Search

Visit Github File Issue Email Request

Learn More Sponsor Project

Visit Github File Issue Email Request

Learn More Sponsor Project

CitableCorpusAnalysis.jl

Work with multiple models of a text corpus.

Author neelsmith

Suggest Category

If you are a human, ignore this field

OR

Category

Sub Category

Website Github

Popularity: 0 Stars

Updated Last: 2 Years Ago

Started In: September 2021

CitableCorpusAnalysis.jl

A Julia module to work with multiple models of a text corpus.

See the documentation.

Current focus

The current focus of work on this module is to profile a citable corpus. This includes measuring:

Size of corpus, coverage of analysis

total number of lexical tokens in corpus (size of corpus)
number of distinct lexical tokens in corpus (and ratio to total size) (the token vocabulary)
number (and proportion) of lexical tokens parsed by analyzer (coverage of corpus)
number (and proportion) of distinct lexical tokens parsed by analyzer (coverage of token vocabulary)

Size of lexicon, formal and lexical complexity

number of lexemes recognized (lexicon of the corpus)
proportion of lexemes to total tokens parsed (lexical density or complexity of the corpus)
proportion of lexemes to distinct tokens parsed (lexical density or complexity of token vocabulary)
number of distinct forms related to lexemes, distinct tokens, and total tokens (morphological complexity of lexicon, token vocabulary and corpus)
number (proportion) of total tokens that are morphologically ambiguous by token vocabulary and corpus (formal ambiguity of token vocabulary and corpus)
number (proportion) of total tokens that are lexically ambiguous by token vocabulary and corpus (lexical ambiguity of token vocabulary and corpus)

Required Packages

View all packages

Used By Packages

No packages found.

Julia Packages

This website serves as a package browsing tool for the Julia programming language. It works by aggregating various sources on Github to help you find your next package.

By analogy, Julia Packages operates much like PyPI, Ember Observer, and Ruby Toolbox do for their respective stacks.