spaCy/spacy
Matthew Honnibal d3dc5718b2 Fix syntax error in Doc 2016-09-28 11:39:49 +02:00
..
data
de Add language data for German 2016-09-25 15:44:45 +02:00
en
fi access model via sputnik 2015-12-07 06:01:28 +01:00
it
munge * Fix Python3 problem in align_raw 2015-07-28 16:06:53 +02:00
serialize * Whitespace 2016-01-29 03:59:22 +01:00
syntax
tests Fix test for empty sentence string. 2016-09-27 19:21:22 +02:00
tokens Fix syntax error in Doc 2016-09-28 11:39:49 +02:00
zh * Work on Chinese support 2016-05-05 11:39:12 +02:00
__init__.pxd
__init__.py
about.py
attrs.pxd introduce lang field for LexemeC to hold language id 2016-03-10 13:01:34 +01:00
attrs.pyx introduce lang field for LexemeC to hold language id 2016-03-10 13:01:34 +01:00
cfile.pxd
cfile.pyx
deprecated.py
download.py
gold.pxd
gold.pyx
language.py
lemmatizer.py
lexeme.pxd
lexeme.pyx Fix Issue #371: Lexeme objects were unhashable. 2016-09-27 13:22:30 +02:00
matcher.pyx Finish refactoring data loading 2016-09-24 20:26:17 +02:00
morphology.pxd
morphology.pyx Fix pos name conflict in lemmatize 2016-09-27 17:35:58 +02:00
multi_words.py
orth.pxd
orth.pyx
parts_of_speech.pxd * Fix parts_of_speech now that symbols list has been reformed 2015-10-13 13:44:40 +11:00
parts_of_speech.pyx
scorer.py
strings.pxd
strings.pyx remove ujson as default non-dev dependency (still works as fallback if installed), because ujson doesn't ship wheels 2016-04-12 11:28:07 +02:00
structs.pxd Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match. 2016-09-21 14:54:55 +02:00
symbols.pxd
symbols.pyx
tagger.pxd * Move to thinc 5.0 2016-01-29 03:58:55 +01:00
tagger.pyx
tokenizer.pxd
tokenizer.pyx Refactor so that the tokenizer data is read from Python data, rather than from disk 2016-09-25 14:49:53 +02:00
typedefs.pxd
typedefs.pyx
util.py Refactor so that the tokenizer data is read from Python data, rather than from disk 2016-09-25 14:49:53 +02:00
vocab.pxd
vocab.pyx Refactor so that the tokenizer data is read from Python data, rather than from disk 2016-09-25 14:49:53 +02:00