spaCy

History

Matthew Honnibal d3dc5718b2 Fix syntax error in Doc		2016-09-28 11:39:49 +02:00
..
data	…
de	Add language data for German	2016-09-25 15:44:45 +02:00
en	…
fi	access model via sputnik	2015-12-07 06:01:28 +01:00
it	…
munge	* Fix Python3 problem in align_raw	2015-07-28 16:06:53 +02:00
serialize	* Whitespace	2016-01-29 03:59:22 +01:00
syntax	…
tests	Fix test for empty sentence string.	2016-09-27 19:21:22 +02:00
tokens	Fix syntax error in Doc	2016-09-28 11:39:49 +02:00
zh	* Work on Chinese support	2016-05-05 11:39:12 +02:00
__init__.pxd	…
__init__.py	…
about.py	…
attrs.pxd	introduce lang field for LexemeC to hold language id	2016-03-10 13:01:34 +01:00
attrs.pyx	introduce lang field for LexemeC to hold language id	2016-03-10 13:01:34 +01:00
cfile.pxd	…
cfile.pyx	…
deprecated.py	…
download.py	…
gold.pxd	…
gold.pyx	…
language.py	…
lemmatizer.py	…
lexeme.pxd	…
lexeme.pyx	Fix Issue #371 : Lexeme objects were unhashable.	2016-09-27 13:22:30 +02:00
matcher.pyx	Finish refactoring data loading	2016-09-24 20:26:17 +02:00
morphology.pxd	…
morphology.pyx	Fix pos name conflict in lemmatize	2016-09-27 17:35:58 +02:00
multi_words.py	…
orth.pxd	…
orth.pyx	…
parts_of_speech.pxd	* Fix parts_of_speech now that symbols list has been reformed	2015-10-13 13:44:40 +11:00
parts_of_speech.pyx	…
scorer.py	…
strings.pxd	…
strings.pyx	remove ujson as default non-dev dependency (still works as fallback if installed), because ujson doesn't ship wheels	2016-04-12 11:28:07 +02:00
structs.pxd	Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match.	2016-09-21 14:54:55 +02:00
symbols.pxd	…
symbols.pyx	…
tagger.pxd	* Move to thinc 5.0	2016-01-29 03:58:55 +01:00
tagger.pyx	…
tokenizer.pxd	…
tokenizer.pyx	Refactor so that the tokenizer data is read from Python data, rather than from disk	2016-09-25 14:49:53 +02:00
typedefs.pxd	…
typedefs.pyx	…
util.py	Refactor so that the tokenizer data is read from Python data, rather than from disk	2016-09-25 14:49:53 +02:00
vocab.pxd	…
vocab.pyx	Refactor so that the tokenizer data is read from Python data, rather than from disk	2016-09-25 14:49:53 +02:00