💫 Industrial-strength Natural Language Processing (NLP) in Python

Go to file

Matthew Honnibal 9f17467c2e * Fix EMPTY_TOKEN		2014-12-07 22:07:41 +11:00
data	* Revise tokenization rules to match PTB. Rules are pretty messy around periods, need better support for these.	2014-12-07 22:04:47 +11:00
docs	* Make intro chattier, explain philosophy better	2014-12-02 15:20:18 +11:00
spacy	* Fix EMPTY_TOKEN	2014-12-07 22:07:41 +11:00
tests	* Make StringStore.__getitem__ accept unicode-typed keys.	2014-12-03 01:33:20 +11:00
.gitignore	* Upd gitignore	2014-11-12 23:25:27 +11:00
README.md	Initial commit	2014-07-04 01:15:40 +10:00
fabfile.py	* Add conll experiments	2014-11-12 23:22:05 +11:00
requirements.txt	* Rename external hashing lib, from trustyc to preshed	2014-09-26 18:40:03 +02:00
setup.py	* Compile context.pyx and tagger.pyx modules	2014-12-07 15:29:54 +11:00

spaCy

Lightning fast, full-cream NL tokenization. Tokens are pointers to rich Lexeme structs.