💫 Industrial-strength Natural Language Processing (NLP) in Python

ai artificial-intelligence cython data-science deep-learning entity-linking machine-learning named-entity-recognition natural-language-processing neural-network neural-networks nlp nlp-library python spacy starred-explosion-repo starred-repo text-classification tokenization

Go to file

Matthew Honnibal 64645a1c2f * Improve docstring on English		2015-02-11 15:13:20 -05:00
bin/parser	* Fix parser training script	2015-02-09 03:57:56 -05:00
docs	* Make corrections to example code	2015-02-07 08:45:09 -05:00
spacy	* Improve docstring on English	2015-02-11 15:13:20 -05:00
tests	Add rokenizer test for zero length string	2015-02-10 08:20:32 -05:00
.gitignore	* Upd gitignore	2015-01-30 16:49:44 +11:00
.travis.yml	* Upd travis.yml	2015-01-31 13:50:30 +11:00
LICENSE.txt	* Add license file	2015-01-26 03:07:18 +11:00
MANIFEST.in	* Add manifest file	2015-01-30 16:49:02 +11:00
README.md	Update README.md	2015-02-01 18:38:22 +11:00
dev_setup.py	* Upd dev_setup	2015-01-03 21:02:03 +11:00
fabfile.py	* Fix clean command	2015-01-25 14:49:29 +11:00
requirements.txt	* Require advanced version of cymem	2015-02-01 17:04:59 +11:00
setup.py	* Inc version	2015-02-11 14:20:57 -05:00
wordnet_license.txt	* Add WordNet license file	2015-02-01 16:11:53 +11:00

README.md

spaCy

http://honnibal.github.io/spaCy

Fast, state-of-the-art natural language processing pipeline. Commercial licenses available, or use under AGPL.

Version 0.40 released

2014-02-01

Several bug-fixes have now been pushed to master
Tests fail on some platforms, including Travis CI, due to memory errors.
Tests pass on my local machines OSX and Ubuntu machines (for Python2.7 and Python 3.4)

The problem is likely due to non-portable usage of the Py_UNICODE data type in my Cython code, or possibly in the binary file formats of lexemes.bin, vec.bin, or the model file read by thinc.learner.LinearModel.

I'm trying to reproduce the problem. Once this is fixed and docs are updated I will push version 0.4 to PyPi.

I have a flight from Sydney to New York in 24 hours, so this problem may remain unfixed for a few days.

Supports:

CPython 2.7
CPython 3.4
OSX
Linux

Want to support:

Windows

Difficult to support:

PyPy 2.7
PyPy 3.4