spaCy/spacy
Matthew Honnibal df01a88763 Merge branch 'refactor' (and serializaton)
Add Huffman-code serialization, and do a lot of
refactoring. Highlights include:

* Much more efficient StringStore
* Vocab maintains a by-orth mapping of Lexemes
* Avoid manually slicing Py_UNICODE buffers,
  simplifying tokenizer and vocab C APIs
* Remove various bits of dead code
* Work on removing GIL around parser
* Work on bridge to Theano

Conflicts:
	spacy/strings.pxd
	spacy/strings.pyx
	spacy/structs.pxd
2015-07-23 02:18:35 +02:00
..
en
munge
serialize
syntax
tokens
__init__.pxd
__init__.py
_ml.pxd
_ml.pyx
_nn.py
_nn.pyx
_theano.pxd
_theano.pyx
attrs.pxd
attrs.pyx
cfile.pxd
cfile.pyx
gold.pxd
gold.pyx
lexeme.pxd
lexeme.pyx
morphology.pxd
morphology.pyx
multi_words.py
orth.pxd
orth.pyx
parts_of_speech.pxd
parts_of_speech.pyx
scorer.py
senses.pxd
senses.pyx
strings.pxd
strings.pyx
structs.pxd
tokenizer.pxd
tokenizer.pyx
typedefs.pxd
typedefs.pyx
util.py
vocab.pxd
vocab.pyx