spaCy

History

Matthew Honnibal f07457a91f * Remove POS alignment stuff. Now use training data based on raw text, instead of clumsy detokenization stuff		2014-11-04 01:06:43 +11:00
..
__init__.pxd	* Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags.	2014-10-24 02:23:42 +11:00
__init__.py	* Basic punct tests updated and passing	2014-08-27 19:38:57 +02:00
en.pxd	* Large refactor, particularly to Python API	2014-10-24 00:59:17 +11:00
en.pyx	* Tighten interfaces	2014-10-30 18:14:42 +11:00
lang.pxd	* Add pos_tag method to Language	2014-11-02 14:21:43 +11:00
lang.pyx	* Add pos_tag method to Language	2014-11-02 14:21:43 +11:00
lexeme.pxd	* Remove vocab10k field, and add flags for gazetteers	2014-11-03 00:13:51 +11:00
lexeme.pyx	* Remove vocab10k field, and add flags for gazetteers	2014-11-03 00:13:51 +11:00
orth.py	* Remove non_sparse method --- features wanting this can do it easily enough.	2014-11-03 00:15:47 +11:00
pos.pxd	* Add count_tags functionto pos.pyx, which should probably live in another file. Feature set achieves 97.9 on wsj19-21, 95.85 on onto web.	2014-10-31 17:42:15 +11:00
pos.pyx	* Fiddle with POS tag features	2014-11-03 00:15:03 +11:00
pos_util.py	* Remove POS alignment stuff. Now use training data based on raw text, instead of clumsy detokenization stuff	2014-11-04 01:06:43 +11:00
tokens.pxd	* Remove vocab10k from tokens	2014-11-03 00:23:20 +11:00
tokens.pyx	* Remove vocab10k from tokens	2014-11-03 00:23:20 +11:00
typedefs.pxd	* Fiddle with data types on Lexeme, to compress them to a much smaller size.	2014-10-30 15:42:15 +11:00
utf8string.pxd	* Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags.	2014-10-24 02:23:42 +11:00
utf8string.pyx	* Fix strings i/o, removing use of ujson library in favour of plain text file. Allows better control of codecs.	2014-11-02 13:20:37 +11:00
util.py	* Tighten interfaces	2014-10-30 18:14:42 +11:00