Commit Graph

266 Commits

Author SHA1 Message Date
Matthew Honnibal 3b793cf4f7 * Tests passing for new Word object version 2014-08-24 18:13:53 +02:00
Matthew Honnibal 89d6faa9c9 * Move en_ptb to ptb3 2014-08-22 04:24:05 +02:00
Matthew Honnibal d42cdbb446 * Compile orthography.latin.pyx 2014-08-20 17:03:19 +02:00
Matthew Honnibal 01469b0888 * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. 2014-08-18 19:14:00 +02:00
Matthew Honnibal 865cacfaf7 * Remove dependence on murmurhash 2014-08-16 17:37:09 +02:00
Matthew Honnibal 7fd9b2f1f8 * Add murmurhash to setup while we figure out cython includes 2014-08-15 23:56:57 +02:00
Matthew Honnibal 365a2af756 * Restore happax. commit uncommited work 2014-08-02 21:27:03 +01:00
Matthew Honnibal 18fb76b2c4 * Removed happax. Not sure if good idea. 2014-08-02 20:53:35 +01:00
Matthew Honnibal d4b8bc07ce * Use FixedTable to control index size 2014-08-01 07:27:48 +01:00
Matthew Honnibal a235804730 * Fix setup.py 2014-07-31 02:03:53 +01:00
Matthew Honnibal 5461399924 * Fix setup.py 2014-07-31 02:03:10 +01:00
Matthew Honnibal b9016c4633 * Switch to using sparsehash and murmurhash libraries out of pip 2014-07-25 15:47:27 +01:00
Matthew Honnibal 1c5ab3b49a * Add tokens module to setup 2014-07-07 12:51:23 +02:00
Matthew Honnibal 648d1fe3ed * Compile en_ptb 2014-07-07 05:10:28 +02:00
Matthew Honnibal 0c1be7effe * Compile string_tools module 2014-07-07 04:24:00 +02:00
Matthew Honnibal ca7045f3f2 * Add build/setup stuff 2014-07-05 20:49:34 +02:00