Commit Graph

12 Commits

Author SHA1 Message Date
Matthew Honnibal 2348a08481 * Load/dump strings with a json file, instead of the hacky strings file we were using. 2015-10-22 21:13:03 +11:00
Matthew Honnibal 9baf0abd59 * Save vocab after training. 2015-10-22 21:09:14 +11:00
Matthew Honnibal 20fd36a0f7 * Very scrappy, likely buggy first-cut pickle implementation, to work on Issue #125: allow pickle for Apache Spark. The current implementation sends stuff to temp files, and does almost nothing to ensure all modifiable state is actually preserved. The Language() instance is a deep tree of extension objects, and if pickling during training, some of the C-data state is hard to preserve. 2015-10-13 13:44:41 +11:00
Matthew Honnibal a6ced80c0c * Fix Issue #116: Misleading handling of True value in Language.__init__. 2015-09-29 20:54:12 +10:00
Matthew Honnibal 27f988b167 * Remove the vectors option to Vocab, preferring to either load vectors from disk, or set them on the Lexeme objects. 2015-09-15 14:41:48 +10:00
Matthew Honnibal e13e47e9e5 * Add English stop words 2015-09-14 17:48:51 +10:00
Matthew Honnibal d9f1fc2112 * Add deprecation warning for unused load_vectors argument. 2015-09-09 14:31:09 +02:00
Matthew Honnibal 534e3dda3c * More work on language independent parsing 2015-08-28 03:44:54 +02:00
Matthew Honnibal c2307fa9ee * More work on language-generic parsing 2015-08-28 02:02:33 +02:00
Matthew Honnibal 0af139e183 * Tagger training now working. Still need to test load/save of model. Morphology still broken. 2015-08-27 09:16:11 +02:00
Matthew Honnibal 76996f4145 * Hack on generic Language class. Still needs work for morphology, defaults, etc 2015-08-26 19:16:09 +02:00
Matthew Honnibal f2f699ac18 * Add language base class 2015-08-25 15:37:17 +02:00