Commit Graph

2124 Commits

Author SHA1 Message Date
Matthew Honnibal 329ae57520 * Fix whitespace attachment thing 2015-10-13 09:46:38 +02:00
Matthew Honnibal 37919eac82 * Fix whitespace attachment in simpler way. Leaves problem with setting left/right children. 2015-10-13 18:23:24 +11:00
Matthew Honnibal 883ff1f59e * Fix test 2015-10-13 15:58:46 +11:00
Matthew Honnibal c70eb776ae * Fix whitespace attachment, so that left/right children are consistent with head. 2015-10-13 15:58:22 +11:00
Matthew Honnibal 63df729edd * Fix test 2015-10-13 15:48:15 +11:00
Matthew Honnibal 00ae3edd3a * Fix tests 2015-10-13 15:39:52 +11:00
Matthew Honnibal 531182f937 * Fix Model.__reduce__ 2015-10-13 15:14:38 +11:00
Matthew Honnibal 6c227a6c1f * Fix Model.__reduce__ 2015-10-13 15:10:04 +11:00
Matthew Honnibal f6d74b14de * Merge 2015-10-13 05:25:49 +02:00
Matthew Honnibal 59b792058d * Fix test_parse_navigate looking for test file in wrong place 2015-10-13 14:19:12 +11:00
Matthew Honnibal 358c82595c * Fix NAMES list in spacy/parts_of_speech.pyx 2015-10-13 14:18:45 +11:00
Matthew Honnibal c1fdc487bc Merge branch 'attrs' 2015-10-13 14:03:41 +11:00
Matthew Honnibal 41cbbdefe3 Merge branch 'attrs' 2015-10-13 05:03:25 +02:00
Matthew Honnibal 38109dd912 * Allow preshed v0.42 2015-10-13 13:56:23 +11:00
Matthew Honnibal d698aa546d Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-10-13 13:56:09 +11:00
Matthew Honnibal 1ca1beff4b * Allow preshed v0.42 in setup.py 2015-10-13 13:55:50 +11:00
Matthew Honnibal 404e484276 * Fix prag_sbd tests 2015-10-13 04:54:15 +02:00
Matthew Honnibal b866f1443e Merge branch 'master' of https://github.com/honnibal/spaCy into attrs 2015-10-13 04:52:27 +02:00
Matthew Honnibal 6c2da06c18 * Package tag_map.json 2015-10-13 13:52:10 +11:00
Matthew Honnibal e886e6a406 * Inc version 2015-10-13 13:46:17 +11:00
Matthew Honnibal 20fd36a0f7 * Very scrappy, likely buggy first-cut pickle implementation, to work on Issue #125: allow pickle for Apache Spark. The current implementation sends stuff to temp files, and does almost nothing to ensure all modifiable state is actually preserved. The Language() instance is a deep tree of extension objects, and if pickling during training, some of the C-data state is hard to preserve. 2015-10-13 13:44:41 +11:00
Matthew Honnibal f8de403483 * Work on pickling Vocab instances. The current implementation is not correct, but it may serve to see whether this approach is workable. Pickling is necessary to address Issue #125 2015-10-13 13:44:41 +11:00
Matthew Honnibal 85e7944572 * Start trying to pickle Vocab 2015-10-13 13:44:41 +11:00
Matthew Honnibal 5ca57bd859 * Ensure Morphology can be pickled, to address Issue #125. 2015-10-13 13:44:41 +11:00
Matthew Honnibal dfe0ad51ff * Add pickle test for lemmatizer 2015-10-13 13:44:41 +11:00
Matthew Honnibal 0cee928467 * Allow StringStore to be pickled, to start addressing Issue #125 2015-10-13 13:44:41 +11:00
Matthew Honnibal 41012907a8 * Fix variable name 2015-10-13 13:44:40 +11:00
Matthew Honnibal e70368d157 * Use lower case strings for dependency label names in symbols enum 2015-10-13 13:44:40 +11:00
Matthew Honnibal 7b4af3d1e7 * Fix parts_of_speech now that symbols list has been reformed 2015-10-13 13:44:40 +11:00
Matthew Honnibal 37b909b6b6 * Use the symbols file in vocab instead of the symbols subfiles like attrs.pxd 2015-10-13 13:44:40 +11:00
Matthew Honnibal ce65ec698c * Remove qualified naming in symbols 2015-10-13 13:44:40 +11:00
Matthew Honnibal 9f4be0adcd * Map NO_TAG to NIL in parts_of_speech.pxd 2015-10-13 13:44:40 +11:00
Matthew Honnibal 278e12f7e8 * Addmorphology symbols to morphology. May need to remove these as an enum. 2015-10-13 13:44:40 +11:00
Matthew Honnibal d80067eda1 * Map empty string to NULL_ATTR in attrs 2015-10-13 13:44:40 +11:00
Matthew Honnibal fd204d3cd5 * Map NIL to empty string in tag map 2015-10-13 13:44:40 +11:00
Matthew Honnibal d70e8cac2c * Fix empty values in attributes and parts of speech, so symbols align correctly with the StringStore 2015-10-13 13:44:40 +11:00
Matthew Honnibal ce3e306376 * Allow SPACY_DATA environment variable in website tests 2015-10-13 13:44:40 +11:00
Matthew Honnibal a29c8ee23d * Add symbols to the vocab before reading the strings, so that they line up correctly 2015-10-13 13:44:39 +11:00
Matthew Honnibal 74c0853471 * Rename ATTR_IDS to attrs.IDS. Rename ATTR_NAMES to attrs.NAMES. Rename UNIV_POS_IDS to parts_of_speech.IDS 2015-10-13 13:44:39 +11:00
Matthew Honnibal 10a4a843ea * Enumerate all symbols in one file 2015-10-13 13:44:39 +11:00
Matthew Honnibal 5c24ad3f5c * Whitespace 2015-10-13 13:44:39 +11:00
Matthew Honnibal 85ce36ab11 * Refactor symbols, so that frequency rank can be derived from the orth id of a word. 2015-10-13 13:44:39 +11:00
Matthew Honnibal 3b79d67462 * Fix assertion in test_basic_create 2015-10-12 00:48:18 +11:00
Matthew Honnibal afec8cac20 * Add more tests to probe mingw32 failure 2015-10-11 22:40:04 +11:00
Matthew Honnibal dba1daf597 * Add script to test loading different components 2015-10-11 19:46:53 +11:00
Matthew Honnibal 92f750cf8b * Use a gzipped frequencies file in init_model 2015-10-11 06:59:44 +02:00
Matthew Honnibal cc92f3f0ed * Fix Matcher test 2015-10-11 14:59:12 +11:00
Matthew Honnibal 1f8f81f0c8 * Fix missing import 2015-10-11 14:38:21 +11:00
Matthew Honnibal 693dd06547 * Add basic, non-data dependent class creation tests, without depending on pytest. For use in debugging MS build issues, for Issue #132 2015-10-11 14:29:12 +11:00
Matthew Honnibal 0090f79fbd * Use lower case strings for dependency label names in symbols enum 2015-10-10 22:59:14 +11:00