Commit Graph

1853 Commits

Author SHA1 Message Date
Matthew Honnibal 9e7bfe8449 * Fix space at end of merged token 2015-09-10 14:45:17 +02:00
Matthew Honnibal f634191e27 * Fix vocab read/write 2015-09-10 14:44:38 +02:00
Matthew Honnibal 31ccf494e6 Merge branch 'develop' of https://github.com/honnibal/spaCy into develop 2015-09-09 14:33:38 +02:00
Matthew Honnibal a7f4b26c8c * Tmp 2015-09-09 14:33:26 +02:00
Matthew Honnibal 07686470a9 * Don't consider a coordinated NP a base chunk 2015-09-09 14:32:28 +02:00
Matthew Honnibal d9f1fc2112 * Add deprecation warning for unused load_vectors argument. 2015-09-09 14:31:09 +02:00
Matthew Honnibal 0b527fbdc8 * Set POS tag in morphology 2015-09-09 14:30:24 +02:00
Matthew Honnibal 07c09a0e1b * Fix attribute getters and setters in Lexeme 2015-09-09 14:29:22 +02:00
Matthew Honnibal d6561988cf * Fix lexemes.bin 2015-09-09 11:49:51 +02:00
Matthew Honnibal c301bebd33 Merge branch 'master' of https://github.com/honnibal/spaCy into develop 2015-09-09 10:55:39 +02:00
Matthew Honnibal 0e24d099a1 * Fix L/R edge bug, by ensuring l_edge and r_edge are preset, and fixing the way the edge update in del_arc. Bugs keep arising here because the edges are absolute positions, where everything else is relative. I'm also not 100% convinced that del_arc is handled correctly. Do we need to update the parents? 2015-09-09 03:40:44 +02:00
Matthew Honnibal 83d1a1e512 * Fix lemmatizer tests 2015-09-08 15:39:43 +02:00
Matthew Honnibal 2be3620333 * Save morphological analyses in a cache 2015-09-08 15:39:24 +02:00
Matthew Honnibal 1def5a6cbe * Fix print statements in matcher 2015-09-08 15:38:19 +02:00
Matthew Honnibal 64d71f8893 * Fix lemmatizer 2015-09-08 15:38:03 +02:00
Matthew Honnibal b2e82e55f6 * Create POS model dir in training script 2015-09-08 15:36:23 +02:00
Matthew Honnibal 623329b19a Merge branch 'master' of ssh://github.com/honnibal/spaCy into develop 2015-09-08 14:27:01 +02:00
Matthew Honnibal 62a01dd41d * Fix issue #92: lexemes.bin read error on 32-bit platforms. 2015-09-08 14:23:58 +02:00
Matthew Honnibal 55ed3b3a63 Merge pull request #85 from NSchrading/master
Add a script to generate the specials.json file
2015-09-07 09:05:19 +10:00
Matthew Honnibal ef58607a99 * Add spacy.it 2015-09-06 22:10:37 +02:00
Matthew Honnibal 2154a54f6b * Add spacy.de 2015-09-06 21:56:47 +02:00
Matthew Honnibal a03e2a0b65 * Remove old docs files 2015-09-06 20:20:55 +02:00
Matthew Honnibal fc8f7b123d * Mark a matcher test as requiring the model 2015-09-06 20:19:51 +02:00
Matthew Honnibal f6ec5bf1b0 * Use empty tag map in vocab if none supplied 2015-09-06 20:19:27 +02:00
Matthew Honnibal 4f8e38271d * Fix merge errors in lexeme.pxd 2015-09-06 20:19:08 +02:00
Matthew Honnibal 5ad4527c42 * Rename Deutsch to German 2015-09-06 20:18:58 +02:00
Matthew Honnibal 86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal d2fc104a26 * Begin merge of Gazetteer and DE branches 2015-09-06 19:45:15 +02:00
Matthew Honnibal dbf8dce109 Merge branch 'gaz' of ssh://github.com/honnibal/spaCy into gaz 2015-09-06 18:44:14 +02:00
Matthew Honnibal 577418986a * Add draft Italian stuff 2015-09-06 18:44:10 +02:00
Matthew Honnibal 80a66c0159 * Add draft finnish stuff 2015-09-06 18:43:44 +02:00
Matthew Honnibal b3703836f9 * Add en lemma rules 2015-09-06 17:56:11 +02:00
Matthew Honnibal 238b2f533b * Add lemma rules 2015-09-06 17:55:53 +02:00
Matthew Honnibal c9f2082e3c * Fix compilation error in en/tag_map.json 2015-09-06 17:54:51 +02:00
Matthew Honnibal 9eae9837c4 * Fix morphology look up 2015-09-06 17:53:39 +02:00
Matthew Honnibal 6427a3fcac * Temporarily import flag attributes in matcher 2015-09-06 17:53:12 +02:00
Matthew Honnibal 7cc56ada6e * Temporarily add py_set_flag attribute in Lexeme 2015-09-06 17:52:51 +02:00
Matthew Honnibal e35bb36be7 * Ensure Lexeme.check_flag returns a boolean value 2015-09-06 17:52:32 +02:00
Matthew Honnibal d1eea2d865 * Update train.py for language-generic spaCy 2015-09-06 17:51:48 +02:00
Matthew Honnibal 950ce36660 * Update init model 2015-09-06 17:51:30 +02:00
Matthew Honnibal 4f765eee79 Merge branch 'gaz' of https://github.com/honnibal/spaCy into gaz 2015-09-06 14:07:43 +02:00
Matthew Honnibal 7e4fea67d3 * Fix bug in token subtree, introduced by duplication of L/R code in Stateclass. Need to consolidate the two methods. 2015-09-06 10:48:36 +02:00
Matthew Honnibal 571b6eda88 * Upd tests 2015-09-06 05:40:10 +02:00
Matthew Honnibal 5edac11225 * Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag. 2015-09-06 04:15:00 +02:00
Matthew Honnibal fd1eeb3102 * Add POS attribute support in get_attr 2015-09-06 04:13:03 +02:00
Matthew Honnibal 534e3dda3c * More work on language independent parsing 2015-08-28 03:44:54 +02:00
Matthew Honnibal c2307fa9ee * More work on language-generic parsing 2015-08-28 02:02:33 +02:00
Matthew Honnibal 86c4a8e3e2 * Work on new morphology organization 2015-08-27 23:11:51 +02:00
Matthew Honnibal 5b89e2454c * Improve error-reporting in tagger 2015-08-27 10:26:36 +02:00
Matthew Honnibal f0a7c99554 * Relax rule-requirement in lemmatizer 2015-08-27 10:26:19 +02:00