Commit Graph

1840 Commits

Author SHA1 Message Date
Matthew Honnibal e7af6b937f Fix syntax error while fixing doc strings 2016-11-01 13:27:32 +01:00
Matthew Honnibal 62fc6b1afa Use 32 bit hashes for OOV, re Issue #589, Issue #285 2016-11-01 13:27:13 +01:00
Matthew Honnibal 6977a2b8cd Add test for Issue #589 2016-11-01 12:33:36 +01:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal d563f1eadb Fix Issue #587: Segfault in Matcher, due to simple error in the state machine. 2016-10-28 17:42:00 +02:00
Matthew Honnibal 7e5f63a595 Improve test slightly 2016-10-28 17:41:16 +02:00
Matthew Honnibal 782e4814f4 Test Issue #587: Matcher segfaults on particular input 2016-10-28 16:38:32 +02:00
Matthew Honnibal 708ea22208 Infer types in transition_system.pyx 2016-10-27 18:08:13 +02:00
Matthew Honnibal 18590eba94 Fix training evaluate method 2016-10-27 18:02:19 +02:00
Matthew Honnibal 301f3cc898 Fix Issue #429. Add an initialize_state method to the named entity recogniser that adds missing entity types. This is a messy place to add this, because it's strange to have the method mutate state. A better home for this logic could be found. 2016-10-27 18:01:55 +02:00
Matthew Honnibal afea6505f3 Test Issue 429: No valid actions for NER after matcher adds a new entity label. 2016-10-27 18:01:34 +02:00
Matthew Honnibal 03a520ec4f Change signature of Parser.parseC, so that nr_class is read from the transition system. This allows the transition system to modify the number of actions in initialize_state. 2016-10-27 17:58:56 +02:00
Matthew Honnibal 6c47048912 Fix test, after IOB tweak. 2016-10-26 17:22:03 +02:00
Matthew Honnibal 4ca31b4d87 Fix clobbering of 'missing' named ent values after assigning ents. 2016-10-26 13:13:56 +02:00
Matthew Honnibal cb49189477 Remove dead code 2016-10-26 13:11:07 +02:00
Matthew Honnibal a209b10579 Improve error message when oracle fails for non-projective trees, re Issue #571. 2016-10-24 20:31:30 +02:00
Matthew Honnibal b2d43b93d2 Fix Python 3 basestring error 2016-10-24 14:22:51 +02:00
Matthew Honnibal 276478fe0f Update strings.pxd 2016-10-24 14:00:35 +02:00
Matthew Honnibal d8134817ff Workaround Issue #285: Allow the StringStore to be 'frozen', in which case strings will be pushed into an OOV map. We can then flush this OOV map, freeing all of the OOV strings. 2016-10-24 13:49:03 +02:00
Matthew Honnibal d3a617aa99 Test workaround for Issue #285: Streaming data memory growth 2016-10-24 13:48:06 +02:00
Matthew Honnibal 64e5f02cf7 Update test 2016-10-23 21:08:07 +02:00
Matthew Honnibal 66d7a6eca2 Update test 2016-10-23 21:02:05 +02:00
Matthew Honnibal 90bf797125 Update test 2016-10-23 20:54:17 +02:00
Matthew Honnibal 5e76320ffe Update test 2016-10-23 20:44:54 +02:00
Matthew Honnibal aa105927f3 Update test 2016-10-23 20:31:25 +02:00
Matthew Honnibal 6b9237aa83 Increment version 2016-10-23 20:22:53 +02:00
Matthew Honnibal 150e02d72e Fix Issue #566 2016-10-23 20:19:01 +02:00
Matthew Honnibal e120561294 Fix vector_norm test. 2016-10-23 19:56:16 +02:00
Matthew Honnibal fefde8aef8 Make installation print data path. 2016-10-23 19:46:44 +02:00
Matthew Honnibal e7414cd064 Try to fix weird install glitch. 2016-10-23 19:46:28 +02:00
Matthew Honnibal 90f7544edd Increment version 2016-10-23 19:43:06 +02:00
Matthew Honnibal 6036ec7c77 Fix vector norm when loading lexemes. 2016-10-23 19:40:18 +02:00
Matthew Honnibal c05cd2356e Fix similarity test for Python 3 2016-10-23 18:16:56 +02:00
Matthew Honnibal 3e688e6d4b Fix issue #514 -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness. 2016-10-23 17:45:44 +02:00
Matthew Honnibal 79aa03fe98 Test Issue #514: Serializer fails when new entity type has been added. 2016-10-23 17:41:44 +02:00
Matthew Honnibal f97548c6f1 Fix broken test, re Issue #461 2016-10-23 17:02:23 +02:00
Matthew Honnibal 4de30a8e38 Test Issue #514: Serialization fails after adding a new entity label. 2016-10-23 16:40:27 +02:00
Matthew Honnibal 936e6246aa Fix Issue #459 -- failed to deserialize empty doc. 2016-10-23 16:31:05 +02:00
Matthew Honnibal e99b3f5322 Test Issue #459: Fail to deserialize empty doc 2016-10-23 16:30:22 +02:00
Matthew Honnibal 49c117960c Fix bug where huffman codec died if given empty freqs dict. 2016-10-23 16:28:05 +02:00
Matthew Honnibal 99ff8b902f Test that huffman codec works with empty freqs dict 2016-10-23 16:27:45 +02:00
Matthew Honnibal 15c9b59f0e Fix Issue #461: O tag was being clobbered by doc.ents.__set__ 2016-10-23 15:50:26 +02:00
Matthew Honnibal e5627134d9 Test Issue #461: ent_iob tag incorrect after setting entities. 2016-10-23 15:50:04 +02:00
Matthew Honnibal f62088d646 Fix compile error 2016-10-23 14:50:50 +02:00
Matthew Honnibal 2c3a67b693 Fix calculation of vector norm, re Issue #522. Need to consolidate the calculations into a helper function. 2016-10-23 14:49:31 +02:00
Matthew Honnibal a0a4ada42a Fix calculation of L2-norm for Lexeme 2016-10-23 14:44:45 +02:00
Matthew Honnibal 2989072aac Add tests to verify that Issue #442 is fixed in 1.1 2016-10-23 14:33:13 +02:00
Matthew Honnibal 739213a8af Fix create_pipeline keyword argument. 2016-10-23 14:24:16 +02:00
Matthew Honnibal bea44bd3c4 Fix vector_norm when vector is assigned to Lexeme. 2016-10-23 14:23:56 +02:00
Matthew Honnibal e838b6d53f Add tests for using the new Entity ID tracking in the rule matcher 2016-10-23 14:04:01 +02:00