Commit Graph

57 Commits

Author SHA1 Message Date
adrianeboyd d359da9687 Replace Entity/MatchStruct with SpanC (#4459)
* Replace MatchStruct with Entity

Replace MatchStruct with Entity since the existing Entity struct is
nearly identical.

* Replace Entity with more general SpanC
2019-10-18 11:01:47 +02:00
Matthew Honnibal a5b1f6dcec Fix NER when preset entities cross sentence boundaries (#3379)
💫 Fix NER when preset entities cross sentence boundaries
2019-03-10 14:53:03 +01:00
Matthew Honnibal 8cd06cc763 Try to fix root-outside-sentence bug 2018-05-02 14:39:48 +00:00
Matthew Honnibal 1f7229f40f Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit c9ba3d3c2d, reversing
changes made to 92c26a35d4.
2018-03-27 19:23:02 +02:00
Matthew Honnibal a0c7dabb72 Fix bug in 8-token parser features 2017-10-28 23:01:35 +00:00
Matthew Honnibal dd5b2d8fa3 Check for out-of-memory when calling calloc. Closes #1446 2017-10-24 12:40:47 +02:00
Matthew Honnibal e938bce320 Adjust parsing transition system to allow preset sentence segments. 2017-10-08 23:53:34 +02:00
Matthew Honnibal 5c750a9c2f Reserve 0 for 'missing' in history features 2017-10-06 06:10:13 -05:00
Matthew Honnibal b0618def8d Add support for 2-token state option 2017-10-05 21:54:12 -05:00
Matthew Honnibal ee41e4fea7 Support history features in stateclass 2017-10-03 12:43:48 +02:00
Matthew Honnibal 664c5af745 Revert padding in parser 2017-09-14 16:59:25 +02:00
Matthew Honnibal c6395b057a Improve parser feature extraction, for missing values 2017-09-14 16:18:02 +02:00
Matthew Honnibal c307a0ffb8 Restore patches from nn-beam-parser to spacy/syntax 2017-08-18 22:38:59 +02:00
Matthew Honnibal 5f81d700ff Restore patches from nn-beam-parser to spacy/syntax 2017-08-18 22:23:03 +02:00
Matthew Honnibal 426f84937f Resolve conflicts when merging new beam parsing stuff 2017-08-18 13:38:32 -05:00
Matthew Honnibal 3533bb61cb Add option of 8 feature parse state 2017-08-16 18:23:27 -05:00
Matthew Honnibal 52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal d4308d2363 Initialize State offset to 0 2017-08-12 17:14:39 -05:00
Matthew Honnibal 53a3824334 Fix mistake in ner feature 2017-05-31 03:01:02 +02:00
Matthew Honnibal cc911feab2 Fix bug in NER state 2017-05-30 22:12:19 +02:00
Matthew Honnibal 7996d21717 Fixes for new StringStore 2017-05-28 11:09:27 -05:00
Matthew Honnibal 655ca58c16 Clarifying change to StateC.clone 2017-05-27 15:49:37 -05:00
Matthew Honnibal 3d5a536eaa Improve efficiency of parser batching 2017-05-26 11:31:23 -05:00
Matthew Honnibal 8a9e318deb Put the parsing loop in a nogil prange block 2017-05-22 17:58:12 -05:00
Matthew Honnibal a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
Matthew Honnibal c79b3129e3 Fix setting of empty lexeme in initial parse state 2017-03-15 09:26:53 -05:00
Matthew Honnibal d59c6926c1 I think this fixes the segfault 2017-03-11 06:58:34 -06:00
Matthew Honnibal 318b9e32ff WIP on beam parser. Currently segfaults. 2017-03-11 06:19:52 -06:00
Wolfgang Seeker dbf8f5f3ec fix bug in StateC.set_break() 2016-05-05 15:15:34 +02:00
Wolfgang Seeker 289b10f441 remove some comments 2016-04-14 15:37:51 +02:00
Wolfgang Seeker d99a9cbce9 different handling of space tokens
space tokens are now always attached to the previous non-space token
there are two exceptions:
leading space tokens are attached to the first following non-space token
in input that consists exclusively of space tokens, the last space token
is the head of all others.
2016-04-13 15:28:28 +02:00
Matthew Honnibal 1b83cb9dfa * Fix Issue #251: Incorrect right edge calculation on left-clobber low in the tree 2016-02-07 00:00:42 +01:00
Matthew Honnibal 4412a70dc5 * Initialize StateC._empty_token to 0, to avoid undefined behaviour. 2016-02-06 13:34:38 +01:00
Matthew Honnibal e3db39dd21 * Fix compiler warning about signed/unsigned comparison 2016-02-01 09:08:07 +01:00
Matthew Honnibal a47f00901b * Pass a StateC pointer into the transition and validation methods in the parser, so that the GIL can be released over a batch of documents 2016-02-01 02:58:14 +01:00
Matthew Honnibal 7a0e3bb9c1 * Continue proxying. Some problem currently 2016-02-01 02:22:21 +01:00
Matthew Honnibal 2fa228458e * Add _state file, which StateClass will proxy to 2016-02-01 01:09:21 +01:00
Matthew Honnibal 7f9384f53c * Remove deprecated _state module 2015-06-23 17:28:24 +02:00
Matthew Honnibal 77d7e79c7e * Fix r/l and distance features. 2015-06-12 13:06:15 +02:00
Matthew Honnibal dd0867645d * Remove stray const from State header 2015-06-03 00:10:04 +02:00
Matthew Honnibal e09a08bd00 * Add copy_state function 2015-06-01 23:06:30 +02:00
Matthew Honnibal 9dfc9c039c * Work on constituency parsing. 2015-05-20 16:02:51 +02:00
Matthew Honnibal 03a6626545 * Tmp commit 2015-05-12 20:27:56 +02:00
Matthew Honnibal d2ac8d8007 * Add ctnt field to State, in preparation for constituency parsing 2015-05-12 20:27:56 +02:00
Matthew Honnibal a4e2af54f9 * Add support for l/r edge to add_dep, and move inlined methods into _state.pyx where possible 2015-05-12 20:26:41 +02:00
Matthew Honnibal a3af6b7c3d * Left-Arc from Root, to allow non-monotonic reduce to compete with left-arc when the stack is not empty. 2015-03-27 17:39:16 +01:00
Matthew Honnibal e181c051d5 * Improve features for NER 2015-03-26 16:44:44 +01:00
Matthew Honnibal 8057a95f20 * NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring. 2015-03-26 16:44:44 +01:00
Matthew Honnibal b3eda03c9c * Tmp 2015-03-26 16:44:44 +01:00
Matthew Honnibal b34a1325d3 * Everything compiling after reorg. About to start testing. 2014-12-21 05:42:23 +11:00