Commit Graph

77 Commits

Author SHA1 Message Date
Matthew Honnibal fe11564b8e Finish stringstore change. Also xfail vectors tests 2017-05-28 15:10:22 +02:00
Matthew Honnibal e27262f431 Go back to previous matcher signature, with on_match positional 2017-05-23 04:37:40 -05:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
Matthew Honnibal 187f370734 Update tests for matcher changes 2017-05-22 12:59:50 +02:00
ines 4ed6a36622 Update docstrings and API docs for Matcher 2017-05-20 14:43:10 +02:00
ines 39f36539f6 Update docstrings and API docs for Matcher 2017-05-20 14:32:34 +02:00
ines c00ff257be Update docstrings and API docs for Matcher 2017-05-20 14:26:10 +02:00
ines 790435e51c Update docstrings 2017-05-20 14:05:07 +02:00
Matthew Honnibal ce9234f593 Update Matcher API 2017-05-20 13:54:53 +02:00
ines 1d4d3d0ecd Add TODO 2017-05-20 01:38:04 +02:00
ines fe5d8819ea Update Matcher docstrings and API docs 2017-05-19 21:47:06 +02:00
ines e1efd589c3 Fix json imports and use ujson 2017-04-15 12:13:34 +02:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
Matthew Honnibal 725249c59a Add merge_phrase callback in matcher.pyx 2017-03-31 13:58:59 +02:00
Raphaël Bournhonesque f332bf05be Remove unused import statements 2017-03-21 21:08:54 +01:00
Matthew Honnibal 8f94897d07 Add 1 operator to matcher, and make sure open patterns are closed at end of document. Closes Issue #766 2017-02-24 14:27:02 +01:00
Dmytro Sadovnychyi e70a7050e1 Remove duplicated line of vocab declaration
As already declared on line 211.
2016-11-13 18:52:49 +08:00
Dmitry Sadovnychyi 9488222e79 Fix PhraseMatcher to work with updated Matcher
#613
2016-11-09 00:14:26 +08:00
Matthew Honnibal 5cd3acb265 Fix #605: Acceptor now rejects matches as expected. 2016-11-06 10:50:42 +01:00
Matthew Honnibal f1605df2ec Fix #588: Matcher should reject empty pattern. 2016-11-03 00:16:44 +01:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal d563f1eadb Fix Issue #587: Segfault in Matcher, due to simple error in the state machine. 2016-10-28 17:42:00 +02:00
Matthew Honnibal 2e92c6fb3a Fix JSON encoding issue on load 2016-10-20 21:06:48 +02:00
Matthew Honnibal f189a3cb00 Fix encoding when opening files in Python 2.7, re Issue #539 2016-10-20 14:42:56 +02:00
Matthew Honnibal 05e2a589a4 Fix None label in matcher 2016-10-18 18:05:21 +02:00
Matthew Honnibal 9258db788a Revert "Have the matcher return character offsets, to handle the match better."
This reverts commit 049c937540.
2016-10-17 16:49:51 +02:00
Matthew Honnibal 2fd97c71cc Revert "Don't try to pickle matcher."
This reverts commit 97bd0c9d00.
2016-10-17 16:49:43 +02:00
Matthew Honnibal 97bd0c9d00 Don't try to pickle matcher. 2016-10-17 16:38:40 +02:00
Matthew Honnibal 049c937540 Have the matcher return character offsets, to handle the match better. 2016-10-17 15:58:57 +02:00
Matthew Honnibal 6cbdc94959 Lots of updates to Matcher, to make entity handling sane. 2016-10-17 15:23:31 +02:00
Matthew Honnibal 90baa9c7e6 Revert "Changes to matcher.pyx for new StringStore scheme"
This reverts commit 3ff09614e0.
2016-09-30 20:20:13 +02:00
Matthew Honnibal 3ff09614e0 Changes to matcher.pyx for new StringStore scheme 2016-09-30 19:56:48 +02:00
Matthew Honnibal fd65cf6cbb Finish refactoring data loading 2016-09-24 20:26:17 +02:00
Matthew Honnibal 83e364188c Mostly finished loading refactoring. Design is in place, but doesn't work yet. 2016-09-24 15:42:01 +02:00
Matthew Honnibal eaf4065480 Expose the _patterns private member 2016-09-24 11:20:42 +02:00
Matthew Honnibal 55f1f7edaf Don't automatically write new entities into the Doc in the Matcher. This fixes a long-standing wart, but introduces a *backwards incompatibility.* 2016-09-24 01:16:45 +02:00
Matthew Honnibal 58e83fe34b Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match. 2016-09-21 14:54:55 +02:00
Matthew Honnibal 67ce96c9c9 * Make patterns argument to Matcher class optional 2016-04-17 21:32:24 +02:00
Wolfgang Seeker e6945c4d0e bugfix: uppercase attr values before looking them up 2016-04-15 15:46:31 +02:00
Matthew Honnibal 108aca0e50 * Make Matcher use attrs from the attrs.pyx file, rather than having an incomplete function doing the mapping. 2016-04-14 10:37:39 +02:00
Matthew Honnibal 7119e77fb6 * Fix Matcher.pipe 2016-02-05 19:46:02 +01:00
Matthew Honnibal 1ef84a0557 * Merge master into rethinc2 2016-02-05 12:55:59 +01:00
Matthew Honnibal 9703ccc3de * Remove unused import 2016-02-04 13:04:33 +01:00
Matthew Honnibal 84b247ef83 * Add a .pipe method, that takes a stream of input, operates on it, and streams the output. Internally, the stream may be buffered, to allow multi-threading. 2016-02-03 02:10:58 +01:00
Henning Peters 235f094534 untangle data_path/via 2016-01-16 12:23:45 +01:00
Henning Peters 846fa49b2a distinct load() and from_package() methods 2016-01-16 10:00:57 +01:00
Henning Peters 788f734513 refactored data_dir->via, add zip_safe, add spacy.load() 2016-01-15 18:01:02 +01:00