Commit Graph

98 Commits

Author SHA1 Message Date
ines d96e72f656 Tidy up rest 2017-10-27 21:07:59 +02:00
ines c0b55ebdac Fix PhraseMatcher.__contains__ and add more tests 2017-10-25 16:31:11 +02:00
ines 91beacf5e3 Fix Matcher.__contains__ 2017-10-25 16:19:38 +02:00
ines 4d97efc3b5 Add missing docstrings 2017-10-25 12:10:16 +02:00
ines 1262aa0bf9 Implement PhraseMatcher.__contains__ 2017-10-25 12:10:04 +02:00
ines 9c733a8849 Implement PhraseMatcher.__len__ 2017-10-25 12:09:56 +02:00
ines 7eebeeaf85 Fix Matcher.__contains__ 2017-10-25 12:09:47 +02:00
ines 7bcec57462 Remove unused attribute 2017-10-25 12:08:54 +02:00
Matthew Honnibal 4bea65a1a8 Fix Issue #1450: Off-by-1 in * and ? matches
Patterns that end in variable-length operators e.g. * and ? now end on
the correct token. Previously, they were off by 1: the next token was
pulled into the match, even if that's where the pattern failed.
2017-10-24 14:26:27 +02:00
Matthew Honnibal d8391b1c4d Fix #1434: Matcher failed on ending ? if no token 2017-10-20 16:49:36 +02:00
Matthew Honnibal 56aa42cc5d Fix and document matcher operator 'shadowing' behaviour 2017-10-16 13:38:20 +02:00
Matthew Honnibal 0433181658 Document operator semantics in Matcher docstring 2017-10-16 12:06:33 +02:00
Matthew Honnibal 2534cd57d7 Add bandaid solution to the 'shadowing' problem in #864 2017-10-09 08:59:35 +02:00
Matthew Honnibal 3b67eabfea Allow empty dictionaries to match any token in Matcher
Often patterns need to match "any token". A clean way to denote this
is with the empty dict {}: this sets no constraints on the token,
so should always match.

The problem was that having attributes length==0 was used as an
end-of-array signal, so the matcher didn't handle this case correctly.

This patch compiles empty token spec dicts into a constraint
NULL_ATTR==0. The NULL_ATTR attribute, 0, is always set to 0 on the
lexeme -- so this always matches.
2017-10-07 03:36:15 +02:00
Matthew Honnibal 19c7c09bf7 Fix PhraseMatcher.__contains__ 2017-09-26 08:35:53 -05:00
Ines Montani 7123139b2b Add __contains__ to PhraseMatcher 2017-09-26 13:13:27 +02:00
Ines Montani 50ad50f96a Update matcher.pyx 2017-09-26 13:11:17 +02:00
Matthew Honnibal 842e21de9f Fix int type error for Python 2 2017-09-20 23:55:30 +02:00
Matthew Honnibal 0c93c73e49 Add __reduce__ method for PhraseMatcher 2017-09-20 22:26:40 +02:00
Matthew Honnibal cc408fc189 Make PhraseMatcher API like Matcher API 2017-09-20 22:20:35 +02:00
Matthew Honnibal 828cc91545 Fix PhraseMatcher for spaCy 2 2017-09-20 21:54:31 +02:00
Matthew Honnibal fe11564b8e Finish stringstore change. Also xfail vectors tests 2017-05-28 15:10:22 +02:00
Matthew Honnibal e27262f431 Go back to previous matcher signature, with on_match positional 2017-05-23 04:37:40 -05:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
Matthew Honnibal 187f370734 Update tests for matcher changes 2017-05-22 12:59:50 +02:00
ines 4ed6a36622 Update docstrings and API docs for Matcher 2017-05-20 14:43:10 +02:00
ines 39f36539f6 Update docstrings and API docs for Matcher 2017-05-20 14:32:34 +02:00
ines c00ff257be Update docstrings and API docs for Matcher 2017-05-20 14:26:10 +02:00
ines 790435e51c Update docstrings 2017-05-20 14:05:07 +02:00
Matthew Honnibal ce9234f593 Update Matcher API 2017-05-20 13:54:53 +02:00
ines 1d4d3d0ecd Add TODO 2017-05-20 01:38:04 +02:00
ines fe5d8819ea Update Matcher docstrings and API docs 2017-05-19 21:47:06 +02:00
ines e1efd589c3 Fix json imports and use ujson 2017-04-15 12:13:34 +02:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
Matthew Honnibal 725249c59a Add merge_phrase callback in matcher.pyx 2017-03-31 13:58:59 +02:00
Raphaël Bournhonesque f332bf05be Remove unused import statements 2017-03-21 21:08:54 +01:00
Matthew Honnibal 8f94897d07 Add 1 operator to matcher, and make sure open patterns are closed at end of document. Closes Issue #766 2017-02-24 14:27:02 +01:00
Dmytro Sadovnychyi e70a7050e1 Remove duplicated line of vocab declaration
As already declared on line 211.
2016-11-13 18:52:49 +08:00
Dmitry Sadovnychyi 9488222e79 Fix PhraseMatcher to work with updated Matcher
#613
2016-11-09 00:14:26 +08:00
Matthew Honnibal 5cd3acb265 Fix #605: Acceptor now rejects matches as expected. 2016-11-06 10:50:42 +01:00
Matthew Honnibal f1605df2ec Fix #588: Matcher should reject empty pattern. 2016-11-03 00:16:44 +01:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal d563f1eadb Fix Issue #587: Segfault in Matcher, due to simple error in the state machine. 2016-10-28 17:42:00 +02:00
Matthew Honnibal 2e92c6fb3a Fix JSON encoding issue on load 2016-10-20 21:06:48 +02:00
Matthew Honnibal f189a3cb00 Fix encoding when opening files in Python 2.7, re Issue #539 2016-10-20 14:42:56 +02:00
Matthew Honnibal 05e2a589a4 Fix None label in matcher 2016-10-18 18:05:21 +02:00
Matthew Honnibal 9258db788a Revert "Have the matcher return character offsets, to handle the match better."
This reverts commit 049c937540.
2016-10-17 16:49:51 +02:00