Commit Graph

351 Commits

Author SHA1 Message Date
Matthew Honnibal 094512fd47 Fix model-mark on regression test. 2017-10-25 14:44:00 +02:00
Matthew Honnibal 908809d488 Update tests 2017-10-24 17:05:15 +02:00
Matthew Honnibal 63f0bde749 Add test for #1250: Tokenizer cache clobbered special-case attrs 2017-10-24 16:07:18 +02:00
Matthew Honnibal 4bea65a1a8 Fix Issue #1450: Off-by-1 in * and ? matches
Patterns that end in variable-length operators e.g. * and ? now end on
the correct token. Previously, they were off by 1: the next token was
pulled into the match, even if that's where the pattern failed.
2017-10-24 14:26:27 +02:00
Matthew Honnibal 391d5ef0d1 Normalize imports in regression test 2017-10-24 14:25:49 +02:00
Matthew Honnibal b66b8f028b Fix #1375 -- out-of-bounds on token.nbor() 2017-10-24 12:10:39 +02:00
Matthew Honnibal a68d89a4f3 Add failing test for bug #1375 -- no out-of-bounds error for token.nbor() 2017-10-24 12:05:25 +02:00
Matthew Honnibal 490ad3eaf0 Check that empty strings are handled. Closes #1242 2017-10-21 00:52:14 +02:00
Matthew Honnibal d8391b1c4d Fix #1434: Matcher failed on ending ? if no token 2017-10-20 16:49:36 +02:00
Matthew Honnibal f111b228e0 Fix re-parsing of previously parsed text
If a Doc object had been previously parsed, it was possible for
invalid parses to be added. There were two problems:

1) The parse was only being partially erased
2) The RightArc action was able to create a 1-cycle.

This patch fixes both errors, and avoids resetting the parse if one is
present. In theory this might allow a better parse to be predicted by
running the parser twice.

Closes #1253.
2017-10-20 16:27:36 +02:00
ines 3516aa0cea Port over changes from #1389 2017-10-14 13:32:55 +02:00
ines 15fe0fd82d Fix tests 2017-10-11 13:27:18 +02:00
Matthew Honnibal c6cd81f192 Wrap try/except around model saving 2017-10-05 08:14:24 -05:00
Matthew Honnibal fd4baff475 Update tests 2017-10-05 08:12:27 -05:00
Matthew Honnibal 40edb65ee7 Make test work for Python 2.7 2017-10-04 16:36:50 +02:00
Matthew Honnibal db05d4d582 Add test for #1380. Passes without fix? 2017-10-04 14:56:31 +02:00
Matthew Honnibal 456bb8a74c Unxfail and close #1305 2017-09-06 19:14:17 +02:00
Matthew Honnibal 99e44fbdbb Update regression test 2017-09-06 19:13:51 +02:00
Matthew Honnibal 497a9308a8 Xfail new lemmatizer test 2017-09-06 18:41:22 +02:00
Matthew Honnibal 5384fff5ce Add test for 1305: Incorrect lemmatization of VBZ for English 2017-09-06 18:40:18 +02:00
Matthew Honnibal d55d6e1cfa Fix comparison of Token from different docs. Closes #1257 2017-08-19 16:39:32 +02:00
ines 51d7414e94 Make sure sents are a list 2017-06-05 12:30:13 +02:00
ines a0f4592f0a Update tests 2017-06-05 02:26:13 +02:00
ines 3e105bcd36 Update tests 2017-06-05 02:09:27 +02:00
Matthew Honnibal bb98d45a63 Fix tests 2017-06-04 16:00:44 -05:00
Matthew Honnibal 55d0621532 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-06-04 15:53:25 -05:00
Matthew Honnibal 5b9f116aca Update tests 2017-06-04 15:53:17 -05:00
ines 8a29308d0b Remove unused imports 2017-06-04 22:39:29 +02:00
ines 96867a24ae Fix typo 2017-06-04 22:36:40 +02:00
ines 20a7003c0d Update model fixtures and reorganise tests 2017-05-29 22:14:31 +02:00
Matthew Honnibal fe11564b8e Finish stringstore change. Also xfail vectors tests 2017-05-28 15:10:22 +02:00
ines fb0ff0272f xfail neural parser tests for now and remove test for deprecated method 2017-05-23 12:40:37 +02:00
Matthew Honnibal 5418bcf5d7 Resolve conflict on test 2017-05-23 04:37:16 -05:00
ines e6acd3bbf2 Fix matcher tests and matcher docs 2017-05-23 11:36:02 +02:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
ines b3c7ee0148 Fix tests and use the new Matcher API 2017-05-22 13:54:20 +02:00
Matthew Honnibal 8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
ines 3c0f85de8e Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
ines be5541bd16 Fix import and tokenizer exceptions 2017-05-08 16:20:14 +02:00
Matthew Honnibal 24c4c51f13 Try to make test999 less flakey 2017-04-26 18:42:06 +02:00
Matthew Honnibal c4be9c36fe Fix unicode header in tests 2017-04-24 10:09:01 +02:00
Matthew Honnibal 65f10b53e5 Fix test 2017-04-24 00:25:55 +02:00
Matthew Honnibal 70a43858e1 Fix flakey test 2017-04-24 00:06:30 +02:00
Matthew Honnibal 3973af2d15 Make training test less flakey 2017-04-23 22:59:34 +02:00
Matthew Honnibal 874a3cbb07 Add test for Issue #955 2017-04-23 17:57:01 +02:00
Matthew Honnibal 5d8af40445 Add test for Issue #999 2017-04-23 17:06:30 +02:00
Matthew Honnibal 040751ad17 Remove xfail on Test #910 2017-04-23 16:28:55 +02:00
Matthew Honnibal 1dca7eeb03 Add unicode declaration on new regression test 2017-04-07 18:09:23 +02:00
ines 887827fc6a Merge branch 'develop' 2017-04-07 17:36:23 +02:00
ines bf0f15e762 Add / to tokenizer infixes (resolves #891) 2017-04-07 17:30:44 +02:00
ines 00b9011a49 Fix whitespace 2017-04-07 17:29:59 +02:00
Matthew Honnibal cc36c308f4 Fix noun_chunk rules around coordination
Closes #693.
2017-04-07 17:06:40 +02:00
Matthew Honnibal 83dca920d4 Rename test #913 -> #957, comment
Make test for #957 reference correct bug. Add comment.

Previous commit closes #957.
2017-04-07 15:54:25 +02:00
Matthew Honnibal 5887383fc0 Add test for Issue #913: Hang from bad regex 2017-04-07 15:47:27 +02:00
Matthew Honnibal cfff4e0f61 Improve test 2017-03-31 13:59:32 +02:00
Matthew Honnibal e854f28304 Add test for Issue #758
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Matthew Honnibal b94286de30 Fix regression test 2017-03-25 22:35:07 +01:00
Matthew Honnibal 4f400fa486 Prevent lemmatization of base nouns
Update lemmatizer's base-form check, for change in morphology class.
Closes #903.
2017-03-25 21:51:12 +01:00
Matthew Honnibal 4454c1b23f Block lemmatization of base-form adjectives
Fixes check that an adjective is a base form (as opposed to a
comparative or superlative), so that it's not lemmatized.
e.g. inner -!> inn. Closes #912.
2017-03-25 21:29:57 +01:00
Matthew Honnibal f40fbc3710 Add test for Issue #910: Resuming entity training 2017-03-23 23:38:57 +01:00
ines fe0ff00fe1 Fix spacing 2017-03-19 11:55:37 +01:00
ines 5712da6095 Add regression test for #891 2017-03-19 11:48:01 +01:00
ines aefb898e37 Add title-case version of morph rules (resolves #686) 2017-03-18 17:27:11 +01:00
ines d0b85faf69 Pass regression test for #401 (resolves #401)
Fixed in new English models.
2017-03-18 17:06:49 +01:00
Matthew Honnibal de0e6385b4 Merge branch 'master' of https://github.com/explosion/spaCy 2017-03-18 16:17:28 +01:00
Matthew Honnibal fe442cac53 Fix #717: Set correct lemma for contracted verbs 2017-03-18 16:16:10 +01:00
ines ad934a9abd Add regression test for #693 2017-03-18 16:12:30 +01:00
ines f57c616830 Add regression test for #704 and test new model (resolves #704)
(using new English model)
2017-03-18 16:04:14 +01:00
Matthew Honnibal 413138de79 Fix #719: Lemmatizer can no longer output empty string 2017-03-18 16:02:06 +01:00
Matthew Honnibal db51abf685 Fix tests 2017-03-16 18:53:47 -05:00
Matthew Honnibal fea9fe08af Merge pull request #866 from juanmirocks/master
Fix lemmatization of OOV words
2017-03-16 23:37:36 +01:00
ines 42ba740dde Revert "Merge branch 'debug'"
This reverts commit 89b79d1178, reversing
changes made to 02bdf490a1.
2017-03-13 20:11:52 +01:00
ines 4c5f51e49e Update regression test 2017-03-13 15:16:11 +01:00
ines 02bdf490a1 Remove regression test to see if it caused pytest Travis error 2017-03-13 13:00:22 +01:00
ines 17018750ac Add regression test for #717 2017-03-13 12:58:22 +01:00
ines 2883ebfca2 Remove print statement 2017-03-13 12:30:42 +01:00
ines 98c13d8aa9 Add regression test for #401 2017-03-13 12:28:41 +01:00
ines 444d665f9d Add regression test for #686 2017-03-13 12:23:35 +01:00
ines 46b17e5b51 Add regression test for #719 2017-03-13 12:17:35 +01:00
ines c8ae682ff9 Add regression test for #636 2017-03-13 12:08:31 +01:00
ines 337f9601f2 Add missing unicode declaration 2017-03-13 12:08:19 +01:00
ines d70386ec6e Update docstring in #886 regression test 2017-03-13 12:00:38 +01:00
ines 51ba3ef0a8 Add regression test for #886 2017-03-13 11:44:58 +01:00
ines 66c1f194f9 Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
Matthew Honnibal 5b0b968d13 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-03-08 15:03:10 +01:00
Matthew Honnibal 0ac3d27689 Fix handling of trailing whitespace
Fix off-by-one error that meant trailing spaces were being dropped.
Closes #792
2017-03-08 15:01:40 +01:00
ines c2e3e651b8 Re-add regression test for #859 2017-03-08 14:36:09 +01:00
Matthew Honnibal 3edb8ae207 Whitespace 2017-03-07 17:16:26 +01:00
Matthew Honnibal 4e75e74247 Update regression test for variable-length pattern problem in the matcher. 2017-03-07 16:08:32 +01:00
Matthew Honnibal 6d67213b80 Add test for 850: Matcher fails on zero-or-more. 2017-03-07 15:55:28 +01:00
ines 8dff040032 Revert "Add regression test for #859"
This reverts commit c4f16c66d1.
2017-03-01 21:56:20 +01:00
Juan Miguel Cejuela a8cfde46d3 #781 Fix test — colocalizes is lemmatized to colocaliz and colicalize 2017-03-01 21:43:08 +01:00
Juan Miguel Cejuela a471114eb2 #781 add regression test, failing previous bug fix 2017-03-01 21:30:51 +01:00
ines c4f16c66d1 Add regression test for #859 2017-03-01 16:07:27 +01:00
Matthew Honnibal 0aaa546435 Fix test after updating the French tokenizer stuff 2017-02-27 11:20:47 +01:00
ines 7c1260e98c Add regression test 2017-02-24 18:22:49 +01:00
ines 67991b6e5f Add more test cases to #775 regression test to cover #847 2017-02-18 14:10:44 +01:00
ines 44de3c7642 Reformat test and use text_file fixture 2017-02-16 23:49:19 +01:00
ines 3dd22e9c88 Mark vectors test as xfail (temporary) 2017-02-16 23:28:51 +01:00
ines 85d249d451 Revert "Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)""
This reverts commit ea05f78660.
2017-02-16 23:26:25 +01:00
ines ea05f78660 Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)"
This reverts commit 7d8c9eee7f, reversing
changes made to f6b69babcc.
2017-02-16 15:27:12 +01:00
Raphaël Bournhonesque 06a71d22df Fix test failure by using unicode literals 2017-02-16 14:48:00 +01:00
Raphaël Bournhonesque 3ba109622c Add regression test with non ' ' space character as token 2017-02-16 12:23:27 +01:00
Michael Wallin 35100c8bdd [issue 805] Add regression test and the required fixture 2017-02-04 16:21:34 +02:00
Ines Montani afc6365388 Update regression test for #801 to match current expected behaviour 2017-02-02 16:23:05 +01:00
Ines Montani 13a4ab37e0 Add regression test for #801 2017-02-02 15:33:52 +01:00
Ines Montani e4875834fe Fix formatting 2017-01-31 15:19:33 +01:00
Ines Montani c304834e45 Add missing import 2017-01-31 15:18:30 +01:00
Ines Montani e6465b9ca3 Parametrize test cases and mark as xfail 2017-01-31 15:14:42 +01:00
latkins e4c84321a5 Added regression test for Issue #792. 2017-01-31 13:47:42 +00:00
Ines Montani 19501f3340 Add regression test for #775 2017-01-25 13:16:52 +01:00
Ines Montani 0967eb07be Add regression test for #768 2017-01-23 21:25:46 +01:00
Ines Montani 5f6f48e734 Add regression test for #759 2017-01-20 15:11:48 +01:00
Matthew Honnibal 2c60d0cb1e Test #743: Tokens unhashable. 2017-01-16 13:27:26 +01:00
Ines Montani 50878ef598 Exclude "were" and "Were" from tokenizer exceptions and add regression test (resolves #744) 2017-01-16 13:10:38 +01:00
Ines Montani e053c7693b Fix formatting 2017-01-16 13:09:52 +01:00
Ines Montani e9e99a5670 Add regression test for #740 2017-01-12 22:57:38 +01:00
Ines Montani 6935d55409 Fix formatting 2017-01-12 22:56:20 +01:00
Ines Montani 9b4bea1df9 Tidy up and rename regression tests and remove unnecessary imports 2017-01-12 22:00:37 +01:00
Ines Montani 27482ebed8 Move matcher tests for #188 and #242 to regression tests
Modernise tests and remove unnecessary imports
2017-01-12 17:33:57 +01:00
Ines Montani 0a4dc632bd Update test to not create redundant Doc object 2017-01-12 17:33:18 +01:00
Ines Montani 51ef75f629 Fix regression test for #615 and remove unnecessary imports 2017-01-12 16:51:12 +01:00
Ines Montani c3d4516fc2 Move test for #361 to regression tests 2017-01-12 16:51:12 +01:00
Ines Montani 359f73a96b Move test for #54 to regression tests 2017-01-12 12:25:51 +01:00
Ines Montani c5914c6fe5 Fix and pass regression test for #736 2017-01-12 11:48:56 +01:00
Ines Montani ec7739b76e Add regression test for #736 2017-01-12 11:12:44 +01:00
Ines Montani c9671329dc Move test for #309 to regression tests 2017-01-11 23:52:13 +01:00
Ines Montani 3e6e1f0251 Tidy up regression tests 2017-01-10 19:24:10 +01:00
Ines Montani c6e5a5349d Move regression test for #360 into own file 2017-01-04 00:49:31 +01:00
Ines Montani 59059fed27 Move regression test for #351 to own file 2017-01-04 00:47:11 +01:00
Matthew Honnibal bdcecb3c96 Add import in regression test 2016-12-18 16:51:31 +01:00
Ines Montani 77cf2fb0f6 Remove unnecessary argument in test 2016-12-18 14:06:27 +01:00
Ines Montani 121c310566 Remove trailing whitespace 2016-12-18 14:06:27 +01:00
Matthew Honnibal 0595cc0635 Change test595 to mock data, instead of requiring model. 2016-12-18 13:28:51 +01:00
Matthew Honnibal e01c1875ee Work on test for #615 2016-11-23 23:48:41 +01:00
Matthew Honnibal e86f440ca6 Fix test for issue 617 2016-11-10 22:48:10 +01:00
Matthew Honnibal a2c7de8329 spacy/tests/regression/test_issue617.py
Test Issue #617
2016-11-10 22:46:23 +01:00
Matthew Honnibal 3ea15b257f Fix test for 605 2016-11-06 11:59:26 +01:00
Matthew Honnibal efe7790439 Test #590: Order dependence in Matcher rules. 2016-11-06 11:21:36 +01:00
Matthew Honnibal 75805397dd Test Issue #605 2016-11-06 10:42:32 +01:00
Matthew Honnibal 4a8a2b6001 Test #595 -- Bug in lemmatization of base forms. 2016-11-04 00:27:32 +01:00
Matthew Honnibal 72b9bd57ec Test Issue #588: Matcher accepts invalid, empty patterns. 2016-11-03 00:09:35 +01:00
Matthew Honnibal 3d6c79e595 Test Issue #599: .is_tagged and .is_parsed attributes not reflected after deserialization for empty documents. 2016-11-02 23:40:11 +01:00
Matthew Honnibal 125c910a8d Test Issue #600 2016-11-02 23:24:13 +01:00
Matthew Honnibal d8db648ebf Add __init__.py file for regression tests 2016-11-01 13:45:06 +01:00
Matthew Honnibal 6977a2b8cd Add test for Issue #589 2016-11-01 12:33:36 +01:00
Matthew Honnibal 7e5f63a595 Improve test slightly 2016-10-28 17:41:16 +02:00
Matthew Honnibal 782e4814f4 Test Issue #587: Matcher segfaults on particular input 2016-10-28 16:38:32 +02:00
Matthew Honnibal afea6505f3 Test Issue 429: No valid actions for NER after matcher adds a new entity label. 2016-10-27 18:01:34 +02:00