Commit Graph

863 Commits

Author SHA1 Message Date
Matthew Honnibal ef87562741 Restore vectors test utils 2017-08-19 20:35:16 +02:00
Matthew Honnibal 1391f9da37 Restore vectors tests 2017-08-19 20:34:58 +02:00
Matthew Honnibal d55d6e1cfa Fix comparison of Token from different docs. Closes #1257 2017-08-19 16:39:32 +02:00
Matthew Honnibal 4fda02c7e6 Add test for new Span.to_array method 2017-08-19 16:24:38 +02:00
Matthew Honnibal c606b4a42c Add test for Doc.char_span 2017-08-19 16:18:23 +02:00
Matthew Honnibal 42d47c1e5c Fix tagger serialization 2017-08-19 04:16:32 +02:00
Matthew Honnibal 2da96a0ec7 Fix beam test 2017-08-19 04:15:46 +02:00
Matthew Honnibal a7309a217d Update tagger serialization 2017-08-18 23:12:05 +02:00
Matthew Honnibal de7e8703e3 Restore tests for beam parser 2017-08-18 22:27:42 +02:00
Matthew Honnibal 52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal 92ebab6073 Update beam-update tests 2017-08-13 08:56:02 +02:00
Matthew Honnibal 24b45b45c6 Add test for beam update 2017-08-12 17:15:28 -05:00
Matthew Honnibal b353e4d843 Work on parser beam training 2017-08-12 14:47:45 -05:00
Jim Geovedi cc4772cac2 reworks 2017-08-03 13:08:38 +07:00
Jim Geovedi 783f7d8b86 added test set for Indonesian language 2017-07-29 18:21:07 +07:00
Matthew Honnibal d6a5c2c85a Add test for NER 2017-07-22 01:48:58 +02:00
Matthew Honnibal 28244df4da Add test for beam parsing 2017-07-22 01:48:35 +02:00
Matthew Honnibal 2424493970 Remove unnecessary import of Mock 2017-07-22 01:13:54 +02:00
Matthew Honnibal 289f23df51 Test beam parsing 2017-07-20 15:03:10 +02:00
Matthew Honnibal f014138c11 Fix parser tests 2017-07-20 00:16:52 +02:00
mollerhoj e840077601 Add some basic tests for Danish 2017-07-03 15:49:51 +02:00
ines 34a2eecb17 Add simple "naughty strings" test (see #1107) 2017-06-06 17:43:51 +02:00
ines cc9c5dc7a3 Fix noun chunks test 2017-06-05 16:39:04 +02:00
Matthew Honnibal b4cdd05466 Add vectors.pyx in setup 2017-06-05 12:45:29 +02:00
Matthew Honnibal 30369d580f Start testing Vectors class 2017-06-05 12:32:49 +02:00
ines 51d7414e94 Make sure sents are a list 2017-06-05 12:30:13 +02:00
ines a0f4592f0a Update tests 2017-06-05 02:26:13 +02:00
ines 3e105bcd36 Update tests 2017-06-05 02:09:27 +02:00
ines 078232932c Fix tokenizer fixture scope 2017-06-05 01:06:34 +02:00
Matthew Honnibal 58be0e1f6f Update tests 2017-06-04 16:35:06 -05:00
Matthew Honnibal bb98d45a63 Fix tests 2017-06-04 16:00:44 -05:00
Matthew Honnibal 55d0621532 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-06-04 15:53:25 -05:00
Matthew Honnibal 5b9f116aca Update tests 2017-06-04 15:53:17 -05:00
ines 8a29308d0b Remove unused imports 2017-06-04 22:39:29 +02:00
Ines Montani 112c5787eb Merge pull request #1101 from oroszgy/hu_tokenizer_fix
More robust Hungarian tokenizer.
2017-06-04 22:37:51 +02:00
ines 96867a24ae Fix typo 2017-06-04 22:36:40 +02:00
ines f432bb4b48 Fix fixture scopes 2017-06-04 22:34:31 +02:00
ines a66cf24ee8 xfail tokenizer serialization tests for now
Tests pass locally, but not on Travis – needs more investigation
2017-06-04 13:58:20 +02:00
ines e47eef5e03 Update German tokenizer exceptions and tests 2017-06-03 21:07:44 +02:00
ines d77c2cc8bb Add tests for English norm exceptions 2017-06-03 20:59:50 +02:00
ines 3152ee5ca2 Update serialization tests for tokenizer 2017-06-03 17:05:28 +02:00
ines 1ebd0d3f27 Add assert_packed_msg_equal util function 2017-06-03 17:04:30 +02:00
ines de974f7bef Add serializer tests for tokenizer 2017-06-03 13:26:34 +02:00
ines d21459f87d Update serializer tests 2017-06-02 21:42:26 +02:00
ines d86e7cde93 Add entity recognizer to parser serialization tests 2017-06-02 18:40:06 +02:00
ines 0051c05964 Add tests for serializing parser 2017-06-02 18:37:19 +02:00
ines cef547a9f0 Add serialization tests for tensorizer 2017-06-02 18:18:30 +02:00
ines f74a45c1fe Remove unnecessary argument 2017-06-02 18:17:46 +02:00
ines 43b4d63f85 Add serialization tests for tagger 2017-06-02 17:29:34 +02:00
ines acd65c00f6 Add serialization tests for StringStore and Vocab 2017-06-02 10:57:42 +02:00
ines 9692c98f57 Add test utils for temp file and temp dir 2017-06-02 10:56:09 +02:00
Matthew Honnibal 4c97371051 Fixes for thinc 6.7 2017-06-01 04:22:16 -05:00
Gyorgy Orosz f0c3b09242 More robust Hungarian tokenizer. 2017-05-31 22:28:40 +02:00
ines 5e1c361270 Update tests README with info on model tests 2017-05-31 12:22:58 +02:00
Ines Montani e6cf3c7e1c Merge pull request #1093 from oroszgy/hu_emoji_fix
Fixed emoji handling for Hungarian
2017-05-31 11:33:24 +02:00
Matthew Honnibal 6937e311a4 Update doc tests 2017-05-30 23:34:23 +02:00
Gyorgy Orosz 8c0b4b850e Fixed emoji handling for Hungarian 2017-05-30 21:34:46 +02:00
Matthew Honnibal b127645afc Fix test_misc merge conflict 2017-05-29 18:31:44 -05:00
Matthew Honnibal e0e8eae7c7 Tweak package test 2017-05-29 18:30:42 -05:00
ines 20a7003c0d Update model fixtures and reorganise tests 2017-05-29 22:14:31 +02:00
ines 795fe43a4d Add load_test_model function with importorskip()
Loads model only if it can be imported, i.e. if it's installed as a
package.
2017-05-29 22:11:31 +02:00
ines 6e3937efc5 Check for arguments of model markers to specify models to test
Lets user set --models --en for only English models
2017-05-29 22:10:16 +02:00
Matthew Honnibal f4aafca222 Merge changes to test_misc 2017-05-29 12:26:02 +02:00
Matthew Honnibal ff26aa6c37 Work on to/from bytes/disk serialization methods 2017-05-29 11:45:45 +02:00
ines df920ba0e7 Add tests for displaCy and util functions and fix util typo 2017-05-29 10:51:19 +02:00
ines c5714d4fb2 xfail matcher test for now until setting norm via Span.merge works 2017-05-29 10:51:02 +02:00
Matthew Honnibal c91b121aeb Move serialization functions to util 2017-05-29 10:13:42 +02:00
Matthew Honnibal 1fa2bfb600 Add model_to_bytes and model_from_bytes helpers. Probably belong in thinc. 2017-05-29 09:27:04 +02:00
Matthew Honnibal 6dad4117ad Work on serialization for models 2017-05-29 01:37:57 +02:00
ines 7b1ddcc04d Add test for vocab serialization 2017-05-29 01:09:52 +02:00
ines 00b2094dc3 Fix typos, long integers and tests 2017-05-29 01:09:52 +02:00
ines 804dbb8d25 Add StringStore test for API docs 2017-05-29 01:09:52 +02:00
Matthew Honnibal 92dbf28c1e Hack a fixture in the vectors tests, for xfail 2017-05-28 20:28:32 +02:00
Matthew Honnibal fe11564b8e Finish stringstore change. Also xfail vectors tests 2017-05-28 15:10:22 +02:00
Matthew Honnibal b007a2b0d3 Update stringstore tests 2017-05-28 14:08:09 +02:00
Matthew Honnibal 84e66ca6d4 WIP on stringstore change. 27 failures 2017-05-28 14:06:40 +02:00
Matthew Honnibal fe4a746300 Accomodate symbols in new string scheme 2017-05-28 13:03:16 +02:00
Matthew Honnibal a5606c3eda Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
ines a8e58e04ef Add symbols class to punctuation rules to handle emoji (see #1088)
Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽‍💻 into account.
2017-05-27 17:57:10 +02:00
Matthew Honnibal 4917cbb484 Include sent_start test 2017-05-23 18:40:37 +02:00
ines fb0ff0272f xfail neural parser tests for now and remove test for deprecated method 2017-05-23 12:40:37 +02:00
Matthew Honnibal 5418bcf5d7 Resolve conflict on test 2017-05-23 04:37:16 -05:00
ines e6acd3bbf2 Fix matcher tests and matcher docs 2017-05-23 11:36:02 +02:00
ines d0c6d4f76d Fix formatting 2017-05-23 11:32:00 +02:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
ines b3c7ee0148 Fix tests and use the new Matcher API 2017-05-22 13:54:20 +02:00
Matthew Honnibal 187f370734 Update tests for matcher changes 2017-05-22 12:59:50 +02:00
Matthew Honnibal 7e2cdc0c81 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-22 12:39:34 +02:00
Matthew Honnibal 2f78413a02 PseudoProjectivity->nonproj 2017-05-22 05:39:03 -05:00
Matthew Honnibal d8bb5bb959 Implement StringStore serialization, and update tests 2017-05-22 12:38:00 +02:00
Matthew Honnibal 5db89053aa Merge docstrings 2017-05-21 13:46:23 -05:00
Matthew Honnibal 836fe1d880 Update neural net tests 2017-05-19 18:11:29 -05:00
ines a804045597 Use is_ancestor instead of deprecated is_ancestor_of 2017-05-19 20:23:40 +02:00
Matthew Honnibal 793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal c9a5d5d24b Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-16 16:22:05 +02:00
Matthew Honnibal 8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal 221b4c1ee8 Fix test for Python 3 2017-05-16 13:06:30 +02:00
Matthew Honnibal 1d7c18e58a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-15 21:53:47 +02:00
Matthew Honnibal a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
ines b462076d80 Merge load_lang_class and get_lang_class 2017-05-14 01:31:10 +02:00
ines 5858857a78 Update languages list in conftest 2017-05-13 15:37:54 +02:00
ines 8c2a0c026d Fix parse_tree test 2017-05-13 12:32:45 +02:00
Matthew Honnibal ee1d35bdb0 Fix merge conflict 2017-05-13 03:20:19 +02:00
Matthew Honnibal b2540d2379 Merge Kengz's tree_print patch 2017-05-13 03:18:49 +02:00
Matthew Honnibal 7253b4e649 Remove old serialization tests 2017-05-09 18:12:58 +02:00
Matthew Honnibal f9327343ce Start updating serializer test 2017-05-09 18:12:03 +02:00
ines 2c3bdd09b1 Add English test for like_num 2017-05-09 11:06:34 +02:00
ines 22375eafb0 Fix and merge attrs and lex_attrs tests 2017-05-09 11:06:25 +02:00
ines c714841cc8 Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
ines bd57b611cc Update conftest to lazy load languages 2017-05-09 00:02:21 +02:00
ines 3c0f85de8e Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
ines be5541bd16 Fix import and tokenizer exceptions 2017-05-08 16:20:14 +02:00
ines 2324788970 Remove bad tests 2017-05-08 16:15:27 +02:00
Gregory Howard c0afcd22bb Merge remote-tracking branch 'remotes/upstream/master' 2017-04-27 14:42:54 +02:00
Gregory Howard 8ff4682255 correcting tokenizer exception.
Adding tests for lemmatization
2017-04-27 11:52:14 +02:00
Ines Montani 7da9cefd25 Merge pull request #1022 from luvogels/master
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Gregory Howard 44cb486849 Adding unitest for tokenization in french (with title) 2017-04-27 10:59:38 +02:00
luvogels d12a0b6431 Hooked up tokenizer tests 2017-04-26 23:21:41 +02:00
luvogels 8de59ce3b9 Added tokenizer tests 2017-04-26 19:10:18 +02:00
Matthew Honnibal 4d98511db7 Make Span hashable. Closes #1019 2017-04-26 19:01:05 +02:00
Matthew Honnibal 24c4c51f13 Try to make test999 less flakey 2017-04-26 18:42:06 +02:00
Gregory Howard ed5f094451 Adding insensitive lemmatisation test 2017-04-25 18:07:02 +02:00
ghoward 26e31afc18 renamming tests 2017-04-25 17:46:01 +02:00
ghoward c085c2d391 Adding some unitests 2017-04-25 17:44:16 +02:00
Matthew Honnibal c4be9c36fe Fix unicode header in tests 2017-04-24 10:09:01 +02:00
Matthew Honnibal 65f10b53e5 Fix test 2017-04-24 00:25:55 +02:00
Matthew Honnibal 70a43858e1 Fix flakey test 2017-04-24 00:06:30 +02:00
Matthew Honnibal 3973af2d15 Make training test less flakey 2017-04-23 22:59:34 +02:00
ines 42305bc519 Remove unnecessary test 2017-04-23 21:21:41 +02:00
ines 012ea594d1 Add file for misc tests 2017-04-23 21:06:51 +02:00
ines 83f66947dc Rename test_download to test_cli 2017-04-23 21:06:50 +02:00
Matthew Honnibal 874a3cbb07 Add test for Issue #955 2017-04-23 17:57:01 +02:00
Matthew Honnibal 5d8af40445 Add test for Issue #999 2017-04-23 17:06:30 +02:00
Matthew Honnibal 040751ad17 Remove xfail on Test #910 2017-04-23 16:28:55 +02:00
Ben Eyal e90e8a3f10 Enable test 2017-04-20 02:25:24 +03:00
ines 2bd89e7ade Tidy up Hebrew tests and test for punctuation (see #995) 2017-04-19 19:28:03 +02:00
ines 13d30b6c01 xfail lemmatizer test that's causing problems (see #546) 2017-04-16 21:18:39 +02:00
ines 0084466a66 Remove unused utf8open util and replace os.path with ensure_path 2017-04-16 20:37:45 +02:00
Matthew Honnibal 1dca7eeb03 Add unicode declaration on new regression test 2017-04-07 18:09:23 +02:00
ines 887827fc6a Merge branch 'develop' 2017-04-07 17:36:23 +02:00
ines 444dd511c5 Fix xpassing URL test case 2017-04-07 17:36:05 +02:00
ines bf0f15e762 Add / to tokenizer infixes (resolves #891) 2017-04-07 17:30:44 +02:00
ines 00b9011a49 Fix whitespace 2017-04-07 17:29:59 +02:00
Matthew Honnibal 0513c43bf0 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-07 17:07:10 +02:00
Matthew Honnibal cc36c308f4 Fix noun_chunk rules around coordination
Closes #693.
2017-04-07 17:06:40 +02:00
Matthew Honnibal ab846256cf Merge pull request #966 from recognai/master
Prepare Spanish language for training models, including configuration, rich-UD tag map and tests
2017-04-07 16:12:29 +02:00
Matthew Honnibal 83dca920d4 Rename test #913 -> #957, comment
Make test for #957 reference correct bug. Add comment.

Previous commit closes #957.
2017-04-07 15:54:25 +02:00
Matthew Honnibal 5887383fc0 Add test for Issue #913: Hang from bad regex 2017-04-07 15:47:27 +02:00
oeg c693d40791 feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests 2017-04-06 18:48:45 +02:00
Matthew Honnibal cfff4e0f61 Improve test 2017-03-31 13:59:32 +02:00
Matthew Honnibal e854f28304 Add test for Issue #758
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Matthew Honnibal 0fefdfcbda Merge pull request #935 from ericzhao28/master
Add option to use label=ent_type in doc.merge arguments (Bug fix for issue #862)
2017-03-30 02:51:24 +02:00
Eric Zhao aafdf6ffb8 Add option to use label karg to determine ent_type in doc.merge 2017-03-28 23:35:03 -07:00
Matthew Honnibal b94286de30 Fix regression test 2017-03-25 22:35:07 +01:00
Matthew Honnibal 4f400fa486 Prevent lemmatization of base nouns
Update lemmatizer's base-form check, for change in morphology class.
Closes #903.
2017-03-25 21:51:12 +01:00
Matthew Honnibal 4454c1b23f Block lemmatization of base-form adjectives
Fixes check that an adjective is a base form (as opposed to a
comparative or superlative), so that it's not lemmatized.
e.g. inner -!> inn. Closes #912.
2017-03-25 21:29:57 +01:00
Ines Montani 97cb4d5e3c Merge branch 'master' into master 2017-03-25 10:03:47 +01:00
Iddo Berger da135bd823 add hebrew tokenizer 2017-03-24 18:27:44 +03:00
Matthew Honnibal f40fbc3710 Add test for Issue #910: Resuming entity training 2017-03-23 23:38:57 +01:00
ines f830213c4c Remove compatibility check test
Will only cause problems when incrementing version and not updating
table. Also depends on external URL, which is bad.
2017-03-20 13:20:26 +01:00
Ines Montani b6ee241e26 Fix print statements 2017-03-20 11:46:37 +01:00
ines fe0ff00fe1 Fix spacing 2017-03-19 11:55:37 +01:00
ines 5712da6095 Add regression test for #891 2017-03-19 11:48:01 +01:00
ines aefb898e37 Add title-case version of morph rules (resolves #686) 2017-03-18 17:27:11 +01:00
ines 64ec17abc1 Pass xpassing tests and add xfails for failures 2017-03-18 17:20:46 +01:00
ines d0b85faf69 Pass regression test for #401 (resolves #401)
Fixed in new English models.
2017-03-18 17:06:49 +01:00
ines be9daefbdd Remove actual model downloading from tests 2017-03-18 17:01:10 +01:00
Matthew Honnibal de0e6385b4 Merge branch 'master' of https://github.com/explosion/spaCy 2017-03-18 16:17:28 +01:00
Matthew Honnibal fe442cac53 Fix #717: Set correct lemma for contracted verbs 2017-03-18 16:16:10 +01:00
ines ad934a9abd Add regression test for #693 2017-03-18 16:12:30 +01:00
ines f57c616830 Add regression test for #704 and test new model (resolves #704)
(using new English model)
2017-03-18 16:04:14 +01:00
Matthew Honnibal 413138de79 Fix #719: Lemmatizer can no longer output empty string 2017-03-18 16:02:06 +01:00
ines ab1451f997 Don't mark compatibility test as slow 2017-03-18 15:17:39 +01:00
ines ec3e810662 Add directory cli and set up command line interface 2017-03-18 15:14:48 +01:00
Matthew Honnibal 6420f86f02 Merge changes to __init__.py 2017-03-17 19:51:45 +01:00
ines 0e533ad0cc Mark compatibility table test as slow (temporary)
Prevent Travis from running test test until models repo is published
2017-03-17 13:11:36 +01:00
Matthew Honnibal a630726b13 Fix typo in tests 2017-03-16 20:50:36 -05:00
Matthew Honnibal f98b30583f Fix tests 2017-03-16 19:48:00 -05:00
Matthew Honnibal db51abf685 Fix tests 2017-03-16 18:53:47 -05:00
Matthew Honnibal fea9fe08af Merge pull request #866 from juanmirocks/master
Fix lemmatization of OOV words
2017-03-16 23:37:36 +01:00
Matthew Honnibal 28bb546939 Merge pull request #883 from ericzhao28/master
Add `lower_` and `upper_` properties to `Span` class
2017-03-16 23:35:47 +01:00
Matthew Honnibal 8843b84bd1 Merge remote-tracking branch 'origin/develop-downloads' 2017-03-16 12:00:42 -05:00
ines 4cfc8ffbd2 Reformat pickle tests 2017-03-15 17:39:54 +01:00
ines 2a0fcf1354 Add tests for new download module 2017-03-15 17:39:43 +01:00
Matthew Honnibal 4cab8ac136 Update morph exceptions test 2017-03-15 09:31:34 -05:00
ines 42ba740dde Revert "Merge branch 'debug'"
This reverts commit 89b79d1178, reversing
changes made to 02bdf490a1.
2017-03-13 20:11:52 +01:00
ines 4c5f51e49e Update regression test 2017-03-13 15:16:11 +01:00
ines 02bdf490a1 Remove regression test to see if it caused pytest Travis error 2017-03-13 13:00:22 +01:00
ines 17018750ac Add regression test for #717 2017-03-13 12:58:22 +01:00
ines 2883ebfca2 Remove print statement 2017-03-13 12:30:42 +01:00
ines 98c13d8aa9 Add regression test for #401 2017-03-13 12:28:41 +01:00
ines 444d665f9d Add regression test for #686 2017-03-13 12:23:35 +01:00
ines 46b17e5b51 Add regression test for #719 2017-03-13 12:17:35 +01:00
ines c8ae682ff9 Add regression test for #636 2017-03-13 12:08:31 +01:00
ines 337f9601f2 Add missing unicode declaration 2017-03-13 12:08:19 +01:00
ines d70386ec6e Update docstring in #886 regression test 2017-03-13 12:00:38 +01:00
ines 51ba3ef0a8 Add regression test for #886 2017-03-13 11:44:58 +01:00
ines 1da29a7146 Use new Lemmatizer data and remove file import
Since there's currently only an English lemmatizer, the global
Lemmatizer imports from spacy.en. This is unideal and still needs to be
fixed.
2017-03-12 13:58:22 +01:00
ines c89e30d1a3 Add test for English time exceptions ("1a.m." etc.) 2017-03-12 13:58:22 +01:00
ines 66c1f194f9 Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
Em 9c809efc25 Removed mapStr 2017-03-11 16:23:26 -08:00
Matthew Honnibal ea2592879f Merge branch 'master' of https://github.com/explosion/spaCy 2017-03-11 11:13:37 -06:00
Em 426d17167f Added string manipulation for spans 2017-03-10 16:50:02 -08:00
ines 10e29189ac Adjust URL testcases and xfail problems (instead of comment) 2017-03-10 14:22:50 +01:00
Matthew Honnibal ea53647362 Merge branch 'develop' 2017-03-10 02:49:39 -06:00
Dan Rapp 123d3f2d38 Fix error in test case parameterization 2017-03-09 12:18:21 -07:00
Dan Rapp b9307dfcd7 Merge branch 'master' into rappdw/tokenizer_exceptions_url_fix 2017-03-09 11:42:14 -07:00
Dan Rapp 3b1df3808d Issue #840 - URL pattenr too broad 2017-03-09 11:39:39 -07:00
Matthew Honnibal 5b0b968d13 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-03-08 15:03:10 +01:00
Matthew Honnibal 0ac3d27689 Fix handling of trailing whitespace
Fix off-by-one error that meant trailing spaces were being dropped.
Closes #792
2017-03-08 15:01:40 +01:00
ines c2e3e651b8 Re-add regression test for #859 2017-03-08 14:36:09 +01:00
Matthew Honnibal 16670d3251 Xfail the vocab pickling for now 2017-03-07 21:43:28 +01:00
Matthew Honnibal a89c3500f6 Fixes to hacky vocab pickling 2017-03-07 20:58:55 +01:00
Matthew Honnibal 3edb8ae207 Whitespace 2017-03-07 17:16:26 +01:00
Matthew Honnibal 5de7e712b7 Add support for pickling StringStore. 2017-03-07 17:15:18 +01:00
Matthew Honnibal 4e75e74247 Update regression test for variable-length pattern problem in the matcher. 2017-03-07 16:08:32 +01:00
Matthew Honnibal 6d67213b80 Add test for 850: Matcher fails on zero-or-more. 2017-03-07 15:55:28 +01:00
Aniruddha Adhikary 696215a3fb add tests for Bengali 2017-03-05 11:25:12 +06:00
ines 8dff040032 Revert "Add regression test for #859"
This reverts commit c4f16c66d1.
2017-03-01 21:56:20 +01:00
Juan Miguel Cejuela a8cfde46d3 #781 Fix test — colocalizes is lemmatized to colocaliz and colicalize 2017-03-01 21:43:08 +01:00
Juan Miguel Cejuela a471114eb2 #781 add regression test, failing previous bug fix 2017-03-01 21:30:51 +01:00
ines c4f16c66d1 Add regression test for #859 2017-03-01 16:07:27 +01:00
Matthew Honnibal 34bcc8706d Merge branch 'french-tokenizer-exceptions' 2017-02-27 11:21:21 +01:00
Matthew Honnibal 0aaa546435 Fix test after updating the French tokenizer stuff 2017-02-27 11:20:47 +01:00
ines 376c5813a7 Remove print statements from test 2017-02-24 18:26:32 +01:00
ines 7c1260e98c Add regression test 2017-02-24 18:22:49 +01:00
ines 51eb190ef4 Remove print statements from test 2017-02-24 17:41:12 +01:00
Matthew Honnibal db5ada3995 Merge branch 'master' of https://github.com/explosion/spaCy 2017-02-24 14:28:12 +01:00
Matthew Honnibal 8f94897d07 Add 1 operator to matcher, and make sure open patterns are closed at end of document. Closes Issue #766 2017-02-24 14:27:02 +01:00
ines 67991b6e5f Add more test cases to #775 regression test to cover #847 2017-02-18 14:10:44 +01:00
ines 44de3c7642 Reformat test and use text_file fixture 2017-02-16 23:49:19 +01:00
ines 3dd22e9c88 Mark vectors test as xfail (temporary) 2017-02-16 23:28:51 +01:00
ines 85d249d451 Revert "Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)""
This reverts commit ea05f78660.
2017-02-16 23:26:25 +01:00
ines ea05f78660 Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)"
This reverts commit 7d8c9eee7f, reversing
changes made to f6b69babcc.
2017-02-16 15:27:12 +01:00
Raphaël Bournhonesque 06a71d22df Fix test failure by using unicode literals 2017-02-16 14:48:00 +01:00
Raphaël Bournhonesque 3ba109622c Add regression test with non ' ' space character as token 2017-02-16 12:23:27 +01:00
ines 21f09d10d7 Revert "Revert "Merge pull request #818 from raphael0202/tokenizer_exceptions""
This reverts commit f02a2f9322.
2017-02-10 13:17:05 +01:00
ines f02a2f9322 Revert "Merge pull request #818 from raphael0202/tokenizer_exceptions"
This reverts commit b95afdf39c, reversing
changes made to b0ccf32378.
2017-02-09 17:07:21 +01:00
Raphaël Bournhonesque 309da78bf0 Merge branch 'master' into tokenizer_exceptions 2017-02-09 16:32:12 +01:00
Raphaël Bournhonesque 4ce0bbc6b6 Update unit tests 2017-02-09 16:30:43 +01:00
ines 654fe447b1 Add Swedish tokenizer tests (see #807) 2017-02-05 11:47:07 +01:00
Michael Wallin 35100c8bdd [issue 805] Add regression test and the required fixture 2017-02-04 16:21:34 +02:00
Michael Wallin 1a1952afa5 [finnish] Add initial tests for tokenizer 2017-02-04 13:54:10 +02:00
Ines Montani afc6365388 Update regression test for #801 to match current expected behaviour 2017-02-02 16:23:05 +01:00
Ines Montani 13a4ab37e0 Add regression test for #801 2017-02-02 15:33:52 +01:00
Raphaël Bournhonesque 85f951ca99 Add tokenizer exceptions for French 2017-02-02 08:36:16 +01:00
Ines Montani e4875834fe Fix formatting 2017-01-31 15:19:33 +01:00
Ines Montani c304834e45 Add missing import 2017-01-31 15:18:30 +01:00