Commit Graph

2341 Commits

Author SHA1 Message Date
Gyorgy Orosz b03a46792c Better error handling 2017-01-14 22:09:29 +01:00
Gyorgy Orosz a45f22913f Added further abbreviations present in the Szeged corpus 2017-01-14 22:08:55 +01:00
Ines Montani 332ce2d758 Update README.md 2017-01-14 21:12:11 +01:00
Gyorgy Orosz 9505c6a72b Passing all old tests. 2017-01-14 20:39:21 +01:00
Gyorgy Orosz 63037e79af Fixed hyphen handling in the Hungarian tokenizer. 2017-01-14 16:30:11 +01:00
Gyorgy Orosz f77c0284d6 Maintaining compatibility with other spacy tokenizers. 2017-01-14 16:19:15 +01:00
Gyorgy Orosz be7a7aeb1a Reversed accidental changes. 2017-01-14 15:59:36 +01:00
Gyorgy Orosz 1be5da1ac6 Fixed Hungarian tokenizer for numbers 2017-01-14 15:51:59 +01:00
Ines Montani a89e269a5a Fix test formatting and consistency 2017-01-14 13:41:19 +01:00
Ines Montani 3424e3a7e5 Update README.md 2017-01-13 15:54:54 +01:00
Ines Montani 49186b34a1 Mark lemmatizer tests as models since they use installed data 2017-01-13 15:12:07 +01:00
Ines Montani 138deb80a1 Modernise vector tests, use add_vecs_to_vocab and don't depend on models 2017-01-13 15:12:07 +01:00
Ines Montani 96f0caa28a Fix test name for consistency 2017-01-13 15:12:07 +01:00
Ines Montani dc2bb1259f Add util function to add vectors to vocab 2017-01-13 15:12:07 +01:00
Ines Montani db9b25663d Reformat add_docs_equal and add docstring 2017-01-13 15:12:07 +01:00
Ines Montani 62ce0a0073 Add README.md to tests to explain organisation and conventions 2017-01-13 15:11:18 +01:00
Ines Montani 38d60f6b90 Modernise serializer I/O tests and don't depend on models where possible 2017-01-13 02:24:56 +01:00
Ines Montani 4bb5b89ee4 Add text_file_b fixture using BytesIO 2017-01-13 02:23:50 +01:00
Ines Montani 49febd8c62 Modernise noun chunks tests and don't depend on models 2017-01-13 02:01:00 +01:00
Ines Montani 3ee97b5686 Rename test_parser to test_noun_chunks 2017-01-13 01:36:33 +01:00
Ines Montani a308703f47 Remove old tests 2017-01-13 01:34:48 +01:00
Ines Montani 12eb8edf26 Move parser tests from unit to parser 2017-01-13 01:34:38 +01:00
Ines Montani 138c53ff2e Merge tokenizer tests 2017-01-13 01:34:14 +01:00
Ines Montani 01f36ca3ff Move attrs tests from unit to root and modernise 2017-01-13 01:33:50 +01:00
Ines Montani 3610d27967 Move alignment tests from munge to gold and modernise 2017-01-13 01:33:31 +01:00
Ines Montani 094ff7396a Reformat and rename Pragmatic Segmenter tests and mark xfails 2017-01-13 01:30:20 +01:00
Ines Montani affcf1b19d Modernise lemmatizer tests 2017-01-12 23:41:17 +01:00
Ines Montani 33d9cf87f9 Modernise tagger tests and fix xpassing test 2017-01-12 23:40:52 +01:00
Ines Montani 33e5f8dc2e Create basic and extended test set for URLs 2017-01-12 23:40:02 +01:00
Ines Montani 5e4f5ebfc8 Modernise BILUO tests 2017-01-12 23:39:18 +01:00
Ines Montani 09acfbca01 Add Lemmatizer fixture 2017-01-12 23:38:55 +01:00
Ines Montani 514bfa2597 Add path fixture for spaCy data path 2017-01-12 23:38:47 +01:00
Ines Montani 0894b8c0ef Don't split tokens with digits and "/" infixes (resolves #740) 2017-01-12 22:58:26 +01:00
Ines Montani e9e99a5670 Add regression test for #740 2017-01-12 22:57:38 +01:00
Ines Montani 6935d55409 Fix formatting 2017-01-12 22:56:20 +01:00
Ines Montani 5f0d196a31 Modernise and merge matcher tests 2017-01-12 22:23:11 +01:00
Ines Montani d5d774413a Update comments on EN and DE fixtures 2017-01-12 22:03:07 +01:00
Ines Montani 9b4bea1df9 Tidy up and rename regression tests and remove unnecessary imports 2017-01-12 22:00:37 +01:00
Ines Montani 5e1b6178e3 Fix formatting and consistency 2017-01-12 22:00:06 +01:00
Ines Montani a3fd32455e Remove redundant language loading integration tests 2017-01-12 21:59:48 +01:00
Ines Montani 61f1ca09c2 Modernise serializer codecs tests 2017-01-12 21:58:55 +01:00
Ines Montani 5dbc6e59f6 Modernise Huffman tests 2017-01-12 21:58:40 +01:00
Ines Montani edeeeccea5 Modernise packer tests and don't depend on models where possible 2017-01-12 21:58:07 +01:00
Ines Montani d084676cd0 Modernise and merge serialization tests 2017-01-12 21:57:19 +01:00
Ines Montani 442237787c Add assert_docs_equal util to compare two docs 2017-01-12 21:56:52 +01:00
Ines Montani eac3f700fb Add fixture for entity recognizer 2017-01-12 21:56:32 +01:00
Ines Montani b438cfddbc Modernise matcher tests and split into two files 2017-01-12 17:51:46 +01:00
Ines Montani 27482ebed8 Move matcher tests for #188 and #242 to regression tests
Modernise tests and remove unnecessary imports
2017-01-12 17:33:57 +01:00
Ines Montani 0a4dc632bd Update test to not create redundant Doc object 2017-01-12 17:33:18 +01:00
Ines Montani a2526e66d8 Fix formatting, naming and unicode declaration 2017-01-12 16:51:13 +01:00
Ines Montani 052cdff07d Modernise vector similarity tests 2017-01-12 16:51:13 +01:00
Ines Montani bd20ec0a6a Add get_cosine util function 2017-01-12 16:51:13 +01:00
Ines Montani 51ef75f629 Fix regression test for #615 and remove unnecessary imports 2017-01-12 16:51:12 +01:00
Ines Montani aeb747e10c Adjust formatting 2017-01-12 16:51:12 +01:00
Ines Montani 8e3e58a7e6 Modernise and merge lexeme vocab tests 2017-01-12 16:51:12 +01:00
Ines Montani c3d4516fc2 Move test for #361 to regression tests 2017-01-12 16:51:12 +01:00
Daniel Hershcovich 99eb494a82 Fix #737: support loading word vectors with " " as a word 2017-01-12 17:00:14 +02:00
Ines Montani 7cb3d74426 Modernise span tests and don't depend on models 2017-01-12 15:30:49 +01:00
Ines Montani 92e3d8b3ee Modernise vocab API tests and remove old xfailing tests 2017-01-12 15:27:46 +01:00
Ines Montani 7ea87684cd Rename test_vocab.py to test_vocab_api.py 2017-01-12 15:12:21 +01:00
Ines Montani 0da2ee5c68 Merge flag features tests into orth tests in tests root 2017-01-12 15:12:00 +01:00
Ines Montani 03c136cfd3 Remove StringStore tests from vocab tests 2017-01-12 15:11:15 +01:00
Ines Montani d7bd57abdf Modernise add vectors vocab test 2017-01-12 15:09:49 +01:00
Ines Montani 89525ef345 Use consistent test names 2017-01-12 15:09:21 +01:00
Ines Montani f8803808ce Remove old unused tests and conftest files 2017-01-12 15:09:05 +01:00
Ines Montani 4d0bfebcd9 Move Pragmatic Segmenter test cases (currently unused) to parser tests 2017-01-12 15:08:02 +01:00
Ines Montani 26d018d874 Add tests for StringStore 2017-01-12 15:07:31 +01:00
Ines Montani 9b6784bab5 Add fixture for StringStore 2017-01-12 15:05:40 +01:00
Ines Montani 99d66d613a Modernise tests for merging spans and don't depend on models 2017-01-12 12:26:26 +01:00
Ines Montani fa8f67596d Remove unused old test 2017-01-12 12:26:08 +01:00
Ines Montani 359f73a96b Move test for #54 to regression tests 2017-01-12 12:25:51 +01:00
Ines Montani 3f3a46722c Remove unused conftest 2017-01-12 12:25:24 +01:00
Ines Montani c2406e92bc Allow setting ents in get_doc 2017-01-12 12:25:10 +01:00
Ines Montani c5914c6fe5 Fix and pass regression test for #736 2017-01-12 11:48:56 +01:00
Matthew Honnibal 4e48862fa8 Remove print statement 2017-01-12 11:25:39 +01:00
Matthew Honnibal d1d8214767 Increment version 2017-01-12 11:21:57 +01:00
Matthew Honnibal fba67fa342 Fix Issue #736: Times were being tokenized with incorrect string values. 2017-01-12 11:21:01 +01:00
Ines Montani a6790b6694 Rename tags to pos in get_doc and allow adding tags to tokens 2017-01-12 11:18:36 +01:00
Ines Montani 1add8ace67 Merge lemmatizer tests 2017-01-12 11:16:53 +01:00
Ines Montani 3bc082abdf Modernise morph exceptions test and don't depend on models 2017-01-12 11:14:29 +01:00
Ines Montani ec7739b76e Add regression test for #736 2017-01-12 11:12:44 +01:00
Ines Montani 6c1c564891 Move language-specific tests out of redundant tokenizer directories 2017-01-12 02:17:18 +01:00
Ines Montani 8fecedac3a Tidy up 2017-01-12 02:16:37 +01:00
Ines Montani ae7edd30e7 Move text file back to tokenizer tests directory 2017-01-12 02:10:23 +01:00
Ines Montani ffcaba9017 Remove old and/or redundant tests 2017-01-12 02:10:18 +01:00
Ines Montani 19c4132097 Modernise space attachment parser tests and don't depend on models 2017-01-12 01:54:44 +01:00
Ines Montani 69778924c8 Modernise and merge parser tests and don't depend on models 2017-01-12 01:07:29 +01:00
Ines Montani 178c147612 Modernise nonprojectivity tests and don't depend on models 2017-01-12 01:06:36 +01:00
Ines Montani 1a3984742c Modernise sentence boundary detection tests and don't depend on models (where possible) 2017-01-11 23:53:08 +01:00
Ines Montani 0cdb6ea61d Remove old unused pickle test 2017-01-11 23:52:28 +01:00
Ines Montani c9671329dc Move test for #309 to regression tests 2017-01-11 23:52:13 +01:00
Ines Montani d0e37b5670 Modernise parser tests and don't depend on models 2017-01-11 21:30:27 +01:00
Ines Montani 342cb41782 Add apply_transition_sequence util function to utils 2017-01-11 21:30:14 +01:00
Ines Montani 09807addff Add en_parser fixture 2017-01-11 21:29:59 +01:00
Ines Montani 55d151aa61 Modernise Doc parse tree navigation tests and don't depend on models 2017-01-11 21:14:15 +01:00
Ines Montani 7262421bb2 Use consistent test names 2017-01-11 19:00:52 +01:00
Ines Montani 33800c9367 Rename "tokens" tests to "doc" 2017-01-11 18:59:01 +01:00
Ines Montani 3a9c6a9563 Remove old unused files 2017-01-11 18:58:38 +01:00
Ines Montani 8e962de39f Remove old word vector tests 2017-01-11 18:55:08 +01:00
Ines Montani e027936920 Modernise Doc noun chunks tests 2017-01-11 18:54:56 +01:00