Commit Graph

1188 Commits

Author SHA1 Message Date
Matthew Honnibal 1ec4e6fc95 * Don't score whitespace tokens 2015-06-07 19:10:32 +02:00
Matthew Honnibal de8f32ba4d * Upd version in docs 2015-06-07 19:09:39 +02:00
Matthew Honnibal 731e5f1e46 * Add get() function in spacy/syntax/Config 2015-06-07 19:09:15 +02:00
Matthew Honnibal ac422492cf * Fix write_parses mode of bin/parser/train.py 2015-06-07 19:08:48 +02:00
Matthew Honnibal 1cfa326f6e * Comment out test_conjuncts 2015-06-07 19:08:04 +02:00
Matthew Honnibal 48bc4122d8 * Upd version in setup.py 2015-06-07 19:05:28 +02:00
Matthew Honnibal 638e07939d * Avoid laoding vectors in test_token_references 2015-06-07 19:03:16 +02:00
Matthew Honnibal 50768241b3 * Fix test_docs.py 2015-06-07 19:02:43 +02:00
Matthew Honnibal d83255db17 * Fix ner test 2015-06-07 18:57:42 +02:00
Matthew Honnibal c6dc151fc3 * Fix spans/test_merge.py 2015-06-07 18:46:16 +02:00
Matthew Honnibal 2676240cbb * Fix spans/test_merge.py 2015-06-07 18:45:19 +02:00
Matthew Honnibal 9abb0dd4fd * Fix spans/test_merge.py 2015-06-07 18:44:18 +02:00
Matthew Honnibal 8a4c9c33f1 * Fix test_token_references test 2015-06-07 18:33:04 +02:00
Matthew Honnibal 15123329b1 * Have travis test the pip version of the code 2015-06-07 18:17:19 +02:00
Matthew Honnibal 5f44adc659 * Add tests/spans/conftest.py 2015-06-07 18:07:59 +02:00
Matthew Honnibal dd587b7477 * Fix tests 2015-06-07 18:07:32 +02:00
Matthew Honnibal e3af6af83c * Add tests/vocab/conftest.py 2015-06-07 18:02:47 +02:00
Matthew Honnibal 88041f69d1 * More work on reorganising tests, using conftest.py 2015-06-07 18:02:24 +02:00
Matthew Honnibal 674ee5dde7 * Add conftest.py to tests/, to allow session-global pipeline. This allows much faster tests. 2015-06-07 17:53:14 +02:00
Matthew Honnibal 877abb0e5b * Set up tokenizer/ tests properly, using a session-scoped fixture to avoid long load/unload times. Tokenizer tests now complete in 20 seconds. 2015-06-07 17:24:49 +02:00
Matthew Honnibal 1d5f20fdda * Move nlp variable from global scope 2015-06-07 16:55:11 +02:00
Matthew Honnibal d37dca72dd * Reorganize tests 2015-06-07 16:49:46 +02:00
Matthew Honnibal 2ef3555d88 * Add ujson to requirements.txt 2015-06-07 03:22:17 +02:00
Matthew Honnibal 8f142c1838 * Refactor transition system oracles, to split out move and label cost. Preparing to add Unshift move. Will exclude non-monotonic. 2015-06-07 03:21:29 +02:00
Matthew Honnibal e2578fbb90 * Avoid parsing and tagging in test_emoticons 2015-06-06 05:59:20 +02:00
Matthew Honnibal 89b8775887 * Fix output from _min_edit_path when inputs match. 2015-06-06 05:58:53 +02:00
Matthew Honnibal 27c8dc3db2 * Run tests one file at a time, as the teardown isn't cleaning up objects in global namespace properly 2015-06-06 05:58:16 +02:00
Matthew Honnibal 4126ef3b8c * Restore hyphenation test to test_infix 2015-06-06 05:57:36 +02:00
Matthew Honnibal 98cfd84123 * Remove hyphenation from main tokenizer loop: do it in infix.txt instead. This lets emoticons work 2015-06-06 05:57:03 +02:00
Matthew Honnibal 45ec92243a * Add hyphenation rule to infix.txt for tokenizer 2015-06-06 05:56:00 +02:00
Matthew Honnibal 4073533e28 * Upd munge_ewtb for the new json format 2015-06-06 02:10:33 +02:00
Matthew Honnibal 6a1341b29e * Add tb pre-process script 2015-06-06 01:59:44 +02:00
Matthew Honnibal a57ced0ead * Pin cython packages to particular versions, so that the current version works even if updates to them are pushed. 2015-06-05 23:51:39 +02:00
Matthew Honnibal 1736fc5a67 * Add more options to bin/parser/train 2015-06-05 23:49:26 +02:00
Matthew Honnibal 1fee7ade61 * Tweak to ner 2015-06-05 23:48:43 +02:00
Matthew Honnibal 362f87dc3a * Update input corruption method to work with lists as well as trings 2015-06-05 19:33:32 +02:00
Matthew Honnibal 33e70b167f * Remove dead code from ner.pyx 2015-06-05 17:12:47 +02:00
Matthew Honnibal 88ac5c6e98 * Send beam_width < 0 to greedy parser 2015-06-05 17:12:06 +02:00
Matthew Honnibal 0114e7600d * Fix NER oracle 2015-06-05 17:11:26 +02:00
Matthew Honnibal c04e6ebca6 * Allow user to load different sized vectors. 2015-06-05 16:26:39 +02:00
Matthew Honnibal 0aed9c9a33 * Fix train.py 2015-06-05 15:50:24 +02:00
Matthew Honnibal 23e2f26535 * Require thinc 1.76 2015-06-05 15:50:05 +02:00
Matthew Honnibal 8466600add * Clean up train.py, removing unused tag jackknifing code 2015-06-05 15:01:28 +02:00
Matthew Honnibal e772b48dcd * Skip sentences of length 1 in training 2015-06-05 02:29:03 +02:00
Matthew Honnibal 6bf35cecc3 * Refactor transition system to use classes with staticmethods. 2015-06-05 02:27:17 +02:00
Matthew Honnibal 36a34d544b * Refactoring arc_eager, grouping oracle functions into transitions 2015-06-04 22:43:03 +02:00
Matthew Honnibal 4433396005 * Impove efficiency of dynamic oracle, making beam training faster 2015-06-04 21:15:14 +02:00
Matthew Honnibal 079dad28a7 * Update for faster beam training 2015-06-04 19:32:32 +02:00
Matthew Honnibal f8843906ad Merge branch 'constituency'
Add beam parsing and training from JSON files, with Levenshtein alignment.
2015-06-03 06:07:24 +02:00
Matthew Honnibal ae653b850a * Remove unused import from gold.pyx 2015-06-03 06:07:15 +02:00