Commit Graph

4319 Commits

Author SHA1 Message Date
ines 3dd22e9c88 Mark vectors test as xfail (temporary) 2017-02-16 23:28:51 +01:00
ines 85d249d451 Revert "Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)""
This reverts commit ea05f78660.
2017-02-16 23:26:25 +01:00
Matthew Honnibal 2f82d68430 Disable sdist setting for now while investigate server problem. 2017-02-16 23:12:22 +01:00
Matthew Honnibal 49cf28e4c6 Fix Travis.yml 2017-02-16 23:04:41 +01:00
Matthew Honnibal c744ce4b6d Fix bad change to cythonize.py script, re subprocess call 2017-02-16 19:01:25 +01:00
ines ea05f78660 Revert "Merge pull request #836 from raphael0202/load_vectors (closes #834)"
This reverts commit 7d8c9eee7f, reversing
changes made to f6b69babcc.
2017-02-16 15:27:12 +01:00
Matthew Honnibal 0836cbe064 Pass shell to cythonize.py. See Issue #791 2017-02-17 01:06:06 +11:00
Matthew Honnibal 071d11cb35 Pass environment to Cythonize script. Closes #791 2017-02-17 01:04:16 +11:00
Ines Montani 7d8c9eee7f Merge pull request #836 from raphael0202/load_vectors (closes #834)
load_vectors should accept arbitrary space characters as word tokens
2017-02-16 14:52:40 +01:00
Raphaël Bournhonesque 06a71d22df Fix test failure by using unicode literals 2017-02-16 14:48:00 +01:00
Raphaël Bournhonesque 3ba109622c Add regression test with non ' ' space character as token 2017-02-16 12:23:27 +01:00
ines f6b69babcc Fix years in footer 2017-02-16 12:14:35 +01:00
Ines Montani 4e673bfeea Merge pull request #833 from vaulttech/master (resolves #832)
Fixes example 3 of entity recognition (see issue #832)
2017-02-16 12:13:48 +01:00
Raphaël Bournhonesque e17dc2db75 Remove useless import 2017-02-16 12:10:24 +01:00
Raphaël Bournhonesque 3fd2742649 load_vectors should accept arbitrary space characters as word tokens
Fix bug  #834
2017-02-16 12:08:30 +01:00
John Gamboa e31894b800 Fixes example 3 of entity recognition (see issue #832) 2017-02-16 11:19:53 +01:00
Ines Montani 813989940e Merge pull request #821 from knub/patch-1
Fix error in pipeline loading documentation
2017-02-10 17:24:44 +01:00
ines f08e180a47 Make groups non-capturing
Prevents hitting the 100 named groups limit in Python
2017-02-10 13:35:02 +01:00
ines fa3b8512da Use consistent imports and exports
Bundle everything in language_data to keep it consistent with other
languages and make TOKENIZER_EXCEPTIONS importable from there.
2017-02-10 13:34:09 +01:00
ines 21f09d10d7 Revert "Revert "Merge pull request #818 from raphael0202/tokenizer_exceptions""
This reverts commit f02a2f9322.
2017-02-10 13:17:05 +01:00
Stefan Bunk 2bf19d4735 Fix error in pipeline loading documentation
The cell for the `vocab` parameter is not displayed, making it seem as if the explanation belongs to the previous param.
2017-02-10 12:06:55 +01:00
ines f02a2f9322 Revert "Merge pull request #818 from raphael0202/tokenizer_exceptions"
This reverts commit b95afdf39c, reversing
changes made to b0ccf32378.
2017-02-09 17:07:21 +01:00
Ines Montani b95afdf39c Merge pull request #818 from raphael0202/tokenizer_exceptions
Add tokenizer exceptions for French
2017-02-09 16:41:21 +01:00
Raphaël Bournhonesque 309da78bf0 Merge branch 'master' into tokenizer_exceptions 2017-02-09 16:32:12 +01:00
Raphaël Bournhonesque 4ce0bbc6b6 Update unit tests 2017-02-09 16:30:43 +01:00
Raphaël Bournhonesque 5d706ab95d Merge tokenizer exceptions from PR #802 2017-02-09 16:30:28 +01:00
Ines Montani b0ccf32378 Update CONTRIBUTING.md 2017-02-09 16:27:31 +01:00
ines 1b8719bf9a Adjust formatting and increment version 2017-02-08 21:33:22 +01:00
Ines Montani c63bf3fc94 Merge pull request #814 from wehlutyk/website-nav-hysteresis
Make the website nav header's hysteresis a bit more robust
2017-02-08 21:30:34 +01:00
Sébastien Lerique e1f87858ad Make the website nav header's hysteresis a bit more robust
In particular, this prevents the nav header from reappearing all the
time while scrolling down on Firefox.
2017-02-08 15:08:33 +01:00
Ines Montani 8ac741c217 Merge pull request #811 from knub/patch-1
Fix error in matching documentation
2017-02-07 16:57:02 +01:00
Stefan Bunk e972b2fa87 Fix error in matching documentation
LOWER and IS_PUNCT are members of `spacy` and not of the `Matcher` class.
2017-02-07 16:52:01 +01:00
ines 654fe447b1 Add Swedish tokenizer tests (see #807) 2017-02-05 11:47:07 +01:00
ines 6715615d55 Add missing EXC variable and combine tokenizer exceptions 2017-02-05 11:42:52 +01:00
Ines Montani 30a52d576b Merge pull request #807 from magnusburton/master
Added swedish lemma rules and more verb contractions
2017-02-05 11:34:19 +01:00
Matthew Honnibal 9aaa2c5633 Fix entity recognition example (closes #803) 2017-02-05 11:23:12 +01:00
Magnus Burton 19c0ce745a Added swedish lemma rules 2017-02-04 17:53:32 +01:00
Ines Montani cf529f4774 Merge pull request #806 from wallinm1/fix/swedish-tokenizer-exceptions
Fix issue #805
2017-02-04 17:40:40 +01:00
Michael Wallin d25556bf80 [issue 805] Fix issue 2017-02-04 16:22:21 +02:00
Michael Wallin 35100c8bdd [issue 805] Add regression test and the required fixture 2017-02-04 16:21:34 +02:00
ines a44da8fb34 Update language models and alpha support overview 2017-02-04 13:49:05 +01:00
Ines Montani 708cd37a2e Update README.rst 2017-02-04 13:42:46 +01:00
Ines Montani ff91be6d17 Update CONTRIBUTORS.md 2017-02-04 13:41:21 +01:00
ines 0ab353b0ca Add line breaks to Finnish stop words for better readability 2017-02-04 13:40:25 +01:00
Ines Montani 3431e7b86f Merge pull request #804 from wallinm1/finnish-alpha-support
Alpha support for Finnish
2017-02-04 13:37:08 +01:00
Michael Wallin 55b1e5e682 [finnish] Add contributor file 2017-02-04 13:54:10 +02:00
Michael Wallin 1a1952afa5 [finnish] Add initial tests for tokenizer 2017-02-04 13:54:10 +02:00
Michael Wallin f9bb25d1cf [finnish] Reformat and correct stop words 2017-02-04 13:54:10 +02:00
Michael Wallin 73f66ec570 Add preliminary support for Finnish 2017-02-04 13:54:10 +02:00
Ines Montani 932aaba7de Update CONTRIBUTORS.md 2017-02-03 10:55:42 +01:00