Ines Montani
bdafb514c5
Update version
2017-01-26 13:47:32 +01:00
Ines Montani
19501f3340
Add regression test for #775
2017-01-25 13:16:52 +01:00
Ines Montani
209c37bbcf
Exclude "shell" and "Shell" from English tokenizer exceptions ( resolves #775 )
2017-01-25 13:15:02 +01:00
Ines Montani
a3c92e1bf6
Update README.rst
2017-01-25 10:48:09 +01:00
Ines Montani
c784b49d33
Merge pull request #772 from raphael0202/french-support
...
Add French tokenization support
2017-01-24 14:27:16 +01:00
Raphaël Bournhonesque
1be9c0e724
Add fr tokenization unit tests
2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
1faaf698ca
Add infixes and abbreviation exceptions (fr)
2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
cf8474401b
Remove unused import statement
2017-01-24 10:57:37 +01:00
Raphaël Bournhonesque
902f136f18
Add support for elision in French
2017-01-24 10:57:37 +01:00
Ines Montani
199ae10690
Update CONTRIBUTORS.md
2017-01-23 21:36:53 +01:00
Ines Montani
55c9c62abc
Use relative import
2017-01-23 21:27:49 +01:00
Ines Montani
0967eb07be
Add regression test for #768
2017-01-23 21:25:46 +01:00
Ines Montani
6baa98f774
Merge pull request #769 from raphael0202/spacy-768
...
Allow zero-width 'infix' token
2017-01-23 21:24:33 +01:00
Raphaël Bournhonesque
dce8f5515e
Allow zero-width 'infix' token
2017-01-23 18:28:01 +01:00
Ines Montani
5f6f48e734
Add regression test for #759
2017-01-20 15:11:48 +01:00
Ines Montani
09ecc39b4e
Fix multi-line string of NUM_WORDS ( resolves #759 )
2017-01-20 15:11:48 +01:00
Matthew Honnibal
be26085277
Fix missing import
...
Closes #755
2017-01-19 22:03:52 +11:00
Ines Montani
94ddfb2304
Merge pull request #750 from oiwah/span-doc-typofix-patch
...
Documentation Typo Fix: start_char description in the span API
2017-01-18 09:46:19 +01:00
Hidekazu Oiwa
7806ebafd2
Fix the span doc typo
...
Fix the typo in the span API doc.
It explains the `end` of the span as the `start_char` description.
2017-01-17 20:37:14 -08:00
Matthew Honnibal
300650a6f8
Merge pull request #749 from sudowork/custom-tokenizer-docs
...
Fix Custom Tokenizer docs
2017-01-18 11:39:43 +11:00
Kevin Gao
7ec710af0e
Fix Custom Tokenizer docs
...
- Fix mismatched quotations
- Make it more clear where ORTH, LEMMA, and POS symbols come from
- Make strings consistent
- Fix lemma_ assertion s/-PRON-/me/
2017-01-17 10:38:14 -08:00
Ines Montani
dbe8dafb52
Fix logo width and height to avoid link overlap in Safari ( resolves #748 )
2017-01-17 17:56:34 +01:00
Ines Montani
ee45619307
Fix formatting
2017-01-17 17:55:59 +01:00
Ines Montani
7e36568d5b
Fix title to accommodate sputnik
2017-01-17 00:51:09 +01:00
Ines Montani
d704cfa60d
Fix typo
2017-01-16 21:30:33 +01:00
Ines Montani
fb482ff049
Fix typo
2017-01-16 21:30:23 +01:00
Ines Montani
b50c499c04
Fix consistency
2017-01-16 20:44:31 +01:00
Ines Montani
8a615e8961
Simplify and update pull request template
2017-01-16 20:43:52 +01:00
Ines Montani
5909804a61
Merge pull request #747 from JasonKessler/patch-1
...
Clarify Rule-Based Workflow Docs
2017-01-16 20:39:27 +01:00
Jason Kessler
9fa6f9fb40
Origin of spacy.matcher attributes
...
Make it clear that Matcher attributes live in spacy.matcher.attrs.
2017-01-16 13:31:35 -06:00
Ines Montani
842155e3ae
Merge pull request #746 from jktong/patch-1
...
Correct typo "chldren" in doc.jade
2017-01-16 17:58:37 +01:00
jktong
df0aeff379
Correct typo "chldren" in doc.jade
2017-01-16 09:34:59 -05:00
Ines Montani
64e142f460
Update about.py
2017-01-16 14:23:08 +01:00
Matthew Honnibal
63adcb8141
Merge branch 'master' of ssh://github.com/explosion/spaCy
2017-01-16 14:02:12 +01:00
Matthew Honnibal
e889cd698e
Increment version
2017-01-16 14:01:35 +01:00
Ines Montani
5e3793f711
Update README.rst
2017-01-16 14:00:56 +01:00
Matthew Honnibal
e7f8e13cf3
Make Token hashable. Fixes #743
2017-01-16 13:27:57 +01:00
Matthew Honnibal
2c60d0cb1e
Test #743 : Tokens unhashable.
2017-01-16 13:27:26 +01:00
Matthew Honnibal
48c712f1c1
Merge branch 'master' of ssh://github.com/explosion/spaCy
2017-01-16 13:18:06 +01:00
Matthew Honnibal
7ccf490c73
Increment version
2017-01-16 13:17:58 +01:00
Matthew Honnibal
d4e6d4c1c4
Use new thinc
2017-01-16 13:17:14 +01:00
Ines Montani
50878ef598
Exclude "were" and "Were" from tokenizer exceptions and add regression test ( resolves #744 )
2017-01-16 13:10:38 +01:00
Ines Montani
e053c7693b
Fix formatting
2017-01-16 13:09:52 +01:00
Ines Montani
116c675c3c
Merge pull request #742 from oroszgy/hu_tokenizer_fix
...
Improved Hungarian tokenizer
2017-01-14 23:52:44 +01:00
Gyorgy Orosz
92345b6a41
Further numeric test.
2017-01-14 22:44:19 +01:00
Gyorgy Orosz
b4df202bfa
Better error handling
2017-01-14 22:24:58 +01:00
Ines Montani
853130bcf8
Update installation instructions (see #727 )
2017-01-14 22:12:42 +01:00
Gyorgy Orosz
b03a46792c
Better error handling
2017-01-14 22:09:29 +01:00
Gyorgy Orosz
a45f22913f
Added further abbreviations present in the Szeged corpus
2017-01-14 22:08:55 +01:00
Ines Montani
a3e3df3e33
Clean up fabfile
2017-01-14 21:30:38 +01:00