spaCy/spacy
Koichi Yasuoka 0afb54ac93
JapaneseTokenizer.pipe added (#6515)
* JapaneseTokenizer.pipe added

For [spacymoji](https://spacy.io/universe/project/spacymoji)  with `Japanese()`.

* DummyTokenizer.pipe added instead
2020-12-08 20:02:23 +01:00
..
cli Remove non-working --use-chars from train CLI 2020-12-08 08:30:00 +01:00
data Make spacy/data a package 2017-03-18 20:04:22 +01:00
displacy Fix on EntityRendered to support break lines (after last entity) (closes #5838) 2020-07-29 18:48:39 +02:00
lang Added Multext-East V5 tagset for Croatian language (#6248) 2020-11-05 12:19:22 +01:00
matcher Add SPACY as a Matcher attribute (#6463) 2020-11-30 09:34:50 +08:00
ml Reproducibility for TextCat and Tok2Vec (#6218) 2020-10-08 00:43:46 +02:00
pipeline
syntax Restore cleanup_beam method (#6446) 2020-11-25 13:21:48 +01:00
tests
tokens Only set NORM on Token in retokenizer (#6464) 2020-11-30 09:35:42 +08:00
__init__.pxd
__init__.py
__main__.py
_ml.py Bugfix textcat reproducibility on GPU (#6411) 2020-11-23 12:29:35 +01:00
about.py Set version to v2.3.4 2020-11-26 08:48:52 +01:00
analysis.py
attrs.pxd
attrs.pyx
compat.py
errors.py
glossary.py
gold.pxd
gold.pyx
kb.pxd
kb.pyx
language.py Move max_length to nlp.make_doc() (#6512) 2020-12-08 14:24:02 +08:00
lemmatizer.py Fix lemmatizer is_base_form for python2.7 (#5734) 2020-07-09 22:11:24 +02:00
lexeme.pxd
lexeme.pyx
lookups.py
morphology.pxd
morphology.pyx Improve tag map initialization and updating (#5768) 2020-07-19 11:13:39 +02:00
parts_of_speech.pxd
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
scorer.py
strings.pxd
strings.pyx Merge branch 'master' into feature/lemmatizer 2019-03-16 13:44:22 +01:00
structs.pxd
symbols.pxd
symbols.pyx
tokenizer.pxd
tokenizer.pyx
typedefs.pxd
typedefs.pyx
util.py
vectors.pyx
vocab.pxd
vocab.pyx