spaCy

History

adrianeboyd 3bf111585d Update Japanese tokenizer config and add serialization (#5562 ) * Use `config` dict for tokenizer settings * Add serialization of split mode setting * Add tests for tokenizer split modes and serialization of split mode setting Based on #5561		2020-06-08 16:29:05 +02:00
..
cli	prevent loading a pretrained Tok2Vec layer AND pretrained components	2020-05-29 17:38:33 +02:00
data	…
displacy	Add missing import	2020-04-28 13:48:37 +02:00
lang	Update Japanese tokenizer config and add serialization (#5562 )	2020-06-08 16:29:05 +02:00
matcher	Switch to new add API in PhraseMatcher unpickle	2020-05-25 11:22:47 +02:00
ml	…
pipeline	Preserve _SP when filtering tag map in Tagger	2020-05-31 19:57:54 +02:00
syntax	Revert "Remove peeking from Parser.begin_training (#5456 )"	2020-05-29 23:21:55 +02:00
tests	Update Japanese tokenizer config and add serialization (#5562 )	2020-06-08 16:29:05 +02:00
tokens	Remove MorphAnalysis __str__ and __repr__	2020-05-29 14:33:47 +02:00
__init__.pxd	…
__init__.py	Simplify warnings	2020-04-28 13:37:37 +02:00
__main__.py	…
_ml.py	Skip duplicate lexeme rank setting (#5401 )	2020-05-14 18:26:12 +02:00
about.py	Switch to v2.3.0.dev0	2020-05-25 12:57:20 +02:00
analysis.py	Simplify warnings	2020-04-28 13:37:37 +02:00
attrs.pxd	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
attrs.pyx	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
compat.py	…
errors.py	Add rudimentary version checks on model load	2020-06-02 17:33:48 +02:00
glossary.py	…
gold.pxd	…
gold.pyx	Add warning for misaligned character offset spans (#5007 )	2020-05-19 16:01:18 +02:00
kb.pxd	Tidy up and avoid absolute spacy imports in core	2020-05-21 20:05:03 +02:00
kb.pyx	Merge pull request #5264 from lfiedler/issue-5230	2020-05-22 00:31:07 +02:00
language.py	Improve vector name loading from model meta	2020-05-27 14:48:54 +02:00
lemmatizer.py	Return lowercase form as default except for PROPN	2020-05-20 15:35:08 +02:00
lexeme.pxd	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
lexeme.pyx	Avoid libc.stdint for UINT64_MAX (#5545 )	2020-06-04 20:02:05 +02:00
lookups.py	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
morphology.pxd	…
morphology.pyx	Prefer _SP over SP for default tag map space attrs	2020-05-26 14:57:13 +02:00
parts_of_speech.pxd	…
parts_of_speech.pyx	…
scorer.py	Fix GoldParse init when token count differs (#5191 )	2020-03-26 10:46:23 +01:00
strings.pxd	…
strings.pyx	…
structs.pxd	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
symbols.pxd	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
symbols.pyx	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
tokenizer.pxd	Rename to url_match	2020-05-22 12:41:03 +02:00
tokenizer.pyx	Rename to url_match	2020-05-22 12:41:03 +02:00
typedefs.pxd	…
typedefs.pyx	…
util.py	Remove unnecessary check	2020-06-02 17:41:25 +02:00
vectors.pyx	fix deserialization order	2020-05-30 12:53:32 +02:00
vocab.pxd	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
vocab.pyx	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00