spaCy/spacy/tests/serialize
Adriane Boyd f4339f9bff
Fix tokenizer cache flushing (#7836)
* Fix tokenizer cache flushing

Fix/simplify tokenizer init detection in order to fix cache flushing
when properties are modified.

* Remove init reloading logic

* Remove logic disabling `_reload_special_cases` on init
  * Setting `rules` last in `__init__` (as before) means that setting
    other properties doesn't reload any special cases
  * Reset `rules` first in `from_bytes` so that setting other properties
    during deserialization doesn't reload any special cases
    unnecessarily
* Reset all properties in `Tokenizer.from_bytes` to allow any settings
  to be `None`

* Also reset special matcher when special cache is flushed

* Remove duplicate special case validation

* Add test for special cases flushing

* Extend test for tokenizer deserialization of None values
2021-04-22 18:14:57 +10:00
..
__init__.py
test_resource_warning.py
test_serialize_config.py Ensure hyphen in config file works as string value (#7642) 2021-04-12 14:35:57 +02:00
test_serialize_doc.py
test_serialize_extension_attrs.py
test_serialize_kb.py
test_serialize_language.py
test_serialize_pipeline.py
test_serialize_tokenizer.py Fix tokenizer cache flushing (#7836) 2021-04-22 18:14:57 +10:00
test_serialize_vocab_strings.py