spaCy/spacy
Matthew Honnibal 333b1a308b
Adapt parser and NER for transformers (#5449)
* Draft layer for BILUO actions

* Fixes to biluo layer

* WIP on BILUO layer

* Add tests for BILUO layer

* Format

* Fix transitions

* Update test

* Link in the simple_ner

* Update BILUO tagger

* Update __init__

* Import simple_ner

* Update test

* Import

* Add files

* Add config

* Fix label passing for BILUO and tagger

* Fix label handling for simple_ner component

* Update simple NER test

* Update config

* Hack train script

* Update BILUO layer

* Fix SimpleNER component

* Update train_from_config

* Add biluo_to_iob helper

* Add IOB layer

* Add IOBTagger model

* Update biluo layer

* Update SimpleNER tagger

* Update BILUO

* Read random seed in train-from-config

* Update use of normal_init

* Fix normalization of gradient in SimpleNER

* Update IOBTagger

* Remove print

* Tweak masking in BILUO

* Add dropout in SimpleNER

* Update thinc

* Tidy up simple_ner

* Fix biluo model

* Unhack train-from-config

* Update setup.cfg and requirements

* Add tb_framework.py for parser model

* Try to avoid memory leak in BILUO

* Move ParserModel into spacy.ml, avoid need for subclass.

* Use updated parser model

* Remove incorrect call to model.initializre in PrecomputableAffine

* Update parser model

* Avoid divide by zero in tagger

* Add extra dropout layer in tagger

* Refine minibatch_by_words function to avoid oom

* Fix parser model after refactor

* Try to avoid div-by-zero in SimpleNER

* Fix infinite loop in minibatch_by_words

* Use SequenceCategoricalCrossentropy in Tagger

* Fix parser model when hidden layer

* Remove extra dropout from tagger

* Add extra nan check in tagger

* Fix thinc version

* Update tests and imports

* Fix test

* Update test

* Update tests

* Fix tests

* Fix test

Co-authored-by: Ines Montani <ines@ines.io>
2020-05-18 22:23:33 +02:00
..
cli Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
displacy Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
lang Remove "pala" tokenizer exception for Spanish (#5265) 2020-04-09 10:21:20 +02:00
matcher Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
ml Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
pipeline Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
syntax Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
tests Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
tokens Update morphologizer (#5108) 2020-04-02 14:46:32 +02:00
__init__.pxd
__init__.py Simplify warnings 2020-02-28 12:20:23 +01:00
__main__.py Update spaCy for thinc 8.0.0 (#4920) 2020-01-29 17:06:46 +01:00
_ml.py take care of global vectors in multiprocessing (#5081) 2020-03-03 13:58:22 +01:00
about.py bump to 3.0.0.dev7 and thinc to 8.0.0a8 2020-05-15 13:25:54 +02:00
analysis.py Simplify warnings 2020-02-28 12:20:23 +01:00
attrs.pxd Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
attrs.pyx Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
compat.py Merge branch 'develop' into refactor/remove-symlinks 2020-02-18 17:22:20 +01:00
errors.py throw warning when model_cfg is None 2020-05-15 11:02:10 +02:00
glossary.py Tidy up and auto-format 2020-02-18 15:38:18 +01:00
gold.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
gold.pyx Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
kb.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
kb.pyx Merge branch 'develop' into refactor/simplify-warnings 2020-03-04 16:38:55 +01:00
language.py Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
lemmatizer.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
lexeme.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
lexeme.pyx Simplify warnings 2020-02-28 12:20:23 +01:00
lookups.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
morphology.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
morphology.pyx Fix small errors 2020-03-26 13:47:31 +01:00
parts_of_speech.pxd
parts_of_speech.pyx Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
schemas.py Add sent_start to pattern schema 2020-03-26 14:05:40 +01:00
scorer.py Update morphologizer (#5108) 2020-04-02 14:46:32 +02:00
strings.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
strings.pyx Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
structs.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
symbols.pxd Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
symbols.pyx Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
tokenizer.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
tokenizer.pyx Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
typedefs.pxd Update spaCy for thinc 8.0.0 (#4920) 2020-01-29 17:06:46 +01:00
typedefs.pyx
util.py Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
vectors.pyx Merge branch 'master' into tmp/sync 2020-03-26 13:38:14 +01:00
vocab.pxd Tidy up compiler flags and imports (#5071) 2020-03-02 11:48:10 +01:00
vocab.pyx Tidy up and auto-format 2020-02-18 15:38:18 +01:00