spaCy/spacy
Sofie Van Landeghem cd70c3cb79
Fixing pretrain (#7342)
* initialize NLP with train corpus

* add more pretraining tests

* more tests

* function to fetch tok2vec layer for pretraining

* clarify parameter name

* test different objectives

* formatting

* fix check for static vectors when using vectors objective

* clarify docs

* logger statement

* fix init_tok2vec and proc.initialize order

* test training after pretraining

* add init_config tests for pretraining

* pop pretraining block to avoid config validation errors

* custom errors
2021-03-09 14:01:13 +11:00
..
cli Merge pull request #7255 from adrianeboyd/bugfix/extraneous-tok2vec 2021-03-03 23:15:06 +11:00
displacy Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
lang Fix Ukrainian lemmatizer init (#7127) 2021-02-22 11:05:08 +11:00
matcher Run PhraseMatcher on Spans (#6918) 2021-02-10 23:43:32 +11:00
ml Fixing pretrain (#7342) 2021-03-09 14:01:13 +11:00
pipeline Re-refactor Sentencizer with Pipe API (#7176) 2021-02-26 09:48:14 +01:00
tests Fixing pretrain (#7342) 2021-03-09 14:01:13 +11:00
tokens Fix spans weak ref in doc copy (#7225) 2021-02-28 12:32:48 +11:00
training Fixing pretrain (#7342) 2021-03-09 14:01:13 +11:00
__init__.pxd
__init__.py Pass on vocab arg in spacy.blank() (#6924) 2021-02-04 15:09:01 +01:00
__main__.py
about.py Increment version [ci skip] 2021-02-14 13:36:13 +11:00
attrs.pxd
attrs.pyx
compat.py Use Literal type for nr_feature_tokens 2020-09-23 16:00:03 +02:00
default_config.cfg Add initialize.before_init and after_init callbacks 2021-01-12 13:07:44 +01:00
default_config_pretraining.cfg pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00
errors.py Fixing pretrain (#7342) 2021-03-09 14:01:13 +11:00
glossary.py
kb.pxd Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
kb.pyx Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
language.py Fixing pretrain (#7342) 2021-03-09 14:01:13 +11:00
lexeme.pxd
lexeme.pyx reduce memory load when reading all vectors from file (#6945) 2021-02-07 08:05:43 +08:00
lookups.py Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
morphology.pxd
morphology.pyx Prevent 0-length mem alloc (#6653) 2021-01-06 12:50:17 +11:00
parts_of_speech.pxd
parts_of_speech.pyx
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
schemas.py Support env vars and CLI overrides for project.yml 2021-02-10 13:45:27 +11:00
scorer.py fix type in docs 2021-02-26 14:27:10 +01:00
strings.pxd
strings.pyx Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
structs.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
symbols.pxd introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
symbols.pyx introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
tokenizer.pxd
tokenizer.pyx Run PhraseMatcher on Spans (#6918) 2021-02-10 23:43:32 +11:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx
util.py Fix is_cython_func for additional imported code 2021-03-01 16:37:39 +01:00
vectors.pyx Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
vocab.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
vocab.pyx Extend docs related to Vocab.get_noun_chunks 2021-02-25 16:38:21 +01:00