Commit Graph

24 Commits

Author SHA1 Message Date
Sofie Van Landeghem 1543558d08
Add test for old architectures (#10751)
* add v1 and v2 tests for tok2vec architectures

* textcat architectures are not "layers"

* test older textcat architectures

* test older parser architecture
2022-05-10 08:24:42 +02:00
Daniël de Kok e5debc68e4
Tagger: use unnormalized probabilities for inference (#10197)
* Tagger: use unnormalized probabilities for inference

Using unnormalized softmax avoids use of the relatively expensive exp function,
which can significantly speed up non-transformer models (e.g. I got a speedup
of 27% on a German tagging + parsing pipeline).

* Add spacy.Tagger.v2 with configurable normalization

Normalization of probabilities is disabled by default to improve
performance.

* Update documentation, models, and tests to spacy.Tagger.v2

* Move Tagger.v1 to spacy-legacy

* docs/architectures: run prettier

* Unnormalized softmax is now a Softmax_v2 option

* Require thinc 8.0.14 and spacy-legacy 3.0.9
2022-03-15 14:15:31 +01:00
Adriane Boyd f4c74764b8
Fix Tok2Vec for empty batches (#10324)
* Add test for tok2vec with vectors and empty docs

* Add shortcut for empty batch in Tok2Vec.predict

* Avoid types
2022-02-21 10:22:36 +01:00
Adriane Boyd 5eeb25f043 Tidy up code 2021-06-28 12:08:15 +02:00
Sofie Van Landeghem fff662e41f
Ensemble textcat with listener (#8012)
* add unit test for two listeners, with a textcat ensemble in the middle

* return zero gradients instead of None in accumulate_gradient
2021-05-31 18:21:06 +10:00
svlandeg ece8be4fec extend test to training with replaced tok2vec layer 2021-05-12 11:32:22 +02:00
Adriane Boyd 36ecba224e
Set up GPU CI testing (#7293)
* Set up CI for tests with GPU agent

* Update tests for enabled GPU

* Fix steps filename

* Add parallel build jobs as a setting

* Fix test requirements

* Fix install test requirements condition

* Fix pipeline models test

* Reset current ops in prefer/require testing

* Fix more tests

* Remove separate test_models test

* Fix regression 5551

* fix StaticVectors for GPU use

* fix vocab tests

* Fix regression test 5082

* Move azure steps to .github and reenable default pool jobs

* Consolidate/rename azure steps

Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2021-04-22 14:58:29 +02:00
Ines Montani e6accb3a9e Tidy up and auto-format 2021-01-30 12:52:33 +11:00
Ines Montani bc089b693c Update tests 2021-01-29 19:38:09 +11:00
Ines Montani 911dfcccfc Add option to replace listeners for sourced components 2021-01-29 15:57:04 +11:00
Sofie Van Landeghem 75d9019343
Fix types of Tok2Vec encoding architectures (#6442)
* fix TorchBiLSTMEncoder documentation

* ensure the types of the encoding Tok2vec layers are correct

* update references from v1 to v2 for the new architectures
2021-01-07 16:39:27 +11:00
Adriane Boyd 39aabf50ab Also rename to include_static_vectors in CharEmbed 2020-10-09 11:54:48 +02:00
Matthew Honnibal db84d175c3 Fix test 2020-10-05 19:59:30 +02:00
Matthew Honnibal 6dcc4a0ba6 Simplify MultiHashEmbed signature 2020-10-05 19:57:45 +02:00
Matthew Honnibal 7d93575f35 spacy/tests/ 2020-10-05 15:28:12 +02:00
Matthew Honnibal f4ca9a39cb spacy/tests/ 2020-10-05 15:27:06 +02:00
Matthew Honnibal f2f1deca66 spacy/tests/ 2020-10-05 15:24:33 +02:00
Ines Montani fa47f87924 Tidy up and auto-format 2020-09-29 21:39:28 +02:00
Ines Montani ff9a63bfbd begin_training -> initialize 2020-09-28 21:35:09 +02:00
Ines Montani 7e938ed63e Update config resolution to use new Thinc 2020-09-27 22:21:31 +02:00
Sofie Van Landeghem 86a08f819d
tok2vec.update instead of predict (#6113) 2020-09-22 21:54:52 +02:00
Sofie Van Landeghem d53c84b6d6
avoid None callback (#6100) 2020-09-22 13:54:44 +02:00
svlandeg 781fae678b Merge remote-tracking branch 'upstream/develop' into fix/corpus 2020-09-17 09:24:36 +02:00
svlandeg bd87e8686e move tests to correct subdir 2020-09-15 21:40:38 +02:00