Commit Graph

372 Commits

Author SHA1 Message Date
svlandeg 44e14ccae8 one more losses fix 2020-10-14 15:11:34 +02:00
svlandeg 0aa8851878 always return losses 2020-10-14 15:00:49 +02:00
svlandeg 68d79796c6 add test for vocab after serializing KB 2020-10-10 20:59:48 +02:00
Ines Montani bfa3931c9d
Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
Adriane Boyd 39aabf50ab Also rename to include_static_vectors in CharEmbed 2020-10-09 11:54:48 +02:00
Sofie Van Landeghem d093d6343b
TrainablePipe (#6213)
* rename Pipe to TrainablePipe

* split functionality between Pipe and TrainablePipe

* remove unnecessary methods from certain components

* cleanup

* hasattr(component, "pipe") should be sufficient again

* remove serialization and vocab/cfg from Pipe

* unify _ensure_examples and validate_examples

* small fixes

* hasattr checks for self.cfg and self.vocab

* make is_resizable and is_trainable properties

* serialize strings.json instead of vocab

* fix KB IO + tests

* fix typos

* more typos

* _added_strings as a set

* few more tests specifically for _added_strings field

* bump to 3.0.0a36
2020-10-08 21:33:49 +02:00
Ines Montani 064575d79d
Merge pull request #6216 from svlandeg/feature/nel-initialize 2020-10-08 11:14:12 +02:00
svlandeg eaf5c265cb set_kb method for entity_linker 2020-10-08 10:34:01 +02:00
Ines Montani 010956d493 Clear rule-based components on initialize 2020-10-08 09:51:31 +02:00
svlandeg 33c2d4af16 move kb_loader to initialize for NEL instead of constructor 2020-10-07 14:56:00 +02:00
svlandeg ff9ac39c88 read entity_ruler patterns with srsly.read_jsonl.v1 2020-10-05 22:50:14 +02:00
svlandeg 193e0d5a98 add docs for entity_ruler.initialize 2020-10-05 18:04:08 +02:00
svlandeg 9eb813a35d Merge remote-tracking branch 'upstream/develop' into fix/patterns-init 2020-10-05 17:49:44 +02:00
svlandeg 4e3ace4b8c is_trainable method 2020-10-05 17:43:42 +02:00
svlandeg 65abd77779 add finish_update to Pipe 2020-10-05 16:23:33 +02:00
svlandeg 251b3eb4e5 add initialize method for entity_ruler 2020-10-05 14:59:13 +02:00
Sofie Van Landeghem f4f49f5877
update blis (#6198)
* allow higher blis version

* fix typo

* bump to 3.0.0a34

* fix pins in other files
2020-10-05 14:58:56 +02:00
Ines Montani 11347f34da Tidy up, tests and docs 2020-10-04 13:54:05 +02:00
Matthew Honnibal 96b636c2d3 Update attribute ruler 2020-10-04 13:08:21 +02:00
Ines Montani bcd52e5486 Tidy up errors and warnings 2020-10-04 11:16:31 +02:00
Ines Montani d3b3663942 Adjust error message and add test 2020-10-04 10:11:27 +02:00
Ines Montani cc08c88a89
Merge pull request #6187 from svlandeg/fix/begin_training_pipe 2020-10-04 10:01:02 +02:00
svlandeg 3f657ed3a1 implement warning in __init_subclass__ instead 2020-10-03 22:34:10 +02:00
Matthew Honnibal 3b2a78720c Upd morphologizer 2020-10-03 19:35:19 +02:00
Matthew Honnibal 4fccd2ceaf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-10-03 19:13:55 +02:00
Matthew Honnibal 8ea8b7d940 Support loading labels in morphologizer 2020-10-03 19:13:42 +02:00
Ines Montani 80603f0fa5 Make SentenceRecognizer.label_data return None
Overwrite the method from the base class (Tagger) but don't export anything in "init labels"
2020-10-03 18:54:09 +02:00
Ines Montani 3bc3c05fcc Tidy up and auto-format 2020-10-03 17:20:18 +02:00
Ines Montani dd542ec6a4
Fix label initialization of textcat component (#6190) 2020-10-03 17:07:38 +02:00
Ines Montani f0b30aedad
Make lemmatizers use initialize logic (#6182)
* Make lemmatizer use initialize logic and tidy up

* Fix typo

* Raise for uninitialized tables
2020-10-02 15:42:36 +02:00
Adriane Boyd 86c3ec9c2b
Refactor Token morph setting (#6175)
* Refactor Token morph setting

* Remove `Token.morph_`
* Add `Token.set_morph()`
  * `0` resets `token.c.morph` to unset
  * Any other values are passed to `Morphology.add`

* Add token.morph setter to set from MorphAnalysis
2020-10-01 22:21:46 +02:00
Ines Montani f2627157c8 Update docs [ci skip] 2020-10-01 17:38:17 +02:00
Ines Montani b799af16de Don't raise in Pipe.initialize if not implemented 2020-09-30 00:05:27 +02:00
Ines Montani fa47f87924 Tidy up and auto-format 2020-09-29 21:39:28 +02:00
Matthew Honnibal a4da3120b4 Fix multitasks 2020-09-29 18:33:16 +02:00
Matthew Honnibal 0b5c72fce2 Fix incorrect docstrings 2020-09-29 18:30:38 +02:00
Matthew Honnibal e4f535a964 Fix Pipe.labels 2020-09-29 16:55:07 +02:00
Matthew Honnibal 1fd002180e Allow more components to use labels 2020-09-29 16:48:56 +02:00
Matthew Honnibal 99bff78617 Use labels in tagger 2020-09-29 16:48:44 +02:00
Matthew Honnibal 58c8d4b414 Add label_data property to pipeline 2020-09-29 16:22:13 +02:00
Ines Montani f171903139 Clean up sgd and pipeline -> nlp 2020-09-29 12:20:26 +02:00
Ines Montani 42f0e4c946 Clean up 2020-09-29 12:14:08 +02:00
Matthew Honnibal 9c8b2524fe Upd initialize args 2020-09-29 12:08:37 +02:00
Matthew Honnibal f2d1b7feb5 Clean up sgd 2020-09-29 12:00:08 +02:00
Ines Montani dec984a9c1 Update Language.initialize and support components/tokenizer settings 2020-09-29 11:52:45 +02:00
Matthew Honnibal b3b6868639 Remove 'sgd' arg from component initialize 2020-09-29 11:42:35 +02:00
Ines Montani ff9a63bfbd begin_training -> initialize 2020-09-28 21:35:09 +02:00
Adriane Boyd 6c25e60089 Simplify string match IDs for AttributeRuler 2020-09-26 11:12:39 +02:00
Matthew Honnibal 702edf52a0 Fix attributeruler 2020-09-26 00:30:48 +02:00
Matthew Honnibal 821f37254c Fix attributeruler 2020-09-26 00:19:53 +02:00