Commit Graph

362 Commits

Author SHA1 Message Date
svlandeg ff9ac39c88 read entity_ruler patterns with srsly.read_jsonl.v1 2020-10-05 22:50:14 +02:00
svlandeg 193e0d5a98 add docs for entity_ruler.initialize 2020-10-05 18:04:08 +02:00
svlandeg 9eb813a35d Merge remote-tracking branch 'upstream/develop' into fix/patterns-init 2020-10-05 17:49:44 +02:00
svlandeg 4e3ace4b8c is_trainable method 2020-10-05 17:43:42 +02:00
svlandeg 65abd77779 add finish_update to Pipe 2020-10-05 16:23:33 +02:00
svlandeg 251b3eb4e5 add initialize method for entity_ruler 2020-10-05 14:59:13 +02:00
Sofie Van Landeghem f4f49f5877
update blis (#6198)
* allow higher blis version

* fix typo

* bump to 3.0.0a34

* fix pins in other files
2020-10-05 14:58:56 +02:00
Ines Montani 11347f34da Tidy up, tests and docs 2020-10-04 13:54:05 +02:00
Matthew Honnibal 96b636c2d3 Update attribute ruler 2020-10-04 13:08:21 +02:00
Ines Montani bcd52e5486 Tidy up errors and warnings 2020-10-04 11:16:31 +02:00
Ines Montani d3b3663942 Adjust error message and add test 2020-10-04 10:11:27 +02:00
Ines Montani cc08c88a89
Merge pull request #6187 from svlandeg/fix/begin_training_pipe 2020-10-04 10:01:02 +02:00
svlandeg 3f657ed3a1 implement warning in __init_subclass__ instead 2020-10-03 22:34:10 +02:00
Matthew Honnibal 3b2a78720c Upd morphologizer 2020-10-03 19:35:19 +02:00
Matthew Honnibal 4fccd2ceaf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-10-03 19:13:55 +02:00
Matthew Honnibal 8ea8b7d940 Support loading labels in morphologizer 2020-10-03 19:13:42 +02:00
Ines Montani 80603f0fa5 Make SentenceRecognizer.label_data return None
Overwrite the method from the base class (Tagger) but don't export anything in "init labels"
2020-10-03 18:54:09 +02:00
Ines Montani 3bc3c05fcc Tidy up and auto-format 2020-10-03 17:20:18 +02:00
Ines Montani dd542ec6a4
Fix label initialization of textcat component (#6190) 2020-10-03 17:07:38 +02:00
Ines Montani f0b30aedad
Make lemmatizers use initialize logic (#6182)
* Make lemmatizer use initialize logic and tidy up

* Fix typo

* Raise for uninitialized tables
2020-10-02 15:42:36 +02:00
Adriane Boyd 86c3ec9c2b
Refactor Token morph setting (#6175)
* Refactor Token morph setting

* Remove `Token.morph_`
* Add `Token.set_morph()`
  * `0` resets `token.c.morph` to unset
  * Any other values are passed to `Morphology.add`

* Add token.morph setter to set from MorphAnalysis
2020-10-01 22:21:46 +02:00
Ines Montani f2627157c8 Update docs [ci skip] 2020-10-01 17:38:17 +02:00
Ines Montani b799af16de Don't raise in Pipe.initialize if not implemented 2020-09-30 00:05:27 +02:00
Ines Montani fa47f87924 Tidy up and auto-format 2020-09-29 21:39:28 +02:00
Matthew Honnibal a4da3120b4 Fix multitasks 2020-09-29 18:33:16 +02:00
Matthew Honnibal 0b5c72fce2 Fix incorrect docstrings 2020-09-29 18:30:38 +02:00
Matthew Honnibal e4f535a964 Fix Pipe.labels 2020-09-29 16:55:07 +02:00
Matthew Honnibal 1fd002180e Allow more components to use labels 2020-09-29 16:48:56 +02:00
Matthew Honnibal 99bff78617 Use labels in tagger 2020-09-29 16:48:44 +02:00
Matthew Honnibal 58c8d4b414 Add label_data property to pipeline 2020-09-29 16:22:13 +02:00
Ines Montani f171903139 Clean up sgd and pipeline -> nlp 2020-09-29 12:20:26 +02:00
Ines Montani 42f0e4c946 Clean up 2020-09-29 12:14:08 +02:00
Matthew Honnibal 9c8b2524fe Upd initialize args 2020-09-29 12:08:37 +02:00
Matthew Honnibal f2d1b7feb5 Clean up sgd 2020-09-29 12:00:08 +02:00
Ines Montani dec984a9c1 Update Language.initialize and support components/tokenizer settings 2020-09-29 11:52:45 +02:00
Matthew Honnibal b3b6868639 Remove 'sgd' arg from component initialize 2020-09-29 11:42:35 +02:00
Ines Montani ff9a63bfbd begin_training -> initialize 2020-09-28 21:35:09 +02:00
Adriane Boyd 6c25e60089 Simplify string match IDs for AttributeRuler 2020-09-26 11:12:39 +02:00
Matthew Honnibal 702edf52a0 Fix attributeruler 2020-09-26 00:30:48 +02:00
Matthew Honnibal 821f37254c Fix attributeruler 2020-09-26 00:19:53 +02:00
Matthew Honnibal 98327f66a9 Fix attributeruler key 2020-09-25 23:20:50 +02:00
Matthew Honnibal 16475528f7
Fix skipped documents in entity scorer (#6137)
* Fix skipped documents in entity scorer

* Add back the skipping of unannotated entities

* Update spacy/scorer.py

* Use more specific NER scorer

* Fix import

* Fix get_ner_prf

* Add scorer

* Fix scorer

Co-authored-by: Ines Montani <ines@ines.io>
2020-09-24 20:38:57 +02:00
Ines Montani 0b52b6904c Update entity_linker.py 2020-09-24 17:10:35 +02:00
Adriane Boyd 59340606b7
Add option to disable Matcher errors (#6125)
* Add option to disable Matcher errors

* Add option to disable Matcher errors when a doc doesn't contain a
particular type of annotation

Minor additional change:

* Update `AttributeRuler.load_from_morph_rules` to allow direct `MORPH`
values

* Rename suppress_errors to allow_missing

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>

* Refactor annotation checks in Matcher and PhraseMatcher

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-09-24 16:54:39 +02:00
Sofie Van Landeghem c7eedd3534
updates to NEL functionality (#6132)
* NEL: read sentences and ents from reference

* fiddling with sent_start annotations

* add KB serialization test

* KB write additional file with strings.json

* score_links function to calculate NEL P/R/F

* formatting

* documentation
2020-09-24 16:53:59 +02:00
Ines Montani ae51f580c1 Fix handling of score_weights 2020-09-24 10:27:33 +02:00
svlandeg dd2292793f 'parser' instead of 'deps' for state_type 2020-09-23 16:53:49 +02:00
svlandeg 6c85fab316 state_type and extra_state_tokens instead of nr_feature_tokens 2020-09-23 13:35:09 +02:00
Sofie Van Landeghem d53c84b6d6
avoid None callback (#6100) 2020-09-22 13:54:44 +02:00
svlandeg 781fae678b Merge remote-tracking branch 'upstream/develop' into fix/corpus 2020-09-17 09:24:36 +02:00