Commit Graph

2409 Commits

Author SHA1 Message Date
svlandeg 218abaa69a typo 2020-11-20 22:36:49 +01:00
svlandeg e861e928df more small corrections 2020-11-20 22:29:58 +01:00
svlandeg 5ac0867427 final fixes 2020-11-20 22:18:53 +01:00
svlandeg 331ec83493 edits and updates to implementing REL component docs 2020-11-20 21:41:52 +01:00
svlandeg 4a3e611abc small fixes and formatting 2020-11-20 15:55:05 +01:00
svlandeg 124f49feb6 update REL model code 2020-11-20 15:25:20 +01:00
svlandeg 636be3c791 Merge remote-tracking branch 'upstream/develop' into feature/trf-docs 2020-11-19 14:15:35 +01:00
Sofie Van Landeghem 165993d8e5
fix typo in transformer docs (#6404) 2020-11-19 14:11:38 +01:00
M. Revuelta Espinosa 51232ffb9e
Update universe.json (include PatternOmatic) (#6399)
Request to include PatternOmatic in spaCy Universe

Adds @revuel to contributors
2020-11-19 13:15:50 +01:00
Adriane Boyd 3cf6479467 Fix JSON in #6395 2020-11-17 15:25:41 +01:00
Sam Edwardes 78913a4f95
Added spaCyTextBlob to universe.json (#6395) 2020-11-17 14:38:34 +01:00
Adriane Boyd 96726ec1f6
Fix DocBin init in training example (#6396) 2020-11-17 14:36:44 +01:00
Adriane Boyd ed32fa80cd Update source install instructions
* Use `pip install` instead of `python setup.py install`
* For developers recommend:
  * `python setup.py build_ext --inplace -j N`
  * `python setup.py develop`
2020-11-16 10:13:51 +01:00
svlandeg 99d0412b6e add link to REL project 2020-11-15 18:35:56 +01:00
svlandeg 73fc1ed963 remove labels from morphologizer constructor 2020-11-11 21:48:50 +01:00
svlandeg fcd79e0655 remove set_morphology from docs 2020-11-11 21:32:34 +01:00
Ines Montani 3ca5c7082d Use pip install . in quickstart [ci skip] 2020-11-10 17:27:49 +08:00
Ines Montani de6453940e
Merge pull request #6305 from svlandeg/feature/score-docs [ci skip] 2020-11-10 02:52:11 +01:00
Ines Montani 4d337eedf2
Merge pull request #6322 from medspacy/master 2020-11-10 02:47:29 +01:00
Ines Montani d7950c5ada
Merge pull request #6297 from adrianeboyd/docs/nightly-conda-install [ci skip] 2020-11-10 02:45:52 +01:00
Ines Montani 448bfbdc30 Remove conda from nightly install widget [ci skip] 2020-11-10 09:44:52 +08:00
svlandeg 789fb3d124 add docs for upstream argument of TransformerListener 2020-11-09 21:42:58 +01:00
Ines Montani 363ac73c72 Update docs [ci skip] 2020-11-09 12:43:26 +08:00
Adriane Boyd 8644ee3e3f
Update TIGER link and tag description (#6344) 2020-11-05 09:33:00 +01:00
Sofie Van Landeghem 8ef056cf98
fix embed_size in Entity Linker architecture (#6343) 2020-11-04 22:20:13 +01:00
Ines Montani 019a1dd5e8 Fix v3 overview [ci skip] 2020-11-03 18:10:06 +01:00
Adriane Boyd a4b32b9552
Handle missing reference values in scorer (#6286)
* Handle missing reference values in scorer

Handle missing values in reference doc during scoring where it is
possible to detect an unset state for the attribute. If no reference
docs contain annotation, `None` is returned instead of a score. `spacy
evaluate` displays `-` for missing scores and the missing scores are
saved as `None`/`null` in the metrics.

Attributes without unset states:

* `token.head`: relies on `token.dep` to recognize unset values
* `doc.cats`: unable to handle missing annotation

Additional changes:

* add optional `has_annotation` check to `score_scans` to replace
`doc.sents` hack
* update `score_token_attr_per_feat` to handle missing and empty morph
representations
* fix bug in `Doc.has_annotation` for normalization of `IS_SENT_START`
vs. `SENT_START`

* Fix import

* Update return types
2020-11-03 15:47:18 +01:00
Alec Chapman 204c7c8a00 fix thumbnail link to be github raw url 2020-11-01 07:53:48 -07:00
Alec Chapman 73d22d96ff add medspacy to universe and fix example w/ cov-bsv 2020-10-29 07:53:56 -06:00
Adriane Boyd 8cc5ed6771 Add Macedonian to website languages 2020-10-29 08:49:56 +01:00
Adriane Boyd dc816bba9d
Fix node name typo in dependency matcher example (#6311) 2020-10-28 16:32:46 +01:00
Adriane Boyd 4dd86306e9
Add Nepali to supported languages on website (#6315) 2020-10-28 16:32:07 +01:00
svlandeg 77688b0072 fix config 2020-10-26 11:14:34 +01:00
svlandeg 5878ff6bcd cleanup 2020-10-26 11:13:02 +01:00
svlandeg e95d9caa87 small edits 2020-10-26 11:09:25 +01:00
svlandeg a664994a81 adding score method to explanation of new component 2020-10-26 10:52:47 +01:00
Adriane Boyd 253480353c Remove zh from quickstart extras 2020-10-23 11:39:25 +02:00
Adriane Boyd af26886fff Fix formatting 2020-10-23 11:38:14 +02:00
Adriane Boyd c0b76f4c19 Add install step to "Compile from source" 2020-10-23 11:36:36 +02:00
Adriane Boyd 8fe7ede667 Add install step to source install quickstart 2020-10-23 11:34:43 +02:00
Adriane Boyd 4299a7f654 Setup / install / quickstart updates
* Add `cuda110` to setup.cfg and quickstart dropdown
* Switch to `pip` for pip-only packages in conda quickstart instructions
* Update zh pkuseg install message with version range and conda
* Remove `zh` from `extras_require` because the default doesn't require
additional packages
2020-10-23 11:27:54 +02:00
Kunal Sharma 01aec7a313
Adding MindMeld to Universe JSON (#6275)
* Adding Mindmeld to Universe JSON

Mindmeld is a conversational AI platform for deep-domain voice interfaces and chatbots. https://www.mindmeld.com/

* Signing contribution agreement.

Co-authored-by: kunshar2 <kunshar2@cisco.com>
2020-10-21 18:42:11 +02:00
Ines Montani 6523f2daac
Merge pull request #6273 from adrianeboyd/bugfix/detailed-scores-in-evaluate2 2020-10-20 10:03:09 +02:00
Adriane Boyd fbe65b257b Convert accuracy numbers on website models page 2020-10-19 18:55:55 +02:00
Ines Montani b6b1c1e23c
Merge pull request #6271 from walterhenry/develop-proof [ci skip] 2020-10-19 16:31:43 +02:00
walterhenry db24dc5614 Proofread remarks
I think these may the last remarks for the nightly docs. Only two minor things actually.
2020-10-19 11:11:32 +02:00
Sofie Van Landeghem 75a202ce65
TextCat updates and fixes (#6263)
* small fix in example imports

* throw error when train_corpus or dev_corpus is not a string

* small fix in custom logger example

* limit macro_auc to labels with 2 annotations

* fix typo

* also create parents of output_dir if need be

* update documentation of textcat scores

* refactor TextCatEnsemble

* fix tests for new AUC definition

* bump to 3.0.0a42

* update docs

* rename to spacy.TextCatEnsemble.v2

* spacy.TextCatEnsemble.v1 in legacy

* cleanup

* small fix

* update to 3.0.0rc2

* fix import that got lost in merge

* cursed IDE

* fix two typos
2020-10-18 14:50:41 +02:00
Ines Montani e2f3c4e12d Fix robots [ci skip] 2020-10-16 17:44:13 +02:00
Adriane Boyd e896803792 Add and update website license links 2020-10-16 17:01:52 +02:00
Ines Montani c655742b8b Remove docs references to starters for now (see #6262) [ci skip] 2020-10-16 15:46:34 +02:00
Ines Montani 3851300e80 Update landing [ci skip] 2020-10-16 11:46:33 +02:00
Ines Montani c968d1560f Fix docs example [ci skip] 2020-10-16 11:33:20 +02:00
Ines Montani ba1e004049 Fix typo [ci skip] 2020-10-15 23:39:04 +02:00
Ines Montani 32dc4f4796 Sort models sidebar alphabetically [ci skip] 2020-10-15 22:47:16 +02:00
Ines Montani 20f80587d6
Merge pull request #6257 from walterhenry/develop-proof
A few tiny typo fixes to push through with release of nightly
2020-10-15 18:17:30 +02:00
walterhenry 75b7f86383 Three small typos
Some little typos since v3.0 is out.
2020-10-15 18:06:37 +02:00
Ines Montani 09dbbe75d7 Update docs [ci skip] 2020-10-15 17:27:24 +02:00
Ines Montani 7f05ccc170 Update docs [ci skip] 2020-10-15 12:35:30 +02:00
Ines Montani 4fa869e6f7 Update docs [ci skip] 2020-10-15 11:16:06 +02:00
Ines Montani 178760855f Merge branch 'develop' into master-tmp 2020-10-15 09:06:03 +02:00
Ines Montani abeafcbc08 Update docs [ci skip] 2020-10-15 08:58:30 +02:00
Ines Montani 050aa1e0e2 Update languages.json [ci skip] 2020-10-14 20:51:50 +02:00
Ines Montani a966c271f7 Update models docs [ci skip] 2020-10-14 20:50:23 +02:00
Ines Montani a2d4aaee70
Apply suggestions from code review 2020-10-14 19:51:36 +02:00
Ines Montani d94e241fce Merge branch 'develop' into pr/6253 2020-10-14 16:55:46 +02:00
Ines Montani cb47f25cda
Merge pull request #6252 from svlandeg/fix/docs 2020-10-14 16:43:12 +02:00
walterhenry 6af585dba5 New batch of proofs
Just tiny fixes to the docs as a proofreader
2020-10-14 16:37:57 +02:00
svlandeg 478a14a619 fix few typos 2020-10-14 15:01:19 +02:00
Ines Montani 1aa8e8f2af Update docs [ci skip] 2020-10-14 14:58:45 +02:00
Ines Montani 4d99d2b94a Update docs [ci skip] 2020-10-13 11:38:52 +02:00
svlandeg 40276fd3be update NEL docs after latest refactor 2020-10-12 11:41:27 +02:00
svlandeg 08cb085f6c Merge remote-tracking branch 'upstream/develop' into fix/various 2020-10-09 17:01:27 +02:00
Ines Montani 97ff090e49 Fix docs example [ci skip] 2020-10-09 16:03:57 +02:00
Ines Montani 9fb3244672
Merge pull request #6231 from adrianeboyd/feature/include-static-vectors 2020-10-09 15:54:52 +02:00
Adriane Boyd 2dd79454af Update docs 2020-10-09 14:42:07 +02:00
svlandeg 853edace37 fix MultiHashEmbed example in documentation 2020-10-09 14:11:06 +02:00
Ines Montani e50dc2c1c9 Update docs [ci skip] 2020-10-09 12:04:52 +02:00
Ines Montani 7c52def5da
Merge pull request #6227 from adrianeboyd/chore/update-3.0.0a36-from-master 2020-10-09 10:49:20 +02:00
Ines Montani 329b61ee7b Update docs [ci skip] 2020-10-09 10:36:06 +02:00
Šarūnas Navickas 287ba94a2f Website (Universe): An entry for rita-dsl (#6138)
* Create zaibacu.md

* Add RITA-DSL entry

* Update agreement

* Fix formatting
2020-10-09 10:14:40 +02:00
delzac 668507be1b Reflect on usage doc that IS_SENT_START attribute exist (#6114)
* Reflect on usage doc that IS_SENT_START attribute exist

* Create delzac.md
2020-10-09 10:14:40 +02:00
Sofie Van Landeghem d093d6343b
TrainablePipe (#6213)
* rename Pipe to TrainablePipe

* split functionality between Pipe and TrainablePipe

* remove unnecessary methods from certain components

* cleanup

* hasattr(component, "pipe") should be sufficient again

* remove serialization and vocab/cfg from Pipe

* unify _ensure_examples and validate_examples

* small fixes

* hasattr checks for self.cfg and self.vocab

* make is_resizable and is_trainable properties

* serialize strings.json instead of vocab

* fix KB IO + tests

* fix typos

* more typos

* _added_strings as a set

* few more tests specifically for _added_strings field

* bump to 3.0.0a36
2020-10-08 21:33:49 +02:00
Ines Montani 5ebd1fc2cf Update docs [ci skip] 2020-10-08 16:23:12 +02:00
Ines Montani 741796e500 Update docs [ci skip] 2020-10-08 14:31:34 +02:00
Ines Montani d1602e1ece Update docs [ci skip] 2020-10-08 11:56:50 +02:00
Ines Montani 064575d79d
Merge pull request #6216 from svlandeg/feature/nel-initialize 2020-10-08 11:14:12 +02:00
Ines Montani 43e59bb22a Update docs and install extras [ci skip] 2020-10-08 10:58:50 +02:00
svlandeg eaf5c265cb set_kb method for entity_linker 2020-10-08 10:34:01 +02:00
svlandeg bcaad28eda fix typos 2020-10-07 13:05:37 +02:00
delzac 15ea401b39
Reflect on usage doc that IS_SENT_START attribute exist (#6114)
* Reflect on usage doc that IS_SENT_START attribute exist

* Create delzac.md
2020-10-06 15:11:01 +02:00
Ines Montani ce14520789 Update docs [ci skip] 2020-10-06 14:35:17 +02:00
Ines Montani 2a17566da3 Update docs [ci skip] 2020-10-06 14:15:08 +02:00
Ines Montani 967377287a
Merge pull request #6210 from adrianeboyd/docs/various-v3-3 [ci skip] 2020-10-06 11:28:45 +02:00
Adriane Boyd aa9c9f3bf0 Update Chinese usage for spacy-pkuseg 2020-10-06 11:21:17 +02:00
Šarūnas Navickas 047fb9f8b8
Website (Universe): An entry for rita-dsl (#6138)
* Create zaibacu.md

* Add RITA-DSL entry

* Update agreement

* Fix formatting
2020-10-06 11:19:36 +02:00
Ines Montani 2fd7122074 Update docs [ci skip] 2020-10-06 10:31:48 +02:00
Ines Montani 568e12215d
Merge pull request #6206 from svlandeg/fix/patterns-init 2020-10-06 10:27:23 +02:00
Ines Montani 2e961817cb Update docs [ci skip] 2020-10-06 10:23:01 +02:00
svlandeg 9b4cf7b0b6 update output of debug config command 2020-10-06 09:47:23 +02:00
svlandeg fd0f60e2bc updates to data format for training and pretraining 2020-10-06 09:28:53 +02:00