Commit Graph

1620 Commits

Author SHA1 Message Date
Paul O'Leary McCann b53e39455e
Fix UD POS docs links (fix #9013) (#9407)
* Fix UD POS docs links (fix #9013)

The previous link seems to have been for UD v1.

* Fix link
2021-10-11 11:51:19 +02:00
Adriane Boyd a5231cb044
Remove traces of lexemes from vocab serialization (#9400) 2021-10-11 11:13:35 +02:00
Sofie Van Landeghem f87ae3cb7d
Doc fixes in convert API (#9350)
* add more info on the spacy debug command

* formatting
2021-10-06 13:13:18 +09:00
Paul O'Leary McCann 6e833b617a
Updating Troubleshooting Docs (#9329)
* Add link to Discussions FAQ

* Remove old FAQ entries

I think these are no longer relevant.

- no-cache-dir: affected pip versions are *very* old now
- narrow unicode: not an issue from py3.3+
- utf-8 osx: upstream bug closed in 2019

Some of the other issues are also maybe not frequent.
2021-10-01 12:28:22 +02:00
Ines Montani 6bb0324b81 Adjust kb_id visualizer templating and docs 2021-09-23 11:59:02 +02:00
Ines Montani beb4a8c524
Merge pull request #9199 from shigapov/master (resolves #9129) 2021-09-23 19:41:53 +10:00
Jozef Harag 865cfbc903
feat: add `spacy.WandbLogger.v3` with optional `run_name` and `entity` parameters (#9202)
* feat: add `spacy.WandbLogger.v3` with optional `run_name` and `entity` parameters

* update versioning in docs

Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2021-09-16 12:26:41 +02:00
Paul O'Leary McCann 1d57d78758 Make docs consistent (fix #9126) 2021-09-16 15:54:12 +09:00
Renat Shigapov d5cc009faf
Merge branch 'explosion:master' into master 2021-09-13 08:43:48 +02:00
Renat Shigapov e61d93f8c3
add NEL-visualisation to manual-usage 2021-09-13 08:38:58 +02:00
Paul O'Leary McCann f89e1c34c9
Minor typo fix in docs 2021-09-11 14:22:05 +09:00
Sofie Van Landeghem 8895e3c9ad
matcher doc corrections (#9115)
* update error message to current UX

* clarify uppercase effect

* fix docstring
2021-09-02 09:26:33 +02:00
Robyn Speer d60b748e3c
Fix surprises when asking for the root of a git repo (#9074)
* Fix surprises when asking for the root of a git repo

In the case of the first asset I wanted to get from git, the data I
wanted was the entire repository. I tried leaving "path" blank, which
gave a less-than-helpful error, and then I tried `path: "/"`, which
started copying my entire filesystem into the project. The path I should
have used was "".

I've made two changes to make this smoother for others:

- The 'path' within a git clone defaults to ""
- If the path points outside of the tmpdir that the git clone goes
into, we fail with an error

Signed-off-by: Elia Robyn Speer <elia@explosion.ai>

* use a descriptive error instead of a default

plus some minor fixes from PR review

Signed-off-by: Elia Robyn Speer <elia@explosion.ai>

* check for None values in assets

Signed-off-by: Elia Robyn Speer <elia@explosion.ai>

Co-authored-by: Elia Robyn Speer <elia@explosion.ai>
2021-09-01 22:52:08 +02:00
Paul O'Leary McCann ba6a37d358
Document Assigned Attributes of Pipeline Components (#9041)
* Add textcat docs

* Add NER docs

* Add Entity Linker docs

* Add assigned fields docs for the tagger

This also adds a preamble, since there wasn't one.

* Add morphologizer docs

* Add dependency parser docs

* Update entityrecognizer docs

This is a little weird because `Doc.ents` is the only thing assigned to,
but it's actually a bidirectional property.

* Add token fields for entityrecognizer

* Fix section name

* Add entity ruler docs

* Add lemmatizer docs

* Add sentencizer/recognizer docs

* Update website/docs/api/entityrecognizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/tagger.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/api/entityruler.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update type for Doc.ents

This was `Tuple[Span, ...]` everywhere but `Tuple[Span]` seems to be
correct.

* Run prettier

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

* Add transformers section

This basically just moves and renames the "custom attributes" section
from the bottom of the page to be consistent with "assigned attributes"
on other pages.

I looked at moving the paragraph just above the section into the
section, but it includes the unrelated registry additions, so it seemed
better to leave it unchanged.

* Make table header consistent

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-09-01 12:09:39 +02:00
Davide Fiocco 1dd69be1f1
Fix point typo on docbin docs (#9097) 2021-08-31 10:55:44 +02:00
Sofie Van Landeghem 1e974de837
config is not Optional (#9024) 2021-08-27 11:44:31 +02:00
Sofie Van Landeghem 4d39430b82
Document use-case of freezing tok2vec (#8992)
* update error msg

* add sentence to docs

* expand note on frozen components
2021-08-26 09:50:35 +02:00
Sofie Van Landeghem 94fb840443
fix docs for Span constructor arguments (#9023) 2021-08-25 16:06:22 +02:00
Sofie Van Landeghem de025beb5f
Warn and document spangroup.doc weakref (#8980)
* test for error after Doc has been garbage collected

* warn about using a SpanGroup when the Doc has been garbage collected

* add warning to the docs

* rephrase slightly

* raise error instead of warning

* update

* move warning to doc property
2021-08-20 11:06:19 +02:00
Paul O'Leary McCann 37fe847af4 Fix type annotation in docs 2021-08-20 15:34:22 +09:00
Paul O'Leary McCann 9391998c77
Add notes on preparing training data to docs (#8964)
* Add training data section

Not entirely sure this is in the right location on the page - maybe it
should be after quickstart?

* Add pointer from binary format to training data section

* Minor cleanup

* Add to ToC, fix filename

* Update website/docs/usage/training.md

Co-authored-by: Ines Montani <ines@ines.io>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Move the training data section further down the page

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update website/docs/usage/training.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Run prettier

Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-08-16 17:37:21 +02:00
Ines Montani 4f769ff913 Update Prodigy project template for v1.11 [ci skip] 2021-08-12 13:46:20 +10:00
Paul O'Leary McCann e227d24d43
Allow passing in array vars for speedup (#8882)
* Allow passing in array vars for speedup

This fixes #8845. Not sure about the docstring changes here...

* Update docs

Types maybe need more detail? Maybe not?

* Run prettier on docs

* Update spacy/tokens/span.pyx

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-08-10 15:13:53 +02:00
Paul O'Leary McCann 6029cfc391
Add scores to output in spancat (#8855)
* Add scores to output in spancat

This exposes the scores as an attribute on the SpanGroup. Includes a
basic test.

* Add basic doc note

* Vectorize score calcs

* Add "annotation format" section

* Update website/docs/api/spancategorizer.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Clean up doc section

* Ran prettier on docs

* Get arrays off the gpu before iterating over them

* Remove int() calls

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-08-10 13:47:49 +02:00
Paul O'Leary McCann cac298471f
Fix #8902 (bad link in docs)
typo fix
2021-08-08 22:04:00 +09:00
Adriane Boyd 175847f92c
Support list values and INTERSECTS in Matcher (#8784)
* Support list values and IS_INTERSECT in Matcher

* Support list values as token attributes for set operators, not just as
pattern values.

* Add `IS_INTERSECT` operator.

* Fix incorrect `ISSUBSET` and `ISSUPERSET` in schema and docs.

* Rename IS_INTERSECT to INTERSECTS
2021-08-02 19:39:26 +02:00
Ines Montani 30f20496d5
Merge pull request #8840 from polm/docs/evaluate-speed [ci skip] 2021-07-30 09:10:15 +10:00
Ines Montani 65d163fab5
Adjust formatting [ci skip] 2021-07-30 09:10:04 +10:00
Ines Montani 3a701d3645
Merge pull request #8841 from adrianeboyd/docs/ent-id-sep [ci skip]
Fix formatting of ent_id_sep in EntityRuler API docs
2021-07-30 09:09:25 +10:00
thomashacker 02258916c8 Fix example config typo for transformer architecture 2021-07-29 11:19:40 +02:00
Adriane Boyd 15b12f3e35 Fix formatting of ent_id_sep in EntityRuler API docs 2021-07-29 10:10:12 +02:00
Paul O'Leary McCann a60cb13910 Update speed entry in metrics table 2021-07-29 16:35:19 +09:00
Paul O'Leary McCann e125313a50 Revert "Add note about SPEED in output"
This reverts commit c92d268176.
2021-07-29 16:34:08 +09:00
Ines Montani 0a1e299d30
Merge pull request #8814 from polm/docs/migrate-lexeme-tables [ci skip] 2021-07-29 17:18:02 +10:00
Paul O'Leary McCann c92d268176 Add note about SPEED in output
In #8823 it was pointed out that the `SPEED` value wasn't documented
anywhere.
2021-07-29 15:03:07 +09:00
Paul O'Leary McCann 8867e60fbb
Update website/docs/usage/v3.md
Co-authored-by: Ines Montani <ines@ines.io>
2021-07-29 14:56:56 +09:00
Adriane Boyd 8547514aa4
Remove labels from textcat component config example (#8815) 2021-07-27 13:14:38 +02:00
Paul O'Leary McCann 76ac95923a Add note to migration guide about lexeme tables (fix #7290)
This just adds the resolution from #6388 to the docs.
2021-07-27 19:19:25 +09:00
Paul O'Leary McCann 67ecdcc3ac
Update subset/superset docs (#8795)
* Update subset/superset docs

* Update website/docs/usage/rule-based-matching.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-07-27 12:08:46 +02:00
Ines Montani 134cb06af3
Merge pull request #8808 from kevinlu1248/master [ci skip]
Changed a CLI command in data-formats.md due to erroneous information
2021-07-27 12:15:16 +10:00
Kevin Lu 4a8e9e4e4e
Update data-formats.md 2021-07-25 22:58:53 -07:00
Adriane Boyd f5acc48111
Remove TrainablePipe as base class for Lemmatizer in API docs (#8725) 2021-07-15 16:41:36 +02:00
Sofie Van Landeghem 77859beb99
spacy.ngram_range_suggester.v1 (#8699) 2021-07-15 10:01:22 +02:00
Ines Montani 50000d37e4
Avoid double parentheses [ci skip] 2021-07-10 10:52:01 +10:00
Calum Sieppert e2d53aa1a6
Typo fixes 2021-07-09 10:25:56 -06:00
Ines Montani 39c8f7949e Add code preview for textcat_multilabel [ci skip] 2021-07-08 13:33:25 +10:00
Calum Sieppert 889c187bc2
Typo fixes 2021-07-07 16:53:04 -06:00
Adriane Boyd 6db647dfe0 Update v3.1 usage docs 2021-07-07 08:43:33 +02:00
Sofie Van Landeghem 64fac754fe
add spacy prefix to ngram_suggester.v1 (#8623) 2021-07-07 08:09:30 +02:00
Sofie Van Landeghem e7d747e3ee
TransitionBasedParser.v1 to legacy (#8586)
* TransitionBasedParser.v1 to legacy

* register sublayers

* bump spacy-legacy to 3.0.7
2021-07-06 15:26:45 +02:00