Merge pull request #6105 from adrianeboyd/docs/various-v3-2 [ci skip]

This commit is contained in:
Ines Montani 2020-09-22 09:41:55 +02:00 committed by GitHub
commit 709ebf5550
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 32 additions and 30 deletions

View File

@ -197,8 +197,8 @@ Remove a previously registered extension.
## Doc.char_span {#char_span tag="method" new="2"}
Create a `Span` object from the slice `doc.text[start_idx:end_idx]`. Returns
`None` if the character indices don't map to a valid span using the default mode
`"strict".
`None` if the character indices don't map to a valid span using the default
alignment mode `"strict".
> #### Example
>
@ -208,15 +208,15 @@ Create a `Span` object from the slice `doc.text[start_idx:end_idx]`. Returns
> assert span.text == "New York"
> ```
| Name | Description |
| ------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `start` | The index of the first character of the span. ~~int~~ |
| `end` | The index of the last character after the span. ~int~~ |
| `label` | A label to attach to the span, e.g. for named entities. ~~Union[int, str]~~ |
| `kb_id` <Tag variant="new">2.2</Tag> | An ID from a knowledge base to capture the meaning of a named entity. ~~Union[int, str]~~ |
| `vector` | A meaning representation of the span. ~~numpy.ndarray[ndim=1, dtype=float32]~~ |
| `mode` | How character indices snap to token boundaries. Options: `"strict"` (no snapping), `"inside"` (span of all tokens completely within the character span), `"outside"` (span of all tokens at least partially covered by the character span). Defaults to `"strict"`. ~~str~~ |
| **RETURNS** | The newly constructed object or `None`. ~~Optional[Span]~~ |
| Name | Description |
| ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `start` | The index of the first character of the span. ~~int~~ |
| `end` | The index of the last character after the span. ~int~~ |
| `label` | A label to attach to the span, e.g. for named entities. ~~Union[int, str]~~ |
| `kb_id` <Tag variant="new">2.2</Tag> | An ID from a knowledge base to capture the meaning of a named entity. ~~Union[int, str]~~ |
| `vector` | A meaning representation of the span. ~~numpy.ndarray[ndim=1, dtype=float32]~~ |
| `alignment_mode` | How character indices snap to token boundaries. Options: `"strict"` (no snapping), `"contract"` (span of all tokens completely within the character span), `"expand"` (span of all tokens at least partially covered by the character span). Defaults to `"strict"`. ~~str~~ |
| **RETURNS** | The newly constructed object or `None`. ~~Optional[Span]~~ |
## Doc.similarity {#similarity tag="method" model="vectors"}

View File

@ -65,22 +65,22 @@ Matchers help you find and extract information from [`Doc`](/api/doc) objects
based on match patterns describing the sequences you're looking for. A matcher
operates on a `Doc` and gives you access to the matched tokens **in context**.
| Name | Description |
| --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [`Matcher`](/api/matcher) | Match sequences of tokens, based on pattern rules, similar to regular expressions. |
| [`PhraseMatcher`](/api/phrasematcher) | Match sequences of tokens based on phrases. |
| [`DependencyMatcher`](/api/dependencymatcher) | Match sequences of tokens based on dependency trees using the [Semgrex syntax](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html). |
| Name | Description |
| --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [`Matcher`](/api/matcher) | Match sequences of tokens, based on pattern rules, similar to regular expressions. |
| [`PhraseMatcher`](/api/phrasematcher) | Match sequences of tokens based on phrases. |
| [`DependencyMatcher`](/api/dependencymatcher) | Match sequences of tokens based on dependency trees using [Semgrex operators](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html). |
### Other classes {#architecture-other}
| Name | Description |
| ------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------- |
| [`Vocab`](/api/vocab) | The shared vocabulary that stores strings and gives you access to [`Lexeme`](/api/lexeme) objects. |
| [`StringStore`](/api/stringstore) | Map strings to and from hash values. |
| [`Vectors`](/api/vectors) | Container class for vector data keyed by string. |
| [`Lookups`](/api/lookups) | Container for convenient access to large lookup tables and dictionaries. |
| [`Morphology`](/api/morphology) | Assign linguistic features like lemmas, noun case, verb tense etc. based on the word and its part-of-speech tag. |
| [`MorphAnalysis`](/api/morphology#morphanalysis) | A morphological analysis. |
| [`KnowledgeBase`](/api/kb) | Storage for entities and aliases of a knowledge base for entity linking. |
| [`Scorer`](/api/scorer) | Compute evaluation scores. |
| [`Corpus`](/api/corpus) | Class for managing annotated corpora for training and evaluation data. |
| Name | Description |
| ------------------------------------------------ | -------------------------------------------------------------------------------------------------- |
| [`Vocab`](/api/vocab) | The shared vocabulary that stores strings and gives you access to [`Lexeme`](/api/lexeme) objects. |
| [`StringStore`](/api/stringstore) | Map strings to and from hash values. |
| [`Vectors`](/api/vectors) | Container class for vector data keyed by string. |
| [`Lookups`](/api/lookups) | Container for convenient access to large lookup tables and dictionaries. |
| [`Morphology`](/api/morphology) | Store morphological analyses and map them to and from hash values. |
| [`MorphAnalysis`](/api/morphology#morphanalysis) | A morphological analysis. |
| [`KnowledgeBase`](/api/kb) | Storage for entities and aliases of a knowledge base for entity linking. |
| [`Scorer`](/api/scorer) | Compute evaluation scores. |
| [`Corpus`](/api/corpus) | Class for managing annotated corpora for training and evaluation data. |

View File

@ -299,9 +299,10 @@ installed in the same environment that's it.
When you load a pipeline, spaCy will generally use its `config.cfg` to set up
the language class and construct the pipeline. The pipeline is specified as a
list of strings, e.g. `pipeline = ["tagger", "paser", "ner"]`. For each of those
strings, spaCy will call `nlp.add_pipe` and look up the name in all factories
defined by the decorators [`@Language.component`](/api/language#component) and
list of strings, e.g. `pipeline = ["tagger", "parser", "ner"]`. For each of
those strings, spaCy will call `nlp.add_pipe` and look up the name in all
factories defined by the decorators
[`@Language.component`](/api/language#component) and
[`@Language.factory`](/api/language#factory). This means that you have to import
your custom components _before_ loading the pipeline.

View File

@ -119,6 +119,7 @@
{ "text": "Corpus", "url": "/api/corpus" },
{ "text": "KnowledgeBase", "url": "/api/kb" },
{ "text": "Lookups", "url": "/api/lookups" },
{ "text": "MorphAnalysis", "url": "/api/morphology#morphanalysis" },
{ "text": "Morphology", "url": "/api/morphology" },
{ "text": "Scorer", "url": "/api/scorer" },
{ "text": "StringStore", "url": "/api/stringstore" },