mirror of https://github.com/explosion/spaCy.git
Merge pull request #6105 from adrianeboyd/docs/various-v3-2 [ci skip]
This commit is contained in:
commit
709ebf5550
|
@ -197,8 +197,8 @@ Remove a previously registered extension.
|
|||
## Doc.char_span {#char_span tag="method" new="2"}
|
||||
|
||||
Create a `Span` object from the slice `doc.text[start_idx:end_idx]`. Returns
|
||||
`None` if the character indices don't map to a valid span using the default mode
|
||||
`"strict".
|
||||
`None` if the character indices don't map to a valid span using the default
|
||||
alignment mode `"strict".
|
||||
|
||||
> #### Example
|
||||
>
|
||||
|
@ -208,15 +208,15 @@ Create a `Span` object from the slice `doc.text[start_idx:end_idx]`. Returns
|
|||
> assert span.text == "New York"
|
||||
> ```
|
||||
|
||||
| Name | Description |
|
||||
| ------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `start` | The index of the first character of the span. ~~int~~ |
|
||||
| `end` | The index of the last character after the span. ~int~~ |
|
||||
| `label` | A label to attach to the span, e.g. for named entities. ~~Union[int, str]~~ |
|
||||
| `kb_id` <Tag variant="new">2.2</Tag> | An ID from a knowledge base to capture the meaning of a named entity. ~~Union[int, str]~~ |
|
||||
| `vector` | A meaning representation of the span. ~~numpy.ndarray[ndim=1, dtype=float32]~~ |
|
||||
| `mode` | How character indices snap to token boundaries. Options: `"strict"` (no snapping), `"inside"` (span of all tokens completely within the character span), `"outside"` (span of all tokens at least partially covered by the character span). Defaults to `"strict"`. ~~str~~ |
|
||||
| **RETURNS** | The newly constructed object or `None`. ~~Optional[Span]~~ |
|
||||
| Name | Description |
|
||||
| ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `start` | The index of the first character of the span. ~~int~~ |
|
||||
| `end` | The index of the last character after the span. ~int~~ |
|
||||
| `label` | A label to attach to the span, e.g. for named entities. ~~Union[int, str]~~ |
|
||||
| `kb_id` <Tag variant="new">2.2</Tag> | An ID from a knowledge base to capture the meaning of a named entity. ~~Union[int, str]~~ |
|
||||
| `vector` | A meaning representation of the span. ~~numpy.ndarray[ndim=1, dtype=float32]~~ |
|
||||
| `alignment_mode` | How character indices snap to token boundaries. Options: `"strict"` (no snapping), `"contract"` (span of all tokens completely within the character span), `"expand"` (span of all tokens at least partially covered by the character span). Defaults to `"strict"`. ~~str~~ |
|
||||
| **RETURNS** | The newly constructed object or `None`. ~~Optional[Span]~~ |
|
||||
|
||||
## Doc.similarity {#similarity tag="method" model="vectors"}
|
||||
|
||||
|
|
|
@ -65,22 +65,22 @@ Matchers help you find and extract information from [`Doc`](/api/doc) objects
|
|||
based on match patterns describing the sequences you're looking for. A matcher
|
||||
operates on a `Doc` and gives you access to the matched tokens **in context**.
|
||||
|
||||
| Name | Description |
|
||||
| --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| [`Matcher`](/api/matcher) | Match sequences of tokens, based on pattern rules, similar to regular expressions. |
|
||||
| [`PhraseMatcher`](/api/phrasematcher) | Match sequences of tokens based on phrases. |
|
||||
| [`DependencyMatcher`](/api/dependencymatcher) | Match sequences of tokens based on dependency trees using the [Semgrex syntax](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html). |
|
||||
| Name | Description |
|
||||
| --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| [`Matcher`](/api/matcher) | Match sequences of tokens, based on pattern rules, similar to regular expressions. |
|
||||
| [`PhraseMatcher`](/api/phrasematcher) | Match sequences of tokens based on phrases. |
|
||||
| [`DependencyMatcher`](/api/dependencymatcher) | Match sequences of tokens based on dependency trees using [Semgrex operators](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html). |
|
||||
|
||||
### Other classes {#architecture-other}
|
||||
|
||||
| Name | Description |
|
||||
| ------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------- |
|
||||
| [`Vocab`](/api/vocab) | The shared vocabulary that stores strings and gives you access to [`Lexeme`](/api/lexeme) objects. |
|
||||
| [`StringStore`](/api/stringstore) | Map strings to and from hash values. |
|
||||
| [`Vectors`](/api/vectors) | Container class for vector data keyed by string. |
|
||||
| [`Lookups`](/api/lookups) | Container for convenient access to large lookup tables and dictionaries. |
|
||||
| [`Morphology`](/api/morphology) | Assign linguistic features like lemmas, noun case, verb tense etc. based on the word and its part-of-speech tag. |
|
||||
| [`MorphAnalysis`](/api/morphology#morphanalysis) | A morphological analysis. |
|
||||
| [`KnowledgeBase`](/api/kb) | Storage for entities and aliases of a knowledge base for entity linking. |
|
||||
| [`Scorer`](/api/scorer) | Compute evaluation scores. |
|
||||
| [`Corpus`](/api/corpus) | Class for managing annotated corpora for training and evaluation data. |
|
||||
| Name | Description |
|
||||
| ------------------------------------------------ | -------------------------------------------------------------------------------------------------- |
|
||||
| [`Vocab`](/api/vocab) | The shared vocabulary that stores strings and gives you access to [`Lexeme`](/api/lexeme) objects. |
|
||||
| [`StringStore`](/api/stringstore) | Map strings to and from hash values. |
|
||||
| [`Vectors`](/api/vectors) | Container class for vector data keyed by string. |
|
||||
| [`Lookups`](/api/lookups) | Container for convenient access to large lookup tables and dictionaries. |
|
||||
| [`Morphology`](/api/morphology) | Store morphological analyses and map them to and from hash values. |
|
||||
| [`MorphAnalysis`](/api/morphology#morphanalysis) | A morphological analysis. |
|
||||
| [`KnowledgeBase`](/api/kb) | Storage for entities and aliases of a knowledge base for entity linking. |
|
||||
| [`Scorer`](/api/scorer) | Compute evaluation scores. |
|
||||
| [`Corpus`](/api/corpus) | Class for managing annotated corpora for training and evaluation data. |
|
||||
|
|
|
@ -299,9 +299,10 @@ installed in the same environment – that's it.
|
|||
|
||||
When you load a pipeline, spaCy will generally use its `config.cfg` to set up
|
||||
the language class and construct the pipeline. The pipeline is specified as a
|
||||
list of strings, e.g. `pipeline = ["tagger", "paser", "ner"]`. For each of those
|
||||
strings, spaCy will call `nlp.add_pipe` and look up the name in all factories
|
||||
defined by the decorators [`@Language.component`](/api/language#component) and
|
||||
list of strings, e.g. `pipeline = ["tagger", "parser", "ner"]`. For each of
|
||||
those strings, spaCy will call `nlp.add_pipe` and look up the name in all
|
||||
factories defined by the decorators
|
||||
[`@Language.component`](/api/language#component) and
|
||||
[`@Language.factory`](/api/language#factory). This means that you have to import
|
||||
your custom components _before_ loading the pipeline.
|
||||
|
||||
|
|
|
@ -119,6 +119,7 @@
|
|||
{ "text": "Corpus", "url": "/api/corpus" },
|
||||
{ "text": "KnowledgeBase", "url": "/api/kb" },
|
||||
{ "text": "Lookups", "url": "/api/lookups" },
|
||||
{ "text": "MorphAnalysis", "url": "/api/morphology#morphanalysis" },
|
||||
{ "text": "Morphology", "url": "/api/morphology" },
|
||||
{ "text": "Scorer", "url": "/api/scorer" },
|
||||
{ "text": "StringStore", "url": "/api/stringstore" },
|
||||
|
|
Loading…
Reference in New Issue