mirror of https://github.com/explosion/spaCy.git
Merge branch 'develop' into spacy.io
This commit is contained in:
commit
41f86f640b
|
@ -315,7 +315,7 @@ def read_regex(path):
|
||||||
|
|
||||||
|
|
||||||
def compile_prefix_regex(entries):
|
def compile_prefix_regex(entries):
|
||||||
"""Compile a list of prefix rules into a regex object.
|
"""Compile a sequence of prefix rules into a regex object.
|
||||||
|
|
||||||
entries (tuple): The prefix rules, e.g. spacy.lang.punctuation.TOKENIZER_PREFIXES.
|
entries (tuple): The prefix rules, e.g. spacy.lang.punctuation.TOKENIZER_PREFIXES.
|
||||||
RETURNS (regex object): The regex object. to be used for Tokenizer.prefix_search.
|
RETURNS (regex object): The regex object. to be used for Tokenizer.prefix_search.
|
||||||
|
@ -332,7 +332,7 @@ def compile_prefix_regex(entries):
|
||||||
|
|
||||||
|
|
||||||
def compile_suffix_regex(entries):
|
def compile_suffix_regex(entries):
|
||||||
"""Compile a list of suffix rules into a regex object.
|
"""Compile a sequence of suffix rules into a regex object.
|
||||||
|
|
||||||
entries (tuple): The suffix rules, e.g. spacy.lang.punctuation.TOKENIZER_SUFFIXES.
|
entries (tuple): The suffix rules, e.g. spacy.lang.punctuation.TOKENIZER_SUFFIXES.
|
||||||
RETURNS (regex object): The regex object. to be used for Tokenizer.suffix_search.
|
RETURNS (regex object): The regex object. to be used for Tokenizer.suffix_search.
|
||||||
|
@ -342,7 +342,7 @@ def compile_suffix_regex(entries):
|
||||||
|
|
||||||
|
|
||||||
def compile_infix_regex(entries):
|
def compile_infix_regex(entries):
|
||||||
"""Compile a list of infix rules into a regex object.
|
"""Compile a sequence of infix rules into a regex object.
|
||||||
|
|
||||||
entries (tuple): The infix rules, e.g. spacy.lang.punctuation.TOKENIZER_INFIXES.
|
entries (tuple): The infix rules, e.g. spacy.lang.punctuation.TOKENIZER_INFIXES.
|
||||||
RETURNS (regex object): The regex object. to be used for Tokenizer.infix_finditer.
|
RETURNS (regex object): The regex object. to be used for Tokenizer.infix_finditer.
|
||||||
|
|
|
@ -29,10 +29,10 @@ components. spaCy then does the following:
|
||||||
>
|
>
|
||||||
> ```json
|
> ```json
|
||||||
> {
|
> {
|
||||||
> "name": "example_model",
|
|
||||||
> "lang": "en",
|
> "lang": "en",
|
||||||
|
> "name": "core_web_sm",
|
||||||
> "description": "Example model for spaCy",
|
> "description": "Example model for spaCy",
|
||||||
> "pipeline": ["tagger", "parser"]
|
> "pipeline": ["tagger", "parser", "ner"]
|
||||||
> }
|
> }
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
|
@ -51,11 +51,11 @@ components. spaCy then does the following:
|
||||||
So when you call this...
|
So when you call this...
|
||||||
|
|
||||||
```python
|
```python
|
||||||
nlp = spacy.load("en")
|
nlp = spacy.load("en_core_web_sm")
|
||||||
```
|
```
|
||||||
|
|
||||||
... the model tells spaCy to use the language `"en"` and the pipeline
|
... the model's `meta.json` tells spaCy to use the language `"en"` and the
|
||||||
`["tagger", "parser", "ner"]`. spaCy will then initialize
|
pipeline `["tagger", "parser", "ner"]`. spaCy will then initialize
|
||||||
`spacy.lang.en.English`, and create each pipeline component and add it to the
|
`spacy.lang.en.English`, and create each pipeline component and add it to the
|
||||||
processing pipeline. It'll then load in the model's data from its data directory
|
processing pipeline. It'll then load in the model's data from its data directory
|
||||||
and return the modified `Language` class for you to use as the `nlp` object.
|
and return the modified `Language` class for you to use as the `nlp` object.
|
||||||
|
|
Loading…
Reference in New Issue