mirror of https://github.com/explosion/spaCy.git
Update docs [ci skip]
This commit is contained in:
parent
1139247532
commit
0c31f03ec5
|
@ -838,8 +838,6 @@ domain. There are five things you would need to define:
|
||||||
hyphens etc.
|
hyphens etc.
|
||||||
5. An optional boolean function `token_match` matching strings that should never
|
5. An optional boolean function `token_match` matching strings that should never
|
||||||
be split, overriding the infix rules. Useful for things like URLs or numbers.
|
be split, overriding the infix rules. Useful for things like URLs or numbers.
|
||||||
Note that prefixes and suffixes will be split off before `token_match` is
|
|
||||||
applied.
|
|
||||||
|
|
||||||
You shouldn't usually need to create a `Tokenizer` subclass. Standard usage is
|
You shouldn't usually need to create a `Tokenizer` subclass. Standard usage is
|
||||||
to use `re.compile()` to build a regular expression object, and pass its
|
to use `re.compile()` to build a regular expression object, and pass its
|
||||||
|
|
Loading…
Reference in New Issue