mirror of https://github.com/explosion/spaCy.git
Update Tokenizer documentation to reflect token_match and url_match signatures (#9859)
This commit is contained in:
parent
ba0fa7a64e
commit
ac45ae3779
|
@ -45,10 +45,12 @@ cdef class Tokenizer:
|
||||||
`re.compile(string).search` to match suffixes.
|
`re.compile(string).search` to match suffixes.
|
||||||
`infix_finditer` (callable): A function matching the signature of
|
`infix_finditer` (callable): A function matching the signature of
|
||||||
`re.compile(string).finditer` to find infixes.
|
`re.compile(string).finditer` to find infixes.
|
||||||
token_match (callable): A boolean function matching strings to be
|
token_match (callable): A function matching the signature of
|
||||||
|
`re.compile(string).match`, for matching strings to be
|
||||||
recognized as tokens.
|
recognized as tokens.
|
||||||
url_match (callable): A boolean function matching strings to be
|
url_match (callable): A function matching the signature of
|
||||||
recognized as tokens after considering prefixes and suffixes.
|
`re.compile(string).match`, for matching strings to be
|
||||||
|
recognized as urls.
|
||||||
|
|
||||||
EXAMPLE:
|
EXAMPLE:
|
||||||
>>> tokenizer = Tokenizer(nlp.vocab)
|
>>> tokenizer = Tokenizer(nlp.vocab)
|
||||||
|
|
Loading…
Reference in New Issue