mirror of https://github.com/explosion/spaCy.git
22 lines
1018 B
Plaintext
22 lines
1018 B
Plaintext
|
//- 💫 DOCS > API > TEXTCATEGORIZER
|
||
|
|
||
|
include ../../_includes/_mixins
|
||
|
|
||
|
p
|
||
|
| Add text categorization models to spaCy pipelines. The model supports
|
||
|
| classification with multiple, non-mutually exclusive labels.
|
||
|
|
||
|
p
|
||
|
| You can change the model architecture rather easily, but by default, the
|
||
|
| #[code TextCategorizer] class uses a convolutional neural network to
|
||
|
| assign position-sensitive vectors to each word in the document. This step
|
||
|
| is similar to the #[+api("tensorizer") #[code Tensorizer]] component, but the
|
||
|
| #[code TextCategorizer] uses its own CNN model, to avoid sharing weights
|
||
|
| with the other pipeline components. The document tensor is then
|
||
|
| summarized by concatenating max and mean pooling, and a multilayer
|
||
|
| perceptron is used to predict an output vector of length #[code nr_class],
|
||
|
| before a logistic activation is applied elementwise. The value of each
|
||
|
| output neuron is the probability that some class is present.
|
||
|
|
||
|
+under-construction
|