Commit Graph

20 Commits

Author SHA1 Message Date
mpuels ee4d6fdd40
Fix typo in comment 2017-12-09 13:14:57 +01:00
Ines Montani 1a23a0f87e
Remove broken link (resolves #1541) 2017-11-10 12:28:39 +01:00
ines 89bd40b821 Fix print statement in textcat training example (resolves #1515) 2017-11-08 17:17:40 +01:00
ines a09c096d3c Get docs ready for v2.0.0 2017-11-07 12:00:43 +01:00
ines 173b1551af Update examples 2017-11-07 01:22:30 +01:00
ines 1b1c9105b4 Update example compatibility statements 2017-11-07 01:11:45 +01:00
ines 8fb48b9b91 Update and document new util functions 2017-11-07 00:22:43 +01:00
ines fe498b3d5e Update training examples to use "simple style" 2017-11-06 23:14:04 +01:00
ines 8f1d3fc3ee Update textcat example 2017-11-01 17:09:22 +01:00
Matthew Honnibal dad8f09fba Fix print statements in text classifier example 2017-11-01 16:34:31 +01:00
ines bfe17b7df1 Fix begin_training if get_gold_tuples is None 2017-11-01 13:14:31 +01:00
ines 4b196fdf7f Fix formatting 2017-11-01 00:43:22 +01:00
ines a7b9074b4c Update textcat training example and docs 2017-10-27 00:48:45 +02:00
ines b61866a2e4 Update textcat example 2017-10-27 00:32:19 +02:00
Matthew Honnibal 563f46f026 Fix multi-label support for text classification
The TextCategorizer class is supposed to support multi-label
text classification, and allow training data to contain missing
values.

For this to work, the gradient of the loss should be 0 when labels
are missing. Instead, there was no way to actually denote "missing"
in the GoldParse class, and so the TextCategorizer class treated
the label set within gold.cats as complete.

To fix this, we change GoldParse.cats to be a dict instead of a list.
The GoldParse.cats dict should map to floats, with 1. denoting
'present' and 0. denoting 'absent'. Gradients are zeroed for categories
absent from the gold.cats dict. A nice bonus is that you can also set
values between 0 and 1 for partial membership. You can also set numeric
values, if you're using a text classification model that uses an
appropriate loss function.

Unfortunately this is a breaking change; although the functionality
was only recently introduced and hasn't been properly documented
yet. I've updated the example script accordingly.
2017-10-05 18:43:02 -05:00
Matthew Honnibal f1b86dff8c Update textcat example 2017-10-04 15:12:28 +02:00
Matthew Honnibal 79a94bc166 Update textcat exampe 2017-10-04 14:55:30 +02:00
Matthew Honnibal c16ef0a85c Clarify train textcat example 2017-07-29 21:59:27 +02:00
Matthew Honnibal 54a539a113 Finish text classifier example 2017-07-23 00:34:12 +02:00
Matthew Honnibal 2bc7d87c70 Add example for training text classifier 2017-07-22 20:15:32 +02:00