spaCy

History

Daniël de Kok da7ad97519 Update `TextCatBOW` to use the fixed `SparseLinear` layer (#13149 ) * Update `TextCatBOW` to use the fixed `SparseLinear` layer A while ago, we fixed the `SparseLinear` layer to use all available parameters: https://github.com/explosion/thinc/pull/754 This change updates `TextCatBOW` to `v3` which uses the new `SparseLinear_v2` layer. This results in a sizeable improvement on a text categorization task that was tested. While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent` option to make it possible to change the hidden size. Ideally, we'd just have an option called `length`. But the way that `TextCatBOW` uses hashes results in a non-uniform distribution of parameters when the length is not a power of two. * Replace TexCatBOW `length_exponent` parameter by `length` We now round up the length to the next power of two if it isn't a power of two. * Remove some tests for TextCatBOW.v2 * Fix missing import	2023-11-29 09:11:54 +01:00
..
quickstart_training.jinja	Update `TextCatBOW` to use the fixed `SparseLinear` layer (#13149 )	2023-11-29 09:11:54 +01:00
quickstart_training_recommendations.yml	Add transformer recommendation for ca (#11819 )	2022-11-18 08:15:27 +01:00

Update `TextCatBOW` to use the fixed `SparseLinear` layer (#13149 )

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: https://github.com/explosion/thinc/pull/754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import

2023-11-29 09:11:54 +01:00

quickstart_training.jinja

Update `TextCatBOW` to use the fixed `SparseLinear` layer (#13149 )

2023-11-29 09:11:54 +01:00

quickstart_training_recommendations.yml

Add transformer recommendation for ca (#11819 )

2022-11-18 08:15:27 +01:00