mehrad
3cb97e63bf
Adding curriculum learning
...
Introducing harder datasets grdually, after the model has learned the basic features from easier dataset, makes training more robust and usually yields better precision
2019-03-12 10:47:57 -07:00
Giovanni Campagna
1aa6306702
server: automatically grow the embedding matrix when new words are encountered
...
Otherwise, we feed actual unks to the model, while the model
is trained with character embeddings and expects to see some
actual value for everything.
2019-03-02 00:44:51 -08:00
Giovanni Campagna
72175af080
torchtext.field: use <unk> as the unk tokens
...
" UNK " has spaces, and a token with spaces is asking for trouble
2019-03-02 00:17:34 -08:00
Giovanni Campagna
99e39f9528
Move generic dataset outside of torchtext
...
There is no reason for it to live inside torchtext, and having it
outside will help with using torchtext as a library
2019-03-01 15:47:11 -08:00
Giovanni Campagna
41b80bb4f4
Move all python files to a decanlp/ package
...
As per python conventions
2019-03-01 15:43:02 -08:00