Commit Graph

4 Commits

Author SHA1 Message Date
Giovanni Campagna 1aa6306702 server: automatically grow the embedding matrix when new words are encountered
Otherwise, we feed actual unks to the model, while the model
is trained with character embeddings and expects to see some
actual value for everything.
2019-03-02 00:44:51 -08:00
Giovanni Campagna 72175af080 torchtext.field: use <unk> as the unk tokens
" UNK " has spaces, and a token with spaces is asking for trouble
2019-03-02 00:17:34 -08:00
Giovanni Campagna 99e39f9528 Move generic dataset outside of torchtext
There is no reason for it to live inside torchtext, and having it
outside will help with using torchtext as a library
2019-03-01 15:47:11 -08:00
Giovanni Campagna 41b80bb4f4 Move all python files to a decanlp/ package
As per python conventions
2019-03-01 15:43:02 -08:00