Commit Graph

25 Commits

Author SHA1 Message Date
mehrad b3bbad24c5 update tests 2019-03-04 15:08:56 -08:00
mehrad ef94113a67 let user specifies total number of checkpoints to keep 2019-03-04 12:04:03 -08:00
Giovanni Campagna 96ce6dc26f saver: add .pth to new file 2019-03-04 00:21:42 -08:00
Giovanni Campagna d1e54e9734 saver: actually save checkpoint.json
Won't go far without it
2019-03-03 23:37:37 -08:00
Giovanni Campagna c9d8e920a9 Fix logger.debug call
It does not take more than one argument
2019-03-03 22:50:30 -08:00
Giovanni Campagna b950927a2b Clean up old checkpoints as we go along
Introduce an utility Saver class, that does what tensorflow's Saver
does: keeps track of saved checkpoints in a separate file, and
deletes the old ones before saving a new one.
2019-03-03 22:35:29 -08:00
Giovanni Campagna 410c6cd8ec Merge remote-tracking branch 'origin/master' into wip/more-cleanups 2019-03-03 22:16:09 -08:00
Giovanni Campagna c2406e62e0 Fix "get_commit" when invoking "decanlp" installed with "pip install -e"
sys.argv will be a script living in ~/.bin that loads and executes
the module, so it will not live in the git repository
2019-03-02 18:28:04 +00:00
Giovanni Campagna 1aa6306702 server: automatically grow the embedding matrix when new words are encountered
Otherwise, we feed actual unks to the model, while the model
is trained with character embeddings and expects to see some
actual value for everything.
2019-03-02 00:44:51 -08:00
Giovanni Campagna 72175af080 torchtext.field: use <unk> as the unk tokens
" UNK " has spaces, and a token with spaces is asking for trouble
2019-03-02 00:17:34 -08:00
Giovanni Campagna dac1e88a8e server: close gracefully on Ctrl-c and EOF 2019-03-02 00:17:34 -08:00
mehrad 3a1b969ad3 updates 2019-03-01 17:51:54 -08:00
Giovanni Campagna 3a1fb01286 server: flush stdout after a request 2019-03-01 17:40:19 -08:00
Giovanni Campagna ce5c1aa8df Use logger instead of print()
print() uses stdout by default, which has two problems:

- it is not flushed until later (so messages don't show, or don't show
  up in order with other loggers)
- it conflicts with stdin/stdout usage by `decanlp server --stdin`
2019-03-01 17:35:04 -08:00
Giovanni Campagna 8d4136a79a server: add a simpler interface that just works over stdin/stdout 2019-03-01 16:29:05 -08:00
Giovanni Campagna 1a2a4a9ea9 One more stanford copyright 2019-03-01 16:18:10 -08:00
Giovanni Campagna 03ce9501ad server: remove --data argument
The server does not load any data file
2019-03-01 16:14:08 -08:00
Giovanni Campagna bafabac483 Fix argument handling 2019-03-01 16:13:10 -08:00
Giovanni Campagna ad1a15637a Fix missing import 2019-03-01 16:09:58 -08:00
Giovanni Campagna 4bddbada45 Fix typo 2019-03-01 16:08:52 -08:00
Giovanni Campagna ac3ac8c680 Add Stanford copyright to all files that we touched 2019-03-01 15:54:54 -08:00
Giovanni Campagna 8e2b519ac3 Add copyright notices to all files
Makes the license clear and explicit
2019-03-01 15:51:45 -08:00
Giovanni Campagna 99e39f9528 Move generic dataset outside of torchtext
There is no reason for it to live inside torchtext, and having it
outside will help with using torchtext as a library
2019-03-01 15:47:11 -08:00
Giovanni Campagna 5447d0c37c Add a "decanlp" script that calls out to the different subcommands
Usage:
- decanlp train ...
- decanlp predict ...
- decanlp convert-to-logical-forms ...
2019-03-01 15:43:02 -08:00
Giovanni Campagna 41b80bb4f4 Move all python files to a decanlp/ package
As per python conventions
2019-03-01 15:43:02 -08:00