Commit Graph

64 Commits

Author SHA1 Message Date
mehrad 16e7cac99d Add tests for sts filtering 2021-01-03 11:54:07 -08:00
Giovanni Campagna f283774bb9
Merge pull request #66 from stanford-oval/wip/version_bumps
Wip/version bumps
2020-12-30 09:03:13 -08:00
Giovanni Campagna 10b51e4623
Fix exporting models (#67)
Initialize the numericalizer correctly so we don't try to load files
that don't exist
2020-12-28 19:13:50 -08:00
Sina aa9927a100 Add support and test for diverse beam search 2020-12-20 00:08:10 -08:00
Giovanni Campagna 234180487b tests: reduce batch size to make travis happy 2020-12-19 10:54:29 -08:00
Giovanni Campagna 826d54ccd7 tests: temporarily remove mt5 test
Too big
2020-12-19 10:39:50 -08:00
Giovanni Campagna ebc09794f2 tests: ensure we run T5 with --preprocess-special-tokens
T5 models don't know how to extend their vocabulary properly, so
they cannot deal with new special tokens and must preprocess all
special tokens into regular words.
2020-12-14 23:11:17 -08:00
Giovanni Campagna db8e9fb1ee Merge remote-tracking branch 'origin/master' into wip/thingtalk2 2020-12-14 21:46:51 -08:00
Giovanni Campagna ae93e290ec Add automatic preprocessing of special ThingTalk tokens
Automatically compute a replacement for all @ tokens, ^^ tokens,
and entity tokens, into natural-language-like words. The replacement
is based on the longest unambiguous suffix after splitting on
delimiters.
2020-12-10 18:41:26 -08:00
mehrad 84ada6eedb Update test data for t5 2020-12-10 18:03:49 -08:00
mehrad 8f1c080936 Add tests for mbart and mt5 2020-12-10 11:54:45 -08:00
Giovanni Campagna 8b34f7bed8 Fix tests, address review comments 2020-12-07 22:46:25 -08:00
Giovanni Campagna 1ebc6a6a95 Cleanup embedding code and command-line flags
Remove all flags that are always disabled (like the Transformer
layers on the decoder side) and set the dimension based on
the pretrained model size.
2020-12-07 08:01:45 -08:00
Giovanni Campagna e930a6a0b5 Remove GloVe embeddings
Not useful in a modern NLP pipeline
2020-12-05 11:36:34 -08:00
mehrad 69c0d5f801 Fix and update tests 2020-12-03 17:24:55 -08:00
mehrad 8b644cca02 bump transformers version to 3.5.1 2020-12-03 12:50:34 -08:00
s-jse 653424214c
Merge pull request #58 from stanford-oval/wip/lower-memory
Submission work
2020-12-03 00:23:03 -08:00
mehrad f0a77fc172 Fix and add test for paired encoding 2020-12-02 21:55:35 -08:00
mehrad 2384ae9b1e Use bart-tiny-random instead of bart-large for tests 2020-11-28 14:57:34 -08:00
mehrad c3bc5835ad Add translation test files 2020-11-15 14:33:45 -08:00
mehrad dbfdfb20d3 run_lm_finetuning: fix attention_mask 2020-11-15 14:22:28 -08:00
mehrad 744a7a8483 run_generation.py: fix and update denoising functions 2020-11-15 13:46:22 -08:00
mehrad b84a6548a6 Changes to make run_lm_finetuning compatible with mbart and marian 2020-11-15 13:46:17 -08:00
Sina 2f94fc2d35 Fix bug 2020-11-11 17:21:12 -08:00
Sina e2ad021972 Clarify test case 2020-11-07 23:42:27 -08:00
Sina 9a847b4b1c Add Bart test 2020-11-07 15:19:44 -08:00
Sina cdcfa97ce0 Fix bugs 2020-07-24 16:32:54 -07:00
Sina 44021e2ba0 Add test for server mode 2020-07-24 16:32:54 -07:00
mehrad 5c0453a4ec Add multilingual dialog almond tasks 2020-07-22 22:04:27 -07:00
mehrad 813e28b8fa Update tests 2020-07-20 22:49:31 -07:00
Sina a943b2a1b5 Postpone BART tests until a later transformers version 2020-07-03 18:46:29 -07:00
Sina 04104f0dc3 Free up space during tests 2020-07-03 17:59:10 -07:00
Sina f8976ac6d7 Tests are less resource-intensive to run on Travis 2020-07-03 17:14:24 -07:00
Sina 31fca52dde Fix BART tests 2020-07-03 16:44:29 -07:00
Sina eedb4b6b5e Merge branch 'master' into paraphrase-merge 2020-07-03 15:09:51 -07:00
mehrad 6de6bd98a5 Added option to use language in the question 2020-06-23 14:58:43 -07:00
mehrad 0489d19f2e Option to pool encoded values by choosing cls token representation 2020-06-23 14:58:43 -07:00
mehrad a93dfc9099 Merge BART code with the rest of paraphrasing codebase 2020-05-04 15:54:09 -07:00
mehrad 50c4de9d60 merge master branch 2020-05-04 10:06:35 -07:00
mehrad 8ba1f4c490 code refactoring 2020-05-04 10:03:08 -07:00
mehrad 5f871512a6 bart_evaluation is integrated with run_generation script 2020-04-29 21:14:03 -07:00
mehrad 9d5945376f move files to test dir 2020-04-27 16:36:36 -07:00
mehrad 2eb78d7080 add tests 2020-04-27 16:30:19 -07:00
mehrad 36b0fb5317 fix bug in iter 2020-04-19 18:03:11 -07:00
Sina e24e426a3c test datasets should not be ignored 2020-04-18 19:54:48 -07:00
Sina a122c05062 paraphrase training is now done with tsv files 2020-04-18 18:46:45 -07:00
mehrad 891bee8632 merge master branch 2020-04-09 22:10:35 -07:00
mehrad 31190c3f18 update tests 2020-04-06 16:03:05 -07:00
mehrad f3bf392231 adding tests + fixing bugs 2020-03-30 14:43:03 -07:00
mehrad 703edc4fdc Option to do sentence batching
Batches can have multiple minibatches; each containing same sentences but in different languages
2020-03-30 12:50:02 -07:00