Commit Graph

167 Commits

Author SHA1 Message Date
Sina abc65eaf53 Remove `train-paraphrase` command 2022-08-26 12:04:35 -07:00
Sina bc82aeb4b6 Add test to check the outputs of --is_hf_model 2022-08-25 20:29:37 -07:00
Sina c046ec0eb0 Fix tests 2022-08-25 18:37:49 -07:00
Sina 25ebee3964 Factor out some shared parameters in test cases 2022-08-24 21:20:05 -07:00
Sina c5d04f31bc Make tests multi-line commands 2022-08-24 21:04:08 -07:00
Sina 16b7c7a717 Add test for direct HF model prediction 2022-08-24 20:40:48 -07:00
mehrad 996b406184
Add test for bitod_error_cls task 2022-08-08 11:56:31 -07:00
mehrad c1c38298ec
Update OOD expected results 2022-08-05 14:20:33 -07:00
mehrad 101a6d3f6c
Update e2e tests and results 2022-06-29 14:35:56 -07:00
mehrad 6e8e7f1f84
Update e2e_dialogue_preds.json
Due toX_re changes in bitod/main (7b1e6ae9c4fdfc124140ca33ff71626037a39820)
2022-06-16 16:59:11 -07:00
mehrad 968b708336
Fix and update tests 2022-05-09 14:16:41 -07:00
mehrad 7ce4cdaf7c
Add test for entity translation 2022-05-09 10:48:12 -07:00
dependabot[bot] 1613516699
build(deps): bump transformers from 4.16.2 to 4.17.0 (#256)
* build(deps): bump transformers from 4.16.2 to 4.17.0

Bumps [transformers](https://github.com/huggingface/transformers) from 4.16.2 to 4.17.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.16.2...v4.17.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bart tokenizer no longer inherits from Roberta

* Add XGLMTokenizer

* Update TokenClassification test results

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: mehrad <mehrad@stanford.edu>
2022-03-07 11:14:59 -08:00
mehrad 4fe3f06bed
test_paraphrasing: compute rouge1 and rougeL 2022-03-02 17:39:44 -08:00
mehrad 9a6de26afb
Update e2e tests
Use a pretrained marian model instead of randomly initialized bart so we get some non-trivial results.
Shrink bitod test datasets. Now that we are testing e2e_dialogue_evaluation it will help to keep testing time almost the same.
2022-03-02 16:42:26 -08:00
mehrad 26fb840339
Test evaluate-file 2022-03-02 16:27:22 -08:00
mehrad 941881fb59
test_e2e_dialogues: test e2e prediction too 2022-03-02 16:27:21 -08:00
mehrad 1b790f6658
Remove test_main_almond_multilingual 2022-02-28 13:22:45 -08:00
mehrad 96f6db7e3f
Remove data caching
it's almost never been used
2022-02-28 13:19:39 -08:00
mehrad e07c8c5216
Update test results after adding input column to predictions 2022-02-25 14:01:41 -08:00
mehrad 86044a8d9b
Add tests for bitod tasks 2022-02-25 13:33:27 -08:00
Sina 53746f351e Fix prediction with `num_beams > 1` and `num_outputs > 1` 2022-02-21 20:51:36 -08:00
mehrad 633ab40ce8
Update tests 2021-12-15 15:25:31 -08:00
Mehrad Moradshahi 1d50ba072e
Fix bug in server for TransformerLSTM (#229) 2021-11-29 10:51:47 -08:00
mehrad 699b784483
predict: assing eval_lang to pred_lang if not provided 2021-10-06 17:33:41 -07:00
mehrad 302075f32c
Update bleu test expected results
In sacrebleu 2.0 except bleu score is 0.0 if there are no ngram matches (i.e. smoothing is skipped)
2021-09-20 09:53:25 -07:00
mehrad b0e02a72d7
Address PR comments
- change default names for add_entities_to_text and ned_normalize_types
- improve code style and comments
- move banned_phrases to a text file
2021-09-06 17:17:26 -07:00
mehrad 727e607040
Remove --database_lookup_method argument
and some cleanups
2021-09-06 17:17:24 -07:00
mehrad 9dc6295d45
Use marisa_trie for alias2qids and qid2typeqid to save memory 2021-09-06 17:17:24 -07:00
mehrad b64b952948
Rename bootleg_post_process_types to ned_normalized_types
- the argument is not specific to bootleg anymore and can be used with Naive and Oracle too
- it now hast three  modes: no, yes, and force. previous functionlity was doing only yes and no.
2021-09-06 17:17:24 -07:00
kevintangzero 911fef976c Add a test script for cuda errors 2021-08-10 03:42:08 -07:00
mehrad e584ac70d4
Rename canonical2type.json to alias2type.json 2021-08-05 01:16:56 -07:00
Mehrad Moradshahi d1bc56427e
Merge pull request #183 from stanford-oval/wip/mehrad/pr_10
NED Refactoring (2)
2021-08-04 10:28:50 -07:00
mehrad 13b8f23f14
Address PR comments (2) 2021-08-02 13:07:40 -07:00
mehrad df01c76924
Remove ElasticDatabase and elasticsearch dep 2021-08-01 20:55:33 -07:00
mehrad b46d477724
Address PR comments 2021-08-01 20:55:32 -07:00
mehrad 49ab643afd
Updates to make tests pass 2021-08-01 12:10:45 -07:00
mehrad bcb5ea5528
Remove nonessential files from database dir 2021-08-01 12:10:44 -07:00
mehrad 455dc7c869
Use add_entity_to_text and entity_attributes
add_entity_to_text tells how to add entity attributes to text.
entity_attributes indicates what to add (e.g. type, qid)
2021-08-01 12:09:43 -07:00
mehrad b52ac345e4
Rename Feature to Entity 2021-08-01 12:09:42 -07:00
kevintangzero f404fcc130 Add prediction and compare expected results in ood test script 2021-08-01 02:35:22 -07:00
kevintangzero 0a5b3a6f4b Fix dataloader and dataset 2021-07-31 05:18:32 -07:00
kevintangzero 9f6c414aee Split training and test data 2021-07-29 02:46:51 -07:00
kevintangzero 61cabfa07f Fix bugs in generate_with_classification_model 2021-07-27 08:13:01 -07:00
kevintangzero 6837d40a43 Add ood dataloader, model, and toy dataset 2021-07-27 02:29:43 -07:00
mehrad c4bad249eb
Major refactoring of alignment code 2021-07-23 17:47:22 -07:00
mehrad 3d9708b3cb
predict: include gold answers in prediction file
- so running local evaluation is possible without accessing the original dataset
- easier debugging since answer might have been changes after passing through genienlp
2021-07-23 17:47:13 -07:00
mehrad 871e5d059c
Updates to make NED tests pass 2021-07-14 11:45:13 -07:00
mehrad 724dc23b57
Simplify how we keep track of spans 2021-06-29 21:45:24 -07:00
mehrad d96ddcd08f
Misc. code updates
update translation tests, address some bugs
2021-06-29 19:41:39 -07:00