mehrad
a5b885ecd4
fix
2019-11-06 01:47:28 -08:00
mehrad
3addf7da94
fix
2019-11-06 01:18:08 -08:00
mehrad
1339767da3
update FastText embedding
2019-11-06 01:09:55 -08:00
Giovanni Campagna
4e10b5240b
Fix downloading fasttext vectors
2019-11-05 21:54:06 -08:00
Giovanni Campagna
06eca199d7
Merge pull request #19 from stanford-oval/wip/dockerupdate
...
docker: Switch to ubi8 (RHEL 8)
2019-11-04 16:22:35 -08:00
Giovanni Campagna
9cdb85abbb
Merge pull request #20 from stanford-oval/wip/i18n
...
Support word embeddings for arbitrary languages
2019-11-03 22:10:16 -08:00
Giovanni Campagna
2973165de0
util: load "locale" parameter from config.json
...
And if not present, set it to "en"
2019-11-03 21:25:57 -08:00
Giovanni Campagna
ba4c963176
fasttext: use common-crawl not wiki vectors
...
They are trained on a larger corpus and thus better
2019-11-01 17:38:38 -07:00
Giovanni Campagna
d5aacba674
embeddings: allow passing full locale tags as --locale
...
This way, we don't need to do anything too special in Genie,
and we can call decanlp with --locale zh-tw or --locale zh-cn
if needed to distinguish
2019-11-01 17:36:36 -07:00
Giovanni Campagna
d46353d352
Add fast-text support to cache-embeddings command
2019-11-01 17:30:59 -07:00
Giovanni Campagna
026e481df9
Add support for arbitrary word embeddings
...
FastText has all the word embeddings we care about anyway
2019-11-01 17:27:35 -07:00
Giovanni Campagna
668cc6f917
docker: Switch to ubi8 (RHEL 8)
...
ubi7 was updated to rhel 7.7, which should contain python36 natively,
but python3 is not part of the ubi repos. Luckily, it is part of the
ubi8 repos instead.
2019-11-01 17:08:06 -07:00
Giovanni Campagna
a546c7b62c
Fix cuda image name
2019-08-31 02:22:04 +02:00
Giovanni Campagna
28e8296032
Merge pull request #17 from stanford-oval/wip/docker
...
Remove obsolete Dockerfiles, and replace with a new one
2019-08-31 00:36:44 +02:00
Giovanni Campagna
dcb4f020ae
docker: fix hooks
...
They are executed from the directory containing the Dockerfile
2019-08-30 23:13:28 +02:00
Giovanni Campagna
86a51de115
MQAN: fix decoding with out of vocabulary words
...
If the vocabulary was limited due to max_generative_words and we
encounter an OOV word, we need to map it back to the full index
in GloVe, using the limited_idx_to_full_idx map set by the server
dynamically.
2019-08-30 21:39:17 +02:00
Giovanni Campagna
1c0a71aa0b
Add debugging to docker hooks
...
To see exactly how they get called...
2019-08-30 21:37:42 +02:00
Giovanni Campagna
fb72c84f73
docker: move embeddings to a shared directory
...
This way, the image can be used as a base image by almond-cloud
(which runs as a different user)
2019-08-30 21:25:54 +02:00
Giovanni Campagna
ff311bd35b
Add docker hub hooks to build both CPU and GPU images
2019-08-30 21:01:47 +02:00
Giovanni Campagna
a9264a699d
Remove obsolete Dockerfiles, and replace with a new one
...
A new one that wraps the new decanlp command and installs all
the dependencies correctly.
2019-08-30 18:55:28 +02:00
Giovanni Campagna
48b88317b7
Refuse to run with pytorch 1.2.0
...
Because pytorch 1.2.0 changed the behavior of Bools vs Uint8 and
that broke us...
2019-08-08 16:07:48 -07:00
Giovanni Campagna
f5ea63ecb0
Merge pull request #15 from stanford-oval/wip/mehrad/multi-language
...
Wip/mehrad/multi language
2019-05-30 08:37:37 -07:00
mehrad
21db011ad2
fixes
2019-05-29 20:57:09 -07:00
mehrad
4919b28627
fixes
2019-05-29 19:38:24 -07:00
mehrad
672aa14117
merge branch wip/mehrad/multi_lingual
2019-05-29 18:43:10 -07:00
mehrad
4004664259
fixes
2019-05-29 16:42:36 -07:00
mehrad
a27c3a8d8b
bug fixes
2019-05-29 13:37:18 -07:00
mehrad
d52d862310
fixing bugs
2019-05-29 12:18:57 -07:00
mehrad
eb10a788b0
updates
2019-05-28 18:12:31 -07:00
mehrad
bb90a35bc0
updates
2019-05-28 18:11:32 -07:00
mehrad
612e3bdd4d
output context sentneces as well for predict.py
2019-05-28 17:43:10 -07:00
mehrad
85c7a99ec2
bunch of updates
2019-05-22 14:04:16 -07:00
mehrad
90308a84e3
minor fix
2019-05-20 17:50:50 -07:00
mehrad
acfcbf88c4
updating prediction scripts
2019-05-20 17:43:45 -07:00
mehrad
7db36a90af
fixes
2019-05-20 14:29:25 -07:00
mehrad
ec77faacca
Glueing the models together
...
end-to-end finetuning works now
2019-05-20 13:41:59 -07:00
mehrad
9bf2217324
adding arguments
2019-05-20 11:02:42 -07:00
Giovanni Campagna
25db020107
Reduce memory usage while loading almond datasets
...
Don't load all lines in memory
2019-05-15 09:27:00 -07:00
Giovanni Campagna
488a4feb64
Fix contextual almond
2019-05-15 09:26:54 -07:00
mehrad
df94f7dd3a
ignore malformed sentences
2019-05-13 14:50:08 -07:00
mehrad
0495d4d0eb
Updates
...
1) use FastText for encoding persian text
2) let the user choose the question for almond task
3) bug fixes
2019-05-13 13:03:51 -07:00
Giovanni Campagna
27a3a8b173
Merge pull request #14 from stanford-oval/wip/contextual
...
Contextual Almond
2019-05-10 09:54:51 -07:00
mehrad
925d839e15
adding an end-to-end combined model
2019-04-29 13:56:39 -07:00
Giovanni Campagna
46eaae8ba8
fix almond dataset name
2019-04-23 09:35:57 -07:00
Giovanni Campagna
64020a497f
Add ContextualAlmond task
...
Its training files have four columns: <id> <context> <sentence> <target_program>
2019-04-23 09:30:49 -07:00
Giovanni Campagna
6837cc7e7b
Add missing dependency
...
Probably before it was coming from somewhere else, like allennlp
2019-04-17 12:10:45 -07:00
Giovanni Campagna
73ffec5365
Populate "install_requires" package metadata
...
This is necessary to automatically install dependencies when
the user installs the library with pip.
2019-04-17 11:40:40 -07:00
Giovanni Campagna
68e76f7990
Remove unused dependencies
...
These are not used anywhere I can see.
2019-04-17 11:39:54 -07:00
Giovanni Campagna
13e1c0335e
Load allenlp, cove libraries lazily
...
These libraries are only needed if one passes --elmo or --cove
on the command line. They are annoyingly big libraries, so
it makes sense to keep them optional.
2019-04-17 11:39:15 -07:00
mehrad
19067c71ba
option to retrain encoder embeddings
2019-04-15 16:11:10 -07:00