Giovanni Campagna
1c0a71aa0b
Add debugging to docker hooks
...
To see exactly how they get called...
2019-08-30 21:37:42 +02:00
Giovanni Campagna
fb72c84f73
docker: move embeddings to a shared directory
...
This way, the image can be used as a base image by almond-cloud
(which runs as a different user)
2019-08-30 21:25:54 +02:00
Giovanni Campagna
ff311bd35b
Add docker hub hooks to build both CPU and GPU images
2019-08-30 21:01:47 +02:00
Giovanni Campagna
a9264a699d
Remove obsolete Dockerfiles, and replace with a new one
...
A new one that wraps the new decanlp command and installs all
the dependencies correctly.
2019-08-30 18:55:28 +02:00
Giovanni Campagna
48b88317b7
Refuse to run with pytorch 1.2.0
...
Because pytorch 1.2.0 changed the behavior of Bools vs Uint8 and
that broke us...
2019-08-08 16:07:48 -07:00
Giovanni Campagna
f5ea63ecb0
Merge pull request #15 from stanford-oval/wip/mehrad/multi-language
...
Wip/mehrad/multi language
2019-05-30 08:37:37 -07:00
mehrad
21db011ad2
fixes
2019-05-29 20:57:09 -07:00
mehrad
4919b28627
fixes
2019-05-29 19:38:24 -07:00
mehrad
672aa14117
merge branch wip/mehrad/multi_lingual
2019-05-29 18:43:10 -07:00
mehrad
4004664259
fixes
2019-05-29 16:42:36 -07:00
mehrad
a27c3a8d8b
bug fixes
2019-05-29 13:37:18 -07:00
mehrad
d52d862310
fixing bugs
2019-05-29 12:18:57 -07:00
mehrad
eb10a788b0
updates
2019-05-28 18:12:31 -07:00
mehrad
bb90a35bc0
updates
2019-05-28 18:11:32 -07:00
mehrad
612e3bdd4d
output context sentneces as well for predict.py
2019-05-28 17:43:10 -07:00
mehrad
85c7a99ec2
bunch of updates
2019-05-22 14:04:16 -07:00
mehrad
90308a84e3
minor fix
2019-05-20 17:50:50 -07:00
mehrad
acfcbf88c4
updating prediction scripts
2019-05-20 17:43:45 -07:00
mehrad
7db36a90af
fixes
2019-05-20 14:29:25 -07:00
mehrad
ec77faacca
Glueing the models together
...
end-to-end finetuning works now
2019-05-20 13:41:59 -07:00
mehrad
9bf2217324
adding arguments
2019-05-20 11:02:42 -07:00
Giovanni Campagna
25db020107
Reduce memory usage while loading almond datasets
...
Don't load all lines in memory
2019-05-15 09:27:00 -07:00
Giovanni Campagna
488a4feb64
Fix contextual almond
2019-05-15 09:26:54 -07:00
mehrad
df94f7dd3a
ignore malformed sentences
2019-05-13 14:50:08 -07:00
mehrad
0495d4d0eb
Updates
...
1) use FastText for encoding persian text
2) let the user choose the question for almond task
3) bug fixes
2019-05-13 13:03:51 -07:00
Giovanni Campagna
27a3a8b173
Merge pull request #14 from stanford-oval/wip/contextual
...
Contextual Almond
2019-05-10 09:54:51 -07:00
mehrad
925d839e15
adding an end-to-end combined model
2019-04-29 13:56:39 -07:00
Giovanni Campagna
46eaae8ba8
fix almond dataset name
2019-04-23 09:35:57 -07:00
Giovanni Campagna
64020a497f
Add ContextualAlmond task
...
Its training files have four columns: <id> <context> <sentence> <target_program>
2019-04-23 09:30:49 -07:00
Giovanni Campagna
6837cc7e7b
Add missing dependency
...
Probably before it was coming from somewhere else, like allennlp
2019-04-17 12:10:45 -07:00
Giovanni Campagna
73ffec5365
Populate "install_requires" package metadata
...
This is necessary to automatically install dependencies when
the user installs the library with pip.
2019-04-17 11:40:40 -07:00
Giovanni Campagna
68e76f7990
Remove unused dependencies
...
These are not used anywhere I can see.
2019-04-17 11:39:54 -07:00
Giovanni Campagna
13e1c0335e
Load allenlp, cove libraries lazily
...
These libraries are only needed if one passes --elmo or --cove
on the command line. They are annoyingly big libraries, so
it makes sense to keep them optional.
2019-04-17 11:39:15 -07:00
mehrad
19067c71ba
option to retrain encoder embeddings
2019-04-15 16:11:10 -07:00
Giovanni Campagna
bb84b2b130
Merge pull request #13 from stanford-oval/wip/mmap-embeddings
...
Memory-mappable embeddings
2019-04-10 23:21:33 -07:00
Giovanni Campagna
8399064f15
vocab: restore "dim" property on load
2019-04-10 11:21:31 -07:00
Giovanni Campagna
aed5576756
vocab: use a better hash function
...
the previous one was not great, and it was particularly bad for
char ngrams, where it would produce collisions almost constantly
2019-04-10 10:59:57 -07:00
Giovanni Campagna
94bebc4435
update tests
2019-04-10 10:38:16 -07:00
Giovanni Campagna
335c792a27
mmappable embeddings: make it work
...
- handle integer overflow correctly in hashing
- store table, itos and vectors in separate files, because numpy
ignores mmap_mode for npz files
- optimize the loading of the txt vectors and free memory eagerly
because otherwise we run out of memory before saving
2019-04-10 10:31:25 -07:00
Giovanni Campagna
8112a985c8
Add "cache-embeddings" subcommand to download embeddings
...
It's useful to download the embeddings as a separate step
from training or deployment, for example to train on a
firewalled machine.
2019-04-09 16:54:12 -07:00
Giovanni Campagna
3f8f836d02
torchtext.Vocab: store word embeddings in mmap-friendly format on disk
...
torch.load/save uses pickle, which is not mmappable and causes high
memory usage: the vectors must be completely stored in memory.
This is fine during training, because the training machines are
large and have a lot of ram, but during inference we want to reduce
memory usage to deploy more models on one machine.
Instead, if we use numpy's npz format (uncompressed), all the word
vectors can be stored on disk and loaded on demand when the page
is faulted in. Furthemore, all pages are shared between processes
(so multiple models only use one copy of the embeddings), and the
kernel can free the memory back to disk under pressure.
The annoying part is that we can only store numpy ndarrays in this
format, and not Python native dicts. So instead we need a custom
HashTable implementation that is backed by numpy ndarrays.
As a side bonus, the custom implementation keeps only one copy
of all the words in memory, so memory usage is lower.
2019-04-09 16:54:12 -07:00
Giovanni Campagna
1021c4851c
word vectors: ignore all words longer than 100 characters
...
There's ~100 of these in GloVe and they are all garbage (horizontal
lines, sequences of numbers and urls). This will keep the maximum
word length in check.
2019-04-09 16:54:11 -07:00
mehrad
4905ad6ce8
Fixes
...
Apparently layer norm implementation can't be tampered with!
Reverting the change for now and switching to a new branch for truly fixing this.
2019-04-08 17:24:02 -07:00
mehrad
03cdc2d0c1
consistent formatting
2019-04-08 16:18:30 -07:00
mehrad
a7a2d752d2
Fixes
...
std() in layer normalization is the culprit for generating NAN.
It happens in the backward pass for values with zero variance.
Just update the mean for these batches.
2019-04-08 14:48:23 -07:00
mehrad
4acdba6c22
fix for NAN loss
2019-04-05 10:26:35 -07:00
Giovanni Campagna
d16277b4d3
stop if loss is less than 1e-5 for more than 100 iterations
2019-03-31 17:12:38 -07:00
Giovanni Campagna
09c6e77525
Merge pull request #12 from Stanford-Mobisocial-IoT-Lab/wip/thingtalk-lm
...
Pretrained decoder language model
2019-03-28 17:58:58 -07:00
mehrad
34ba4d2600
skip batches with NAN loss
2019-03-28 12:37:01 -07:00
Giovanni Campagna
3e3755b19b
use a slightly different strategy to make the pretrained lm non-trainable
2019-03-28 00:31:36 -07:00