genienlp

Commit Graph

Author	SHA1	Message	Date
Giovanni Campagna	68e76f7990	Remove unused dependencies These are not used anywhere I can see.	2019-04-17 11:39:54 -07:00
Giovanni Campagna	13e1c0335e	Load allenlp, cove libraries lazily These libraries are only needed if one passes --elmo or --cove on the command line. They are annoyingly big libraries, so it makes sense to keep them optional.	2019-04-17 11:39:15 -07:00
Giovanni Campagna	bb84b2b130	Merge pull request #13 from stanford-oval/wip/mmap-embeddings Memory-mappable embeddings	2019-04-10 23:21:33 -07:00
Giovanni Campagna	8399064f15	vocab: restore "dim" property on load	2019-04-10 11:21:31 -07:00
Giovanni Campagna	aed5576756	vocab: use a better hash function the previous one was not great, and it was particularly bad for char ngrams, where it would produce collisions almost constantly	2019-04-10 10:59:57 -07:00
Giovanni Campagna	94bebc4435	update tests	2019-04-10 10:38:16 -07:00
Giovanni Campagna	335c792a27	mmappable embeddings: make it work - handle integer overflow correctly in hashing - store table, itos and vectors in separate files, because numpy ignores mmap_mode for npz files - optimize the loading of the txt vectors and free memory eagerly because otherwise we run out of memory before saving	2019-04-10 10:31:25 -07:00
Giovanni Campagna	8112a985c8	Add "cache-embeddings" subcommand to download embeddings It's useful to download the embeddings as a separate step from training or deployment, for example to train on a firewalled machine.	2019-04-09 16:54:12 -07:00
Giovanni Campagna	3f8f836d02	torchtext.Vocab: store word embeddings in mmap-friendly format on disk torch.load/save uses pickle, which is not mmappable and causes high memory usage: the vectors must be completely stored in memory. This is fine during training, because the training machines are large and have a lot of ram, but during inference we want to reduce memory usage to deploy more models on one machine. Instead, if we use numpy's npz format (uncompressed), all the word vectors can be stored on disk and loaded on demand when the page is faulted in. Furthemore, all pages are shared between processes (so multiple models only use one copy of the embeddings), and the kernel can free the memory back to disk under pressure. The annoying part is that we can only store numpy ndarrays in this format, and not Python native dicts. So instead we need a custom HashTable implementation that is backed by numpy ndarrays. As a side bonus, the custom implementation keeps only one copy of all the words in memory, so memory usage is lower.	2019-04-09 16:54:12 -07:00
Giovanni Campagna	1021c4851c	word vectors: ignore all words longer than 100 characters There's ~100 of these in GloVe and they are all garbage (horizontal lines, sequences of numbers and urls). This will keep the maximum word length in check.	2019-04-09 16:54:11 -07:00
mehrad	4905ad6ce8	Fixes Apparently layer norm implementation can't be tampered with! Reverting the change for now and switching to a new branch for truly fixing this.	2019-04-08 17:24:02 -07:00
mehrad	03cdc2d0c1	consistent formatting	2019-04-08 16:18:30 -07:00
mehrad	a7a2d752d2	Fixes std() in layer normalization is the culprit for generating NAN. It happens in the backward pass for values with zero variance. Just update the mean for these batches.	2019-04-08 14:48:23 -07:00
mehrad	4acdba6c22	fix for NAN loss	2019-04-05 10:26:35 -07:00
Giovanni Campagna	d16277b4d3	stop if loss is less than 1e-5 for more than 100 iterations	2019-03-31 17:12:38 -07:00
Giovanni Campagna	09c6e77525	Merge pull request #12 from Stanford-Mobisocial-IoT-Lab/wip/thingtalk-lm Pretrained decoder language model	2019-03-28 17:58:58 -07:00
mehrad	34ba4d2600	skip batches with NAN loss	2019-03-28 12:37:01 -07:00
Giovanni Campagna	3e3755b19b	use a slightly different strategy to make the pretrained lm non-trainable	2019-03-28 00:31:36 -07:00
Giovanni Campagna	25cc4ee55e	support pretrained embeddings smaller than the model size add a feed-forward layer in that case	2019-03-27 23:50:14 -07:00
Giovanni Campagna	182d2698da	fix prediction _elmo was renamed to _tokens	2019-03-27 14:07:30 -07:00
Giovanni Campagna	82d15a4ae3	load pretrained_decoder_lm from config.json	2019-03-27 14:06:44 -07:00
Giovanni Campagna	6a97970b13	fix typo	2019-03-27 12:46:20 -07:00
Giovanni Campagna	fbe17b565e	make it work Fix time/batch confusion	2019-03-27 12:18:47 -07:00
Giovanni Campagna	9814d6bf4f	Implement using a pretrained language model for the decoder embedding Let's see if it makes a difference	2019-03-27 11:40:59 -07:00
Giovanni Campagna	cea6092f90	Fix evaluating - fix loading old config.json files that are missing some parameters - fix expanding the trained embedding - add a default context for "almond_with_thingpedia_as_context" (to include thingpedia) - fix handling empty sentences	2019-03-23 17:28:22 -07:00
Giovanni Campagna	d22e13f6c5	Merge pull request #9 from Stanford-Mobisocial-IoT-Lab/wip/thingpedia_as_context Wip/thingpedia as context	2019-03-23 16:59:42 -07:00
mehrad	d6198efc77	fix small bug	2019-03-21 21:15:29 -07:00
mehrad	487bdb8317	suppress logging epoch number	2019-03-21 21:12:44 -07:00
mehrad	a85923264b	Bug fixes	2019-03-21 16:01:14 -07:00
mehrad	91e6f5ded8	merge master + updates	2019-03-21 14:38:34 -07:00
Mehrad Moradshahi	48bd1d67ef	Merge pull request #8 from Stanford-Mobisocial-IoT-Lab/wip/curriculum Wip/curriculum	2019-03-21 12:24:08 -07:00
mehrad	7555ec6b82	master updates + additional tweaks	2019-03-21 11:20:48 -07:00
Giovanni Campagna	e41c9d89c3	Merge pull request #10 from Stanford-Mobisocial-IoT-Lab/wip/grammar Grammar support	2019-03-20 17:33:03 -07:00
Giovanni Campagna	799d8c4993	fix syntax	2019-03-19 20:40:01 -07:00
Giovanni Campagna	d18eca650b	add new argument to load_json	2019-03-19 20:38:24 -07:00
Giovanni Campagna	a3cf02cbe7	Add a way to disable glove embeddings on the decoder side With grammar, they just add noise and overfit badly	2019-03-19 20:36:20 -07:00
Giovanni Campagna	7f1a8b2578	fix	2019-03-19 18:34:02 -07:00
Giovanni Campagna	63c96cd76a	Fix plain thingtalk grammar I copied the wrong version of genieparser...	2019-03-19 18:32:23 -07:00
Giovanni Campagna	d67ef67fb8	Fix	2019-03-19 17:49:42 -07:00
Giovanni Campagna	2769cc96e3	Add the option to train a portion of decoder embeddings This will be needed because GloVe/char embeddings are meaningless for tokens that encode grammar productions (which are of the form "R<id>")	2019-03-19 17:31:53 -07:00
Giovanni Campagna	112bb0bbbf	Fix	2019-03-19 17:23:36 -07:00
Giovanni Campagna	c4ba6d7bcd	Add a progbar when loading the almond dataset Because it takes a while	2019-03-19 14:53:11 -07:00
Giovanni Campagna	7325ca1cc7	Add option to use grammar in Almond task	2019-03-19 14:38:18 -07:00
Giovanni Campagna	17f4381ea3	Import the grammar code from genie-parser Now purged of unnecessary messing with numpy, and of unnecessary tensorflow	2019-03-19 12:06:22 -07:00
Giovanni Campagna	f40f168f17	Reshuffle code around Move task specific stuff into tasks/	2019-03-19 11:22:54 -07:00
Giovanni Campagna	02e4d6ddac	Prepare for supporting grammar Use a consistent preprocessing function, provided by the task class, between server and train/predict, and load the tasks once.	2019-03-19 11:14:32 -07:00
Giovanni Campagna	14caf01e49	server: update to use task classes	2019-03-19 10:58:34 -07:00
Giovanni Campagna	83d113dc48	Fix	2019-03-19 10:50:53 -07:00
Giovanni Campagna	42331a3c08	Fix JSON serialization of arguments	2019-03-19 10:07:00 -07:00
Giovanni Campagna	6f777425ea	Remove --reverse_task argument If you want to train on the reverse Almond task, use "reverse_almond" as a task name, as you should.	2019-03-19 10:03:28 -07:00

1 2 3 4 5 ...

289 Commits All Branches Search

289 Commits

All Branches