mehrad
4919b28627
fixes
2019-05-29 19:38:24 -07:00
mehrad
672aa14117
merge branch wip/mehrad/multi_lingual
2019-05-29 18:43:10 -07:00
Giovanni Campagna
25db020107
Reduce memory usage while loading almond datasets
...
Don't load all lines in memory
2019-05-15 09:27:00 -07:00
Giovanni Campagna
488a4feb64
Fix contextual almond
2019-05-15 09:26:54 -07:00
mehrad
df94f7dd3a
ignore malformed sentences
2019-05-13 14:50:08 -07:00
mehrad
0495d4d0eb
Updates
...
1) use FastText for encoding persian text
2) let the user choose the question for almond task
3) bug fixes
2019-05-13 13:03:51 -07:00
mehrad
925d839e15
adding an end-to-end combined model
2019-04-29 13:56:39 -07:00
Giovanni Campagna
46eaae8ba8
fix almond dataset name
2019-04-23 09:35:57 -07:00
Giovanni Campagna
64020a497f
Add ContextualAlmond task
...
Its training files have four columns: <id> <context> <sentence> <target_program>
2019-04-23 09:30:49 -07:00
mehrad
a7a2d752d2
Fixes
...
std() in layer normalization is the culprit for generating NAN.
It happens in the backward pass for values with zero variance.
Just update the mean for these batches.
2019-04-08 14:48:23 -07:00
Giovanni Campagna
cea6092f90
Fix evaluating
...
- fix loading old config.json files that are missing some parameters
- fix expanding the trained embedding
- add a default context for "almond_with_thingpedia_as_context"
(to include thingpedia)
- fix handling empty sentences
2019-03-23 17:28:22 -07:00
mehrad
91e6f5ded8
merge master + updates
2019-03-21 14:38:34 -07:00
mehrad
7555ec6b82
master updates + additional tweaks
2019-03-21 11:20:48 -07:00
Giovanni Campagna
7f1a8b2578
fix
2019-03-19 18:34:02 -07:00
Giovanni Campagna
63c96cd76a
Fix plain thingtalk grammar
...
I copied the wrong version of genieparser...
2019-03-19 18:32:23 -07:00
Giovanni Campagna
112bb0bbbf
Fix
2019-03-19 17:23:36 -07:00
Giovanni Campagna
c4ba6d7bcd
Add a progbar when loading the almond dataset
...
Because it takes a while
2019-03-19 14:53:11 -07:00
Giovanni Campagna
7325ca1cc7
Add option to use grammar in Almond task
2019-03-19 14:38:18 -07:00
Giovanni Campagna
17f4381ea3
Import the grammar code from genie-parser
...
Now purged of unnecessary messing with numpy, and of unnecessary
tensorflow
2019-03-19 12:06:22 -07:00
Giovanni Campagna
f40f168f17
Reshuffle code around
...
Move task specific stuff into tasks/
2019-03-19 11:22:54 -07:00
Giovanni Campagna
02e4d6ddac
Prepare for supporting grammar
...
Use a consistent preprocessing function, provided by the task class,
between server and train/predict, and load the tasks once.
2019-03-19 11:14:32 -07:00
Giovanni Campagna
14caf01e49
server: update to use task classes
2019-03-19 10:58:34 -07:00
Giovanni Campagna
5989846771
Make use of task classes
...
And clean up the metric handling code as well
2019-03-19 10:01:45 -07:00
Giovanni Campagna
92aade6a66
Introduce a registry mechanism and a class hierarchy for tasks
...
Most tasks can be framed as CQA naturally, with the task specific
code living in the dataset code, but some require extra task-specific
handling (IDs, metrics, tokenization, etc.)
In preparation for having even more task specific handling for Almond,
start cleaning up by creating a class for each task.
2019-03-19 09:15:19 -07:00