Commit Graph

6 Commits

Author SHA1 Message Date
Giovanni Campagna fdfdd154c4 Remove hacks needed to use a different tokenizer for Almond
Almond data is always pretokenized (because we must preprocess
numbers/quoted strings/etc.).

Previously, we used a bad HACK of hardcoding almond in the generic
Field subclass. Now, we instead thread through the tokenizer/detokenizer
arguments from the right places.

In the future, we will probably want task classes, to clean up
the mess of hacks and hardcoded task-specific tweaks everywhere.
2019-01-23 10:48:49 -08:00
Bryan Marcus McCann 847a9dd094 bug fix for string interning 2018-08-29 02:44:24 +00:00
Bryan Marcus McCann c00de93356 re-interning for alternatives to revtok 2018-08-21 03:24:21 +00:00
Bryan Marcus McCann 2065ccf4b5 rm interning; integrated into revtok 2018-08-21 03:18:35 +00:00
Bryan Marcus McCann 1f83b7a739 interning strings to fix memory consumption 2018-08-21 02:47:14 +00:00
Bryan Marcus McCann 8997039aec Initial commit 2018-06-20 12:36:21 +00:00