Most tasks can be framed as CQA naturally, with the task specific
code living in the dataset code, but some require extra task-specific
handling (IDs, metrics, tokenization, etc.)
In preparation for having even more task specific handling for Almond,
start cleaning up by creating a class for each task.