Add setup directions for data dir

This script's data needs are not intuitive. I have added a note explaining that (a) it expects pos/neg polarity data, (b) the structure of the data dir (train/test), and (c) a standard resource for such polarity data.
This commit is contained in:
Kyle P. Johnson 2016-11-13 10:08:16 -08:00 committed by GitHub
parent c8d3694e2d
commit d105771a07
1 changed files with 8 additions and 0 deletions

View File

@ -1,3 +1,11 @@
"""This script expects something like a binary sentiment data set, such as
that available here: `http://www.cs.cornell.edu/people/pabo/movie-review-data/`
It expects a directory structure like: `data_dir/train/{pos|neg}`
and `data_dir/test/{pos|neg}`. Put (say) 90% of the files in the former
and the remainder in the latter.
"""
from __future__ import unicode_literals from __future__ import unicode_literals
from __future__ import print_function from __future__ import print_function
from __future__ import division from __future__ import division