mirror of https://github.com/explosion/spaCy.git
Add setup directions for data dir
This script's data needs are not intuitive. I have added a note explaining that (a) it expects pos/neg polarity data, (b) the structure of the data dir (train/test), and (c) a standard resource for such polarity data.
This commit is contained in:
parent
c8d3694e2d
commit
d105771a07
|
@ -1,3 +1,11 @@
|
|||
"""This script expects something like a binary sentiment data set, such as
|
||||
that available here: `http://www.cs.cornell.edu/people/pabo/movie-review-data/`
|
||||
|
||||
It expects a directory structure like: `data_dir/train/{pos|neg}`
|
||||
and `data_dir/test/{pos|neg}`. Put (say) 90% of the files in the former
|
||||
and the remainder in the latter.
|
||||
"""
|
||||
|
||||
from __future__ import unicode_literals
|
||||
from __future__ import print_function
|
||||
from __future__ import division
|
||||
|
|
Loading…
Reference in New Issue