diff --git a/README.md b/README.md index 6c3ef5a..8a25fc0 100644 --- a/README.md +++ b/README.md @@ -8,8 +8,10 @@ This is a result of [N2I project](http://codh.rois.ac.jp/collaboration/#n2i) for The system has 2 main modules: text line extraction and text line recognition. The overall architechture is shown in the below figure. -For text line extraction, we retrain the CRAFT (Character Region Awareness for Text Detection) on our dataset. -For text line recognition, we employ the attention-based encoder-decoder on our previous publication. +For text line extraction, we retrain the CRAFT (Character Region Awareness for Text Detection) on 1000 annotated images provided by Center for Research and Development of Higher Education, The University of Tokyo. +For text line recognition, we employ the attention-based encoder-decoder on our previous publication. We train the text line recognition on 1000 annotated images and 1600 unannotated images provided by Center for Research and Development of Higher Education and National Institute for Japanese Language and Linguistics, respectively. + +