24 lines
684 B
Markdown
24 lines
684 B
Markdown
# Kindai-OCR
|
|
OCR system for recognizing modern Japanese magazines
|
|
|
|
## About
|
|
|
|
This repo contains an OCR sytem for converting modern Japanese images to text.
|
|
This is a result of N2I project[http://codh.rois.ac.jp/collaboration/#n2i] for digitization of modern Japanese documents.
|
|
|
|
The system has 2 main modules: text line extraction and text line recognition. The overall architechture is shown in the below figure.
|
|
|
|
For text line extraction, we retrain the CRAFT (Character Region Awareness for Text Detection) on our dataset.
|
|
For text line recognition, we employ the attention-based encoder-decoder on our previous publication.
|
|
|
|
|
|
|
|
|
|
## Installing Kindai OCR
|
|
|
|
|
|
|
|
## Running Kindai OCR
|
|
|
|
|