OCR system for recognizing modern Japanese magazines
Go to file
DeepApps 43365fd524
Update README.md
update image path
2023-07-11 16:36:44 +09:00
basenet initial commit 2020-07-08 11:31:26 +09:00
data add transformer OCR 2023-07-11 15:44:23 +09:00
images add images 2020-07-08 11:34:55 +09:00
pretrain initial commit 2020-07-08 11:31:26 +09:00
transformer add transformer OCR 2023-07-11 15:44:23 +09:00
.DS_Store update model on gdrive 2023-07-11 16:33:23 +09:00
.gitignore initial commit 2020-07-08 11:31:26 +09:00
README.md Update README.md 2023-07-11 16:36:44 +09:00
coordinates.py initial commit 2020-07-08 11:31:26 +09:00
craft.py initial commit 2020-07-08 11:31:26 +09:00
craft_utils.py initial commit 2020-07-08 11:31:26 +09:00
data_loader.py initial commit 2020-07-08 11:31:26 +09:00
decoder.py add code for both GPU and CPU 2020-07-22 11:01:12 +09:00
encoder.py initial commit 2020-07-08 11:31:26 +09:00
encoder_decoder.py add code for both GPU and CPU 2020-07-22 11:01:12 +09:00
evaluation.py initial commit 2020-07-08 11:31:26 +09:00
file_utils.py add code for both GPU and CPU 2020-07-22 11:01:12 +09:00
gaussian.py initial commit 2020-07-08 11:31:26 +09:00
imgproc.py initial commit 2020-07-08 11:31:26 +09:00
mep.py initial commit 2020-07-08 11:31:26 +09:00
mseloss.py initial commit 2020-07-08 11:31:26 +09:00
requirements.txt add transformer OCR 2023-07-11 15:44:23 +09:00
test.py add code for both GPU and CPU 2020-07-22 11:01:12 +09:00
test_kindai_1.0.py add transformer OCR 2023-07-11 15:44:23 +09:00
test_kindai_2.0.py add transformer OCR 2023-07-11 15:44:23 +09:00
torchutil.py initial commit 2020-07-08 11:31:26 +09:00
translate_line.py initial commit 2020-07-08 11:31:26 +09:00
utils.py add code for both GPU and CPU 2020-07-22 11:01:12 +09:00
watershed.py initial commit 2020-07-08 11:31:26 +09:00

README.md

Kindai-OCR

OCR system for recognizing modern Japanese magazines

About

This repo contains an OCR sytem for converting modern Japanese images to text. This is a result of N2I project for digitization of modern Japanese documents.

The system has 2 main modules: text line extraction and text line recognition. The overall architechture is shown in the below figures. alt text

For text line extraction, we retrain the CRAFT (Character Region Awareness for Text Detection) on 1000 annotated images provided by Center for Research and Development of Higher Education, The University of Tokyo. alt text

Text line recognition, For Kindai V1.0, we employ the attention-based encoder-decoder on our previous publication. We train the text line recognition on 1000 annotated images and 1600 unannotated images provided by Center for Research and Development of Higher Education, The University of Tokyo and National Institute for Japanese Language and Linguistics, respectively.
For Kindai V2.0, we trained a transformer with more data from National Diet Library and The Center for Open Data in The Humanities.

Installing Kindai OCR

Python==3.7.11
torch==1.7.0
torchvision==0.8.1
opencv-python==3.4.2.17
scikit-image==0.14.2
scipy==1.1.0
Polygon3
pillow==4.3.0
pytorch-lightning==1.3.5
einops==0.3.0
editdistance==0.5.3

Running Kindai OCR

  • You should first download the pre_trained models and put them into ./pretrain/ folder. VGG model, CRAFT model, OCR V1.0 model [OCR V2.0 model] (https://drive.google.com/file/d/1cq4PwPS2mXXRjOApst2i7n4G3mBSVqpI/view?usp=drive_link)
  • Copy your images into ./data/test/ folder
  • run the following script to recognize images:
    python test_kindai_1.0.py
    python test_kindai_2.0.py
  • The recognized text transcription is in ./data/result.xml and the result images are in ./data/result/
  • If you may have to check the path to Japanese font in test.py for correct visualization results.
    fontPIL = '/usr/share/fonts/truetype/fonts-japanese-gothic.ttf' # japanese font
  • using --cuda = True for GPU device and Fasle for CPU device
  • using --canvas_size ot set image size for text line detection
  • An example result from our OCR system

Running Kindai OCR

If you find Kindai OCR useful in your research, please consider citing:
Anh Duc Le, Daichi Mochihashi, Katsuya Masuda, Hideki Mima, and Nam Tuan Ly. 2019. Recognition of Japanese historical text lines by an attention-based encoder-decoder and text line generation. In Proceedings of the 5th International Workshop on Historical Document Imaging and Processing (HIP 19). Association for Computing Machinery, New York, NY, USA, 3741. DOI:https://doi.org/10.1145/3352631.3352641

Acknowledgment

We thank The Center for Research and Development of Higher Education, The University of Tokyo, and National Institute for Japanese Language and Linguistics for providing the kindai datasets.

Contact

Dr. Anh Duc Le, email: leducanh841988@gmail.com or anh@ism.ac.jp