Skip to content

Latest commit

 

History

History
19 lines (16 loc) · 447 Bytes

README.md

File metadata and controls

19 lines (16 loc) · 447 Bytes

OCR related utils

Features

  • Scrape PDF from Web
  • Extract information of coordinations and descriptions from PDF
  • Convert PDF to image object(png)
  • Make OCR dataset like PyTorch Dataset.

Installation

git clone https://github.com/mzntaka0/ocra.git
cd ocra
python setup.py install

Dependencies

  • poppler-utils(pdftohtml)
  • Python >= 3.6.2