DANY Bank Parser

Setup

Before handover to DANY, we will have dockerized this code for easy deployment (I will check with Alex if this is a good idea.

Install virtualenv
You should have a virtualenv created. This repository automatically ignores the venv folder so it would be a good name for a virtualenv.
Remember to activate your virtualenv with source venv/bin/activate
Install all the dependencies with pip install -r requirements.txt.
Install imagemagick and libimagemagickdev (differs by platform). Available via apt on Ubuntu
Install tesseract (see above.)

Invoking

The main point of entry for this program is currently ocr_parser.py. To do OCR on a file, run: python ocr_parser.py <FILE_NAME>. If your file is a PDF, we will start by converting it to an image and then run OCR on it. If it is an image file, OCR will be run directly. We detect whether filetype entirely via extension.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.gitignore		.gitignore
README.md		README.md
bank_parse_text.py		bank_parse_text.py
ocr_parser.py		ocr_parser.py
parser.py		parser.py
requirements.txt		requirements.txt
run_ocr		run_ocr
test_reader.py		test_reader.py
text_parser.py		text_parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DANY Bank Parser

Setup

Invoking

About

Releases

Packages

Contributors 4

Languages

hackNY-labs-2018/dany-parsing

Folders and files

Latest commit

History

Repository files navigation

DANY Bank Parser

Setup

Invoking

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages