Test cases #3

falktan · 2021-04-15T10:44:54Z

A small collection of sample pictures to test the quality of the OCR algorithm would be helpful.
I.e. there should be about 15 pictures with different lighting conditions and different text size and quality with more or less distracting background.
The repo https://github.com/falktan/ovip-supplementary is probably a good place for these pictures.

In addition, it would be helpful to have a convenient way to test the algorithm on those pictures.

falktan · 2021-04-17T17:55:27Z

It might also be a good idea to search for a reasonable collection of sample images that is available publicly.

falktan · 2021-04-17T18:42:31Z

This paper references a number of sources for "out in the wild" text (as a bonus it is an interesting read about how to use LSTM for OCR).
https://arxiv.org/pdf/1507.05717.pdf

falktan · 2021-04-25T11:28:38Z

Started work here:
https://github.com/falktan/ocrjs

falktan · 2021-07-26T12:19:49Z

Ideas how to improve tesseract see:
https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html

falktan added the good first issue Good for newcomers label Apr 15, 2021

falktan removed the good first issue Good for newcomers label Apr 17, 2021

falktan self-assigned this Apr 24, 2021

falktan added the enhancement New feature or request label Apr 24, 2021

falktan mentioned this issue Apr 29, 2021

Try adaptive thresholding #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test cases #3

Test cases #3

falktan commented Apr 15, 2021

falktan commented Apr 17, 2021

falktan commented Apr 17, 2021

falktan commented Apr 25, 2021

falktan commented Jul 26, 2021

Test cases #3

Test cases #3

Comments

falktan commented Apr 15, 2021

falktan commented Apr 17, 2021

falktan commented Apr 17, 2021

falktan commented Apr 25, 2021

falktan commented Jul 26, 2021