A web interface for performing Optical Character Recognition on PDF files. Simply upload a PDF file using the provided form and you will, after a while, be presented with a zipfile containing its pages in text format. Note: Processing may be very slow and so either great hardware or great patience (and sometimes both) are advised.
This web interface is best deployed as a Docker image either locally or in a more advanced configuration with an ingress service. For this purpose a Dockerfile is provided, ready to build.
Apart from very rudimentary input sanitation there is no security or authentication provided, therefore great caution is advised when exposing the interface to an untrusted network. In addition, since OCR processing can be very CPU-intensive, performing a denial-of-service attack through request flooding is extremely easy.