This repository contains pre-trained Kraken models for the OCR of historical classical commentaries. These models were trained on ground truth (GT) data from two sources:
- the Polytonic Greek Training Data from Historic Texts (Pogretra) dataset v1.0 (31,972 lines)
- the OCR GT for Historical Commentaries dataset (3,356 lines)
For detailed information about each model, please refer to the metadata.json
file contained in the model's directory.
Name | Description | Line Example |
---|---|---|
greek-english_porson_sophoclesplaysa05campgoog | Model trained on Pogretra's Porson data, enhanced with additional training materials from Jebb's commentary. | |
greek-german_serifs_sophokle1v3soph | Model trained on Pogretra's German-serifs data, enhanced with additional training materials from Schneidewin's commentary. | |
greek-german_serifs_bsb10234118 | Model trained on Pogretra's German-serifs data, enhanced with additional training materials from Lobeck's Latin commentary. |
Data in this repository were produced in the context of the Ajax Multi-Commentary project, funded by the Swiss National Science Foundation under an Ambizione grant PZ00P1_186033.
Contributors: Bruce Robertson (Mount Allison University), Sven Najem-Meyer (EPFL), Matteo Romanello (UNIL).