generated from UB-Mannheim/ocr-model-repo-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Automatic] Update README and METADATA
- Loading branch information
1 parent
3fd5deb
commit af9a4ac
Showing
19 changed files
with
1,116 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
theme: jekyll-theme-dinky |
28 changes: 28 additions & 0 deletions
28
docs/data/german-newspapers/data/kraken/text/german_newspapers/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">Kraken (default) model for german newspapers trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd>[1,120,0,1 Cr{C_0}3,13,32 Do{Do_1}0.1,2 Mp{Mp_2}2,2 Cr{C_3}3,13,32 Do{Do_4}0.1,2 Mp{Mp_5}2,2 Cr{C_6}3,9,64 Do{Do_7}0.1,2 Mp{Mp_8}2,2 Cr{C_9}3,9,64 Do{Do_10}0.1,2 S{S_11}1(1x0)1,3 Lbx{L_12}200 Do{Do_13}0.1,2 Lbx{L_14}200 Do.{Do_15}1,2 Lbx{L_16}200 Do{Do_17} O{O_18}1c264]</dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>39</dd> | ||
</dl> | ||
</div> |
28 changes: 28 additions & 0 deletions
28
...german-newspapers/data/kraken/text/german_newspapers_topologies/gpt/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">Kraken model with gpt topology for german newspapers trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd>[1,120,0,1 Cr{C_0}3,3,32,1,1 Gn{Gn_1}32 Mp{Mp_2}2,2 Cr{C_3}3,3,64,1,1 Gn{Gn_4}64 Mp{Mp_5}2,2,2,2 Cr{C_6}3,3,128,1,1 Gn{Gn_7}128 Mp{Mp_8}2,2,2,2 Cr{C_9}3,3,256,1,1 Gn{Gn_10}256 Mp{Mp_11}2,2,2,2 S{S_12}1(1x0)1,3 Lbx{L_13}256 Do{Do_14}0.2 Lbx{L_15}256 Do{Do_16}0.2 Lbx{L_17}256 Do{Do_18}0.2 O{O_19}1c264]</dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>36</dd> | ||
</dl> | ||
</div> |
28 changes: 28 additions & 0 deletions
28
...erman-newspapers/data/kraken/text/german_newspapers_topologies/htr+/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">Kraken model with htr+ topology for german newspapers trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd>[1,128,0,1 Cr{C_0}4,2,8,4,2 Cr{C_1}4,2,32,1,1 Mp{Mp_2}4,2,4,2 Cr{C_3}3,3,64,1,1 Mp{Mp_4}1,2,1,2 S{S_5}1(1x0)1,3 Lbx{L_6}256 Do{Do_7}0.5 Lbx{L_8}256 Do{Do_9}0.5 Lbx{L_10}256 Do{Do_11}0.5 O{O_12}1c264]</dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>36</dd> | ||
</dl> | ||
</div> |
28 changes: 28 additions & 0 deletions
28
...erman-newspapers/data/kraken/text/german_newspapers_topologies/htru/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">Kraken model with htru topology for german newspapers trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd>[1,120,0,1 Cr{C_0}4,2,32,4,2 Gn{Gn_1}32 Cr{C_2}4,2,64,1,1 Gn{Gn_3}32 Mp{Mp_4}4,2,4,2 Cr{C_5}3,3,128,1,1 Gn{Gn_6}32 Mp{Mp_7}1,2,1,2 S{S_8}1(1x0)1,3 Lbx{L_9}256 Do{Do_10}0.5 Lbx{L_11}256 Do{Do_12}0.5 Lbx{L_13}256 Do{Do_14}0.5 O{O_15}1c264]</dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>36</dd> | ||
</dl> | ||
</div> |
28 changes: 28 additions & 0 deletions
28
...man-newspapers/data/kraken/text/german_newspapers_topologies/kraken/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">Kraken model with kraken topology for german newspapers trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd>[1,120,0,1 Cr{C_0}3,13,32 Do{Do_1}0.1,2 Mp{Mp_2}2,2 Cr{C_3}3,13,32 Do{Do_4}0.1,2 Mp{Mp_5}2,2 Cr{C_6}3,9,64 Do{Do_7}0.1,2 Mp{Mp_8}2,2 Cr{C_9}3,9,64 Do{Do_10}0.1,2 S{S_11}1(1x0)1,3 Lbx{L_12}200 Do{Do_13}0.1,2 Lbx{L_14}200 Do.{Do_15}1,2 Lbx{L_16}200 Do{Do_17} O{O_18}1c264]</dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>39</dd> | ||
</dl> | ||
</div> |
28 changes: 28 additions & 0 deletions
28
...german-newspapers/data/kraken/text/german_newspapers_topologies/sgd/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">Kraken model with sgd topology for german newspapers trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd>[1,144,0,1 Cr4,2,16,1,1 Mp4,2 Cr2,2,48,1,1, Gn24 Mp2,2 Cr2,2,72,1,1 Gn36 Mp2,2 S1(1x0)1,3 Lbx288 Do0.2,2 Lbx288 Do0.2,2 Lbx288]</dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>30</dd> | ||
</dl> | ||
</div> |
29 changes: 29 additions & 0 deletions
29
docs/data/german-newspapers/data/tesseract/best/german_newspapers/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
<link rel="stylesheet" href="../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">OCR model for german newspapers trained from several datasets. | ||
Best model variant for Tesseract. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-german-newspapers</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Tesseract</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.traineddata</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd></dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>20</dd> | ||
</dl> | ||
</div> |
29 changes: 29 additions & 0 deletions
29
docs/data/german-newspapers/data/tesseract/fast/german_newspapers/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
<link rel="stylesheet" href="../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German newspapers</h1> | ||
<p id="paragraph">OCR model for german newspapers trained from several datasets. | ||
Fast model variant for Tesseract. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-german-newspapers</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Tesseract</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.traineddata</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd></dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>20</dd> | ||
</dl> | ||
</div> |
28 changes: 28 additions & 0 deletions
28
docs/data/german-print/data/kraken/text/german_print/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
<link rel="stylesheet" href="../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German print</h1> | ||
<p id="paragraph">Kraken model for german prints trained from several datasets. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Kraken</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.mlmodel</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd></dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>17</dd> | ||
</dl> | ||
</div> |
29 changes: 29 additions & 0 deletions
29
docs/data/german-print/data/tesseract/best/german_print/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
<link rel="stylesheet" href="../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German print</h1> | ||
<p id="paragraph">OCR model for german prints trained from several datasets. | ||
Best model variant for Tesseract. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Tesseract</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.traineddata</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd></dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>20</dd> | ||
</dl> | ||
</div> |
29 changes: 29 additions & 0 deletions
29
docs/data/german-print/data/tesseract/fast/german_print/METADATA.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
<link rel="stylesheet" href="../../../../../../table_hide.css"/> | ||
<div> | ||
<h1 id="title">German print</h1> | ||
<p id="paragraph">OCR model for german prints trained from several datasets. | ||
Fast model variant for Tesseract. | ||
See https://github.com/UB-Mannheim/kraken/wiki/Training-German-Print</p> | ||
<h2>Metadata</h2> | ||
<dl class="grid"> | ||
<dt id="Language">OCR engine / software:</dt> | ||
<dd>Tesseract</dd> | ||
<dt id="Type">Model type:</dt> | ||
<dd>Text recognition</dd> | ||
<dt id="Format">Format:</dt> | ||
<dd>.traineddata</dd> | ||
<dt id="Topology">Topology:</dt> | ||
<dd></dd> | ||
<dt id="Creation">Creation:</dt> | ||
<dd></dd> | ||
<dt id="License">License:</dt> | ||
<dd>PublicDomainMark 1.0 (see: https://creativecommons.org/publicdomain/mark/1.0/)</dd> | ||
</dl> | ||
<h2>Training</h2> | ||
<dl class="grid"> | ||
<dt id="Training-type">Type of training:</dt> | ||
<dd>From scratch</dd> | ||
<dt id="Epochs">Epochs:</dt> | ||
<dd>20</dd> | ||
</dl> | ||
</div> |
Oops, something went wrong.