Benchmark workflows "selected_pages_ocr" do not produce text results #22

stweil · 2024-01-11T09:56:41Z

The related workflows all end with CER / WER 1.0, so no text is recognized by Calamari.

A manual run for a single GT terminates in less than 1 second without error message, but also without a usable result:

root@35dc144e05e4:/app/workflows/workspaces/euler_rechenkunst01_1738_selected_pages_ocr/data/euler_rechenkunst01_1738# ocrd-calamari-recognize -I OCR-D-SEG-LINE-RESEG-DEWARP -O OCR-D-OCR2 -P checkpoint_dir qurator-gt4histocr-1.0
Checkpoint version 2 is up-to-date.
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_data (InputLayer)        [(None, None, 48, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_0 (Conv2D)              (None, None, 48, 40  400         ['input_data[0][0]']             
                                )                                                                 
                                                                                                  
 pool2d_1 (MaxPooling2D)        (None, None, 24, 40  0           ['conv2d_0[0][0]']               
                                )                                                                 
                                                                                                  
 conv2d_1 (Conv2D)              (None, None, 24, 60  21660       ['pool2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 pool2d_3 (MaxPooling2D)        (None, None, 12, 60  0           ['conv2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 reshape (Reshape)              (None, None, 720)    0           ['pool2d_3[0][0]']               
                                                                                                  
 bidirectional (Bidirectional)  (None, None, 400)    1473600     ['reshape[0][0]']                
                                                                                                  
 input_sequence_length (InputLa  [(None, 1)]         0           []                               
 yer)                                                                                             
                                                                                                  
 dropout (Dropout)              (None, None, 400)    0           ['bidirectional[0][0]']          
                                                                                                  
 tf.compat.v1.floor_div (TFOpLa  (None, 1)           0           ['input_sequence_length[0][0]']  
 mbda)                                                                                            
                                                                                                  
 logits (Dense)                 (None, None, 255)    102255      ['dropout[0][0]']                
                                                                                                  
 tf.compat.v1.floor_div_1 (TFOp  (None, 1)           0           ['tf.compat.v1.floor_div[0][0]'] 
 Lambda)                                                                                          
                                                                                                  
 softmax (Softmax)              (None, None, 255)    0           ['logits[0][0]']                 
                                                                                                  
 input_data_params (InputLayer)  [(None, 1)]         0           []                               
                                                                                                  
 tf.cast (TFOpLambda)           (None, 1)            0           ['tf.compat.v1.floor_div_1[0][0]'
                                                                 ]                                
                                                                                                  
==================================================================================================
Total params: 1,597,915
Trainable params: 1,597,915
Non-trainable params: 0
__________________________________________________________________________________________________
None
Checkpoint version 2 is up-to-date.
Model: "model_1"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_data (InputLayer)        [(None, None, 48, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_0 (Conv2D)              (None, None, 48, 40  400         ['input_data[0][0]']             
                                )                                                                 
                                                                                                  
 pool2d_1 (MaxPooling2D)        (None, None, 24, 40  0           ['conv2d_0[0][0]']               
                                )                                                                 
                                                                                                  
 conv2d_1 (Conv2D)              (None, None, 24, 60  21660       ['pool2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 pool2d_3 (MaxPooling2D)        (None, None, 12, 60  0           ['conv2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 reshape_1 (Reshape)            (None, None, 720)    0           ['pool2d_3[0][0]']               
                                                                                                  
 bidirectional_1 (Bidirectional  (None, None, 400)   1473600     ['reshape_1[0][0]']              
 )                                                                                                
                                                                                                  
 input_sequence_length (InputLa  [(None, 1)]         0           []                               
 yer)                                                                                             
                                                                                                  
 dropout_1 (Dropout)            (None, None, 400)    0           ['bidirectional_1[0][0]']        
                                                                                                  
 tf.compat.v1.floor_div_2 (TFOp  (None, 1)           0           ['input_sequence_length[0][0]']  
 Lambda)                                                                                          
                                                                                                  
 logits (Dense)                 (None, None, 255)    102255      ['dropout_1[0][0]']              
                                                                                                  
 tf.compat.v1.floor_div_3 (TFOp  (None, 1)           0           ['tf.compat.v1.floor_div_2[0][0]'
 Lambda)                                                         ]                                
                                                                                                  
 softmax (Softmax)              (None, None, 255)    0           ['logits[0][0]']                 
                                                                                                  
 input_data_params (InputLayer)  [(None, 1)]         0           []                               
                                                                                                  
 tf.cast_1 (TFOpLambda)         (None, 1)            0           ['tf.compat.v1.floor_div_3[0][0]'
                                                                 ]                                
                                                                                                  
==================================================================================================
Total params: 1,597,915
Trainable params: 1,597,915
Non-trainable params: 0
__________________________________________________________________________________________________
None
Checkpoint version 2 is up-to-date.
Model: "model_2"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_data (InputLayer)        [(None, None, 48, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_0 (Conv2D)              (None, None, 48, 40  400         ['input_data[0][0]']             
                                )                                                                 
                                                                                                  
 pool2d_1 (MaxPooling2D)        (None, None, 24, 40  0           ['conv2d_0[0][0]']               
                                )                                                                 
                                                                                                  
 conv2d_1 (Conv2D)              (None, None, 24, 60  21660       ['pool2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 pool2d_3 (MaxPooling2D)        (None, None, 12, 60  0           ['conv2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 reshape_2 (Reshape)            (None, None, 720)    0           ['pool2d_3[0][0]']               
                                                                                                  
 bidirectional_2 (Bidirectional  (None, None, 400)   1473600     ['reshape_2[0][0]']              
 )                                                                                                
                                                                                                  
 input_sequence_length (InputLa  [(None, 1)]         0           []                               
 yer)                                                                                             
                                                                                                  
 dropout_2 (Dropout)            (None, None, 400)    0           ['bidirectional_2[0][0]']        
                                                                                                  
 tf.compat.v1.floor_div_4 (TFOp  (None, 1)           0           ['input_sequence_length[0][0]']  
 Lambda)                                                                                          
                                                                                                  
 logits (Dense)                 (None, None, 255)    102255      ['dropout_2[0][0]']              
                                                                                                  
 tf.compat.v1.floor_div_5 (TFOp  (None, 1)           0           ['tf.compat.v1.floor_div_4[0][0]'
 Lambda)                                                         ]                                
                                                                                                  
 softmax (Softmax)              (None, None, 255)    0           ['logits[0][0]']                 
                                                                                                  
 input_data_params (InputLayer)  [(None, 1)]         0           []                               
                                                                                                  
 tf.cast_2 (TFOpLambda)         (None, 1)            0           ['tf.compat.v1.floor_div_5[0][0]'
                                                                 ]                                
                                                                                                  
==================================================================================================
Total params: 1,597,915
Trainable params: 1,597,915
Non-trainable params: 0
__________________________________________________________________________________________________
None
Checkpoint version 2 is up-to-date.
Model: "model_3"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_data (InputLayer)        [(None, None, 48, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_0 (Conv2D)              (None, None, 48, 40  400         ['input_data[0][0]']             
                                )                                                                 
                                                                                                  
 pool2d_1 (MaxPooling2D)        (None, None, 24, 40  0           ['conv2d_0[0][0]']               
                                )                                                                 
                                                                                                  
 conv2d_1 (Conv2D)              (None, None, 24, 60  21660       ['pool2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 pool2d_3 (MaxPooling2D)        (None, None, 12, 60  0           ['conv2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 reshape_3 (Reshape)            (None, None, 720)    0           ['pool2d_3[0][0]']               
                                                                                                  
 bidirectional_3 (Bidirectional  (None, None, 400)   1473600     ['reshape_3[0][0]']              
 )                                                                                                
                                                                                                  
 input_sequence_length (InputLa  [(None, 1)]         0           []                               
 yer)                                                                                             
                                                                                                  
 dropout_3 (Dropout)            (None, None, 400)    0           ['bidirectional_3[0][0]']        
                                                                                                  
 tf.compat.v1.floor_div_6 (TFOp  (None, 1)           0           ['input_sequence_length[0][0]']  
 Lambda)                                                                                          
                                                                                                  
 logits (Dense)                 (None, None, 255)    102255      ['dropout_3[0][0]']              
                                                                                                  
 tf.compat.v1.floor_div_7 (TFOp  (None, 1)           0           ['tf.compat.v1.floor_div_6[0][0]'
 Lambda)                                                         ]                                
                                                                                                  
 softmax (Softmax)              (None, None, 255)    0           ['logits[0][0]']                 
                                                                                                  
 input_data_params (InputLayer)  [(None, 1)]         0           []                               
                                                                                                  
 tf.cast_3 (TFOpLambda)         (None, 1)            0           ['tf.compat.v1.floor_div_7[0][0]'
                                                                 ]                                
                                                                                                  
==================================================================================================
Total params: 1,597,915
Trainable params: 1,597,915
Non-trainable params: 0
__________________________________________________________________________________________________
None
Checkpoint version 2 is up-to-date.
Model: "model_4"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_data (InputLayer)        [(None, None, 48, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_0 (Conv2D)              (None, None, 48, 40  400         ['input_data[0][0]']             
                                )                                                                 
                                                                                                  
 pool2d_1 (MaxPooling2D)        (None, None, 24, 40  0           ['conv2d_0[0][0]']               
                                )                                                                 
                                                                                                  
 conv2d_1 (Conv2D)              (None, None, 24, 60  21660       ['pool2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 pool2d_3 (MaxPooling2D)        (None, None, 12, 60  0           ['conv2d_1[0][0]']               
                                )                                                                 
                                                                                                  
 reshape_4 (Reshape)            (None, None, 720)    0           ['pool2d_3[0][0]']               
                                                                                                  
 bidirectional_4 (Bidirectional  (None, None, 400)   1473600     ['reshape_4[0][0]']              
 )                                                                                                
                                                                                                  
 input_sequence_length (InputLa  [(None, 1)]         0           []                               
 yer)                                                                                             
                                                                                                  
 dropout_4 (Dropout)            (None, None, 400)    0           ['bidirectional_4[0][0]']        
                                                                                                  
 tf.compat.v1.floor_div_8 (TFOp  (None, 1)           0           ['input_sequence_length[0][0]']  
 Lambda)                                                                                          
                                                                                                  
 logits (Dense)                 (None, None, 255)    102255      ['dropout_4[0][0]']              
                                                                                                  
 tf.compat.v1.floor_div_9 (TFOp  (None, 1)           0           ['tf.compat.v1.floor_div_8[0][0]'
 Lambda)                                                         ]                                
                                                                                                  
 softmax (Softmax)              (None, None, 255)    0           ['logits[0][0]']                 
                                                                                                  
 input_data_params (InputLayer)  [(None, 1)]         0           []                               
                                                                                                  
 tf.cast_4 (TFOpLambda)         (None, 1)            0           ['tf.compat.v1.floor_div_9[0][0]'
                                                                 ]                                
                                                                                                  
==================================================================================================
Total params: 1,597,915
Trainable params: 1,597,915
Non-trainable params: 0
__________________________________________________________________________________________________
None
09:51:19.351 INFO processor.CalamariRecognize - INPUT FILE 0 / phys_0001
09:51:19.400 INFO processor.CalamariRecognize - INPUT FILE 1 / phys_0002
09:51:19.443 INFO processor.CalamariRecognize - INPUT FILE 2 / phys_0003
09:51:19.485 INFO processor.CalamariRecognize - INPUT FILE 3 / phys_0004
09:51:19.530 INFO processor.CalamariRecognize - INPUT FILE 4 / phys_0005
09:51:19.575 INFO processor.CalamariRecognize - INPUT FILE 5 / phys_0006
09:51:19.619 INFO ocrd.process.profile - Executing processor 'ocrd-calamari-recognize' took 0.269729s (wall) 0.160561s (CPU)( [--input-file-grp='OCR-D-SEG-LINE-RESEG-DEWARP' --output-file-grp='OCR-D-OCR2' --parameter='{"checkpoint_dir": "qurator-gt4histocr-1.0", "voter": "confidence_voter_default_ctc", "textequiv_level": "line", "glyph_conf_cutoff": 0.001}' --page-id='']

The text was updated successfully, but these errors were encountered:

mikegerber · 2024-01-11T11:04:57Z

Would be interesting to see the files of OCR-D-SEG-LINE-RESEG-DEWARP.

stweil · 2024-01-11T11:25:21Z

Here is OCR-D-SEG-LINE-RESEG-DEWARP/OCR-D-SEG-LINE-RESEG-DEWARP_0001.xml:

<?xml version="1.0" encoding="UTF-8"?>
<pc:PcGts xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15 http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd" pcGtsId="OCR-D-SEG-LINE-RESEG-DEWARP_0001">
    <pc:Metadata>
        <pc:Creator>OCR-D/core 2.49.0</pc:Creator>
        <pc:Created>2024-01-11T05:41:57.730360</pc:Created>
        <pc:LastChange>2024-01-11T05:41:57.730360</pc:LastChange>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/binarization" value="ocrd-cis-ocropy-binarize">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="ocropy" type="method"/>
                <pc:Label value="0.5" type="threshold"/>
                <pc:Label value="False" type="grayscale"/>
                <pc:Label value="0.0" type="maxskew"/>
                <pc:Label value="0" type="noise_maxsize"/>
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="page" type="level-of-operation"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.1.5" type="ocrd-cis-ocropy-binarize"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/cropping" value="ocrd-tesserocr-crop">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="4" type="padding"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.17.0 (tesseract 5.3.0-46-g1569)" type="ocrd-tesserocr-crop"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/binarization" value="ocrd-skimage-binarize">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="li" type="method"/>
                <pc:Label value="page" type="level-of-operation"/>
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="0" type="window_size"/>
                <pc:Label value="0.34" type="k"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.1.7" type="ocrd-skimage-binarize"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/despeckling" value="ocrd-skimage-denoise">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="page" type="level-of-operation"/>
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="0.0" type="protect"/>
                <pc:Label value="1.0" type="maxsize"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.1.7" type="ocrd-skimage-denoise"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/deskewing" value="ocrd-tesserocr-deskew">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="page" type="operation_level"/>
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="1.5" type="min_orientation_confidence"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.17.0 (tesseract 5.3.0-46-g1569)" type="ocrd-tesserocr-deskew"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
        <pc:MetadataItem type="processingStep" name="layout/segmentation/region" value="ocrd-cis-ocropy-segment">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="region" type="level-of-operation"/>
                <pc:Label value="20" type="maxcolseps"/>
                <pc:Label value="20" type="maxseps"/>
                <pc:Label value="10" type="maximages"/>
                <pc:Label value="4" type="csminheight"/>
                <pc:Label value="10" type="hlminwidth"/>
                <pc:Label value="0.01" type="gap_height"/>
                <pc:Label value="1.5" type="gap_width"/>
                <pc:Label value="True" type="overwrite_order"/>
                <pc:Label value="True" type="overwrite_separators"/>
                <pc:Label value="True" type="overwrite_regions"/>
                <pc:Label value="True" type="overwrite_lines"/>
                <pc:Label value="2.4" type="spread"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.1.5" type="ocrd-cis-ocropy-segment"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/dewarping" value="ocrd-cis-ocropy-dewarp">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="0" type="dpi"/>
                <pc:Label value="4.0" type="range"/>
                <pc:Label value="1.0" type="smoothness"/>
                <pc:Label value="0.05" type="max_neighbour"/>
            </pc:Labels>
            <pc:Labels externalModel="ocrd-tool" externalId="version">
                <pc:Label value="0.1.5" type="ocrd-cis-ocropy-dewarp"/>
                <pc:Label value="2.49.0" type="ocrd/core"/>
            </pc:Labels>
        </pc:MetadataItem>
    </pc:Metadata>
    <pc:Page imageFilename="OCR-D-IMG/OCR-D-IMG_0001.tif" imageWidth="1296" imageHeight="1855" orientation="0.124143090435354" readingDirection="left-to-right" textLineOrder="top-to-bottom">
        <pc:AlternativeImage filename="OCR-D-BIN/OCR-D-BIN_0001.IMG-BIN.png" comments=",binarized"/>
        <pc:AlternativeImage filename="OCR-D-CROP/OCR-D-CROP_0001.IMG-CROP.png" comments=",binarized,cropped"/>
        <pc:AlternativeImage filename="OCR-D-BIN2/OCR-D-BIN2_0001.IMG-BIN.png" comments=",cropped,binarized"/>
        <pc:AlternativeImage filename="OCR-D-BIN-DENOISE/OCR-D-BIN-DENOISE_0001.IMG-DEN.png" comments=",cropped,binarized,despeckled"/>
        <pc:AlternativeImage filename="OCR-D-BIN-DENOISE-DESKEW/OCR-D-BIN-DENOISE-DESKEW_0001.IMG-DESKEW.png" comments=",cropped,binarized,despeckled,deskewed"/>
        <pc:Border>
            <pc:Coords points="17,140 1296,140 1296,1805 17,1805"/>
        </pc:Border>
    </pc:Page>
</pc:PcGts>

stweil · 2024-01-11T11:35:29Z

The QuiVer benchmark workflow selected_pages_ocr uses a process which binarizes twice. That gives an image which is too light for good OCR results (some characters are even missing completely). Nevertheless most of the text is still readable, to there should be some OCR result.

stweil · 2024-01-11T12:33:42Z

All data is now available online.

It also includes the generated page images, for example page 1 (binarized twice, denoised, deskewed).

mikegerber · 2024-01-12T16:24:59Z

There are no TextLines to recognize text from, so this is expected.

mikegerber · 2024-01-12T16:31:40Z

(I'm going on vacation in 2 hours so I'm not checking where the segmentation step is missing/going wrong, but I can check when I'm back)

stweil · 2024-01-13T19:16:26Z

Commit 3b32589 removed a parameter for cis-ocropy-segment, so that processor no longer produces the required text lines. cc @mweidling.

If that parameter is added again, some tests work fine, but others fail with a runtime error in cis-ocropy-segment. See related issue for ocrd_cis.

stweil · 2024-01-14T08:07:22Z

Meanwhile I restored the line segmentation for the workflow and got OCR results at least for the tests where the segmentation process did not crash (see cisocrgroup/ocrd_cis#94). It looks like the segmentation of a single newspaper page takes several hours (the first one is now running for 252 minutes, see cisocrgroup/ocrd_cis#98). I am afraid that the whole workflow cannot be used in the benchmark tests because of that.

stweil · 2024-01-14T14:15:57Z

The workflow selected_pages_ocr uses more than 118 GiB of RAM while running OCR with calamari-recognize for newspaper pages. A server with 128 GiB RAM starts swapping and gets nearly unusable.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                             
3023389 stweil    20   0    6648   2132   1696 S   6.2   0.0   0:00.86 bash                                                                                                                                
3023393 stweil    20   0  207.9g 118.8g  44280 S   6.2  94.5  16:10.67 ocrd-calamari-r

bertsky · 2024-01-18T01:08:34Z

Commit 3b32589 removed a parameter for cis-ocropy-segment, so that processor no longer produces the required text lines. cc @mweidling.

That change is faulty btw: default is level-of-operation=region (for historic reasons), and since no prior segmentation happened, nothing will happen.

mikegerber · 2024-02-22T19:14:29Z

The workflow selected_pages_ocr uses more than 118 GiB of RAM while running OCR with calamari-recognize for newspaper pages. A server with 128 GiB RAM starts swapping and gets nearly unusable.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                             
3023389 stweil    20   0    6648   2132   1696 S   6.2   0.0   0:00.86 bash                                                                                                                                
3023393 stweil    20   0  207.9g 118.8g  44280 S   6.2  94.5  16:10.67 ocrd-calamari-r

On which input data is that specifically? Your upload seems not to be up-to-date.

I'd gladly reproduce and debug if I had the workspace including the segmentation used. The "configuration" used is workflows/ocrd_workflows/selected_pages_ocr.txt, I take it?

stweil · 2024-02-22T19:29:49Z

That's right, selected_pages_ocr.txt is the workflow file.

mikegerber · 2024-02-23T12:00:49Z

It would help if I had a workspace up to ocrd-calamari-recognize's input fileGrp (OCR-D-SEG-LINE-RESEG-DEWARP?), so I can easily reproduce this.

From the looks of it, @bertsky seems to be right (above) and the workflow still doesn't produce line segmentation (only region segmentation), so this behaviour would be even more curious.

bertsky · 2024-02-23T14:09:25Z

@stweil didn't we already establish (in the OCR-D Forum) that the version of ocrd_all used by Quiver at the time was hopelessly outdated? But I agree we should get to the bottom of this – with or without line segments, ocrd-calamari-recognize should not be allowed (or motivated) to allocate large amounts of memory.

mikegerber · 2024-02-23T14:25:17Z

But I agree we should get to the bottom of this – with or without line segments, ocrd-calamari-recognize should not be allowed (or motivated) to allocate large amounts of memory.

Yep. The way it works (line by line processing) it shouldn't happen, but a. I didn't test many newspaper pages myself and did that on a host with a lot of memory b. wouldn't be the first time to see a memory leak with TensorFlow.

mikegerber · 2024-02-23T14:32:46Z

(Should probably run processors with ulimit or in a cgroup)

bertsky · 2024-02-23T14:39:42Z

(Should probably run processors with ulimit or in a cgroup)

Agreed! Could also be easily done in ocrd_all Docker images. Docker itself offers options like --memory 2GB and --ulimit rss=2000000:4000000, but we could also set something in the image's /etc/profile.d ...

mikegerber · 2024-02-23T14:53:17Z

(Should probably run processors with ulimit or in a cgroup)

Agreed! Could also be easily done in ocrd_all Docker images. Docker itself offers options like --memory 2GB and --ulimit rss=2000000:4000000, but we could also set something in the image's /etc/profile.d ...

I have thoughts about this (for example, I don't think profile.d would work here), should we open an issue in ocrd_all then? Have to look into the "slim image" efforts anyway.

stweil · 2024-02-23T15:29:53Z

version of ocrd_all used by Quiver at the time was hopelessly outdated

It is still outdated, see issue #23. And I don't know whether there are plans and resources to change that.

bertsky · 2024-02-23T16:25:39Z

I have thoughts about this (for example, I don't think profile.d would work here), should we open an issue in ocrd_all then?

@mikegerber I added it to OCR-D/ocrd_all#280 – please add your ideas there.

mikegerber · 2024-02-28T16:31:44Z

Because I didn't have to workspace to debug the memory problem involving ocrd-calamari-recognize, I tried to run the selected_page_ocr workflow on reichsanzeiger_random_selected_pages_ocr (removed all filegroups except OCR-D-IMG and OCR-D-GT-SEG-LINE to start with) and encountered a different problem (using latest ocrd/all:maximum image):

15:51:21.928 INFO ocrd.task_sequence.run_tasks - Start processing task 'skimage-denoise -I OCR-D-BIN2 -O OCR-D-BIN-DENOISE -p '{"level-of-operation": "page", "dpi": 0, "protect": 0.0, "maxsize": 1.0}''
15:51:24.932 INFO processor.SkimageDenoise - INPUT FILE 0 / P_1879_45_0344
15:51:31.599 ERROR ocrd.processor.helpers.run_processor - Failure in processor 'ocrd-skimage-denoise'
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/ocrd/processor/helpers.py", line 130, in run_processor
    processor.process()
  File "/build/ocrd_wrap/ocrd_wrap/skimage_denoise.py", line 75, in process
    page_image, page_coords, page_image_info = self.workspace.image_from_page(
  File "/usr/local/lib/python3.8/site-packages/ocrd/workspace.py", line 781, in image_from_page
    raise Exception('Found no AlternativeImage that satisfies all requirements ' +
Exception: Found no AlternativeImage that satisfies all requirements selector="binarized" in page "P_1879_45_0344"

Workspace at this point - if someone wants to have a look: https://qurator-data.de/~mike.gerber/2024-02-quiver-benchmarks-issue-22/reichsanzeiger_random_selected_pages_ocr.zip (Includes a ocrd.log)

At this point, I am not willing to look into this specific ocrd-calamari-recognize memory issue further, because I can't reproduce anything properly - it already involved guessing which original workspace it could have been and trying to run 7 processors. I am willing to look into it further, if I get the workspace in the state before ocrd-calamari-recognize ran, including OCR-D-SEG-LINE-RESEG-DEWARP.

I'll test with some other segmentation in OCR-D/ocrd_calamari#110, just to make sure that there is no general issue.

mikegerber · 2024-02-28T16:37:28Z

The QuiVer benchmark workflow selected_pages_ocr uses a process which binarizes twice. That gives an image which is too light for good OCR results (some characters are even missing completely). Nevertheless most of the text is still readable, to there should be some OCR result.

I am not sure that the images are binarized twice. It runs the binarization twice, yes, but the second binarization step may just use the original image but cropped, via AlternativeImage.

@kba @bertsky It this correct? Is there a way to verify with the log? (In the ZIP in the comment above this)

bertsky · 2024-02-29T09:46:45Z

@mikegerber exactly. All binarization processors filter avoid images on the input side (via feature_filter='binarized').
It's not a useful step IMHO, but it cannot hurt either.

The log would only detail this if you were to enable debug loggers for ocrd.workspace.

stweil changed the title ~~Calamari OCR does not produce text results~~ Benchmark workflows "selected_pages_ocr" with Calamari OCR do not produce text results Jan 12, 2024

stweil changed the title ~~Benchmark workflows "selected_pages_ocr" with Calamari OCR do not produce text results~~ Benchmark workflows "selected_pages_ocr" do not produce text results Jan 12, 2024

stweil mentioned this issue Jan 13, 2024

Segment crashes cisocrgroup/ocrd_cis#94

Closed

mikegerber mentioned this issue Feb 22, 2024

A lot of memory used on newspaper pages? OCR-D/ocrd_calamari#110

Open

2 tasks

bertsky mentioned this issue Feb 23, 2024

cuda: limit gpu memory OCR-D/ocrd_all#280

Open

bertsky mentioned this issue Feb 29, 2024

crop: removes original image ref OCR-D/ocrd_tesserocr#201

Open

mikegerber mentioned this issue Mar 1, 2024

Invalid structMap produced OCR-D/core#1195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark workflows "selected_pages_ocr" do not produce text results #22

Benchmark workflows "selected_pages_ocr" do not produce text results #22

stweil commented Jan 11, 2024 •

edited

Loading

mikegerber commented Jan 11, 2024

stweil commented Jan 11, 2024

stweil commented Jan 11, 2024

stweil commented Jan 11, 2024

mikegerber commented Jan 12, 2024

mikegerber commented Jan 12, 2024

stweil commented Jan 13, 2024 •

edited

Loading

stweil commented Jan 14, 2024

stweil commented Jan 14, 2024

bertsky commented Jan 18, 2024

mikegerber commented Feb 22, 2024

stweil commented Feb 22, 2024

mikegerber commented Feb 23, 2024

bertsky commented Feb 23, 2024

mikegerber commented Feb 23, 2024

mikegerber commented Feb 23, 2024

bertsky commented Feb 23, 2024

mikegerber commented Feb 23, 2024

stweil commented Feb 23, 2024

bertsky commented Feb 23, 2024

mikegerber commented Feb 28, 2024

mikegerber commented Feb 28, 2024

bertsky commented Feb 29, 2024 •

edited

Loading

Benchmark workflows "selected_pages_ocr" do not produce text results #22

Benchmark workflows "selected_pages_ocr" do not produce text results #22

Comments

stweil commented Jan 11, 2024 • edited Loading

mikegerber commented Jan 11, 2024

stweil commented Jan 11, 2024

stweil commented Jan 11, 2024

stweil commented Jan 11, 2024

mikegerber commented Jan 12, 2024

mikegerber commented Jan 12, 2024

stweil commented Jan 13, 2024 • edited Loading

stweil commented Jan 14, 2024

stweil commented Jan 14, 2024

bertsky commented Jan 18, 2024

mikegerber commented Feb 22, 2024

stweil commented Feb 22, 2024

mikegerber commented Feb 23, 2024

bertsky commented Feb 23, 2024

mikegerber commented Feb 23, 2024

mikegerber commented Feb 23, 2024

bertsky commented Feb 23, 2024

mikegerber commented Feb 23, 2024

stweil commented Feb 23, 2024

bertsky commented Feb 23, 2024

mikegerber commented Feb 28, 2024

mikegerber commented Feb 28, 2024

bertsky commented Feb 29, 2024 • edited Loading

stweil commented Jan 11, 2024 •

edited

Loading

stweil commented Jan 13, 2024 •

edited

Loading

bertsky commented Feb 29, 2024 •

edited

Loading