add OCR Decoding support - WIP #113

N950 · 2024-06-10T14:24:21Z

This is a WIP to add CTC OCR recognition/decoding
Conformity to contribution guidelines will be fixed before closing

Next change will be adding kpt label to the TEXT LabelType, even though the goal is only OCR recognition, on the data side it makes since to create the annotation/LabelType from the begining to support kpt annotations

kozlov721 · 2024-06-10T17:40:07Z

luxonis_ml/data/datasets/annotation.py

+        class_mapping: Dict[str, int],
+        **_,
+    ) -> np.ndarray:
+        text_labels = None


text_labels can be already set to np.zeros((len(annotations), ann.max_len)) so there's no chance to return None

kozlov721 · 2024-06-10T17:40:49Z

luxonis_ml/data/loaders/luxonis_loader.py

@@ -174,6 +174,7 @@ def _load_image_with_annotations(self, idx: int) -> Tuple[np.ndarray, Labels]:

        uuid = self.instances[idx]
        df = self.df.loc[uuid]
+        print(df)


forgotten print

kozlov721 · 2024-06-10T17:47:01Z

luxonis_ml/data/utils/data_utils.py

+
+
+def validate_text_value(
+    value: str,


The annotation of value seems to be incorrect.

kozlov721

Left some comments, otherwise looks good.

klemen1999 · 2024-06-13T09:37:09Z

luxonis_ml/data/augmentations/custom/ocr.py

+        @type is_train: bool
+        """
+        super(OCRAugmentation, self).__init__()
+        self.transforms = A.Compose(


Is this a set of some standard augmentations that are usually performed for OCR task or how is this defined?

Also curious on this

klemen1999 · 2024-06-13T09:41:56Z

luxonis_ml/data/augmentations/custom/ocr.py

+                    ],
+                    p=0.2
+                ),
+                A.Compose(  # resize to image_size with aspect ratio, pad if needed


Resize and Normalize are already part of the default augmentations. Resize is always done (you can control if it keeps aspect ratio or not) and Normalize is also appended to list of augmentations (if used by luxonis-train, can be deactivated through config though). So is this needed here?

kozlov721 · 2024-06-14T00:51:58Z

luxonis_ml/data/augmentations/custom/ocr.py

+        @param is_train: True if image is train. False if image is val/test.
+        @type is_train: bool
+        """
+        super(OCRAugmentation, self).__init__()


Le'ts keep it just super().__init__(). The arguments in super are a relic from python 2.

conorsim

Next change will be adding kpt label to the TEXT LabelType, even though the goal is only OCR recognition, on the data side it makes since to create the annotation/LabelType from the begining to support kpt annotations

What is meant by this? We have the LabelType.KEYPOINTS already. We also plan to support nested annotations, so I think the final form for OCR + keypoints would be TEXT and KEYPOINTS nested within a BOUNDINGBOX

conorsim · 2024-06-18T16:56:39Z

luxonis_ml/data/datasets/luxonis_dataset.py

+    def set_global_metadata(self, metadata: Dict[str, Any]) -> None:
+        self.global_metadata = metadata


Due to GCS datasets, I think we need a way to persist this via storage instead of just memory? Perhaps we could use the existing datasets.json or metadata folder?

add OCR Decoding support

d24ff7e

N950 requested review from kozlov721, tersekmatija and conorsim June 10, 2024 14:24

kozlov721 reviewed Jun 10, 2024

View reviewed changes

luxonis_ml/data/utils/data_utils.py

def validate_text_value(

value: str,

Copy link

Collaborator

kozlov721 Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotation of value seems to be incorrect.

kozlov721 reviewed Jun 10, 2024

View reviewed changes

N950 requested review from klemen1999 and CaptainTrojan June 11, 2024 09:22

klemen1999 reviewed Jun 13, 2024

View reviewed changes

kozlov721 reviewed Jun 14, 2024

View reviewed changes

conorsim reviewed Jun 18, 2024

View reviewed changes

Base automatically changed from dev to main July 1, 2024 23:35

kozlov721 changed the base branch from main to dev July 8, 2024 14:30

kozlov721 requested a review from a team as a code owner October 8, 2024 22:52

kozlov721 requested review from kozlov721, klemen1999 and conorsim and removed request for a team October 8, 2024 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add OCR Decoding support - WIP #113

add OCR Decoding support - WIP #113

N950 commented Jun 10, 2024

kozlov721 Jun 10, 2024

kozlov721 Jun 10, 2024

kozlov721 Jun 10, 2024

kozlov721 left a comment

klemen1999 Jun 13, 2024

conorsim Jun 18, 2024

klemen1999 Jun 13, 2024

kozlov721 Jun 14, 2024

conorsim left a comment

conorsim Jun 18, 2024

		def set_global_metadata(self, metadata: Dict[str, Any]) -> None:
		self.global_metadata = metadata

add OCR Decoding support - WIP #113

Are you sure you want to change the base?

add OCR Decoding support - WIP #113

Conversation

N950 commented Jun 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kozlov721 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

conorsim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment