Example 3 of D4.3 - Pre-processing #12

cozzolinoac11 · 2023-05-08T08:49:33Z

Use case

common

Name of resource

JPEG images to numpy array transformation

ID

JPEG_to_numpy_transformation

Description

Building dataset as numpy array. In machine learning, Python uses image data in the format of a NumPy array, i.e., [Height, Width, Channel] format. Therefore, the images must be transformed in this format. In this case, the images are in JPEG format and, through pillow, NumPy and OpenCV functions, the transformation is performed. The cv2 package (OpenCV) has the method imread() which is used to load the image and it also reads the given image (PIL image) in the NumPy array format. Because the images within the dataset (i.e., the NumPy arrays) must all be the same size to be used, and as a matter of efficiency and calculation power, using cv2's resize() the images are resized from 350x350 pixels into 100x100 (this dimension can be easily changed). The channel is three because the images are RGB. This method then returns a dataset containing the images in the format of NumPy arrays and their respective class labels.

Main category

Pre-processing

Other category

No response

Publication date

2023-08-05

Objective

data-transformation

Platform

Google Colab

Framework

OpenCV

Architecture

None

Approach

None

Algorithm

custom-method

Processor

cpu

OS

linux

Keyword

numpy array, data transformation, jpeg

Reference link

No response

Example

https://github.com/cozzolinoac11/wildfire_prediction/blob/main/img_to_NPY_transformation.ipynb

Input data used

https://open.canada.ca/data/en/dataset/9d8f219c-4df0-4481-926f-8a2a532ca003

Characteristics of input data

Refer to Canada's website for the original wildfires data. The dataset is composed by satellite images (shape is 350x350).

Biases and ethical aspects

No response

Output data obtained

https://public.epsilon-italia.it/FAIRiCUBE/wildfire-classification/data_numpy.zip

Characteristics of output data

Dataset in format Numpy arrays. The images are resized in 100x100.

Performance

No response

Conditions for access and use

cc-by-4.0

Constraints

No response

KathiSchleidt · 2023-05-08T12:01:03Z

Similar to the comment on #11 I think a bit more detail may be useful for non-expert users

On the description, could you provide a bit more detail on how the transformation is performed, what's available in the numpy array (how do you split the JPEG RGB to the array)

On "Input data used", the page you link to provides diverse datasets, it's unclear which are being used. In "Characteristics of input data", there's no link, only way of finding the information is the input data link.

On sizes, you don't provide a UoM. I'm assuming meters, but would be nice to add.

cozzolinoac11 · 2023-05-11T10:48:43Z

The same comment made in issue #11

cozzolinoac11 added documentation Improvements or additions to documentation good first issue Good for newcomers labels May 8, 2023

cozzolinoac11 changed the title ~~Example 1 of D4.3 - Pre-processing~~ Example 3 of D4.3 - Pre-processing May 8, 2023

cozzolinoac11 mentioned this issue May 8, 2023

additional values ( for several codelists) #5

Open

cozzolinoac11 added the a/p metadata label Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example 3 of D4.3 - Pre-processing #12

Example 3 of D4.3 - Pre-processing #12

cozzolinoac11 commented May 8, 2023 •

edited

Loading

KathiSchleidt commented May 8, 2023

cozzolinoac11 commented May 11, 2023

Example 3 of D4.3 - Pre-processing #12

Example 3 of D4.3 - Pre-processing #12

Comments

cozzolinoac11 commented May 8, 2023 • edited Loading

Use case

Name of resource

ID

Description

Main category

Other category

Publication date

Objective

Platform

Framework

Architecture

Approach

Algorithm

Processor

OS

Keyword

Reference link

Example

Input data used

Characteristics of input data

Biases and ethical aspects

Output data obtained

Characteristics of output data

Performance

Conditions for access and use

Constraints

KathiSchleidt commented May 8, 2023

cozzolinoac11 commented May 11, 2023

cozzolinoac11 commented May 8, 2023 •

edited

Loading