Fixes #1659: Convert image mode I;16 to RGB #1664

haianhng31 · 2024-10-20T17:32:04Z

== Summary ==

Fixes [pdq] ValueError when computing PDQ hash on some pngs #1659
Convert image mode I;16 to RGB

== Test Plan ==

Add test cases for images of mode LA, I;16, RGB

facebook-github-bot · 2024-10-20T17:32:09Z

Hi @haianhng31!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2024-10-20T17:33:39Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Dcallies

Thanks a ton for taking this on! I am convinced by your unittests that your PR fixes this issue, but if you can confirm that you can generate a reproduction of the condition with them in your test plan in the summary, I'll be doubly convinced.

A few concerns about the test images, since they appear to be from reddit user data. We don't want to include images we don't have the rights to.

You have a few options for generating test data:

Produce the image yourself (essentially donating it the repo)
Algorithmically generate the image by generating noise, random data, etc.
Mody images in pdq/data/bridge_pics to generate the case, since those were generated by RFC: Python Library for ThreatExchange #1 by a previous author.

We cannot do:

Images on the internet that we find (even if they are sites purporting to be public domain, because people can re-host images there)
Known copyrighted images

Can you try and produce a local reproduction by using the bridge pics? That might be nicest because those hashes have a known value.

16BitNarwhal · 2024-10-23T16:25:01Z

Hi @Dcallies,
Made some changes and generated test images as follows.
I'm not sure why the quality of the i16 (16-bit grayscale) image is at 36 though

from PIL import Image
import numpy as np

width, height = 256, 256

def generate_i16_image():
    noise = np.random.randint(0, 65536, (height, width), dtype=np.uint16)
    image = Image.fromarray(noise, mode='I;16')
    image.save('noise_image_I16.png')

def generate_la_image():
    luminance = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    alpha = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    la_image_data = np.stack((luminance, alpha), axis=-1)
    la_image = Image.fromarray(la_image_data, mode='LA')
    la_image.save('noise_image_LA.png')

def generate_rgb_image():
    red = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    green = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    blue = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    rgb_image_data = np.stack((red, green, blue), axis=-1)
    rgb_image = Image.fromarray(rgb_image_data, mode='RGB')
    rgb_image.save('noise_image_RGB.jpeg', 'JPEG')

generate_i16_image()
generate_la_image()
generate_rgb_image()

thedanielsun · 2024-10-24T21:10:07Z

Oh, is this the same issue as what I tried to fix in #1264 ?
I added test data in #1265 idk if that's useful

Edit: I see the PR includes I;16, maybe you can copy what I did in #1265 and convert the b.jpg to I;16 if you want to avoid licensing for the test data

Dcallies

Hmm, I had a reply that seems to have never been submitted -

The reason that your quality is low is that the image quality is low - it's just random noise, and PDQ can detect that and advise against using it for matching!

Instead, we should do as Daniel recommended, which is to use the bridge-mod images, and just convert them. Since this has been sitting for a long time, and there's nothing blocking about this PR, suggest we merge this and iterate if need be.

haianhng31 and others added 4 commits October 17, 2024 21:10

Convert image type I16 to RGB

c5bc1df

Add test cases for images of mode LA, I16, RGB

6a469cc

add back the old test cases

ae737e8

lint

068c4af

haianhng31 requested a review from Dcallies as a code owner October 20, 2024 17:32

facebook-github-bot added the CLA Signed label Oct 20, 2024

Dcallies requested changes Oct 22, 2024

View reviewed changes

use noisy images

420daa9

Dcallies self-requested a review October 28, 2024 20:43

Dcallies approved these changes Nov 1, 2024

View reviewed changes

Dcallies merged commit 9e067ec into facebook:main Nov 1, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #1659: Convert image mode I;16 to RGB #1664

Fixes #1659: Convert image mode I;16 to RGB #1664

haianhng31 commented Oct 20, 2024 •

edited by Dcallies

Loading

facebook-github-bot commented Oct 20, 2024

facebook-github-bot commented Oct 20, 2024

Dcallies left a comment •

edited

Loading

16BitNarwhal commented Oct 23, 2024

thedanielsun commented Oct 24, 2024 •

edited

Loading

Dcallies left a comment

Fixes #1659: Convert image mode I;16 to RGB #1664

Fixes #1659: Convert image mode I;16 to RGB #1664

Conversation

haianhng31 commented Oct 20, 2024 • edited by Dcallies Loading

facebook-github-bot commented Oct 20, 2024

Action Required

Process

facebook-github-bot commented Oct 20, 2024

Dcallies left a comment • edited Loading

Choose a reason for hiding this comment

16BitNarwhal commented Oct 23, 2024

thedanielsun commented Oct 24, 2024 • edited Loading

Dcallies left a comment

Choose a reason for hiding this comment

haianhng31 commented Oct 20, 2024 •

edited by Dcallies

Loading

Dcallies left a comment •

edited

Loading

thedanielsun commented Oct 24, 2024 •

edited

Loading