Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1659: Convert image mode I;16 to RGB #1664

Merged
merged 5 commits into from
Nov 1, 2024

Conversation

haianhng31
Copy link
Contributor

@haianhng31 haianhng31 commented Oct 20, 2024

== Summary ==

== Test Plan ==

  • Add test cases for images of mode LA, I;16, RGB

@facebook-github-bot
Copy link
Contributor

Hi @haianhng31!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for taking this on! I am convinced by your unittests that your PR fixes this issue, but if you can confirm that you can generate a reproduction of the condition with them in your test plan in the summary, I'll be doubly convinced.

A few concerns about the test images, since they appear to be from reddit user data. We don't want to include images we don't have the rights to.

You have a few options for generating test data:

  1. Produce the image yourself (essentially donating it the repo)
  2. Algorithmically generate the image by generating noise, random data, etc.
  3. Mody images in pdq/data/bridge_pics to generate the case, since those were generated by RFC: Python Library for ThreatExchange #1 by a previous author.

We cannot do:

  1. Images on the internet that we find (even if they are sites purporting to be public domain, because people can re-host images there)
  2. Known copyrighted images

Can you try and produce a local reproduction by using the bridge pics? That might be nicest because those hashes have a known value.

@16BitNarwhal
Copy link
Contributor

Hi @Dcallies,
Made some changes and generated test images as follows.
I'm not sure why the quality of the i16 (16-bit grayscale) image is at 36 though

from PIL import Image
import numpy as np

width, height = 256, 256

def generate_i16_image():
    noise = np.random.randint(0, 65536, (height, width), dtype=np.uint16)
    image = Image.fromarray(noise, mode='I;16')
    image.save('noise_image_I16.png')

def generate_la_image():
    luminance = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    alpha = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    la_image_data = np.stack((luminance, alpha), axis=-1)
    la_image = Image.fromarray(la_image_data, mode='LA')
    la_image.save('noise_image_LA.png')

def generate_rgb_image():
    red = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    green = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    blue = np.random.randint(0, 256, (height, width), dtype=np.uint8)
    rgb_image_data = np.stack((red, green, blue), axis=-1)
    rgb_image = Image.fromarray(rgb_image_data, mode='RGB')
    rgb_image.save('noise_image_RGB.jpeg', 'JPEG')

generate_i16_image()
generate_la_image()
generate_rgb_image()

@thedanielsun
Copy link
Contributor

thedanielsun commented Oct 24, 2024

Oh, is this the same issue as what I tried to fix in #1264 ?
I added test data in #1265 idk if that's useful

Edit: I see the PR includes I;16, maybe you can copy what I did in #1265 and convert the b.jpg to I;16 if you want to avoid licensing for the test data

@Dcallies Dcallies self-requested a review October 28, 2024 20:43
Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I had a reply that seems to have never been submitted -

The reason that your quality is low is that the image quality is low - it's just random noise, and PDQ can detect that and advise against using it for matching!

Instead, we should do as Daniel recommended, which is to use the bridge-mod images, and just convert them. Since this has been sitting for a long time, and there's nothing blocking about this PR, suggest we merge this and iterate if need be.

@Dcallies Dcallies merged commit 9e067ec into facebook:main Nov 1, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pdq] ValueError when computing PDQ hash on some pngs
5 participants