-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pytx] Implement FileContent class #1680
base: main
Are you sure you want to change the base?
Changes from 5 commits
a661224
b31647e
8aff837
d18a8e9
95282f0
69d9abc
bb816aa
a5a1930
2580a48
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
import typing as t | ||
from content_base import ContentType | ||
from photo import PhotoContent | ||
from video import VideoContent | ||
from PIL import Image | ||
|
||
class FileContent(ContentType): | ||
""" | ||
ContentType representing a generic file. | ||
|
||
Determines if a file is a photo or video based on file extension. | ||
""" | ||
|
||
VALID_PHOTO_EXTENSIONS = {ext.lower() for ext in Image.registered_extensions()} | ||
VALID_VIDEO_EXTENSIONS = {".mp4", ".avi", ".mov", ".mkv"} | ||
|
||
@classmethod | ||
def get_content_type_from_filename(cls, file_name: str) -> t.Type[ContentType]: | ||
""" | ||
Determines content type based on file extension. | ||
""" | ||
file_extension = file_name.lower().rsplit('.', 1)[-1] | ||
file_extension = f".{file_extension}" | ||
|
||
if file_extension in cls.VALID_PHOTO_EXTENSIONS: | ||
return PhotoContent | ||
elif file_extension in cls.VALID_VIDEO_EXTENSIONS: | ||
return VideoContent | ||
else: | ||
return None |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
import pytest | ||
from threatexchange.content_type.photo import PhotoContent | ||
from threatexchange.content_type.video import VideoContent | ||
from threatexchange.content_type.file_content import FileContent | ||
|
||
@pytest.mark.parametrize("file_name,expected_content_type", [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
("file.jpg", PhotoContent), | ||
("file.JPG", PhotoContent), | ||
("file.mp4", VideoContent), | ||
("file.MP4", VideoContent), | ||
("archive.photo.png", PhotoContent), | ||
("movie.backup.mp4", VideoContent), | ||
]) | ||
def test_file_content_detection(file_name, expected_content_type): | ||
""" | ||
Tests that FileContent correctly identifies the content type | ||
as either PhotoContent or VideoContent based on file extension. | ||
""" | ||
content_type = FileContent.get_content_type_from_filename(file_name) | ||
assert content_type == expected_content_type, f"Failed for {file_name}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You probably don't need to add a message - parameterize will do that for you. |
||
|
||
def test_unknown_file_type(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: You can combine this into your previous test by just doing
|
||
""" | ||
Tests that an unknown file type returns None. | ||
""" | ||
file_content = FileContent.get_content_type_from_filename("file.txt") | ||
assert file_content is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing work during module import time, which is generally considered an anti-pattern, but I don't think it's worth changing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide more details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you tell me more details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the first result for me from google for "doing work during module import time"
https://www.benkuhn.net/importtime/
As mentioned, I don't think it's worth changing.