Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for ARW, DNG, CR2 raw #140

Merged
36 changes: 34 additions & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ torchvision = [
{ version = "==0.17.2+cpu", source = "pytorch-cpu", markers = "sys_platform == 'linux' and platform_machine != 'aarch64'" }
]
tqdm = "^4.65.0"
rawpy = "^0.23.2"

[tool.poetry.group.dev.dependencies]
pycodestyle = ">=2.7,<3.0"
Expand Down
5 changes: 5 additions & 0 deletions rclip/const.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,8 @@
IS_MACOS = sys.platform == 'darwin'
IS_LINUX = sys.platform.startswith('linux')
IS_WINDOWS = sys.platform == 'win32' or sys.platform == 'cygwin'

# these images are always processed
IMAGE_EXT = ["jpg", "jpeg", "png", "webp"]
# RAW images are processed only if there is no processed image alongside it
IMAGE_RAW_EXT = ["arw", "cr2"]
26 changes: 23 additions & 3 deletions rclip/main.py
Copy link
Owner

@yurijmikhalevich yurijmikhalevich Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an amazing start 🙌 Thank you. Can you please add a test or two that includes all of the newly added image formats?

When adding test images, can you please pick or create RAW files that weigh as little as possible to keep the repo size small?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have downloaded the raw images from internet that have relatively larger file sizes (about 10 to 40 MB). I also searched conversion tools and tried to convert using Python libraries. But it didn't work.

Adding these larger images could increase the tool size. What is your opinion regarding this sized image upload to the test folder? @yurijmikhalevich

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, I think adding 3 10Mb images to the repository can be ok. We aren't bundling them in the distributions anyway.

But, if we can create 3 small 100px * 100px RAW images, it would be much better.

Also, when adding images, we should be mindful of their license. All of the images in the rclip test dir are the images I took myself 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurijmikhalevich, I have tried to resize or convert images. But I failed. So, It's difficult for me to add different raw photos. But I can download some raw images from the internet those sizes on average 40MB. Need your suggestion badly.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, let me see what I can do.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, I am going to cleanup and merge your branch, and then will add tests in a separate PR.

I can't add it in this PR because GitHub and git-lfs don't let me push large images to your branch (-:

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, another important thing to do in this PR was to ensure that the --preview works for the RAW images. Check out this diff for implementation: 46f2fc8.

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Also, it doesn't actually support "DNG" files created by Lightroom. This is why tests are important 💭

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abidkhan484, merged! 🙌 Congrats with your first contribution to rclip and thank you for the help 🔥

Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from PIL import Image, ImageFile

from rclip import db, fs, model
from rclip.const import IMAGE_EXT, IMAGE_RAW_EXT
from rclip.utils.preview import preview
from rclip.utils.snap import check_snap_permissions, is_snap
from rclip.utils import helpers
Expand Down Expand Up @@ -41,7 +42,7 @@ def is_image_meta_equal(image: db.Image, meta: ImageMeta) -> bool:

class RClip:
EXCLUDE_DIRS_DEFAULT = ['@eaDir', 'node_modules', '.git']
IMAGE_REGEX = re.compile(r'^.+\.(jpe?g|png|webp)$', re.I)
IMAGE_REGEX = re.compile(f'^.+\\.({"|".join([*IMAGE_EXT, *IMAGE_RAW_EXT])})$', re.I)
DB_IMAGES_BEFORE_COMMIT = 50_000

class SearchResult(NamedTuple):
Expand All @@ -67,7 +68,7 @@ def _index_files(self, filepaths: List[str], metas: List[ImageMeta]):
filtered_paths: List[str] = []
for path in filepaths:
try:
image = Image.open(path)
image = helpers.read_image(path)
images.append(image)
filtered_paths.append(path)
except PIL.UnidentifiedImageError as ex:
Expand All @@ -88,6 +89,18 @@ def _index_files(self, filepaths: List[str], metas: List[ImageMeta]):
vector=vector.tobytes()
), commit=False)

def _does_processed_image_exist_for_raw(self, raw_path: str) -> bool:
"""Check if there is a processed image alongside the raw one; doesn't support mixed-case extensions,
e.g. it won't detect the .JpG image, but will detect .jpg or .JPG"""

image_path = os.path.splitext(raw_path)[0]
for ext in IMAGE_EXT:
if os.path.isfile(image_path + "." + ext):
return True
if os.path.isfile(image_path + "." + ext.upper()):
return True
return False

def ensure_index(self, directory: str):
print(
'checking images in the current directory for changes;'
Expand All @@ -113,7 +126,13 @@ def update_total_images(count: int):
metas: List[ImageMeta] = []
for entry in fs.walk(directory, self._exclude_dir_regex, self.IMAGE_REGEX):
filepath = entry.path
image = self._db.get_image(filepath=filepath)

file_ext = helpers.get_file_extension(filepath)
if file_ext in IMAGE_RAW_EXT and self._does_processed_image_exist_for_raw(filepath):
images_processed += 1
pbar.update()
continue

try:
meta = get_image_meta(entry)
except Exception as ex:
Expand All @@ -125,6 +144,7 @@ def update_total_images(count: int):
images_processed += 1
pbar.update()

image = self._db.get_image(filepath=filepath)
if image and is_image_meta_equal(image, meta):
self._db.remove_indexing_flag(filepath, commit=False)
continue
Expand Down
22 changes: 19 additions & 3 deletions rclip/utils/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@
import textwrap
from PIL import Image, UnidentifiedImageError
import re
import numpy as np
import rawpy
import requests
import sys
from importlib.metadata import version

from rclip.const import IS_LINUX, IS_MACOS, IS_WINDOWS
from rclip.const import IMAGE_RAW_EXT, IS_LINUX, IS_MACOS, IS_WINDOWS


MAX_DOWNLOAD_SIZE_BYTES = 50_000_000
Expand Down Expand Up @@ -186,15 +188,29 @@ def download_image(url: str) -> Image.Image:
return img


def get_file_extension(path: str) -> str:
return os.path.splitext(path)[1].lower()[1:]


def read_raw_image_file(path: str):
raw = rawpy.imread(path)
rgb = raw.postprocess()
return Image.fromarray(np.array(rgb))


def read_image(query: str) -> Image.Image:
path = remove_prefix(query, 'file://')
try:
img = Image.open(path)
file_ext = get_file_extension(path)
if file_ext in IMAGE_RAW_EXT:
image = read_raw_image_file(path)
else:
image = Image.open(path)
except UnidentifiedImageError as e:
# by default the filename on the UnidentifiedImageError is None
e.filename = path
raise e
return img
return image


def is_http_url(path: str) -> bool:
Expand Down
4 changes: 3 additions & 1 deletion rclip/utils/preview.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
import os
from PIL import Image

from rclip.utils.helpers import read_image


def _get_start_sequence():
term_env_var = os.getenv('TERM')
Expand All @@ -19,7 +21,7 @@ def _get_end_sequence():


def preview(filepath: str, img_height_px: int):
with Image.open(filepath) as img:
with read_image(filepath) as img:
if img_height_px >= img.height:
width_px, height_px = img.width, img.height
else:
Expand Down