Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata api unsuable and lacking features #74

Open
fredprodibi opened this issue May 1, 2020 · 4 comments
Open

Metadata api unsuable and lacking features #74

fredprodibi opened this issue May 1, 2020 · 4 comments
Labels
question Further information is requested

Comments

@fredprodibi
Copy link

I m trying to extract metadata from the image attached, but the results from net-vips are not usable in a production environment and are lacking features to read metadata.

1. let's take the make exemple:
net-vips will return exif-ifd0-Make : "Canon (Canon, ASCII, 6 components, 6 bytes)"
so instead of Canon now we have to parse a string to extract the value.
if we have a user description he may have things like that : "My description (very good description)"
and net-vips will return something like "My description (very good description) (My description (very good description), ASCII, xx components, xx bytes)" that would be very difficult to parse reliably when the input is not controlled.

So the output should only be the value and if you want to output more details you need to return a structured object that will have something like value, components, length for example.

2. we cannot navigate or get things that are inside iptc or xmp
if we do Get("iptc-data") it returns a raw byte[] array that we don't know what to do with it, it should either be something navigable that we can continue query with GetFields or Get

I use this image to test
libviptest

the expected result that is the one from exiftools is this one with the following command:

exiftool.exe -Make -Model -Artist -ExposureTime -FNumber -ISO -DateTimeOriginal -CreateDate -LensModel -ObjectName -Keywords -By-line -Caption-Abstract -Rating -Label -Lens -DateCreated -Description -Title -Creator -Subject -HierarchicalSubject -DateTimeCreated -DigitalCreationDateTime -LensID -LensType -gps* -j libviptest.jpg


[{
  "SourceFile": "libviptest.jpg",
  "Make": "Canon",
  "Model": "Canon EOS 5DS R",
  "Artist": "Fred",
  "ExposureTime": "1/200",
  "FNumber": 6.3,
  "ISO": 100,
  "DateTimeOriginal": "2019:08:12 16:32:47",
  "CreateDate": "2019:08:12 16:32:47",
  "LensModel": "Canon EF 85mm f/1.2L II USM",
  "ObjectName": "Fashion in studio",
  "Keywords": ["girl","studio","jeans","photography"],
  "By-line": "Fred",
  "Caption-Abstract": "Girl in jeans standing in front of gray background",
  "Rating": 3,
  "Label": "Yellow",
  "Lens": "Canon EF 85mm f/1.2L II USM",
  "DateCreated": "2019:08:12 16:32:47",
  "Description": "Girl in jeans standing in front of gray background",
  "Title": "Fashion in studio",
  "Creator": "Fred",
  "Subject": ["girl","studio","jeans","photography"],
  "HierarchicalSubject": ["girl","studio","jeans","photography"],
  "DateTimeCreated": "2019:08:12 16:32:47",
  "DigitalCreationDateTime": "2019:08:12 16:32:47"
}]

and this code to test:
`using var test1Image = Image.NewFromFile("libviptest.jpg");

        var exifs = new StringBuilder();
        var fields = test1Image.GetFields();
        foreach (var field in fields)
        {
            var fieldValue = test1Image.Get(field);
            if (fieldValue is string) exifs.AppendLine($"{field} : {fieldValue}");
        }

        var exifsDump = exifs.ToString();`

the result of this is :

format : uchar
coding : none
interpretation : srgb
filename : C:\Users\fred\source\tools\exiftool-11.81\libviptest.jpg
vips-loader : jpegload
jpeg-chroma-subsample : 4:4:4
resolution-unit : in
exif-ifd0-ImageDescription : Girl in jeans standing in front of gray background (Girl in jeans standing in front of gray background, ASCII, 51 components, 51 bytes)
exif-ifd0-Make : Canon (Canon, ASCII, 6 components, 6 bytes)
exif-ifd0-Model : Canon EOS 5DS R (Canon EOS 5DS R, ASCII, 16 components, 16 bytes)
exif-ifd0-XResolution : 300/1 (300, Rational, 1 components, 8 bytes)
exif-ifd0-YResolution : 300/1 (300, Rational, 1 components, 8 bytes)
exif-ifd0-ResolutionUnit : 2 (Inch, Short, 1 components, 2 bytes)
exif-ifd0-Software : Capture One 20 Windows (Capture One 20 Windows, ASCII, 23 components, 23 bytes)
exif-ifd0-DateTime : 2020:04:03 22:02:14 (2020:04:03 22:02:14, ASCII, 20 components, 20 bytes)
exif-ifd0-Artist : Fred (Fred, ASCII, 5 components, 5 bytes)
exif-ifd0-Copyright : Fred (Fred (Photographer) - [None] (Editor), ASCII, 5 components, 5 bytes)
exif-ifd0-Padding : 2060 bytes undefined data (2060 bytes undefined data, Undefined, 2060 components, 2060 bytes)
exif-ifd1-Compression : 6 (JPEG compression, Short, 1 components, 2 bytes)
exif-ifd1-XResolution : 300/1 (300, Rational, 1 components, 8 bytes)
exif-ifd1-YResolution : 300/1 (300, Rational, 1 components, 8 bytes)
exif-ifd1-ResolutionUnit : 2 (Inch, Short, 1 components, 2 bytes)
exif-ifd2-ExposureTime : 1/200 (1/200 sec., Rational, 1 components, 8 bytes)
exif-ifd2-FNumber : 63/10 (f/6.3, Rational, 1 components, 8 bytes)
exif-ifd2-ExposureProgram : 1 (Manual, Short, 1 components, 2 bytes)
exif-ifd2-ISOSpeedRatings : 100 (100, Short, 1 components, 2 bytes)
exif-ifd2-ExifVersion : Exif Version 2.3 (Exif Version 2.3, Undefined, 4 components, 4 bytes)
exif-ifd2-DateTimeOriginal : 2019:08:12 16:32:47 (2019:08:12 16:32:47, ASCII, 20 components, 20 bytes)
exif-ifd2-DateTimeDigitized : 2019:08:12 16:32:47 (2019:08:12 16:32:47, ASCII, 20 components, 20 bytes)
exif-ifd2-ShutterSpeedValue : 7643856/1000000 (7.64 EV (1/199 sec.), SRational, 1 components, 8 bytes)
exif-ifd2-ApertureValue : 5310704/1000000 (5.31 EV (f/6.3), Rational, 1 components, 8 bytes)
exif-ifd2-ExposureBiasValue : 0/1 (0.00 EV, SRational, 1 components, 8 bytes)
exif-ifd2-SubjectDistance : 0/1 (0.0 m, Rational, 1 components, 8 bytes)
exif-ifd2-MeteringMode : 5 (Pattern, Short, 1 components, 2 bytes)
exif-ifd2-Flash : 16 (Flash did not fire, compulsory flash mode, Short, 1 components, 2 bytes)
exif-ifd2-FocalLength : 85/1 (85.0 mm, Rational, 1 components, 8 bytes)
exif-ifd2-SubSecTimeOriginal : 32 (32, ASCII, 3 components, 3 bytes)
exif-ifd2-SubSecTimeDigitized : 32 (32, ASCII, 3 components, 3 bytes)
exif-ifd2-PixelXDimension : 1365 (1365, Short, 1 components, 2 bytes)
exif-ifd2-PixelYDimension : 2048 (2048, Short, 1 components, 2 bytes)
exif-ifd2-FocalPlaneXResolution : 196336816/32768 (5991.72412, Rational, 1 components, 8 bytes)
exif-ifd2-FocalPlaneYResolution : 196675920/32768 (6002.07275, Rational, 1 components, 8 bytes)
exif-ifd2-FocalPlaneResolutionUnit : 2 (Inch, Short, 1 components, 2 bytes)
exif-ifd2-FileSource : DSC (DSC, Undefined, 1 components, 1 bytes)
exif-ifd2-SceneType : Directly photographed (Directly photographed, Undefined, 1 components, 1 bytes)
exif-ifd2-CustomRendered : 0 (Normal process, Short, 1 components, 2 bytes)
exif-ifd2-ExposureMode : 1 (Manual exposure, Short, 1 components, 2 bytes)
exif-ifd2-WhiteBalance : 1 (Manual white balance, Short, 1 components, 2 bytes)
exif-ifd2-SceneCaptureType : 0 (Standard, Short, 1 components, 2 bytes)
exif-ifd2-Padding : 2060 bytes undefined data (2060 bytes undefined data, Undefined, 2060 components, 2060 bytes)
exif-ifd2-FlashPixVersion : FlashPix Version 1.0 (FlashPix Version 1.0, Undefined, 4 components, 4 bytes)
exif-ifd2-ColorSpace : 65535 (Uncalibrated, Short, 1 components, 2 bytes)

I think it would be easier if this api would allow for query with tag name like exiftool or at least with path, and may be provide a TryGet method so we can query if something exist or not without having to catch exceptions if not found. and the string output from the current implementation makes it completely unusable.

@kleisauke kleisauke added the question Further information is requested label May 1, 2020
@kleisauke
Copy link
Owner

Hi @fredprodibi,

net-vips will return exif-ifd0-Make : "Canon (Canon, ASCII, 6 components, 6 bytes)"
so instead of Canon now we have to parse a string to extract the value.
if we have a user description he may have things like that : "My description (very good description)"
and net-vips will return something like "My description (very good description) (My description (very good description), ASCII, xx components, xx bytes)" that would be very difficult to parse reliably when the input is not controlled.

The final " (xx, yy, zz, kk)" part of the string (if present) is added by libvips to reconstruct EXIF values during saving. You could use a similar logic that is included in libvips to remove this tail. Or you could try parsing the image raw EXIF data (provided by image.Get("exif-data");) with a separate library.

@jcupitt Is there any way to read EXIF values without this tail? I remembered we had the same problem in the test suite.

if we do Get("iptc-data") it returns a raw byte[] array that we don't know what to do with it, it should either be something navigable that we can continue query with GetFields or Get

image.Get("iptc-data"); will return a buffer containing raw IPTC data (if present). libexif, the library that libvips uses, only provides support for EXIF data. The other two (IPTC and XMP) are therefore not parsed by libvips. There was a request to replace libexif with Exiv2 (which GIMP uses for metadata), but it's not possible for a variety of reasons, see libvips/libvips#453.

I think it would be easier if this api would allow for query with tag name like exiftool or at least with path

I personally think this is outside the scope of libvips. Can't you use exiftool for that purpose?

provide a TryGet method so we can query if something exist or not without having to catch exceptions if not found.

To check if the image contains an property of metadata you could use:

var image = Image.NewFromFile("zebra.jpg", access: Enums.Access.Sequential);
if (image.Contains("exif-ifd0-Orientation"))
{
    ...
}

@jcupitt
Copy link
Contributor

jcupitt commented May 1, 2020

From memory, the problem with exif data values is that they are typed.

Each field (orientation, resolution, etc.) has an associated type (int, rational, etc.) and the string value must be reconstructible, ie. you have to be able to go exif binary data -> string -> exif binary data and get exactly the same thing in there.

This means something like exif-ifd2-ShutterSpeedValue must be represented as 7643856/1000000, which is just awful for humans to work with, so libvips attaches a (explanation) at the end, (7.64 EV (1/199 sec.), SRational, 1 components, 8 bytes) in this case.

I suppose an alternative would be to add GType objects for Rational, SRational, Int etc. and not have string-valued exif fields, but that would still push a lot complexity down to clients, just different complexity.

@jcupitt
Copy link
Contributor

jcupitt commented May 1, 2020

Another idea might be to output the EXIF tags in two forms, perhaps:

exif-ifd2-ShutterSpeedValue: 7643856/1000000
exif-ifd2-human-ShutterSpeedValue: 7.64 EV (1/199 sec.), SRational, 1 components, 8 bytes

The human (or -info?) versions are written by libvips but never read, so there would be no need to parse them back. Then I suppose the problem would be that the two tags would get out of sync.

@fredprodibi
Copy link
Author

You could use a similar logic that is included in libvips to remove this tail.

Yes the issue is that if the user has some metadata that contains ( or ) it may return wrong results so it's not very robust.

To check if the image contains an property of metadata you could use

Thank you :)

Can't you use exiftool for that purpose?

Yes I can that what I do right now but that means launching an additional process just to fetch the metadata, so the Get("xxx") looked has a good solution

EXIF data (provided by image.Get("exif-data");) with a separate library

Yes if it's not easier in libvips then we will have to go back to exiftool

Another idea might be to output the EXIF tags in two forms

That would help, but I think the best would be to output a class that will would give you

  1. the value 2. the human friendly value 3. the type 4. the number of component 5. the size
    the ToString() methods would return the string as it is right now.
    or even a single string but in structured format like json exif-ifd2-ShutterSpeedValue would be something like { value: "7643856/1000000", friendlyValue : "7.64 EV (1/199 sec.)", dataType: "SRational", components: 1, length: 8 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants