Autocorrect misidentified Files from remote sources after Exif/pronom extraction #278
Labels
Configuration
Danger Mr Robinson
Things so core to us that need extra care. Please submit automated tests?
Digital Preservation
enhancement
New feature or request
Events and Subscriber
JSON Postprocessors
Drupal Plugins that do stuff with JSON data
question
Further information is requested
Milestone
What?
I should have come up with this before but here I am, just a
boy
standing in front of anapplication/octet-stream
knowing it is atiff
(but still does not love me?)How?
Specially when dealing with remote Files and ill-configured HTTP servers, we have ended with Files being ingested via AMI and indentified/routed to
as:document
bc the headers were absent and we did not even have an extension, but once persisted, saved and exif/pronom etc. kicked in we could get the real format from inside (and more precise than we could have gotten ever by just fetching and downloading). And all this wonderfultech metadata
is there and stored. The issue is the Drupal File entity is already created, the file is in its final position (and probably in S3:// using either just adot
, no extension of stuff like.bin
).But we have 3, last minute "signs", things we can act on (in the analogy of the
boy standing in front
, let's say these are orange flowers and chocolate covered cherries as response to a smile).application/octet-stream
at thedr:mimetype
levelflv:exif
&&pronom
inferred from signature mimetype are telling us a different storyBased on that we could kick a "save the night and dance at least once" action that under this conditions does a last attempt, deduces the right extension, renames the name and the S3 file path (cheaper than deleting/re ingesting), edits the File entity adding the real exif, changing the URL to the source (same size, same checksum) and moves the file to its right place under as:image or as:audio, who knows. Now the question. This should be a setting? Or are the "signs" enough to try one more time and get flowers, at midnight, at the gas station (or from someones front yard) ?
@alliomeria ping. This would be just great. Bc we could re-process (simply save) ADOs that have this issue instead of patching?
The text was updated successfully, but these errors were encountered: