Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iiif View of large tiffs does not work #848

Open
KatharineV opened this issue Oct 4, 2024 · 11 comments
Open

Iiif View of large tiffs does not work #848

KatharineV opened this issue Oct 4, 2024 · 11 comments
Assignees
Labels
bug something isn't working high priority M1 Milestone 1

Comments

@KatharineV
Copy link
Collaborator

Today I tried to upload a couple hundred images (JPEG and TIFF files) for a contributor. They have an alumni event next weekend that they need these images online for. The importer is taking waaaaaaay longer than usual to process works.
https://adl.b2.adventistdigitallibrary.org/importers/468?locale=en

I checked GoodJob and I saw some very weird jobs in the "Running" tab. The jobs cleared before I could take a screenshot. They were related to Derivative Rodeo jobs running on the image files. The messages suggested that the jobs were failing. I do believe something is wrong, because the works that have passed out of pending into "finished" on the importer are still not rendering their images in the UV.

https://adl.b2.adventistdigitallibrary.org/concern/generic_works/p024452_grand_ledge_academy_graduating_class_of_1984

The files are attached in the items list, but the UV is a black box with a spinning load icon.

Image

Thumbnails are not loading for most of the imported works.

https://adl.b2.adventistdigitallibrary.org/catalog?utf8=%E2%9C%93&locale=en&search_field=all_fields&q=

It has been almost an hour since the importer started.

And GoodJobs just now generated these error messages for a new batch of works:

Image

@KatharineV KatharineV converted this from a draft issue Oct 4, 2024
@KatharineV KatharineV added the bug something isn't working label Oct 4, 2024
@KatharineV
Copy link
Collaborator Author

I tested the same works and CSV in staging, with Valkyrie.

https://adl.s2.adventistdigitallibrary.org/importers/359?locale=en

The importer is pending and works are taking ages to import in Valkyrie as well.

The UV is spinning and black on staging too.

GoodJob for staging has error messages:
Image

Maybe there's something wrong with these S3 files or with the CSV? Either way, I really do need help to get these works into the repository and to avoid the problem happening again. So, thank you.

@ShanaLMoore
Copy link
Contributor

 Error: Ldp::NotFound - <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 404 Not Found</title> </head> <body><h2>HTTP ERROR 404 Not Found</h2> <table> <tr><th>URI:</th><td>/rest/3929b765-8d25-48ab-986f-3ce4edf75a6b</td></tr> <tr><th>STATUS:</th><td>404</td></tr> <tr><th>MESSAGE:</th><td>Not Found</td></tr> <tr><th>SERVLET:</th><td>jersey-servlet</td></tr> </table> <hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 9.4.51.v20230217</a><hr/> </body> </html> 

for each importer:

image

@ShanaLMoore
Copy link
Contributor

ShanaLMoore commented Oct 4, 2024

@KatharineV have you tried importing the CSV into staging? Are the results the same?

Could you attach the CSV please?

In the meantime I'm going to try restarting fedora and re run this importer.

Edit: my apologies, I overlooked the previous comment that had answers to my above questions.

@ShanaLMoore ShanaLMoore self-assigned this Oct 7, 2024
@ShanaLMoore ShanaLMoore moved this from Ready for Development to In Development in Adventist Knapsack Oct 7, 2024
@jillpe jillpe added the M1 Milestone 1 label Oct 7, 2024
@ShanaLMoore ShanaLMoore added high priority and removed M1 Milestone 1 labels Oct 7, 2024
@KatharineV
Copy link
Collaborator Author

I just tried another importer that generated an error message that could be relevant.
https://adl.b2.adventistdigitallibrary.org/importers/433?locale=en
I tried to upload a zip file with PDFs and CSV metadata. This same size zip never failed for size issues prior to the move to Cloudflare, so it seems like Cloudflare's necessary service has some unintended consequences.
Image

I'm going to create a new importer for the same works but pulling the files from S3 rather than a zip.

@ShanaLMoore
Copy link
Contributor

ShanaLMoore commented Oct 7, 2024

https://adl.b2.adventistdigitallibrary.org/importers/433?locale=en
@KatharineV
What does the CSV look like? The linked importer has our favorite error message:

Error: StandardError - Missing at least one required element, missing element(s) are: identifier

@ShanaLMoore ShanaLMoore added the blocked other work must be completed first label Oct 7, 2024
@KatharineV
Copy link
Collaborator Author

@ShanaLMoore I worked around the 413 request entity too large error with importer 433. I modified the existing CSV and put different columns first (solved the "missing element" error) and I added the files to S3 and pulled them into the repo from there rather than a zip (solved the "413 request entity" error).

https://adl.b2.adventistdigitallibrary.org/importers/502?locale=en

Trial and error has shown me a few workarounds for the erroneous "Missing element" error. They don't always work, but here's what I try and sometimes it works:

  • I reorganize the columns in the CSV so something other than title or identifier is first. Those two columns can be marked as "missing" if Bulkrax reads them first (sometimes, not always, ugh).
  • I strip the metadata out of the CSV, drop it in notepad, save it there, then reopen the notepad CSV or TSV data in Excel and resave as a CSV.
  • I save as CSV UTF-8 or CSV without UTF-8, whichever option is the opposite of the file I tried first. Sometimes adding or removing UTF-8 will solve the problem.

None of these solutions appear to be consistently helpful, but that's probably because I'm solving for a variety of invisible errors and I don't have the bandwidth to document every single variable in the way that I should. Either way, signs point to Valkyrie and Version 6 solving some of these problems at least.

@ShanaLMoore
Copy link
Contributor

@KatharineV So are you OK for this weekend's demo?

I really hate that you have to go so much out of your way to get the data in 😓 I too am looking forward to the brighter days and promise of Valkyrie.

@ShanaLMoore ShanaLMoore removed the blocked other work must be completed first label Oct 8, 2024
@ShanaLMoore ShanaLMoore removed their assignment Oct 8, 2024
@KatharineV
Copy link
Collaborator Author

KatharineV commented Oct 15, 2024

See #856
Team, I'm adding to this ticket because we have a recurring error that feels related to the issues I first noticed with the importer at the top of the ticket. Large file sizes are being rejected from the repository. This has only happened since CloudFlare. I was able to work around the issue last week with the images from the importer, but in general we need to upload large file sizes because we deliver high resolution TIFFs to researchers around the world. Today, I have to work around Hyku and use a different service to deliver files to a user in France because the repo is rejecting the TIFFs I tried to upload.

The repo is rejecting all files over 150 MB, according to my student worker. This file size is obviously wildly too small for our purposes. Can you help us fix this issue fairly urgently? It is a blocker with everything we try to do.

Work I needed to add a TIFF to:
https://adl.b2.adventistdigitallibrary.org/concern/generic_works/20213926_illustrations_of_miller_s_views_of_the_end_of_the_world_in_1843/

Error message and failure to upload:
Image

Dropbox link to the TIFF:
https://www.dropbox.com/scl/fo/mipr9ujtltqgxempmgdrp/APxFmQU3El5vlFC_pRX9zJI?rlkey=c2dz52x6zrowg6knytoe26vnw&st=c314ffy5&dl=0

@orangewolf
Copy link
Contributor

@KatharineV the upload issue, especially the solution for it, is separate. I've made a new ticket for it here -#856

@ShanaLMoore ShanaLMoore changed the title Image importer with splitting issues Iiif View of large tiffs does not work Oct 21, 2024
@ShanaLMoore ShanaLMoore added the M1 Milestone 1 label Oct 21, 2024
@orangewolf orangewolf moved this from In Development to Client Verification in Adventist Knapsack Oct 23, 2024
@orangewolf
Copy link
Contributor

if we make this ticket only about the iiif view of large tifs not working, then I think it is resolved. the timeout for big non-pyramidal tifs was set to low. I've expanded it. However, we may want to revisit these large tifs in the future and create much faster loading pyramidal versions of them.

@orangewolf
Copy link
Contributor

the upload issue persists and is its own ticket now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working high priority M1 Milestone 1
Projects
Status: Client Verification
Development

No branches or pull requests

4 participants