Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write cron job to find resources with NULL images #255

Open
elrayle opened this issue Mar 12, 2020 · 0 comments
Open

write cron job to find resources with NULL images #255

elrayle opened this issue Mar 12, 2020 · 0 comments

Comments

@elrayle
Copy link
Contributor

elrayle commented Mar 12, 2020

Description

Some images fail to upload for a variety of reasons. When this happens, a resource and a featured image are created in the database, but the featured image does not have a physical file in S3. Once this state occurs, the exhibit that holds the resource cannot be reindexed failing with error in sidekiq: Reindexing fails with... Riiif::ImageNotFoundError: unable to find file for 133 (where 133 is the ID of the image in the spotlight_featured_images table).

Early Detection

Write a cron job that lists any resource/featured_image combination where it is likely the image is missing.

SQL to find resources with featured images where the image is missing

select id, exhibit_id, type, upload_id from spotlight_resources where upload_id IS NULL;

+------+------------+------------------------------+-----------+
| id   | exhibit_id | type                         | upload_id |
+------+------------+------------------------------+-----------+
| 2053 |         28 | Spotlight::Resources::Upload |      NULL |
| 2162 |         36 | Spotlight::Resources::Upload |      NULL |
| 2163 |         36 | Spotlight::Resources::Upload |      NULL |
| 2165 |         36 | Spotlight::Resources::Upload |      NULL |
| 2166 |         36 | Spotlight::Resources::Upload |      NULL |
| 2361 |         44 | Spotlight::Resources::Upload |      NULL |
| 2362 |         44 | Spotlight::Resources::Upload |      NULL |
| 2374 |         44 | Spotlight::Resources::Upload |      NULL |
| 2376 |         43 | Spotlight::Resources::Upload |      NULL |
+------+------------+------------------------------+-----------+
9 rows in set (0.00 sec)

To execute this in Rails...

puts "+------+------------+------------------------------+-----------+"
puts "| id   | exhibit_id | type                         | upload_id |"
puts "+------+------------+------------------------------+-----------+"

select_stmt = "select id, exhibit_id, type, upload_id from spotlight_resources where upload_id IS NULL;"
results = ActiveRecord::Base.connection.execute(select_stmt)
results.each { |r| puts "| %4d | %10d | %27s | %9s |" % r }

This gives a list of potential problem resources. Each should be investigated to determine if the resource should be deleted along with its associated database objects. See issue #192 for more information on what and how to delete.

Related Work

Issue #192 Reindexing fails with... ActiveRecord::RecordNotFound: Couldn't find Spotlight::FeaturedImage without an ID

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants