Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop storing FileError items from Kingfisher Collect in the database #366

Open
jpmckinney opened this issue Apr 20, 2022 · 2 comments
Open
Labels
database Changes to the database (adding indices, renaming columns)
Milestone

Comments

@jpmckinney
Copy link
Member

open-contracting/kingfisher-collect#917 (comment)

SInce this version of Process stores the Scrapyd job ID, it's easy to use scrapy-log-analyzer to parse the log file itself. This avoids errors being introduced by Process, the network, etc.

@jpmckinney
Copy link
Member Author

jpmckinney commented Jun 8, 2022

Hmm, the job ID is stored by the data registry. It is sent to create_collection from Collect's spider_opened callback, but the ID is not yet stored. See #341 Done

@jpmckinney jpmckinney added refactor feature Relating to loading data from the web API or CLI command and removed refactor labels Jun 8, 2022
@jpmckinney jpmckinney added this to the Priority milestone Jun 8, 2022
@jpmckinney jpmckinney changed the title Stop storing collection errors in the database Stop storing FileError items from Kingfisher Collect in the database Jul 4, 2023
@jpmckinney jpmckinney modified the milestones: Database changes, Priority Apr 12, 2024
@jpmckinney
Copy link
Member Author

jpmckinney commented Apr 12, 2024

Obviously, as part of this, we would also stop sending messages for file errors from Collect.

Edit: There is also some logic in collectionstatus that we can remove .exclude(data__has_key="http_error")

As part of this, we can delete collection_note rows WHERE note LIKE 'Couldn''t download %'

@jpmckinney jpmckinney added database Changes to the database (adding indices, renaming columns) and removed feature Relating to loading data from the web API or CLI command labels Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Changes to the database (adding indices, renaming columns)
Projects
None yet
Development

No branches or pull requests

1 participant