-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest updates #222
Merged
Merged
Ingest updates #222
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Simplify the `files_to_upload` config as a single mapping where the key is the remote file name and the value is the local file instead of maintaining a two lists of files. This ensures that we know exactly which local file is uploaded to the remote file without worrying about order or duplicates.
Change the expectation that the local file paths for `file_to_upload` must be relative to the ingest directory instead of the ingest/data directory. This is done in preparation for moving the final outputs fo the ingest workflow to an ingest/results directory.
Instead of mixing the final results with the intermediate files produced during the workflow run, output the final files to the result directory.
for more information, see https://pre-commit.ci
j23414
reviewed
Nov 6, 2023
j23414
approved these changes
Nov 6, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I particularly like the switch to the key:value
style of listing remote and local files, thanks for the changes!
As far as I can tell, the test ran to completion, and I checked s3:
nextstrain remote ls s3://nextstrain-data/files/workflows/mpox/branch/ingest-updates
files/workflows/mpox/branch/ingest-updates/alignment.fasta.xz
files/workflows/mpox/branch/ingest-updates/all_sequences.ndjson.xz
files/workflows/mpox/branch/ingest-updates/genbank.ndjson.xz
files/workflows/mpox/branch/ingest-updates/insertions.csv.gz
files/workflows/mpox/branch/ingest-updates/metadata.tsv.gz
files/workflows/mpox/branch/ingest-updates/sequences.fasta.xz
files/workflows/mpox/branch/ingest-updates/translations.zip
I downloaded the metadata and sequences files and they weren’t empty and look reasonable.
joverlee521
added a commit
that referenced
this pull request
Nov 15, 2023
Fix bug that was introduced in #222 The testing done on branch runs do not include the Slack notifications, so this bug was not revealed in the test run. I only noticed the workflow as failing because our internal #monkeypox-updates channel had been quiet for over a week. This did not trigger our usual error notifications in Slack because the error occurs during the DAG building process before the start of the actual workflow run.
joverlee521
added a commit
that referenced
this pull request
Nov 15, 2023
This change was motivated by the unintentional bug introduced in #222 that would only be triggered by using Slack notifications. This allows to test branches and send notifications to the testing channel. As part of this change, I've added an organization level variable `TEST_SLACK_CHANNEL` that points our #scratch channel for testing Slack notifications.
2 tasks
joverlee521
added a commit
that referenced
this pull request
Nov 15, 2023
This change was motivated by the unintentional bug introduced in #222 that would only be triggered by using Slack notifications. This allows to test branches and send notifications to the testing channel. As part of this change, I've added an organization level variable `TEST_SLACK_CHANNEL` that points our #scratch channel for testing Slack notifications.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
Minor refactoring and updates to the ingest workflow that came from review of nextstrain/dengue#13, mainly:
files_to_upload
mappingresults/
directoryChecklist