Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed genome only option not working #57

Merged
merged 10 commits into from
Nov 6, 2024

Conversation

FernandoDuarteF
Copy link
Collaborator

Fixed --genome_only option not working.

@chriswyatt1
Copy link
Contributor

Great, ok, lets wait for our first review, and then push it. I will test it now

@FernandoDuarteF
Copy link
Collaborator Author

FernandoDuarteF commented Nov 5, 2024

As mentioned in #53:

  • Fixed uncompressing on fasta and gff files not working, as done in decontamination workflow #72.
  • Improved readability by being more explicit about what each item is in mapping functions.
  • Fixed sync problems:
    • Changed CREATE_PATH so it now outputs a single tuple.
    • Combined fasta and gff channels into a single channel and then splitted then into different two channels using multiMap operator. Same thing for NCBIGENOMEDOWNLOAD input.

@FernandoDuarteF
Copy link
Collaborator Author

Hi @chriswyatt1, looks like meryl and merqury are working fine on the test datasets.

I think this is ready to be merged.

Copy link
Contributor

@chriswyatt1 chriswyatt1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all looks good, have tested it and works as expected. Questions/Comments:

  1. Will the pipeline now work with mixed inputs, e.g. some with just genomes, other with refseq IDs, etc.?

  2. Maybe we should add a option on the REAMDE for this already.
    Although the README does require a revamp with all the new modules/subworkflows added.

@FernandoDuarteF
Copy link
Collaborator Author

FernandoDuarteF commented Nov 6, 2024

1. Will the pipeline now work with mixed inputs, e.g. some with just genomes, other with refseq IDs, etc.?

It works like this:

  • If --genome_only is set, it will run on GENOME_ONLY mode, it will check if there is a genome present and run. Doesn't check for annotation. If you provide it, it will ignore it.
  • If no --genome_only is set, it will run on GENOME_AND_ANNOTAION mode. It will check for RefSeq ID or local fasta and gff. If not present, it will fail.
    Note to myself: I need to improve the error messages.
2. Maybe we should add a option on the REAMDE for this already.
   Although the README does require a revamp with all the new modules/subworkflows added.

Yeah, I haven't look at the README. I'll check tomorrow and see what can I add. Probably a lot.

@FernandoDuarteF
Copy link
Collaborator Author

FernandoDuarteF commented Nov 6, 2024

I also added .pre-commit-config.yaml to the repository which is needed to commit changes in gitpod.

@FernandoDuarteF FernandoDuarteF merged commit 49a21b3 into nf-core:dev Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants