Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add header description in the fasta file used for the mixed species modules. #463

Open
mlocardpaulet opened this issue Nov 29, 2024 · 0 comments

Comments

@mlocardpaulet
Copy link
Contributor

Some software tools need the header description to properly parse fasta header with default parameters. We do not have this in the current fasta proposed for the DDA and DIA modules.

So we decided to generate a fasta with these descriptions and the same sequences as the ones in the fasta that we currently use.

I had a look at it and: problem!

Whatever I do, we won't have description for all the accessions because:

  1. some accessions are "weird" but necessary: the spiked in biognosys iRTs have no descriptions; and some sequences in the contaminants are tags or else (hence, no header description).
  2. some accessions are deleted or not annotated anymore in Uniprot.

So, whatever I do there will be accessions without descriptions.

Will this be an issue for some software tools? If yes, how do I proceed? What do I do with the sequences that are not "real" proteins?

@wolski: do you have an opinion on how to proceed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant