Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tabulate-seqs reports the number of sequences in the input, not the number of rows in the table #316

Open
gregcaporaso opened this issue Oct 3, 2024 · 0 comments
Labels
bug-sev:2|low bug-type:1|cosmetic UX is less than ideal; something is superficially wrong. diff:2|intermediate A modest understanding of the languages involved and platform is required. good first issue Good for newcomers

Comments

@gregcaporaso
Copy link
Member

If tabulate-seqs is called with --p-merge-method intersect or --p-merge-method union, the Sequence Count reported at the top left (see the star in the screenshot below) is the number of sequences in the input data Artifact. That may be different than the number of sequences reported on in the table (i.e., the number of table rows), which can be lower (if --p-merge-method intersect) or higher (if --p-merge-method union). The sequence length stats presented are, however, computed on the total Sequence count that is reported.

See these lines:


seq_len_stats = _compute_descriptive_stats(seq_lengths)

and notice that seq_lengths isn't filtered in the --p-merge-method intersect case.

In the --p-merge-method intersect it would probably make the most sense to have the Sequence Count and length stats reflect the sequences that are reported on in the table.

In the --p-merge-method union case on the other hand, it's possible that we won't have sequences for some of the table rows, in which case the sequence length stats would reflect only what's present in the input data Artifact, so it probably makes sense to have Sequence Count reflect what's in the input data.

In both cases, we should be making it more clear specifically what Sequence Count is a count of.

Screenshot 2024-10-03 at 12 09 41 PM

@github-project-automation github-project-automation bot moved this to Needs Triage in QIIME 2 - Triage 🚑 Oct 3, 2024
@gregcaporaso gregcaporaso added bug-sev:2|low bug-type:1|cosmetic UX is less than ideal; something is superficially wrong. diff:2|intermediate A modest understanding of the languages involved and platform is required. labels Oct 3, 2024
@github-project-automation github-project-automation bot moved this to Backlog in 2025.4 🌻 Oct 10, 2024
@lizgehret lizgehret added the good first issue Good for newcomers label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-sev:2|low bug-type:1|cosmetic UX is less than ideal; something is superficially wrong. diff:2|intermediate A modest understanding of the languages involved and platform is required. good first issue Good for newcomers
Projects
Status: Backlog
Development

No branches or pull requests

2 participants