Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add formatting #295

Merged
merged 9 commits into from
Dec 20, 2023
Merged

add formatting #295

merged 9 commits into from
Dec 20, 2023

Conversation

Clockwork-Rat
Copy link
Contributor

Closes #293

Adds formatting and rounds summary output.

@Clockwork-Rat
Copy link
Contributor Author

@Oddant1 I still have to add number formatting to parts of the output, which I am working on now, but this is the basis and just adds the rounding. I am trying to see where I can do the rounding -- either in the jinja html or in the summary function. I am not sure which is better or more clear.

@gregcaporaso
Copy link
Member

gregcaporaso commented Nov 30, 2023

I would advocate for doing the rounding in the Python function rather than the jinja html. My gut is that the former would be easier to test than the latter, and potentially more clear to future developers.

@Clockwork-Rat
Copy link
Contributor Author

@Oddant1 @gregcaporaso this should be ready to pull in.

@gregcaporaso
Copy link
Member

@Clockwork-Rat, should this be a Draft Pull Request until we sort out the plan for handling FeatureTable[RelativeFrequency]?

@Clockwork-Rat
Copy link
Contributor Author

@gregcaporaso @Oddant1 this should be working as expected.

@gregcaporaso gregcaporaso self-assigned this Dec 8, 2023
summary = pd.Series([frequencies.min(), frequencies.quantile(0.25),
frequencies.median(), frequencies.quantile(0.75),
frequencies.max(), frequencies.mean()],
table = table.to_dataframe()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a better approach we can use:

nonzeros = table.matrix_data.data

This will take the BIOM table to a Sparse Matrix, and then the raw non-zero data, which will prevent setting up a dataframe and iterating over rows or columns, since we don't care about where the values are, just their intergery-ness.

@hagenjp
Copy link
Contributor

hagenjp commented Dec 11, 2023

@Clockwork-Rat,
We are doing a PR sprint today and I was assigned to review this PR with the help of @ebolyen and @gregcaporaso.
When rendering your changes locally using feature-table summarize we realized that relative frequency just does not make sense as an input of this method. In the attached screenshot the 2.13 maximum frequency stood out to us, summing across the relative frequency values is creating meaningless values inside the Frequency Per Feature table.
In summarize and summarize-plus please drop relative frequency as an input option of these actions.
Feel free to close this PR and open a new one, if you would like, and let me/us know if you need any clarification. (Evan and Greg apologize for not noticing this sooner, it was only after looking at the relative frequency summary that they realized the outcome here!)
Screen Shot 2023-12-11 at 11 01 06 AM

@Clockwork-Rat
Copy link
Contributor Author

@hagenjp @ebolyen These changes have been made and tested.

@gregcaporaso
Copy link
Member

gregcaporaso commented Dec 13, 2023

Thanks @Clockwork-Rat! I think (@ebolyen, please correct me if I'm wrong) that we'll never have ints as input here, so it would be safe to drop your check and just always round.

@hagenjp, do you mind pulling this down and testing the changes like you did with the last iteration? Note that in this iteration, both summarize and summarize-plus should fail if provided with a FeatureTable[RelativeFrequency].

@Clockwork-Rat, can you also add a note to the release notes document that the interface of summarize will be changing in 2024.2 to no longer accept a FeatureTable[RelativeFrequency]? We should discuss next week whether it makes sense to add another action to use in its place that generates more relevant information.

@hagenjp
Copy link
Contributor

hagenjp commented Dec 14, 2023

@Clockwork-Rat @gregcaporaso These changes look good to me!
I was able to see that relative frequency tables were absent from the accepted table formats in the --help output for feature-table summarize and summarize plus. And when I provided a relative frequency table as an input both summarize and summarize plus both returned errors for invalid value for --i-table.

Copy link
Member

@gregcaporaso gregcaporaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Clockwork-Rat, could you a take a pass through and remove everything we don't need from this PR, now that we're not concerned about supporting relative frequency values? Sorry for the trouble - this is one of those PRs that had a big change in direction once we got into it.

q2_feature_table/_summarize/_visualizer.py Show resolved Hide resolved
@Clockwork-Rat
Copy link
Contributor Author

@hagenjp @gregcaporaso here are the changes with the removed logic.

Copy link
Member

@gregcaporaso gregcaporaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @Clockwork-Rat!

@gregcaporaso gregcaporaso merged commit a9e47ba into qiime2:dev Dec 20, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Completed
Development

Successfully merging this pull request may close these issues.

minor numeric formatting changes in summarize and summarize-plus
4 participants