Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value based styling for pivoted tables? #389

Open
igorcalabria opened this issue Jul 1, 2024 · 8 comments
Open

Value based styling for pivoted tables? #389

igorcalabria opened this issue Jul 1, 2024 · 8 comments

Comments

@igorcalabria
Copy link

Question

What would you like to know?

Hi, not sure if I've missed something but is there an easy way to apply conditional styles (https://posit-dev.github.io/great-tables/get-started/basic-styling.html) to pivoted tables (ie. Dates as columns)?

I know it's possible to pass a polars column selector in the location body, but I haven´t found a proper way to select rows per column automatically.

Something like this

 tab_style(
        style=style.text(color="white"),
        locations=loc.body(
            columns=cs.numeric(),
            rows=pl.col("each-column-from-the-selector").eq(0)
        )
    )

The alternative is applying styles on individual columns but that's a bit cumbersome when columns are dynamic

@igorcalabria igorcalabria changed the title Valeu based styling for pivoted tables? Value based styling for pivoted tables? Jul 1, 2024
@jrycw
Copy link
Collaborator

jrycw commented Jul 1, 2024

@igorcalabria If I understand correctly, your idea should work as expected.

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

(
    GT(pl.from_pandas(gtcars).head())
    .tab_style(
        style=style.text(color="red"),
        locations=loc.body(
            columns=cs.numeric(),
            rows=pl.col("year").eq(2017.0)
        )
    )
)

image

@jrycw
Copy link
Collaborator

jrycw commented Jul 1, 2024

Oh, I think I understand what you're looking for now. Maybe something like this:

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head()
gt = GT(df_mini)


for column in df_mini.select(cs.numeric()).columns:
    gt = gt.tab_style(
        style=style.text(color="red"),
        locations=loc.body(columns=column, rows=pl.col(column).gt(600)),
    )

gt

image

@igorcalabria
Copy link
Author

igorcalabria commented Jul 1, 2024

Yeah that's pretty much it. A closer example is something like this

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head().pivot(on="year", index=["trim", "mfr"], values="hp")
gt = GT(df_mini,rowname_col="trim", groupname_col="mfr")
for dt in df_mini.select(cs.numeric()).columns:
    gt = gt.tab_style(
        style=style.text(color="red"),
        locations=loc.body(columns=dt, rows=pl.col(dt).gt(600))
    )
gt

image

My question was if there's a more idiomatic way of doing it. I feel like this kind of pivots on dates are super common (there's one in the examples page).

From a user's point of view, I think the most natural API would be to simply style stuff based based on cell values and not the entire series:

    gt = gt.tab_style(
        style=style.text(color="red"),
        locations=loc.body(columns=cs.numeric(), rows=pl.col("cell_value").gt(600))
    )

I understand that this may be super awkward to implement across different dataframe implementations and that pl.col("cell-value") doesn't make much sense from polars point of view so It can get confusing too

EDIT:

Maybe

    gt = gt.tab_style(
        style=style.text(color="red"),
       # if rows is a lambda expr, call it for each column that matches the selector
        locations=loc.body(columns=cs.numeric(), rows=lambda col: pl.col(col).gt(600))
    )

@jrycw
Copy link
Collaborator

jrycw commented Jul 2, 2024

Thanks for your feedback. Personally, I'm comfortable calling multiple tab_style() to achieve the desired result, as it seems to be the most idiomatic approach I'm aware of.

@machow
Copy link
Collaborator

machow commented Jul 2, 2024

Hey thanks for flagging this, and for the useful examples. @rich-iannone and I have talked before about potentially adding a loc.body(mask=...) argument, that could handle the multiple cols selected -> those cols transformed situation.

Something like this...

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head().pivot(on="year", index=["trim", "mfr"], values="hp")
gt = (
    GT(df_mini,rowname_col="trim", groupname_col="mfr")
    .tab_style(
        style=style.text(color="red"),
        locations=loc.body(mask = cs.numeric().gt(600))
    )
)

Would this cover your usecase?

@igorcalabria
Copy link
Author

Yes, that's exactly what I needed. It's also way better than my suggestion, makes a lot of sense to use masks for styling and it fits pretty well with polars(and probably other dataframe) APIs

@henryharbeck
Copy link

FWIW I just answered a stack overflow question asking this and landed on a sliiiightly simpler current solution than what is presented here. Given that style=style.text(color="red") does not change within the for loop, a list comprehension passed to locations simplifies things a tad. It also avoids repeatedly overriding the gt variable

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head().pivot(on="year", index=["trim", "mfr"], values="hp")

(
    GT(df_mini, rowname_col="trim", groupname_col="mfr")
    .tab_style(
        style=style.text(color="red"),
        locations=[
            loc.body(columns=dt, rows=pl.col(dt).gt(600))
            for dt in df_mini.select(cs.numeric()).columns
        ],
    )
)

Image

There is another stack overflow question asking the same thing as well, so some demand from users does look be there.
The loc.body(mask=...) solution does look like it would cover both of these SO questions (and is intuitive to me too).

@jrycw
Copy link
Collaborator

jrycw commented Dec 15, 2024

@henryharbeck, thanks for mentioning the workaround; it's quite clever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants