Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GT.save() usability #499

Merged
merged 11 commits into from
Nov 22, 2024
Merged

Improve GT.save() usability #499

merged 11 commits into from
Nov 22, 2024

Conversation

jrycw
Copy link
Collaborator

@jrycw jrycw commented Oct 28, 2024

Related PR: #496.

This PR introduces several improvements to enhance the usability of GT.save():

  • Propose exposing **params to allow advanced users to customize save parameters directly via Pillow.
  • Streamline the .png extension check to a single line.
  • Construct the URL directly, similar to FmtImage._get_image_uri(), without using tempfile.TemporaryDirectory().
  • Replace the final time.sleep(0.05) with WebDriverWait(driver, 1).until() (we may need to determine the optimal timeout for most use cases).
  • Remove the .png saving branch that relied on selenium; all files are now saved using Pillow.
  • Suggest modifying GT.save() to return itself, allowing users to save intermediate tables. For example:
    GT(exibble).save("initial.png").tab_header("title").save("with_title.png")

Additionally, I noticed a potential speed boost (approximately 40-50% faster on my machine) when calling headless_browser=wdriver(options=wd_options) directly, bypassing the context manager. However, I'm unsure about the safety implications of this approach.

Copy link

codecov bot commented Oct 28, 2024

Codecov Report

Attention: Patch coverage is 91.52542% with 5 lines in your changes missing coverage. Please review.

Project coverage is 89.38%. Comparing base (1ad3ec1) to head (23f12cb).
Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
great_tables/_utils_selenium.py 90.90% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #499      +/-   ##
==========================================
+ Coverage   87.86%   89.38%   +1.52%     
==========================================
  Files          42       45       +3     
  Lines        4852     5239     +387     
==========================================
+ Hits         4263     4683     +420     
+ Misses        589      556      -33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@machow machow requested a review from rich-iannone November 12, 2024 21:25
@jrycw
Copy link
Collaborator Author

jrycw commented Nov 18, 2024

Hello team,

In my last commit, I experimented with isolating the web driver preparation logic into _utils_selenium.py. Please feel free to review and decide whether to keep or discard this change.

Comment on lines 38 to 41
self.debug_port = debug_port
self.wd_options = webdriver.ChromeOptions()
super().__init__()
self.driver = webdriver.Chrome(self.wd_options)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we consolidate the driver initialization across all these subclasses, by adding cls_wd_options=ChromeOptions, and cls_driver=Chrome as class attributes?

That way, the BaseClass could essentially require subclasses to specify those pieces. Then, the init would not have to be overridden by child classes.

This makes the child classes mostly about:

  • specifying Driver and Option classes (as class attributes)
  • customizing add_arguments method

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machow , thanks for the quick response.

Using cls_driver seems like a good idea, but I’m concerned that cls_wd_options might lead to issues. For example, what if we need to create two Chrome instances with different ChromeOptions in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machow , I think I understand now. Does the latest commit align with what you had in mind?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks -- that looks really nice!

@rich-iannone
Copy link
Member

This is looking good! I had a look at the output images on all 4 webdrivers (at 3 different scale values) to make sure outputs are similar to main. In case this is interesting I've included the images in this zip file:

image-tests.zip

I've found no reduction in quality with the changes here. Chrome is the only one that is problematic (cuts off content at the bottom) but that's also on main and an open issue anyway.

Copy link
Collaborator

@machow machow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is so helpful! Thanks for taking the time to work on .save(), which afaict is one of the most important parts of our API. My one request was that we remove **params so that PIL is a internal detail.

Excited to merge all these PRs, sorry for slowing your roll 😭

_debug_dump
Whether the saved image should be a big browser window, with key elements outlined. This is
helpful for debugging this function's resizing, cropping heuristics. This is an internal
parameter and subject to change.
**params
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay if we remove the **params piece of this PR? I'm onboard with everything else. Because **params only applies to when we go from a .png -> something else, it exposes PIL as part of the user API.

If we leave it out, we can swap out PIL down the road. As an alternative could we tell users that we're using Pillow? We could even tell them about PIL open and save if needed (so they can go from .png -> anything PIL supports on their own if more customization is needed).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I agree that we should remove **params, making Pillow an internal dependency rather than a public one.

Regarding hints, I believe that diligent readers likely already recognize we use Pillow under the hood and would consult its documentation for further customization if needed.

This aligns with our goal of providing a user-friendly way to save generated tables rather than focusing on creating highly customized table figures in various formats🤔.

@jrycw
Copy link
Collaborator Author

jrycw commented Nov 22, 2024

This is looking good! I had a look at the output images on all 4 webdrivers (at 3 different scale values) to make sure outputs are similar to main. In case this is interesting I've included the images in this zip file:

image-tests.zip

I've found no reduction in quality with the changes here. Chrome is the only one that is problematic (cuts off content at the bottom) but that's also on main and an open issue anyway.

As a side note, this PR does not address the cutoff bug encountered when using Chrome, as mentioned in #480 .

Copy link
Member

@rich-iannone rich-iannone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@machow machow merged commit 538fbf1 into posit-dev:main Nov 22, 2024
14 checks passed
@machow
Copy link
Collaborator

machow commented Nov 22, 2024

Shoot, I just noticed the change to passing a base64 encoded url directly to the headless browser. I think there could be some challenges with this approach (and url length limitations). In general, modern browsers seem to have large limits, but it's a tricky territory.

Opened an issue to track:

#518

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants