Improve `GT.save()` usability #499

jrycw · 2024-10-28T07:11:45Z

Related PR: #496.

This PR introduces several improvements to enhance the usability of GT.save():

Propose exposing **params to allow advanced users to customize save parameters directly via Pillow.
Streamline the .png extension check to a single line.
Construct the URL directly, similar to FmtImage._get_image_uri(), without using tempfile.TemporaryDirectory().
Replace the final time.sleep(0.05) with WebDriverWait(driver, 1).until() (we may need to determine the optimal timeout for most use cases).
Remove the .png saving branch that relied on selenium; all files are now saved using Pillow.
Suggest modifying GT.save() to return itself, allowing users to save intermediate tables. For example:
```
GT(exibble).save("initial.png").tab_header("title").save("with_title.png")
```

Additionally, I noticed a potential speed boost (approximately 40-50% faster on my machine) when calling headless_browser=wdriver(options=wd_options) directly, bypassing the context manager. However, I'm unsure about the safety implications of this approach.

codecov · 2024-10-28T07:13:40Z

Codecov Report

Attention: Patch coverage is 91.52542% with 5 lines in your changes missing coverage. Please review.

Project coverage is 89.38%. Comparing base (1ad3ec1) to head (23f12cb).
Report is 14 commits behind head on main.

Files with missing lines	Patch %	Lines
great_tables/_utils_selenium.py	90.90%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #499      +/-   ##
==========================================
+ Coverage   87.86%   89.38%   +1.52%     
==========================================
  Files          42       45       +3     
  Lines        4852     5239     +387     
==========================================
+ Hits         4263     4683     +420     
+ Misses        589      556      -33

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

jrycw · 2024-11-18T19:12:29Z

Hello team,

In my last commit, I experimented with isolating the web driver preparation logic into _utils_selenium.py. Please feel free to review and decide whether to keep or discard this change.

machow · 2024-11-18T19:28:58Z

great_tables/_utils_selenium.py

+        self.debug_port = debug_port
+        self.wd_options = webdriver.ChromeOptions()
+        super().__init__()
+        self.driver = webdriver.Chrome(self.wd_options)


Could we consolidate the driver initialization across all these subclasses, by adding cls_wd_options=ChromeOptions, and cls_driver=Chrome as class attributes?

That way, the BaseClass could essentially require subclasses to specify those pieces. Then, the init would not have to be overridden by child classes.

This makes the child classes mostly about:

specifying Driver and Option classes (as class attributes)

customizing add_arguments method

@machow , thanks for the quick response.

Using cls_driver seems like a good idea, but I’m concerned that cls_wd_options might lead to issues. For example, what if we need to create two Chrome instances with different ChromeOptions in the future?

@machow , I think I understand now. Does the latest commit align with what you had in mind?

Yeah, thanks -- that looks really nice!

rich-iannone · 2024-11-18T21:35:40Z

This is looking good! I had a look at the output images on all 4 webdrivers (at 3 different scale values) to make sure outputs are similar to main. In case this is interesting I've included the images in this zip file:

image-tests.zip

I've found no reduction in quality with the changes here. Chrome is the only one that is problematic (cuts off content at the bottom) but that's also on main and an open issue anyway.

machow

This PR is so helpful! Thanks for taking the time to work on .save(), which afaict is one of the most important parts of our API. My one request was that we remove **params so that PIL is a internal detail.

Excited to merge all these PRs, sorry for slowing your roll 😭

machow · 2024-11-21T18:27:05Z

great_tables/_export.py

    _debug_dump
        Whether the saved image should be a big browser window, with key elements outlined. This is
        helpful for debugging this function's resizing, cropping heuristics. This is an internal
        parameter and subject to change.
+    **params


Is it okay if we remove the **params piece of this PR? I'm onboard with everything else. Because **params only applies to when we go from a .png -> something else, it exposes PIL as part of the user API.

If we leave it out, we can swap out PIL down the road. As an alternative could we tell users that we're using Pillow? We could even tell them about PIL open and save if needed (so they can go from .png -> anything PIL supports on their own if more customization is needed).

Thanks for the explanation. I agree that we should remove **params, making Pillow an internal dependency rather than a public one.

Regarding hints, I believe that diligent readers likely already recognize we use Pillow under the hood and would consult its documentation for further customization if needed.

This aligns with our goal of providing a user-friendly way to save generated tables rather than focusing on creating highly customized table figures in various formats🤔.

jrycw · 2024-11-22T04:56:48Z

This is looking good! I had a look at the output images on all 4 webdrivers (at 3 different scale values) to make sure outputs are similar to main. In case this is interesting I've included the images in this zip file:

image-tests.zip

I've found no reduction in quality with the changes here. Chrome is the only one that is problematic (cuts off content at the bottom) but that's also on main and an open issue anyway.

As a side note, this PR does not address the cutoff bug encountered when using Chrome, as mentioned in #480 .

…erent formats

rich-iannone

LGTM!

machow · 2024-11-22T19:49:23Z

Shoot, I just noticed the change to passing a base64 encoded url directly to the headless browser. I think there could be some challenges with this approach (and url length limitations). In general, modern browsers seem to have large limits, but it's a tricky territory.

Opened an issue to track:

#518

Improve GT.save() usability

cccee66

machow requested a review from rich-iannone November 12, 2024 21:25

jrycw added 4 commits November 19, 2024 02:23

Isolate web driver preparation logic from GT.save()

f23661a

Add quit() to _NoOpDriverCtx

79f5461

Try to replace _NoOpDriverCtx with no_op_callable()

5abca23

Add from __future__ import annotations to _utils_selenium.py

1b599da

machow reviewed Nov 18, 2024

View reviewed changes

jrycw added 3 commits November 19, 2024 04:41

Introduce cls_driver and cls_wd_options to WebDriver

e0b6365

Fix wrong cls_driver for _SafariWebDriver

73a7c86

Fix the type hint for debug_port in _BaseWebDriver

4c6d965

jrycw force-pushed the screenshot branch from 41e1611 to 2183a87 Compare November 19, 2024 05:00

Add test__utils_selenium.py

d319ce1

jrycw force-pushed the screenshot branch from 2183a87 to d319ce1 Compare November 19, 2024 05:02

machow requested changes Nov 21, 2024

View reviewed changes

Remove **params from GT.save()

a2ee377

Restore the comment about using PIL for converting tables into diff…

23f12cb

…erent formats

rich-iannone approved these changes Nov 22, 2024

View reviewed changes

rich-iannone assigned machow Nov 22, 2024

machow merged commit 538fbf1 into posit-dev:main Nov 22, 2024
14 checks passed

machow mentioned this pull request Nov 22, 2024

save: investigate use of base64 encoded urls in headless browser #518

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `GT.save()` usability #499

Improve `GT.save()` usability #499

jrycw commented Oct 28, 2024

codecov bot commented Oct 28, 2024 •

edited

Loading

jrycw commented Nov 18, 2024

machow Nov 18, 2024

jrycw Nov 18, 2024

jrycw Nov 18, 2024

machow Nov 19, 2024

rich-iannone commented Nov 18, 2024

machow left a comment

machow Nov 21, 2024

jrycw Nov 22, 2024

jrycw commented Nov 22, 2024

rich-iannone left a comment

machow commented Nov 22, 2024

Improve GT.save() usability #499

Improve GT.save() usability #499

Conversation

jrycw commented Oct 28, 2024

codecov bot commented Oct 28, 2024 • edited Loading

Codecov Report

jrycw commented Nov 18, 2024

machow Nov 18, 2024

Choose a reason for hiding this comment

jrycw Nov 18, 2024

Choose a reason for hiding this comment

jrycw Nov 18, 2024

Choose a reason for hiding this comment

machow Nov 19, 2024

Choose a reason for hiding this comment

rich-iannone commented Nov 18, 2024

machow left a comment

Choose a reason for hiding this comment

machow Nov 21, 2024

Choose a reason for hiding this comment

jrycw Nov 22, 2024

Choose a reason for hiding this comment

jrycw commented Nov 22, 2024

rich-iannone left a comment

Choose a reason for hiding this comment

machow commented Nov 22, 2024

Improve `GT.save()` usability #499

Improve `GT.save()` usability #499

codecov bot commented Oct 28, 2024 •

edited

Loading