-
-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using default browser context #471
base: main
Are you sure you want to change the base?
Conversation
This has access to the browser's persisted state (e.g. cookies, sessions, history, etc).
This would grant access to persistent state such as cookies that are associated with the default browser profile.
This was probably leftover from my local monkeypatch.
@contexts[default_context_id] = ::Ferrum::Context.new(@client, self, nil) | ||
end | ||
|
||
# Compute the default context ID by looking for contexts not returned by Target.getBrowserContexts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kinda hacky but I didn't find a better way to find this looking through https://chromedevtools.github.io/devtools-protocol. Target.getBrowserContexts
does not return the default context, but when you create targets without specifying the browserContextId it returns this context ID. This is why we need to create an entry in @contexts
with that ID to receive the targetCreated and put the Target in the right Context.
Another way to get the ID that might be more reliable is to create a target in the default context and then get the ID from there, but that would require creating a throwaway target or something. I guess it'd be fine to create a Target, get its browserContextId, and then destroy it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm this method was also suggested (independently) by one of the Chromium engineers, so it might be fine. I am slightly nervous about the case when no tabs exist in the default context which presumably could happen, but I think generally people using browser automation should be able to ensure that the correct prerequisites are met.
One other approach might be to have the Contexts
class (optionally) enumerate all existing contexts at startup and create entries for them, and then any targets that aren't part of the discovered contexts are probably from the default context. This could get out of sync potentially, but I'm not sure if there is really any problem with a Target (in Ruby) being assigned to the incorrect Context object from the Contexts map, since the Target seems to have sufficient information to operate on its own and doesn't really need info from the Context. I think the main purpose of that association is that when Context#dispose
is called, it closes the connections to the Targets
in that Context
, so that's probably not going to be an issue in practice because if Ferrum is not aware of that Context
you're never going to call #dispose
on it.
Edit: The above might be slightly confusing, but this is what I mean: ibrahima@b1b7732
@@ -125,6 +125,9 @@ class Browser | |||
# @option options [Hash] :env | |||
# Environment variables you'd like to pass through to the process. | |||
# | |||
# @option options [Boolean] :use_default_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what the best name for this is... it's kinda confusing since Contexts#default_context
is already a thing and the current behavior of that method is to create a new BrowserContext. But I felt like "default" should be the right thing because the effective change is to not pass a browserContextId when creating a target.
Perhaps a less confusing name would be use_persistent_context
? Similar to how Playwright has launchPersistentContext
.
@@ -11,6 +11,7 @@ class Contexts | |||
def initialize(client) | |||
@contexts = Concurrent::Map.new | |||
@client = client | |||
@default_context = create_default_context if @client.options.use_default_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One side effect of setting this in the constructor is that any existing tabs/targets in the browser's default context will be available as targets (e.g. browser.default_context.targets
will return a nonzero number of targets). This felt right to me, because then it handles use cases like #320 as well. Similar to the fork mentioned in #320 (comment) but in my opinion cleaner because it just gives you access to all the existing tabs to do with what you will, rather than somewhat arbitrarily picking the default (latest?) tab to be the default target.
It also seems to me that doing this actually makes the discover
method call in the constructor do something, which I was wondering about. Without setting up any initial contexts, calling discover
will start enumerating targets but because no Contexts exist yet, it won't do anything with those targets.
@ibrahima I'm on vacation currently and will take a look on Monday or sooner if I have time |
@route how about reviewing this? |
@yann120 also reusing the chrome data dir will help you |
Hey @ibrahima. Thanks for working on this! I was trying to add the same functionality. I'm seeing where if I run the following:
I get:
I'm digging in on it, but I wasn't sure if you had seen this. Thanks! |
Thanks, I ended up saving the cookies, and reload them, and it works. Maybe the easiest solution :) |
This PR adds an option for Ferrum to use the default browser context instead of a created BrowserContext that doesn't have access to the persisted browser state.
I know that historically for testing purposes it's been recommended here to use clean browser contexts for reproducibility, but I think this is a reasonable option for some use cases (that may not be related to testing). Other similar libraries provide such options. For example, Playwright lets you do launchPersistentContext to launch a browser with a persistent context. Puppeteer seems to default to using the default browser context and only creates new BrowserContexts if you manually create them. I don't have experience with these libraries, but I did a lot of digging through their source to see how they handled Target creation and BrowserContext creation.
Fixes #47
Re: this comment from that thread:
I seem to be able to create pages/targets within the default context fine. However, I haven't explored much so there could very well be bugs associated with this change. All I know is that it works for my use case and it would be very useful to have this in the upstream gem. I'm happy to discuss whether this change makes sense or not and I don't mind if the answer is that it doesn't. But at least in my case, the save/load cookies feature from 99cfa84 doesn't solve my use case because the persisted browser state that I'm trying to restore is coming from a browser hosting service (https://www.browserbase.com/), so the only way I can access the state is through the browser's default context.
Thank you for your consideration!