Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Screencast: Allow some way to request windows by name or process #1064

Open
zorbathut opened this issue Aug 1, 2023 · 12 comments
Labels
needs discussion Needs discussion on how to implement or fix the corresponding task new api This requires adding API to an existing portal portal: screencast Screencast portal

Comments

@zorbathut
Copy link

I'm working on a system that requires the ability for OBS to, without user intervention, capture the contents of a new window. At the moment xdg-desktop-portal cannot support this; it allows selecting a window with user intervention, or it allows selecting a window with an opaque recovery token. But the recovery token is generated only with user intervention, so there's essentially no solution to allow screencasting a window without going through the window picker.

This is something that you can do on Windows because OBS identifies windows by name, and it's sometimes really convenient because it dramatically reduces the amount of user intervention needed when you do things.

I'd like an option that can be passed in which is a desired window identifier, either identifying via the name of the window or via the name of the process. Using the game Hades as an example because I have it convenient, this would look something like, as an added parameter to org.freedesktop.portal.ScreenCast's SelectSources:

request_by_name Hades

or

request_by_process Z:\home\zorba\.local\share\Steam\steamapps\common\Hades\x64Vk\Hades.exe

If the request fails, it could either fall back on the picker, or just return failure. I'm not sure which of those is better.


Concerns:

There are definite potential security issues in this. I'm not proposing that you should be able to get any window just by guessing the window name or process name; instead, I'd personally expect a checkbox on the picker labeled something like "allow future capturing of windows with this name", which saves that flag permanently somewhere. This means you still need to manually intervene once, but after that, it can just happen transparently for you.

I'm kinda handwaving on "saves that flag permanently somewhere". I think if this were to be done completely right, this would also need a dialog somewhere so you could manage the authorized names/processes and revoke them. This may end up being complicated. A hacky initial option could forego the window checkbox and just allow a hand-authored config file somewhere; this works for my purposes and might work for ironing out interface problems.

There's potential ambiguity if there's more than one window that matches the pattern. In my case, I don't care! Just pick one! Maybe other people care.


In my case, I have control over the window, so I could send a D-Bus message that says "identify this window with this given tag", then we could have request_by_tag. I don't think this is a good solution, though, because most people who want this feature are not going to have code-level control over the window.

I'd love to get this up to feature-parity with Windows; right now it errs on the side of security, which is a good direction to err in, but, man, sometimes convenience and automation are really nice!

@orowith2os
Copy link

This feels like a very big security issue, as you mentioned, as well as usability issue. You can already continue the capture of a previously selected window if the user allows it - what more do you need? You can also use things like ObsVkCapture for games, which does work with Flatpak.

Not to mention that requesting by name or process is iffy - the window names are subject to change, and the process can be different depending on the environment you're run in; the process in a flatpak isn't necessarily the same as the process running on the host, as well as file path issues.

You can also probably figure something out with window handles, and the same mechanism that ObsVkCapture uses.

@jadahl
Copy link
Collaborator

jadahl commented Aug 4, 2023

Would it help if you could pass a window title/app-id to as a filter, then still require the user to click "Share"? It'd simplify the user interaction by potentially having a single window to choose.

@zorbathut
Copy link
Author

zorbathut commented Aug 4, 2023

This feels like a very big security issue, as you mentioned, as well as usability issue.

Note that I'm not asking for this to be enabled by default, I just want another checkbox to loosen the security a bit further. This is one of those cases where security and usability clash a bit.

This feels like a very big security issue, as you mentioned, as well as usability issue. You can already continue the capture of a previously selected window if the user allows it - what more do you need?

The problem is that it's a new window spawned by a new process (with the same name, and with the same process name, but still a new PID.) This makes it impossible to "continue" the capture; it needs to be a new capture of a new window that has many of the same properties as the last window.

You can also use things like ObsVkCapture for games, which does work with Flatpak.

This might work; I'll check it out.

Not to mention that requesting by name or process is iffy - the window names are subject to change, and the process can be different depending on the environment you're run in; the process in a flatpak isn't necessarily the same as the process running on the host, as well as file path issues.

In my case, I have control over the window name, and it's not running in a flatpak anyway. I agree this might be something that needs to be tackled for general purposes though.

(Although I will note that "capture based on window name" has been an identifier used in OBS for quite a while.)

@zorbathut
Copy link
Author

Would it help if you could pass a window title/app-id to as a filter, then still require the user to click "Share"?

Unfortunately not. Needs to be fully without interaction.

@zorbathut
Copy link
Author

In response to the confused emoji:

The thing I'm working on is an automated test framework for a game. I need to be able to spawn new fresh instances of the game running test scripts while automatically recording footage. "Automatically" is the entire point here; needing a human to sit there clicking the "share" button every fifteen seconds is unacceptable.

Right now, I'm solving this by running under X11 with XCompositor capturing. I tried switching to Wayland, but as near as I can tell all captures in Wayland must go through xdg-desktop-portal. There's no way to tell xdg-desktop-portal "no, seriously, let me capture this window without user intervention", so this makes Wayland completely unusable (regardless of whether a flatpak is involved, for the record.) I'd like to head this problem off sooner rather than later.

If I do need to switch to Wayland, my current solution is going to be a custom build of xdg-desktop-portal that implements "search by window name" and a matching custom build of OBS to pass chosen window names in, because I frankly don't care about the security implications in this context. But it'd be nice to come up with a solution that other people can use as well :)

@orowith2os
Copy link

Then your best bet will be OBS-VkCapture. Unless there's a more real-world use case for screen capturing via the ScreenCast API with specific window names/etc, it's not really useful. Test frameworks don't usually need to, nor do they normally go through, desktop APIs like this.

@Mikenux
Copy link

Mikenux commented Aug 5, 2023

There is request #304, where the window name is needed (for display within app). If such a portal existed, then what is requested here would be an extension of it by asking to monitor a specific name or selecting a specific application. Am I right?

@orowith2os
Copy link

That sounds about right.

@GeorgesStavracas GeorgesStavracas moved this to Needs Triage in Triage Oct 2, 2023
@GeorgesStavracas GeorgesStavracas added new api This requires adding API to an existing portal needs discussion Needs discussion on how to implement or fix the corresponding task portal: screencast Screencast portal labels Oct 5, 2023
@GeorgesStavracas GeorgesStavracas moved this from Needs Triage to Triaged in Triage Oct 5, 2023
@ruineka
Copy link

ruineka commented Jun 1, 2024

This issue requires careful consideration as it poses a significant challenge for serious streamers transitioning from Windows to Linux. There seems to be a misunderstanding regarding the complexity of streamer setups, where multiple sources need to be added as overlays to create engaging content. While selecting sources for a camera and a game is straightforward, the process becomes frustrating beyond that.

Modern streamers incorporate 3D avatars, multiple webcams capturing different angles of their keyboard, avatar, and face, along with various web browser plugins as additional sources. Currently, upon opening OBS, users are inundated with countless requests to select window sources, causing confusion about which window to choose. This cumbersome process creates significant barriers for streamers considering Linux as a viable platform.

@bbb651
Copy link

bbb651 commented Jun 10, 2024

Currently, upon opening OBS, users are inundated with countless requests to select window sources, causing confusion about which window to choose. This cumbersome process creates significant barriers for streamers considering Linux as a viable platform.

Restoration is already a thing, and is implemented on obs (you need your portal backend to support ScreenCast v4):

restore_token (s)

The token to restore a previous session.

If the stored session cannot be restored, this value is ignored and the user will be prompted normally. This may happen when, for example, the session contains a monitor or a window that is not available anymore, or when the stored permissions are withdrawn.

The restore token is invalidated after using it once. To restore the same session again, use the new restore token sent in response to starting this session.

Setting a restore_token is only allowed for screen cast sessions. Persistent remote desktop screen cast sessions can only be handled via the Remote Desktop interface.

This option was added in version 4 of this interface.

Although it seems like it doesn't work across captured applications restarts and compositor restarts, I think it's already tracked in #1355

@Mikenux
Copy link

Mikenux commented Jun 10, 2024

@ruineka: Maybe open a discussion (https://github.com/flatpak/xdg-desktop-portal/discussions/new/choose) to document what streamers exactly need and expect?

@m0rg-dev
Copy link

I'm in the same situation as @ruineka here with regard to streaming. OBS currently pops four screencast selector windows every time I start it up, and since it doesn't identify which source each one is supposed to go to I have to close out of all of them and then manually re-assign every capture source. It's a real pain in the neck, and I don't think I have a particularly complicated stream layout. Session restore doesn't work since it's not practical for me to keep the windows I want to include in the stream open all the time (and in some cases I have to close and reopen them while the stream is running, which forces another screencast select since the window ID changes).

In my opinion, this is a real-world use case that really does warrant a flag or permissions bit or something for unattended screencasting. It's a security issue if we let any program do that, but at some point I don't need to know every time if OBS is going to capture my window contents and send them off to some random external server because that is the thing that OBS is designed to do. We don't have to blow the thing wide open, but I'd at least like to be able to say "/usr/bin/obs can watch /usr/bin/dolphin-emu without asking me first" because that is often my explicit purpose in opening OBS and I don't really care that much if some other program could hypothetically pretend to be OBS and capture from my dolphin-emu window.

(Also, the current model doesn't even prevent you from streaming the wrong content by mistake. If you're streaming a capture of a browser window (as I do for LiveSplit One), and something xdg-opens a link and opens a new tab in that window -- which can happen without user interaction -- the capture keeps on rolling...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion Needs discussion on how to implement or fix the corresponding task new api This requires adding API to an existing portal portal: screencast Screencast portal
Projects
Status: Triaged
Development

No branches or pull requests

8 participants