-
-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a InputCapture portal #714
Conversation
<member>1: RELATIVE_POINTER</member> | ||
<member>2: ABSOLUTE_POINTER</member> | ||
<member>4: KEYBOARD</member> | ||
<member>8: TOUCH</member> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be multiple "classes" of capabilities?
These involve what type of input events that can be captured, but perhaps we need a "trigger" capability, so that "pointer barrier" can be allowed but not some future alternative trigger method will not, or vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For triggers, we have DBus methods (currently SetPointerBarrier
but I can envision at least a ActivateNow
type too). Those have return values, so I think it may be good enough to just error out from those and let the application deal with it. If in the future we notice that applications need to know ahead of time whether they can set up a barrier, we can add those, separate to the input capabilities. But right now I think they're not needed - doubly so because the capturing triggers will be very few, so it's likely all implementation will support all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of when the portal backend should present a dialog to the user. At this point, it should be known what the application wants, both in terms of possilbe triggers and the type of input devices. Doing it at Enable()
is probably too late, since ConnectToEIS()
would be called prior to that, which wouldn't be possible since it's not known what kind of permissions there will be. Doing it at SetPointerBarrier()
wouldn't be good either, since that means adding another delayed trigger would require yet another dialog. My point is that at some point before any ConnectToEIS()
or Enable()
, the whole intention of the application must be known, so that a complete picture of its needs can be presented to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we don't disagree. I'm just suggesting to leave this off for now and work it into the various options a{sv}
and result a{sv}
if we end up having a specific use-case for it. Probably at CreateSession()
time, on the assumption that if we allow an application to set up pointer barrier and it ends up setting up the wrong one that's a bug rather than a sandbox violation.
As for "when to present the dialog", I'd say at CreateSession
time. That currently already has the capabilities field so "An application wants to capture pointer input" is possible there. SetPointerBarriers
uses a Request and has a return value for failed barriers, so an extra dialog could be inserted here, if that's really needed.
I think (and that needs to be spelled out in the docs) that the EIS implementation needs integrate with the impl.portal
correctly anyway, because ConnectToEis()
is only called once but if regions change and/or the client set up invalid pointer barriers later, the fd is still there even though the client must no longer receive events.
GetRegions: | ||
@session_handle: Object path for the #org.freedesktop.portal.Session object | ||
@options: Vardict with optional further information | ||
@regions: An array of 4 integers, each specifying a region's width, height, x/y offset and physical scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this scale not be exposed via the absolute pointer device?
Do we even need to bother at all with these for captured input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a general rule, the devices exposed in the CaptureInput portal will more closely match the physical input devices, not the logical ones. And those don't have that physical scale, that one only exists on screen.
And since we expect the pointer barriers to be set up on screen, I think we need the physical scale as part of the region.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the those properties (physical vs logical movement) be handled by libei events instead? Or do you think it's better to leave out any meaning of logical pixels out of the libei API? (what would need to be added would be a get_scale()
or something to ei_region
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I'm not actually sure right now how to handle this correctly. In the touchscreen case it's easy enough since they are mapped to a (screen) region to function correctly. But forwarding a tablet? That may not have a region (or rather: the region is physical scale rather than a logical region). Maybe something like ei_region_is_physical()
would do the trick here? Dunno yet, but filed as libei issue.
However, that seems independent of the issue here - the regions exposed here in the portal are purely so the client can set up the pointer barriers, the regions are completely independent of the input devices (which are handled by libei). For example, if all you have is a relative device, you'd still have the portal regions but the libei device would have no regions - it just forwards events in logical pixels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of pointer barriers alone, I still don't see how any scale is relevant. The only thing that matters is the screen edges in a logical pixel grid that represents the layout the compositor uses.
Sometimes the logical pixel grid matches the physical in some sub-regions, sometimes it doesn't; sometimes the region is "HiDPI", but it is so somewhat independently of the scale here, as in, a scale == 1.0 if the scale is to represent the logical - physical pixel grid relationship doesn't mean it isn't a HiDPI monitor and it doesn't mean raw input isn't in some way transformed.
What we really are after, isn't that always in what way relative input events are transformed? For example lets say we have two identical HiDPI setups:
- A) 4K monitor, represented by a 2K region; logical - physical pixel scale is
2
, clients draw with scale2
- B) 4K monitor, represented by a 4K region; logical - physical pixel scale is
1
, clients draw with scale2
In both A) and B) the displayed output is completely identical, both show a HiDPI client exactly the same way, pixel by pixel, but the way input events are represented in libei would differ. The same applies to physically moving a pointer devices a given distance - it will result in the same distance traveled by the pointer sprite on screen.
In A) moving a relative pointer device (dx, dy)
units after pointer acceleration etc moves the pointer (dx, dy)
logical pixels in the grid. For something looking at absolute events it'll see (x, y)
and then (x', y')
, and to get the original device delta, it just needs to do dx = x' - x
and dy = y' - y
.
For A) moving a relative pointer device (dx, dy)
units after pointer acceleration etc moves the pointer (dx * 2, dy * 2)
logical pixels in the grid, since the logical - physical pixel scale is 2
. Thus when looking at libei events, a relative pointer should still send (dx, dy)
, but absolute events would see first (x, y)
then (x + dx * 2, y + dy * 2)
. If something that looks only at absolute events sees (x, y)
then (x', y')
, in order to get the original input event it will need to know the scale the compositor used to boost the pointer movement (lets call it s
), thus to get the original (dx, dy)
it will need to do dx = (x' - x) / s
and dy = (y' - y) / s
.
To come up with some kind of conclusion of what I'm thinking:
-
For pointer barriers alone, I don't think scale is needed anywhere at all.
-
In order to be able to get the original post-processed relative input event from two absolute events, the scale used to scale input events is needed - not the one related to the pixel grid.
-
The only time a physical - logical scale is needed is to know how far a relative event traveled in physical space, but I'm not sure I understand when this is actually needed when capturing input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First - agreed, scale for pointer barriers appears to not be necessary.
For the events, there are missing bits, I tried to type out the four cases that apply:
Setup
Server-side:
- we have a 4k screen with logical-physical
2
on synergy server- technically the client-scale doesn't matter from an input perspective so we can leave it out here
- we have a touchscreen and a mouse
- synergy server connects to portal and receives
CaptureInput
region(2560x1440@0,0)
, scale1.0
- scale here can be
1
because the client-scale is completely transparent to anyone but the compositor, those pixels might as well not exist and we effectively have a 2k screen
- scale here can be
- synergy server sets up a pointer barrier at
x=2560, y=[0-1440]
- compositor creates EIS device capability
rel
with no input region - compositor creates EIS device capability
abs
with an input region of(500x312@0,0)
in mm- Missing: there is no connection between the portal region and the EIS region
And on the synergy-client side:
- we have a 4k screen with logical-physical
1
- synergy-client has an EIS region of
(1920x1080@0,0)
with scale2.6
- contrived example to make this explanation more complex.
rel → rel event
- mouse/libinput generate a relative motion of
(dx, dy)
- compositor converts this to
(2dx, 2dy)
physical pixels movement on-screen - input capture activates
- mouse/libinput generate a relative motion event of
(dx, dy)
- compositor passes this as
(dx, dy)
to the libei context on synergy-server - synergy-server sends
(dx,dy)
as-is to synergy-client - synergy-client converts to local delta
(dx/scale, dy/scale)
(wherescale == 2.6
) and passes that to EIS - remote compositor moves cursor by that delta
abs → abs event
- touchscreen generates absolute motion to
(x, y)
- compositor converts this to the matching screen position using
libinput_event_touch_get_x_transformed()
in the 4k range - input capture activates
- touchscreen generates absolute motion event to
(x, y)
- compositor passes these as
(xmm, ymm)
inmm
to the libei context on synergy-server - synergy-server converts this to abs on the client by converting from
500x312
in mm to the1920x1080
in logical pixels range and sends(ax, ay)
to the client - synergy-client libei sends abs
(ax, ay)
to the EIS implementation - remote compositor converts
(ax, ay)
(in the 1920 range) to the right position in the 4k range
rel → abs event
- mouse/libinput generate a relative motion of
(dx, dy)
- compositor converts this to
(2dx, 2dy)
physical pixels movement on-screen - input capture activates
- mouse/libinput generate a relative motion event of
(dx, dy)
- compositor passes this as
(dx, dy)
to the libei context on synergy-server - synergy-server converts sends
(dx,dy)
on as-is to synergy-client- technically synergy-server does everything here too but that's irrelevant to this example
- synergy-client applies local scale and converts the delta
(dx/scale, dy/scale)
to absolute position(ax, ay)
in that region - synergy-client libei sends abs
(ax, ay)
to the EIS implementation - compositor converts
(ax, ay)
to the right position on the 4k screen (mapping 1920x1080 to the 4k actual range)
abs → rel event
- touchscreen generates absolute motion to
(x, y)
- compositor converts this to the matching screen position using
libinput_event_touch_get_x_transformed()
in the 4k range - input capture activates
- touchscreen generates absolute motion event to
(x, y)
, then(x', y')
- compositor passes these as
(xmm, ymm)
and(x'mm, y'mm)
inmm
to the libei context on synergy-server - synergy-server calculates delta as
(dxmm, dymm)
- Missing: we cannot reliably convert abs motion in mm to rel motion in logical pixels with the current data.
- synergy-server converts this to delta on the client by calculating factor
f
to convert from500x312
to the1920x1080
range and applying that conversion to(dxmm * f, dymm * f)
and sending that as relative event to synergy-client - synergy-client applies local scale
2.6
on the incoming delta and passes it on to EIS - compositor moves cursor by that delta
Summary
So, the missing bits in the above are:
We cannot convert from abs motion in mm to rel motion in logical pixels, and the reason we can't is that there is no association between a physical region and the logical region it may be mapped to. Or like in the example above: there's no hint that the 500x312mm touchscreen corresponds to the 2k region on the host.
This could be a problem, like in the abs → rel
case. If we had an association between the two, we could map the mm to pixels.
The other bit that is missing is that the size of pixels differs. All the above treats logical pixels as "reachable pixel", but in the case of a 4k screen and a VGA screen, the same pixel movement will be significantly different physical speed. Maybe a scale < 1.0 could help there, haven't thought through that yet.
In terms of input handling, how would this portal interact with #711 ? I appreciate that that other is about key combinations, and is designed around that, but what would have when both global shortcuts AND this portal are used at the same time? |
I'm aware of #711, so far my best idea is to have a specific method |
I think the question is more about input routing, not about triggering. When input is captured by an application, does global shortcuts registered by #711 gets triggered or not? Should this be possible to tweak with some property? For the intended primary use case (Synergy/Barrier/whatever the new one is called), I'd say captured input should not trigger keyboard shortcuts, as input being captured really should be seen as being designated for another computer, but are there use cases where this isn't desirable? |
yes, if we (theoretically) switch qemu to use libei instead of wl_relative_pointer that would be a prime use-case for this: you'd still want ctrl+alt to escape the pointer capture. I haven't added to that effect yet beyond the documentation that the compositor can filter any event because without more specific use-cases it's hard to get that API right. And much of it relies on the individual use-cases anyway, e.g. you may want to not forward volume keys in a synergy setup because the music is playing on the server, not the client. Or maybe it's playing on the client and you do want to forward them. With the |
There would be two levels of breaking the capture in the qemu/virt-manager case wouldn't there? As in a application level and compositor level. A compositor level escape hatch would always be possible (as it is with the "inhibit keyboard shortcuts" Wayland protocol), but the virt-manager level escape hatch is controlled by virt-manager, which would see all the events before they reach qemu I suppose? |
Summary after a meeting I had with @jadahl last night, putting it here so it doesn't end up hidden in a resolved conversation. This was about #714 (comment) and which other bits we need. In conclusion:
|
Renamed to |
26911d3
to
3d9ccda
Compare
We now have an implementation, and it seems to work too! (according to the pytest test suite that is) |
Draft backend implementation in xdg-desktop-portal-gnome: https://gitlab.gnome.org/GNOME/xdg-desktop-portal-gnome/-/merge_requests/33 It lacks libei integration, but can create barriers, get zones etc. |
Looking at the DBus interface and implementation this makes sense to me. A slight concern I have is about the fine grained capabilities that are captured in On one hand, the behavior of the compositor with non-captured capabilities is unspecified (and probably could be called "implementation detail" here), but leaving it like that is subject to many possible interpretations and grades of complexity in handling it. On the other hand, there are also combinations that are more directly awkward to handle, like the behavior of pointers when they enter a barrier but pointers are not in the captured capability set. From the Mutter perspective I've been looking to reduce the ways things can grab input from each other (and adding ways to notify each other). I would prefer to avoid the complexities that arise from adding per-device/capability granularity to input capturing. I wonder if it would be possible to remove or make the requested capabilities optional, defaulting to "give me all events I could care", or perhaps that's been the way to interpret those all along? |
Just for the record, we can differ between capturing (portal) and transmitting (libei) events. In the case of synergy, we cannot transmit touch events, they're not supported by synergy. But we could still capture touch events (and discard them) once the pointer barrier is triggered. I'm not sure that's the best user-visible behaviour tough. I'm thinking of the use-case of e.g. forwarding a tablet device. This should probably work like USB forwarding, i.e. per-device only. The tablet device generally makes the pointer barriers more "interesting" - a tablet in relative mode can hit the pointer barrier without a pointer capability being necessary anywhere. Especially in mutter where tablets control separate cursors. But since we don't have tablet support right now, removing the capabilities in |
Aha, so the capabilities reported here about capturing and compositor/libei can still think otherwise about the events sent.
I didn't get there as there's the overall TABLET capability AFAICS, but indeed, multiple pointers make it all more fun :). If it is possible to have different pointing devices trigger multiple barriers capturing input towards different computers, questions like "what happens if all request KEYBOARD capability" start to pop up. For the specific usecase of tablets, I tend to see more sense in a "switch host" pad button action like there is for cycling displays.
Cool, time would tell if it's simple or simplistic :), could always be extended at some point later on. |
correct. the libei events must be a subset of the capabilities in the portal, but otherwise they're independent. libei also has the concept of seats (decided by the compositor) which cater for the multi-pointer scenario. This portal does not handle the case where you want to capture only one of the seats, pointer barriers apply to all (provided the compositor agrees of course). libei however splits devices by seat so you can separate the event stream per seat. as for the tablet capability: I'm somewhat planning to add this to libei together with gestures and whatever else libinput does already (in a similar way anyway). since libei is supposed to be a generic input emulation layer, it should provide the various devices we know of already. it's not implemented anywhere but think of this as libinput-over-the-wire and that's fairly close. |
Dropped by to describe a use case I have for this. I work on a remote sharing application that emulates multiple cursors. One user is the owner and controls the "real cursor" and the other cursors are "fake" that just are rendered in addition to the real cursor. The problem occurs when a remote user owns the real cursor and the local user moves their mouse around. The X server will continue to listen to the local mouse and move the cursor around. This can be addressed by telling the X server to stop listening for events from the local mouse input devices, however, our application still needs those events to control a fake cursor and retake control when the local user clicks a mouse button. So when this occurs, our application needs to listen for mouse input outside of the compositor since we had to disable it. This is where we could make use of the proposed CaptureInput portal. |
4d36a9e
to
7caf7ed
Compare
If you follow semver, could you not make a breaking change on every minor release (0.x -> 0.y) until 1.0? Could be useful to make a usable release for this portal, so that it gets some real world testing. |
That's what semver is for, no?
|
ftr, libei 1.0.0RC1 was released last week and the API is now stable. however, this PR was updated to remove any dependencies on libei itself - just like #762 there is no libei-specific code here anymore, it's just a plain fd that gets passed around. Moving out of Draft, I think this is ready |
b3c3441
to
2ff0e50
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems I have 3 comments that I didn't know of, lets see what they are (can't see in github UI).
bc99b6a
to
4ebe2c0
Compare
Is there anything else left for merging this in? It seems like all the feedback has been addressed. I'm trying to understand where this is so that I can ask the KDE folks to look to implement this in |
4758277
to
a1d9476
Compare
This portal is aimed at allowing applications to capture input from physical devices, the prime use case is the server component of an InputLeap (a Synergy fork) setup where local devices are used to control a remote display. The general flow for the InputCapture session is: - application queries the "zones" available (usually representing the desktop) - application sets up "pointer barriers" - application provides a libei-compatible handle for the actual input events - when a cursor moves against/across a pointer barrier, the compositor notifies the client and re-routes input events from libinput to libei instead of controlling the local cursor Notable: the actual input events are not part of the portal here, they're handled by the direct libei connection between compositor and application. This portal has no direct dependencies on libei itself. The compositor is in charge of everything - it can restrict some screens from showing up in the regions, it can deny pointer barriers at locations, it decides when a pointer barrier is crossed, it can stop capturing (or filter events) at any time, etc.
a1d9476
to
05cd5a9
Compare
Took the liberty to push cosmetic code style changes |
This portal is aimed at allowing applications to capture input from
physical devices, the prime use case is the server component of a
Barrier/Synergy setup where local devices are used to control a remote
(Barrier/Synergy client) display.
The general flow for a
InputCapture
session is:events
notifies the client and re-routes input events from libinput to libei
instead of controlling the local cursor
Notable: the actual input events are not part of the portal here,
they're handled by the direct libei connection between compositor and
application.
The compositor is in charge of everything - it can restrict some screens
from showing up in the zones, it can deny pointer barriers at
locations, it decides when a pointer barrier is crossed, it can stop
capturing (or filter events) at any time, etc.
This requires active/passive libei contexts, see
https://gitlab.freedesktop.org/libinput/libei/-/merge_requests/80
cc @jadahl, @ofourdan, @p12tic