-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows input improvements #651
Windows input improvements #651
Conversation
This enables us to deliver more key combinations triggered with tab, namely: * Ctrl+Tab * Ctrl+Shift+Tab
This entails proper decoding of UTF-16 surrogate pairs. With this change, we deliver KeyEvent::Char(...) events for code points outside of the BMP, such as many CJK code points as well as all emojis. This works with both pasting and IME input in Windows Terminal. This currently only works with IME input in Conhost terminal. Pasting doesn't work because Conhost synthesizes Alt codes for the higher unicode scalar values, rather than delivering a pair of surrogate code points. Some special handling will be required to interpret unicode scalar values from Alt codes.
In addition to handling manual user input of Alt codes, this also handles pasting of unicode from the supplemental planes into a Conhost terminal, as the Conhost terminal encodes such input by synthesizing key sequences for an Alt code.
Many key combinations produce key events which have u_char == 0, and these have been discarded until now. This for example includes all combinations involving the Ctrl+Alt modifiers, as well as many key combinations with just Ctrl. We can provide events for such key combinations by determining the character associated with the keys from consulting the keyboard layout. Almost all keys on a keyboard have characters associated with them -- it's just a question of whether we can determine what character corresponds to a key event. There are some caveats involved in doing that... In addition, the key events with u_char in the ASCII control code range was until now mapped into the ASCII range '@' to '_' which is inaccurate for many keys and for users with non-US keyboard layouts. The character for key combinations that produce control codes are now also handled by consulting the keyboard layout. The caveats revolve around determining the keyboard layout, which has two issues: 1. There is a race condition between the user typing in their terminal with one keyboard layout active, and the console application determining the keyboard layout while processing the key event later, as these two events happen asynchronously. If a user changes the active keyboard layout in between the two events, then the console application might misinterpret the character. 2. For console applications running in a Conhost terminal, it turns out to be very difficult to determine the active keyboard layout. There appears to be no available APIs that reliably provide the layout.
What exactly is a surrogate and what does it try to fix? |
In the context of Unicode, surrogates are code points in the range 0xD800 - 0xDFFF which are used in UTF-16 to encode other code points that don't otherwise fit in a single 2-byte code unit (i.e. every code point larger than 65535). They're relevant because we retrieve UTF-16 encoded input from the Win32 Console API when using ReadConsoleInputW in crossterm_winapi::console::Console::read_input. When the terminal wants to send us a character from outside the Basic Multilingual Plane (BMP), it encodes the character as a surrogate pair and passes it to us in two consecutive INPUT_RECORDs, each containing one UTF-16 code unit (the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Gave it a try and a test today. Looks like a good solution! Thanks for the contribution.
Hi
I was playing around with tui and crossterm and ran into issue #561, so I thought I'd dig in and fix it. While I was there, I then also decided to fix some other issues around missing or inaccurate events for certain key combinations (#536, #643).
The unicode issue turns out to be fairly straightforward; we need to parse surrogate pairs (for Windows Terminal) and handle alt codes (for Conhost).
The missing/inaccurate keys issue turns out to be much tougher. See commit f52347c for details. I don't think it's possible to solve correctly in general -- the available data and APIs are simply insufficient. The solution I propose makes a best-effort attempt at providing an accurate event for all key combinations.
This works as one would want in Windows Terminal, but is unreliable in Conhost if users change their keyboard layout. I could not find any reliable way to detect the active keyboard layout under a Conhost terminal. I decided to settle for a partial solution (works when not changing keyboard layouts) rather than hacking together an unreliable solution.
Regarding a hacky, unreliable solution: It is actually possible to find a window handle for a Conhost terminal that can be queried for the active keyboard layout, but no API directly offers such a handle. I found this out by looking through process/window/thread handles in Spy++ for a Conhost terminal session. Enumerating and digging through attached console processes and their window handles seemed very brittle to me, so I didn't pursue it further. We could dig more in that direction if you think it's worthwhile to have that for Conhost terminals, even if it could break for any future change to the OS/Conhost infrastructure.