Windows input improvements #651

matkaas · 2022-04-10T13:01:54Z

Hi

I was playing around with tui and crossterm and ran into issue #561, so I thought I'd dig in and fix it. While I was there, I then also decided to fix some other issues around missing or inaccurate events for certain key combinations (#536, #643).

The unicode issue turns out to be fairly straightforward; we need to parse surrogate pairs (for Windows Terminal) and handle alt codes (for Conhost).

The missing/inaccurate keys issue turns out to be much tougher. See commit f52347c for details. I don't think it's possible to solve correctly in general -- the available data and APIs are simply insufficient. The solution I propose makes a best-effort attempt at providing an accurate event for all key combinations.
This works as one would want in Windows Terminal, but is unreliable in Conhost if users change their keyboard layout. I could not find any reliable way to detect the active keyboard layout under a Conhost terminal. I decided to settle for a partial solution (works when not changing keyboard layouts) rather than hacking together an unreliable solution.
Regarding a hacky, unreliable solution: It is actually possible to find a window handle for a Conhost terminal that can be queried for the active keyboard layout, but no API directly offers such a handle. I found this out by looking through process/window/thread handles in Spy++ for a Conhost terminal session. Enumerating and digging through attached console processes and their window handles seemed very brittle to me, so I didn't pursue it further. We could dig more in that direction if you think it's worthwhile to have that for Conhost terminals, even if it could break for any future change to the OS/Conhost infrastructure.

This enables us to deliver more key combinations triggered with tab, namely: * Ctrl+Tab * Ctrl+Shift+Tab

This entails proper decoding of UTF-16 surrogate pairs. With this change, we deliver KeyEvent::Char(...) events for code points outside of the BMP, such as many CJK code points as well as all emojis. This works with both pasting and IME input in Windows Terminal. This currently only works with IME input in Conhost terminal. Pasting doesn't work because Conhost synthesizes Alt codes for the higher unicode scalar values, rather than delivering a pair of surrogate code points. Some special handling will be required to interpret unicode scalar values from Alt codes.

In addition to handling manual user input of Alt codes, this also handles pasting of unicode from the supplemental planes into a Conhost terminal, as the Conhost terminal encodes such input by synthesizing key sequences for an Alt code.

matkaas · 2022-04-10T13:10:53Z

Whoops, commit 90d11bb breaks the alt code parsing introduced by 0fcce0d. I'll get that fixed.

Many key combinations produce key events which have u_char == 0, and these have been discarded until now. This for example includes all combinations involving the Ctrl+Alt modifiers, as well as many key combinations with just Ctrl. We can provide events for such key combinations by determining the character associated with the keys from consulting the keyboard layout. Almost all keys on a keyboard have characters associated with them -- it's just a question of whether we can determine what character corresponds to a key event. There are some caveats involved in doing that... In addition, the key events with u_char in the ASCII control code range was until now mapped into the ASCII range '@' to '_' which is inaccurate for many keys and for users with non-US keyboard layouts. The character for key combinations that produce control codes are now also handled by consulting the keyboard layout. The caveats revolve around determining the keyboard layout, which has two issues: 1. There is a race condition between the user typing in their terminal with one keyboard layout active, and the console application determining the keyboard layout while processing the key event later, as these two events happen asynchronously. If a user changes the active keyboard layout in between the two events, then the console application might misinterpret the character. 2. For console applications running in a Conhost terminal, it turns out to be very difficult to determine the active keyboard layout. There appears to be no available APIs that reliably provide the layout.

matkaas · 2022-04-10T14:00:48Z

Whoops, commit 90d11bb breaks the alt code parsing introduced by 0fcce0d. I'll get that fixed.

Made a fixup for 90d11bb; it's fixed in the new commit f52347c.

TimonPost · 2022-04-17T15:59:50Z

What exactly is a surrogate and what does it try to fix?

matkaas · 2022-04-17T16:56:38Z

What exactly is a surrogate and what does it try to fix?

In the context of Unicode, surrogates are code points in the range 0xD800 - 0xDFFF which are used in UTF-16 to encode other code points that don't otherwise fit in a single 2-byte code unit (i.e. every code point larger than 65535).
The Wikipedia page for UTF-16 explains in more details: https://en.wikipedia.org/wiki/UTF-16

They're relevant because we retrieve UTF-16 encoded input from the Win32 Console API when using ReadConsoleInputW in crossterm_winapi::console::Console::read_input. When the terminal wants to send us a character from outside the Basic Multilingual Plane (BMP), it encodes the character as a surrogate pair and passes it to us in two consecutive INPUT_RECORDs, each containing one UTF-16 code unit (the u_char member). We have to decode the two code units together to retrieve the character (unicode scalar value).

TimonPost

Awesome! Gave it a try and a test today. Looks like a good solution! Thanks for the contribution.

matkaas added 3 commits April 10, 2022 13:14

Recognise Tab by virtual key code on Windows

c35dae7

This enables us to deliver more key combinations triggered with tab, namely: * Ctrl+Tab * Ctrl+Shift+Tab

Handle Alt code input on Windows

0fcce0d

In addition to handling manual user input of Alt codes, this also handles pasting of unicode from the supplemental planes into a Conhost terminal, as the Conhost terminal encodes such input by synthesizing key sequences for an Alt code.

matkaas requested a review from TimonPost as a code owner April 10, 2022 13:01

TimonPost approved these changes Apr 23, 2022

View reviewed changes

TimonPost merged commit 0b4a06a into crossterm-rs:master Apr 23, 2022

matkaas deleted the windows-input-improvements branch April 23, 2022 09:25

This was referenced Jun 30, 2022

Ctrl + Space Key Event not read on Windows. #643

Closed

Combination of CONTROL + <Non-Alphanumeric> i.e. [ , ], \, / etc. gives, faulty readings or no readings at all. #536

Closed

Some unicode chars not working on windows terminal #561

Closed

fdncred mentioned this pull request Oct 2, 2023

Can't insert or paste some unicode points nushell/nushell#10578

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows input improvements #651

Windows input improvements #651

matkaas commented Apr 10, 2022 •

edited

Loading

matkaas commented Apr 10, 2022

matkaas commented Apr 10, 2022

TimonPost commented Apr 17, 2022 •

edited

Loading

matkaas commented Apr 17, 2022

TimonPost left a comment

Windows input improvements #651

Windows input improvements #651

Conversation

matkaas commented Apr 10, 2022 • edited Loading

matkaas commented Apr 10, 2022

matkaas commented Apr 10, 2022

TimonPost commented Apr 17, 2022 • edited Loading

matkaas commented Apr 17, 2022

TimonPost left a comment

Choose a reason for hiding this comment

matkaas commented Apr 10, 2022 •

edited

Loading

TimonPost commented Apr 17, 2022 •

edited

Loading