Add range formatting support #92

lionel- · 2024-12-11T14:28:35Z

Closes #63.

Range formatting is trickier than whole document formatting because we need to find the complete entity that makes the most sense to format around the provided range. To that end we're allowed to format a larger range than provided (maybe a smaller as well but we don't do that here and that would likely feel inconsistent).

Once we have found the piece of code to format, anything outside it is left alone. In order to match the surrounding indentation, Biome (in this cas biome_formatter::format_sub_tree()) does some work to figure out the initial indentation and the resulting output is printed with this indentation.

Biome provides format_range() but I haven't ended up using it because of the way it selects a safe node to format. There are two issues with it:

First it tends to select too high in the tree. It uses this algorithm (https://github.com/biomejs/biome/blob/2a098f94/crates/biome_formatter/src/lib.rs#L1590-L1607) from prettier (https://github.com/prettier/prettier/blob/cae19518/src/main/range-util.js#L36).

If you try to format the second line in:
```
1+1
2+2
```
It ends up selecting the whole expression list instead of the binary expression.
Second it only supports selecting a single node. If you wanted to format the first two lines in
```
1+1
2+2
3+3
```
its only possible choice would be to extend the range to the enclosing expression list and format the third line as well.

For comparison here is what Ruff does with an algorithm that works in the opposite direction, starting from the root node: https://github.com/astral-sh/ruff/blob/0e9427255fe3f9f42e1947a6f35af4483c101e95/crates/ruff_python_formatter/src/range.rs#L149. The idea is to find the deepest "logical line" starting from the root. I used this as an inspiration for the find_deepest_enclosing_logical_lines() routine implemented in this PR. The main difference is that it finds all consecutive logical lines within the range so that the user can select multiple lines of code to format.

Currently I define a logical line as any child of the top-level program or of an { expression. In the future, I think it'd be helpful to also treat function arguments and binary operands as logical lines. That would be helpful when format-on-paste is enabled and the user is pasting inside an expanded argument list or an expanded pipeline. Air will be able to only format what's been pasted. Currently we'll extend the range to format the whole thing.

There's a bit of trickery involving rewrapping the nodes in an expression list, itself wrapped in an artificial RRoot node. The first wrapping is necessary to handle multiple logical lines. The second is necessary to resolve an issue in Biome's comment mapper. If the node to format is not at least two levels deep, leading comments end up attached to nodes deeper in the tree (see https://github.com/biomejs/biome/blob/2a098f94aa58cedfd2691de4f3fd3ef65c05c5d6/crates/biome_formatter/src/comments/builder.rs#L104-L112). This introduces hard line breaks where they are not expected. One effect from wrapping nodes in this way is that a hard line is introduced at the end of the expression list. We just remove it at the end.

DavisVaughan · 2024-12-11T19:46:41Z

That would be helpful when format-on-paste is enabled and the user is pasting inside an expanded argument list or an expanded pipeline. Air will be able to only format what's been pasted. Currently we'll extend the range to format the whole thing.

I don't mind this honestly because if an argument breaks, then that means the entire function call should break. I imagine you might could get yourself into weird scenarios where an argument is formatted correctly but the function call as a whole isn't

DavisVaughan

I think everything looks good except for the find_deepest_enclosing_logical_lines algorithm, see comments below

CHANGELOG.md

crates/lsp/src/handlers_format.rs

lionel- · 2024-12-12T16:25:12Z

I don't mind this honestly because if an argument breaks, then that means the entire function call should break. I imagine you might could get yourself into weird scenarios where an argument is formatted correctly but the function call as a whole isn't

I was envisioning only expanded calls / pipelines would work this way. Flattened calls and pipelines would be formatted as a whole.

I think the idea for range formatting is not to leave the whole document in a consistently formatted state but just the minimal range that the user is currently working with. Changing code outside of this scope is likely to be distracting. Then the user can save or format the whole doc if needed.

That said I don't think this would be worth the extra complexity so I'm happy with what we have here.

DavisVaughan · 2024-12-12T22:17:00Z

crates/lsp/src/handlers_format.rs

+    let logical_lines: Vec<RSyntaxNode> = iter
+        .map(|expr| expr.into_syntax())
+        .skip_while(|node| !node.text_range().contains(range.start()))
+        .take_while(|node| node.text_range().start() <= range.end())
+        .collect();


In this case, were you expecting that it formats both 2 + 2 and THE ENTIRE for loop? i.e. note how 1 : 10 is formatted even though it is outside the selection.

It seems to be due to this bit here.

Screen.Recording.2024-12-12.at.5.16.07.PM.mov

As I mentioned in #92 (comment) I was expecting that this just formats 2 + 2 because that is the only node inside the common logical line "root" fully covered by the user's selection

It seems like the most conservative approach that avoids going outside of the user's selection?

Should it be this?

Suggested change

let logical_lines: Vec<RSyntaxNode> = iter

.map(|expr| expr.into_syntax())

.skip_while(|node| !node.text_range().contains(range.start()))

.take_while(|node| node.text_range().start() <= range.end())

.collect();

let logical_lines: Vec<RSyntaxNode> = iter

.map(|expr| expr.into_syntax())

.filter(|node| range.contains_range(node.text_trimmed_range()))

.collect();

Careful to use text_trimmed_range() here, because we'd want this (where < and > represent the selection) to still format the 1+1 even though it doesn't span the full text_range() when you add in the comment

# hi there <1+1>

This breaks one of your tests related to partial selection (reproduced in the image below) where I'd argue that only 1+1 should be reformatted

yes this was on purpose. We always format the selected part of user code. Sometimes it means widening the range to the highest level of logical line selected by the user. In other words we never narrow the requested range, we only widen it.

I think what I don't love about this is that it means we can format lines arbitrarily far away from the user's actual selection

Like in this case I accidentally selected a little too far up and it means that the whole function is formatted, even if thats 20 lines further up

Screen.Recording.2024-12-13.at.10.02.53.AM.mov

We talked and decided to keep the current behavior, but with a comment around these lines and the fact that they allow the current implementation to expand the user's range, and add in my suggested code above as a commented out alternative that is more conservative if we ever wanted to switch to that

Happens in edge cases when biome returns a `Printed::new_empty()`

lionel- · 2024-12-16T11:13:37Z

Here is an example of the behaviour difference I'm worried about between { lists and argument lists. In this screencast I'm pasting an expression in different parts of the file:

Screen.Recording.2024-12-16.at.12.10.44.mov

The formatting of other arguments in argument lists feel inconsistent with how we don't format expressions in braced lists. This is why I wondered if we should extend our notion of logical lines to include lines in expanded lists (and possibly lines in pipelines). But let's try out the current approach to start with.

lionel- requested a review from DavisVaughan December 11, 2024 14:28

DavisVaughan requested changes Dec 11, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

crates/lsp/src/handlers_format.rs Show resolved Hide resolved

crates/lsp/src/handlers_format.rs Show resolved Hide resolved

crates/lsp/src/handlers_format.rs Show resolved Hide resolved

lionel- force-pushed the feature/range-formatting branch from e619633 to c7741dc Compare December 12, 2024 13:32

DavisVaughan reviewed Dec 12, 2024

View reviewed changes

DavisVaughan approved these changes Dec 13, 2024

View reviewed changes

lionel- added 17 commits December 16, 2024 11:58

Support RangeFormatting LSP request

cc01f0a

Respect output range

7702278

Add failing test for formatting range

f7fe3ab

Add tests for no formatting changes

c78ee7e

Handle null range

02643f0

Happens in edge cases when biome returns a `Printed::new_empty()`

Add test for range bounded by curly braces

1ab2fd9

Find deepest enclosing logical line

f5e6c50

Wrap in root node to fix wrong comment attachment

73e39cb

Add test for mismatched indentation

06d04e8

Format all logical lines included in range

1a2e1cb

Add test for deep logical line

7577908

Add changelog

dc87878

Rework algorithm to handle ranges over nested lists

a754eff

Rename dev version header

cdc491a

Fix clippy notes

523004b

Add doodle_and_range() for easier testing

4be4b61

Trim range when inspecting end positions

5e8dfd8

lionel- force-pushed the feature/range-formatting branch from f90e017 to 5e8dfd8 Compare December 16, 2024 10:58

Comment on our choice to always widen selections

8f425bc

lionel- merged commit 357be89 into main Dec 16, 2024
4 checks passed

lionel- deleted the feature/range-formatting branch December 16, 2024 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add range formatting support #92

Add range formatting support #92

lionel- commented Dec 11, 2024

DavisVaughan commented Dec 11, 2024

DavisVaughan left a comment

lionel- commented Dec 12, 2024

DavisVaughan Dec 12, 2024

DavisVaughan Dec 12, 2024 •

edited

Loading

DavisVaughan Dec 12, 2024

lionel- Dec 13, 2024 •

edited

Loading

DavisVaughan Dec 13, 2024 •

edited

Loading

DavisVaughan Dec 13, 2024

lionel- commented Dec 16, 2024

Add range formatting support #92

Add range formatting support #92

Conversation

lionel- commented Dec 11, 2024

DavisVaughan commented Dec 11, 2024

DavisVaughan left a comment

Choose a reason for hiding this comment

lionel- commented Dec 12, 2024

DavisVaughan Dec 12, 2024

Choose a reason for hiding this comment

DavisVaughan Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

DavisVaughan Dec 12, 2024

Choose a reason for hiding this comment

lionel- Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

DavisVaughan Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

DavisVaughan Dec 13, 2024

Choose a reason for hiding this comment

lionel- commented Dec 16, 2024

DavisVaughan Dec 12, 2024 •

edited

Loading

lionel- Dec 13, 2024 •

edited

Loading

DavisVaughan Dec 13, 2024 •

edited

Loading