Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiply applied capture groups seems to ignore some captures #127

Open
asottile opened this issue Mar 11, 2020 · 3 comments
Open

multiply applied capture groups seems to ignore some captures #127

asottile opened this issue Mar 11, 2020 · 3 comments
Labels
bug Issue identified by VS Code Team member as probable bug

Comments

@asottile
Copy link
Contributor

a bit of an edge case, I'm not sure how this is supposed to be handled -- I don't have a concrete use case, just trying to implement my own parser in python using this as a reference

sample grammar

{
    "scopeName": "test",
    "patterns": [
        {
            "match": "((a)) ((b) c) (d (e)) ((f) )",
            "name": "matched",
            "captures": {
                "1": {"name": "g1"},
                "2": {"name": "g2"},
                "3": {"name": "g3"},
                "4": {"name": "g4"},
                "5": {"name": "g5"},
                "6": {"name": "g6"},
                "7": {
                    "patterns": [
                        {"match": "f", "name": "g7f"},
                        {"match": " ", "name": "g7space"}
                    ]
                },
                "8": {"name": "g8"}
            }
        }
    ]
}

sample file

a b c d e f z

tokenization using vs code

$ node vsc.js cap.json f

Tokenizing line: a b c d e f z
 - token from 0 to 1 (a) with scopes test, matched, g1, g2
 - token from 1 to 2 ( ) with scopes test, matched
 - token from 2 to 3 (b) with scopes test, matched, g3, g4
 - token from 3 to 5 ( c) with scopes test, matched, g3
 - token from 5 to 6 ( ) with scopes test, matched
 - token from 6 to 8 (d ) with scopes test, matched, g5
 - token from 8 to 9 (e) with scopes test, matched, g5, g6
 - token from 9 to 10 ( ) with scopes test, matched
 - token from 10 to 11 (f) with scopes test, matched, g7f
 - token from 11 to 12 ( ) with scopes test, matched, g7space
 - token from 12 to 14 (z) with scopes test

I expect the f to have the scope test, matched, g7f, g8:

>>> # ...
>>> state, regions = highlight_line(compiler, state, 'a b c d e f z', first_line=True)
>>> import pprint
>>> pprint.pprint(regions)
(Region(start=0, end=1, scope=('test', 'matched', 'g1', 'g2')),
 Region(start=1, end=2, scope=('test', 'matched')),
 Region(start=2, end=3, scope=('test', 'matched', 'g3', 'g4')),
 Region(start=3, end=5, scope=('test', 'matched', 'g3')),
 Region(start=5, end=6, scope=('test', 'matched')),
 Region(start=6, end=8, scope=('test', 'matched', 'g5')),
 Region(start=8, end=9, scope=('test', 'matched', 'g5', 'g6')),
 Region(start=9, end=10, scope=('test', 'matched')),
 Region(start=10, end=11, scope=('test', 'matched', 'g7f', 'g8')),
 Region(start=11, end=12, scope=('test', 'matched', 'g7space')),
 Region(start=12, end=13, scope=('test',)))
@alexdima
Copy link
Member

alexdima commented Mar 11, 2020

I have tried also in TextMate and they appear to handle this in the way you expect:

image

Here is the grammar converted to TextMate's format:

{	patterns = (
		{	
			match = "((a)) ((b) c) (d (e)) ((f) )";
			name = "matched";
			captures = {
				1 = { name = "g1"; };
				2 = { name = "g2"; };
				3 = { name = "g3"; };
				4 = { name = "g4"; };
				5 = { name = "g5"; };
				6 = { name = "g6"; };
				7 = {
					patterns = (
						{ match = "f"; name = "g7f"; },
						{ match = " "; name = "g7space"; },
					);
				};
				8 = { name = "g8"; };
			};
		},
	);
}

@alexdima alexdima added the bug Issue identified by VS Code Team member as probable bug label Mar 11, 2020
@RedCMD
Copy link

RedCMD commented Oct 5, 2024

dup:
#164
#208

@asottile
Copy link
Contributor Author

asottile commented Oct 5, 2024

@RedCMD usually dupe goes the other way since this one is older and has more context

@asottile asottile closed this as completed Oct 5, 2024
@asottile asottile reopened this Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue identified by VS Code Team member as probable bug
Projects
None yet
Development

No branches or pull requests

3 participants