Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Look at using Tree Sitter to parse C# #23

Open
wiltaylor opened this issue Sep 20, 2023 · 2 comments
Open

[Proposal] Look at using Tree Sitter to parse C# #23

wiltaylor opened this issue Sep 20, 2023 · 2 comments

Comments

@wiltaylor
Copy link

Another possability is instead of using AI to parse code you could look at something like Tree sitter to parse the C# code into an abstract syntax tree and then use that to translate into other languages/engines.

This might be a more maintainable approach moving forward.

Tree sitter has lots of syntax definitions for nearly every language and is used in lots of popular projects like neovim.

Some links:

@bshikin
Copy link
Contributor

bshikin commented Sep 20, 2023

Tree-sitter is already added as a base for C# parsing and separating class definition vs. methods.

How would you perform the actual translation with tree-sitter?

@orbikm
Copy link

orbikm commented Sep 20, 2023

Typically this is how this type of tooling is built. Using AI and LLM may be passable with a lot of SME in that area, but my impression would be that it would be easy to get to 90% correct, and then very difficult to get the remaining 10% done.

Using a C# parser tool like tree sitter (or an alternative), to generate an AST is one step of the process. The next step is to use the AST to generate some kind of a model, which contains similar info the the AST, but can embed useful context into the nodes as well. Finally, you perform a projection step, where the model to generate output code in the target language. This is typically done using some kind of templating language, e.g. jinja (if python)

I have written / maintained numerous systems like this for exposing API projections in different languages for over a decade now, and I think this does represent a scalable and effective solution as opposed to leveraging LLM.

I would suggest that if this is an avenue we want to go down, to look into some of the core Reflection libraries included in DotNet as a means to building the AST / model. It may not be necessary to pull in a dependency for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants