Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding more granular diff format for autoedits model training #6173

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

hitesh-1997
Copy link
Contributor

@hitesh-1997 hitesh-1997 commented Nov 21, 2024

Context

The PR makes the following high-level changes:

  1. Current auto-edit model have trouble understanding the most recent diffs, where it suggest deleting the recently added line or suggest the change recently deleted. One reason is that it doesn't have a seperate view of short term and long term diff.
  2. Introduces a more granular diff format for training the auto-edits model. Currently we only use a single diff format. The PR computes the line level for the changes made in the editor. In addition, ensures that all the continuous changes are groped together as a single entity. Additionally, it derives some strategies to calculate the diff at different granularity levels. Refer to the class for the entry point.
  3. Introduce a helper function to diff format, to simulate the document changes using markers. Refer to helper function here
  4. Refactors recent edits handling to separate long-term and short-term diffs.
  5. Initially the data is logged to the telemetry, to be used for training and evaluating the model offline.
  6. One final change is to log 10 sec diff data by the user in the analytics to capture the short term diffs.

Test plan

Added Unit tests for various changes

@hitesh-1997 hitesh-1997 marked this pull request as ready for review November 24, 2024 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant