Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add prompt snippets support #234220

Open
wants to merge 58 commits into
base: main
Choose a base branch
from

Conversation

legomushroom
Copy link
Member

@legomushroom legomushroom commented Nov 19, 2024

@legomushroom legomushroom self-assigned this Nov 19, 2024
@vs-code-engineering vs-code-engineering bot added this to the November 2024 milestone Nov 19, 2024
@legomushroom legomushroom force-pushed the legomushroom/prompt-snippet-completions branch from 6f5026f to a927e4e Compare November 19, 2024 21:45

prompt.parts
.forEach((part, i) => {
public async resolveVariables(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly readability improvements in this file, the logic remains the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but I'd prefer to keep PRs focused more narrowly, and the word "mostly" makes it a little harder to review :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry, it was very hard to read even on 27" monitor, so I was formatting it on the go as I was parsing through the code, and then decided to keep it. The same applies for other cases where I split long lines over. Let me know if you think that reverting this would help with the review at this point 👂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also recommend scoping PRs down where possible, large PRs with many changes make it hard on the reviewers and ultimately delay landing features.

) {
super();
this._register(widget.inputEditor.onDidChangeModelContent(e => {
e.changes.forEach(c => {
// Don't mutate entries in _variables, since they will be returned from the getter
this._variables = coalesce(this._variables.map(ref => {
this._variables = coalesce(this._variables.map((ref) => {
if (c.text === `#file:${ref.filenameWithReferences}`) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the only way I made it work and its not ideal, any better ideas? cc @roblourens @joyceerhl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this trying to do? Is this triggered when that text is inserted all at once?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we add the (+N more) label to a filename after nested references are resolved, this is run and as the result the variables are removed. The logic below infers that the variables text have changed and are hence the variables are no longer valid and need to be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we update the variable objects when we update the labels, this if check works, - if the variable text is really the same as the variable should have, do nothing.

endLineNumber: ref.range.endLineNumber,
endColumn: ref.range.endColumn + delta
}
ref.range = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roblourens do you recall the main reason for the // Don't mutate entries in _variables.. above? Because the ref is now a class instance with event listeners now, I do need to mutate the variables here 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because those objects will have been previously returned, and other objects will be holding on to them, and not expecting them to change. Maybe there is somewhere that we diff the previous object with the new one, and mutating the object breaks that. There are other ways we can solve that. But I don't quite understand the problem for you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is somewhere that we diff the previous object with the new one

But that would break only if the diff logic relies on the object pointers, doing diffing by-attribute should be unaffected in either case?

There is no really a problem for me, just wanted to make sure that mutating the objects is fine here. We do need to mutate them now because the ref variable is a class instance now rather than a simple object it was before. The class instance can have event listeners registered on it by other code, and it handles nested file references resolution which we don't want to again over and over 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that would break only if the diff logic relies on the object pointers, doing diffing by-attribute should be unaffected in either case?

The attributes would change, and break the diffing. So we just need to figure out where this matters, and make sure that code can handle a mutable object. I don't know where that is off the top of my head, I can help figure it out later.

@legomushroom legomushroom force-pushed the legomushroom/prompt-snippet-completions branch from a927e4e to 2fa00b9 Compare November 19, 2024 22:18
Copy link
Member

@roblourens roblourens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not done reviewing but wanted to drop the first round of comments. Overall, when I'm working on a large feature, I usually try to start pushing small PRs to main as early as possible. That's the best way to avoid merge conflicts and get feedback early. Don't be afraid of pushing an incomplete feature behind a hidden setting, as long as it won't break anything. We do that often on this team. I also would rather avoid PRs that make multiple unrelated changes. It just gets a bit hard to review.

src/vs/base/common/assertDefined.ts Outdated Show resolved Hide resolved
@@ -186,7 +186,7 @@ export interface ITransformer<Original, Transformed> {
error?: IErrorTransformer;
}

export function newWriteableStream<T>(reducer: IReducer<T>, options?: WriteableStreamOptions): WriteableStream<T> {
export function newWriteableStream<T>(reducer: IReducer<T> | null, options?: WriteableStreamOptions): WriteableStream<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional argument? We generally use undefined over null

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used null on purpose to make it explicit to prevent subtle silent errors. It is too easy to overlook a variable or a function return value being an undefined and pass it over, hence I wanted to make sure the folks that using this to be very specific.

I do appreciate the work that TS has done with void and not treating the foo?: string and foo: undefined | string the same anymore, but there is still the cases when requiring the explicit null saves some hair on your head 🤗 Do you think it would be fine with undefined? You have more context so happy to revert if you think it won't be a pain in the future.

state: WorkingSetEntryState.Attached,
kind: 'reference',
});

seenEntries.add(child);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a correct fix (cc @joyceerhl) but is this method even related to your new feature? I prefer not grouping unrelated fixes into a PR that's already really big

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one relates to my feature only - the child references is the new logic. The existing logic is now on the line 1086 above now, and that one would still need the fix to be applied.


prompt.parts
.forEach((part, i) => {
public async resolveVariables(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but I'd prefer to keep PRs focused more narrowly, and the word "mostly" makes it a little harder to review :)

src/vs/workbench/contrib/chat/browser/chatWidget.ts Outdated Show resolved Hide resolved
src/vs/workbench/contrib/chat/browser/chatWidget.ts Outdated Show resolved Hide resolved
}

if (typeof value === 'string') {
return value.trim().toLowerCase() === 'true';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this also expecting a string vs boolean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned value is unknown and we provide the JSON view of the config, so users may add the "true"(or even "True") value there and get confused? Or we already parse it to a correct boolean value under the hood?

}

// get all file references in the file contents
const references = await this.codec.decode(fileStream.value).consumeAll();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering whether this needs to be a stream? Do you use it as a stream?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just nice to have the stream since you are working from a file stream, just wondering if there is a case where you would use it that way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the amount of data we need to parse might be substantial and because filesystem IO is involved, I thought it is important to keep it as a stream so we can do the parsing concurrently.

I have a follow-up task to make the logic here take leverage of that. Beside this line, the decoders are composed(e.g., PromptFileDecoder uses SimpleTokenDecoder) hence fully take advantage of the fact that an underlying decoder is a stream of messages 🚀

@roblourens
Copy link
Member

I can't get this to work- I set the setting, should I see intellisense for #file? If I manually type #file:./test.js, it doesn't include that file.

Also, it seems like this might break normal implicit context?

@legomushroom legomushroom force-pushed the legomushroom/prompt-snippet-completions branch from 356d78b to dc90f20 Compare November 22, 2024 01:27
@legomushroom
Copy link
Member Author

@roblourens

I can't get this to work- I set the setting, should I see intellisense for #file? If I manually type #file:./test.js, it doesn't include that file.

You mean autocompletion? Yeah it should work just fine, I don't recall it to broken at any point 🤔

Also, it seems like this might break normal implicit context?

It was broken for a bit a while back but should work just fine now, please try again!

*/
private updateVariableTexts(): void {
for (const variable of this._variables) {
const text = `#file:${variable.filenameWithReferences}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the idea of having a distinct file extension for these files, which would also enable implementing attaching prompt snippets as a separate variable type instead of overloading the existing #file: variable and implicit context. I didn't see discussion of this route in the issue you posted but let me know if that was already evaluated as an option and dismissed. For context, we have #kb: for GitHub knowledge bases when using @github. I think that would result in a simpler and more contained implementation of this functionality, and also as a user I personally wouldn't expect to pick a prompt snippet through #file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we decided to revisit this after the first round of feedback, once we got some usage we should be in a better place to make this call 👍


prompt.parts
.forEach((part, i) => {
public async resolveVariables(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also recommend scoping PRs down where possible, large PRs with many changes make it hard on the reviewers and ultimately delay landing features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants