Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple selection in "Select and Edit" #418

Open
radrad opened this issue Oct 24, 2024 · 5 comments
Open

Support multiple selection in "Select and Edit" #418

radrad opened this issue Oct 24, 2024 · 5 comments

Comments

@radrad
Copy link

radrad commented Oct 24, 2024

Describe the bug

I am using VS code insider in admin mode.

In backend .env I entered my AI keys:
OPENAI_API_KEY=sk-2siLny...
ANTHROPIC_API_KEY=sk-ant-api0...

When drag/drop an .mp4 video below:
https://github.com/user-attachments/assets/22713a47-4d23-44e9-b83f-dcb774ebbcc8
I am getting a notification dialog:
Error assembling prompt. Contact support at [email protected]

How can I use the latest models and what it the code I should change?
I want to use the latest OppenAI model: o1-preview which points to o1-preview-2024-09-12
and
I want to use the lastes Anthropic model: claude-3-5-sonnet-latest which points to claude-3-5-sonnet-20241022

I am confused where in the code I can designate the latest versions

When I drag/drop .png image:
screenshot1
I cannot see Option 1 rendering.
What is that Option 1 suppose to show? Open AI based generation?

How can I use the latest o1 model

frontend\src\lib\models.ts
`// Keep in sync with backend (llm.py)
// Order here matches dropdown order
export enum CodeGenerationModel {
CLAUDE_3_5_SONNET_2024_06_20 = "claude-3-5-sonnet-20240620",
GPT_4O_2024_05_13 = "gpt-4o-2024-05-13",
GPT_4_TURBO_2024_04_09 = "gpt-4-turbo-2024-04-09",
GPT_4_VISION = "gpt_4_vision",
CLAUDE_3_SONNET = "claude_3_sonnet",
}

// Will generate a static error if a model in the enum above is not in the descriptions
export const CODE_GENERATION_MODEL_DESCRIPTIONS: {
[key in CodeGenerationModel]: { name: string; inBeta: boolean };
} = {
"gpt-4o-2024-05-13": { name: "GPT-4o", inBeta: false },
"claude-3-5-sonnet-20240620": { name: "Claude 3.5 Sonnet", inBeta: false },
"gpt-4-turbo-2024-04-09": { name: "GPT-4 Turbo (deprecated)", inBeta: false },
gpt_4_vision: { name: "GPT-4 Vision (deprecated)", inBeta: false },
claude_3_sonnet: { name: "Claude 3 (deprecated)", inBeta: false },
};`

Console log for backend:

Using openAiApiKey from client-side settings dialog
Using anthropicApiKey from client-side settings dialog
Using official OpenAI URL
Generating react_tailwind code in video mode using Llm.CLAUDE_3_5_SONNET_2024_06_20...
Status (variant 0): Generating code...
Status (variant 1): Generating code...
C:\Users\Greg\AppData\Local\Temp\tmpkwxwbc8s.mp4
Error assembling prompt. Contact support at [email protected]
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\moviepy\video\io\ffmpeg_reader.py", line 285, in ffmpeg_parse_infos
line = [l for l in lines if keyword in l][index]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\uvicorn\protocols\websockets\websockets_impl.py", line 250, in run_asgi
result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\fastapi\applications.py", line 276, in call
await super().call(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\middleware\errors.py", line 149, in call
await self.app(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\middleware\cors.py", line 75, in call
await self.app(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\routing.py", line 718, in call
await route.handle(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\routing.py", line 341, in handle
await self.app(scope, receive, send)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\starlette\routing.py", line 82, in app
await func(session)
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\fastapi\routing.py", line 289, in app
await dependant.call(**values)
File "J:\k8s\ArgoCD\Git\Maui\The Path to Self-Transformation\Automation\screenshot-to-code\backend\routes\generate_code.py", line 234, in stream_code
prompt_messages, image_cache = await create_prompt(params, stack, input_mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "J:\k8s\ArgoCD\Git\Maui\The Path to Self-Transformation\Automation\screenshot-to-code\backend\prompts_init_.py", line 72, in create_prompt
prompt_messages = await assemble_claude_prompt_video(video_data_url)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "J:\k8s\ArgoCD\Git\Maui\The Path to Self-Transformation\Automation\screenshot-to-code\backend\video\utils.py", line 21, in assemble_claude_prompt_video
images = split_video_into_screenshots(video_data_url)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "J:\k8s\ArgoCD\Git\Maui\The Path to Self-Transformation\Automation\screenshot-to-code\backend\video\utils.py", line 79, in split_video_into_screenshots
clip = VideoFileClip(temp_video_file.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\moviepy\video\io\VideoFileClip.py", line 88, in init
self.reader = FFMPEG_VideoReader(filename, pix_fmt=pix_fmt,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\moviepy\video\io\ffmpeg_reader.py", line 35, in init
infos = ffmpeg_parse_infos(filename, print_infos, check_duration,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Greg.virtualenvs\agents_with_json_mode_only-I1PZcyP6\Lib\site-packages\moviepy\video\io\ffmpeg_reader.py", line 289, in ffmpeg_parse_infos
raise IOError(("MoviePy error: failed to read the duration of file %s.\n"
OSError: MoviePy error: failed to read the duration of file C:\Users\Greg\AppData\Local\Temp\tmpkwxwbc8s.mp4.
Here are the file infos returned by ffmpeg:

ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 9.2.1 (GCC) 20200122
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
C:\Users\Greg\AppData\Local\Temp\tmpkwxwbc8s.mp4: Permission denied

@abi
Copy link
Owner

abi commented Oct 24, 2024

Re: images, I made it a little trickier to set the model in code with the newest change that support multiple options.

If you pull the latest and have both Anthropic and OpenAI keys set, it will use the latest Claude 1022. See

model=Llm.CLAUDE_3_5_SONNET_2024_10_22,

We currently use GPT_4O_2024_05_13 which is one update behind

model=Llm.GPT_4O_2024_05_13,
You can update that to the latest if you want.

You can't use o1-preview-2024-09-12 because that doesn't support image input as far as I know.

I'll make it easier to choose models in the future.

Re: video, this is a known issue. If you can convert the format of the video using video convertor, it should work. Some browser don't set duration correctly when capturing video and so, it doesn't work. You could also try a different browser.

@radrad
Copy link
Author

radrad commented Oct 24, 2024

What about having both Option 1 (which I don't have. I have both Anthropic and OpenAI keys in .env) and Option 2 (which I do have)?

I find that model handling and hard coding them in multiple places is very hard to maintain for future updates.
I understand there are differences in code bases on how some models from different providers are treated (to enable or disable some features)
This is what I changed and feel free to apply this patch if my changes are appropriate to bring the latest models from both Anthropic and OpenAI.
my_changes.patch

Re: "If you can convert the format of the video using video convertor, it should work."
What should I convert in video?: image

The video (which you can check (https://github.com/user-attachments/assets/22713a47-4d23-44e9-b83f-dcb774ebbcc8) is captured by Snagit tool and you can see from properties of this video file that it does have Lenght and other video data (that can be seen in the picture)

Where exactly in the code I am getting this error?

Next. It would be good when selecting fine grained changes that you accumulate more than one selection. I find that after selecting some html element and providing a desired update, that there is an immediate code re-generation. I would prefere there are multiple selection and update prompts possible before re-generating

@abi
Copy link
Owner

abi commented Oct 25, 2024

Yeah, I will look to support the newest GPT4o model. Need to do some testing to ensure quality before switching to it.

I think a good thing to convert is to just re-encode it as MP4. If that's confusing, you could do MP4 to WebM. Or MP4 -> WebM -> MP4. It's an encoding issue with the video as far as I know when the duration error shows up.

Good suggestion re: more than 1 selection. Also, exploring newer models like Llama on Groq so the change is instant.

@abi abi changed the title Picture generation works only for Option 2, Video generation causes: Error assembling prompt. Contact support at [email protected] Support multiple selection in "Select and Edit" Oct 25, 2024
@radrad
Copy link
Author

radrad commented Oct 25, 2024

Snagit is a professional screen video capture tool. I cannot see any problem with what it creates as .mp4.
I did what you suggested MP4 -> WebM -> MP4 and there is the same file with this conversion steps.
https://github.com/user-attachments/assets/fe15a9de-ab1a-4423-ac6b-9fa7da46e239

Can you try with my original video and this one.

Can you provide me with a couple of videos that do work to see if there is some other problem.

What about producing both Option 1 and Option 2 as now I only get Option 2. When whould Option 1 generate code?

@abi
Copy link
Owner

abi commented Oct 28, 2024

Screenshot 2024-10-28 at 6 30 59 PM

The video you provided works for me. What error do you get?

If Option 1 isn't working, please share the backend logs. But even without that, my guess would be that their Anthropic key isn't right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants