Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Camera Vision realtime still not support ? #11

Open
skyxiaobai opened this issue Sep 10, 2024 · 2 comments
Open

Camera Vision realtime still not support ? #11

skyxiaobai opened this issue Sep 10, 2024 · 2 comments

Comments

@skyxiaobai
Copy link

I tested found vision parts just show front camera view, but vision frame not realtime comunicated with LLM, So, Any plan to support this?

@marcus-daily
Copy link
Collaborator

Thanks for the question @skyxiaobai. Vision is only supported with some models, such as Claude Sonnet. Also please make sure you have "Voice and Vision" selected in the settings.

image

@skyxiaobai
Copy link
Author

Thank you for your reminder. This is what I am doing. I am using GPT-4o by default and the RTVI Android SDK. After generating the APK and running it on the phone, I can see that the camera opens, but I found out that during the conversation, the content from the camera cannot be recognized. For example, in a scenario where you ask, "Can you see what I am doing?" the camera content is not detected. Actually, my goal is to utilize the camera's real-time video stream for conversation, similar to a video chat function. However, after testing, I found that only voice is real-time. Thanks again for your support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants