You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the question @skyxiaobai. Vision is only supported with some models, such as Claude Sonnet. Also please make sure you have "Voice and Vision" selected in the settings.
Thank you for your reminder. This is what I am doing. I am using GPT-4o by default and the RTVI Android SDK. After generating the APK and running it on the phone, I can see that the camera opens, but I found out that during the conversation, the content from the camera cannot be recognized. For example, in a scenario where you ask, "Can you see what I am doing?" the camera content is not detected. Actually, my goal is to utilize the camera's real-time video stream for conversation, similar to a video chat function. However, after testing, I found that only voice is real-time. Thanks again for your support.
I tested found vision parts just show front camera view, but vision frame not realtime comunicated with LLM, So, Any plan to support this?
The text was updated successfully, but these errors were encountered: