Visione's Capabilities #25
Replies: 3 comments 1 reply
-
Hi @MalikAhmed2 , right now, VISIONE's visual search is supported by the following global image descriptors: OpenCLIP, ALADIN, CLIP2Video, and DINOv2. For faces, you might be able to search for some celebrities and other public figures that ended up in CLIP's training set, but that's all... no custom face search. Concerning audio, it is not analyzed in VISIONE (yet). |
Beta Was this translation helpful? Give feedback.
-
Is it possible to add different modules, such as face recognition, to VISIONE? |
Beta Was this translation helpful? Give feedback.
-
I'm eager to contribute in the future when my schedule allows. Keep up the excellent work! |
Beta Was this translation helpful? Give feedback.
-
I'm curious about Visione's abilities. Can Visione identify faces and logos in videos? Can I search for videos based on a specific face or logo? And, does Visione work with audio too, like turning speech into text?
Beta Was this translation helpful? Give feedback.
All reactions