Skip to content

Latest commit

 

History

History
13 lines (8 loc) · 837 Bytes

README.md

File metadata and controls

13 lines (8 loc) · 837 Bytes

SherlookingArt

Project conducted during the King's College Prompting Hackathon

To what extent can LLMs be useful for multi-modal knowledge acquisition and inferencing?

  • Prior work on leveraging text-only LLMs for knowledge extraction and KG completion (overview here).
  • We would like to extend such approaches to multi-modal knowledge, including not only text and images, but also audio, video, haptics etc.
  • The goal would be to test the ability of multi-modal LLMs such as GPT-4 (as well as others) towards the construction and completion of a multi-modal KG in the context of the MuseIT project (https://www.muse-it.eu/).
  • Particularly interesting would be to explore the functionality of LLMs for multi-modal reasoning and inferencing.

Uploading Untitled Diagram.drawio(5).png…