-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image support inside complex types #1767
Comments
This is how I did it in def format_input_simple(pydantic_object: BaseModel, img_formatter=None) -> dict[str, Any]:
if img_formatter is None:
img_formatter = gpt_format_image
image_map = {}
def replace_image_with_id(obj: Any) -> Any:
image_id = f"[image {len(image_map) + 1}]"
image_map[image_id] = obj.base64()
return image_id
dict_obj = map_images(pydantic_object, replace_image_with_id)
processed = json.dumps(dict_obj)
content = [{"type": "text", "text": processed}]
for image_id, image in image_map.items():
content.append({"type": "text", "text": image_id + ":"})
content.append(img_formatter(image))
return {"role": "user", "content": content} Basically when I turn the input object into json, I replace all images with an ID. Works reasonably well. |
Hey, I was trying to perform VQA with an LLM using dspy for optimized prompting and I'm not able to pass the base64image to LLM via dspy. Could you let me know how you were able to do it? I tried dspy.Image but I get an error saying No module called dspy.Image. Thanks |
@rzr2kor Are you on the latest version of DSPy? |
@thomasahle Did you find that this worked better than interweaving the |
With images complex types it seems like we could unlock MiproV2 w fewshots aware enabled as |
I couldn't put it in "the actual context", since that was just one big json string |
Currently, only you can only pass a single image at a time in a signature.
E.g. this will work
But any more complex types involving images wont:
This is due to how images are compiled into OAI compatible messages, where inside
chat_adapter.py
we create a large list of content blocks by giving fields with an image_url special privileges:I do some fairly naive parsing inside
ChatAdapter
, and there is definitely a more elegant solution here.#1763 addresses the List case, but I want a more generalized solution.
cc @okhat
The text was updated successfully, but these errors were encountered: