Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: openai vision toolkit #269

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

AaronGoldsmith
Copy link
Contributor

@AaronGoldsmith AaronGoldsmith commented Nov 18, 2024

This pull request introduces a new ImageUrl content type and integrates it across various modules in the exchange package. The changes include updates to the content handling logic, message validation, and new tests to ensure the functionality of the ImageUrl type. Additionally, a new VisionToolkit has been added to the goose toolkit for image analysis using AI capabilities.

The toolkit uses the OpenAI Vision API to analyze image URLs and base64-encoded images.
Future enhancements: Support for additional providers (ex: Anthropic)

image

New Content Type and Integration:

  • Added ImageUrl class to handle image URLs as content and implemented to_dict and summary methods in packages/exchange/src/exchange/content.py.
  • Updated validate_role_and_content and content_converter functions in packages/exchange/src/exchange/message.py to support ImageUrl. [1] [2]
  • Added image_content property to the Message class to retrieve all ImageUrl instances in packages/exchange/src/exchange/message.py.

Providers and Utilities:

  • Updated messages_to_openai_spec function in packages/exchange/src/exchange/providers/utils.py to handle ImageUrl content.

Testing Enhancements:

  • Added tests for ImageUrl integration in packages/exchange/tests/test_image_tool_integration.py and packages/exchange/tests/test_message.py. [1] [2]

New Toolkit:

  • Introduced VisionToolkit for image analysis in src/goose/toolkit/vision.py and updated pyproject.toml to include the new toolkit entry point. [1] [2]

If the team prefers having the toolkit and the new content type as separate PRs, I’d be happy to split them up.

Comment on lines +28 to +42
def content_converter(contents: list) -> list[Content]:
result = []
for c in contents:
if isinstance(c, dict) and 'type' in c: # Structured content logic
content_type = c.pop('type')
if content_type in CONTENT_TYPES:
result.append(CONTENT_TYPES[content_type](**c))
elif isinstance(c, Content): # Already a Content instance
result.append(c)
elif isinstance(c, str): # Plain text handling
result.append(Text(text=c))
else:
# Handle unexpected content formats if necessary
raise ValueError(f"Unsupported content type: {type(c)}")
return result
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before updating content_converter I would get the following:

 "Traceback (most recent call last):
  File \"/Users/aarong/Development/goose/packages/exchange/src/exchange/exchange.py\", line 149, in call_function
    output = json.dumps(tool.function(**tool_use.parameters))
 File \"/Users/aarong/Development/goose/src/goose/toolkit/vision.py\", line 22, in describe_image
    user_message = Message(role=\"user\", content=[f\"{instructions}: \", image])
  File \"<attrs generated init exchange.message.Message>\", line 13, in __init__

    _setattr('content', __attr_converter_content(content))
  File \"/Users/aarong/Development/goose/packages/exchange/src/exchange/message.py\", line 29, in content_converter\n    return [(CONTENT_TYPES[c.pop(\"type\")](**c) if c.__class__ not in CONTENT_TYPES.values() else c) for c in contents]

  File \"/Users/aarong/Development/goose/packages/exchange/src/exchange/message.py\", line 29, in <listcomp>\n    return [(CONTENT_TYPES[c.pop(\"type\")](**c) if c.__class__ not in CONTENT_TYPES.values() else c) for c in contents]\nAttributeError: 'str' object has no attribute 'pop'

'str' object has no attribute 'pop'", "is_error": true, "type": "ToolResult"}]

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant