Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SDK code #3

Merged
merged 65 commits into from
Dec 18, 2023
Merged

feat: SDK code #3

merged 65 commits into from
Dec 18, 2023

Conversation

asafgardin
Copy link
Contributor

No description provided.

@asafgardin asafgardin changed the title Sdk code feat: SDK code Dec 11, 2023

match = "(main)"
upload_to_release = true
build_command = "pip install poetry && poetry build"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is thi used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. its for python-semantic-release

pyproject.toml Outdated
]

match = "(main)"
upload_to_release = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used. Removed

@@ -0,0 +1,4 @@
import ai21
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I think I'm missing what's up here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And added tests

from ai21.tokenizers.factory import get_tokenizer


class AI21Client(AI21StudioClient):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rethink the structure of this class. IMO it should be simplified:

  1. I'm not sure why the client that's externally facing needs to extend the AI21StudioClient. IMO in the init you can create the client and then pass it to the resources. Let's try to avoid unnecessary inheritance when not needed
  2. I think that a pattern of passing self to your members can complicate things (such as recursive dependency tree) and should be avoided. This also correlates with the 1st suggestion here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created the AI21StudioClient inside the constructor

self.library = StudioLibrary(self)
self.segmentation = StudioSegmentation(self)

def count_token(self, text: str, model_id: str = "j2-instruct") -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are not using the model_id here, so why expose it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to remove it. Done

text: str,
*,
style: Optional[str] = None,
start_index: Optional[int] = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if start_index is passed as None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server translates it to 0 by checking if the value is None

count_penalty: Optional[Dict[str, Any]] = None,
**kwargs,
) -> CompletionsResponse:
body = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a dup of Sagemaker resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bedrock receives model_id as well

from typing import Any, Dict, TYPE_CHECKING

if TYPE_CHECKING:
from ai21.clients.bedrock.ai21_bedrock_client import AI21BedrockClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are to many conditional imports and it's going to be difficult to maintain.
Consider adding in the resource creation (or in __init__) a try+import+except before adding the resource to the client or exposing it in __init__ (Along with a proper error)
You can get insipration from Langchain and their conditional imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@asafgardin asafgardin Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error message for bad import would be
ImportError: Please install "ai21[AWS]" in order to use AI21BedrockClient

return GECResponse.model_validate(json)

def _create_body(self, text: str, **kwargs) -> Dict[str, Any]:
return {"text": text, **kwargs}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to let users pass unrelated args?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of it, not necessarily. It could allow for an indicative error, telling the user that this arg is no longer supported in case we decide to omit a field.
But for future proofing case - I'll remove it from the body param for now

ai21/utils.py Outdated
logger = logging.getLogger("ai21")


def log_info(message):
Copy link
Contributor

@etang-ai21 etang-ai21 Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revisit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only kept logger = logging.getLogger('ai21')
What we could also do is something like

def setup_logger():
    if AI21EnvConfig.log_level == "debug":
        logger.setLevel(logging.DEBUG)
    elif AI21EnvConfig.log_level == "info":
        logger.setLevel(logging.INFO)
...

with maybe some more configs and then call it from our main __init__.py

@asafgardin asafgardin merged commit 0e9b36a into main Dec 18, 2023
8 checks passed
@asafgardin asafgardin deleted the sdk_code branch December 18, 2023 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants