-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: align readme with current mteb #1493
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -50,6 +50,8 @@ model_name = "average_word_embeddings_komninos" | |||||||||
# model_name = "sentence-transformers/all-MiniLM-L6-v2" | ||||||||||
|
||||||||||
model = SentenceTransformer(model_name) | ||||||||||
# or directly from mteb: | ||||||||||
model = mteb.get_model(model_name) | ||||||||||
tasks = mteb.get_tasks(tasks=["Banking77Classification"]) | ||||||||||
evaluation = mteb.MTEB(tasks=tasks) | ||||||||||
results = evaluation.run(model, output_folder=f"results/{model_name}") | ||||||||||
|
@@ -220,9 +222,13 @@ Note that the public leaderboard uses the test splits for all datasets except MS | |||||||||
Models should implement the following interface, implementing an `encode` function taking as inputs a list of sentences, and returning a list of embeddings (embeddings can be `np.array`, `torch.tensor`, etc.). For inspiration, you can look at the [mteb/mtebscripts repo](https://github.com/embeddings-benchmark/mtebscripts) used for running diverse models via SLURM scripts for the paper. | ||||||||||
|
||||||||||
```python | ||||||||||
import mteb | ||||||||||
from mteb.encoder_interface import PromptType | ||||||||||
from mteb.models.wrapper import Wrapper | ||||||||||
import numpy as np | ||||||||||
|
||||||||||
|
||||||||||
class CustomModel: | ||||||||||
class CustomModel(Wrapper): | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does it need to inherit from wrapper? I would instead inherit from the Encoder protocol There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Lines 366 to 367 in 3ff38ec
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, shouldn't it just wrap SentenceTransformers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It can be like this. You changed it in mieb Lines 356 to 357 in fab0b82
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm then it will also be merged into v2.0.0 in which case we should probably just update the readme there (I plan to merge it during December) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, for now I think this PR can be merged, and I will update readme in the 2.0 branch |
||||||||||
def encode( | ||||||||||
self, | ||||||||||
sentences: list[str], | ||||||||||
|
@@ -244,7 +250,7 @@ class CustomModel: | |||||||||
pass | ||||||||||
|
||||||||||
model = CustomModel() | ||||||||||
tasks = mteb.get_task("Banking77Classification") | ||||||||||
tasks = mteb.get_tasks(tasks=["Banking77Classification"]) | ||||||||||
evaluation = MTEB(tasks=tasks) | ||||||||||
evaluation.run(model) | ||||||||||
``` | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just recommend this one always?
and maybe rephrase to:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I'm not sure removing it is the best approach. I can make changes to whatever you think is better