Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging #6

Merged
merged 3 commits into from
Nov 14, 2024
Merged

Improve logging #6

merged 3 commits into from
Nov 14, 2024

Conversation

lukasgarbas
Copy link
Collaborator

Logging should give a clear idea of what happens in the ranker.

from datasets import load_dataset
from transformer_ranker import TransformerRanker, prepare_popular_models

# Load a dataset
dataset = load_dataset('conll2003')

# Prepare some language models
language_models = prepare_popular_models('large')

# Initialize the ranker with the dataset
ranker = TransformerRanker(dataset, dataset_downsample=0.2)

# Run it with your models
results = ranker.run(language_models, batch_size=32)

# Inspect results
print(results)

First, the dataset is preprocessed. The logger shows the dataset info, including column names for texts and labels, the label map, dataset size, and the task category.

transformer_ranker:Texts and labels: tokens, ner_tags
transformer_ranker:Label map: {'O': 0, 'ORG': 1, 'LOC': 2, 'MISC': 3, 'PER': 4}
transformer_ranker:Dataset size: 4148 texts (down-sampled to 0.2)
transformer_ranker:Task category: token classification
transformer_ranker:Running on cuda:0
transformer_ranker:Models found in cache: ['bert-large-uncased', 'roberta-large', ..., 'KISTI-AI/scideberta']

Second, the models are downloaded or loaded from cache. This stage can take the majority of the time.

Third, the ranking starts. Each model has two loading bars: (1) one for embedding texts (2) one for scoring embeddings with a transferability metric.

Retrieving Embeddings: 100%|██████████| 130/130 [00:12<00:00, 10.75it/s]
Transferability Score: 100%|██████████| 1/1 [00:00<00:00,  2.74it/s]
transformer_ranker:bert-large-uncased estimation: 2.6677 (hscore)
Retrieving Embeddings: 100%|██████████| 130/130 [00:12<00:00, 10.63it/s]
Transferability Score: 100%|██████████| 1/1 [00:00<00:00,  2.62it/s]
transformer_ranker:roberta-large estimation: 2.7269 (hscore)
Retrieving Embeddings: 100%|██████████| 130/130 [00:11<00:00, 11.01it/s]
Transferability Score: 100%|██████████| 1/1 [00:00<00:00,  1.38it/s]
transformer_ranker:google/electra-large-discriminator estimation: 2.7463 (hscore)
...

Finally, the results can be printed out:

Rank 1. microsoft/deberta-v3-large: 2.7883
...
Rank 11. dmis-lab/biobert-large-cased-v1.1: 1.7927

Improvements made to the logging:

  • Log the device.
  • Add the label map to the log.
  • Add metric names next to scores.

Other changes:

  • Use Python 3.9 typing.
  • Use consistent double quotes.

@lukasgarbas lukasgarbas merged commit 84088de into main Nov 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant