Skip to content

kellylab/plm-model-comparison

Repository files navigation

plm-model-comparison

Comparing novel Protein Language Models

The procedures below are heavily based off of the procedures in https://www.nature.com/articles/s41564-023-01584-8.

PHROGs annotation table:

.tsv: https://storage.googleapis.com/plm-model-comparison/PHROG_index.tsv

.csv: https://storage.googleapis.com/plm-model-comparison/EFAM_embed/PHROG_index.csv

PHROG Embedding:

Necessary Files:

Code and Data Availability:

Code for extracting the embeddings for each model is present in the directory extracting-embeddings

PHROGs model embeddings and final averaged embeddings present in https://console.cloud.google.com/storage/browser/plm-model-comparison in folders, labeled final_embeddings and final_average_embeddings respectively

Code for creating the embedding figures for each model is present in the directory phrog-embedding-figures

PHROGs averaged embeddings figures present in https://console.cloud.google.com/storage/browser/plm-model-comparison/phrog-embedding-figures

Trained Model Performances on PHROGs:

Necessary Files:

Code and Data Availability:

Code for training the functional classifiers and creating the precision, recall and F1 boxplots are present in the directory phrog-performance

PHROGs functional classifier data present under 5CV_LMs_performance in each model directory

Trained Model Performances on EFAM:

Necessary Files:

Code and Data Availability:

Code for training the functional classifiers and testing their performances on EFAM are present in the directory efam-performance

PHROGs trained classifiers present in https://console.cloud.google.com/storage/browser/plm-model-comparison in their respective folders, labeled models

About

Comparing novel Protein Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published