EC2Vec

EC2Vec is a machine learning tool that embeds Enzyme Commission (EC) numbers into vector representations.

Dependencies

pytorch 1.10.0
numpy 1.19.2
sklearn 0.23.2
pandas 1.1.3

Input data to the model

EC2Vec takes raw EC numbers as input. The ./Datasets/EC_numbers.csv file contains the EC numbers used for training the model.

Get EC number embeddings using the trained model

The trained model embeds each EC number as a 1024-dim vector.

To get the EC number embeddings using the trained model, put your EC number data under ./Datasets/ directory, please follow ./Datasets/EC_numbers.csv for the format.

Then simply run get_ec2vec_embeddings.py.

The generated embedding file will be saved under ./Embedding_Results/ directory as embedded_EC_number.csv file.

Train your own EC2Vec

To train the EC2Vec model using your own data, put your EC number data under ./Datasets/ directory, please follow ./Datasets/EC_numbers.csv for the format.

Then simply run ec2vec.py.

The trained model based on your data will be saved under ./Trained_model/ directory as model.pth file.

Note that, we used 1024 as the embedding size for an EC number. You can adjust this dimension by changing the hidden_sizes parameter in the code.

Remark

EC2Vec can process incomplete EC numbers, such as 3.4.25.-, 1.8.-.-, 6.-.-.-, and 0.0.0.0 (spontaneous reaction).

Application of EC2Vec embeddings in downstream classification tasks

For downstream machine learning applications using the EC embeddings generated by EC2Vec, please visit the GitHub repository at https://github.com/MengLiu90/Classification_Using_EC2Vec.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Datasets		Datasets
Embedding_Results		Embedding_Results
Trained_model		Trained_model
EC_dataset_process.py		EC_dataset_process.py
LICENSE		LICENSE
README.md		README.md
ec2vec.py		ec2vec.py
get_ec2vec_embeddings.py		get_ec2vec_embeddings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EC2Vec

Dependencies

Input data to the model

Get EC number embeddings using the trained model

Train your own EC2Vec

Remark

Application of EC2Vec embeddings in downstream classification tasks

About

Releases

Packages

Languages

License

MengLiu90/EC2Vec

Folders and files

Latest commit

History

Repository files navigation

EC2Vec

Dependencies

Input data to the model

Get EC number embeddings using the trained model

Train your own EC2Vec

Remark

Application of EC2Vec embeddings in downstream classification tasks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages