diff --git a/LICENSE.md b/LICENSE.md index 6162710..fe0b992 100644 --- a/LICENSE.md +++ b/LICENSE.md @@ -1,82 +1,13 @@ -# EvolutionaryScale Community License Agreement - -This license is a reproduction of the EvolutionaryScale Community License Agreement available at . - -Please read this EvolutionaryScale Community License Agreement (the “**Agreement**”) carefully before using the AI Model (as defined below), which is offered by EvolutionaryScale, PBC (“**ES**”). - -By downloading the AI Model, or otherwise using the AI Model in any manner, You agree that You have read and agree to be bound by the terms of this Agreement. If You are accessing the AI Model on behalf of an organization or entity, You represent and warrant that You are authorized to enter into this Agreement on that organization’s or entity’s behalf and bind them to the terms of this Agreement (in which case, the references to “**You**” and “**Your**” in this Agreement, except for in this sentence, refer to that organization or entity) and that such entity is a non-commercial organization (such as a university, non-profit organization, research institute or educational or governmental body). Use of the AI Model is expressly conditioned upon Your assent to all terms of this Agreement, to the exclusion of all other terms. - -## Definitions. - -In addition to other terms defined elsewhere in this Agreement, the terms below have the following meanings. - -1. “**AI Model**” means the EvolutionaryScale ESM-3 Open Model code and model weights made available at the following link [https://github.com/evolutionaryscale/esm] (the “**GitHub Page**”), as may be updated and amended from time to time, whether in Source or Object form, made available to You pursuant to this Agreement. -2. “**Commercial Entity**” means any entity engaged in any activity intended for or directed toward commercial advantage or monetary compensation, including, without limitation, the development of any product or service intended to be sold or made available for a fee. For the purpose of this Agreement, references to a Commercial Entity expressly exclude any universities, non-profit organizations, not-for-profit entities, research institutes and educational and government bodies. -3. “**Contribution**” means any work of authorship, including the original version of the AI Model and any modifications or additions to that AI Model or Derivative Works thereof, that is intentionally submitted to ES for inclusion in the AI Model by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "**submitted**" means any form of electronic, verbal, or written communication sent to ES or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, ES for the purpose of discussing and improving the AI Model, but excluding Outputs and all communications that are conspicuously marked or otherwise designated in writing by the copyright owner as "**Not a Contribution**." -4. “**Contributor**” means ES and any individual or Legal Entity on behalf of whom a Contribution has been received by ES and subsequently incorporated within the AI Model. -5. “**Derivative Work**” means any work, whether in Source or Object form, that is based on (or derived from) the AI Model and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this Agreement, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the AI Model and Derivative Works thereof. -6. “**Legal Entity**” means the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "**control**" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. -7. “**Non-Commercial Purposes**” means uses not intended for or directed toward commercial advantage or monetary compensation, or the facilitation of development of any product or service to be sold or made available for a fee. For the avoidance of doubt, the provision of Outputs as a service is not a Non-Commercial Purpose. -8. “**Object**” means any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. -9. “**Output**” means any output, including any protein sequence, structure prediction, functional annotation, molecule, descriptions of a molecule, model, sequence, text, and/or image that is elicited directly or indirectly by, or otherwise made available to, You in connection with Your use of the AI Model, including, but not limited to, the use of AI-Powered Technology. -10. “**Output Derivatives**” means any enhancements, modifications and derivative works of Outputs (including, but not limited to, any derivative sequences or molecules). -11. “**Source**” means the preferred form for making modifications, including but not limited to AI Model source code, documentation source, and configuration files. -12. “**Third Party Model**” means any non-human tool, platform and/or other technology powered or made available in connection with the use of generative artificial intelligence or machine learning models that is operated by any third party. -13. “**You**” or “**Your**” means the individual entering into this Agreement or the organization or entity on whose behalf such individual is entering into this Agreement. - -## Intellectual Property Rights and Licenses. - -1. **Copyright License. Subject to the terms and conditions of this Agreement, each Contributor hereby grants to You a non-exclusive, non-transferable, limited copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the AI Model and such Derivative Works in Source or Object form for Your Non-Commercial Purposes.** -2. **Patent License**. Subject to the terms and conditions of this Agreement, each Contributor hereby grants to You a non-exclusive, non-transferable, limited patent license to make, have made, use, import, and otherwise transfer the AI Model for Your Non-Commercial Purposes, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the AI Model to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the AI Model or a Contribution incorporated within the AI Model constitutes direct or contributory patent infringement, then any patent licenses granted to You under this Agreement for that AI Model shall terminate as of the date such litigation is filed. - -## Licensing Process. - -In connection with Your licensing of the AI Model, ES may collect, from You or automatically through Your use of the AI Model, certain registration information about You, any Legal Entity You may represent, and Your use of the AI Model. The collection of this information and ES’s policies and procedures regarding the collection, use, disclosure and security of information received are described further in ES’s Privacy Policy available at , as may be updated from time to time. - -## Redistribution. - -Subject to Section 5 (Use Restrictions) below, You may reproduce and distribute copies of the AI Model or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form for Your Non-Commercial Purposes, provided that You meet the following conditions: - -1. You must not distribute copies of the AI Model or Derivative Works thereof to, or allow the use of any reproductions or copies thereof by, on behalf of or for, any Commercial Entity; and -2. You must restrict the usage of any copies of the AI Model or Derivative Works to usage for Non-Commercial Purposes; and -3. You must give any other recipients of the AI Model or Derivative Works a copy of this Agreement; and -4. You must cause any modified files to carry prominent notices stating that You changed the files; and -5. You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the AI Model, excluding those notices that do not pertain to any part of the Derivative Works; and -6. If the AI Model includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify this Agreement. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the AI Model, provided that such additional attribution notices cannot be construed as modifying this Agreement. - -You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the AI Model otherwise complies with the conditions stated in this Agreement. - -## Use Restrictions. - -**No Commercial Use**.  You may only use the AI Model, Contributions, Derivative Works, Outputs and Output Derivatives (as defined below) for Non-Commercial Purposes. For the avoidance of doubt, structure tokens are also considered Outputs and may only be used for Non-Commercial Purposes. Any commercial use of any of the foregoing, including, without limitation, any use by, on behalf of or for any Commercial Entity or to facilitate the development of any product or service to be sold or made available for a fee, is strictly prohibited under this Agreement. - -**No Use in Drug Development or Discovery**. Without limiting the foregoing, You may not use the AI Model or any Contributions, Derivative Works, Outputs or Output Derivatives in or in connection with: (i) the development (at any stage) or discovery of any drug, medication or pharmaceutical of any kind; (ii) any molecular or biological target, hit or lead identification; (iii) drug candidate selection; or (iv) lead optimization. - -**Use of Outputs**.  Notwithstanding anything to the contrary in this Agreement, You may not use or provide access to any Outputs or Output Derivatives to train, optimize, improve or otherwise influence the functionality or performance of any: (i) other large language model; (ii) technology for protein structure prediction; ****or (iii) other Third Party Model ****that is similar to the AI Model. You may, however, use the Outputs and Outputs Derivatives to train, optimize, improve or otherwise influence the functionality or performance of the AI Model itself and downstream Derivative Works thereof. - -**Additional Restrictions**.  Your use of the AI Model may also be subject to additional use restrictions communicated to You through the AI Model or otherwise, including those set forth in the ES Acceptable Use Policy available at , as may be updated and amended from time to time (the “**AUP**”), the terms of which are incorporated herein by reference. In the event of any conflict between the terms of this Agreement and the terms of the AUP, the terms that are more restrictive of Your use of the AI Model, Derivative Works, Outputs and Output Derivatives, as applicable, shall govern and control. - -## Submission of Contributions. - -Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the AI Model by You to ES shall be under the terms and conditions of this Agreement, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with ES regarding such Contributions. - -## Trademarks. - -This Agreement does not grant permission to use the trade names, trademarks, service marks, or product names of ES, except as required for reasonable and customary use in describing the origin of the AI Model and reproducing the content of the NOTICE file. - -## Disclaimer of Warranty. - -UNLESS REQUIRED BY APPLICABLE LAW OR OTHERWISE EXPRESSLY AGREED MUTUALLY AGREED UPON BY YOU AND ES IN WRITING, ES PROVIDES THE AI MODEL (AND EACH CONTRIBUTOR PROVIDES ITS CONTRIBUTIONS) ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, CORRECTNESS, RELIABILITY OR FITNESS FOR A PARTICULAR PURPOSE, ALL OF WHICH ARE HEREBY DISCLAIMED. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE AI MODEL AND ASSUME ANY RISKS ASSOCIATED WITH YOUR EXERCISE OF PERMISSIONS UNDER THIS AGREEMENT. - -## Limitation of Liability. - -In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this Agreement or out of the use or inability to use the AI Model (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. - -## General. - -1. **Entire Agreement**. This Agreement constitutes the entire agreement between You and ES relating to the subject matter hereof and supersedes all proposals, understandings, or discussions, whether written or oral, relating to the subject matter of this Agreement and all past dealing or industry custom. The failure of either party to enforce its rights under this Agreement at any time for any period shall not be construed as a waiver of such rights. ES may amend or modify this Agreement from time to time and will use reasonable efforts to provide You with notice of any material changes that may negatively impact Your use of the AI Model through the GitHub Page or through another means made available to You. No other changes, modifications or waivers to this Agreement will be effective unless in writing and signed by both parties. -2. **Relationship of Parties**. You and ES are independent contractors, and nothing herein shall be deemed to constitute either party as the agent or representative of the other or both parties as joint venturers or partners for any purpose. -3. **Export Control**. You shall comply with the U.S. Foreign Corrupt Practices Act and all applicable export laws, restrictions and regulations of the U.S. Department of Commerce, and any other applicable U.S. and foreign authority. -4. **Assignment**. This Agreement and the rights and obligations herein may not be assigned or transferred, in whole or in part, by You without the prior written consent of ES. Any assignment in violation of this provision is void. ES may freely assign or transfer this Agreement, in whole or in part. This Agreement shall be binding upon, and inure to the benefit of, the successors and permitted assigns of the parties. -5. **Governing Law**. This Agreement shall be governed by and construed under the laws of the State of New York and the United States without regard to conflicts of laws provisions thereof, and without regard to the Uniform Computer Information Transactions Act. -6. **Severability**.  If any provision of this Agreement is held to be invalid, illegal or unenforceable in any respect, that provision shall be limited or eliminated to the minimum extent necessary so that this Agreement otherwise remains in full force and effect and enforceable. +# License Information + +Here are the different licenses that govern access to the ESM codebase and the models inclusive of weights: + +| License | What | +|------------------------------------------------------------|-------------------------------------------------------------------| +| [Cambrian Open License Agreement](https://www.evolutionaryscale.ai/policies/cambrian-open-license-agreement) | Code on GitHub (excluding model weights) | +| [Cambrian Open License Agreement](https://www.evolutionaryscale.ai/policies/cambrian-open-license-agreement) | ESM C 300M (incl weights) | +| [Cambrian Non-Commercial License Agreement](https://www.evolutionaryscale.ai/policies/cambrian-non-commercial-license-agreement) | ESM-3 Open Model (incl weights) | +| [Cambrian Non-Commercial License Agreement](https://www.evolutionaryscale.ai/policies/cambrian-non-commercial-license-agreement) | ESM C 600M (incl weights) | +| Governed by API Agreements (See Below) | API-only models (ESM C 6B, ESM3 family) | +| [Forge API Terms of Use](https://www.evolutionaryscale.ai/policies/terms-of-use) | Free non-commercial API access via Forge | +| [Cambrian Inference Clickthrough License Agreement](https://www.evolutionaryscale.ai/policies/cambrian-inference-clickthrough-license-agreement) | Commercial Inference via SageMaker | diff --git a/README.md b/README.md index e4dca9c..09418dd 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,72 @@ -# ESM3 +# Table of Contents + +1. [Installation](#installation) +2. [ESM C](#esm-c) +3. [ESM 3](#esm3) +4. [Responsible Development](#responsible-development) +5. [License](#license) + +## Installation + +To get started with ESM, install the library using pip: + +```bash +pip install esm +``` + +## ESM C +[ESM Cambrian](https://www.evolutionaryscale.ai/blog/esm-cambrian) is a parallel model family to our flagship ESM3 generative models. While ESM3 focuses on controllable generation of proteins for therapeutic and many other applications, ESM C focuses on creating representations of the underlying biology of proteins. + +ESM C comes with major performance benefits over ESM2. The 300M parameter ESM C delivers similar performance to ESM2 650M with dramatically reduced memory requirements and faster inference. The 600M parameter ESM C rivals the 3B parameter ESM2 and approaches the capabilities of the 15B model, delivering frontier performance with far greater efficiency. At the leading edge, the 6B parameter ESM C sets a new benchmark, outperforming all prior protein language models by a wide margin. + +ESM C models are available immediately for academic and commercial use under a new license structure designed to promote openness and enable scientists and builders. You can find our [open](www.evolutionaryscale.ai/policies/cambrian-open-license-agreement) and [non-commercial](www.evolutionaryscale.ai/policies/cambrian-non-commercial-license-agreement) license agreements here. + +You can use the following guides to start using ESM-C models today through [HF](https://huggingface.co/EvolutionaryScale), [the Forge API](https://forge.evolutionaryscale.ai/) and [AWS SageMaker](https://aws.amazon.com/sagemaker/). + +### Using ESM C 300M and 600M via GitHub +ESM-C model weights are stored on the HuggingFace hub under https://huggingface.co/EvolutionaryScale/. +```py +from esm.models.esmc import ESMC +from esm.sdk.api import ESMProtein, LogitsConfig + +protein = ESMProtein(sequence="AAAAA") +client = ESMC.from_pretrained("esmc_300m").to("cuda") # or "cpu" +protein_tensor = client.encode(protein) +logits_output = client.logits( + protein_tensor, LogitsConfig(sequence=True, return_embeddings=True) +) +print(logits_output.logits, logits_output.embeddings) +``` + +### Using ESM C 6B via Forge API + +ESM-C models, including ESMC 6B, are accessible via EvolutionaryScale Forge. You can request access and utilize these models through forge.evolutionaryscale.ai, as demonstrated in the example below. +```py +from evolutionaryscale.opensource.sdk.forge import ESM3ForgeInferenceClient +from esm.sdk.api import ESMProtein, LogitsConfig + +# Apply for forge access and get an access token +forge_client = ESM3ForgeInferenceClient(model="esmc-6b-2024-12", url="https://forge.evolutionaryscale.ai", token="") +protein_tensor = forge_client.encode(protein) +logits_output = forge_client.logits( + protein_tensor, LogitsConfig(sequence=True, return_embeddings=True) +) +print(logits_output.logits, logits_output.embeddings) +``` + +### Using ESM C 6B via SageMaker + +ESM-C models are also available on Amazon SageMaker. They function similarly to the ESM3 model family, and you can refer to the sample notebooks provided in this repository for examples. + +After creating the endpoint, you can create a sagemaker client and use it the same way as a forge client. They share the same API. + +```py +sagemaker_client = ESM3SageMakerClient( + endpoint_name=SAGE_ENDPOINT_NAME, model= +) +``` + +## ESM 3 [ESM3](https://www.evolutionaryscale.ai/papers/esm3-simulating-500-million-years-of-evolution-with-a-language-model) is a frontier generative model for biology, able to jointly reason across three fundamental biological properties of proteins: sequence, structure, and function. These three data modalities are represented as tracks of discrete tokens at the input and output of ESM3. You can present the model with a combination of partial inputs across the tracks, and ESM3 will provide output predictions for all the tracks. @@ -11,10 +79,10 @@ The ESM3 architecture is highly scalable due to its transformer backbone and all Learn more by reading the [blog post](https://www.evolutionaryscale.ai/blog/esm3-release) and [the pre-print (Hayes et al., 2024)](https://www.evolutionaryscale.ai/papers/esm3-simulating-500-million-years-of-evolution-with-a-language-model). Here we present `esm3-open-small`. With 1.4B parameters it is the smallest and fastest model in the family. -ESM3-open is available under a [non-commercial license](https://www.evolutionaryscale.ai/policies/community-license-agreement), reproduced under `LICENSE.md`. +ESM3-open is available under the [Cambrian non-commercial license agreement](https://www.evolutionaryscale.ai/policies/cambrian-non-commercial-license-agreement), as outlined in `LICENSE.md` (note: updated with ESM C release). Visit our [Discussions page](https://github.com/evolutionaryscale/esm/discussions) to get in touch, provide feedback, ask questions or share your experience with ESM3! -## Quickstart for ESM3-open +### Quickstart for ESM3-open ``` pip install esm @@ -65,7 +133,7 @@ We also provide example scripts that show common workflows under `examples/`: - [local_generate.py](./examples/local_generate.py) shows how simple and elegant common tasks are: it shows folding, inverse folding and chain of thought generation, all by calling just `model.generate()` for iterative decoding. - [seqfun_struct.py](./examples/seqfun_struct.py) shows direct use of the model as a standard pytorch model with a simple model `forward` call. -## Forge: Access to larger ESM3 models +### Forge: Access to larger ESM3 models You can apply for beta access to the full family of larger and higher capability ESM3 models at [EvolutionaryScale Forge](https://forge.evolutionaryscale.ai). @@ -101,19 +169,4 @@ The core tenets of our framework are With this in mind, we have performed a variety of mitigations for `esm3-sm-open-v1`, detailed in our [paper](https://www.evolutionaryscale.ai/papers/esm3-simulating-500-million-years-of-evolution-with-a-language-model) ## License - -**The Big Picture:** - -1. The EvolutionaryScale AI Model is **only** available under this Community License Agreement for **non-commercial use** by **individuals** or **non-commercial organizations** (including universities, non-profit organizations and research institutes, educational and government bodies). - -2. You **may not** use the EvolutionaryScale AI Model or any derivative works of the EvolutionaryScale AI Model or its outputs: - - 1. in connection with **any commercial activities**, for example, any activities **by, on behalf of or for a commercial entity** or to develop **any product or service** such as hosting the AI Model behind an API; or - - 2. without attribution to EvolutionaryScale and this Community License Agreement; or - - 3. to **train** a AI-powered third party model **similar to EvolutionaryScale’s AI Model**, even for non-commercial usage. You may, however, create **Derivative Works** of ESM3, for example by finetuning or adding model layers. - -3. You **can publish, share and adapt** the EvolutionaryScale AI Model and its outputs for **non-commercial purposes** in accordance with the Community License Agreement, including a **non-commercial restriction** on the adapted model. - -Please read our non-commercial [Community License Agreement](https://www.evolutionaryscale.ai/policies/community-license-agreement) reproduced under [./LICENSE.md](LICENSE.md) before using ESM3. +The code and model weights of ESM3 and ESM C are available under a mixture of non-commercial and more permissive licenses, fully outlined in [LICENSE.md](LICENSE.md). diff --git a/esm/__init__.py b/esm/__init__.py index 0dd7cd4..6e3ccdf 100644 --- a/esm/__init__.py +++ b/esm/__init__.py @@ -1,2 +1,2 @@ -__version__ = "3.0.8" +__version__ = "3.1.0" diff --git a/esm/layers/attention.py b/esm/layers/attention.py index 41833b4..d8eb6ec 100644 --- a/esm/layers/attention.py +++ b/esm/layers/attention.py @@ -43,7 +43,10 @@ def _apply_rotary(self, q: torch.Tensor, k: torch.Tensor): def forward(self, x, seq_id): qkv_BLD3 = self.layernorm_qkv(x) query_BLD, key_BLD, value_BLD = torch.chunk(qkv_BLD3, 3, dim=-1) - query_BLD, key_BLD = self.q_ln(query_BLD), self.k_ln(key_BLD) + query_BLD, key_BLD = ( + self.q_ln(query_BLD).to(query_BLD.dtype), + self.k_ln(key_BLD).to(query_BLD.dtype), + ) query_BLD, key_BLD = self._apply_rotary(query_BLD, key_BLD) n_heads = self.n_heads diff --git a/esm/models/esmc.py b/esm/models/esmc.py new file mode 100644 index 0000000..c153a5c --- /dev/null +++ b/esm/models/esmc.py @@ -0,0 +1,145 @@ +from __future__ import annotations + +import contextlib + +import attr +import torch +import torch.nn as nn +from attr import dataclass + +from esm.layers.regression_head import RegressionHead +from esm.layers.transformer_stack import TransformerStack +from esm.sdk.api import ( + ESMCInferenceClient, + ESMProtein, + ESMProteinTensor, + ForwardTrackData, + LogitsConfig, + LogitsOutput, +) +from esm.tokenization import EsmSequenceTokenizer +from esm.utils import encoding +from esm.utils.constants.models import ESMC_600M +from esm.utils.decoding import decode_sequence +from esm.utils.sampling import _BatchedESMProteinTensor + + +@dataclass +class ESMCOutput: + sequence_logits: torch.Tensor + embeddings: torch.Tensor | None + + +class ESMC(nn.Module, ESMCInferenceClient): + """ + ESMC model implementation. + + Args: + d_model (int): The dimensionality of the input and output feature vectors. + n_heads (int): The number of attention heads in the transformer layers. + n_layers (int): The number of transformer layers. + """ + + def __init__( + self, d_model: int, n_heads: int, n_layers: int, tokenizer: EsmSequenceTokenizer + ): + super().__init__() + self.embed = nn.Embedding(64, d_model) + self.transformer = TransformerStack( + d_model, n_heads, None, n_layers, n_layers_geom=0 + ) + self.sequence_head = RegressionHead(d_model, 64) + self.tokenizer = tokenizer + + @classmethod + def from_pretrained( + cls, model_name: str = ESMC_600M, device: torch.device | None = None + ) -> ESMC: + from esm.pretrained import load_local_model + + if device is None: + device = torch.device("cuda" if torch.cuda.is_available() else "cpu") + model = load_local_model(model_name, device=device) + if device.type != "cpu": + model = model.to(torch.bfloat16) + assert isinstance(model, ESMC) + return model + + @property + def device(self): + return next(self.parameters()).device + + @property + def raw_model(self): + return self + + def forward( + self, + sequence_tokens: torch.Tensor | None = None, + sequence_id: torch.Tensor | None = None, + ) -> ESMCOutput: + """ + Performs forward pass through the ESMC model. Check utils to see how to tokenize inputs from raw data. + + Args: + sequence_tokens (torch.Tensor, optional): The amino acid tokens. + sequence_id (torch.Tensor, optional): The sequence ID. + + Returns: + ESMCOutput: The output of the ESMC model. + + """ + if sequence_id is None: + sequence_id = sequence_tokens == self.tokenizer.pad_token_id + + x = self.embed(sequence_tokens) + x, _ = self.transformer(x, sequence_id=sequence_id) + sequence_logits = self.sequence_head(x) + output = ESMCOutput(sequence_logits=sequence_logits, embeddings=x) + return output + + def encode(self, input: ESMProtein) -> ESMProteinTensor: + input = attr.evolve(input) # Make a copy + sequence_tokens = None + + if input.sequence is not None: + sequence_tokens = encoding.tokenize_sequence( + input.sequence, self.tokenizer, add_special_tokens=True + ) + return ESMProteinTensor(sequence=sequence_tokens).to( + next(self.parameters()).device + ) + + def decode(self, input: ESMProteinTensor) -> ESMProtein: + input = attr.evolve(input) # Make a copy + + assert input.sequence is not None + sequence = decode_sequence(input.sequence[1:-1], self.tokenizer) + + return ESMProtein(sequence=sequence) + + def logits( + self, + input: ESMProteinTensor | _BatchedESMProteinTensor, + config: LogitsConfig = LogitsConfig(), + ) -> LogitsOutput: + if not isinstance(input, _BatchedESMProteinTensor): + # Create batch dimension if necessary. + input = _BatchedESMProteinTensor.from_protein_tensor(input) + + device = torch.device(input.device) + + with ( + torch.no_grad(), + torch.autocast(enabled=True, device_type=device.type, dtype=torch.bfloat16) # type: ignore + if device.type == "cuda" + else contextlib.nullcontext(), + ): + output = self.forward(sequence_tokens=input.sequence) + + return LogitsOutput( + logits=ForwardTrackData( + sequence=output.sequence_logits if config.sequence else None + ), + embeddings=output.embeddings if config.return_embeddings else None, + ) diff --git a/esm/models/function_decoder.py b/esm/models/function_decoder.py index 2c34ce6..913af17 100644 --- a/esm/models/function_decoder.py +++ b/esm/models/function_decoder.py @@ -42,7 +42,7 @@ class FunctionTokenDecoderConfig: interpro_entry_list: str = field(default_factory=lambda: str(C.INTERPRO_ENTRY)) # Path to keywords vocabulary. keyword_vocabulary_path: str = field( - default_factory=lambda: str(C.data_root() / C.KEYWORDS_VOCABULARY) + default_factory=lambda: str(C.data_root("esm3") / C.KEYWORDS_VOCABULARY) ) # Whether to unpack LSH bits into single-bit tokens. unpack_lsh_bits: bool = True diff --git a/esm/pretrained.py b/esm/pretrained.py index a570cb3..df2686c 100644 --- a/esm/pretrained.py +++ b/esm/pretrained.py @@ -4,6 +4,7 @@ import torch.nn as nn from esm.models.esm3 import ESM3 +from esm.models.esmc import ESMC from esm.models.function_decoder import FunctionTokenDecoder from esm.models.vqvae import ( StructureTokenDecoder, @@ -16,6 +17,8 @@ ESM3_OPEN_SMALL, ESM3_STRUCTURE_DECODER_V0, ESM3_STRUCTURE_ENCODER_V0, + ESMC_300M, + ESMC_600M, ) ModelBuilder = Callable[[torch.device | str], nn.Module] @@ -27,7 +30,8 @@ def ESM3_structure_encoder_v0(device: torch.device | str = "cpu"): d_model=1024, n_heads=1, v_heads=128, n_layers=2, d_out=128, n_codes=4096 ).eval() state_dict = torch.load( - data_root() / "data/weights/esm3_structure_encoder_v0.pth", map_location=device + data_root("esm3") / "data/weights/esm3_structure_encoder_v0.pth", + map_location=device, ) model.load_state_dict(state_dict) return model @@ -37,7 +41,8 @@ def ESM3_structure_decoder_v0(device: torch.device | str = "cpu"): with torch.device(device): model = StructureTokenDecoder(d_model=1280, n_heads=20, n_layers=30).eval() state_dict = torch.load( - data_root() / "data/weights/esm3_structure_decoder_v0.pth", map_location=device + data_root("esm3") / "data/weights/esm3_structure_decoder_v0.pth", + map_location=device, ) model.load_state_dict(state_dict) return model @@ -47,12 +52,47 @@ def ESM3_function_decoder_v0(device: torch.device | str = "cpu"): with torch.device(device): model = FunctionTokenDecoder().eval() state_dict = torch.load( - data_root() / "data/weights/esm3_function_decoder_v0.pth", map_location=device + data_root("esm3") / "data/weights/esm3_function_decoder_v0.pth", + map_location=device, ) model.load_state_dict(state_dict) return model +def ESMC_300M_202412(device: torch.device | str = "cpu"): + with torch.device(device): + model = ESMC( + d_model=960, + n_heads=15, + n_layers=30, + tokenizer=get_model_tokenizers(ESM3_OPEN_SMALL).sequence, + ).eval() + state_dict = torch.load( + data_root("esmc-300") / "data/weights/esmc_300m_2024_12_v0.pth", + map_location=device, + ) + model.load_state_dict(state_dict) + + return model + + +def ESMC_600M_202412(device: torch.device | str = "cpu"): + with torch.device(device): + model = ESMC( + d_model=1152, + n_heads=18, + n_layers=36, + tokenizer=get_model_tokenizers(ESM3_OPEN_SMALL).sequence, + ).eval() + state_dict = torch.load( + data_root("esmc-600") / "data/weights/esmc_600m_2024_12_v0.pth", + map_location=device, + ) + model.load_state_dict(state_dict) + + return model + + def ESM3_sm_open_v0(device: torch.device | str = "cpu"): with torch.device(device): model = ESM3( @@ -66,7 +106,7 @@ def ESM3_sm_open_v0(device: torch.device | str = "cpu"): tokenizers=get_model_tokenizers(ESM3_OPEN_SMALL), ).eval() state_dict = torch.load( - data_root() / "data/weights/esm3_sm_open_v1.pth", map_location=device + data_root("esm3") / "data/weights/esm3_sm_open_v1.pth", map_location=device ) model.load_state_dict(state_dict) return model @@ -77,6 +117,8 @@ def ESM3_sm_open_v0(device: torch.device | str = "cpu"): ESM3_STRUCTURE_ENCODER_V0: ESM3_structure_encoder_v0, ESM3_STRUCTURE_DECODER_V0: ESM3_structure_decoder_v0, ESM3_FUNCTION_DECODER_V0: ESM3_function_decoder_v0, + ESMC_600M: ESMC_600M_202412, + ESMC_300M: ESMC_300M_202412, } diff --git a/esm/sdk/api.py b/esm/sdk/api.py index 3f9dec5..ae96b4c 100644 --- a/esm/sdk/api.py +++ b/esm/sdk/api.py @@ -423,3 +423,23 @@ def forward_and_sample( def raw_model(self): # Get underlying esm3 model of an inference client. raise NotImplementedError + + +class ESMCInferenceClient(ABC): + def encode(self, input: ESMProtein) -> ESMProteinTensor: + # Encode allows for encoding RawRepresentation into TokenizedRepresentation. + raise NotImplementedError + + def decode(self, input: ESMProteinTensor) -> ESMProtein: + # Decode is the inverse of encode + raise NotImplementedError + + def logits( + self, input: ESMProteinTensor, config: LogitsConfig = LogitsConfig() + ) -> LogitsOutput: + raise NotImplementedError + + @property + def raw_model(self): + # Get underlying esmc model of an inference client. + raise NotImplementedError diff --git a/esm/tokenization/function_tokenizer.py b/esm/tokenization/function_tokenizer.py index 15d6070..e5d2169 100644 --- a/esm/tokenization/function_tokenizer.py +++ b/esm/tokenization/function_tokenizer.py @@ -19,7 +19,7 @@ def _default_data_path(x: PathLike | None, d: PathLike) -> PathLike: - return x if x is not None else C.data_root() / d + return x if x is not None else C.data_root("esm3") / d def _default_local_data_path(x: PathLike | None, d: PathLike) -> PathLike: diff --git a/esm/tokenization/residue_tokenizer.py b/esm/tokenization/residue_tokenizer.py index cf6ff48..c64fdb3 100644 --- a/esm/tokenization/residue_tokenizer.py +++ b/esm/tokenization/residue_tokenizer.py @@ -15,7 +15,7 @@ class ResidueAnnotationsTokenizer(EsmTokenizerBase): def __init__(self, csv_path: str | None = None, max_annotations: int = 16): if csv_path is None: - csv_path = str(C.data_root() / C.RESID_CSV) + csv_path = str(C.data_root("esm3") / C.RESID_CSV) self.csv_path = csv_path self.max_annotations = max_annotations diff --git a/esm/utils/constants/esm3.py b/esm/utils/constants/esm3.py index 8a02be5..bea46ed 100644 --- a/esm/utils/constants/esm3.py +++ b/esm/utils/constants/esm3.py @@ -97,11 +97,18 @@ @staticmethod @cache -def data_root(): +def data_root(model: str): if "INFRA_PROVIDER" in os.environ: return Path("") # Try to download from hugginface if it doesn't exist - path = Path(snapshot_download(repo_id="EvolutionaryScale/esm3-sm-open-v1")) + if model.startswith("esm3"): + path = Path(snapshot_download(repo_id="EvolutionaryScale/esm3-sm-open-v1")) + elif model.startswith("esmc-300"): + path = Path(snapshot_download(repo_id="EvolutionaryScale/esmc-300m-2024-12")) + elif model.startswith("esmc-600"): + path = Path(snapshot_download(repo_id="EvolutionaryScale/esmc-600m-2024-12")) + else: + raise ValueError(f"{model=} is an invalid model name.") return path diff --git a/esm/utils/constants/models.py b/esm/utils/constants/models.py index d3835d7..ea20342 100644 --- a/esm/utils/constants/models.py +++ b/esm/utils/constants/models.py @@ -6,6 +6,8 @@ ESM3_STRUCTURE_ENCODER_V0 = "esm3_structure_encoder_v0" ESM3_STRUCTURE_DECODER_V0 = "esm3_structure_decoder_v0" ESM3_FUNCTION_DECODER_V0 = "esm3_function_decoder_v0" +ESMC_600M = "esmc_600m" +ESMC_300M = "esmc_300m" def model_is_locally_supported(x: str): diff --git a/examples/esmc_examples.py b/examples/esmc_examples.py new file mode 100644 index 0000000..2fe423d --- /dev/null +++ b/examples/esmc_examples.py @@ -0,0 +1,34 @@ +from esm.models.esmc import ESMC +from examples.local_generate import get_sample_protein +from esm.sdk.api import ( + ESMCInferenceClient, + LogitsConfig, + LogitsOutput, +) + + +def main(client: ESMCInferenceClient): + # ================================================================ + # Example usage: one single protein + # ================================================================ + protein = get_sample_protein() + protein.coordinates = None + protein.function_annotations = None + protein.sasa = None + + # Use logits endpoint. Using bf16 for inference optimization + protein_tensor = client.encode(protein) + logits_output = client.logits( + protein_tensor, LogitsConfig(sequence=True, return_embeddings=True) + ) + assert isinstance( + logits_output, LogitsOutput + ), f"LogitsOutput was expected but got {logits_output}" + assert ( + logits_output.logits is not None and logits_output.logits.sequence is not None + ) + assert logits_output.embeddings is not None and logits_output.embeddings is not None + + +if __name__ == "__main__": + main(ESMC.from_pretrained("esmc_300m")) diff --git a/pyproject.toml b/pyproject.toml index 8ee478e..f7ff9db 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "esm" -version = "3.0.8" +version = "3.1.0" description = "EvolutionaryScale open model repository" readme = "README.md" requires-python = ">=3.10"