-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document (1) how to use local models, (2) which model classes are supported by prompt2model #356
Comments
Thanks for the interest @chensimian , can I clarify your request? Basically, prompt2model identifies the most useful model to fine-tune on the Hugging Face hub, but instead of using a model from Hugging Face you'd like to use one on your local disk? I think that you can probably do this by replacing the name of the pre-trained model that is passed into trainer = GenerationModelTrainer(
"/path/to/my/model",
has_encoder=True,
executor_batch_size=batch_size,
tokenizer_max_length=1024,
sequence_max_length=1280,
) Please tell us if this works, or doesn't work. |
Which models are supported, such as Llama or Baichuan? |
Where is the downloaded Hugging Face model stored in the directory? |
I tried the modification you suggested, but it didn't work. It still redirects to the website. For example, the following error occurs: |
Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.
We support all models that are supported by Hugging Face's
This is stored in your standard hugging face cache directory.
This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this! |
You say prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big. But I want to use a model that is larger than 3GB. What should I do? |
If you're downloading and loading the model locally (as you asked in the beginning of this thread), then this is not a problem. Prompt2model will happily train that model for you. If you want to use the model retriever, we will need to fix issue #273 first to make the maximum model size configurable. |
Are you asking if the issue you mentioned is still unresolved? For example, when I try to train using the chatgpt 6b model, I encounter this error:ValueError: Expected input batch_size (284) to match target batch_size (51). |
Hi @chensimian, I haven't tried with this model (and I think you mean chatglm-6b?). Can you please share the full stack trace for that ValueError to help us debug? Thank you. |
|
|
I don't use the model retriever to retrieve models. For example, if I train the model of Baichuan directly on the disk, the following errors will occur. Error: Model type should be one of BartConfig, BigBirdPegasusConfig, BlenderbotConfig, BlenderbotSmallConfig, EncoderDecoderConfig, FSMTConfig, GPTSanJapaneseConfig, LEDConfig, LongT5Config, M2M100Config, MarianConfig, MBartConfig, MT5Config, MvpConfig, NllbMoeConfig, PegasusConfig, PegasusXConfig, PLBartConfig, ProphetNetConfig, SwitchTransformersConfig, T5Config, UMT5Config, XLMProphetNetConfig. |
How to change a downloaded model to a local model without downloading the model from Hugging Face?
The text was updated successfully, but these errors were encountered: