-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
any way to generate music longer than 47 seconds? #154
Comments
just change the 'sample_size' to a value longer than 47s on model_config and you should be able to do it |
@kenkalang Thank you for your response. While the length has indeed increased, the audio quality seems to have deteriorated, and the conditions for seconds_start and seconds_total are not functioning as expected. The pre-trained model had a duration of 47 seconds, so it seems that a direct modification like this might not be appropriate. |
yeah, you should also fine tune the pre trained model with your dataset if you want it to have better quality |
@kenkalang If fine-tuning is to be performed, aside from DiT, does the Autoencoder part of the network also require fine-tuning? |
As the autoencoder is fully convolution-based, I think it does not need any fine-tuning |
yeah, just tune the DiT that's where the most impact happens |
Look up the multi diffusion paper, should work fine with this to generate arbitrary length music. |
@zaptrem Are you referring to the "Fusing Diffusion Paths for Controlled Image Generation" paper when you mention multi diffusion? Is there any other application related to music? |
Page 49 and 61 of Movie Gen they use it for their audio accompaniment model: https://ai.meta.com/static-resource/movie-gen-research-paper |
As the title described. Can continuation methods be used?
The text was updated successfully, but these errors were encountered: