Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cache_length an argument from the Sampler #70

Merged
merged 1 commit into from
Jan 6, 2025

Conversation

copybara-service[bot]
Copy link

@copybara-service copybara-service bot commented Jan 2, 2025

Make cache_length an argument from the Sampler

max_cache_length is currently defined in the TransformerConfig but this really should be a sampler argument.

  • This would allow to use the same transformer instance for both training and inference.
  • From an API perspective, creating the Gemma configs for the 2b,... models should not require arguments.

@copybara-service copybara-service bot changed the title Add SamplerEval Make cache_length an argument from the Sampler Jan 2, 2025
`max_cache_length` is currently defined in the `TransformerConfig` but this really should be a sampler argument.

* This would allow to use the same transformer instance for both training and inference.
* From an API perspective, creating the Gemma configs for the 2b,... models should not require arguments.

PiperOrigin-RevId: 712484211
@copybara-service copybara-service bot merged commit d03cfd6 into main Jan 6, 2025
@copybara-service copybara-service bot deleted the test_708307065 branch January 6, 2025 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant