-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we always pass rng
to the model?
#721
Comments
I would prefer to force the user to pass the |
Where is it that it will tend to be available? I mean, is it always going to be obvious to a user which rng winds up getting used? |
Second, having a |
Problem here is that the user then has to specify two rngs when calling a bunch of functions, e.g.
Regarding the For example, samplers which only use log-density evaluation, e.g. f = LogDensityFunction(model)
sample(f, external_sampler, ...) We would then like a user to be able to do the following: sample(rng, model, external_sampler, ...) and know that the provided
It's used in the tilde pipeline whenever we have a |
I don't see this as a problem. The two rngs would serve two quite distinct purposes.
I'm not sure this is a problem. Do you have a use case where this would result in unexpected or undesirable behaviour? I'm more tolerant of rngs mutating than mutation more generally. If I do something like @model function my_model(rng)
blahblah
end
model_instance = my_model(rng)
model_instance()
sample(model_instance, NUTS(), 100)
rand(model_instance) then I wouldn't be offended that the results I get depend on things like how many samples I pulled or which sampler I used. Actually, that's exactly what I would expect. I certainly wouldn't want all my samples to be identical.
I was actually thinking of something simpler: If we would come to the conclusion that we want to make
|
Sure, but there's a reason why we try to stick to using a single RNG in a program. Simple example: I want to parallelize the sampling within my sampler, so when the user does sample(rng, model, parallel_sampler, ...) I convert this into rngs = split(rng, num_devices)
pmap(rngs) do rng
sample(rng, model, parallel_sampler, ...)
end Here we're clearly not achieving what we want if there's also a separate
Ah, sure then I'm very much whatever. I agree that |
Back in the day, the evaluator for a
@model
would look likeor something like this.
But when we started making use of contexts more drastically in #249 , this became instead the simpler
and instead we provide the
rng
andsampler
argument sometimes through theSamplingcontext(rng, sampler, context)
, thus providing a clear separation between when we're sampling and when we're evaluating a model.However, this consequently doesn't allow someone to define models with inherit randomness in them while still preserving determinisim conditional on a
rng
.A simple example is an implementation of a model with subsampling of the data. I could do this as follows:
However, the issue in the above is ofc that the
rand
call inside@model
doesn't have access to therng
used internally in the model, and thus cannot ensure that everything is deterministic given anrng
.Ofc, the user could provide the
rng
as a specific argument, but this seems quite redundant as we often will have anrng
available.As a result, I'm thinking that it might be useful to remove the
rng
from theSamplingConctext
and instead make it a similarly "private" varriable so users could doas they could before.
Thoughts? @mhauru @penelopeysm @yebai @willtebbutt @sunxd3
The text was updated successfully, but these errors were encountered: