Added random module #18

k88hudson-cfa · 2024-08-08T17:35:42Z

This PR adds a module to support creating and using random generators. It's implemented as a trait on context, with a DataPlugin to store a base random seed and a hashmap of rng instances. The design supports any rng implementation that implements rand::SeedableRandom.

The basic idea is you initialize the utility with a base random seed when setting up a context:

context.set_base_random_seed(12345678);

In a module that needs an independent random number generator, you use a macro to define an independent type, and then call context.get_rng::<YourType>() to retrieve it. For example:

define_rng!(FooRng)

let mut rng = context.get_rng::<FooRng>();
let value = rng.next_u64();

Note that an rng needs a mutable reference because it has to store internal state about previous calls in order to be reproducible.

One difference from the eosim implementation here is that I destroy existing rngs if base_seed is changed rather than using a flag on the holder.

k88hudson-cfa · 2024-08-13T19:52:28Z

So I gave explicit initialization a try (implementation here if you're curious: k88hudson_rng_explcit_init), which ends up looking something like this:

In your init function of the module, you call create_rng:

init() {
  // ...
  context.create_rng::<FooRng>();
}

And then later you call get_rng as normal:

some_other_method() {
  let mut foo_rng = context.get_rng::<FooRng>();

  foo_rng.next_u64();
  // ...
  foo_rng.next_u64();
}

The thing I don't like about this is that if you want to reset the base random seed, you have to deal with reinitialization (like maybe iterating through existing ones and replacing them?), which seems kind of not great

k88hudson-cfa · 2024-08-15T21:46:26Z

I spent a bunch of time but I couldn't figure out how to get this to work with interior mutability for the RngHolders which also having the whole HashMap in a RefCell, so I think I'm just gonna leave it as is for now

ekr-cfa · 2024-08-16T23:45:21Z

I spent a bunch of time but I couldn't figure out how to get this to work with interior mutability for the RngHolders which also having the whole HashMap in a RefCell, so I think I'm just gonna leave it as is for now

Should this say "without also"?

ekr-cfa · 2024-08-16T23:46:05Z

src/random.rs

+        struct $random_id {}
+
+        impl $crate::random::RngId for $random_id {
+            // TODO: This is hardcoded to StdRng; we should replace this


Suggested change

// TODO: This is hardcoded to StdRng; we should replace this

// TODO([email protected]): This is hardcoded to StdRng; we should replace this

https://google.github.io/styleguide/cppguide.html#TODO_Comments

ekr-cfa · 2024-08-16T23:49:33Z

src/random.rs

+}
+pub use define_rng;
+
+pub trait RngId: Any {


Why does this need to implement Any

After some discussion, turns out problem here was about the lifetime of R (the type id) in get_rng<R: RngId> , which apparently is scoped to the function whereas the closure where it is used has a static lifetime. The right fix here was to change the signature to fn get_rng<R: RngId + 'static>

I think you could get away with just RngId: 'static so that it has a TypeId, or move the static lifetime requirement into the get_rng / sample method signatures.

src/random.rs

ekr-cfa · 2024-08-16T23:52:33Z

src/random.rs

+    }
+);
+
+#[allow(clippy::module_name_repetitions)]


Let's add a comment that this is going to be a trait extension on Context.

I also wonder of we should have the convention be Context<Thingy> rather than <Thingy>Context. Or maybe ContextThingyExtn

ekr-cfa · 2024-08-16T23:53:38Z

src/random.rs

This is conceptually like the eosim code, but we also talked about a version where you didn't get the RNG but just told it what distribution to draw from. @jasonasher did you ever prototype that.

ekr-cfa · 2024-08-17T00:09:59Z

src/random.rs

+            .expect("You must initialize the random number generator with a base seed");
+
+        let rng_holders = data_container.rng_holders.try_borrow_mut().unwrap();
+


Extra blank line.

ekr-cfa · 2024-08-17T00:12:42Z

src/random.rs

+        let mut foo_rng = context.get_rng::<FooRng>();
+        assert_eq!(foo_rng.next_u64(), 5113542052170610017);
+        assert_eq!(foo_rng.next_u64(), 8640506012583485895);
+        assert_eq!(foo_rng.next_u64(), 16699691489468094833);


Do you really want to hardcode these values? This will break if Rust changes the algorithm, which they say they might.

ekr-cfa · 2024-08-17T00:14:17Z

src/random.rs

+
+    #[test]
+    #[should_panic]
+    fn get_rng_one_ref_per_rng_id() {


Isn't this test misnamed because the rule it is testing is that you can't have two references to any two Rngs, as you are getting BarRng.

src/random.rs

ekr-cfa · 2024-08-17T00:17:07Z

src/random.rs

+
+        let mut foo_rng = context.get_rng::<FooRng>();
+        foo_rng.next_u64();
+        drop(foo_rng);


Maybe comment this drop to explain why it's different from the previous test.

k88hudson-cfa · 2024-08-17T22:11:04Z

To summarize some discussion: one thing we should consider here is that we will mostly be using rngs in combination with some kind of distribution/sample function. These may be created in place by a module (like the Exp example below), and sometimes they might need to be stored as a global variable / in a data container (because they are pretty large / used in multiple places or whatever).

The design in this PR returns a RefMut you have to deal with that when you pass it to a sample function, which is maybe not ideal:

let mut foo_rng = context.get_rng::<FooRng>();
let dist = Exp::new(1 / bar).unwrap();
let value = dist.sample(&mut *foo_rng);

@jasonasher did some prototyping of an alternative design where instead of the context extension providing a get_rng method, it implements a sample that retrieves the rng internally:

let dist = context.get_data_container::<SamplerData>().unwrap();
let value = context.sample::<FooRng, usize>(|rng| dist.sample(rng));
// Aternatively, pass in the whole distribution
let value2 = context.sample_distr::<FooRng, usize>(dist);

I don't this hate honestly, if you're pretty much always using rngs this way

jasonasher · 2024-08-18T17:50:46Z

In my experience the most common way we use random sampling is either with distributions like this or by grabbing random f64/f32s to make future decisions, such as setting individual propensities to engage in various behaviors. So having a general sample method and then sample_distr and sample_xyz for various primitive xyz would cover all of the common use cases. I'd be interested to know if @confunguido agrees.

confunguido · 2024-08-18T19:18:15Z

In my experience the most common way we use random sampling is either with distributions like this or by grabbing random f64/f32s to make future decisions, such as setting individual propensities to engage in various behaviors. So having a general sample method and then sample_distr and sample_xyz for various primitive xyz would cover all of the common use cases. I'd be interested to know if @confunguido agrees.

@jasonasher. I think I agree. Sampling from distribution, making decisions, and grabbing a random individual from a group. As a side note, I think it would be nice to have the ability to specify and sample from custom probability distributions.

k88hudson-cfa · 2024-08-20T00:41:26Z

Alright, I think this is ready: updated this to expose sample and sample_distr – also added something to the macro to prevent name collisions.

jasonasher · 2024-08-20T16:23:56Z

@jasonasher. I think I agree. Sampling from distribution, making decisions, and grabbing a random individual from a group. As a side note, I think it would be nice to have the ability to specify and sample from custom probability distributions.

@confunguido These would be supported right out of the box with sample_distr: rand_distr. And, by implementing the Distribution<T> trait we can support custom distributions.

k88hudson-cfa · 2024-08-21T18:16:08Z

So @jasonasher came up with a solution in https://github.com/CDCgov/ixa/tree/jma_rand_runtime_collision for runtime collision checking – the thing I don't like about it is that it's easy to forget to update the field that caches name collisions given the current architecture of data containers, so I think i'd still leave this as is (unless we really don't want to require paste)

jasonasher · 2024-08-21T18:23:29Z

I don't think requiring paste is much of an issue - I was more reacting to handling people potentially not using the macro. My instinct is we don't have much of an update problem here for the name field caching, as it's handled in one place and not obviously likely to change much in the future, but I could be missing something and defer to your and @ekr-cfa's judgment here.

ekr-cfa · 2024-08-21T18:25:14Z

I would prefer a compile time check even if it's imperfect. I think if people color this far outside the lines, then they can suffer.

ekr-cfa

LGTM

ekr-cfa · 2024-08-17T20:41:31Z

src/random.rs

+
+        let mut foo_rng = context.get_rng::<FooRng>();
+        foo_rng.next_u64();
+        // If you drop the first reference, you should be able to get a reference to a different rng


Suggested change

// If you drop the first reference, you should be able to get a reference to a different rng

// If you drop the first reference, you should be able to get another reference to an rng

The same or different, right?

ekr-cfa · 2024-08-26T19:52:14Z

src/random.rs

+
+    #[test]
+    fn multiple_references_with_drop() {
+        let mut context = Context::new();


This test doesn't seem to do what it says.

k88hudson-cfa added 2 commits August 8, 2024 12:30

Added random module

6505831

Refactor get_rng a bit

def51f2

k88hudson-cfa requested a review from ekr-cfa August 15, 2024 21:46

ekr-cfa requested changes Aug 17, 2024

View reviewed changes

k88hudson-cfa added 2 commits August 17, 2024 13:10

Review fixes

a99cedc

Added a test as an example of how to use with a distribution

4870f23

k88hudson-cfa added 3 commits August 19, 2024 19:05

Remove Any

1d01a69

Ensure uniqueness of

377efb2

Switch api to sample / sample_distr

23ef220

k88hudson-cfa requested a review from ekr-cfa August 20, 2024 00:40

namespace type collision guard

f6a9f3b

k88hudson-cfa added 6 commits August 27, 2024 14:48

add sample_range

7404c87

Use struct literals and things are better

522f24a

Goodbye turbofish

fc51d73

add sample bool

d0e2aed

typo

f61ae10

multiple rng types types name

fa7acb4

ekr-cfa approved these changes Aug 27, 2024

View reviewed changes

Better test for sample_bool

de14d50

ekr-cfa merged commit eac036c into main Aug 27, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added random module #18

Added random module #18

k88hudson-cfa commented Aug 8, 2024

k88hudson-cfa commented Aug 13, 2024

k88hudson-cfa commented Aug 15, 2024 •

edited

Loading

ekr-cfa commented Aug 16, 2024

ekr-cfa Aug 16, 2024

ekr-cfa Aug 16, 2024

ekr-cfa Aug 16, 2024

k88hudson-cfa Aug 17, 2024

jasonasher Aug 18, 2024

ekr-cfa Aug 16, 2024

ekr-cfa Aug 16, 2024

ekr-cfa Aug 17, 2024

ekr-cfa Aug 17, 2024

ekr-cfa Aug 17, 2024

ekr-cfa Aug 17, 2024

k88hudson-cfa commented Aug 17, 2024

jasonasher commented Aug 18, 2024 •

edited

Loading

confunguido commented Aug 18, 2024

k88hudson-cfa commented Aug 20, 2024

jasonasher commented Aug 20, 2024 •

edited

Loading

k88hudson-cfa commented Aug 21, 2024

jasonasher commented Aug 21, 2024

ekr-cfa commented Aug 21, 2024

ekr-cfa left a comment

ekr-cfa Aug 17, 2024

ekr-cfa Aug 26, 2024

	// TODO: This is hardcoded to StdRng; we should replace this
	// TODO([email protected]): This is hardcoded to StdRng; we should replace this

		.expect("You must initialize the random number generator with a base seed");

		let rng_holders = data_container.rng_holders.try_borrow_mut().unwrap();

	// If you drop the first reference, you should be able to get a reference to a different rng
	// If you drop the first reference, you should be able to get another reference to an rng

Added random module #18

Added random module #18

Conversation

k88hudson-cfa commented Aug 8, 2024

k88hudson-cfa commented Aug 13, 2024

k88hudson-cfa commented Aug 15, 2024 • edited Loading

ekr-cfa commented Aug 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k88hudson-cfa commented Aug 17, 2024

jasonasher commented Aug 18, 2024 • edited Loading

confunguido commented Aug 18, 2024

k88hudson-cfa commented Aug 20, 2024

jasonasher commented Aug 20, 2024 • edited Loading

k88hudson-cfa commented Aug 21, 2024

jasonasher commented Aug 21, 2024

ekr-cfa commented Aug 21, 2024

ekr-cfa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k88hudson-cfa commented Aug 15, 2024 •

edited

Loading

jasonasher commented Aug 18, 2024 •

edited

Loading

jasonasher commented Aug 20, 2024 •

edited

Loading