Skip to content
This repository has been archived by the owner on Jun 17, 2022. It is now read-only.

[braindump] ownerships and thread safety

Arvid E. Picciani edited this page Feb 8, 2020 · 8 revisions

I’ve been thinking about thread safety for a while. Traditionally in C/C++ we would document if a function is thread safe and hope the user of that function will read the documentation.

Thread safe in this context means that function a and b on an object C can be called from two threads in parallel without causing invalid state of C.

fn thread1(C mut* c) { c->a(); }
fn thread2(C mut* c) { c->b(); }

ZZ generally is about compile time enforcement of traditionally implicit api contracts. In this case the traditional contract would be a documentation marker

/// thread safe
pub fn start(Car *o) {

formalizing that marker into formal logic is unsurprisingly hard. Most papers are motivated by distributed systems, and revolve around proving partial order. That is given a set of events, will every order lead to the same result?

for example the sequence

  1. a = 1
  2. b = a
  3. f = 3

has only the valid permutations (1,2,3) (3,1,2) (1,3,2), because all other permutations lead to a different result. in science, that's super easy to formalize

1 -> 2
3

cool stuff, but pretty meaningless to us programmers. We can't be bothered to manually specify those arrows for everything. Why can't the computer figure this out? Yes, well, if your computer is the size of one google dot com, it could execute every permutation and find the ones that break. So no.

Rust has the right idea here that works in practice. It draws from the old idea of ownership and then takes it further by splitting ownership into write and read. It’s pretty neat and works most of the time.

  1. a = 1 (take and return mut a)
  2. b = a (take and return a, take and return mut b)
  3. f = 3 (take and return mut f)

so putting any instruction inbetween 1 and 2 that would take ref mut b will not be allowed.

  1. a = 1 (take and return mut a)
  2. x = &mut b (take mut b)
  3. b = a (take and return a, take and return mut b)
  4. f = 3 (take and return mut f)

this is provable in SSA form:

assert (= a0 1)
assert owner(a0, I1)

assert (= x0 ( ref b0 ))
assert owner(x0 , I2)
assert owner(b0,  I2)

assert (= b0 a0)

// nope, it is I2
assert owner(b0, I3)

but the proofs get excessively long to the point where manual tracking becomes easier. i'd argue this is also not a case where programmers make mistakes, because its easy to argue about.

it becomes much more interesting when we move ownership into a black box

  1. a = 1
  2. do_things_with(&a);
  3. b = a
  4. f = 3

if the function immediately returns ownership back, we all guccy, but how do we know the function will not store the pointer somewhere in a global? or worse, in a thread stack?

usually the documentation will tell us. But let's go to C++ for a second, because they did think about this stuff.

  • raw pointer: caller still owns the pointer
  • std::unique_ptr: function now owns the pointer, there may not be an alias
  • std::shared_ptr: both own the pointer and the functions lifetime is longer than the caller
  • std::weak_ptr: both own the pointer and the functions lifetime is shorter than the caller

they're runtime things, but at least we can see there are different practical categories of ownership.

formalized:

raw < weak < shared < unique

let's define some theory that implements that order of lifetimes

enum Lifetime {
 Disowned
 Raw,
 Weak,
 Shared,
 Unique,
}
theory lifetime(void *) -> Lifetime;

fn own_something(int *a)
   where lifetime(a) == Lifetime::Unique 
   model lifetime(a) == Lifetime::Disowned
{
  free(a);
}

fn just_a_quick_borrow(int *a)
   where lifetime(a) >= Lifetime::Raw
   model lifetime(a) >= Lifetime::Raw
{
   own_something(a);
  // error model requires lifetime(a) >= Lifetime::Raw
}

fn main() {
   int m = 0;
   int * a = &m;

   static_attest(lifetime(a) == Unique);
   just_a_quick_borrow(a);
}

cool demo, but super awkward in practice.

a more practical approach would be to define lifetime as a free integer between zero and infinite

theory lifetime(void *) -> int;

fn own_something(int * a)
   model lifetime(a) == 0
{
  free(a);
}

fn just_a_quick_borrow(int *a)
{
   own_something(a) 
   // error: implicit model of just_a_quick_borrow requires
   //    lifetime(a) > lifetime(just_a_quick_borrow)
}

fn main() {
   int m = 0;
   int * a = &m;

   static_attest(lifetime(a) == lifetime(main));
   just_a_quick_borrow(a);
}

that's less powerful and has edge cases, but naturally follows programming flow.

"the pointer i'm putting into the function, exists as long as the function exists"

unfortunately the error messages with free integers are pretty bad

struct A {
   int *x;
}

fn put_in_struct(A * a, int *x)
   model lifetime(x) >= lifetime(a)
{
   a->x = x;
}


fn main() {

   A a;
   for (;;) {
      int m = 0;
      put_in_struct(&a, &m);

      break;
      // error: violation of callsite constrain for drop(m)
      //   lifetime(x) == 8399278
      //     for:
      //   lifetime(x) == 178287123
   }

}

which is why i wanted to introduce "atoms" which are a single global enum with anonymous order value.

atom Bob;
atom Peter;
static_assert(Bob != Peter);
Clone this wiki locally