add std/smartptrs #17333

ringabout · 2021-03-11T03:24:01Z

After discussing with araq, move smartptrs from fusion, add isolation support.

The original code:
https://github.com/nim-lang/fusion/blob/master/src/fusion/smartptrs.nim

thanks for the contributions of @Araq @planetis-m and @narimiran

thanks @cooldome
#10485

Todo:

an atomic runnableExamples for sharedPtr at the muti-threads environement
conisder implementing AtomicSharedPtr
use std/atomics instead of system/atomics and use moAquire instead of moConsume

ringabout · 2021-03-11T03:35:07Z

Maybe also add

proc extract[T](p): T =
  move p.val

and count getter for SharedPtr

proc count(): int

and maybe use std/atomics instead of system/atomics?

lib/std/smartptrs.nim

tests/stdlib/tsmartptrs.nim

lib/std/smartptrs.nim

Co-authored-by: konsumlamm <[email protected]> Co-authored-by: Juan Carlos <[email protected]>

konsumlamm · 2021-03-19T14:34:54Z

a.isNil == true => a.isNil
a.isNil == false => not a.isNil
(see #17333 (comment) & #17333 (comment))

ringabout · 2021-03-19T14:47:03Z

a.isNil == true => a.isNil
a.isNil == false => not a.isNil
(see #17333 (comment) & #17333 (comment))

done

lib/std/smartptrs.nim

Varriount · 2021-06-23T19:13:30Z

I like the overall design, however the documentation needs improvement. The assumption should be that this module will be used by users who aren't familiar with C++ and its smart pointers.

arnetheduck · 2021-06-25T08:31:59Z

essence of a

More than anything, it captures a "unique instance" perhaps - but not a pointer - because this type allocates its own memory in a specific memory pool, it also breaks all pointers that point to the original instance - this is a significant distinction, together with the fact that copying the instance may be expensive - a type X = array[1024*1024*100, byte] structure, when you put it in a UniquePtr as proposed will cause a memory usage spike of 100 mb and a copy - this is quite different from the semantics that one would expect from moving a "pointer" around.

In other words, it's a nice little utility maybe, but it doesn't have that much to do with the use cases that unique_ptr in C++ typically are used for, neither is it really a traditional "pointer" in the nim sense.

Araq · 2021-06-25T08:35:32Z

I'm not sure what you mean.

planetis-m · 2021-06-25T12:25:03Z

@arnetheduck turns out you can attach destructors to distinct ptr, so imo that would be a more elegant way for an object pool to work, regardless post an example plz, I would be interested to tinker with.

dom96 · 2021-06-25T21:48:51Z

Agree with @Varriount. The documentation should also contain what this module's intended use cases are. I'm actually wondering: why are we adding such a big C++ concept to Nim's stdlib?

Grinshpon · 2021-06-25T23:59:09Z

I'm actually wondering: why are we adding such a big C++ concept to Nim's stdlib?

Nim seems to be going in a direction that promotes gc-less programming, at least according to Araq's blog, and so having smart pointers would be extremely beneficial as a way to not deal with raw pointers, in cases where the programmer still needs to be dealing with pointers.

Varriount · 2021-06-26T01:38:01Z

I'm actually wondering: why are we adding such a big C++ concept to Nim's stdlib?

Nim seems to be going in a direction that promotes gc-less programming, at least according to Araq's blog, and so having smart pointers would be extremely beneficial as a way to not deal with raw pointers, in cases where the programmer still needs to be dealing with pointers.

Eh. The ARC GC essentially replicates what C++ smart pointers do - both use refcounting mechanisms to determine when memory should be released. My interpretation is that this module was wanted because ARC doesn't do atomic refcounting, and this module can implement that.

Araq · 2021-06-26T05:50:44Z

My interpretation is that this module was wanted because ARC doesn't do atomic refcounting, and this module can implement that.

Entirely correct. These shared pointers are a last resort, the recommended solution is to send subgraphs via isolate.

dom96 · 2021-06-27T11:35:48Z

What's the blocker for changing ARC to do atomic refcounting? Is it something that we decided not to do ever or is this module just to provide a workaround in the short-term?

If the former we should make it clear in the docs of this module that it should be used for shared memory management. If the latter we shouldn't have this module in stdlib.

Araq · 2021-06-28T06:19:44Z

What's the blocker for changing ARC to do atomic refcounting?

Atomic refcounting is simply too slow for idiomatically written Nim code.

Is it something that we decided not to do ever or is this module just to provide a workaround in the short-term?

No, it's a very long-term solution as I doubt hardware will ever make atomic operations fast enough. Unfortunately.

dom96 · 2021-06-28T12:57:02Z

Isn't there a safer alternative? Having to manage memory manually will make our parallelism a worse experience than most languages out there.

Araq · 2021-06-28T16:05:20Z

Er, this is pretty much Rust's solution, it's not unsafe, the unsafety is hidden behind the API. But anyhow, the encouraged solution is to use isolate instead.

planetis-m · 2021-06-29T10:46:00Z

More than anything, it captures a "unique instance" perhaps - but not a pointer - because this type allocates its own memory in a specific memory pool, it also breaks all pointers that point to the original instance - this is a significant distinction, together with the fact that copying the instance may be expensive - a type X = array[1024*1024*100, byte] structure, when you put it in a UniquePtr as proposed will cause a memory usage spike of 100 mb and a copy - this is quite different from the semantics that one would expect from moving a "pointer" around.

Try: https://gist.github.com/planetis-m/42b675403212e018b5b9c9cc2378dffc

Clang is able to optimize this, there is no memcpy, data are placed directly on the heap. Whereas gcc produces code that crashes with SIGSEGV.

...but it only works with -d:release.

dom96 · 2021-06-29T21:03:13Z

Er, this is pretty much Rust's solution, it's not unsafe, the unsafety is hidden behind the API. But anyhow, the encouraged solution is to use isolate instead.

What stops us from making memory sharing across threads as easy as in Golang or Python? I think Nim should be more expressive than Rust here, so comparing to Rust isn't enough.

Araq · 2021-06-29T21:19:54Z

Golang needs runtime data race checking and Python's multi-threading story is famous for its global interpreter lock...

so comparing to Rust isn't enough.

It's good enough for me and nobody has a better solution.

dom96 · 2021-06-29T21:42:28Z

Golang needs runtime data race checking

And smartptrs doesn't?

Anyway, we can discuss offline. Before merging this let's change the docs of this module though to reference isolate and make it clear that this module is for rare edge cases where isolate is not enough.

planetis-m · 2021-07-09T10:45:24Z

So, in order for [] to be memory safe, we need to return Isolated[var T] or something.

Grinshpon · 2021-07-20T20:34:12Z

Atomic refcounting is simply too slow for idiomatically written Nim code.

I'd like some clarification on this. Are SharedPtr's slower than ref's with --gc:arc?

Araq · 2021-07-21T08:50:36Z

I'd like some clarification on this. Are SharedPtr's slower than ref's with --gc:arc?

Yes, definitely. Rust has both Rc and Arc for a reason.

Grinshpon · 2021-07-22T05:31:27Z

Yes, definitely. Rust has both Rc and Arc for a reason.

May I ask why? To my knowledge, a SharedPtr is just incrementing/decrementing an integer counter, unless this has to do with low-level stuff I don't know about yet

Araq · 2021-07-22T06:46:42Z

May I ask why? To my knowledge, a SharedPtr is just incrementing/decrementing an integer counter, unless this has to do with low-level stuff I don't know about yet

It's an atomic count operation, 30x slower than an ordinary count operation. It prevents the CPU from doing pipelining effectively and needs communication between the different CPU cores.

From http://iacoma.cs.uiuc.edu/iacoma-papers/pact18.pdf
"As shown in the figure, performing RC operations takes on
average 42% of the execution time in client programs, and 15% in
server programs. The average across all programs can be shown to
be 32%"

timotheecour · 2021-07-22T07:06:13Z

lib/std/smartptrs.nim

+    result.val[] = val.extract
+    # no destructor call for 'val: sink T' here either.
+
+  template newUniquePtr*[T](val: T): UniquePtr[T] =


as mention in #18450 (comment) for the other PR, this is what's needed:

proc newUniquePtr*[T](U: typedesc[T]): UniquePtr[T] = result.val = cast[ptr T](allocShared(sizeof(T)))

which also ends up producing correct results for the C++ example i gave, unlike what the other APIs do.

read the whole thread starting here for more context: #18450 (comment)

I also wonder whether we actually need proc newUniquePtr[T](val: sink Isolated[T]): UniquePtr[T] {.nodestroy.} = given that proc newUniquePtr*[T](U: typedesc[T]): UniquePtr[T] = seems simpler, more correct, and more performant (no copy)

I don't understand. Don't you want the unique pointer to point to some valid object? Your proc doesn't do the same.

I'm allocating on the heap only once instead of allocating in caller scope (eg on stack) and then allocating on heap and copying; it's more efficient and analogous to C++ placement new on which C++ make_unique relies, avoids complications of isolated, and is also more correct wrt destructors being called as you can see in #18450 (comment)

Caller can then, if needed, assign values in the heap allocated object, eg:

import std/smartptrs type ManagedFile = object name: string cfile: File proc `=destroy`(a: var ManagedFile) = echo ("in =destroy", a.name) close(a.cfile) a.cfile = nil proc main()= block: let p = newUniquePtr(ManagedFile) p[].name = currentSourcePath p[].cfile = open(p[].name, fmRead) echo p[].cfile.readAll.len echo "after" main()

prints:

381 ("in =destroy", "/Users/timothee/git_clone/nim/timn/tests/nim/all/t12621.nim") after

It's also the correct way to wrap C++ objects (eg with non-trivial constructor): it allows the C++ object to be constructed exactly once, and then destroyed when the unique_ptr goes out of scope.

@timotheecour its already taken care of https://github.com/nim-lang/threading/blob/master/threading/smartptrs.nim#L48 Its not more efficient though, you should use the sink one (unless you have to) as I said in #17333 (comment)

p[].name = currentSourcePath p[].cfile = open(p[].name, fmRead)

is not as efficient as direct inplace construction. Not because of the lack of moves, but because the compiler doesn't know that in p[] there is nothing to destroy yet.

Here's one, showing that the way I proposed is 3.4X faster, where the speedup depends on the size of the object; you'll get smaller speedups for smaller objects but always at least as fast.

But that's measuring artifacts of the current implementation, in particular NVRO (which is unfortunately currently fragile) or the lack thereof. In theory it's a pessimization as you mitigated the compiler's ability to reason about the code.

I still don't understand the reasoning here; in current code, you:
1 construct an object Foo in the stack
2 allocate sizeof(Foo) on the heap
3 copy Foo

with proposed code you:
1 allocate sizeof(Foo) on the heap
2 construct in-place

in all possible scenarios, it should be at least as fast, and usually faster; eg it allows placement-new with C++, or simply just constructing in-place the fields you care about updating from their default value (if some of the memory is to remain 0-initialized, you don't pay for it); you also don't have extra ctor/dtor to worry about, in particular for C++ objects.

n element version

Finally, this generalizes in a straightforward way to C++'s array version of make_unique, which we should add at some point; it allows dynamic allocation of n objects without having to use a seq (eg so you can interface with C++ or avoid relying on gc for allocating n elements); there are 2 approaches to consider:

same as C++, make the n implicit (user code need to pass that value around somewhere else)

proc newUniquePtr*[T](U: typedesc[T], n = 1): UniquePtr[T] = result.val = cast[ptr T](allocShared(sizeof(T) * n))

add a UniquePtrN type with an extra field n: int

type UniquePtrN*[T] = object val: ptr T n: int proc newUniquePtrN*[T](U: typedesc[T], n = 1): UniquePtrN[T] = result.val = cast[ptr T](allocShared(sizeof(T) * n)) result.n = n

I don't follow, the loophole is already there,

I am just pointing it out. I had totally missed it before.

n element version

Why not use a seq instead?

I still don't understand the reasoning here; in current code, you: ...

Your're correct, however, the code is still not as efficient as it could be if we had proper "in-place" construction. I think we could do proper in-place construction for the construct toSinkParam ObjConstr(x: x, y: y, ...).

for c++ objects, it should already be possible today without a language extension (via importcpp of new(buf) T), although i haven't tried yet
for nim objects, it shouldn't be hard to extend semObjConstr to take an optional input buffer address, eg:

var p: ptr Foo = ... Foo.new(p, a1: 1, a2: 2) # explicit syntax p[] = Foo(a1: 1, a2: 2) # syntax sugar possible?

in any case, these can be added later, and proc newUniquePtrN*[T](U: typedesc[T]): UniquePtrN[T] = can be introduced before those exist; code would just get auto-optimized once p[] = Foo(a1: 1, a2: 2) starts working as in-place construction

arnetheduck · 2021-07-23T10:25:23Z

More than anything, it captures a "unique instance" perhaps - but not a pointer - because this type allocates its own memory in a specific memory pool, it also breaks all pointers that point to the original instance - this is a significant distinction, together with the fact that copying the instance may be expensive - a type X = array[1024*1024*100, byte] structure, when you put it in a UniquePtr as proposed will cause a memory usage spike of 100 mb and a copy - this is quite different from the semantics that one would expect from moving a "pointer" around.

Try: https://gist.github.com/planetis-m/42b675403212e018b5b9c9cc2378dffc

Clang is able to optimize this, there is no memcpy, data are placed directly on the heap. Whereas gcc produces code that crashes with SIGSEGV.

...but it only works with -d:release.

Relying on the C-level optimizer tends to be a bit fragile - either we must limit construction such that the Nim compiler can guarantee that it generates code with in-place construction in all valid cases (ie even with instances returned from a function etc), or it might be difficult to rely on the type in generic code where it's hard to control the stack size.

We already have many issues with stack overflow in Nim because the rules for when RVO happens are .. fragile, like this.

arnetheduck · 2021-07-23T10:29:30Z

@arnetheduck turns out you can attach destructors to distinct ptr, so imo that would be a more elegant way for an object pool to work, regardless post an example plz, I would be interested to tinker with.

https://stackoverflow.com/questions/28843379/how-to-use-boostobject-pool-with-stdunique-ptr would be a classic example

ringabout added 6 commits March 8, 2021 20:28

improve test coverage for isolation

3273fd7

a bit better

b4fa432

Merge remote-tracking branch 'upstream/devel' into a141

421cf08

Merge remote-tracking branch 'upstream/devel' into a144

ea5fe84

add std/smartptrs

7b329bc

more

4aae28e

ringabout marked this pull request as draft March 11, 2021 03:24

add comments

b6af261

timotheecour reviewed Mar 11, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

ringabout added 3 commits March 11, 2021 13:20

add comments

f9bf78d

up

5e3eb8d

add comment

57e18ad

konsumlamm suggested changes Mar 11, 2021

View reviewed changes

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

tests/stdlib/tsmartptrs.nim Outdated Show resolved Hide resolved

juancarlospaco reviewed Mar 13, 2021

View reviewed changes

lib/std/smartptrs.nim Outdated Show resolved Hide resolved

ringabout and others added 4 commits March 19, 2021 22:17

Merge remote-tracking branch 'upstream/devel' into a147

162db95

Apply suggestions from code review

120716f

Co-authored-by: konsumlamm <[email protected]> Co-authored-by: Juan Carlos <[email protected]>

Merge branch 'a147' of https://github.com/xflywind/Nim into a147

5566515

use doAssert

6f30a8c

smartptrs

7992e60

ringabout marked this pull request as ready for review March 19, 2021 14:49

Varriount requested changes Jun 23, 2021

View reviewed changes

lib/std/smartptrs.nim Show resolved Hide resolved

planetis-m mentioned this pull request Jul 7, 2021

Fresh attempt on smartptrs #18450

Closed

timotheecour reviewed Jul 22, 2021

View reviewed changes

planetis-m mentioned this pull request Jul 23, 2021

Add smartptrs and atomics nim-lang/threading#5

Merged

ringabout closed this Sep 22, 2021

ringabout deleted the a147 branch September 19, 2022 14:27

ringabout removed the TODO: followup needed remove tag once fixed or tracked elsewhere label Sep 19, 2022

add std/smartptrs #17333

add std/smartptrs #17333

Conversation

ringabout commented Mar 11, 2021 • edited Loading

ringabout commented Mar 11, 2021 • edited Loading

konsumlamm commented Mar 19, 2021

ringabout commented Mar 19, 2021

Varriount commented Jun 23, 2021

arnetheduck commented Jun 25, 2021

Araq commented Jun 25, 2021

planetis-m commented Jun 25, 2021

dom96 commented Jun 25, 2021 • edited Loading

Grinshpon commented Jun 25, 2021

Varriount commented Jun 26, 2021

Araq commented Jun 26, 2021

dom96 commented Jun 27, 2021

Araq commented Jun 28, 2021 • edited Loading

dom96 commented Jun 28, 2021

Araq commented Jun 28, 2021

planetis-m commented Jun 29, 2021 • edited Loading

dom96 commented Jun 29, 2021

Araq commented Jun 29, 2021

dom96 commented Jun 29, 2021 • edited Loading

planetis-m commented Jul 9, 2021

Grinshpon commented Jul 20, 2021

Araq commented Jul 21, 2021

Grinshpon commented Jul 22, 2021

Araq commented Jul 22, 2021 • edited Loading

timotheecour Jul 22, 2021 • edited Loading

Choose a reason for hiding this comment

Araq Jul 22, 2021

Choose a reason for hiding this comment

timotheecour Jul 22, 2021 • edited Loading

Choose a reason for hiding this comment

planetis-m Jul 22, 2021 • edited Loading

Choose a reason for hiding this comment

Araq Jul 22, 2021

Choose a reason for hiding this comment

Araq Jul 23, 2021 • edited Loading

Choose a reason for hiding this comment

timotheecour Jul 23, 2021

Choose a reason for hiding this comment

n element version

planetis-m Jul 23, 2021

Choose a reason for hiding this comment

Araq Jul 24, 2021 • edited Loading

Choose a reason for hiding this comment

timotheecour Jul 25, 2021

Choose a reason for hiding this comment

arnetheduck commented Jul 23, 2021

arnetheduck commented Jul 23, 2021

ringabout commented Mar 11, 2021 •

edited

Loading

ringabout commented Mar 11, 2021 •

edited

Loading

dom96 commented Jun 25, 2021 •

edited

Loading

Araq commented Jun 28, 2021 •

edited

Loading

planetis-m commented Jun 29, 2021 •

edited

Loading

dom96 commented Jun 29, 2021 •

edited

Loading

Araq commented Jul 22, 2021 •

edited

Loading

timotheecour Jul 22, 2021 •

edited

Loading

timotheecour Jul 22, 2021 •

edited

Loading

planetis-m Jul 22, 2021 •

edited

Loading

Araq Jul 23, 2021 •

edited

Loading

Araq Jul 24, 2021 •

edited

Loading