Skip to content

doodspav/atomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

atomics

This library implements a wrapper around the lower level patomic C library (which is provided as part of this library through the build_patomic command in setup.py).

It exposes hardware level lock-free (and address-free) atomic operations on a memory buffer, either internally allocated or externally provided, via a set of atomic classes.

The operations in these classes are both thread-safe and process-safe, meaning that they can be used on a shared memory buffer for interprocess communication (including with other languages such as C/C++).

Table of Contents

Installing

Linux/MacOS:

$ python3 -m pip install atomics

Windows:

$ py -m pip install atomics

This library requires Python3.6+, and has a dependency on the cffi library. While the code here has no dependency on any implementation specific features, the cffi library functions used are likely to not work outside of CPython and PyPy.

Binaries are provided for the following platforms:

  • Windows [x86, amd64]
  • MacOSX [x86_64, universal2]
  • Linux [i686, x86_64, aarch64, ppc64le, s390x] [manylinux2014, musllinux_1_1]
  • Linux [i686, x86_64] [manylinux1]

If you are on one of these platforms and pip tries to build from source or fails to install, make sure that you have the latest version of pip installed. This can be done like so:

Linux/MacOS:

$ python3 -m pip install --upgrade pip

Windows:

$ py -m pip install --upgrade pip

If you need to build from source, check out the Building section as there are additional requirements for that.

Examples

Incorrect

The following example has a data race (ais modified from multiple threads). The program is not correct, and a's value will not equal total at the end.

from threading import Thread


a = 0


def fn(n: int) -> None:
    global a
    for _ in range(n):
        a += 1


if __name__ == "__main__":
    # setup
    total = 10_000_000
    # run threads to completion
    t1 = Thread(target=fn, args=(total // 2,))
    t2 = Thread(target=fn, args=(total // 2,))
    t1.start(), t2.start()
    t1.join(), t2.join()
    # print results
    print(f"a[{a}] != total[{total}]")

Multi-Threading

This example implements the previous example but a is now an AtomicInt which can be safely modified from multiple threads (as opposed to int which can't). The program is correct, and a will equal total at the end.

import atomics
from threading import Thread


def fn(ai: atomics.INTEGRAL, n: int) -> None:
    for _ in range(n):
        ai.inc()


if __name__ == "__main__":
    # setup
    a = atomics.atomic(width=4, atype=atomics.INT)
    total = 10_000
    # run threads to completion
    t1 = Thread(target=fn, args=(a, total // 2))
    t2 = Thread(target=fn, args=(a, total // 2))
    t1.start(), t2.start()
    t1.join(), t2.join()
    # print results
    print(f"a[{a.load()}] == total[{total}]")

Multi-Processing

This example is the counterpart to the above correct code, but using processes to demonstrate that atomic operations are also safe across processes. This program is also correct, and a will equal total at the end. It is also how one might communicate with processes written in other languages such as C/C++.

import atomics
from multiprocessing import Process, shared_memory


def fn(shmem_name: str, width: int, n: int) -> None:
    shmem = shared_memory.SharedMemory(name=shmem_name)
    buf = shmem.buf[:width]
    with atomics.atomicview(buffer=buf, atype=atomics.INT) as a:
        for _ in range(n):
            a.inc()
    del buf
    shmem.close()


if __name__ == "__main__":
    # setup
    width = 4
    shmem = shared_memory.SharedMemory(create=True, size=width)
    buf = shmem.buf[:width]
    total = 10_000
    # run processes to completion
    p1 = Process(target=fn, args=(shmem.name, width, total // 2))
    p2 = Process(target=fn, args=(shmem.name, width, total // 2))
    p1.start(), p2.start()
    p1.join(), p2.join()
    # print results and cleanup
    with atomics.atomicview(buffer=buf, atype=atomics.INT) as a:
        print(f"a[{a.load()}] == total[{total}]")
    del buf
    shmem.close()
    shmem.unlink()

NOTE: Although shared_memory is showcased here, atomicview accepts any type that supports the buffer protocol as its buffer argument, so other sources of shared memory such as mmap could be used instead.

Docs

Types

The following helper (abstract-ish base) types are available in atomics:

  • [ANY, INTEGRAL, BYTES, INT, UINT]

This library provides the following Atomic classes in atomics.base:

  • Atomic --- ANY
  • AtomicIntegral --- INTEGRAL
  • AtomicBytes --- BYTES
  • AtomicInt --- INT
  • AtomicUint --- UINT

These Atomic classes are constructable on their own, but it is strongly suggested using the atomic() function to construct them. Each class corresponds to one of the above helper types (as indicated).

This library also provides Atomic*View (in atomics.view) and Atomic*ViewContext (in atomics.ctx) counterparts to the Atomic* classes, corresponding to the same helper types.

The latter of the two sets of classes can be constructed manually, although it is strongly suggested using the atomicview() function to construct them. The former set of classes cannot be constructed manually with the available types, and should only be obtained by called .__enter__() on a corresponding Atomic*ViewContext object.

Even though you should never need to directly use these classes (apart from the helper types), they are provided to be used in type hinting. The inheritance hierarchies are detailed in the ARCHITECTURE.md file (available on GitHub).

Construction

This library provides the functions atomic and atomicview, along with the types BYTES, INT, and UINT (as well as ANY and INTEGRAL) to construct atomic objects like so:

import atomics

a = atomics.atomic(width=4, atype=atomics.INT)
print(a)  # AtomicInt(value=0, width=4, readonly=False, signed=True)

buf = bytearray(2)
with atomics.atomicview(buffer=buf, atype=atomics.BYTES) as a:
    print(a)  # AtomicBytesView(value=b'\x00\x00', width=2, readonly=True)

You should only need to construct objects with an atype of BYTES, INT, or UINT. Using an atype of ANY or INTGERAL will require additional kwargs, and an atype of ANY will result in an object that doesn't actually expose any atomic operations (only properties, explained in sections further on).

The atomic() function returns a corresponding Atomic* object.

The atomicview() function returns a corresponding Atomic*ViewContext object. You can use this context object in a with statement to obtain an Atomic*View object. The buffer parameter may be any object that supports the buffer protocol.

Construction can raise UnsupportedWidthException and AlignmentError.

NOTE: the width property of Atomic*View objects is derived from the buffer's length as if it were contiguous. It is equivalent to calling memoryview(buf).nbytes.

Lifetime

Objects of Atomic* classes (i.e. objects returned by the atomic() function) have a self-contained buffer which is automatically freed. They can be passed around and stored liked regular variables, and there is nothing special about their lifetime.

Objects of Atomic*ViewContext classes (i.e. objects returned by the atomicview() function) and Atomic*View objects obtained from said objects have a much stricter usage contract.

Contract

The buffer used to construct an Atomic*ViewContext object (either directly or through atomicview()) MUST NOT be invalidated until .release() is called. This is aided by the fact that .release() is called automatically in .__exit__(...) and .__del__(). As long as you immediately use the context object in a with statement, and DO NOT invalidate the buffer inside that with scope, you will always be safe.

The protections implemented are shown in this example:

import atomics


buf = bytearray(4)
ctx = atomics.atomicview(buffer=buf, atype=atomics.INT)

# ctx.release() here will cause ctx.__enter__() to raise:
# ValueError("Cannot open context after calling 'release'.")

with ctx as a:  # this calls ctx.__enter__()
    # ctx.release() here will raise:
    # ValueError("Cannot call 'release' while context is open.")

    # ctx.__enter__() here will raise:
    # ValueError("Cannot open context multiple times.")
    
    print(a.load())  # ok

# ctx.__exit__(...) now called
# we can safely invalidate object 'buf' now

# ctx.__enter__() will raise:
# ValueError("Cannot open context after calling 'release'.")

# accessing object 'a' in any way will also raise an exception

Furthermore, in CPython, all built-in types supporting the buffer protocol will throw a BufferError exception if you try to invalidate them while they're in use (i.e. before calling .release()).

As a last resort, if you absolutely must invalidate the buffer inside the with context (where you can't call .release()), you may call .__exit__(...) manually on the Atomic*ViewContext object. This is to force explicitness about something considered to be bad practice and dangerous.

Where it's allowed, .release() may be called multiple times with no ill-effects. This also applies to .__exit__(...), which has no restrictions on where it can be called.

Alignment

Different platforms may each have their own alignment requirements for atomic operations of given widths. This library provides the Alignment class in atomics to ensure that a given buffer meets these requirements.

from atomics import Alignment

buf = bytearray(8)
align = Alignment(len(buf))
assert align.is_valid(buf)

If an atomic class is constructed from a misaligned buffer, the constructor will raise AlignmentError.

By default, .is_valid calls .is_valid_recommended. The class Alignment also exposes .is_valid_minimum. Currently, no atomic class makes use of the minimum alignment, so checking for it is pointless. Support for it will be added in a future release.

Properties

All Atomic* and Atomic*View classes have the following properties:

  • width: width in bytes of the underlying buffer (as if it were contiguous)
  • readonly: whether the object supports modifying operations
  • ops_supported: a sorted list of OpType enum values representing which operations are supported on the object

Integral Atomic* and Atomic*View classes also have the following property:

  • signed: whether arithmetic operations are signed or unsigned

In both cases, the behaviour on overflow is defined to wraparound.

Operations

Base Atomic and AtomicView objects (corresponding to ANY) expose no atomic operations.

AtomicBytes and AtomicBytesView objects support the following operations:

  • [base]: load, store
  • [xchg]: exchange, cmpxchg_weak, cmpxchg_strong
  • [bitwise]: bit_test, bit_compl, bit_set, bit_reset
  • [binary]: bin_or, bin_xor, bin_and, bin_not
  • [binary]: bin_fetch_or, bin_fetch_xor, bin_fetch_and, bin_fetch_not

Integral Atomic* and Atomic*View classes additionally support the following operations:

  • [arithmetic]: add, sub, inc, dec, neg
  • [arithmetic]: fetch_add, fetch_sub, fetch_inc, fetch_dec, fetch_neg

The usage of (most of) these functions is modelled directly on the C++11 std::atomic implementation found here.

Compare Exchange (cmpxchg_*)

The cmpxchg_* functions return CmpxchgResult. This has the attributes .success: bool which indicates whether the exchange took place, and .expected: T which holds the original value of the atomic object.
The cmpxchg_weak function may fail spuriously, even if expected matches the actual value. It should be used as shown below:

import atomics


def atomic_mul(a: atomics.INTEGRAL, operand: int):
    res = atomics.CmpxchgResult(success=False, expected=a.load())
    while not res:
        desired = res.expected * operand
        res = a.cmpxchg_weak(expected=res.expected, desired=desired)

In a real implementation of atomic_mul, care should be taken to ensure that desired fits in a (i.e. desired.bit_length() < (a.width * 8), assuming 8 bits in a byte).

Exceptions

All operations can raise UnsupportedOperationException (so check .ops_supported if you need to be sure).

Operations load, store, and cmpxchg_* can raise MemoryOrderError if called with an invalid memory order. MemoryOrder enum values expose the functions is_valid_store_order(), is_valid_load_order(), and is_valid_fail_order() to check with.

Special Methods

AtomicBytes and AtomicBytesView implement the __bytes__ special method.

Integral Atomic* and Atomic*View classes implement the __int__ special method. They intentionally do not implement __index__.

There is a notable lack of any classes implementing special methods corresponding to atomic operations; this is intentional. Assignment in Python is not available as a special method, and we do not want to encourage people to use other special methods with this class, lest it lead to them accidentally using assignment when they meant .store(...).

Memory Order

The MemoryOrder enum class is provided in atomics, and the memory orders are directly copied from C++11's std::memory_order documentation found here, except for CONSUME (which would be pointless to expose in this library).

All operations have a default memory order, SEQ_CST. This will enforce sequential consistency, and essentially make your multi-threaded and/or multi-processed program be as correct as if it were to run in a single thread.

IF YOU DO NOT UNDERSTAND THE LINKED DOCUMENTATION, DO NOT USE YOUR OWN MEMORY ORDERS!!! Stick with the defaults to be safe. (And realistically, this is Python, you won't get a noticeable performance boost from using a more lax memory order).

The following helper functions are provided:

  • .is_valid_store_order() (for store op)
  • .is_valid_load_order() ( for load op)
  • .is_valid_fail_order() (for the fail ordering in cmpxchg_* ops)

Passing an invalid memory order to one of these ops will raise MemoryOrderError.

Exceptions

The following exceptions are available in atomics.exc:

  • AlignmentError
  • MemoryOrderError
  • UnsupportedWidthException
  • UnsupportedOperationException

Building

IMPORTANT: Make sure you have the latest version of pip installed.

Using setup.py's build or bdist_wheel commands will run the build_patomic command (which you can also run directly).

This clones the patomic library into a temporary directory, builds it, and then copies the shared library into atomics._clib.

This requires that git be installed on your system (a requirement of the GitPython module). You will also need an ANSI/C90 compliant C compiler (although ideally a more recent compiler should be used). CMake is also required but should be automatically pip install'd if not available.

If you absolutely cannot get build_patomic to work, go to patomic, follow the instructions on building it (making sure to build the shared library version), and then copy-paste the shared library file into atomics._clib manually.

NOTE: Currently, the library builds a dummy extension in order to trick setuptools into building a non-purepython wheel. If you are ok with a purepython wheel, then feel free to remove the code for that from setup.py (at the bottom).
Otherwise, you will need a C99 compliant C compiler, and probably the development libraries/headers for whichever version of Python you're using.

Future Thoughts

  • add docstrings
  • add tests
  • add support for minimum alignment
  • add support for constructing Atomic classes' buffers in shared memory
  • add support for passing Atomic objects to sub-processes and sub-interpreters
  • reimplement in C or Cython for performance gains (preliminary benchmarks put such implementations at 2x the speed of a raw int)

Contributing

I don't have a guide for contributing yet. This section is here to make the following two points:

  • new operations must first be implemented in patomic before this library can be updated
  • new architectures, widths, and existing unsupported operations must be supported in patomic (no change required in this library)