Skip to content

v2.3.0

Compare
Choose a tag to compare
@etremel etremel released this 19 May 18:24
· 262 commits to master since this release
v2.3.0
3aef3d8

This version includes some API changes as well as new features.

Notable Changes

  • The type ExternalCaller<T> is now named PeerCaller<T> to reflect the fact that it is not "external" to the Derecho group; it represents a Derecho group member that is not in subgroup type T (but is in the same top-level group as type T).
  • The type ExternalGroup is now named ExternalGroupClient to reflect the fact that it represents a client process that will communicate with the Derecho group, not the entire group or a group member.
  • The bundled mutils-serialization library now uses uint8_t* instead of char* as the type that represents a "pointer to a plain byte array" (in the to_bytes and from_bytes functions). The serialization functions for Derecho objects, and DEFAULT_SERIALIZATION_SUPPORT macro for user-defined replicated types, have been correspondingly updated. See issue #218 and pull request #223.

New Features

  • External clients (processes running outside the Derecho group) can now be sent notifications by members of the Derecho group. To enable this feature, the Derecho subgroup (replicated type) that the client communicates with must inherit from NotificationSupport, and register the notify method as P2P-callable. The macro REGISTER_RPC_FUNCTIONS_WITH_NOTIFICATION can be used instead of REGISTER_RPC_FUNCTIONS when declaring the subgroup's class, in order to ensure that notify is registered. See pull request #239
  • Persistent objects now have a getDeltaSignature<DeltaType>() method that can retrieve a signature from a Delta-supporting Persistent object only if it matches a user-provided search function. This is similar to the existing getDelta<DeltaType>() function that accepts a user-provided function as a parameter. See pull request #220
  • Group has a new get_num_subgroups<SubgroupType>() method, which returns the number of subgroups of the same type that exist in the current configuration, and a new get_my_subgroup_indexes<SubgroupType>() method, which returns a vector of subgroup indexes (of that type) that the local node belongs to.
  • Replicated<T> now has the methods get_global_persistence_frontier() and get_global_verified_frontier(), which allow application code to learn the highest version number that has reached global-persistence stability or global-signature stability. In addition, the method wait_for_global_persistence_frontier() will block until a specified version number has reached global persistence. See pull request #225
  • RPC-callable functions in replicated types can now learn the ID of the calling node by calling _Group::get_rpc_caller_id() within their function body. See #227 and #228.

Bugs fixed

  • The P2P send mechanism was not thread-safe and could suffer from a race condition between the P2P sending thread and the SST predicates (RPC-handling) thread. This was fixed by removing internal state from the P2PConnection object so that it could be accessed concurrently without the concurrent threads modifying shared data. See #217
  • Some internal Derecho files incorrectly used #include with < > instead of #include with quotes to include other Derecho files, which could cause compile errors when trying to rebuild the library after it is already installed: the < > syntax defaults to searching system library locations before files in the local source tree. The double-quotes syntax should always be used to refer to files within the same library.
  • CMakeLists.txt had some errors in the way it packaged the Derecho library for installation. It should be capable of handling custom installation locations now.
  • Throughput was lower than expected in groups that sent large numbers of both P2P and ordered messages. This turned out to be caused by high contention between the P2P thread and the predicates thread for the RDMA queue pair managed by LibFabric: LibFabric internally used a biased spinlock to protect this resource, and one thread would end up starving during periods of contention. We now require LibFabric to be configured with spinlocks disabled, so that it uses fair mutexes instead (see commit ae97bea)