Concurrency Kit Changelog

What's new in Concurrency Kit 0.5.2

Sep 17, 2016
  • This release contains performance improvements and a bug fix to ck_epoch, a bug fix to the pointer-based ck_ring_enqueue_mpsc operation, regression test improvements, fixes to ck_ht custom hash callback functions, a bug fix to ck_rhs and documentation improvements.
  • Documentation: Roman Lebedev has fixed invalid references to TAILQ in ck_queue and added documentation for STAILQ. Olivier Houchard has updated ck_epoch_poll documentation.
  • ck_ht: Roman Lebedev has committed a fix to ck_ht. Previously, user-supplied hash functions were not being called by ck_ht_hash. Users that call ck_ht_hash
  • but use a custom hash function are affected.
  • ck_rhs: Olivier Houchard fixed an uninitialized memory problem.
  • ck_epoch: Synchronize operation is now completely serialized for multi-writer scenario.
  • ck_ring: 铁哥 has fixed an issue in ck_ring_enqueue_mpsc pointer-operation. Users of CK_RING_* generic interface are unaffected.

New in Concurrency Kit 0.5.1 (Jan 4, 2016)

  • fixes to ck_epoch. It is recommended all users of ck_epoch upgrade. ck_epoch_{synchronize,barrier} now provides full barrier semantics. ck_epoch_poll provides acquire semantics. A bug fix in ck_epoch_section_t support is also included.
  • The ck_epoch_section_t issue only affects users who are taking advantage of nested epoch sections (non-NULL ck_epoch_section_t argument). Such sections that never leave the outermost protected section will cause write-side operations to hang due to an invalid mask being applied to bucket selection. An additional regression test has been added for epoch sections that better stresses forward-progress guarantees of the ck_epoch_section_t extensions.
  • ck_pr_fence_memory has also been changed to emit fences on TSO. This is not the idiom provided by various kernel facilities, but it is more consistent to provide this behavior since Concurrency Kit provides a richer fencing interface. Generally, it is expected a full fence is used in cases where weaker fences (of X_Y / acq-rel form) are insufficient and so, a fence should be emitted.

New in Concurrency Kit 0.4.5 (Feb 3, 2015)

  • This release allows for ck_epoch deferrals to reschedule deferrals through ck_epoch_call, ck_rhs_apply functionality, improved register allocation for 64-bit ARM targets, constification of read-only ck_sequence operations and build system improvements.

New in Concurrency Kit 0.4.4 (Sep 2, 2014)

  • This release introduces the powerful ck_hs_apply function, regressions build fixes, clang static analyzer false positive fixes, a new ck_bitmap_bts function for atomic test and set and various improvements to the configure script (specifically to mimic GNU automake more) and fixes ck_rhs shared library builds.
  • The ck_hs_apply function is designed to speed-up get to conditional modify workloads and allows for in-band mutation of stored objects. I've seen 15%-35% latency improvements when using this in my workloads.

New in Concurrency Kit 0.4.3 (Jul 31, 2014)

  • This release includes support for ppc64le, preliminary and experimental support for aarch64, header include fixes for ck_rhs and i386 PIC fixes for regressions.
  • aarch64: CK will now compile on aarch64 platforms. It currently relies on GCC intrinsics and as such, is still grossly sub-optimal. There is a known regression in ck_epoch that is being investigated. This is a great stepping stone towards the final port and thanks to Marcin Juszkiewicz for patches.
  • ck_pr/gcc: The GCC ck_pr-backend is on parity with the architecture-specific fence interface. Thanks to Marcin Juszkiewicz for patches.
  • ck_rhs: Header include was incorrectly conflicting with ck_hs include, preventing inclusion of both files. Thanks to Bruce Spang for patches.
  • i386/regressions: PIC regressions did not build on x86-32 due to clobbering of the PIC register in the rdtsc implementation. Thanks to Devon H. O'Dell for patches.
  • ppc64le: The configure script has been updated to support ppc64le (little endian) targets. Thanks to Phil Sorber for patches.

New in Concurrency Kit 0.4.1 (Feb 26, 2014)

  • Fixes a calculation error in ck_bitmap_union, which can lead to a crash condition. This was missed due to a lack of the typical testing cycle for the function.

New in Concurrency Kit 0.4 (Feb 24, 2014)

  • Critical bug fixes were made to ck_bytelock (thanks to Albi Kalvo) for unslotted readers and 32-bit architectures (see below for details). A new robin-hood hash set implementation was added by Olivier Houchard that provides excellent stable performance for delete and write-heavy workloads, while still providing SPMC progress guarantees. A new operation has been added to the bitmap, ck_bitmap_union, that provides a (non-linearized) union of two bitmaps. Some other API changes were made to ck_bitmap. Last but not least, the library now uses the correct naming convention for Darwin targets.
  • ck_rhs: Documentation to come next release. It is pretty much a 1:1 mapping to the ck_hs API, however.
  • ck_bitmap: Manual page has been added, see ck_bitmap_union. The MPMC suffix has been permanently dropped from ck_bitmap.
  • ck_bytelock: Thanks to Albi Kavo for spotting this issue. If there are more than 56 readers or if a thread uses a slot identifier of 57, a deadlock situation will occur. This was an off-by-one bug. Another issue was spotted by Albi, a calculation error in bytelock length for 32-bit architectures, which would result in sub-optimal performance.

New in Concurrency Kit 0.3.5 (Jan 6, 2014)

  • This release includes a critical bug fix for ck_ht_gc that affects CK_HT_MODE_DIRECT hash tables (it does not affect bytestring hash tables). Additional optimizations were made to ck_ht_gc and ck_hs_gc for empty hash tables.

New in Concurrency Kit 0.3.4 (Jan 3, 2014)

  • Acquire and release fences have finally been introduced. Significant performance improvements were made to spinlock, rwlock, pflock, cohorts, brlocks and bytelocks on RMO and PSO architectures (PPC*, SPARCv9+, ARM). Significant improvements have been made to ck_hs write-side performance in the presence of deletes. Latency improvements to ck_epoch for read-side critical sections on x86 were introduced. Last but not least, the parametric ck_ring has been re-introduced following discussion and feedback.

New in Concurrency Kit 0.3.3 (Dec 21, 2013)

  • build:
  • A generic method has been added to detect VMA bits x86 on targets that do not expose this (it is done through cpuid on x86).
  • ck_bag:
  • This data structure has been removed and is no longer supported.
  • ck_epoch:
  • A redundant observation of e + 2 has been removed.
  • ck_hs:
  • A delete hint (CK_HS_MODE_DELETE) may be passed to hash set initialization to allow for stronger probe sequence bounds and improved performance for delete heavy workloads.
  • ck_ht:
  • Alignment issues have been fixed in MurmurHash have also been fixed.
  • ck_pr/arm:
  • Specialized memory barrier instructions have been added for ARMv7+. Support for cas_2 and other double-word operations has been added.
  • ck_ring:
  • The template-driven implementation has been removed from this release.
  • ck_spinlock:
  • A hierarchical CLH spinlock has been added, which exhibits NUMA-awareness
  • on multi-socket systems.
  • doc:
  • Documentation has been added for ck_pr_rtm, which is the RTM interface for Intel processors supporting TSX.

New in Concurrency Kit 0.3.2 (Oct 22, 2013)

  • This release includes various manual page fixes and introduces a simple SPMC non-blocking dense pointer array with an explicitly invoked linearization point to allow for
  • simpler forms of elimination for put to remove and remove to put workloads.
  • It is designed primarily for iteration heavy workloads. See the ck_array manual pages for more information. Expect modifications to this interface in future releases.

New in Concurrency Kit 0.3.1 (Sep 27, 2013)

  • This release consists of crucial bug fixes to the ARM port as well as improvements in the form of Thumb 2 support.
  • An issue that prevented core counts from being counted correctly on the Linux ARM port has also been fixed.

New in Concurrency Kit 0.3 (Sep 16, 2013)

  • Sdded support for ARMv5 - ARMv7, thanks to Olivier Houchard
  • Simplication improvements to the ck_spinlock API in the form of additional initializers for ease-of-use and a strict aliasing fix for ck_spinlock_unlock.
  • Additional manual pages have been added. CK_BITMAP_BITS has been updated to not require the second argument.
  • In addition to this, a new CK_SEQUENCE_READ operation has been added to help the readability of read-side ck_sequence sections.

New in Concurrency Kit 0.2.20 (Jul 18, 2013)

  • This release adds support for Intel TSX extensions in the form of the ck_pr_rtm interface, support for lock elision, a ck_hs_move operation and includes support for reader-writer cohort locks.
  • Thanks to Brendon Scheinman for his contribution of a generalized read-write cohort implementation. These locks are designed for NUMA-awareness. Reader-preference, writer-preference and neutral algorithms have been added. Please see the documentation for additional information.
  • ck_sequence interface has been slightly changed. More specifically, ck_sequence_read_retry has been modified to take an unsigned int rather than uint32_t for portability reasons.
  • is_locked functionality has been added to all spinlocks and ck_rwlock.

New in Concurrency Kit 0.2.19 (Jun 26, 2013)

  • This release adds a missing compiler barrier in no-op ck_pr_fence operations, improvesperformance of ck_hs, fixes a bug in proportional back-off ticket lockson x86, and adds support for a swap operation in the hash set. I am alsohappy to announce that a FreeBSD port has been added (devel/concurrencykit)thanks to Kubilay Kocak (koobs@).

New in Concurrency Kit 0.2.18 (May 22, 2013)

  • This release implements an improved finer-grained memory barrier interface and has also implemented lighter-weight ck_epoch_synchronize functionality. In this latest release, ck_epoch_synchronize will no longer execute any deferred callbacks (as documented, ck_epoch_barrier must be used). The latter interface plays nicely for integrations that wish to time reclamation overheads.

New in Concurrency Kit 0.2.17 (Apr 24, 2013)

  • This release includes MingW64 support, SPMC CK_STAILQ with wait-free read-side traversal and migration of ck_hp to use strict fence semantics when setting hazard pointers.
  • In addition to this, ck_pr and ck_ring documentation has been added in the form of manual pages (which are also available on the website).

New in Concurrency Kit 0.2.14 (Jan 18, 2013)

  • This release includes a bug fix to ck_bytelock.h and a bug fix to ck_queue.h.
  • The ck_bytelock bug affects users that downgrade a write acquisition to a read-side acquisition. Thanks to Jeffrey Birnbaum for pointing this out.
  • The ck_queue bug affects users that are using CK_LIST_INSERT_HEAD followed by CK_LIST_REMOVE to non-head elements. This execution sequence causes incorrect removal of nodes. Thanks to Abel Mathew for providing the patch fixing this problem.
  • This release also includes improved unit test coverage to x86-32.

New in Concurrency Kit 0.2.13 (Dec 20, 2012)

  • This release includes a bug fix to ck_ring_dequeue_spmc for users of versions of GCC greater than 4.2X and users of LLVM-backed compilers, a bug fix to ck_pr_store_double for x86_64 when dealing with constant arguments (causes compilation failures) and rips out ck_backoff_eb.
  • Details for major changes:
  • ck_ring: Add a work-around for compiler bug(s) in ck_ring_dequeue_spmc.
  • ck_backoff: Replace ck_backoff_eb with ck_backoff_gb.
  • ck_pr/x86_64: GCC will attempt immediate expansion on double. Break out double functionality to not use immediate expansion at all.

New in Concurrency Kit 0.2.12 (Nov 5, 2012)

  • This release includes trydequeue functionality for ck_ring and bug fixes to ck_epoch for systems that dynamically register and unregister threads to and from an epoch object.

New in Concurrency Kit 0.2.11 (Oct 8, 2012)

  • This release includes significant performance improvements to the PPC64 port, strict semantics for reader-writer mutexes, a single-producer/multi-consumer ck_ring implementation, performance improvements were made to ck_brlock (8-20% reduction in latency depending on platform), _MOVE operations were added to ck_queue.h and Andrew Schorr's patches for const-ified load (and an extension for ck_pr_store) have been merged.

New in Concurrency Kit 0.2.10 (Sep 18, 2012)

  • This release includes build system improvements, various manual page improvements and fixes as well as a spanking now lock-free hash set for SPMC belonging to the "ck_hs" namespace. The default memory model for SPARCv9 has been changed to total store ordering. The default memory models may be changed via the --memory-model option to configure. In addition to this, a --vma-bits option has been added that will allow you to leverage pointer packing on architectures that have a VMA space that is smaller than that of pointer range. This can also be used to prepare yourself for Haswell.

New in Concurrency Kit 0.2.9 (Sep 11, 2012)

  • This release includes some minor fixes for ck_ring and ck_fifo_spsc for RMO as well as drastic improvements to the ck_epoch implementation and programming interface. Documentation has also been added for ck_bag and the new ck_epoch interface in the form of manual pages.

New in Concurrency Kit 0.2.8 (Aug 30, 2012)

  • This release includes an overhaul of ck_barrier, ck_spinlock, ck_bytelock, ck_brlock, ck_hp, ck_epoch, ck_rwlock, ck_ring, ck_fifo and ck_queue for RMO architectures. All users are recommended to upgrade.

New in Concurrency Kit 0.2.6 (Jul 31, 2012)

  • ck_cc: A CK_CC_ALIASED attribute has been added to work-around literal interpretations of strict aliasing with GCC 4.4. We may want to push for an interface change to ck_pr_*_ptr_2* variants.
  • ck_stack: Adoption of CK_CC_ALIASED.
  • ck_ht: Removed ck_ht_allocator_set definition and added performance improvements to platforms that have pointer packing disabled (will affect regrowth speed).
  • ck_hp_fifo: Fixed race condition on dequeue.
  • ck_fifo: Add load fences for platforms with relaxed load ordering.

New in Concurrency Kit 0.2.5 (Jul 16, 2012)

  • A critical bug fix to ck_ht that affects behavior for builds with disabled pointer packing that are using bytestring-key hash tables.
  • ck_ht now supports user-defined hash functions and per-ck_ht allocator functions. These changes are exposed through an API change to ck_ht_init.
  • Geometric backoff (ck_backoff_gb) has been modified to saturate at the ck_backoff-specified backoff ceiling.
  • The factor argument for ck_rwlock has also been dropped.
  • Documentation updates have been made to the ck_ht_init manual page to explicitly point out that key values of 0 or UINTPTR_MAX are unsupported. Thanks to Wez Furlong for pointing out that this limitation is not documented.

New in Concurrency Kit 0.2.4 (Jul 2, 2012)

  • This release includes a family of try operations for the lock-free FIFO and stack implementations and a recursive writer reader-write lock implementation (ck_rwlock_recursive). trylock variants have yet to be introduced to the latter but will be included in next week's release including saturated back-off behavior.

New in Concurrency Kit 0.2.3 (Jun 19, 2012)

  • This release includes a bug fix to ck_bag which corrects a memory allocation leak caused by available list mismanagement. Pointer packing has been disabled by default for the x86_64 as Intel expects to bump VMA bits to 56 bits after Haswell. In order to re-enable pointer packing please pass --enable-pointer-packing to the configure script.

New in Concurrency Kit 0.2.2 (May 1, 2012)

  • This release includes a bug fix to ck_bag which corrects allocation behavior on ck_bag_put_spmc for initial bag creation and an iterator for ck_bitmap.

New in Concurrency Kit 0.2.1 (Apr 24, 2012)

  • This release includes two new data structures, support for PPC32, documentation fixes and a bug fix to ck_ht_entry_key for non-x86_64 targets.

New in Concurrency Kit 0.2 (Apr 6, 2012)

  • Improvements to the RPM specification file, a fix to ck_ring, beginnings of manual pages and a new lock-free SPMC hash table.

New in Concurrency Kit 0.1.5 (Mar 13, 2012)

  • ck_epoch:
  • This release includes usability improvements to ck_epoch in the form of delineated read/write protected sections and removal of load barrier on read_end. ck_epoch_end has been deprecated. Despite API incompatibility, ABI compatibility has been maintained.
  • ck_pr:
  • Fix strict aliasing issues in the x86_64 implementation and the various unit tests.
  • ck_queue:
  • A BSD-derived queue.h facility. This allows for lock-less traversal of these lists. Please see the FreeBSD or Mac OS X queue manual page for additional details while documentation is being cooked. Writer-side synchronization is still necessary.

New in Concurrency Kit 0.1.4 (Nov 15, 2011)

  • This release includes extensions to the ck_spinlock and ck_rwlock programming interface and support for ADD/INC/DEC/SUB/NEG on double operations by relying on the underlying ck_pr_cas_double implementation.
  • A default ck_spinlock implementation has been provided that lacks the implementation suffix. This implementation is currently implemented in terms of ck_spinlock_fas. Additionally, ck_spinlock_*_locked utility functions have been added that allow for lazy locking. For more information please refer to http://concurrencykit.org/doc/ck_spinlock.html
  • ck_rwlock now has a ck_rwlock_downgrade function which atomically downgrades an exclusive acquisition to a shared one.
  • Thanks to William Irwin for the initial patch for double support in ck_pr and thanks to David Joseph for adding top-level ck_pr support for double targets.

New in Concurrency Kit 0.1.3 (Nov 9, 2011)

  • Added support for ck_pr_load/store_double on ppc64, sparcv9 and x86_64 targets.

New in Concurrency Kit 0.1.1 (Nov 1, 2011)

  • This release includes an over-haul of the epoch reclamation interface to allow for per-object destructors, lower latency bid reader spinlocks, recursive big reader spinlock (for reader-side) and bounded variants of ck_rwlock and ck_brlock.

New in Concurrency Kit 0.0.6 (Aug 1, 2011)

  • Fix for a build system regression that affects Mac OS X and ck_ring_size. Additional functionality has been provided for the SPSC wait-free FIFO implementation including a spinlock abstraction, a volatile implementation of ck_fifo_spsc_isempty and an accessor for the FIFO stub node.

New in Concurrency Kit 0.0.5 (Jun 27, 2011)

  • Includes build system improvements, naive reader/writer spinlocks and big reader spinlocks.

New in Concurrency Kit 0.0.4 (May 18, 2011)

  • This release includes build system improvements, improvements to the SPARCv9 port and Sun Studio 12+ support.