ck
Concurrency primitives, safe memory reclamation mechanisms and non-blocking (including lock-free) data structures designed to aid in the research, design and implementation of high performance concurrent systems developed in C99+.
Top Related Projects
Concurrent data structures in C++
A bounded multi-producer multi-consumer concurrent queue written in C++11
A fast multi-producer, multi-consumer lock-free concurrent queue for C++11
An open-source C++ library developed and used at Facebook.
oneAPI Threading Building Blocks (oneTBB)
Quick Overview
Concurrency Kit (ck) is a high-performance concurrency library for C and C++. It provides a collection of lock-free and concurrent data structures, as well as low-level synchronization primitives designed for scalability and efficiency in multi-threaded environments.
Pros
- Highly optimized for performance and scalability
- Extensive collection of lock-free data structures
- Cross-platform support (Linux, BSD, Windows, macOS)
- Well-documented with thorough API references
Cons
- Steep learning curve for developers unfamiliar with lock-free programming
- Limited higher-level abstractions compared to some other concurrency libraries
- Requires careful usage to avoid subtle concurrency bugs
- May be overkill for simpler concurrent applications
Code Examples
- Using a lock-free stack:
#include <ck_stack.h>
ck_stack_t my_stack;
ck_stack_entry_t element;
// Initialize the stack
ck_stack_init(&my_stack);
// Push an element onto the stack
ck_stack_push_upmc(&my_stack, &element);
// Pop an element from the stack
ck_stack_entry_t *popped = ck_stack_pop_upmc(&my_stack);
- Implementing a spinlock:
#include <ck_spinlock.h>
ck_spinlock_t my_lock = CK_SPINLOCK_INITIALIZER;
// Acquire the lock
ck_spinlock_lock(&my_lock);
// Critical section
// ...
// Release the lock
ck_spinlock_unlock(&my_lock);
- Using an atomic operation:
#include <ck_pr.h>
uint64_t value = 0;
// Atomically increment the value
ck_pr_faa_64(&value, 1);
// Atomically compare and swap
uint64_t expected = 1;
uint64_t new_value = 2;
bool success = ck_pr_cas_64(&value, expected, new_value);
Getting Started
To use Concurrency Kit in your project:
- Download and install the library from the official repository.
- Include the necessary headers in your C/C++ file:
#include <ck_stack.h>
#include <ck_spinlock.h>
#include <ck_pr.h>
- Compile your program with the Concurrency Kit library:
gcc -o my_program my_program.c -lck
- Start using the Concurrency Kit data structures and primitives in your code as shown in the examples above.
Competitor Comparisons
Concurrent data structures in C++
Pros of Junction
- More focused on specific concurrent data structures like queues and hash tables
- Provides detailed documentation and examples for each data structure
- Designed with C++11 features in mind, offering modern C++ compatibility
Cons of Junction
- Smaller community and less frequent updates compared to CK
- Limited to C++ language, while CK supports C and can be used in C++ projects
- Narrower scope of concurrent primitives compared to CK's broader offering
Code Comparison
Junction queue usage:
junction::ConcurrentQueue<int> queue;
queue.push(42);
int value;
if (queue.try_pop(value)) {
// Use value
}
CK queue usage:
ck_queue_t queue;
ck_queue_init(&queue);
ck_queue_enqueue(&queue, &item);
void *value = ck_queue_dequeue(&queue);
Both libraries provide efficient concurrent data structures, but Junction focuses on C++ specific implementations while CK offers a broader range of primitives in C. CK has a larger community and more frequent updates, making it potentially more suitable for projects requiring long-term support and a wider range of concurrent operations. Junction, however, might be preferable for modern C++ projects seeking specialized concurrent data structures with detailed documentation.
A bounded multi-producer multi-consumer concurrent queue written in C++11
Pros of MPMCQueue
- Focused implementation of a multi-producer, multi-consumer queue
- Designed for low-latency and high-throughput scenarios
- Simple API with clear documentation
Cons of MPMCQueue
- Limited scope compared to CK's broader concurrency toolkit
- May require additional synchronization primitives for complex use cases
- Less mature and less widely adopted than CK
Code Comparison
MPMCQueue:
MPMCQueue<int> q(100);
q.try_push(42);
int value;
q.try_pop(value);
CK:
ck_ring_t ring;
ck_ring_buffer_t buffer[SIZE];
ck_ring_init(&ring, buffer, SIZE);
ck_ring_enqueue_spsc(&ring, buffer, &data);
ck_ring_dequeue_spsc(&ring, buffer, &result);
Summary
MPMCQueue offers a specialized, high-performance queue implementation, while CK provides a comprehensive concurrency toolkit. MPMCQueue is ideal for specific use cases requiring a multi-producer, multi-consumer queue, whereas CK offers a broader range of concurrency primitives and data structures. The choice between the two depends on the specific requirements of the project and the desired level of abstraction.
A fast multi-producer, multi-consumer lock-free concurrent queue for C++11
Pros of concurrentqueue
- Focused specifically on lock-free queues, providing a simpler API for queue operations
- Header-only library, making it easier to integrate into existing projects
- Extensive benchmarking and performance optimizations for queue operations
Cons of concurrentqueue
- Limited scope compared to CK, which offers a broader range of concurrency primitives
- Less flexibility for customizing low-level concurrency behavior
- Fewer options for atomic operations and memory ordering
Code Comparison
CK example (atomic increment):
ck_pr_inc_uint(&value);
concurrentqueue example (enqueue operation):
moodycamel::ConcurrentQueue<int> q;
q.enqueue(42);
Summary
CK provides a comprehensive toolkit for concurrency primitives, offering more flexibility and control over low-level operations. concurrentqueue, on the other hand, focuses specifically on efficient, lock-free queue implementations with a simpler API. CK is better suited for projects requiring a wide range of concurrency tools, while concurrentqueue excels in scenarios where high-performance queues are the primary concern.
An open-source C++ library developed and used at Facebook.
Pros of Folly
- Broader scope: Folly is a comprehensive C++ library with a wide range of utilities, while CK focuses primarily on concurrency primitives
- Active development: Folly has more frequent updates and a larger contributor base
- Integration with other Facebook projects: Seamlessly works with other Facebook open-source libraries
Cons of Folly
- Larger footprint: Folly's extensive feature set results in a larger codebase and potential overhead
- Steeper learning curve: The breadth of functionality may require more time to master compared to CK's focused approach
Code Comparison
CK (Atomic operations):
ck_pr_cas_ptr(&target, old_value, new_value);
Folly (Atomic operations):
folly::atomic_compare_exchange_strong(&target, &old_value, new_value);
Both libraries provide atomic operations, but Folly's implementation aligns more closely with C++11 standards.
Summary
Folly offers a more comprehensive set of utilities and active development, making it suitable for large-scale projects. CK, on the other hand, provides a focused, lightweight solution for concurrency primitives. The choice between the two depends on project requirements, team expertise, and performance considerations.
oneAPI Threading Building Blocks (oneTBB)
Pros of oneTBB
- More comprehensive parallel programming library with a wider range of features
- Better integration with C++ standard library and modern C++ practices
- Extensive documentation and support from Intel
Cons of oneTBB
- Larger footprint and potentially more complex to use for simple tasks
- May have more overhead for basic operations compared to CK
- Less suitable for low-level, performance-critical systems programming
Code Comparison
oneTBB example (parallel for loop):
#include <tbb/parallel_for.h>
#include <vector>
void process(std::vector<int>& data) {
tbb::parallel_for(0, data.size(), [&](int i) {
data[i] = data[i] * 2;
});
}
CK example (lock-free stack):
#include <ck_stack.h>
ck_stack_t stack;
ck_stack_entry_t entry;
ck_stack_init(&stack);
ck_stack_push_upmc(&stack, &entry);
ck_stack_entry_t *popped = ck_stack_pop_upmc(&stack);
The code examples highlight the different focus areas of the libraries. oneTBB provides high-level abstractions for parallel algorithms, while CK offers low-level concurrent data structures and primitives. oneTBB's example shows its integration with C++ features, whereas CK's example demonstrates its C-based, lock-free approach.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Concurrency Kit
Modern concurrency primitives and building blocks for high performance applications.
Continuous Integration
Compilers tested in the past include gcc, clang, cygwin, icc, mingw32, mingw64 and suncc across all supported architectures. All new architectures are required to pass the integration test and under-go extensive code review.
Continuous integration is currently enabled for the following targets:
darwin/clang/arm64
freebsd/clang/x86-64
linux/gcc/arm64
linux/gcc/x86-64
linux/clang/x86-64
Compile and Build
-
Step 1.
./configure
For additional options try./configure --help
-
Step 2. In order to compile regressions (requires POSIX threads) use
make regressions
. In order to compile libck usemake all
ormake
. -
Step 3. In order to install use
make install
To uninstall usemake uninstall
.
See http://concurrencykit.org/ for more information.
Supported Architectures
Concurrency Kit supports any architecture using compiler built-ins as a fallback. There is usually a performance degradation associated with this.
Concurrency Kit has specialized assembly for the following architectures:
aarch64
arm
ppc
ppc64
riscv64
s390x
sparcv9+
x86
x86_64
Features
Concurrency Primitives
ck_pr
Concurrency primitives including architecture-specific ones. Provides wrappers around CAS in case of missing native support. This also provides support for RTM (transactional memory), pipeline control, read-for-ownership and more.
ck_backoff
A simple and efficient (minimal noise) backoff function.
ck_cc
Abstracted compiler builtins when writing efficient concurrent data structures.
Safe Memory Reclamation
ck_epoch
A scalable safe memory reclamation mechanism with support for idle threads and various optimizations that make it better than or competitive with many state-of-the-art solutions.
ck_hp
Implements support for hazard pointers, a simple and efficient lock-free safe memory reclamation mechanism.
Data Structures
ck_array
A simple concurrently-readable pointer array structure.
ck_bitmap
An efficient multi-reader and multi-writer concurrent bitmap structure.
ck_ring
Efficient concurrent bounded FIFO data structures with various performance trade-off. This includes specialization for single-reader, many-reader, single-writer and many-writer.
ck_fifo
A reference implementation of the first published lock-free FIFO algorithm, with specialization for single-enqueuer-single-dequeuer and many-enqueuer-single-dequeuer and extensions to allow for node re-use.
ck_hp_fifo
A reference implementation of the above algorithm, implemented with safe memory reclamation using hazard pointers.
ck_hp_stack
A reference implementation of a Treiber stack with support for hazard pointers.
ck_stack
A reference implementation of an efficient lock-free stack, with specialized variants for a variety of memory management strategies and bounded concurrency.
ck_queue
A concurrently readable friendly derivative of the BSD-queue interface. Coupled with a safe memory reclamation mechanism, implement scalable read-side queues with a simple search and replace.
ck_hs
An extremely efficient single-writer-many-reader hash set, that satisfies lock-freedom with bounded concurrency without any usage of atomic operations and allows for recycling of unused or deleted slots. This data structure is recommended for use as a general hash-set if it is possible to compute values from keys.
ck_ht
A specialization of the ck_hs
algorithm allowing for disjunct key-value pairs.
ck_rhs
A variant of ck_hs
that utilizes robin-hood hashing to allow for improved
performance with higher load factors and high deletion rates.
Synchronization Primitives
ck_ec
An extremely efficient event counter implementation, a better alternative to condition variables with specialization for fixed concurrency use-cases.
ck_barrier
A plethora of execution barriers including: centralized barriers, combining barriers, dissemination barriers, MCS barriers, tournament barriers.
ck_brlock
A simple big-reader lock implementation, write-biased reader-writer lock with scalable read-side locking.
ck_bytelock
An implementation of bytelocks, for research purposes, allowing for (in theory), fast read-side acquisition without the use of atomic operations. In reality, memory barriers are required on the fast path.
ck_cohort
A generic lock cohorting interface, allows you to turn any lock into a NUMA-friendly scalable NUMA lock. There is a significant trade-off in fast path acquisition cost. Specialization is included for all relevant lock implementations in Concurrency Kit. Learn more by reading "Lock Cohorting: A General Technique for Designing NUMA Locks".
ck_elide
A generic lock elision framework, allows you to turn any lock implementation into an elision-aware implementation. This requires support for restricted transactional memory by the underlying hardware.
ck_pflock
Phase-fair reader-writer mutex that provides strong fairness guarantees between readers and writers. Learn more by reading "Spin-Based Reader-Writer Synchronization for Multiprocessor Real-Time Systems".
ck_rwcohort
A generic read-write lock cohorting interface, allows you to turn any read-write lock into a NUMA-friendly scalable NUMA lock. There is a significant trade-off in fast path acquisition cost. Specialization is included for all relevant lock implementations in Concurrency Kit. Learn more by reading "Lock Cohorting: A General Technique for Designing NUMA Locks".
ck_rwlock
A simple centralized write-biased read-write lock.
ck_sequence
A sequence counter lock, popularized by the Linux kernel, allows for very fast read and write synchronization for simple data structures where deep copy is permitted.
ck_swlock
A single-writer specialized read-lock that is copy-safe, useful for data structures that must remain small, be copied and contain in-band mutexes.
ck_tflock
Task-fair locks are fair read-write locks, derived from "Scalable reader-writer synchronization for shared-memory multiprocessors".
ck_spinlock
A basic but very fast spinlock implementation.
ck_spinlock_anderson
Scalable and fast anderson spinlocks. This is here for reference, one of the earliest scalable and fair lock implementations.
ck_spinlock_cas
A basic spinlock utilizing compare_and_swap.
ck_spinlock_dec
A basic spinlock, a C adaption of the older optimized Linux kernel spinlock for x86. Primarily here for reference.
ck_spinlock_fas
A basic spinlock utilizing atomic exchange.
ck_spinlock_clh
An efficient implementation of the scalable CLH lock, providing many of the same performance properties of MCS with a better fast-path.
ck_spinlock_hclh
A NUMA-friendly CLH lock.
ck_spinlock_mcs
An implementation of the seminal scalable and fair MCS lock.
ck_spinlock_ticket
An implementation of fair centralized locks.
Top Related Projects
Concurrent data structures in C++
A bounded multi-producer multi-consumer concurrent queue written in C++11
A fast multi-producer, multi-consumer lock-free concurrent queue for C++11
An open-source C++ library developed and used at Facebook.
oneAPI Threading Building Blocks (oneTBB)
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot