thread-pool

Thread pool implementation using c++11 threads

1,211

237

1,211

View on GitHub

Top Related Projects

CTPL

1,947

Modern and efficient C++ Thread Pool Library

thread-pool

2,582

BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library

threadpool

1,063

based on C++11 , a mini threadpool , accept variable number of parameters 基于C++11的线程池,简洁且可以带任意多的参数

asyncplusplus

1,404

Async++ concurrency framework for C++11

Quick Overview

mtrebi/thread-pool is a C++ implementation of a simple thread pool. It provides a way to manage and reuse a fixed number of threads for executing tasks concurrently, improving performance and resource utilization in multi-threaded applications.

Pros

Easy to use and integrate into existing C++ projects
Efficient task execution with minimal overhead
Supports both synchronous and asynchronous task submission
Header-only library, simplifying integration

Cons

Limited features compared to more comprehensive threading libraries
May not be suitable for complex threading scenarios
Lacks advanced scheduling or prioritization options
Not actively maintained (last update was in 2018)

Code Examples

Creating a thread pool and submitting a simple task:

#include "ThreadPool.h"

ThreadPool pool(4); // Create a thread pool with 4 threads
auto result = pool.submit([]() {
    return 42;
});
std::cout << "Result: " << result.get() << std::endl;

Submitting multiple tasks and waiting for their completion:

std::vector<std::future<int>> results;
for (int i = 0; i < 8; ++i) {
    results.emplace_back(pool.submit([i]() {
        return i * i;
    }));
}

for (auto& result : results) {
    std::cout << "Result: " << result.get() << std::endl;
}

Using the thread pool with a custom function:

int multiply(int a, int b) {
    return a * b;
}

auto result = pool.submit(multiply, 6, 7);
std::cout << "6 * 7 = " << result.get() << std::endl;

Getting Started

Clone the repository:

git clone https://github.com/mtrebi/thread-pool.git

Include the ThreadPool.h header in your C++ project:
```
#include "path/to/ThreadPool.h"
```

Create a thread pool and start submitting tasks:

ThreadPool pool(std::thread::hardware_concurrency());
auto result = pool.submit([]() { return "Hello, Thread Pool!"; });
std::cout << result.get() << std::endl;

Competitor Comparisons

CTPL

1,947

Modern and efficient C++ Thread Pool Library

Pros of CTPL

Simpler implementation with fewer dependencies
More lightweight and easier to integrate into existing projects
Supports both C++11 and C++14 standards

Cons of CTPL

Less feature-rich compared to thread-pool
Limited documentation and examples
Lacks advanced queue management options

Code Comparison

CTPL:

#include "ctpl_stl.h"
ctpl::thread_pool p(4);
p.push([](int id){ /* task */ });

thread-pool:

#include "ThreadPool.h"
ThreadPool pool(4);
pool.enqueue([](){ /* task */ });

Key Differences

CTPL uses a template-based approach, while thread-pool uses a more object-oriented design
thread-pool offers more advanced features like task prioritization and future-based results
CTPL provides a simpler API with fewer configuration options

Use Cases

CTPL: Ideal for projects requiring a straightforward thread pool implementation with minimal overhead
thread-pool: Better suited for applications needing more advanced threading features and fine-grained control

Community and Maintenance

thread-pool has more recent updates and a larger community following
CTPL has fewer contributors but maintains a stable codebase

Both libraries offer efficient thread pool implementations, but they cater to different needs in terms of complexity and feature set.

thread-pool

2,582

BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library

Pros of thread-pool (bshoshany)

Header-only implementation, making it easier to integrate into projects
More extensive feature set, including parallel loops and custom task priorities
Better documentation and examples for usage

Cons of thread-pool (bshoshany)

More complex codebase, potentially harder to understand and modify
Requires C++17 or later, limiting compatibility with older codebases

Code Comparison

thread-pool (mtrebi):

ThreadPool pool(4);
pool.enqueue([](int a, int b) { return a + b; }, 2, 3);

thread-pool (bshoshany):

thread_pool pool(4);
auto future = pool.submit([](int a, int b) { return a + b; }, 2, 3);
int result = future.get();

The bshoshany implementation offers a more flexible interface with futures, allowing for easier handling of return values from tasks. It also provides additional features like parallel loops:

pool.parallelize_loop(0, 100, [](int i) { /* process i */ });

Both libraries offer efficient thread pool implementations, but bshoshany's version provides more features and flexibility at the cost of increased complexity and higher language standard requirements.

threadpool

1,063

based on C++11 , a mini threadpool , accept variable number of parameters 基于C++11的线程池,简洁且可以带任意多的参数

Pros of threadpool

Simpler implementation with fewer files and dependencies
Supports both C++11 and C++17 standards
Includes a simple example demonstrating usage

Cons of threadpool

Less feature-rich compared to thread-pool
Lacks extensive documentation and explanations
No built-in support for future/promise-based task submission

Code Comparison

thread-pool:

ThreadPool pool(4);
auto result = pool.submit([](int answer) { return answer; }, 42);
std::cout << result.get() << std::endl;

threadpool:

ThreadPool pool(4);
auto result = pool.enqueue([](int answer) { return answer; }, 42);
std::cout << result.get() << std::endl;

The main difference in usage is the method name for submitting tasks: submit vs enqueue. Both implementations use similar syntax for task submission and result retrieval.

thread-pool offers more advanced features like prioritized tasks and wait-free work stealing, while threadpool provides a simpler, more straightforward implementation. thread-pool may be better suited for complex, high-performance applications, whereas threadpool could be a good choice for simpler projects or those requiring C++11 compatibility.

asyncplusplus

1,404

Async++ concurrency framework for C++11

Pros of asyncplusplus

More comprehensive and feature-rich, offering a wider range of asynchronous programming tools
Better support for complex task dependencies and continuations
More actively maintained with regular updates and improvements

Cons of asyncplusplus

Steeper learning curve due to its more extensive API and features
Potentially higher overhead for simple use cases compared to thread-pool

Code Comparison

thread-pool:

ThreadPool pool(4);
auto result = pool.enqueue([](int answer) { return answer; }, 42);

asyncplusplus:

async::task<int> t = async::spawn([](int answer) { return answer; }, 42);
int result = t.get();

Key Differences

thread-pool is a simpler, more straightforward implementation focused solely on thread pooling
asyncplusplus provides a more comprehensive asynchronous programming framework
thread-pool uses a queue-based approach, while asyncplusplus offers task-based programming
asyncplusplus supports more advanced features like task cancellation and parallel algorithms

Use Case Recommendations

Choose thread-pool for simpler projects requiring basic thread pooling functionality
Opt for asyncplusplus when working on larger, more complex applications that benefit from advanced asynchronous programming features

Community and Support

thread-pool has fewer contributors and less frequent updates
asyncplusplus has a larger community, more frequent updates, and better documentation

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Introduction
Build instructions
Thread pool
        Queue
        Submit function
        Thread worker
Usage example
        Use case#1
        Use case#2
        Use case#3
Future work
References

Introduction:

A thread pool is a technique that allows developers to exploit the concurrency of modern processors in an easy and efficient manner. It's easy because you send "work" to the pool and somehow this work gets done without blocking the main thread. It's efficient because threads are not initialized each time we want the work to be done. Threads are initialized once and remain inactive until some work has to be done. This way we minimize the overhead.

There are many more Thread pool implementations in C++, many of them are probably better (safer, faster...) than mine. However,I believe my implementation are very straightforward and easy to understand.

Disclaimer: Please Do not use this project in a professional environment. It may contain bugs and/or not work as expected. I did this project to learn how C++11 Threads work and provide an easy way for other people to understand it too.

Build instructions:

This project has been developed using Netbeans and Linux but it should work on Windows, MAC OS and Linux. It can be easily build using CMake and different other generators. The following code can be used to generate the VS 2017 project files:

// VS 2017
cd <project-folder>
mkdir build
cd build/
cmake .. "Visual Studio 15 2017 Win64"

Then, from VS you can edit and execute the project. Make sure that main project is set up as the startup project

If you are using Linux, you need to change the generator (use the default) and execute an extra operation to actually make it executable:

// Linux
cd <project-folder>
mkdir build
cd build/
cmake ..
make

Thread pool

The way that I understand things better is with images. So, let's take a look at the image of thread pool given by Wikipedia:

As you can see, we have three important elements here:

Tasks Queue. This is where the work that has to be done is stored.
Thread Pool. This is set of threads (or workers) that continuously take work from the queue and do it.
Completed Tasks. When the Thread has finished the work we return "something" to notify that the work has finished.

Queue

We use a queue to store the work because it's the more sensible data structure. We want the work to be started in the same order that we sent it. However, this queue is a little bit special. As I said in the previous section, threads are continuously (well, not really, but let's assume that they are) querying the queue to ask for work. When there's work available, threads take the work from the queue and do it. What would happen if two threads try to take the same work at the same time? Well, the program would crash.

To avoid these kinds of problems, I implemented a wrapper over the standard C++ Queue that uses mutex to restrict the concurrent access. Let's see a small sample of the SafeQueue class:

void enqueue(T& t) {
	std::unique_lock<std::mutex> lock(m_mutex);
	m_queue.push(t);
}

To enqueue the first thing we do is lock the mutex to make sure that no one else is accessing the resource. Then, we push the element to the queue. When the lock goes out of scopes it gets automatically released. Easy, huh? This way, we make the Queue thread-safe and thus we don't have to worry many threads accessing and/or modifying it at the same "time".

Submit function

The most important method of the thread pool is the one responsible of adding work to the queue. I called this method submit. It's not difficult to understand how it works but its implementation can seem scary at first. Let's think about what should do and after that we will worry about how to do it. What:

Accept any function with any parameters.
Return "something" immediately to avoid blocking main thread. This returned object should eventually contain the result of the operation.

Cool, let's see how we can implement it.

Submit implementation

The complete submit functions looks like this:

// Submit a function to be executed asynchronously by the pool
template<typename F, typename...Args>
auto submit(F&& f, Args&&... args) -> std::future<decltype(f(args...))> {
	// Create a function with bounded parameters ready to execute
	std::function<decltype(f(args...))()> func = std::bind(std::forward<F>(f), std::forward<Args>(args)...);
	// Encapsulate it into a shared ptr in order to be able to copy construct / assign 
	auto task_ptr = std::make_shared<std::packaged_task<decltype(f(args...))()>>(func);

	// Wrap packaged task into void function
	std::function<void()> wrapper_func = [task_ptr]() {
	  (*task_ptr)(); 
	};

	// Enqueue generic wrapper function
	m_queue.enqueue(wrapperfunc);

	// Wake up one thread if its waiting
	m_conditional_lock.notify_one();

	// Return future from promise
	return task_ptr->get_future();
}

Nevertheless, we're going to inspect line by line what's going on in order to fully understand how it works.

Variadic template function

template<typename F, typename...Args>

This means that the next statement is templated. The first template parameter is called F (our function) and second one is a parameter pack. A parameter pack is a special template parameter that can accept zero or more template arguments. It is, in fact, a way to express a variable number of arguments in a template. A template with at least one parameter pack is called variadic template

Summarizing, we are telling the compiler that our submit function is going to take one generic parameter of type F (our function) and a parameter pack Args (the parameters of the function F).

Function declaration

auto submit(F&& f, Args&&... args) -> std::future<decltype(f(args...))> {

This may seem weird but, it's not. A function, in fact, can be declared using two different syntaxes. The following is the most well known:

return-type identifier ( argument-declarations... )

But, we can also declare the function like this:

auto identifier ( argument-declarations... ) -> return_type

Why two syntaxes? Well, imagine that you have a function that has a return type that depends on the input parameters of the function. Using the first syntax you can't declare that function without getting a compiler error since you would be using a variable in the return type that has not been declared yet (because the return type declaration goes before the parameters type declaration).

Using the second syntax you can declare the function to have return type auto then, using the -> you can declare the return type depending on the arguments of the functions that have been declared previously.

Now, let's inspect the parameters of the submit function. When the type of a parameter is declared as T&& for some deducted type T that parameter is a universal reference. This term was coined by Scott Meyers because T&& can also mean r-value reference. However, in the context of type deduction, it means that it can be bound to both l-values and r-values, unlike l-value references that can only be bound to non-const objects (they bind only to modifiable lvalues) and r-value references (they bind only to rvalues).

The return type of the function is of type std::future. An std::future is a special type that provides a mechanism to access the result of asynchronous operations, in our case, the result of executing a specific function. This makes sense with what we said earlier.

Finally, the template type of std::future is decltype(f(args...)). Decltype is a special C++ keyword that inspects the declared type of an entity or the type and value category of an expression. In our case, we want to know the return type of the function f, so we give decltype our generic function f and the parameter pack args.

Function body

// Create a function with bounded parameters ready to execute
std::function<decltype(f(args...))()> func = std::bind(std::forward<F>(f), std::forward<Args>(args)...);

There are many many things happening here. First of all, the std::bind(F, Args) is a function that creates a wrapper for F with the given Args. Caling this wrapper is the same as calling F with the Args that it has been bound. Here, we are simply calling bind with our generic function f and the parameter pack args but using another wrapper std::forward(t) for each parameter. This second wrapper is needed to achieve perfect forwarding of universal references. The result of this bind call is a std::function. The std::function is a C++ object that encapsulates a function. It allows you to execute the function as if it were a normal function calling the operator() with the required parameters BUT, because it is an object, you can store it, copy it and move it around. The template type of any std::function is the signature of that function: std::function< return-type (arguments)>. In this case, we already know how to get the return type of this function using decltype. But, what about the arguments? Well, because we bound all arguments args to the function f we just have to add an empty pair of parenthesis that represents an empty list of arguments: decltype(f(args...))().

// Encapsulate it into a shared ptr in order to be able to copy construct / assign 
auto task_ptr = std::make_shared<std::packaged_task<decltype(f(args...))()>>(func);

The next thing we do is we create a std::packaged_task(t). A packaged_task is a wrapper around a function that can be executed asynchronously. It's result is stored in a shared state inside an std::future object. The templated type T of an std::packaged_task(t) is the type of the function t that is wrapping. Because we said it before, the signature of the function f is decltype(f(args...))() that is the same type of the packaged_task. Then, we just wrap again this packaged task inside a std::shared_ptr using the initialize function std::make_shared.

// Wrap packaged task into void function
std::function<void()> wrapperfunc = [task_ptr]() {
  (*task_ptr)(); 
};

Again, we create a std:.function, but, note that this time its template type is void(). Independently of the function f and its parameters args this wrapperfunc the return type will always be void. Since all functions f may have different return types, the only way to store them in a container (our Queue) is wrapping them with a generic void function. Here, we are just declaring this wrapperfunc to execute the actual task taskptr that will execute the bound function func.

// Enqueue generic wrapper function
m_queue.enqueue(wrapperfunc);

We enqueue this wrapperfunc.

// Wake up one thread if its waiting
m_conditional_lock.notify_one();

Before finishing, we wake up one thread in case it is waiting.

// Return future from promise
return task_ptr->get_future();

And finally, we return the future of the packaged_task. Because we are returning the future that is bound to the packaged_task taskptr that, at the same time, is bound with the function func, executing this taskptr will automatically update the future. Because we wrapped the execution of the taskptr with a generic wrapper function, is the execution of wrapperfunc that, in fact, updates the future. Aaaaand. since we enqueued this wrapper function, it will be executed by a thread after being dequeued calling the operator().

Thread worker

Now that we understand how the submit method works, we're going to focus on how the work gets done. Probably, the simplest implementation of a thread worker could be using polling:

 Loop
	If Queue is not empty
		Dequeue work
		Do it

This looks alright but it's not very efficient. Do you see why? What would happen if there is no work in the Queue? The threads would keep looping and asking all the time: Is the queue empty?

The more sensible implementation is done by "sleeping" the threads until some work is added to the queue. As we saw before, as soon as we enqueue work, a signal notify_one() is sent. This allows us to implement a more efficient algorithm:

Loop
	If Queue is empty
		Wait signal
	Dequeue work
	Do it

This signal system is implemented in C++ with conditional variables. Conditional variables are always bound to a mutex, so I added a mutex to the thread pool class just to manage this. The final code of a worker looks like this:

void operator()() {
	std::function<void()> func;
	bool dequeued;
	while (!m_pool->m_shutdown) {
	{
		std::unique_lock<std::mutex> lock(m_pool->m_conditional_mutex);
		if (m_pool->m_queue.empty()) {
			m_pool->m_conditional_lock.wait(lock);
		}
		dequeued = m_pool->m_queue.dequeue(func);
	}
		if (dequeued) {
	  		func();
		}
	}	
}

The code is really easy to understand so I am not going to explain anything. The only thing to note here is that, func is our wrapper function declared as:

std::function<void()> wrapperfunc = [task_ptr]() {
  (*task_ptr)(); 
};

So, executing this function will automatically update the future.

Usage example

Creating the thread pool is as easy as:

// Create pool with 3 threads
ThreadPool pool(3);

// Initialize pool
pool.init();

When we want to shutdown the pool just call:

// Shutdown the pool, releasing all threads
pool.shutdown()

Ff we want to send some work to the pool, after we have initialized it, we just have to call the submit function:

pool.submit(work);

Depending on the type of work, I've distinguished different use-cases. Suppose that the work that we have to do is multiply two numbers. We can do it in many different ways. I've implemented the three most common ways to do it that I can imagine:

Use-Case #1. Function returns the result
Use-Case #2. Function updates by ref parameter with the result
Use-Case #3. Function prints the result

Note: This is just to show how the submit function works. Options are not exclusive

Use-Case #1

The multiply function with a return looks like this:

// Simple function that adds multiplies two numbers and returns the result
int multiply(const int a, const int b) {
  const int res = a * b;
  return res;
}

Then, the submit:

// The type of future is given by the return type of the function
std::future<int> future = pool.submit(multiply, 2, 3);

We can also use the auto keyword for convenience:

auto future = pool.submit(multiply, 2, 3);

Nice, when the work is finished by the thread pool we know that the future will get updated and we can retrieve the result calling:

const int result = future.get();
std::cout << result << std::endl;

The get() function of std::future always return the type T of the future. This type will always be equal to the return type of the function passed to the submit method. In this case, int.

Use-Case #2

The multiply function has a parameter passed by ref:

// Simple function that adds multiplies two numbers and updates the out_res variable passed by ref
void  multiply(int& out_res, const int a, const int b) {
	out_res = a * b;
}

Now, we have to call the submit function with a subtle difference. Because we are using templates and type deduction (universal references), the parameter passed by ref needs to be called using std::ref(param) to make sure that we are passing it by ref and not by value.

int result = 0;
auto future = pool.submit(multiply, std::ref(result), 2, 3);
// result is 0
future.get();
// result is 6
std::cout << result << std::endl;

In this case, what's the type of future? Well, as I said before, the return type will always be equal to the return type of the function passed to the submit method. Because this function is of type void, the future is std::future. Calling future.get() returns void. That's not very useful, but we still need to call .get() to make sure that the work has been done.

Use-Case #3

The last case is the easiest one. Our multiply function simply prints the result:

We have a simple function without output parameters. For this example I implemented the following multiplication function:

// Simple function that adds multiplies two numbers and prints the result
void multiply(const int a, const int b) {
  const int result = a * b;
  std::cout << result << std::endl;
}

Then, we can simply call:

auto future = pool.submit(multiply, 2, 3);
future.get();

In this case, we know that as soon as the multiplication is done it will be printed. If we care when this is done, we can wait for it calling future.get().

Checkout the main program for a complete example.

Future work

Make it more reliable and safer (exceptions)
Find a better way to use it with member functions (thanks to @rajenk)
Run benchmarks and improve performance if needed
Evaluate performance and impact of std::function in the heap and try alternatives if necessary. (thanks to @JensMunkHansen)

References

MULTI-THREADED PROGRAMMING TERMINOLOGY - 2017: Fast analysis of how a multi-thread system works
Universal References in C++11âScott Meyers: Universal references in C++11 by Scott Meyers
Perfect forwarding and universal references in C++: Article about how and when to use perfect forwarding and universal references
C++ documentation: Thread, conditional variables, mutex and many others...

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of CTPL

Cons of CTPL

Code Comparison

Key Differences

Use Cases

Community and Maintenance

Pros of thread-pool (bshoshany)

Cons of thread-pool (bshoshany)

Code Comparison

Pros of threadpool

Cons of threadpool

Code Comparison

Pros of asyncplusplus

Cons of asyncplusplus

Code Comparison

Key Differences

Use Case Recommendations

Community and Support

Convert designs to code with AI

README

Table of Contents

Introduction:

Build instructions:

Thread pool

Queue

Submit function

Submit implementation

Variadic template function

Function declaration

Function body

Thread worker

Usage example

Use-Case #1

Use-Case #2

Use-Case #3

Future work

References

Top Related Projects

Convert designs to code with AI