cat /dev/thoughts: Pthreads Overview

Here's a quick overview of the calls available to a programmer using Pthreads (both the standard Pthreads API and GNU extensions). It's intended to be as brief as possible while being (i) descriptive and (ii) not just a list of all the available calls. The links are to the relevant man pages on linux.die.net.

Note: The "NP" at the end of certain functions and macros stands for "non-portable". They are GNU extensions, accessed by e.g. #define-ing _GNU_SOURCE.

Thread creation and attributes
Thread operations

Fork/thread termination handlers
Thread termination and cancellation
Signal operations
Scheduling and CPU time/affinity

Thread-specific data and one-time-only operations
Locking mechanisms

Mutexes
Read/write locks
Spin locks

Memory barriers
Condition variables

Thread creation and attributes

A thread is created with pthread_create(). Thread attributes can be specified when it is created with a pthread_attr_t structure, which gets initialized with pthread_attr_init() and destroyed with pthread_attr_destroy(). A copy of a thread's current attributes can be obtained with the GNU-specific pthread_getattr_np().

The pthread_attr_t can have the following operations performed on it:

The thread's initial CPU affinity mask can be accessed/modified with pthread_attr_getaffinity_np() and pthread_attr_setaffinity_np().
Whether the thread is detached or joinable can be accessed/modified with pthread_attr_getdetachstate() and pthread_attr_setdetachstate().
The size of the guard area (area at the end of the stack that will generate a segmentation fault on access, to detect stack overflow) can be accessed/modified with pthread_attr_getguardsize() and pthread_attr_setguardsize() (but note that guard sizes are rounded up to page boundaries when applied).
Whether the thread gets its scheduling policy and parameters (see the next two points) by inheriting them from the current thread or explicitly from the values set in the attribute structure can be accessed/modified with pthread_attr_getinheritsched() and pthread_attr_setinheritsched().
The scheduling policy the new thread starts with (assuming it's been set to have this explicitly specified; see above) can be accessed/modified with pthread_attr_getschedpolicy() and pthread_attr_setschedpolicy().
The scheduling parameters (i.e. priority, unless more parameters are added in future) the new thread starts with (assuming it's been set to have this explicitly specified; see above) can be accessed/modified with pthread_attr_getschedparam() and pthread_attr_setschedparam().
Note: Only system-level contention is available on Linux. Whether the scheduling contention scope of the thread is at system-level (i.e. the thread's priority is compared with every other thread on the system when allocating CPU time) or process-level (i.e. the thread's priority is only compared with other threads in the process; the process itself must get CPU time before the scheduler decides if the thread deserves CPU time) can be accessed/modified by pthread_attr_getscope() and pthread_attr_setscope().
If a thread needs to run with a stack with a specific location and size, this can be accessed/modified with pthread_attr_getstack() and pthread_attr_setstack(). The size alone can be accessed/modified with pthread_attr_getstacksize() and pthread_attr_setstacksize(). (The functions for accessing and modifying the address are pthread_attr_getstackaddr() and pthread_attr_setstackaddr(), but they are obsolete).

Thread operations

A thread can get its own ID via pthread_self(). Two thread IDs can be tested to see if they refer to the same thread with pthread_equal() (don't just use ==).

Fork/thread termination handlers

You can specify functions to be executed immediately before and after a call to fork() (in both parent and child) with pthread_atfork(). This is useful in e.g. a threaded library, since the child process a fork() creates only consists of a single thread.

Thread clean-up handlers (analogous to atexit()/on_exit()) can be registered/unregistered with pthread_cleanup_push() and pthread_cleanup_pop(). They will be called if the thread is cancelled or it exits using pthread_exit() (but not if it just returns from the thread routine).

If you're worried about asynchronous cancellation between pushing/popping clean-up handlers (and who isn't!) you can use the GNU-specific pthread_cleanup_push_defer_np() and pthread_cleanup_pop_restore_np() instead, which (i) set the thread cancellation type to deferred (see Thread termination) and (ii) restore the previous thread cancellation type, respectively.

Thread termination and cancellation

A thread can exit from a thread at any time using pthread_exit().

A thread that is joinable can be joined with pthread_join(), which will block until the thread has finished. A thread can be moved to the detached state with pthread_detach() (the reverse cannot be done). Normally, an attempt to join will block until the thread has finished, but this blocking behaviour can be avoided with the GNU-specific pthread_tryjoin_np() and pthread_timedjoin_np().

One primitive way of terminating a thread is to cancel it with pthread_cancel(). Whether the thread is able to be cancelled in this way can be controlled with pthread_setcancelstate(), and whether such a cancellation is deferred or delivered at a cancellation point can be controlled with pthread_setcanceltype(). Within a thread, an "artificial" cancellation point can be generated by calling pthread_testcancel().

Signal operations

A signal can be sent to a specific thread using pthread_kill() or pthread_sigqueue(), analogous to the normal kill() and sigqueue functions.

If you're using LinuxThreads (you'll probably know if you are; most systems are using the Native POSIX Thread Library now, for which this function is irrelevant) you can use pthread_kill_other_threads_np() to send SIGKILL to all other threads, usually to correct a deficiency whereby other threads are not terminated when calling one of the exec() family of functions.

The signal mask for a thread can be accessed/modified with pthread_sigmask(), which uses a sigset_t that can be manipulated with the usual sigsetops.

Scheduling and CPU time/affinity

The clock_t of a specific thread (e.g. for a call to clock_gettime()) can be accessed with pthread_getcpuclockid().

A thread's scheduling policy and parameters (i.e. priority) can be accessed/modified using pthread_setschedparam() and pthread_getschedparam(). The priority alone can be set using pthread_setschedprio(). The CPU can be released with pthread_yield(), which is analogous to sched_yield() but for an individual thread.

Note: This is irrelevant on Linux, and is ignored. The concurrency (a hint to the system about how many lower-level entities (e.g. kernel threads) to assign) of a process can be accessed/modified with pthread_getconcurrency() and pthread_setconcurrency().

The CPU affinity of a thread can be accessed/modified after creation with the GNU-specific pthread_getaffinity_np() and pthread_setaffinity_np().

Thread-specific data and one-time-only operations

Once-only initialization can be achieved using pthread_once(). This requires a pthread_once_t which is initialized by assignment from the PTHREAD_ONCE_INIT macro.

Thread-local data is provided through a pthread_key_t, which is initialized/freed using pthread_key_create() and pthread_key_delete(). The value can then be accessed/modified using pthread_getspecific() and pthread_setspecific().

Locking mechanisms

Pthreads provides three types of locking mechanism - mutexes, read/write locks and spin locks.

Mutexes

These are the most common types of lock, as well as the most flexible. They are also the only locks that can be used with condition variables.

A mutex is initialized/freed with pthread_mutex_init() and pthread_mutex_destroy(). You can also initialize a statically allocated mutex using the PTHREAD_MUTEX_INITIALIZER macro or, as GNU extensions, PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP or PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP.

Mutexes can optionally be initialized with attributes in the form of a pthread_mutexattr_t, initialized and freed using pthread_mutexattr_init() and pthread_mutexattr_destroy(). The pthread_mutexattr_t can have the following operations performed on it:

The type of mutex used can be accessed/modified using pthread_mutexattr_gettype() and pthread_mutexattr_settype(). A mutex can be normal, error-checking (re-locking the mutex from the owning thread is an error) or recursive (relocking the mutex from the owning thread is fine).
Whether a mutex can be used from multiple processes can be accessed/modified using pthread_mutexattr_getpshared() and pthread_mutexattr_setpshared().
To prevent priority inversion, the behaviour of a thread's scheduling while a mutex is locked and/or blocking higher priority threads can be accessed/modified with pthread_mutexattr_getprotocol() and pthread_mutexattr_setprotocol(). The priority ceiling of a mutex can be accessed/modified with pthread_mutexattr_getprioceiling() and pthread_mutexattr_setprioceiling().

The priority ceiling of a thread can be accessed/modified after creation using pthread_mutex_getprioceiling() and pthread_mutex_setprioceiling(). Mutexes can be locked using any of pthread_mutex_lock(), pthread_mutex_timedlock() or pthread_mutex_trylock(). They can be unlocked with pthread_mutex_unlock().

Read/write locks

Read/write locks can provide performance benefits over mutexes since they allow any number of threads to access the protected resource with "read-only" behaviour, while granting exclusive access to threads that want to be able to modify the resource. They have some disadvantages over mutexes though, namely that they can't be configured to avoid priority inversion and they can't be used with condition variables.

A read/write lock is initialized/freed with pthread_rwlock_init() and pthread_rwlock_destroy(). You can initialize a statically allocated read/write lock with PTHREAD_RWLOCK_INITIALIZER or, as a GNU extension, PTHREAD_RWLOCK_WRITER_NONRECURSIVE_INITIALIZER_NP.

Read/write locks can optionally be initialized with attributes in the form of a pthread_rwlockattr_t, initialized and freed using pthread_rwlockattr_init() and pthread_rwlockattr_destroy().

Currently, the only attribute a read/write lock can be given is whether it can be used by multiple processes. This setting can be accessed/modified with pthread_rwlockattr_getpshared() and pthread_rwlockattr_setpshared().

A read lock can be obtained using one of pthread_rwlock_rdlock(), pthread_rwlock_timedrdlock() or pthread_rwlock_tryrdlock(). The corresponding functions for obtaining the write lock are pthread_rwlock_timedwrlock(), pthread_rwlock_trywrlock() and pthread_rwlock_wrlock(). The lock can be unlocked from either type of lock with pthread_rwlock_unlock().

Spin locks

These are the simplest form of locks, and are generally only used when you know that other threads will only be holding the locks for a short amount of time (e.g. not whilst performing blocking I/O).

A spin lock is initialized/freed with pthread_spin_init() andpthread_spin_destroy(). There is no associated attributes structure (but they can be made available to other processes upon initialization).

The functions for locking/unlocking a spin lock are pthread_spin_lock(), pthread_spin_trylock() and pthread_spin_unlock().

Memory barriers

These allow threads to synchronize at a pre-defined point of execution.

They are initialized and freed using pthread_barrier_init() and pthread_barrier_destroy().

A memory barrier can optionally be initialized with attributes in the form of a pthread_barrierattr_t, initialized and freed using pthread_barrierattr_init() and pthread_barrierattr_destroy().

Currently, the only attribute a memory barrier can be given is whether it can be used by multiple processes. This property can be accessed/modified using pthread_barrierattr_getpshared() andpthread_barrierattr_setpshared().

Once created, threads can synchronize at a barrier using pthread_barrier_wait(). There is no such thing as a "trywait" or "timedwait" function.

Condition variables

Condition variables can be used to signal changes in an application's state to other threads. They must be used with an associated pthread_mutex_t.

Condition variables are initialized/freed using pthread_cond_init() and pthread_cond_destroy().

A condition variable can optionally be initialized with attributes in the form of a pthread_condattr_t, initialized and freed using pthread_condattr_destroy() and pthread_condattr_init(). You can initialize a statically allocated condition variable using PTHREAD_COND_INITIALIZER.

Whether the condition variable is able to be used by multiple processes can be accessed/modified using pthread_condattr_getpshared() and pthread_condattr_setpshared().

The clock used (identified by a clockid_t) for timed waits on a condition variable can be accessed/modified using pthread_condattr_getclock() and pthread_condattr_setclock() (though it cannot be set to a CPU clock).

Threads can wait for changes in conditions using pthread_cond_timedwait() and pthread_cond_wait().

Threads can signal changes in conditions using pthread_cond_broadcast() and pthread_cond_signal().

cat /dev/thoughts

Thursday, 18 October 2012

Pthreads Overview