Parallel Progamming With Pthreads

Threads and Concurrency
Michael Ibrahim
Lesson Preview
• What is a thread?
• How are threads are different from processes?
• What data structures are used to implement and manage threads?
• POSIX Thread API

• Thread creation, termination and joining
• Thread safety
• Synchronization primitives in PThreads
Process Review
What if we have multiple CPUs?
Visual Metaphor
A thread is like a … worker
• Is an active entity
• Executing unit of product order
• Works simultaneously with
others
• Many workers completing
products order
• Requires coordination
• Sharing of tools, parts,
workstations
Visual Metaphor
A thread is like a … worker
• Is an active entity • Is an active entity
• Executing unit of a process • Executing unit of product order
• Works simultaneously with • Works simultaneously with
others others
• Many workers completing
• Many threads executing products order
• Requires coordination • Requires coordination

• Sharing I/O devices, CPUs, • Sharing of tools, parts,
memory, … workstations
Process vs. Thread
Why are threads useful?
• Parallelization => Speedup

• Specialization => Hot cache
• Efficiency => lower memory requirement and cheaper IPC
Are threads useful on single CPU?
• If t_idle > 2 * t_ctx_switch
then context switch to hide
idling time
• t_ctx_switch thread <

t_ctx_switch process
Benefits to Applications and OS code
Process vs. Thread Quiz
Do the following statements apply to processes, threads, or both?
Can share a virtual address space
Take longer to context switch
Have an execution context
Usually result in hotter caches when multiple exist
Make use of some communication mechanism
What do we need to support threads?
• Thread data structure
• Identify threads, keep track of resource usage, …
• Mechanism to create and manage threads
• Mechanisms to safely coordinate among threads running

concurrently in the same address space
Threads and concurrency
Processes Threads
Concurrency Control & Coordination
Synchronization mechanisms:
• Mutual Exclusion
• Exclusive access to only on thread at a time
• Mutex
• Condition variable
• Waiting on other threads specific condition before proceeding
• Waking up other threads from wait state
Threads and Thread Creation
• Thread type
• Thread data structure
• Fork (proc, args)

• Create a thread
• Not UNIX fork
• Join (thread)
• Terminate a thread
Thread Creation Example
Mutual Exclusion
Making safe_insert safe
Producer/Consumer Example
• What if the process you wish to perform with mutual exclusion needs
to occur only under certain conditions?
Producer/Consumer Pseudocode
Condition Variable
Condition Variable API
• Condition type
• Wait(mutex, cond)
• Mutex is automatically release and re-acquired on wait
• Signal (cond)
• Notify only one thread waiting on condition
• Broadcast(cond)
• Notify all waiting threads
Condition Variable Quiz
Recall the consumer code from
the previous example for condition
variable.
Instead of ‘while’, why did we not
simply use ‘if’?
‘While’ can support multiple consumer threads?
Cannot guarantee access to m once the condition is signaled?
The list can change before the consumer gets access again?
All of the above
Avoiding Common Mistakes
• Keep track of mutex/conditional variable used with a resource
• E.g., mutex_type m1;//mutex for var1
• Check that you always (and correctly) using lock & lock
• E.g., did you forget to lock/unlock?
• Use a single mutex to a access a single resource.
• In this case operations will occur concurrently on the shared resource
• Check that you are signaling correct condition
• Check that you not using signal when broadcast is needed
• Only one thread will proceed … remaining threads will continue to wait
Spurious Wake-Ups
• A spurious wakeup happens when a thread wakes up from waiting on

a condition variable that's been signaled, only to discover that the
condition it was waiting for isn't satisfied.
• They usually happen because, in between the time when the condition
variable was signaled and when the waiting thread finally ran, another
thread ran and changed the condition.
Deadlocks
• A deadlock occurs when two or more competing threads are waiting

on each other to complete, but none of them ever do.
Interrupts vs. Signals
Interrupts Signals
• Events generated externally by • Events triggered by the CPU &
components other than the CPU software running on it
(I/O devices, timers, other CPUs)
• Determined based on the • Determined based on the
physical platform operating system
• Appear asynchronously • Appear synchronously or
asynchronously
Interrupts
Interrupt Handler Table

Signals
Signal Handler Table

Kernel vs. User-level Threads
What is the relationship

between the user and the
kernel level threads?
One-to-One Model
+ OS sees/understands threads,
synchronization, blocking, …
- Must go to OS for all operations

(may be expensive)
- OS may limits on policies,
thread#
- portability
Many-to-One Model
+ Totally portable, doesn’t depend
on OS limits and policies
- OS has no insights into

application needs
- OS may block entire process if
one user level thread blocks on
I/O
Many-to-Many Model
+ Can be best both worlds
- Requires coordination between

user and kernel-level thread
managers
POSIX threads (PThreads)
• Threads used to implement parallelism in shared
memory multiprocessor systems, such as SMPs
• Historically, hardware vendors have implemented their
own proprietary versions of threads
– Portability a concern for software developers.
• For UNIX systems, a standardized C language threads
programming interface has been specified by the IEEE
POSIX 1003.1c standard.
– Implementations that adhere to this standard are referred to
as POSIX threads
The POSIX Thread API
• Commonly referred to as PThreads, POSIX has emerged as the
standard threads API, supported by most vendors.
– Implemented with a pthread.h header/include file and a thread
library
• Functionalities
– Thread management, e.g. creation and joining
– Thread synchronization primitives
• Mutex
• Condition variables
• Reader/writer locks
• Pthread barrier
– Thread-specific data
• The concepts discussed here are largely independent of the API

– Applied to other thread APIs (NT threads, Solaris threads, Java
threads, etc.) as well.
PThread API
• #include <pthread.h>
• gcc -lpthread
Thread Creation
• Initially, main() program comprises a single, default thread
– All other threads must be explicitly created
int pthread_create(
pthread_t *thread,
const pthread_attr_t *attr,
void *(*start_routine)(void *),
void * arg);
• thread: An opaque, unique identifier for the new thread returned by the subroutine
• attr: An opaque attribute object that may be used to set thread attributes
You can specify a thread attributes object, or NULL for the default values
• start_routine: the C routine that the thread will execute once it is created
• arg: A single argument that may be passed to start_routine. It must be passed by
reference as a pointer cast of type void. NULL may be used if no argument is to be
passed.
Opaque object: A letter is an opaque object to the mailman, and sender and receiver
know the information.
Thread Creation
• pthread_create creates a new thread and makes it
executable, i.e. run immediately in theory
– can be called any number of times from anywhere within your code
• Once created, threads are peers, and may create other threads
• There is no implied hierarchy or dependency between threads
Example 1: pthread_create
#include <pthread.h> One possible output:
#define NUM_THREADS 5
In main: creating thread 0
void *PrintHello(void *thread_id) { In main: creating thread 1
long tid = (long)thread_id; In main: creating thread 2
printf("Hello World! It's me, thread #%ld!\n", tid); In main: creating thread 3
pthread_exit(NULL); Hello World! It's me, thread #0!
} In main: creating thread 4
Hello World! It's me, thread #1!
int main(int argc, char *argv[]) { Hello World! It's me, thread #3!
pthread_t threads[NUM_THREADS]; Hello World! It's me, thread #2!
long t; Hello World! It's me, thread #4!
for(t=0;t<NUM_THREADS;t++) {
printf("In main: creating thread %ld\n", t);
int rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t );
if (rc) {
printf("ERROR; return code from pthread_create() is %d\n", rc);
exit(-1);
}
}
pthread_exit(NULL);
}
Terminating Threads
• pthread_exit is used to explicitly exit a thread
– Called after a thread has completed its work and is no longer
required to exist
• If main()finishes before the threads it has created
– If exits with pthread_exit(), the other threads will continue to
execute
– Otherwise, they will be automatically terminated when
main()finishes
• The programmer may optionally specify a termination
status, which is stored as a void pointer for any thread that
may join the calling thread
• Cleanup: the pthread_exit()routine does not close
files
– Any files opened inside the thread will remain open after the thread
is terminated
Thread Attribute
int pthread_create(
pthread_t *thread,
const pthread_attr_t *attr,
void *(*start_routine)(void *),
void * arg);
• Attribute contains details about

– whether scheduling policy is inherited or explicit
– scheduling policy, scheduling priority
– stack size, stack guard region size
• pthread_attr_init and pthread_attr_destroy are used

to initialize/destroy the thread attribute object
• Other routines are then used to query/set specific attributes in the
thread attribute object
Passing Arguments to Threads
• The pthread_create() routine permits the programmer to
pass one argument to the thread start routine
• For cases where multiple arguments must be passed:
– Create a structure which contains all of the arguments
– Then pass a pointer to the object of that structure in the
pthread_create()routine.
– All arguments must be passed by reference and cast to (void *)
• Make sure that all passed data is thread safe: data racing
– it can not be changed by other threads
– It can be changed in a determinant way
• Thread coordination
Example 2: Argument Passing
#include <pthread.h>
struct thread_data {
int thread_id; char
*message;
};
struct thread_data thread_data_array[NUM_THREADS];
void *PrintHello(void *threadarg) {

int taskid;
char *hello_msg;
sleep(1);
struct thread_data *my_data = (struct thread_data *) threadarg;
taskid = my_data->thread_id;
hello_msg = my_data->message;
printf("Thread %d: %s\n", taskid, hello_msg);
pthread_exit(NULL);
}
Example 2: Argument Passing
int main(int argc, char *argv[]) {
pthread_t threads[NUM_THREADS]; Thread 3: Klingon: Nuq neH!
int t; Thread 0: English: Hello World!
char *messages[NUM_THREADS]; Thread 1: French: Bonjour, le monde!
messages[0] = "English: Hello World!"; Thread 2: Spanish: Hola al mundo
messages[1] = "French: Bonjour, le monde!"; Thread 5: Russian: Zdravstvytye, mir!
messages[2] = "Spanish: Hola al mundo"; Thread 4: German: Guten Tag, Welt!
messages[3] = "Klingon: Nuq neH!";
Thread 6: Japan: Sekai e konnichiwa!
messages[4] = "German: Guten Tag, Welt!";
messages[5] = "Russian: Zdravstvytye, mir!";
Thread 7: Latin: Orbis, te saluto!
messages[6] = "Japan: Sekai e konnichiwa!";
messages[7] = "Latin: Orbis, te saluto!";
for(t=0;t<NUM_THREADS;t++) {
struct thread_data * thread_arg = &thread_data_array[t];
thread_arg->thread_id = t;
thread_arg->message = messages[t];
pthread_create(&threads[t], NULL, PrintHello, (void *) thread_arg);
}
pthread_exit(NULL);
}
Wait for Thread Termination
Suspend execution of calling thread until thread terminates
int pthread_join(
pthread_t thread,
void **value_ptr);
• thread: the joining thread
• value_ptr: ptr to location for return code a terminating thread passes to
pthread_exit
• It is a logical error to attempt simultaneous multiple joins on the same thread

Example 3: PThreads Joining
void *BusyWork(void *t) {

int i;
long tid = (long)t;
double result=0.0;
printf("Thread %ld starting...\n",tid);
for (i=0; i<1000000; i++) {
result = result + sin(i) * tan(i);
}
printf("Thread %ld done. Result = %e\n",tid, result);
pthread_exit((void*) t);
}
Example 3: PThreads joining
int main (int argc, char *argv[]) Main: creating thread 0
{ Main: creating thread 1
pthread_t thread[NUM_THREADS]; Thread 0 starting...
pthread_attr_t attr;
Main: creating thread 2
long t;
void *status; Thread 1 starting...
Main: creating thread 3
/* Initialize and set thread detached attribute */ Thread 2 starting...
pthread_attr_init(&attr); Thread 3 starting...
Thread 1 done. Result = -3.153838e+06
pthread_attr_setdetachstate(&attr, PTHREAD_C REATE_JOINABLE);
Thread 0 done. Result = -3.153838e+06
for(t=0; t<NUM_THREADS; t++) { Main: joined with thread 0, status: 0
printf("Main: creating thread %ld\n", t);
Main: joined with thread 1, status: 1
pthread_create(&thread[t], &attr, BusyWork, ( void *)t);
} Thread 2 done. Result = -3.153838e+06
/* Free attribute and wait for the other threads */ Main: joined with thread 2, status: 2
pthread_attr_destroy(&attr); Thread 3 done. Result = -3.153838e+06
for(t=0; t<NUM_THREADS; t++) { Main: joined with thread 3, status: 3
pthread_join(thread[t], &status); Main: program completed. Exiting.
printf(“Main: joined with thread %ld, status: %ld\n ", t, (long)status);
}
printf("Main: program completed. Exiting.\n");
pthread_exit(NULL);
}
Thread Consequences
• Shared State!
– Accidental changes to global variables can be fatal.
– Changes made by one thread to shared system resources (such as
closing a file) will be seen by all other threads
– Two pointers having the same value point to the same data
– Reading and writing to the same memory locations is possible
– Therefore requires explicit synchronization by the programmer
• Many library functions are not thread-safe
– Library Functions that return pointers to static internal memory. E.g.
gethostbyname()
• Lack of robustness
– Crash in one thread will crash the entire process
Thread-safeness
• Thread-safeness: in a nutshell, refers an application's ability to
execute multiple threads simultaneously without "clobbering"
shared data or creating "race" conditions
• Example: an application creates several threads, each of which

makes a call to the same library routine:
– This library routine accesses/modifies a global structure or
location in memory.
– As each thread calls this routine it is possible that they may try
to modify this global structure/memory location at the same
time.
– If the routine does not employ some sort of synchronization
constructs to prevent data corruption, then it is not thread-
safe.
Thread-safeness
Thread-safeness
The implication to users of external library routines:
• If you aren't 100% certain the routine is thread-safe, then you

take your chances with problems that could arise.
• Recommendation
– Be careful if your application uses libraries or other objects that
don't explicitly guarantee thread-safeness.
– When in doubt, assume that they are not thread-safe until
proven otherwise
– This can be done by "serializing" the calls to the uncertain
routine, etc.
Why PThreads (not processes)?
• The primary motivation
– To realize potential program performance gains
• Compared to the cost of creating and managing a process
– A thread can be created with much less OS overhead
• Managing threads requires fewer system resources than
managing processes
• All threads within a process share the same address space
• Inter-thread communication is more efficient and, in many cases,
easier to use than inter-process communication
pthread_create vs fork
• Timing results for the fork() subroutine and the
pthreads_create() subroutine
– Timings reflect 50,000 process/thread creations
– units are in seconds
– no optimization flags
Why pthreads
• Potential performance gains and practical advantages over non-
threaded applications:
– Overlapping CPU work with I/O
• For example, a program may have sections where it is performing a long
I/O operation
• While one thread is waiting for an I/O system call to complete, CPU
intensive work can be performed by other threads.
• Asynchronous event handling
– Tasks which service events of indeterminate frequency and duration can be
interleaved
– For example, a web server can both transfer data from previous requests
and manage the arrival of new requests.
AXPY with PThreads
• y = α·x + y
– x and y are vectors of size N
• In C, x[N], y[N]
– α is scalar
• Decomposition and mapping to pthreads
A task will be mapped to a

pthread
AXPY with PThreads
Data Racing in a Multithread Program
Consider:
/* each thread to update shared variable
best_cost */
if (my_cost < best_cost)
best_cost = my_cost;
– two threads,
– the initial value of best_cost is 100,
– the values of my_cost are 50 and 75 for threads t1 and t2
T1 T2
best_cost
if (my_cost =(50)
my_cost;
< best_cost) if (my_cost (75) < best_cost)
best_cost = my_cost; best_cost = my_cost;
• The value of best_cost could be 50 or 75!

• The value 75 does not correspond to any serialization of the two
threads.
Critical Section and Mutual Exclusion
• Critical section = a segment that must be executed by
only one thread at any time
if (my_cost < best_cost)
best_cost = my_cost;
• Mutex locks protect critical sections in Pthreads

– locked and unlocked
– At any point of time, only one thread can acquire a mutex lock
• Using mutex locks

– request lock before executing critical section
– enter critical section when lock granted
– release lock when leaving critical section
Mutual Exclusion using Pthread Mutex
int pthread_mutex_lock (pthread_mutex_t *mutex_lock);
int pthread_mutex_unlock (pthread_mutex_t *mutex_lock);
int pthread_mutex_init (pthread_mutex_t *mutex_lock,
const pthread_mutexattr_t *lock_attr);
pthread_mutex_t cost_lock; pthread_mutex_lock blocks the calling

int main() { thread if another thread holds the lock
...
pthread_mutex_init(&cost_lock, NULL); When pthread_mutex_lock call returns
pthread_create(&thhandle, NULL, find_best, …) 1. Mutex is locked, enter CS
... 2. Any other locking attempt (call to
} thread_mutex_lock) will cause the
void *find_best(void *list_ptr) { blocking of the calling thread
...
pthread_mutex_lock(&cost_lock); // enter CS When pthread_mutex_unlock returns
if (my_cost < best_cost) 1. Mutex is unlocked, leave CS
best_cost = my_cost; Critical Section 2. One thread who blocks on
pthread_mutex_unlock(&cost_lock); // leave CS thread_mutex_lock call will acquire
} the lock and enter CS
Overheads of Locking
• Locks enforce serialization
– Thread must execute critical sections one after another
• Large critical sections can lead to significant performance
degradation.
• Reduce the blocking overhead associated with locks using:
int pthread_mutex_trylock (
pthread_mutex_t *mutex_lock);
– acquire lock if available

– return EBUSY if not available
– enables a thread to do something else if lock unavailable
• pthread trylock typically much faster than lock on certain systems

– It does not have to deal with queues associated with locks for multiple
threads waiting on the lock.
Condition Variables for Synchronization
A condition variable: associated with a predicate and a mutex
– A sync variable for a condition, e.g. mybalance > 500
• A thread can block itself until a condition becomes true

– When blocked, release mutex so others can acquire it
• When a condition becomes true, observed by another
thread, the condition variable is used to signal other
threads who are blocked
• A condition variable always has a mutex associated with
it.
– A thread locks this mutex and tests the condition
Condition Variables for Synchronization
/* the opaque data structure */
pthread_cond_t
/* initialization and destroying */

int pthread_cond_init(pthread_cond_t *cond,
const pthread_condattr_t *attr);
int pthread_cond_destroy(pthread_cond_t *cond);
/* block and release lock until a condition is true */

int pthread_cond_wait(pthread_cond_t *cond,
pthread_mutex_t *mutex);
int pthread_cond_timedwait(pthread_cond_t *cond,
pthread_mutex_t *mutex, const struct timespec *wtime);
/* signal one or all waiting threads that condition is true */

int pthread_cond_signal(pthread_cond_t *cond);
int pthread_cond_broadcast(pthread_cond_t *cond);
Producer-Consumer Using Condition Variables
pthread_cond_t cond_queue_empty, cond_queue_full;
pthread_mutex_t task_queue_cond_lock;
int task_available;
/* other data structures here */
main() {
/* declarations and initializations */
task_available = 0;
pthread_cond_init(&cond_queue_empty, NULL);
pthread_cond_init(&cond_queue_full, NULL);
pthread_mutex_init(&task_queue_cond_lock, NULL);
/* create and join producer and consumer threads */
}
• Two conditions:
• Queue is full: (task_available == 1)  cond_queue_full
• Queue is empty: (task_available == 0)  cond_queue_empty
• A mutex for protecting accessing the queue (CS): task_queue_cond_lock
void *producer(void *producer_thread_data) {
int inserted;
while (!done()) {
create_task();
pthread_mutex_lock(&task_queue_cond_lock);
while (task_available == 1) Release mutex (unlock)
1 pthread_cond_wait(&cond_queue_empty, when blocked/wait
&task_queue_cond_lock);
Acquire mutex (lock) when
insert_into_queue(); awaken
2 task_available = 1; CS
pthread_mutex_unlock(&task_queue_cond_lock);
3 pthread_cond_signal(&cond_queue_full);
}
}
Producer:
1. Wait for queue to become empty, notified by consumer through cond_queue_empty
2. insert into queue
3. Signal consumer through cond_queue_full
void *consumer(void *consumer_thread_data) {
while (!done()) {
pthread_mutex_lock(&task_queue_cond_lock);
while (task_available == 0) Release mutex (unlock)
1 pthread_cond_wait(&cond_queue_full, when blocked/wait
&task_queue_cond_lock);
Acquire mutex (lock) when
my_task = extract_from_queue(); awaken
2 task_available = 0;
pthread_mutex_unlock(&task_queue_cond_lock);
3 pthread_cond_signal(&cond_queue_empty);
process_task(my_task);
}
}
Consumer:
1. Wait for queue to become full, notified by producer through cond_queue_full
2. Extract task from queue
3. Signal producer through cond_queue_empty
Thread and Synchronization Attributes
• Three major objects
– pthread_t
– pthread_mutex_t
– pthread_cond_t
• Default attributes when being created/initialized
– NULL
• An attributes object is a data-structure that describes entity
(thread, mutex, condition variable) properties.
– Once these properties are set, the attributes object can be
passed to the method initializing the entity.
– Enhances modularity, readability, and ease of modification.
Composite Synchronization Constructs
• Pthread Mutex and Condition Variables are two basic sync
operations.
• Higher level constructs can be built using basic constructs.
– Read-write locks
– Barriers
• Pthread has its corresponding implementation

– pthread_rwlock_t
– pthread_barrier_t
• We will discuss our own implementations

Read-Write Locks
• Concurrent access to data structure:
– Read frequently but
– Written infrequently
• Behavior:
– Concurrent read: A read request is granted when there are
other reads or no write (pending write request).
– Exclusive write: A write request is granted only if there is no
write or pending write request, or reads.
• Interfaces:
– The rw lock data structure: struct mylib_rwlock_t
– Read lock: mylib_rwlock_rlock
– write lock: mylib_rwlock_wlock
– Unlock: mylib_rwlock_unlock.
Read-Write Locks
• Two types of mutual exclusions
– 0/1 mutex for protecting access to write
– Counter mutex (semaphore) for counting read access
• Component sketch
– a count of the number of readers,
– 0/1 integer specifying whether a writer is present,
– a condition variable readers_proceed that is signaled when readers
can proceed,
– a condition variable writer_proceed that is signaled when one of the
writers can proceed,
– a count pending_writers of pending writers, and
– a pthread_mutex_t read_write_lock associated with the shared data
structure
Read-Write Locks
typedef struct {
int readers;
int writer;
pthread_cond_t readers_proceed;
pthread_cond_t writer_proceed;
int pending_writers;
pthread_mutex_t read_write_lock;
} mylib_rwlock_t;
void mylib_rwlock_init (mylib_rwlock_t *l) {

l->readers=0; l->writer=0; l->pending_writers=0;
pthread_mutex_init(&(l->read_write_lock), NULL);
pthread_cond_init(&(l->readers_proceed), NULL);
pthread_cond_init(&(l->writer_proceed), NULL);
}
Read-Write Locks
void mylib_rwlock_rlock(mylib_rwlock_t *l) {
pthread_mutex_lock(&(l->read_write_lock));
while ((l->pending_writers > 0) || (l->writer > 0))

1 pthread_cond_wait(&(l->readers_proceed),
&(l->read_write_lock));
2 l->readers ++;
pthread_mutex_unlock(&(l->read_write_lock));
}
Reader lock:
1. if there is a write or pending writers, perform condition wait,
2. else increment count of readers and grant read lock
Read-Write Locks
void mylib_rwlock_wlock(mylib_rwlock_t *l) {
pthread_mutex_lock(&(1->read_write_lock));
1->pending_writers ++;
while ((1->writer > 0) || (1->readers > 0)) {
1 pthread_cond_wait(&(1->writer_proceed),
&(1->read_write_lock));
}
1->pending_writers --;
2 1->writer ++;
pthread_mutex_unlock(&(1->read_write_lock));
}
Writer lock:
1. If there are readers or writers, increment pending writers
count and wait.
2. On being woken, decrement pending writers count and
increment writer count
Read-Write Locks
void mylib_rwlock_unlock(mylib_rwlock_t *l) {
pthread_mutex_lock(&(1->read_write_lock));
if (1->writer > 0) /* only writer */
1 1->writer = 0;
else if (1->readers > 0) /* only reader */
2 1->readers --;
pthread_mutex_unlock(&(1->read_write_lock));
if ((1->readers == 0) && (1->pending_writers > 0))

3 pthread_cond_signal(&(1->writer_proceed));
else if (1->readers > 0)
4 pthread_cond_broadcast(&(1->readers_proceed));
}
Reader/Writer unlock:
1. If there is a write lock then unlock
2. If there are read locks, decrement count of read locks.
3. If the read count becomes 0 and there is a pending writer, notify writer
4. Otherwise if there are pending readers, let them all go through
Barrier
• A barrier holds one or multiple threads until all
threads participating in the barrier have reached the
barrier point
Barrier
• Needs a counter, a mutex and a condition variable
– The counter keeps track of the number of threads that have
reached the barrier.
• If the count is less than the total number of threads, the
threads execute a condition wait.
– The last thread entering (master) wakes up all the threads
using a condition broadcast.
typedef struct {
int count;
pthread_mutex_t count_lock;
pthread_cond_t ok_to_proceed;
} mylib_barrier_t;
void mylib_barrier_init(mylib_barrier_t *b) {

b->count = 0;
pthread_mutex_init(&(b->count_lock), NULL);
pthread_cond_init(&(b->ok_to_proceed), NULL);
}
Barriers
void mylib_barrier (mylib_barrier_t *b, int num_threads) {
pthread_mutex_lock(&(b->count_lock));
1 • b->count ++;
2 • if (b->count == num_threads) { b->count = 0;
• pthread_cond_broadcast(&(b->ok_to_proceed));
3 • } else ,
• while (pthread_cond_wait(&(b->ok_to_proceed) &(b->count_lo
!= 0);
pthread_mutex_unlock(&(b->count_lock));
}
Barrier
1. Each thread increments the counter and check whether all reach
2. The thread (master) who detect that all reaches signal others to proceed
3. If not all reach, the thread waits
Flat/Linear vs Tree/Log Barrier
• Linear/Flat barrier.
– O(n) for n thread
– A single master to collect information of all threads and notify them to
continue
• Tree/Log barrier
– Organize threads in a tree logically
– Multiple submaster to collect and notify
– Runtime grows as O(log p).
77
Barrier
• Execution time of 1000 sequential and logarithmic barriers as a function of

number of threads on a 32 processor SGI Origin 2000.
78
Lesson Summary
• What are threads? How and why do we use them
• Thread mechanisms
• Mutexes, condition variables
• Using threads
• Problems, solutions and design approaches
• POSIX Thread API

• Thread creation, termination and joining
• Thread safety
• Synchronization primitives in PThreads

Parallel Progamming With Pthreads

Uploaded by

Copyright:

Available Formats

Parallel Progamming With Pthreads

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parallel Progamming With Pthreads

Uploaded by

Copyright:

Available Formats

Threads and Concurrency

• POSIX Thread API

• Requires coordination • Requires coordination

• Parallelization => Speedup

• t_ctx_switch thread <

• Mechanism to create and manage threads

• Mechanisms to safely coordinate among threads running

• Fork (proc, args)

• A spurious wakeup happens when a thread wakes up from waiting on

• A deadlock occurs when two or more competing threads are waiting

Interrupt Handler Table

Signal Handler Table

What is the relationship

- Must go to OS for all operations

- OS has no insights into

- Requires coordination between

• The concepts discussed here are largely independent of the API

• Attribute contains details about

• pthread_attr_init and pthread_attr_destroy are used

struct thread_data thread_data_array[NUM_THREADS];

void *PrintHello(void *threadarg) {

• It is a logical error to attempt simultaneous multiple joins on the same thread

void *BusyWork(void *t) {

• Example: an application creates several threads, each of which

• If you aren't 100% certain the routine is thread-safe, then you

A task will be mapped to a

best_cost = my_cost; best_cost = my_cost;

• The value of best_cost could be 50 or 75!

• Mutex locks protect critical sections in Pthreads

• Using mutex locks

pthread_mutex_t cost_lock; pthread_mutex_lock blocks the calling

– acquire lock if available

• pthread trylock typically much faster than lock on certain systems

• A thread can block itself until a condition becomes true

/* initialization and destroying */

/* block and release lock until a condition is true */

/* signal one or all waiting threads that condition is true */

• Pthread has its corresponding implementation

• We will discuss our own implementations

void mylib_rwlock_init (mylib_rwlock_t *l) {

while ((l->pending_writers > 0) || (l->writer > 0))

if ((1->readers == 0) && (1->pending_writers > 0))

void mylib_barrier_init(mylib_barrier_t *b) {

• Execution time of 1000 sequential and logarithmic barriers as a function of

• POSIX Thread API

You might also like

void PrintHello(void threadarg) {

void BusyWork(void t) {