CSCI 380 -- Lab 4

Locks All the Way Down...

Handout

A tarfile can be acquired here: lab4-handout.tar

Part 0 (Background): Mutex

We've been looking at locks and mutual exclusion a bit in class, so let's dive a little deeper. A lock, by definition, can either be available or unavailable. If a lock is available, then it can be acquired (which changes the lock's state to unavailable). If a lock is unavailable, then only the thread that has acquired the lock can release the lock (changing its state to available). We use the terms of wait and signal for acquiring and releasing, respectively.

def wait (lock l):
    while l == UNAVAILABLE:
        wait()
    l.state = UNAVAILABLE

def signal (lock l):
    l.state = AVAILABLE

The problem with this example above when we have more than one thread running at the same time. Two threads may both observe that the lock has been made available, and both will then set the state to unavailable. In fact, for a correct implementation of a mutex, we often need to rely on hardware support that can ensure the while-loop from above happens atomically (acting like one instruction). Intel x86-64 hardware and ARM support atomic instructions .


#define AVAILABLE   0
#define UNAVAILABLE 1

// attempts to set lock IFF it is available (to unavailable)
#define TRY_LOCK(Mutex) \
  __sync_lock_test_and_set (&(Mutex)->lock, UNAVAILABLE)

void mutex_wait (volatile struct mutex* m) {
  // if we fail to get the lock quickly, switch to a spin lock
  if (TRY_LOCK(m)) {
    spin_lock (mutex);
  }
}

void mutex_signal (volatile struct mutex* m) {
  // sets lock to AVAILABLE
  __sync_lock_release (&m->lock); 
}

#define SPIN_LIMIT ((1 << 6) - 1)

static void spin_lock (volatile struct mutex* m) {
  int spins = 0;
  while (m->lock != AVAILABLE || TRY_LOCK(m)) {
    if (!(++spins & SPIN_LIMIT)) {
      // relinquish thread
      sched_yield ();
    }
  }
}

Part 1: Semaphores

Mutexes are really convenient for allowing us to limit parts of code such that only one thread may execute at once, but what if we want to state that two or more threads can be in the same region at once? A mutex will not work because it is either available or unavailable.

  1. We need to keep track of a count of how many threads are within a given region, so let's introduce a count.
  2. We need to safely update count, so we must introduce a mutex which guards it.
  3. We need to somehow "block" a thread from running if we have reached a maximum number of concurrent threads in a region, so we introduce another mutex called delay

These three pieces make up our semaphore. For the first part of your lab, it is your responsibility to implement three functions which correctly implement a semaphore

struct semaphore {
  struct mutex mutex;
  struct mutex delay;
  int count;
};

We have two different ways of implementing a semaphore. I am providing the pseudocode, so you will only need to translate the algorithm to an implementation in C.

Option 1:
def init (s : Semaphore, avail: Int):
  init (s.mutex)
  init (s.delay)
  wait (s.delay)
  s.count = avail

def wait (s : Semaphore):
  wait (s.mutex);
  s.count -= 1
  if s.count < 0:
    signal (s.mutex)
    wait (s.delay)
  signal (s.mutex)

def signal (s : Semaphore):
  wait (s.mutex)
  s.count += 1
  if s.count <= 0:
    signal (s.delay)
  else:
    signal (s.mutex);
Option 2:
def init (s : Semaphore, avail: Int):
  init (s.mutex)
  init (s.delay)
  if avail == 0:
    wait (s.delay)
  s.count = avail

def wait (s : Semaphore):
  wait (s.delay)
  wait (s.mutex);
  s.count -= 1
  if s.count > 0:
    signal (s.delay)
  signal (s.mutex)

def signal (s : Semaphore):
  wait (s.mutex)
  s.count += 1
  if s.count == 1:
    signal (s.delay)
  signal (s.mutex);

Note: whichever option you choose, you will be required in your writeup to include:

  1. The number of mutex wait()s and signal()s when the semaphore is available/unavailable within wait()
  2. The number of mutex wait()s and signal()s when the semaphore is available/unavailable within signal()
  3. Justification as to why you chose a particular implementation

Part 2: Making a Circular Buffer Queue Thread-Safe

A queue is defined as a structure with the following operations. The first item added into the structure will always be the first item removed (FIFO).

We can implement a queue as a circular buffer which limits the number of elements which can exist in the queue. To make operations as fast as possible, we don't keep track of size and always assume is safe to add and remove.

I provide you with a functional implementation of queue within queue.h and queue.c. It is up to you to determine which synchronization primitives (mutex and/or semaphore) to add to your queue implementation. You will modify the queue.h header file to augment your struct queue definition. Then you will modify the queue.c implementation file to make add, remove, and peek thread safe.

You will then add to WRITEUP.txt your design rationale and discussion on performance of your thread-safe queue.

I provide a driver program, driver.c, which creates a specified number of "readers" and "writers" to remove/add values from a shared queue. I will use this driver for showing experimental correctness (manual analysis will be required for proving correctness).

Putting It All Together

Command Description
make builds the program
make clean removes all object files and driver executable
make tidy removes all dependency files, object files, and driver executable
make handin creates a tarball of semaphore.c queue.h queue.c and WRITEUP.txt
./driver runs the driver program

driver usage:

NOTE: the calls to fprintf() must stay within queue.c

Stress Testing:

By default, DIFFICULTY is set to 0 (a.k.a. easy mode)

You can explicitly set the difficulty up to 4 by changing your invocation to make:

make clean && DIFFICULTY=4 make

Submission

You can invoke make handin to generate a handin.tar file. Upload/submit this file to autolab.

Grading Criteria

This lab will be graded out of 100 points. Please note: only 25 points is autograded