Queues, Synchronization Primitives, and Threads

Hey! Small intro…

The following is the C equivalent of the primitive TestAndSet (TAS)

int TestAndSet(int *old_ptr, int new) {
    int old = *old_ptr;
    *old_ptr = new;
    return old;
}

TAS works by allowing the caller to provide the address of the old flag, the lock guard, and then TAS returns the value after updating the flag. What I found interesting is that regardless of whether the flag – the value at old_ptr --, TAS will always update flag. I thought and wondered if this could ever be problematic? Is there ever a situation where you don’t want want to update the flag? Is it possible to only update the flag if there’s , and if so, what are the problems? I concluded that it wasn’t because the flag is only used as a boolean to indicate whether a thread could enter a critical section. Could the flag enter

what if it wasn’t

Atomically…

Update it even if’s locked…

This is the hardware represented as simple C code, but the actual primitive on x86 systems is bts (bit test and set)

; my_lock is a global 32-bit lock guard with the value 0x00000004
bts     [rip + my_lock], 2

And this checks the whether the second bit is set (it is) and stores the result in the carry flag.

To build the lock, we do:

typedef struct __lock_t {
    int flag;
} lock_t;

void init(lock_t *lock) {
    // 0: lock is available, 1: lock is held
    lock->flag = 0;
}

void lock(lock_t *lock) {
    while (TestAndSet(&lock->flag, 1) == 1)
        ; // spin-wait (do nothing)
}

void unlock(lock_t *lock) {
    lock->flag = 0;
}

and again, I’ll try to translate to what that might look like in x86.

init_lock:
mov [rip + my_lock], DWORD PTR 0

; ...

spin_lock:
push    rbp
mov     rbp, rsp
bts     [rip + my_lock], 2
jc      spin_lock
leave
ret

; ...

unlock:
mov     [rip + my_lock], 0

And the reason this works is atomicity…

because if we didn’t use it…

and the reason the

Which leads to starvation…

typedef struct __lock_t {
    int flag;
    int guard;
    queue_t *q;
} lock_t;

void lock_init(lock_t *m) {
    m->flag = 0;
    m->guard = 0;
    queue_init(m->q);
}

void lock(lock_t *m) {
    while (TestAndSet(&m->guard, 1) == 1)
        ; //acquire guard lock by spinning
    if (m->flag == 0) {
        m->flag = 1; // lock is acquired
        m->guard = 0;
    } else {
        queue_add(m->q, gettid());
        m->guard = 0;
        park();
    }
}

void unlock(lock_t *m) {
    while (TestAndSet(&m->guard, 1) == 1)
        ; //acquire guard lock by spinning
    if (queue_empty(m->q))
        m->flag = 0; // let go of lock; no one wants it
    else
        unpark(queue_remove(m->q)); // hold lock
    // (for next thread!)
    m->guard = 0;
}

And this is better because…

Why is spinning better on this? Is it because it decreases the amount of scheduled queued threads… AKA it’s descheduled

The reading poses this question

What would happen if the release of the guard lock came after the park(), and not before? Hint: something bad.

for this line

queue_add(m->q, gettid());
m->guard = 0;
park();

And then it also offers a solution for wakeup/waiting race problem. I believe this is the case.

And one final question, these locking operations, does it make sense to make thes functions atomic? Or put locks around locks perhaps