EMBEDDED GEEK: RTOS : THREADING AND SYNCHRONIZATION

One of the most useful developments in the history of computing is multitasking and multithreading. This technique isn't always available to an embedded system engineer, but some embedded systems and RTOS have multithreading (MT) capability.

Preemptive Multithreading

When the first multi-tasking systems were established, they did not have a central controller. Multi-tasking was established by having programs voluntarily give up control to the system, and the system would then give control to another process. This system worked reasonably well, except that any program that was misbehaving would slow down the entire system. For instance, if a program got stuck in an infinite loop, it would never give up control, and the system would freeze.

The solution to this problem is preemptive multithreading. In a preemptive environment, control could be moved from one process to another process at any given time. The process that was "preempted" would not even know that anything had happened, except maybe there would be a larger than average delay between 2 instructions. Preemptive multithreading allows for programs that do not voluntarily give up control, and it also allows a computer to continue functioning when a single process hangs.

There are a number of problems associated with preemptive multithreading that all stem from the fact that control is taken away from one process when it is not necessarily prepared to give up control. For instance, if one process were writing to a memory location, and was preempted, the next process would see half-written data, or even corrupted data in that memory location. Or, if a task was reading in data from an input port, and it was preempted, the timing would be wrong, and data would be missed from the line. Clearly, this is unacceptable.

The solution to this new problem then is the idea of synchronization. Synchronization is a series of tools provided by the preemptive multithreaded operating system to ensure that these problems are avoided. Synchronization tools can include timers, "critical sections," and locks. Timers can ensure that a given process may be preempted, but only for a certain time. Critical sections are commands in the code that prevent the system from switching control for a certain time. Locks are commands that prevent interference in atomic operations.

Mutexes

The term Mutex is short for "Mutual Exclusion", and is a type of mechanism used in a preemptive environment that can prevent unauthorized access to resources that are currently in use. Mutexes follow several rules:

Mutexes are system wide objects, that are maintained by the kernel.
Mutexes can only be owned by one process at a time
Mutexes can be acquired by asking the kernel to allocate that mutex to the current task
If a Mutex is already allocated, the request function will block until the mutex is available.

In general, it is considered good programming practice to release mutexes as quickly as possible.

Spin Locks

Spin locks is a quick form of synchronization methods. It is named after its behavior - spin in the loop while the condition is false. To implement spin lock system should support test and set idiom or give exclusive access to a locking thread by any means (masking interrupts, locking bus).

An advantage of spin locks is that they are very simple. A disadvantage is that they waste CPU cycles in loop waiting. Most common use of spin locks is to synchronize quick access to objects. It is not advisable to do a long computations while spin locked a section.

Critical Sections

A critical section is a sequence of computer instructions that may malfunction if interrupted. An atomic operation is a sequence of computer instructions that cannot be interrupted and function correctly. In practice, these two subtly different definitions are often combined. Operating systems provide synchronization objects to meet these requirements, and some actually call these objects as "critical sections," "atomic operations" or "monitors."

An example of a critical section is code that removes data from a queue that is filled by an interrupt. If the critical section is not protected, the interrupt can occur while the dequeuing function is changing pointers, and corrupt the queue pointers. An example of an atomic operation is an I/O read where the process must read all the data at a particular rate, and cannot be preempted while reading.

A generally good programming practice is to have programs exit their critical sections as quickly as possible, because holding a critical section for too long will cause other processes on the system not to get any time, and will cause a major performance decrease. Critical sections should be used with care.

Priority Scheduling

Many RTOS have a mechanism to distinguish the relative priorities of different tasks. High-priority tasks are executed more frequently than the low priority tasks. Each implementation of priority scheduling will be a little different, however.

Deadlock

A deadlock occurs when a series of synchronization objects are held in a preemptive MT system in such a way that no process can move forward. Let us take a look at an example:

Let's say that we have 2 threads: T1 and T2. We also have 2 mutexes, M1 and M2.

T1 asks for and acquires mutex M1.
T2 acquires M2
T1 asks for M2, and the system transfers control to T2 until T2 releases M2.
T2 asks for M1, and the system is in deadlock (neither thread can continue until the other releases it's mutex).

Reading Counters without Locking

Perhaps the most common concurrency algorithm used in embedded systems is the "read the counter twice and compare" optimistic concurrency control pattern.

Many hardware timers and hardware counters are connected to the CPU over an 8-bit bus. If the low byte of the timer happens to be 0xFF when we start to read the timer, if the timer increments between two byte reads, a simple read of each byte separately will get a corrupted value

One simple solution is to read the counter twice and compare:

long atomic_read_counter(long *counter){
    do{
        long counter_old = *counter; // alas, not an atomic operation when the timer is connected to the CPU over an 8-bit bus.
        long counter_new = *counter;
    }while( counter_old != counter_new );
    return counter_new;
}

// "optimized" routine hard-wired to read a 16-bit Counter1:
// the entire routine takes 3 machine instruction on the 8051 -- see Craig Steiner, Abhishek Yadav, etc.
inline
int atomic_read_counter1(){
    do{
        byte upper = Counter1H;
        byte lower = Counter1L;
    }while( upper != Counter1H );
    return( (upper << 8) | lower );
}

Because no locks are used, the double-read algorithm avoids deadlocks, avoids priority inversion, and avoids other problems with locks.

There are many other algorithms based on this read-twice-and-compare algorithm.

For example, the seqlocks used in Linux use this read-twice-and-compare algorithm.

For example, single-reader single-writer ring buffers are also common in embedded systems, and both the reader and the writer can be implemented with a similar lock-free algorithm.

EMBEDDED GEEK

Sunday, 9 August 2015

RTOS : THREADING AND SYNCHRONIZATION