Threads and processes
A process is a running program. Multitasking operating systems (e.g Linux, Windows etc.) run a number of processes simultaneously. Each process has a global (or static) memory area, a stack and code. Processes in multitasking OS’s are protected from one another using a hardware based memory management unit. A Scheduler allocates CPU time to each process. The simplest scheduler is a “round-robin” scheduler which allows each process run for a short time before switching to the next allowing each process a turn on the CPU.
Threads are similar to processes in some ways however they share the same global/static data as well as the same code but have separate stacks.
Threads can be scheduled just like processes and so appear to operate in parallel – this is multi-threading.
Context switching
Each process or thread switch involves a context change: the current processor state (all of its register contents) must be saved and the processor state for the next thread or process loaded. The image below illustrates a context change from Thread 1 to Thread 2
The context change is triggered by a timer interrupt and the ARM Cortex processors have a special timer aimed at just this role : the SysTick timer. In the following example the SysTick timer is configured to interrupt the CPU every millisecond which triggers a context change.
ARM Cortex M0 Exception handling
The following registers are placed on the interrupted thread stack (Process Stack) automatically following an interrupt (such as SysTick)
-
Address Contents SP Prior to interrupt ???????? SP + 0x0000001C xPSR SP + 0x00000018 PC SP + 0x00000014 LR SP + 0x00000010 R12 SP + 0x0000000C R3 SP + 0x00000008 R2 SP + 0x00000004 R1 SP + 0x00000000 R0
Why not save all of the registers? It is too slow (your ISR may not be changing all registers).
Why just these ones? R0-R3 typically are used for argument passing and should always be preserved by ISR’s. R12 is used by some compilers in their inner function call glue. The LR may hold a function return address. PC must be remembered so we know where to go back to and xPSR must be remembered for the flags.
For a full context switch, the remaining registers must be placed on the Process Stack also.
-
Address Contents SP Prior to interrupt ???????? SP + 0x0000001C xPSR SP + 0x00000018 PC SP + 0x00000014 LR SP + 0x00000010 R12 SP + 0x0000000C R3 SP + 0x00000008 R2 SP + 0x00000004 R1 SP + 0x00000000 R0 SP – 0x00000004 R11 SP – 0x00000008 R10 SP – 0x0000000C R9 SP – 0x00000010 R8 SP – 0x00000014 R7 SP – 0x00000018 R6 SP – 0x0000001C R5 SP – 0x00000020 R4
It is not possible carry this out in the C language so a little inline assembler is needed here to complete the context change.
// Preserve remaining registers on stack of thread that is being suspended (Thread A) asm(" cpsid i "); // disable interrupts during thread switch asm(" MRS R0,PSP "); // get Thread A stack pointer asm(" SUB R0,#32"); // Make room for the other registers : R4-R11 = 8 x 4 = 32 bytes asm(" STMIA R0! , { R4-R7 } "); // Can only do a multiple store on registers up to R7 asm(" MOV R4,R8 "); // Copy higher registers to lower ones asm(" MOV R5,R9 "); asm(" MOV R6,R10 "); asm(" MOV R7,R11 "); asm(" STMIA R0! , { R4-R7 } "); // and repeat the multiple register store // Locate the Thread Control Block (TCB) for Thread A asm(" LDR R0,=TCB_Size "); // get the size of each TCB asm(" LDR R0,[R0] "); asm(" LDR R1,=ThreadIndex "); // Which one is being used right now? asm(" LDR R1,[R1] "); asm(" MUL R1,R0,R1 "); // Calculate offset of Thread A TCB from start of TCB array asm(" LDR R0,=Threads "); // point to start of TCB array asm(" ADD R1,R0,R1 "); // add offset to get pointer to Thread A TCB asm(" MRS R0,PSP "); // get Thread A stack pointer // Save Thread A's stack pointer (adjusted for new registers being pushed asm(" SUB R0,#32 "); // Adjust for the other registers : R4-R11 = 8 x 4 = 32 bytes asm(" STR R0,[R1] "); // Save Thread A Stack pointer to the TCB (first entry = Saved stack pointer) // Update the ThreadIndex ThreadIndex++; if (ThreadIndex >= ThreadCount) ThreadIndex = 0; // Locate the Thread Control Block (TCB) for Thread B asm(" LDR R0,=TCB_Size "); // get the size of each TCB asm(" LDR R0,[R0] "); asm(" LDR R1,=ThreadIndex "); // Which one is being used right now? asm(" LDR R1,[R1] "); asm(" MUL R1,R0,R1 "); // Calculate offset of Thread A TCB from start of TCB array asm(" LDR R0,=Threads "); // point to start of TCB array asm(" ADD R1,R0,R1 "); // add offset to get pointer to Thread B TCB asm(" LDR R0,[R1] "); // read saved Thread B Stack pointer asm(" ADD R0,#16 "); // Skip past saved low registers for the moment asm(" LDMIA R0!,{R4-R7} "); // read saved registers asm(" MOV R8,R4 "); // Copy higher registers to lower ones asm(" MOV R9,R5 "); asm(" MOV R10,R6 "); asm(" MOV R11,R7 "); asm(" LDR R0,[R1] "); // read saved Thread B Stack pointer asm(" LDMIA R0!,{R4-R7} "); // read saved LOW registers asm(" LDR R0,[R1] "); // read saved Thread B Stack pointer asm(" ADD R0,#32 "); // re-adjust saved stack pointer asm(" MSR PSP,R0 "); // write Thread B stack pointer
Threads are managed using a structure called a Thread Control Block which is defined as follows:
typedef struct { uint32_t *ThreadStack; void (*ThreadFn )(); uint32_t Attributes; } ThreadControlBlock;
Implementation
A demonstrator application with three threads was developed for the Tiva C Launchpad. Each thread flashes an LED on the board at a different rate. The trickiest part to get right was the initial launching of the thread switching which involved a little bit of stack fiddling. Code is available over here on Github
One thought on “Multi-threading on the Tiva C Launchpad”