Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.

Thread Local Storage

Dylan Graham edited this page Oct 7, 2021 · 3 revisions

Before starting, here is some recommended reading:

  1. Thread Local Storage at Dlang Tour
  2. Performance of TLS Variables
  3. Thread Local Storage on FreeRTOS

How does it work?

Unshared global variables are thread-local by default in D. This requires runtime support, whose implementation can be found in rt.tls. LWDR piggybacks off the RTOS' implementation, rather than rolling its own, for portability reasons.

When a thread is first registered with LWDR, it will look through the linker script to determine how big the TLS block is, and allocate memory on the heap for it. LWDR will then tell the RTOS that this pointer to the memory bloc is specific to this thread (via rtosbackend_setTLSPointerCurrThread). When D code attempts to use a TLS variable, the compiler will call a special function to fetch the memory space. On ARM EABI, this is called __aeabi_read_tp. What this does it ask the RTOS for the pointer it gave it for the specific thread (via rtosbackend_getTLSPointerCurrThread), and returns it. The compiler then looks up the variable in this block of memory, thus completing TLS variable support.

As mentioned above, the linker script must allow accept D's TLS block (.tdata, .tbss). An example linker script segment for TLS variables in on STM32F407 is as follows:

 .tdata : 
  {
  	_tdata = .;
  	*(.tdata .tdata.* .gnu.linkonce.td.*)
  	. = ALIGN(4);
  } >FLASH
  _tdata_size = SIZEOF(.tdata);
  
  .tbss :
  {
  	_tbss = .;
  	*(.tbss .tbss.* .gnu.linkonce.tb.*)
  	*(.tcommon)
  	_etbss = .;
  	. = ALIGN(4);
  } >FLASH
  _tbss_size = SIZEOF(.tbss);

Note: LWDR does expect _tdata, _tdata_size, _tbss and _tbss_size to be implemented as it is referenced in LWDR.

Which hooks must be implemented?

rtosbackend_setTLSPointerCurrThread(void* ptr, int index)

This is used to tell the thread implementation to store ptr in its thread control block (TCB) at position index for the current thread. For example, on FreeRTOS it would point to vTaskSetThreadLocalStoragePointer.

rtosbackend_getTLSPointerCurrThread(int index)

This gets a pointer at position index from the current thread's thread control block (TCB).

Register clobbering on ARM EABI.

It has been found RTOSs may clobber registers. To correct this, some guards must be placed around the call to the RTOS.

For example, this is the __aeabi_read_tp code that fetches the pointer for the current thread:

extern(C) void* __aeabi_read_tp() nothrow @nogc
{
	auto ret = rtosbackend_getTLSPointerCurrThread(tlsPointerIndex);
	return ret;
}

It calls rtosbackend_getTLSPointerCurrThread, which is intended to be implemented by the user in C and call to the RTOS TLS implementation. For FreeRTOS, it would look like this:

void* rtosbackend_getTLSPointerCurrThread(int index) {
	void* ptr = pvTaskGetThreadLocalStoragePointer(NULL, index);
	return ptr;
}

However, that is bugged, as pvTaskGetThreadLocalStoragePointer will clobber the registers and break the TLS implementation. To work around it, some assembly must be injected that pushes and pops registers r0, r1, r2, r3 to/from the stack. The correct implementation is as follows:

void* rtosbackend_getTLSPointerCurrThread(int index) {
	__asm("push {r0, r1, r2, r3}"); // push the registers to the stack. This must be directly around the FreeRTOS call or it won't work!
	void* ptr = pvTaskGetThreadLocalStoragePointer(NULL, index); // FreeRTOS call
	__asm("pop {r0, r1, r2, r3}"); // pop the registers
	return ptr;
}

Compiler flags

LDC

On LDC, the compiler flag -fthread-model=local-exec must be included to make TLS work.