The linker script file

The relationship between the compiler, linker and linker script file.
Linker scripts control the linking process. Their main function is to tell the linker where memory can be found in the target system, how much there is and its type.
The linker script file is also used to label different sections of memory for different purposes.

/* useful reference: www.linuxselfhelp.com/gnu/ld/html_chapter/ld_toc.html */
MEMORY
{
    flash : org = 0x08000000, len = 64k
    ram : org = 0x20000000, len = 8k
}  
SECTIONS
{       
	. = ORIGIN(flash);
        .text : {		 
		  *(.text);
        } >flash
	. = ORIGIN(ram);
        .data : {
	  INIT_DATA_VALUES = LOADADDR(.data);
	  INIT_DATA_START = .;
	    *(.data);
	  INIT_DATA_END = .;
        } >ram AT>flash
	BSS_START = .;
	.bss : {	  
	    *(.bss);	  
	} > ram
	BSS_END = .;
}

The MEMORY part of this script states that Flash ROM starts at address 0x08000000 and that there is 64k available. It also states that RAM starts at address 0x20000000 and that there is 8k available.
The next part of the script labels and organises the different sections of memory. Think of this as a blueprint for creating a memory image of the program. A particular syntax is used here. A single dot is used to identify the current location (in much the same was as ‘.’ is used to represent the current directory in various OS’s).
The first statement:
. = ORIGIN(flash);
moves the linker’s memory image pointer (usually called the location counter) to the start of the flash memory region (0x080000000) in this case. The linker will sequentially output the sections that follow to this point forwards until a new origin statement is met. The .text section is output here. It contains all allocations that the compiler marks as belonging to the .text section. These typically consist of executable code and constant data definitions.
At this point our memory image now looks like this:

alloc1

Line 13 of the linker script sets the location counter to a new value: 0x20000000 (the start of RAM). The section definition that follows refers to global and static data variables that are explicitly initialized (e.g. look at the way ‘x’ is declared in the simple C program above). The compiler tags on a .data suffix to all symbols relating to initialized global and static data. Explicitly initialized variables present a bit of a difficulty for embedded systems that make use of ROM (or Flash) memory. When embedded systems boot, they must have a mechanism for copying initialization data from ROM to RAM based variables just prior to program start. Each global/static variable is therefore associated with 2 addresses. One address is in ROM and contains its initialization value. The other address is in RAM and is used at runtime. The ROM address is sometimes referred to a the “load memory address” (LMA), the other RAM or runtime address is sometimes called the “virtual memory address” (VMA). The code that initializes these variables could look like this:

// do global/static data initialization
	unsigned char *src;
	unsigned char *dest;
	unsigned len;
	src= &INIT_DATA_VALUES;
	dest= &INIT_DATA_START;
	len= &INIT_DATA_END-&INIT_DATA_START;
	while (len--)
		*dest++ = *src++;

The pointer src should point to the ROM address for initialization data, the dst pointer should point to the RAM address. How do we figure out these addresses? You could do this manually if you determine the length of your .text section. The linker script can automate this however. The data section in the linker script creates three symbols:
INIT_DATA_VALUES which marks the start of the initialization data in ROM
INIT_DATA_START which marks the start of the data in RAM
INIT_DATA_END which marks the end of the data in RAM.
Looking at the definitions of the first two of these you would think that INIT_DATA_VALUES and INIT_DATA_START should have the same value. This would be true but for the code at the end of the section definition: >ram AT>flash. This code forces static data initial values into flash ROM but still preserves the RAM based addresses of the variables. This code also changes the address reported by the LOADADDR function. In this case, the LOADADDR function returns a reference to the next available byte of flash memory – in other words a reference to the start of the initialization values. Our memory image now looks like this:

alloc2

The last section of the linker script file (Line 20 onwards) deals with global and static variables that are not explicitly initialized. The normal practice here is to zero out these variables. The compiler marks these symbols with as belonging to the “.bss” section (see Wikipedia for origin of the term). The linker script marks the beginning and end of the bss variables with a BSS_START and a BSS_END symbol.
Initialization code then uses these symbols to clear the memory involves as shown below:

// zero out the uninitialized global/static variables
        dest = &BSS_START;
        len = &BSS_END - &BSS_START;
        while (len--)
                *dest++=0;

Our memory image now looks like this.

linkerscr3

Reference: RedHat guide to linker scripts.

2 thoughts on “The linker script file

  1. Samrat July 19, 2017 / 1:15 pm

    Thanks for this article. I think “M0” should be changed to “M4″(both in the article and the code).

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s