Cortex-M0 Boot

stm32f0_img

Still alive and kicking. And with another year in my back in Embedded Development.

My recent tamperings have been about creating a bootloader for a Cortex-M0 µ-processor that performs firmware update either from UART or SPI.

There was a interesting bit on how to set-up the system to have two Firmwares running (Boot Mode and application mode). And that is what I’ll explain in this post. How to set up a project to build a boot and successfully run your main application.

Before Starting

This post and code is prepared for a specific Cortex-M0. That is an STM32F071 µ-controller from ST. Since ARM cores are manufactured in different ways and flavours this procedure may differ. I actually doubt that this will change, but I have to cover by back 😛 It’s a second nature

STM32F0X – Memory Mapping

The guts of this problem is about the memory mapping of the whole situation.

The Cortex-M0 has aliasing memory areas. It will “always” start execution at address 0x0000_0000. But what the address 0x0000_0000 represents can be configured.

So, we can have that address linking FLASH memory, SRAM memory, or SYSTEM memory. Our applications live in FLASH so that’s what we are going to execute by default.

STM32F0 memory map

Section 2.2 of STM32f071 doc

This information is important to understand some changes that we will need to set in our application.

Building the Bootloader

What we’ll build is a Second stage bootloader. That means, we are not going to erase the SYSTEM bootloader (provided by ST) we will simply shift our real application somewhere else, and put the Bootloader as the first thing to run.

To build this you just need a default “project” for your ARM core, and when I say project I mean: some code, an appropriate linker file, and toolchain to build. You can use whatever you please, in this case I started the project using STCube. At the end of the day, the idea is that your binary has to live in position 0x0000_0000. That is Flash address 0x0080_0000.

Your linker file has to show this. Additionally, since this application is actually the BOOT you may want to limit its size.

/* Specify the memory areas */
MEMORY
{
FLASH (rx)  : ORIGIN = 0x08000000, LENGTH = 0x2000
...
}

Building the Application

The application build has to be a little bit different. To do so, we’ll simply change its starting memory address that gets assigned on linking. Again, this is a task for the linker script file:

/* Specify the memory areas */
MEMORY
{
FLASH (rx)  : ORIGIN = 0x08002000, LENGTH = 128K - 0x2000
VTRAM (xrw) : ORIGIN = 0x20000000, LENGTH = 0xC0
...
}

You can see from this example that I increased the ORIGIN value of FLASH, so the application gets linked starting from 0x0800_2000. Then we can actually have the BOOTLOADER and the Application living in the same FLASH but not overlapping.

Start End Comment
0x08000000 0x08002000 Boot application
0x08002000 0x........ Firmware

Boot memory Diagram

Boot memory Diagram



Side Note: It is interesting (if not compulsory), to match the size of your BOOT and starting application address with the size of Flash Pages. If later on you want to delete flash pages (As I will be doing for a firmware update), you don’t want a page to have half boot and half application.

Jump to application

So, we built an application that starts at 0x0800_0000 and we call it BOOTLOADER. Its only mission right now is to jump wherever the other application lives (0x0800_2000).

The Cortex-M0 needs two simple things to control the flow of the program:

  • the $pc (program counter). To know which assembler line to execute.
  • the $sp (stack pointer). To know where the stack is.

That’s easy right? The only thing that the BOOTLOADER needs to know is where this $pc and $sp should point in memory.

To do this correctly we need to understand that when we linked the binary, the address 0x0800_2000 does not actually contain assembler instructions, but the vector table.

The vector table structure for a Cortex-M0 is defined in the Cortex-M0 devices Generic user guide. You can check the whole document in ARM infocenter, but I copied the relevant stack diagram here:

Cortex-M0 vector table

Cortex-M0 vector table

As I highlighted, the first two values of this vector table have the Initial SP value and the Reset value. Reset value being the address of the first instruction to execute.

Once we know this, writing C code to jump to the application is fairly trivial. I do a couple of C tricks though.

First, let’s define a type pFunction. This represents a function with no return value and no parameter. So if we assign the address as a type pfunction, when calling it the $pc register will change as expected.

Before doing so, we have to set a correct $sp. Libraries like STCube have specific functions to do so (__set_MSP). To change the value directly we’ll use the register construct forcing a variable to map to the MSP register. This way when a value is assigned to that variable it is actually assigned to the register.


typedef void (*pFunction)(void);

int boot_main(void){
    uint32_t startAddress, applicationStack;
    register uint32_t __regMSP __ASM("msp");
    pFunction applicationEntry;
    
    //The address where the application is written
    startAddress     = 0x08002000
    
    //Retrieve values
    applicationStack = (uint32_t)  *(volatile unsigned int*) (startAddress);
    applicationEntry = (pFunction) *(volatile unsigned int*) (startAddress + 4);
    
    /*Set a valid stack pointer for the application */
    __regMSP = applicationStack;
    
    /*Start the application */
    applicationEntry();
}

The Application Vector table

Since we are working on a cortex-M0 we don’t have the luxury of relocating the vector table wherever we want. But there’s a simple trick that will suffice.

Cortex-M0 expects the vector table to be at address 0x0000_0000, and it can only be there. As discussed early, this address can represent different physical memory addresses. Since Flash memory has the BOOTLOADER, and we don’t want to modify SYSTEM memory, we’ll have to put the application vector table in SRAM memory 0x2000_0000.

The first thing our application has to do is copy its Vector table to SRAM memory and remap the Cortex-M0 memory, so 0x0000_0000 is actually 0x2000_0000.

In summary, we have to:

  • Disable all interrupts
  • Copy the application ISR vector map, to SRAM
  • Modify the SYSCONFIG register to map SRAM to 0x0000_0000
  • Continue

I’ll dump here the code to perform this task:

    /**Firmware starts at address different than 0*/
    #define FW_START_ADDR 0x08002000

    /**Force VectorTable to specific memory position defined in linker*/
    volatile uint32_t VectorTable[48] __attribute__((section(".RAMVectorTable")));

    //copy the vector table to SRAM
    void remapMemToSRAM( void )
    {
        uint32_t vecIndex = 0;
        __disable_irq();

        for(vecIndex = 0; vecIndex < 48; vecIndex++){
            VectorTable[vecIndex] = *(volatile uint32_t*)(FW_START_ADDR + (vecIndex << 2));
        }

        __HAL_SYSCFG_REMAPMEMORY_SRAM();

        __enable_irq();
    }

Hell, I've been lazy and used a macro provided by ST. You can try to decipher it
by yourself. :-D.

I'll just comment that, the register we have to modify is the SYSCFG, living in
memory address 0x4001_0000. And set its MEM_MODE[1:0]. And I know,
this is a mouthful.

syscfg_cfgr1

SYSCFG_CFGR1 on an STM32F071

#define __HAL_SYSCFG_REMAPMEMORY_SRAM() \
 do {SYSCFG->CFGR1 &= ~(SYSCFG_CFGR1_MEM_MODE); \
     SYSCFG->CFGR1 |= (SYSCFG_CFGR1_MEM_MODE_0 | SYSCFG_CFGR1_MEM_MODE_1); \
 }while(0) 

For the function remapMemToSRAM code to work, we must ensure that the SRAM has some reserved space for our vector table. We do so in the linker script sections

SECTIONS {
  /*Other sections not related with VTRAM may go here before the RAMVectorTable definition*/

  .RAMVectorTable(NOLOAD):
  {
    KEEP(*(.RAMVectorTable))
  } > VTRAM
}

Compile, link, create .hex (or whatever binary file your program uses) and GO!

Conclusions

It’s been a while since I wrote anything remotely long. I forgot how much time it takes.

Writing about low level C embedded software is also a little bit more tricky than writing stuff about higher level languages. I want to be precise and concise, but the more concise I get the more confusing the writing sounds. I hope that over time I can achieve better writing skills

Hope that this post is useful for somebody, and if not, it has been a fun topic for me to write about.

Anyway, this is quite a niche topic.

References

ARM infocenter ⇒GO
STM32F071 reference manual ⇒GO

15 Comments

Filed under code, electronic

15 responses to “Cortex-M0 Boot

  1. Jeff

    Hi Jordi,

    I really appreciated this post of yours, thank you very much! I do have a small question for you. I do not want to reset my STM32F071 to preserve my IO states, and I am trying to jump “back” to my bootloader. Am I right to say that all I need to do for the vector table to work again is to call the __HAL_SYSCFG_REMAPMEMORY_FLASH() function at the start of my main(). Everything seems to work for me, except that my systick does not want to run when I jump back to the bootloader from my main application.

    Any help on this topic would be greatly appreciated.

    Jeff

    • kxtells

      Jeff! Thanks for your comment.

      Yes, you are quite right. The only thing you should do here is remap to FLASH, either after jumping to your bootloader, or even before, if you don’t need to service any other ISRs. I was doing something similar in my project.

      Systick wise, there should be no problem, it is simply another interrupt. Do not forget to enable/disable interrupts when performing jumps between boot and firmware. Additionally, note that if you are using a global counter or similar it will have a different value when in bootloader mode and application mode as long as you don’t specify a “shared memory” region.

      I’m not sure if my response helped you at all :-p

      • Jeff

        Jordi,

        Thank you very much for your quick response and info. After emitting the DMA used by my I2C the problem went away, which kind of make sense if there is a problem with it. What the DMA issue is, I do not know, but that is a problem for another day. At least my “jumping” up and down is working flawlessly and I can remote update my firmware.

  2. Guillem

    Thanks for spending some time explaining it step by step. It’s often when you write it and when you want to be accurate when the time goes by really quick.
    Gràcies 🙂

  3. Hi, your post is very interesting but I have a problem with my SMT32F091RCTx M0 board.
    Using this piece of code

    #define APPLICATION_ADDRESS (uint32_t)0x08004000
    volatile uint32_t *VectorTable = (volatile uint32_t *)0x20000000;
    uint32_t ui32_VectorIndex = 0;
    for(ui32_VectorIndex = 0; ui32_VectorIndex < 48; ui32_VectorIndex++) {
    VectorTable[ui32_VectorIndex] = *(__IO uint32_t*)((uint32_t)APPLICATION_ADDRESS + (ui32_VectorIndex < SYS -> TimeBase Source -> SysTick
    If I choose this configuration
    CubeMX -> SYS -> TimeBase Source -> TIM1
    My firmware does not start. Have you got any suggestion for me?

  4. Pingback: STM32F0: Bootloader – Bytes On The Rocks

  5. İpek Peşkircioğlu

    Hello Jordi
    Thanks for this clear explanation. I am a beginner in this fields, I read so many documents but after reading this I start to understand things. So thanks a lot. I have a question. Correct me if I am wrong, here what I understood so far: There is a vector table at address 0x0000_0000, boot loader uses that vector table, but our application needs a second vector table otherwise in case of an interrupt, program will jump back to boot loader right ? So we create a new vector table by copying the original one to the SRAM ? But in case of the first vector table(of boot loader) you adjusted the stack pointer and program counter, but you didn’t do the same thing for the one in the SRAM? How would it now where to execute the program?
    Thanks,
    Bests,
    İpek

    • kxtells

      Hi İpek.

      yes, that’s right. We don’t want the application jumping to bootloader interrupt handlers.

      Long story short, when we are copying the vector table to SRAM the application is already running, so it already knows where to execute the program.

      Long story long:
      The PC, and MSP for the application are already set by our bootloader, so the application does not have to set anything specific because everything is already properly set.

      Imagine that we have the following in memory when starting bootloader execution:
      0x0800 0000 (bootloader vector table)
      * Reset = A
      * Initial SP Value = B

      0x0800 2000 (Application vector table)
      * Reset = K
      * Initial SP Value = W

      0xSRAM (our sram vector table before running application)
      * Reset = Garbage
      * Initial SP Value = Garbage

      Registers (at the start of bootloader running)
      * PC = A
      * MSP = B

      Then the bootloader has to jump to the application, so it properly sets the needed registers:

      Registers (just jumping to application)
      * PC = K
      * MSP = W

      The application runs its starting code (that is, copying the vector table and remapping, this leaves us with the SRAM with a proper vector table, that is a copy of whatever we had in 0x0800 2000)

      0xSRAM (our sram vector table after running application)
      Reset = K
      Initial SP Value = W

      You don’t have to copy again those registers, those are already set, otherwise the application would not be running 🙂

  6. Justin Squire

    Hi there,

    I found your post very helpful in understanding these concepts better! One thing I was a little confused about was whether you are building two different projects, or building them all in one project.

    I am also trying to make some custom bootloader code for a stm32F0 chip.

    For example, I am using stm32cubeIDE, and I am under the impression that I should create a project just dedicated to the bootloader part, which will live at 0x0800_0000, and then create another project for the application specific code that will live in a different part of flash (controlled by my settings in the linker script). Does this sound right?

    Are you willing to provide the code for how you made the complete bootloader? And the basics of the application side of the code?

    Thanks again.

    • kxtells

      Hello Justin,

      I can’t provide the full code since it’s proprietary code from one of the companies I worked for, but the basic ideas are free for all 😉

      In short, yes, you have to set-up two projects since those are two different application binaries (bootloader and main-application). You could of course end up creating your own Makefile to generate those two binaries with a simple make call instead of calling it twice, but that’s just a convenience.

      Hope this clarifies your doubts 😉

  7. awvreenen95

    I’m trying to redo this with the STM32F051t6. The program jumps correctly but interrupts refuse to work. I have push buttons (PA0 – 3) set up as interrupts. When I use the STM32 Workbench to debug the application at address 0x08003000 the interrupts work fine until I reset the chip and do the jump from the bootloader.

    I am using STM32 Workbench in conjunction with MXCube and the HAL drivers.

    • kxtells

      From totally out-of-the loop now, I would say check your ISER while running your program. If no interrupt is coming forward it probably means that all interrupts are disabled, or your expected push-button interrupts are not enabled. So I’m thinking that maybe some setup code is not being run for your application.

      Programming manual section 4.2.2

      There’s not much more I can say really 😀 Good luck!

      • Thanks for the reply. It turns out that the issue was because of a pending interrupt. I failed to disable it, which is ok since I was just using the external interrupts (push buttons) to test my custom bootloader jumping.

        Another problem I’ve encountered is that when I jump to the system bootloader at 0x1FFFEC00, I get a verification error when writing my program over UART (as per AN2606). The verification and flashing is done via a python script. When I enter the system bootloader via the physical pin I correctly flash and verify the program but when I enter it via a jump from my custom bootloader it fails to verify (incorrect data read out). I’d appreciate any advise you have on this.

      • Forgot to mention that it does write to flash correctly because the program runs as it should (as far as I’ve tested)

Leave a reply to kxtells Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.