Measuring GD32VF103 Interrupt response time

Following on from the previous article that talked about the ECLIC in the GD32VF103 RISC-V microcontroller I decided to measure its interrupt response. The example works like this: Configure a port pin to generate an interrupt when it goes through a high-low transition. The interrupt handler then drives the pin back high again.
The main body of this example consists of a loop which drives the pin low thus triggering an interrupt. The interrupt handler sends it high again. This is done in a loop which incorporates a small delay to facilitate measurement. The GPIO pin in question is Port C bit 13 which happens to control the red LED on the Sipeed Longan Nano. The results are shown in the figure below.
GD32VF103IRQResponseTime

As can be seen, it takes 416.667ns to drive the pin high again. This corresponds to 45 clock cycles at 108Mhz which is the time it takes to save the CPU registers, determine the cause of the interrupt, jump to the handler and perform a standard function entry prologue (set up stack etc). Not too shabby 🙂

Code is available over on Github

Interrupts and the GD32VF103

Interrupt handling is a core part of embedded systems and different architectures have different ways of dealing with it. The RISC-V Bumblebee core in the GD32VF103 uses an interrupt controller called the Enhanced Core-Local Interrupt Controller (ECLIC). All interrupts (internal and external) are handled by the ECLIC. The ECLIC handles prioritization and level/edge triggering of interrupts.The documentation on the ECLIC is pretty poor at the moment. The Bumblebee core datasheet is formatted very badly (fonts all over the place, broken diagrams etc) and it was only by combining this with the assembler provided in the GD32VF103 that I began to get a clearer picture of what is going on.

The ECLIC has two modes of dealing with interrupts : vectored and non-vectored. Vectored interrupt handling is similar to other MCU’s such as the ARM, MSP430 etc. The CPU receives interrupt request ‘N’ and finds the interrupt handler by looking up entry ‘N’ in the interrupt vector table. Saving of registers (context saving) is left to the writer of the interrupt handler and on exit the handler often must execute a return from interrupt instruction (though not for all architectures).
Non-Vectored interrupt handling is quite different. All interrupt requests cause the same interrupt handler code to be executed initially. This code saves the CPU registers, finds out which interrupt happened and calls the handler pointed to by the appropriate interrupt vector table entry. This happens using a standard subroutine/function call instruction. The handler deals with the interrupt and executes a normal function return instruction. The registers are then restored and a return from interrupt instruction is executed.

The advantage of this approach is that the code to preserve/restore the registers is done (safely) and only needs to occur once in memory. Also, the interrupt handlers can be written in C or C++ without the need to tag on unusual attributes like “interrupt” etc. The potential disadvantage is that maybe the handling of interrupts is a little slower than it might have been using the vectored approach because you end up saving and restoring all CPU registers – whether you use them or not.

Excerpt from ~/.platformio/packages/framework-gd32vf103-sdk/RISCV/env_Eclipse/entry.S

.weak irq_entry
irq_entry: // -------------> This label will be set to MTVT2 register
  // Allocate the stack space
  

  SAVE_CONTEXT// Save 16 regs

  //------This special CSR read operation, which is actually use mcause as operand to directly store it to memory
  csrrwi  x0, CSR_PUSHMCAUSE, 17
  //------This special CSR read operation, which is actually use mepc as operand to directly store it to memory
  csrrwi  x0, CSR_PUSHMEPC, 18
  //------This special CSR read operation, which is actually use Msubm as operand to directly store it to memory
  csrrwi  x0, CSR_PUSHMSUBM, 19
 
// ****************** THIS IS WHERE THE JUMP TO THE INTERRUPT HANDLER FUNCTION HAPPENS! **************

service_loop:
  //------This special CSR read/write operation, which is actually Claim the CLIC to find its pending highest
  // ID, if the ID is not 0, then automatically enable the mstatus.MIE, and jump to its vector-entry-label, and
  // update the link register 
  csrrw ra, CSR_JALMNXTI, ra 
  
  //RESTORE_CONTEXT_EXCPT_X5

  #---- Critical section with interrupts disabled -----------------------
  DISABLE_MIE # Disable interrupts 

  LOAD x5,  19*REGBYTES(sp)
  csrw CSR_MSUBM, x5  
  LOAD x5,  18*REGBYTES(sp)
  csrw CSR_MEPC, x5  
  LOAD x5,  17*REGBYTES(sp)
  csrw CSR_MCAUSE, x5  


  RESTORE_CONTEXT

  
  // Return to regular code
  mret

The comments just before the instruction csrrw ra, CSR_JALMNXTI, ra are a little obscure but their sense is clear enough: If there is a pending interrupt the ECLIC will execute a call to its interrupt handler and will update the Link Register so that when the handler executes a return from subroutine instruction control will return to the next line in the irq_entry.
The macros SAVE_CONTEXT, RESTORE_CONTEXT are defined elsewhere in the startup source code. The constant CSR_JALMNXTI evalautes to 0x7ed – a register number in the ECLIC. According to the Bumblebee core data sheet this register is
“The custom register is used to enable the ECLIC interrupt. The read operation of this register can process the next interrupt and return the entry address of the next interrupt handler. Jump to this address.”.

This is the mechanism by which the developer supplied interrupt handler is called.
How does the irq_entry code get called in the first place? Looking at the code for .platformio/packages/framework-gd32vf103-sdk/RISCV/env_Eclipse/start.S we find the following code that executes during boot:

_start:

        csrc CSR_MSTATUS, MSTATUS_MIE
        /* Jump to logical address first to ensure correct operation of RAM region  */
    la          a0,     _start
    li          a1,     1
        slli    a1,     a1, 29
    bleu        a1, a0, _start0800
    srli        a1,     a1, 2
    bleu        a1, a0, _start0800
    la          a0,     _start0800
    add         a0, a0, a1
        jr      a0

_start0800:

    /* Set the the NMI base to share with mtvec by setting CSR_MMISC_CTL */
    li t0, 0x200
    csrs CSR_MMISC_CTL, t0

        /* Intial the mtvt*/
    la t0, vector_base
    csrw CSR_MTVT, t0

        /* Intial the mtvt2 and enable it*/
    la t0, irq_entry
    csrw CSR_MTVT2, t0
    csrs CSR_MTVT2, 0x1

Note the instructions la t0, irq_entry , csrw CSR_MTVT2, t0 . These tell the ECLIC where to find the irq_entry code. The Bumblebee core reference manual defines this register CSR_MTV2 (0x7ec) as
“Custom registers are used to set non-vector interrupt handling Mode interrupt entry address” .

So, there we have it. The picture below shows my rough understanding of the process.
ECLIC
I have written up two more demo programs: One which uses the internal clock cycle timer interrupt (mtimecmp) (Systick interrupt), the other uses Timer 6 as an external source of timer interrupts (TimerIRQ). Code is available over on Github

Longan-nano RISC-V and PlatformIO

longan_patchwork

The Longan-nano board shown above includes a microcontroller with a RISC-V core. The chip seems to be very similar to the STM32F103C8T6 with the exception that the ARM-Cortex M3 core has been swapped out for a RISC-V Bumblebee core called the GD32VF103CBT6. This little development kit has an Arduino nano form factor and also sports a 160×80 full colour display, an RGB LED and a micro SD-card socket. This particular one came from Seeed Studio at a cost of $4.90 + shipping.

There are two ways of programming this chip: You can use a JTAG debugger (various kinds supported) or you can put the chip into DFU (Device Firmware Update) mode by holding the Reset and Boot button down, releasing the Reset button first. This allows you to program the board using a simple USB-Serial converter. I’ve gone with this initially but will definitely be exploring the details of the RISC-V core with a proper debugger shortly.

Developing code
The documentation for this chip pushes you towards using PlatformIO for development. This is an extension for Visual Studio Code – another first for me. You first install VSCode and then add in the PlatformIO extension (I just followed the online guide and it all just worked 🙂 )

There is good support for the GD32VF103CBT6 within PlatformIO and so there is no problem with setting up environment variables, directories and so on. I chose to develop in C++ which caused me a slight problem with the gd32vf103.h header file. This is intended for use with C projects and includes a definition the bool type. This causes a problem for C++ as it already has such a type. If you edit gd32vf103.h as follows you can fix this (around line 180 look for the enum declaration called bool)

#ifndef __cplusplus
typedef enum {FALSE = 0, TRUE = !FALSE} bool;
#endif

One more thing: when you include this file in your C++ files be sure to surround the include statement with “extern C” as shown below:

extern "C" {
#include "gd32vf103.h"
}

If you don’t do this you run into all sorts of problems with C++ decorated names.

And along came a gotcha
I have to admit that I have been a little lazy when it comes to function return values over the years. Let’s say you write an I/O function that may, in some future design, return an error code – as the project is only starting however you have not written that code yet.
So, your code might look like this:

int test()
{
    int X;
    X = 1;
    /* etc */
}

Note the missing return statement – after all, you haven’t written that code yet. Well, this runs fine on ARM Cortex MCU’s I’ve used, and, also on x86. On RISC-V/GCC9.2 this crashes your program. I spent a while scratching my head over this one. The C++ standard apparently states that behavior in this example is “undefined”. So, be warned: If you state that you are going to return a value then do!

Demo code
I’ve started a repository on github with some examples (multi-colour blinky and a graphics demo for now). You can view it here

NRF905 Revisited

I have worked with the NRF905 radio transceiver in the past as part of a weather station. At the time I wasn’t entirely happy about the code for the radio as it felt a little hacky. Like many others before me I vowed “This time I’m going to do it right” 🙂 so I have set about rewriting the radio part in a more object oriented way. I have put together a transceiver pair as shown below and I intend working on the code to simply the NRF905 application interface as much as I can. Right now there is little abstraction going on – that needs to change.
The transceiver pair consists of a couple of NRF905’s wired to some STM32F030’s mounted on breakout boards. These are wire-wrapped together and a full schematic will eventually be posted along with the code over on github


Update 19-Nov-2019
I have updated the code on github with a tighter interface to the nrf905 class. I also did an initial range test and was getting error free line of sight range of about 200m @ 10dBm. The transmitter was on top of a car and the receiver was at about 1m above the ground. Transmitter and receiver had 1/4 wave monopole antennae. A longer range is possible if you are not concerned about the odd missing packet.

Added support for the ST7789 display module to Breadboard games

Breadboard games has been based on parallel interface displays so far. These displays have also had resistive touch screen interfaces. I have a number of ST7789 displays left over from Dublin Maker so, once again on a cross country train journey I had my breadboard, logic analyzer and laptop out on the table (making the other passengers a little nervous I suspect :). The result: an ST7789 port of Breadboard games. These displays don’t have touch input so the sprite designed doesn’t work however it does take about 2 euro off the price bringing it down below €5 per kit.
Code is available over here

A measurement of the performance of Rust vs C.

This post follows on from my previous one which was an opening foray into the world of Rust. Following a comment I received I decided I would look at the relative performance of Rust and C for the trivial example of blinky running as fast as the CPU can do it (using the default reset clock speed). This involved modifying the C and Rust code.
The modified main.c
The modified code for main.c is shown below. The main differences between this and the version from my previous post are:
#defines used to define the addresses of the I/O registers. This means that the handling of pointer indirection happens at compile time – not run time. Also, it means that RAM is not used to represent the pointers at run-time.
The BSR and BRR (bit-set and bit reset registers) are used for setting and clearing bit 13. This produces smaller and faster code as there is no software bitwise AND/OR involved at run-time.
Two build changes:
optimization is turned up to level 2 by passing the -O2 argument to arm-none-eabi-gcc
debugging information is removed (building for release)

/* User LED for the Blue pill is on PC13 */
#include  // This header includes data type definitions such as uint32_t etc.
#define rcc_apb2enr (  (volatile uint32_t *) 0x40021018  )
#define gpioc_crh  (  (volatile uint32_t *) 0x40011004  )
#define gpioc_odr (  (volatile uint32_t *) 0x4001100c  )
#define gpioc_bsr (  (volatile uint32_t *) 0x40011010  )
#define gpioc_brr (  (volatile uint32_t *) 0x40011014  )


// The main function body follows 
int main() {
    // Do I/O configuration
    // Turn on GPIO C
    *rcc_apb2enr |= (1 << 4); // set bit 4
    // Configure PC13 as an output
    *gpioc_crh |= (1 << 20); // set bit 20
    *gpioc_crh &= ~((1 << 23) | (1 << 22) | (1 << 21)); // clear bits 21,22,23
    while(1) // do this forever
    {
        *gpioc_bsr = (1 << 13); // set bit 13       
         *gpioc_brr = (1 << 13); // clear bit 13     
    }
}
// The reset interrupt handler
void reset_handler() {
    main(); // call on main 
    while(1); // if main exits then loop here until next reset. 
}

// Build the interrupt vector table
const void * Vectors[] __attribute__((section(".vector_table"))) ={
	reset_handler
};

This code is built and gdb reports it's load size as 76 bytes when loaded onto the STM32F103. Executing the code we see the port pin behave as follows:

rust_blink_speed
I’m not clear on why there is a variation of cycle length every other cycle – maybe its as a result of the sampling on my cheap logic analyzer but the important point here is that the waveform is exactly the same for both C and Rust versions of this program so there is no performance hit in tight loops such as this.

The Rust version
The Rust version of the program is shown below. This is different to the version in my previous post as it uses the stm32f1 crate. The example is taken from a Jonathan Klimt’s blog over here. There’s quite a lot going on under the surface in this crate but crucially this all seems to happen at compile time because the output code is just as efficient as the C code from a speed perspective although it is a good deal bigger : 542 bytes vs C’s 76. Why the difference? There are several reasons but one big chunk is accounted for by the fact that the Rust version has a fully populated interrupt vector table which takes up 304 bytes. The C-version only includes the initial stack pointer and the Reset vector : a total of 8 bytes.

// std and main are not available for bare metal software
#![no_std]
#![no_main]

extern crate stm32f1;
extern crate panic_halt;
extern crate cortex_m_rt;

use cortex_m_rt::entry;
use stm32f1::stm32f103;

// use `main` as the entry point of this application
#[entry]
fn main() -> ! {
    // get handles to the hardware
    let peripherals = stm32f103::Peripherals::take().unwrap();
    let gpioc = &peripherals.GPIOC;
    let rcc = &peripherals.RCC;   

    // enable the GPIO clock for IO port C
    rcc.apb2enr.write(|w| w.iopcen().set_bit());
    gpioc.crh.write(|w| unsafe{
        w.mode13().bits(0b11);
        w.cnf13().bits(0b00)
    });

    loop{
        gpioc.bsrr.write(|w| w.bs13().set_bit());        
        gpioc.brr.write(|w| w.br13().set_bit());                
    }
}

The blinky loops in assembler
The version produced by the C code is shown below. r1 and r2 have been previously set up to point at the BSR and BRR registers; while r3 contains the mask value (1 << 13).

   0x08000034 :    str     r3, [r1, #0]
   0x08000036 :    str     r3, [r2, #0]
   0x08000038 :    b.n     0x8000034 

The version produced by the Rust code is below. In this case, r0 is set to point at the data structure that manages Port C and the offsets used in the store instructions are used to select the BSR and BRR registers. Register r1 has been preset to the mask value (1 << 13).

   0x0800017a :    str     r1, [r0, #12]
   0x0800017c :    str     r1, [r0, #16]
   0x0800017e :    b.n     0x800017a 

Final thoughts
Rust is considered to be a safer language to use than C. This benefit must come at some cost however. I’ve seen here that the cost is not in execution speed so where is it? Well, the C version is completely built in 168ms while the Rust version takes over 2 minutes – note this is a “once-off” cost in a way; incremental builds are nearly as quick as the C version. A full rebuild only happens if you build after a “cargo clean” command or if you modify the Cargo.toml file (which changes the build options).
Does this encourage me to pursue Rust a little more – definitely yes and with a more complex project that might reveal the greater safety of Rust.

Rust and C side-by-side on the STM32F103C8T6 (“Blue pill”) board

rustybluepill2
For most people, the first program they write for an MCU is “blinky”. This post explores blinky in C and Rust for the STM32F103C8T6 Blue pill board. This board has a built-in LED on port C bit 13. Programs are downloaded via an STLink-V2 SWD debug interface clone.

What does the program do?

When you power on an STM32F103 it looks to the lower addresses in FLASH memory to figure out what to do. The first entry (32 bit) should contain the value for the initial stack pointer. This is typically set to the top of RAM. The next entry is the address of the code that handles reset : the reset_handler.
Typically, the job of the reset handler is to initialize global and static variables and maybe perform some clock initialization. When this is done it then calls on the “main” function although this is not essential : you can write all your code in the reset handler. I prefer to use the main function.
In order to make the LED blink the main function must configure the appropriate GPIO pin as an output (Port C, bit 13). Having done this, the program enters an endless loop which does the following:
Set port C bit 13 high (turning on the LED)
Wait for a while (so the user can see the LED change)
Set port C bit 13 low (turning off the LED)
Wait for a while (so the user can see the LED change)
And that’s it. Now lets look at the C and Rust ways of doing this. Note: these are very much minimal programs that seek to bridge the gap between the higher level languages and the hardware. They don’t necessarily represent a recommended programming style for complex systems.

Blinky in C : memory layout
memory_layout

The figure above shows how the linker script file (linker_script.ld) lays out the memory image output by the linker. Flash memory starts at address 0x08000000. The initial stack pointer value is set by the line:

          LONG(ORIGIN(RAM) + LENGTH(RAM));

which evaluates to 0x20005000.
The interrupt vector table is then placed immediately after this – the reset vector being the first and only entry in this case. The text section contains code, rodata contains constants and so on. The ARM.exidx section contains data that can be used during debugging to perform a stack backtrace. The last section is not really relevant to this example but is included for completeness. This is used to help initialize global and static data.

Blinky in C : code

/* User LED for the Blue pill is on PC13 */
#include  // This header includes data type definitions such as uint32_t etc.

// Simple software delay.  The larger dly is the longer it takes to count to zero.
void delay(uint32_t dly) {
    while(dly--);
}
// GPIO configuration
void config_pins() {
    // Make pointers to the relevant registers
    volatile uint32_t * rcc_apb2enr = (  (volatile uint32_t *) 0x40021018  );
    volatile uint32_t * gpioc_crh = (  (volatile uint32_t *) 0x40011004  );
    
    // Turn on GPIO C
    *rcc_apb2enr |= (1 << 4); // set bit 4
    // Configure PC13 as an output
    *gpioc_crh |= (1 << 20); // set bit 20
    *gpioc_crh &= ~((1 << 23) | (1 << 22) | (1 << 21)); // clear bits 21,22,23
}
void led_on() {
    // Make a pointer to the output data register for port C
    volatile uint32_t * gpioc_odr = (  (volatile uint32_t *) 0x4001100c  );
    *gpioc_odr |= (1 << 13); // set bit 13
}

void led_off() {
    // Make a pointer to the output data register for port C
    volatile uint32_t * gpioc_odr = (  (volatile uint32_t *) 0x4001100c  );
    *gpioc_odr &= ~(1 << 13); // clear bit 13
}
// The main function body follows 
int main() {
    // Do I/O configuratoin
    config_pins(); 
    while(1) // do this forever
    {
        led_on();
        delay(100000);
        led_off();
        delay(100000);
    }
}
// The reset interrupt handler
void reset_handler() {
    main(); // call on main 
    while(1); // if main exits then loop here until next reset. 
}

// Build the interrupt vector table
const void * Vectors[] __attribute__((section(".vector_table"))) ={
	reset_handler
};

Let's start at the bottom of the code: Here you can see the interrupt vector table being defined. It begins with:

    const void * Vectors[] __attribute__((section(".vector_table"))) ={

What does this mean:
Vectors is a an array of constant pointers to undefined types. It is given the linker section attribute ".vector_table" which places it at the appropriate place in the program memory image. There is just one element in this array: reset_handler which represents the address of that function. So, when reset or power up happens, the first code to be executed will be at the address “reset_handler”. In this example, the reset_handler code simply calls on main. If by some chance the main function exits, reset_handler then enters an empty endless loop.
The main function calls on lower level functions : config_pins, led_on, delay and led_off.

The job of config_pins is to set up GPIO Port C, bit 13 as an output. Note the way pointers to the hardware registers are created. This mechanism is less than ideal because the pointer takes up valuable RAM. It is more usual to use #define macros for these pointers instead but this approach is taken here so that the C code lines up with the Rust code more closely.
The functions led_on, led_off behave in a similar way. The delay function implements a simple software delay loop. The program is built with the following command line (this is in a “batch” file in the github repository called build.bat – this batch file extension is chosen so that it can be directly executed on Windows as well as Linux)

    arm-none-eabi-gcc -static -mthumb -g -mcpu=cortex-m3 *.c -T linker_script.ld -o main.elf -nostartfiles 

This invokes the arm gcc compliler, performs static linking (no dll’s), generates “thumb” code, with debugging information for the arm-cortex-m3 core. All C files in the current directory are included. The linker is instructed to use this particular linker script, the output program will be called main.elf and the linker is instructed not to include any default initialization code (the reset_handler function does this). You must have arm-none-eabi-gcc installed on your system and reachable via the PATH environment variable.

Blinky in Rust
Rust uses a linker file just like C – in fact it is identical apart from a couple of minor points:
1) It is called memory.x (which is in line with other examples on the Internet)
2) It doesn’t contain the section relating to the initialization of global and static variables.

While the C version of the program sits entirely in one directory, the Rust version is distributed across a number of sub-directories. This is because the Rust build tool Cargo is used. The directory tree is as follows:
cargo_directories

This is obviously WAY more complex however you really only have to concern yourself with the highlighted files.

The config file in the “.cargo” directory contains the following:

[target.thumbv7m-none-eabi]
rustflags = ["-C", "link-arg=memory.x"]

[build]
target = "thumbv7m-none-eabi"

This tells the rust compiler the target architecture : thumbv7m (Cortex m3), none (no operating system), eabi : defines the function calling convention, the way data is organized in memory and the object file format as embedded application binary interface. The rustflags setting causes the linker to use the linker script file memory.x when generating the program image.

The file Cargo.toml contains the following


[package]
name = "slightly_rusty"
version = "0.1.0"


[profile.release]
# enable debugging in release mode.
debug = true

This simply names the output executable file name, specifies the version number and causes debugging data to be included in the release version (can be handy while testing)

The memory.x linker script file has been discussed above.

The rust code for this project is included in main.rs. It is as follows:

#![no_main]
#![no_std]

use core::panic::PanicInfo;

fn delay(mut dly : u32) {
    
    while dly > 0
    {
        dly = dly -1;
    }
}
// GPIO configuration
fn config_pins() {
   unsafe {
        // Make pointers to the relevant registers
        let rcc_apb2enr  = 0x40021018 as *mut u32;
        let gpioc_crh    = 0x40011004 as *mut u32; 
                
        // Turn on GPIO C
        *rcc_apb2enr |= 1 << 4; // set bit 4
        // Configure PC13 as an output
        *gpioc_crh |= 1 << 20;  // set bit 20
        *gpioc_crh &= !((1<<23) | (1<<22) | (1 << 21)); // clear bits 21,22,23
    }
}

fn led_on() {
    unsafe {
        // Make a pointer to the output data register for port C
        let gpioc_odr  = 0x4001100C as *mut u32; 
        *gpioc_odr |= 1 << 13; // set bit 13
    }
}

fn led_off() {
    unsafe {
        // Make a pointer to the output data register for port C
        let gpioc_odr  = 0x4001100C as *mut u32; 
        *gpioc_odr &= !(1  ! = reset_handler;




// Rust requires a function to handle program panics - this one simply loops.  You 
// could perhaps write some code to output diagnostic information.
#[panic_handler]
fn panic(_panic: &PanicInfo) -> ! {
    loop {}
}

The code is pretty similar to the C code so let’s just look at some of the differences.
First of all, we see the #![no_main] macro. This tells rust that there is no default startup function. Without this the compiler generates an error saying “error: requires `start` lang_item” which presumably means it can’t find the
default entry point for this program – not a problem here as the reset_handler is the entry point.
The #![no_std] macro tells the rust ( and the linker) not to include the rust standard library as it is not implemented (at least not fully) for the thumbv7m-none-eabi target.

The next line: use core::panic::PanicInfo is a bit like a C-include. It includes the definition for the type “PanicInfo” which is needed by the panic handler at the end of the program. The rust functions that follow have obvious parallels with their C counterparts. The key differences are:
unsafe : this keyword suspends some of Rust’s compile time memory/type safety checking to allow raw pointers to be used.
(unsafe reference : https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html). The data types are quite similar to those defined in stdint.h : uint32_t = u32 etc. Pointer behaviour is similar to C also. Consider this line from config_pins:

  let rcc_apb2enr  = 0x40021018 as *mut u32;

This declares rcc_apb2enr as a pointer to the address 0x40021018 which contains a changeable unsigned 32 bit integer. Pointer deferencing is just like C’s.
One additional difference: the bitwise NOT is an exclamation point (!) as opposed to C’s tilde (~)

To build the rust program the following command is executed from the same directory as the Cargo.toml file:

cargo build

Assuming all goes well the output executable is written to the file
target/thumbv7m-none-eabi/debug/slightly_rusty

Loading on to the chip
The output files main.elf and slightly_rusty are loaded in the same way on to the target hardware. Start an openocd session in one terminal from the top level directory (rust_vs_c) as follows
/usr/bin/openocd -f stm32f103_aliexpress.cfg
This is the default openocd (version 0.10.0) that comes with Ubuntu 19.04
Assuming your devices are plugged in ok you should see an output something like this:
Open On-Chip Debugger 0.10.0
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
adapter speed: 1000 kHz
adapter_nsrst_delay: 100
none separate
none separate
Info : Unable to match requested speed 1000 kHz, using 950 kHz
Info : Unable to match requested speed 1000 kHz, using 950 kHz
Info : clock speed 950 kHz
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.244914
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints

In another window run arm-none-eabi-gdb and execute the following commands
target remote :3333
load c/main.elf
monitor reset

This loads the C version of the progam and you should see the blue pill’s onboard LED blink.
Execute the following commands to run the rust version:
load rust/target/thumbv7m-none-eabi/debug/slightly_rusty
monitor reset

Hopefully the LED starts blinking again.

Where to from here? There are crates (rust libraries) online that define the various peripherals for STM32F103 devices etc. I’ve begun looking at these and while they are great in that they convert the STM32 SVD files to Rust they are also quite big – leading to a quite large final executable. More investigations are needed as well as a project to drive it all along.
Code for these examples is over on github