
I have been working on a graphics library for the ST7735 and the Raspberry Ri Rico. The first version was written in pure micropython and worked well enough but was quite slow – especially when writing out blocks of colour (fillRectangle). This was the original code for fillRectangle:
def fillRectangle(self,x1,y1,w,h,colour):
self.openAperture(x1,y1,x1+w-1,y1+h-1)
pixelcount=h*w
self.command(0x2c)
self.a0.value(1)
msg=bytearray()
while(pixelcount >0):
pixelcount = pixelcount-1
msg.append(colour >> 8)
msg.append(colour & 0xff)
self.spi.write(msg)
Not only was this slow, it also required that a buffer be created that held the filled rectangle in RAM. This was slow and memory intensive.
The new version looks like this:
def fillRectangle(self,x1,y1,w,h,colour):
self.openAperture(x1,y1,x1+w-1,y1+h-1)
pixelcount=h*w
self.command(0x2c)
self.a0.value(1)
self.fill_block(colour,pixelcount)
It makes use of an inline assembler function the source code of which is as follows:
@micropython.asm_thumb
def fill_block(r0,r1,r2):
# pointer to self passed in r0
# r1 contains the 16 bit data to be written
# r2 countains count
# Going to use SPI0.
# Base address = 0x4003c000
# SSPCR0 Register OFFSET 0
# SSPCR1 Register OFFSET 4
# SSPDR Register OFFSET 8
# SSPSR Register OFFSET c
push({r1,r2,r3,r4,r7})
# Convoluted load of a 32 value into r7
mov(r7,0x40)
lsl(r7,r7,8)
add(r7,0x03)
lsl(r7,r7,8)
add(r7,0xc0)
lsl(r7,r7,8)
add(r7,0x00)
mov(r4,2)
label(fill_block_loop_start)
cmp(r2,0)
beq(fill_block_exit)
mov(r3,r1) # read next byte
lsr(r3,r3,8)
strb(r3,[r7,8]) # write to SPI
label(fill_block_spi_wait1)
ldr(r3,[r7,0xc]) # read next byte
and_(r3,r4)
beq(fill_block_spi_wait1)
mov(r3,r1) # read next byte
strb(r3,[r7,8]) # write to SPI
sub(r2,r2,1) # decrement count
label(fill_block_spi_wait2)
ldr(r3,[r7,0xc]) # read next byte
and_(r3,r4)
beq(fill_block_spi_wait2)
b(fill_block_loop_start)
label(fill_block_exit)
pop ({r1,r2,r3,r4,r7})
This writes the colour value directly to the SPI port the required number of times. It needs to pause when the SPI FIFO fills up (hence he need for the labels fill_block_spi_wait1/2).
The performance improvement is about a factor of 20!
Code is available over on gihub and is likely to change lots in the next couple of weeks while I prepare for a STEM event.