Squeezing rust onto more things, this time: Flashlights
For a while now I have been pretty interested in niche torches that run the Andúril flashlight firmware, which is written in C and supports many flashlights that use AVR microcontrollers, such as Hank’s and Fireflylite.
In my opinion the standout features of Andúril are:
- Support for aux LEDs: it’s common for these flashlights to have a set of RGB LEDs on the front, Andúril supports using these to show the battery voltage by changing between 6 colours.
- Many different mode, some useful like the candle flicker mode, which along with the normal mode supports a fading-off timer, and some silly modes like a police strobe (flashes the aux lights between red and blue.)
- Very good thermal and battery voltage handling: the maximum output is smoothy adjusted to regulate the temperature of the light, and prevent the battery voltage dipping too low.
- Open source and user flashable: all (or most?) flashlights sold with Andúril have accessible flashing pads on the exposed side of the driver circuit board, making it easy for people to reflash the firmware.
I dabbled in customising Andúril for about a year while I was also playing around with writing rust firmware for keyboards12, and in January 2024 I decided to try squeezing (async) Rust onto an AVR chip. In this case the chip-to-be was an attiny1616 — which has 16kb of flash and 2kb of ram.
AVR Beginnings
I decided I certainly wanted to use Embassy3 for writing the firmware, I had used it previously with my keyboards and loved how easy it made breaking up the individual tasks that needed to be performed into their own asynchronous tasks.
However, unlike microcontroller platforms such as nRF, STM32, and RP2040, Embassy does not have excellent support for AVR. Fortunately, it isn’t difficult to get Embassy up and running on new hardware as long as Rust supports it. Embassy only needs a hardware specific timer queue implementation, which is used to schedule tasks for wakeup.
One other slight issue was that there was no Rust peripheral access crate for the attiny1616, but luckily there was already the avr-device crate, which has support for some similar AVR chips, and scripts for generating PACs from machine readable register description documents.
For those unfamiliar with embedded (and specifically embedded Rust development), the ecosystem is generally broken down into:
- Peripheral Access Crates (PAC): Provide safe Rust abstractions for reading and writing to memory mapped IO registers.
- Hardware Abstraction Libraries (HAL): Provide ergonomic Rust bindings for peripherals which often provide fully compile time verified configuration and usage of peripherals. (in my opinion HALs in Rust are much nicer to use than in other languages, as documentation is easily explored and the type system makes sure you’re getting everything right)
- Instruction set and runtime support crates: Provide helper functions and the necessities to get the CPU going before
main
can run. (e.g. cortex-m(-rt))
After some fiddling I was able to generate the PAC for the attiny1616, and with that I was able to start poking registers and write functions such as this one to configure the ADC peripheral:
Also very lucky for me was that someone had already put in the effort of writing a hardware abstraction library for tinyAVR microcontrollers (of which the t1616 is a member of), which only needs a working PAC to provide safe rust abstractions for configuring clocks, GPIO pins, timers, and the ADC.
Now that we have a PAC for the attiny1616 we can proceed with implementing a timer queue so that we can get some tasks running.
Timer queue
It’s a common requirement in many situations that a delay can be inserted between two operations, and I suppose it’s even more common in embedded systems where simple inputs need to trigger complex outputs and vice-versa. In fact at the time of writing my flashlight firmware uses eight timeouts and fifteen sleeps.
Now in many embedded codebases, especially those written using Arduino and such, it is usual that delays are implemented using busy loops that count cycles until a period has passed. This is usually because the alternative is to do horrible things like manually breaking up your code into state machines or use a RTOS that provides stackful4 tasks (and you’ll struggle to get a stackful task scheduler onto an AVR microcontroller with only 2k of RAM).
I’ll spare you the usual Rust async spiel about how async functions map to the Future trait, the gist is that with Rust’s async the compiler does the state-machine-ification for you, so you can write nice sequential code that once compiled doesn’t require wasteful stack switching to achive concurrency. With all our tasks being state machines, our sleep(...)
function can just be an instruction to the scheduler to not schedule our task until the timeout has elapsed (This is of course how a sleep function works everywhere it isn’t a busy loop). With out sleeping task sleeping, the scheduler can choose to run another task if one is ready, or if all the tasks are waiting on something the scheduler can choose to put the microcontroller to sleep instead.
The core of this functionality is the timer queue, and it simply has two jobs:
- When a task wants to sleep, Embassy passes to the timer queue a
Waker
and a timestamp indicating when the waker should be woken. - When this timer is reached, the timer queue should call the
.wake()
method of the waker, which does whatever is necessary to mark the task as ready to run again.
Embassy models this as a Trait with a single schedule_wake
method:
; // not public!
So to implement a timer queue you need only to implement this trait, and some way to handle waking. On embedded devices there is in general two ways to do this:
Have an interrupt fire periodically (say, at 1000Hz), the handler to this interrupt can step a counter and then wake up all the tasks which have timestamps that are now in the past.
This solution is simple, but forces the microcontroller to wake up periodically even when there’s no work to do.
Configure a hardware timer to fire an interrupt when the next timer is due, then process elapsed timeouts as with (.1)
This is more complicated as you have to handle cancelling and restarting a hardware timer, but allows the system to sleep uninterrupted for longer periods of time.
I chose to use a periodic interrupt for the simplicity of implementation, as there’s always the option to switch to dynamically reconfiguring the timer in the future.
To begin with, we need to define how the state of the timer queue is stored. For my implementation I store for each queue entry:
use ;
use ;
pub type Time = u32;
const QUEUE_SIZE: usize = 10;
/// An array of queue entries. The `Mutex<Cell<_>>` here is actually
/// a noop at runtime, and just serves to prove we're inside a
/// critical section when accessing the entries
static ENTRIES: =
;
We then need a function to allocate an entry on this timer queue:
/// Allocate an entry, returning on success the index, and whether
/// there was already an entry for this waker
/// Add a waker to the queue, correctly handles when the
/// waker is already in the queue
And that’s all we need to implement the first half of the timer queue, we just need to provide an interface for Embassy to use it:
Now that we can add to our timer queue, we just need to periodically process the entries and wake up tasks which need waking up. I chose to do this using the Periodic Interrupt Timer functionality (PIT) of the Real Time Clock (RTC) peripheral on the AVR:
pub static TICKS_ELAPSED: = new;
const TICKS_PER_COUNT: Time = 1;
/// Check each entry in the queue, if the timer has elapsed,
/// then wake the associated task
// A flag we use to ensure we don't try to process the timer queue
// recursively if handle_tick is entered again somehow
static IN_PROGRESS: = new;
// Declare an interrupt handler for the RTC_PIT interrupt
unsafe
pub unsafe
/// Configure the RTC with the PIT enabled, firing at a rate of 1024Hz
We’re almost done, the last thing to do is tell Embassy how to read what the current time is:
And with that, we can now use Embassy:
async
Async peripheral drivers
With timers out of the way we can now look into implementing peripheral drivers that are async compatible. Microcontrollers are already all setup for this as it’s common for peripherals to fire interrupts when its state changes, so we can simply just hook up an interrupt handler to wake up tasks waiting on the peripheral.
As an example, for GPIO pins it is common to want to wait until the state of an input pin changes in some way, such as low to high, or high to low. On AVR you may configure the microcontroller to fire an interrupt when such a state transition happens.
This means we can easily build a Rust future which configures pin interrupts for a pin, and then registers a waker such that when an interrupt is fired for the pin, the task is woken back up.
To implement this for AVR I started with declaring a place to store a waker for each pin:
// GPIO pins on AVR are grouped into 'ports'
const PORTA_PIN_COUNT: usize = 8;
const PORTB_PIN_COUNT: usize = 8;
const PORTC_PIN_COUNT: usize = 6;
// AtomicWaker is effectively just `Mutex<Cell<Option<Waker>>>`
static WAKERS: =
;
Then we can declare the interrupt handlers for the pin interrupts, which will wake up any wakers for pins that have an interrupt pending.
// To reduce code size, the true handler for pin interrupts is this function,
// which is passed the port for which the interrupt was served and wakes up
// any wakers for pins which have a pending interrupt.
// Pin interrupts on AVR are grouped to the port the pin belongs to, the
// pin has a 'pending interrupt' flag which is used to check which pin(s)
// the interrupt was fired for.
unsafe
unsafe
unsafe
// Helper trait used for its vtable, this seems to have the least
// code size impact.
Then on the other side we just need to create a Future which configures the interrupt and registers the waker:
Now we can write a function which waits for a button press, and then lights up a LED for one second after:
async
This same technique can then be used to create an async driver for the ADC, which fires an interrupt when the result is ready to be retrieved.
Splitting up tasks
Initially I expected to not actually use that many async tasks for this flashlight firmware, but as it turns out, there’s actually quite a few concurrent processes you can decompse a flashlight into:
Debouncing the power button
We want to do things when the power button is pressed and depressed, but due to the realities of the world we cannot just wait for highs and lows on the pin connected to the button as the signal will actually very quickly flip between low and high when the button is pressed and depressed. And so we need to perform debouncing of the button.
We can model this incredibly simply as a single process: When we first see the button is pressed we can wait a period of time (16ms), and if the button is still pressed we then treat it as a press. We can act likewise for depresses.
pub async
Recognising button clicks and holds
The UI of Andúril is structured around sequences of clicks that are optionally finished by a hold (long presses). For example: when the torch is unlocked, 1C (a single click) will turn on the light at the previously used brightness, while 1H (a single hold) will turn the light on at a default ‘low’ brightness level. 4C (three clicks in a row) while the torch is locked will unlock it, and likewise when the torch is unlocked.
I implemented recognising sequences of clicks and holds with a simple state machine that receives press and depress events from the debouncer process. After receiving a press event we wait for either a depress or a timeout of 300ms. If a timeout occured we emit a hold event and proceed to wait for an eventual depress, however if a depress occured we count the click and proceed to wait for another 300ms in case the button is pressed again (in which case we return to see if that is a click or a hold), if nothing is pressed within the timeout we can emit a click event containg the count of clicks so far.
Implemented in code, this looks like this:
// This isn't the tidiest as we're intentionally encoding the state machine as data rather than code // as to reduce the number of await points. pub async
Controlling the AUX lights
- PWM and low aux modes
- Monitoring temperature and battery voltage
- Also the watchdog
- Controlling output brightness
- Reducing brightness to limit temperature
- Smoothly interpolating between levels
Fighting the inliner, testing different representations, compiler flags and turbowakers
Pins need to be modelled at the type level so that we can verify pin compatability statically, but once a pin is used by a peripheral, we can either keep it generic or turn it into a runtime integer representing the pin number, which can result in different code sizes.
Async state machine code sizes, coalescing future helpers to reduce code size.
Running out of flash
STM32 is the solution, but would require a custom driver design. Time to learn how a current controlled boost converter works.
Designed a simple MP3432 boost driver using an stm32 to control it.
- went with STM32L072KB, needs only some bypass caps and a LDO to provide a constant voltage, doesn’t require an external oscillator. This chip has plenty of timers, flash, DACs, and ADCs.
Ported codebase to stm32
Didn’t require much work as the UI and power control logic was already fairly generic.
Also tried out maitake and designed a new driver
By this I mean a runtime system that swaps out thread stacks