Edge Deployment Paths
How the HLM-Nano and HLM-Micro checkpoints reach real hardware. Three viable paths today, with explicit caveats about what is verified end-to-end and what is reference math waiting for a target board.
Status
Two of the three paths work today end to end. The third (Nano on bare MCU via the C reference kernel) has correct math and a complete cross-compile recipe but has not yet been flashed to a physical board.
Path A — Nano on bare MCU (C99 reference kernel)
A dependency-free C99 implementation of the Nano forward pass, with zero malloc, all allocations static. Loads model.int8.bin + scales from model.int8.json, runs the polynomial-Hopfield forward pass.
What ships:
- the C99 Nano kernel headers — 433 lines, dependency-free
micro/code/nano/c_kernel/test_harness.c— desktop validation harnessmicro/code/nano/c_kernel/Makefile— host + AVR + Cortex-M targetsmicro/code/nano/hlm_nano_demo/— Arduino IDE sketch that runs the tiny preset and prints predictions over Serialmicro/code/nano/tools/bin2c.py— regeneratesmodel_data.h+scales.hfor the sketch after retraining
Cross-compile recipes:
# Host (verify the math)
cd micro/code/nano/c_kernel
make
./test_harness ../../../checkpoints/output-nano-tiny-v0/model.int8.json \
../../../checkpoints/output-nano-tiny-v0/model.int8.bin
# Arduino Uno (ATmega328P, 2 KB SRAM, 32 KB flash) — v1 work, see caveat below
make TARGET=avr
# nRF52, STM32L4, Teensy 4 (Cortex-M4 + FPU)
make TARGET=arm-m4
# STM32L0, Pi Pico (Cortex-M0+)
make TARGET=arm-m0Arduino IDE path: open micro/code/nano/hlm_nano_demo/hlm_nano_demo.ino, pick the target board, upload. The sketch prints the predicted class and logits over Serial every two seconds on a synthetic impact signature.
Static state by preset:
| Preset | Params | INT8 size | C-kernel static state | Fits |
|---|---|---|---|---|
| Nano-Tiny | 1,525 | 1.5 KB | ~6.5 KB | nRF52, STM32L4, RP2040, Teensy, ESP32 |
| Nano-Small | 2,805 | 2.8 KB | ~12 KB | nRF52, STM32L4, RP2040, Teensy, ESP32 |
| Nano-Micro | 6,622 | 6.5 KB | ~27 KB | RP2040, Teensy, ESP32, large STM32 |
Caveats:
- Not yet validated on a flashed board. Math is verified — the C kernel reproduces the Python reference's prediction on the synthetic impact signature, and the INT8 quantization step (600 eval samples) confirms dequant matches FP32 accuracy. The next step is flashing one of the target boards.
- Arduino Uno (AVR) needs the v1 variant. Current sketch uses regular flash for the model and float activations. ATmega328P (2 KB SRAM, no FPU) needs the planned INT16 fixed-point + PROGMEM variant before it fits.
- Stem bias is zero-initialised in
test_harness.c/scales.h. Works for the synthetic 3-class task because the bias learned to be near zero; a production deploy should parse the FP16 block ofmodel.int8.bin.
Path B — Micro on ESP32 + sensor, streaming to a laptop host
Working today end to end. ESP32 + MPU6050 6-axis accelerometer streams 32-channel feature frames (raw + derived + cheap FFT bins) over USB serial at ~125 Hz; a laptop receives the frames and runs HLM-Micro inference with audit certificate generation.
What ships:
- the ESP32 + MPU6050 firmware sketch
micro/hardware/firmware/sensor_node_analog.ino— analog-input alternativemicro/hardware/host/live_inference.py— laptop receiver + model runnermicro/hardware/host/requirements.txt— Python depsmicro/hardware/wiring/mpu6050_esp32.md— wiring guide
Hardware list:
- ESP32 DevKitC (or any ESP32 board with USB serial)
- MPU6050 6-axis IMU
- I2C wiring (SDA → GPIO21, SCL → GPIO22, default ESP32 pins)
- USB cable to the laptop
Caveat: the inference itself still runs on the laptop. A native ESP32-S3 INT4 kernel for HLM-Micro is on the roadmap but not yet built. Today's micro deployment is the "ESP32 streams frames, laptop runs the model" pattern, which is sufficient for an on-site demo with a real board in the buyer's hand.
Path C — Linux SBC (Raspberry Pi 3 / 4 / 5, Jetson, generic ARM SBC)
Trivial. Any Linux board with Python and PyTorch runs the Nano and Micro checkpoints directly. No cross-compile, no special kernel.
# On the Pi (or any Linux SBC):
git clone <bundle>
cd HLM-Demos
pip install -r requirements-demo.txt
python micro/code/nano/demo_nano.py --allOutput matches the laptop runner — same KB / accuracy / latency report.
This is the recommended path for any demo where you want to show a model running on a small device without flashing firmware.
Picking a path
| Scenario | Path |
|---|---|
| Demo: real MCU, model on-device, no laptop | A — Nano sketch (caveat: pre-flash validation) |
| Demo: real sensor, audit cert, talk to a partner | B — ESP32 + MPU6050 + laptop host (working today) |
| Demo: small device, no MCU expertise needed | C — Raspberry Pi or similar Linux board |
| Production pilot: high-volume MCU rollout | A with v1 hardening (INT16 AVR variant, real-board validation) |
| Production pilot: industrial sensor + audit trail | B with the ESP32-S3 native kernel (roadmap) |
What we are willing to claim publicly
- The Nano architecture trains end-to-end at sub-10 KB INT8 footprint.
- The Python runner reproduces recorded validation accuracy on synthetic data.
- The C reference kernel compiles for AVR, Cortex-M0+, and Cortex-M4, with a cross-compile Makefile and an Arduino IDE sketch.
- The ESP32 streaming path works today with a real MPU6050 sensor.
What we are not yet claiming
- "Runs on a flashed Arduino board." Math is verified, board validation is the next step.
- "Native ESP32-S3 INT4 HLM-Micro on-device." This is roadmap work.
- "Production accuracy on a real sensor stream." Synthetic-ceiling caveat applies to the current Nano checkpoints.