Skip to content

Edge Deployment Paths

How the HLM-Nano and HLM-Micro checkpoints reach real hardware. Three viable paths today, with explicit caveats about what is verified end-to-end and what is reference math waiting for a target board.

Status

Two of the three paths work today end to end. The third (Nano on bare MCU via the C reference kernel) has correct math and a complete cross-compile recipe but has not yet been flashed to a physical board.

Path A — Nano on bare MCU (C99 reference kernel)

A dependency-free C99 implementation of the Nano forward pass, with zero malloc, all allocations static. Loads model.int8.bin + scales from model.int8.json, runs the polynomial-Hopfield forward pass.

What ships:

  • the C99 Nano kernel headers — 433 lines, dependency-free
  • micro/code/nano/c_kernel/test_harness.c — desktop validation harness
  • micro/code/nano/c_kernel/Makefile — host + AVR + Cortex-M targets
  • micro/code/nano/hlm_nano_demo/ — Arduino IDE sketch that runs the tiny preset and prints predictions over Serial
  • micro/code/nano/tools/bin2c.py — regenerates model_data.h + scales.h for the sketch after retraining

Cross-compile recipes:

bash
# Host (verify the math)
cd micro/code/nano/c_kernel
make
./test_harness ../../../checkpoints/output-nano-tiny-v0/model.int8.json \
               ../../../checkpoints/output-nano-tiny-v0/model.int8.bin

# Arduino Uno (ATmega328P, 2 KB SRAM, 32 KB flash)  — v1 work, see caveat below
make TARGET=avr

# nRF52, STM32L4, Teensy 4 (Cortex-M4 + FPU)
make TARGET=arm-m4

# STM32L0, Pi Pico (Cortex-M0+)
make TARGET=arm-m0

Arduino IDE path: open micro/code/nano/hlm_nano_demo/hlm_nano_demo.ino, pick the target board, upload. The sketch prints the predicted class and logits over Serial every two seconds on a synthetic impact signature.

Static state by preset:

PresetParamsINT8 sizeC-kernel static stateFits
Nano-Tiny1,5251.5 KB~6.5 KBnRF52, STM32L4, RP2040, Teensy, ESP32
Nano-Small2,8052.8 KB~12 KBnRF52, STM32L4, RP2040, Teensy, ESP32
Nano-Micro6,6226.5 KB~27 KBRP2040, Teensy, ESP32, large STM32

Caveats:

  1. Not yet validated on a flashed board. Math is verified — the C kernel reproduces the Python reference's prediction on the synthetic impact signature, and the INT8 quantization step (600 eval samples) confirms dequant matches FP32 accuracy. The next step is flashing one of the target boards.
  2. Arduino Uno (AVR) needs the v1 variant. Current sketch uses regular flash for the model and float activations. ATmega328P (2 KB SRAM, no FPU) needs the planned INT16 fixed-point + PROGMEM variant before it fits.
  3. Stem bias is zero-initialised in test_harness.c / scales.h. Works for the synthetic 3-class task because the bias learned to be near zero; a production deploy should parse the FP16 block of model.int8.bin.

Path B — Micro on ESP32 + sensor, streaming to a laptop host

Working today end to end. ESP32 + MPU6050 6-axis accelerometer streams 32-channel feature frames (raw + derived + cheap FFT bins) over USB serial at ~125 Hz; a laptop receives the frames and runs HLM-Micro inference with audit certificate generation.

What ships:

  • the ESP32 + MPU6050 firmware sketch
  • micro/hardware/firmware/sensor_node_analog.ino — analog-input alternative
  • micro/hardware/host/live_inference.py — laptop receiver + model runner
  • micro/hardware/host/requirements.txt — Python deps
  • micro/hardware/wiring/mpu6050_esp32.md — wiring guide

Hardware list:

  • ESP32 DevKitC (or any ESP32 board with USB serial)
  • MPU6050 6-axis IMU
  • I2C wiring (SDA → GPIO21, SCL → GPIO22, default ESP32 pins)
  • USB cable to the laptop

Caveat: the inference itself still runs on the laptop. A native ESP32-S3 INT4 kernel for HLM-Micro is on the roadmap but not yet built. Today's micro deployment is the "ESP32 streams frames, laptop runs the model" pattern, which is sufficient for an on-site demo with a real board in the buyer's hand.

Path C — Linux SBC (Raspberry Pi 3 / 4 / 5, Jetson, generic ARM SBC)

Trivial. Any Linux board with Python and PyTorch runs the Nano and Micro checkpoints directly. No cross-compile, no special kernel.

bash
# On the Pi (or any Linux SBC):
git clone <bundle>
cd HLM-Demos
pip install -r requirements-demo.txt
python micro/code/nano/demo_nano.py --all

Output matches the laptop runner — same KB / accuracy / latency report.

This is the recommended path for any demo where you want to show a model running on a small device without flashing firmware.

Picking a path

ScenarioPath
Demo: real MCU, model on-device, no laptopA — Nano sketch (caveat: pre-flash validation)
Demo: real sensor, audit cert, talk to a partnerB — ESP32 + MPU6050 + laptop host (working today)
Demo: small device, no MCU expertise neededC — Raspberry Pi or similar Linux board
Production pilot: high-volume MCU rolloutA with v1 hardening (INT16 AVR variant, real-board validation)
Production pilot: industrial sensor + audit trailB with the ESP32-S3 native kernel (roadmap)

What we are willing to claim publicly

  • The Nano architecture trains end-to-end at sub-10 KB INT8 footprint.
  • The Python runner reproduces recorded validation accuracy on synthetic data.
  • The C reference kernel compiles for AVR, Cortex-M0+, and Cortex-M4, with a cross-compile Makefile and an Arduino IDE sketch.
  • The ESP32 streaming path works today with a real MPU6050 sensor.

What we are not yet claiming

  • "Runs on a flashed Arduino board." Math is verified, board validation is the next step.
  • "Native ESP32-S3 INT4 HLM-Micro on-device." This is roadmap work.
  • "Production accuracy on a real sensor stream." Synthetic-ceiling caveat applies to the current Nano checkpoints.