Why

I wanted to run real-time emotion detection on a Pi camera. The question was whether the Pi 5 could handle it.

Result

ONNX Runtime got the latency down to 120ms. Camera frame rate is still the bottleneck.

What I learned

Model quantization makes a bigger difference than I expected. INT8 vs FP32 is nearly 3x faster on ARM.