STM32N6 NPU Deployment — Politecnico di Milano  1.0
Documentation for Neural Network Deployment on STM32N6 NPU - Politecnico di Milano 2024-2025
Chapter 3 — Toolchain

Chapter 3 — Toolchain

The Software Toolchain

After understanding the hardware, it is time to look at the software side. This chapter walks through every tool involved in the deployment pipeline — what it does, how it fits together, and where to find the detailed code explanation in Part 2 and Part 3.

ST Model Zoo
ModelZoo Services
ST Edge AI Core
STM32CubeIDE
Key Insight — Keep this in mind

Before diving into the tools, there is one idea that ties everything together: the division of labor between Python and C.

🐍 Python — Host side
Handles training, quantization, and evaluation. Runs on your laptop. Produces a quantized model file (.tflite, .onnx) and generates the C configuration header.
⚡ C — Board side
The C project is what actually runs on the board. Compiled by STM32CubeIDE from C code generated by ST Edge AI Core, then flashed via ST-Link.

What this chapter covers

Now that we understand the hardware platform — the STM32N6570-DK, its Neural-ART NPU, and its memory hierarchy — we can turn to the software side and understand all the steps required to deploy a neural network firmware on the board.

For each tool we will look at what it contains (code, models, configuration files), how it is used in practice, and — crucially — we will link directly to the detailed code explanations in Part 2 — Code Reference and Part 3 — Module Groups, where every function is documented line by line.

This chapter is structured in three parts, mirroring the division of labor described above:

Part A — Python Pipeline
The ST Model Zoo and ModelZoo Services: where models come from, how they are quantized, and how the Python scripts orchestrate the entire pipeline from a single YAML configuration file.
Part B — ST Edge AI Core
The converter tool that takes a quantized Python model and produces optimised C arrays, NPU epoch assignments, and a detailed profiling report. This is the bridge between Python and C.
Part C — STM32CubeIDE & C Firmware
The IDE that compiles the generated C project and flashes it to the board. Here we look at the C files that actually run on the STM32N6570-DK.

Pipeline Overview

The complete deployment pipeline is composed of four distinct software components, each responsible for a specific phase. They always execute in this exact order:

📚 ST Model Zoo Pre-trained models .tflite / .h5 / .onnx 🛠 Zoo Services Train / Quantize Python + YAML ST Edge AI Core Model → C code + NPU profiling 🔌 STM32CubeIDE Compile + Flash via ST-Link → board PYTHON SIDE C SIDE training, quantization, evaluation C code generation, compile, flash

Notice the dashed vertical line in the diagram above. Everything to the left runs on your laptop in Python. Everything to the right produces and runs C code on the board. ST Edge AI Core sits exactly at the boundary: it takes a Python model as input and produces C source files as output.

Explore each component in detail

Each component has its own dedicated page. For each one we examine: the repository structure, the key files, how it is configured and invoked, and direct links to the annotated source code in Part 2.

📚
3.1 — ST Model Zoo
github.com/STMicroelectronics/stm32ai-modelzoo

The public repository of pre-trained, validated models for STM32 deployment. Covers object detection, pose estimation, image classification and more. This is where MoveNet Lightning was sourced.

Read more →
🛠
3.2 — ModelZoo Services
Python pipeline + user_config.yaml

The Python framework that automates every step: training, quantization, evaluation, benchmarking, deployment. A single YAML file controls everything. All Python source files in Part 2 belong here.

Read more →
3.3 — ST Edge AI Core
stedgeai CLI + NPU Add-on

The converter tool. Takes a quantized .tflite or .onnx model and produces optimised C arrays, assigns layers to NPU or CPU epochs, and generates a detailed profiling report. The Python-to-C bridge.

Read more →
🔌
3.4 — STM32CubeIDE & C Firmware
compile + flash + C source files

The IDE invoked automatically by the Python script to compile the C project and flash it to the board via ST-Link. Here we also examine the C firmware files that implement the real-time inference loop on the STM32N6570-DK.

Read more →
🔗 This chapter connects directly to Part 2 and Part 3

Throughout the subpages of this chapter, every time we mention a Python function or a C file, we link directly to its annotated documentation in Part 2. If you want to go deeper on any specific file right now:

← Chapter 2 — Hardware Start: 3.1 ST Model Zoo →

3.1 — ST Model Zoo 3.2 — ModelZoo Services 3.3 — ST Edge AI Core 3.4 — STM32CubeIDE & C Firmware