STM32N6 NPU Deployment — Politecnico di Milano  1.0
Documentation for Neural Network Deployment on STM32N6 NPU - Politecnico di Milano 2024-2025
deploy Namespace Reference

Functions

None deploy (DictConfig cfg=None, Optional[str] model_path_to_deploy=None, list credentials=None)
 Key parameters: More...
 
None deploy_mpu (DictConfig cfg=None, Optional[str] model_path_to_deploy=None, list credentials=None)
 

Function Documentation

◆ deploy()

None deploy.deploy ( DictConfig   cfg = None,
Optional[str]   model_path_to_deploy = None,
list   credentials = None 
)

Key parameters:

@brief Deploy a quantized INT8 model onto an STM32N6 MCU board.

@details
This function implements the full deployment pipeline for STM32N6-series boards:

**Step 1 — Parameter extraction from YAML config:**
  All deployment parameters are read from the Hydra configuration object:
  - `board`              : Target board name (e.g., "STM32N6570-DK").
  - `stlink_serial_number`: ST-Link serial number for multi-board setups.
  - `c_project_path`     : Path to the STM32CubeIDE C project directory.
  - `stm32ai_version`    : Version of ST Edge AI Core to use.
  - `optimization`       : Optimization level for code generation (e.g., "balanced").
  - `path_to_stm32ai`    : Local path to the stedgeai executable.
  - `path_to_cube_ide`   : Local path to the STM32CubeIDE executable.
  - `stm32ai_ide`        : IDE/compiler to use (must be "gcc" for STM32N6).
  - `stm32ai_serie`      : MCU series (must be "STM32N6" for this path).

**Step 2 — C header file generation:**
  gen_h_user_file_n6() generates `ai_model_config.h`, a C header that embeds
  model metadata (input/output shapes, number of keypoints, confidence thresholds)
  into the firmware. This allows the C application to configure itself at compile time.

**Step 3 — Board configuration file selection:**
  A board-specific .conf file is selected:
  - `stmaic_STM32N6570-DK.conf`   for the STM32N6570-DK Discovery Kit.
  - `stmaic_NUCLEO-N657X0-Q.conf` for the NUCLEO-N657X0-Q board.

**Step 4 — Full deployment via stm32ai_deploy_stm32n6():**
  This common function:
    a. Invokes ST Edge AI Core to convert the .tflite model into optimized C arrays.
    b. Integrates the C code into the STM32CubeIDE project.
    c. Compiles the firmware using GCC.
    d. Flashes the binary onto the board via ST-Link (USB debugger).

@note After flashing, the user must manually toggle the boot switches on the
      STM32N6570-DK to the left and power-cycle the board to boot from flash.

@param cfg                Hydra DictConfig loaded from user_config.yaml.
@param model_path_to_deploy  Path to the quantized .tflite model to deploy.
                          If None, uses cfg.general.model_path.
@param credentials        Optional cloud credentials from a prior cloud_connect() call.

@return None
@throws ValueError  If the hardware series is STM32H7 (not supported in this flow).
@throws TypeError   If the board name or IDE/serie combination is not supported.
  • input_data_type='uint8' : The NPU expects uint8 input images (0–255 range).
  • inputs_ch_position='chlast' : Channel-last (NHWC) format as required by TFLite.
  • output_data_type='' : Output type is determined automatically.

Definition at line 51 of file deploy.py.

References gen_h_file.gen_h_user_file_n6(), common_deploy.stm32ai_deploy_stm32n6(), and stm32ai_main.str.

Here is the call graph for this function:

◆ deploy_mpu()

None deploy.deploy_mpu ( DictConfig   cfg = None,
Optional[str]   model_path_to_deploy = None,
list   credentials = None 
)
@brief Deploy an AI model onto an STM32MP MPU target device.

@details
This function handles deployment on STM32MP-series Microprocessor Units (MPUs),
which run Linux and have more memory and compute resources than MCUs.

Unlike the MCU path, MPU deployment can optionally leverage the STM32Cube.AI
Developer Cloud to generate an optimized NBG (Neural Binary Graph) file,
which is a proprietary binary format optimized for ST's NPU on MP2 devices.

Cloud optimization flow (STM32MP2 + .tflite or .onnx + on_cloud=True):
  1. Upload the model to the Developer Cloud.
  2. Request NBG generation (optimized binary format for the NPU).
  3. Download the resulting .nb file.
  4. Deploy the .nb file instead of the original model.

If cloud optimization is not available or fails, the original model is used directly.

Supported boards:
  - STM32MP257F-EV1
  - STM32MP157F-DK2
  - STM32MP135F-DK

@param cfg                Hydra DictConfig loaded from user_config.yaml.
@param model_path_to_deploy  Path to the model file to deploy (.tflite or .onnx).
                          If None, uses cfg.general.model_path.
@param credentials        Optional cloud credentials from a prior cloud_connect() call.

@return None
@throws TypeError  If the board is not in the list of supported MPU boards,
                   or if the deployment process fails.

Definition at line 198 of file deploy.py.

References common_deploy.stm32ai_deploy_mpu().

Referenced by stm32ai_main.chain_qd(), and stm32ai_main.process_mode().

Here is the call graph for this function:
Here is the caller graph for this function: