|
STM32N6 NPU Deployment — Politecnico di Milano
1.0
Documentation for Neural Network Deployment on STM32N6 NPU - Politecnico di Milano 2024-2025
|
Functions | |
| None | chain_qd (DictConfig cfg=None, str float_model_path=None, tf.data.Dataset train_ds=None, tf.data.Dataset quantization_ds=None) |
| None | chain_eqeb (DictConfig cfg=None, str float_model_path=None, tf.data.Dataset train_ds=None, tf.data.Dataset valid_ds=None, tf.data.Dataset quantization_ds=None, tf.data.Dataset test_ds=None) |
| None | chain_qb (DictConfig cfg=None, str float_model_path=None, tf.data.Dataset train_ds=None, tf.data.Dataset quantization_ds=None) |
| None | chain_eqe (DictConfig cfg=None, str float_model_path=None, tf.data.Dataset train_ds=None, tf.data.Dataset valid_ds=None, tf.data.Dataset quantization_ds=None, tf.data.Dataset test_ds=None) |
| None | chain_tqeb (DictConfig cfg=None, tf.data.Dataset train_ds=None, tf.data.Dataset valid_ds=None, tf.data.Dataset quantization_ds=None, tf.data.Dataset test_ds=None) |
| None | chain_tqe (DictConfig cfg=None, tf.data.Dataset train_ds=None, tf.data.Dataset valid_ds=None, tf.data.Dataset quantization_ds=None, tf.data.Dataset test_ds=None) |
| None | process_mode (str mode=None, DictConfig configs=None, tf.data.Dataset train_ds=None, tf.data.Dataset valid_ds=None, tf.data.Dataset quantization_ds=None, tf.data.Dataset test_ds=None, Optional[str] float_model_path=None, Optional[bool] fake=False) |
| None | main (DictConfig cfg) |
Variables | |
| SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) | |
Add the parent directory to sys.path so that shared modules in common/ are importable. More... | |
| parser = argparse.ArgumentParser() | |
| type | |
| str | |
| default | |
| help | |
| nargs | |
| args = parser.parse_args() | |
| None stm32ai_main.chain_eqe | ( | DictConfig | cfg = None, |
| str | float_model_path = None, |
||
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | valid_ds = None, |
||
| tf.data.Dataset | quantization_ds = None, |
||
| tf.data.Dataset | test_ds = None |
||
| ) |
@brief Executes the Evaluation → Quantization → Evaluation pipeline (chain_eqe). @details Evaluates accuracy before and after INT8 quantization to measure the accuracy degradation introduced by the quantization process. No on-device benchmarking. @param cfg Hydra configuration dictionary. @param float_model_path Path to the float32 model. @param train_ds Training dataset (fallback for quantization calibration). @param valid_ds Validation dataset for evaluation. @param quantization_ds Dedicated calibration dataset for INT8 quantization. @param test_ds Test dataset (takes priority over valid_ds). @return None
Definition at line 229 of file stm32ai_main.py.
Referenced by process_mode().
| None stm32ai_main.chain_eqeb | ( | DictConfig | cfg = None, |
| str | float_model_path = None, |
||
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | valid_ds = None, |
||
| tf.data.Dataset | quantization_ds = None, |
||
| tf.data.Dataset | test_ds = None |
||
| ) |
@brief Executes the Evaluation → Quantization → Evaluation → Benchmarking pipeline (chain_eqeb). @details This chain is used to fully characterize both the float and quantized versions of a model: 1. Evaluate the float model to establish a baseline accuracy. 2. Quantize to INT8 using the provided calibration dataset. 3. Evaluate the quantized model to measure accuracy degradation. 4. Benchmark the quantized model on the target STM32 board to measure real-world latency. @param cfg Hydra configuration dictionary. @param float_model_path Path to the float32 model. @param train_ds Training dataset (used as fallback for quantization calibration). @param valid_ds Validation dataset for evaluation. @param quantization_ds Dedicated calibration dataset for INT8 quantization. @param test_ds Test dataset (takes priority over valid_ds for evaluation). @return None
Definition at line 130 of file stm32ai_main.py.
Referenced by process_mode().
| None stm32ai_main.chain_qb | ( | DictConfig | cfg = None, |
| str | float_model_path = None, |
||
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | quantization_ds = None |
||
| ) |
@brief Executes the Quantization → Benchmarking pipeline (chain_qb). @details Useful when accuracy evaluation is not needed and the goal is to quickly measure the on-device performance of a quantized model (latency, memory usage). @param cfg Hydra configuration dictionary. @param float_model_path Path to the float32 model to quantize. @param train_ds Training dataset (fallback for quantization calibration). @param quantization_ds Dedicated calibration dataset for INT8 quantization. @return None
Definition at line 190 of file stm32ai_main.py.
Referenced by process_mode().
| None stm32ai_main.chain_qd | ( | DictConfig | cfg = None, |
| str | float_model_path = None, |
||
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | quantization_ds = None |
||
| ) |
@brief Executes the Quantization → Deployment pipeline (chain_qd). @details This chain is used when a float model is already trained and only needs to be quantized and then deployed onto the STM32N6 board. Quantization strategy (in order of priority): 1. Use the dedicated quantization dataset if provided. 2. Fall back to the training dataset if no quantization dataset is available. 3. Use fake (synthetic) data if neither dataset is provided — accuracy will be degraded. After quantization, the model is deployed: - On MPU targets via deploy_mpu(). - On MCU targets (e.g., STM32N6570-DK) via deploy(). @param cfg Hydra configuration dictionary loaded from user_config.yaml. @param float_model_path Path to the float32 model file (.tflite, .h5, or .onnx). @param train_ds TensorFlow dataset used as fallback for quantization calibration. @param quantization_ds Dedicated TensorFlow dataset for INT8 quantization calibration. @return None
Definition at line 72 of file stm32ai_main.py.
References deploy.deploy_mpu().
Referenced by process_mode().
| None stm32ai_main.chain_tqe | ( | DictConfig | cfg = None, |
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | valid_ds = None, |
||
| tf.data.Dataset | quantization_ds = None, |
||
| tf.data.Dataset | test_ds = None |
||
| ) |
@brief Executes the Training → Quantization → Evaluation pipeline (chain_tqe). @details Similar to chain_tqeb but without the final on-device benchmarking step. Useful when the goal is to verify accuracy after quantization without needing to connect a physical STM32 board. @param cfg Hydra configuration dictionary. @param train_ds Training dataset. @param valid_ds Validation dataset. @param quantization_ds Dedicated calibration dataset (falls back to train_ds if not provided). @param test_ds Test dataset (takes priority over valid_ds for evaluation). @return None
Definition at line 339 of file stm32ai_main.py.
Referenced by process_mode().
| None stm32ai_main.chain_tqeb | ( | DictConfig | cfg = None, |
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | valid_ds = None, |
||
| tf.data.Dataset | quantization_ds = None, |
||
| tf.data.Dataset | test_ds = None |
||
| ) |
@brief Executes the full Training → Quantization → Evaluation → Benchmarking pipeline (chain_tqeb). @details This is the most complete pipeline, covering the entire model lifecycle from training to on-device performance measurement. It is particularly useful when starting from scratch or when fine-tuning a model for a new dataset. Pipeline steps: 1. Train the model on the provided training dataset. 2. Quantize the trained model to INT8. 3. Evaluate the quantized model for accuracy. 4. Benchmark on the target STM32 board. @param cfg Hydra configuration dictionary. @param train_ds Training dataset. @param valid_ds Validation dataset. @param quantization_ds Dedicated calibration dataset (falls back to train_ds if not provided). @param test_ds Test dataset (takes priority over valid_ds for evaluation). @return None
Definition at line 278 of file stm32ai_main.py.
Referenced by process_mode().
| None stm32ai_main.main | ( | DictConfig | cfg | ) |
@brief Main entry point of the STM32AI Model Zoo Services script.
@details
This function is decorated with @hydra.main, which means Hydra automatically
loads the configuration from `user_config.yaml` and passes it as a DictConfig object.
Execution flow:
1. Configure GPU memory limits (if specified in the config).
2. Parse and validate the full configuration via get_config().
3. Initialize MLflow experiment tracking.
4. Optionally initialize ClearML task tracking.
5. Set the global random seed for reproducibility.
6. Load and preprocess datasets (if required by the selected mode).
7. Dispatch to process_mode() based on cfg.operation_mode.
@note The operation mode is read from the YAML field `operation_mode`.
Modes requiring datasets (training, evaluation, etc.) will call preprocess()
to load and prepare the data. Modes like deployment do not require datasets.
@param cfg Hydra DictConfig object automatically populated from user_config.yaml.
@return None
Definition at line 536 of file stm32ai_main.py.
References parse_config.get_config(), and process_mode().
| None stm32ai_main.process_mode | ( | str | mode = None, |
| DictConfig | configs = None, |
||
| tf.data.Dataset | train_ds = None, |
||
| tf.data.Dataset | valid_ds = None, |
||
| tf.data.Dataset | quantization_ds = None, |
||
| tf.data.Dataset | test_ds = None, |
||
| Optional[str] | float_model_path = None, |
||
| Optional[bool] | fake = False |
||
| ) |
@brief Dispatches execution to the appropriate pipeline based on the operation mode.
@details
This function acts as a central dispatcher. It reads the `operation_mode` field
from the configuration and calls the corresponding function or chain.
Supported modes:
- 'training' : Train a model.
- 'evaluation' : Evaluate model accuracy on a dataset.
- 'quantization' : Quantize a float model to INT8.
- 'deployment' : Deploy the model onto the STM32 board (generates C code, compiles, flashes).
- 'prediction' : Run inference on new input data.
- 'benchmarking' : Measure on-device performance metrics.
- 'chain_tqeb' : Training → Quantization → Evaluation → Benchmarking.
- 'chain_tqe' : Training → Quantization → Evaluation.
- 'chain_eqe' : Evaluation → Quantization → Evaluation.
- 'chain_qb' : Quantization → Benchmarking.
- 'chain_eqeb' : Evaluation → Quantization → Evaluation → Benchmarking.
- 'chain_qd' : Quantization → Deployment.
@note In deployment mode for STM32N6570-DK, after flashing the user must manually
toggle the boot switches and power-cycle the board.
@param mode Operation mode string (e.g., 'deployment', 'chain_qd').
@param configs Hydra configuration dictionary.
@param train_ds Training TensorFlow dataset.
@param valid_ds Validation TensorFlow dataset.
@param quantization_ds Calibration dataset for INT8 quantization.
@param test_ds Test TensorFlow dataset.
@param float_model_path Path to the float32 model file.
@param fake If True, use synthetic data for quantization calibration.
@return None
@throws ValueError if an unsupported operation_mode is provided.
Definition at line 383 of file stm32ai_main.py.
References chain_eqe(), chain_eqeb(), chain_qb(), chain_qd(), chain_tqe(), chain_tqeb(), and deploy.deploy_mpu().
Referenced by main().
| stm32ai_main.args = parser.parse_args() |
Definition at line 634 of file stm32ai_main.py.
| stm32ai_main.default |
Definition at line 628 of file stm32ai_main.py.
| stm32ai_main.help |
Definition at line 629 of file stm32ai_main.py.
| stm32ai_main.nargs |
Definition at line 632 of file stm32ai_main.py.
| stm32ai_main.parser = argparse.ArgumentParser() |
Definition at line 627 of file stm32ai_main.py.
| stm32ai_main.SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) |
Add the parent directory to sys.path so that shared modules in common/ are importable.
Definition at line 54 of file stm32ai_main.py.
| stm32ai_main.str |
Definition at line 628 of file stm32ai_main.py.
Referenced by deploy.deploy(), and common_deploy.stm32ai_deploy_mpu().
| stm32ai_main.type |
Definition at line 628 of file stm32ai_main.py.