STM32N6 NPU Deployment — Politecnico di Milano  1.0
Documentation for Neural Network Deployment on STM32N6 NPU - Politecnico di Milano 2024-2025
quantize.py File Reference

INT8 Post-Training Quantization (PTQ) pipeline for STM32N6 deployment. More...

Go to the source code of this file.

Namespaces

 quantize
 

Functions

None quantize._tflite_ptq_quantizer (tf.keras.Model model=None, tf.data.Dataset quantization_ds=None, bool fake=False, str output_dir=None, Optional[str] export_dir=None, tuple input_shape=None, str quantization_granularity=None, str quantization_input_type=None, str quantization_output_type=None, str quantization_split=None, str quantization_path=None)
 
str quantize.quantize (DictConfig cfg=None, Optional[tf.data.Dataset] quantization_ds=None, Optional[bool] fake=False, Optional[str] float_model_path=None)
 

Detailed Description

INT8 Post-Training Quantization (PTQ) pipeline for STM32N6 deployment.

Converts a float32 model to INT8 using a representative calibration dataset via the TensorFlow Lite Converter. Called by stm32ai_main.py in quantization and chain_qd/chain_qb/chain_eqe operation modes.

Note
Politecnico di Milano, A.Y. 2024-2025. Multidisciplinary Project — Neural Network Deployment on STM32N6 NPU.

Definition in file quantize.py.