Quick Quantization Results
Overview
This document describes the structure of the output files generated by the quick quantization feature, which helps you understand the organization of the quantization results and the functions of each file.
Generated Files
After executing the quick quantization command, the system generates the following files in the specified save_path directory:
├── config.json # Original model configuration file
├── generation_config.json # Original generation configuration file
├── quant_model_description.json # Description file for quantized weights
├── quant_model_weights.safetensors # Quantized model weight file (sharded and indexed by index.json if the weights are large)
├── tokenizer_config.json # Original tokenizer configuration file
├── tokenizer.json # Original tokenizer vocabulary
├── {model_type}_best_practice.yaml # Quantization configuration protocol file
└── vocab.json # Original vocabulary mapping file (for some models)
├── optional/ # Optional export directory (generated when certain algorithms are enabled)
│ └── quarot.safetensors # QuaRot global rotation matrix (generated when export_extra_info is enabled)
└── debug_info/ # Debug information directory (generated only when the --debug option is specified)
├── debug_info.json # Debug metadata in JSON format
└── debug_info.safetensors # Debug tensor data in SafeTensors format
File Description
| File | Description |
|---|---|
config.json |
Original model configuration file, which contains key architectural parameters such as the model framework, number of layers, and hidden dimensions. |
generation_config.json |
Generation configuration file of the original model, which contains inference-related parameters such as the sampling strategy and maximum generation length. |
quant_model_description.json |
Quantization weight description file, which records the exact quantization type and metadata for each weight tensor. |
quant_model_weights.safetensors |
Quantized model weight file, which contains the quantized weight data of the model. If the weights are large, they are sharded across multiple files and indexed through model.safetensors.index.json. |
tokenizer_config.json |
Configuration file of the original tokenizer, which stores parameters such as special tokens and the vocabulary size. |
tokenizer.json |
Vocabulary file of the original tokenizer, which defines the mapping relationship between tokens and unique IDs. |
{model_type}_best_practice.yaml |
Quantization configuration protocol file, which records the complete configuration parameters used during execution. For details, see Quantization Configuration Protocol. |
vocab.json |
Original vocabulary mapping file, which is included automatically with certain architectures, such as GPT-style models. |
optional/quarot.safetensors |
QuaRot global rotation matrix file, which is generated only when QuaRot is used with export_extra_info set to True. It stores the global rotation matrix Q. For details, see QuaRot - Rotation Quantization. |
debug_info/ |
Debug information directory (generated only when the --debug option is specified), which contains the quantization context for troubleshooting and algorithm analysis. For details, see Debug Information Output. |
quant_model_description.json
The quant_model_description.json file is a quantization weight description file, which records the quantization type and related metadata for each weight tensor in a model. It serves as a critical configuration baseline for the inference framework to load the quantized model.
File Structure Example
{
"model_quant_type": "W8A8",
"version": "1.0.0",
"group_size": 128,
"kv_quant_type": "KV8",
"model.layers.0.self_attn.qkv_proj.weight": "W8A8",
"model.layers.0.self_attn.o_proj.weight": "W8A8",
"model.layers.0.mlp.gate_proj.weight": "W8A8",
"model.layers.0.mlp.up_proj.weight": "W8A8",
"model.layers.0.mlp.down_proj.weight": "W8A8",
"metadata": {},
"optional": {}
}
Note
The *.weight field name is determined by the model architecture itself.
Field Description
Global Metadata Fields
| Field | Type | Description |
|---|---|---|
model_quant_type |
string |
Quantization type of the entire model, which identifies the quantization scheme. |
version |
string |
Version number of the quantization tool, in the format of x.x.x. |
group_size |
int |
Quantization group size, which applies to group quantization. |
kv_quant_type |
string |
KV Cache quantization type, where KV8 indicates 8-bit KV Cache. |
kv_cache_type |
string |
Alias for the KV Cache quantization type, which shares an identical meaning with kv_quant_type. |
fa_quant_type |
string |
Flash Attention quantization type, such as FAQuant. |
reduce_quant_type |
string |
Communication quantization type, such as per_channel. |
metadata |
object |
Additional metadata, such as the QuaRot information. |
optional |
object |
Optional functional module information, such as the QuaRot global rotation matrix path. |
Quantization Type Enumerated Values
| Enumerated Value | Description |
|---|---|
FLOAT |
Floating-point (unquantized) |
W16A16S |
W16A16s sparse quantization with 16-bit sparse weights |
W8A8 |
W8A8 quantization with 8-bit weights and 8-bit activations |
W8A8_DYNAMIC |
W8A8 dynamic quantization with 8-bit weights and per-token dynamic activations |
W8A8_MIX |
W8A8 mixed quantization combining static and dynamic quantization |
W8A16 |
W8A16 quantization with 8-bit weights and 16-bit activations |
W4A4_DYNAMIC |
W4A4 dynamic quantization with 4-bit weights and per-token dynamic activations |
WFP8AFP8_DYNAMIC |
WFP8/AFP8 dynamic quantization |
W8A8_MXFP8 |
W8A8 MXFP8 quantization |
W4A8_MXFP |
W4A8 MXFP quantization |
W4A4_MXFP4 |
W4A4 MXFP4 quantization |
W4A8_DYNAMIC |
W4A8 dynamic quantization with 4-bit weights and per-token dynamic activations |
C8 |
KV Cache 8-bit quantization |
FAQuant |
Flash Attention quantization |
FLATQUANT_DYNAMIC |
FlatQuant dynamic quantization |
FLATQUANT |
FlatQuant static quantization |
Weight Tensor Quantization Type
In addition to the global metadata fields described above, other key-value pairs in the JSON file represent the quantization types of individual weight tensors in the model. The key is the weight name, and the value is the quantization type used by the weight.
Example:
{
"model.layers.0.self_attn.qkv_proj.weight": "W8A8"
}
This pair indicates that the model.layers.0.self_attn.qkv_proj.weight weight uses the W8A8 quantization type.
Weight Type Details
This section describes the parameter structure of different quantization modes based on the fields in quant_model_description.json. Quantization parameters vary depending on the quantization mode, and the .safetensors weight files and .json description files differ accordingly.
Quantization Mode Parameters
The following content describes the parameter structure of each quantization mode based on the AscendV1Saver implementation.
FLOAT (Unquantized)
FLOAT indicates weights that have not undergone quantization. These weights retain their original floating-point precision.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
float16/bfloat16 | Original floating-point weight |
bias |
float16/bfloat16 | Bias (optional) |
Typical Weight Names
model.layers.0.self_attn.q_proj.weight
model.layers.0.self_attn.q_proj.bias
quant_model_description.json Identifier: FLOAT
W16A16S (Sparse Quantization)
W16A16S is a floating-point weight sparse quantization mode, where weights retain floating-point precision and undergo sparsification processing.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
float16/bfloat16 | Sparsified weights (non-zero values) |
scale |
float16/bfloat16 | Scaling factor |
Typical Weight Names
model.layers.0.mlp.gate_proj.weight
model.layers.0.mlp.gate_proj.scale
quant_model_description.json Identifier: W16A16S
W8A8 (Static Quantization)
W8A8 is a static quantization mode, which applies int8 quantization to both weights and activations.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8 | Quantized weight data |
quant_bias |
int32 | Quantization bias |
input_scale |
float16/bfloat16 | Activation quantization scale factor |
input_offset |
float16/bfloat16 | Activation quantization offset factor |
deq_scale |
int64/float32 | Combined dequantization scale factor calculated as input_scale × weight_scale |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.quant_bias
model.layers.0.self_attn.qkv_proj.input_scale
model.layers.0.self_attn.qkv_proj.input_offset
model.layers.0.self_attn.qkv_proj.deq_scale
quant_model_description.json Identifier: W8A8
W8A8_DYNAMIC (Dynamic Quantization)
W8A8_DYNAMIC is a mode with int8 weight quantization and per-token dynamic activation quantization.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8 | Quantized weight data |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor, which contains all zeros for symmetric quantization |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
model.layers.0.self_attn.qkv_proj.weight_offset
Dequantization Formula
deq_weight = (weight - weight_offset) * weight_scale
quant_model_description.json Identifier: W8A8_DYNAMIC
W8A8_MIX (Mixed Quantization)
W8A8_MIX is a mixed mode that combines static and dynamic quantization.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8 | Quantized weight data |
quant_bias |
int32 | Quantization bias |
input_scale |
float16/bfloat16 | Activation quantization scale factor |
input_offset |
float16/bfloat16 | Activation quantization offset factor |
deq_scale |
int64/float32 | Combined dequantization scale factor |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.quant_bias
model.layers.0.self_attn.qkv_proj.input_scale
model.layers.0.self_attn.qkv_proj.input_offset
model.layers.0.self_attn.qkv_proj.deq_scale
model.layers.0.self_attn.qkv_proj.weight_scale
model.layers.0.self_attn.qkv_proj.weight_offset
quant_model_description.json Identifier: W8A8_MIX
W8A16 (Weight Requantization)
W8A16 is a weight quantization mode that applies int8 quantization only to weights, while activations retain floating-point precision.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8 | Quantized weight data |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor, which contains all zeros for symmetric quantization |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
model.layers.0.self_attn.qkv_proj.weight_offset
Dequantization Formula
deq_weight = (weight - weight_offset) * weight_scale
quant_model_description.json Identifier: W8A16
W4A4_DYNAMIC (W4A4 Dynamic Quantization)
W4A4_DYNAMIC is an ultra-low-bit mode featuring int4 weight quantization and per-token dynamic activation quantization.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8 | Quantized weight data packed as int4 storage |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
model.layers.0.self_attn.qkv_proj.weight_offset
Note: Activation quantization parameters are calculated dynamically during inference and are not saved to the weight file.
quant_model_description.json Identifier: W4A4_DYNAMIC
WFP8AFP8_DYNAMIC (FP8 Dynamic Quantization)
WFP8AFP8_DYNAMIC is an FP8 floating-point dynamic quantization mode.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
float8_e4m3fn | Weight in FP8 format |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
model.layers.0.self_attn.qkv_proj.weight_offset
quant_model_description.json Identifier: WFP8AFP8_DYNAMIC
W8A8_MXFP8 (MXFP8 Quantization)
W8A8_MXFP8 is a quantization mode using Microscaling (MX) formats and FP8.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
float8_e4m3fn | Weight in FP8 format |
weight_scale |
uint8 | Scale factor stored with a +127 offset |
bias |
float32 | Original floating-point bias (optional) |
Note: The weight_scale parameter undergoes a +127 offset adjustment to shift its value range from -127 to 128 over to 0 to 255, which exactly covers the value range of uint8.
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
quant_model_description.json Identifier: W8A8_MXFP8
W4A8_MXFP (W4A8 MXFP Quantization)
W4A8_MXFP is an MXFP quantization mode with int4 weights and int8 activations.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
float8_e4m3fn | Weight in FP8 format |
weight_scale |
uint8 | Scale factor stored with a +127 offset |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
quant_model_description.json Identifier: W4A8_MXFP
W4A4_MXFP4 (W4A4 MXFP4 Quantization)
W4A4_MXFP4 is an ultra-low-bit MXFP4 quantization mode.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
float8_e4m3fn | Weight in FP8 format |
weight_scale |
uint8 | Scale factor stored with a +127 offset |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
quant_model_description.json Identifier: W4A4_MXFP4
C8 (KV Cache Quantization)
KV Cache 8-bit quantization is a quantization mode for the Key-Value Cache.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
kv_cache_scale |
float32/float16 | Scale factor for KV cache quantization |
kv_cache_offset |
float32/float16 | Offset factor for KV cache quantization |
Typical Weight Names
model.layers.0.self_attn.k_proj.kv_cache_scale
model.layers.0.self_attn.k_proj.kv_cache_offset
model.layers.0.self_attn.v_proj.kv_cache_scale
model.layers.0.self_attn.v_proj.kv_cache_offset
quant_model_description.json Identifier: C8
W4A8_DYNAMIC (W4A8 Dynamic Quantization)
W4A8_DYNAMIC is a mode with int4 weight quantization and per-token dynamic activation quantization.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8 | Quantized weight data packed as int4 storage |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor |
scale_bias |
float32 | Scale bias factor used for additional adjustments during dequantization |
bias |
float32 | Original floating-point bias (optional) |
Typical Weight Names
model.layers.0.self_attn.qkv_proj.weight
model.layers.0.self_attn.qkv_proj.weight_scale
model.layers.0.self_attn.qkv_proj.weight_offset
model.layers.0.self_attn.qkv_proj.scale_bias
Dequantization Formula
deq_weight = (weight - weight_offset) * weight_scale + scale_bias
quant_model_description.json Identifier: W4A8_DYNAMIC
FlatQuant_DYNAMIC (FlatQuant Dynamic Quantization)
FlatQuant_DYNAMIC is a FlatQuant dynamic quantization mode.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
weight |
int8/int32 | Quantized weight data |
weight_scale |
float32 | Weight quantization scale factor |
weight_offset |
float32 | Weight quantization offset factor |
input_scale |
float32 | Activation quantization scale factor (dynamic) |
input_offset |
float32 | Activation quantization offset factor (dynamic) |
deq_scale |
float32 | Combined dequantization scale factor |
quant_bias |
int32 | Quantization bias |
left_trans |
float32 | Left matrix for linear transformation |
right_trans |
float32 | Right matrix for linear transformation |
clip_ratio |
float32 | Clipping ratio factor |
bias |
float32 | Original floating-point bias (optional) |
Notes:
- FlatQuant is a quantization method combined with linear transformation.
- The
left_transandright_transparameters are matrices used for feature transformation. - The
clip_ratioparameter controls the quantization range.
quant_model_description.json Identifier: W8A8_FLATQUANT_DYNAMIC or W4A8_FLATQUANT_DYNAMIC
NonFusionSmoothQuant (Smooth Quantization)
NonFusionSmoothQuant is a smooth quantization mode that reduces quantization error.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
div.mul_scale |
float32 | Smooth scale factor |
| Other parameters | - | Determined by the quantization type of the internal linear layer |
Typical Weight Names
model.layers.0.self_attn.q_proj.div.mul_scale
model.layers.0.self_attn.q_proj.linear.weight
quant_model_description.json Identifier: FLOAT (Inner Weights)
FAQuant (Flash Attention Quantization)
FAQuant is a Flash Attention quantization mode.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
scale |
float16/bfloat16 | Quantization scale factor |
offset |
float16/bfloat16 | Quantization offset factor |
Typical Weight Names
model.layers.0.self_attn.q_proj.scale
model.layers.0.self_attn.q_proj.offset
quant_model_description.json Identifier: FAQuant
Notes:
- A checkmark (✓) indicates that the quantization mode contains this parameter, a hyphen (-) indicates that the quantization mode does not contain this parameter, and ✓ (+127) indicates that a +127 offset adjustment is required.
- The parameters of NonFusionSmoothQuant are determined by the quantization type of the internal linear layer, and this mode additionally includes the
div.mul_scaleparameter.- The rotation matrix parameters of QuaRot may include a subset or the entirety of the rotation matrices based on the specific implementation.
QuaRot - Rotation Quantization
Parameter Description
QuaRot is a rotation-based quantization method that maintains the functional equivalence of a quantized model.
Quantization Parameters (Stored in .safetensors)
| Parameter | Data Type | Description |
|---|---|---|
heads_rotation |
float32 | Rotation matrix for multi-head attention |
kronecker_rotation_m |
float32 | Kronecker rotation matrix M |
kronecker_rotation_n |
float32 | Kronecker rotation matrix N |
global_rotation |
float32 | Global rotation matrix saved in the optional directory |
Notes:
- The
heads_rotationparameter handles rotation for multi-head attention. - The
kronecker_rotation_mandkronecker_rotation_nparameters are used for rotation at the MLP layer. - The
global_rotationparameter is saved in theoptional/quarot.safetensorsfile.
File Description
optional/quarot.safetensors
When using the QuaRot algorithm with export_extra_info set to True, the quantization tool generates an additional optional/ subdirectory within the save_path directory to store the global rotation matrix Q in SafeTensors format. The directory structure is as follows:
optional/
└── quarot.safetensors # QuaRot global rotation matrix file.
Global rotation matrix Q
| Key | Data Type | Description |
|---|---|---|
global_rotation |
float32 | QuaRot global rotation matrix Q |
Description Fields in quant_model_description.json
Enable online: Add a metadata.quarot domain field to the quant_model_description.json file:
{
"metadata": { // Additional metadata information.
"quarot": { // Additional QuaRot export domain.
"max_tp_size": 4, // Maximum TP size, configured by the max_tp_size parameter in the QuaRot quantization settings.
"heads_rotation": { // Rotation matrix for multi-head attention.
"layers": [ // Layers that use online rotation (attention o layers).
"model.layers.0.self_attn.o_proj.",
"model.layers.1.self_attn.o_proj.",
"model.layers.2.self_attn.o_proj."
]
},
"kronecker_rotation": { // Kronecker rotation matrix.
"layers": [ // Layers using online rotation (down layers), specified by down_proj_online_layers in the QuaRot configurations, and described by kronecker_rotation_m and kronecker_rotation_n in the safetensors file.
"model.layers.2.mlp.down_proj."
]
}
}
}
}
Enable export_extra_info: Add a new optional.quarot domain field to the quant_model_description.json file:
{
"optional": { // Primary entry for optional export assets
"quarot": { // Additional QuaRot export domain
"rotation_map": { // Rotation information mapping table
"global_rotation": "optional/quarot.safetensors" // Global rotation matrix file path (relative path)
}
}
}
}
Application Scenarios
- Inference framework loading: The inference framework reads the
optional.quarot.rotation_mappath inside thequant_model_description.jsonfile and loads the global rotation matrix along that path to execute online rotation calculations. - Algorithm reproduction and debugging: Load the rotation matrix directly to verify the mathematical equivalence of the QuaRot transformation.
Debug Information Output
When the --debug option is specified in the quantization command, msModelSlim automatically saves the execution context during the quantization process to the debug_info directory after quantization is complete.
Debug Information Directory Structure
debug_info/
├── debug_info.json # Debug metadata in JSON format
└── debug_info.safetensors # Debug tensor data in SafeTensors format
Debug Information Files
debug_info.json
Stores non-tensor data and tensor metadata generated during the quantization process. The data is organized by namespace.
File Structure Example
{
"linear_quant_namespace": {
"layer_name": "model.layers.0.self_attn.qkv_proj",
"quant_config": {
"weight_dtype": "int8",
"act_dtype": "int8"
},
"statistics": {
"weight_min": -0.5,
"weight_max": 0.5
},
"scale_tensor": {
"_type": "tensor",
"_file": "debug_info.safetensors",
"_key": "tensor_0"
}
},
"iter_smooth_namespace": {
"smoothing_factors": {
"_type": "tensor",
"_file": "debug_info.safetensors",
"_key": "tensor_1"
}
}
}
Field Description
- Namespace (
namespace): An independent namespace is created by each processor or module to isolate the debug information of different phases. - Common fields: record scalar values such as integers, floating-point numbers, strings, and Boolean values directly.
- Tensor fields: store reference information for PyTorch tensors.
_type: indicates that the field is a tensor reference with the fixed value"tensor"._file: indicates the name of the file where the tensor data is stored (debug_info.safetensors)._key: indicates the key name of the tensor in theSafeTensorsfile.
debug_info.safetensors
Stores all tensor data generated during the quantization process in SafeTensors format, including:
- Quantization parameters such as
scaleandzero_point - Statistics such as minimum values, maximum values, and histograms
- Intermediate result tensors
- Smoothing factors for the outlier suppression algorithm
- Other tensors used for debugging
Features
- Efficient storage: The SafeTensors format supports fast loading and memory mapping.
- Cross-platform compatibility: The file can be shared across different frameworks and platforms.
- Security: The format is more secure than the pickle format to avoid code injection risks.
Debug Information Usage
Debug information applies to the following scenarios:
- Quantization accuracy tuning: Identify layers with significant quantization error and verify whether the outlier suppression algorithm takes effect.
- Algorithm research and development: Compare different quantization algorithms and develop new quantization strategies.
- Issue locating and reporting: Quickly locate issues and provide detailed diagnostic information to technical support.
- Model analysis and optimization: Understand activation distribution characteristics across model layers and identify quantization-sensitive layers.
Debug Information Loading Example
import json
from safetensors import safe_open
# Load JSON metadata
with open("debug_info/debug_info.json", "r") as f:
debug_meta = json.load(f)
# Load SafeTensors tensor data
with safe_open("debug_info/debug_info.safetensors", framework="pt") as f:
# Obtain the key names of all tensors
tensor_keys = f.keys()
# Load specific tensors
for key in tensor_keys:
tensor = f.get_tensor(key)
print(f"{key}: shape={tensor.shape}, dtype={tensor.dtype}")
Precautions
- Debug information may occupy large drive space, typically 10% to 50% of the model size.
- Enabling debug mode slightly increases quantization time, typically by 5% to 10%.
- Debug information may contain sensitive model information. Keep it secure.
For details, see the Debug Mode User Guide.