accuracy_based_auto_calibration
产品支持情况
| 产品 | 是否支持 |
|---|---|
| Ascend 950PR/Ascend 950DT | √ |
| Atlas A3 训练系列产品/Atlas A3 推理系列产品 | √ |
| Atlas A2 训练系列产品/Atlas A2 推理系列产品 | √ |
功能说明
根据用户输入的模型、配置文件进行自动的校准过程,搜索得到一个满足目标精度的量化配置,输出可以在ONNX Runtime环境下做精度仿真的fake_quant模型,和可在AI处理器上做推理的deploy模型。
函数原型
accuracy_based_auto_calibration(model,model_evaluator,config_file,record_file,save_dir,input_data,input_names,output_names,dynamic_axes,strategy='BinarySearch',sensitivity='CosineSimilarity')
参数说明
含义:对模型输入输出动态轴的指定,例如对于输入inputs(NCHW),N、H、W为不确定大小,输出outputs(NL),N为不确定大小,则{"inputs": [0,2,3], "outputs": [0]}。 数据类型:dict<string, dict<python:int, string>> or dict<string, list(int)> |
||
返回值说明
无
调用示例
import amct_pytorch as amct
from amct_pytorch.common.auto_calibration import AutoCalibrationEvaluatorBase
# You need to implement the AutoCalibrationEvaluator's calibration(), evaluate() and metric_eval() funcs
class AutoCalibrationEvaluator(AutoCalibrationEvaluatorBase):
""" subclass of AutoCalibrationEvaluatorBase"""
def __init__(self, target_loss, batch_num):
super(AutoCalibrationEvaluator, self).__init__()
self.target_loss = target_loss
self.batch_num = batch_num
def calibration(self, model):
""" implement the calibration function of AutoCalibrationEvaluatorBase
calibration() need to finish the calibration inference procedure
so the inference batch num need to >= the batch_num pass to create_quant_config
"""
model_forward(model=model, batch_size=32, iterations=self.batch_num)
def evaluate(self, model):
""" implement the evaluate function of AutoCalibrationEvaluatorBase
params: model in torch.nn.module
return: the accuracy of input model on the eval dataset, or other metric which
can describe the 'accuracy' of model
"""
top1, _ = model_forward(model=model, batch_size=32, iterations=5)
if torch.cuda.is_available():
torch.cuda.empty_cache()
return top1
def metric_eval(self, original_metric, new_metric):
""" implement the metric_eval function of AutoCalibrationEvaluatorBase
params: original_metric: the returned accuracy of evaluate() on non quantized model
new_metric: the returned accuracy of evaluate() on fake quant model
return:
[0]: whether the accuracy loss between non quantized model and fake quant model
can satisfy the requirement
[1]: the accuracy loss between non quantized model and fake quant model
"""
loss = original_metric - new_metric
if loss * 100 < self.target_loss:
return True, loss
return False, loss
...
# 1. step1 create quant config json file
config_json_file = os.path.join(TMP, 'config.json')
skip_layers = []
batch_num = 2
amct.create_quant_config(
config_json_file,
model,
input_data,
skip_layers,
batch_num
)
# 2. step2 construct the instance of AutoCalibrationEvaluator
evaluator = AutoCalibrationEvaluator(target_loss=0.5, batch_num=batch_num)
# 3. step3 using the accuracy_based_auto_calibration to quantized the model
record_file = os.path.join(TMP, 'scale_offset_record.txt')
result_path = os.path.join(PATH, 'result/mobilenet_v2')
amct.accuracy_based_auto_calibration(
model=model,
model_evaluator=evaluator,
config_file=config_json_file,
record_file=record_file,
save_dir=result_path,
input_data=input_data,
input_names=['input'],
output_names=['output'],
dynamic_axes={
'input': {0: 'batch_size'},
'output': {0: 'batch_size'}
},
strategy='BinarySearch',
sensitivity='CosineSimilarity'
)
落盘文件说明:
- 精度仿真模型文件:ONNX格式的模型文件,模型名中包含fake_quant,可以在ONNX Runtime环境进行精度仿真。
- 部署模型文件:ONNX格式的模型文件,模型名中包含deploy,经过ATC转换工具转换后可部署到AI处理器。
- 量化因子记录文件:在接口中的record_file中写入量化因子。
- 敏感度信息文件:该文件记录了待量化层对于量化的敏感度信息,根据该信息进行量化回退层的选择。
- 自动量化回退历史记录文件:记录的回退层的信息。