How to force layernorm to run at FP32 precision using python #4225

OswaldoBornemann · 2024-10-28T04:30:41Z

How to force layernorm to run at FP32 precision?

lix19937 · 2024-10-28T05:12:25Z

Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

OswaldoBornemann · 2024-10-28T05:44:44Z

So yeah, I export the ONNX model using opset_version=18, but the warning still happened. @lix19937
How to force layernorm layers to run in FP32 precision where not to use trtexec command? My convert code is like this

EXPLICIT_BATCH_FLAG = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) # Which is 1

builder = trt.Builder(TRT_LOGGER) #  <tensorrt.tensorrt.Builder at 0x7f45cef99ab0>
network = builder.create_network(EXPLICIT_BATCH_FLAG) # <tensorrt.tensorrt.INetworkDefinition at 0x7f45cf8e6b70>
parser = trt.OnnxParser(network, TRT_LOGGER) # <tensorrt.tensorrt.OnnxParser at 0x7f45cf9164b0>
causal_ar_success = parser.parse_from_file(onnx_path) # True

config = builder.create_builder_config()
config.set_flag(trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS)
config.set_flag(trt.BuilderFlag.FP16)

profile = builder.create_optimization_profile()


shapes = {
.....
}

for order, (name, shape_list) in enumerate(shapes.items()):
    print(order, name, shape_list)
    min_shape, opt_shape, max_shape = shape_list
    profile.set_shape(name, min_shape, opt_shape, max_shape)
    

config.add_optimization_profile(profile)
engine = builder.build_serialized_network(network, config)

lix19937 · 2024-10-28T07:02:49Z

So yeah, I export the ONNX model using opset_version=18, but the warning still happened. @lix19937

Do you use trtexec --fp16 --onnx=spec --verbose ?

OswaldoBornemann · 2024-10-28T07:06:10Z

No, I use the following code to convert pytorch model to onnx

# Export the model to ONNX
torch.onnx.export(
    pytorch_model,                          # The model
    input_datas,  # Inputs to the forward function
    onnx_output_path,                       # Output ONNX file path
    export_params=True,                # Store the trained parameter weights
    opset_version=17,       # ONNX version to export to
    do_constant_folding=True,          # Whether to execute constant folding for optimization
    input_names=in_names,  # Input names
    output_names=out_names,          # Output names
    dynamic_axes=dynamic_axes
)

lix19937 · 2024-10-28T10:00:44Z

Then use trtexec --fp16 to run.

If LN not run in fp16, use follow

trtexec --layerPrecisions="layer_name":"fp32"    --onnx=spec  --fp16  --verbose

OswaldoBornemann · 2024-10-28T11:28:37Z

Thank you

OswaldoBornemann closed this as completed Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to force layernorm to run at FP32 precision using python #4225

How to force layernorm to run at FP32 precision using python #4225

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024 •

edited

Loading

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024

OswaldoBornemann commented Oct 28, 2024

How to force layernorm to run at FP32 precision using python #4225

How to force layernorm to run at FP32 precision using python #4225

Comments

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024 • edited Loading

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024

OswaldoBornemann commented Oct 28, 2024

lix19937 commented Oct 28, 2024 •

edited

Loading