Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to force layernorm to run at FP32 precision using python #4225

Closed
OswaldoBornemann opened this issue Oct 28, 2024 · 6 comments
Closed

How to force layernorm to run at FP32 precision using python #4225

OswaldoBornemann opened this issue Oct 28, 2024 · 6 comments

Comments

@OswaldoBornemann
Copy link

How to force layernorm to run at FP32 precision?

@lix19937
Copy link

lix19937 commented Oct 28, 2024

@OswaldoBornemann see #3897 (comment)

Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

@OswaldoBornemann
Copy link
Author

  1. So yeah, I export the ONNX model using opset_version=18, but the warning still happened. @lix19937
  2. How to force layernorm layers to run in FP32 precision where not to use trtexec command? My convert code is like this
EXPLICIT_BATCH_FLAG = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) # Which is 1

builder = trt.Builder(TRT_LOGGER) #  <tensorrt.tensorrt.Builder at 0x7f45cef99ab0>
network = builder.create_network(EXPLICIT_BATCH_FLAG) # <tensorrt.tensorrt.INetworkDefinition at 0x7f45cf8e6b70>
parser = trt.OnnxParser(network, TRT_LOGGER) # <tensorrt.tensorrt.OnnxParser at 0x7f45cf9164b0>
causal_ar_success = parser.parse_from_file(onnx_path) # True

config = builder.create_builder_config()
config.set_flag(trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS)
config.set_flag(trt.BuilderFlag.FP16)

profile = builder.create_optimization_profile()


shapes = {
.....
}

for order, (name, shape_list) in enumerate(shapes.items()):
    print(order, name, shape_list)
    min_shape, opt_shape, max_shape = shape_list
    profile.set_shape(name, min_shape, opt_shape, max_shape)
    

config.add_optimization_profile(profile)
engine = builder.build_serialized_network(network, config)

@lix19937
Copy link

  1. So yeah, I export the ONNX model using opset_version=18, but the warning still happened. @lix19937

Do you use trtexec --fp16 --onnx=spec --verbose ?

@OswaldoBornemann
Copy link
Author

No, I use the following code to convert pytorch model to onnx

# Export the model to ONNX
torch.onnx.export(
    pytorch_model,                          # The model
    input_datas,  # Inputs to the forward function
    onnx_output_path,                       # Output ONNX file path
    export_params=True,                # Store the trained parameter weights
    opset_version=17,       # ONNX version to export to
    do_constant_folding=True,          # Whether to execute constant folding for optimization
    input_names=in_names,  # Input names
    output_names=out_names,          # Output names
    dynamic_axes=dynamic_axes
)

@lix19937
Copy link

Then use trtexec --fp16 to run.

If LN not run in fp16, use follow

trtexec --layerPrecisions="layer_name":"fp32"    --onnx=spec  --fp16  --verbose  

@OswaldoBornemann
Copy link
Author

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants