Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support QDQ transformations with com.microsoft.Quantize/Dequantize ops #17127

Merged
merged 57 commits into from
Aug 25, 2023

Conversation

adrianlizarraga
Copy link
Contributor

@adrianlizarraga adrianlizarraga commented Aug 12, 2023

Description

  • Enables int32 support for com.microsoft.DequantizeLinear (contrib op)
  • Makes the zero_point input optional for Quantize/Dequantize contrib ops
  • Enables QDQ transformations with the Quantize/Dequantize contrib ops
  • Update tests: EnsureUniqueDQForNodeUnitTests, QDQTransformerTests, TransposeOptimizerTests

Testing

List of tested graph transformations:

  • QDQSelectorActionTransformer
    • qdq_transformer_test.cc
  • QDQS8ToU8Transformer
    • qdq_transformer_test.cc
  • DoubleQDQPairsRemover
    • qdq_transformer_test.cc
  • IdenticalChildrenConsolidation
    • qdq_transformer_test.cc
  • QDQPropagation
    • qdq_transformer_test.cc
  • QDQFinalCleanup
    • qdq_transformer_test.cc
  • CliQuantFusion
    • qdq_transformer_test.cc
  • ReluQuantFusion
    • qdq_transformer_test.cc
  • EnsureUniqueDQForNodeUnit
    • ensure_unique_dq_for_node_unit_test.cc
  • TransposeOptimizer
    • transpose_optimizer_test.cc
  • CommonSubexpressionElimination
    • graph_transform_test.cc
  • ConstantFolding
    • graph_transform_test.cc

Motivation and Context

We need to support mixed 16-bit/8-bit precision QDQ models. This PR is the first step in achieving this goal: we need to make QDQ contrib ops work with our optimizations/transformations.

@adrianlizarraga adrianlizarraga marked this pull request as ready for review August 14, 2023 17:21
@adrianlizarraga adrianlizarraga requested a review from a team as a code owner August 14, 2023 17:21
skottmckay
skottmckay previously approved these changes Aug 24, 2023
edgchen1
edgchen1 previously approved these changes Aug 24, 2023
@yufenglee
Copy link
Member

QuantizeLinear, 1,

I think you are adding int16 support. I don't see the change.


Refers to: onnxruntime/core/graph/contrib_ops/quantization_defs.cc:144 in 375d3a2. [](commit_id = 375d3a2, deletion_comment = False)

@adrianlizarraga
Copy link
Contributor Author

adrianlizarraga commented Aug 25, 2023

I think you are adding int16 support. I don't see the change.

@yufenglee The work is being broken down into separate/smaller PRs. This specific PR focuses on making sure contrib QDQ ops can be optimized in the same manner as ONNX ops (please refer to the PR description for details).

The next PR (linked in the description) adds int16 support, but I'd like to get this one merged in before starting reviews on it.

Copy link
Member

@yufenglee yufenglee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@adrianlizarraga adrianlizarraga merged commit 5a83a67 into main Aug 25, 2023
99 checks passed
@adrianlizarraga adrianlizarraga deleted the adrianl/contrib-qdq-optimizations branch August 25, 2023 16:57
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
microsoft#17127)

### Description
- Enables int32 support for com.microsoft.DequantizeLinear (contrib op)
- Makes the `zero_point` input optional for Quantize/Dequantize contrib
ops
- Enables QDQ transformations with the Quantize/Dequantize contrib ops
- Update tests: EnsureUniqueDQForNodeUnitTests, QDQTransformerTests,
TransposeOptimizerTests

### Testing
List of tested graph transformations:
- [x] QDQSelectorActionTransformer
  - qdq_transformer_test.cc
- [x] QDQS8ToU8Transformer
  - qdq_transformer_test.cc
- [x] DoubleQDQPairsRemover
  - qdq_transformer_test.cc
- [x] IdenticalChildrenConsolidation
  - qdq_transformer_test.cc
- [x] QDQPropagation
  - qdq_transformer_test.cc
- [x] QDQFinalCleanup
  - qdq_transformer_test.cc
- [x] CliQuantFusion
  - qdq_transformer_test.cc
- [x] ReluQuantFusion
  - qdq_transformer_test.cc
- [x] EnsureUniqueDQForNodeUnit 
  - ensure_unique_dq_for_node_unit_test.cc
- [x] TransposeOptimizer 
  - transpose_optimizer_test.cc
- [x] CommonSubexpressionElimination
  - graph_transform_test.cc
- [x] ConstantFolding
  - graph_transform_test.cc

### Motivation and Context
We need to [support mixed 16-bit/8-bit precision QDQ
models](microsoft#17015). This PR is
the first step in achieving this goal: we need to make QDQ contrib ops
work with our optimizations/transformations.

---------

Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants