Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tuner] Reduce maintenance burden and prepare for more codegen pipelines #453

Open
8 of 13 tasks
kuhar opened this issue Nov 8, 2024 · 4 comments
Open
8 of 13 tasks
Assignees
Labels

Comments

@kuhar
Copy link
Member

kuhar commented Nov 8, 2024

This is an uber-issue for making the tuner easier to maintain. The current implementation has a few issues that make the tuner library fragile and prone to getting out of sync with the IREE compiler. Specifically, we identified the following issues:

  1. There are two ways to (re-)configure executable sources:
    a. By updating the lowering config and translation info in-situ. This is used when producing candidate dispatches using executable benchmarks as the source-of-truth.
    b. By using the transform dialect library script to match root ops and apply compilation info attributes to them. This is used during the model candidate compilation and benchmarking stage.

    As a result, we have duplicate logic to apply configurations found by the constraint solver. The fix is to write a pass that strips existing configuration from executable sources, and then use transform dialect to re-configure them. This can be done as a separate invocation of iree-opt.

  2. The MLIR processing is mostly string-based. While this allowed us to quickly prototype, it makes the code prone to getting out of sync with the IREE compiler. The lowering configs and translation info attributes are considered compiler-internals and there's no stability guarantee as to the exact structure and format of these attributes. As a result, every time the format changes, we have to update the parsing and printing logic in the tuner to match the new format in the compiler.

    Here, the proposed solution is to expose these key attributes (translation info, compilation info, and MFMA intrinsic info) through python bindings. We already have it for the GPU pipeline options that can be used as a template for future bindings: Reland #18804 iree-org/iree#18840.

  3. Make it easier to identify 'root ops'. We can make the IREE compiler annotate the root lingalg ops with a new attribute that the tuner can use to recognize them, without having to duplicate the compiler logic.

  4. The Configuration representation is modeled after the requirements of the LLVMGPUVectorDistribute pipeline. This made it so that the surrounding code makes implicit assumptions about the problem representation. Instead, we should define an interface that allows us to support multiple compilation pipelines, such that the generated SMT constraints are specific to both the pipeline and the dispatch kind. Further, the constraint generation code should be decoupled from the parsing/printing code, such that projects like TKW can use just the constraint generation and benchmarking infra.

  5. Move from two stages of compile-and-benchmark to just one. This made sense for SDXL where the best isolated dispatch does not necessarily perform best across the whole model, but it may not be necessary or even sufficiently general for other applications. This is related to the libtuner.TuningClient class; clients should be able to define their own tuning stages with libtuner providing the interface to specify the compilation and benchmarking commands.

Tasks

@kuhar
Copy link
Member Author

kuhar commented Nov 8, 2024

cc: @MaheshRavishankar @nithinsubbiah @bangtianliu @Max191

@kuhar kuhar changed the title [tuner] Reduce maintenance burden and prepare for more codegen Pipelines [tuner] Reduce maintenance burden and prepare for more codegen pipelines Nov 8, 2024
@MaheshRavishankar
Copy link

Thanks @kuhar . Could you also add a pointer to where the tuner lives today.

The tasks you described are related to making it easy to maintain the tuner. One more thing we need to think about is how to make it easy to maintain and update the tuner script. Specifically questions to answer are

  1. We currently use the result of tuner (the TD script) during compilation. Do we maintain a default script that is somewhere in IREE codebase that gets loaded.
  2. How does a developer update the tuner script if they need to (i.e they change the lowering config/translation info could there be a tool that "ports" the configuration over)
  3. Could we have multiple TD scripts passed to the compiler. My answer to this would be no, but it might be useful to have a tool to "merge" TD scripts. This could also be the mechanism to update the tuner script. You create a new tuner script and "update".

For now just noting some top-of-mind questions.

@kuhar
Copy link
Member Author

kuhar commented Nov 11, 2024

@MaheshRavishankar The tuner lives under the tuner directory in this repo. The questions you listed are very relevant but not to the tuner itself but the iree project -- I decided to decouple these two sets of concerns. I will answer your questions under an iree issue and add a link here.

bangtianliu added a commit to iree-org/iree that referenced this issue Nov 11, 2024
…ources (#19069)

This PR aims to address the first task in
nod-ai/shark-ai#453: adding an iree-opt
pass that removes configuration from executable sources. The
corresponding test is also included to ensure its correct functionality.

---------

Signed-off-by: Bangtian Liu <[email protected]>
bangtianliu added a commit to iree-org/iree that referenced this issue Nov 14, 2024
…9124)

This PR aims to address the task listed in
nod-ai/shark-ai#453: add a utility
function (`QueryMMAIntrinsics`) to query supported MMA intrinsics.

A new test pass `TestLLVMGPUQueryMMAPass` has been added to validate the
correctness of this utility function, along with a corresponding test to
ensure reliable functionality.

TODO: The function will be exposed to both the C API and Python in a
follow-up PR.

---------

Signed-off-by: Bangtian Liu <[email protected]>
@kuhar kuhar added the tuner label Nov 15, 2024
kuhar pushed a commit to kuhar/iree that referenced this issue Nov 19, 2024
…ee-org#19124)

This PR aims to address the task listed in
nod-ai/shark-ai#453: add a utility
function (`QueryMMAIntrinsics`) to query supported MMA intrinsics.

A new test pass `TestLLVMGPUQueryMMAPass` has been added to validate the
correctness of this utility function, along with a corresponding test to
ensure reliable functionality.

TODO: The function will be exposed to both the C API and Python in a
follow-up PR.

---------

Signed-off-by: Bangtian Liu <[email protected]>
Signed-off-by: Jakub Kuderski <[email protected]>
@kuhar
Copy link
Member Author

kuhar commented Nov 19, 2024

The IREE issue that explains the logistics of how we maintain tuning specs and apply them in the codegen flow is here: iree-org/iree#19214

kuhar pushed a commit that referenced this issue Nov 22, 2024
This PR is relevant to the task in
#453: " Use IREE attributes for
MFMA intrinsics in the tuner".

---------

Signed-off-by: Bangtian Liu <[email protected]>
kuhar pushed a commit that referenced this issue Nov 26, 2024
Remove the data class `MfmaIntrinsic` from the codebase, and use IREE
attributes (` iree_gpu.MMAIntrinsic` and `iree_gpu.MMAAttr` ) for MFMA
intrinsics in the tuner.

**Motivation for this PR**: The original MLIR processing relies heavily
on string-based operations, making it fragile and prone to breaking with
updates to the IREE Compiler. To address this, we aim to leverage key
attributes directly through IREE Python bindings, enabled by exposing
these attributes. For more details, refer to [this
issue](#453).
kuhar pushed a commit that referenced this issue Nov 28, 2024
…ptionsAttr. (#626)

This PR is relevant to task: Use IREE bindings for compilation info
(incl., lowering_config and translation_info) in
#453.

Retire data class GPUPipelineOptions, use python binding
iree_gpu.PipelineOptionsAttr instead.

---------

Signed-off-by: Bangtian Liu <[email protected]>
Groverkss pushed a commit to Groverkss/iree that referenced this issue Dec 1, 2024
…ources (iree-org#19069)

This PR aims to address the first task in
nod-ai/shark-ai#453: adding an iree-opt
pass that removes configuration from executable sources. The
corresponding test is also included to ensure its correct functionality.

---------

Signed-off-by: Bangtian Liu <[email protected]>
Groverkss pushed a commit to Groverkss/iree that referenced this issue Dec 1, 2024
…ee-org#19124)

This PR aims to address the task listed in
nod-ai/shark-ai#453: add a utility
function (`QueryMMAIntrinsics`) to query supported MMA intrinsics.

A new test pass `TestLLVMGPUQueryMMAPass` has been added to validate the
correctness of this utility function, along with a corresponding test to
ensure reliable functionality.

TODO: The function will be exposed to both the C API and Python in a
follow-up PR.

---------

Signed-off-by: Bangtian Liu <[email protected]>
bangtianliu added a commit that referenced this issue Dec 3, 2024
This PR is relevant to the task in
#453 : use IREE bindings for
compilation info (incl., lowering_config and translation_info).

Remove data class `ReorderWorkgroupsStrategy`, and use lowering_config
binding.

---------

Signed-off-by: Bangtian Liu <[email protected]>
giacs-epic pushed a commit to giacs-epic/iree that referenced this issue Dec 4, 2024
…ources (iree-org#19069)

This PR aims to address the first task in
nod-ai/shark-ai#453: adding an iree-opt
pass that removes configuration from executable sources. The
corresponding test is also included to ensure its correct functionality.

---------

Signed-off-by: Bangtian Liu <[email protected]>
Signed-off-by: Giacomo Serafini <[email protected]>
giacs-epic pushed a commit to giacs-epic/iree that referenced this issue Dec 4, 2024
…ee-org#19124)

This PR aims to address the task listed in
nod-ai/shark-ai#453: add a utility
function (`QueryMMAIntrinsics`) to query supported MMA intrinsics.

A new test pass `TestLLVMGPUQueryMMAPass` has been added to validate the
correctness of this utility function, along with a corresponding test to
ensure reliable functionality.

TODO: The function will be exposed to both the C API and Python in a
follow-up PR.

---------

Signed-off-by: Bangtian Liu <[email protected]>
Signed-off-by: Giacomo Serafini <[email protected]>
bangtianliu added a commit to iree-org/iree that referenced this issue Dec 7, 2024
…9376)

This PR introduces additional property functions to the LoweringConfig
Python binding. These new functions enable direct extraction of the
following attributes: `workgroup`, `reduction`, `subgroup_m_count`,
`subgroup_n_count`, and `mma_kind` directly from the lowering config
python binding.

This PR is relevant to the task in
nod-ai/shark-ai#453: use IREE bindings for
compilation info (incl., lowering_config and translation_info).

---------

Signed-off-by: Bangtian Liu <[email protected]>
bangtianliu added a commit that referenced this issue Dec 9, 2024
…ng (#662)

After landing iree-org/iree#19376, all helper
functions related to lowering configuration can be removed. Instead, we
can directly utilize property functions from the LoweringConfig Python
bindings.

This PR is still relevant to the task in
#453: use IREE bindings for
compilation info (incl., lowering_config and translation_info).

---------

Signed-off-by: Bangtian Liu <[email protected]>
bangtianliu added a commit that referenced this issue Dec 11, 2024
This PR is relevant to the task in
#453 : use IREE bindings for
compilation info (incl., lowering_config and translation_info).

Use translation_info from IREE  python binding.

---------

Signed-off-by: Bangtian Liu <[email protected]>
bangtianliu added a commit that referenced this issue Dec 12, 2024
This PR is relevant to the task in
#453 : use IREE bindings for
compilation info (incl., lowering_config and translation_info).

Retire data class `configuration` and use the `compilation_info` from
IREE python binding.

Signed-off-by: Bangtian Liu <[email protected]>
monorimet pushed a commit that referenced this issue Dec 13, 2024
This PR is relevant to the task in
#453 : use IREE bindings for
compilation info (incl., lowering_config and translation_info).

Remove data class `ReorderWorkgroupsStrategy`, and use lowering_config
binding.

---------

Signed-off-by: Bangtian Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants