Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSA HAL for AIE #775

Closed
wants to merge 11 commits into from
Closed

HSA HAL for AIE #775

wants to merge 11 commits into from

Conversation

makslevental
Copy link
Collaborator

@makslevental makslevental commented Sep 13, 2024

This is the long-awaited HSA HAL for us (AIE). It should be considered beta - it all works (all the current tests pass, even on Strix) but there are probably still many "sharp edges". In addition, I haven't timed anything so I don't know if it's actually slower or faster than the XRT HAL.

There's a lot here if you haven't worked with IREE's HAL before or HSA before. So ask all the questions - investigating/answering them will help me clarify my own understanding as well.

Only thing to explicitly call out is that currently it's pinned to a branch of ROCr which will be merged as soon as I fix up the tests over there.

Note: only Linux (a current limitation of HSA itself).

Shout out to @eddierichter-amd for fighting in the trenches with me on this.

cc @stellaraccident @powderluv @kumardeepakamd @MaheshRavishankar

@makslevental makslevental force-pushed the makslevental/my-own-hsa-hal-v2 branch 30 times, most recently from d6fb421 to b74dd30 Compare September 16, 2024 02:26
Copy link
Contributor

@josemonsalve2 josemonsalve2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for putting this together, Maksim. This is awesome!

runtime/src/iree-amd-aie/driver/hsa/CMakeLists.txt Outdated Show resolved Hide resolved

// TODO(max): this doesn't go here, should go in pending_actions maybe?
// like shouldn't be done right at dispatch time
// Configure the queue's hardware context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need the context before submitting anything into the queue? I'm not familiar with pending actions.

One thing I noticed is that, when using the python bindings, if you configured the dispatch early on, and then ran the compilation, there was a bug that would kill the context in the device upon execution of the subprocess (From python). This is because somehow upon the subcommand finishing, it would call an operation that would force a flush in the file descriptor of the driver, and that would kill the context.

So early creation of the context had some implications in that case. But this is really a corner case.

iree_string_view_t identifier{};
iree_hal_hsa_dynamic_symbols_t hsa_symbols{};
int num_aie_agents{};
hsa::hsa_agent_t agents[IREE_HAL_HSA_MAX_DEVICES]{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you currently do not care to much about the GPU side, but any thoughts if what I originally had made sense?

@makslevental makslevental force-pushed the makslevental/my-own-hsa-hal-v2 branch 3 times, most recently from c44d59b to f768298 Compare September 24, 2024 20:05
void PrepareMatmulExecutable() {
IREE_ASSERT_OK(iree_hal_executable_cache_create(
device_, iree_make_cstring_view("default"),
iree_loop_inline(&loop_status_), &executable_cache_));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this loop inline?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makslevental makslevental force-pushed the makslevental/my-own-hsa-hal-v2 branch 5 times, most recently from fec213a to c730870 Compare September 25, 2024 16:39
@makslevental makslevental deleted the makslevental/my-own-hsa-hal-v2 branch October 18, 2024 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants