Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch was not compiled with flash attention warning #1375

Open
lostmsu opened this issue Sep 14, 2024 · 1 comment
Open

Torch was not compiled with flash attention warning #1375

lostmsu opened this issue Sep 14, 2024 · 1 comment

Comments

@lostmsu
Copy link
Contributor

lostmsu commented Sep 14, 2024

This is printed when I call functional.scaled_dot_product_attention:

[W914 13:25:36.000000000 sdp_utils.cpp:555] Warning: 1Torch was not compiled with flash attention. (function operator ())

I'm on Windows with TorchSharp-cuda-windows=0.103.0

@travisjj
Copy link

Can you show the actual line of code used?
Are you getting the warning during runtime or at compile / interpret?

I don't see this warning, when using a CausalSelfAttention layer inside of a transformer architecture.

This is the line of code I used:

        // "Flash" attention
        var y = F.scaled_dot_product_attention(q, k, v, is_casual: true);

where q,k,v are the query, key, values from a Causal Attention linear layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants