[CIR][CIRGen][Builtin][Neon] Lower neon_vabs_v and neon_vabsq_v #1081

ghehg · 2024-11-07T19:38:21Z

Now implement the same as OG, which is to call llvm aarch64 intrinsic which would eventually become an ARM64 instruction.
However, clearly there is an alternative, which is to extend CIR::AbsOp and CIR::FAbsOp to support vector type and only lower it at LLVM Lowering stage to either LLVM::FAbsOP or [LLVM::AbsOP ], provided LLVM dialect could do the right thing of TargetLowering by translating to llvm aarch64 intrinsic eventually.

The question is whether it is worth doing it?

Any way, put up this diff for suggestions and ideas.

bcardosolopes · 2024-11-08T20:58:55Z

The question is whether it is worth doing it?

It's always better to unify and/or make target specific things to be mapped as generic (easier to optimizers to latter understand). As the person working on this for some time, what's your take? When using C++ source that leads to CIR::AbsOp and CIR::FAbsOp with vectors against OG, does LLVM output get llvm.aarch64.neon.abs.v4i16 and llvm.aarch64.neon.abs.v8i16 if the vector size matches?

ghehg · 2024-11-10T02:10:25Z

The question is whether it is worth doing it?

It's always better to unify and/or make target specific things to be mapped as generic (easier to optimizers to latter
understand). As the person working on this for some time, what's your take?

I'm thinking about the same thing (just using CIR:AbsOp and CIR:FAbsOp and only lower to intrinsics later)

When using C++ source that leads to CIR::AbsOp and CIR::FAbsOp with vectors against OG, does LLVM output get
llvm.aarch64.neon.abs.v4i16 and llvm.aarch64.neon.abs.v8i16 if the vector size matches?

Unfortunately, just did experiment. using [LLVM::AbsOP ], gives use LLVM IR like this
%3 = call <4 x i16> @llvm.abs.v4i16(<4 x i16> %0, i1 false),
not really using neon-specific intrinsics even for triplet "aarch64-none-linux-android24".

But it might be OK, we could just do our own smart TargetLowering at LoweringPrepare and lower vector ty CIR::AbsOp and CIR::FAbsOp to CIR::LLVMIntrinsicCallOp if target supports neon. so we can take advantage of hardware neon features.

ghehg · 2024-11-10T02:41:02Z

The question is whether it is worth doing it?

It's always better to unify and/or make target specific things to be mapped as generic (easier to optimizers to latter
understand). As the person working on this for some time, what's your take?

I'm thinking about the same thing (just using CIR:AbsOp and CIR:FAbsOp and only lower to intrinsics later)

When using C++ source that leads to CIR::AbsOp and CIR::FAbsOp with vectors against OG, does LLVM output get
llvm.aarch64.neon.abs.v4i16 and llvm.aarch64.neon.abs.v8i16 if the vector size matches?

Unfortunately, just did experiment. using [LLVM::AbsOP ], gives use LLVM IR like this %3 = call <4 x i16> @llvm.abs.v4i16(<4 x i16> %0, i1 false), not really using neon-specific intrinsics even for triplet "aarch64-none-linux-android24".

But it might be OK, we could just do our own smart TargetLowering at LoweringPrepare and lower vector ty CIR::AbsOp and CIR::FAbsOp to CIR::LLVMIntrinsicCallOp if target supports neon. so we can take advantage of hardware neon features.

Very interesting.
[Traditional clang codegen seems only want to use neon abs in the case of neon abs builtin]
(https://godbolt.org/z/v8Ysxdrd1)
But,
same exact hardware instruction abs v0.4s, v0.4s is generated for both general and neon intrinsics

ghehg · 2024-11-11T03:37:31Z

My current plan is to extend AbsOp to take vector type. PR
Next, extend FAbsOp along with other FpUnaryOps to support vector type.
Then we visit this PR to decide whether we just use AbsOp and FAbsOp and lower them to generic llvm.abs/llvm.fabs which would give us a chance to optimize them in LLVM passes. Or either at code gen or lowering statge to lower them to llvm.aarch64.neon.abs, which would not be optimized away in generated assembly/machine code under -O0.

bcardosolopes · 2024-11-11T19:07:20Z

But, same exact hardware instruction abs v0.4s, v0.4s is generated for both general and neon intrinsics

Nice - as long as the final ASM is the same feels like we're good to map all those things with the same higher level CIR operation.

Then we visit this PR to decide whether we just use AbsOp and FAbsOp

Sounds good, thanks

ghehg · 2024-11-22T19:35:20Z

back to draft, waiting for other PRs to land.

ghehg marked this pull request as ready for review November 7, 2024 19:40

ghehg requested review from lanza and bcardosolopes as code owners November 7, 2024 19:40

ghehg force-pushed the macM3 branch from 3164ac8 to 496364c Compare November 8, 2024 13:35

smeenai force-pushed the main branch from dbd3e03 to 8176d88 Compare November 22, 2024 18:23

ghehg marked this pull request as draft November 22, 2024 19:34

smeenai force-pushed the main branch from 4aca8d4 to a04cf10 Compare November 23, 2024 06:11

ghehg force-pushed the macM3 branch from 496364c to 028fb7b Compare November 25, 2024 15:32

ghehg closed this Nov 25, 2024

ghehg force-pushed the macM3 branch from 028fb7b to a223ec2 Compare November 25, 2024 15:35

Lower neon_vabs_v, neon_vqabs_v

1b3d5e3

ghehg reopened this Nov 25, 2024

Update CIRGenBuiltinAArch64.cpp

6893164

ghehg marked this pull request as ready for review November 25, 2024 20:53

bcardosolopes approved these changes Nov 25, 2024

View reviewed changes

bcardosolopes merged commit 41078e9 into llvm:main Nov 25, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIR][CIRGen][Builtin][Neon] Lower neon_vabs_v and neon_vabsq_v #1081

[CIR][CIRGen][Builtin][Neon] Lower neon_vabs_v and neon_vabsq_v #1081

ghehg commented Nov 7, 2024 •

edited

Loading

bcardosolopes commented Nov 8, 2024

ghehg commented Nov 10, 2024 •

edited

Loading

ghehg commented Nov 10, 2024 •

edited

Loading

ghehg commented Nov 11, 2024

bcardosolopes commented Nov 11, 2024

ghehg commented Nov 22, 2024

[CIR][CIRGen][Builtin][Neon] Lower neon_vabs_v and neon_vabsq_v #1081

[CIR][CIRGen][Builtin][Neon] Lower neon_vabs_v and neon_vabsq_v #1081

Conversation

ghehg commented Nov 7, 2024 • edited Loading

bcardosolopes commented Nov 8, 2024

ghehg commented Nov 10, 2024 • edited Loading

ghehg commented Nov 10, 2024 • edited Loading

ghehg commented Nov 11, 2024

bcardosolopes commented Nov 11, 2024

ghehg commented Nov 22, 2024

ghehg commented Nov 7, 2024 •

edited

Loading

ghehg commented Nov 10, 2024 •

edited

Loading

ghehg commented Nov 10, 2024 •

edited

Loading