[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271

statham-arm · 2024-07-01T16:31:07Z

This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted.

The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load mySymbol + 0x123:

mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol)
movk x0, #0x123, lsl #16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol)
movk x0, #0x123, lsl #32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol)
movk x0, #0x123, lsl #48 ; R_AARCH64_MOVW_UABS_G3(mySymbol)

The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of mySymbol+0x123, and the second would load the next 16 bits of mySymbol+0x1230000, and so on. This doesn't reliably get you any useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16.

If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify small addends successfully.

smithp35

One small comment about the MOV alias but that isn't a strong opinion.

As I understand it the sequence

ADRP x0, symbol + addend
ADD x0, x0, :lo12: symbol +addend

Would likely limit the size of the addend to ADRP to same addend [0-4095] that could be expressed in the ADD/LDR.

I would like to think there's a way of relating addend1 and addend2 so that we always get the right offset within the page.

ADRP x0, symbol + addend1
ADD x0, x0, :lo12: symbol +addend2

Off the top of my head, something like addend2 is congruent to addend1 (modulo 4096). Would need to think about that in more detail as I could easily have that wrong.

If there is such an expression we could add that as a note.

smithp35 · 2024-07-01T17:14:16Z

aaelf64/aaelf64.rst

@@ -940,6 +940,13 @@ A ``RELA`` format relocation must be used if the initial addend cannot be encode

 There is no PC bias to accommodate in the relocation of a place containing an instruction that formulates a PC- relative address. The program counter reflects the address of the currently executing instruction.

+There are two special cases for forming the initial addend of REL-type relocations where the immediate field cannot normally hold small signed integers:
+
+* For relocations processing MOVZ and MOVK instructions, the initial   addend is formed by interpreting the 16-bit literal field of the instruction as a 16-bit signed value in the range -32768 <= A < 32768. The interpretation is the same whether or not the instruction applies a left shift to its immediate: the addend is never treated as shifted.


Would it be worth including MOV (wide immediate) using the Arm ARM description. For example
For relocations processing MOV (wide immediate), MOVZ and MOVK instructions,

MOV (wide immediate) is an alias for MOVZ so this is included in MOVZ and MOVK, but MOV is used a lot in code examples.

I dithered about that myself. Added a parenthesis.

statham-arm · 2024-07-02T09:22:12Z

Nice thought! I think you're absolutely right: you can make ADRP+ADD consistently generate up to a 19-bit signed offset even though the ADD immediate can only hold 12 bits of it, by just setting the ADD immediate to the low 12 bits. The error in computing the full value during processing of the ADD relocation doesn't matter, because it's only in bits 12 and above, which don't affect the output.

This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted. The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load `mySymbol + 0x123`: mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl ARM-software#16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl ARM-software#32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl ARM-software#48 ; R_AARCH64_MOVW_UABS_G3(mySymbol) The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of `mySymbol+0x123`, and the second would load the next 16 bits of `mySymbol+0x1230000`, and so on. This doesn't reliably get you _any_ useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16. If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify _small_ addends successfully.

statham-arm · 2024-07-02T12:30:55Z

Oops! The ADRP immediate is 21 bits, not 19. Almost didn't notice that the encoding has two more immediate bits near the top of the word.

rearnsha · 2024-07-02T12:58:14Z

Is it really safe to make this change? The current text says: 5.7.2 Addends and PC-bias <https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#id49> <https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#572addends-and-pc-bias> A binary file may use REL or RELA relocations or a mixture of the two (but multiple relocations of the same place must use only one type). The initial addend for a REL-type relocation is formed according to the following rules. - If the relocation relocates data (Static Data relocations <https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#static-data-relocations>) the initial value in the place is sign-extended to 64 bits. - If the relocation relocates an instruction the immediate field of the instruction is extracted, scaled as required by the instruction field encoding, and sign-extended to 64 bits. A RELA format relocation must be used if the initial addend cannot be encoded in the place. So the question is, could existing code be using these fields in a different way (interpreting the bits as scaled by the field encoding information)? If they could, then this change would break existing usage and thus cannot be safely made, even if it would have been more useful in retrospect.

…

On Mon, 1 Jul 2024 at 17:31, Simon Tatham ***@***.***> wrote: This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted. The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load mySymbol + 0x123: mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl #16 <#16> ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl #32 <#32> ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl #48 <#48> ; R_AARCH64_MOVW_UABS_G3(mySymbol) The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of mySymbol+0x123, and the second would load the next 16 bits of mySymbol+0x1230000, and so on. This doesn't reliably get you *any* useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16. If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify *small* addends successfully. ------------------------------ You can view, comment on, or merge this pull request online at: #271 Commit Summary - 47d5a43 <47d5a43> [AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. File Changes (1 file <https://github.com/ARM-software/abi-aa/pull/271/files>) - *M* aaelf64/aaelf64.rst <https://github.com/ARM-software/abi-aa/pull/271/files#diff-4af66823238bb53aa276f8f0c9e0196fe8707a57a3ed5e66a39360b1128061db> (7) Patch Links: - https://github.com/ARM-software/abi-aa/pull/271.patch - https://github.com/ARM-software/abi-aa/pull/271.diff — Reply to this email directly, view it on GitHub <#271>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANY64XDRO43BYRH3HU5BWETZKF76NAVCNFSM6AAAAABKF5E77GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4DIMRWGA3TEMQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Wilco1 · 2024-07-02T13:25:07Z

So the question is, could existing code be using these fields in a
different way (interpreting the bits as scaled by the field encoding
information)? If they could, then this change would break existing usage
and thus cannot be safely made, even if it would have been more useful in
retrospect.

As explained, it is not possible to use a non-zero offset with REL MOVZ/MOVK/ADRP/ADD/LDR relocations since they wouldn't compute the same relocation.

smithp35 · 2024-07-02T13:28:27Z

So the question is, could existing code be using these fields in a
different way (interpreting the bits as scaled by the field encoding
information)? If they could, then this change would break existing usage
and thus cannot be safely made, even if it would have been more useful in
retrospect.

While we can never be 100% certain, we've only got one known user of SHT_REL for AArch64 which is the Arm legacy assembler armasm, and that switches to RELA for the instructions with scaled addends like ADRP, MOVT/MOVW. Part of the motivation for this change is to correct an incorrect implementation in LLD.

On the use case side, we can't be 100% certain, but I certainly can't think of a way a compiler could make use of the scaled immediates as relocation addends. I can remember from the 32-bit case that we first implemented the MOVT/MOVW in SHT_REL using scaled addends and soon ran into runtime failures which provoked the redefinition to not scale the addends. It may be possible to come up with something in assembly that isn't contrived.

I think this is an area where I'd be comfortable taking the risk.

smithp35

Thanks for the updates, I'm approving this from my side. I suggest we leave this open for a few days for others to comment/object. Please can Request changes be used for any blocking feedback.

rearnsha · 2024-07-02T14:50:21Z

I'm sure there are use cases where the bottom 16 bits might be known to be zero, then you're more likely to want something that's a multiple of 2^16.

…

On Tue, 2 Jul 2024 at 14:28, Peter Smith ***@***.***> wrote: So the question is, could existing code be using these fields in a different way (interpreting the bits as scaled by the field encoding information)? If they could, then this change would break existing usage and thus cannot be safely made, even if it would have been more useful in retrospect. While we can never be 100% certain, we've only got one known user of SHT_REL for AArch64 which is the Arm legacy assembler armasm, and that switches to RELA for the instructions with scaled addends like ADRP, MOVT/MOVW. Part of the motivation for this change is to correct an incorrect implementation in LLD. On the use case side, we can't be 100% certain, but I certainly can't think of a way a compiler could make use of the scaled immediates as relocation addends. I can remember from the 32-bit case that we first implemented the MOVT/MOVW in SHT_REL using scaled addends and soon ran into runtime failures which provoked the redefinition to not scale the addends. It may be possible to come up with something in assembly that isn't contrived. I think this is an area where I'd be comfortable taking the risk. — Reply to this email directly, view it on GitHub <#271 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANY64XF63ERIOVGV6VUBPXTZKKTJFAVCNFSM6AAAAABKF5E77GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBTGE3DIMRWGA> . You are receiving this because you commented.Message ID: ***@***.***>

smithp35 · 2024-07-02T15:12:58Z

It would have to be an expression of the form
symbol + <immediate>* 2 ^ 16
Where the symbol is aligned on a 2 ^ 16 boundary. The only thing that I can think of in that case is where symbol is the section symbol of a section aligned to at least 2 ^ 16, and the section is larger than 2 ^ 16 bytes.

Given how rare that this use-case is compared to sections that are 4/8 byte aligned I doubt that a potential SHT_REL only toolchain would implement section symbol + offset like that. I'd expect it to use local anonymous symbols as anchors.

smithp35 · 2024-07-08T13:15:31Z

Will be merging this PR by end of UK working day 08/07/2024

smithp35 · 2024-07-08T16:11:09Z

End of UK working day. Merging this PR.

sallyarmneale

No changes required.

statham-arm mentioned this pull request Jul 1, 2024

Are implicit addends for MOVW_UABS_G[123] and ADR_PREL_PG_HI21 shifted? #268

Closed

smithp35 reviewed Jul 1, 2024

View reviewed changes

statham-arm force-pushed the aaelf64-reloc-clarification branch from 47d5a43 to a1066c7 Compare July 2, 2024 09:36

statham-arm force-pushed the aaelf64-reloc-clarification branch from a1066c7 to 5cc35e7 Compare July 2, 2024 12:30

smithp35 approved these changes Jul 2, 2024

View reviewed changes

smithp35 merged commit 1a448ed into ARM-software:main Jul 8, 2024
1 check passed

sallyarmneale reviewed Aug 12, 2024

View reviewed changes

statham-arm deleted the aaelf64-reloc-clarification branch August 12, 2024 09:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271

[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271

statham-arm commented Jul 1, 2024

smithp35 left a comment •

edited

Loading

smithp35 Jul 1, 2024

statham-arm Jul 2, 2024

statham-arm commented Jul 2, 2024

statham-arm commented Jul 2, 2024

rearnsha commented Jul 2, 2024 via email

Wilco1 commented Jul 2, 2024

smithp35 commented Jul 2, 2024

smithp35 left a comment

rearnsha commented Jul 2, 2024 via email

smithp35 commented Jul 2, 2024

smithp35 commented Jul 8, 2024

smithp35 commented Jul 8, 2024

sallyarmneale left a comment

[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271

[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271

Conversation

statham-arm commented Jul 1, 2024

smithp35 left a comment • edited Loading

Choose a reason for hiding this comment

smithp35 Jul 1, 2024

Choose a reason for hiding this comment

statham-arm Jul 2, 2024

Choose a reason for hiding this comment

statham-arm commented Jul 2, 2024

statham-arm commented Jul 2, 2024

rearnsha commented Jul 2, 2024 via email

Wilco1 commented Jul 2, 2024

smithp35 commented Jul 2, 2024

smithp35 left a comment

Choose a reason for hiding this comment

rearnsha commented Jul 2, 2024 via email

smithp35 commented Jul 2, 2024

smithp35 commented Jul 8, 2024

smithp35 commented Jul 8, 2024

sallyarmneale left a comment

Choose a reason for hiding this comment

smithp35 left a comment •

edited

Loading