From 5cc35e7e5cb8ec4cf46ae3f669946b9633a1034a Mon Sep 17 00:00:00 2001 From: Simon Tatham Date: Mon, 1 Jul 2024 17:19:26 +0100 Subject: [PATCH] [AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted. The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load `mySymbol + 0x123`: mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl #16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl #32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl #48 ; R_AARCH64_MOVW_UABS_G3(mySymbol) The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of `mySymbol+0x123`, and the second would load the next 16 bits of `mySymbol+0x1230000`, and so on. This doesn't reliably get you _any_ useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16. If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify _small_ addends successfully. --- aaelf64/aaelf64.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/aaelf64/aaelf64.rst b/aaelf64/aaelf64.rst index c106b47..7fbf12e 100644 --- a/aaelf64/aaelf64.rst +++ b/aaelf64/aaelf64.rst @@ -940,6 +940,15 @@ A ``RELA`` format relocation must be used if the initial addend cannot be encode There is no PC bias to accommodate in the relocation of a place containing an instruction that formulates a PC- relative address. The program counter reflects the address of the currently executing instruction. +There are two special cases for forming the initial addend of REL-type relocations where the immediate field cannot normally hold small signed integers: + +* For relocations processing MOVZ and MOVK instructions (including the "MOV (wide immediate)" alias), the initial addend is formed by interpreting the 16-bit literal field of the instruction as a 16-bit signed value in the range -32768 <= A < 32768. The interpretation is the same whether or not the instruction applies a left shift to its immediate: the addend is never treated as shifted. + +* For relocations processing the ADRP instruction, the initial addend is similarly formed by interpreting the literal field of the instruction as a 21-bit signed integer, in the range -1048576 <= A < 1048576. The ADRP instruction's implicit left shift of 12 bits is not applied. + +These special cases permit a sequence of instructions to each add the same small constant to a symbol's value, and extract separate ranges of bits from the sum, so that the instruction sequence as a whole consistently loads the full result of the addition. + +In the case of a sequence using ADRP followed by a 12-bit ADD to set up the low bits of the offset, you can express an offset up to 1048576 in either direction, by writing the full offset in the ADRP's immediate field, and repeating its low 12 bits in the ADD's immediate field. A linker resolving the R_AARCH64_ADD_ABS_LO12_NC relocation on the ADD will not compute the correct overall 64-bit value, but the error will only be in the higher bits, which are discarded by that relocation. Relocation types ^^^^^^^^^^^^^^^^