decompiler: fix loops with shifted structure offset #6718
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ghidra decompiler is not displaying some array accesses as well as it could. For example, if our structures are defined this way:
, and we are given code:
, then decompiler produces this result:
While it should (with this fix) produce this result, which is much neater and readable:
The issue was caused by RulePtrArith not finding the multiplication operation (by 0xc in this case) in the pointer arithmetic expression
ESI + EAX*0x1 + 0x4
(ESI is the shifted structure offset here, EAX is List.Data and 0x4 is the offset to Element2). By letting RulePtrArith recognize certain cases through CPUI_MULTIEQUAL and transforming expressions, it can create a situation where it can find the multiplier and produce better results. The expression transformation involves taking the 0xc in this case out of the -= operator and the place where iVar2 is initialized and duplicating it in every place it's used. When multiplication by 0xc is right there in the pointer arithmetic expression, RulePtrArith can find it and generate CPUI_PTRADD (addition of a number to a pointer).Another issue was ActionNodeJoin creating CPUI_MULTIEQUALs whose output type was unknown or undefined and then later RulePropagateCopy was taking those and putting them inside CPUI_PTRADD expressions in place of the base pointer which produced now an expression where the base pointer has unknown type, and then RulePtraddUndo would see that there's a CPUI_PTRADD in which the base pointer has size that doesn't match the original size with which CPUI_PTRADD was created, so it converts the CPUI_PTRADD back to CPUI_INT_ADD (addition of two integers). This is fixed by assigning a type to the output of CPUI_MULTIEQUAL in ActionNodeJoin because everyone else seems to just be doing their job.
EDIT: It seems this "shifted structure offset" rule that transforms expressions cannot be applied if the variable that holds the "shifted offset" is stored on the stack and there's a function call in the loop body, because that function call, if passed any arguments, could potentially write to it (you never know what it was passed and whether it's a pointer to that stack location). So it only works when the "shifted offset" variable is in a register.
EDIT: Even if the function was not passed any variables, guess what, global variables exist, and those could point to that stack location, you never know. So any presence of calls in the loop makes using this rule impossible if the offset is stored on the stack.