-
The Andes V3 instruction set has an extension named EX9.IT. Usage looks something like:
Enlightening a disassembler about this has 2 annoyances:
1 seems a common problem, and can be handled in ghidra similar to tracking other global pointers. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
I realized this construct is sort of like an inverted delayslot. However, I don't think ghidra's existing delayslot behavior can be abused to implement it. |
Beta Was this translation helpful? Give feedback.
-
I implemented it with callotherfixup dynamic pcode and it seems to work well enough (although would be nicer if didn't have to do it this way) |
Beta Was this translation helpful? Give feedback.
-
The only way I can think of would be having a hidden return address register and context value that flags an indirect execution:
and for every instruction that can potentially be used in the table, have a return check:
I haven't tested this, but regardless this will only work if the instructions used in the |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I eventually arrived at something that seems to produce good results, although I think it may highlight some issues in existing ghidra code. I split the functionality for decoding the indirectly referenced instructions, then only return pcode in the injection hook, and perform the extra annotation in an Analyzer: package plugin.core.analysis;
//...
public class EX9ITDisassembler {
private Program program;
private ProgramContext programContext;
private Language language;
private Memory memory;
private final int INSTRUCTION_TABLE_ENTRY_LENGTH = 4;
private Register itbReg;
private Listing listing;
private Address zeroAddress;
private final String EX9IT_MNEMONIC = "EX9.IT";
public EX9ITDisassembler(Program program) {
this.program = program;
memory = program.getMemory();
language = program.getLanguage();
programContext = program.getProgramContext();
listing = program.getListing();
itbReg = language.getRegister("ITB");
zeroAddress = program.getAddressFactory().getDefaultAddressSpace().getAddress(0);
}
boolean instructionIsEX9IT(Instruction insn) {
return insn.getMnemonicString().equals(EX9IT_MNEMONIC);
}
public PseudoInstruction disassemble(Address disasmAddr, byte bytes[]) throws InsufficientBytesException,
UnknownInstructionException {
PseudoDisassemblerContext disassemblerContext = new PseudoDisassemblerContext(programContext);
MemBuffer memBuffer = new ByteMemBufferImpl(disasmAddr, bytes, language.isBigEndian());
// check that address is defined in memory
try {
memBuffer.getByte(0);
} catch (Exception e) {
return null;
}
InstructionPrototype prototype = null;
disassemblerContext.flowStart(disasmAddr);
prototype = language.parse(memBuffer, disassemblerContext, false);
if (prototype == null) {
return null;
}
PseudoInstruction instr;
try {
// First, normal decode
instr = new PseudoInstruction(program, disasmAddr, prototype, memBuffer, disassemblerContext);
// hw would generate Reserved Instruction Exception
if (instructionIsEX9IT(instr)) {
return null;
}
// If it's branch, it's decoded as if current pc is 0
// Must avoid passing program to PseudoInstruction, otherwise it will read from
// program at the given addr - which we're trying to avoid
FlowType flowType = prototype.getFlowType(instr);
if (flowType.isCall() || flowType.isJump()) {
instr = new PseudoInstruction(program.getAddressFactory(), zeroAddress, prototype, memBuffer,
disassemblerContext);
}
} catch (AddressOverflowException e) {
throw new InsufficientBytesException(
"failed to build pseudo instruction at " + disasmAddr + ": " + e.getMessage());
}
return instr;
}
Instruction getEX9ITInstruction(Address address) {
Instruction insn = listing.getInstructionAt(address);
if (insn == null) {
return null;
}
if (!instructionIsEX9IT(insn)) {
return null;
}
return insn;
}
public PseudoInstruction getITInstruction(Instruction ex9itInsn) {
Address ex9itAddress = ex9itInsn.getAddress();
long itOffset = ex9itInsn.getScalar(0).getUnsignedValue() * INSTRUCTION_TABLE_ENTRY_LENGTH;
BigInteger ITB = programContext.getValue(itbReg, ex9itAddress, false);
if (ITB == null) {
return null;
}
long memOffset = (ITB.longValue() & ~0b11) + itOffset;
Address fetchAddress = ex9itInsn.getAddress().getNewAddress(memOffset);
byte[] data = new byte[INSTRUCTION_TABLE_ENTRY_LENGTH];
try {
if (memory.getBytes(fetchAddress, data) != Array.getLength(data)) {
return null;
}
return disassemble(ex9itAddress, data);
} catch (Exception ex) {
return null;
}
}
public PseudoInstruction getITInstruction(Address ex9itAddress) {
Instruction ex9Insn = getEX9ITInstruction(ex9itAddress);
if (ex9Insn == null) {
return null;
}
return getITInstruction(ex9Insn);
}
} package ghidra.app.util.pcodeInject;
//...
public class InjectEX9IT extends InjectPayloadCallother {
public InjectEX9IT(String sourceName) {
super(sourceName);
}
@Override
public PcodeOp[] getPcode(Program program, InjectContext con) {
EX9ITDisassembler disassembler = new EX9ITDisassembler(program);
PseudoInstruction itInsn = disassembler.getITInstruction(con.baseAddr);
// Could be bad ITB
if (itInsn == null) {
return null;
}
// NOTE SymbolicPropogator must be patched to allow STORE pcode ops
return itInsn.getPcode();
}
} package plugin.core.analysis;
//...
public class AndeStarEX9ITAnalyzer extends AbstractAnalyzer {
private final static String PROCESSOR_NAME = "AndeStar";
private final static String NAME = "AndeStar EX9IT Analyzer";
private final static String DESCRIPTION = "Annotates EX9.IT instructions";
private final static CodeUnitFormat codeUnitFormat = new CodeUnitFormat(new CodeUnitFormatOptions());
public AndeStarEX9ITAnalyzer() {
super(NAME, DESCRIPTION, AnalyzerType.INSTRUCTION_ANALYZER);
setDefaultEnablement(true);
}
@Override
public boolean canAnalyze(Program program) {
return program.getLanguage().getProcessor().equals(
Processor.findOrPossiblyCreateProcessor(PROCESSOR_NAME));
}
@Override
public boolean added(Program program, AddressSetView set, TaskMonitor monitor, MessageLog log)
throws CancelledException {
Listing listing = program.getListing();
ReferenceManager refMgr = program.getReferenceManager();
EX9ITDisassembler disassembler = new EX9ITDisassembler(program);
for (Address addr : set.getAddresses(true)) {
PseudoInstruction itInsn = disassembler.getITInstruction(addr);
if (itInsn == null) {
continue;
}
// Add a comment
// TODO append to mnemonic instead?
// itInsn will not have associated program if it's a branch
if (itInsn.getProgram() != null) {
String comment = codeUnitFormat.getRepresentationString(itInsn);
listing.setComment(addr, CodeUnit.EOL_COMMENT, comment);
} else {
// dont really need extra comment - ghidra will add one because of the reference
// comment = itInsn.getPrimaryReference(0).getToAddress().toString();
}
// Copy the references
refMgr.removeAllReferencesFrom(addr);
Reference[] refs = itInsn.getReferencesFrom();
if (refs.length > 0) {
CodeUnit cu = listing.getCodeUnitAt(addr);
for (Reference ref : refs) {
cu.addMnemonicReference(ref.getToAddress(), ref.getReferenceType(), ref.getSource());
}
}
}
return true;
}
} The things to highlight:
|
Beta Was this translation helpful? Give feedback.
-
@shuffle2 I was thinking about another way to make this work, or an extension we might add to sleigh parsing to handle something like this. There are two other ways this could work, one is that the instruction bytes the EXEC.IT instruction uses are fetched in an analyzer and then set in a 4 byte context field. To make this work, all instructions would need to be parsed from the context and not from memory. It might get complicated with variable sized instructions. But they can actually consume bytes as well. This would be possible with the addition of ":^instruction" type context loading, in conjunction with an analyzer that would set the context correctly and then re-parse the instruction. This would work, but unfortunately all RISC-V instructions would need to be parsed from context. So the tokens that are used to parse from memory would be moved into tokens in the context register. The other would require a change to sleigh that could load/append bytes to the parse buffer after the IT instruction, most likely in the [ action ] part of the parsing. And then re-curse the parsing with the again. Something like: [ inst_buffer = read(space, target, 4); ] This would append the bytes, or replace the bytes in the parse buffer read from the target location, which could also be from the instruction Hardware lookup table implemented as another address space. This might also work for peeking at bytes and matching during the parse without consuming them. I'm not sure this is the best solution either. We'd need to think about the changes that would be necessary to the sleigh parsing. Also, are you considering submitting a PR, or is the spec available somewhere to take a look? |
Beta Was this translation helpful? Give feedback.
I eventually arrived at something that seems to produce good results, although I think it may highlight some issues in existing ghidra code.
I split the functionality for decoding the indirectly referenced instructions, then only return pcode in the injection hook, and perform the extra annotation in an Analyzer: