Solidity
is a high-level programming language, we can read it, but machines can’t. When we install Ethereum
clients, it also comes with Ethereum virtual machine
(EVM), which is a lightweight operating system specially created for running smart contracts. When we compile solidity
code, it converts the code into bytecode that only EVM
can understand.
Let’s take a very simple contract as an example
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.7;
contract Greeting{
function sayHello() external pure returns (string memory){
return "Hello World!";
}
}
If we run this code in remix, we'll see the compile code. Object is the compiled code. They are the hexadecimal representation of the final contract, also known as bytecode. We also see the opcode. Opcodes are low-level readable instructions of programs. All opcodes have corresponding hexadecimal values.
{
"functionDebugData": {},
"generatedSources": [],
"linkReferences": {},
"object": "608060405234801561001057600080fd5b5061017c806100206000396000f3fe608060405234801561001057600080fd5b506004361061002b5760003560e01c8063ef5fb05b14610030575b600080fd5b61003861004e565b60405161004591906100c4565b60405180910390f35b60606040518060400160405280600c81526020017f48656c6c6f20576f726c64210000000000000000000000000000000000000000815250905090565b6000610096826100e6565b6100a081856100f1565b93506100b0818560208601610102565b6100b981610135565b840191505092915050565b600060208201905081810360008301526100de818461008b565b905092915050565b600081519050919050565b600082825260208201905092915050565b60005b83811015610120578082015181840152602081019050610105565b8381111561012f576000848401525b50505050565b6000601f19601f830116905091905056fea26469706673582212204e73b0b87647a66bb4db4c95d39bf565be4977d84c116afdaa89989b258a184064736f6c63430008070033",
"opcodes": "PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH2 0x17C DUP1 PUSH2 0x20 PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN INVALID PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x4 CALLDATASIZE LT PUSH2 0x2B JUMPI PUSH1 0x0 CALLDATALOAD PUSH1 0xE0 SHR DUP1 PUSH4 0xEF5FB05B EQ PUSH2 0x30 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH2 0x38 PUSH2 0x4E JUMP JUMPDEST PUSH1 0x40 MLOAD PUSH2 0x45 SWAP2 SWAP1 PUSH2 0xC4 JUMP JUMPDEST PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH1 0x60 PUSH1 0x40 MLOAD DUP1 PUSH1 0x40 ADD PUSH1 0x40 MSTORE DUP1 PUSH1 0xC DUP2 MSTORE PUSH1 0x20 ADD PUSH32 0x48656C6C6F20576F726C64210000000000000000000000000000000000000000 DUP2 MSTORE POP SWAP1 POP SWAP1 JUMP JUMPDEST PUSH1 0x0 PUSH2 0x96 DUP3 PUSH2 0xE6 JUMP JUMPDEST PUSH2 0xA0 DUP2 DUP6 PUSH2 0xF1 JUMP JUMPDEST SWAP4 POP PUSH2 0xB0 DUP2 DUP6 PUSH1 0x20 DUP7 ADD PUSH2 0x102 JUMP JUMPDEST PUSH2 0xB9 DUP2 PUSH2 0x135 JUMP JUMPDEST DUP5 ADD SWAP2 POP POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x20 DUP3 ADD SWAP1 POP DUP2 DUP2 SUB PUSH1 0x0 DUP4 ADD MSTORE PUSH2 0xDE DUP2 DUP5 PUSH2 0x8B JUMP JUMPDEST SWAP1 POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 DUP2 MLOAD SWAP1 POP SWAP2 SWAP1 POP JUMP JUMPDEST PUSH1 0x0 DUP3 DUP3 MSTORE PUSH1 0x20 DUP3 ADD SWAP1 POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 JUMPDEST DUP4 DUP2 LT ISZERO PUSH2 0x120 JUMPI DUP1 DUP3 ADD MLOAD DUP2 DUP5 ADD MSTORE PUSH1 0x20 DUP2 ADD SWAP1 POP PUSH2 0x105 JUMP JUMPDEST DUP4 DUP2 GT ISZERO PUSH2 0x12F JUMPI PUSH1 0x0 DUP5 DUP5 ADD MSTORE JUMPDEST POP POP POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x1F NOT PUSH1 0x1F DUP4 ADD AND SWAP1 POP SWAP2 SWAP1 POP JUMP INVALID LOG2 PUSH5 0x6970667358 0x22 SLT KECCAK256 0x4E PUSH20 0xB0B87647A66BB4DB4C95D39BF565BE4977D84C11 PUSH11 0xFDAA89989B258A18406473 PUSH16 0x6C634300080700330000000000000000 ",
"sourceMap": "59:125:0:-:0;;;;;;;;;;;;;;;;;;;"
}
The specific value of operation code can refer to the relevant information of Ethereum. Check Ethereum opcodes
EVM virtual machine is a stack virtual machine, the so-called stack is the last in first out structure, in terms of computer science, we call it LIFO.
In this project, I developed a ByteCode To Opcode Disassembler, it means that takes the bytecote and obtain the opcodes that are going to be read for the EVM. I used typescript
and developed a test using Jest.