-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework IR as a graph of extended basic blocks #104
Comments
According to our strategy for extended basic blocks, if the branch instruction of previous basic block end is not taken, this branch instruction would not be regarded as the end of basic block. Therefore, we continue to add the instructions to the previous basic block to create an extended basic block. However, when we emulate the same block again, the control flow can alter, thus we need to add some mechanism to check this circumstance. The implementation of this approach will be added to |
Another method we're planning to use to reduce memory usage is to discard basic blocks or extended basic blocks after emulation rather than adding them to cache if the number of instructions in them is less than a predetermined threshold. |
The commit 9f9e41b implements the strategy of extended basic blocks. With extended basic blocks, the perfomance increases more than 7.6 % and the memory usage decreases more than 16%.
The commit 07892d2 implements the strategy of discarding basic blocks contain less intructions. According to the data, performance declines as we discard more blocks. As a result, less memory is needed. In code-copying and JIT, we think that block discarding can be more effective.
|
The results demonstrate the performance of the extended basic block(EBB) and the basic block(BB) in various benchmark tools. dhrystone
nqueens
richards
fib45
|
The results demonstrate the performance of the extended basic block with unconditional jump(EBB) and the basic block(BB) in various benchmark tools.
dhrystone
nqueens
richards
fib45
|
Real cases for extended basic block (or super block):
For high level descriptions of libvex IR, check PyVEX. |
Because the previous strategy's performance improvement is limited, we propose a new mechanism for implementing extended basic blocks. For new strategy, any block whose usage times exceed the threshold we set would be extended. We extend basic block by appending the basic block of branch not taken path and branch taken path to it, allowing us to emulate a basic block and its branch target with an internal jump rather than an external jump. The commit 3e463e0 implements this strategy. With extended basic blocks, the perfomance of runnung Results
Extended Basic Block Statistics vs Basic Block StatisticsThe first statistic image records the number of the original basic block which using times more than 100000 and the number of instruciton it contains. We can see that most blocks have few instructions. The second statistic image records the number of the extended basic block which using times more than 100000 and the number of instruciton it contains. Because we only extend the basic blocks end with conditional jump instruction, we discovered that all of the extended basic blocks with instructions less than 5 end with unconditional jump instruction. Nevertheless, we still resolve the problem that most blocks have few instructions. |
What are your objectives with respect to blocks with few instructions? I mean, we do need solid algorithms rather than unintentional trials. |
Based on our results, we can conclude that the more instructions in a block, the better the performance. However, we do not want the number of instructions in a block to become too large, as this would make it unsuitable for translating to machine code. Moreover, only blocks that are frequently used need to be extended.
We extend basic block by appending the basic block of branch not taken path and branch taken path to it, allowing us to emulate a basic block and its branch target with an internal jump rather than an external jump.
Case1Case2 |
We might revisit EBB while implementing JIT compiler. At the moment, let's concentrate on #81. |
The intermediate representation (IR) function used in
wip/jit
branch is a graph of basic blocks. Extended basic block (EBB; or superblock) is a collection of BBs with one label at the beginning and internal labels, each of which is the target of only one internal jump and no external jumps. EBB may contain internal branch instructions and is closer to how machine code works.LLVM uses phi instructions in its SSA representation. Cranelift passes arguments to EBBs instead. The two representations are equivalent, but the EBB arguments are better suited to handle EBBs that may contain multiple branches to the same destination block with different arguments. Passing arguments to an EBB looks a lot like passing arguments to a function call, and the register allocator treats them very similarly. Arguments are assigned to registers or stack locations.
The definition of Control flow graph (CFG):
Extended basic block
We can identify loops by using dominators
entry
node to B includes A.The goal of dominators and postdominators is to determine loops in the flowgraph.
Use case: ARMware, an ARMv4 / Compaq iPAQ emulator, has a built-in threaded code engine which will cache an EBB (extended basic block) of ARM codes, so that it can increase the execution speed. Further more, ARMware has a built-in dynamic compiler which will translate an EBB of ARM codes into a block of x86 machine codes, so that it can increase the runtime performance dramatically. The optimization techniques implemented in this dynamic compiler include:
Reference:
The text was updated successfully, but these errors were encountered: