Skip to content
This repository has been archived by the owner on Aug 12, 2022. It is now read-only.

How to map the branch prediction algorithm to the RTL code #14

Open
Grubby-CPU opened this issue Aug 31, 2020 · 2 comments
Open

How to map the branch prediction algorithm to the RTL code #14

Grubby-CPU opened this issue Aug 31, 2020 · 2 comments

Comments

@Grubby-CPU
Copy link

Grubby-CPU commented Aug 31, 2020

I read the branch prediction section which seems to be a common technique and is very easy to understand.
However, when I tried to understand the RTL code in the iuq_bp.vhdl based on my understanding, I could not find the relation.

For example, there are 1024 entries in the BP table which are accessed by IFAR(50:59) based on Figure D-3 in A2_BGQ.pdf. Below code only generates 8-bit address which confused me.

iu1_bh_ti0gs1_rd_addr(0 to 7) <= (ic_bp_iu1_ifar(52 to 55) xor iu1_gshare(0 to 3)) & ic_bp_iu1_ifar(56 to 59); iu1_bh_ti1gs1_rd_addr(0 to 7) <= iu1_tid_enc(0 to 1) & (ic_bp_iu1_ifar(54 to 57) xor iu1_gshare(0 to 3)) & ic_bp_iu1_ifar(58 to 59);

I tried to track the flow of the signal "iu1_bh_ti0gs1_rd_addr" and found it is the input of the tri_bht.vhdl file. "ary_r_data" seems to be the output of the BP table, but what's "data_out" which is related to INIT_MASK and "r_addr_q(0)". I am also confused by the below code.

data_out(0 to 7) <= gate(ary_r_data(0 to 7) xor (INIT_MASK(0 to 1) & INIT_MASK(0 to 1) & INIT_MASK(0 to 1) & INIT_MASK(0 to 1)), r_addr_q(0) = '0') or gate(ary_r_data(8 to 15) xor (INIT_MASK(0 to 1) & INIT_MASK(0 to 1) & INIT_MASK(0 to 1) & INIT_MASK(0 to 1)), r_addr_q(0) = '1') ; .

After reading the data, these data are processed again in iuq_bp.vhdl.

`with ic_bp_iu3_ifar(60 to 61) select
iu3_0_br_hist <= iu3_3_bh_rd_data(0 to 1) when "11",
iu3_2_bh_rd_data(0 to 1) when "10",
iu3_1_bh_rd_data(0 to 1) when "01",
iu3_0_bh_rd_data(0 to 1) when others;

with ic_bp_iu3_ifar(60 to 61) select
iu3_1_br_hist <= iu3_3_bh_rd_data(0 to 1) when "10",
iu3_2_bh_rd_data(0 to 1) when "01",
iu3_1_bh_rd_data(0 to 1) when others;

with ic_bp_iu3_ifar(60 to 61) select
iu3_2_br_hist <= iu3_3_bh_rd_data(0 to 1) when "01",
iu3_2_bh_rd_data(0 to 1) when others;

iu3_3_br_hist <= iu3_3_bh_rd_data(0 to 1);
`.

Can someone tell me how to understand these codes from the architectural view? Any hints are welcome

@openpowerwtf
Copy link
Collaborator

As a start...

tri_bht uses the array tri_128x16_1r1w_1 (128x16, 1 read, 1 write = 2K bits). There are four 2-bit data outputs of tri_bht. It uses ra(1:7) to read its array, then selects hi/lo data using ra(0). So it acts like a 256x8b access (four entries read at once).

ifar(60:61) are doing the 'Branch History Rotate' in the figure, which is a partial left-shift.

I think the figure implies 4 arrays, which would be 1K x 2b x 4 = 8Kb. I suspect that is wrong (probably changed at some point), and ifar(50:51) are not part of the selection; the address is created like you show, using 52:59. The text says:

The BHT consists of 1024 2-bit counters.

@Grubby-CPU
Copy link
Author

Grubby-CPU commented Sep 1, 2020

Thanks! The code looks much more clear now.
I think ‘'Branch History Rotate’ is used to deal with the the case where ifar(60:61) is not aligned, right? For example, if ifar(60:61) is 2'b10, then
"iu3_0_br_hist" is "iu3_2_bh_rd_data(0 to 1)",
"iu3_1_br_hist" is "iu3_3_bh_rd_data(0 to 1)",
"iu3_2_br_hist" is "iu3_2_bh_rd_data" and
"iu3_3_br_hist" is "iu3_3_bh_rd_data".

What does this mean in the architecture view? It seems to me that iu3_0_br_hist and iu3_1_br_hist have found the right BHT entry. However, iu3_2_br_hist and iu3_3_br_hist have the wrong BHT entries?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants