Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: floating-point loads may corrupt integer loads #36

Open
flaviens opened this issue Jun 26, 2024 · 2 comments
Open

Issue: floating-point loads may corrupt integer loads #36

flaviens opened this issue Jun 26, 2024 · 2 comments

Comments

@flaviens
Copy link

Hi there!

I encountered an issue that is hard to justify at the moment, but I cannot formally exclude that it is related to my setup.

When I execute the snippet below, I get a mismatch with spike. The discrepancy disappears when replacing the (unused) floating-point loads with nops. The mismatch is the following:

Expected:

ra: 0xe04421c9224caa89
sp: 0xc61cd5c8311c246e
gp: 0x95b25c9b2351c1d5
tp: 0x1ea3cb34fe8ff474
t0: 0xf46b31db8d0787ed
t1: 0x4edaeb36851a035f
t2: 0xdd823e7ce5bccc33
s0: 0xff7bee40ac9924ee
s1: 0x2f83fdcaed66b4a9
a0: 0xae49d1b5c6b68041
a1: 0xf04cdf3459e50963
a2: 0x5eb964b31312b5c8
a3: 0x74c0e29e81d2a1b0
a4: 0xe96758395029650e
a5: 0x8000012c
a6: 0x6f67b4f4d76e5f94
a7: 0xe9813374733db707
s2: 0xfd5dc052e66d68be
s3: 0xcbeb5aac59031ecf
s4: 0x37b70270175d3ad8
s5: 0xd3384a1992e2b47a
s6: 0x139faa8e8cdb810c
s7: 0xfff3dbc8bf652496

Got:

ra: 0xe04421c9224caa89
sp: 0xc61cd5c8311c246e
gp: 0x95b25c9b2351c1d5
tp: 0x1ea3cb34fe8ff474
t0: 0xf46b31db8d0787ed
t1: 0x4edaeb36851a035f
t2: 0xdd823e7ce5bccc33
s0: 0xff7bee40ac9924ee
s1: 0x74c0e29e81d2a1b0
a0: 0x5eb964b31312b5c8
a1: 0x78c5c22d14a5aef
a2: 0x5eb964b31312b5c8
a3: 0xd3384a1992e2b47a
a4: 0xe96758395029650e
a5: 0x8000012c
a6: 0x6f67b4f4d76e5f94
a7: 0xe9813374733db707
s2: 0xfd5dc052e66d68be
s3: 0xcbeb5aac59031ecf
s4: 0x37b70270175d3ad8
s5: 0xd3384a1992e2b47a
s6: 0x139faa8e8cdb810c
s7: 0xfff3dbc8bf652496

Snippet:

  .section ".text.init","ax",@progbits
  .globl _start
  .align 2
_start:

  li      sp, 1
  slli    sp, sp, 31
  addi    sp, sp, 16
  jr      sp

  lui     t6,0x80000
  mv      t6,t6
  slli    t6,t6,0x20
  srli    t6,t6,0x20
  csrw    medeleg,zero
  csrw    mtvec,zero
  csrw    stvec,zero
  li      ra,31
  csrw    pmpcfg0,ra
  li      ra,1
  slli    ra,zero,0x36
  addi    ra,ra,-1
  csrw    pmpaddr0,ra
  csrw    mcycle,zero
  csrw    minstret,zero
  csrw    mcause,zero
  csrw    mtval,zero
  csrw    mscratch,zero
  csrw    scause,zero
  csrw    stval,zero
  csrw    sscratch,zero
  lui     t4,0x2
  csrw    mstatus,t4
  li      ra,0
  fscsr   ra
  srli    s11,t4,0x1
  srli    t3,t4,0x2
  or      t3,t3,s11
  nop
  csrs    mstatus,t3
  csrs    mstatus,s11
  srli    s10,t4,0x5
  csrs    mstatus,s10
  li      t5,-1
  slli    t5,t5,0x20
  not     t5,t5
  add     s7,zero,t6
  addi    s7,s7,304
  fld     ft0,184(s7)
  fld     ft1,192(s7)
  fld     ft2,200(s7)
  fld     ft3,208(s7)
  fld     ft4,216(s7)
  fld     ft5,224(s7)
  fld     ft6,232(s7)
  fld     ft7,240(s7)
  fld     fs0,248(s7)
  # Replace the floating-point loads with the nops below to make it work
  # nop
  # nop
  # nop
  # nop
  # nop
  # nop
  # nop
  # nop
  # nop
  nop
  ld      ra,0(s7)
  ld      sp,8(s7)
  ld      gp,16(s7)
  ld      tp,24(s7)
  ld      t0,32(s7)
  ld      t1,40(s7)
  ld      t2,48(s7)
  ld      s0,56(s7)
  ld      s1,64(s7)
  ld      a0,72(s7)
  ld      a1,80(s7)
  ld      a2,88(s7)
  ld      a3,96(s7)
  ld      a4,104(s7)
  ld      a5,112(s7)
  ld      a6,120(s7)
  ld      a7,128(s7)
  ld      s2,136(s7)
  ld      s3,144(s7)
  ld      s4,152(s7)
  ld      s5,160(s7)
  ld      s6,168(s7)
  ld      s7,176(s7)
  jal     a5,first_block

  .insn 2, 0x0000           
  .insn 2, 0x0000           
  .insn 2, 0xaa89           
  .insn 2, 0x224c           
  .insn 2, 0x21c9           
  .insn 2, 0xe044           
  .insn 2, 0x246e           
  .insn 2, 0x311c           
  .insn 2, 0xd5c8           
  .insn 2, 0xc61c           
  .insn 2, 0xc1d5           
  .insn 2, 0x2351           
  .insn 4, 0x95b25c9b       
  .insn 2, 0xf474           
  .insn 4, 0xcb34fe8f       
  .insn 4, 0x87ed1ea3       
  .insn 4, 0x31db8d07       
  .insn 4, 0x035ff46b       
  .insn 2, 0x851a           
  .insn 2, 0xeb36           
  .insn 2, 0x4eda           
  .insn 4, 0xe5bccc33       
  .insn 2, 0x3e7c           
  .insn 2, 0xdd82           
  .insn 2, 0x24ee           
  .insn 2, 0xac99           
  .insn 2, 0xee40           
  .insn 4, 0xb4a9ff7b       
  .insn 2, 0xed66           
  .insn 2, 0xfdca           
  .insn 4, 0x80412f83       
  .insn 2, 0xc6b6           
  .insn 2, 0xd1b5           
  .insn 2, 0xae49           
  .insn 4, 0x59e50963       
  .insn 2, 0xdf34           
  .insn 2, 0xf04c           
  .insn 2, 0xb5c8           
  .insn 2, 0x1312           
  .insn 4, 0x5eb964b3       
  .insn 2, 0xa1b0           
  .insn 2, 0x81d2           
  .insn 2, 0xe29e           
  .insn 2, 0x74c0           
  .insn 2, 0x650e           
  .insn 2, 0x5029           
  .insn 2, 0x5839           
  .insn 4, 0x72b0e967       
  .insn 2, 0x168c           
  .insn 2, 0x37c2           
  .insn 2, 0x3520           
  .insn 2, 0x5f94           
  .insn 2, 0xd76e           
  .insn 2, 0xb4f4           
  .insn 4, 0xb7076f67       
  .insn 2, 0x733d           
  .insn 2, 0x3374           
  .insn 2, 0xe981           
  .insn 2, 0x68be           
  .insn 2, 0xe66d           
  .insn 2, 0xc052           
  .insn 2, 0xfd5d           
  .insn 4, 0x59031ecf       
  .insn 2, 0x5aac           
  .insn 4, 0x3ad8cbeb       
  .insn 2, 0x175d           
  .insn 2, 0x0270           
  .insn 4, 0xb47a37b7       
  .insn 2, 0x92e2           
  .insn 2, 0x4a19           
  .insn 2, 0xd338           
  .insn 2, 0x810c           
  .insn 4, 0xaa8e8cdb       
  .insn 6, 0xbf652496139f 
  .insn 2, 0xdbc8           
  .insn 4, 0x21fffff3       
  .insn 2, 0x75d6           
  .insn 2, 0xf5ff           
  .insn 4, 0x3af702cf       
  .insn 2, 0xf45a           
  .insn 2, 0x28f6           
  .insn 2, 0x02f9           
  .insn 2, 0x56ea           
  .insn 2, 0xa4e5           
  .insn 2, 0x5a5a           
  .insn 2, 0xe755           
  .insn 6, 0x8248e288811f  
  .insn 2, 0xd1c5           
  .insn 2, 0xe93d           
  .insn 4, 0xba33fa0b       
  .insn 2, 0x5a28           
  .insn 2, 0x6af5           
  .insn 2, 0xb045           
  .insn 2, 0xe47e           
  .insn 2, 0xc6b4           
  .insn 4, 0xe9405d93       
  .insn 2, 0xafc4           
  .insn 2, 0x3cd1           
  .insn 4, 0xd14a5aef       
  .insn 2, 0x5c22           
  .insn 2, 0x078c           
  .insn 2, 0x95d0           
  .insn 2, 0x342c           
  .insn 2, 0x7571           
  .insn 2, 0x2402           


first_block:
  lui     t5,0x0
  addi    t5,t5,16 # 0x10
  fence
  sd      ra,0(t5)
  fence
  sd      sp,0(t5)
  fence
  sd      gp,0(t5)
  fence
  sd      tp,0(t5)
  fence
  sd      t0,0(t5)
  fence
  sd      t1,0(t5)
  fence
  sd      t2,0(t5)
  fence
  sd      s0,0(t5)
  fence
  sd      s1,0(t5)
  fence
  sd      a0,0(t5)
  fence
  sd      a1,0(t5)
  fence
  sd      a2,0(t5)
  fence
  sd      a3,0(t5)
  fence
  sd      a4,0(t5)
  fence
  sd      a5,0(t5)
  fence
  sd      a6,0(t5)
  fence
  sd      a7,0(t5)
  fence
  sd      s2,0(t5)
  fence
  sd      s3,0(t5)
  fence
  sd      s4,0(t5)
  fence
  sd      s5,0(t5)
  fence
  sd      s6,0(t5)
  fence
  sd      s7,0(t5)
  fence
  lui     t5,0x0
  mv      t5,t5
  sd      zero,0(t5) # 0x0
  fence

  # Infinite loop
infiniteloop:
  j infiniteloop

In terms of address map, I'm hardcoding the dummy config as follows in C910_RTL_FACTORY/gen_rtl/mmu/rtl/ct_mmu_sysmap.v:

assign sysmap_comp_hit0 = 1'b1;
assign sysmap_comp_hit1 = 1'b1;
assign sysmap_comp_hit2 = 1'b1;
assign sysmap_comp_hit3 = 1'b1;
assign sysmap_comp_hit4 = 1'b1;
assign sysmap_comp_hit5 = 1'b1;
assign sysmap_comp_hit6 = 1'b1;
assign sysmap_comp_hit7 = 1'b1;

Is this a known issue / can someone else reproduce it?

Thanks!
Flavien

@flaviens
Copy link
Author

Hi again! I could reduce the code a bit and make a few observations.

Here is the new code, with a similar mismatch:

  .section ".text.init","ax",@progbits
  .globl _start
  .align 2
_start:

  # Make sure we start at 0x80000000 and not at 0x0 in case the memory aliases.
  li      sp,1
  slli    sp,sp,0x1f
  addi    sp,sp,16
  jr      sp
  # Some init
  lui     t6,0x80000
  mv      t6,t6
  slli    t6,t6,0x20
  srli    t6,t6,0x20
  csrw    medeleg,zero
  csrw    mtvec,zero
  csrw    stvec,zero
  lui     t4,0x2
  csrw    mstatus,t4
  li      ra,0
  fscsr   ra
  srli    s10,t4,0x5
  csrs    mstatus,s10
  li      t5,-1
  slli    t5,t5,0x20
  not     t5,t5
  # Load floats and ints. The floats are not used but seem to be necessary to cause the issue.
  add     tp,zero,t6
  addi    tp,tp,128 # 0x80
  fld     ft0,32(tp) # 0x20
  fld     ft1,40(tp) # 0x28
  fld     ft2,48(tp) # 0x30
  fld     ft3,56(tp) # 0x38
  fld     ft4,64(tp) # 0x40
  ld      ra,0(tp) # 0x0
  ld      sp,8(tp) # 0x8
  ld      gp,16(tp) # 0x10
  ld      tp,24(tp) # 0x18
  jal     first_block

  # Some dummy data
  .insn 2, 0x0
  .insn 2, 0x0000
  .insn 2, 0x0000
  .insn 2, 0x0
  .insn 2, 0x3334    
  .insn 2, 0xb56d    
  .insn 4, 0xc9ac9f77
  .insn 2, 0x5db5    
  .insn 4, 0x09df2983
  .insn 2, 0x664e    
  .insn 2, 0x989c    
  .insn 4, 0xfaad7697
  .insn 4, 0x31b4373b
  .insn 2, 0x0000    
  .insn 2, 0x0000
  .insn 2, 0x0000
  .insn 2, 0x0000
  .insn 2, 0x0000
  .insn 2, 0x0000    
  .insn 2, 0x0000    
  .insn 2, 0x0000    
  .insn 2, 0x0000    
  .insn 2, 0x0000    
  .insn 2, 0x0000
  .insn 2, 0x0000
  .insn 2, 0x0000    
  .insn 2, 0x0000    
  .insn 2, 0x0000    
  .insn 2, 0x0000
  .insn 2, 0x0000
  .insn 2, 0x0000    
  .insn 2, 0x0000    

# The alignment of this block matters. For example, an alignment of 0xc1c will show the bug, but not an alignment of 0x1000. 
.section ".text.firstblock","ax",@progbits
.globl _start
.align 2
first_block:
  # Dump the register values
  lui     t5,0x0
  addi    t5,t5,16 # 0x10
  fence
  sd      ra,0(t5)
  fence
  sd      sp,0(t5)
  fence
  sd      gp,0(t5)
  fence
  sd      tp,0(t5)
  fence

# This code here is actually what will be stored into `tp`.
.rept 32
  nop
.endr

  # Add 1 to the value and store it, to see if it's a problem of register valuation, or or in the store operation.
  # The conclusion is the former.
  addi tp, tp, 1
  sd      tp,0(t5)
  fence

  # Complete the testbench.
  lui     t5,0x0
  mv      t5,t5
  sd      zero,0(t5) # 0x0
  fence

  # Infinite loop
infiniteloop:
  j infiniteloop

Some observations:

  • The issue seems to be conditioned by the alignment of the first_block section.
  • tp takes the value of some code following the store.

Anyone observed something similar?

Thanks!
Flavien

@flaviens
Copy link
Author

Also, there are occurrences that do not imply floating-point operations or floating-point loads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant