Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement hart state management #22

Open
chiangkd opened this issue Jul 23, 2023 · 9 comments
Open

Implement hart state management #22

chiangkd opened this issue Jul 23, 2023 · 9 comments
Assignees

Comments

@chiangkd
Copy link
Collaborator

chiangkd commented Jul 23, 2023

During emulation, the semu_start() function initializes a single HART with a hartid of 0 and a start_addr value of (RAM_SIZE - 1024 * 1024). This means that there is only one HART present throughout the emulation process.

However, HSM (Hart Stete Management) is a mechanism used to manage the state of processor cores and is commonly applied in multi-core processor systems. It involves saving and restoring the HART's register state, program counter, and other contents when switching between different tasks or processes.

In a single-core processor system (only one HART), frequent HART state switching is not required.

Expectations:
Implement multi-core RISC-V processor system followed HSM State Machine
image

Validation:
check cat/proc/cpuinfo, currently as follows:

# cat /proc/cpuinfo
processor	: 0
hart		: 0
isa		: rv32ima
mmu		: sv32
mvendorid	: 0x12345678
marchid		: 0x80000001
mimpid		: 0x1
@jserv jserv changed the title Initiating the execution of a hart Initiate the execution of a hart Jul 24, 2023
@chiangkd chiangkd changed the title Initiate the execution of a hart Implement hart state management Jul 24, 2023
@sysprog21 sysprog21 deleted a comment from chiangkd Jul 27, 2023
@chiangkd
Copy link
Collaborator Author

chiangkd commented Aug 3, 2023

Currently, I am attempting to implement multi-core support in the Device Tree (.dts) file

    cpus {
        #address-cells = <1>;
        #size-cells = <0>;
        timebase-frequency = <65000000>;
        cpu0: cpu@0 {
            device_type = "cpu";
            compatible = "riscv";
            reg = <0>;
            riscv,isa = "rv32ima";
            mmu-type = "riscv,rv32";
            cpu0_intc: interrupt-controller {
                #interrupt-cells = <1>;
                interrupt-controller;
                compatible = "riscv,cpu-intc";
            };
        };

+       cpu1: cpu@1 {
+           device_type = "cpu";
+           compatible = "riscv";
+           reg = <1>;
+           riscv,isa = "rv32ima";
+           mmu-type = "riscv,rv32";
+           cpu1_intc: interrupt-controller {
+               #interrupt-cells = <1>;
+               interrupt-controller;
+               compatible = "riscv,cpu-intc";
+           };
+       };
    };

...

    soc {
        #address-cells = <1>;
        #size-cells = <1>;
        compatible = "simple-bus";
        ranges = <0x0 0xF0000000 0x10000000>;
        interrupt-parent = <&plic0>;

        plic0: interrupt-controller@0 {
            #interrupt-cells = <1>;
            #address-cells = <0>;
            compatible = "sifive,plic-1.0.0";
            reg = <0x0000000 0x4000000>;
            interrupt-controller;
            interrupts-extended = 
                <&cpu0_intc 9>,
+               <&cpu1_intc 9>;
            riscv,ndev = <31>;
        };

Also, I make modifications to the configs/linux.config file. After making the changes, I rebuild the Image file by running the command make build-image

-# CONFIG_SMP is not set
+CONFIG_SMP=y

After the modifications and rebuilding, I encountered a soft panic (oops) with a load access fault, which later resulted in a kernel panic with the error message: "not syncing: Attempted to kill the idle task!".

Entire log:

Ready to launch Linux kernel. Please be patient.
failed to allocate TAP device: Operation not permitted
No virtio-net functioned
[    0.000000] Linux version 6.1.42 (aaron@aaron-Lenovo-Y520-15IKBN) (riscv32-buildroot-linux-gnu-gcc.br_real (Buildroot 2023.05.1) 12.3.0, GNU ld (GNU Binutils) 2.39) #2 SMP Tue Aug  1 03:42:11 CST 2023
[    0.000000] Machine model: semu
[    0.000000] earlycon: ns16550 at MMIO 0xf4000000 (options '')
[    0.000000] printk: bootconsole [ns16550] enabled
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] SBI specification v0.3 detected
[    0.000000] SBI implementation ID=0x999 Version=0x1
[    0.000000] SBI TIME extension detected
[    0.000000] SBI SRST extension detected
[    0.000000] SBI HSM extension detected
[    0.000000] riscv: base ISA extensions aim
[    0.000000] riscv: ELF capabilities aim
[    0.000000] percpu: Embedded 10 pages/cpu s11604 r8192 d21164 u40960
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 130048
[    0.000000] Kernel command line: earlycon console=ttyS0
[    0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 490556K/524288K available (3565K kernel code, 8490K rwdata, 4096K rodata, 4135K init, 140K bss, 33732K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=32 to nr_cpu_ids=2.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] riscv-intc: 32 local interrupts mapped
[    0.000000] Oops - load access fault [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.42 #2
[    0.000000] Hardware name: semu (DT)
[    0.000000] epc : plic_toggle+0xe4/0xf8
[    0.000000]  ra : plic_toggle+0x34/0xf8
[    0.000000] epc : c01ff278 ra : c01ff1c8 sp : c1401e30
[    0.000000]  gp : c14546e0 tp : c14073c0 t0 : c1401df0
[    0.000000]  t1 : c1805018 t2 : 00000000 s0 : c1401e50
[    0.000000]  s1 : dfff353c a0 : dfff3544 a1 : 00000002
[    0.000000]  a2 : 00000000 a3 : 00000000 a4 : a0002080
[    0.000000]  a5 : 00000001 a6 : c1c00130 a7 : 00000000
[    0.000000]  s2 : 00000001 s3 : 00000000 s4 : dfff353c
[    0.000000]  s5 : 00000001 s6 : 00000002 s7 : 00000001
[    0.000000]  s8 : c080853c s9 : c0cb5128 s10: 00000009
[    0.000000]  s11: c146fd60 t3 : c1805010 t4 : c1805010
[    0.000000]  t5 : 00000400 t6 : c1401dd0
[    0.000000] status: 00000100 badaddr: a0002080 cause: 00000005
[    0.000000] [<c01ff278>] plic_toggle+0xe4/0xf8
[    0.000000] [<c0415748>] __plic_init.constprop.0+0x41c/0x44c
[    0.000000] [<c04157c0>] plic_init+0x1c/0x2c
[    0.000000] [<c041c074>] of_irq_init+0x26c/0x328
[    0.000000] [<c04151f4>] irqchip_init+0x20/0x30
[    0.000000] [<c0403c10>] init_IRQ+0x18/0x44
[    0.000000] [<c0400e74>] start_kernel+0x544/0x748
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

I think there's a problem with PLIC initialization, but I'm unsure of the exact issue. Any advice or guidance on troubleshooting would be greatly appreciated. Thanks.

@jserv
Copy link
Collaborator

jserv commented Aug 4, 2023

Check the implementation of ladybird's simulator. PLIC should be aware of hart ID and available cores.

@chiangkd
Copy link
Collaborator Author

chiangkd commented Aug 31, 2023

I followed the implementation provided in ladybird/sim/plic.c to perform calculations related to the hart context. Additionally, I referred to riscv-plic.adoc

Furthermore, I made modifications to the existing structure to support multicore functionality as follows. (renamed vm_t structure to core_t and redefining vm_t as a structure capable of accommodating multicore operations)

typedef struct __vm_internal vm_t;

struct __vm_internal {
    uint32_t num_core;
    core_t **core;
};

However, the system logs show that the kernel is able to detect the presence of the second CPU, but encounters a failure in bringing it online.

[    0.016639] smp: Bringing up secondary CPUs ...
[    1.037847] CPU1: failed to come online
[    1.038113] smp: Brought up 1 node, 1 CPU
# cat /sys/devices/system/cpu/possible
0-1
# cat /sys/devices/system/cpu/offline
1
# cat /sys/devices/system/cpu/online
0

There is a similar issue has been reported in riscv-software-src/opensbi#150. However, the problem described there appears tho be specific to an older version of OpenSBI.

Further investigation into the kernel source code smpboot.c and cpumask.h, reveals the presence of a function cpumask_test_cpu used for testinf the status of a CPU withing a cpumask.

/**
 * cpumask_test_cpu - test for a cpu in a cpumask
 * @cpu: cpu number (< nr_cpu_ids)
 * @cpumask: the cpumask pointer
 *
 * Returns true if @cpu is set in @cpumask, else returns false
 */

Drawing parallels with the implementation from ladybird/sim/plic.c, where a loop iterates through each core and performs certain operations.

/* 'i' is iterate from 0 to num_cores - 1 */
uint32_t pc = vm.core[i]->pc; 
core_step(vm.core[i], pc); 

I've also noticed a function sbi_hsm_hart_wait in sbi_hsm.c, which appears to be related to handling non-coldboot harts. It's uncertain if this function could be contributing to the CPU1: failed to come online issue.
Is it advisable to have the remaining harts in a wfi() state wait for the sbi_hsm_hart_start() function before proceeding?

@jserv
Copy link
Collaborator

jserv commented Sep 6, 2023

I've also noticed a function sbi_hsm_hart_wait in sbi_hsm.c, which appears to be related to handling non-coldboot harts. It's uncertain if this function could be contributing to the CPU1: failed to come online issue. Is it advisable to have the remaining harts in a wfi() state wait for the sbi_hsm_hart_start() function before proceeding?

It depends. We can provide a preliminary implementation with some remarks. By the way, riscv-vm handles hart in a straightforward way.

@chiangkd
Copy link
Collaborator Author

Progress is slow but steady.
Current make PLIC aware hart ID, next step is to implement ACLINT to handle inter-processor interrupt (IPI).

@jserv
Copy link
Collaborator

jserv commented Jan 28, 2024

Progress is slow but steady. Current make PLIC aware hart ID, next step is to implement ACLINT to handle inter-processor interrupt (IPI).

Got it. You may check RISC-V ACLINT MTIMER memory mapped peripheral as well. RVVM is yet another reference for CLINT and related system emulation. See https://github.com/LekKit/RVVM/blob/staging/src/devices/clint.c

@jserv
Copy link
Collaborator

jserv commented Jan 29, 2024

Incidentally, we can start implementing basic CLINT support even before the SMP system emulation is fully operational. For guidance, refer to this minimal implementation. You are also encouraged to submit any relevant pull requests in this regard.

@jserv
Copy link
Collaborator

jserv commented Jan 29, 2024

See also: Multi-Core Architecture for neorv32.

@Mes0903
Copy link
Collaborator

Mes0903 commented Dec 12, 2024

Based on the implementation of CLINT, ACLINT has been added, along with a new macro SEMU_FEATURE_ACLINT to replace CLINT for handling IPI and timer interrupts. However, due to incomplete implementation, although the logic for MSWI has been implemented, only SSWI is currently being used.

Additionally, in the original CLINT implementation:

  1. In the function clint_reg_write, the variable name lowwer should be corrected to lower. Also, the types of the variables upper and lower are int32_t, but the assigned target variable value is of type uint32_t, which are different to the upper and lower.

  2. In the function clint_update_interrupts, when comparing mtimecmp with mtimer, the function uses hart->time. While this value should be equal to mtimer, for semantic correctness, mtimer should be used directly in the comparison.

I will create a PR for the ACLINT implementation first. As for the CLINT fixes, I have not yet made any changes. Once I confirm that the fixes are correct, I will open a separate PR for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants