From 3588df9eff9dac2d933d05b107cf19cbd13d13d4 Mon Sep 17 00:00:00 2001 From: Myrausman Date: Wed, 11 Sep 2024 00:31:40 +0500 Subject: [PATCH 1/5] improved readme Signed-off-by: Myrausman --- README.md | 366 +++++++++++++++++++++++++++++------------------------- 1 file changed, 200 insertions(+), 166 deletions(-) diff --git a/README.md b/README.md index 73c07156..762745ba 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,20 @@ # riscv-opcodes -This repo enumerates standard RISC-V instruction opcodes and control and -status registers. It also contains a script to convert them into several -formats (C, Scala, LaTeX). +This repository enumerates standard RISC-V instruction opcodes and control/status registers. It also contains a script to convert them into various formats (C, Scala, LaTeX). -Artifacts (encoding.h, latex-tables, etc) from this repo are used in other -tools and projects like Spike, PK, RISC-V Manual, etc. +Artifacts like `encoding.h`, `latex-tables`, etc., from this repo are used in tools and projects such as Spike, PK, and the RISC-V Manual. +## Table of Contents +1. [Project Structure](#project-structure) +2. [File Naming Policy](#file-naming-policy) +3. [Encoding Syntax](#encoding-syntax) +4. [Usage](#usage) +5. [Artifact Generation](#artifact-generation) +6. [Adding a New Extension](#adding-a-new-extension) +7. [Debugging](#debugging) +8. [Contributing](#contributing) + +--- ## Project Structure ```bash @@ -14,225 +22,251 @@ tools and projects like Spike, PK, RISC-V Manual, etc. ├── encoding.h # the template encoding.h file ├── LICENSE # license file ├── Makefile # makefile to generate artifacts -├── parse.py # python file to perform checks on the instructions and generate artifacts +├── parse.py # performs checks and generates artifacts ├── README.md # this file ├── rv* # instruction opcode files └── unratified # contains unratified instruction opcode files ``` - +--- ## File Naming Policy -This project follows a very specific file structure to define the instruction encodings. All files -containing instruction encodings start with the prefix `rv`. These files can either be present in -the root directory (if the instructions have been ratified) or the `unratified` directory. The exact -file-naming policy and location is as mentioned below: +This project follows a specific file naming convention for instruction encodings: + +* **`rv_x`**: Instructions common to both 32-bit and 64-bit modes of extension `X`. +* **`rv32_x`**: Instructions specific to `rv32x` (e.g., `brev8`). +* **`rv64_x`**: Instructions specific to `rv64x` (e.g., `addw`). +* **`rv_x_y`**: Instructions valid when both extensions `X` and `Y` are enabled. Canonical ordering as specified by the RISC-V spec should be followed. +* **`unratified/`**: Contains instruction encodings that are not ratified yet, following the same policy. -1. `rv_x` - contains instructions common within the 32-bit and 64-bit modes of extension X. -2. `rv32_x` - contains instructions present in rv32x only (absent in rv64x e.g.. brev8) -3. `rv64_x` - contains instructions present in rv64x only (absent in rv32x, e.g. addw) -4. `rv_x_y` - contains instructions when both extension X and Y are available/enabled. It is recommended to follow canonical ordering for such file names as specified by the spec. -5. `unratified` - this directory will also contain files similar to the above policies, but will - correspond to instructions which have not yet been ratified. +For instructions present in multiple extensions where the spec is vague, the encoding should be placed in the canonically ordered first extension and imported into others using `$import`. -When an instruction is present in multiple extensions and the spec is vague in defining the extension which owns the instruction, the instruction encoding must be placed in the first canonically ordered extension and should be imported(via the `$import` keyword) in the remaining extensions. +--- ## Encoding Syntax +Instruction encoding files in this project use the following syntax: -The encoding syntax uses `$` to indicate keywords. As of now 2 keywords have been identified : `$import` and `$pseudo_op` (described below). The syntax also uses `::` as a means to define the relationship between extension and instruction. `..` is used to defined bit ranges. We use `#` to define comments in the files. All comments must be in a separate line. In-line comments are not supported. +* **Keywords**: `$import` and `$pseudo_op` are keywords used to indicate special operations. +* **Operators**: `::` defines relationships between extensions and instructions; `..` defines bit ranges. +* **Comments**: Use `#` for comments. Inline comments are not supported. -Instruction syntaxes used in this project are broadly categorized into three: +### Instruction Categories -- **regular instructions** :- these are instructions which hold a unique opcode in the encoding space. A very generic syntax guideline - for these instructions is as follows: - ``` - - ``` - where `` is either `` or ``. +1. **Regular Instructions**: Instructions with unique opcodes. + - **Syntax**: ` ` + - **Example**: + ```plaintext + lui rd imm20 6..2=0x0D 1..0=3 + beq bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3 + ``` - Examples: - ``` - lui rd imm20 6..2=0x0D 1..0=3 - beq bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3 - ``` - The bit encodings are usually of 2 types: - - *single bit assignment* : here the value of a single bit is assigned using syntax `=`. For e.g. `6=1` means bit 6 should be 1. Here the value must be 1 or 0. - - *range assignment*: here a range of bits is assigned a value using syntax: `..=`. For e.g. `31..24=0xab`. The value here can be either unsigned integer, hex (0x) or binary (0b). + - **Bit Encoding Types**: + - *Single Bit Assignment*: `=` + - *Range Assignment*: `..=` -- **pseudo_instructions** (a.k.a pseudo\_ops) - These are instructions which are aliases of regular instructions. Their encodings force - certain restrictions over the regular instruction. The syntax for such instructions uses the `$pseudo_op` keyword as follows: - ``` - $pseudo_op :: - ``` - Here the `` specifies the extension which contains the base instruction. `` indicates the name of the instruction - this pseudo-instruction is an alias of. The remaining fields are the same as the regular instruction syntax, where all the args and the fields - of the pseudo instruction are specified. +2. **Pseudo Instructions** (`$pseudo_op`): Aliases for regular instructions with restricted bit encodings. + - **Syntax**: `$pseudo_op :: ` + - ``: Specifies the extension which contains the base instruction. + - ``: Indicates the name of the instruction this pseudo-instruction is an alias of. + - The remaining fields are the same as the regular instruction syntax, where all the arguments and the fields of the pseudo instruction are specified. + - **Example**: + ```plaintext + $pseudo_op rv_zicsr::csrrs frflags rd 19..15=0 31..20=0x001 14..12=2 6..2=0x1C 1..0=3 + ``` - Example: - ``` - $pseudo_op rv_zicsr::csrrs frflags rd 19..15=0 31..20=0x001 14..12=2 6..2=0x1C 1..0=3 - ``` - - If a ratified instruction is a pseudo\_op of a regular unratified - instruction, it is recommended to maintain this pseudo\_op relationship i.e. - define the new instruction as a pseudo\_op of the unratified regular - instruction, as this avoids existence of overlapping opcodes for users who are - experimenting with unratified extensions as well. - -- **imported_instructions** - these are instructions which are borrowed from an extension into a new/different extension/sub-extension. Only regular instructions can be imported. Pseudo-op or already imported instructions cannot be imported. Example: - ``` - $import rv32_zkne::aes32esmi - ``` + - **Recommendation**: If a ratified instruction is a `$pseudo_op` of a regular unratified instruction, it is recommended to maintain this `$pseudo_op` relationship. Define the new instruction as a `$pseudo_op` of the unratified regular instruction to avoid overlapping opcodes for users experimenting with unratified extensions. -### RESTRICTIONS +3. **Imported Instructions** (`$import`): Instructions borrowed from another extension. + - These are instructions borrowed from an extension into a new or different extension/sub-extension. Only regular instructions can be imported. Pseudo-ops or already imported instructions cannot be imported. + - **Syntax**: `$import :: ` + - **Example**: + ```plaintext + $import rv32_zkne::aes32esmi + ``` -Following are the restrictions one should keep in mind while defining $pseudo\_ops and $imported\_ops +### Restrictions -- Pseudo-op or already imported instructions cannot be imported again in another file. One should - always import base-instructions only. -- While defining a $pseudo\_op, the base-instruction itself cannot be a $pseudo\_op +* Pseudo-ops or already imported instructions cannot be imported again. +* A base instruction for a pseudo-op cannot be a pseudo-op itself. +--- ## Flow for parse.py -The `parse.py` python file is used to perform checks on the current set of instruction encodings and also generates multiple artifacts : latex tables, encoding.h header file, etc. This section will provide a brief overview of the flow within the python file. +The `parse.py` Python file is used to perform checks on the current set of instruction encodings and also generates multiple artifacts: LaTeX tables, `encoding.h` header file, etc. This section provides a brief overview of the flow within the Python file. -To start with, `parse.py` creates a list of all `rv*` files currently checked into the repo (including those inside the `unratified` directory as well). -It then starts parsing each file line by line. In the first pass, we only capture regular instructions and ignore the imported or pseudo instructions. -For each regular instruction, the following checks are performed : +1. **Initial Setup**: + - `parse.py` creates a list of all `rv*` files currently checked into the repo (including those inside the `unratified` directory). + - It starts parsing each file line by line. - - for range-assignment syntax, the *msb* position must be higher than the *lsb* position - - for range-assignment syntax, the value of the range must representable in the space identified by *msb* and *lsb* - - values for the same bit positions should not be defined multiple times. - - All bit positions must be accounted for (either as args or constant value fields) +2. **First Pass - Regular Instructions**: + - Capture only regular instructions and ignore imported or pseudo instructions. + - **Checks performed**: + - For range-assignment syntax, the *msb* (most significant bit) position must be higher than the *lsb* (least significant bit) position. + - The value of the range must be representable in the space identified by *msb* and *lsb*. + - Values for the same bit positions should not be defined multiple times. + - All bit positions must be accounted for (either as arguments or constant value fields). -Once the above checks are passed for a regular instruction, we then create a dictionary for this instruction which contains the following fields: - - encoding : contains a 32-bit string defining the encoding of the instruction. Here `-` is used to represent instruction argument fields - - extension : string indicating which extension/filename this instruction was picked from - - mask : a 32-bit hex value indicating the bits of the encodings that must be checked for legality of that instruction - - match : a 32-bit hex value indicating the values the encoding must take for the bits which are set as 1 in the mask above - - variable_fields : This is list of args required by the instruction + - **Dictionary Creation**: + - Create a dictionary for each instruction with the following fields: + - `encoding`: A 32-bit string defining the encoding of the instruction. `-` is used to represent instruction argument fields. + - `extension`: String indicating which extension/filename this instruction was picked from. + - `mask`: A 32-bit hex value indicating the bits of the encodings that must be checked for legality. + - `match`: A 32-bit hex value indicating the values the encoding must take for the bits which are set as 1 in the mask. + - `variable_fields`: A list of arguments required by the instruction. -The above dictionary elements are added to a main `instr_dict` dictionary under the instruction node. This process continues until all regular -instructions have been processed. In the second pass, we now process the `$pseudo_op` instructions. Here, we first check if the *base-instruction* of -this pseudo instruction exists in the relevant extension/filename or not. If it is present, the the remaining part of the syntax undergoes the same -checks as above. Once the checks pass and if the *base-instruction* is not already added to the main `instr_dict` then the pseudo-instruction is added to -the list. In the third, and final, pass we process the imported instructions. + - Add the dictionary elements to a main `instr_dict` dictionary under the instruction node. This process continues until all regular instructions have been processed. -The case where the *base-instruction* for a pseudo-instruction may not be present in the main `instr_dict` after the first pass is if the only a subset -of extensions are being processed such that the *base-instruction* is not included. +3. **Second Pass - Pseudo Instructions**: + - Process `$pseudo_op` instructions. + - **Checks performed**: + - Verify if the *base-instruction* of the pseudo instruction exists in the relevant extension/filename. + - The remaining part of the syntax undergoes the same checks as above. + - If the checks pass and the *base-instruction* is not already added to the main `instr_dict`, then add the pseudo-instruction to the list. +4. **Third Pass - Imported Instructions**: + - Process imported instructions. + +5. **Special Case**: + - If the *base-instruction* for a pseudo-instruction is not present in the main `instr_dict` after the first pass, it may be due to processing only a subset of extensions where the *base-instruction* is not included. ## Artifact Generation and Usage -The following artifacts can be generated using parse.py: +The `parse.py` script can generate the following artifacts: -- instr\_dict.json : This is always generated by parse.py and contains the - entire main dictionary `instr\_dict` in JSON format. Note, in this file the - *dots* in an instruction are replaced with *underscores*. In previous - versions of this project the generated file was instr\_dict.yaml. Note that - JSON is a subset of YAML so the file can still be read by any YAML parser. -- encoding.out.h : this is the header file that is used by tools like spike, pk, etc -- instr-table.tex : the latex table of instructions used in the riscv-unpriv spec -- priv-instr-table.tex : the latex table of instruction used in the riscv-priv spec -- inst.chisel : chisel code to decode instructions -- inst.sverilog : system verilog code to decode instructions -- inst.rs : rust code containing mask and match variables for all instructions -- inst.spinalhdl : spinalhdl code to decode instructions -- inst.go : go code to decode instructions +* **`instr_dict.yaml`**: Contains the main dictionary `instr_dict` in YAML format. Note that dots in instruction names are replaced with underscores in this YAML file. +* **`encoding.out.h`**: A header file used by tools such as Spike, PK, etc. +* **`instr-table.tex`**: LaTeX table of instructions for the RISC-V unprivileged specification. +* **`priv-instr-table.tex`**: LaTeX table of instructions for the RISC-V privileged specification. +* **`inst.chisel`**: Chisel code for decoding instructions. +* **`inst.sverilog`**: SystemVerilog code for decoding instructions. +* **`inst.rs`**: Rust code containing mask and match variables for all instructions. +* **`inst.spinalhdl`**: SpinalHDL code for decoding instructions. +* **`inst.go`**: Go code for decoding instructions. -To generate all the above artifacts for all instructions currently checked in, simply run `make` from the root-directory. This should print the following log on the command-line: +### Prerequisites +Ensure you have the required Python dependencies installed. Run the following commands: + +```bash +sudo apt-get install python3-pip +pip3 install -r requirements.txt ``` -Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv*', 'unratified/rv*'] -Extensions selected : ['rv*', 'unratified/rv*'] -INFO:: encoding.out.h generated successfully -INFO:: inst.chisel generated successfully -INFO:: inst.spinalhdl generated successfully -INFO:: inst.sverilog generated successfully -INFO:: inst.rs generated successfully -INFO:: inst.go generated successfully -INFO:: instr-table.tex generated successfully -INFO:: priv-instr-table.tex generated successfully -``` +### Generating Artifacts +To generate all artifacts for all instructions currently checked in, run make from the root directory. This will produce the following output: + ```plaintext + Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv*', 'unratified/rv*'] + Extensions selected : ['rv*', 'unratified/rv*'] + INFO:: encoding.out.h generated successfully + INFO:: inst.chisel generated successfully + INFO:: inst.spinalhdl generated successfully + INFO:: inst.sverilog generated successfully + INFO:: inst.rs generated successfully + INFO:: inst.go generated successfully + INFO:: instr-table.tex generated successfully + INFO:: priv-instr-table.tex generated successfully -By default all extensions are enabled. To select only a subset of extensions you can change the `EXTENSIONS` variable of the makefile to contains only the file names of interest. + ``` +### Selecting Specific Extensions +By default, all extensions are enabled. To select a subset of extensions, modify the EXTENSIONS variable in the Makefile to include only the filenames of interest. For example, to include only the I and M extensions: For example if you want only the I and M extensions you can do the following: ```bash make EXTENSIONS='rv*_i rv*_m' ``` -Which will print the following log: - -``` -Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m'] -Extensions selected : ['rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m'] -INFO:: encoding.out.h generated successfully -INFO:: inst.chisel generated successfully -INFO:: inst.spinalhdl generated successfully -INFO:: inst.sverilog generated successfully -INFO:: inst.rs generated successfully -INFO:: inst.go generated successfully -INFO:: instr-table.tex generated successfully -INFO:: priv-instr-table.tex generated successfully +This will produce the following output: + +```plaintext + Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', 'rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m'] + Extensions selected : ['rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m'] + INFO:: encoding.out.h generated successfully + INFO:: inst.chisel generated successfully + INFO:: inst.sverilog generated successfully + INFO:: inst.rs generated successfully + INFO:: instr-table.tex generated successfully + INFO:: priv-instr-table.tex generated successfully ``` +### Generating Specific Artifacts -If you only want a specific artifact you can use one or more of the following targets : `c`, `rust`, `chisel`, `sverilog`, `latex`. -For example, if you want to generate the `c` based artifact with extensions as shown earlier, you can use the following command: +To generate specific artifacts, use one or more of the following targets: -```bash -./parse.py -c EXTENSIONS='rv*_i rv*_m' -``` -Which will print the following log: +* `c` +* `rust` +* `chisel` +* `sverilog` +* `latex` -``` -Running with args : ['./parse.py', '-c', 'EXTENSIONS=rv*_i rv*_m'] -Extensions selected : ['EXTENSIONS=rv*_i rv*_m'] -INFO:: encoding.out.h generated successfully -``` +### Cleaning Up -or you can also use the `make` command as: +To remove all generated artifacts, use the `clean` target: ```bash -make encoding.out.h EXTENSIONS='rv*_i rv*_m' +make clean ``` -You can use the `clean` target to remove all artifacts. +--- -## Adding a new extension +## Adding a New Extension -To add a new extension of instructions, create an appropriate `rv*` file based on the policy defined in [File Structure](#file-naming-policy). Run `make` from the root directory to ensure that all checks pass and all artifacts are created correctly. A successful run should print the following log on the terminal: +To add a new extension of instructions, follow these steps: -``` -Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv*', 'unratified/rv*'] -Extensions selected : ['rv*', 'unratified/rv*'] -INFO:: encoding.out.h generated successfully -INFO:: inst.chisel generated successfully -INFO:: inst.sverilog generated successfully -INFO:: inst.rs generated successfully -INFO:: instr-table.tex generated successfully -INFO:: priv-instr-table.tex generated successfully -``` +1. **Create the Extension File**: + - Create a new `rv*` file according to the policy defined in the [File Structure](#file-naming-policy). + +2. **Run Checks and Generate Artifacts**: + - From the root directory, run the `make` command to ensure that all checks pass and that all artifacts are generated correctly. + - A successful run will produce the following output: + + ```plaintext + Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv*', 'unratified/rv*'] + Extensions selected : ['rv*', 'unratified/rv*'] + INFO:: encoding.out.h generated successfully + INFO:: inst.chisel generated successfully + INFO:: inst.sverilog generated successfully + INFO:: inst.rs generated successfully + INFO:: instr-table.tex generated successfully + INFO:: priv-instr-table.tex generated successfully + ``` -Create a PR for review. +3. **Submit for Review**: + - Create a pull request (PR) to submit your changes for review. -## Enabling Debug logs in parse.py +Ensure you follow these steps carefully to integrate the new extension properly. +--- -To enable debug logs in parse.py change `level=logging.INFO` to `level=logging.DEBUG` and run the python command. You will now see debug statements on -the terminal like below: +## How do I find where an instruction is defined? + +You can locate the definition of an instruction using one of the following methods: + +1. **Using `grep`**: + ```bash + grep "^\s*" rv* unratified/rv* + ``` +2. **Using `make`**: + - Run make to generate the instr_dict.yaml file. + - Open instr_dict.yaml and search for the instruction. + - The extension field in the file will indicate which file the instruction was picked from. +--- +## Debugging +To enable debug logs in parse.py: + +1. Modify the logging level in parse.py: +```python +level=logging.INFO ``` -DEBUG:: Collecting standard instructions first -DEBUG:: Parsing File: ./rv_i -DEBUG:: Processing line: lui rd imm20 6..2=0x0D 1..0=3 -DEBUG:: Processing line: auipc rd imm20 6..2=0x05 1..0=3 -DEBUG:: Processing line: jal rd jimm20 6..2=0x1b 1..0=3 -DEBUG:: Processing line: jalr rd rs1 imm12 14..12=0 6..2=0x19 1..0=3 -DEBUG:: Processing line: beq bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3 -DEBUG:: Processing line: bne bimm12hi rs1 rs2 bimm12lo 14..12=1 6..2=0x18 1..0=3 +Change it to: +```python +level=logging.DEBUG ``` +2. Example debug output: + ```bash + DEBUG:: Parsing File: ./rv_i + DEBUG:: Processing line: lui rd imm20 6..2=0x0D 1..0=3 + ``` +--- +## Contributing + If you wish to contribute to this project: -## How do I find where an instruction is defined? + - Open a pull request (PR) or issue. + - Ensure that all tests pass. + - Follow the repository’s coding guidelines. -You can use `grep "^\s*" rv* unratified/rv*` OR run `make` and open -`instr_dict.json` and search for the instruction you are looking for. Within -that instruction the `extension` field will indicate which file the -instruction was picked from. From cea6f5b8c87124dbe9413575b43f1197cdd9069c Mon Sep 17 00:00:00 2001 From: Myrausman Date: Fri, 13 Sep 2024 18:27:59 +0500 Subject: [PATCH 2/5] example added & formatted Signed-off-by: Myrausman --- README.md | 222 +++++++++++++++++++++++++++++++----------------------- 1 file changed, 129 insertions(+), 93 deletions(-) diff --git a/README.md b/README.md index 762745ba..4758aaa5 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ This repository enumerates standard RISC-V instruction opcodes and control/statu Artifacts like `encoding.h`, `latex-tables`, etc., from this repo are used in tools and projects such as Spike, PK, and the RISC-V Manual. ## Table of Contents + 1. [Project Structure](#project-structure) 2. [File Naming Policy](#file-naming-policy) 3. [Encoding Syntax](#encoding-syntax) @@ -15,6 +16,7 @@ Artifacts like `encoding.h`, `latex-tables`, etc., from this repo are used in to 8. [Contributing](#contributing) --- + ## Project Structure ```bash @@ -27,16 +29,18 @@ Artifacts like `encoding.h`, `latex-tables`, etc., from this repo are used in to ├── rv* # instruction opcode files └── unratified # contains unratified instruction opcode files ``` + --- + ## File Naming Policy This project follows a specific file naming convention for instruction encodings: -* **`rv_x`**: Instructions common to both 32-bit and 64-bit modes of extension `X`. -* **`rv32_x`**: Instructions specific to `rv32x` (e.g., `brev8`). -* **`rv64_x`**: Instructions specific to `rv64x` (e.g., `addw`). -* **`rv_x_y`**: Instructions valid when both extensions `X` and `Y` are enabled. Canonical ordering as specified by the RISC-V spec should be followed. -* **`unratified/`**: Contains instruction encodings that are not ratified yet, following the same policy. +- **`rv_x`**: Instructions common to both 32-bit and 64-bit modes of extension `X`. +- **`rv32_x`**: Instructions specific to `rv32x` (e.g., `brev8`). +- **`rv64_x`**: Instructions specific to `rv64x` (e.g., `addw`). +- **`rv_x_y`**: Instructions valid when both extensions `X` and `Y` are enabled. Canonical ordering as specified by the RISC-V spec should be followed. +- **`unratified/`**: Contains instruction encodings that are not ratified yet, following the same policy. For instructions present in multiple extensions where the spec is vague, the encoding should be placed in the canonically ordered first extension and imported into others using `$import`. @@ -46,48 +50,53 @@ For instructions present in multiple extensions where the spec is vague, the enc Instruction encoding files in this project use the following syntax: -* **Keywords**: `$import` and `$pseudo_op` are keywords used to indicate special operations. -* **Operators**: `::` defines relationships between extensions and instructions; `..` defines bit ranges. -* **Comments**: Use `#` for comments. Inline comments are not supported. +- **Keywords**: `$import` and `$pseudo_op` are keywords used to indicate special operations. +- **Operators**: `::` defines relationships between extensions and instructions; `..` defines bit ranges. +- **Comments**: Use `#` for comments. Inline comments are not supported. ### Instruction Categories 1. **Regular Instructions**: Instructions with unique opcodes. - - **Syntax**: ` ` - - **Example**: - ```plaintext - lui rd imm20 6..2=0x0D 1..0=3 - beq bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3 - ``` - - **Bit Encoding Types**: - - *Single Bit Assignment*: `=` - - *Range Assignment*: `..=` + - **Syntax**: ` ` + - **Example**: + + ```plaintext + lui rd imm20 6..2=0x0D 1..0=3 + beq bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3 + ``` + + - **Bit Encoding Types**: + - _Single Bit Assignment_: `=` + - _Range Assignment_: `..=` 2. **Pseudo Instructions** (`$pseudo_op`): Aliases for regular instructions with restricted bit encodings. - - **Syntax**: `$pseudo_op :: ` - - ``: Specifies the extension which contains the base instruction. - - ``: Indicates the name of the instruction this pseudo-instruction is an alias of. - - The remaining fields are the same as the regular instruction syntax, where all the arguments and the fields of the pseudo instruction are specified. - - **Example**: - ```plaintext - $pseudo_op rv_zicsr::csrrs frflags rd 19..15=0 31..20=0x001 14..12=2 6..2=0x1C 1..0=3 - ``` - - **Recommendation**: If a ratified instruction is a `$pseudo_op` of a regular unratified instruction, it is recommended to maintain this `$pseudo_op` relationship. Define the new instruction as a `$pseudo_op` of the unratified regular instruction to avoid overlapping opcodes for users experimenting with unratified extensions. + - **Syntax**: `$pseudo_op :: ` + - ``: Specifies the extension which contains the base instruction. + - ``: Indicates the name of the instruction this pseudo-instruction is an alias of. + - The remaining fields are the same as the regular instruction syntax, where all the arguments and the fields of the pseudo instruction are specified. + - **Example**: + + ```plaintext + $pseudo_op rv_zicsr::csrrs frflags rd 19..15=0 31..20=0x001 14..12=2 6..2=0x1C 1..0=3 + ``` + + - **Recommendation**: If a ratified instruction is a `$pseudo_op` of a regular unratified instruction, it is recommended to maintain this `$pseudo_op` relationship. Define the new instruction as a `$pseudo_op` of the unratified regular instruction to avoid overlapping opcodes for users experimenting with unratified extensions. 3. **Imported Instructions** (`$import`): Instructions borrowed from another extension. - - These are instructions borrowed from an extension into a new or different extension/sub-extension. Only regular instructions can be imported. Pseudo-ops or already imported instructions cannot be imported. - - **Syntax**: `$import :: ` - - **Example**: - ```plaintext - $import rv32_zkne::aes32esmi - ``` + - These are instructions borrowed from an extension into a new or different extension/sub-extension. Only regular instructions can be imported. Pseudo-ops or already imported instructions cannot be imported. + - **Syntax**: `$import :: ` + - **Example**: + ```plaintext + $import rv32_zkne::aes32esmi + ``` ### Restrictions -* Pseudo-ops or already imported instructions cannot be imported again. -* A base instruction for a pseudo-op cannot be a pseudo-op itself. +- Pseudo-ops or already imported instructions cannot be imported again. +- A base instruction for a pseudo-op cannot be a pseudo-op itself. + --- ## Flow for parse.py @@ -95,18 +104,22 @@ Instruction encoding files in this project use the following syntax: The `parse.py` Python file is used to perform checks on the current set of instruction encodings and also generates multiple artifacts: LaTeX tables, `encoding.h` header file, etc. This section provides a brief overview of the flow within the Python file. 1. **Initial Setup**: + - `parse.py` creates a list of all `rv*` files currently checked into the repo (including those inside the `unratified` directory). - It starts parsing each file line by line. 2. **First Pass - Regular Instructions**: + - Capture only regular instructions and ignore imported or pseudo instructions. - **Checks performed**: - - For range-assignment syntax, the *msb* (most significant bit) position must be higher than the *lsb* (least significant bit) position. - - The value of the range must be representable in the space identified by *msb* and *lsb*. + + - For range-assignment syntax, the _msb_ (most significant bit) position must be higher than the _lsb_ (least significant bit) position. + - The value of the range must be representable in the space identified by _msb_ and _lsb_. - Values for the same bit positions should not be defined multiple times. - All bit positions must be accounted for (either as arguments or constant value fields). - **Dictionary Creation**: + - Create a dictionary for each instruction with the following fields: - `encoding`: A 32-bit string defining the encoding of the instruction. `-` is used to represent instruction argument fields. - `extension`: String indicating which extension/filename this instruction was picked from. @@ -117,31 +130,33 @@ The `parse.py` Python file is used to perform checks on the current set of instr - Add the dictionary elements to a main `instr_dict` dictionary under the instruction node. This process continues until all regular instructions have been processed. 3. **Second Pass - Pseudo Instructions**: + - Process `$pseudo_op` instructions. - **Checks performed**: - - Verify if the *base-instruction* of the pseudo instruction exists in the relevant extension/filename. + - Verify if the _base-instruction_ of the pseudo instruction exists in the relevant extension/filename. - The remaining part of the syntax undergoes the same checks as above. - - If the checks pass and the *base-instruction* is not already added to the main `instr_dict`, then add the pseudo-instruction to the list. + - If the checks pass and the _base-instruction_ is not already added to the main `instr_dict`, then add the pseudo-instruction to the list. 4. **Third Pass - Imported Instructions**: + - Process imported instructions. 5. **Special Case**: - - If the *base-instruction* for a pseudo-instruction is not present in the main `instr_dict` after the first pass, it may be due to processing only a subset of extensions where the *base-instruction* is not included. + - If the _base-instruction_ for a pseudo-instruction is not present in the main `instr_dict` after the first pass, it may be due to processing only a subset of extensions where the _base-instruction_ is not included. ## Artifact Generation and Usage The `parse.py` script can generate the following artifacts: -* **`instr_dict.yaml`**: Contains the main dictionary `instr_dict` in YAML format. Note that dots in instruction names are replaced with underscores in this YAML file. -* **`encoding.out.h`**: A header file used by tools such as Spike, PK, etc. -* **`instr-table.tex`**: LaTeX table of instructions for the RISC-V unprivileged specification. -* **`priv-instr-table.tex`**: LaTeX table of instructions for the RISC-V privileged specification. -* **`inst.chisel`**: Chisel code for decoding instructions. -* **`inst.sverilog`**: SystemVerilog code for decoding instructions. -* **`inst.rs`**: Rust code containing mask and match variables for all instructions. -* **`inst.spinalhdl`**: SpinalHDL code for decoding instructions. -* **`inst.go`**: Go code for decoding instructions. +- **`instr_dict.yaml`**: Contains the main dictionary `instr_dict` in YAML format. Note that dots in instruction names are replaced with underscores in this YAML file. +- **`encoding.out.h`**: A header file used by tools such as Spike, PK, etc. +- **`instr-table.tex`**: LaTeX table of instructions for the RISC-V unprivileged specification. +- **`priv-instr-table.tex`**: LaTeX table of instructions for the RISC-V privileged specification. +- **`inst.chisel`**: Chisel code for decoding instructions. +- **`inst.sverilog`**: SystemVerilog code for decoding instructions. +- **`inst.rs`**: Rust code containing mask and match variables for all instructions. +- **`inst.spinalhdl`**: SpinalHDL code for decoding instructions. +- **`inst.go`**: Go code for decoding instructions. ### Prerequisites @@ -151,22 +166,27 @@ Ensure you have the required Python dependencies installed. Run the following co sudo apt-get install python3-pip pip3 install -r requirements.txt ``` + ### Generating Artifacts + To generate all artifacts for all instructions currently checked in, run make from the root directory. This will produce the following output: - ```plaintext - Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv*', 'unratified/rv*'] - Extensions selected : ['rv*', 'unratified/rv*'] - INFO:: encoding.out.h generated successfully - INFO:: inst.chisel generated successfully - INFO:: inst.spinalhdl generated successfully - INFO:: inst.sverilog generated successfully - INFO:: inst.rs generated successfully - INFO:: inst.go generated successfully - INFO:: instr-table.tex generated successfully - INFO:: priv-instr-table.tex generated successfully - ``` +```plaintext +Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv*', 'unratified/rv*'] +Extensions selected : ['rv*', 'unratified/rv*'] +INFO:: encoding.out.h generated successfully +INFO:: inst.chisel generated successfully +INFO:: inst.spinalhdl generated successfully +INFO:: inst.sverilog generated successfully +INFO:: inst.rs generated successfully +INFO:: inst.go generated successfully +INFO:: instr-table.tex generated successfully +INFO:: priv-instr-table.tex generated successfully + +``` + ### Selecting Specific Extensions + By default, all extensions are enabled. To select a subset of extensions, modify the EXTENSIONS variable in the Makefile to include only the filenames of interest. For example, to include only the I and M extensions: For example if you want only the I and M extensions you can do the following: @@ -186,15 +206,18 @@ This will produce the following output: INFO:: instr-table.tex generated successfully INFO:: priv-instr-table.tex generated successfully ``` + ### Generating Specific Artifacts To generate specific artifacts, use one or more of the following targets: -* `c` -* `rust` -* `chisel` -* `sverilog` -* `latex` +- `inst.c`,`rs`,`chisel`,`sverilog`, `instr-table.tex` + +For example, if you want to generate using Chisel and Rust, run the following command: + +```bash + make inst.chisel inst.rs +``` ### Cleaning Up @@ -203,35 +226,37 @@ To remove all generated artifacts, use the `clean` target: ```bash make clean ``` + --- ## Adding a New Extension To add a new extension of instructions, follow these steps: -1. **Create the Extension File**: - - Create a new `rv*` file according to the policy defined in the [File Structure](#file-naming-policy). +1. **Create the Extension File**: + +- Create a new `rv*` file according to the policy defined in the [File Structure](#file-naming-policy). 2. **Run Checks and Generate Artifacts**: - - From the root directory, run the `make` command to ensure that all checks pass and that all artifacts are generated correctly. - - A successful run will produce the following output: - - ```plaintext - Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv*', 'unratified/rv*'] - Extensions selected : ['rv*', 'unratified/rv*'] - INFO:: encoding.out.h generated successfully - INFO:: inst.chisel generated successfully - INFO:: inst.sverilog generated successfully - INFO:: inst.rs generated successfully - INFO:: instr-table.tex generated successfully - INFO:: priv-instr-table.tex generated successfully - ``` + +- From the root directory, run the `make` command to ensure that all checks pass and that all artifacts are generated correctly. +- A successful run will produce the following output: + + ```plaintext + Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv*', 'unratified/rv*'] + Extensions selected : ['rv*', 'unratified/rv*'] + INFO:: encoding.out.h generated successfully + INFO:: inst.chisel generated successfully + INFO:: inst.sverilog generated successfully + INFO:: inst.rs generated successfully + INFO:: instr-table.tex generated successfully + INFO:: priv-instr-table.tex generated successfully + ``` 3. **Submit for Review**: - Create a pull request (PR) to submit your changes for review. -Ensure you follow these steps carefully to integrate the new extension properly. ---- +## Ensure you follow these steps carefully to integrate the new extension properly. ## How do I find where an instruction is defined? @@ -242,31 +267,42 @@ You can locate the definition of an instruction using one of the following metho grep "^\s*" rv* unratified/rv* ``` 2. **Using `make`**: - - Run make to generate the instr_dict.yaml file. - - Open instr_dict.yaml and search for the instruction. - - The extension field in the file will indicate which file the instruction was picked from. + +- Run make to generate the instr_dict.yaml file. +- Open instr_dict.yaml and search for the instruction. +- The extension field in the file will indicate which file the instruction was picked from. + --- + ## Debugging + To enable debug logs in parse.py: 1. Modify the logging level in parse.py: + ```python level=logging.INFO ``` + Change it to: + ```python level=logging.DEBUG ``` + 2. Example debug output: - ```bash - DEBUG:: Parsing File: ./rv_i - DEBUG:: Processing line: lui rd imm20 6..2=0x0D 1..0=3 - ``` + +```bash +DEBUG:: Parsing File: ./rv_i +DEBUG:: Processing line: lui rd imm20 6..2=0x0D 1..0=3 +``` + --- + ## Contributing - If you wish to contribute to this project: - - Open a pull request (PR) or issue. - - Ensure that all tests pass. - - Follow the repository’s coding guidelines. +If you wish to contribute to this project: +- Open a pull request (PR) or issue. +- Ensure that all tests pass. +- Follow the repository’s coding guidelines. From d3909c9fcfc81926440c702ea26954d65d27f147 Mon Sep 17 00:00:00 2001 From: Myrausman Date: Sun, 3 Nov 2024 17:02:41 +0500 Subject: [PATCH 3/5] Added CONTRIBUTING.md Signed-off-by: Myrausman --- CONTRIBUTING.md | 57 +++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 3 +-- 2 files changed, 58 insertions(+), 2 deletions(-) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..f4d93235 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,57 @@ +# Contributing to riscv-opcodes + +Welcome, and thank you for your interest in contributing to the `riscv-opcodes` repository! + +## Table of Contents +- [How to Contribute](#how-to-contribute) +- [Setting Up the Project](#setting-up-the-project) +- [Code Standards](#code-standards) +- [Running Tests](#running-tests) +- [Submitting a Pull Request](#submitting-a-pull-request) + +## How to Contribute +- **Report an Issue**: For bugs or improvement suggestions, please open an issue. +- **Submit a Pull Request (PR)**: For fixes, improvements, or new features, submit a PR. It’s recommended to connect your PR with an open issue or prior discussion. + +## Setting Up the Project +### Prerequisites +Ensure you have: +- Python 3.9+ +- Necessary Python dependencies + +### Generating Artifacts +To generate artifacts like `encoding.h` and `inst.rs`, navigate to the project root and run: +```bash +make +``` +To generate specific artifacts like `inst.chisel` and `inst.rs`, run: +```bash +make inst.chisel inst.rs +``` +To clean up generated files: +```bash +make clean +``` +## Code Standards + +- **File Naming**: Follow naming conventions provided in the README. +- **Syntax**: Use correct encoding syntax. +- **Comments**: Use `#` for comments, avoiding inline comments. + +To format code, run: +```bash +pre-commit run --all-files +``` +## Running Tests +Run tests to check encoding files and artifact generation before submitting a PR. + +## Submitting a Pull Request + +- **Branching**: Create a new branch for each PR. +- **Commits**: Use clear and descriptive commit messages. +- **PR Details**: + - Reference any related issues. + - Provide a summary of changes and their purpose. + - Verify that all tests pass. + +Thank you for contributing to the RISC-V community! diff --git a/README.md b/README.md index 4758aaa5..e7a0cee3 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ Artifacts like `encoding.h`, `latex-tables`, etc., from this repo are used in to 5. [Artifact Generation](#artifact-generation) 6. [Adding a New Extension](#adding-a-new-extension) 7. [Debugging](#debugging) -8. [Contributing](#contributing) +8. [Contributing.md](CONTRIBUTING.md) --- @@ -164,7 +164,6 @@ Ensure you have the required Python dependencies installed. Run the following co ```bash sudo apt-get install python3-pip -pip3 install -r requirements.txt ``` ### Generating Artifacts From a6b0b0099a17ec94960656f3bef59c878060cb6b Mon Sep 17 00:00:00 2001 From: Myrausman Date: Sun, 3 Nov 2024 19:08:45 +0500 Subject: [PATCH 4/5] improved readme Signed-off-by: Myrausman --- README.md | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index e7a0cee3..a8b68e25 100644 --- a/README.md +++ b/README.md @@ -148,7 +148,7 @@ The `parse.py` Python file is used to perform checks on the current set of instr The `parse.py` script can generate the following artifacts: -- **`instr_dict.yaml`**: Contains the main dictionary `instr_dict` in YAML format. Note that dots in instruction names are replaced with underscores in this YAML file. +- **`instr_dict.json`**: Contains the main dictionary `instr_dict`, formatted as JSON. In this file, dots in instruction names are replaced with underscores. Previously, this file was generated as instr_dict.yaml. Since JSON is a subset of YAML, it can still be read by any YAML parser. - **`encoding.out.h`**: A header file used by tools such as Spike, PK, etc. - **`instr-table.tex`**: LaTeX table of instructions for the RISC-V unprivileged specification. - **`priv-instr-table.tex`**: LaTeX table of instructions for the RISC-V privileged specification. @@ -267,8 +267,8 @@ You can locate the definition of an instruction using one of the following metho ``` 2. **Using `make`**: -- Run make to generate the instr_dict.yaml file. -- Open instr_dict.yaml and search for the instruction. +- Run make to generate the instr_dict.json file. +- Open instr_dict.json and search for the instruction. - The extension field in the file will indicate which file the instruction was picked from. --- @@ -295,13 +295,3 @@ level=logging.DEBUG DEBUG:: Parsing File: ./rv_i DEBUG:: Processing line: lui rd imm20 6..2=0x0D 1..0=3 ``` - ---- - -## Contributing - -If you wish to contribute to this project: - -- Open a pull request (PR) or issue. -- Ensure that all tests pass. -- Follow the repository’s coding guidelines. From ec5c00c50eabb3776438a39143ef386ee8a692d4 Mon Sep 17 00:00:00 2001 From: Myrausman Date: Sat, 23 Nov 2024 17:16:15 +0500 Subject: [PATCH 5/5] improve readme to enhance usage and contribution experience --- CONTRIBUTING.md | 208 +++++++++++++++++++++++------- README.md | 336 ++++++++++++------------------------------------ 2 files changed, 238 insertions(+), 306 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f4d93235..3eca0cc4 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,57 +1,167 @@ -# Contributing to riscv-opcodes - -Welcome, and thank you for your interest in contributing to the `riscv-opcodes` repository! - -## Table of Contents -- [How to Contribute](#how-to-contribute) -- [Setting Up the Project](#setting-up-the-project) -- [Code Standards](#code-standards) -- [Running Tests](#running-tests) -- [Submitting a Pull Request](#submitting-a-pull-request) - -## How to Contribute -- **Report an Issue**: For bugs or improvement suggestions, please open an issue. -- **Submit a Pull Request (PR)**: For fixes, improvements, or new features, submit a PR. It’s recommended to connect your PR with an open issue or prior discussion. - -## Setting Up the Project -### Prerequisites -Ensure you have: -- Python 3.9+ -- Necessary Python dependencies - -### Generating Artifacts -To generate artifacts like `encoding.h` and `inst.rs`, navigate to the project root and run: -```bash -make +# **Contributing to riscv-opcodes** + +Thank you for considering contributing to the **riscv-opcodes** project! +This guide will help you understand the repository structure, coding conventions, and contribution workflow. + +--- + +## **Table of Contents** + +1. [Getting Started](#getting-started) +2. [File Naming Policy](#file-naming-policy) +3. [Encoding Syntax](#encoding-syntax) +4. [Adding a New Extension](#adding-a-new-extension) +5. [Testing and Validation](#testing-and-validation) +6. [Code of Conduct](#code-of-conduct) + +--- + +## **Getting Started** + +1. **Clone the Repository** + ```bash + git clone https://github.com//riscv-opcodes.git + cd riscv-opcodes + ``` + +2. **Set Up Dependencies** + Ensure you have Python installed (version >= 3.7). + Install dependencies using: + ```bash + pip install -r requirements.txt + ``` + +3. **Understand the Project Structure** + Familiarize yourself with the files and folders as explained in the [README.md](README.md). + +4. **Create a Branch** + Before starting your work, create a feature branch: + ```bash + git checkout -b + ``` + +--- + +## **File Naming Policy** + +- **`rv_x`**: Instructions common to both 32-bit and 64-bit modes of extension `X`. +- **`rv32_x`**: Instructions specific to `rv32x` (e.g., `brev8`). +- **`rv64_x`**: Instructions specific to `rv64x` (e.g., `addw`). +- **`rv_x_y`**: Instructions valid when both extensions `X` and `Y` are enabled. + - **Canonical Ordering**: Follow the RISC-V specification for ordering extensions (e.g., `rv_zknh` for `Zkn` + `Zknh`). + +- **`unratified/`**: Contains instruction encodings that are not ratified yet, following the same naming policy. + +**Note**: If an instruction belongs to multiple extensions, place the encoding in the canonically ordered first extension and use `$import` to share it. + +--- + +## **Encoding Syntax** + +### **Keywords** +- `$import`: Used for importing instructions from another extension. +- `$pseudo_op`: Used to define pseudo-instructions. + +### **Operators** +- `::`: Defines relationships between extensions and instructions. +- `..`: Defines bit ranges. + +### **Comments** +- Use `#` for comments. Inline comments are not supported. + +### **Regular Instruction Syntax** +```plaintext + ``` -To generate specific artifacts like `inst.chisel` and `inst.rs`, run: -```bash -make inst.chisel inst.rs + +### **Pseudo-Instruction Syntax** +```plaintext +$pseudo_op :: ``` -To clean up generated files: -```bash -make clean + +### **Imported Instruction Syntax** +```plaintext +$import :: ``` -## Code Standards -- **File Naming**: Follow naming conventions provided in the README. -- **Syntax**: Use correct encoding syntax. -- **Comments**: Use `#` for comments, avoiding inline comments. +--- -To format code, run: -```bash -pre-commit run --all-files -``` -## Running Tests -Run tests to check encoding files and artifact generation before submitting a PR. +## **Adding a New Extension** + +1. **Create Encoding Files** + Place your new encoding file in the `extensions/` folder. Use the [File Naming Policy](#file-naming-policy) to name your file appropriately. + +2. **Define Instructions** + Follow the [Encoding Syntax](#encoding-syntax) to add your instructions in the new file. + Example: + ```plaintext + lui rd imm20 6..2=0x0D 1..0=3 + ``` + +3. **Import Instructions (if needed)** + If an instruction is shared between extensions, use `$import`. + Example: + ```plaintext + $import rv_zicsr::csrrs + ``` + +4. **Update Dependencies** + If your new extension requires dependencies (e.g., artifacts), update `Makefile` or other relevant scripts. + +--- + +## **Testing and Validation** + +1. **Run the Parser** + After making changes, validate your encodings using: + ```bash + python parse.py + ``` + + Fix any errors or warnings reported by the script. + +2. **Check Artifacts** + Ensure that the artifacts generated (e.g., `encoding.h`, LaTeX tables) include your changes: + ```bash + make + ``` + +3. **Review Your Changes** + Verify that your changes are aligned with the project’s conventions and do not break existing functionality. + +--- + +## **Code of Conduct** + +By contributing to this project, you agree to adhere to the following principles: +- Be respectful and collaborative. +- Provide clear documentation for any new instructions or extensions. +- Report bugs and propose changes in a constructive manner. +- Ensure your contributions comply with the [LICENSE](LICENSE). + +--- + +## **Submitting Your Contribution** + +1. **Commit Your Changes** + Use clear and concise commit messages: + ```bash + git add . + git commit -m "Add " + ``` + +2. **Push Your Branch** + Push your branch to your forked repository: + ```bash + git push origin + ``` + +3. **Create a Pull Request** + Open a pull request from your branch to the `main` branch of this repository. -## Submitting a Pull Request +4. **Address Feedback** + Be prepared to make revisions based on reviewer feedback. -- **Branching**: Create a new branch for each PR. -- **Commits**: Use clear and descriptive commit messages. -- **PR Details**: - - Reference any related issues. - - Provide a summary of changes and their purpose. - - Verify that all tests pass. +--- -Thank you for contributing to the RISC-V community! +We look forward to your contributions! 😊 diff --git a/README.md b/README.md index a8b68e25..40e84c89 100644 --- a/README.md +++ b/README.md @@ -1,297 +1,119 @@ -# riscv-opcodes +# **riscv-opcodes** -This repository enumerates standard RISC-V instruction opcodes and control/status registers. It also contains a script to convert them into various formats (C, Scala, LaTeX). - -Artifacts like `encoding.h`, `latex-tables`, etc., from this repo are used in tools and projects such as Spike, PK, and the RISC-V Manual. - -## Table of Contents - -1. [Project Structure](#project-structure) -2. [File Naming Policy](#file-naming-policy) -3. [Encoding Syntax](#encoding-syntax) -4. [Usage](#usage) -5. [Artifact Generation](#artifact-generation) -6. [Adding a New Extension](#adding-a-new-extension) -7. [Debugging](#debugging) -8. [Contributing.md](CONTRIBUTING.md) +This repository enumerates standard RISC-V instruction opcodes and control/status registers. It includes tools to convert these into various formats (e.g., C, Scala, LaTeX) for use in projects like Spike, PK, and the RISC-V Manual. --- -## Project Structure - -```bash -├── constants.py # contains variables, constants and data-structures used in parse.py -├── encoding.h # the template encoding.h file -├── LICENSE # license file -├── Makefile # makefile to generate artifacts -├── parse.py # performs checks and generates artifacts -├── README.md # this file -├── rv* # instruction opcode files -└── unratified # contains unratified instruction opcode files -``` +## **Table of Contents** +1. [Introduction](#introduction) +2. [Project Structure](#project-structure) +3. [Usage](#usage) +4. [Artifact Generation](#artifact-generation) +5. [Resources](#resources) +6. [Contributing](#contributing) +7. [License](#license) --- -## File Naming Policy - -This project follows a specific file naming convention for instruction encodings: - -- **`rv_x`**: Instructions common to both 32-bit and 64-bit modes of extension `X`. -- **`rv32_x`**: Instructions specific to `rv32x` (e.g., `brev8`). -- **`rv64_x`**: Instructions specific to `rv64x` (e.g., `addw`). -- **`rv_x_y`**: Instructions valid when both extensions `X` and `Y` are enabled. Canonical ordering as specified by the RISC-V spec should be followed. -- **`unratified/`**: Contains instruction encodings that are not ratified yet, following the same policy. - -For instructions present in multiple extensions where the spec is vague, the encoding should be placed in the canonically ordered first extension and imported into others using `$import`. - ---- - -## Encoding Syntax - -Instruction encoding files in this project use the following syntax: - -- **Keywords**: `$import` and `$pseudo_op` are keywords used to indicate special operations. -- **Operators**: `::` defines relationships between extensions and instructions; `..` defines bit ranges. -- **Comments**: Use `#` for comments. Inline comments are not supported. - -### Instruction Categories - -1. **Regular Instructions**: Instructions with unique opcodes. - - - **Syntax**: ` ` - - **Example**: - - ```plaintext - lui rd imm20 6..2=0x0D 1..0=3 - beq bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3 - ``` - - - **Bit Encoding Types**: - - _Single Bit Assignment_: `=` - - _Range Assignment_: `..=` - -2. **Pseudo Instructions** (`$pseudo_op`): Aliases for regular instructions with restricted bit encodings. - - - **Syntax**: `$pseudo_op :: ` - - ``: Specifies the extension which contains the base instruction. - - ``: Indicates the name of the instruction this pseudo-instruction is an alias of. - - The remaining fields are the same as the regular instruction syntax, where all the arguments and the fields of the pseudo instruction are specified. - - **Example**: - - ```plaintext - $pseudo_op rv_zicsr::csrrs frflags rd 19..15=0 31..20=0x001 14..12=2 6..2=0x1C 1..0=3 - ``` - - - **Recommendation**: If a ratified instruction is a `$pseudo_op` of a regular unratified instruction, it is recommended to maintain this `$pseudo_op` relationship. Define the new instruction as a `$pseudo_op` of the unratified regular instruction to avoid overlapping opcodes for users experimenting with unratified extensions. - -3. **Imported Instructions** (`$import`): Instructions borrowed from another extension. - - These are instructions borrowed from an extension into a new or different extension/sub-extension. Only regular instructions can be imported. Pseudo-ops or already imported instructions cannot be imported. - - **Syntax**: `$import :: ` - - **Example**: - ```plaintext - $import rv32_zkne::aes32esmi - ``` - -### Restrictions - -- Pseudo-ops or already imported instructions cannot be imported again. -- A base instruction for a pseudo-op cannot be a pseudo-op itself. +## **Introduction** +The **riscv-opcodes** repository serves as the single source of truth for RISC-V instruction encodings and related metadata. Artifacts generated from this repository, such as `encoding.h` and LaTeX tables, are used in various RISC-V software and documentation projects. --- -## Flow for parse.py - -The `parse.py` Python file is used to perform checks on the current set of instruction encodings and also generates multiple artifacts: LaTeX tables, `encoding.h` header file, etc. This section provides a brief overview of the flow within the Python file. - -1. **Initial Setup**: - - - `parse.py` creates a list of all `rv*` files currently checked into the repo (including those inside the `unratified` directory). - - It starts parsing each file line by line. - -2. **First Pass - Regular Instructions**: - - - Capture only regular instructions and ignore imported or pseudo instructions. - - **Checks performed**: - - - For range-assignment syntax, the _msb_ (most significant bit) position must be higher than the _lsb_ (least significant bit) position. - - The value of the range must be representable in the space identified by _msb_ and _lsb_. - - Values for the same bit positions should not be defined multiple times. - - All bit positions must be accounted for (either as arguments or constant value fields). - - - **Dictionary Creation**: - - - Create a dictionary for each instruction with the following fields: - - `encoding`: A 32-bit string defining the encoding of the instruction. `-` is used to represent instruction argument fields. - - `extension`: String indicating which extension/filename this instruction was picked from. - - `mask`: A 32-bit hex value indicating the bits of the encodings that must be checked for legality. - - `match`: A 32-bit hex value indicating the values the encoding must take for the bits which are set as 1 in the mask. - - `variable_fields`: A list of arguments required by the instruction. - - - Add the dictionary elements to a main `instr_dict` dictionary under the instruction node. This process continues until all regular instructions have been processed. - -3. **Second Pass - Pseudo Instructions**: - - - Process `$pseudo_op` instructions. - - **Checks performed**: - - Verify if the _base-instruction_ of the pseudo instruction exists in the relevant extension/filename. - - The remaining part of the syntax undergoes the same checks as above. - - If the checks pass and the _base-instruction_ is not already added to the main `instr_dict`, then add the pseudo-instruction to the list. - -4. **Third Pass - Imported Instructions**: - - - Process imported instructions. - -5. **Special Case**: - - If the _base-instruction_ for a pseudo-instruction is not present in the main `instr_dict` after the first pass, it may be due to processing only a subset of extensions where the _base-instruction_ is not included. - -## Artifact Generation and Usage - -The `parse.py` script can generate the following artifacts: - -- **`instr_dict.json`**: Contains the main dictionary `instr_dict`, formatted as JSON. In this file, dots in instruction names are replaced with underscores. Previously, this file was generated as instr_dict.yaml. Since JSON is a subset of YAML, it can still be read by any YAML parser. -- **`encoding.out.h`**: A header file used by tools such as Spike, PK, etc. -- **`instr-table.tex`**: LaTeX table of instructions for the RISC-V unprivileged specification. -- **`priv-instr-table.tex`**: LaTeX table of instructions for the RISC-V privileged specification. -- **`inst.chisel`**: Chisel code for decoding instructions. -- **`inst.sverilog`**: SystemVerilog code for decoding instructions. -- **`inst.rs`**: Rust code containing mask and match variables for all instructions. -- **`inst.spinalhdl`**: SpinalHDL code for decoding instructions. -- **`inst.go`**: Go code for decoding instructions. - -### Prerequisites - -Ensure you have the required Python dependencies installed. Run the following commands: +## **Project Structure** ```bash -sudo apt-get install python3-pip +├── utils/ # Utility scripts for artifact generation +├── extensions/ # Instruction opcode files, organized by extensions +│ ├── rv* # Ratified instructions +│ └── unratified/ # Unratified instruction files +├── LICENSE # Licensing information +├── Makefile # Build script for generating artifacts +├── parse.py # Script to parse and validate encodings +└── README.md # This file ``` -### Generating Artifacts +For detailed guidelines on contributing, see the [Contributing Guidelines](CONTRIBUTING.md). -To generate all artifacts for all instructions currently checked in, run make from the root directory. This will produce the following output: - -```plaintext -Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', '-spinalhdl', 'rv*', 'unratified/rv*'] -Extensions selected : ['rv*', 'unratified/rv*'] -INFO:: encoding.out.h generated successfully -INFO:: inst.chisel generated successfully -INFO:: inst.spinalhdl generated successfully -INFO:: inst.sverilog generated successfully -INFO:: inst.rs generated successfully -INFO:: inst.go generated successfully -INFO:: instr-table.tex generated successfully -INFO:: priv-instr-table.tex generated successfully - -``` - -### Selecting Specific Extensions - -By default, all extensions are enabled. To select a subset of extensions, modify the EXTENSIONS variable in the Makefile to include only the filenames of interest. For example, to include only the I and M extensions: -For example if you want only the I and M extensions you can do the following: - -```bash -make EXTENSIONS='rv*_i rv*_m' -``` - -This will produce the following output: - -```plaintext - Running with args : ['./parse.py', '-c', '-go', '-chisel', '-sverilog', '-rust', '-latex', 'rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m'] - Extensions selected : ['rv32_i', 'rv64_i', 'rv_i', 'rv64_m', 'rv_m'] - INFO:: encoding.out.h generated successfully - INFO:: inst.chisel generated successfully - INFO:: inst.sverilog generated successfully - INFO:: inst.rs generated successfully - INFO:: instr-table.tex generated successfully - INFO:: priv-instr-table.tex generated successfully -``` - -### Generating Specific Artifacts - -To generate specific artifacts, use one or more of the following targets: - -- `inst.c`,`rs`,`chisel`,`sverilog`, `instr-table.tex` - -For example, if you want to generate using Chisel and Rust, run the following command: - -```bash - make inst.chisel inst.rs -``` +--- -### Cleaning Up +## **Usage** -To remove all generated artifacts, use the `clean` target: +### **Generating Artifacts** +1. Clone this repository: + ```bash + git clone https://github.com//riscv-opcodes.git + cd riscv-opcodes + ``` +2. Run the following command to generate artifacts: + ```bash + make + ``` -```bash -make clean -``` +Generated artifacts (e.g., `encoding.h`, LaTeX tables) will appear in the output directory. --- -## Adding a New Extension +## **Artifact Generation** -To add a new extension of instructions, follow these steps: +The following artifacts are generated using `parse.py` or `make`: -1. **Create the Extension File**: +| Artifact | Description | +|-------------------------|-----------------------------------------------------------------------------| +| `instr_dict.json` | Contains the full dictionary of instruction encodings in JSON format. | +| `encoding.out.h` | Header file used by tools like Spike and PK. | +| `instr-table.tex` | LaTeX table for the RISC-V unprivileged spec. | +| `priv-instr-table.tex` | LaTeX table for the RISC-V privileged spec. | +| `inst.chisel` | Chisel code for instruction decoding. | +| `inst.sverilog` | System Verilog code for instruction decoding. | +| `inst.rs` | Rust code with mask and match variables. | +| `inst.spinalhdl` | SpinalHDL code for instruction decoding. | +| `inst.go` | Go code for instruction decoding. | -- Create a new `rv*` file according to the policy defined in the [File Structure](#file-naming-policy). +### **Commands** -2. **Run Checks and Generate Artifacts**: - -- From the root directory, run the `make` command to ensure that all checks pass and that all artifacts are generated correctly. -- A successful run will produce the following output: - - ```plaintext - Running with args : ['./parse.py', '-c', '-chisel', '-sverilog', '-rust', '-latex', 'rv*', 'unratified/rv*'] - Extensions selected : ['rv*', 'unratified/rv*'] - INFO:: encoding.out.h generated successfully - INFO:: inst.chisel generated successfully - INFO:: inst.sverilog generated successfully - INFO:: inst.rs generated successfully - INFO:: instr-table.tex generated successfully - INFO:: priv-instr-table.tex generated successfully +- **Generate All Artifacts**: + ```bash + make ``` -3. **Submit for Review**: - - Create a pull request (PR) to submit your changes for review. +- **Generate Specific Artifacts**: + To generate specific artifacts, modify the `EXTENSIONS` variable in the `Makefile` or use: + ```bash + make EXTENSIONS='rv*_i rv*_m' + ``` -## Ensure you follow these steps carefully to integrate the new extension properly. +- **Generate Using Specific Targets**: + ```bash + ./parse.py -c EXTENSIONS='rv*_i rv*_m' + ``` -## How do I find where an instruction is defined? +- **Clean Artifacts**: + To remove all generated artifacts, run: + ```bash + make clean + ``` -You can locate the definition of an instruction using one of the following methods: +--- -1. **Using `grep`**: - ```bash - grep "^\s*" rv* unratified/rv* - ``` -2. **Using `make`**: +## **Resources** -- Run make to generate the instr_dict.json file. -- Open instr_dict.json and search for the instruction. -- The extension field in the file will indicate which file the instruction was picked from. +- [RISC-V Official Documentation](https://riscv.org/specifications/) +- [Spike RISC-V Simulator](https://github.com/riscv-software-src/riscv-isa-sim) +- [RISC-V Foundation](https://riscv.org/) --- -## Debugging - -To enable debug logs in parse.py: - -1. Modify the logging level in parse.py: +## **Contributing** -```python -level=logging.INFO -``` +We welcome contributions! Please read the [Contributing Guidelines](CONTRIBUTING.md) before submitting pull requests. -Change it to: +If you're adding a new instruction, extension, or artifact: +- Follow the **File Naming Policy** and **Encoding Syntax** guidelines in the [Contributing Guidelines](CONTRIBUTING.md). +- Run the `parse.py` script to validate your changes. -```python -level=logging.DEBUG -``` +--- -2. Example debug output: +## **License** -```bash -DEBUG:: Parsing File: ./rv_i -DEBUG:: Processing line: lui rd imm20 6..2=0x0D 1..0=3 -``` +This repository is licensed under the [BSD-3-Clause License](LICENSE).