Adding an Instruction to the GNU Assembler


Binutils is a huge piece of code and new users can often feel lost and out of their depth when navigating it alone. To help ease the shock, in this post we’ll look at the very simplest step of adding a new basic instruction to an already defined extension and how to add a corresponding GNU Assembler (GAS) test.

While the examples and files given are all RISC-V specific, the information is transferable to other architecture ports, however tables and structures may differ. More information can be found through the binutils project page.

This blog post accompanies our talk at BCS OSSG on 15th October 2020.

To add a simple new instruction, we just need to modify the encoding table in the opcodes library and add tests that uses the new instructions.

Modifying Opcode Library Encoding Table

The encoding table for the opcode library is found in opcodes/riscv-opc.c in the structure riscv_opcodes. An example structure for the multiply instruction mul is shown below. This instruction is part of the M extension and multiplies two registers (operand format types s and t) together and saves the answer in a third register (operand format type d).

{"mul", 0, INSN_CLASS_M, "d,s,t", MATCH_MUL, MASK_MUL, match_opcode, 0 }

The fields are:

  1. Name of the instruction
  2. Xlen: If the instruction only belongs to RV32 use 32, RV64 use 64, or both use 0.
  3. Instruction class: The extension the instruction belongs to. A list of possibilities can be found in include/opcode/riscv.h: riscv_insn_class.
  4. Instruction operands: Defines the operands of the instruction. RISC-V already has defined many different types of operand which you can find at gas/config/tc-riscv.c. Custom operands can be used, but additional steps will be required which are not covered in this blog post.
  5. Match and
  6. Mask: Defined in include/opcode/riscv-opc.h. Used to pick the instruction out of the bit stream. The method is defined by:
  7. Match opcode: The RISC-V generic match opcode function is match_opcode which uses the logic: encoding & MASK == MATCH.
  8. Pinfo: a collection of bits describing the instruction, notably any relevant hazard information. For example, it can be used to indicate it is an alias, a macro or even a branch instruction.

Testing

To test the assembler, Binutils has the GAS testsuite. The tests assemble the given input and pattern match the produced disassembly output with the given expected disassembly. Since the tests use pattern matching, it is possible to use regular expressions in the tests.

To add a test for RISC-V, add an assembly file (suffix .s), disassembly file (suffix .d) and if the test expects to fail, a failure file (suffix .l). The group of files that belonging to the same test must all have the same name.

The dissassembly file also defines options to be passed to the assembler and disassembler to generate the desired output.

Building

To build just the assembler, use the commands:

mkdir -p ${BUILDPREFIX}/binutils-gdb
cd ${BUILDPREFIX}/binutils-gdb 
../../binutils-gdb/configure \
     --target=riscv32-unknown-elf \
     --prefix=${INSTALLPREFIX}
make all-gas

BUILDPREFIX and INSTALLPREFIX define your chosen build and install directories respectively.

Test the assembler using

make check-gas

If all tests pass install the assembler using

make install-gas

Using the assembler

Run the assembler with the following command, where test.s is your own assembly file.

riscv32-unknown-elf-as -march=rv32im test.s

Note the use of march=rv32im to tell the assembler to use instructions from the 32 bit RISC-V instruction set with the multiply extension. This can be any legal RISC-V architecture.

Conclusion

This has been a brief summary of the minimal steps needed to add an instruction to the GNU assembler. Our talk on the 15th October 2020 went into more detail – iIf you missed the talk, it is available on our YouTube channel. Hopefully, you feel slightly more confident with adding a simple instruction to RISC-V.

We’ll be going into the subject in of extending the entire GNU compiler in more depth at the London RISC-V meetup on Monday 19th November 2020, where you will have an opportunity to ask questions.