Skip to content

[RISCV][MC] Support Assembling 48- and 64-bit Instructions #110022

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions llvm/docs/RISCVUsage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -426,6 +426,20 @@ line. This currently applies to the following extensions:

No extensions have experimental intrinsics.

Long (>32-bit) Instruction Support
==================================

RISC-V is a variable-length ISA, but the standard currently only defines 16- and 32-bit instructions. The specification describes longer instruction encodings, but these are not ratified.

The LLVM disassembler, `llvm-objdump`, does use the longer instruction encodings described in the specification to guess the instruction length (up to 176 bits) and will group the disassembly view of encoding bytes correspondingly.

The LLVM integrated assembler for RISC-V supports two different kinds of ``.insn`` directive, for assembling instructions that LLVM does not yet support:

* ``.insn type, args*`` which takes a known instruction type, and a list of fields. You are strongly recommended to use this variant of the directive if your instruction fits an existing instruction type.
* ``.insn [ length , ] encoding`` which takes an (optional) explicit length (in bytes) and a raw encoding for the instruction. When given an explicit length, this variant can encode instructions up to 64 bits long. The encoding part of the directive must be given all bits for the instruction, none are filled in for the user. When used without the optional length, this variant of the directive will use the LSBs of the raw encoding to work out if an instruction is 16 or 32 bits long. LLVM does not infer that an instruction might be longer than 32 bits - in this case, the user must give the length explicitly.

It is strongly recommended to use the ``.insn`` directive for assembling unsupported instructions instead of ``.word`` or ``.hword``, because it will produce the correct mapping symbols to mark the word as an instruction, not data.

Global Pointer (GP) Relaxation and the Small Data Limit
=======================================================

Expand Down
68 changes: 56 additions & 12 deletions llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
#include "llvm/TargetParser/RISCVISAInfo.h"

#include <limits>
#include <optional>

using namespace llvm;

Expand Down Expand Up @@ -707,6 +708,8 @@ struct RISCVOperand final : public MCParsedAsmOperand {
bool isUImm16() const { return IsUImm<16>(); }
bool isUImm20() const { return IsUImm<20>(); }
bool isUImm32() const { return IsUImm<32>(); }
bool isUImm48() const { return IsUImm<48>(); }
bool isUImm64() const { return IsUImm<64>(); }

bool isUImm8GE32() const {
int64_t Imm;
Expand Down Expand Up @@ -3146,8 +3149,8 @@ bool RISCVAsmParser::parseDirectiveInsn(SMLoc L) {
StringRef Format;
SMLoc ErrorLoc = Parser.getTok().getLoc();
if (Parser.parseIdentifier(Format)) {
// Try parsing .insn [length], value
int64_t Length = 0;
// Try parsing .insn [ length , ] value
std::optional<int64_t> Length;
int64_t Value = 0;
if (Parser.parseIntToken(
Value, "expected instruction format or an integer constant"))
Expand All @@ -3156,25 +3159,66 @@ bool RISCVAsmParser::parseDirectiveInsn(SMLoc L) {
Length = Value;
if (Parser.parseIntToken(Value, "expected an integer constant"))
return true;

if (*Length == 0 || (*Length % 2) != 0)
return Error(ErrorLoc,
"instruction lengths must be a non-zero multiple of two");

// TODO: Support Instructions > 64 bits.
if (*Length > 8)
return Error(ErrorLoc,
"instruction lengths over 64 bits are not supported");
}

// We only derive a length from the encoding for 16- and 32-bit
// instructions, as the encodings for longer instructions are not frozen in
// the spec.
int64_t EncodingDerivedLength = ((Value & 0b11) == 0b11) ? 4 : 2;

if (Length) {
// Only check the length against the encoding if the length is present and
// could match
if ((*Length <= 4) && (*Length != EncodingDerivedLength))
return Error(ErrorLoc,
"instruction length does not match the encoding");

if (!isUIntN(*Length * 8, Value))
return Error(ErrorLoc, "encoding value does not fit into instruction");
} else {
if (!isUIntN(EncodingDerivedLength * 8, Value))
return Error(ErrorLoc, "encoding value does not fit into instruction");
}

// TODO: Add support for long instructions
int64_t RealLength = (Value & 3) == 3 ? 4 : 2;
if (!isUIntN(RealLength * 8, Value))
return Error(ErrorLoc, "invalid operand for instruction");
if (RealLength == 2 && !AllowC)
if (!AllowC && (EncodingDerivedLength == 2))
return Error(ErrorLoc, "compressed instructions are not allowed");
if (Length != 0 && Length != RealLength)
return Error(ErrorLoc, "instruction length mismatch");

if (getParser().parseEOL("invalid operand for instruction")) {
getParser().eatToEndOfStatement();
return true;
}

emitToStreamer(getStreamer(), MCInstBuilder(RealLength == 2 ? RISCV::Insn16
: RISCV::Insn32)
.addImm(Value));
unsigned Opcode;
if (Length) {
switch (*Length) {
case 2:
Opcode = RISCV::Insn16;
break;
case 4:
Opcode = RISCV::Insn32;
break;
case 6:
Opcode = RISCV::Insn48;
break;
case 8:
Opcode = RISCV::Insn64;
break;
default:
llvm_unreachable("Error should have already been emitted");
}
} else
Opcode = (EncodingDerivedLength == 2) ? RISCV::Insn16 : RISCV::Insn32;

emitToStreamer(getStreamer(), MCInstBuilder(Opcode).addImm(Value));
return false;
}

Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,8 @@ enum OperandType : unsigned {
OPERAND_UIMM12,
OPERAND_UIMM16,
OPERAND_UIMM32,
OPERAND_UIMM48,
OPERAND_UIMM64,
OPERAND_ZERO,
OPERAND_SIMM5,
OPERAND_SIMM5_PLUS1,
Expand Down
15 changes: 15 additions & 0 deletions llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,21 @@ void RISCVMCCodeEmitter::encodeInstruction(const MCInst &MI,
support::endian::write(CB, Bits, llvm::endianness::little);
break;
}
case 6: {
uint64_t Bits = getBinaryCodeForInstr(MI, Fixups, STI) & 0xffff'ffff'ffffu;
SmallVector<char, 8> Encoding;
support::endian::write(Encoding, Bits, llvm::endianness::little);
assert(Encoding[6] == 0 && Encoding[7] == 0 &&
"Unexpected encoding for 48-bit instruction");
Encoding.truncate(6);
CB.append(Encoding);
break;
}
case 8: {
uint64_t Bits = getBinaryCodeForInstr(MI, Fixups, STI);
support::endian::write(CB, Bits, llvm::endianness::little);
break;
}
}

++MCNumEmitted; // Keep track of the # of mi's emitted.
Expand Down
16 changes: 16 additions & 0 deletions llvm/lib/Target/RISCV/RISCVInstrFormats.td
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,22 @@ class RVInst<dag outs, dag ins, string opcodestr, string argstr,
let Size = 4;
}

class RVInst48<dag outs, dag ins, string opcodestr, string argstr,
list<dag> pattern, InstFormat format>
: RVInstCommon<outs, ins, opcodestr, argstr, pattern, format> {
field bits<48> Inst;
field bits<48> SoftFail = 0;
let Size = 6;
}

class RVInst64<dag outs, dag ins, string opcodestr, string argstr,
list<dag> pattern, InstFormat format>
: RVInstCommon<outs, ins, opcodestr, argstr, pattern, format> {
field bits<64> Inst;
field bits<64> SoftFail = 0;
let Size = 8;
}

// Pseudo instructions
class Pseudo<dag outs, dag ins, list<dag> pattern, string opcodestr = "", string argstr = "">
: RVInst<outs, ins, opcodestr, argstr, pattern, InstFormatPseudo> {
Expand Down
12 changes: 12 additions & 0 deletions llvm/lib/Target/RISCV/RISCVInstrInfo.td
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,8 @@ def uimm7 : RISCVUImmOp<7>;
def uimm8 : RISCVUImmOp<8>;
def uimm16 : RISCVUImmOp<16>;
def uimm32 : RISCVUImmOp<32>;
def uimm48 : RISCVUImmOp<48>;
def uimm64 : RISCVUImmOp<64>;
def simm12 : RISCVSImmLeafOp<12> {
let MCOperandPredicate = [{
int64_t Imm;
Expand Down Expand Up @@ -1135,6 +1137,16 @@ def Insn32 : RVInst<(outs), (ins uimm32:$value), "", "", [], InstFormatOther> {
let Inst{31-0} = value;
let AsmString = ".insn 0x4, $value";
}
def Insn48 : RVInst48<(outs), (ins uimm48:$value), "", "", [], InstFormatOther> {
bits<48> value;
let Inst{47-0} = value;
let AsmString = ".insn 0x6, $value";
}
def Insn64 : RVInst64<(outs), (ins uimm64:$value), "", "", [], InstFormatOther> {
bits<64> value;
let Inst{63-0} = value;
let AsmString = ".insn 0x8, $value";
}
}

// Use InstAliases to match these so that we can combine the insn and format
Expand Down
30 changes: 26 additions & 4 deletions llvm/test/MC/RISCV/insn-invalid.s
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,30 @@

.insn . # CHECK: :[[@LINE]]:7: error: expected instruction format or an integer constant
.insn 0x2, # CHECK: :[[@LINE]]:12: error: expected an integer constant
.insn 0x2, 0xffff # CHECK: :[[@LINE]]:7: error: instruction length mismatch
.insn 0x2, 0xffffffff # CHECK: :[[@LINE]]:7: error: instruction length mismatch
.insn 0xffffffffff # CHECK: :[[@LINE]]:7: error: invalid operand for instruction
.insn 0x0010 # CHECK: :[[@LINE]]:7: error: compressed instructions are not allowed

.insn 0x4, 0x13, 0 # CHECK: :[[@LINE]]:16: error: invalid operand for instruction

.insn 0x2, 0xffff # CHECK: :[[@LINE]]:7: error: instruction length does not match the encoding
.insn 0x2, 0xffffffff # CHECK: :[[@LINE]]:7: error: instruction length does not match the encoding
.insn 0xffffffffff # CHECK: :[[@LINE]]:7: error: encoding value does not fit into instruction

.insn 0x0, 0x0 # CHECK: :[[@LINE]]:7: error: instruction lengths must be a non-zero multiple of two
.insn 0x1, 0xff # CHECK: :[[@LINE]]:7: error: instruction lengths must be a non-zero multiple of two
.insn 10, 0x000007f # CHECK: :[[@LINE]]:7: error: instruction lengths over 64 bits are not supported

.insn 0x2, 0x03 # CHECK: :[[@LINE]]:7: error: instruction length does not match the encoding
.insn 0x2, 0x1f # CHECK: :[[@LINE]]:7: error: instruction length does not match the encoding
.insn 0x2, 0x3f # CHECK: :[[@LINE]]:7: error: instruction length does not match the encoding

.insn 0x4, 0x00000001 # CHECK: :[[@LINE]]:7: error: instruction length does not match the encoding

.insn 0x6, 0x000000000001 # CHECK: :[[@LINE]]:7: error: compressed instructions are not allowed
.insn 0x8, 0x0000000000000001 # CHECK: :[[@LINE]]:7: error: compressed instructions are not allowed

.insn 0x2, 0x10001 # CHECK: :[[@LINE]]:7: error: encoding value does not fit into instruction
.insn 0x4, 0x100000003 # CHECK: :[[@LINE]]:7: error: encoding value does not fit into instruction
.insn 0x6, 0x100000000001f # CHECK: :[[@LINE]]:7: error: encoding value does not fit into instruction
.insn 0x8, 0x1000000000000003f # CHECK: :[[@LINE]]:12: error: expected an integer constant

.insn 0x0010 # CHECK: :[[@LINE]]:7: error: compressed instructions are not allowed
.insn 0x2, 0x0001 # CHECK: :[[@LINE]]:7: error: compressed instructions are not allowed
20 changes: 20 additions & 0 deletions llvm/test/MC/RISCV/insn.s
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,23 @@ target:
# CHECK-ASM: encoding: [0x13,0x00,0x00,0x00]
# CHECK-OBJ: addi zero, zero, 0x0
.insn 0x4, 0x13

# CHECK-ASM: .insn 0x6, 31
# CHECK-ASM: encoding: [0x1f,0x00,0x00,0x00,0x00,0x00]
# CHECK-OBJ: <unknown>
.insn 6, 0x1f

# CHECK-ASM: .insn 0x4, 65503
# CHECK-ASM: encoding: [0xdf,0xff,0x00,0x00]
# CHECK-OBJ: <unknown>
.insn 0xffdf

# CHECK-ASM: .insn 0x8, 63
# CHECK-ASM: encoding: [0x3f,0x00,0x00,0x00,0x00,0x00,0x00,0x00]
# CHECK-OBJ: <unknown>
.insn 8, 0x3f

# CHECK-ASM: .insn 0x4, 65471
# CHECK-ASM: encoding: [0xbf,0xff,0x00,0x00]
# CHECK-OBJ: <unknown>
.insn 0xffbf
2 changes: 1 addition & 1 deletion llvm/test/MC/RISCV/insn_c-invalid.s
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@
## Make fake mnemonics we use to match these in the tablegened asm match table isn't exposed.
.insn_cr 2, 9, a0, a1 # CHECK: :[[#@LINE]]:1: error: unknown directive

.insn 0xfffffff0 # CHECK: :[[@LINE]]:7: error: invalid operand for instruction
.insn 0xfffffff0 # CHECK: :[[@LINE]]:7: error: encoding value does not fit into instruction
Loading