RISC-V: Add .insn support

Message ID CA+yXCZBQzKaayVkTp5op_oniqhW5GZodTwvRZYEcA8oBb9VTQA@mail.gmail.com
State New
Headers show
Series
  • RISC-V: Add .insn support
Related show

Commit Message

Kito Cheng March 7, 2018, 9:15 a.m.
This patch make  RISC-V assembler support new directive: .insn, it
able to write instruction with another form just like s/390's .insn
directive.

Main purpose of this directive is to make people easier to add new
instruction for experimentation without modify binutils, and it's much
usable than just use .word to encode instruction.


[1] https://sourceware.org/binutils/docs-2.30/as/s390-Directives.html#index-_002einsn-directive_002c-s390

Comments

Andrew Waterman March 7, 2018, 10:04 a.m. | #1
Hi Kito,

Thanks for contributing this patch.  I did not thoroughly review the
code, but I like the approach.  Jim or Palmer will probably follow up
with additional comments.

I noticed a typo: major opcode 0x6f should be named JAL, not JAR.
(The test case and the documentation also have this typo.)

Andrew

On Wed, Mar 7, 2018 at 1:15 AM, Kito Cheng <kito.cheng@gmail.com> wrote:
> This patch make  RISC-V assembler support new directive: .insn, it

> able to write instruction with another form just like s/390's .insn

> directive.

>

> Main purpose of this directive is to make people easier to add new

> instruction for experimentation without modify binutils, and it's much

> usable than just use .word to encode instruction.

>

>

> [1] https://sourceware.org/binutils/docs-2.30/as/s390-Directives.html#index-_002einsn-directive_002c-s390
Jim Wilson March 7, 2018, 6:21 p.m. | #2
On Wed, Mar 7, 2018 at 2:04 AM, Andrew Waterman <andrew@sifive.com> wrote:
> Thanks for contributing this patch.  I did not thoroughly review the

> code, but I like the approach.  Jim or Palmer will probably follow up

> with additional comments.


I looked at the previous version of the patch a month or so ago, and
the only curious thing I noticed is that the 4-operand instruction
pattern has type I, but in the ISA these are only used for FP
instructions.  I'm not sure if this can be fixed though, since we
don't have any category that covers all FP extensions: F, D, Q.  Plus
someone might want to try using a 4-operand instruction with integer
operands so it is probably reasonable to allow that.

My schedule is very hectic at the moment, as I'm in the middle of
moving to a new home closer to work.  I should be able to find time to
look at this new version of the patch sometime soon.

Jim
Andrew Waterman March 7, 2018, 6:24 p.m. | #3
I think permitting 4-operand integer instructions is fine.

On Wed, Mar 7, 2018 at 10:21 AM, Jim Wilson <jimw@sifive.com> wrote:
> On Wed, Mar 7, 2018 at 2:04 AM, Andrew Waterman <andrew@sifive.com> wrote:

>> Thanks for contributing this patch.  I did not thoroughly review the

>> code, but I like the approach.  Jim or Palmer will probably follow up

>> with additional comments.

>

> I looked at the previous version of the patch a month or so ago, and

> the only curious thing I noticed is that the 4-operand instruction

> pattern has type I, but in the ISA these are only used for FP

> instructions.  I'm not sure if this can be fixed though, since we

> don't have any category that covers all FP extensions: F, D, Q.  Plus

> someone might want to try using a 4-operand instruction with integer

> operands so it is probably reasonable to allow that.

>

> My schedule is very hectic at the moment, as I'm in the middle of

> moving to a new home closer to work.  I should be able to find time to

> look at this new version of the patch sometime soon.

>

> Jim
Kito Cheng March 8, 2018, 4:41 p.m. | #4
Oops, I've removed 4 operand format support in this version, but I guess it
might
be useful when doing some ISA exploration. I'll restore that and named r4
format in next version.

and I don't support any floating point instruction for .insn directive yet,
one of reason to me is here
is no instruction format name defined in ISA spec.

On Thu, Mar 8, 2018 at 2:24 AM, Andrew Waterman <andrew@sifive.com> wrote:

> I think permitting 4-operand integer instructions is fine.

>

> On Wed, Mar 7, 2018 at 10:21 AM, Jim Wilson <jimw@sifive.com> wrote:

> > On Wed, Mar 7, 2018 at 2:04 AM, Andrew Waterman <andrew@sifive.com>

> wrote:

> >> Thanks for contributing this patch.  I did not thoroughly review the

> >> code, but I like the approach.  Jim or Palmer will probably follow up

> >> with additional comments.

> >

> > I looked at the previous version of the patch a month or so ago, and

> > the only curious thing I noticed is that the 4-operand instruction

> > pattern has type I, but in the ISA these are only used for FP

> > instructions.  I'm not sure if this can be fixed though, since we

> > don't have any category that covers all FP extensions: F, D, Q.  Plus

> > someone might want to try using a 4-operand instruction with integer

> > operands so it is probably reasonable to allow that.

> >

> > My schedule is very hectic at the moment, as I'm in the middle of

> > moving to a new home closer to work.  I should be able to find time to

> > look at this new version of the patch sometime soon.

> >

> > Jim

>

> --

> You received this message because you are subscribed to the Google Groups

> "RISC-V Patches" group.

> To unsubscribe from this group and stop receiving emails from it, send an

> email to patches+unsubscribe@groups.riscv.org.

> To post to this group, send email to patches@groups.riscv.org.

> Visit this group at https://groups.google.com/a/groups.riscv.org/group/

> patches/.

> To view this discussion on the web visit https://groups.google.com/a/

> groups.riscv.org/d/msgid/patches/CA%2B%2B6G0BdTx8jwOX%

> 3DMHUibmDW896X5Yv7f9k4XFF5c-hERRMNhQ%40mail.gmail.com.

> For more options, visit https://groups.google.com/a/

> groups.riscv.org/d/optout.

>
Jim Wilson March 8, 2018, 11:59 p.m. | #5
On Wed, Mar 7, 2018 at 1:15 AM, Kito Cheng <kito.cheng@gmail.com> wrote:
> This patch make  RISC-V assembler support new directive: .insn, it

> able to write instruction with another form just like s/390's .insn

> directive.


I have a bunch of comments, but they are all minor cleanups.  I think
the overall structure of this is fine.

The patch created with "git format-patch" can be mailed with "git send-email".
You can edit the patch after creating it and before emailing, to add info about
the patch, such as what it fixes and how it was tested.

The tc-riscv.c ChangeLog entry is formatted wrong.  Some of the lines have two
extra spaces after the tab.

GNU coding standards says that every declaration should have an explanatory
comment before it.  Most of the new types/variables/functions added to
tc-riscv.c are missing comments.

In opcode_name_list, NMADD and NMSUB are swapped.  JAR should be JAL, as
Andrew mentioned.

"value must be 0 ~ 7"
is inconsistent with the other messages that use ... for a range

The first
"bad FUNCT field specifier 'F%c'\n"
should be CF%c for the compressed FUNCT field specifier, and maybe change
the message also to mention CFUNCT or compressed FUNCT or something

The line handling in s_riscv_insn doesn't look right.  It has an explicit
check for \n.  It should be using is_end_of_line similar to s_riscv_option.
Also, ignoring the rest of the characters on the line after parsing a .insn
is wrong.  We should give an error is there are extra non-comment characters
on the line.  This is done by calling demand_empty_rest_of_line, similar to
s_riscv_option.

In c-riscv.texi
"instructions formats" should be "instruction formats"

has another two references to JAR/jar that should be JAL/jal

In riscv-opc.c
+{"ci",    "C",  "O2,CF3,d,Cj",        0,    0,  match_opcode, 0 },
+{"ci",    "C",  "O2,CF3,d,Co",        0,    0,  match_opcode, 0 },

Co is all rvc immediate constants, Cj is all rvc immediates except 0.  So there
doesn't seem to be any point to having both.  You only need Co.  I did change
this encoding stuff at one point when working on hint instructions, so maybe
this is just left over from the old support.

The .ciw and .ci patterns are the same in the riscv_insn_types array, but are
documented as being different in the c-riscv.texi file.  In the testcase, you
used addi and li as examples for .ci and .ciw, but they use the same format.
There are two formats, using 3-bit register fields, that are missing.  One takes
one register, one takes 2 registers.  Maybe one of these is what you meant
.ciw to be for?  Or maybe the issue is the rd restrictions on addi/li,
but the .insn
support isn't trying to handle them, and I don't think it is
reasonable to expect
it to, since these restrictions are due to overlapping opcodes, not actual
instruction format limitations, so I don't think we should worry about that.  We
should just treat addi and li as the same .insn format, and add the missing two
formats.

Jim

Patch

From 97e73c265f713716e4d89ca5634137fa450fc9ce Mon Sep 17 00:00:00 2001
From: Kito Cheng <kito.cheng@gmail.com>
Date: Tue, 13 Feb 2018 16:44:13 +0800
Subject: [PATCH] RISC-V: Add .insn support

gas/ChangeLog
2018-03-07  Kito Cheng  <kito.cheng@gmail.com>

	* config/tc-riscv.c (opcode_name_list): New.
	  (opcode_names_hash): Likewise.
	  (init_opcode_names_hash): Likewise.
	  (opcode_name_lookup): Likewise.
	  (validate_riscv_insn): New argument length, and add new format
	  which used in .insn directive.
	  (md_begin): Refine hash table initialization logic into
	  init_opcode_hash.
	  (init_opcode_hash): New.
	  (my_getOpcodeExpression): Parse opcode name for .insn.
	  (riscv_ip): New argument hash, able to handle .insn directive.
	  (s_riscv_insn): Handler for .insn directive.
	  (riscv_pseudo_table): New entry for .insn.
	* doc/c-riscv.texi: Add documentation for .insn directive.
	* testsuite/gas/riscv/insn.d: Add testcase for .insn directive.
	* testsuite/gas/riscv/insn.s: Likewise.

include/ChangeLog
2018-03-07  Kito Cheng  <kito.cheng@gmail.com>

	* opcode/riscv.h (OP_MASK_FUNCT3): New.
	(OP_SH_FUNCT3): Likewise.
	(OP_MASK_FUNCT7): Likewise.
	(OP_SH_FUNCT7): Likewise.
	(OP_MASK_OP2): Likewise.
	(OP_SH_OP2): Likewise.
	(OP_MASK_CFUNCT4): Likewise.
	(OP_SH_CFUNCT4): Likewise.
	(OP_MASK_CFUNCT3): Likewise.
	(OP_SH_CFUNCT3): Likewise.
	(riscv_insn_types): Likewise.

opcodes/ChangeLog
2018-03-07  Kito Cheng  <kito.cheng@gmail.com>

	* riscv-opc.c (riscv_insn_types): New.
---
 gas/config/tc-riscv.c          | 383 ++++++++++++++++++++++++++++++++++++++---
 gas/doc/c-riscv.texi           | 230 +++++++++++++++++++++++++
 gas/testsuite/gas/riscv/insn.d |  43 +++++
 gas/testsuite/gas/riscv/insn.s |  30 ++++
 include/opcode/riscv.h         |  14 ++
 opcodes/riscv-opc.c            |  21 +++
 6 files changed, 695 insertions(+), 26 deletions(-)
 create mode 100644 gas/testsuite/gas/riscv/insn.d
 create mode 100644 gas/testsuite/gas/riscv/insn.s

diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index f1bc7f9..6d54455 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -221,6 +221,9 @@  riscv_set_arch (const char *s)
 /* Handle of the OPCODE hash table.  */
 static struct hash_control *op_hash = NULL;
 
+/* Handle of the type of .insn hash table.  */
+static struct hash_control *insn_type_hash = NULL;
+
 /* This array holds the chars that always start a comment.  If the
     pre-processor is disabled, these aren't very useful */
 const char comment_chars[] = "#";
@@ -391,6 +394,105 @@  relaxed_branch_length (fragS *fragp, asection *sec, int update)
   return length;
 }
 
+struct opcode_name_t
+{
+  const char *name;
+  unsigned int val;
+};
+
+static const struct opcode_name_t opcode_name_list[] =
+{
+  {"C0",        0x0},
+  {"C1",        0x1},
+  {"C2",        0x2},
+
+  {"LOAD",      0x03},
+  {"LOAD_FP",   0x07},
+  {"CUSTOM_0",  0x0b},
+  {"MISC_MEM",  0x0f},
+  {"OP_IMM",    0x13},
+  {"AUIPC",     0x17},
+  {"OP_IMM_32", 0x1b},
+  /* 48b        0x1f.  */
+
+  {"STORE",     0x23},
+  {"STORE_FP",  0x27},
+  {"CUSTOM_1",  0x2b},
+  {"AMO",       0x2f},
+  {"OP",        0x33},
+  {"LUI",       0x37},
+  {"OP_32",     0x3b},
+  /* 64b        0x3f.  */
+
+  {"MADD",      0x43},
+  {"MSUB",      0x47},
+  {"NMADD",     0x4b},
+  {"NMSUB",     0x4f},
+  {"OP_FP",     0x53},
+  /*reserved    0x57.  */
+  {"CUSTOM_2",  0x5b},
+  /* 48b        0x5f.  */
+
+  {"BRANCH",    0x63},
+  {"JALR",      0x67},
+  /*reserved    0x5b.  */
+  {"JAR",       0x6f},
+  {"SYSTEM",    0x73},
+  /*reserved    0x77.  */
+  {"CUSTOM_3",  0x7b},
+  /* >80b       0x7f.  */
+
+  {NULL, 0}
+};
+
+static struct hash_control *opcode_names_hash = NULL;
+
+static void
+init_opcode_names_hash (void)
+{
+  const char *retval;
+  const struct opcode_name_t *opcode;
+
+  for (opcode = &opcode_name_list[0]; opcode->name != NULL; ++opcode)
+    {
+      retval = hash_insert (opcode_names_hash, opcode->name, (void *)opcode);
+
+      if (retval != NULL)
+	as_fatal (_("internal error: can't hash `%s': %s"),
+		  opcode->name, retval);
+    }
+}
+
+static const struct opcode_name_t *
+opcode_name_lookup (char **s)
+{
+  char *e;
+  char save_c;
+  struct opcode_name_t *o;
+
+  /* Find end of name.  */
+  e = *s;
+  if (is_name_beginner (*e))
+    ++e;
+  while (is_part_of_name (*e))
+    ++e;
+
+  /* Terminate name.  */
+  save_c = *e;
+  *e = '\0';
+
+  o = (struct opcode_name_t *) hash_find (opcode_names_hash, *s);
+
+  /* Advance to next token if one was recognized.  */
+  if (o)
+    *s = e;
+
+  *e = save_c;
+  expr_end = e;
+
+  return o;
+}
+
 struct regname
 {
   const char *name;
@@ -490,13 +592,20 @@  arg_lookup (char **s, const char *const *array, size_t size, unsigned *regnop)
    by the match/mask part of the instruction definition, or by the
    operand list.  */
 static bfd_boolean
-validate_riscv_insn (const struct riscv_opcode *opc)
+validate_riscv_insn (const struct riscv_opcode *opc, int length)
 {
   const char *p = opc->args;
   char c;
   insn_t used_bits = opc->mask;
-  int insn_width = 8 * riscv_insn_length (opc->match);
-  insn_t required_bits = ~0ULL >> (64 - insn_width);
+  int insn_width;
+  insn_t required_bits;
+
+  if (length == 0)
+    insn_width = 8 * riscv_insn_length (opc->match);
+  else
+    insn_width = 8 * length;
+
+  required_bits = ~0ULL >> (64 - insn_width);
 
   if ((used_bits & opc->match) != (opc->match & required_bits))
     {
@@ -538,6 +647,18 @@  validate_riscv_insn (const struct riscv_opcode *opc)
 	  case '>': used_bits |= ENCODE_RVC_IMM (-1U); break;
 	  case 'T': USE_BITS (OP_MASK_CRS2, OP_SH_CRS2); break;
 	  case 'D': USE_BITS (OP_MASK_CRS2S, OP_SH_CRS2S); break;
+	  case 'F': /* funct */
+	    switch (c = *p++)
+	      {
+		case '4': USE_BITS (OP_MASK_CFUNCT4, OP_SH_CFUNCT4); break;
+		case '3': USE_BITS (OP_MASK_CFUNCT3, OP_SH_CFUNCT3); break;
+		default:
+		  as_bad (_("internal: bad RISC-V opcode"
+			    " (unknown operand type `CF%c'): %s %s"),
+			  c, opc->name, opc->args);
+		  return FALSE;
+	      }
+	    break;
 	  default:
 	    as_bad (_("internal: bad RISC-V opcode (unknown operand type `C%c'): %s %s"),
 		    c, opc->name, opc->args);
@@ -555,6 +676,7 @@  validate_riscv_insn (const struct riscv_opcode *opc)
       case 'E':	USE_BITS (OP_MASK_CSR,		OP_SH_CSR);	break;
       case 'I': break;
       case 'R':	USE_BITS (OP_MASK_RS3,		OP_SH_RS3);	break;
+      case 'r': USE_BITS (OP_MASK_RS3,          OP_SH_RS3);     break;
       case 'S':	USE_BITS (OP_MASK_RS1,		OP_SH_RS1);	break;
       case 'U':	USE_BITS (OP_MASK_RS1,		OP_SH_RS1);	/* fallthru */
       case 'T':	USE_BITS (OP_MASK_RS2,		OP_SH_RS2);	break;
@@ -574,6 +696,30 @@  validate_riscv_insn (const struct riscv_opcode *opc)
       case '[': break;
       case ']': break;
       case '0': break;
+      case 'F': /* funct */
+	switch (c = *p++)
+	  {
+	    case '7': USE_BITS (OP_MASK_FUNCT7, OP_SH_FUNCT7); break;
+	    case '3': USE_BITS (OP_MASK_FUNCT3, OP_SH_FUNCT3); break;
+	    default:
+	      as_bad (_("internal: bad RISC-V opcode"
+			" (unknown operand type `F%c'): %s %s"),
+		      c, opc->name, opc->args);
+	    return FALSE;
+	  }
+	break;
+      case 'O': /* opcode */
+	switch (c = *p++)
+	  {
+	    case '4': USE_BITS (OP_MASK_OP, OP_SH_OP); break;
+	    case '2': USE_BITS (OP_MASK_OP2, OP_SH_OP2); break;
+	    default:
+	      as_bad (_("internal: bad RISC-V opcode"
+			" (unknown operand type `F%c'): %s %s"),
+		      c, opc->name, opc->args);
+	     return FALSE;
+	  }
+	break;
       default:
 	as_bad (_("internal: bad RISC-V opcode "
 		  "(unknown operand type `%c'): %s %s"),
@@ -597,52 +743,71 @@  struct percent_op_match
   bfd_reloc_code_real_type reloc;
 };
 
-/* This function is called once, at assembler startup time.  It should set up
-   all the tables, etc. that the MD part of the assembler will need.  */
-
-void
-md_begin (void)
+static struct hash_control *
+init_opcode_hash (const struct riscv_opcode *opcodes,
+		  bfd_boolean insn_directive_p)
 {
   int i = 0;
-  unsigned long mach = xlen == 64 ? bfd_mach_riscv64 : bfd_mach_riscv32;
-
-  if (! bfd_set_arch_mach (stdoutput, bfd_arch_riscv, mach))
-    as_warn (_("Could not set architecture and machine"));
-
-  op_hash = hash_new ();
-
-  while (riscv_opcodes[i].name)
+  int length;
+  struct hash_control *hash = hash_new ();
+  while (opcodes[i].name)
     {
-      const char *name = riscv_opcodes[i].name;
+      const char *name = opcodes[i].name;
       const char *hash_error =
-	hash_insert (op_hash, name, (void *) &riscv_opcodes[i]);
+	hash_insert (hash, name, (void *) &opcodes[i]);
 
       if (hash_error)
 	{
 	  fprintf (stderr, _("internal error: can't hash `%s': %s\n"),
-		   riscv_opcodes[i].name, hash_error);
+		   opcodes[i].name, hash_error);
 	  /* Probably a memory allocation problem?  Give up now.  */
 	  as_fatal (_("Broken assembler.  No assembly attempted."));
 	}
 
       do
 	{
-	  if (riscv_opcodes[i].pinfo != INSN_MACRO)
+	  if (opcodes[i].pinfo != INSN_MACRO)
 	    {
-	      if (!validate_riscv_insn (&riscv_opcodes[i]))
+	      if (insn_directive_p)
+		length = ((name[0] == 'c') ? 2 : 4);
+	      else
+		length = 0; /* Let assembler determine the length. */
+	      if (!validate_riscv_insn (&opcodes[i], length))
 		as_fatal (_("Broken assembler.  No assembly attempted."));
 	    }
+	  else
+	    gas_assert (!insn_directive_p);
 	  ++i;
 	}
-      while (riscv_opcodes[i].name && !strcmp (riscv_opcodes[i].name, name));
+      while (opcodes[i].name && !strcmp (opcodes[i].name, name));
     }
 
+  return hash;
+}
+
+/* This function is called once, at assembler startup time.  It should set up
+   all the tables, etc. that the MD part of the assembler will need.  */
+
+void
+md_begin (void)
+{
+  unsigned long mach = xlen == 64 ? bfd_mach_riscv64 : bfd_mach_riscv32;
+
+  if (! bfd_set_arch_mach (stdoutput, bfd_arch_riscv, mach))
+    as_warn (_("Could not set architecture and machine"));
+
+  op_hash = init_opcode_hash (riscv_opcodes, FALSE);
+  insn_type_hash = init_opcode_hash (riscv_insn_types, TRUE);
+
   reg_names_hash = hash_new ();
   hash_reg_names (RCLASS_GPR, riscv_gpr_names_numeric, NGPR);
   hash_reg_names (RCLASS_GPR, riscv_gpr_names_abi, NGPR);
   hash_reg_names (RCLASS_FPR, riscv_fpr_names_numeric, NFPR);
   hash_reg_names (RCLASS_FPR, riscv_fpr_names_abi, NFPR);
 
+  opcode_names_hash = hash_new ();
+  init_opcode_names_hash ();
+
 #define DECLARE_CSR(name, num) hash_reg_name (RCLASS_CSR, #name, num);
 #define DECLARE_CSR_ALIAS(name, num) DECLARE_CSR(name, num);
 #include "opcode/riscv-opc.h"
@@ -1186,6 +1351,22 @@  my_getSmallExpression (expressionS *ep, bfd_reloc_code_real_type *reloc,
   return reloc_index;
 }
 
+static size_t
+my_getOpcodeExpression (expressionS *ep, bfd_reloc_code_real_type *reloc,
+			char *str, const struct percent_op_match *percent_op)
+{
+  const struct opcode_name_t *o = opcode_name_lookup (&str);
+
+  if (o != NULL)
+    {
+      ep->X_op = O_constant;
+      ep->X_add_number = o->val;
+      return 0;
+    }
+
+  return my_getSmallExpression (ep, reloc, str, percent_op);
+}
+
 /* Detect and handle implicitly zero load-store offsets.  For example,
    "lw t0, (t1)" is shorthand for "lw t0, 0(t1)".  Return TRUE iff such
    an implicit offset was detected.  */
@@ -1211,7 +1392,7 @@  riscv_handle_implicit_zero_offset (expressionS *ep, const char *s)
 
 static const char *
 riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
-	  bfd_reloc_code_real_type *imm_reloc)
+	  bfd_reloc_code_real_type *imm_reloc, struct hash_control *hash)
 {
   char *s;
   const char *args;
@@ -1234,7 +1415,7 @@  riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 	break;
       }
 
-  insn = (struct riscv_opcode *) hash_find (op_hash, str);
+  insn = (struct riscv_opcode *) hash_find (hash, str);
 
   argsStart = s;
   for ( ; insn && insn->name && strcmp (insn->name, str) == 0; insn++)
@@ -1259,7 +1440,12 @@  riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		{
 		  if (!insn->match_func (insn, ip->insn_opcode))
 		    break;
-		  if (riscv_insn_length (insn->match) == 2 && !riscv_opts.rvc)
+
+		  /* For .insn, insn->match and insn->mask are 0.  */
+		  if (riscv_insn_length ((insn->match == 0 && insn->mask == 0)
+					 ? ip->insn_opcode
+					 : insn->match) == 2
+		      && !riscv_opts.rvc)
 		    break;
 		}
 	      if (*s != '\0')
@@ -1468,6 +1654,43 @@  rvc_lui:
 		    break;
 		  INSERT_OPERAND (CRS2, *ip, regno);
 		  continue;
+		case 'F':
+		  switch (*++args)
+		    {
+		      case '4':
+		        if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+			    || imm_expr->X_op != O_constant
+			    || imm_expr->X_add_number < 0
+			    || imm_expr->X_add_number >= 16)
+			  {
+			    as_bad (_("bad value for funct4 field, "
+				      "value must be 0...15"));
+			    break;
+			  }
+
+			INSERT_OPERAND (CFUNCT4, *ip, imm_expr->X_add_number);
+			imm_expr->X_op = O_absent;
+			s = expr_end;
+			continue;
+		      case '3':
+			if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+			    || imm_expr->X_op != O_constant
+			    || imm_expr->X_add_number < 0
+			    || imm_expr->X_add_number >= 8)
+			  {
+			    as_bad (_("bad value for funct3 field, "
+				      "value must be 0 ~ 7"));
+			    break;
+			  }
+			INSERT_OPERAND (CFUNCT3, *ip, imm_expr->X_add_number);
+			imm_expr->X_op = O_absent;
+			s = expr_end;
+			continue;
+		      default:
+			as_bad (_("bad FUNCT field specifier 'F%c'\n"), *args);
+		    }
+		  break;
+
 		default:
 		  as_bad (_("bad RVC field specifier 'C%c'\n"), *args);
 		}
@@ -1712,6 +1935,83 @@  jump:
 	      else
 		*imm_reloc = BFD_RELOC_RISCV_CALL;
 	      continue;
+	    case 'O':
+	      switch (*++args)
+		{
+		case '4':
+		  if (my_getOpcodeExpression (imm_expr, imm_reloc, s, p)
+		      || imm_expr->X_op != O_constant
+		      || imm_expr->X_add_number < 0
+		      || imm_expr->X_add_number >= 128
+		      || (imm_expr->X_add_number & 0x3) != 3)
+		    {
+		      as_bad (_("bad value for opcode field, "
+				"value must be 0...127 and "
+				"lower 2 bits must be 0x3"));
+		      break;
+		    }
+
+		  INSERT_OPERAND (OP, *ip, imm_expr->X_add_number);
+		  imm_expr->X_op = O_absent;
+		  s = expr_end;
+		  continue;
+		case '2':
+		  if (my_getOpcodeExpression (imm_expr, imm_reloc, s, p)
+		      || imm_expr->X_op != O_constant
+		      || imm_expr->X_add_number < 0
+		      || imm_expr->X_add_number >= 3)
+		    {
+		      as_bad (_("bad value for opcode field, "
+				"value must be 0...2"));
+		      break;
+		    }
+
+		  INSERT_OPERAND (OP2, *ip, imm_expr->X_add_number);
+		  imm_expr->X_op = O_absent;
+		  s = expr_end;
+		  continue;
+		default:
+		  as_bad (_("bad Opcode field specifier 'O%c'\n"), *args);
+		}
+	      break;
+
+	    case 'F':
+	      switch (*++args)
+		{
+		case '7':
+		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		      || imm_expr->X_op != O_constant
+		      || imm_expr->X_add_number < 0
+		      || imm_expr->X_add_number >= 128)
+		    {
+		      as_bad (_("bad value for funct7 field, "
+				"value must be 0...127"));
+		      break;
+		    }
+
+		  INSERT_OPERAND (FUNCT7, *ip, imm_expr->X_add_number);
+		  imm_expr->X_op = O_absent;
+		  s = expr_end;
+		  continue;
+		case '3':
+		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		      || imm_expr->X_op != O_constant
+		      || imm_expr->X_add_number < 0
+		      || imm_expr->X_add_number >= 8)
+		    {
+		      as_bad (_("bad value for funct3 field, "
+			        "value must be 0...7"));
+		      break;
+		    }
+
+		  INSERT_OPERAND (FUNCT3, *ip, imm_expr->X_add_number);
+		  imm_expr->X_op = O_absent;
+		  s = expr_end;
+		  continue;
+		default:
+		  as_bad (_("bad FUNCT field specifier 'F%c'\n"), *args);
+		}
+	      break;
 
 	    case 'z':
 	      if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
@@ -1746,7 +2046,7 @@  md_assemble (char *str)
   expressionS imm_expr;
   bfd_reloc_code_real_type imm_reloc = BFD_RELOC_UNUSED;
 
-  const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc);
+  const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc, op_hash);
 
   if (error)
     {
@@ -2614,6 +2914,36 @@  s_riscv_leb128 (int sign)
   return s_leb128 (sign);
 }
 
+static void
+s_riscv_insn (int x ATTRIBUTE_UNUSED)
+{
+  char *str = input_line_pointer;
+  struct riscv_cl_insn insn;
+  expressionS imm_expr;
+  bfd_reloc_code_real_type imm_reloc = BFD_RELOC_UNUSED;
+  char *n = strchr (str, '\n');
+  if (n)
+    *n = '\0';
+
+  const char *error = riscv_ip (str, &insn, &imm_expr,
+				&imm_reloc, insn_type_hash);
+
+  while (*input_line_pointer++);
+
+  if (error)
+    {
+      as_bad ("%s `%s'", error, str);
+    }
+  else
+    {
+      gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
+      append_insn (&insn, &imm_expr, imm_reloc);
+    }
+
+  if (n)
+    *n = '\n';
+}
+
 /* Pseudo-op table.  */
 
 static const pseudo_typeS riscv_pseudo_table[] =
@@ -2628,6 +2958,7 @@  static const pseudo_typeS riscv_pseudo_table[] =
   {"bss", s_bss, 0},
   {"uleb128", s_riscv_leb128, 0},
   {"sleb128", s_riscv_leb128, 1},
+  {"insn", s_riscv_insn, 0},
 
   { NULL, NULL, 0 },
 };
diff --git a/gas/doc/c-riscv.texi b/gas/doc/c-riscv.texi
index 3f327d6..1b98938 100644
--- a/gas/doc/c-riscv.texi
+++ b/gas/doc/c-riscv.texi
@@ -17,6 +17,7 @@ 
 @menu
 * RISC-V-Options::        RISC-V Options
 * RISC-V-Directives::     RISC-V Directives
+* RISC-V-Formats::        RISC-V Instruction Formats
 @end menu
 
 @node RISC-V-Options
@@ -148,4 +149,233 @@  opportunistically relax some code sequences, but sometimes this behavior is not
 desirable.
 @end table
 
+@cindex INSN directives
+@item .insn @var{value}
+@itemx .insn @var{value}
+This directive permits the numeric representation of an instructions
+and makes the assembler insert the operands according to one of the
+instructions formats for @samp{.insn} (@ref{RISC-V-Formats}).
+For example, the instruction @samp{add a0, a1, a2} could be written as
+@samp{.insn r 0x33, 0, 0, a0, a1, a2}.
+
+@end table
+
+@node RISC-V-Formats
+@section Instruction Formats
+@cindex instruction formats, risc-v
+@cindex RISC-V instruction formats
+
+The RISC-V Instruction Set Manual Volume I: User-Level ISA lists 12
+instruction formats where some of the formats have multiple variants.
+For the @samp{.insn} pseudo directive the assembler recognizes some
+of the formats.
+Typically, the most general variant of the instruction format is used
+by the @samp{.insn} directive.
+
+The following table lists the abbreviations used in the table of
+instruction formats:
+
+@display
+@multitable @columnfractions .15 .40
+@item opcode @tab Unsigned immediate or opcode name for 7-bits opcode.
+@item opcode2 @tab Unsigned immediate or opcode name for 2-bits opcode.
+@item func7 @tab Unsigned immediate for 7-bits function code.
+@item func4 @tab Unsigned immediate for 4-bits function code.
+@item func3 @tab Unsigned immediate for 3-bits function code.
+@item rd @tab Destination register number for operand x.
+@item rd' @tab Destination register number for operand x,
+only accept s0-s1, and a0-a5.
+@item rs1 @tab First source register number for operand x.
+@item rs1' @tab First source register number for operand x,
+only accept s0-s1, and a0-a5.
+@item rs2 @tab Second source register number for operand x.
+@item rs2' @tab Second source register number for operand x,
+only accept s0-s1, and a0-a5.
+@item simm12 @tab Sign-extended 12-bit immediate for operand x.
+@item simm20 @tab Sign-extended 20-bit immediate for operand x.
+@item symbol @tab Symbol or lable reference for operand x.
+@end multitable
+@end display
+
+The following table lists all available opcode name:
+
+@table @code
+@item C0
+@item C1
+@item C2
+Opcode space for compressed instructions.
+
+@item LOAD
+Opcode space for load instructions.
+
+@item LOAD_FP
+Opcode space for floating-point load instructions.
+
+@item STORE
+Opcode space for store instructions.
+
+@item STORE_FP
+Opcode space for floating-point store instructions.
+
+@item AUIPC
+Opcode space for auipc instruction.
+
+@item LUI
+Opcode space for lui instruction.
+
+@item BRANCH
+Opcode space for branch instructions.
+
+@item JAR
+Opcode space for jar instruction.
+
+@item JALR
+Opcode space for jalr instruction.
+
+@item OP
+Opcode space for ALU instructions.
+
+@item OP_32
+Opcode space for 32-bits ALU instructions.
+
+@item OP_IMM
+Opcode space for ALU with immediate instructions.
+
+@item OP_IMM_32
+Opcode space for 32-bits ALU with immediate instructions.
+
+@item OP_FP
+Opcode space for floating-point operation instructions.
+
+@item MADD
+Opcode space for madd instruction.
+
+@item MSUB
+Opcode space for msub instruction.
+
+@item NMADD
+Opcode space for nmadd instruction.
+
+@item NMSUB
+Opcode space for msub instruction.
+
+@item AMO
+Opcode space for atomic memory operation instructions.
+
+@item MISC_IMM
+Opcode space for misc instructions.
+
+@item SYSTEM
+Opcode space for system instructions.
+
+@item CUSTOM_0
+@item CUSTOM_1
+@item CUSTOM_2
+@item CUSTOM_3
+Opcode space for customize instructions.
+
 @end table
+
+An instruction is two or four bytes in length and must be aligned
+on a 2 byte boundary. The first two bits of the instruction specify the
+length of the instruction, 00, 01 and 10 indicates a two byte instruction,
+11 indicates a four byte instruction.
+
+The following table lists the RISC-V instruction formats that are available
+with the @samp{.insn} pseudo directive:
+
+@table @code
+@item R type: .insn r opcode, func3, func7, rd, rs1, rs2
+@verbatim
++-------+-----+-----+-------+----+-------------+
+| func7 | rs2 | rs1 | func3 | rd |      opcode |
++-------+-----+-----+-------+----+-------------+
+31      25    20    15      12   7             0
+@end verbatim
+
+@item I type: .insn i opcode, func3, rd, rs1, simm12
+@verbatim
++-------------+-----+-------+----+-------------+
+|      simm12 | rs1 | func3 | rd |      opcode |
++-------------+-----+-------+----+-------------+
+31            20    15      12   7             0
+@end verbatim
+
+@item S type: .insn s opcode, func3, rd, rs1, simm12
+@verbatim
++--------------+-----+-----+-------+-------------+-------------+
+| simm12[11:5] | rs2 | rs1 | func3 | simm12[4:0] |      opcode |
++--------------+-----+-----+-------+-------------+-------------+
+31             25    20    15      12            7             0
+@end verbatim
+
+@item SB type: .insn sb opcode, func3, rd, rs1, symbol
+@itemx SB type: .insn sb opcode, func3, rd, simm12(rs1)
+@verbatim
++--------------+-----+-----+-------+-------------+-------------+
+| simm21[11:5] | rs2 | rs1 | func3 | simm12[4:0] |      opcode |
++--------------+-----+-----+-------+-------------+-------------+
+31             25    20    15      12            7             0
+@end verbatim
+
+@item U type: .insn u opcode, rd, simm20
+@verbatim
++---------------------------+----+-------------+
+|                    simm20 | rd |      opcode |
++---------------------------+----+-------------+
+31                          12   7             0
+@end verbatim
+
+@item UJ type: .insn uj opcode, rd, symbol
+@verbatim
++------------+--------------+------------+---------------+----+-------------+
+| simm20[20] | simm20[10:1] | simm20[11] | simm20[19:12] | rd |      opcode |
++------------+--------------+------------+---------------+----+-------------+
+31           30             21           20              12   7             0
+@end verbatim
+
+@item CR type: .insn cr opcode, func4, rd, rs1
+@verbatim
++---------+--------+-----+---------+
+|   func4 | rd/rs1 | rs2 | opcode2 |
++---------+--------+-----+---------+
+15        12       7     2        0
+@end verbatim
+
+@item CI type: .insn ci opcode, func3, rd, simm6
+@verbatim
++---------+-----+--------+-----+---------+
+|   func3 | imm | rd/rs1 | imm | opcode2 |
++---------+-----+--------+-----+---------+
+15        13    12       7     2         0
+@end verbatim
+
+@item CIW type: .insn ciw opcode, func3, rd, simm6
+@verbatim
++---------+--------------+-----+---------+
+|   func3 |          imm | rd' | opcode2 |
++---------+--------------+-----+---------+
+15        13             7     2         0
+@end verbatim
+
+@item CB type: .insn cb opcode, func3, rs1, symbol
+@verbatim
++---------+--------+------+--------+---------+
+|   func3 | offset | rs1' | offset | opcode2 |
++---------+--------+------+--------+---------+
+15        13       10     7        2         0
+@end verbatim
+
+@item CJ type: .insn cj opcode, symbol
+@verbatim
++---------+--------------------+---------+
+|   func3 |        jump target | opcode2 |
++---------+--------------------+---------+
+15        13             7     2         0
+@end verbatim
+
+
+@end table
+
+For the complete list of all instruction format variants see
+The RISC-V Instruction Set Manual Volume I: User-Level ISA.
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
new file mode 100644
index 0000000..36d5044
--- /dev/null
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -0,0 +1,43 @@ 
+#as: -march=rv32ic
+#objdump: -dr
+
+.*:[ 	]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <target>:
+[ 	]+0:[ 	]+00c58533[ 	]+add[ 	]+a0,a1,a2
+[ 	]+4:[ 	]+00d58513[ 	]+addi[ 	]+a0,a1,13
+[ 	]+8:[ 	]+00a58567[ 	]+jalr[ 	]+a0,10\(a1\)
+[ 	]+c:[ 	]+00458503[ 	]+lb[ 	]+a0,4\(a1\)
+[ 	]+10:[ 	]+feb508e3[ 	]+beq[ 	]+a0,a1,0 \<target\>
+[	]+10: R_RISCV_BRANCH[	]+target
+[ 	]+14:[ 	]+00a58223[ 	]+sb[ 	]+a0,4\(a1\)
+[ 	]+18:[ 	]+00fff537[ 	]+lui[ 	]+a0,0xfff
+[ 	]+1c:[ 	]+fe5ff56f[ 	]+jal[ 	]+a0,0 \<target\>
+[	]+1c: R_RISCV_JAL[	]+target
+[ 	]+20:[ 	]+0511[ 	]+addi[ 	]+a0,a0,4
+[ 	]+22:[ 	]+852e[ 	]+mv[ 	]+a0,a1
+[ 	]+24:[ 	]+45a9[ 	]+li[ 	]+a1,10
+[ 	]+26:[ 	]+dde9[ 	]+beqz[ 	]+a1,0 \<target\>
+[	]+26: R_RISCV_RVC_BRANCH[	]+target
+[ 	]+28:[ 	]+bfe1[ 	]+j[ 	]+0 \<target\>
+[	]+28: R_RISCV_RVC_JUMP[	]+target
+[ 	]+2a:[ 	]+00c58533[ 	]+add[ 	]+a0,a1,a2
+[ 	]+2e:[ 	]+00d58513[ 	]+addi[ 	]+a0,a1,13
+[ 	]+32:[ 	]+00a58567[ 	]+jalr[ 	]+a0,10\(a1\)
+[ 	]+36:[ 	]+00458503[ 	]+lb[ 	]+a0,4\(a1\)
+[ 	]+3a:[ 	]+fcb503e3[ 	]+beq[ 	]+a0,a1,0 \<target\>
+[	]+3a: R_RISCV_BRANCH[	]+target
+[ 	]+3e:[ 	]+00a58223[ 	]+sb[ 	]+a0,4\(a1\)
+[ 	]+42:[ 	]+00fff537[ 	]+lui[ 	]+a0,0xfff
+[ 	]+46:[ 	]+fbbff56f[ 	]+jal[ 	]+a0,0 \<target\>
+[	]+46: R_RISCV_JAL[	]+target
+[ 	]+4a:[ 	]+0511[ 	]+addi[ 	]+a0,a0,4
+[ 	]+4c:[ 	]+852e[ 	]+mv[ 	]+a0,a1
+[ 	]+4e:[ 	]+45a9[ 	]+li[ 	]+a1,10
+[ 	]+50:[ 	]+d9c5[ 	]+beqz[ 	]+a1,0 \<target\>
+[	]+50: R_RISCV_RVC_BRANCH[	]+target
+[ 	]+52:[ 	]+b77d[ 	]+j[ 	]+0 \<target\>
+[	]+52: R_RISCV_RVC_JUMP[	]+target
diff --git a/gas/testsuite/gas/riscv/insn.s b/gas/testsuite/gas/riscv/insn.s
new file mode 100644
index 0000000..ab30129
--- /dev/null
+++ b/gas/testsuite/gas/riscv/insn.s
@@ -0,0 +1,30 @@ 
+target:
+	.insn r  0x33,  0,  0, a0, a1, a2
+	.insn i  0x13,  0, a0, a1, 13
+	.insn i  0x67,  0, a0, 10(a1)
+	.insn s   0x3,  0, a0, 4(a1)
+	.insn sb 0x63,  0, a0, a1, target
+	.insn sb 0x23,  0, a0, 4(a1)
+	.insn u  0x37, a0, 0xfff
+	.insn uj 0x6f, a0, target
+
+	.insn ci 0x1, 0x0, a0, 4
+	.insn cr 0x2, 0x8, a0, a1
+	.insn ciw 0x1, 0x2, a1, 10
+	.insn cb 0x1, 0x6, a1, target
+	.insn cj 0x1, 0x5, target
+
+	.insn r  OP,  0,  0, a0, a1, a2
+	.insn i  OP_IMM,  0, a0, a1, 13
+	.insn i  JALR,  0, a0, 10(a1)
+	.insn s  LOAD,  0, a0, 4(a1)
+	.insn sb BRANCH,  0, a0, a1, target
+	.insn sb STORE,  0, a0, 4(a1)
+	.insn u  LUI, a0, 0xfff
+	.insn uj JAR, a0, target
+
+	.insn ci C1, 0x0, a0, 4
+	.insn cr C2, 0x8, a0, a1
+	.insn ciw C1, 0x2, a1, 10
+	.insn cb C1, 0x6, a1, target
+	.insn cj C1, 0x5, target
diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h
index b87c719..65ab6417 100644
--- a/include/opcode/riscv.h
+++ b/include/opcode/riscv.h
@@ -223,8 +223,16 @@  static const char * const riscv_pred_succ[16] =
 #define OP_MASK_CSR		0xfff
 #define OP_SH_CSR		20
 
+#define OP_MASK_FUNCT3         0x7
+#define OP_SH_FUNCT3           12
+#define OP_MASK_FUNCT7         0x7f
+#define OP_SH_FUNCT7           25
+
 /* RVC fields.  */
 
+#define OP_MASK_OP2            0x3
+#define OP_SH_OP2              0
+
 #define OP_MASK_CRS2 0x1f
 #define OP_SH_CRS2 2
 #define OP_MASK_CRS1S 0x7
@@ -232,6 +240,11 @@  static const char * const riscv_pred_succ[16] =
 #define OP_MASK_CRS2S 0x7
 #define OP_SH_CRS2S 2
 
+#define OP_MASK_CFUNCT4                0xf
+#define OP_SH_CFUNCT4          12
+#define OP_MASK_CFUNCT3                0x7
+#define OP_SH_CFUNCT3          13
+
 /* ABI names for selected x-registers.  */
 
 #define X_RA 1
@@ -340,5 +353,6 @@  extern const char * const riscv_fpr_names_numeric[NFPR];
 extern const char * const riscv_fpr_names_abi[NFPR];
 
 extern const struct riscv_opcode riscv_opcodes[];
+extern const struct riscv_opcode riscv_insn_types[];
 
 #endif /* _RISCV_H_ */
diff --git a/opcodes/riscv-opc.c b/opcodes/riscv-opc.c
index 4aeb55a..7a7a100 100644
--- a/opcodes/riscv-opc.c
+++ b/opcodes/riscv-opc.c
@@ -741,3 +741,24 @@  const struct riscv_opcode riscv_opcodes[] =
 /* Terminate the list.  */
 {0, 0, 0, 0, 0, 0, 0}
 };
+
+const struct riscv_opcode riscv_insn_types[] =
+{
+/* name,  isa,          operands, match, mask,    match_func, pinfo.  */
+{"r",     "I",  "O4,F3,F7,d,s,t",     0,    0,  match_opcode, 0 },
+{"i",     "I",  "O4,F3,d,s,j",        0,    0,  match_opcode, 0 },
+{"i",     "I",  "O4,F3,d,o(s)",       0,    0,  match_opcode, 0 },
+{"s",     "I",  "O4,F3,d,o(s)",       0,    0,  match_opcode, 0 },
+{"sb",    "I",  "O4,F3,s,t,p",        0,    0,  match_opcode, 0 },
+{"sb",    "I",  "O4,F3,t,q(s)",       0,    0,  match_opcode, 0 },
+{"u",     "I",  "O4,d,u",             0,    0,  match_opcode, 0 },
+{"uj",    "I",  "O4,d,a",             0,    0,  match_opcode, 0 },
+{"cr",    "C",  "O2,CF4,d,CV",        0,    0,  match_opcode, 0 },
+{"ci",    "C",  "O2,CF3,d,Cj",        0,    0,  match_opcode, 0 },
+{"ci",    "C",  "O2,CF3,d,Co",        0,    0,  match_opcode, 0 },
+{"ciw",   "C",  "O2,CF3,d,Co",        0,    0,  match_opcode, 0 },
+{"cb",    "C",  "O2,CF3,Cs,Cp",       0,    0,  match_opcode, 0 },
+{"cj",    "C",  "O2,CF3,Ca",          0,    0,  match_opcode, 0 },
+/* Terminate the list.  */
+{0, 0, 0, 0, 0, 0, 0}
+};
-- 
2.7.4