PowerPC: Add new xxmr and xxlnot extended mnemonics [committed]

Message ID 52168eb6-0c25-0d80-17cf-1898ad88e990@linux.ibm.com
State New
Headers show
Series
  • PowerPC: Add new xxmr and xxlnot extended mnemonics [committed]
Related show

Commit Message

Nick Alcock via Binutils May 27, 2021, 10:11 p.m.
I committed the following patch to add a couple of extended mnemonics that
should have been defined in the ISA document earlier, but weren't.  These
new extended mnemonics will now show up in the next ISA release (whenever
that is), but there's no need to wait for that.

Peter


opcodes/
	* ppc-opc.c (powerpc_opcodes) <xxmr, xxlnot>: New extended mnemonics.

gas/
	* testsuite/gas/ppc/vsx.d <xxmr, xxlnot>: Add tests.
	* testsuite/gas/ppc/vsx.s: Likewise.

Comments

Nick Alcock via Binutils May 28, 2021, 3:56 a.m. | #1
On Thu, May 27, 2021 at 05:11:41PM -0500, Peter Bergner wrote:
> I committed the following patch to add a couple of extended mnemonics that

> should have been defined in the ISA document earlier, but weren't.  These

> new extended mnemonics will now show up in the next ISA release (whenever

> that is), but there's no need to wait for that.


Should these disassemble to xxlor and xxlnor with -Mraw?  This isn't a
new problem, lots of other instructions don't disassemble to the
underlying machine insn with -Mraw.  For example:

 xxmr 3,4
 xxlnot 3,4
 mr 3,4
 not 3,4
 xxspltd 3,4,0
 xxspltd 3,4,1
 xxmrghd 3,4,5
 xxmrgld 3,4,5
 xxswapd 3,4
 li 3,123
 blt .+4

objdump -d -Mraw displays

   0:	90 24 64 f0 	xxmr    vs3,vs4
   4:	10 25 64 f0 	xxlnot  vs3,vs4
   8:	78 23 83 7c 	mr      r3,r4
   c:	f8 20 83 7c 	not     r3,r4
  10:	50 20 64 f0 	xxspltd vs3,vs4,0
  14:	50 23 64 f0 	xxspltd vs3,vs4,1
  18:	50 28 64 f0 	xxpermdi vs3,vs4,vs5,0
  1c:	50 2b 64 f0 	xxpermdi vs3,vs4,vs5,3
  20:	50 22 64 f0 	xxpermdi vs3,vs4,vs4,2
  24:	7b 00 60 38 	addi    r3,0,123
  28:	04 00 80 41 	bc      12,lt,0x2c

with the first 6 insns not being the real hardware insns.  I think
-Mraw output really ought to be:

   0:	90 24 64 f0 	xxlor   vs3,vs4,vs4
   4:	10 25 64 f0 	xxlnor  vs3,vs4,vs4
   8:	78 23 83 7c 	or      r3,r4,r4
   c:	f8 20 83 7c 	nor     r3,r4,r4
  10:	50 20 64 f0 	xxpermdi vs3,vs4,vs4,0
  14:	50 23 64 f0 	xxpermdi vs3,vs4,vs4,3
  18:	50 28 64 f0 	xxpermdi vs3,vs4,vs5,0
  1c:	50 2b 64 f0 	xxpermdi vs3,vs4,vs5,3
  20:	50 22 64 f0 	xxpermdi vs3,vs4,vs4,2
  24:	7b 00 60 38 	addi    r3,0,123
  28:	04 00 80 41 	bc      12,lt,0x2c

The following hack produces this.  The idea is of course that we can
detect a specialisation involving matching register operands or the
like by simply counting the number of operands.

I'm going to glare at our opcode table a bit more before I commit
this.  :-)

	* ppc-dis.c (lookup_powerpc): When mask is the same, count
	operands to find underlying hardware insn for -Mraw.
	(lookup_prefix): Likewise.

diff --git a/opcodes/ppc-dis.c b/opcodes/ppc-dis.c
index 739195a9910..7473ead9977 100644
--- a/opcodes/ppc-dis.c
+++ b/opcodes/ppc-dis.c
@@ -619,7 +619,10 @@ lookup_powerpc (uint64_t insn, ppc_cpu_t dialect)
 
       /* The raw machine insn is one that is not a specialization.  */
       if (last == NULL
-	  || (last->mask & ~opcode->mask) != 0)
+	  || (last->mask & ~opcode->mask) != 0
+	  || (last->mask == opcode->mask
+	      && (strnlen ((char *) last->operands, sizeof (last->operands))
+		  < strnlen ((char *) opcode->operands, sizeof (opcode->operands)))))
 	last = opcode;
     }
 
@@ -670,7 +673,10 @@ lookup_prefix (uint64_t insn, ppc_cpu_t dialect)
 
       /* The raw machine insn is one that is not a specialization.  */
       if (last == NULL
-	  || (last->mask & ~opcode->mask) != 0)
+	  || (last->mask & ~opcode->mask) != 0
+	  || (last->mask == opcode->mask
+	      && (strnlen ((char *) last->operands, sizeof (last->operands))
+		  < strnlen ((char *) opcode->operands, sizeof (opcode->operands)))))
 	last = opcode;
     }
 

-- 
Alan Modra
Australia Development Lab, IBM
Nick Alcock via Binutils May 28, 2021, 1:29 p.m. | #2
On 5/27/21 10:56 PM, Alan Modra wrote:
> Should these disassemble to xxlor and xxlnor with -Mraw?  This isn't a

> new problem, lots of other instructions don't disassemble to the

> underlying machine insn with -Mraw.  For example:


To be honest, I didn't even know about -Mraw  :-)



> The following hack produces this.  The idea is of course that we can

> detect a specialisation involving matching register operands or the

> like by simply counting the number of operands.

> 

> I'm going to glare at our opcode table a bit more before I commit

> this.  :-)


This probably catches most of the usage, since extended mnemonics
usually have fewer operands then the base mnemonic.  That isn't always
the case though.  Extended mnemonics like subi, subis, etc. and some
of the extended mnemonics for the ridiculous insns have the same
number of operands.  I doubt there are any extended mnemonics which
have more operands than the base mnemonic.

You patch doesn't seem to do anything for the case when they have
the same number of operands.  In that case, we probably defer to
the first one found in the table which (normally, but not always)
is the extended mnemonic.


> 	* ppc-dis.c (lookup_powerpc): When mask is the same, count

> 	operands to find underlying hardware insn for -Mraw.

> 	(lookup_prefix): Likewise.


If this works, I'm all for it.

If that doesn't work, one other option is to add a RAW mask to the
deprecated field in the extended mnemonics. That way, our existing
machinery will automatically disable them.


Peter
Nick Alcock via Binutils May 28, 2021, 2:33 p.m. | #3
On Fri, May 28, 2021 at 08:29:00AM -0500, Peter Bergner wrote:
> > 	* ppc-dis.c (lookup_powerpc): When mask is the same, count

> > 	operands to find underlying hardware insn for -Mraw.

> > 	(lookup_prefix): Likewise.

> 

> If this works, I'm all for it.


It needed lots more tweaking.

diff --git a/opcodes/ppc-dis.c b/opcodes/ppc-dis.c
index 739195a9910..c7ca29997c3 100644
--- a/opcodes/ppc-dis.c
+++ b/opcodes/ppc-dis.c
@@ -575,6 +575,68 @@ skip_optional_operands (const unsigned char *opindex,
   return true;
 }
 
+/* Count the number of operands for OPCODE.  */
+
+static inline size_t
+num_operands (const struct powerpc_opcode *opcode)
+{
+  return strnlen ((char *) opcode->operands, sizeof (opcode->operands));
+}
+
+/* Return TRUE if OP1 is more specialized than OP2 as measured by
+   the opcode mask and operands.  OP1 and OP2 are both valid
+   disassembly for some instruction.  The aim here is to return TRUE
+   if OP1 is an extended opcode for OP2.  */
+
+static bool
+more_specialized (const struct powerpc_opcode *op1,
+		  const struct powerpc_opcode *op2)
+{
+  /* If the mask implies that OP1 has more fixed bits than OP2,
+     then OP1 is more specialized.  */
+  if ((op1->mask & ~op2->mask) != 0)
+    return true;
+
+  if (op1->mask == op2->mask)
+    {
+      size_t n1 = num_operands (op1);
+      size_t n2 = num_operands (op2);
+      /* Choose more operands, or if equal number of operands, an
+	 opcode with fewer extract functions.  */
+      if (n1 < n2)
+	return true;
+      if (n1 == n2)
+	{
+	  int count = 0;
+	  bool ex1 = false;
+	  while (n1--)
+	    {
+	      ex1 = powerpc_operands[op1->operands[n1]].extract != NULL;
+	      if (ex1)
+		count++;
+	      bool ex2 = powerpc_operands[op2->operands[n1]].extract != NULL;
+	      if (ex2)
+		count--;
+	    }
+	  if (count > 0)
+	    return true;
+	  /* There is some trickery here.  If we haven't found a
+	     mis-match in extract functions then return true anyway if
+	     OP1 uses an extract function in its first operand.  This
+	     is for conditional branch instructions like bcctr-,
+	     bcctr+, bcctr where each alternative has an extract
+	     function on the first operand but the last alternative is
+	     correct for -Mraw.
+	     FIXME: It might be better to put an explicit flag on the
+	     alternatives to discard for -Mraw.  That would speed up
+	     -Mraw disassembly too.  */
+	  if (count == 0)
+	    return ex1;
+	}
+    }
+  return false;
+}
+
 /* Find a match for INSN in the opcode table, given machine DIALECT.  */
 
 static const struct powerpc_opcode *
@@ -617,9 +679,14 @@ lookup_powerpc (uint64_t insn, ppc_cpu_t dialect)
       if ((dialect & PPC_OPCODE_RAW) == 0)
 	return opcode;
 
-      /* The raw machine insn is one that is not a specialization.  */
+      /* The raw machine insn is one that is not a specialization.
+	 Note that simply choosing the last insn that matches doesn't
+	 work.  The opcodes table has entries that are just aliases,
+	 (besides power vs. ppc naming) eg. "vctsxs" and "vcfpsxws"
+	 where the first entry should be chosen for -Mraw just as it
+	 is without -Mraw.  */
       if (last == NULL
-	  || (last->mask & ~opcode->mask) != 0)
+	  || more_specialized (last, opcode))
 	last = opcode;
     }
 
@@ -670,7 +737,7 @@ lookup_prefix (uint64_t insn, ppc_cpu_t dialect)
 
       /* The raw machine insn is one that is not a specialization.  */
       if (last == NULL
-	  || (last->mask & ~opcode->mask) != 0)
+	  || more_specialized (last, opcode))
 	last = opcode;
     }

You'll notice the FIXME, which is saying to do..
 
> If that doesn't work, one other option is to add a RAW mask to the

> deprecated field in the extended mnemonics. That way, our existing

> machinery will automatically disable them.


..this.  Painful as all the editing was.

-- 
Alan Modra
Australia Development Lab, IBM
Nick Alcock via Binutils May 28, 2021, 9:48 p.m. | #4
On 5/28/21 9:33 AM, Alan Modra wrote:
> You'll notice the FIXME, which is saying to do..

>  

>> If that doesn't work, one other option is to add a RAW mask to the

>> deprecated field in the extended mnemonics. That way, our existing

>> machinery will automatically disable them.

> 

> ..this.  Painful as all the editing was.


Adding the deprecated field definitely would be the most robust fix,
but yeah, a lot of changes and we'd have to ensure each new added
mnemonic is marked correctly.  I did test adding a RAW field to xxmr
and that was all that was needed to work correctly:


bergner@pike:~$ cat xxmr.s 
	.text
start:
	xxlor 3,4,4
	xxmr 3,4
bergner@pike:~$ /home/bergner/binutils/build/binutils-xxmr-raw/binutils/objdump -d xxmr.o 
0000000000000000 <start>:
   0:	90 24 64 f0 	xxmr    vs3,vs4
   4:	90 24 64 f0 	xxmr    vs3,vs4
bergner@pike:~$ /home/bergner/binutils/build/binutils-xxmr-raw/binutils/objdump -d -Mraw xxmr.o 
0000000000000000 <start>:
   0:	90 24 64 f0 	xxlor   vs3,vs4,vs4
   4:	90 24 64 f0 	xxlor   vs3,vs4,vs4


That said, an automatic option like your patch is probably preferable,
since we wouldn't have to do anything special when adding new mnemonics.
We could use the deprecated field method for any "weird" insns, where
your code somehow cannot correctly identify the extended mnemonic.
If there are even any.



> +  /* If the mask implies that OP1 has more fixed bits than OP2,

> +     then OP1 is more specialized.  */

> +  if ((op1->mask & ~op2->mask) != 0)

> +    return true;


I know this was the original test, but this doesn't really ensure
that op1->mask has more bits set than op2->mask.  It just tells us
that op1->mask has bits that are not set in op2->mask.  It could
be op2->mask has set bits that op1->mask doesn't have too.
That said, this would probably only happen between 2 different
extended mnemonics being compared to each other and I would expect
the base mnemonic to win out over both of them, so we're probably
fine with the test as is.

Thanks for cleaning up after me!

Peter

Patch

diff --git a/opcodes/ppc-opc.c b/opcodes/ppc-opc.c
index 272dc098991..84b885a9ae9 100644
--- a/opcodes/ppc-opc.c
+++ b/opcodes/ppc-opc.c
@@ -8516,6 +8516,7 @@  const struct powerpc_opcode powerpc_opcodes[] = {
 {"xsrsp",	XX2(60,281),	XX2_MASK,    PPCVSX2,	PPCVLE,		{XT6, XB6}},
 {"xsmaxjdp",	XX3(60,144),	XX3_MASK,    PPCVSX3,	PPCVLE,		{XT6, XA6, XB6}},
 {"xsnmsubasp",	XX3(60,145),	XX3_MASK,    PPCVSX2,	PPCVLE,		{XT6, XA6, XB6}},
+{"xxmr",	XX3(60,146),	XX3_MASK,    PPCVSX,	PPCVLE,		{XT6, XAB6}},
 {"xxlor",	XX3(60,146),	XX3_MASK,    PPCVSX,	PPCVLE,		{XT6, XA6, XB6}},
 {"xscvuxdsp",	XX2(60,296),	XX2_MASK,    PPCVSX2,	PPCVLE,		{XT6, XB6}},
 {"xststdcsp",	XX2(60,298),	XX2BFD_MASK, PPCVSX3,	PPCVLE,		{BF, XB6, DCMX}},
@@ -8525,6 +8526,7 @@  const struct powerpc_opcode powerpc_opcodes[] = {
 {"xscvsxdsp",	XX2(60,312),	XX2_MASK,    PPCVSX2,	PPCVLE,		{XT6, XB6}},
 {"xsmaxdp",	XX3(60,160),	XX3_MASK,    PPCVSX,	PPCVLE,		{XT6, XA6, XB6}},
 {"xsnmaddadp",	XX3(60,161),	XX3_MASK,    PPCVSX,	PPCVLE,		{XT6, XA6, XB6}},
+{"xxlnot",	XX3(60,162),	XX3_MASK,    PPCVSX,	PPCVLE,		{XT6, XAB6}},
 {"xxlnor",	XX3(60,162),	XX3_MASK,    PPCVSX,	PPCVLE,		{XT6, XA6, XB6}},
 {"xscvdpuxds",	XX2(60,328),	XX2_MASK,    PPCVSX,	PPCVLE,		{XT6, XB6}},
 {"xscvspdp",	XX2(60,329),	XX2_MASK,    PPCVSX,	PPCVLE,		{XT6, XB6}},
diff --git a/gas/testsuite/gas/ppc/vsx.d b/gas/testsuite/gas/ppc/vsx.d
index 0fbbf75eb93..7aa9f19733c 100644
--- a/gas/testsuite/gas/ppc/vsx.d
+++ b/gas/testsuite/gas/ppc/vsx.d
@@ -170,4 +170,8 @@  Disassembly of section \.text:
 .*:	(7d 0a a6 99|99 a6 0a 7d) 	lxvd2x  vs40,r10,r20
 .*:	(7d 00 a7 99|99 a7 00 7d) 	stxvd2x vs40,0,r20
 .*:	(7d 0a a7 99|99 a7 0a 7d) 	stxvd2x vs40,r10,r20
+.*:	(f1 12 95 17|17 95 12 f1) 	xxlnot  vs40,vs50
+.*:	(f1 12 95 17|17 95 12 f1) 	xxlnot  vs40,vs50
+.*:	(f1 12 94 97|97 94 12 f1) 	xxmr    vs40,vs50
+.*:	(f1 12 94 97|97 94 12 f1) 	xxmr    vs40,vs50
 #pass
diff --git a/gas/testsuite/gas/ppc/vsx.s b/gas/testsuite/gas/ppc/vsx.s
index 716174cdb44..cd02a6e2dc0 100644
--- a/gas/testsuite/gas/ppc/vsx.s
+++ b/gas/testsuite/gas/ppc/vsx.s
@@ -162,3 +162,7 @@  start:
 	lxvd2x     40,10,20
 	stxvd2x    40,0,20
 	stxvd2x    40,10,20
+	xxlnot     40,50
+	xxlnor     40,50,50
+	xxmr       40,50
+	xxlor      40,50,50