[PATCHv2] Handle not explicitly zero terminated strings in merge sections

Message ID VI1PR0701MB286296920A9FB159972A9487E4190@VI1PR0701MB2862.eurprd07.prod.outlook.com
State New
Headers show
Series
  • [PATCHv2] Handle not explicitly zero terminated strings in merge sections
Related show

Commit Message

Bernd Edlinger Sept. 14, 2018, 6:39 p.m.
Hi,

this is an upate of the string-merge section, it is based on the V2-STRING_CST
semantic patch series, which was finally installed yesterday.
It merges single-byte string constants with or without terminating NUL.
The patch has the same Ada and C test cases that were already in the V1 patch.

Thus there are no pre-requisite patches necessary at this time.

Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.

Comments

Jakub Jelinek Sept. 14, 2018, 7:01 p.m. | #1
On Fri, Sep 14, 2018 at 06:39:38PM +0000, Bernd Edlinger wrote:
> Hi,

> 

> this is an upate of the string-merge section, it is based on the V2-STRING_CST

> semantic patch series, which was finally installed yesterday.

> It merges single-byte string constants with or without terminating NUL.

> The patch has the same Ada and C test cases that were already in the V1 patch.

> 

> Thus there are no pre-requisite patches necessary at this time.

> 

> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.

> Is it OK for trunk?


I believe SHF_MERGE | SHF_STRINGS sections should only ever contain
zero-terminated strings.
See e.g. SCO ELF documentation:
"The size of each element is specified in the section header's sh_entsize field. If the SHF_STRINGS flag is also set, the data elements consist of null-terminated character strings."
http://www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html
https://docs.oracle.com/cd/E23824_01/html/819-0690/ggdlu.html

So, please never put non-zero terminated strings into MS sections.

	Jakub
Bernd Edlinger Sept. 14, 2018, 7:06 p.m. | #2
On 09/14/18 21:01, Jakub Jelinek wrote:
> On Fri, Sep 14, 2018 at 06:39:38PM +0000, Bernd Edlinger wrote:

>> Hi,

>>

>> this is an upate of the string-merge section, it is based on the V2-STRING_CST

>> semantic patch series, which was finally installed yesterday.

>> It merges single-byte string constants with or without terminating NUL.

>> The patch has the same Ada and C test cases that were already in the V1 patch.

>>

>> Thus there are no pre-requisite patches necessary at this time.

>>

>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.

>> Is it OK for trunk?

> 

> I believe SHF_MERGE | SHF_STRINGS sections should only ever contain

> zero-terminated strings.

> See e.g. SCO ELF documentation:

> "The size of each element is specified in the section header's sh_entsize field. If the SHF_STRINGS flag is also set, the data elements consist of null-terminated character strings."

> http://www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html

> https://docs.oracle.com/cd/E23824_01/html/819-0690/ggdlu.html

> 

> So, please never put non-zero terminated strings into MS sections.

> 


Yes, that is of course clear.

The idea is, if the string itself is not zero-terminated
another zero is added, that is only done in at the assembly level.

Strings with embedded zero bytes are not put in the merge section.


Bernd.
Bernd Edlinger Sept. 21, 2018, 11:36 a.m. | #3
Hi Jakub,

are your concerns addressed with this answer, or do you have objections
to this patch?


Thanks
Bernd.


On 09/14/18 21:06, Bernd Edlinger wrote:
> On 09/14/18 21:01, Jakub Jelinek wrote:

>> On Fri, Sep 14, 2018 at 06:39:38PM +0000, Bernd Edlinger wrote:

>>> Hi,

>>>

>>> this is an upate of the string-merge section, it is based on the V2-STRING_CST

>>> semantic patch series, which was finally installed yesterday.

>>> It merges single-byte string constants with or without terminating NUL.

>>> The patch has the same Ada and C test cases that were already in the V1 patch.

>>>

>>> Thus there are no pre-requisite patches necessary at this time.

>>>

>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.

>>> Is it OK for trunk?

>>

>> I believe SHF_MERGE | SHF_STRINGS sections should only ever contain

>> zero-terminated strings.

>> See e.g. SCO ELF documentation:

>> "The size of each element is specified in the section header's sh_entsize field. If the SHF_STRINGS flag is also set, the data elements consist of null-terminated character strings."

>> http://www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html

>> https://docs.oracle.com/cd/E23824_01/html/819-0690/ggdlu.html

>>

>> So, please never put non-zero terminated strings into MS sections.

>>

> 

> Yes, that is of course clear.

> 

> The idea is, if the string itself is not zero-terminated

> another zero is added, that is only done in at the assembly level.

> 

> Strings with embedded zero bytes are not put in the merge section.

> > 

> Bernd.
Jakub Jelinek Sept. 21, 2018, 11:59 a.m. | #4
On Fri, Sep 21, 2018 at 11:36:53AM +0000, Bernd Edlinger wrote:
> are your concerns addressed with this answer, or do you have objections

> to this patch?


No objections, but no time to review your patch either right now.

	Jakub
Jeff Law Oct. 3, 2018, 4:31 p.m. | #5
On 9/14/18 12:39 PM, Bernd Edlinger wrote:
> Hi,

> 

> this is an upate of the string-merge section, it is based on the V2-STRING_CST

> semantic patch series, which was finally installed yesterday.

> It merges single-byte string constants with or without terminating NUL.

> The patch has the same Ada and C test cases that were already in the V1 patch.

> 

> Thus there are no pre-requisite patches necessary at this time.

> 

> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.

> Is it OK for trunk?

> 

> 

> Thanks

> Bernd.

> 

> 

> patch-merge-strings-v2.diff

> 

> gcc:

> 2018-08-27  Bernd Edlinger  <bernd.edlinger@hotmail.de>

> 

> 	* varasm.c (output_constant): Add new parameter merge_strings.

> 	Make strings properly zero terminated in merge string sections.

> 	(mergeable_string_section): Don't fail if the last char is non-zero.

> 	(assemble_variable_contents): Handle merge string sections.

> 	(assemble_variable): Likewise.

> 	(assemble_constant_contents): Likewise.

> 	(output_constant_def_contents): Likewise.

> 

> testsuite:

> 2018-08-27  Bernd Edlinger  <bernd.edlinger@hotmail.de>

> 

> 	* gnat.dg/string_merge1.adb: New test.

> 	* gnat.dg/string_merge2.adb: New test.

> 	* gcc.dg/merge-all-constants-1.c: Adjust test.

> 	* gcc.dg/merge-all-constants-2.c: New test.

Thanks for you patience.


> 

> diff -Npur gcc/varasm.c gcc/varasm.c

> --- gcc/varasm.c	2018-08-26 15:02:43.157905415 +0200

> +++ gcc/varasm.c	2018-08-26 17:57:26.488494866 +0200

> @@ -111,7 +111,8 @@ static int compare_constant (const tree,

>  static void output_constant_def_contents (rtx);

>  static void output_addressed_constants (tree);

>  static unsigned HOST_WIDE_INT output_constant (tree, unsigned HOST_WIDE_INT,

> -					       unsigned int, bool);

> +					       unsigned int, bool,

> +					       bool = false);

Given there's less than 10 call sites in total, avoid defaulting the
argument and just pass it in unconditionally.  Similarly for
assemble_constant_contents which has just a couple calls.



>  static void globalize_decl (tree);

>  static bool decl_readonly_section_1 (enum section_category);

>  #ifdef BSS_SECTION_ASM_OP

> @@ -804,8 +805,8 @@ mergeable_string_section (tree decl ATTR

>        && TREE_CODE (decl) == STRING_CST

>        && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE

>        && align <= 256

> -      && (len = int_size_in_bytes (TREE_TYPE (decl))) > 0

> -      && TREE_STRING_LENGTH (decl) >= len)

> +      && (len = int_size_in_bytes (TREE_TYPE (decl))) >= 0

> +      && TREE_STRING_LENGTH (decl) == len)

Not sure why you want to test for >= 0 here.  > 0 seems sufficient,
though I guess there's no harm in the = 0 case.


> @@ -835,7 +836,7 @@ mergeable_string_section (tree decl ATTR

>  	      if (j == unit)

>  		break;

>  	    }

> -	  if (i == len - unit)

> +	  if (i == len - unit || (unit == 1 && i == len))

>  	    {

>  	      sprintf (name, "%s.str%d.%d", prefix,

>  		       modesize / 8, (int) (align / 8));

> @@ -2316,7 +2317,9 @@ assemble_variable (tree decl, int top_le

>  	switch_to_section (sect);

>        if (align > BITS_PER_UNIT)

>  	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));

> -      assemble_variable_contents (decl, name, dont_output_data);

> +      assemble_variable_contents (decl, name, dont_output_data,

> +				  sect->common.flags & SECTION_MERGE

> +				  && sect->common.flags & SECTION_STRINGS);

I believe our style guidelines would call for parenthesis around the
multi-line expression (where you check SECTION_MERGE and SECTION_STRINGS).


> @@ -3522,10 +3526,13 @@ output_constant_def_contents (rtx symbol

>  		   || (VAR_P (decl) && DECL_IN_CONSTANT_POOL (decl))

>  		   ? DECL_ALIGN (decl)

>  		   : symtab_node::get (decl)->definition_alignment ());

> -      switch_to_section (get_constant_section (exp, align));

> +      section *sect = get_constant_section (exp, align);

> +      switch_to_section (sect);

>        if (align > BITS_PER_UNIT)

>  	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));

> -      assemble_constant_contents (exp, XSTR (symbol, 0), align);

> +      assemble_constant_contents (exp, XSTR (symbol, 0), align,

> +				  sect->common.flags & SECTION_MERGE

> +				  && sect->common.flags & SECTION_STRINGS);

Similarly, you need some parens here.



I think with the changing of the default parm to a normal parm and the
style changes this is OK for the trunk.  Given it's independent of
everything else going on, go ahead and commit it when you've fixed up
those minor issues.

jeff
Andreas Schwab Oct. 5, 2018, 6:15 p.m. | #6
On Sep 14 2018, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:

> diff -Npur gcc/testsuite/gnat.dg/string_merge1.adb gcc/testsuite/gnat.dg/string_merge1.adb

> --- gcc/testsuite/gnat.dg/string_merge1.adb	1970-01-01 01:00:00.000000000 +0100

> +++ gcc/testsuite/gnat.dg/string_merge1.adb	2018-08-26 16:31:12.650271931 +0200

> @@ -0,0 +1,19 @@

> +-- { dg-do compile }

> +-- { dg-options "-O1 -fmerge-all-constants" }

> +

> +procedure String_Merge1 is

> +   procedure Process (X : String);

> +   pragma Import (Ada, Process);

> +begin

> +   Process ("ABCD");

> +end;

> +

> +-- We expect something like:

> +

> +-- .section  .rodata.str1.1,"aMS",@progbits,1

> +-- .LC1:

> +--     .string "ABCD"

> +

> +-- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }

> +-- { dg-final { scan-assembler-times "\\.string" 1 } }

> +-- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }


FAIL: gnat.dg/string_merge1.adb scan-assembler-times \\.string 1

$ grep ABCD string_merge1.s 
        stringz "ABCD"

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."
Bernd Edlinger Oct. 5, 2018, 7:26 p.m. | #7
On 10/05/18 20:15, Andreas Schwab wrote:
> On Sep 14 2018, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:

> 

>> diff -Npur gcc/testsuite/gnat.dg/string_merge1.adb gcc/testsuite/gnat.dg/string_merge1.adb

>> --- gcc/testsuite/gnat.dg/string_merge1.adb	1970-01-01 01:00:00.000000000 +0100

>> +++ gcc/testsuite/gnat.dg/string_merge1.adb	2018-08-26 16:31:12.650271931 +0200

>> @@ -0,0 +1,19 @@

>> +-- { dg-do compile }

>> +-- { dg-options "-O1 -fmerge-all-constants" }

>> +

>> +procedure String_Merge1 is

>> +   procedure Process (X : String);

>> +   pragma Import (Ada, Process);

>> +begin

>> +   Process ("ABCD");

>> +end;

>> +

>> +-- We expect something like:

>> +

>> +-- .section  .rodata.str1.1,"aMS",@progbits,1

>> +-- .LC1:

>> +--     .string "ABCD"

>> +

>> +-- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }

>> +-- { dg-final { scan-assembler-times "\\.string" 1 } }

>> +-- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }

> 

> FAIL: gnat.dg/string_merge1.adb scan-assembler-times \\.string 1

> 

> $ grep ABCD string_merge1.s

>          stringz "ABCD"

> 


Ah, thanks.

Turns out there are too much variations, like mentioned stringz, and asciz, and
probably lots more here.

But for the purpose of testing the optimization it should be sufficient to look for
".rodata.str" in the assembler.

So I committed the following as obvious:

Index: gnat.dg/string_merge2.adb
===================================================================
--- gnat.dg/string_merge2.adb	(Revision 264887)
+++ gnat.dg/string_merge2.adb	(Revision 264888)
@@ -15,5 +15,3 @@
  --     .string "ABCD"
  
  -- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }
--- { dg-final { scan-assembler-times "\\.string" 1 } }
--- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }
Index: gnat.dg/string_merge1.adb
===================================================================
--- gnat.dg/string_merge1.adb	(Revision 264887)
+++ gnat.dg/string_merge1.adb	(Revision 264888)
@@ -15,5 +15,3 @@
  --     .string "ABCD"
  
  -- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }
--- { dg-final { scan-assembler-times "\\.string" 1 } }
--- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }
Index: ChangeLog
===================================================================
--- ChangeLog	(Revision 264887)
+++ ChangeLog	(Revision 264888)
@@ -1,3 +1,8 @@
+2018-10-05  Bernd Edlinger  <bernd.edlinger@hotmail.de>
+
+	* gnat.dg/string_merge1.adb: Fix test expectations.
+	* gnat.dg/string_merge2.adb: Likewise.
+
  2018-10-05  David Malcolm  <dmalcolm@redhat.com>
  
  	PR c++/56856



Thanks
Bernd.
Rainer Orth Oct. 8, 2018, 11:14 a.m. | #8
Hi Bernd,

> On 10/05/18 20:15, Andreas Schwab wrote:

>> On Sep 14 2018, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:

>> 

>>> diff -Npur gcc/testsuite/gnat.dg/string_merge1.adb

>>> gcc/testsuite/gnat.dg/string_merge1.adb

>>> --- gcc/testsuite/gnat.dg/string_merge1.adb 1970-01-01

>>> 01:00:00.000000000 +0100

>>> +++ gcc/testsuite/gnat.dg/string_merge1.adb 2018-08-26

>>> 16:31:12.650271931 +0200

>>> @@ -0,0 +1,19 @@

>>> +-- { dg-do compile }

>>> +-- { dg-options "-O1 -fmerge-all-constants" }

>>> +

>>> +procedure String_Merge1 is

>>> +   procedure Process (X : String);

>>> +   pragma Import (Ada, Process);

>>> +begin

>>> +   Process ("ABCD");

>>> +end;

>>> +

>>> +-- We expect something like:

>>> +

>>> +-- .section  .rodata.str1.1,"aMS",@progbits,1

>>> +-- .LC1:

>>> +--     .string "ABCD"

>>> +

>>> +-- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }

>>> +-- { dg-final { scan-assembler-times "\\.string" 1 } }

>>> +-- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }

>> 

>> FAIL: gnat.dg/string_merge1.adb scan-assembler-times \\.string 1

>> 

>> $ grep ABCD string_merge1.s

>>          stringz "ABCD"

>> 

>

> Ah, thanks.

>

> Turns out there are too much variations, like mentioned stringz, and asciz, and

> probably lots more here.

>

> But for the purpose of testing the optimization it should be sufficient to look for

> ".rodata.str" in the assembler.

>

> So I committed the following as obvious:


unfortunately that patch is not enough and still shows a fundamental
problem:

* The tests still FAIL on Solaris with /bin/as and Darwin:

FAIL: gcc.dg/merge-all-constants-2.c scan-assembler-not \\\\.rodata[\\n\\r]

FAIL: gnat.dg/string_merge1.adb scan-assembler-times \\\\.rodata\\\\.str 1
FAIL: gnat.dg/string_merge2.adb scan-assembler-times \\\\.rodata\\\\.str 1

  Both targets lack string merging support, so the tests should check
  for that.

* The merge-all-constants-2.c test doesn't FAIL on Solaris/SPARC with
  /bin/as, although it lacks string merging support, too.  The assembler
  output contains

        .section        ".rodata"

  so the pattern currently used to check for .rodata is too
  restrictive.  There is assembler syntax beyond gas on x86 ;-)

The following patch fixes the former issue; tested on
sparc-sun-solaris2.11 with as (no string merging) resp. gas (with string
merging).  Installed on mainline.

Besides, the patch seems to have produced more fallout on Solaris: I see
many new Go testsuite failures on Solaris 10 which probably are related,
and Solaris bootstrap with Ada included is broken due to the extended
usage of string merging.  I'm currently investigating what's going on
there.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-10-08  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	* gcc.dg/merge-all-constants-2.c: Require string_merging support.
	* gnat.dg/string_merge1.adb: Likewise.
	* gnat.dg/string_merge2.adb: Likewise.
# HG changeset patch
# Parent  69da4995a4d01be3b3fb5cbf466648a814da22b1
Require string merging support in gnat.dg/string_merge?.adb etc.

diff --git a/gcc/testsuite/gcc.dg/merge-all-constants-2.c b/gcc/testsuite/gcc.dg/merge-all-constants-2.c
--- a/gcc/testsuite/gcc.dg/merge-all-constants-2.c
+++ b/gcc/testsuite/gcc.dg/merge-all-constants-2.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target string_merging } */
 /* { dg-options "-w -O2 -fmerge-all-constants" } */
 
 const char str1[36] = "0123456789abcdefghijklmnopqrstuvwxyz";
diff --git a/gcc/testsuite/gnat.dg/string_merge1.adb b/gcc/testsuite/gnat.dg/string_merge1.adb
--- a/gcc/testsuite/gnat.dg/string_merge1.adb
+++ b/gcc/testsuite/gnat.dg/string_merge1.adb
@@ -1,4 +1,5 @@
 -- { dg-do compile }
+-- { dg-require-effective-target string_merging }
 -- { dg-options "-O1 -fmerge-all-constants" }
 
 procedure String_Merge1 is
diff --git a/gcc/testsuite/gnat.dg/string_merge2.adb b/gcc/testsuite/gnat.dg/string_merge2.adb
--- a/gcc/testsuite/gnat.dg/string_merge2.adb
+++ b/gcc/testsuite/gnat.dg/string_merge2.adb
@@ -1,4 +1,5 @@
 -- { dg-do compile }
+-- { dg-require-effective-target string_merging }
 -- { dg-options "-O1 -fmerge-all-constants" }
 
 procedure String_Merge2 is
Rainer Orth Oct. 8, 2018, 12:49 p.m. | #9
Hi Bernd,

> Besides, the patch seems to have produced more fallout on Solaris: I see

> many new Go testsuite failures on Solaris 10 which probably are related,

> and Solaris bootstrap with Ada included is broken due to the extended

> usage of string merging.  I'm currently investigating what's going on

> there.


I've filed PR bootstrap/87551 for the libgnat-9.so link failure.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Bernd Edlinger Oct. 8, 2018, 3:04 p.m. | #10
On 10/08/18 13:14, Rainer Orth wrote:
> Hi Bernd,

> 

>> On 10/05/18 20:15, Andreas Schwab wrote:

>>> On Sep 14 2018, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:

>>>

>>>> diff -Npur gcc/testsuite/gnat.dg/string_merge1.adb

>>>> gcc/testsuite/gnat.dg/string_merge1.adb

>>>> --- gcc/testsuite/gnat.dg/string_merge1.adb 1970-01-01

>>>> 01:00:00.000000000 +0100

>>>> +++ gcc/testsuite/gnat.dg/string_merge1.adb 2018-08-26

>>>> 16:31:12.650271931 +0200

>>>> @@ -0,0 +1,19 @@

>>>> +-- { dg-do compile }

>>>> +-- { dg-options "-O1 -fmerge-all-constants" }

>>>> +

>>>> +procedure String_Merge1 is

>>>> +   procedure Process (X : String);

>>>> +   pragma Import (Ada, Process);

>>>> +begin

>>>> +   Process ("ABCD");

>>>> +end;

>>>> +

>>>> +-- We expect something like:

>>>> +

>>>> +-- .section  .rodata.str1.1,"aMS",@progbits,1

>>>> +-- .LC1:

>>>> +--     .string "ABCD"

>>>> +

>>>> +-- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }

>>>> +-- { dg-final { scan-assembler-times "\\.string" 1 } }

>>>> +-- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }

>>>

>>> FAIL: gnat.dg/string_merge1.adb scan-assembler-times \\.string 1

>>>

>>> $ grep ABCD string_merge1.s

>>>           stringz "ABCD"

>>>

>>

>> Ah, thanks.

>>

>> Turns out there are too much variations, like mentioned stringz, and asciz, and

>> probably lots more here.

>>

>> But for the purpose of testing the optimization it should be sufficient to look for

>> ".rodata.str" in the assembler.

>>

>> So I committed the following as obvious:

> 

> unfortunately that patch is not enough and still shows a fundamental

> problem:

> 

> * The tests still FAIL on Solaris with /bin/as and Darwin:

> 

> FAIL: gcc.dg/merge-all-constants-2.c scan-assembler-not \\\\.rodata[\\n\\r]

> 

> FAIL: gnat.dg/string_merge1.adb scan-assembler-times \\\\.rodata\\\\.str 1

> FAIL: gnat.dg/string_merge2.adb scan-assembler-times \\\\.rodata\\\\.str 1

> 

>    Both targets lack string merging support, so the tests should check

>    for that.

> 

> * The merge-all-constants-2.c test doesn't FAIL on Solaris/SPARC with

>    /bin/as, although it lacks string merging support, too.  The assembler

>    output contains

> 

>          .section        ".rodata"

> 

>    so the pattern currently used to check for .rodata is too

>    restrictive.  There is assembler syntax beyond gas on x86 ;-)

> 


For the test that failed with the quotes around .rodata.  I think
instead of looking for a end of line immediately after .rodata, it
would be sufficient to make sure it does not continue with .str, so
could you please try to add something like the following to your patch?

Index: gcc/testsuite/gcc.dg/merge-all-constants-2.c
===================================================================
--- gcc/testsuite/gcc.dg/merge-all-constants-2.c	(revision 264888)
+++ gcc/testsuite/gcc.dg/merge-all-constants-2.c	(working copy)
@@ -5,4 +5,4 @@
  const char str2[37] = "0123456789abcdefghijklmnopqrstuvwxyz";
  const char str3[10] = "0123456789abcdefghijklmnopqrstuvwxyz";
  
-/* { dg-final { scan-assembler-not "\\.rodata\[\n\r\]" } } */
+/* { dg-final { scan-assembler-not "\\.rodata\[^.]" } } */


> The following patch fixes the former issue; tested on

> sparc-sun-solaris2.11 with as (no string merging) resp. gas (with string

> merging).  Installed on mainline.

> 

> Besides, the patch seems to have produced more fallout on Solaris: I see

> many new Go testsuite failures on Solaris 10 which probably are related,

> and Solaris bootstrap with Ada included is broken due to the extended

> usage of string merging.  I'm currently investigating what's going on

> there.

> 

> 	Rainer

>
Eric Botcazou Oct. 8, 2018, 10:03 p.m. | #11
> Besides, the patch seems to have produced more fallout on Solaris: I see

> many new Go testsuite failures on Solaris 10 which probably are related,

> and Solaris bootstrap with Ada included is broken due to the extended

> usage of string merging.  I'm currently investigating what's going on

> there.


I can bootstrap on SPARC/Solaris 10 and 11 (GNU as and system linker) but I 
have regressions on Solaris 11 (and not 10) under the form of SIGBUSes:

                === acats tests ===
FAIL:   c34005g
FAIL:   c35508c
FAIL:   ce3602d
FAIL:   cxa4005
FAIL:   ee3412c

FAIL: gfortran.dg/allocatable_function_5.f90   -O1  execution test
FAIL: gfortran.dg/allocatable_function_5.f90   -O2  execution test
FAIL: gfortran.dg/allocatable_function_5.f90   -O3 -fomit-frame-pointer -
funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/allocatable_function_5.f90   -O3 -g  execution test
FAIL: gfortran.dg/allocatable_function_5.f90   -Os  execution test
FAIL: gfortran.dg/char_allocation_1.f90   -O1  (test for excess errors)
UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O1  compilation failed to 
produce executable
FAIL: gfortran.dg/char_allocation_1.f90   -O2  (test for excess errors)
UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O2  compilation failed to 
produce executable
FAIL: gfortran.dg/char_allocation_1.f90   -O3 -fomit-frame-pointer -funroll-
loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O3 -fomit-frame-pointer -
funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to 
produce executable
FAIL: gfortran.dg/char_allocation_1.f90   -O3 -g  (test for excess errors)
UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O3 -g  compilation failed to 
produce executable
FAIL: gfortran.dg/char_pointer_assign_3.f90   -O2  execution test
FAIL: gfortran.dg/char_pointer_assign_3.f90   -O3 -fomit-frame-pointer -
funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/char_pointer_assign_3.f90   -O3 -g  execution test
FAIL: gfortran.dg/implied_do_io_2.f90   -Os  execution test
FAIL: gfortran.dg/namelist_38.f90   -O2  execution test
FAIL: gfortran.dg/namelist_38.f90   -O3 -fomit-frame-pointer -funroll-loops -
fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/namelist_38.f90   -O3 -g  execution test
FAIL: gfortran.dg/namelist_70.f90   -O1  execution test
FAIL: gfortran.dg/namelist_70.f90   -O2  execution test
FAIL: gfortran.dg/namelist_70.f90   -O3 -fomit-frame-pointer -funroll-loops -
fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/namelist_70.f90   -O3 -g  execution test
FAIL: gfortran.dg/namelist_70.f90   -Os  execution test
FAIL: gfortran.dg/nan_6.f90   -O1  execution test
FAIL: gfortran.dg/realloc_on_assign_4.f03   -O1  execution test
FAIL: gfortran.dg/realloc_on_assign_4.f03   -O2  execution test
FAIL: gfortran.dg/realloc_on_assign_4.f03   -O3 -fomit-frame-pointer -funroll-
loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/realloc_on_assign_4.f03   -O3 -g  execution test
FAIL: gfortran.dg/realloc_on_assign_4.f03   -Os  execution test
FAIL: gfortran.dg/streamio_14.f90   -O2  execution test
FAIL: gfortran.dg/streamio_14.f90   -O3 -fomit-frame-pointer -funroll-loops -
fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/streamio_14.f90   -O3 -g  execution test
FAIL: gfortran.dg/substr_7.f90   -O  (test for excess errors)
UNRESOLVED: gfortran.dg/substr_7.f90   -O  compilation failed to produce 
executable
FAIL: gfortran.dg/widechar_2.f90   -O1  execution test
FAIL: gfortran.dg/widechar_2.f90   -Os  execution test
FAIL: gfortran.fortran-torture/execute/adjustr.f90 execution,  -Os

FAIL: go.test/test/fixedbugs/issue4562.go execution,  -O2 -g 
FAIL: go.test/test/nil.go execution,  -O2 -g

-- 
Eric Botcazou
Rainer Orth Oct. 9, 2018, 11:43 a.m. | #12
Hi Eric,

>> Besides, the patch seems to have produced more fallout on Solaris: I see

>> many new Go testsuite failures on Solaris 10 which probably are related,

>> and Solaris bootstrap with Ada included is broken due to the extended

>> usage of string merging.  I'm currently investigating what's going on

>> there.

>

> I can bootstrap on SPARC/Solaris 10 and 11 (GNU as and system linker) but I 

> have regressions on Solaris 11 (and not 10) under the form of SIGBUSes:


I could bootstrap on Solaris/SPARC with Bernd's patch, too.  The
bootstrap failure only occured on Solaris 11/x86 with gas.

Which version exactly (pkg list entire) of Solaris 11 are you running?
I'm using gas 2.31 and /bin/ld on Solaris 11.4 resp. 11.5 Beta, where
Bernd's patch in PR bootstrap/87551 fixed the remaining regressions.

>                 === acats tests ===

> FAIL:   c34005g

> FAIL:   c35508c

> FAIL:   ce3602d

> FAIL:   cxa4005

> FAIL:   ee3412c

>

> FAIL: gfortran.dg/allocatable_function_5.f90   -O1  execution test

> FAIL: gfortran.dg/allocatable_function_5.f90   -O2  execution test

> FAIL: gfortran.dg/allocatable_function_5.f90   -O3 -fomit-frame-pointer -

> funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

> FAIL: gfortran.dg/allocatable_function_5.f90   -O3 -g  execution test

> FAIL: gfortran.dg/allocatable_function_5.f90   -Os  execution test

> FAIL: gfortran.dg/char_allocation_1.f90   -O1  (test for excess errors)

> UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O1  compilation failed to 

> produce executable

> FAIL: gfortran.dg/char_allocation_1.f90   -O2  (test for excess errors)

> UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O2  compilation failed to 

> produce executable

> FAIL: gfortran.dg/char_allocation_1.f90   -O3 -fomit-frame-pointer -funroll-

> loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)

> UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O3 -fomit-frame-pointer -

> funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to 

> produce executable

> FAIL: gfortran.dg/char_allocation_1.f90   -O3 -g  (test for excess errors)

> UNRESOLVED: gfortran.dg/char_allocation_1.f90   -O3 -g  compilation failed to 

> produce executable


Those are fixed by the patch in PR bootstrap/87551...

> FAIL: gfortran.dg/char_pointer_assign_3.f90   -O2  execution test

> FAIL: gfortran.dg/char_pointer_assign_3.f90   -O3 -fomit-frame-pointer -

> funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

> FAIL: gfortran.dg/char_pointer_assign_3.f90   -O3 -g  execution test

> FAIL: gfortran.dg/implied_do_io_2.f90   -Os  execution test

> FAIL: gfortran.dg/namelist_38.f90   -O2  execution test

> FAIL: gfortran.dg/namelist_38.f90   -O3 -fomit-frame-pointer -funroll-loops -

> fpeel-loops -ftracer -finline-functions  execution test

> FAIL: gfortran.dg/namelist_38.f90   -O3 -g  execution test

> FAIL: gfortran.dg/namelist_70.f90   -O1  execution test

> FAIL: gfortran.dg/namelist_70.f90   -O2  execution test

> FAIL: gfortran.dg/namelist_70.f90   -O3 -fomit-frame-pointer -funroll-loops -

> fpeel-loops -ftracer -finline-functions  execution test

> FAIL: gfortran.dg/namelist_70.f90   -O3 -g  execution test

> FAIL: gfortran.dg/namelist_70.f90   -Os  execution test

> FAIL: gfortran.dg/nan_6.f90   -O1  execution test

> FAIL: gfortran.dg/realloc_on_assign_4.f03   -O1  execution test

> FAIL: gfortran.dg/realloc_on_assign_4.f03   -O2  execution test

> FAIL: gfortran.dg/realloc_on_assign_4.f03   -O3 -fomit-frame-pointer -funroll-

> loops -fpeel-loops -ftracer -finline-functions  execution test

> FAIL: gfortran.dg/realloc_on_assign_4.f03   -O3 -g  execution test

> FAIL: gfortran.dg/realloc_on_assign_4.f03   -Os  execution test

> FAIL: gfortran.dg/streamio_14.f90   -O2  execution test

> FAIL: gfortran.dg/streamio_14.f90   -O3 -fomit-frame-pointer -funroll-loops -

> fpeel-loops -ftracer -finline-functions  execution test

> FAIL: gfortran.dg/streamio_14.f90   -O3 -g  execution test


> FAIL: gfortran.dg/substr_7.f90   -O  (test for excess errors)

> UNRESOLVED: gfortran.dg/substr_7.f90   -O  compilation failed to produce 

> executable


... as is this one.

> FAIL: gfortran.dg/widechar_2.f90   -O1  execution test

> FAIL: gfortran.dg/widechar_2.f90   -Os  execution test

> FAIL: gfortran.fortran-torture/execute/adjustr.f90 execution,  -Os


> FAIL: go.test/test/fixedbugs/issue4562.go execution,  -O2 -g 

> FAIL: go.test/test/nil.go execution,  -O2 -g


These two were present before Bernd's patch.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Bernd Edlinger Oct. 9, 2018, 12:45 p.m. | #13
On 10/03/18 18:31, Jeff Law wrote:
>> -      && (len = int_size_in_bytes (TREE_TYPE (decl))) > 0

>> -      && TREE_STRING_LENGTH (decl) >= len)

>> +      && (len = int_size_in_bytes (TREE_TYPE (decl))) >= 0

>> +      && TREE_STRING_LENGTH (decl) == len)

> Not sure why you want to test for >= 0 here.  > 0 seems sufficient,

> though I guess there's no harm in the = 0 case.

> 


Aehm, cough...

Sorry Jeff, I need to change that back.  It turns out that
completely empty strings don't work right, because of this
check in output_constant:

output_constant (tree exp, unsigned HOST_WIDE_INT size, unsigned int align,
                  bool reverse, bool merge_strings)
{
   enum tree_code code;
   unsigned HOST_WIDE_INT thissize;
   rtx cst;

   if (size == 0 || flag_syntax_only)
     return size;

So while my intention was to add a null-termination for all strings, including empty ones,
this does not work for empty strings, which was diagnosed by the solaris assembler/linker.

However since those empty strings do not use any space, there is no improvement
by merging them in the first place.


Rainer bootstrapped the attached patch successfully.
Is it OK for trunk?


Thanks
Bernd.
2018-10-09  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* varasm.c (mergeable_string_section): Don't try to move zero-length
	strings to the merge section.

Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 264887)
+++ gcc/varasm.c	(working copy)
@@ -804,7 +804,7 @@ mergeable_string_section (tree decl ATTRIBUTE_UNUS
       && TREE_CODE (decl) == STRING_CST
       && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE
       && align <= 256
-      && (len = int_size_in_bytes (TREE_TYPE (decl))) >= 0
+      && (len = int_size_in_bytes (TREE_TYPE (decl))) > 0
       && TREE_STRING_LENGTH (decl) == len)
     {
       scalar_int_mode mode;
Rainer Orth Oct. 9, 2018, 1:05 p.m. | #14
Hi Bernd,

>> * The merge-all-constants-2.c test doesn't FAIL on Solaris/SPARC with

>>    /bin/as, although it lacks string merging support, too.  The assembler

>>    output contains

>> 

>>          .section        ".rodata"

>> 

>>    so the pattern currently used to check for .rodata is too

>>    restrictive.  There is assembler syntax beyond gas on x86 ;-)

>> 

>

> For the test that failed with the quotes around .rodata.  I think

> instead of looking for a end of line immediately after .rodata, it

> would be sufficient to make sure it does not continue with .str, so

> could you please try to add something like the following to your patch?

>

> Index: gcc/testsuite/gcc.dg/merge-all-constants-2.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/merge-all-constants-2.c	(revision 264888)

> +++ gcc/testsuite/gcc.dg/merge-all-constants-2.c	(working copy)

> @@ -5,4 +5,4 @@

>   const char str2[37] = "0123456789abcdefghijklmnopqrstuvwxyz";

>   const char str3[10] = "0123456789abcdefghijklmnopqrstuvwxyz";

>   

> -/* { dg-final { scan-assembler-not "\\.rodata\[\n\r\]" } } */

> +/* { dg-final { scan-assembler-not "\\.rodata\[^.]" } } */


to do this; I've temporarily disabled the string_merging requirement in
the test and ran it on sparc-sun-solaris2.11:

* With as (no string merging), there's

        .section        ".rodata"

  in the output and I get the expected

FAIL: gcc.dg/merge-all-constants-2.c scan-assembler-not \\.rodata[^.]

* With gas however (string merging supported), the output has the usual

        .section        .rodata.str1.8,"aMS",@progbits,1

  and the test PASSes.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Eric Botcazou Oct. 9, 2018, 5:04 p.m. | #15
> Which version exactly (pkg list entire) of Solaris 11 are you running?

> I'm using gas 2.31 and /bin/ld on Solaris 11.4 resp. 11.5 Beta, where

> Bernd's patch in PR bootstrap/87551 fixed the remaining regressions.


Solaris 11.3 with Gas 2.30.

-- 
Eric Botcazou
Rainer Orth Oct. 10, 2018, 11:52 a.m. | #16
Hi Eric,

>> Which version exactly (pkg list entire) of Solaris 11 are you running?

>> I'm using gas 2.31 and /bin/ld on Solaris 11.4 resp. 11.5 Beta, where

>> Bernd's patch in PR bootstrap/87551 fixed the remaining regressions.

>

> Solaris 11.3 with Gas 2.30.


I could now reproduce the regressions on Solaris 11.3 SRU 35.6 (the
latest and last 11.3 update), e.g.

+FAIL: gfortran.dg/allocatable_function_5.f90   -O1  execution test


Program received signal SIGBUS: Access to an undefined portion of a memory object.

Backtrace for this error:

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x00010fe0 in foo (_carg=12, carg=..., .__result=0x215d4 <slen.0>, 
    __result=<optimized out>)
    at /vol/gcc/src/hg/trunk/local/gcc/testsuite/gfortran.dg/allocatable_function_5.f90:41
41          res = carg(1:3)
(gdb) where
#0  0x00010fe0 in foo (_carg=12, carg=..., .__result=0x215d4 <slen.0>, 
    __result=<optimized out>)
    at /vol/gcc/src/hg/trunk/local/gcc/testsuite/gfortran.dg/allocatable_function_5.f90:41
#1  MAIN__ ()
    at /vol/gcc/src/hg/trunk/local/gcc/testsuite/gfortran.dg/allocatable_function_5.f90:22

1: x/i $pc
=> 0x10fe0 <MAIN__+24>: lduh  [ %g1 + 0x3a9 ], %g1
(gdb) p/x $g1
$1 = 0x10800

(gdb) x/7i MAIN__
   0x10fc8 <MAIN__>:    save  %sp, -104, %sp
   0x10fcc <MAIN__+4>:  call  0x213c4 <malloc@got.plt>
   0x10fd0 <MAIN__+8>:  mov  3, %o0
   0x10fd4 <MAIN__+12>: mov  %o0, %i5
   0x10fd8 <MAIN__+16>: sethi  %hi(0x10800), %g1
   0x10fdc <MAIN__+20>: or  %g1, 0x3a9, %g2     ! 0x10ba9
=> 0x10fe0 <MAIN__+24>: lduh  [ %g1 + 0x3a9 ], %g1

(gdb) x/s 0x10800+0x3a9
0x10ba9:        'foo calling \000'

Looking at the .rodata.str1.4 section, I see

$ objdump -s -j .rodata.str1.8 allocatable_function_5.exe

allocatable_function_5.exe:     file format elf32-sparc-sol2

Contents of section .rodata.str1.8:
 10ba8 6d666f6f 2063616c 6c696e67 20000000  mfoo calling ...
 10bb8 666f6f00 00000000 6c687300 00000000  foo.....lhs.....

This string table compression can be disabled with ld -z nocomprstrtab:

       -z nocompstrtab

           Disables the compression of ELF string  tables,  and  comment  sec-
           tions. By default, string compression is applied to SHT_STRTAB sec-
           tions, to SHT_PROGBITS  sections  that  have  their  SHF_MERGE  and
           SHF_STRINGS section flags set, and to comment sections.

This isn't necessary on Solaris 11.4, and Solaris 11.3/x86 isn't
affected as well.  I'm still determining what the best course of action
is: disable string merging support before Solaris 11.4 or enable the
workaround above instead.

sparc-sun-solaris2.11 and i386-pc-solaris2.11 bootstraps with
LD_OPTIONS='-z nocompstrtab' are currently running to check if this
fixes all regressions.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Eric Botcazou Oct. 10, 2018, 2:32 p.m. | #17
> Looking at the .rodata.str1.4 section, I see

> 

> $ objdump -s -j .rodata.str1.8 allocatable_function_5.exe

> 

> allocatable_function_5.exe:     file format elf32-sparc-sol2

> 

> Contents of section .rodata.str1.8:

>  10ba8 6d666f6f 2063616c 6c696e67 20000000  mfoo calling ...

>  10bb8 666f6f00 00000000 6c687300 00000000  foo.....lhs.....

> 

> This string table compression can be disabled with ld -z nocomprstrtab:

> 

>        -z nocompstrtab

> 

>            Disables the compression of ELF string  tables,  and  comment 

> sec- tions. By default, string compression is applied to SHT_STRTAB sec-

> tions, to SHT_PROGBITS  sections  that  have  their  SHF_MERGE  and

> SHF_STRINGS section flags set, and to comment sections.


Thanks for tracking this down!

> This isn't necessary on Solaris 11.4, and Solaris 11.3/x86 isn't

> affected as well.  I'm still determining what the best course of action

> is: disable string merging support before Solaris 11.4 or enable the

> workaround above instead.


Out of curiosity, why isn't it necessary on Solaris 11.4?  Is string 
compression disabled or does it respect alignment on Solaris 11.4?

-- 
Eric Botcazou
Rainer Orth Oct. 10, 2018, 2:36 p.m. | #18
Hi Eric,

>> This isn't necessary on Solaris 11.4, and Solaris 11.3/x86 isn't

>> affected as well.  I'm still determining what the best course of action

>> is: disable string merging support before Solaris 11.4 or enable the

>> workaround above instead.

>

> Out of curiosity, why isn't it necessary on Solaris 11.4?  Is string 

> compression disabled or does it respect alignment on Solaris 11.4?


it's not disabled (I had to disable it when testing an a /bin/as version
with full SHF_MERGE/SHF_STRINGS suppurt recently), so I suspect the
latter.  In S11.4 .rodata and .rodata.str1.8 are merged, with the
alignment of the larger of the two on the output section.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Jeff Law Oct. 10, 2018, 5:58 p.m. | #19
On 10/9/18 6:45 AM, Bernd Edlinger wrote:
> On 10/03/18 18:31, Jeff Law wrote:

>>> -      && (len = int_size_in_bytes (TREE_TYPE (decl))) > 0

>>> -      && TREE_STRING_LENGTH (decl) >= len)

>>> +      && (len = int_size_in_bytes (TREE_TYPE (decl))) >= 0

>>> +      && TREE_STRING_LENGTH (decl) == len)

>> Not sure why you want to test for >= 0 here.  > 0 seems sufficient,

>> though I guess there's no harm in the = 0 case.

>>

> 

> Aehm, cough...

> 

> Sorry Jeff, I need to change that back.  It turns out that

> completely empty strings don't work right, because of this

> check in output_constant:

> 

> output_constant (tree exp, unsigned HOST_WIDE_INT size, unsigned int align,

>                   bool reverse, bool merge_strings)

> {

>    enum tree_code code;

>    unsigned HOST_WIDE_INT thissize;

>    rtx cst;

> 

>    if (size == 0 || flag_syntax_only)

>      return size;

> 

> So while my intention was to add a null-termination for all strings, including empty ones,

> this does not work for empty strings, which was diagnosed by the solaris assembler/linker.

> 

> However since those empty strings do not use any space, there is no improvement

> by merging them in the first place.

> 

> 

> Rainer bootstrapped the attached patch successfully.

> Is it OK for trunk?

OK.  Funny I almost called out the zero size stuff as being pointless to
bother handling in the first place.  Oh well.

Jeff
Bernd Edlinger Oct. 22, 2018, 5:59 p.m. | #20
Hi Rainer,

On 10/9/18 3:05 PM, Rainer Orth wrote:
> Hi Bernd,

> 

>>> * The merge-all-constants-2.c test doesn't FAIL on Solaris/SPARC with

>>>     /bin/as, although it lacks string merging support, too.  The assembler

>>>     output contains

>>>

>>>           .section        ".rodata"

>>>

>>>     so the pattern currently used to check for .rodata is too

>>>     restrictive.  There is assembler syntax beyond gas on x86 ;-)

>>>

>>

>> For the test that failed with the quotes around .rodata.  I think

>> instead of looking for a end of line immediately after .rodata, it

>> would be sufficient to make sure it does not continue with .str, so

>> could you please try to add something like the following to your patch?

>>

>> Index: gcc/testsuite/gcc.dg/merge-all-constants-2.c

>> ===================================================================

>> --- gcc/testsuite/gcc.dg/merge-all-constants-2.c	(revision 264888)

>> +++ gcc/testsuite/gcc.dg/merge-all-constants-2.c	(working copy)

>> @@ -5,4 +5,4 @@

>>    const char str2[37] = "0123456789abcdefghijklmnopqrstuvwxyz";

>>    const char str3[10] = "0123456789abcdefghijklmnopqrstuvwxyz";

>>    

>> -/* { dg-final { scan-assembler-not "\\.rodata\[\n\r\]" } } */

>> +/* { dg-final { scan-assembler-not "\\.rodata\[^.]" } } */

> 

> to do this; I've temporarily disabled the string_merging requirement in

> the test and ran it on sparc-sun-solaris2.11:

> 

> * With as (no string merging), there's

> 

>          .section        ".rodata"

> 

>    in the output and I get the expected

> 

> FAIL: gcc.dg/merge-all-constants-2.c scan-assembler-not \\.rodata[^.]

> 

> * With gas however (string merging supported), the output has the usual

> 

>          .section        .rodata.str1.8,"aMS",@progbits,1

> 

>    and the test PASSes.

> 


which is okay, right?


Bernd.
Rainer Orth Oct. 22, 2018, 6 p.m. | #21
Hi Bernd,

> On 10/9/18 3:05 PM, Rainer Orth wrote:

>> Hi Bernd,

>> 

>>>> * The merge-all-constants-2.c test doesn't FAIL on Solaris/SPARC with

>>>>     /bin/as, although it lacks string merging support, too.  The assembler

>>>>     output contains

>>>>

>>>>           .section        ".rodata"

>>>>

>>>>     so the pattern currently used to check for .rodata is too

>>>>     restrictive.  There is assembler syntax beyond gas on x86 ;-)

>>>>

>>>

>>> For the test that failed with the quotes around .rodata.  I think

>>> instead of looking for a end of line immediately after .rodata, it

>>> would be sufficient to make sure it does not continue with .str, so

>>> could you please try to add something like the following to your patch?

>>>

>>> Index: gcc/testsuite/gcc.dg/merge-all-constants-2.c

>>> ===================================================================

>>> --- gcc/testsuite/gcc.dg/merge-all-constants-2.c	(revision 264888)

>>> +++ gcc/testsuite/gcc.dg/merge-all-constants-2.c	(working copy)

>>> @@ -5,4 +5,4 @@

>>>    const char str2[37] = "0123456789abcdefghijklmnopqrstuvwxyz";

>>>    const char str3[10] = "0123456789abcdefghijklmnopqrstuvwxyz";

>>>    

>>> -/* { dg-final { scan-assembler-not "\\.rodata\[\n\r\]" } } */

>>> +/* { dg-final { scan-assembler-not "\\.rodata\[^.]" } } */

>> 

>> to do this; I've temporarily disabled the string_merging requirement in

>> the test and ran it on sparc-sun-solaris2.11:

>> 

>> * With as (no string merging), there's

>> 

>>          .section        ".rodata"

>> 

>>    in the output and I get the expected

>> 

>> FAIL: gcc.dg/merge-all-constants-2.c scan-assembler-not \\.rodata[^.]

>> 

>> * With gas however (string merging supported), the output has the usual

>> 

>>          .section        .rodata.str1.8,"aMS",@progbits,1

>> 

>>    and the test PASSes.

>> 

>

> which is okay, right?


yes, exactly as expected.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Eric Botcazou Oct. 23, 2018, 7:26 a.m. | #22
> it's not disabled (I had to disable it when testing an a /bin/as version

> with full SHF_MERGE/SHF_STRINGS suppurt recently), so I suspect the

> latter.  In S11.4 .rodata and .rodata.str1.8 are merged, with the

> alignment of the larger of the two on the output section.


OK.  The regressions are still there on Solaris 11.3 for me.

-- 
Eric Botcazou
Rainer Orth Oct. 23, 2018, 7:28 a.m. | #23
Hi Eric,

>> it's not disabled (I had to disable it when testing an a /bin/as version

>> with full SHF_MERGE/SHF_STRINGS suppurt recently), so I suspect the

>> latter.  In S11.4 .rodata and .rodata.str1.8 are merged, with the

>> alignment of the larger of the two on the output section.

>

> OK.  The regressions are still there on Solaris 11.3 for me.


I know: a patch to fix this is almost ready, just needs a final round of
testing.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Eric Botcazou Oct. 23, 2018, 8:13 a.m. | #24
> I know: a patch to fix this is almost ready, just needs a final round of

> testing.


OK, thanks for the information.

-- 
Eric Botcazou
Rainer Orth Oct. 24, 2018, 9:05 a.m. | #25
Hi Eric,

>> I know: a patch to fix this is almost ready, just needs a final round of

>> testing.

>

> OK, thanks for the information.


here's what I've got.  It took me two false starts, unfortunately:

* Initially, I just tried linking with LD_OPTIONS='-z nocompstrtab':

       -z nocompstrtab

           Disables the compression of ELF string  tables,  and  comment  sec-
           tions. By default, string compression is applied to SHT_STRTAB sec-
           tions, to SHT_PROGBITS  sections  that  have  their  SHF_MERGE  and
           SHF_STRINGS section flags set, and to comment sections.

  While that worked, this approach has several disadvantages: if the
  objects are somehow linked without that option, the resulting
  executables will suddenly and mysteriously start to die with SIGBUS.
  It would be far better if the objects themselves are save to use in
  whatever way without relying on a special non-default linker option.
  Besides, it also disable string table compression, so has a far larger
  impact than necessary.

* Next, I tried to disable HAVE_GAS_SHF_MERGE completely before Solaris
  11.4.  This also fixed the regressions you observed.  Unfortunately,
  while this is what's used with as/ld since the Solaris assembler
  doesn't fully support setting SHF_MERGE/SHF_STRINGS yet, when gas is
  in use, hundreds of Go test start to FAIL on both sparc and x86:

runtime stack:
fatal error: DW_FORM_strp out of range in .debug_info at 166762
panic during panic

  The error is from libbacktrace/dwarf.c (read_attribute) which is
  linked into libgo.so.13.  While readelf --debug-dump=info shows no
  problem, dwarfdump -a complains loudly.

  I haven't yet investigated what exactly is happening here.

* Instead I tried an approach that one of the Solaris linker engineers
  suggested which both avoids the need for special linker options and a
  large part of the impact: it just disables string merging for sections
  with alignment > 1 with older (pre-Solaris 11.4) versions of Solaris
  ld.  This one worked just fine and seems the preferred approach.
  While the Solaris 10/SPARC ld didn't show the regressions seen on
  11.3, I've decided to apply the workaround there, too, to avoid
  running into problems if those objects are later relinked on Solaris 11.

Bootstrapped without regressions on i386-pc-solaris2.1[01] and
sparc-sun-solaris2.1[01] (both Solaris 11.3 and 11.4 each) with as/ld,
gas/ld, and x86_64-pc-linux-gnu.

Ok for mainline?

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-10-19  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	* configure.ac (gcc_cv_ld_aligned_shf_merge): New test.
	* configure: Regenerate.
	* config.in: Regenerate.
	* varasm.c (mergeable_string_section): Use readonly_data_section
	if linker doesn't support SHF_MERGE with alignment > 8.
	(mergeable_constant_section): Likewise.
# HG changeset patch
# Parent  01647481afc4b26a304e22ee00c42bc5a9bbafe3
Disable string merging with alignment > 1 before Solaris 11.4/SPARC

diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3050,6 +3050,28 @@ AC_DEFINE_UNQUOTED(HAVE_GAS_SHF_MERGE,
   [`if test $gcc_cv_as_shf_merge = yes; then echo 1; else echo 0; fi`],
 [Define 0/1 if your assembler supports marking sections with SHF_MERGE flag.])
 
+gcc_cv_ld_aligned_shf_merge=yes
+case "$target" in
+  # While Solaris 10/SPARC ld isn't affected, disable to avoid problems
+  # relinking on Solaris 11 < 11.4.
+  sparc*-*-solaris2.10*)
+    if test x"$gnu_ld" = xno; then
+      gcc_cv_ld_aligned_shf_merge=no
+    fi
+    ;;
+  # SHF_MERGE support is broken in Solaris ld up to Solaris 11.3/SPARC for
+  # alignment > 1.
+  sparc*-*-solaris2.11*)
+    if test x"$gnu_ld" = xno \
+       && test "$ld_vers_major" -lt 2 && test "$ld_vers_minor" -lt 3159; then
+      gcc_cv_ld_aligned_shf_merge=no
+    fi
+    ;;
+esac
+AC_DEFINE_UNQUOTED(HAVE_LD_ALIGNED_SHF_MERGE,
+  [`if test $gcc_cv_ld_aligned_shf_merge = yes; then echo 1; else echo 0; fi`],
+[Define 0/1 if your linker supports the SHF_MERGE flag with section alignment > 1.])
+
 gcc_GAS_CHECK_FEATURE([stabs directive], gcc_cv_as_stabs_directive, ,,
 [.stabs "gcc2_compiled.",60,0,0,0],,
 [AC_DEFINE(HAVE_AS_STABS_DIRECTIVE, 1,
diff --git a/gcc/varasm.c b/gcc/varasm.c
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -823,6 +823,9 @@ mergeable_string_section (tree decl ATTR
 	  if (align < modesize)
 	    align = modesize;
 
+	  if (!HAVE_LD_ALIGNED_SHF_MERGE && align > 8)
+	    return readonly_data_section;
+
 	  str = TREE_STRING_POINTER (decl);
 	  unit = GET_MODE_SIZE (mode);
 
@@ -861,7 +864,8 @@ mergeable_constant_section (machine_mode
       && known_le (GET_MODE_BITSIZE (mode), align)
       && align >= 8
       && align <= 256
-      && (align & (align - 1)) == 0)
+      && (align & (align - 1)) == 0
+      && (HAVE_LD_ALIGNED_SHF_MERGE ? 1 : align == 8))
     {
       const char *prefix = function_mergeable_rodata_prefix ();
       char *name = (char *) alloca (strlen (prefix) + 30);
Richard Biener Oct. 24, 2018, 9:38 a.m. | #26
On Wed, 24 Oct 2018, Rainer Orth wrote:

> Hi Eric,

> 

> >> I know: a patch to fix this is almost ready, just needs a final round of

> >> testing.

> >

> > OK, thanks for the information.

> 

> here's what I've got.  It took me two false starts, unfortunately:

> 

> * Initially, I just tried linking with LD_OPTIONS='-z nocompstrtab':

> 

>        -z nocompstrtab

> 

>            Disables the compression of ELF string  tables,  and  comment  sec-

>            tions. By default, string compression is applied to SHT_STRTAB sec-

>            tions, to SHT_PROGBITS  sections  that  have  their  SHF_MERGE  and

>            SHF_STRINGS section flags set, and to comment sections.

> 

>   While that worked, this approach has several disadvantages: if the

>   objects are somehow linked without that option, the resulting

>   executables will suddenly and mysteriously start to die with SIGBUS.

>   It would be far better if the objects themselves are save to use in

>   whatever way without relying on a special non-default linker option.

>   Besides, it also disable string table compression, so has a far larger

>   impact than necessary.

> 

> * Next, I tried to disable HAVE_GAS_SHF_MERGE completely before Solaris

>   11.4.  This also fixed the regressions you observed.  Unfortunately,

>   while this is what's used with as/ld since the Solaris assembler

>   doesn't fully support setting SHF_MERGE/SHF_STRINGS yet, when gas is

>   in use, hundreds of Go test start to FAIL on both sparc and x86:

> 

> runtime stack:

> fatal error: DW_FORM_strp out of range in .debug_info at 166762

> panic during panic

> 

>   The error is from libbacktrace/dwarf.c (read_attribute) which is

>   linked into libgo.so.13.  While readelf --debug-dump=info shows no

>   problem, dwarfdump -a complains loudly.

> 

>   I haven't yet investigated what exactly is happening here.

> 

> * Instead I tried an approach that one of the Solaris linker engineers

>   suggested which both avoids the need for special linker options and a

>   large part of the impact: it just disables string merging for sections

>   with alignment > 1 with older (pre-Solaris 11.4) versions of Solaris

>   ld.  This one worked just fine and seems the preferred approach.

>   While the Solaris 10/SPARC ld didn't show the regressions seen on

>   11.3, I've decided to apply the workaround there, too, to avoid

>   running into problems if those objects are later relinked on Solaris 11.

> 

> Bootstrapped without regressions on i386-pc-solaris2.1[01] and

> sparc-sun-solaris2.1[01] (both Solaris 11.3 and 11.4 each) with as/ld,

> gas/ld, and x86_64-pc-linux-gnu.

> 

> Ok for mainline?


LGTM.

Richard.

Patch

gcc:
2018-08-27  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* varasm.c (output_constant): Add new parameter merge_strings.
	Make strings properly zero terminated in merge string sections.
	(mergeable_string_section): Don't fail if the last char is non-zero.
	(assemble_variable_contents): Handle merge string sections.
	(assemble_variable): Likewise.
	(assemble_constant_contents): Likewise.
	(output_constant_def_contents): Likewise.

testsuite:
2018-08-27  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gnat.dg/string_merge1.adb: New test.
	* gnat.dg/string_merge2.adb: New test.
	* gcc.dg/merge-all-constants-1.c: Adjust test.
	* gcc.dg/merge-all-constants-2.c: New test.

diff -Npur gcc/varasm.c gcc/varasm.c
--- gcc/varasm.c	2018-08-26 15:02:43.157905415 +0200
+++ gcc/varasm.c	2018-08-26 17:57:26.488494866 +0200
@@ -111,7 +111,8 @@  static int compare_constant (const tree,
 static void output_constant_def_contents (rtx);
 static void output_addressed_constants (tree);
 static unsigned HOST_WIDE_INT output_constant (tree, unsigned HOST_WIDE_INT,
-					       unsigned int, bool);
+					       unsigned int, bool,
+					       bool = false);
 static void globalize_decl (tree);
 static bool decl_readonly_section_1 (enum section_category);
 #ifdef BSS_SECTION_ASM_OP
@@ -804,8 +805,8 @@  mergeable_string_section (tree decl ATTR
       && TREE_CODE (decl) == STRING_CST
       && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE
       && align <= 256
-      && (len = int_size_in_bytes (TREE_TYPE (decl))) > 0
-      && TREE_STRING_LENGTH (decl) >= len)
+      && (len = int_size_in_bytes (TREE_TYPE (decl))) >= 0
+      && TREE_STRING_LENGTH (decl) == len)
     {
       scalar_int_mode mode;
       unsigned int modesize;
@@ -835,7 +836,7 @@  mergeable_string_section (tree decl ATTR
 	      if (j == unit)
 		break;
 	    }
-	  if (i == len - unit)
+	  if (i == len - unit || (unit == 1 && i == len))
 	    {
 	      sprintf (name, "%s.str%d.%d", prefix,
 		       modesize / 8, (int) (align / 8));
@@ -2117,7 +2118,7 @@  assemble_noswitch_variable (tree decl, c
 
 static void
 assemble_variable_contents (tree decl, const char *name,
-			    bool dont_output_data)
+			    bool dont_output_data, bool merge_strings = false)
 {
   /* Do any machine/system dependent processing of the object.  */
 #ifdef ASM_DECLARE_OBJECT_NAME
@@ -2140,7 +2141,7 @@  assemble_variable_contents (tree decl, c
 	output_constant (DECL_INITIAL (decl),
 			 tree_to_uhwi (DECL_SIZE_UNIT (decl)),
 			 get_variable_align (decl),
-			 false);
+			 false, merge_strings);
       else
 	/* Leave space for it.  */
 	assemble_zeros (tree_to_uhwi (DECL_SIZE_UNIT (decl)));
@@ -2316,7 +2317,9 @@  assemble_variable (tree decl, int top_le
 	switch_to_section (sect);
       if (align > BITS_PER_UNIT)
 	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
-      assemble_variable_contents (decl, name, dont_output_data);
+      assemble_variable_contents (decl, name, dont_output_data,
+				  sect->common.flags & SECTION_MERGE
+				  && sect->common.flags & SECTION_STRINGS);
       if (asan_protected)
 	{
 	  unsigned HOST_WIDE_INT int size
@@ -3471,7 +3474,8 @@  maybe_output_constant_def_contents (stru
    constant's alignment in bits.  */
 
 static void
-assemble_constant_contents (tree exp, const char *label, unsigned int align)
+assemble_constant_contents (tree exp, const char *label, unsigned int align,
+			    bool merge_strings = false)
 {
   HOST_WIDE_INT size;
 
@@ -3481,7 +3485,7 @@  assemble_constant_contents (tree exp, co
   targetm.asm_out.declare_constant_name (asm_out_file, label, exp, size);
 
   /* Output the value of EXP.  */
-  output_constant (exp, size, align, false);
+  output_constant (exp, size, align, false, merge_strings);
 
   targetm.asm_out.decl_end ();
 }
@@ -3522,10 +3526,13 @@  output_constant_def_contents (rtx symbol
 		   || (VAR_P (decl) && DECL_IN_CONSTANT_POOL (decl))
 		   ? DECL_ALIGN (decl)
 		   : symtab_node::get (decl)->definition_alignment ());
-      switch_to_section (get_constant_section (exp, align));
+      section *sect = get_constant_section (exp, align);
+      switch_to_section (sect);
       if (align > BITS_PER_UNIT)
 	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
-      assemble_constant_contents (exp, XSTR (symbol, 0), align);
+      assemble_constant_contents (exp, XSTR (symbol, 0), align,
+				  sect->common.flags & SECTION_MERGE
+				  && sect->common.flags & SECTION_STRINGS);
       if (asan_protected)
 	{
 	  HOST_WIDE_INT size = get_constant_size (exp);
@@ -4838,7 +4845,7 @@  output_constructor (tree, unsigned HOST_
 
 static unsigned HOST_WIDE_INT
 output_constant (tree exp, unsigned HOST_WIDE_INT size, unsigned int align,
-		 bool reverse)
+		 bool reverse, bool merge_strings /* = false */)
 {
   enum tree_code code;
   unsigned HOST_WIDE_INT thissize;
@@ -4966,8 +4973,11 @@  output_constant (tree exp, unsigned HOST
 	case CONSTRUCTOR:
 	  return output_constructor (exp, size, align, reverse, NULL);
 	case STRING_CST:
-	  thissize
-	    = MIN ((unsigned HOST_WIDE_INT)TREE_STRING_LENGTH (exp), size);
+	  thissize = (unsigned HOST_WIDE_INT)TREE_STRING_LENGTH (exp);
+	  if (merge_strings
+	      && (thissize == 0
+		  || TREE_STRING_POINTER (exp) [thissize - 1] != '\0'))
+	    thissize++;
 	  gcc_checking_assert (check_string_literal (exp, size));
 	  assemble_string (TREE_STRING_POINTER (exp), thissize);
 	  break;
diff -Npur gcc/testsuite/gcc.dg/merge-all-constants-1.c gcc/testsuite/gcc.dg/merge-all-constants-1.c
--- gcc/testsuite/gcc.dg/merge-all-constants-1.c	2018-08-16 17:28:11.000000000 +0200
+++ gcc/testsuite/gcc.dg/merge-all-constants-1.c	2018-08-26 16:31:12.650271931 +0200
@@ -1,8 +1,8 @@ 
 /* { dg-do compile } */
 /* { dg-options "-w -O2 -fmerge-all-constants" } */
 
-const char str1[36] = "0123456789abcdefghijklmnopqrstuvwxyz";
+const char str1[36] = "\000123456789abcdefghijklmnopqrstuvwxyz";
 const char str2[38] = "0123456789abcdefghijklmnopqrstuvwxyz";
-const char str3[10] = "0123456789abcdefghijklmnopqrstuvwxyz";
+const char str3[10] = "\000123456789abcdefghijklmnopqrstuvwxyz";
 
-/* { dg-final { scan-assembler-not "\.rodata\.str" } } */
+/* { dg-final { scan-assembler-not "\\.rodata\\.str" } } */
diff -Npur gcc/testsuite/gcc.dg/merge-all-constants-2.c gcc/testsuite/gcc.dg/merge-all-constants-2.c
--- gcc/testsuite/gcc.dg/merge-all-constants-2.c	1970-01-01 01:00:00.000000000 +0100
+++ gcc/testsuite/gcc.dg/merge-all-constants-2.c	2018-08-26 16:31:12.650271931 +0200
@@ -0,0 +1,8 @@ 
+/* { dg-do compile } */
+/* { dg-options "-w -O2 -fmerge-all-constants" } */
+
+const char str1[36] = "0123456789abcdefghijklmnopqrstuvwxyz";
+const char str2[37] = "0123456789abcdefghijklmnopqrstuvwxyz";
+const char str3[10] = "0123456789abcdefghijklmnopqrstuvwxyz";
+
+/* { dg-final { scan-assembler-not "\\.rodata\[\n\r\]" } } */
diff -Npur gcc/testsuite/gnat.dg/string_merge1.adb gcc/testsuite/gnat.dg/string_merge1.adb
--- gcc/testsuite/gnat.dg/string_merge1.adb	1970-01-01 01:00:00.000000000 +0100
+++ gcc/testsuite/gnat.dg/string_merge1.adb	2018-08-26 16:31:12.650271931 +0200
@@ -0,0 +1,19 @@ 
+-- { dg-do compile }
+-- { dg-options "-O1 -fmerge-all-constants" }
+
+procedure String_Merge1 is
+   procedure Process (X : String);
+   pragma Import (Ada, Process);
+begin
+   Process ("ABCD");
+end;
+
+-- We expect something like:
+
+-- .section  .rodata.str1.1,"aMS",@progbits,1
+-- .LC1:
+--     .string "ABCD"
+
+-- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }
+-- { dg-final { scan-assembler-times "\\.string" 1 } }
+-- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }
diff -Npur gcc/testsuite/gnat.dg/string_merge2.adb gcc/testsuite/gnat.dg/string_merge2.adb
--- gcc/testsuite/gnat.dg/string_merge2.adb	1970-01-01 01:00:00.000000000 +0100
+++ gcc/testsuite/gnat.dg/string_merge2.adb	2018-08-26 16:31:12.650271931 +0200
@@ -0,0 +1,19 @@ 
+-- { dg-do compile }
+-- { dg-options "-O1 -fmerge-all-constants" }
+
+procedure String_Merge2 is
+   procedure Process (X : String);
+   pragma Import (Ada, Process);
+begin
+   Process ("ABCD" & Ascii.NUL);
+end;
+
+-- We expect something like:
+
+-- .section  .rodata.str1.1,"aMS",@progbits,1
+-- .LC1:
+--     .string "ABCD"
+
+-- { dg-final { scan-assembler-times "\\.rodata\\.str" 1 } }
+-- { dg-final { scan-assembler-times "\\.string" 1 } }
+-- { dg-final { scan-assembler-times "\"ABCD\"" 1 } }