Fix PR 101453: ICE with optimize and large integer constant

Message ID 1626460525-4914-1-git-send-email-apinski@marvell.com
State New
Headers show
Series
  • Fix PR 101453: ICE with optimize and large integer constant
Related show

Commit Message

David Malcolm via Gcc-patches July 16, 2021, 6:35 p.m.
From: Andrew Pinski <apinski@marvell.com>


The problem is the buffer is too small to hold "-O" and
the interger.  This fixes the problem by use the correct size
instead.

Changes since v1:
* v2: Use HOST_BITS_PER_LONG and just divide by 3 instead of
3.32.

OK? Bootstrapped and tested on x86_64-linux with no regressions.

gcc/c-family/ChangeLog:

	PR c/101453
	* c-common.c (parse_optimize_options): Use the correct
	size for buffer.
---
 gcc/c-family/c-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
1.8.3.1

Comments

David Malcolm via Gcc-patches July 16, 2021, 6:49 p.m. | #1
On July 16, 2021 8:35:25 PM GMT+02:00, apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>From: Andrew Pinski <apinski@marvell.com>

>

>The problem is the buffer is too small to hold "-O" and

>the interger.  This fixes the problem by use the correct size

>instead.

>

>Changes since v1:

>* v2: Use HOST_BITS_PER_LONG and just divide by 3 instead of

>3.32.

>

>OK? Bootstrapped and tested on x86_64-linux with no regressions.


OK. 

Richard. 

>gcc/c-family/ChangeLog:

>

>	PR c/101453

>	* c-common.c (parse_optimize_options): Use the correct

>	size for buffer.

>---

> gcc/c-family/c-common.c | 2 +-

> 1 file changed, 1 insertion(+), 1 deletion(-)

>

>diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c

>index 20ec263..e7a54a5 100644

>--- a/gcc/c-family/c-common.c

>+++ b/gcc/c-family/c-common.c

>@@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)

> 

>       if (TREE_CODE (value) == INTEGER_CST)

> 	{

>-	  char buffer[20];

>+	  char buffer[HOST_BITS_PER_LONG / 3 + 4];

> 	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

> 	  vec_safe_push (optimize_args, ggc_strdup (buffer));

> 	}
Segher Boessenkool July 17, 2021, 9:30 p.m. | #2
Hi!

On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:
> --- a/gcc/c-family/c-common.c

> +++ b/gcc/c-family/c-common.c

> @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)

>  

>        if (TREE_CODE (value) == INTEGER_CST)

>  	{

> -	  char buffer[20];

> +	  char buffer[HOST_BITS_PER_LONG / 3 + 4];

>  	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

>  	  vec_safe_push (optimize_args, ggc_strdup (buffer));

>  	}


This calculation is correct, assuming "value" is non-negative.  You can
easily avoid all of that though:

	  int n = snprintf (0, 0, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  char buffer[n + 1];
	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  vec_safe_push (optimize_args, ggc_strdup (buffer));

This creates a variable length allocation on the stack though.  You can
do better (for some value of "better") by using the known longest value:
(again assuming non-negative):

	  int n = snprintf (0, 0, "-O%ld", LONG_MAX);
	  char buffer[n + 1];
	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  vec_safe_push (optimize_args, ggc_strdup (buffer));

This does not call snprintf, GCC can evaluate that at compile time.
This pattern works well in portable code.

But we can do better still:

	  char *buffer;
	  asprintf (&buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  vec_safe_push (optimize_args, ggc_strdup (buffer));
	  free (buffer);

(asprintf is in libiberty already).  And we could avoid the ggc_strdup
if we made a variant of asprintf that marks the GGC itself (the aprintf
implementation already factors calculating the length needed, it will
be easy to do this).


Segher
David Malcolm via Gcc-patches July 17, 2021, 10:13 p.m. | #3
On Sat, Jul 17, 2021 at 2:31 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>

> Hi!

>

> On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:

> > --- a/gcc/c-family/c-common.c

> > +++ b/gcc/c-family/c-common.c

> > @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)

> >

> >        if (TREE_CODE (value) == INTEGER_CST)

> >       {

> > -       char buffer[20];

> > +       char buffer[HOST_BITS_PER_LONG / 3 + 4];

> >         sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

> >         vec_safe_push (optimize_args, ggc_strdup (buffer));

> >       }

>

> This calculation is correct, assuming "value" is non-negative.  You can

> easily avoid all of that though:


Actually it is still correct even for negative values because we are
adding 4 rather than 3 to make sure there is enough room for the sign
character.
So it takes 11 max characters for a signed 32bit integer
(ceil(log(2^31)) + 1) and add the "-O" part and the null character
gives us 14.  32/3 is 10, and then add 4 gives us 14. This is the
exact amount.
So it takes 20 max characters for a signed 64bit integer
(ceil(log(2^63)) + 1) and add the "-O" part and the null character
gives us 23.  64/3 is 21, and then add 4 gives us 25. There are 2
extra bytes used which is ok.


Thanks,
Andrew


>

>           int n = snprintf (0, 0, "-O%ld", (long) TREE_INT_CST_LOW (value));

>           char buffer[n + 1];

>           sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

>           vec_safe_push (optimize_args, ggc_strdup (buffer));

>

> This creates a variable length allocation on the stack though.  You can

> do better (for some value of "better") by using the known longest value:

> (again assuming non-negative):

>

>           int n = snprintf (0, 0, "-O%ld", LONG_MAX);

>           char buffer[n + 1];

>           sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

>           vec_safe_push (optimize_args, ggc_strdup (buffer));

>

> This does not call snprintf, GCC can evaluate that at compile time.

> This pattern works well in portable code.

>

> But we can do better still:

>

>           char *buffer;

>           asprintf (&buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

>           vec_safe_push (optimize_args, ggc_strdup (buffer));

>           free (buffer);

>

> (asprintf is in libiberty already).  And we could avoid the ggc_strdup

> if we made a variant of asprintf that marks the GGC itself (the aprintf

> implementation already factors calculating the length needed, it will

> be easy to do this).

>

>

> Segher
Segher Boessenkool July 18, 2021, 7:32 a.m. | #4
On Sat, Jul 17, 2021 at 03:13:59PM -0700, Andrew Pinski wrote:
> On Sat, Jul 17, 2021 at 2:31 PM Segher Boessenkool

> <segher@kernel.crashing.org> wrote:

> > On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:

> > > --- a/gcc/c-family/c-common.c

> > > +++ b/gcc/c-family/c-common.c

> > > @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)

> > >

> > >        if (TREE_CODE (value) == INTEGER_CST)

> > >       {

> > > -       char buffer[20];

> > > +       char buffer[HOST_BITS_PER_LONG / 3 + 4];

> > >         sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

> > >         vec_safe_push (optimize_args, ggc_strdup (buffer));

> > >       }

> >

> > This calculation is correct, assuming "value" is non-negative.  You can

> > easily avoid all of that though:

> 

> Actually it is still correct even for negative values because we are

> adding 4 rather than 3 to make sure there is enough room for the sign

> character.


Say your longs have only two bits, one sign and one value (so it is
{-2, -1, 0, 1}).  There you get  char buffer[4]  although you need 5
here if you care about negative numbers: '-', 'O', '-', '1', 0.

But longs are at least 32 bits, of course.  And 14 chars is (just)
enough, because you divide by only 3 now (instead of {}^2 log 10, or
3.32 as before), giving some leeway.

(n/3+4 also fails for longs of sizes 5, 8, or 11 bits, but no more).


This only strengthens my point though: it is much easier and it requires
much less thinking, but most of all it is less error-prone, if you let
the compiler do the tricky bits.  But it is less fun!  :-)


Segher
Andreas Schwab July 18, 2021, 7:49 a.m. | #5
On Jul 18 2021, Segher Boessenkool wrote:

> On Sat, Jul 17, 2021 at 03:13:59PM -0700, Andrew Pinski wrote:

>> On Sat, Jul 17, 2021 at 2:31 PM Segher Boessenkool

>> <segher@kernel.crashing.org> wrote:

>> > On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:

>> > > --- a/gcc/c-family/c-common.c

>> > > +++ b/gcc/c-family/c-common.c

>> > > @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)

>> > >

>> > >        if (TREE_CODE (value) == INTEGER_CST)

>> > >       {

>> > > -       char buffer[20];

>> > > +       char buffer[HOST_BITS_PER_LONG / 3 + 4];

>> > >         sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));

>> > >         vec_safe_push (optimize_args, ggc_strdup (buffer));

>> > >       }

>> >

>> > This calculation is correct, assuming "value" is non-negative.  You can

>> > easily avoid all of that though:

>> 

>> Actually it is still correct even for negative values because we are

>> adding 4 rather than 3 to make sure there is enough room for the sign

>> character.

>

> Say your longs have only two bits, one sign and one value (so it is

> {-2, -1, 0, 1}).  There you get  char buffer[4]  although you need 5

> here if you care about negative numbers: '-', 'O', '-', '1', 0.

>

> But longs are at least 32 bits, of course.  And 14 chars is (just)

> enough, because you divide by only 3 now (instead of {}^2 log 10, or

> 3.32 as before), giving some leeway.

>

> (n/3+4 also fails for longs of sizes 5, 8, or 11 bits, but no more).


/* Bound on length of the string representing an unsigned integer
   value representable in B bits.  log10 (2.0) < 146/485.  The
   smallest value of B where this bound is not tight is 2621.  */
#define INT_BITS_STRLEN_BOUND(b) (((b) * 146 + 484) / 485)

/* Bound on length of the string representing an integer type or expression T.
   Subtract 1 for the sign bit if T is signed, and then add 1 more for
   a minus sign if needed.

   Because _GL_SIGNED_TYPE_OR_EXPR sometimes returns 1 when its argument is
   unsigned, this macro may overestimate the true bound by one byte when
   applied to unsigned types of size 2, 4, 16, ... bytes.  */
#define INT_STRLEN_BOUND(t)                                     \
  (INT_BITS_STRLEN_BOUND (TYPE_WIDTH (t) - _GL_SIGNED_TYPE_OR_EXPR (t)) \
   + _GL_SIGNED_TYPE_OR_EXPR (t))

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Patch

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 20ec263..e7a54a5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5799,7 +5799,7 @@  parse_optimize_options (tree args, bool attr_p)
 
       if (TREE_CODE (value) == INTEGER_CST)
 	{
-	  char buffer[20];
+	  char buffer[HOST_BITS_PER_LONG / 3 + 4];
 	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
 	  vec_safe_push (optimize_args, ggc_strdup (buffer));
 	}