Extend store merging to STRING_CST

Message ID 1958860.vJQVT1cKh3@polaris
State New
Headers show
Series
  • Extend store merging to STRING_CST
Related show

Commit Message

Eric Botcazou July 1, 2020, 6:49 a.m.
Hi,

the GIMPLE store merging pass doesn't merge STRING_CSTs in the general case, 
although they are accepted by native_encode_expr; the reason is that the pass 
only works with integral modes, i.e. with chunks whose size is a power of two.

There are two possible ways of extending it to handle STRING_CSTs: 1) lift the 
condition of integral modes and treat STRING_CSTs as other _CST nodes but with 
arbitrary size; 2) implement a specific and separate handling for STRING_CSTs.

The attached patch implements 2) for the following reasons: on the one hand, 
even in Ada where character strings are first-class citizens, cases where 
merging STRING_CSTs with other *_CST nodes would be possible are quite rare in 
practice; on the other hand, string concatenations happen more naturally and 
frequently thanks to the "&" operator, giving rise to merging opportunities.

These opportunites generally occur after inlining, when a strng argument 
passed in a subprogram call is a STRING_CST and is concatenated with other 
STRING_CSTs in the called subprogram.  In this case, the concatenation is 
originally expanded by means of a VLA and calls to BUILT_IN_MEMCPY on the 
array, and becomes of fixed length after inlining; in order to avoid having to 
deal with the calls to BUILT_IN_MEMCPY in the GIMPLE store merging pass, the 
patch also adds an optimization to gimple_fold_builtin_memory_op that turns 
calls to BUILT_IN_MEMCPY of STRING_CSTs into direct assignment of STRING_CSTs.
Unfortunately, doing this in the general case discombobulates the C-oriented 
string-related GIMPLE passes of the optimizer, so it is done only for the 
calls to BUILT_IN_MEMCPY that have been generated during gimplification for 
variable-sized assignments, which are thus marked with a new flag.

Tested on x86-64/Linux, OK for the mainline?


2020-07-01  Eric Botcazou  <ebotcazou@adacore.com>

	* gimple-fold.c (gimple_fold_builtin_memory_op): Fold calls that were
	initially created for the assignment of a variable-sized object.
	* gimple-ssa-store-merging.c (struct merged_store_group): Document
 	STRING_CST for rhs_code field.
	Add string_concatenation boolean field.
	(merged_store_group::merged_store_group): Initialize it as well as
	bit_insertion here.
	(merged_store_group::do_merge): Set it upon seeing a STRING_CST.  Also
	set bit_insertion here upon seeing a BIT_INSERT_EXPR.
	(merged_store_group::apply_stores): Clear it for small regions.  Do not
	create a power-of-2-sized buffer if it is still true.  And do not set
	bit_insertion here again.
	(encode_tree_to_bitpos): Deal with BLKmode for the expression.
	(merged_store_group::can_be_merged_into): Deal with STRING_CST.
	(imm_store_chain_info::coalesce_immediate_stores): Set bit_insertion
	to true after changing MEM_REF stores into BIT_INSERT_EXPR stores.
	(count_multiple_uses): Return 0 for STRING_CST.
	(split_group): Do not split the group for a string concatenation.
	(imm_store_chain_info::output_merged_store): Constify and rename some
	local variables.  Build an array type as destination type for a string
	concatenation, as well as a zero mask, and call build_string to build
	the source.
	(lhs_valid_for_store_merging_p): Return true for VIEW_CONVERT_EXPR.
	(pass_store_merging::process_store): Accept STRING_CST on the RHS.
	* gimple.h (gimple_call_alloca_for_var_p): New accessor function.
	* gimplify.c (gimplify_modify_expr_to_memcpy): Set alloca_for_var.
	* tree.h (CALL_ALLOCA_FOR_VAR_P): Document it for BUILT_IN_MEMCPY.


2020-07-01  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/opt87.adb: New test.
	* gnat.dg/opt87_pkg.ad[sb]: New helper.

-- 
Eric Botcazou

Comments

Jeff Law via Gcc-patches July 1, 2020, 7:47 a.m. | #1
On Wed, Jul 1, 2020 at 8:50 AM Eric Botcazou <botcazou@adacore.com> wrote:
>

> Hi,

>

> the GIMPLE store merging pass doesn't merge STRING_CSTs in the general case,

> although they are accepted by native_encode_expr; the reason is that the pass

> only works with integral modes, i.e. with chunks whose size is a power of two.

>

> There are two possible ways of extending it to handle STRING_CSTs: 1) lift the

> condition of integral modes and treat STRING_CSTs as other _CST nodes but with

> arbitrary size; 2) implement a specific and separate handling for STRING_CSTs.

>

> The attached patch implements 2) for the following reasons: on the one hand,

> even in Ada where character strings are first-class citizens, cases where

> merging STRING_CSTs with other *_CST nodes would be possible are quite rare in

> practice; on the other hand, string concatenations happen more naturally and

> frequently thanks to the "&" operator, giving rise to merging opportunities.

>

> These opportunites generally occur after inlining, when a strng argument

> passed in a subprogram call is a STRING_CST and is concatenated with other

> STRING_CSTs in the called subprogram.  In this case, the concatenation is

> originally expanded by means of a VLA and calls to BUILT_IN_MEMCPY on the

> array, and becomes of fixed length after inlining; in order to avoid having to

> deal with the calls to BUILT_IN_MEMCPY in the GIMPLE store merging pass, the

> patch also adds an optimization to gimple_fold_builtin_memory_op that turns

> calls to BUILT_IN_MEMCPY of STRING_CSTs into direct assignment of STRING_CSTs.

> Unfortunately, doing this in the general case discombobulates the C-oriented

> string-related GIMPLE passes of the optimizer, so it is done only for the

> calls to BUILT_IN_MEMCPY that have been generated during gimplification for

> variable-sized assignments, which are thus marked with a new flag.

>

> Tested on x86-64/Linux, OK for the mainline?


+         && TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST
+         && tree_int_cst_equal
+            (TYPE_SIZE_UNIT (TREE_TYPE (TREE_OPERAND (src, 0))), len))
+       {

I guess we miss a BITS_PER_UNIT == 8 check here?

+         gimple_set_vuse (new_stmt, gimple_vuse (stmt));
+         gimple_set_vdef (new_stmt, gimple_vdef (stmt));
+         if (gimple_vdef (new_stmt)
+             && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME)
+           SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt;

you can use gimple_move_vops (new_stmt, stmt) for this.

I wonder if there are objects beside STRING_CSTs that could have their
sizes fixed via inlining, thus, whether you can omit the == STRING_CST
check?  That is, I see this change independent of the store-merging
optimization.

Otherwise the memcpy folding part looks OK to me, I skimmed the
store-merging change and didn't find anything suspicious but I wonder
whether not special-casing STRING_CSTs would actually simplify
the code?

Thanks,
Richard.


>

> 2020-07-01  Eric Botcazou  <ebotcazou@adacore.com>

>

>         * gimple-fold.c (gimple_fold_builtin_memory_op): Fold calls that were

>         initially created for the assignment of a variable-sized object.

>         * gimple-ssa-store-merging.c (struct merged_store_group): Document

>         STRING_CST for rhs_code field.

>         Add string_concatenation boolean field.

>         (merged_store_group::merged_store_group): Initialize it as well as

>         bit_insertion here.

>         (merged_store_group::do_merge): Set it upon seeing a STRING_CST.  Also

>         set bit_insertion here upon seeing a BIT_INSERT_EXPR.

>         (merged_store_group::apply_stores): Clear it for small regions.  Do not

>         create a power-of-2-sized buffer if it is still true.  And do not set

>         bit_insertion here again.

>         (encode_tree_to_bitpos): Deal with BLKmode for the expression.

>         (merged_store_group::can_be_merged_into): Deal with STRING_CST.

>         (imm_store_chain_info::coalesce_immediate_stores): Set bit_insertion

>         to true after changing MEM_REF stores into BIT_INSERT_EXPR stores.

>         (count_multiple_uses): Return 0 for STRING_CST.

>         (split_group): Do not split the group for a string concatenation.

>         (imm_store_chain_info::output_merged_store): Constify and rename some

>         local variables.  Build an array type as destination type for a string

>         concatenation, as well as a zero mask, and call build_string to build

>         the source.

>         (lhs_valid_for_store_merging_p): Return true for VIEW_CONVERT_EXPR.

>         (pass_store_merging::process_store): Accept STRING_CST on the RHS.

>         * gimple.h (gimple_call_alloca_for_var_p): New accessor function.

>         * gimplify.c (gimplify_modify_expr_to_memcpy): Set alloca_for_var.

>         * tree.h (CALL_ALLOCA_FOR_VAR_P): Document it for BUILT_IN_MEMCPY.

>

>

> 2020-07-01  Eric Botcazou  <ebotcazou@adacore.com>

>

>         * gnat.dg/opt87.adb: New test.

>         * gnat.dg/opt87_pkg.ad[sb]: New helper.

>

> --

> Eric Botcazou
Eric Botcazou July 1, 2020, 10:32 a.m. | #2
> +         && TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST

> +         && tree_int_cst_equal

> +            (TYPE_SIZE_UNIT (TREE_TYPE (TREE_OPERAND (src, 0))), len))

> +       {

> 

> I guess we miss a BITS_PER_UNIT == 8 check here?


OK, added.

> +         gimple_set_vuse (new_stmt, gimple_vuse (stmt));

> +         gimple_set_vdef (new_stmt, gimple_vdef (stmt));

> +         if (gimple_vdef (new_stmt)

> +             && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME)

> +           SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt;

> 

> you can use gimple_move_vops (new_stmt, stmt) for this.


Indeed, changed.

> I wonder if there are objects beside STRING_CSTs that could have their

> sizes fixed via inlining, thus, whether you can omit the == STRING_CST

> check?  That is, I see this change independent of the store-merging

> optimization.


Will that not interfere with the subsequent cases though?  For STRING_CSTs, 
this is not the case since they are not var_decl_component_p.  I can replace 
STRING_CST with !var_decl_component_p but I'm not sure what we will gain.

> Otherwise the memcpy folding part looks OK to me, I skimmed the

> store-merging change and didn't find anything suspicious but I wonder

> whether not special-casing STRING_CSTs would actually simplify

> the code?


This would mean implementing the 1) though and, in particular, rejecting or 
splitting long strings, which is precisely what we do not want to do in Ada.

-- 
Eric Botcazou
Jeff Law via Gcc-patches July 1, 2020, noon | #3
On Wed, Jul 1, 2020 at 12:32 PM Eric Botcazou <botcazou@adacore.com> wrote:
>

> > +         && TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST

> > +         && tree_int_cst_equal

> > +            (TYPE_SIZE_UNIT (TREE_TYPE (TREE_OPERAND (src, 0))), len))

> > +       {

> >

> > I guess we miss a BITS_PER_UNIT == 8 check here?

>

> OK, added.

>

> > +         gimple_set_vuse (new_stmt, gimple_vuse (stmt));

> > +         gimple_set_vdef (new_stmt, gimple_vdef (stmt));

> > +         if (gimple_vdef (new_stmt)

> > +             && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME)

> > +           SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt;

> >

> > you can use gimple_move_vops (new_stmt, stmt) for this.

>

> Indeed, changed.

>

> > I wonder if there are objects beside STRING_CSTs that could have their

> > sizes fixed via inlining, thus, whether you can omit the == STRING_CST

> > check?  That is, I see this change independent of the store-merging

> > optimization.

>

> Will that not interfere with the subsequent cases though?  For STRING_CSTs,

> this is not the case since they are not var_decl_component_p.  I can replace

> STRING_CST with !var_decl_component_p but I'm not sure what we will gain.


Hmm, that's a good question - so would your patch be replaceable by
simply amending var_decl_component_p by

  (var_decl_component_p (TREE_OPERAND (src, 0))
   || TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST)

?

> > Otherwise the memcpy folding part looks OK to me, I skimmed the

> > store-merging change and didn't find anything suspicious but I wonder

> > whether not special-casing STRING_CSTs would actually simplify

> > the code?

>

> This would mean implementing the 1) though and, in particular, rejecting or

> splitting long strings, which is precisely what we do not want to do in Ada.


Oh, indeed, yeah - so the separate handling looks better.  I thought
of cases like merging "x\0" with a subsequent short integer but those may
be already handled since they are power-of-two sized.

Richard.

> --

> Eric Botcazou
Eric Botcazou July 1, 2020, 9:05 p.m. | #4
> Hmm, that's a good question - so would your patch be replaceable by

> simply amending var_decl_component_p by

> 

>   (var_decl_component_p (TREE_OPERAND (src, 0))

> 

>    || TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST)

> 

> ?


The gimple_call_alloca_for_var_p (stmt) kludge is still needed though and the 
end of the transformation needs to be adjusted, so I'm not sure it's better.

-- 
Eric Botcazou
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 4e3de95d2d2..6d5670c8e06 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -840,6 +840,32 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	    }
 	}
 
+      /* If this was a variable-sized copy operation that has been turned
+	 into a fixed-size string store, typically after inlining, then do
+	 the fixed-size store explicitly.  We might be able to concatenate
+	 several of them later into a single string store.  */
+      if (tree_fits_uhwi_p (len)
+	  && gimple_call_alloca_for_var_p (stmt)
+	  && TREE_CODE (src) == ADDR_EXPR
+	  && TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST
+	  && tree_int_cst_equal
+	     (TYPE_SIZE_UNIT (TREE_TYPE (TREE_OPERAND (src, 0))), len)
+	  && BITS_PER_UNIT == 8)
+	{
+	  tree new_src = TREE_OPERAND (src, 0);
+	  tree new_dest
+	    = fold_build2 (MEM_REF, TREE_TYPE (new_src), dest, off0);
+	  gimple *new_stmt = gimple_build_assign (new_dest, new_src);
+	  gimple_move_vops (new_stmt, stmt);
+	  if (!lhs)
+	    {
+	      gsi_replace (gsi, new_stmt, false);
+	      return true;
+	    }
+	  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+	  goto done;
+	}
+
       if (code == BUILT_IN_MEMMOVE)
 	{
 	  /* Both DEST and SRC must be pointer types.
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 4e3de95d2d2..52c54ac14be 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1002,7 +1002,9 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	      || src_align >= TYPE_ALIGN (desttype)))
 	destvar = fold_build2 (MEM_REF, desttype, dest, off0);
       else if (TREE_CODE (src) == ADDR_EXPR
-	       && var_decl_component_p (TREE_OPERAND (src, 0))
+	       && (var_decl_component_p (TREE_OPERAND (src, 0))
+		   || (TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST
+		       && gimple_call_alloca_for_var_p (stmt)))
 	       && tree_int_cst_equal (TYPE_SIZE_UNIT (srctype), len)
 	       && src_align >= TYPE_ALIGN (srctype)
 	       && (is_gimple_reg_type (srctype)
@@ -1071,19 +1073,34 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	  goto set_vop_and_replace;
 	}
 
-      /* We get an aggregate copy.  Use an unsigned char[] type to
-	 perform the copying to preserve padding and to avoid any issues
-	 with TREE_ADDRESSABLE types or float modes behavior on copying.  */
-      desttype = build_array_type_nelts (unsigned_char_type_node,
-					 tree_to_uhwi (len));
-      srctype = desttype;
-      if (src_align > TYPE_ALIGN (srctype))
-	srctype = build_aligned_type (srctype, src_align);
+      /* We get an aggregate copy.  If the source is a STRING_CST, then
+	 directly use its type to perform the copy.  */
+      if (TREE_CODE (src) == ADDR_EXPR
+	  && TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST
+	  && tree_int_cst_equal (TYPE_SIZE_UNIT (srctype), len))
+	{
+	  desttype = srctype;
+	  srcvar = TREE_OPERAND (src, 0);
+	}
+
+      /* Or else, use an unsigned char[] type to perform the copy in order
+	 to preserve padding and to avoid any issues with TREE_ADDRESSABLE
+	 types or float modes behavior on copying.  */
+      else
+	{
+	  desttype = build_array_type_nelts (unsigned_char_type_node,
+					     tree_to_uhwi (len));
+	  srctype = desttype;
+	  if (src_align > TYPE_ALIGN (srctype))
+	    srctype = build_aligned_type (srctype, src_align);
+	  srcvar = fold_build2 (MEM_REF, srctype, src, off0);
+	}
+
       if (dest_align > TYPE_ALIGN (desttype))
 	desttype = build_aligned_type (desttype, dest_align);
-      new_stmt
-	= gimple_build_assign (fold_build2 (MEM_REF, desttype, dest, off0),
-			       fold_build2 (MEM_REF, srctype, src, off0));
+      destvar = fold_build2 (MEM_REF, desttype, dest, off0);
+      new_stmt = gimple_build_assign (destvar, srcvar);
+
 set_vop_and_replace:
       gimple_move_vops (new_stmt, stmt);
       if (!lhs)
Jeff Law via Gcc-patches July 2, 2020, 8:40 a.m. | #5
On Wed, Jul 1, 2020 at 11:06 PM Eric Botcazou <botcazou@adacore.com> wrote:
>

> > Hmm, that's a good question - so would your patch be replaceable by

> > simply amending var_decl_component_p by

> >

> >   (var_decl_component_p (TREE_OPERAND (src, 0))

> >

> >    || TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST)

> >

> > ?

>

> The gimple_call_alloca_for_var_p (stmt) kludge is still needed though and the

> end of the transformation needs to be adjusted, so I'm not sure it's better.


Sorry for the ping-pong but why's using a new char[len] type problematic?

That said, I do like p2 more even if we need to special-case STRING_CST
sources at the end for store-merging to be happy.

Thanks,
Richard.

> --

> Eric Botcazou
Eric Botcazou July 2, 2020, 10:27 a.m. | #6
> Sorry for the ping-pong but why's using a new char[len] type problematic?


Because the type may be incompatible with that of the STRING_CST, so we would 
need to build a new STRING_CST with the new type.

> That said, I do like p2 more even if we need to special-case STRING_CST

> sources at the end for store-merging to be happy.


OK.

-- 
Eric Botcazou
Jeff Law via Gcc-patches July 2, 2020, 10:39 a.m. | #7
On Thu, Jul 2, 2020 at 12:27 PM Eric Botcazou <botcazou@adacore.com> wrote:
>

> > Sorry for the ping-pong but why's using a new char[len] type problematic?

>

> Because the type may be incompatible with that of the STRING_CST, so we would

> need to build a new STRING_CST with the new type.


I see.

> > That said, I do like p2 more even if we need to special-case STRING_CST

> > sources at the end for store-merging to be happy.

>

> OK.


So this variant combined with the rest of the patch is OK then.

Thanks,
Richard.

>

> --

> Eric Botcazou
Eric Botcazou July 2, 2020, 7:08 p.m. | #8
> So this variant combined with the rest of the patch is OK then.


Thanks.  It occurred to me that using string_constant might be slightly better 
(iti is already used by gimple_fold_builtin_memchr in the same file).


	* gimple-fold.c (gimple_fold_builtin_memory_op): Fold calls that were
	initially created for the assignment of a variable-sized destination
	and whose source is now a string constant.

-- 
Eric Botcazou
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 4e3de95d2d2..72c5e43300a 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -700,7 +700,6 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   gimple *stmt = gsi_stmt (*gsi);
   tree lhs = gimple_call_lhs (stmt);
   tree len = gimple_call_arg (stmt, 2);
-  tree destvar, srcvar;
   location_t loc = gimple_location (stmt);
 
   /* If the LEN parameter is a constant zero or in range where
@@ -741,7 +740,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
     }
   else
     {
-      tree srctype, desttype;
+      tree srctype, desttype, destvar, srcvar, srcoff;
       unsigned int src_align, dest_align;
       tree off0;
       const char *tmp_str;
@@ -991,7 +990,9 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
       /* Choose between src and destination type for the access based
          on alignment, whether the access constitutes a register access
 	 and whether it may actually expose a declaration for SSA rewrite
-	 or SRA decomposition.  */
+	 or SRA decomposition.  Also try to expose a string constant, we
+	 might be able to concatenate several of them later into a single
+	 string store.  */
       destvar = NULL_TREE;
       srcvar = NULL_TREE;
       if (TREE_CODE (dest) == ADDR_EXPR
@@ -1008,7 +1009,16 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	       && (is_gimple_reg_type (srctype)
 		   || dest_align >= TYPE_ALIGN (srctype)))
 	srcvar = fold_build2 (MEM_REF, srctype, src, off0);
-      if (srcvar == NULL_TREE && destvar == NULL_TREE)
+      /* FIXME: Don't transform copies from strings with known original length.
+	 As soon as strlenopt tests that rely on it for passing are adjusted,
+	 this hack can be removed.  */
+      else if (gimple_call_alloca_for_var_p (stmt)
+	       && (srcvar = string_constant (src, &srcoff, NULL, NULL))
+	       && integer_zerop (srcoff)
+	       && tree_int_cst_equal (TYPE_SIZE_UNIT (TREE_TYPE (srcvar)), len)
+	       && dest_align >= TYPE_ALIGN (TREE_TYPE (srcvar)))
+	srctype = TREE_TYPE (srcvar);
+      else
 	return false;
 
       /* Now that we chose an access type express the other side in
@@ -1071,19 +1081,29 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	  goto set_vop_and_replace;
 	}
 
-      /* We get an aggregate copy.  Use an unsigned char[] type to
-	 perform the copying to preserve padding and to avoid any issues
-	 with TREE_ADDRESSABLE types or float modes behavior on copying.  */
-      desttype = build_array_type_nelts (unsigned_char_type_node,
-					 tree_to_uhwi (len));
-      srctype = desttype;
-      if (src_align > TYPE_ALIGN (srctype))
-	srctype = build_aligned_type (srctype, src_align);
+      /* We get an aggregate copy.  If the source is a STRING_CST, then
+	 directly use its type to perform the copy.  */
+      if (TREE_CODE (srcvar) == STRING_CST)
+	  desttype = srctype;
+
+      /* Or else, use an unsigned char[] type to perform the copy in order
+	 to preserve padding and to avoid any issues with TREE_ADDRESSABLE
+	 types or float modes behavior on copying.  */
+      else
+	{
+	  desttype = build_array_type_nelts (unsigned_char_type_node,
+					     tree_to_uhwi (len));
+	  srctype = desttype;
+	  if (src_align > TYPE_ALIGN (srctype))
+	    srctype = build_aligned_type (srctype, src_align);
+	  srcvar = fold_build2 (MEM_REF, srctype, src, off0);
+	}
+
       if (dest_align > TYPE_ALIGN (desttype))
 	desttype = build_aligned_type (desttype, dest_align);
-      new_stmt
-	= gimple_build_assign (fold_build2 (MEM_REF, desttype, dest, off0),
-			       fold_build2 (MEM_REF, srctype, src, off0));
+      destvar = fold_build2 (MEM_REF, desttype, dest, off0);
+      new_stmt = gimple_build_assign (destvar, srcvar);
+
 set_vop_and_replace:
       gimple_move_vops (new_stmt, stmt);
       if (!lhs)
Jeff Law via Gcc-patches July 3, 2020, 11:22 a.m. | #9
On Thu, Jul 2, 2020 at 9:08 PM Eric Botcazou <botcazou@adacore.com> wrote:
>

> > So this variant combined with the rest of the patch is OK then.

>

> Thanks.  It occurred to me that using string_constant might be slightly better

> (iti is already used by gimple_fold_builtin_memchr in the same file).


OK.

Thanks,
Richard.

>

>         * gimple-fold.c (gimple_fold_builtin_memory_op): Fold calls that were

>         initially created for the assignment of a variable-sized destination

>         and whose source is now a string constant.

>

> --

> Eric Botcazou

Patch

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 4e3de95d2d2..9a467e80aae 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -840,6 +840,35 @@  gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	    }
 	}
 
+      /* If this was a variable-sized copy operation that has been turned
+	 into a fixed-size string store, typically after inlining, then do
+	 the fixed-size store explicitly.  We might be able to concatenate
+	 several of them later into a single string store.  */
+      if (tree_fits_uhwi_p (len)
+	  && gimple_call_alloca_for_var_p (stmt)
+	  && TREE_CODE (src) == ADDR_EXPR
+	  && TREE_CODE (TREE_OPERAND (src, 0)) == STRING_CST
+	  && tree_int_cst_equal
+	     (TYPE_SIZE_UNIT (TREE_TYPE (TREE_OPERAND (src, 0))), len))
+	{
+	  tree new_src = TREE_OPERAND (src, 0);
+	  tree new_dest
+	    = fold_build2 (MEM_REF, TREE_TYPE (new_src), dest, off0);
+	  gimple *new_stmt = gimple_build_assign (new_dest, new_src);
+	  gimple_set_vuse (new_stmt, gimple_vuse (stmt));
+	  gimple_set_vdef (new_stmt, gimple_vdef (stmt));
+	  if (gimple_vdef (new_stmt)
+	      && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME)
+	    SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt;
+	  if (!lhs)
+	    {
+	      gsi_replace (gsi, new_stmt, false);
+	      return true;
+	    }
+	  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+	  goto done;
+	}
+
       if (code == BUILT_IN_MEMMOVE)
 	{
 	  /* Both DEST and SRC must be pointer types.
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index 3ab614148a7..bac6dd05e3c 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -1362,9 +1362,9 @@  public:
   unsigned HOST_WIDE_INT bitregion_end;
   gimple *stmt;
   unsigned int order;
-  /* INTEGER_CST for constant stores, MEM_REF for memory copy,
-     BIT_*_EXPR for logical bitwise operation, BIT_INSERT_EXPR
-     for bit insertion.
+  /* INTEGER_CST for constant store, STRING_CST for string store,
+     MEM_REF for memory copy, BIT_*_EXPR for logical bitwise operation,
+     BIT_INSERT_EXPR for bit insertion.
      LROTATE_EXPR if it can be only bswap optimized and
      ops are not really meaningful.
      NOP_EXPR if bswap optimization detected identity, ops
@@ -1440,6 +1440,7 @@  public:
   unsigned int first_order;
   unsigned int last_order;
   bool bit_insertion;
+  bool string_concatenation;
   bool only_constants;
   unsigned int first_nonmergeable_order;
   int lp_nr;
@@ -1650,7 +1651,10 @@  encode_tree_to_bitpos (tree expr, unsigned char *ptr, int bitlen, int bitpos,
     {
       fixed_size_mode mode
 	= as_a <fixed_size_mode> (TYPE_MODE (TREE_TYPE (expr)));
-      byte_size = GET_MODE_SIZE (mode);
+      byte_size
+	= mode == BLKmode
+	? tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (expr)))
+	: GET_MODE_SIZE (mode);
     }
   /* Allocate an extra byte so that we have space to shift into.  */
   byte_size++;
@@ -1807,7 +1811,8 @@  merged_store_group::merged_store_group (store_immediate_info *info)
      width has been finalized.  */
   val = NULL;
   mask = NULL;
-  bit_insertion = false;
+  bit_insertion = info->rhs_code == BIT_INSERT_EXPR;
+  string_concatenation = info->rhs_code == STRING_CST;
   only_constants = info->rhs_code == INTEGER_CST;
   first_nonmergeable_order = ~0U;
   lp_nr = info->lp_nr;
@@ -1860,12 +1865,12 @@  merged_store_group::can_be_merged_into (store_immediate_info *info)
   if (info->rhs_code == stores[0]->rhs_code)
     return true;
 
-  /* BIT_INSERT_EXPR is compatible with INTEGER_CST.  */
+  /* BIT_INSERT_EXPR is compatible with INTEGER_CST if no STRING_CST.  */
   if (info->rhs_code == BIT_INSERT_EXPR && stores[0]->rhs_code == INTEGER_CST)
-    return true;
+    return !string_concatenation;
 
   if (stores[0]->rhs_code == BIT_INSERT_EXPR && info->rhs_code == INTEGER_CST)
-    return true;
+    return !string_concatenation;
 
   /* We can turn MEM_REF into BIT_INSERT_EXPR for bit-field stores, but do it
      only for small regions since this can generate a lot of instructions.  */
@@ -1875,7 +1880,7 @@  merged_store_group::can_be_merged_into (store_immediate_info *info)
       && info->bitregion_start == stores[0]->bitregion_start
       && info->bitregion_end == stores[0]->bitregion_end
       && info->bitregion_end - info->bitregion_start <= MAX_FIXED_MODE_SIZE)
-    return true;
+    return !string_concatenation;
 
   if (stores[0]->rhs_code == MEM_REF
       && (info->rhs_code == INTEGER_CST
@@ -1883,7 +1888,18 @@  merged_store_group::can_be_merged_into (store_immediate_info *info)
       && info->bitregion_start == stores[0]->bitregion_start
       && info->bitregion_end == stores[0]->bitregion_end
       && info->bitregion_end - info->bitregion_start <= MAX_FIXED_MODE_SIZE)
-    return true;
+    return !string_concatenation;
+
+  /* STRING_CST is compatible with INTEGER_CST if no BIT_INSERT_EXPR.  */
+  if (info->rhs_code == STRING_CST
+      && stores[0]->rhs_code == INTEGER_CST
+      && stores[0]->bitsize == CHAR_BIT)
+    return !bit_insertion;
+
+  if (stores[0]->rhs_code == STRING_CST
+      && info->rhs_code == INTEGER_CST
+      && info->bitsize == CHAR_BIT)
+    return !bit_insertion;
 
   return false;
 }
@@ -1932,6 +1948,21 @@  merged_store_group::do_merge (store_immediate_info *info)
       first_order = info->order;
       first_stmt = stmt;
     }
+
+  /* We need to use extraction if there is any bit-field.  */
+  if (info->rhs_code == BIT_INSERT_EXPR)
+    {
+      bit_insertion = true;
+      gcc_assert (!string_concatenation);
+    }
+
+  /* We need to use concatenation if there is any string.  */
+  if (info->rhs_code == STRING_CST)
+    {
+      string_concatenation = true;
+      gcc_assert (!bit_insertion);
+    }
+
   if (info->rhs_code != INTEGER_CST)
     only_constants = false;
 }
@@ -1972,6 +2003,9 @@  merged_store_group::merge_overlapping (store_immediate_info *info)
 bool
 merged_store_group::apply_stores ()
 {
+  store_immediate_info *info;
+  unsigned int i;
+
   /* Make sure we have more than one store in the group, otherwise we cannot
      merge anything.  */
   if (bitregion_start % BITS_PER_UNIT != 0
@@ -1979,16 +2013,23 @@  merged_store_group::apply_stores ()
       || stores.length () == 1)
     return false;
 
-  stores.qsort (sort_by_order);
-  store_immediate_info *info;
-  unsigned int i;
+  buf_size = (bitregion_end - bitregion_start) / BITS_PER_UNIT;
+
+  /* Really do string concatenation for large strings only.  */
+  if (buf_size <= MOVE_MAX)
+    string_concatenation = false;
+
   /* Create a power-of-2-sized buffer for native_encode_expr.  */
-  buf_size = 1 << ceil_log2 ((bitregion_end - bitregion_start) / BITS_PER_UNIT);
+  if (!string_concatenation)
+    buf_size = 1 << ceil_log2 (buf_size);
+
   val = XNEWVEC (unsigned char, 2 * buf_size);
   mask = val + buf_size;
   memset (val, 0, buf_size);
   memset (mask, ~0U, buf_size);
 
+  stores.qsort (sort_by_order);
+
   FOR_EACH_VEC_ELT (stores, i, info)
     {
       unsigned int pos_in_buffer = info->bitpos - bitregion_start;
@@ -2000,14 +2041,9 @@  merged_store_group::apply_stores ()
       else
 	cst = NULL_TREE;
       bool ret = true;
-      if (cst)
-	{
-	  if (info->rhs_code == BIT_INSERT_EXPR)
-	    bit_insertion = true;
-	  else
-	    ret = encode_tree_to_bitpos (cst, val, info->bitsize,
-					 pos_in_buffer, buf_size);
-	}
+      if (cst && info->rhs_code != BIT_INSERT_EXPR)
+	ret = encode_tree_to_bitpos (cst, val, info->bitsize, pos_in_buffer,
+				     buf_size);
       unsigned char *m = mask + (pos_in_buffer / BITS_PER_UNIT);
       if (BYTES_BIG_ENDIAN)
 	clear_bit_region_be (m, (BITS_PER_UNIT - 1
@@ -2029,6 +2065,8 @@  merged_store_group::apply_stores ()
 	      dump_char_array (dump_file, mask, buf_size);
 	      if (bit_insertion)
 		fputs ("  bit insertion is required\n", dump_file);
+	      if (string_concatenation)
+		fputs ("  string concatenation is required\n", dump_file);
 	    }
 	  else
 	    fprintf (dump_file, "Failed to merge stores\n");
@@ -2916,6 +2954,7 @@  imm_store_chain_info::coalesce_immediate_stores ()
 		      infoj->ops[0].val = gimple_assign_rhs1 (infoj->stmt);
 		      infoj->ops[0].base_addr = NULL_TREE;
 		    }
+		  merged_store->bit_insertion = true;
 		}
 	      if ((infof->ops[0].base_addr
 		   ? compatible_load_p (merged_store, info, base_addr, 0)
@@ -3143,6 +3182,7 @@  count_multiple_uses (store_immediate_info *info)
   switch (info->rhs_code)
     {
     case INTEGER_CST:
+    case STRING_CST:
       return 0;
     case BIT_AND_EXPR:
     case BIT_IOR_EXPR:
@@ -3238,13 +3278,14 @@  split_group (merged_store_group *group, bool allow_unaligned_store,
 
   gcc_assert ((size % BITS_PER_UNIT == 0) && (pos % BITS_PER_UNIT == 0));
 
+  /* For bswap framework using sets of stores, all the checking has been done
+     earlier in try_coalesce_bswap and the result always needs to be emitted
+     as a single store.  Likewise for string concatenation,  */
   if (group->stores[0]->rhs_code == LROTATE_EXPR
-      || group->stores[0]->rhs_code == NOP_EXPR)
+      || group->stores[0]->rhs_code == NOP_EXPR
+      || group->string_concatenation)
     {
       gcc_assert (!bzero_first);
-      /* For bswap framework using sets of stores, all the checking
-	 has been done earlier in try_coalesce_bswap and needs to be
-	 emitted as a single store.  */
       if (total_orig)
 	{
 	  /* Avoid the old/new stmt count heuristics.  It should be
@@ -3658,16 +3699,12 @@  invert_op (split_store *split_store, int idx, tree int_type, tree &mask)
 bool
 imm_store_chain_info::output_merged_store (merged_store_group *group)
 {
-  split_store *split_store;
-  unsigned int i;
-  unsigned HOST_WIDE_INT start_byte_pos
+  const unsigned HOST_WIDE_INT start_byte_pos
     = group->bitregion_start / BITS_PER_UNIT;
-
   unsigned int orig_num_stmts = group->stores.length ();
   if (orig_num_stmts < 2)
     return false;
 
-  auto_vec<class split_store *, 32> split_stores;
   bool allow_unaligned_store
     = !STRICT_ALIGNMENT && param_store_merging_allow_unaligned;
   bool allow_unaligned_load = allow_unaligned_store;
@@ -3676,6 +3713,7 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
   unsigned int num_clobber_stmts = 0;
   if (group->stores[0]->rhs_code == INTEGER_CST)
     {
+      unsigned int i;
       FOR_EACH_VEC_ELT (group->stores, i, store)
 	if (gimple_clobber_p (store->stmt))
 	  num_clobber_stmts++;
@@ -3725,7 +3763,10 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
       if ((pass_min & 2) == 0)
 	bzero_first = false;
     }
-  unsigned total_orig, total_new;
+
+  auto_vec<class split_store *, 32> split_stores;
+  split_store *split_store;
+  unsigned total_orig, total_new, i;
   split_group (group, allow_unaligned_store, allow_unaligned_load, bzero_first,
 	       &split_stores, &total_orig, &total_new);
 
@@ -3940,12 +3981,14 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 
   FOR_EACH_VEC_ELT (split_stores, i, split_store)
     {
-      unsigned HOST_WIDE_INT try_size = split_store->size;
-      unsigned HOST_WIDE_INT try_pos = split_store->bytepos;
-      unsigned HOST_WIDE_INT try_bitpos = try_pos * BITS_PER_UNIT;
-      unsigned HOST_WIDE_INT align = split_store->align;
+      const unsigned HOST_WIDE_INT try_size = split_store->size;
+      const unsigned HOST_WIDE_INT try_pos = split_store->bytepos;
+      const unsigned HOST_WIDE_INT try_bitpos = try_pos * BITS_PER_UNIT;
+      const unsigned HOST_WIDE_INT try_align = split_store->align;
+      const unsigned HOST_WIDE_INT try_offset = try_pos - start_byte_pos;
       tree dest, src;
       location_t loc;
+
       if (split_store->orig)
 	{
 	  /* If there is just a single non-clobber constituent store
@@ -3972,12 +4015,20 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 	    orig_stmts.safe_push (info->stmt);
 	  tree offset_type
 	    = get_alias_type_for_stmts (orig_stmts, false, &clique, &base);
+	  tree dest_type;
 	  loc = get_location_for_stmts (orig_stmts);
 	  orig_stmts.truncate (0);
 
-	  tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED);
-	  int_type = build_aligned_type (int_type, align);
-	  dest = fold_build2 (MEM_REF, int_type, addr,
+	  if (group->string_concatenation)
+	    dest_type
+	      = build_array_type_nelts (char_type_node,
+					try_size / BITS_PER_UNIT);
+	  else
+	    {
+	      dest_type = build_nonstandard_integer_type (try_size, UNSIGNED);
+	      dest_type = build_aligned_type (dest_type, try_align);
+	    }
+	  dest = fold_build2 (MEM_REF, dest_type, addr,
 			      build_int_cst (offset_type, try_pos));
 	  if (TREE_CODE (dest) == MEM_REF)
 	    {
@@ -3986,12 +4037,11 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 	    }
 
 	  tree mask;
-	  if (bswap_res)
+	  if (bswap_res || group->string_concatenation)
 	    mask = integer_zero_node;
 	  else
-	    mask = native_interpret_expr (int_type,
-					  group->mask + try_pos
-					  - start_byte_pos,
+	    mask = native_interpret_expr (dest_type,
+					  group->mask + try_offset,
 					  group->buf_size);
 
 	  tree ops[2];
@@ -4002,6 +4052,12 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 	      store_operand_info &op = split_store->orig_stores[0]->ops[j];
 	      if (bswap_res)
 		ops[j] = bswap_res;
+	      else if (group->string_concatenation)
+		{
+		  ops[j] = build_string (try_size / BITS_PER_UNIT,
+					 (const char *) group->val + try_offset);
+		  TREE_TYPE (ops[j]) = dest_type;
+		}
 	      else if (op.base_addr)
 		{
 		  FOR_EACH_VEC_ELT (split_store->orig_stores, k, info)
@@ -4043,8 +4099,7 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 		       warnings in that case.  */
 		    TREE_NO_WARNING (ops[j]) = 1;
 
-		  stmt = gimple_build_assign (make_ssa_name (int_type),
-					      ops[j]);
+		  stmt = gimple_build_assign (make_ssa_name (dest_type), ops[j]);
 		  gimple_set_location (stmt, load_loc);
 		  if (gsi_bb (load_gsi[j]))
 		    {
@@ -4059,10 +4114,10 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 		  ops[j] = gimple_assign_lhs (stmt);
 		  tree xor_mask;
 		  enum tree_code inv_op
-		    = invert_op (split_store, j, int_type, xor_mask);
+		    = invert_op (split_store, j, dest_type, xor_mask);
 		  if (inv_op != NOP_EXPR)
 		    {
-		      stmt = gimple_build_assign (make_ssa_name (int_type),
+		      stmt = gimple_build_assign (make_ssa_name (dest_type),
 						  inv_op, ops[j], xor_mask);
 		      gimple_set_location (stmt, load_loc);
 		      ops[j] = gimple_assign_lhs (stmt);
@@ -4075,9 +4130,8 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 		    }
 		}
 	      else
-		ops[j] = native_interpret_expr (int_type,
-						group->val + try_pos
-						- start_byte_pos,
+		ops[j] = native_interpret_expr (dest_type,
+						group->val + try_offset,
 						group->buf_size);
 	    }
 
@@ -4096,7 +4150,7 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 	      orig_stmts.truncate (0);
 
 	      stmt
-		= gimple_build_assign (make_ssa_name (int_type),
+		= gimple_build_assign (make_ssa_name (dest_type),
 				       split_store->orig_stores[0]->rhs_code,
 				       ops[0], ops[1]);
 	      gimple_set_location (stmt, bit_loc);
@@ -4115,10 +4169,10 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 	      src = gimple_assign_lhs (stmt);
 	      tree xor_mask;
 	      enum tree_code inv_op;
-	      inv_op = invert_op (split_store, 2, int_type, xor_mask);
+	      inv_op = invert_op (split_store, 2, dest_type, xor_mask);
 	      if (inv_op != NOP_EXPR)
 		{
-		  stmt = gimple_build_assign (make_ssa_name (int_type),
+		  stmt = gimple_build_assign (make_ssa_name (dest_type),
 					      inv_op, src, xor_mask);
 		  gimple_set_location (stmt, bit_loc);
 		  if (load_addr[1] == NULL_TREE && gsi_bb (load_gsi[0]))
@@ -4138,17 +4192,17 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 		  gimple_seq_add_stmt_without_update (&seq, stmt);
 		  src = gimple_assign_lhs (stmt);
 		}
-	      if (!useless_type_conversion_p (int_type, TREE_TYPE (src)))
+	      if (!useless_type_conversion_p (dest_type, TREE_TYPE (src)))
 		{
-		  stmt = gimple_build_assign (make_ssa_name (int_type),
+		  stmt = gimple_build_assign (make_ssa_name (dest_type),
 					      NOP_EXPR, src);
 		  gimple_seq_add_stmt_without_update (&seq, stmt);
 		  src = gimple_assign_lhs (stmt);
 		}
-	      inv_op = invert_op (split_store, 2, int_type, xor_mask);
+	      inv_op = invert_op (split_store, 2, dest_type, xor_mask);
 	      if (inv_op != NOP_EXPR)
 		{
-		  stmt = gimple_build_assign (make_ssa_name (int_type),
+		  stmt = gimple_build_assign (make_ssa_name (dest_type),
 					      inv_op, src, xor_mask);
 		  gimple_set_location (stmt, loc);
 		  gimple_seq_add_stmt_without_update (&seq, stmt);
@@ -4206,18 +4260,18 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 		    tem = gimple_build (&seq, loc,
 					RSHIFT_EXPR, TREE_TYPE (tem), tem,
 					build_int_cst (NULL_TREE, -shift));
-		  tem = gimple_convert (&seq, loc, int_type, tem);
+		  tem = gimple_convert (&seq, loc, dest_type, tem);
 		  if (shift > 0)
 		    tem = gimple_build (&seq, loc,
-					LSHIFT_EXPR, int_type, tem,
+					LSHIFT_EXPR, dest_type, tem,
 					build_int_cst (NULL_TREE, shift));
 		  src = gimple_build (&seq, loc,
-				      BIT_IOR_EXPR, int_type, tem, src);
+				      BIT_IOR_EXPR, dest_type, tem, src);
 		}
 
 	  if (!integer_zerop (mask))
 	    {
-	      tree tem = make_ssa_name (int_type);
+	      tree tem = make_ssa_name (dest_type);
 	      tree load_src = unshare_expr (dest);
 	      /* The load might load some or all bits uninitialized,
 		 avoid -W*uninitialized warnings in that case.
@@ -4233,28 +4287,28 @@  imm_store_chain_info::output_merged_store (merged_store_group *group)
 
 	      /* FIXME: If there is a single chunk of zero bits in mask,
 		 perhaps use BIT_INSERT_EXPR instead?  */
-	      stmt = gimple_build_assign (make_ssa_name (int_type),
+	      stmt = gimple_build_assign (make_ssa_name (dest_type),
 					  BIT_AND_EXPR, tem, mask);
 	      gimple_set_location (stmt, loc);
 	      gimple_seq_add_stmt_without_update (&seq, stmt);
 	      tem = gimple_assign_lhs (stmt);
 
 	      if (TREE_CODE (src) == INTEGER_CST)
-		src = wide_int_to_tree (int_type,
+		src = wide_int_to_tree (dest_type,
 					wi::bit_and_not (wi::to_wide (src),
 							 wi::to_wide (mask)));
 	      else
 		{
 		  tree nmask
-		    = wide_int_to_tree (int_type,
+		    = wide_int_to_tree (dest_type,
 					wi::bit_not (wi::to_wide (mask)));
-		  stmt = gimple_build_assign (make_ssa_name (int_type),
+		  stmt = gimple_build_assign (make_ssa_name (dest_type),
 					      BIT_AND_EXPR, src, nmask);
 		  gimple_set_location (stmt, loc);
 		  gimple_seq_add_stmt_without_update (&seq, stmt);
 		  src = gimple_assign_lhs (stmt);
 		}
-	      stmt = gimple_build_assign (make_ssa_name (int_type),
+	      stmt = gimple_build_assign (make_ssa_name (dest_type),
 					  BIT_IOR_EXPR, tem, src);
 	      gimple_set_location (stmt, loc);
 	      gimple_seq_add_stmt_without_update (&seq, stmt);
@@ -4451,6 +4505,7 @@  lhs_valid_for_store_merging_p (tree lhs)
     case BIT_FIELD_REF:
     case COMPONENT_REF:
     case MEM_REF:
+    case VIEW_CONVERT_EXPR:
       return true;
     default:
       return false;
@@ -4701,14 +4756,17 @@  pass_store_merging::process_store (gimple *stmt)
   store_operand_info ops[2];
   if (invalid)
     ;
+  else if (TREE_CODE (rhs) == STRING_CST)
+    {
+      rhs_code = STRING_CST;
+      ops[0].val = rhs;
+    }
   else if (rhs_valid_for_store_merging_p (rhs))
     {
       rhs_code = INTEGER_CST;
       ops[0].val = rhs;
     }
-  else if (TREE_CODE (rhs) != SSA_NAME)
-    invalid = true;
-  else
+  else if (TREE_CODE (rhs) == SSA_NAME)
     {
       gimple *def_stmt = SSA_NAME_DEF_STMT (rhs), *def_stmt1, *def_stmt2;
       if (!is_gimple_assign (def_stmt))
@@ -4820,6 +4878,8 @@  pass_store_merging::process_store (gimple *stmt)
 	  invalid = false;
 	}
     }
+  else
+    invalid = true;
 
   unsigned HOST_WIDE_INT const_bitsize, const_bitpos;
   unsigned HOST_WIDE_INT const_bitregion_start, const_bitregion_end;
diff --git a/gcc/gimple.h b/gcc/gimple.h
index ca7fec6247e..d64c47a7d0d 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -3472,6 +3472,13 @@  gimple_call_alloca_for_var_p (gcall *s)
   return (s->subcode & GF_CALL_ALLOCA_FOR_VAR) != 0;
 }
 
+static inline bool
+gimple_call_alloca_for_var_p (gimple *s)
+{
+  const gcall *gc = GIMPLE_CHECK2<gcall *> (s);
+  return (gc->subcode & GF_CALL_ALLOCA_FOR_VAR) != 0;
+}
+
 /* If BY_DESCRIPTOR_P is true, GIMPLE_CALL S is an indirect call for which
    pointers to nested function are descriptors instead of trampolines.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 968fae2f1a3..09a30cf69a5 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4329,6 +4329,7 @@  gimplify_modify_expr_to_memcpy (tree *expr_p, tree size, bool want_value,
   t = builtin_decl_implicit (BUILT_IN_MEMCPY);
 
   gs = gimple_build_call (t, 3, to_ptr, from_ptr, size);
+  gimple_call_set_alloca_for_var (gs, true);
 
   if (want_value)
     {
diff --git a/gcc/tree.h b/gcc/tree.h
index a74872f5f3e..cf546ed9491 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -925,7 +925,9 @@  extern void omp_clause_range_check_failed (const_tree, const char *, int,
 #define CALL_FROM_THUNK_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
 
 /* In a CALL_EXPR, if the function being called is BUILT_IN_ALLOCA, means that
-   it has been built for the declaration of a variable-sized object.  */
+   it has been built for the declaration of a variable-sized object and, if the
+   function being called is BUILT_IN_MEMCPY, means that it has been built for
+   the assignment of a variable-sized object.  */
 #define CALL_ALLOCA_FOR_VAR_P(NODE) \
   (CALL_EXPR_CHECK (NODE)->base.protected_flag)