coroutines: Don't make duplicate frame copies of awaitables.

Message ID DFA1A106-8D72-4A7F-B296-397872BB0358@sandoe.co.uk
State New
Headers show
Series
  • coroutines: Don't make duplicate frame copies of awaitables.
Related show

Commit Message

Iain Sandoe March 2, 2020, 9:37 a.m.
Hi,

this corrects a thinko that seemed initially to be a missed
optimisation, but turns out to lead to wrong code in some
cases.

tested on x86_64 darwin, linux and powerpc linux
OK for trunk?
thanks
Iain

In general, we need to manage the lifetime of compiler-
generated awaitable instances in the coroutine frame, since
these must persist across suspension points.

However, it is quite possible that the user might provide the
awaitable instances, either as function params or as a local
variable.  We will already generate a frame entry for these as
required.

At present, under this circumstance, we are duplicating these,
awaitable, initialising a second frame copy for them (which we
then subsequently destroy manually after the suspension point). 
That's not efficient - so an undesirable thinko in the first place.
However, there is also an actual bug; if the compiler elects to
elide the copy (which is perfectly legal), it does not have visibility
of the manual management of the post-suspend destruction
- this subsequently leads to double-free errors.

The solution is not to make the second copy (as noted, params
and local vars already have frame copies with managed lifetimes).

gcc/cp/ChangeLog:

2020-03-02  Iain Sandoe  <iain@sandoe.co.uk>

	* coroutines.cc (build_co_await): Do not build frame
	awaitable proxy vars when the co_await expression is
	a function parameter or local var.
	(co_await_expander): Do not initialise a frame var with
	itself.
	(transform_await_expr): Only substitute the awaitable
	frame var if it's needed.
	(register_awaits): Do not make frame copies for param
	or local vars that are awaitables.

gcc/testsuite/ChangeLog:

2020-03-02  Iain Sandoe  <iain@sandoe.co.uk>

	* g++.dg/coroutines/torture/func-params-09-awaitable-parms.C: New test.
	* g++.dg/coroutines/torture/local-var-5-awaitable.C: New test.
---
 gcc/cp/coroutines.cc                          |  89 ++++++++++-----
 .../torture/func-params-09-awaitable-parms.C  | 105 ++++++++++++++++++
 .../torture/local-var-5-awaitable.C           |  73 ++++++++++++
 3 files changed, 241 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/func-params-09-awaitable-parms.C
 create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/local-var-5-awaitable.C

-- 
2.24.1

Comments

Nathan Sidwell March 2, 2020, 2:32 p.m. | #1
On 3/2/20 4:37 AM, Iain Sandoe wrote:
> Hi,

> 

> this corrects a thinko that seemed initially to be a missed

> optimisation, but turns out to lead to wrong code in some

> cases.

> 

> tested on x86_64 darwin, linux and powerpc linux

> OK for trunk?

> thanks

> Iain

> 

> In general, we need to manage the lifetime of compiler-

> generated awaitable instances in the coroutine frame, since

> these must persist across suspension points.

> 

> However, it is quite possible that the user might provide the

> awaitable instances, either as function params or as a local

> variable.  We will already generate a frame entry for these as

> required.

> 

> At present, under this circumstance, we are duplicating these,

> awaitable, initialising a second frame copy for them (which we

> then subsequently destroy manually after the suspension point).

> That's not efficient - so an undesirable thinko in the first place.

> However, there is also an actual bug; if the compiler elects to

> elide the copy (which is perfectly legal), it does not have visibility

> of the manual management of the post-suspend destruction

> - this subsequently leads to double-free errors.

> 

> The solution is not to make the second copy (as noted, params

> and local vars already have frame copies with managed lifetimes).

> 

> gcc/cp/ChangeLog:

> 

> 2020-03-02  Iain Sandoe  <iain@sandoe.co.uk>

> 

> 	* coroutines.cc (build_co_await): Do not build frame

> 	awaitable proxy vars when the co_await expression is

> 	a function parameter or local var.

> 	(co_await_expander): Do not initialise a frame var with

> 	itself.

> 	(transform_await_expr): Only substitute the awaitable

> 	frame var if it's needed.

> 	(register_awaits): Do not make frame copies for param

> 	or local vars that are awaitables.

> 


ok


-- 
Nathan Sidwell

Patch

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index ffc33aa1534..3e06f079787 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -738,8 +738,21 @@  build_co_await (location_t loc, tree a, suspend_point_kind suspend_kind)
   /* To complete the lookups, we need an instance of 'e' which is built from
      'o' according to [expr.await] 3.4.  However, we don't want to materialize
      'e' here (it might need to be placed in the coroutine frame) so we will
-     make a temp placeholder instead.  */
-  tree e_proxy = build_lang_decl (VAR_DECL, NULL_TREE, o_type);
+     make a temp placeholder instead.  If 'o' is a parameter or a local var,
+     then we do not need an additional var (parms and local vars are already
+     copied into the frame and will have lifetimes according to their original
+     scope).  */
+  tree e_proxy = STRIP_NOPS (o);
+  if (INDIRECT_REF_P (e_proxy))
+    e_proxy = TREE_OPERAND (e_proxy, 0);
+  if (TREE_CODE (e_proxy) == PARM_DECL
+      || (TREE_CODE (e_proxy) == VAR_DECL && !DECL_ARTIFICIAL (e_proxy)))
+    e_proxy = o;
+  else
+    {
+      e_proxy = build_lang_decl (VAR_DECL, NULL_TREE, o_type);
+      DECL_ARTIFICIAL (e_proxy) = true;
+    }
 
   /* I suppose we could check that this is contextually convertible to bool.  */
   tree awrd_func = NULL_TREE;
@@ -1452,10 +1465,17 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 				      tf_warning_or_error);
 
   tree stmt_list = NULL;
+  tree t_expr = STRIP_NOPS (expr);
+  tree r;
+  if (t_expr == var)
+    dtor = NULL_TREE;
+  else
+    {
   /* Initialize the var from the provided 'o' expression.  */
-  tree r = build2 (INIT_EXPR, await_type, var, expr);
+    r = build2 (INIT_EXPR, await_type, var, expr);
   r = coro_build_cvt_void_expr_stmt (r, loc);
   append_to_statement_list (r, &stmt_list);
+    }
 
   /* Use the await_ready() call to test if we need to suspend.  */
   tree ready_cond = TREE_VEC_ELT (awaiter_calls, 0); /* await_ready().  */
@@ -1687,20 +1707,26 @@  transform_await_expr (tree await_expr, await_xform_data *xform)
      and an empty pointer for void return.  */
   TREE_OPERAND (await_expr, 0) = ah;
 
-  /* Get a reference to the initial suspend var in the frame.  */
-  tree as_m
-    = lookup_member (coro_frame_type, si->await_field_id,
-		     /*protect=*/1, /*want_type=*/0, tf_warning_or_error);
-  tree as = build_class_member_access_expr (xform->actor_frame, as_m, NULL_TREE,
-					    true, tf_warning_or_error);
+  /* If we have a frame var for the awaitable, get a reference to it.  */
+  proxy_replace data;
+  if (si->await_field_id)
+    {
+      tree as_m
+	 = lookup_member (coro_frame_type, si->await_field_id,
+			  /*protect=*/1, /*want_type=*/0, tf_warning_or_error);
+      tree as = build_class_member_access_expr (xform->actor_frame, as_m,
+						NULL_TREE, true,
+						tf_warning_or_error);
 
-  /* Replace references to the instance proxy with the frame entry now
-     computed.  */
-  proxy_replace data = {TREE_OPERAND (await_expr, 1), as};
-  cp_walk_tree (&await_expr, replace_proxy, &data, NULL);
+      /* Replace references to the instance proxy with the frame entry now
+	 computed.  */
+      data.from = TREE_OPERAND (await_expr, 1);
+      data.to = as;
+      cp_walk_tree (&await_expr, replace_proxy, &data, NULL);
 
-  /* .. and replace.  */
-  TREE_OPERAND (await_expr, 1) = as;
+      /* .. and replace.  */
+      TREE_OPERAND (await_expr, 1) = as;
+    }
 
   /* Now do the self_handle.  */
   data.from = xform->self_h_proxy;
@@ -2643,15 +2669,25 @@  register_awaits (tree *stmt, int *do_subtree ATTRIBUTE_UNUSED, void *d)
      as the counter used for the function-wide await point number.  */
   data->saw_awaits++;
 
-  /* The required field has the same type as the proxy stored in the
-      await expr.  */
-  tree aw_field_type = TREE_TYPE (TREE_OPERAND (aw_expr, 1));
-
-  size_t bufsize = sizeof ("__aw_s.") + 10;
-  char *buf = (char *) alloca (bufsize);
-  snprintf (buf, bufsize, "__aw_s.%d", data->count);
-  tree aw_field_nam
-    = coro_make_frame_entry (data->field_list, buf, aw_field_type, aw_loc);
+  /* If the awaitable is a parm or a local variable, then we already have
+     a frame copy, so don't make a new one.  */
+  tree aw = TREE_OPERAND (aw_expr, 1);
+  tree aw_field_type = TREE_TYPE (aw);
+  tree aw_field_nam = NULL_TREE;
+  if (INDIRECT_REF_P (aw))
+    aw = TREE_OPERAND (aw, 0);
+  if (TREE_CODE (aw) == PARM_DECL
+      || (TREE_CODE (aw) == VAR_DECL && !DECL_ARTIFICIAL (aw)))
+    ; /* Don't make an additional copy.  */
+  else
+    {
+      /* The required field has the same type as the proxy stored in the
+	 await expr.  */
+      char *nam = xasprintf ("__aw_s.%d", data->count);
+      aw_field_nam = coro_make_frame_entry (data->field_list, nam,
+					    aw_field_type, aw_loc);
+      free (nam);
+    }
 
   /* Find out what we have to do with the awaiter's suspend method (this
      determines if we need somewhere to stash the suspend method's handle).
@@ -2671,9 +2707,10 @@  register_awaits (tree *stmt, int *do_subtree ATTRIBUTE_UNUSED, void *d)
     handle_field_nam = NULL_TREE; /* no handle is needed.  */
   else
     {
-      snprintf (buf, bufsize, "__aw_h.%u", data->count);
+      char *nam = xasprintf ("__aw_h.%u", data->count);
       handle_field_nam
-	= coro_make_frame_entry (data->field_list, buf, susp_typ, aw_loc);
+	= coro_make_frame_entry (data->field_list, nam, susp_typ, aw_loc);
+      free (nam);
     }
   register_await_info (aw_expr, aw_field_type, aw_field_nam, susp_typ,
 		       handle_field_nam);
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/func-params-09-awaitable-parms.C b/gcc/testsuite/g++.dg/coroutines/torture/func-params-09-awaitable-parms.C
new file mode 100644
index 00000000000..7d376b91f13
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/func-params-09-awaitable-parms.C
@@ -0,0 +1,105 @@ 
+// { dg-do run }
+
+// Check that we correctly handle params with non-trivial DTORs and
+// use the correct copy/move CTORs.
+
+#include "../coro.h"
+#include <vector>
+
+// boiler-plate for tests of codegen
+#include "../coro1-ret-int-yield-int.h"
+
+int regular = 0;
+int copy = 0;
+int move = 0;
+
+/* This is a more sophisticated awaitable... */
+
+struct FooAwaitable {
+  FooAwaitable(int _v) : value(_v), x(1, _v)
+    {
+      regular++;
+      PRINTF ("FooAwaitable(%d)\n",_v);
+    }
+
+  FooAwaitable(const FooAwaitable& t)
+    {
+      value = t.value;
+      x = t.x;
+      copy++;
+      PRINTF ("FooAwaitable(&), %d\n",value);
+    }
+
+  FooAwaitable(FooAwaitable&& s)
+    {
+      value = s.value;
+      s.value = -1;
+      x = std::move(s.x);
+      s.x = std::vector<int> ();
+      move++;
+      PRINTF ("FooAwaitable(&&), %d\n", value);
+    }
+
+  ~FooAwaitable() {PRINTF ("~FooAwaitable(), %d\n", value);}
+
+  bool await_ready() { return false; }
+  void await_suspend(coro::coroutine_handle<>) {}
+  int await_resume() { return value + x[0];}
+
+  void return_value(int _v) { value = _v;}
+
+  int value;
+  std::vector<int> x;
+};
+
+coro1
+my_coro (FooAwaitable t_lv, FooAwaitable& t_ref, FooAwaitable&& t_rv_ref)
+{
+  PRINT ("my_coro");
+  // We are created suspended, so correct operation depends on
+  // the parms being copied.
+  int sum = co_await t_lv;
+  PRINT ("my_coro 1");
+  sum += co_await t_ref;
+  PRINT ("my_coro 2");
+  sum += co_await t_rv_ref;
+  PRINT ("my_coro 3");
+  co_return sum;
+}
+
+int main ()
+{
+
+  PRINT ("main: create coro1");
+  FooAwaitable thing (4);
+  coro1 x = my_coro (FooAwaitable (1), thing, FooAwaitable (2));
+  PRINT ("main: done ramp");
+
+  if (x.handle.done())
+    abort();
+  x.handle.resume();
+  PRINT ("main: after resume (initial suspend)");
+
+  // now do the three co_awaits.
+  while(!x.handle.done())
+    x.handle.resume();
+  PRINT ("main: after resuming 3 co_awaits");
+
+  /* Now we should have the co_returned value.  */
+  int y = x.handle.promise().get_value();
+  if (y != 14)
+    {
+      PRINTF ("main: wrong result (%d).", y);
+      abort ();
+    }
+
+  if (regular != 3 || copy != 1 || move != 1)
+    {
+      PRINTF ("main: unexpected ctor use (R:%d, C:%d, M:%d)\n",
+	      regular, copy, move);
+      abort ();
+    }
+
+  PRINT ("main: returning");
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/local-var-5-awaitable.C b/gcc/testsuite/g++.dg/coroutines/torture/local-var-5-awaitable.C
new file mode 100644
index 00000000000..7ea00434c87
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/local-var-5-awaitable.C
@@ -0,0 +1,73 @@ 
+//  { dg-do run }
+
+// Test the case where the awaitables are local vars, and therefore already
+// have a frame representation - and should not be copied to a second frame
+// entry (since elision of that copy would break the assumptions made in the
+// management of the lifetime of the awaitable).
+
+#include "../coro.h"
+#include <vector>
+
+// boiler-plate for tests of codegen
+#include "../coro1-ret-int-yield-int.h"
+
+/* Make a non-trivial awaitable.  */
+struct Awaitable
+{
+  int v;
+  std::vector<int> x;
+  Awaitable () : v(0), x(1,0) {PRINTF ("Awaitable()\n");} 
+  Awaitable (int _v) : v(_v), x(1,_v) {PRINTF ("Awaitable(%d)\n",_v);}
+
+  bool await_ready () { return false; }
+  void await_suspend(coro::coroutine_handle<>) {}
+  int await_resume() { return v + x[0];}
+
+  ~Awaitable () {PRINTF ("~Awaitable(%d)\n",v);}
+};
+
+coro1
+my_coro (int start) noexcept
+{
+  PRINT ("my_coro");
+  Awaitable aw0 = Awaitable (start);
+  Awaitable aw1 = Awaitable (4);
+  Awaitable aw2 = Awaitable (10);
+
+  int sum;
+  /* We are started with a suspend_always init suspend expr.  */
+  sum = co_await aw0;
+  PRINT ("my_coro 1");
+  sum += co_await aw1;
+  PRINT ("my_coro 2");
+  sum += co_await aw2;
+  PRINT ("my_coro 3");
+
+  co_return sum;
+}
+
+int main ()
+{
+  PRINT ("main: create my_coro");
+  struct coro1 x = my_coro (7);
+  PRINT ("main: ramp done, resuming init suspend");
+  if (x.handle.done())
+    abort();
+  x.handle.resume();
+
+  // now do the three co_awaits.
+  while(!x.handle.done())
+    x.handle.resume();
+  PRINT ("main: after resuming 3 co_awaits");
+
+  /* Now we should have the co_returned value.  */
+  int y = x.handle.promise().get_value();
+  if (y != 42)
+    {
+      PRINTF ("main: wrong result (%d).", y);
+      abort ();
+    }
+
+  PRINT ("main: returning");
+  return 0;
+}