[C++,coroutines,4/7,v2] Middle end expanders and transforms.

Message ID 905A0299-1DCC-4AC7-BEFF-D9483E3454B5@sandoe.co.uk
State New
Headers show
Series
  • [C++,coroutines,1/7] Common code and base definitions.
Related show

Commit Message

Iain Sandoe Jan. 9, 2020, 12:38 p.m.
Hi Richard,

The SVN commit IDs that address changes to this part of the patchset are noted
in the revised patch header below, for reference.

Richard Biener <richard.guenther@gmail.com> wrote:

> On Sun, Nov 17, 2019 at 11:27 AM Iain Sandoe <iain@sandoe.co.uk> wrote:


>>   Once we have remade the connections to their correct postions, we elide

>>   the labels that the front end inserted.

> 

> Comments inline.


> Do  you actually verify and error for non-INTEGER_CSTs in the

> frontend?  Users can

> inject malicous calls to builtins at their discretion.


Done (it seems that there’s no generic support for such a constraint so
I added a specific validation for this).

>> +           bool dir = wi::to_wide (from) != 0;

>> +           tree vptr = build_pointer_type (void_type_node);

> 

> this is ptr_type_node (also elsewhere)

done throughout.

>> +                               build_int_cst (sizetype, offs));

> 

> size_int  (offs)

done 

>> +           gassign *grpl = gimple_build_assign (lhs, repl);

>> +           gsi_replace (gsi, grpl, true);

>> +           *handled_ops_p = true;

>> +         }

>> +         break;

>> +       case BUILT_IN_CORO_DESTROY:

>> +         call_idx = 1;

>> +         /* FALLTHROUGH */

>> +       case BUILT_IN_CORO_RESUME:

>> +         {

>> +           tree ptr = gimple_call_arg (stmt, 0); /* frame ptr.  */

>> +           tree vptr = build_pointer_type (void_type_node);

>> +           HOST_WIDE_INT psize = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vptr));

>> +           HOST_WIDE_INT offset = call_idx * psize;

>> +           tree fntype = TREE_TYPE (decl);

>> +           tree fntype_ptr = build_pointer_type (fntype);

>> +           tree fntype_ppp = build_pointer_type (fntype_ptr);

>> +           tree indirect = fold_build2 (MEM_REF, fntype_ptr, ptr,

>> +                                        wide_int_to_tree (fntype_ppp, offset));

> 

> You are reading a pointer to the coro_resume builtin here?  You are careful

> with using pointer-to-function types but the offset calculation uses the size

> of void * here. 

> Note that the type of the offset operand in a MEM_REF denotes

> TBAA info, so stores to whatever is at *ptr should better have matching types

> (in this light re-using the function type of the builtin might be iffy).


In fact, in general, the object passed to this builtin will be type-erased.  We will
not know the true type of the “promise object” or the handle.  The only facts that
we can rely on here are that the layout contains:

….. something…
void (*)(void*) [resume]
void (*)(void*) [destroy]
promise: <= we only know that this is its start address, not the type.
…. something….

So we could only say at this stage for sure that the extent would be at least one
byte.

** (note) However, we are never trying to manipulate the promise object at this
stage, when ever we _are_ trying to manipulate it, we will have a complete frame
type as part of the call.

Having said that, perhaps I’m still missing a change that could improve this.

> While wide_int_to_tree seems to match HOST_WIDE_INT the canonical

> way across the code-base is to use build_int_cst (type, hwint).

done.

>> +           tree d_ptr_tmp = make_ssa_name (TYPE_MAIN_VARIANT (vptr));

> 

> No need for TYPE_MAIN_VARIANT here if you use ptr_type_node.

fixed.
> 

>> +           gassign *get_dptr = gimple_build_assign (d_ptr_tmp, indirect);

>> +           gsi_insert_before (gsi, get_dptr, GSI_SAME_STMT);

>> +           tree done = fold_build2 (EQ_EXPR, boolean_type_node, d_ptr_tmp,

>> +                                    wide_int_to_tree (vptr, 0));

> 

> null_pointer_node

fixed

>> +  body = gimple_body (current_function_decl);

>> +  memset (&wi, 0, sizeof (wi));

>> +  walk_gimple_seq_mod (&body, lower_coro_builtin, NULL, &wi);

>> +  gimple_set_body (current_function_decl, body);

> 

> it would be nice to elide the function walk for functions not

> containing any CO* stuff (you noted that below for other parts).

> We can spend a bit in struct function noting functions with

> coroutine code inside and set the bit from frontends or from

> the gimplifier for example.  Since it's behind the flag_coroutines

> paywall this can be addressed as followup.


I did this anyway.

>> +  basic_block bb;

> 

> A comment on why we need two IL walks would be nice.  It looks like you

> could have remembered CO_ACTOR calls in the first walk in a vec<gcall *>

> for example and then just process that?


I fixed this - with a worklist.

>> +  if (changed)

>> +    {

>> +      /* Remove the labels we inserted to map our hidden CFG, this

>> +        avoids confusing block merges when there are also EH labels.  */

>> +      FOR_EACH_BB_FN (bb, cfun)

> 

> An extra walk over all BBs looks a bit excessive?  The label_to_block

> map should be set up so you could walk the set and from there

> affected blocks.


I am concerned that we might need to remove labels for coroutine yield
IFNs that have been removed already - and thus are not “seen” above.
For now, I left this as is with the early exit you noted below.
(I guess it would not be hard to make it a worklist for this too)

>> +       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)

>> +         {

>> +           gimple *stmt = gsi_stmt (gsi);

>> +           if (glabel *glab = dyn_cast<glabel *> (stmt))

>> +             {

>> +               tree rem = gimple_label_label (glab);

>> +               if (to_remove.contains (rem))

>> +                 {

>> +                   gsi_remove (&gsi, true);

>> +                   to_remove.remove (rem);

>> +                   continue; /* We already moved to the next insn.  */

>> +                 }

>> +             }

>> +           gsi_next (&gsi);

> 

> At least you can stop walking a BB if you hit something else

> than a label.  Thus, else break;


done.

>> +/* Optimize (not yet) and lower frame allocation.

>> +

>> +   This is a place-holder for an optimisation to remove unused frame

>> +   entries and re-size the frame to minimum.  */

> 

> I wonder if this will ever be reasonably possible at this point of the

> compilation.  So instead of having this placeholder please do

> the copy through during one of the lowering phases.


I don’t want to side-track the patch discussion at this juncture:

 a) I removed the pass.
 b) and will just note that the coroutine frame is the direct analogue of the
 activation record; it is entirely under the control of the compiler and each
 entry could be considered as independent as any stack frame entry.  Thus
 there cannot be any hidden aliassing. I’d like to discuss in some separate
 thread the opportunities here (it seems to me to be an entirely different
 problem from the general “reorganise aggregate layout” being discussed
 elsewhere).

>>   NEXT_PASS (pass_lower_eh);

>> +  NEXT_PASS (pass_coroutine_lower_builtins);

>>   NEXT_PASS (pass_build_cfg);

>>   NEXT_PASS (pass_warn_function_return);

>> +  NEXT_PASS (pass_coroutine_early_expand_ifns);

> 

> So the two passes above are really close - what's the reason to not

> do coroutine_lower_builtins when the CFG is already built?


The first pass has to run for any function (since the library builtins could and
will be used by non-coroutine functions - e.g. in libstdc++ headers).

The first pass also lowers the IFN_CO_SUSP - which modifies the CFG.

The second pass needs the CFG, actually it would be nice to shift that pass
later in the pipeline to improve the chances of elision of CO_YIELD points via
DCE.

Revised patch below
OK for thrunk now?
thanks
Iain

-- 
2.14.3

Comments

Richard Biener Jan. 9, 2020, 1:49 p.m. | #1
On Thu, Jan 9, 2020 at 1:38 PM Iain Sandoe <iain@sandoe.co.uk> wrote:
>

> Hi Richard,

>

> The SVN commit IDs that address changes to this part of the patchset are noted

> in the revised patch header below, for reference.

>

> Richard Biener <richard.guenther@gmail.com> wrote:

>

> > On Sun, Nov 17, 2019 at 11:27 AM Iain Sandoe <iain@sandoe.co.uk> wrote:

>

> >>   Once we have remade the connections to their correct postions, we elide

> >>   the labels that the front end inserted.

> >

> > Comments inline.

>

> > Do  you actually verify and error for non-INTEGER_CSTs in the

> > frontend?  Users can

> > inject malicous calls to builtins at their discretion.

>

> Done (it seems that there’s no generic support for such a constraint so

> I added a specific validation for this).

>

> >> +           bool dir = wi::to_wide (from) != 0;

> >> +           tree vptr = build_pointer_type (void_type_node);

> >

> > this is ptr_type_node (also elsewhere)

> done throughout.

>

> >> +                               build_int_cst (sizetype, offs));

> >

> > size_int  (offs)

> done

>

> >> +           gassign *grpl = gimple_build_assign (lhs, repl);

> >> +           gsi_replace (gsi, grpl, true);

> >> +           *handled_ops_p = true;

> >> +         }

> >> +         break;

> >> +       case BUILT_IN_CORO_DESTROY:

> >> +         call_idx = 1;

> >> +         /* FALLTHROUGH */

> >> +       case BUILT_IN_CORO_RESUME:

> >> +         {

> >> +           tree ptr = gimple_call_arg (stmt, 0); /* frame ptr.  */

> >> +           tree vptr = build_pointer_type (void_type_node);

> >> +           HOST_WIDE_INT psize = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vptr));

> >> +           HOST_WIDE_INT offset = call_idx * psize;

> >> +           tree fntype = TREE_TYPE (decl);

> >> +           tree fntype_ptr = build_pointer_type (fntype);

> >> +           tree fntype_ppp = build_pointer_type (fntype_ptr);

> >> +           tree indirect = fold_build2 (MEM_REF, fntype_ptr, ptr,

> >> +                                        wide_int_to_tree (fntype_ppp, offset));

> >

> > You are reading a pointer to the coro_resume builtin here?  You are careful

> > with using pointer-to-function types but the offset calculation uses the size

> > of void * here.

> > Note that the type of the offset operand in a MEM_REF denotes

> > TBAA info, so stores to whatever is at *ptr should better have matching types

> > (in this light re-using the function type of the builtin might be iffy).

>

> In fact, in general, the object passed to this builtin will be type-erased.  We will

> not know the true type of the “promise object” or the handle.  The only facts that

> we can rely on here are that the layout contains:

>

> ….. something…

> void (*)(void*) [resume]

> void (*)(void*) [destroy]

> promise: <= we only know that this is its start address, not the type.

> …. something….

>

> So we could only say at this stage for sure that the extent would be at least one

> byte.

>

> ** (note) However, we are never trying to manipulate the promise object at this

> stage, when ever we _are_ trying to manipulate it, we will have a complete frame

> type as part of the call.

>

> Having said that, perhaps I’m still missing a change that could improve this.

>

> > While wide_int_to_tree seems to match HOST_WIDE_INT the canonical

> > way across the code-base is to use build_int_cst (type, hwint).

> done.

>

> >> +           tree d_ptr_tmp = make_ssa_name (TYPE_MAIN_VARIANT (vptr));

> >

> > No need for TYPE_MAIN_VARIANT here if you use ptr_type_node.

> fixed.

> >

> >> +           gassign *get_dptr = gimple_build_assign (d_ptr_tmp, indirect);

> >> +           gsi_insert_before (gsi, get_dptr, GSI_SAME_STMT);

> >> +           tree done = fold_build2 (EQ_EXPR, boolean_type_node, d_ptr_tmp,

> >> +                                    wide_int_to_tree (vptr, 0));

> >

> > null_pointer_node

> fixed

>

> >> +  body = gimple_body (current_function_decl);

> >> +  memset (&wi, 0, sizeof (wi));

> >> +  walk_gimple_seq_mod (&body, lower_coro_builtin, NULL, &wi);

> >> +  gimple_set_body (current_function_decl, body);

> >

> > it would be nice to elide the function walk for functions not

> > containing any CO* stuff (you noted that below for other parts).

> > We can spend a bit in struct function noting functions with

> > coroutine code inside and set the bit from frontends or from

> > the gimplifier for example.  Since it's behind the flag_coroutines

> > paywall this can be addressed as followup.

>

> I did this anyway.

>

> >> +  basic_block bb;

> >

> > A comment on why we need two IL walks would be nice.  It looks like you

> > could have remembered CO_ACTOR calls in the first walk in a vec<gcall *>

> > for example and then just process that?

>

> I fixed this - with a worklist.

>

> >> +  if (changed)

> >> +    {

> >> +      /* Remove the labels we inserted to map our hidden CFG, this

> >> +        avoids confusing block merges when there are also EH labels.  */

> >> +      FOR_EACH_BB_FN (bb, cfun)

> >

> > An extra walk over all BBs looks a bit excessive?  The label_to_block

> > map should be set up so you could walk the set and from there

> > affected blocks.

>

> I am concerned that we might need to remove labels for coroutine yield

> IFNs that have been removed already - and thus are not “seen” above.

> For now, I left this as is with the early exit you noted below.

> (I guess it would not be hard to make it a worklist for this too)

>

> >> +       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)

> >> +         {

> >> +           gimple *stmt = gsi_stmt (gsi);

> >> +           if (glabel *glab = dyn_cast<glabel *> (stmt))

> >> +             {

> >> +               tree rem = gimple_label_label (glab);

> >> +               if (to_remove.contains (rem))

> >> +                 {

> >> +                   gsi_remove (&gsi, true);

> >> +                   to_remove.remove (rem);

> >> +                   continue; /* We already moved to the next insn.  */

> >> +                 }

> >> +             }

> >> +           gsi_next (&gsi);

> >

> > At least you can stop walking a BB if you hit something else

> > than a label.  Thus, else break;

>

> done.

>

> >> +/* Optimize (not yet) and lower frame allocation.

> >> +

> >> +   This is a place-holder for an optimisation to remove unused frame

> >> +   entries and re-size the frame to minimum.  */

> >

> > I wonder if this will ever be reasonably possible at this point of the

> > compilation.  So instead of having this placeholder please do

> > the copy through during one of the lowering phases.

>

> I don’t want to side-track the patch discussion at this juncture:

>

>  a) I removed the pass.

>  b) and will just note that the coroutine frame is the direct analogue of the

>  activation record; it is entirely under the control of the compiler and each

>  entry could be considered as independent as any stack frame entry.  Thus

>  there cannot be any hidden aliassing. I’d like to discuss in some separate

>  thread the opportunities here (it seems to me to be an entirely different

>  problem from the general “reorganise aggregate layout” being discussed

>  elsewhere).


Well, it at least is related in the sense as the compiler might have chosen to
take the address of something (consider an array being part of that frame).
Or mangle the frame accesses in some other interesting way (store-merging).
So you cannot rely on all accesses being "nicely" in their original form.

Note the very same optimization would apply to nested function static chains.
IIRC for Ada we do (or there were patches to do this) try to tear the frame up
in SRA again after inlining a nested function into its caller.  Not sure whether
for coroutines we ever expose such situation.  But also with not inlining
the ABI is completely local for nested functions so we _could_ optimize
it "late" - but run into the very same complications.

> >>   NEXT_PASS (pass_lower_eh);

> >> +  NEXT_PASS (pass_coroutine_lower_builtins);

> >>   NEXT_PASS (pass_build_cfg);

> >>   NEXT_PASS (pass_warn_function_return);

> >> +  NEXT_PASS (pass_coroutine_early_expand_ifns);

> >

> > So the two passes above are really close - what's the reason to not

> > do coroutine_lower_builtins when the CFG is already built?

>

> The first pass has to run for any function (since the library builtins could and

> will be used by non-coroutine functions - e.g. in libstdc++ headers).

>

> The first pass also lowers the IFN_CO_SUSP - which modifies the CFG.

>

> The second pass needs the CFG, actually it would be nice to shift that pass

> later in the pipeline to improve the chances of elision of CO_YIELD points via

> DCE.

>

> Revised patch below

> OK for thrunk now?


Yes.

Thanks,
Richard.

> thanks

> Iain

>

> =======

>

> As described in the covering note, the main part of this is the

> expansion of the library support builtins, these are simple boolean

> or numerical substitutions.  This pass has to run for non-coroutine

> functions, since any 'regular' function may make use of them.

>

> The functionality of implementing an exit from scope without cleanup

> is performed here by lowering an IFN to a gimple goto.

>

> The final part is the expansion of the coroutine IFNs that describe the

> state machine connections to the dispatchers.  This only has to be run

> for functions that are coroutine components.

>

>    In the front end we construct a single actor function that contains

>    the coroutine state machine.

>

>    The actor function has three entry conditions:

>     1. from the ramp, resume point 0 - to initial-suspend.

>     2. when resume () is executed (resume point N).

>     3. from the destroy () shim when that is executed.

>

>    The actor function begins with two dispatchers; one for resume and

>    one for destroy (where the initial entry from the ramp is a special-

>    case of resume point 0).

>

>    Each suspend point and each dispatch entry is marked with an IFN such

>    that we can connect the relevant dispatchers to their target labels.

>

>    So, if we have:

>

>    CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR)

>

>    This is await point NUM, and is the final await if FINAL is non-zero.

>    The resume point is RES_LAB, and the destroy point is DEST_LAB.

>

>    We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a

>    CO_ACTOR (NUM+1) in the destroy dispatcher.

>

>    Initially, the intent of keeping the resume and destroy paths together

>    is that the conditionals controlling them are identical, and thus there

>    would be duplication of any optimisation of those paths if the split

>    were earlier.

>

>    Subsequent inlining of the actor (and DCE) is then able to extract the

>    resume and destroy paths as separate functions if that is found

>    profitable by the optimisers.

>

>    Once we have remade the connections to their correct postions, we elide

>    the labels that the front end inserted.

>

> Squashed commits.

>

> r278861 - Address review comments, use ptr_type_node

> r278864 - Address review comments, more standard nodes and APIs

> r278884 - Address review comments, remove co_frame lowering pass.

> r278936 - Address review comments, make a more specific pass gate.

> r279098 - Address review comments, exit early where possible.

> r279405 - Address review comments, more whitespace and comment changes.

> r279726 - Address review comments, remove one iteration through BBs.

> r279740 - Re-indent lower_coro_builtin.

> r279741 - Remove unnecessary control flow, re-indent.

> r279817 - Update copyright year.

> r280035 - Fix trailing whitespace.

>

> gcc/ChangeLog:

>

> 2020-01-09  Iain Sandoe  <iain@sandoe.co.uk>

>

>         * Makefile.in: Add coroutine-passes.o.

>         * coroutine-passes.cc: New file.

>         * function.h (struct GTY function): Add a bit to indicate that the

>         function is a coroutine component.

>         * passes.def: Add pass_coroutine_lower_builtins,

>         pass_coroutine_early_expand_ifns.

>         * tree-pass.h (make_pass_coroutine_lower_builtins): New.

>         (make_pass_coroutine_early_expand_ifns): New.

>

> gcc/cp/ChangeLog:

>

> 2020-01-09  Iain Sandoe  <iain@sandoe.co.uk>

>

>         * decl.c (emit_coro_helper): Set a bit to signal that this is a

>         coroutine component.

>         (finish_function): Likewise.

> ---

>  gcc/Makefile.in         |   1 +

>  gcc/coroutine-passes.cc | 532 ++++++++++++++++++++++++++++++++++++++++++++++++

>  gcc/cp/decl.c           |   6 +

>  gcc/function.h          |   3 +

>  gcc/passes.def          |   2 +

>  gcc/tree-pass.h         |   2 +

>  6 files changed, 546 insertions(+)

>  create mode 100644 gcc/coroutine-passes.cc

>

> diff --git a/gcc/Makefile.in b/gcc/Makefile.in

> index 61b512c9b7..0849ce4826 100644

> --- a/gcc/Makefile.in

> +++ b/gcc/Makefile.in

> @@ -1265,6 +1265,7 @@ OBJS = \

>         compare-elim.o \

>         context.o \

>         convert.o \

> +       coroutine-passes.o \

>         coverage.o \

>         cppbuiltin.o \

>         cppdefault.o \

> diff --git a/gcc/coroutine-passes.cc b/gcc/coroutine-passes.cc

> new file mode 100644

> index 0000000000..d032a392ce

> --- /dev/null

> +++ b/gcc/coroutine-passes.cc

> @@ -0,0 +1,532 @@

> +/* coroutine expansion and optimisation passes.

> +

> +   Copyright (C) 2018-2020 Free Software Foundation, Inc.

> +

> + Contributed by Iain Sandoe <iain@sandoe.co.uk> under contract to Facebook.

> +

> +This file is part of GCC.

> +

> +GCC is free software; you can redistribute it and/or modify it under

> +the terms of the GNU General Public License as published by the Free

> +Software Foundation; either version 3, or (at your option) any later

> +version.

> +

> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY

> +WARRANTY; without even the implied warranty of MERCHANTABILITY or

> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License

> +for more details.

> +

> +You should have received a copy of the GNU General Public License

> +along with GCC; see the file COPYING3.  If not see

> +<http://www.gnu.org/licenses/>.  */

> +

> +#include "config.h"

> +#include "system.h"

> +#include "coretypes.h"

> +#include "backend.h"

> +#include "target.h"

> +#include "tree.h"

> +#include "gimple.h"

> +#include "tree-pass.h"

> +#include "ssa.h"

> +#include "cgraph.h"

> +#include "pretty-print.h"

> +#include "diagnostic-core.h"

> +#include "fold-const.h"

> +#include "internal-fn.h"

> +#include "langhooks.h"

> +#include "gimplify.h"

> +#include "gimple-iterator.h"

> +#include "gimplify-me.h"

> +#include "gimple-walk.h"

> +#include "gimple-fold.h"

> +#include "tree-cfg.h"

> +#include "tree-into-ssa.h"

> +#include "tree-ssa-propagate.h"

> +#include "gimple-pretty-print.h"

> +#include "cfghooks.h"

> +

> +/* Here we:

> +   * lower the internal function that implements an exit from scope.

> +   * expand the builtins that are used to implement the library

> +     interfaces to the coroutine frame.  */

> +

> +static tree

> +lower_coro_builtin (gimple_stmt_iterator *gsi, bool *handled_ops_p,

> +                   struct walk_stmt_info *wi ATTRIBUTE_UNUSED)

> +{

> +  gimple *stmt = gsi_stmt (*gsi);

> +  *handled_ops_p = !gimple_has_substatements (stmt);

> +

> +  if (gimple_code (stmt) != GIMPLE_CALL)

> +    return NULL_TREE;

> +

> +  /* This internal function implements an exit from scope without

> +     performing any cleanups; it jumps directly to the label provided.  */

> +  if (gimple_call_internal_p (stmt)

> +      && gimple_call_internal_fn (stmt) == IFN_CO_SUSPN)

> +    {

> +      tree dest = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);

> +      ggoto *g = gimple_build_goto (dest);

> +      gsi_replace (gsi, g, /* do EH */ false);

> +      *handled_ops_p = true;

> +      return NULL_TREE;

> +    }

> +

> +  tree decl = gimple_call_fndecl (stmt);

> +  if (!decl || !fndecl_built_in_p (decl, BUILT_IN_NORMAL))

> +    return NULL_TREE;

> +

> +  /* The remaining builtins implement the library interfaces to the coro

> +     frame.  */

> +  unsigned call_idx = 0;

> +

> +  switch (DECL_FUNCTION_CODE (decl))

> +    {

> +    default:

> +      break;

> +    case BUILT_IN_CORO_PROMISE:

> +      {

> +       /* If we are discarding this, then skip it; the function has no

> +          side-effects.  */

> +       tree lhs = gimple_call_lhs (stmt);

> +       if (!lhs)

> +         {

> +           gsi_remove (gsi, true);

> +           *handled_ops_p = true;

> +           return NULL_TREE;

> +         }

> +       /* The coro frame starts with two pointers (to the resume and

> +          destroy() functions).  These are followed by the promise which

> +          is aligned as per type [or user attribute].

> +          The input pointer is the first argument.

> +          The promise alignment is the second and the third is a bool

> +          that is true when we are converting from a promise ptr to a

> +          frame pointer, and false for the inverse.  */

> +       tree ptr = gimple_call_arg (stmt, 0);

> +       tree align_t = gimple_call_arg (stmt, 1);

> +       tree from = gimple_call_arg (stmt, 2);

> +       gcc_checking_assert (TREE_CODE (align_t) == INTEGER_CST);

> +       gcc_checking_assert (TREE_CODE (from) == INTEGER_CST);

> +       bool dir = wi::to_wide (from) != 0;

> +       HOST_WIDE_INT promise_align = TREE_INT_CST_LOW (align_t);

> +       HOST_WIDE_INT psize =

> +         TREE_INT_CST_LOW (TYPE_SIZE_UNIT (ptr_type_node));

> +       HOST_WIDE_INT align = TYPE_ALIGN_UNIT (ptr_type_node);

> +       align = MAX (align, promise_align);

> +       psize *= 2; /* Start with two pointers.  */

> +       psize = ROUND_UP (psize, align);

> +       HOST_WIDE_INT offs = dir ? -psize : psize;

> +       tree repl = build2 (POINTER_PLUS_EXPR, ptr_type_node, ptr,

> +                           size_int (offs));

> +       gassign *grpl = gimple_build_assign (lhs, repl);

> +       gsi_replace (gsi, grpl, true);

> +       *handled_ops_p = true;

> +      }

> +      break;

> +    case BUILT_IN_CORO_DESTROY:

> +      call_idx = 1;

> +      /* FALLTHROUGH */

> +    case BUILT_IN_CORO_RESUME:

> +      {

> +       tree ptr = gimple_call_arg (stmt, 0); /* frame ptr.  */

> +       HOST_WIDE_INT psize =

> +         TREE_INT_CST_LOW (TYPE_SIZE_UNIT (ptr_type_node));

> +       HOST_WIDE_INT offset = call_idx * psize;

> +       tree fntype = TREE_TYPE (decl);

> +       tree fntype_ptr = build_pointer_type (fntype);

> +       tree fntype_ppp = build_pointer_type (fntype_ptr);

> +       tree indirect = fold_build2 (MEM_REF, fntype_ptr, ptr,

> +                                    build_int_cst (fntype_ppp, offset));

> +       tree f_ptr_tmp = make_ssa_name (TYPE_MAIN_VARIANT (fntype_ptr));

> +       gassign *get_fptr = gimple_build_assign (f_ptr_tmp, indirect);

> +       gsi_insert_before (gsi, get_fptr, GSI_SAME_STMT);

> +       gimple_call_set_fn (static_cast<gcall *> (stmt), f_ptr_tmp);

> +       *handled_ops_p = true;

> +      }

> +      break;

> +    case BUILT_IN_CORO_DONE:

> +      {

> +       /* If we are discarding this, then skip it; the function has no

> +          side-effects.  */

> +       tree lhs = gimple_call_lhs (stmt);

> +       if (!lhs)

> +         {

> +           gsi_remove (gsi, true);

> +           *handled_ops_p = true;

> +           return NULL_TREE;

> +         }

> +       /* When we're done, the resume fn is set to NULL.  */

> +       tree ptr = gimple_call_arg (stmt, 0); /* frame ptr.  */

> +       tree vpp = build_pointer_type (ptr_type_node);

> +       tree indirect

> +         = fold_build2 (MEM_REF, vpp, ptr, build_int_cst (vpp, 0));

> +       tree d_ptr_tmp = make_ssa_name (ptr_type_node);

> +       gassign *get_dptr = gimple_build_assign (d_ptr_tmp, indirect);

> +       gsi_insert_before (gsi, get_dptr, GSI_SAME_STMT);

> +       tree done = fold_build2 (EQ_EXPR, boolean_type_node, d_ptr_tmp,

> +                                null_pointer_node);

> +       gassign *get_res = gimple_build_assign (lhs, done);

> +       gsi_replace (gsi, get_res, true);

> +       *handled_ops_p = true;

> +      }

> +      break;

> +    }

> +  return NULL_TREE;

> +}

> +

> +/* Main entry point for lowering coroutine FE builtins.  */

> +

> +static unsigned int

> +execute_lower_coro_builtins (void)

> +{

> +  struct walk_stmt_info wi;

> +  gimple_seq body;

> +

> +  body = gimple_body (current_function_decl);

> +  memset (&wi, 0, sizeof (wi));

> +  walk_gimple_seq_mod (&body, lower_coro_builtin, NULL, &wi);

> +  gimple_set_body (current_function_decl, body);

> +

> +  return 0;

> +}

> +

> +namespace {

> +

> +const pass_data pass_data_coroutine_lower_builtins = {

> +  GIMPLE_PASS,          /* type */

> +  "coro-lower-builtins", /* name */

> +  OPTGROUP_NONE,        /* optinfo_flags */

> +  TV_NONE,              /* tv_id */

> +  0,                    /* properties_required */

> +  0,                    /* properties_provided */

> +  0,                    /* properties_destroyed */

> +  0,                    /* todo_flags_start */

> +  0                     /* todo_flags_finish */

> +};

> +

> +class pass_coroutine_lower_builtins : public gimple_opt_pass

> +{

> +public:

> +  pass_coroutine_lower_builtins (gcc::context *ctxt)

> +    : gimple_opt_pass (pass_data_coroutine_lower_builtins, ctxt)

> +  {}

> +

> +  /* opt_pass methods: */

> +  virtual bool gate (function *) { return flag_coroutines; };

> +

> +  virtual unsigned int execute (function *f ATTRIBUTE_UNUSED)

> +  {

> +    return execute_lower_coro_builtins ();

> +  }

> +

> +}; // class pass_coroutine_lower_builtins

> +

> +} // namespace

> +

> +gimple_opt_pass *

> +make_pass_coroutine_lower_builtins (gcc::context *ctxt)

> +{

> +  return new pass_coroutine_lower_builtins (ctxt);

> +}

> +

> +/* Expand the remaining coroutine IFNs.

> +

> +   In the front end we construct a single actor function that contains

> +   the coroutine state machine.

> +

> +   The actor function has three entry conditions:

> +    1. from the ramp, resume point 0 - to initial-suspend.

> +    2. when resume () is executed (resume point N).

> +    3. from the destroy () shim when that is executed.

> +

> +   The actor function begins with two dispatchers; one for resume and

> +   one for destroy (where the initial entry from the ramp is a special-

> +   case of resume point 0).

> +

> +   Each suspend point and each dispatch entry is marked with an IFN such

> +   that we can connect the relevant dispatchers to their target labels.

> +

> +   So, if we have:

> +

> +   CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR)

> +

> +   This is await point NUM, and is the final await if FINAL is non-zero.

> +   The resume point is RES_LAB, and the destroy point is DEST_LAB.

> +

> +   We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a

> +   CO_ACTOR (NUM+1) in the destroy dispatcher.

> +

> +   Initially, the intent of keeping the resume and destroy paths together

> +   is that the conditionals controlling them are identical, and thus there

> +   would be duplication of any optimisation of those paths if the split

> +   were earlier.

> +

> +   Subsequent inlining of the actor (and DCE) is then able to extract the

> +   resume and destroy paths as separate functions if that is found

> +   profitable by the optimisers.

> +

> +   Once we have remade the connections to their correct postions, we elide

> +   the labels that the front end inserted.  */

> +

> +static void

> +move_edge_and_update (edge e, basic_block old_bb, basic_block new_bb)

> +{

> +  if (dump_file)

> +    fprintf (dump_file, "redirecting edge from bb %u to bb %u\n", old_bb->index,

> +            new_bb->index);

> +

> +  e = redirect_edge_and_branch (e, new_bb);

> +  if (!e && dump_file)

> +    fprintf (dump_file, "failed to redirect edge ..  \n");

> +

> +  /* Die if we failed.  */

> +  gcc_checking_assert (e);

> +}

> +

> +static unsigned int

> +execute_early_expand_coro_ifns (void)

> +{

> +  /* Don't rebuild stuff unless we have to. */

> +  unsigned int todoflags = 0;

> +  bool changed = false;

> +  /* Some of the possible YIELD points will hopefully have been removed by

> +     earlier optimisations; record the ones that are still present.  */

> +  hash_map<int_hash<HOST_WIDE_INT, -1, -2>, tree> destinations;

> +  /* Labels we added to carry the CFG changes, we need to remove these to

> +     avoid confusing EH.  */

> +  hash_set<tree> to_remove;

> +  /* List of dispatch points to update.  */

> +  auto_vec<gimple_stmt_iterator, 16> actor_worklist;

> +  basic_block bb;

> +  gimple_stmt_iterator gsi;

> +

> +  FOR_EACH_BB_FN (bb, cfun)

> +    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)

> +      {

> +       gimple *stmt = gsi_stmt (gsi);

> +

> +       if (!is_gimple_call (stmt) || !gimple_call_internal_p (stmt))

> +         {

> +           gsi_next (&gsi);

> +           continue;

> +         }

> +       switch (gimple_call_internal_fn (stmt))

> +         {

> +         case IFN_CO_FRAME:

> +           {

> +             /* This internal function is a placeholder for the frame

> +                size.  In principle, we might lower it later (after some

> +                optimisation had reduced the frame size).  At present,

> +                without any such optimisation, we just set it here.  */

> +             tree lhs = gimple_call_lhs (stmt);

> +             tree size = gimple_call_arg (stmt, 0);

> +             /* Right now, this is a trivial operation - copy through

> +                the size computed during initial layout.  */

> +             gassign *grpl = gimple_build_assign (lhs, size);

> +             gsi_replace (&gsi, grpl, true);

> +             gsi_next (&gsi);

> +           }

> +           break;

> +         case IFN_CO_ACTOR:

> +           changed = true;

> +           actor_worklist.safe_push (gsi); /* Save for later.  */

> +           gsi_next (&gsi);

> +           break;

> +         case IFN_CO_YIELD:

> +           {

> +             changed = true;

> +             /* .CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR);

> +                 NUM = await number.

> +                 FINAL = 1 if this is the final_suspend() await.

> +                 RES_LAB = resume point label.

> +                 DEST_LAB = destroy point label.

> +                 FRAME_PTR = is a null pointer with the type of the coro

> +                             frame, so that we can resize, if needed.  */

> +             if (dump_file)

> +               fprintf (dump_file, "saw CO_YIELD in BB %u\n", bb->index);

> +             tree num = gimple_call_arg (stmt, 0); /* yield point.  */

> +             HOST_WIDE_INT idx = TREE_INT_CST_LOW (num);

> +             bool existed;

> +             tree res_tgt = TREE_OPERAND (gimple_call_arg (stmt, 2), 0);

> +             tree &res_dest = destinations.get_or_insert (idx, &existed);

> +             if (existed && dump_file)

> +               {

> +                 fprintf (

> +                   dump_file,

> +                   "duplicate YIELD RESUME point (" HOST_WIDE_INT_PRINT_DEC

> +                   ") ?\n",

> +                   idx);

> +                 print_gimple_stmt (dump_file, stmt, 0, TDF_VOPS|TDF_MEMSYMS);

> +               }

> +             else

> +               res_dest = res_tgt;

> +             tree dst_tgt = TREE_OPERAND (gimple_call_arg (stmt, 3), 0);

> +             tree &dst_dest = destinations.get_or_insert (idx + 1, &existed);

> +             if (existed && dump_file)

> +               {

> +                 fprintf (

> +                   dump_file,

> +                   "duplicate YIELD DESTROY point (" HOST_WIDE_INT_PRINT_DEC

> +                   ") ?\n",

> +                   idx + 1);

> +                 print_gimple_stmt (dump_file, stmt, 0, TDF_VOPS|TDF_MEMSYMS);

> +               }

> +             else

> +               dst_dest = dst_tgt;

> +             to_remove.add (res_tgt);

> +             to_remove.add (dst_tgt);

> +             /* lose the co_yield.  */

> +             gsi_remove (&gsi, true);

> +             stmt = gsi_stmt (gsi); /* next. */

> +             /* lose the copy present at O0.  */

> +             if (is_gimple_assign (stmt))

> +               {

> +                 gsi_remove (&gsi, true);

> +                 stmt = gsi_stmt (gsi);

> +               }

> +             /* Simplify the switch or if following.  */

> +             if (gswitch *gsw = dyn_cast<gswitch *> (stmt))

> +               {

> +                 gimple_switch_set_index (gsw, integer_zero_node);

> +                 fold_stmt (&gsi);

> +               }

> +             else if (gcond *gif = dyn_cast<gcond *> (stmt))

> +               {

> +                 if (gimple_cond_code (gif) == EQ_EXPR)

> +                   gimple_cond_make_true (gif);

> +                 else

> +                   gimple_cond_make_false (gif);

> +                 fold_stmt (&gsi);

> +               }

> +             else if (dump_file)

> +               print_gimple_stmt (dump_file, stmt, 0, TDF_VOPS|TDF_MEMSYMS);

> +             if (gsi_end_p (gsi))

> +               break;

> +             continue;

> +           }

> +         default:

> +           gsi_next (&gsi);

> +           break;

> +         }

> +      }

> +

> +  if (!changed)

> +    {

> +      if (dump_file)

> +       fprintf (dump_file, "coro: nothing to do\n");

> +      return todoflags;

> +    }

> +

> +  while (!actor_worklist.is_empty ())

> +    {

> +      gsi = actor_worklist.pop ();

> +      gimple *stmt = gsi_stmt (gsi);

> +      gcc_checking_assert (is_gimple_call (stmt)

> +                          && gimple_call_internal_p (stmt)

> +                          && gimple_call_internal_fn (stmt) == IFN_CO_ACTOR);

> +      bb = gsi_bb (gsi);

> +      HOST_WIDE_INT idx = TREE_INT_CST_LOW (gimple_call_arg (stmt, 0));

> +      tree *seen = destinations.get (idx);

> +      changed = true;

> +

> +      if (dump_file)

> +       fprintf (dump_file, "saw CO_ACTOR in BB %u\n", bb->index);

> +

> +      if (!seen)

> +       {

> +         /* If we never saw this index, it means that the CO_YIELD

> +         associated was elided during earlier optimisations, so we

> +         don't need to fix up the switch targets.  */

> +         if (dump_file)

> +           fprintf (dump_file, "yield point " HOST_WIDE_INT_PRINT_DEC

> +                    " not used, removing it .. \n",  idx);

> +         gsi_remove (&gsi, true);

> +         release_defs (stmt);

> +       }

> +      else

> +       {

> +         /* So we need to switch the target of this switch case to the

> +            relevant BB.  */

> +         basic_block new_bb = label_to_block (cfun, *seen);

> +         /* We expect the block we're modifying to contain a single

> +            CO_ACTOR() followed by a goto <switch default bb>.  */

> +         gcc_checking_assert (EDGE_COUNT (bb->succs) == 1);

> +         edge e;

> +         edge_iterator ei;

> +         FOR_EACH_EDGE (e, ei, bb->succs)

> +           {

> +             basic_block old_bb = e->dest;

> +             move_edge_and_update (e, old_bb, new_bb);

> +           }

> +         gsi_remove (&gsi, true);

> +       }

> +    }

> +

> +  /* Remove the labels we inserted to map our hidden CFG, this

> +     avoids confusing block merges when there are also EH labels.  */

> +  FOR_EACH_BB_FN (bb, cfun)

> +    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)

> +      {

> +       gimple *stmt = gsi_stmt (gsi);

> +       if (glabel *glab = dyn_cast<glabel *> (stmt))

> +         {

> +           tree rem = gimple_label_label (glab);

> +           if (to_remove.contains (rem))

> +             {

> +               gsi_remove (&gsi, true);

> +               to_remove.remove (rem);

> +               continue; /* We already moved to the next insn.  */

> +             }

> +         }

> +       else

> +         break;

> +       gsi_next (&gsi);

> +      }

> +

> +  /* Changed the CFG.  */

> +  todoflags |= TODO_cleanup_cfg;

> +  return todoflags;

> +}

> +

> +namespace {

> +

> +const pass_data pass_data_coroutine_early_expand_ifns = {

> +  GIMPLE_PASS,             /* type */

> +  "coro-early-expand-ifns", /* name */

> +  OPTGROUP_NONE,           /* optinfo_flags */

> +  TV_NONE,                 /* tv_id */

> +  (PROP_cfg),              /* properties_required */

> +  0,                       /* properties_provided */

> +  0,                       /* properties_destroyed */

> +  0,                       /* todo_flags_start */

> +  0                        /* todo_flags_finish, set this in the fn. */

> +};

> +

> +class pass_coroutine_early_expand_ifns : public gimple_opt_pass

> +{

> +public:

> +  pass_coroutine_early_expand_ifns (gcc::context *ctxt)

> +    : gimple_opt_pass (pass_data_coroutine_early_expand_ifns, ctxt)

> +  {}

> +

> +  /* opt_pass methods: */

> +  virtual bool gate (function *f)

> +    {

> +      return flag_coroutines && f->coroutine_component;

> +    }

> +

> +  virtual unsigned int execute (function *f ATTRIBUTE_UNUSED)

> +  {

> +    return execute_early_expand_coro_ifns ();

> +  }

> +

> +}; // class pass_coroutine_expand_ifns

> +

> +} // namespace

> +

> +gimple_opt_pass *

> +make_pass_coroutine_early_expand_ifns (gcc::context *ctxt)

> +{

> +  return new pass_coroutine_early_expand_ifns (ctxt);

> +}

> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c

> index 41c14001fd..6093613bae 100644

> --- a/gcc/cp/decl.c

> +++ b/gcc/cp/decl.c

> @@ -16802,6 +16802,9 @@ emit_coro_helper (tree helper)

>    cp_fold_function (helper);

>    DECL_CONTEXT (DECL_RESULT (helper)) = helper;

>    BLOCK_SUPERCONTEXT (DECL_INITIAL (helper)) = helper;

> +  /* This function has coroutine IFNs that we should handle in middle

> +     end lowering.  */

> +  cfun->coroutine_component = true;

>    cp_genericize (helper);

>    expand_or_defer_fn (helper);

>  }

> @@ -16858,6 +16861,9 @@ finish_function (bool inline_p)

>           return fndecl;

>         }

>

> +      /* We should handle coroutine IFNs in middle end lowering.  */

> +      cfun->coroutine_component = true;

> +

>        if (use_eh_spec_block (fndecl))

>         finish_eh_spec_block (TYPE_RAISES_EXCEPTIONS

>                               (TREE_TYPE (fndecl)),

> diff --git a/gcc/function.h b/gcc/function.h

> index 496d3f728c..1ee8ed3de5 100644

> --- a/gcc/function.h

> +++ b/gcc/function.h

> @@ -418,6 +418,9 @@ struct GTY(()) function {

>    /* Set when the function was compiled with generation of debug

>       (begin stmt, inline entry, ...) markers enabled.  */

>    unsigned int debug_nonbind_markers : 1;

> +

> +  /* Set if this is a coroutine-related function.  */

> +  unsigned int coroutine_component : 1;

>  };

>

>  /* Add the decl D to the local_decls list of FUN.  */

> diff --git a/gcc/passes.def b/gcc/passes.def

> index b6bd4f3d31..2db4ef4acc 100644

> --- a/gcc/passes.def

> +++ b/gcc/passes.def

> @@ -39,8 +39,10 @@ along with GCC; see the file COPYING3.  If not see

>    NEXT_PASS (pass_lower_tm);

>    NEXT_PASS (pass_refactor_eh);

>    NEXT_PASS (pass_lower_eh);

> +  NEXT_PASS (pass_coroutine_lower_builtins);

>    NEXT_PASS (pass_build_cfg);

>    NEXT_PASS (pass_warn_function_return);

> +  NEXT_PASS (pass_coroutine_early_expand_ifns);

>    NEXT_PASS (pass_expand_omp);

>    NEXT_PASS (pass_warn_printf);

>    NEXT_PASS (pass_walloca, /*strict_mode_p=*/true);

> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h

> index 5ff43572b8..f10f467462 100644

> --- a/gcc/tree-pass.h

> +++ b/gcc/tree-pass.h

> @@ -478,6 +478,8 @@ extern gimple_opt_pass *make_pass_gen_hsail (gcc::context *ctxt);

>  extern gimple_opt_pass *make_pass_warn_nonnull_compare (gcc::context *ctxt);

>  extern gimple_opt_pass *make_pass_sprintf_length (gcc::context *ctxt);

>  extern gimple_opt_pass *make_pass_walloca (gcc::context *ctxt);

> +extern gimple_opt_pass *make_pass_coroutine_lower_builtins (gcc::context *ctxt);

> +extern gimple_opt_pass *make_pass_coroutine_early_expand_ifns (gcc::context *ctxt);

>

>  /* IPA Passes */

>  extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);

> --

> 2.14.3

Patch

=======

As described in the covering note, the main part of this is the
expansion of the library support builtins, these are simple boolean
or numerical substitutions.  This pass has to run for non-coroutine
functions, since any 'regular' function may make use of them.

The functionality of implementing an exit from scope without cleanup
is performed here by lowering an IFN to a gimple goto.

The final part is the expansion of the coroutine IFNs that describe the
state machine connections to the dispatchers.  This only has to be run
for functions that are coroutine components.

   In the front end we construct a single actor function that contains
   the coroutine state machine.

   The actor function has three entry conditions:
    1. from the ramp, resume point 0 - to initial-suspend.
    2. when resume () is executed (resume point N).
    3. from the destroy () shim when that is executed.

   The actor function begins with two dispatchers; one for resume and
   one for destroy (where the initial entry from the ramp is a special-
   case of resume point 0).

   Each suspend point and each dispatch entry is marked with an IFN such
   that we can connect the relevant dispatchers to their target labels.

   So, if we have:

   CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR)

   This is await point NUM, and is the final await if FINAL is non-zero.
   The resume point is RES_LAB, and the destroy point is DEST_LAB.

   We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a
   CO_ACTOR (NUM+1) in the destroy dispatcher.

   Initially, the intent of keeping the resume and destroy paths together
   is that the conditionals controlling them are identical, and thus there
   would be duplication of any optimisation of those paths if the split
   were earlier.

   Subsequent inlining of the actor (and DCE) is then able to extract the
   resume and destroy paths as separate functions if that is found
   profitable by the optimisers.

   Once we have remade the connections to their correct postions, we elide
   the labels that the front end inserted.

Squashed commits.

r278861 - Address review comments, use ptr_type_node
r278864 - Address review comments, more standard nodes and APIs
r278884 - Address review comments, remove co_frame lowering pass.
r278936 - Address review comments, make a more specific pass gate.
r279098 - Address review comments, exit early where possible.
r279405 - Address review comments, more whitespace and comment changes.
r279726 - Address review comments, remove one iteration through BBs.
r279740 - Re-indent lower_coro_builtin.
r279741 - Remove unnecessary control flow, re-indent.
r279817 - Update copyright year.
r280035 - Fix trailing whitespace.

gcc/ChangeLog:

2020-01-09  Iain Sandoe  <iain@sandoe.co.uk>

	* Makefile.in: Add coroutine-passes.o.
	* coroutine-passes.cc: New file.
	* function.h (struct GTY function): Add a bit to indicate that the
	function is a coroutine component.
	* passes.def: Add pass_coroutine_lower_builtins,
	pass_coroutine_early_expand_ifns.
	* tree-pass.h (make_pass_coroutine_lower_builtins): New.
	(make_pass_coroutine_early_expand_ifns): New.

gcc/cp/ChangeLog:

2020-01-09  Iain Sandoe  <iain@sandoe.co.uk>

	* decl.c (emit_coro_helper): Set a bit to signal that this is a
	coroutine component.
	(finish_function): Likewise.
---
 gcc/Makefile.in         |   1 +
 gcc/coroutine-passes.cc | 532 ++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/cp/decl.c           |   6 +
 gcc/function.h          |   3 +
 gcc/passes.def          |   2 +
 gcc/tree-pass.h         |   2 +
 6 files changed, 546 insertions(+)
 create mode 100644 gcc/coroutine-passes.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 61b512c9b7..0849ce4826 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1265,6 +1265,7 @@  OBJS = \
 	compare-elim.o \
 	context.o \
 	convert.o \
+	coroutine-passes.o \
 	coverage.o \
 	cppbuiltin.o \
 	cppdefault.o \
diff --git a/gcc/coroutine-passes.cc b/gcc/coroutine-passes.cc
new file mode 100644
index 0000000000..d032a392ce
--- /dev/null
+++ b/gcc/coroutine-passes.cc
@@ -0,0 +1,532 @@ 
+/* coroutine expansion and optimisation passes.
+
+   Copyright (C) 2018-2020 Free Software Foundation, Inc.
+
+ Contributed by Iain Sandoe <iain@sandoe.co.uk> under contract to Facebook.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "cgraph.h"
+#include "pretty-print.h"
+#include "diagnostic-core.h"
+#include "fold-const.h"
+#include "internal-fn.h"
+#include "langhooks.h"
+#include "gimplify.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "gimple-walk.h"
+#include "gimple-fold.h"
+#include "tree-cfg.h"
+#include "tree-into-ssa.h"
+#include "tree-ssa-propagate.h"
+#include "gimple-pretty-print.h"
+#include "cfghooks.h"
+
+/* Here we:
+   * lower the internal function that implements an exit from scope.
+   * expand the builtins that are used to implement the library
+     interfaces to the coroutine frame.  */
+
+static tree
+lower_coro_builtin (gimple_stmt_iterator *gsi, bool *handled_ops_p,
+		    struct walk_stmt_info *wi ATTRIBUTE_UNUSED)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  *handled_ops_p = !gimple_has_substatements (stmt);
+
+  if (gimple_code (stmt) != GIMPLE_CALL)
+    return NULL_TREE;
+
+  /* This internal function implements an exit from scope without
+     performing any cleanups; it jumps directly to the label provided.  */
+  if (gimple_call_internal_p (stmt)
+      && gimple_call_internal_fn (stmt) == IFN_CO_SUSPN)
+    {
+      tree dest = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
+      ggoto *g = gimple_build_goto (dest);
+      gsi_replace (gsi, g, /* do EH */ false);
+      *handled_ops_p = true;
+      return NULL_TREE;
+    }
+
+  tree decl = gimple_call_fndecl (stmt);
+  if (!decl || !fndecl_built_in_p (decl, BUILT_IN_NORMAL))
+    return NULL_TREE;
+
+  /* The remaining builtins implement the library interfaces to the coro
+     frame.  */
+  unsigned call_idx = 0;
+
+  switch (DECL_FUNCTION_CODE (decl))
+    {
+    default:
+      break;
+    case BUILT_IN_CORO_PROMISE:
+      {
+	/* If we are discarding this, then skip it; the function has no
+	   side-effects.  */
+	tree lhs = gimple_call_lhs (stmt);
+	if (!lhs)
+	  {
+	    gsi_remove (gsi, true);
+	    *handled_ops_p = true;
+	    return NULL_TREE;
+	  }
+	/* The coro frame starts with two pointers (to the resume and
+	   destroy() functions).  These are followed by the promise which
+	   is aligned as per type [or user attribute].
+	   The input pointer is the first argument.
+	   The promise alignment is the second and the third is a bool
+	   that is true when we are converting from a promise ptr to a
+	   frame pointer, and false for the inverse.  */
+	tree ptr = gimple_call_arg (stmt, 0);
+	tree align_t = gimple_call_arg (stmt, 1);
+	tree from = gimple_call_arg (stmt, 2);
+	gcc_checking_assert (TREE_CODE (align_t) == INTEGER_CST);
+	gcc_checking_assert (TREE_CODE (from) == INTEGER_CST);
+	bool dir = wi::to_wide (from) != 0;
+	HOST_WIDE_INT promise_align = TREE_INT_CST_LOW (align_t);
+	HOST_WIDE_INT psize =
+	  TREE_INT_CST_LOW (TYPE_SIZE_UNIT (ptr_type_node));
+	HOST_WIDE_INT align = TYPE_ALIGN_UNIT (ptr_type_node);
+	align = MAX (align, promise_align);
+	psize *= 2; /* Start with two pointers.  */
+	psize = ROUND_UP (psize, align);
+	HOST_WIDE_INT offs = dir ? -psize : psize;
+	tree repl = build2 (POINTER_PLUS_EXPR, ptr_type_node, ptr,
+			    size_int (offs));
+	gassign *grpl = gimple_build_assign (lhs, repl);
+	gsi_replace (gsi, grpl, true);
+	*handled_ops_p = true;
+      }
+      break;
+    case BUILT_IN_CORO_DESTROY:
+      call_idx = 1;
+      /* FALLTHROUGH */
+    case BUILT_IN_CORO_RESUME:
+      {
+	tree ptr = gimple_call_arg (stmt, 0); /* frame ptr.  */
+	HOST_WIDE_INT psize =
+	  TREE_INT_CST_LOW (TYPE_SIZE_UNIT (ptr_type_node));
+	HOST_WIDE_INT offset = call_idx * psize;
+	tree fntype = TREE_TYPE (decl);
+	tree fntype_ptr = build_pointer_type (fntype);
+	tree fntype_ppp = build_pointer_type (fntype_ptr);
+	tree indirect = fold_build2 (MEM_REF, fntype_ptr, ptr,
+				     build_int_cst (fntype_ppp, offset));
+	tree f_ptr_tmp = make_ssa_name (TYPE_MAIN_VARIANT (fntype_ptr));
+	gassign *get_fptr = gimple_build_assign (f_ptr_tmp, indirect);
+	gsi_insert_before (gsi, get_fptr, GSI_SAME_STMT);
+	gimple_call_set_fn (static_cast<gcall *> (stmt), f_ptr_tmp);
+	*handled_ops_p = true;
+      }
+      break;
+    case BUILT_IN_CORO_DONE:
+      {
+	/* If we are discarding this, then skip it; the function has no
+	   side-effects.  */
+	tree lhs = gimple_call_lhs (stmt);
+	if (!lhs)
+	  {
+	    gsi_remove (gsi, true);
+	    *handled_ops_p = true;
+	    return NULL_TREE;
+	  }
+	/* When we're done, the resume fn is set to NULL.  */
+	tree ptr = gimple_call_arg (stmt, 0); /* frame ptr.  */
+	tree vpp = build_pointer_type (ptr_type_node);
+	tree indirect
+	  = fold_build2 (MEM_REF, vpp, ptr, build_int_cst (vpp, 0));
+	tree d_ptr_tmp = make_ssa_name (ptr_type_node);
+	gassign *get_dptr = gimple_build_assign (d_ptr_tmp, indirect);
+	gsi_insert_before (gsi, get_dptr, GSI_SAME_STMT);
+	tree done = fold_build2 (EQ_EXPR, boolean_type_node, d_ptr_tmp,
+				 null_pointer_node);
+	gassign *get_res = gimple_build_assign (lhs, done);
+	gsi_replace (gsi, get_res, true);
+	*handled_ops_p = true;
+      }
+      break;
+    }
+  return NULL_TREE;
+}
+
+/* Main entry point for lowering coroutine FE builtins.  */
+
+static unsigned int
+execute_lower_coro_builtins (void)
+{
+  struct walk_stmt_info wi;
+  gimple_seq body;
+
+  body = gimple_body (current_function_decl);
+  memset (&wi, 0, sizeof (wi));
+  walk_gimple_seq_mod (&body, lower_coro_builtin, NULL, &wi);
+  gimple_set_body (current_function_decl, body);
+
+  return 0;
+}
+
+namespace {
+
+const pass_data pass_data_coroutine_lower_builtins = {
+  GIMPLE_PASS,		 /* type */
+  "coro-lower-builtins", /* name */
+  OPTGROUP_NONE,	 /* optinfo_flags */
+  TV_NONE,		 /* tv_id */
+  0,			 /* properties_required */
+  0,			 /* properties_provided */
+  0,			 /* properties_destroyed */
+  0,			 /* todo_flags_start */
+  0			 /* todo_flags_finish */
+};
+
+class pass_coroutine_lower_builtins : public gimple_opt_pass
+{
+public:
+  pass_coroutine_lower_builtins (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_coroutine_lower_builtins, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *) { return flag_coroutines; };
+
+  virtual unsigned int execute (function *f ATTRIBUTE_UNUSED)
+  {
+    return execute_lower_coro_builtins ();
+  }
+
+}; // class pass_coroutine_lower_builtins
+
+} // namespace
+
+gimple_opt_pass *
+make_pass_coroutine_lower_builtins (gcc::context *ctxt)
+{
+  return new pass_coroutine_lower_builtins (ctxt);
+}
+
+/* Expand the remaining coroutine IFNs.
+
+   In the front end we construct a single actor function that contains
+   the coroutine state machine.
+
+   The actor function has three entry conditions:
+    1. from the ramp, resume point 0 - to initial-suspend.
+    2. when resume () is executed (resume point N).
+    3. from the destroy () shim when that is executed.
+
+   The actor function begins with two dispatchers; one for resume and
+   one for destroy (where the initial entry from the ramp is a special-
+   case of resume point 0).
+
+   Each suspend point and each dispatch entry is marked with an IFN such
+   that we can connect the relevant dispatchers to their target labels.
+
+   So, if we have:
+
+   CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR)
+
+   This is await point NUM, and is the final await if FINAL is non-zero.
+   The resume point is RES_LAB, and the destroy point is DEST_LAB.
+
+   We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a
+   CO_ACTOR (NUM+1) in the destroy dispatcher.
+
+   Initially, the intent of keeping the resume and destroy paths together
+   is that the conditionals controlling them are identical, and thus there
+   would be duplication of any optimisation of those paths if the split
+   were earlier.
+
+   Subsequent inlining of the actor (and DCE) is then able to extract the
+   resume and destroy paths as separate functions if that is found
+   profitable by the optimisers.
+
+   Once we have remade the connections to their correct postions, we elide
+   the labels that the front end inserted.  */
+
+static void
+move_edge_and_update (edge e, basic_block old_bb, basic_block new_bb)
+{
+  if (dump_file)
+    fprintf (dump_file, "redirecting edge from bb %u to bb %u\n", old_bb->index,
+	     new_bb->index);
+
+  e = redirect_edge_and_branch (e, new_bb);
+  if (!e && dump_file)
+    fprintf (dump_file, "failed to redirect edge ..  \n");
+
+  /* Die if we failed.  */
+  gcc_checking_assert (e);
+}
+
+static unsigned int
+execute_early_expand_coro_ifns (void)
+{
+  /* Don't rebuild stuff unless we have to. */
+  unsigned int todoflags = 0;
+  bool changed = false;
+  /* Some of the possible YIELD points will hopefully have been removed by
+     earlier optimisations; record the ones that are still present.  */
+  hash_map<int_hash<HOST_WIDE_INT, -1, -2>, tree> destinations;
+  /* Labels we added to carry the CFG changes, we need to remove these to
+     avoid confusing EH.  */
+  hash_set<tree> to_remove;
+  /* List of dispatch points to update.  */
+  auto_vec<gimple_stmt_iterator, 16> actor_worklist;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+      {
+	gimple *stmt = gsi_stmt (gsi);
+
+	if (!is_gimple_call (stmt) || !gimple_call_internal_p (stmt))
+	  {
+	    gsi_next (&gsi);
+	    continue;
+	  }
+	switch (gimple_call_internal_fn (stmt))
+	  {
+	  case IFN_CO_FRAME:
+	    {
+	      /* This internal function is a placeholder for the frame
+		 size.  In principle, we might lower it later (after some
+		 optimisation had reduced the frame size).  At present,
+		 without any such optimisation, we just set it here.  */
+	      tree lhs = gimple_call_lhs (stmt);
+	      tree size = gimple_call_arg (stmt, 0);
+	      /* Right now, this is a trivial operation - copy through
+		 the size computed during initial layout.  */
+	      gassign *grpl = gimple_build_assign (lhs, size);
+	      gsi_replace (&gsi, grpl, true);
+	      gsi_next (&gsi);
+	    }
+	    break;
+	  case IFN_CO_ACTOR:
+	    changed = true;
+	    actor_worklist.safe_push (gsi); /* Save for later.  */
+	    gsi_next (&gsi);
+	    break;
+	  case IFN_CO_YIELD:
+	    {
+	      changed = true;
+	      /* .CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR);
+		  NUM = await number.
+		  FINAL = 1 if this is the final_suspend() await.
+		  RES_LAB = resume point label.
+		  DEST_LAB = destroy point label.
+		  FRAME_PTR = is a null pointer with the type of the coro
+			      frame, so that we can resize, if needed.  */
+	      if (dump_file)
+		fprintf (dump_file, "saw CO_YIELD in BB %u\n", bb->index);
+	      tree num = gimple_call_arg (stmt, 0); /* yield point.  */
+	      HOST_WIDE_INT idx = TREE_INT_CST_LOW (num);
+	      bool existed;
+	      tree res_tgt = TREE_OPERAND (gimple_call_arg (stmt, 2), 0);
+	      tree &res_dest = destinations.get_or_insert (idx, &existed);
+	      if (existed && dump_file)
+		{
+		  fprintf (
+		    dump_file,
+		    "duplicate YIELD RESUME point (" HOST_WIDE_INT_PRINT_DEC
+		    ") ?\n",
+		    idx);
+		  print_gimple_stmt (dump_file, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
+		}
+	      else
+		res_dest = res_tgt;
+	      tree dst_tgt = TREE_OPERAND (gimple_call_arg (stmt, 3), 0);
+	      tree &dst_dest = destinations.get_or_insert (idx + 1, &existed);
+	      if (existed && dump_file)
+		{
+		  fprintf (
+		    dump_file,
+		    "duplicate YIELD DESTROY point (" HOST_WIDE_INT_PRINT_DEC
+		    ") ?\n",
+		    idx + 1);
+		  print_gimple_stmt (dump_file, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
+		}
+	      else
+		dst_dest = dst_tgt;
+	      to_remove.add (res_tgt);
+	      to_remove.add (dst_tgt);
+	      /* lose the co_yield.  */
+	      gsi_remove (&gsi, true);
+	      stmt = gsi_stmt (gsi); /* next. */
+	      /* lose the copy present at O0.  */
+	      if (is_gimple_assign (stmt))
+		{
+		  gsi_remove (&gsi, true);
+		  stmt = gsi_stmt (gsi);
+		}
+	      /* Simplify the switch or if following.  */
+	      if (gswitch *gsw = dyn_cast<gswitch *> (stmt))
+		{
+		  gimple_switch_set_index (gsw, integer_zero_node);
+		  fold_stmt (&gsi);
+		}
+	      else if (gcond *gif = dyn_cast<gcond *> (stmt))
+		{
+		  if (gimple_cond_code (gif) == EQ_EXPR)
+		    gimple_cond_make_true (gif);
+		  else
+		    gimple_cond_make_false (gif);
+		  fold_stmt (&gsi);
+		}
+	      else if (dump_file)
+		print_gimple_stmt (dump_file, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
+	      if (gsi_end_p (gsi))
+		break;
+	      continue;
+	    }
+	  default:
+	    gsi_next (&gsi);
+	    break;
+	  }
+      }
+
+  if (!changed)
+    {
+      if (dump_file)
+	fprintf (dump_file, "coro: nothing to do\n");
+      return todoflags;
+    }
+
+  while (!actor_worklist.is_empty ())
+    {
+      gsi = actor_worklist.pop ();
+      gimple *stmt = gsi_stmt (gsi);
+      gcc_checking_assert (is_gimple_call (stmt)
+			   && gimple_call_internal_p (stmt)
+			   && gimple_call_internal_fn (stmt) == IFN_CO_ACTOR);
+      bb = gsi_bb (gsi);
+      HOST_WIDE_INT idx = TREE_INT_CST_LOW (gimple_call_arg (stmt, 0));
+      tree *seen = destinations.get (idx);
+      changed = true;
+
+      if (dump_file)
+	fprintf (dump_file, "saw CO_ACTOR in BB %u\n", bb->index);
+
+      if (!seen)
+	{
+	  /* If we never saw this index, it means that the CO_YIELD
+	  associated was elided during earlier optimisations, so we
+	  don't need to fix up the switch targets.  */
+	  if (dump_file)
+	    fprintf (dump_file, "yield point " HOST_WIDE_INT_PRINT_DEC
+		     " not used, removing it .. \n",  idx);
+	  gsi_remove (&gsi, true);
+	  release_defs (stmt);
+	}
+      else
+	{
+	  /* So we need to switch the target of this switch case to the
+	     relevant BB.  */
+	  basic_block new_bb = label_to_block (cfun, *seen);
+	  /* We expect the block we're modifying to contain a single
+	     CO_ACTOR() followed by a goto <switch default bb>.  */
+	  gcc_checking_assert (EDGE_COUNT (bb->succs) == 1);
+	  edge e;
+	  edge_iterator ei;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    {
+	      basic_block old_bb = e->dest;
+	      move_edge_and_update (e, old_bb, new_bb);
+	    }
+	  gsi_remove (&gsi, true);
+	}
+    }
+
+  /* Remove the labels we inserted to map our hidden CFG, this
+     avoids confusing block merges when there are also EH labels.  */
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (glabel *glab = dyn_cast<glabel *> (stmt))
+	  {
+	    tree rem = gimple_label_label (glab);
+	    if (to_remove.contains (rem))
+	      {
+		gsi_remove (&gsi, true);
+		to_remove.remove (rem);
+		continue; /* We already moved to the next insn.  */
+	      }
+	  }
+	else
+	  break;
+	gsi_next (&gsi);
+      }
+
+  /* Changed the CFG.  */
+  todoflags |= TODO_cleanup_cfg;
+  return todoflags;
+}
+
+namespace {
+
+const pass_data pass_data_coroutine_early_expand_ifns = {
+  GIMPLE_PASS,		    /* type */
+  "coro-early-expand-ifns", /* name */
+  OPTGROUP_NONE,	    /* optinfo_flags */
+  TV_NONE,		    /* tv_id */
+  (PROP_cfg),		    /* properties_required */
+  0,			    /* properties_provided */
+  0,			    /* properties_destroyed */
+  0,			    /* todo_flags_start */
+  0			    /* todo_flags_finish, set this in the fn. */
+};
+
+class pass_coroutine_early_expand_ifns : public gimple_opt_pass
+{
+public:
+  pass_coroutine_early_expand_ifns (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_coroutine_early_expand_ifns, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *f)
+    {
+      return flag_coroutines && f->coroutine_component;
+    }
+
+  virtual unsigned int execute (function *f ATTRIBUTE_UNUSED)
+  {
+    return execute_early_expand_coro_ifns ();
+  }
+
+}; // class pass_coroutine_expand_ifns
+
+} // namespace
+
+gimple_opt_pass *
+make_pass_coroutine_early_expand_ifns (gcc::context *ctxt)
+{
+  return new pass_coroutine_early_expand_ifns (ctxt);
+}
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 41c14001fd..6093613bae 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -16802,6 +16802,9 @@  emit_coro_helper (tree helper)
   cp_fold_function (helper);
   DECL_CONTEXT (DECL_RESULT (helper)) = helper;
   BLOCK_SUPERCONTEXT (DECL_INITIAL (helper)) = helper;
+  /* This function has coroutine IFNs that we should handle in middle
+     end lowering.  */
+  cfun->coroutine_component = true;
   cp_genericize (helper);
   expand_or_defer_fn (helper);
 }
@@ -16858,6 +16861,9 @@  finish_function (bool inline_p)
 	  return fndecl;
 	}
 
+      /* We should handle coroutine IFNs in middle end lowering.  */
+      cfun->coroutine_component = true;
+
       if (use_eh_spec_block (fndecl))
 	finish_eh_spec_block (TYPE_RAISES_EXCEPTIONS
 			      (TREE_TYPE (fndecl)),
diff --git a/gcc/function.h b/gcc/function.h
index 496d3f728c..1ee8ed3de5 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -418,6 +418,9 @@  struct GTY(()) function {
   /* Set when the function was compiled with generation of debug
      (begin stmt, inline entry, ...) markers enabled.  */
   unsigned int debug_nonbind_markers : 1;
+
+  /* Set if this is a coroutine-related function.  */
+  unsigned int coroutine_component : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/passes.def b/gcc/passes.def
index b6bd4f3d31..2db4ef4acc 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -39,8 +39,10 @@  along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_lower_tm);
   NEXT_PASS (pass_refactor_eh);
   NEXT_PASS (pass_lower_eh);
+  NEXT_PASS (pass_coroutine_lower_builtins);
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
+  NEXT_PASS (pass_coroutine_early_expand_ifns);
   NEXT_PASS (pass_expand_omp);
   NEXT_PASS (pass_warn_printf);
   NEXT_PASS (pass_walloca, /*strict_mode_p=*/true);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 5ff43572b8..f10f467462 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -478,6 +478,8 @@  extern gimple_opt_pass *make_pass_gen_hsail (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_warn_nonnull_compare (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sprintf_length (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_walloca (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_coroutine_lower_builtins (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_coroutine_early_expand_ifns (gcc::context *ctxt);
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);