Add OpenACC 2.6 `serial' construct support

Message ID 20191107095213.11618-1-frederik@codesourcery.com
State New
Headers show
Series
  • Add OpenACC 2.6 `serial' construct support
Related show

Commit Message

Frederik Harwath Nov. 7, 2019, 9:52 a.m.
Hi,
this patch implements the OpenACC 2.6 "serial" construct.
It has been tested by running the testsuite with nvptx-none
offloading on x86_64-pc-linux-gnu.

Best regards,
Frederik
 
-------------------- 8< -------------------

    The `serial' construct (cf. section 2.5.3 of the OpenACC 2.6 standard)
    is equivalent to a `parallel' construct with clauses `num_gangs(1)
     num_workers(1) vector_length(1)' implied.
    These clauses are therefore not supported with the `serial'
    construct. All the remaining clauses accepted with `parallel' are also
    accepted with `serial'.

    The `serial' construct is implemented like `parallel', except for
    hardcoding dimensions rather than taking them from the relevant
    clauses, in `expand_omp_target'.

    Separate codes are used to denote the `serial' construct throughout the
    middle end, even though the mapping of `serial' to an equivalent
    `parallel' construct could have been done in the individual language
    frontends. In particular, this allows to distinguish between `parallel'
    and `serial' in warnings, error messages, dumps etc.

    2019-11-07  Maciej W. Rozycki  <macro@codesourcery.com>
		Tobias Burnus  <tobias@codesourcery.com>
                Frederik Harwath  <frederik@codesourcery.com>

	gcc/
	* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_SERIAL
	enumeration constant.
	(is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
	(is_gimple_omp_offloaded): Likewise.
	* gimplify.c (omp_region_type): Add ORT_ACC_SERIAL enumeration
	constant.  Adjust the value of ORT_NONE accordingly.
	(is_gimple_stmt): Handle OACC_SERIAL.
	(oacc_default_clause): Handle ORT_ACC_SERIAL.
	(gomp_needs_data_present): Likewise.
	(gimplify_adjust_omp_clauses): Likewise.
	(gimplify_omp_workshare): Handle OACC_SERIAL.
	(gimplify_expr): Likewise.
	* omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Add parameter.
	* omp-expand.c (expand_omp_target):
	Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
	(build_omp_regions_1, omp_make_gimple_edges): Likewise.
	* omp-low.c (is_oacc_parallel): Rename function to...
	(is_oacc_parallel_or_serial): ... this.
	Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
	(scan_sharing_clauses): Adjust accordingly.
	(scan_omp_for): Likewise.
	(lower_oacc_head_mark): Likewise.
	(convert_from_firstprivate_int): Likewise.
	(lower_omp_target): Likewise.
	(check_omp_nesting_restrictions): Handle
	GF_OMP_TARGET_KIND_OACC_SERIAL.
	(lower_oacc_reductions): Likewise.
	(lower_omp_target): Likewise.
	* tree.def (OACC_SERIAL): New tree code.
	* tree-pretty-print.c (dump_generic_node): Handle OACC_SERIAL.

	* doc/generic.texi (OpenACC): Document OACC_SERIAL.

	gcc/c-family/
	* c-pragma.h (pragma_kind): Add PRAGMA_OACC_SERIAL enumeration
	constant.
	* c-pragma.c (oacc_pragmas): Add "serial" entry.

	gcc/c/
	* c-parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
	(c_parser_oacc_kernels_parallel): Rename function to...
	(c_parser_oacc_compute): ... this.  Handle PRAGMA_OACC_SERIAL.
	(c_parser_omp_construct): Update accordingly.

	gcc/cp/
	* constexpr.c (potential_constant_expression_1): Handle
	OACC_SERIAL.
	* parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
	(cp_parser_oacc_kernels_parallel): Rename function to...
	(cp_parser_oacc_compute): ... this.  Handle PRAGMA_OACC_SERIAL.
	(cp_parser_omp_construct): Update accordingly.
	(cp_parser_pragma): Handle PRAGMA_OACC_SERIAL.  Fix alphabetic
	order.
	* pt.c (tsubst_expr): Handle OACC_SERIAL.

	gcc/fortran/
	* gfortran.h (gfc_statement): Add ST_OACC_SERIAL_LOOP,
	ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL and ST_OACC_END_SERIAL
	enumeration constants.
	(gfc_exec_op): Add EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL
	enumeration constants.
	* match.h (gfc_match_oacc_serial): New prototype.
	(gfc_match_oacc_serial_loop): Likewise.
	* dump-parse-tree.c (show_omp_node, show_code_node): Handle
	EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
	* match.c (match_exit_cycle): Handle EXEC_OACC_SERIAL_LOOP.
	* openmp.c (OACC_SERIAL_CLAUSES): New macro.
	(gfc_match_oacc_serial_loop): New function.
	(gfc_match_oacc_serial): Likewise.
	(oacc_is_loop): Handle EXEC_OACC_SERIAL_LOOP.
	(resolve_omp_clauses): Handle EXEC_OACC_SERIAL.
	(oacc_code_to_statement): Handle EXEC_OACC_SERIAL and
	EXEC_OACC_SERIAL_LOOP.
	(gfc_resolve_oacc_directive): Likewise.
	* parse.c (decode_oacc_directive) <'s'>: Add case for "serial"
	and "serial loop".
	(next_statement): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL.
	(gfc_ascii_statement): Likewise.  Handle ST_OACC_END_SERIAL_LOOP
	and ST_OACC_END_SERIAL.
	(parse_oacc_structured_block): Handle ST_OACC_SERIAL.
	(parse_oacc_loop): Handle ST_OACC_SERIAL_LOOP and
	ST_OACC_END_SERIAL_LOOP.
	(parse_executable): Handle ST_OACC_SERIAL_LOOP and
	ST_OACC_SERIAL.
	(is_oacc): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
	* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
	* st.c (gfc_free_statement): Likewise.
	* trans-openmp.c (gfc_trans_oacc_construct): Handle
	EXEC_OACC_SERIAL.
	(gfc_trans_oacc_combined_directive): Handle
	EXEC_OACC_SERIAL_LOOP.
	(gfc_trans_oacc_directive): Handle EXEC_OACC_SERIAL_LOOP and
	EXEC_OACC_SERIAL.
	* trans.c (trans_code): Likewise.

	gcc/testsuite/
	* c-c++-common/goacc/serial-dims.c: New test.
	* gfortran.dg/goacc/serial-dims.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: New test.
	* testsuite/libgomp.oacc-fortran/serial-dims-aux.c: New test.
	* testsuite/libgomp.oacc-fortran/serial-dims.f90: New test.
---
 gcc/c-family/c-pragma.c                       |  1 +
 gcc/c-family/c-pragma.h                       |  1 +
 gcc/c/c-parser.c                              | 34 ++++++-
 gcc/cp/constexpr.c                            |  1 +
 gcc/cp/parser.c                               | 35 ++++++-
 gcc/cp/pt.c                                   |  1 +
 gcc/doc/generic.texi                          |  5 +
 gcc/fortran/dump-parse-tree.c                 |  6 ++
 gcc/fortran/gfortran.h                        | 13 +--
 gcc/fortran/match.c                           |  3 +-
 gcc/fortran/match.h                           |  2 +
 gcc/fortran/openmp.c                          | 35 ++++++-
 gcc/fortran/parse.c                           | 30 +++++-
 gcc/fortran/resolve.c                         |  6 ++
 gcc/fortran/st.c                              |  2 +
 gcc/fortran/trans-openmp.c                    | 13 ++-
 gcc/fortran/trans.c                           |  2 +
 gcc/gimple-pretty-print.c                     |  3 +
 gcc/gimple.h                                  |  3 +
 gcc/gimplify.c                                | 20 +++-
 gcc/omp-expand.c                              | 47 ++++++++--
 gcc/omp-low.c                                 | 33 ++++---
 .../c-c++-common/goacc/serial-dims.c          | 12 +++
 .../gfortran.dg/goacc/serial-dims.f90         | 40 ++++++++
 gcc/tree-pretty-print.c                       |  4 +
 gcc/tree.def                                  |  6 ++
 .../libgomp.oacc-c-c++-common/serial-dims.c   | 92 +++++++++++++++++++
 .../libgomp.oacc-fortran/serial-dims-aux.c    | 41 +++++++++
 .../libgomp.oacc-fortran/serial-dims.f90      | 89 ++++++++++++++++++
 29 files changed, 535 insertions(+), 45 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/serial-dims.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/serial-dims.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90

-- 
2.17.1

Comments

Thomas Schwinge Nov. 11, 2019, 11:54 a.m. | #1
Hi Frederik!

On 2019-11-07T10:52:13+0100, Frederik Harwath <frederik@codesourcery.com> wrote:
> this patch implements the OpenACC 2.6 "serial" construct.


Thanks for taking on that one.

> It has been tested by running the testsuite with nvptx-none

> offloading on x86_64-pc-linux-gnu.


This is OK for trunk with the attached "incremental, into Add OpenACC 2.6
`serial' construct support" merged in.  (No need to re-test; I've just
done that.)  In the incremental patch, I'm streamlining some code,
format/handle 'serial' the same as existing 'parallel', etc., plus a few
more things, see my comments in the patch review below.

To record the review effort, please include "Reviewed-by: Thomas Schwinge
<thomas@codesourcery.com>" in the commit log, see
<https://gcc.gnu.org/wiki/Reviewed-by>.


I'm working on an additional patch to handle 'serial' in more cases where
it's wrong to diverge from 'parallel' (this tells us: a lot of testsuite
coverage is missing...), etc.

Thus I'm adding a lot of testsuite coverage.  I'm not asking you to work
on that, as that's not a feasible task for someone who's still new to all
this, to figure out the appropriate tests that should be
augmented/duplicated for 'serial'.  And, coming up with a list for you to
work though, I suppose would be more time consuming for me instead of
just doing it myself.  ;-)

However, you're of course always encouraged to learn from reading such
patches, and ask questions for any things unclear, of course.


>     The `serial' construct (cf. section 2.5.3 of the OpenACC 2.6 standard)

>     is equivalent to a `parallel' construct with clauses `num_gangs(1)

>      num_workers(1) vector_length(1)' implied.


..., and that's how it -- basically -- is implemented, and thus every
usage of 'serial' gets an annoying 'warning: using vector_length (32),
ignoring 1' for nvptx offloading compilation.  I wonder if we should
sinply disable that nvptx back end warning when an 'oacc serial'
attribute is present?  Or, if we should not, to highlight the issue that
I recently filed <https://github.com/OpenACC/openacc-spec/issues/238>
"OpenACC 'serial' construct might not actually be serial", discovered
during this review process?  (Summary: by GCC have a default of
'vector_length (32)', we do get vector parallelism with 'loop vector', or
'routine vector' inside 'serial' regions -- not clear if that's
intentional, and/or correct.)

>     These clauses are therefore not supported with the `serial'

>     construct. All the remaining clauses accepted with `parallel' are also

>     accepted with `serial'.

>

>     The `serial' construct is implemented like `parallel', except for

>     hardcoding dimensions rather than taking them from the relevant

>     clauses, in `expand_omp_target'.


>     Separate codes are used to denote the `serial' construct throughout the

>     middle end, even though the mapping of `serial' to an equivalent

>     `parallel' construct could have been done in the individual language

>     frontends.


Yeah, I'd pointed this out early on, and I still wonder if early
translating 'serial' into 'parallel num_gangs (1) num_workers (1)
vector_length (1)' (if that's really just what it is) would be better?
Would save quite some effort (duplicate all 'parallel' handling for
'serial').  On the other hand, we'd then need a different mechanism for:

>     In particular, this allows to distinguish between `parallel'

>     and `serial' in warnings, error messages, dumps etc.


... that (or just say "compute construct" instead of 'parallel',
'kernels', 'serial').  But we'll eventually want such a more general
mechnisma anyway; <https://gcc.gnu.org/PR65095> "Adapt OpenMP diagnostic
messages for OpenACC".

So -- we've now got that implementation, and we can still clean it up
later on.


> 	* omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Add parameter.


Not anymore.


>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/serial-dims.f90


>  create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c

>  create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90


Thanks to you (and/or Tobias, I suppose) for adding some Fortran
testsuite coversage, because:

> --- a/gcc/fortran/parse.c

> +++ b/gcc/fortran/parse.c

> @@ -683,6 +683,9 @@ decode_oacc_directive (void)

>        matcha ("end parallel loop", gfc_match_omp_eos_error,

>  	      ST_OACC_END_PARALLEL_LOOP);

>        matcha ("end parallel", gfc_match_omp_eos_error, ST_OACC_END_PARALLEL);

> +      matcha ("end serial loop", gfc_match_omp_eos_error,

> +	      ST_OACC_END_SERIAL_LOOP);

> +      matcha ("end serial", gfc_match_omp_eos_error, ST_OACC_END_SERIAL);

>        matcha ("enter data", gfc_match_oacc_enter_data, ST_OACC_ENTER_DATA);

>        matcha ("exit data", gfc_match_oacc_exit_data, ST_OACC_EXIT_DATA);

>        break;


Wow, wow.  I see this has not been present in the og8 and og9 commits of
the OpenACC 'serial' changes.  This tells us: the OpenACC 'serial'
construct has *not at all* been tested with Fortran; any compilation
attempt would've stopped early in the front end:

       25 |   !$acc end serial loop
          |         1
    Error: Unclassifiable OpenACC directive at (1)

       28 |   !$acc end serial
          |         1
    Error: Unclassifiable OpenACC directive at (1)

Thanks for fixing that.


> --- a/gcc/gimple.h

> +++ b/gcc/gimple.h


> @@ -182,6 +182,7 @@ enum gf_mask {

>      GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9,

>      GF_OMP_TARGET_KIND_OACC_DECLARE = 10,

>      GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,

> +    GF_OMP_TARGET_KIND_OACC_SERIAL = 12,


That's not wrong, but I've still moved 'GF_OMP_TARGET_KIND_OACC_SERIAL'
next to/after the existing 'GF_OMP_TARGET_KIND_OACC_PARALLEL',
'GF_OMP_TARGET_KIND_OACC_KERNELS' (it's OK to renumber 'enum gf_mask'
items), so that there's (at least some) consistency in the the
'parallel', 'kernels', 'serial' ordering (which is the order they appear
in the current specification), that we shall use unless alphabetical
ordering is used.


> --- a/gcc/omp-low.c

> +++ b/gcc/omp-low.c


> @@ -7518,7 +7526,7 @@ lower_oacc_head_mark (location_t loc, tree ddvar, tree clauses,

>  

>    /* In a parallel region, loops are implicitly INDEPENDENT.  */

>    omp_context *tgt = enclosing_target_ctx (ctx);

> -  if (!tgt || is_oacc_parallel (tgt))

> +  if (!tgt || is_oacc_parallel_or_serial (tgt))

>      tag |= OLF_INDEPENDENT;


I would agree, but from a (very) quick look, I don't think the OpenACC
specification actually says anything on that topic.  Something I'll get
that clarified.


> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/goacc/serial-dims.c

> @@ -0,0 +1,12 @@

> +/* Invalid use of OpenACC parallelism dimensions clauses: num_gangs,

> +   num_workers, vector_length with the serial construct.  */

> +

> +void f(void)

> +{

> +#pragma acc serial num_gangs (1) /* { dg-error "'num_gangs' is not valid for '#pragma acc serial'" } */

> +  ;

> +#pragma acc serial num_workers (1) /* { dg-error "'num_workers' is not valid for '#pragma acc serial'" } */

> +  ;

> +#pragma acc serial vector_length (1) /* { dg-error "'vector_length' is not valid for '#pragma acc serial'" } */

> +  ;

> +}


I've merged that into the existing 'c-c++-common/goacc/parallel-dims-2.c'.

> --- /dev/null

> +++ b/gcc/testsuite/gfortran.dg/goacc/serial-dims.f90

> @@ -0,0 +1,40 @@

> +! Invalid use of OpenACC parallelism dimensions clauses: num_gangs,

> +! num_workers, vector_length with the serial construct.

> +

> +subroutine s()

> +  integer :: i

> +  !$acc parallel

> +  !$acc end parallel

> +

> +  !$acc parallel loop

> +  do i = 1, 5

> +  end do

> +

> +  !$acc parallel loop

> +  do i = 1, 5

> +  end do

> +  !$acc end parallel loop

> +

> +  !$acc serial loop

> +  do i = 1, 5

> +  end do

> +

> +  !$acc serial loop

> +  do i = 1, 5

> +  end do

> +  !$acc end serial loop

> +

> +  !$acc serial

> +  !$acc end serial

> +end subroutine s

> +

> +subroutine f()

> +!$acc serial num_gangs (1)  ! { dg-error "Failed to match clause at" }

> +!$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }

> +

> +!$acc serial num_workers (1)  ! { dg-error "Failed to match clause at" }

> +!$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }

> +

> +!$acc serial vector_length (1)  ! { dg-error "Failed to match clause at" }

> +!$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }

> +end subroutine f


Similarly, for symmetry, moved into (new)
'gfortran.dg/goacc/parallel-dims-2.f90'.


> --- /dev/null

> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c

> @@ -0,0 +1,92 @@

> +/* OpenACC dimensions with the serial construct.  */


This I've merged into the existing
'libgomp.oacc-c-c++-common/parallel-dims.c', instead of duplicating
infrastructure here, and doing some things slightly differently (possibly
due to incorrect divergence between 'serial' and 'parallel' handling, as
I mentioned above, which I shall soon fix).

> +  /* Serial OpenACC constructs must get launched as 1 x 1 x 1.  */

> +  {

> +    int gangs_min, gangs_max;

> +    int workers_min, workers_max;

> +    int vectors_min, vectors_max;

> +    int gangs_actual, workers_actual, vectors_actual;

> +    int i, j, k;

> +

> +    gangs_min = workers_min = vectors_min = INT_MAX;

> +    gangs_max = workers_max = vectors_max = INT_MIN;

> +    gangs_actual = workers_actual = vectors_actual = 1;

> +#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */

> +    {

> +      if (acc_on_device (acc_device_nvidia))

> +	{

> +	  /* The GCC nvptx back end enforces vector_length (32).  */

> +	  vectors_actual = 32;


So, that's actually a good question, whether that is permissible --
that's <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC
'serial' construct might not actually be serial", as mentioned above.

> +	}

> +      else if (!acc_on_device (acc_device_host))

> +	__builtin_abort ();

> +#pragma acc loop gang \

> +  reduction (min: gangs_min, workers_min, vectors_min) \

> +  reduction (max: gangs_max, workers_max, vectors_max)

> +      for (i = 100 * gangs_actual; i > -100 * gangs_actual; i--)

> +#pragma acc loop worker \

> +  reduction (min: gangs_min, workers_min, vectors_min) \

> +  reduction (max: gangs_max, workers_max, vectors_max)

> +	for (j = 100 * workers_actual; j > -100 * workers_actual; j--)

> +#pragma acc loop vector \

> +  reduction (min: gangs_min, workers_min, vectors_min) \

> +  reduction (max: gangs_max, workers_max, vectors_max)

> +	  for (k = 100 * vectors_actual; k > -100 * vectors_actual; k--)

> +	    {

> +	      gangs_min = gangs_max = acc_gang ();

> +	      workers_min = workers_max = acc_worker ();

> +	      vectors_min = vectors_max = acc_vector ();

> +	    }

> +      if (gangs_min != 0 || gangs_max != gangs_actual - 1

> +	  || workers_min != 0 || workers_max != workers_actual - 1

> +	  || vectors_min != 0 || vectors_max != vectors_actual - 1)

> +	__builtin_abort ();

> +    }

> +  }


Per the OpenACC 'loop' directives specified here, that's testing
gang-partitioned, worker-partitioned, vector-partitioned execution mode.

We should also test gang-redundant, worker-single, vector-single
execution mode, which I've added.


The Fortran counter part (thanks for creating that!), I suppose, had not
yet been tested?

> --- /dev/null

> +++ b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c

> @@ -0,0 +1,41 @@

> +/* OpenACC dimensions with the serial construct.  */

> +/* Used by serial-dims.f90.  */


I indicate where this has been copied from.

(Generally, getting rid of these wrapper functions is for another day.)

> +#include <limits.h>

> +#include <openacc.h>

> +#include <gomp-constants.h>

> +

> +/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper

> +   not behaving as expected for -O0.  */

> +#pragma acc routine seq

> +static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()

> +{

> +  if (acc_on_device ((int) acc_device_host))

> +    return 0;

> +  else if (acc_on_device ((int) acc_device_nvidia))

> +    return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);

> +  else

> +    __builtin_abort ();

> +}

> +

> +#pragma acc routine seq

> +static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()

> +{

> +  if (acc_on_device ((int) acc_device_host))

> +    return 0;

> +  else if (acc_on_device ((int) acc_device_nvidia))

> +    return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);

> +  else

> +    __builtin_abort ();

> +}

> +

> +#pragma acc routine seq

> +static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()

> +{

> +  if (acc_on_device ((int) acc_device_host))

> +    return 0;

> +  else if (acc_on_device ((int) acc_device_nvidia))

> +    return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);

> +  else

> +    __builtin_abort ();

> +}


Compilation of 'libgomp.oacc-fortran/serial-dims.f90' fails:

    serial-dims.f90:(.text+0x124): undefined reference to `acc_gang'
    serial-dims.f90:(.text+0x130): undefined reference to `acc_gang'
    serial-dims.f90:(.text+0x13c): undefined reference to `acc_worker'
    serial-dims.f90:(.text+0x148): undefined reference to `acc_worker'
    serial-dims.f90:(.text+0x154): undefined reference to `acc_vector'
    serial-dims.f90:(.text+0x160): undefined reference to `acc_vector'

Have to remove 'static' from 'acc_gang', 'acc_worker', 'acc_vector'.

> --- /dev/null

> +++ b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90

> @@ -0,0 +1,89 @@

> +! OpenACC dimensions with the serial construct.


This needs '{ dg-do run }' for torture testing.

> +

> +! { dg-additional-sources serial-dims-aux.c }

> +! { dg-warning "command line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }


We get:

    FAIL: libgomp.oacc-fortran/serial-dims.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O   (test for warnings, line 4)
    FAIL: libgomp.oacc-fortran/serial-dims.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O  (test for excess errors)

..., with:

    Excess errors:
    cc1: warning: command-line option '-fintrinsic-modules-path=[...]' is valid for Fortran but not for C

That's because that diagnostic doesn't appear on the line where the
'dg-warning' directive is present (line 4).  I changed that to
'dg-prune-output', but I wonder if there's a better way, so that we can
specify to expect/match a diagnostic without line number information -- I
can't remember whether such a thing exists.

However, that still fails: "command[-]line option" typo.  ;-)

> +module acc_routines

> +  implicit none (type, external)

> +

> +  interface

> +    integer function acc_gang() bind(C)

> +      !$acc routine seq

> +    end function acc_gang

> +

> +    integer function acc_worker() bind(C)

> +      !$acc routine seq

> +    end function acc_worker

> +

> +    integer function acc_vector() bind(C)

> +      !$acc routine seq

> +    end function acc_vector

> +  end interface

> +end module acc_routines


With '-Wall', we're told:

       14 |     integer function acc_gang() bind(C)
          |                             1
    Warning: Variable 'acc_gang' at (1) may not be a C interoperable kind but it is BIND(C) [-Wc-binding-type]

       22 |     integer function acc_vector() bind(C)
          |                               1
    Warning: Variable 'acc_vector' at (1) may not be a C interoperable kind but it is BIND(C) [-Wc-binding-type]

       18 |     integer function acc_worker() bind(C)
          |                               1
    Warning: Variable 'acc_worker' at (1) may not be a C interoperable kind but it is BIND(C) [-Wc-binding-type]

I have not yet looked into that.

> +program main

> +  use iso_c_binding

> +  use openacc

> +  use acc_routines

> +  implicit none (type, external)

> +

> +  integer :: gangs_min, gangs_max

> +  integer :: workers_min, workers_max

> +  integer :: vectors_min, vectors_max

> +  integer :: gangs_actual, workers_actual, vectors_actual

> +  integer :: i, j, k

> +

> +  call acc_init (acc_device_default)

> +

> +  ! Serial OpenACC constructs must get launched as 1 x 1 x 1.

> +  gangs_min = huge(gangs_min)

> +  workers_min = huge(workers_min)

> +  vectors_min = huge(vectors_min)

> +  gangs_max = -huge(gangs_max) - 1  ! INT_MIN

> +  workers_max = -huge(gangs_max) - 1

> +  vectors_max = -huge(gangs_max) - 1


Indeed the C/C++ initializes '*_min' variables with 'INT_MAX', and
'*_max' variables with 'INT_MIN'.  Is the above the generic Fortran
counter part for that?

> +  gangs_actual = 1

> +  workers_actual = 1

> +  vectors_actual = 1

> +

> +  !$acc serial ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }

> +    if (acc_on_device (acc_device_nvidia)) then

> +      ! The GCC nvptx back end enforces vector_length (32).

> +      vectors_actual = 32

> +    elseif (acc_on_device (acc_device_gcn)) then

> +      ! AMD GCN relies on the autovectorizer for the vector dimension:

> +      ! the loop below isn't likely to be vectorized, so vectors_actual

> +      ! is effectively 1.

> +      vectors_actual = 1


We're told:

    [...]/libgomp.oacc-fortran/serial-dims.f90:53:41: Error: Symbol 'acc_device_gcn' at (1) has no IMPLICIT type; did you mean 'acc_device_kind'?

AMD GCN offloading support doesn't exist on trunk yet, so removed that
here.

> +    elseif (.not. acc_on_device (acc_device_host)) then

> +      stop 1

> +    end if

> +

> +!$acc loop gang &

> +!$acc & reduction (min: gangs_min, workers_min, vectors_min) &

> +!$acc & reduction (max: gangs_max, workers_max, vectors_max)

> +    do i = 100 * gangs_actual, -99 * gangs_actual, -1

> +!$acc loop worker &

> +!$acc & reduction (min: gangs_min, workers_min, vectors_min) &

> +!$acc & reduction (max: gangs_max, workers_max, vectors_max)

> +      do j = 100 * workers_actual, -99 * workers_actual, -1

> +!$acc loop vector &

> +!$acc & reduction (min: gangs_min, workers_min, vectors_min) &

> +!$acc & reduction (max: gangs_max, workers_max, vectors_max)

> +        do k = 100 * vectors_actual, -99 * vectors_actual, -1

> +          gangs_min = acc_gang ();

> +          gangs_max = acc_gang ();

> +          workers_min = acc_worker ();

> +          workers_max = acc_worker ();

> +          vectors_min = acc_vector ();

> +          vectors_max = acc_vector ();

> +       end do

> +     end do

> +   end do

> +  if (gangs_min /= 0 .or. gangs_max /= gangs_actual - 1 &

> +      .or. workers_min /= 0 .or. workers_max /= workers_actual - 1 &

> +      .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) &

> +    stop 2

> +!$acc end serial

> +

> +end program main



Grüße
 Thomas
From 788b2ec11009e4c36b28834914cb251134c3b761 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>

Date: Sun, 10 Nov 2019 22:33:43 +0100
Subject: [PATCH] incremental, into Add OpenACC 2.6 `serial' construct support

---
 gcc/fortran/match.h                           |   2 +-
 gcc/fortran/openmp.c                          |  16 +--
 gcc/fortran/parse.c                           |   2 +-
 gcc/fortran/trans-openmp.c                    |   8 +-
 gcc/gimple.def                                |   2 +-
 gcc/gimple.h                                  |  12 +-
 gcc/gimplify.c                                |   4 +-
 gcc/omp-expand.c                              |   8 +-
 gcc/omp-low.c                                 |   7 +-
 .../c-c++-common/goacc/parallel-dims-2.c      |  16 ++-
 .../c-c++-common/goacc/serial-dims.c          |  12 --
 .../{serial-dims.f90 => parallel-dims-2.f90}  |  34 ++---
 gcc/tree.h                                    |   3 +-
 .../libgomp.oacc-c-c++-common/parallel-dims.c |  73 +++++++++++
 .../libgomp.oacc-c-c++-common/serial-dims.c   |  92 --------------
 ...{serial-dims-aux.c => parallel-dims-aux.c} |  14 +-
 .../libgomp.oacc-fortran/parallel-dims.f90    | 120 ++++++++++++++++++
 .../libgomp.oacc-fortran/serial-dims.f90      |  89 -------------
 18 files changed, 257 insertions(+), 257 deletions(-)
 delete mode 100644 gcc/testsuite/c-c++-common/goacc/serial-dims.c
 rename gcc/testsuite/gfortran.dg/goacc/{serial-dims.f90 => parallel-dims-2.f90} (50%)
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c
 rename libgomp/testsuite/libgomp.oacc-fortran/{serial-dims-aux.c => parallel-dims-aux.c} (67%)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
 delete mode 100644 libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90

diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h
index 954af72f0e07..7f3d356cbe49 100644
--- a/gcc/fortran/match.h
+++ b/gcc/fortran/match.h
@@ -146,9 +146,9 @@ match gfc_match_oacc_kernels (void);
 match gfc_match_oacc_kernels_loop (void);
 match gfc_match_oacc_parallel (void);
 match gfc_match_oacc_parallel_loop (void);
-match gfc_match_oacc_enter_data (void);
 match gfc_match_oacc_serial (void);
 match gfc_match_oacc_serial_loop (void);
+match gfc_match_oacc_enter_data (void);
 match gfc_match_oacc_exit_data (void);
 match gfc_match_oacc_routine (void);
 
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 198facce636d..dc0521b40f0b 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1965,14 +1965,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT	      \
    | OMP_CLAUSE_WAIT)
 #define OACC_SERIAL_CLAUSES \
-  (omp_mask (OMP_CLAUSE_ASYNC) | OMP_CLAUSE_WAIT			      \
-   | OMP_CLAUSE_IF							      \
-   | OMP_CLAUSE_REDUCTION						      \
+  (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_REDUCTION	      \
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
-   | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT				      \
-   | OMP_CLAUSE_DEVICEPTR						      \
-   | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE			      \
-   | OMP_CLAUSE_DEFAULT)
+   | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR	      \
+   | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT	      \
+   | OMP_CLAUSE_WAIT)
 #define OACC_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR  | OMP_CLAUSE_COPY	      \
    | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE		      \
@@ -1986,6 +1983,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
   (OACC_LOOP_CLAUSES | OACC_PARALLEL_CLAUSES)
 #define OACC_KERNELS_LOOP_CLAUSES \
   (OACC_LOOP_CLAUSES | OACC_KERNELS_CLAUSES)
+#define OACC_SERIAL_LOOP_CLAUSES \
+  (OACC_LOOP_CLAUSES | OACC_SERIAL_CLAUSES)
 #define OACC_HOST_DATA_CLAUSES omp_mask (OMP_CLAUSE_USE_DEVICE)
 #define OACC_DECLARE_CLAUSES \
   (omp_mask (OMP_CLAUSE_COPY) | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT	      \
@@ -2050,8 +2049,7 @@ gfc_match_oacc_kernels (void)
 match
 gfc_match_oacc_serial_loop (void)
 {
-  return match_acc (EXEC_OACC_SERIAL_LOOP,
-		    OACC_SERIAL_CLAUSES | OACC_LOOP_CLAUSES);
+  return match_acc (EXEC_OACC_SERIAL_LOOP, OACC_SERIAL_LOOP_CLAUSES);
 }
 
 
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 1a38606682ca..e44cc6971983 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -5119,7 +5119,7 @@ parse_oacc_structured_block (gfc_statement acc_st)
   pop_state ();
 }
 
-/* Parse the statements of OpenACC loop/parallel loop/kernels loop.  */
+/* Parse the statements of OpenACC 'loop', or combined compute 'loop'.  */
 
 static gfc_statement
 parse_oacc_loop (gfc_statement acc_st)
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 573b55b066f3..d9dfcabc65ef 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -3193,8 +3193,9 @@ gfc_trans_omp_code (gfc_code *code, bool force_empty)
   return stmt;
 }
 
-/* Trans OpenACC directives. */
-/* parallel, serial, kernels, data and host_data. */
+/* Translate OpenACC 'parallel', 'kernels', 'serial', 'data', 'host_data'
+   construct. */
+
 static tree
 gfc_trans_oacc_construct (gfc_code *code)
 {
@@ -4020,7 +4021,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
   return gfc_finish_block (&block);
 }
 
-/* Combined OpenACC parallel loop, kernels loop and serial loop. */
+/* Translate combined OpenACC 'parallel loop', 'kernels loop', 'serial loop'
+   construct. */
 
 static tree
 gfc_trans_oacc_combined_directive (gfc_code *code)
diff --git a/gcc/gimple.def b/gcc/gimple.def
index dd64419e8eb6..38c11f41156d 100644
--- a/gcc/gimple.def
+++ b/gcc/gimple.def
@@ -359,7 +359,7 @@ DEFGSCODE(GIMPLE_OMP_SECTIONS_SWITCH, "gimple_omp_sections_switch", GSS_BASE)
 DEFGSCODE(GIMPLE_OMP_SINGLE, "gimple_omp_single", GSS_OMP_SINGLE_LAYOUT)
 
 /* GIMPLE_OMP_TARGET <BODY, CLAUSES, CHILD_FN> represents
-   #pragma acc {kernels,parallel,data,enter data,exit data,update}
+   #pragma acc {kernels,parallel,serial,data,enter data,exit data,update}
    #pragma omp target {,data,update}
    BODY is the sequence of statements inside the construct
    (NULL for some variants).
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 83a449be3643..5a190b1714dc 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -177,12 +177,12 @@ enum gf_mask {
     GF_OMP_TARGET_KIND_EXIT_DATA = 4,
     GF_OMP_TARGET_KIND_OACC_PARALLEL = 5,
     GF_OMP_TARGET_KIND_OACC_KERNELS = 6,
-    GF_OMP_TARGET_KIND_OACC_DATA = 7,
-    GF_OMP_TARGET_KIND_OACC_UPDATE = 8,
-    GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9,
-    GF_OMP_TARGET_KIND_OACC_DECLARE = 10,
-    GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,
-    GF_OMP_TARGET_KIND_OACC_SERIAL = 12,
+    GF_OMP_TARGET_KIND_OACC_SERIAL = 7,
+    GF_OMP_TARGET_KIND_OACC_DATA = 8,
+    GF_OMP_TARGET_KIND_OACC_UPDATE = 9,
+    GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 10,
+    GF_OMP_TARGET_KIND_OACC_DECLARE = 11,
+    GF_OMP_TARGET_KIND_OACC_HOST_DATA = 12,
     GF_OMP_TEAMS_GRID_PHONY	= 1 << 0,
     GF_OMP_TEAMS_HOST		= 1 << 1,
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 31429d5ac3ba..87a640545141 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -161,7 +161,7 @@ enum omp_region_type
   ORT_ACC_DATA	= ORT_ACC | ORT_TARGET_DATA, /* Data construct.  */
   ORT_ACC_PARALLEL = ORT_ACC | ORT_TARGET,  /* Parallel construct */
   ORT_ACC_KERNELS  = ORT_ACC | ORT_TARGET | 2,  /* Kernels construct.  */
-  ORT_ACC_SERIAL = ORT_ACC | ORT_TARGET | 4,  /* Serial construct.  */
+  ORT_ACC_SERIAL   = ORT_ACC | ORT_TARGET | 4,  /* Serial construct.  */
   ORT_ACC_HOST_DATA = ORT_ACC | ORT_TARGET_DATA | 2,  /* Host data.  */
 
   /* Dummy OpenMP region, used to disable expansion of
@@ -10101,7 +10101,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	      break;
 	    }
 	  decl = OMP_CLAUSE_DECL (c);
-	  /* Data clauses associated with acc parallel reductions must be
+	  /* Data clauses associated with reductions must be
 	     compatible with present_or_copy.  Warn and adjust the clause
 	     if that is not the case.  */
 	  if (ctx->region_type == ORT_ACC_PARALLEL
diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index d242f4e1ae99..6f945011cf5a 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -7914,8 +7914,8 @@ expand_omp_target (struct omp_region *region)
     case GF_OMP_TARGET_KIND_UPDATE:
     case GF_OMP_TARGET_KIND_ENTER_DATA:
     case GF_OMP_TARGET_KIND_EXIT_DATA:
-    case GF_OMP_TARGET_KIND_OACC_KERNELS:
     case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+    case GF_OMP_TARGET_KIND_OACC_KERNELS:
     case GF_OMP_TARGET_KIND_OACC_SERIAL:
     case GF_OMP_TARGET_KIND_OACC_UPDATE:
     case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA:
@@ -8171,8 +8171,8 @@ expand_omp_target (struct omp_region *region)
       start_ix = BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA;
       flags_i |= GOMP_TARGET_FLAG_EXIT_DATA;
       break;
-    case GF_OMP_TARGET_KIND_OACC_KERNELS:
     case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+    case GF_OMP_TARGET_KIND_OACC_KERNELS:
     case GF_OMP_TARGET_KIND_OACC_SERIAL:
       start_ix = BUILT_IN_GOACC_PARALLEL;
       break;
@@ -8938,8 +8938,8 @@ build_omp_regions_1 (basic_block bb, struct omp_region *parent,
 		{
 		case GF_OMP_TARGET_KIND_REGION:
 		case GF_OMP_TARGET_KIND_DATA:
-		case GF_OMP_TARGET_KIND_OACC_KERNELS:
 		case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+		case GF_OMP_TARGET_KIND_OACC_KERNELS:
 		case GF_OMP_TARGET_KIND_OACC_SERIAL:
 		case GF_OMP_TARGET_KIND_OACC_DATA:
 		case GF_OMP_TARGET_KIND_OACC_HOST_DATA:
@@ -9193,8 +9193,8 @@ omp_make_gimple_edges (basic_block bb, struct omp_region **region,
 	{
 	case GF_OMP_TARGET_KIND_REGION:
 	case GF_OMP_TARGET_KIND_DATA:
-	case GF_OMP_TARGET_KIND_OACC_KERNELS:
 	case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+	case GF_OMP_TARGET_KIND_OACC_KERNELS:
 	case GF_OMP_TARGET_KIND_OACC_SERIAL:
 	case GF_OMP_TARGET_KIND_OACC_DATA:
 	case GF_OMP_TARGET_KIND_OACC_HOST_DATA:
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index abf63334ca05..781e7cbf27a2 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -185,7 +185,8 @@ static tree scan_omp_1_op (tree *, int *, void *);
       *handled_ops_p = false; \
       break;
 
-/* Return true if CTX corresponds to an oacc parallel or serial region.  */
+/* Return true if CTX corresponds to an OpenACC 'parallel' or 'serial'
+   region.  */
 
 static bool
 is_oacc_parallel_or_serial (omp_context *ctx)
@@ -2419,7 +2420,7 @@ scan_omp_for (gomp_for *stmt, omp_context *outer_ctx)
 	    if (check && OMP_CLAUSE_OPERAND (c, 0))
 	      error_at (gimple_location (stmt),
 			"argument not permitted on %qs clause in"
-			" OpenACC %<parallel%>", check);
+			" OpenACC %<parallel%> or %<serial%>", check);
 	  }
 
       if (tgt && is_oacc_kernels (tgt))
@@ -11498,7 +11499,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	if (!maybe_lookup_field (var, ctx))
 	  continue;
 
-	/* Don't remap oacc parallel reduction variables, because the
+	/* Don't remap compute constructs' reduction variables, because the
 	   intermediate result must be local to each gang.  */
 	if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 			   && OMP_CLAUSE_MAP_IN_REDUCTION (c)))
diff --git a/gcc/testsuite/c-c++-common/goacc/parallel-dims-2.c b/gcc/testsuite/c-c++-common/goacc/parallel-dims-2.c
index acfbe7ff031a..31c4ee349f2c 100644
--- a/gcc/testsuite/c-c++-common/goacc/parallel-dims-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/parallel-dims-2.c
@@ -1,5 +1,7 @@
-/* Invalid use of OpenACC parallelism dimensions clauses: num_gangs,
-   num_workers, vector_length.  */
+/* Invalid use of OpenACC parallelism dimensions clauses: 'num_gangs',
+   'num_workers', 'vector_length'.  */
+
+/* See also '../../gfortran.dg/goacc/parallel-dims-2.f90'.  */
 
 void f(int i, float f)
 {
@@ -255,4 +257,14 @@ void f(int i, float f)
   vector_length(&f) /* { dg-error "'vector_length' expression must be integral" } */ \
   num_gangs( /* { dg-error "expected (primary-|)expression before end of line" "TODO" { xfail c } } */
   ;
+
+
+  /* The 'serial' construct doesn't allow these at all.  */
+
+#pragma acc serial num_gangs (1) /* { dg-error "'num_gangs' is not valid for '#pragma acc serial'" } */
+  ;
+#pragma acc serial num_workers (1) /* { dg-error "'num_workers' is not valid for '#pragma acc serial'" } */
+  ;
+#pragma acc serial vector_length (1) /* { dg-error "'vector_length' is not valid for '#pragma acc serial'" } */
+  ;
 }
diff --git a/gcc/testsuite/c-c++-common/goacc/serial-dims.c b/gcc/testsuite/c-c++-common/goacc/serial-dims.c
deleted file mode 100644
index 41698d279c98..000000000000
--- a/gcc/testsuite/c-c++-common/goacc/serial-dims.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* Invalid use of OpenACC parallelism dimensions clauses: num_gangs,
-   num_workers, vector_length with the serial construct.  */
-
-void f(void)
-{
-#pragma acc serial num_gangs (1) /* { dg-error "'num_gangs' is not valid for '#pragma acc serial'" } */
-  ;
-#pragma acc serial num_workers (1) /* { dg-error "'num_workers' is not valid for '#pragma acc serial'" } */
-  ;
-#pragma acc serial vector_length (1) /* { dg-error "'vector_length' is not valid for '#pragma acc serial'" } */
-  ;
-}
diff --git a/gcc/testsuite/gfortran.dg/goacc/serial-dims.f90 b/gcc/testsuite/gfortran.dg/goacc/parallel-dims-2.f90
similarity index 50%
rename from gcc/testsuite/gfortran.dg/goacc/serial-dims.f90
rename to gcc/testsuite/gfortran.dg/goacc/parallel-dims-2.f90
index 72b4a8361776..91a5c300a94c 100644
--- a/gcc/testsuite/gfortran.dg/goacc/serial-dims.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/parallel-dims-2.f90
@@ -1,34 +1,15 @@
-! Invalid use of OpenACC parallelism dimensions clauses: num_gangs,
-! num_workers, vector_length with the serial construct.
+! Invalid use of OpenACC parallelism dimensions clauses: 'num_gangs',
+! 'num_workers', 'vector_length'.
 
-subroutine s()
-  integer :: i
-  !$acc parallel
-  !$acc end parallel
+! See also '../../c-c++-common/goacc/parallel-dims-2.c'.
 
-  !$acc parallel loop
-  do i = 1, 5
-  end do
-
-  !$acc parallel loop
-  do i = 1, 5
-  end do
-  !$acc end parallel loop
-
-  !$acc serial loop
-  do i = 1, 5
-  end do
+subroutine f()
+  !TODO 'kernels', 'parallel' testing per '../../c-c++-common/goacc/parallel-dims-2.c'.
+  !TODO This should incorporate some of the testing done in 'sie.f95'.
 
-  !$acc serial loop
-  do i = 1, 5
-  end do
-  !$acc end serial loop
 
-  !$acc serial
-  !$acc end serial
-end subroutine s
+  ! The 'serial' construct doesn't allow these at all.
 
-subroutine f()
 !$acc serial num_gangs (1)  ! { dg-error "Failed to match clause at" }
 !$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }
 
@@ -37,4 +18,5 @@ subroutine f()
 
 !$acc serial vector_length (1)  ! { dg-error "Failed to match clause at" }
 !$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }
+
 end subroutine f
diff --git a/gcc/tree.h b/gcc/tree.h
index a7d39c3a74df..4bec90d9a729 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1622,7 +1622,8 @@ class auto_suppress_location_wrappers
    treatment if OMP_CLAUSE_SIZE is zero.  */
 #define OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION(NODE) \
   TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
-/* Nonzero if this map clause is for an ACC parallel reduction variable.  */
+/* Nonzero if this map clause is for an OpenACC compute construct's reduction
+   variable.  */
 #define OMP_CLAUSE_MAP_IN_REDUCTION(NODE) \
   TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
index 7e699f476b21..a5edfc6ca164 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
@@ -1,6 +1,8 @@
 /* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
    vector_length.  */
 
+/* See also '../libgomp.oacc-fortran/parallel-dims.f90'.  */
+
 #include <limits.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -45,6 +47,8 @@ int main ()
 {
   acc_init (acc_device_default);
 
+  /* OpenACC parallel construct.  */
+
   /* Non-positive value.  */
 
   /* GR, WS, VS.  */
@@ -478,6 +482,8 @@ int main ()
   }
 
 
+  /* OpenACC kernels construct.  */
+
   /* We can't test parallelized OpenACC kernels constructs in this way: use of
      the acc_gang, acc_worker, acc_vector functions will make the construct
      unparallelizable.  */
@@ -544,5 +550,72 @@ int main ()
   }
 
 
+  /* OpenACC serial construct.  */
+
+  /* GR, WS, VS.  */
+  {
+    int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
+    gangs_min = workers_min = vectors_min = INT_MAX;
+    gangs_max = workers_max = vectors_max = INT_MIN;
+#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+    {
+      for (int i = 100; i > -100; i--)
+	{
+	  gangs_min = gangs_max = acc_gang ();
+	  workers_min = workers_max = acc_worker ();
+	  vectors_min = vectors_max = acc_vector ();
+	}
+    }
+    if (gangs_min != 0 || gangs_max != 1 - 1
+	|| workers_min != 0 || workers_max != 1 - 1
+	|| vectors_min != 0 || vectors_max != 1 - 1)
+      __builtin_abort ();
+  }
+
+  /* Composition of GP, WP, VP.  */
+  {
+    int vectors_actual = 1;  /* Implicit 'vector_length (1)' clause.  */
+    int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
+    gangs_min = workers_min = vectors_min = INT_MAX;
+    gangs_max = workers_max = vectors_max = INT_MIN;
+#pragma acc serial copy (vectors_actual) /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+  copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max)
+    {
+      if (acc_on_device (acc_device_nvidia))
+	{
+	  /* The GCC nvptx back end enforces vector_length (32).  */
+	  /* It's unclear if that's actually permissible here;
+	     <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC
+	     'serial' construct might not actually be serial".  */
+	  vectors_actual = 32;
+	}
+#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      for (int i = 100; i > -100; i--)
+#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	for (int j = 100; j > -100; j--)
+#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	  for (int k = 100 * vectors_actual; k > -100 * vectors_actual; k--)
+	    {
+	      gangs_min = gangs_max = acc_gang ();
+	      workers_min = workers_max = acc_worker ();
+	      vectors_min = vectors_max = acc_vector ();
+	    }
+    }
+    if (acc_get_device_type () == acc_device_nvidia)
+      {
+	if (vectors_actual != 32)
+	  __builtin_abort ();
+      }
+    else
+      if (vectors_actual != 1)
+	__builtin_abort ();
+    if (gangs_min != 0 || gangs_max != 1 - 1
+	|| workers_min != 0 || workers_max != 1 - 1
+	|| vectors_min != 0 || vectors_max != vectors_actual - 1)
+      __builtin_abort ();
+  }
+
+
   return 0;
 }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c
deleted file mode 100644
index bb91c9221f89..000000000000
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c
+++ /dev/null
@@ -1,92 +0,0 @@
-/* OpenACC dimensions with the serial construct.  */
-
-#include <limits.h>
-#include <openacc.h>
-#include <gomp-constants.h>
-
-/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
-   not behaving as expected for -O0.  */
-#pragma acc routine seq
-static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
-{
-  if (acc_on_device ((int) acc_device_host))
-    return 0;
-  else if (acc_on_device ((int) acc_device_nvidia))
-    return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
-  else
-    __builtin_abort ();
-}
-
-#pragma acc routine seq
-static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
-{
-  if (acc_on_device ((int) acc_device_host))
-    return 0;
-  else if (acc_on_device ((int) acc_device_nvidia))
-    return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
-  else
-    __builtin_abort ();
-}
-
-#pragma acc routine seq
-static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
-{
-  if (acc_on_device ((int) acc_device_host))
-    return 0;
-  else if (acc_on_device ((int) acc_device_nvidia))
-    return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
-  else
-    __builtin_abort ();
-}
-
-
-int main ()
-{
-  acc_init (acc_device_default);
-
-  /* Serial OpenACC constructs must get launched as 1 x 1 x 1.  */
-  {
-    int gangs_min, gangs_max;
-    int workers_min, workers_max;
-    int vectors_min, vectors_max;
-    int gangs_actual, workers_actual, vectors_actual;
-    int i, j, k;
-
-    gangs_min = workers_min = vectors_min = INT_MAX;
-    gangs_max = workers_max = vectors_max = INT_MIN;
-    gangs_actual = workers_actual = vectors_actual = 1;
-#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */
-    {
-      if (acc_on_device (acc_device_nvidia))
-	{
-	  /* The GCC nvptx back end enforces vector_length (32).  */
-	  vectors_actual = 32;
-	}
-      else if (!acc_on_device (acc_device_host))
-	__builtin_abort ();
-#pragma acc loop gang \
-  reduction (min: gangs_min, workers_min, vectors_min) \
-  reduction (max: gangs_max, workers_max, vectors_max)
-      for (i = 100 * gangs_actual; i > -100 * gangs_actual; i--)
-#pragma acc loop worker \
-  reduction (min: gangs_min, workers_min, vectors_min) \
-  reduction (max: gangs_max, workers_max, vectors_max)
-	for (j = 100 * workers_actual; j > -100 * workers_actual; j--)
-#pragma acc loop vector \
-  reduction (min: gangs_min, workers_min, vectors_min) \
-  reduction (max: gangs_max, workers_max, vectors_max)
-	  for (k = 100 * vectors_actual; k > -100 * vectors_actual; k--)
-	    {
-	      gangs_min = gangs_max = acc_gang ();
-	      workers_min = workers_max = acc_worker ();
-	      vectors_min = vectors_max = acc_vector ();
-	    }
-      if (gangs_min != 0 || gangs_max != gangs_actual - 1
-	  || workers_min != 0 || workers_max != workers_actual - 1
-	  || vectors_min != 0 || vectors_max != vectors_actual - 1)
-	__builtin_abort ();
-    }
-  }
-
-  return 0;
-}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
similarity index 67%
rename from libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c
rename to libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
index 45c260510c29..b5986f4afef7 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
@@ -1,5 +1,9 @@
-/* OpenACC dimensions with the serial construct.  */
-/* Used by serial-dims.f90.  */
+/* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
+   vector_length.  */
+
+/* Copied from '../libgomp.oacc-c-c++-common/parallel-dims.c'.  */
+
+/* Used by 'parallel-dims.f90'.  */
 
 #include <limits.h>
 #include <openacc.h>
@@ -8,7 +12,7 @@
 /* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
    not behaving as expected for -O0.  */
 #pragma acc routine seq
-static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
 {
   if (acc_on_device ((int) acc_device_host))
     return 0;
@@ -19,7 +23,7 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
 }
 
 #pragma acc routine seq
-static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
 {
   if (acc_on_device ((int) acc_device_host))
     return 0;
@@ -30,7 +34,7 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
 }
 
 #pragma acc routine seq
-static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
 {
   if (acc_on_device ((int) acc_device_host))
     return 0;
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
new file mode 100644
index 000000000000..1bfcd6ce0998
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
@@ -0,0 +1,120 @@
+! OpenACC parallelism dimensions clauses: num_gangs, num_workers,
+! vector_length.
+
+! { dg-additional-sources parallel-dims-aux.c }
+! { dg-do run }
+! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
+
+! See also '../libgomp.oacc-c-c++-common/parallel-dims.c'.
+
+module acc_routines
+  implicit none (type, external)
+
+  interface
+    integer function acc_gang() bind(C)
+      !$acc routine seq
+    end function acc_gang
+
+    integer function acc_worker() bind(C)
+      !$acc routine seq
+    end function acc_worker
+
+    integer function acc_vector() bind(C)
+      !$acc routine seq
+    end function acc_vector
+  end interface
+end module acc_routines
+
+program main
+  use iso_c_binding
+  use openacc
+  use acc_routines
+  implicit none (type, external)
+
+  integer :: gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max
+  integer :: vectors_actual
+  integer :: i, j, k
+
+  call acc_init (acc_device_default)
+
+  ! OpenACC parallel construct.
+
+  !TODO
+
+
+  ! OpenACC kernels construct.
+
+  !TODO
+
+
+  ! OpenACC serial construct.
+
+  ! GR, WS, VS.
+
+  gangs_min = huge(gangs_min) ! INT_MAX
+  workers_min = huge(workers_min) ! INT_MAX
+  vectors_min = huge(vectors_min) ! INT_MAX
+  gangs_max = -huge(gangs_max) - 1  ! INT_MIN
+  workers_max = -huge(gangs_max) - 1 ! INT_MIN
+  vectors_max = -huge(gangs_max) - 1 ! INT_MIN
+  !$acc serial &
+  !$acc   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+  do i = 100, -99, -1
+     gangs_min = acc_gang ();
+     gangs_max = acc_gang ();
+     workers_min = acc_worker ();
+     workers_max = acc_worker ();
+     vectors_min = acc_vector ();
+     vectors_max = acc_vector ();
+  end do
+  !$acc end serial
+  if (gangs_min /= 0 .or. gangs_max /= 1 - 1 &
+      .or. workers_min /= 0 .or. workers_max /= 1 - 1 &
+      .or. vectors_min /= 0 .or. vectors_max /= 1 - 1) &
+    stop 1
+
+  ! Composition of GP, WP, VP.
+
+  vectors_actual = 1 ! Implicit 'vector_length (1)' clause.
+  gangs_min = huge(gangs_min) ! INT_MAX
+  workers_min = huge(workers_min) ! INT_MAX
+  vectors_min = huge(vectors_min) ! INT_MAX
+  gangs_max = -huge(gangs_max) - 1  ! INT_MIN
+  workers_max = -huge(gangs_max) - 1 ! INT_MIN
+  vectors_max = -huge(gangs_max) - 1 ! INT_MIN
+  !$acc serial copy (vectors_actual) &
+  !$acc   copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+  if (acc_on_device (acc_device_nvidia)) then
+     ! The GCC nvptx back end enforces vector_length (32).
+     ! It's unclear if that's actually permissible here;
+     ! <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC 'serial'
+     ! construct might not actually be serial".
+   vectors_actual = 32
+  end if
+  !$acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+  do i = 100, -99, -1
+     !$acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+     do j = 100, -99, -1
+        !$acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+        do k = 100 * vectors_actual, -99 * vectors_actual, -1
+           gangs_min = acc_gang ();
+           gangs_max = acc_gang ();
+           workers_min = acc_worker ();
+           workers_max = acc_worker ();
+           vectors_min = acc_vector ();
+           vectors_max = acc_vector ();
+        end do
+     end do
+  end do
+  !$acc end serial
+  if (acc_get_device_type () .eq. acc_device_nvidia) then
+     if (vectors_actual /= 32) stop 2
+  else
+     if (vectors_actual /= 1) stop 3
+  end if
+  if (gangs_min /= 0 .or. gangs_max /= 1 - 1 &
+      .or. workers_min /= 0 .or. workers_max /= 1 - 1 &
+      .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) &
+    stop 4
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90
deleted file mode 100644
index 25c933629045..000000000000
--- a/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90
+++ /dev/null
@@ -1,89 +0,0 @@
-! OpenACC dimensions with the serial construct.
-
-! { dg-additional-sources serial-dims-aux.c }
-! { dg-warning "command line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
-
-module acc_routines
-  implicit none (type, external)
-
-  interface
-    integer function acc_gang() bind(C)
-      !$acc routine seq
-    end function acc_gang
-
-    integer function acc_worker() bind(C)
-      !$acc routine seq
-    end function acc_worker
-
-    integer function acc_vector() bind(C)
-      !$acc routine seq
-    end function acc_vector
-  end interface
-end module acc_routines
-
-program main
-  use iso_c_binding
-  use openacc
-  use acc_routines
-  implicit none (type, external)
-
-  integer :: gangs_min, gangs_max
-  integer :: workers_min, workers_max
-  integer :: vectors_min, vectors_max
-  integer :: gangs_actual, workers_actual, vectors_actual
-  integer :: i, j, k
-
-  call acc_init (acc_device_default)
-
-  ! Serial OpenACC constructs must get launched as 1 x 1 x 1.
-  gangs_min = huge(gangs_min)
-  workers_min = huge(workers_min)
-  vectors_min = huge(vectors_min)
-  gangs_max = -huge(gangs_max) - 1  ! INT_MIN
-  workers_max = -huge(gangs_max) - 1
-  vectors_max = -huge(gangs_max) - 1
-  gangs_actual = 1
-  workers_actual = 1
-  vectors_actual = 1
-
-  !$acc serial ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
-    if (acc_on_device (acc_device_nvidia)) then
-      ! The GCC nvptx back end enforces vector_length (32).
-      vectors_actual = 32
-    elseif (acc_on_device (acc_device_gcn)) then
-      ! AMD GCN relies on the autovectorizer for the vector dimension:
-      ! the loop below isn't likely to be vectorized, so vectors_actual
-      ! is effectively 1.
-      vectors_actual = 1
-    elseif (.not. acc_on_device (acc_device_host)) then
-      stop 1
-    end if
-
-!$acc loop gang &
-!$acc & reduction (min: gangs_min, workers_min, vectors_min) &
-!$acc & reduction (max: gangs_max, workers_max, vectors_max)
-    do i = 100 * gangs_actual, -99 * gangs_actual, -1
-!$acc loop worker &
-!$acc & reduction (min: gangs_min, workers_min, vectors_min) &
-!$acc & reduction (max: gangs_max, workers_max, vectors_max)
-      do j = 100 * workers_actual, -99 * workers_actual, -1
-!$acc loop vector &
-!$acc & reduction (min: gangs_min, workers_min, vectors_min) &
-!$acc & reduction (max: gangs_max, workers_max, vectors_max)
-        do k = 100 * vectors_actual, -99 * vectors_actual, -1
-          gangs_min = acc_gang ();
-          gangs_max = acc_gang ();
-          workers_min = acc_worker ();
-          workers_max = acc_worker ();
-          vectors_min = acc_vector ();
-          vectors_max = acc_vector ();
-       end do
-     end do
-   end do
-  if (gangs_min /= 0 .or. gangs_max /= gangs_actual - 1 &
-      .or. workers_min /= 0 .or. workers_max /= workers_actual - 1 &
-      .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) &
-    stop 2
-!$acc end serial
-
-end program main
-- 
2.17.1

Patch

diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 9fee84b22383..158154ec1294 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1291,6 +1291,7 @@  static const struct omp_pragma_def oacc_pragmas[] = {
   { "loop", PRAGMA_OACC_LOOP },
   { "parallel", PRAGMA_OACC_PARALLEL },
   { "routine", PRAGMA_OACC_ROUTINE },
+  { "serial", PRAGMA_OACC_SERIAL },
   { "update", PRAGMA_OACC_UPDATE },
   { "wait", PRAGMA_OACC_WAIT }
 };
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index e0aa774555a3..bfe681bb430a 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -38,6 +38,7 @@  enum pragma_kind {
   PRAGMA_OACC_LOOP,
   PRAGMA_OACC_PARALLEL,
   PRAGMA_OACC_ROUTINE,
+  PRAGMA_OACC_SERIAL,
   PRAGMA_OACC_UPDATE,
   PRAGMA_OACC_WAIT,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 4f044127a7e2..f5d217d0b7a4 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -16060,6 +16060,11 @@  c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
    # pragma acc parallel oacc-parallel-clause[optseq] new-line
      structured-block
 
+   OpenACC 2.6:
+
+   # pragma acc serial oacc-serial-clause[optseq] new-line
+     structured-block
+
    LOC is the location of the #pragma token.
 */
 
@@ -16096,10 +16101,24 @@  c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_VECTOR_LENGTH)	\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
+#define OACC_SERIAL_CLAUSE_MASK					\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEFAULT)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRIVATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FIRSTPRIVATE)	\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_REDUCTION)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
+
 static tree
-c_parser_oacc_kernels_parallel (location_t loc, c_parser *parser,
-				enum pragma_kind p_kind, char *p_name,
-				bool *if_p)
+c_parser_oacc_compute (location_t loc, c_parser *parser,
+		       enum pragma_kind p_kind, char *p_name, bool *if_p)
 {
   omp_clause_mask mask;
   enum tree_code code;
@@ -16115,6 +16134,11 @@  c_parser_oacc_kernels_parallel (location_t loc, c_parser *parser,
       mask = OACC_PARALLEL_CLAUSE_MASK;
       code = OACC_PARALLEL;
       break;
+    case PRAGMA_OACC_SERIAL:
+      strcat (p_name, " serial");
+      mask = OACC_SERIAL_CLAUSE_MASK;
+      code = OACC_SERIAL;
+      break;
     default:
       gcc_unreachable ();
     }
@@ -20578,9 +20602,9 @@  c_parser_omp_construct (c_parser *parser, bool *if_p)
       break;
     case PRAGMA_OACC_KERNELS:
     case PRAGMA_OACC_PARALLEL:
+    case PRAGMA_OACC_SERIAL:
       strcpy (p_name, "#pragma acc");
-      stmt = c_parser_oacc_kernels_parallel (loc, parser, p_kind, p_name,
-					     if_p);
+      stmt = c_parser_oacc_compute (loc, parser, p_kind, p_name, if_p);
       break;
     case PRAGMA_OACC_LOOP:
       strcpy (p_name, "#pragma acc");
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 20fddc57825a..8c79b0484fce 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -6986,6 +6986,7 @@  potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now,
     case OMP_DEPOBJ:
     case OACC_PARALLEL:
     case OACC_KERNELS:
+    case OACC_SERIAL:
     case OACC_DATA:
     case OACC_HOST_DATA:
     case OACC_LOOP:
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 7138aebebced..c45bfccf6e61 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -40150,6 +40150,10 @@  cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
    # pragma acc parallel oacc-parallel-clause[optseq] new-line
      structured-block
+
+   OpenACC 2.6:
+
+   # pragma acc serial oacc-serial-clause[optseq] new-line
 */
 
 #define OACC_KERNELS_CLAUSE_MASK					\
@@ -40185,9 +40189,24 @@  cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_VECTOR_LENGTH)       \
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
+#define OACC_SERIAL_CLAUSE_MASK						\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEFAULT)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRIVATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FIRSTPRIVATE)	\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_REDUCTION)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
+
 static tree
-cp_parser_oacc_kernels_parallel (cp_parser *parser, cp_token *pragma_tok,
-				 char *p_name, bool *if_p)
+cp_parser_oacc_compute (cp_parser *parser, cp_token *pragma_tok,
+			char *p_name, bool *if_p)
 {
   omp_clause_mask mask;
   enum tree_code code;
@@ -40203,6 +40222,11 @@  cp_parser_oacc_kernels_parallel (cp_parser *parser, cp_token *pragma_tok,
       mask = OACC_PARALLEL_CLAUSE_MASK;
       code = OACC_PARALLEL;
       break;
+    case PRAGMA_OACC_SERIAL:
+      strcat (p_name, " serial");
+      mask = OACC_SERIAL_CLAUSE_MASK;
+      code = OACC_SERIAL;
+      break;
     default:
       gcc_unreachable ();
     }
@@ -42022,9 +42046,9 @@  cp_parser_omp_construct (cp_parser *parser, cp_token *pragma_tok, bool *if_p)
       break;
     case PRAGMA_OACC_KERNELS:
     case PRAGMA_OACC_PARALLEL:
+    case PRAGMA_OACC_SERIAL:
       strcpy (p_name, "#pragma acc");
-      stmt = cp_parser_oacc_kernels_parallel (parser, pragma_tok, p_name,
-					      if_p);
+      stmt = cp_parser_oacc_compute (parser, pragma_tok, p_name, if_p);
       break;
     case PRAGMA_OACC_LOOP:
       strcpy (p_name, "#pragma acc");
@@ -42691,8 +42715,9 @@  cp_parser_pragma (cp_parser *parser, enum pragma_context context, bool *if_p)
     case PRAGMA_OACC_DATA:
     case PRAGMA_OACC_HOST_DATA:
     case PRAGMA_OACC_KERNELS:
-    case PRAGMA_OACC_PARALLEL:
     case PRAGMA_OACC_LOOP:
+    case PRAGMA_OACC_PARALLEL:
+    case PRAGMA_OACC_SERIAL:
     case PRAGMA_OMP_ATOMIC:
     case PRAGMA_OMP_CRITICAL:
     case PRAGMA_OMP_DISTRIBUTE:
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 8bacb3952ff2..5a0efaa86c8b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -17991,6 +17991,7 @@  tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
 
     case OACC_KERNELS:
     case OACC_PARALLEL:
+    case OACC_SERIAL:
       tmp = tsubst_omp_clauses (OMP_CLAUSES (t), C_ORT_ACC, args, complain,
 				in_decl);
       stmt = begin_omp_parallel ();
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 94e339c15ee8..badaaec38979 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -2388,6 +2388,7 @@  compilation.
 @tindex OACC_KERNELS
 @tindex OACC_LOOP
 @tindex OACC_PARALLEL
+@tindex OACC_SERIAL
 @tindex OACC_UPDATE
 
 All the statements starting with @code{OACC_} represent directives and
@@ -2432,6 +2433,10 @@  See the description of the @code{OMP_FOR} code.
 
 Represents @code{#pragma acc parallel [clause1 @dots{} clauseN]}.
 
+@item OACC_SERIAL
+
+Represents @code{#pragma acc serial [clause1 @dots{} clauseN]}.
+
 @item OACC_UPDATE
 
 Represents @code{#pragma acc update [clause1 @dots{} clauseN]}.
diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 9d7aad19e2f5..253fe15b201d 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1654,6 +1654,8 @@  show_omp_node (int level, gfc_code *c)
     case EXEC_OACC_PARALLEL: name = "PARALLEL"; is_oacc = true; break;
     case EXEC_OACC_KERNELS_LOOP: name = "KERNELS LOOP"; is_oacc = true; break;
     case EXEC_OACC_KERNELS: name = "KERNELS"; is_oacc = true; break;
+    case EXEC_OACC_SERIAL_LOOP: name = "SERIAL LOOP"; is_oacc = true; break;
+    case EXEC_OACC_SERIAL: name = "SERIAL"; is_oacc = true; break;
     case EXEC_OACC_DATA: name = "DATA"; is_oacc = true; break;
     case EXEC_OACC_HOST_DATA: name = "HOST_DATA"; is_oacc = true; break;
     case EXEC_OACC_LOOP: name = "LOOP"; is_oacc = true; break;
@@ -1729,6 +1731,8 @@  show_omp_node (int level, gfc_code *c)
     case EXEC_OACC_PARALLEL:
     case EXEC_OACC_KERNELS_LOOP:
     case EXEC_OACC_KERNELS:
+    case EXEC_OACC_SERIAL_LOOP:
+    case EXEC_OACC_SERIAL:
     case EXEC_OACC_DATA:
     case EXEC_OACC_HOST_DATA:
     case EXEC_OACC_LOOP:
@@ -2918,6 +2922,8 @@  show_code_node (int level, gfc_code *c)
     case EXEC_OACC_PARALLEL:
     case EXEC_OACC_KERNELS_LOOP:
     case EXEC_OACC_KERNELS:
+    case EXEC_OACC_SERIAL_LOOP:
+    case EXEC_OACC_SERIAL:
     case EXEC_OACC_DATA:
     case EXEC_OACC_HOST_DATA:
     case EXEC_OACC_LOOP:
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 920acdafc6b7..e962db59bc59 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -223,7 +223,8 @@  enum gfc_statement
   ST_OACC_END_DATA, ST_OACC_HOST_DATA, ST_OACC_END_HOST_DATA, ST_OACC_LOOP,
   ST_OACC_END_LOOP, ST_OACC_DECLARE, ST_OACC_UPDATE, ST_OACC_WAIT,
   ST_OACC_CACHE, ST_OACC_KERNELS_LOOP, ST_OACC_END_KERNELS_LOOP,
-  ST_OACC_ENTER_DATA, ST_OACC_EXIT_DATA, ST_OACC_ROUTINE,
+  ST_OACC_SERIAL_LOOP, ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL,
+  ST_OACC_END_SERIAL, ST_OACC_ENTER_DATA, ST_OACC_EXIT_DATA, ST_OACC_ROUTINE,
   ST_OACC_ATOMIC, ST_OACC_END_ATOMIC,
   ST_OMP_ATOMIC, ST_OMP_BARRIER, ST_OMP_CRITICAL, ST_OMP_END_ATOMIC,
   ST_OMP_END_CRITICAL, ST_OMP_END_DO, ST_OMP_END_MASTER, ST_OMP_END_ORDERED,
@@ -2572,11 +2573,11 @@  enum gfc_exec_op
   EXEC_BACKSPACE, EXEC_ENDFILE, EXEC_INQUIRE, EXEC_REWIND, EXEC_FLUSH,
   EXEC_FORM_TEAM, EXEC_CHANGE_TEAM, EXEC_END_TEAM, EXEC_SYNC_TEAM,
   EXEC_LOCK, EXEC_UNLOCK, EXEC_EVENT_POST, EXEC_EVENT_WAIT, EXEC_FAIL_IMAGE,
-  EXEC_OACC_KERNELS_LOOP, EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_ROUTINE,
-  EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS, EXEC_OACC_DATA, EXEC_OACC_HOST_DATA,
-  EXEC_OACC_LOOP, EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
-  EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA, EXEC_OACC_ATOMIC,
-  EXEC_OACC_DECLARE,
+  EXEC_OACC_KERNELS_LOOP, EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_SERIAL_LOOP,
+  EXEC_OACC_ROUTINE, EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS, EXEC_OACC_SERIAL,
+  EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP, EXEC_OACC_UPDATE,
+  EXEC_OACC_WAIT, EXEC_OACC_CACHE, EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA,
+  EXEC_OACC_ATOMIC, EXEC_OACC_DECLARE,
   EXEC_OMP_CRITICAL, EXEC_OMP_DO, EXEC_OMP_FLUSH, EXEC_OMP_MASTER,
   EXEC_OMP_ORDERED, EXEC_OMP_PARALLEL, EXEC_OMP_PARALLEL_DO,
   EXEC_OMP_PARALLEL_SECTIONS, EXEC_OMP_PARALLEL_WORKSHARE,
diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 4a31080a2856..b5945049de55 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -2860,7 +2860,8 @@  match_exit_cycle (gfc_statement st, gfc_exec_op op)
       && o != NULL
       && o->state == COMP_OMP_STRUCTURED_BLOCK
       && (o->head->op == EXEC_OACC_LOOP
-	  || o->head->op == EXEC_OACC_PARALLEL_LOOP))
+	  || o->head->op == EXEC_OACC_PARALLEL_LOOP
+	  || o->head->op == EXEC_OACC_SERIAL_LOOP))
     {
       int collapse = 1;
       gcc_assert (o->head->next != NULL
diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h
index 611d79646458..954af72f0e07 100644
--- a/gcc/fortran/match.h
+++ b/gcc/fortran/match.h
@@ -147,6 +147,8 @@  match gfc_match_oacc_kernels_loop (void);
 match gfc_match_oacc_parallel (void);
 match gfc_match_oacc_parallel_loop (void);
 match gfc_match_oacc_enter_data (void);
+match gfc_match_oacc_serial (void);
+match gfc_match_oacc_serial_loop (void);
 match gfc_match_oacc_exit_data (void);
 match gfc_match_oacc_routine (void);
 
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index ca3427885457..198facce636d 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1964,6 +1964,15 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT	      \
    | OMP_CLAUSE_WAIT)
+#define OACC_SERIAL_CLAUSES \
+  (omp_mask (OMP_CLAUSE_ASYNC) | OMP_CLAUSE_WAIT			      \
+   | OMP_CLAUSE_IF							      \
+   | OMP_CLAUSE_REDUCTION						      \
+   | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
+   | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT				      \
+   | OMP_CLAUSE_DEVICEPTR						      \
+   | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE			      \
+   | OMP_CLAUSE_DEFAULT)
 #define OACC_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR  | OMP_CLAUSE_COPY	      \
    | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE		      \
@@ -2038,6 +2047,21 @@  gfc_match_oacc_kernels (void)
 }
 
 
+match
+gfc_match_oacc_serial_loop (void)
+{
+  return match_acc (EXEC_OACC_SERIAL_LOOP,
+		    OACC_SERIAL_CLAUSES | OACC_LOOP_CLAUSES);
+}
+
+
+match
+gfc_match_oacc_serial (void)
+{
+  return match_acc (EXEC_OACC_SERIAL, OACC_SERIAL_CLAUSES);
+}
+
+
 match
 gfc_match_oacc_data (void)
 {
@@ -3783,6 +3807,7 @@  oacc_is_loop (gfc_code *code)
 {
   return code->op == EXEC_OACC_PARALLEL_LOOP
 	 || code->op == EXEC_OACC_KERNELS_LOOP
+	 || code->op == EXEC_OACC_SERIAL_LOOP
 	 || code->op == EXEC_OACC_LOOP;
 }
 
@@ -4626,7 +4651,9 @@  resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 				 n->sym->name, name, &n->where);
 		  }
 		if (code
-		    && (oacc_is_loop (code) || code->op == EXEC_OACC_PARALLEL))
+		    && (oacc_is_loop (code)
+			|| code->op == EXEC_OACC_PARALLEL
+			|| code->op == EXEC_OACC_SERIAL))
 		  check_array_not_assumed (n->sym, n->where, name);
 		else if (n->sym->as && n->sym->as->type == AS_ASSUMED_SIZE)
 		  gfc_error ("Assumed size array %qs in %s clause at %L",
@@ -5818,6 +5845,8 @@  oacc_code_to_statement (gfc_code *code)
       return ST_OACC_PARALLEL;
     case EXEC_OACC_KERNELS:
       return ST_OACC_KERNELS;
+    case EXEC_OACC_SERIAL:
+      return ST_OACC_SERIAL;
     case EXEC_OACC_DATA:
       return ST_OACC_DATA;
     case EXEC_OACC_HOST_DATA:
@@ -5826,6 +5855,8 @@  oacc_code_to_statement (gfc_code *code)
       return ST_OACC_PARALLEL_LOOP;
     case EXEC_OACC_KERNELS_LOOP:
       return ST_OACC_KERNELS_LOOP;
+    case EXEC_OACC_SERIAL_LOOP:
+      return ST_OACC_SERIAL_LOOP;
     case EXEC_OACC_LOOP:
       return ST_OACC_LOOP;
     case EXEC_OACC_ATOMIC:
@@ -6163,6 +6194,7 @@  gfc_resolve_oacc_directive (gfc_code *code, gfc_namespace *ns ATTRIBUTE_UNUSED)
     {
     case EXEC_OACC_PARALLEL:
     case EXEC_OACC_KERNELS:
+    case EXEC_OACC_SERIAL:
     case EXEC_OACC_DATA:
     case EXEC_OACC_HOST_DATA:
     case EXEC_OACC_UPDATE:
@@ -6174,6 +6206,7 @@  gfc_resolve_oacc_directive (gfc_code *code, gfc_namespace *ns ATTRIBUTE_UNUSED)
       break;
     case EXEC_OACC_PARALLEL_LOOP:
     case EXEC_OACC_KERNELS_LOOP:
+    case EXEC_OACC_SERIAL_LOOP:
     case EXEC_OACC_LOOP:
       resolve_oacc_loop (code);
       break;
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 15f6bf2937c4..1a38606682ca 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -683,6 +683,9 @@  decode_oacc_directive (void)
       matcha ("end parallel loop", gfc_match_omp_eos_error,
 	      ST_OACC_END_PARALLEL_LOOP);
       matcha ("end parallel", gfc_match_omp_eos_error, ST_OACC_END_PARALLEL);
+      matcha ("end serial loop", gfc_match_omp_eos_error,
+	      ST_OACC_END_SERIAL_LOOP);
+      matcha ("end serial", gfc_match_omp_eos_error, ST_OACC_END_SERIAL);
       matcha ("enter data", gfc_match_oacc_enter_data, ST_OACC_ENTER_DATA);
       matcha ("exit data", gfc_match_oacc_exit_data, ST_OACC_EXIT_DATA);
       break;
@@ -705,6 +708,10 @@  decode_oacc_directive (void)
     case 'r':
       match ("routine", gfc_match_oacc_routine, ST_OACC_ROUTINE);
       break;
+    case 's':
+      matcha ("serial loop", gfc_match_oacc_serial_loop, ST_OACC_SERIAL_LOOP);
+      matcha ("serial", gfc_match_oacc_serial, ST_OACC_SERIAL);
+      break;
     case 'u':
       matcha ("update", gfc_match_oacc_update, ST_OACC_UPDATE);
       break;
@@ -1583,7 +1590,8 @@  next_statement (void)
   case ST_CRITICAL: \
   case ST_OACC_PARALLEL_LOOP: case ST_OACC_PARALLEL: case ST_OACC_KERNELS: \
   case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \
-  case ST_OACC_KERNELS_LOOP: case ST_OACC_ATOMIC
+  case ST_OACC_KERNELS_LOOP: case ST_OACC_SERIAL_LOOP: case ST_OACC_SERIAL: \
+  case ST_OACC_ATOMIC
 
 /* Declaration statements */
 
@@ -2157,6 +2165,18 @@  gfc_ascii_statement (gfc_statement st)
     case ST_OACC_END_KERNELS_LOOP:
       p = "!$ACC END KERNELS LOOP";
       break;
+    case ST_OACC_SERIAL_LOOP:
+      p = "!$ACC SERIAL LOOP";
+      break;
+    case ST_OACC_END_SERIAL_LOOP:
+      p = "!$ACC END SERIAL LOOP";
+      break;
+    case ST_OACC_SERIAL:
+      p = "!$ACC SERIAL";
+      break;
+    case ST_OACC_END_SERIAL:
+      p = "!$ACC END SERIAL";
+      break;
     case ST_OACC_DATA:
       p = "!$ACC DATA";
       break;
@@ -5065,6 +5085,9 @@  parse_oacc_structured_block (gfc_statement acc_st)
     case ST_OACC_KERNELS:
       acc_end_st = ST_OACC_END_KERNELS;
       break;
+    case ST_OACC_SERIAL:
+      acc_end_st = ST_OACC_END_SERIAL;
+      break;
     case ST_OACC_DATA:
       acc_end_st = ST_OACC_END_DATA;
       break;
@@ -5149,6 +5172,7 @@  parse_oacc_loop (gfc_statement acc_st)
     gfc_warning (0, "Redundant !$ACC END LOOP at %C");
   if ((acc_st == ST_OACC_PARALLEL_LOOP && st == ST_OACC_END_PARALLEL_LOOP) ||
       (acc_st == ST_OACC_KERNELS_LOOP && st == ST_OACC_END_KERNELS_LOOP) ||
+      (acc_st == ST_OACC_SERIAL_LOOP && st == ST_OACC_END_SERIAL_LOOP) ||
       (acc_st == ST_OACC_LOOP && st == ST_OACC_END_LOOP))
     {
       gcc_assert (new_st.op == EXEC_NOP);
@@ -5488,6 +5512,7 @@  parse_executable (gfc_statement st)
 
 	case ST_OACC_PARALLEL_LOOP:
 	case ST_OACC_KERNELS_LOOP:
+	case ST_OACC_SERIAL_LOOP:
 	case ST_OACC_LOOP:
 	  st = parse_oacc_loop (st);
 	  if (st == ST_IMPLIED_ENDDO)
@@ -5496,6 +5521,7 @@  parse_executable (gfc_statement st)
 
 	case ST_OACC_PARALLEL:
 	case ST_OACC_KERNELS:
+	case ST_OACC_SERIAL:
 	case ST_OACC_DATA:
 	case ST_OACC_HOST_DATA:
 	  parse_oacc_structured_block (st);
@@ -6544,6 +6570,8 @@  is_oacc (gfc_state_data *sd)
     case EXEC_OACC_PARALLEL:
     case EXEC_OACC_KERNELS_LOOP:
     case EXEC_OACC_KERNELS:
+    case EXEC_OACC_SERIAL_LOOP:
+    case EXEC_OACC_SERIAL:
     case EXEC_OACC_DATA:
     case EXEC_OACC_HOST_DATA:
     case EXEC_OACC_LOOP:
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 218c2edba57c..9b1437d70327 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -10576,6 +10576,8 @@  gfc_resolve_blocks (gfc_code *b, gfc_namespace *ns)
 	case EXEC_OACC_PARALLEL:
 	case EXEC_OACC_KERNELS_LOOP:
 	case EXEC_OACC_KERNELS:
+	case EXEC_OACC_SERIAL_LOOP:
+	case EXEC_OACC_SERIAL:
 	case EXEC_OACC_DATA:
 	case EXEC_OACC_HOST_DATA:
 	case EXEC_OACC_LOOP:
@@ -11527,6 +11529,8 @@  gfc_resolve_code (gfc_code *code, gfc_namespace *ns)
 	    case EXEC_OACC_PARALLEL:
 	    case EXEC_OACC_KERNELS_LOOP:
 	    case EXEC_OACC_KERNELS:
+	    case EXEC_OACC_SERIAL_LOOP:
+	    case EXEC_OACC_SERIAL:
 	    case EXEC_OACC_DATA:
 	    case EXEC_OACC_HOST_DATA:
 	    case EXEC_OACC_LOOP:
@@ -11940,6 +11944,8 @@  start:
 	case EXEC_OACC_PARALLEL:
 	case EXEC_OACC_KERNELS_LOOP:
 	case EXEC_OACC_KERNELS:
+	case EXEC_OACC_SERIAL_LOOP:
+	case EXEC_OACC_SERIAL:
 	case EXEC_OACC_DATA:
 	case EXEC_OACC_HOST_DATA:
 	case EXEC_OACC_LOOP:
diff --git a/gcc/fortran/st.c b/gcc/fortran/st.c
index ee18d7aea8ad..12eed71e3a26 100644
--- a/gcc/fortran/st.c
+++ b/gcc/fortran/st.c
@@ -202,6 +202,8 @@  gfc_free_statement (gfc_code *p)
     case EXEC_OACC_PARALLEL:
     case EXEC_OACC_KERNELS_LOOP:
     case EXEC_OACC_KERNELS:
+    case EXEC_OACC_SERIAL_LOOP:
+    case EXEC_OACC_SERIAL:
     case EXEC_OACC_DATA:
     case EXEC_OACC_HOST_DATA:
     case EXEC_OACC_LOOP:
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 14a3c3e42843..0d5a5a9615d6 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -3141,7 +3141,7 @@  gfc_trans_omp_code (gfc_code *code, bool force_empty)
 }
 
 /* Trans OpenACC directives. */
-/* parallel, kernels, data and host_data. */
+/* parallel, serial, kernels, data and host_data. */
 static tree
 gfc_trans_oacc_construct (gfc_code *code)
 {
@@ -3157,6 +3157,9 @@  gfc_trans_oacc_construct (gfc_code *code)
       case EXEC_OACC_KERNELS:
 	construct_code = OACC_KERNELS;
 	break;
+      case EXEC_OACC_SERIAL:
+	construct_code = OACC_SERIAL;
+	break;
       case EXEC_OACC_DATA:
 	construct_code = OACC_DATA;
 	break;
@@ -3964,7 +3967,8 @@  gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
   return gfc_finish_block (&block);
 }
 
-/* parallel loop and kernels loop. */
+/* Combined OpenACC parallel loop, kernels loop and serial loop. */
+
 static tree
 gfc_trans_oacc_combined_directive (gfc_code *code)
 {
@@ -3982,6 +3986,9 @@  gfc_trans_oacc_combined_directive (gfc_code *code)
       case EXEC_OACC_KERNELS_LOOP:
 	construct_code = OACC_KERNELS;
 	break;
+      case EXEC_OACC_SERIAL_LOOP:
+	construct_code = OACC_SERIAL;
+	break;
       default:
 	gcc_unreachable ();
     }
@@ -5214,9 +5221,11 @@  gfc_trans_oacc_directive (gfc_code *code)
     {
     case EXEC_OACC_PARALLEL_LOOP:
     case EXEC_OACC_KERNELS_LOOP:
+    case EXEC_OACC_SERIAL_LOOP:
       return gfc_trans_oacc_combined_directive (code);
     case EXEC_OACC_PARALLEL:
     case EXEC_OACC_KERNELS:
+    case EXEC_OACC_SERIAL:
     case EXEC_OACC_DATA:
     case EXEC_OACC_HOST_DATA:
       return gfc_trans_oacc_construct (code);
diff --git a/gcc/fortran/trans.c b/gcc/fortran/trans.c
index 2f878f6b1185..d9b278199b75 100644
--- a/gcc/fortran/trans.c
+++ b/gcc/fortran/trans.c
@@ -2137,6 +2137,8 @@  trans_code (gfc_code * code, tree cond)
 	case EXEC_OACC_KERNELS_LOOP:
 	case EXEC_OACC_PARALLEL:
 	case EXEC_OACC_PARALLEL_LOOP:
+	case EXEC_OACC_SERIAL:
+	case EXEC_OACC_SERIAL_LOOP:
 	case EXEC_OACC_ENTER_DATA:
 	case EXEC_OACC_EXIT_DATA:
 	case EXEC_OACC_ATOMIC:
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 2d5ece068053..f59cc2aa3188 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1676,6 +1676,9 @@  dump_gimple_omp_target (pretty_printer *buffer, gomp_target *gs,
     case GF_OMP_TARGET_KIND_OACC_PARALLEL:
       kind = " oacc_parallel";
       break;
+    case GF_OMP_TARGET_KIND_OACC_SERIAL:
+      kind = " oacc_serial";
+      break;
     case GF_OMP_TARGET_KIND_OACC_DATA:
       kind = " oacc_data";
       break;
diff --git a/gcc/gimple.h b/gcc/gimple.h
index cf1f8da5ae24..83a449be3643 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -182,6 +182,7 @@  enum gf_mask {
     GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9,
     GF_OMP_TARGET_KIND_OACC_DECLARE = 10,
     GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,
+    GF_OMP_TARGET_KIND_OACC_SERIAL = 12,
     GF_OMP_TEAMS_GRID_PHONY	= 1 << 0,
     GF_OMP_TEAMS_HOST		= 1 << 1,
 
@@ -6476,6 +6477,7 @@  is_gimple_omp_oacc (const gimple *stmt)
 	{
 	case GF_OMP_TARGET_KIND_OACC_PARALLEL:
 	case GF_OMP_TARGET_KIND_OACC_KERNELS:
+	case GF_OMP_TARGET_KIND_OACC_SERIAL:
 	case GF_OMP_TARGET_KIND_OACC_DATA:
 	case GF_OMP_TARGET_KIND_OACC_UPDATE:
 	case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA:
@@ -6505,6 +6507,7 @@  is_gimple_omp_offloaded (const gimple *stmt)
 	case GF_OMP_TARGET_KIND_REGION:
 	case GF_OMP_TARGET_KIND_OACC_PARALLEL:
 	case GF_OMP_TARGET_KIND_OACC_KERNELS:
+	case GF_OMP_TARGET_KIND_OACC_SERIAL:
 	  return true;
 	default:
 	  return false;
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 5fa0ba6dda60..94a69643aaab 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -161,6 +161,7 @@  enum omp_region_type
   ORT_ACC_DATA	= ORT_ACC | ORT_TARGET_DATA, /* Data construct.  */
   ORT_ACC_PARALLEL = ORT_ACC | ORT_TARGET,  /* Parallel construct */
   ORT_ACC_KERNELS  = ORT_ACC | ORT_TARGET | 2,  /* Kernels construct.  */
+  ORT_ACC_SERIAL = ORT_ACC | ORT_TARGET | 4,  /* Serial construct.  */
   ORT_ACC_HOST_DATA = ORT_ACC | ORT_TARGET_DATA | 2,  /* Host data.  */
 
   /* Dummy OpenMP region, used to disable expansion of
@@ -5551,6 +5552,7 @@  is_gimple_stmt (tree t)
     case STATEMENT_LIST:
     case OACC_PARALLEL:
     case OACC_KERNELS:
+    case OACC_SERIAL:
     case OACC_DATA:
     case OACC_HOST_DATA:
     case OACC_DECLARE:
@@ -7289,7 +7291,8 @@  oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
       break;
 
     case ORT_ACC_PARALLEL:
-      rkind = "parallel";
+    case ORT_ACC_SERIAL:
+      rkind = ctx->region_type == ORT_ACC_PARALLEL ? "parallel" : "serial";
 
       if (is_private)
 	flags |= GOVD_FIRSTPRIVATE;
@@ -10101,7 +10104,8 @@  gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	  /* Data clauses associated with acc parallel reductions must be
 	     compatible with present_or_copy.  Warn and adjust the clause
 	     if that is not the case.  */
-	  if (ctx->region_type == ORT_ACC_PARALLEL)
+	  if (ctx->region_type == ORT_ACC_PARALLEL
+	      || ctx->region_type == ORT_ACC_SERIAL)
 	    {
 	      tree t = DECL_P (decl) ? decl : TREE_OPERAND (decl, 0);
 	      n = NULL;
@@ -10277,7 +10281,8 @@  gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	  decl = OMP_CLAUSE_DECL (c);
 	  /* OpenACC reductions need a present_or_copy data clause.
 	     Add one if necessary.  Emit error when the reduction is private.  */
-	  if (ctx->region_type == ORT_ACC_PARALLEL)
+	  if (ctx->region_type == ORT_ACC_PARALLEL
+	      || ctx->region_type == ORT_ACC_SERIAL)
 	    {
 	      n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
 	      if (n->value & (GOVD_PRIVATE | GOVD_FIRSTPRIVATE))
@@ -12529,6 +12534,9 @@  gimplify_omp_workshare (tree *expr_p, gimple_seq *pre_p)
     case OACC_PARALLEL:
       ort = ORT_ACC_PARALLEL;
       break;
+    case OACC_SERIAL:
+      ort = ORT_ACC_SERIAL;
+      break;
     case OACC_DATA:
       ort = ORT_ACC_DATA;
       break;
@@ -12612,6 +12620,10 @@  gimplify_omp_workshare (tree *expr_p, gimple_seq *pre_p)
       stmt = gimple_build_omp_target (body, GF_OMP_TARGET_KIND_OACC_PARALLEL,
 				      OMP_CLAUSES (expr));
       break;
+    case OACC_SERIAL:
+      stmt = gimple_build_omp_target (body, GF_OMP_TARGET_KIND_OACC_SERIAL,
+				      OMP_CLAUSES (expr));
+      break;
     case OMP_SECTIONS:
       stmt = gimple_build_omp_sections (body, OMP_CLAUSES (expr));
       break;
@@ -13870,6 +13882,7 @@  gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OACC_DATA:
 	case OACC_KERNELS:
 	case OACC_PARALLEL:
+	case OACC_SERIAL:
 	case OMP_SECTIONS:
 	case OMP_SINGLE:
 	case OMP_TARGET:
@@ -14286,6 +14299,7 @@  gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 		  && code != EH_ELSE_EXPR
 		  && code != OACC_PARALLEL
 		  && code != OACC_KERNELS
+		  && code != OACC_SERIAL
 		  && code != OACC_DATA
 		  && code != OACC_HOST_DATA
 		  && code != OACC_DECLARE
diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index eadff6e50f86..d242f4e1ae99 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -7901,19 +7901,22 @@  expand_omp_target (struct omp_region *region)
   gimple *stmt;
   edge e;
   bool offloaded, data_region;
+  int target_kind;
 
   entry_stmt = as_a <gomp_target *> (last_stmt (region->entry));
+  target_kind = gimple_omp_target_kind (entry_stmt);
   new_bb = region->entry;
 
   offloaded = is_gimple_omp_offloaded (entry_stmt);
-  switch (gimple_omp_target_kind (entry_stmt))
+  switch (target_kind)
     {
     case GF_OMP_TARGET_KIND_REGION:
     case GF_OMP_TARGET_KIND_UPDATE:
     case GF_OMP_TARGET_KIND_ENTER_DATA:
     case GF_OMP_TARGET_KIND_EXIT_DATA:
-    case GF_OMP_TARGET_KIND_OACC_PARALLEL:
     case GF_OMP_TARGET_KIND_OACC_KERNELS:
+    case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+    case GF_OMP_TARGET_KIND_OACC_SERIAL:
     case GF_OMP_TARGET_KIND_OACC_UPDATE:
     case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA:
     case GF_OMP_TARGET_KIND_OACC_DECLARE:
@@ -7944,16 +7947,28 @@  expand_omp_target (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
-  if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS)
+  switch (target_kind)
     {
+    case GF_OMP_TARGET_KIND_OACC_KERNELS:
       mark_loops_in_oacc_kernels_region (region->entry, region->exit);
 
-      /* Further down, both OpenACC kernels and OpenACC parallel constructs
-	 will be mappted to BUILT_IN_GOACC_PARALLEL, and to distinguish the
-	 two, there is an "oacc kernels" attribute set for OpenACC kernels.  */
+      /* Further down, all OpenACC compute constructs will be mapped to
+	 BUILT_IN_GOACC_PARALLEL, and to distinguish between them, there
+	 is an "oacc kernels" attribute set for OpenACC kernels.  */
       DECL_ATTRIBUTES (child_fn)
 	= tree_cons (get_identifier ("oacc kernels"),
 		     NULL_TREE, DECL_ATTRIBUTES (child_fn));
+      break;
+    case GF_OMP_TARGET_KIND_OACC_SERIAL:
+      /* Further down, all OpenACC compute constructs will be mapped to
+	 BUILT_IN_GOACC_PARALLEL, and to distinguish between them, there
+	 is an "oacc serial" attribute set for OpenACC serial.  */
+      DECL_ATTRIBUTES (child_fn)
+	= tree_cons (get_identifier ("oacc serial"),
+		     NULL_TREE, DECL_ATTRIBUTES (child_fn));
+      break;
+    default:
+      break;
     }
 
   if (offloaded)
@@ -8158,6 +8173,7 @@  expand_omp_target (struct omp_region *region)
       break;
     case GF_OMP_TARGET_KIND_OACC_KERNELS:
     case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+    case GF_OMP_TARGET_KIND_OACC_SERIAL:
       start_ix = BUILT_IN_GOACC_PARALLEL;
       break;
     case GF_OMP_TARGET_KIND_OACC_DATA:
@@ -8352,7 +8368,18 @@  expand_omp_target (struct omp_region *region)
 	args.quick_push (get_target_arguments (&gsi, entry_stmt));
       break;
     case BUILT_IN_GOACC_PARALLEL:
-      oacc_set_fn_attrib (child_fn, clauses, &args);
+      if (lookup_attribute ("oacc serial", DECL_ATTRIBUTES (child_fn)) != NULL)
+	{
+	  tree dims = NULL_TREE;
+	  unsigned int ix;
+
+	  /* For serial constructs we set all dimensions to 1.  */
+	  for (ix = GOMP_DIM_MAX; ix--;)
+	    dims = tree_cons (NULL_TREE, integer_one_node, dims);
+	  oacc_replace_fn_attrib (child_fn, dims);
+	}
+      else
+	oacc_set_fn_attrib (child_fn, clauses, &args);
       tagging = true;
       /* FALLTHRU */
     case BUILT_IN_GOACC_ENTER_EXIT_DATA:
@@ -8911,8 +8938,9 @@  build_omp_regions_1 (basic_block bb, struct omp_region *parent,
 		{
 		case GF_OMP_TARGET_KIND_REGION:
 		case GF_OMP_TARGET_KIND_DATA:
-		case GF_OMP_TARGET_KIND_OACC_PARALLEL:
 		case GF_OMP_TARGET_KIND_OACC_KERNELS:
+		case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+		case GF_OMP_TARGET_KIND_OACC_SERIAL:
 		case GF_OMP_TARGET_KIND_OACC_DATA:
 		case GF_OMP_TARGET_KIND_OACC_HOST_DATA:
 		  break;
@@ -9165,8 +9193,9 @@  omp_make_gimple_edges (basic_block bb, struct omp_region **region,
 	{
 	case GF_OMP_TARGET_KIND_REGION:
 	case GF_OMP_TARGET_KIND_DATA:
-	case GF_OMP_TARGET_KIND_OACC_PARALLEL:
 	case GF_OMP_TARGET_KIND_OACC_KERNELS:
+	case GF_OMP_TARGET_KIND_OACC_PARALLEL:
+	case GF_OMP_TARGET_KIND_OACC_SERIAL:
 	case GF_OMP_TARGET_KIND_OACC_DATA:
 	case GF_OMP_TARGET_KIND_OACC_HOST_DATA:
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index fa76ceba33c6..fb2ddc5f354a 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -185,15 +185,17 @@  static tree scan_omp_1_op (tree *, int *, void *);
       *handled_ops_p = false; \
       break;
 
-/* Return true if CTX corresponds to an oacc parallel region.  */
+/* Return true if CTX corresponds to an oacc parallel or serial region.  */
 
 static bool
-is_oacc_parallel (omp_context *ctx)
+is_oacc_parallel_or_serial (omp_context *ctx)
 {
   enum gimple_code outer_type = gimple_code (ctx->stmt);
   return ((outer_type == GIMPLE_OMP_TARGET)
-	  && (gimple_omp_target_kind (ctx->stmt)
-	      == GF_OMP_TARGET_KIND_OACC_PARALLEL));
+	  && ((gimple_omp_target_kind (ctx->stmt)
+	       == GF_OMP_TARGET_KIND_OACC_PARALLEL)
+	      || (gimple_omp_target_kind (ctx->stmt)
+		  == GF_OMP_TARGET_KIND_OACC_SERIAL)));
 }
 
 /* Return true if CTX corresponds to an oacc kernels region.  */
@@ -1149,7 +1151,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	  goto do_private;
 
 	case OMP_CLAUSE_REDUCTION:
-	  if (is_oacc_parallel (ctx) || is_oacc_kernels (ctx))
+	  if (is_oacc_parallel_or_serial (ctx) || is_oacc_kernels (ctx))
 	    ctx->local_reduction_clauses
 	      = tree_cons (NULL, c, ctx->local_reduction_clauses);
 	  /* FALLTHRU */
@@ -2391,7 +2393,7 @@  scan_omp_for (gomp_for *stmt, omp_context *outer_ctx)
     {
       omp_context *tgt = enclosing_target_ctx (outer_ctx);
 
-      if (!tgt || is_oacc_parallel (tgt))
+      if (!tgt || is_oacc_parallel_or_serial (tgt))
 	for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
 	  {
 	    char const *check = NULL;
@@ -2945,6 +2947,7 @@  check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx)
 		  {
 		  case GF_OMP_TARGET_KIND_OACC_PARALLEL:
 		  case GF_OMP_TARGET_KIND_OACC_KERNELS:
+		  case GF_OMP_TARGET_KIND_OACC_SERIAL:
 		    ok = true;
 		    break;
 
@@ -3393,6 +3396,7 @@  check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx)
 	      stmt_name = "target exit data"; break;
 	    case GF_OMP_TARGET_KIND_OACC_PARALLEL: stmt_name = "parallel"; break;
 	    case GF_OMP_TARGET_KIND_OACC_KERNELS: stmt_name = "kernels"; break;
+	    case GF_OMP_TARGET_KIND_OACC_SERIAL: stmt_name = "serial"; break;
 	    case GF_OMP_TARGET_KIND_OACC_DATA: stmt_name = "data"; break;
 	    case GF_OMP_TARGET_KIND_OACC_UPDATE: stmt_name = "update"; break;
 	    case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA:
@@ -3410,6 +3414,8 @@  check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx)
 	      ctx_stmt_name = "parallel"; break;
 	    case GF_OMP_TARGET_KIND_OACC_KERNELS:
 	      ctx_stmt_name = "kernels"; break;
+	    case GF_OMP_TARGET_KIND_OACC_SERIAL:
+	      ctx_stmt_name = "serial"; break;
 	    case GF_OMP_TARGET_KIND_OACC_DATA: ctx_stmt_name = "data"; break;
 	    case GF_OMP_TARGET_KIND_OACC_HOST_DATA:
 	      ctx_stmt_name = "host_data"; break;
@@ -6711,8 +6717,10 @@  lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
 		    break;
 
 		  case GIMPLE_OMP_TARGET:
-		    if (gimple_omp_target_kind (probe->stmt)
-			!= GF_OMP_TARGET_KIND_OACC_PARALLEL)
+		    if ((gimple_omp_target_kind (probe->stmt)
+			 != GF_OMP_TARGET_KIND_OACC_PARALLEL)
+			&& (gimple_omp_target_kind (probe->stmt)
+			    != GF_OMP_TARGET_KIND_OACC_SERIAL))
 		      goto do_lookup;
 
 		    cls = gimple_omp_target_clauses (probe->stmt);
@@ -7518,7 +7526,7 @@  lower_oacc_head_mark (location_t loc, tree ddvar, tree clauses,
 
   /* In a parallel region, loops are implicitly INDEPENDENT.  */
   omp_context *tgt = enclosing_target_ctx (ctx);
-  if (!tgt || is_oacc_parallel (tgt))
+  if (!tgt || is_oacc_parallel_or_serial (tgt))
     tag |= OLF_INDEPENDENT;
 
   if (tag & OLF_TILE)
@@ -11357,6 +11365,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
     case GF_OMP_TARGET_KIND_EXIT_DATA:
     case GF_OMP_TARGET_KIND_OACC_PARALLEL:
     case GF_OMP_TARGET_KIND_OACC_KERNELS:
+    case GF_OMP_TARGET_KIND_OACC_SERIAL:
     case GF_OMP_TARGET_KIND_OACC_UPDATE:
     case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA:
     case GF_OMP_TARGET_KIND_OACC_DECLARE:
@@ -11531,7 +11540,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	break;
 
       case OMP_CLAUSE_FIRSTPRIVATE:
-	if (is_oacc_parallel (ctx))
+	if (is_oacc_parallel_or_serial (ctx))
 	  goto oacc_firstprivate;
 	map_cnt++;
 	var = OMP_CLAUSE_DECL (c);
@@ -11905,7 +11914,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	    break;
 
 	  case OMP_CLAUSE_FIRSTPRIVATE:
-	    if (is_oacc_parallel (ctx))
+	    if (is_oacc_parallel_or_serial (ctx))
 	      goto oacc_firstprivate_map;
 	    ovar = OMP_CLAUSE_DECL (c);
 	    if (omp_is_reference (ovar))
@@ -12439,7 +12448,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       gimple_seq fork_seq = NULL;
       gimple_seq join_seq = NULL;
 
-      if (is_oacc_parallel (ctx))
+      if (is_oacc_parallel_or_serial (ctx))
 	{
 	  /* If there are reductions on the offloaded region itself, treat
 	     them as a dummy GANG loop.  */
diff --git a/gcc/testsuite/c-c++-common/goacc/serial-dims.c b/gcc/testsuite/c-c++-common/goacc/serial-dims.c
new file mode 100644
index 000000000000..41698d279c98
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/serial-dims.c
@@ -0,0 +1,12 @@ 
+/* Invalid use of OpenACC parallelism dimensions clauses: num_gangs,
+   num_workers, vector_length with the serial construct.  */
+
+void f(void)
+{
+#pragma acc serial num_gangs (1) /* { dg-error "'num_gangs' is not valid for '#pragma acc serial'" } */
+  ;
+#pragma acc serial num_workers (1) /* { dg-error "'num_workers' is not valid for '#pragma acc serial'" } */
+  ;
+#pragma acc serial vector_length (1) /* { dg-error "'vector_length' is not valid for '#pragma acc serial'" } */
+  ;
+}
diff --git a/gcc/testsuite/gfortran.dg/goacc/serial-dims.f90 b/gcc/testsuite/gfortran.dg/goacc/serial-dims.f90
new file mode 100644
index 000000000000..72b4a8361776
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/serial-dims.f90
@@ -0,0 +1,40 @@ 
+! Invalid use of OpenACC parallelism dimensions clauses: num_gangs,
+! num_workers, vector_length with the serial construct.
+
+subroutine s()
+  integer :: i
+  !$acc parallel
+  !$acc end parallel
+
+  !$acc parallel loop
+  do i = 1, 5
+  end do
+
+  !$acc parallel loop
+  do i = 1, 5
+  end do
+  !$acc end parallel loop
+
+  !$acc serial loop
+  do i = 1, 5
+  end do
+
+  !$acc serial loop
+  do i = 1, 5
+  end do
+  !$acc end serial loop
+
+  !$acc serial
+  !$acc end serial
+end subroutine s
+
+subroutine f()
+!$acc serial num_gangs (1)  ! { dg-error "Failed to match clause at" }
+!$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }
+
+!$acc serial num_workers (1)  ! { dg-error "Failed to match clause at" }
+!$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }
+
+!$acc serial vector_length (1)  ! { dg-error "Failed to match clause at" }
+!$acc end serial  ! { dg-error "Unexpected !.ACC END SERIAL statement" }
+end subroutine f
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 53b3f55a3e6a..1cf7a9121336 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -3223,6 +3223,10 @@  dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       pp_string (pp, "#pragma acc kernels");
       goto dump_omp_clauses_body;
 
+    case OACC_SERIAL:
+      pp_string (pp, "#pragma acc serial");
+      goto dump_omp_clauses_body;
+
     case OACC_DATA:
       pp_string (pp, "#pragma acc data");
       dump_omp_clauses (pp, OACC_DATA_CLAUSES (node), spc, flags);
diff --git a/gcc/tree.def b/gcc/tree.def
index fb6e7344fa6b..e8bb4f37f802 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1095,6 +1095,12 @@  DEFTREECODE (OACC_PARALLEL, "oacc_parallel", tcc_statement, 2)
 
 DEFTREECODE (OACC_KERNELS, "oacc_kernels", tcc_statement, 2)
 
+/* OpenACC - #pragma acc serial [clause1 ... clauseN]
+   Operand 0: OMP_BODY: Code to be executed sequentially.
+   Operand 1: OMP_CLAUSES: List of clauses.  */
+
+DEFTREECODE (OACC_SERIAL, "oacc_serial", tcc_statement, 2)
+
 /* OpenACC - #pragma acc data [clause1 ... clauseN]
    Operand 0: OACC_DATA_BODY: Data construct body.
    Operand 1: OACC_DATA_CLAUSES: List of clauses.  */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c
new file mode 100644
index 000000000000..bb91c9221f89
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/serial-dims.c
@@ -0,0 +1,92 @@ 
+/* OpenACC dimensions with the serial construct.  */
+
+#include <limits.h>
+#include <openacc.h>
+#include <gomp-constants.h>
+
+/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
+   not behaving as expected for -O0.  */
+#pragma acc routine seq
+static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+  else
+    __builtin_abort ();
+}
+
+#pragma acc routine seq
+static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+  else
+    __builtin_abort ();
+}
+
+#pragma acc routine seq
+static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+  else
+    __builtin_abort ();
+}
+
+
+int main ()
+{
+  acc_init (acc_device_default);
+
+  /* Serial OpenACC constructs must get launched as 1 x 1 x 1.  */
+  {
+    int gangs_min, gangs_max;
+    int workers_min, workers_max;
+    int vectors_min, vectors_max;
+    int gangs_actual, workers_actual, vectors_actual;
+    int i, j, k;
+
+    gangs_min = workers_min = vectors_min = INT_MAX;
+    gangs_max = workers_max = vectors_max = INT_MIN;
+    gangs_actual = workers_actual = vectors_actual = 1;
+#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */
+    {
+      if (acc_on_device (acc_device_nvidia))
+	{
+	  /* The GCC nvptx back end enforces vector_length (32).  */
+	  vectors_actual = 32;
+	}
+      else if (!acc_on_device (acc_device_host))
+	__builtin_abort ();
+#pragma acc loop gang \
+  reduction (min: gangs_min, workers_min, vectors_min) \
+  reduction (max: gangs_max, workers_max, vectors_max)
+      for (i = 100 * gangs_actual; i > -100 * gangs_actual; i--)
+#pragma acc loop worker \
+  reduction (min: gangs_min, workers_min, vectors_min) \
+  reduction (max: gangs_max, workers_max, vectors_max)
+	for (j = 100 * workers_actual; j > -100 * workers_actual; j--)
+#pragma acc loop vector \
+  reduction (min: gangs_min, workers_min, vectors_min) \
+  reduction (max: gangs_max, workers_max, vectors_max)
+	  for (k = 100 * vectors_actual; k > -100 * vectors_actual; k--)
+	    {
+	      gangs_min = gangs_max = acc_gang ();
+	      workers_min = workers_max = acc_worker ();
+	      vectors_min = vectors_max = acc_vector ();
+	    }
+      if (gangs_min != 0 || gangs_max != gangs_actual - 1
+	  || workers_min != 0 || workers_max != workers_actual - 1
+	  || vectors_min != 0 || vectors_max != vectors_actual - 1)
+	__builtin_abort ();
+    }
+  }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c
new file mode 100644
index 000000000000..45c260510c29
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims-aux.c
@@ -0,0 +1,41 @@ 
+/* OpenACC dimensions with the serial construct.  */
+/* Used by serial-dims.f90.  */
+
+#include <limits.h>
+#include <openacc.h>
+#include <gomp-constants.h>
+
+/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
+   not behaving as expected for -O0.  */
+#pragma acc routine seq
+static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+  else
+    __builtin_abort ();
+}
+
+#pragma acc routine seq
+static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+  else
+    __builtin_abort ();
+}
+
+#pragma acc routine seq
+static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+  else
+    __builtin_abort ();
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90
new file mode 100644
index 000000000000..25c933629045
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/serial-dims.f90
@@ -0,0 +1,89 @@ 
+! OpenACC dimensions with the serial construct.
+
+! { dg-additional-sources serial-dims-aux.c }
+! { dg-warning "command line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
+
+module acc_routines
+  implicit none (type, external)
+
+  interface
+    integer function acc_gang() bind(C)
+      !$acc routine seq
+    end function acc_gang
+
+    integer function acc_worker() bind(C)
+      !$acc routine seq
+    end function acc_worker
+
+    integer function acc_vector() bind(C)
+      !$acc routine seq
+    end function acc_vector
+  end interface
+end module acc_routines
+
+program main
+  use iso_c_binding
+  use openacc
+  use acc_routines
+  implicit none (type, external)
+
+  integer :: gangs_min, gangs_max
+  integer :: workers_min, workers_max
+  integer :: vectors_min, vectors_max
+  integer :: gangs_actual, workers_actual, vectors_actual
+  integer :: i, j, k
+
+  call acc_init (acc_device_default)
+
+  ! Serial OpenACC constructs must get launched as 1 x 1 x 1.
+  gangs_min = huge(gangs_min)
+  workers_min = huge(workers_min)
+  vectors_min = huge(vectors_min)
+  gangs_max = -huge(gangs_max) - 1  ! INT_MIN
+  workers_max = -huge(gangs_max) - 1
+  vectors_max = -huge(gangs_max) - 1
+  gangs_actual = 1
+  workers_actual = 1
+  vectors_actual = 1
+
+  !$acc serial ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+    if (acc_on_device (acc_device_nvidia)) then
+      ! The GCC nvptx back end enforces vector_length (32).
+      vectors_actual = 32
+    elseif (acc_on_device (acc_device_gcn)) then
+      ! AMD GCN relies on the autovectorizer for the vector dimension:
+      ! the loop below isn't likely to be vectorized, so vectors_actual
+      ! is effectively 1.
+      vectors_actual = 1
+    elseif (.not. acc_on_device (acc_device_host)) then
+      stop 1
+    end if
+
+!$acc loop gang &
+!$acc & reduction (min: gangs_min, workers_min, vectors_min) &
+!$acc & reduction (max: gangs_max, workers_max, vectors_max)
+    do i = 100 * gangs_actual, -99 * gangs_actual, -1
+!$acc loop worker &
+!$acc & reduction (min: gangs_min, workers_min, vectors_min) &
+!$acc & reduction (max: gangs_max, workers_max, vectors_max)
+      do j = 100 * workers_actual, -99 * workers_actual, -1
+!$acc loop vector &
+!$acc & reduction (min: gangs_min, workers_min, vectors_min) &
+!$acc & reduction (max: gangs_max, workers_max, vectors_max)
+        do k = 100 * vectors_actual, -99 * vectors_actual, -1
+          gangs_min = acc_gang ();
+          gangs_max = acc_gang ();
+          workers_min = acc_worker ();
+          workers_max = acc_worker ();
+          vectors_min = acc_vector ();
+          vectors_max = acc_vector ();
+       end do
+     end do
+   end do
+  if (gangs_min /= 0 .or. gangs_max /= gangs_actual - 1 &
+      .or. workers_min /= 0 .or. workers_max /= workers_actual - 1 &
+      .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) &
+    stop 2
+!$acc end serial
+
+end program main