RFC: Monitoring old PRs, new dg directives

Message ID 20200728214447.GS3841@redhat.com
State New
Headers show
Series
  • RFC: Monitoring old PRs, new dg directives
Related show

Commit Message

David Malcolm via Gcc-patches July 28, 2020, 9:44 p.m.
In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In
my experience, a good amount of them have already been fixed; my periodical
sweeps always turn up a bunch of PRs that had already been fixed previously.
Sometimes my sweeps are more or less random, but more often than not I'm just
looking for duplicates of an existing PR.  Sometimes the reason the already
fixed PRs are still open is because a PR that was fixed had duplicates that we
didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a
side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming
because often you need to grab the test from the Bugzilla yet again (and
sometimes there are multiple tests).  Even if you find a PR that was fixed, you
still need to bisect the fix and perhaps add the test to our testsuite.  That's
draining and since the number of bugs only increases, never decreases, it is not
sustainable.

So I've started a personal repo where I've gathered dozens of tests and wrote a
script that just compiles every test in the repo and reports if anything
changed.  One line from it:

pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || echo -e "$pr: ${msg_ice}"

This has major drawbacks: you have to remember to run this manually, keep
updating it, and it's yet another repo that people interested in this would
have to clone, but the worst thing is that typically you would only discover
that a patch fixed a different PR long after the patch was committed.  And
quite likely it wasn't even your patch.  We know that finding problems earlier
in the developer workflow reduces costs; if we can catch this before the
original developer commits & pushes the changes, it's cheaper, because the
developer already understands what the patch does.

A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently
by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in
the patch had this effect would probably have been very useful.  More testing
will lead to a better compiler.

Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after
it was reported by a different change.

Or another: https://gcc.gnu.org/91525 where the patch contained a test, but
that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

To alleviate some of these problems, I propose that we introduce a means to our
DejaGNU infrastructure that allows adding tests for old bugs that have not been
fixed yet, and re-introduce the keyword monitored (no longer used for anything
-- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is
tracked in the testsuite.  I don't want any unnecessary moving tests around, so
the tests would go where they would normally go; they have to be reduced and
have proper targets, etc.  Having such tests in the testsuite means that when
something changes, you will know immediately, before you push any changes.

My thinking is that for:

* rejects-valid: use the existing dg-xfail-if
* accepts-valid: use the new dg-accepts-invalid
* ICEs: use the new dg-ice

dg-ice can be used like this:

// { dg-ice "build_over_call" { target c++11 } }

and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the
ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,
you'll get an XPASS + FAIL.  Then you can close the old PR.

Similarly, dg-accepts-invalid:

// { dg-accepts-invalid "PR86500" }

means that if the test still compiles without errors, you'll get a quiet XFAIL.
If we start giving errors, you'll get an XPASS.

If the bug is fixed, simply remove the directive.

The patch implementing these new directives is appended.  Once/if this is
accepted, I can start adding the old tests we have in our Bugzilla.  (I'm
only concerned about the c++ component, if that wasn't already clear.)

The question is what makes the bug "old": is it one year without it being
assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for
every new PR, just the reasonably old ones that are useful/important.  Such
additions should be done in batches, so that we don't have dozens of commits,
each of them merely adding a single test.

We will still have a surfeit of bugs that we've given short shrift to, but
let's at least automate what we can.  The initial addition of the relevant
old(-ish) tests won't of course happen automagically, but it's a price I'm
willing to pay.  My goal here isn't merely to reduce the number of open PRs;
it is to improve the testing of the compiler overall.

Thoughts?

Bootstrapped/regtested on x86_64-pc-linux-gnu.

[PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

gcc/ChangeLog:

	* doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

gcc/testsuite/ChangeLog:

	* lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.
	* lib/prune.exp (prune_ices): New.
	* lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.
---
 gcc/doc/sourcebuild.texi                 | 19 +++++++
 gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-
 gcc/testsuite/lib/prune.exp              |  9 ++++
 gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++
 4 files changed, 134 insertions(+), 2 deletions(-)


base-commit: f3665bd1111c1799c0421490b5e655f977570354
-- 
2.26.2

Comments

David Malcolm via Gcc-patches July 29, 2020, 1:37 a.m. | #1
I'll punt to the the C++ front-end folks to chime in.  Usually we only check in bugs that are fixed, as they are fixed, this is what makes it a regression suite.  Doing this does have advantages, like, the testsuite is small and doesn't have duplicates and doesn't test anything that is known to fail that isn't an actual regression.

Ideally, it would be nice to have a way to test bugs out of bugzilla, and report on those fixed bugs as they are fixed.  If people want to keep the test suite a regression suite, then my counter proposal would be to have a branch with the non-regression bugs on it and then people can checkout and test that branch.  Most of the people don't (saving the testing time, which is handy), and then more sporadically, the old bugs branch can be test and BZ state can be moved along as bugs as fixed.  A run once a week or even once a month would seem to be plenty often.

You in general don't need to check in fixes from bug reports that have been fixed in the past, as those fixes generally already have a test case for the fix that went in with the fix.

As for how old is old, we'd leave that to the contributor of work to decide.  In theory, bugs can be added as soon as they come in, no need to wait.  To the extent that waiting saves work, well, that's a personal choice, for the person doing the work to choose.


Why not just use xfail and xpass?  Seems less work than doing a setup_xfail.  Also, why not just use the existing directives instead of adding new directives?  I'm suspicious of expect_ice and accepts_invalid.  You set them to 1 all the time, but almost never set them to 0?  I'm wondering if it should be more like shouldfail?


On Jul 28, 2020, at 2:44 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 

> In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In

> my experience, a good amount of them have already been fixed; my periodical

> sweeps always turn up a bunch of PRs that had already been fixed previously.

> Sometimes my sweeps are more or less random, but more often than not I'm just

> looking for duplicates of an existing PR.  Sometimes the reason the already

> fixed PRs are still open is because a PR that was fixed had duplicates that we

> didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a

> side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming

> because often you need to grab the test from the Bugzilla yet again (and

> sometimes there are multiple tests).  Even if you find a PR that was fixed, you

> still need to bisect the fix and perhaps add the test to our testsuite.  That's

> draining and since the number of bugs only increases, never decreases, it is not

> sustainable.

> 

> So I've started a personal repo where I've gathered dozens of tests and wrote a

> script that just compiles every test in the repo and reports if anything

> changed.  One line from it:

> 

> pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || echo -e "$pr: ${msg_ice}"

> 

> This has major drawbacks: you have to remember to run this manually, keep

> updating it, and it's yet another repo that people interested in this would

> have to clone, but the worst thing is that typically you would only discover

> that a patch fixed a different PR long after the patch was committed.  And

> quite likely it wasn't even your patch.  We know that finding problems earlier

> in the developer workflow reduces costs; if we can catch this before the

> original developer commits & pushes the changes, it's cheaper, because the

> developer already understands what the patch does.

> 

> A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently

> by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in

> the patch had this effect would probably have been very useful.  More testing

> will lead to a better compiler.

> 

> Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after

> it was reported by a different change.

> 

> Or another: https://gcc.gnu.org/91525 where the patch contained a test, but

> that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

> 

> To alleviate some of these problems, I propose that we introduce a means to our

> DejaGNU infrastructure that allows adding tests for old bugs that have not been

> fixed yet, and re-introduce the keyword monitored (no longer used for anything

> -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is

> tracked in the testsuite.  I don't want any unnecessary moving tests around, so

> the tests would go where they would normally go; they have to be reduced and

> have proper targets, etc.  Having such tests in the testsuite means that when

> something changes, you will know immediately, before you push any changes.

> 

> My thinking is that for:

> 

> * rejects-valid: use the existing dg-xfail-if

> * accepts-valid: use the new dg-accepts-invalid

> * ICEs: use the new dg-ice

> 

> dg-ice can be used like this:

> 

> // { dg-ice "build_over_call" { target c++11 } }

> 

> and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> you'll get an XPASS + FAIL.  Then you can close the old PR.

> 

> Similarly, dg-accepts-invalid:

> 

> // { dg-accepts-invalid "PR86500" }

> 

> means that if the test still compiles without errors, you'll get a quiet XFAIL.

> If we start giving errors, you'll get an XPASS.

> 

> If the bug is fixed, simply remove the directive.

> 

> The patch implementing these new directives is appended.  Once/if this is

> accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> only concerned about the c++ component, if that wasn't already clear.)

> 

> The question is what makes the bug "old": is it one year without it being

> assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for

> every new PR, just the reasonably old ones that are useful/important.  Such

> additions should be done in batches, so that we don't have dozens of commits,

> each of them merely adding a single test.

> 

> We will still have a surfeit of bugs that we've given short shrift to, but

> let's at least automate what we can.  The initial addition of the relevant

> old(-ish) tests won't of course happen automagically, but it's a price I'm

> willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> it is to improve the testing of the compiler overall.

> 

> Thoughts?

> 

> Bootstrapped/regtested on x86_64-pc-linux-gnu.

> 

> [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

> 

> gcc/ChangeLog:

> 

> 	* doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

> 

> gcc/testsuite/ChangeLog:

> 

> 	* lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.

> 	* lib/prune.exp (prune_ices): New.

> 	* lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.

> ---

> gcc/doc/sourcebuild.texi                 | 19 +++++++

> gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-

> gcc/testsuite/lib/prune.exp              |  9 ++++

> gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++

> 4 files changed, 134 insertions(+), 2 deletions(-)

> 

> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi

> index a7a922d84a2..636d21d30dd 100644

> --- a/gcc/doc/sourcebuild.texi

> +++ b/gcc/doc/sourcebuild.texi

> @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the conditions (which are

> the same as for @code{dg-skip-if}) are met.

> @end table

> 

> +@subsubsection Expect the compiler to crash

> +

> +@table @code

> +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> +Expect the compiler to crash with an internal compiler error and return

> +a nonzero exit status if the conditions (which are the same as for

> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> +fixed yet.

> +@end table

> +

> @subsubsection Expect the test executable to fail

> 

> @table @code

> @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.

> 

> @item @{ dg-prune-output @var{regexp} @}

> Prune messages matching @var{regexp} from the test output.

> +

> +@table @code

> +@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> +Expect the compiler to accept the test (even though it should be rejected with

> +a compile-time error), if the conditions (which are the same as for

> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> +fixed yet.

> +@end table

> +

> @end table

> 

> @subsubsection Verify output of the test executable

> diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp

> index 45d97024883..6478eda283b 100644

> --- a/gcc/testsuite/lib/gcc-dg.exp

> +++ b/gcc/testsuite/lib/gcc-dg.exp

> @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } {

>     verbose "$target_compile $prog $output_file $compile_type $options" 4

>     set comp_output [$target_compile "$prog" "$output_file" "$compile_type" $options]

> 

> +    global expect_ice

>     # Look for an internal compiler error, which sometimes masks the fact

>     # that we didn't get an expected error message.  XFAIL an ICE via

>     # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }

>     # to avoid a second failure for excess errors.

> -    if [string match "*internal compiler error*" $comp_output] {

> +    # "Error reporting routines re-entered" ICE says "Internal" rather than

> +    # "internal", so match that too.

> +    if [string match {*[Ii]nternal compiler error*} $comp_output] {

> 	upvar 2 name name

> -	fail "$name (internal compiler error)"

> +	if { $expect_ice == 0 } {

> +	  fail "$name (internal compiler error)"

> +	} else {

> +	  # We expected an ICE and we got it.  Emit an XFAIL.

> +	  setup_xfail "*-*-*"

> +	  fail "$name (internal compiler error)"

> +	  clear_xfail "*-*-*"

> +	  # Prune the ICE from the output.

> +	  set comp_output [prune_ices $comp_output]

> +	}

> +    } else {

> +	upvar 2 name name

> +	global accepts_invalid

> +	if { $expect_ice == 1 } {

> +	  # We expected an ICE but we didn't get it.  We want an XPASS, so

> +	  # call setup_xfail to set xfail_flag.

> +	  setup_xfail "*-*-*"

> +	  pass "$name (internal compiler error)"

> +	  clear_xfail "*-*-*"

> +	} elseif { $accepts_invalid == 1 } {

> +	    if [string match {*error: *} $comp_output] {

> +	      # We expected that this test be (wrongly) accepted, but now we have

> +	      # seen error(s).  Issue an XPASS to signal that.

> +	      setup_xfail "*-*-*"

> +	      pass "$name (accepts invalid)"

> +	      clear_xfail "*-*-*"

> +	    } else {

> +	      # This test is still (wrongly) accepted.  Just emit an XFAIL.

> +	      setup_xfail "*-*-*"

> +	      fail "$name (accepts invalid)"

> +	      clear_xfail "*-*-*"

> +	    }

> +	}

>     }

> 

>     if { $do_what == "repo" } {

> diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp

> index 1c776249f1a..58a739684a5 100644

> --- a/gcc/testsuite/lib/prune.exp

> +++ b/gcc/testsuite/lib/prune.exp

> @@ -118,6 +118,15 @@ proc prune_file_path { text } {

>     return $text

> }

> 

> +# Prune internal compiler error messages, including the "Please submit..."

> +# footnote.

> +

> +proc prune_ices { text } {

> +  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for instructions\[^\n\]*" $text "" text

> +  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" $text "" text

> +  return $text

> +}

> +

> # Provide a definition of this if missing (delete after next dejagnu release).

> 

> if { [info procs prune_warnings] == "" } then {

> diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp

> index 2a21424b890..765f3a2e27a 100644

> --- a/gcc/testsuite/lib/target-supports-dg.exp

> +++ b/gcc/testsuite/lib/target-supports-dg.exp

> @@ -495,6 +495,75 @@ proc dg-shouldfail { args } {

>     }

> }

> 

> +# Record whether the compiler is expected (at the moment) to ICE.

> +# Used for tests that test bugs that have not been fixed yet.

> +

> +set expect_ice 0

> +set accepts_invalid 0

> +

> +proc dg-ice { args } {

> +    # Don't bother if we're already skipping the test.

> +    upvar dg-do-what dg-do-what

> +    if { [lindex ${dg-do-what} 1] == "N" } {

> +      return

> +    }

> +

> +    global accepts_invalid

> +    # Can't be combined with dg-accepts-invalid.

> +    if { $accepts_invalid == 1 } {

> +      error "dg-ice: cannot be combined with dg-accepts-invalid"

> +      return

> +    }

> +

> +    global expect_ice

> +

> +    set args [lreplace $args 0 0]

> +    if { [llength $args] > 1 } {

> +	set selector [list target [lindex $args 1]]

> +	if { [dg-process-target-1 $selector] == "S" } {

> +	    # The target matches, now check the flags.

> +	    if [check-flags $args] {

> +		set expect_ice 1

> +	    }

> +	}

> +    } else {

> +	set expect_ice 1

> +    }

> +}

> +

> +# Record whether the compiler should reject the testcase with an error,

> +# but currently doesn't do so.  Used for accepts-invalid bugs.

> +

> +proc dg-accepts-invalid { args } {

> +    # Don't bother if we're already skipping the test.

> +    upvar dg-do-what dg-do-what

> +    if { [lindex ${dg-do-what} 1] == "N" } {

> +      return

> +    }

> +

> +    global expect_ice

> +    # Can't be combined with dg-ice.

> +    if { $expect_ice == 1 } {

> +      error "dg-accepts-invalid: cannot be combined with dg-ice"

> +      return

> +    }

> +

> +    global accepts_invalid

> +

> +    set args [lreplace $args 0 0]

> +    if { [llength $args] > 1 } {

> +	set selector [list target [lindex $args 1]]

> +	if { [dg-process-target-1 $selector] == "S" } {

> +	    # The target matches, now check the flags.

> +	    if [check-flags $args] {

> +		set accepts_invalid 1

> +	    }

> +	}

> +    } else {

> +	set accepts_invalid 1

> +    }

> +}

> +

> # Intercept the call to the DejaGnu version of dg-process-target to

> # support use of an effective-target keyword in place of a list of

> # target triplets to xfail or skip a test.

> 

> base-commit: f3665bd1111c1799c0421490b5e655f977570354

> -- 

> 2.26.2

>
David Malcolm via Gcc-patches July 29, 2020, 3:02 a.m. | #2
On Tue, 2020-07-28 at 17:44 -0400, Marek Polacek via Gcc-patches wrote:
> In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In

> my experience, a good amount of them have already been fixed; my periodical

> sweeps always turn up a bunch of PRs that had already been fixed previously.

> Sometimes my sweeps are more or less random, but more often than not I'm just

> looking for duplicates of an existing PR.  Sometimes the reason the already

> fixed PRs are still open is because a PR that was fixed had duplicates that we

> didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a

> side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming

> because often you need to grab the test from the Bugzilla yet again (and

> sometimes there are multiple tests).  Even if you find a PR that was fixed, you

> still need to bisect the fix and perhaps add the test to our testsuite.  That's

> draining and since the number of bugs only increases, never decreases, it is not

> sustainable.

[ ... ]
Another approach is to add tests for unfixed bugs as XFAILs.  When we see the
test go from XFAIL to XPASS, then we know the bug got fixed.

Anyway, there's certainly room to do something here to make it easier to find
bugs we've already fixed.

jeff
>
Richard Sandiford July 29, 2020, 8:40 a.m. | #3
Thanks for doing this.  +1 for the best fix being to add XFAILing tests
to the main testsute, enabled by default.  I don't see any other realistic
way of ensuring that fixes are matched with PRs at the time that the fix
is made (rather than some time after the fact).

Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> […]

> My thinking is that for:

>

> * rejects-valid: use the existing dg-xfail-if

> * accepts-valid: use the new dg-accepts-invalid

> * ICEs: use the new dg-ice

>

> dg-ice can be used like this:

>

> // { dg-ice "build_over_call" { target c++11 } }

>

> and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> you'll get an XPASS + FAIL.  Then you can close the old PR.


This is long overdue IMO, thanks for adding it.

> Similarly, dg-accepts-invalid:

>

> // { dg-accepts-invalid "PR86500" }

>

> means that if the test still compiles without errors, you'll get a quiet XFAIL.

> If we start giving errors, you'll get an XPASS.

>

> If the bug is fixed, simply remove the directive.

>

> The patch implementing these new directives is appended.  Once/if this is

> accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> only concerned about the c++ component, if that wasn't already clear.)

>

> The question is what makes the bug "old": is it one year without it being

> assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for

> every new PR, just the reasonably old ones that are useful/important.  Such

> additions should be done in batches, so that we don't have dozens of commits,

> each of them merely adding a single test.


IMO it should be OK to add a testcase for any open PR, if someone think
it's useful, regardless of age and without being forced to batch the
commits.  I.e. I think it should come under the “obvious” rule and
people should just use their judgement about when it's appropriate.
Adding XFAILing tests shouldn't disturb anyone else very much.

I guess there's a possibility that some tests happen to pass already
on some targets.  That's more likely with middle-end and backend bugs
rather than frontend stuff though.  Perhaps for those it would make
sense to have a convention in which the failing testcase is restricted
(at the whole-test level) to the targets that the person committing the
testcase has actually tried.  Maybe with a comment on the dg-ice etc.
to remind people to reconsider the main target selector when un-XFAILing
the test.

Richard
David Malcolm via Gcc-patches July 29, 2020, 8:37 p.m. | #4
On Tue, Jul 28, 2020 at 5:45 PM Marek Polacek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> In Bugzilla, for the c++ component, we currently have over 3200 open

> bugs.  In

> my experience, a good amount of them have already been fixed; my periodical

> sweeps always turn up a bunch of PRs that had already been fixed

> previously.

> Sometimes my sweeps are more or less random, but more often than not I'm

> just

> looking for duplicates of an existing PR.  Sometimes the reason the already

> fixed PRs are still open is because a PR that was fixed had duplicates

> that we

> didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as

> a

> side-effect of fixing another PR.  Manual sweeps are tedious and

> time-consuming

> because often you need to grab the test from the Bugzilla yet again (and

> sometimes there are multiple tests).  Even if you find a PR that was

> fixed, you

> still need to bisect the fix and perhaps add the test to our testsuite.

> That's

> draining and since the number of bugs only increases, never decreases, it

> is not

> sustainable.

>

> So I've started a personal repo where I've gathered dozens of tests and

> wrote a

> script that just compiles every test in the repo and reports if anything

> changed.  One line from it:

>

> pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' ||

> echo -e "$pr: ${msg_ice}"

>

> This has major drawbacks: you have to remember to run this manually, keep

> updating it, and it's yet another repo that people interested in this would

> have to clone, but the worst thing is that typically you would only

> discover

> that a patch fixed a different PR long after the patch was committed.  And

> quite likely it wasn't even your patch.  We know that finding problems

> earlier

> in the developer workflow reduces costs; if we can catch this before the

> original developer commits & pushes the changes, it's cheaper, because the

> developer already understands what the patch does.

>

> A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently

> by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in

> the patch had this effect would probably have been very useful.  More

> testing

> will lead to a better compiler.

>

> Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after

> it was reported by a different change.

>

> Or another: https://gcc.gnu.org/91525 where the patch contained a test,

> but

> that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

>

> To alleviate some of these problems, I propose that we introduce a means

> to our

> DejaGNU infrastructure that allows adding tests for old bugs that have not

> been

> fixed yet, and re-introduce the keyword monitored (no longer used for

> anything

> -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is

> tracked in the testsuite.  I don't want any unnecessary moving tests

> around, so

> the tests would go where they would normally go; they have to be reduced

> and

> have proper targets, etc.  Having such tests in the testsuite means that

> when

> something changes, you will know immediately, before you push any changes.

>

> My thinking is that for:

>

> * rejects-valid: use the existing dg-xfail-if

>


Or dg-excess-errors, or xfailed dg-bogus.


> * accepts-valid: use the new dg-accepts-invalid

>


xfailed dg-error should cover this case.


> * ICEs: use the new dg-ice.

>


This seems like a good addition.


> dg-ice can be used like this:

>

> // { dg-ice "build_over_call" { target c++11 } }

>

> and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> you'll get an XPASS + FAIL.  Then you can close the old PR.

>

> Similarly, dg-accepts-invalid:

>

> // { dg-accepts-invalid "PR86500" }

>

> means that if the test still compiles without errors, you'll get a quiet

> XFAIL.

> If we start giving errors, you'll get an XPASS.

>

> If the bug is fixed, simply remove the directive.

>

> The patch implementing these new directives is appended.  Once/if this is

> accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> only concerned about the c++ component, if that wasn't already clear.)

>

> The question is what makes the bug "old": is it one year without it being

> assigned?  6 months?  3 months?  Note: I *don't* propose to add every test

> for

> every new PR, just the reasonably old ones that are useful/important.  Such

> additions should be done in batches, so that we don't have dozens of

> commits,

> each of them merely adding a single test.

>

> We will still have a surfeit of bugs that we've given short shrift to, but

> let's at least automate what we can.  The initial addition of the relevant

> old(-ish) tests won't of course happen automagically, but it's a price I'm

> willing to pay.  My goal here isn't merely to reduce the number of open

> PRs;

> it is to improve the testing of the compiler overall.

>

> Thoughts?

>

> Bootstrapped/regtested on x86_64-pc-linux-gnu.

>

> [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

>

> gcc/ChangeLog:

>

>         * doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

>

> gcc/testsuite/ChangeLog:

>

>         * lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and

> dg-accepts-invalid.

>         * lib/prune.exp (prune_ices): New.

>         * lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.

> ---

>  gcc/doc/sourcebuild.texi                 | 19 +++++++

>  gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-

>  gcc/testsuite/lib/prune.exp              |  9 ++++

>  gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++

>  4 files changed, 134 insertions(+), 2 deletions(-)

>

> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi

> index a7a922d84a2..636d21d30dd 100644

> --- a/gcc/doc/sourcebuild.texi

> +++ b/gcc/doc/sourcebuild.texi

> @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the

> conditions (which are

>  the same as for @code{dg-skip-if}) are met.

>  @end table

>

> +@subsubsection Expect the compiler to crash

> +

> +@table @code

> +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{

> @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> +Expect the compiler to crash with an internal compiler error and return

> +a nonzero exit status if the conditions (which are the same as for

> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not

> been

> +fixed yet.

> +@end table

> +

>  @subsubsection Expect the test executable to fail

>

>  @table @code

> @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.

>

>  @item @{ dg-prune-output @var{regexp} @}

>  Prune messages matching @var{regexp} from the test output.

> +

> +@table @code

> +@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{

> @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> +Expect the compiler to accept the test (even though it should be rejected

> with

> +a compile-time error), if the conditions (which are the same as for

> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not

> been

> +fixed yet.

> +@end table

> +

>  @end table

>

>  @subsubsection Verify output of the test executable

> diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp

> index 45d97024883..6478eda283b 100644

> --- a/gcc/testsuite/lib/gcc-dg.exp

> +++ b/gcc/testsuite/lib/gcc-dg.exp

> @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what

> extra_tool_flags } {

>      verbose "$target_compile $prog $output_file $compile_type $options" 4

>      set comp_output [$target_compile "$prog" "$output_file"

> "$compile_type" $options]

>

> +    global expect_ice

>      # Look for an internal compiler error, which sometimes masks the fact

>      # that we didn't get an expected error message.  XFAIL an ICE via

>      # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*"

> }

>      # to avoid a second failure for excess errors.

> -    if [string match "*internal compiler error*" $comp_output] {

> +    # "Error reporting routines re-entered" ICE says "Internal" rather

> than

> +    # "internal", so match that too.

> +    if [string match {*[Ii]nternal compiler error*} $comp_output] {

>         upvar 2 name name

> -       fail "$name (internal compiler error)"

> +       if { $expect_ice == 0 } {

> +         fail "$name (internal compiler error)"

> +       } else {

> +         # We expected an ICE and we got it.  Emit an XFAIL.

> +         setup_xfail "*-*-*"

> +         fail "$name (internal compiler error)"

> +         clear_xfail "*-*-*"

> +         # Prune the ICE from the output.

> +         set comp_output [prune_ices $comp_output]

> +       }

> +    } else {

> +       upvar 2 name name

> +       global accepts_invalid

> +       if { $expect_ice == 1 } {

> +         # We expected an ICE but we didn't get it.  We want an XPASS, so

> +         # call setup_xfail to set xfail_flag.

> +         setup_xfail "*-*-*"

> +         pass "$name (internal compiler error)"

> +         clear_xfail "*-*-*"

> +       } elseif { $accepts_invalid == 1 } {

> +           if [string match {*error: *} $comp_output] {

> +             # We expected that this test be (wrongly) accepted, but now

> we have

> +             # seen error(s).  Issue an XPASS to signal that.

> +             setup_xfail "*-*-*"

> +             pass "$name (accepts invalid)"

> +             clear_xfail "*-*-*"

> +           } else {

> +             # This test is still (wrongly) accepted.  Just emit an XFAIL.

> +             setup_xfail "*-*-*"

> +             fail "$name (accepts invalid)"

> +             clear_xfail "*-*-*"

> +           }

> +       }

>      }

>

>      if { $do_what == "repo" } {

> diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp

> index 1c776249f1a..58a739684a5 100644

> --- a/gcc/testsuite/lib/prune.exp

> +++ b/gcc/testsuite/lib/prune.exp

> @@ -118,6 +118,15 @@ proc prune_file_path { text } {

>      return $text

>  }

>

> +# Prune internal compiler error messages, including the "Please submit..."

> +# footnote.

> +

> +proc prune_ices { text } {

> +  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for

> instructions\[^\n\]*" $text "" text

> +  regsub -all "(^|\n|')*Internal compiler error:.*for

> instructions\[^\n\]*" $text "" text

> +  return $text

> +}

> +

>  # Provide a definition of this if missing (delete after next dejagnu

> release).

>

>  if { [info procs prune_warnings] == "" } then {

> diff --git a/gcc/testsuite/lib/target-supports-dg.exp

> b/gcc/testsuite/lib/target-supports-dg.exp

> index 2a21424b890..765f3a2e27a 100644

> --- a/gcc/testsuite/lib/target-supports-dg.exp

> +++ b/gcc/testsuite/lib/target-supports-dg.exp

> @@ -495,6 +495,75 @@ proc dg-shouldfail { args } {

>      }

>  }

>

> +# Record whether the compiler is expected (at the moment) to ICE.

> +# Used for tests that test bugs that have not been fixed yet.

> +

> +set expect_ice 0

> +set accepts_invalid 0

> +

> +proc dg-ice { args } {

> +    # Don't bother if we're already skipping the test.

> +    upvar dg-do-what dg-do-what

> +    if { [lindex ${dg-do-what} 1] == "N" } {

> +      return

> +    }

> +

> +    global accepts_invalid

> +    # Can't be combined with dg-accepts-invalid.

> +    if { $accepts_invalid == 1 } {

> +      error "dg-ice: cannot be combined with dg-accepts-invalid"

> +      return

> +    }

> +

> +    global expect_ice

> +

> +    set args [lreplace $args 0 0]

> +    if { [llength $args] > 1 } {

> +       set selector [list target [lindex $args 1]]

> +       if { [dg-process-target-1 $selector] == "S" } {

> +           # The target matches, now check the flags.

> +           if [check-flags $args] {

> +               set expect_ice 1

> +           }

> +       }

> +    } else {

> +       set expect_ice 1

> +    }

> +}

> +

> +# Record whether the compiler should reject the testcase with an error,

> +# but currently doesn't do so.  Used for accepts-invalid bugs.

> +

> +proc dg-accepts-invalid { args } {

> +    # Don't bother if we're already skipping the test.

> +    upvar dg-do-what dg-do-what

> +    if { [lindex ${dg-do-what} 1] == "N" } {

> +      return

> +    }

> +

> +    global expect_ice

> +    # Can't be combined with dg-ice.

> +    if { $expect_ice == 1 } {

> +      error "dg-accepts-invalid: cannot be combined with dg-ice"

> +      return

> +    }

> +

> +    global accepts_invalid

> +

> +    set args [lreplace $args 0 0]

> +    if { [llength $args] > 1 } {

> +       set selector [list target [lindex $args 1]]

> +       if { [dg-process-target-1 $selector] == "S" } {

> +           # The target matches, now check the flags.

> +           if [check-flags $args] {

> +               set accepts_invalid 1

> +           }

> +       }

> +    } else {

> +       set accepts_invalid 1

> +    }

> +}

> +

>  # Intercept the call to the DejaGnu version of dg-process-target to

>  # support use of an effective-target keyword in place of a list of

>  # target triplets to xfail or skip a test.

>

> base-commit: f3665bd1111c1799c0421490b5e655f977570354

> --

> 2.26.2

>

>
David Malcolm via Gcc-patches July 29, 2020, 10 p.m. | #5
I've created a much more rudimentary setup for myself to deal
with the same problem.  I copy tests from Bugzilla, sometimes
with tweaks, and compile them from time to time as I revisit
unresolved bugs.  I've also thought about adding those to
the test suite and marking them XFAIL but I don't think I've
actually done it more than a handful of times.  I was told
adding tests (passing or xfailing) is fine without approval.

I think your proposal to add tests for known failures is a good
idea.  I don't have much of an opinion about extending the test
harness to differentiate other kinds of failures (like ICEs) and
mark them as expected.  I'm not sure I understand the benefit
of adding directives like dg-accepts-invalid over using xfail.

Martin

On 7/28/20 3:44 PM, Marek Polacek via Gcc-patches wrote:
> In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In

> my experience, a good amount of them have already been fixed; my periodical

> sweeps always turn up a bunch of PRs that had already been fixed previously.

> Sometimes my sweeps are more or less random, but more often than not I'm just

> looking for duplicates of an existing PR.  Sometimes the reason the already

> fixed PRs are still open is because a PR that was fixed had duplicates that we

> didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a

> side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming

> because often you need to grab the test from the Bugzilla yet again (and

> sometimes there are multiple tests).  Even if you find a PR that was fixed, you

> still need to bisect the fix and perhaps add the test to our testsuite.  That's

> draining and since the number of bugs only increases, never decreases, it is not

> sustainable.

> 

> So I've started a personal repo where I've gathered dozens of tests and wrote a

> script that just compiles every test in the repo and reports if anything

> changed.  One line from it:

> 

> pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || echo -e "$pr: ${msg_ice}"

> 

> This has major drawbacks: you have to remember to run this manually, keep

> updating it, and it's yet another repo that people interested in this would

> have to clone, but the worst thing is that typically you would only discover

> that a patch fixed a different PR long after the patch was committed.  And

> quite likely it wasn't even your patch.  We know that finding problems earlier

> in the developer workflow reduces costs; if we can catch this before the

> original developer commits & pushes the changes, it's cheaper, because the

> developer already understands what the patch does.

> 

> A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently

> by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in

> the patch had this effect would probably have been very useful.  More testing

> will lead to a better compiler.

> 

> Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after

> it was reported by a different change.

> 

> Or another: https://gcc.gnu.org/91525 where the patch contained a test, but

> that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

> 

> To alleviate some of these problems, I propose that we introduce a means to our

> DejaGNU infrastructure that allows adding tests for old bugs that have not been

> fixed yet, and re-introduce the keyword monitored (no longer used for anything

> -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is

> tracked in the testsuite.  I don't want any unnecessary moving tests around, so

> the tests would go where they would normally go; they have to be reduced and

> have proper targets, etc.  Having such tests in the testsuite means that when

> something changes, you will know immediately, before you push any changes.

> 

> My thinking is that for:

> 

> * rejects-valid: use the existing dg-xfail-if

> * accepts-valid: use the new dg-accepts-invalid

> * ICEs: use the new dg-ice

> 

> dg-ice can be used like this:

> 

> // { dg-ice "build_over_call" { target c++11 } }

> 

> and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> you'll get an XPASS + FAIL.  Then you can close the old PR.

> 

> Similarly, dg-accepts-invalid:

> 

> // { dg-accepts-invalid "PR86500" }

> 

> means that if the test still compiles without errors, you'll get a quiet XFAIL.

> If we start giving errors, you'll get an XPASS.

> 

> If the bug is fixed, simply remove the directive.

> 

> The patch implementing these new directives is appended.  Once/if this is

> accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> only concerned about the c++ component, if that wasn't already clear.)

> 

> The question is what makes the bug "old": is it one year without it being

> assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for

> every new PR, just the reasonably old ones that are useful/important.  Such

> additions should be done in batches, so that we don't have dozens of commits,

> each of them merely adding a single test.

> 

> We will still have a surfeit of bugs that we've given short shrift to, but

> let's at least automate what we can.  The initial addition of the relevant

> old(-ish) tests won't of course happen automagically, but it's a price I'm

> willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> it is to improve the testing of the compiler overall.

> 

> Thoughts?

> 

> Bootstrapped/regtested on x86_64-pc-linux-gnu.

> 

> [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

> 

> gcc/ChangeLog:

> 

> 	* doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

> 

> gcc/testsuite/ChangeLog:

> 

> 	* lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.

> 	* lib/prune.exp (prune_ices): New.

> 	* lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.

> ---

>   gcc/doc/sourcebuild.texi                 | 19 +++++++

>   gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-

>   gcc/testsuite/lib/prune.exp              |  9 ++++

>   gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++

>   4 files changed, 134 insertions(+), 2 deletions(-)

> 

> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi

> index a7a922d84a2..636d21d30dd 100644

> --- a/gcc/doc/sourcebuild.texi

> +++ b/gcc/doc/sourcebuild.texi

> @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the conditions (which are

>   the same as for @code{dg-skip-if}) are met.

>   @end table

>   

> +@subsubsection Expect the compiler to crash

> +

> +@table @code

> +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> +Expect the compiler to crash with an internal compiler error and return

> +a nonzero exit status if the conditions (which are the same as for

> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> +fixed yet.

> +@end table

> +

>   @subsubsection Expect the test executable to fail

>   

>   @table @code

> @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.

>   

>   @item @{ dg-prune-output @var{regexp} @}

>   Prune messages matching @var{regexp} from the test output.

> +

> +@table @code

> +@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> +Expect the compiler to accept the test (even though it should be rejected with

> +a compile-time error), if the conditions (which are the same as for

> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> +fixed yet.

> +@end table

> +

>   @end table

>   

>   @subsubsection Verify output of the test executable

> diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp

> index 45d97024883..6478eda283b 100644

> --- a/gcc/testsuite/lib/gcc-dg.exp

> +++ b/gcc/testsuite/lib/gcc-dg.exp

> @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } {

>       verbose "$target_compile $prog $output_file $compile_type $options" 4

>       set comp_output [$target_compile "$prog" "$output_file" "$compile_type" $options]

>   

> +    global expect_ice

>       # Look for an internal compiler error, which sometimes masks the fact

>       # that we didn't get an expected error message.  XFAIL an ICE via

>       # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }

>       # to avoid a second failure for excess errors.

> -    if [string match "*internal compiler error*" $comp_output] {

> +    # "Error reporting routines re-entered" ICE says "Internal" rather than

> +    # "internal", so match that too.

> +    if [string match {*[Ii]nternal compiler error*} $comp_output] {

>   	upvar 2 name name

> -	fail "$name (internal compiler error)"

> +	if { $expect_ice == 0 } {

> +	  fail "$name (internal compiler error)"

> +	} else {

> +	  # We expected an ICE and we got it.  Emit an XFAIL.

> +	  setup_xfail "*-*-*"

> +	  fail "$name (internal compiler error)"

> +	  clear_xfail "*-*-*"

> +	  # Prune the ICE from the output.

> +	  set comp_output [prune_ices $comp_output]

> +	}

> +    } else {

> +	upvar 2 name name

> +	global accepts_invalid

> +	if { $expect_ice == 1 } {

> +	  # We expected an ICE but we didn't get it.  We want an XPASS, so

> +	  # call setup_xfail to set xfail_flag.

> +	  setup_xfail "*-*-*"

> +	  pass "$name (internal compiler error)"

> +	  clear_xfail "*-*-*"

> +	} elseif { $accepts_invalid == 1 } {

> +	    if [string match {*error: *} $comp_output] {

> +	      # We expected that this test be (wrongly) accepted, but now we have

> +	      # seen error(s).  Issue an XPASS to signal that.

> +	      setup_xfail "*-*-*"

> +	      pass "$name (accepts invalid)"

> +	      clear_xfail "*-*-*"

> +	    } else {

> +	      # This test is still (wrongly) accepted.  Just emit an XFAIL.

> +	      setup_xfail "*-*-*"

> +	      fail "$name (accepts invalid)"

> +	      clear_xfail "*-*-*"

> +	    }

> +	}

>       }

>   

>       if { $do_what == "repo" } {

> diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp

> index 1c776249f1a..58a739684a5 100644

> --- a/gcc/testsuite/lib/prune.exp

> +++ b/gcc/testsuite/lib/prune.exp

> @@ -118,6 +118,15 @@ proc prune_file_path { text } {

>       return $text

>   }

>   

> +# Prune internal compiler error messages, including the "Please submit..."

> +# footnote.

> +

> +proc prune_ices { text } {

> +  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for instructions\[^\n\]*" $text "" text

> +  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" $text "" text

> +  return $text

> +}

> +

>   # Provide a definition of this if missing (delete after next dejagnu release).

>   

>   if { [info procs prune_warnings] == "" } then {

> diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp

> index 2a21424b890..765f3a2e27a 100644

> --- a/gcc/testsuite/lib/target-supports-dg.exp

> +++ b/gcc/testsuite/lib/target-supports-dg.exp

> @@ -495,6 +495,75 @@ proc dg-shouldfail { args } {

>       }

>   }

>   

> +# Record whether the compiler is expected (at the moment) to ICE.

> +# Used for tests that test bugs that have not been fixed yet.

> +

> +set expect_ice 0

> +set accepts_invalid 0

> +

> +proc dg-ice { args } {

> +    # Don't bother if we're already skipping the test.

> +    upvar dg-do-what dg-do-what

> +    if { [lindex ${dg-do-what} 1] == "N" } {

> +      return

> +    }

> +

> +    global accepts_invalid

> +    # Can't be combined with dg-accepts-invalid.

> +    if { $accepts_invalid == 1 } {

> +      error "dg-ice: cannot be combined with dg-accepts-invalid"

> +      return

> +    }

> +

> +    global expect_ice

> +

> +    set args [lreplace $args 0 0]

> +    if { [llength $args] > 1 } {

> +	set selector [list target [lindex $args 1]]

> +	if { [dg-process-target-1 $selector] == "S" } {

> +	    # The target matches, now check the flags.

> +	    if [check-flags $args] {

> +		set expect_ice 1

> +	    }

> +	}

> +    } else {

> +	set expect_ice 1

> +    }

> +}

> +

> +# Record whether the compiler should reject the testcase with an error,

> +# but currently doesn't do so.  Used for accepts-invalid bugs.

> +

> +proc dg-accepts-invalid { args } {

> +    # Don't bother if we're already skipping the test.

> +    upvar dg-do-what dg-do-what

> +    if { [lindex ${dg-do-what} 1] == "N" } {

> +      return

> +    }

> +

> +    global expect_ice

> +    # Can't be combined with dg-ice.

> +    if { $expect_ice == 1 } {

> +      error "dg-accepts-invalid: cannot be combined with dg-ice"

> +      return

> +    }

> +

> +    global accepts_invalid

> +

> +    set args [lreplace $args 0 0]

> +    if { [llength $args] > 1 } {

> +	set selector [list target [lindex $args 1]]

> +	if { [dg-process-target-1 $selector] == "S" } {

> +	    # The target matches, now check the flags.

> +	    if [check-flags $args] {

> +		set accepts_invalid 1

> +	    }

> +	}

> +    } else {

> +	set accepts_invalid 1

> +    }

> +}

> +

>   # Intercept the call to the DejaGnu version of dg-process-target to

>   # support use of an effective-target keyword in place of a list of

>   # target triplets to xfail or skip a test.

> 

> base-commit: f3665bd1111c1799c0421490b5e655f977570354

>
Martin Liška July 30, 2020, 9:08 a.m. | #6
Hello.

I support the initiative!
What would be nice to add leading 'PR component/12345'
to a git commit so that these test additions are linked to bugzilla issues.

Martin
David Malcolm via Gcc-patches July 30, 2020, 9:54 a.m. | #7
On Tue, Jul 28, 2020 at 05:44:47PM -0400, Marek Polacek via Gcc-patches wrote:
> We will still have a surfeit of bugs that we've given short shrift to, but

> let's at least automate what we can.  The initial addition of the relevant

> old(-ish) tests won't of course happen automagically, but it's a price I'm

> willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> it is to improve the testing of the compiler overall.

> 

> Thoughts?


Looks useful to me, but I'd think it might be desirable to use separate
directories for those tests, so that it is more obvious that it is a
different category of tests.  Now that we use git, just using git mv
to move them to another place once they are fixed for good (together with
some dg-* directive tweaks) wouldn't be that much work later.

So having gcc.dg/unfixed/ , g++.dg/unfixed/ , c-c++-common/unfixed/
and their torture/ suffixed variants (or better directory name for those)?

	Jakub
David Malcolm via Gcc-patches Aug. 4, 2020, 9:34 p.m. | #8
Hi Mike,

thanks for your comments.

On Tue, Jul 28, 2020 at 06:37:26PM -0700, Mike Stump via Gcc-patches wrote:
> I'll punt to the the C++ front-end folks to chime in.  Usually we only check in bugs that are fixed, as they are fixed, this is what makes it a regression suite.  Doing this does have advantages, like, the testsuite is small and doesn't have duplicates and doesn't test anything that is known to fail that isn't an actual regression.


We also add sanity tests for new language features, for example.  I also like
adding (small) tests for DRs that happen to be already fixed to insure that the
compiler continues to behave as expected.  Avoiding duplicates is a good thing,
obviously, but, with this new scheme, if you fix something and while testing
the fix you find that we already have a test, you can choose not to introduce
a new test.  Whether or not a compiler bug is a regression shouldn't, IMHO, be
a criterion for (not) including the test.

tree(1) in testsuite/g++.dg says 70 directories, 13406 files.  If I go wild and
add 200 new C++ tests, that's ~1.5% increase in the number of tests.  That seems
reasonable.  If it still causes grief for some people, and we go with the idea
of using an unfixed/ directory, we could add GCC_TESTSUITE_TEST_UNFIXED envvar
to enable or disable running such tests.

> Ideally, it would be nice to have a way to test bugs out of bugzilla, and report on those fixed bugs as they are fixed.  If people want to keep the test suite a regression suite, then my counter proposal would be to have a branch with the non-regression bugs on it and then people can checkout and test that branch.  Most of the people don't (saving the testing time, which is handy), and then more sporadically, the old bugs branch can be test and BZ state can be moved along as bugs as fixed.  A run once a week or even once a month would seem to be plenty often.


I'm afraid this would defeat the point of this proposal.  If the additional
tests are only available on a branch, no one is going to use it.  Quite
frankly, I'd just stick with my personal repo (where I can do whatever I want,
include test with #includes, unreduced tests, ...) rather than to bother with
rebasing a branch etc.  Even if someone would actually use such a branch from
time to time, we'd lose the benefit of "left shifting" bugs in the developer
workflow -- you would only notice that your patch had changed something days
or weeks after the commit at which point you likely have lost state on it.

> You in general don't need to check in fixes from bug reports that have been fixed in the past, as those fixes generally already have a test case for the fix that went in with the fix.


I agree, but experience shows that's not always the case.  Just the few PRs
I mentioned in my original mail prove that.  There are patches that change
something as a side-effect only, and, as a developer, you want to be aware
of those side-effects.

> As for how old is old, we'd leave that to the contributor of work to decide.  In theory, bugs can be added as soon as they come in, no need to wait.  To the extent that waiting saves work, well, that's a personal choice, for the person doing the work to choose.


I agree.  I'd leave it up to developers.  Just... be reasonable.  No need to
add every broken nonsense test lurking in Bugzilla.

> Why not just use xfail and xpass?  Seems less work than doing a setup_xfail.  Also, why not just use the existing directives instead of adding new directives?  I'm suspicious of expect_ice and accepts_invalid.  You set them to 1 all the time, but almost never set them to 0?  I'm wondering if it should be more like shouldfail?


Good point, I missed that I could use xfail and xpass.  Fixed.

For accepts_invalid I think we *could* use the existing directives, if you know
where to expect an error.  But for ICEs we currently have nothing that would
work well.  I'll probably drop dg-accepts-invalid completely.

I followed shouldfail's suit when it comes to setting them to 1, so that should
be fine.

Thanks,
Marek

> On Jul 28, 2020, at 2:44 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> > 

> > In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In

> > my experience, a good amount of them have already been fixed; my periodical

> > sweeps always turn up a bunch of PRs that had already been fixed previously.

> > Sometimes my sweeps are more or less random, but more often than not I'm just

> > looking for duplicates of an existing PR.  Sometimes the reason the already

> > fixed PRs are still open is because a PR that was fixed had duplicates that we

> > didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a

> > side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming

> > because often you need to grab the test from the Bugzilla yet again (and

> > sometimes there are multiple tests).  Even if you find a PR that was fixed, you

> > still need to bisect the fix and perhaps add the test to our testsuite.  That's

> > draining and since the number of bugs only increases, never decreases, it is not

> > sustainable.

> > 

> > So I've started a personal repo where I've gathered dozens of tests and wrote a

> > script that just compiles every test in the repo and reports if anything

> > changed.  One line from it:

> > 

> > pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || echo -e "$pr: ${msg_ice}"

> > 

> > This has major drawbacks: you have to remember to run this manually, keep

> > updating it, and it's yet another repo that people interested in this would

> > have to clone, but the worst thing is that typically you would only discover

> > that a patch fixed a different PR long after the patch was committed.  And

> > quite likely it wasn't even your patch.  We know that finding problems earlier

> > in the developer workflow reduces costs; if we can catch this before the

> > original developer commits & pushes the changes, it's cheaper, because the

> > developer already understands what the patch does.

> > 

> > A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently

> > by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in

> > the patch had this effect would probably have been very useful.  More testing

> > will lead to a better compiler.

> > 

> > Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after

> > it was reported by a different change.

> > 

> > Or another: https://gcc.gnu.org/91525 where the patch contained a test, but

> > that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

> > 

> > To alleviate some of these problems, I propose that we introduce a means to our

> > DejaGNU infrastructure that allows adding tests for old bugs that have not been

> > fixed yet, and re-introduce the keyword monitored (no longer used for anything

> > -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is

> > tracked in the testsuite.  I don't want any unnecessary moving tests around, so

> > the tests would go where they would normally go; they have to be reduced and

> > have proper targets, etc.  Having such tests in the testsuite means that when

> > something changes, you will know immediately, before you push any changes.

> > 

> > My thinking is that for:

> > 

> > * rejects-valid: use the existing dg-xfail-if

> > * accepts-valid: use the new dg-accepts-invalid

> > * ICEs: use the new dg-ice

> > 

> > dg-ice can be used like this:

> > 

> > // { dg-ice "build_over_call" { target c++11 } }

> > 

> > and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> > ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> > you'll get an XPASS + FAIL.  Then you can close the old PR.

> > 

> > Similarly, dg-accepts-invalid:

> > 

> > // { dg-accepts-invalid "PR86500" }

> > 

> > means that if the test still compiles without errors, you'll get a quiet XFAIL.

> > If we start giving errors, you'll get an XPASS.

> > 

> > If the bug is fixed, simply remove the directive.

> > 

> > The patch implementing these new directives is appended.  Once/if this is

> > accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> > only concerned about the c++ component, if that wasn't already clear.)

> > 

> > The question is what makes the bug "old": is it one year without it being

> > assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for

> > every new PR, just the reasonably old ones that are useful/important.  Such

> > additions should be done in batches, so that we don't have dozens of commits,

> > each of them merely adding a single test.

> > 

> > We will still have a surfeit of bugs that we've given short shrift to, but

> > let's at least automate what we can.  The initial addition of the relevant

> > old(-ish) tests won't of course happen automagically, but it's a price I'm

> > willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> > it is to improve the testing of the compiler overall.

> > 

> > Thoughts?

> > 

> > Bootstrapped/regtested on x86_64-pc-linux-gnu.

> > 

> > [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

> > 

> > gcc/ChangeLog:

> > 

> > 	* doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

> > 

> > gcc/testsuite/ChangeLog:

> > 

> > 	* lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.

> > 	* lib/prune.exp (prune_ices): New.

> > 	* lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.

> > ---

> > gcc/doc/sourcebuild.texi                 | 19 +++++++

> > gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-

> > gcc/testsuite/lib/prune.exp              |  9 ++++

> > gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++

> > 4 files changed, 134 insertions(+), 2 deletions(-)

> > 

> > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi

> > index a7a922d84a2..636d21d30dd 100644

> > --- a/gcc/doc/sourcebuild.texi

> > +++ b/gcc/doc/sourcebuild.texi

> > @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the conditions (which are

> > the same as for @code{dg-skip-if}) are met.

> > @end table

> > 

> > +@subsubsection Expect the compiler to crash

> > +

> > +@table @code

> > +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> > +Expect the compiler to crash with an internal compiler error and return

> > +a nonzero exit status if the conditions (which are the same as for

> > +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> > +fixed yet.

> > +@end table

> > +

> > @subsubsection Expect the test executable to fail

> > 

> > @table @code

> > @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.

> > 

> > @item @{ dg-prune-output @var{regexp} @}

> > Prune messages matching @var{regexp} from the test output.

> > +

> > +@table @code

> > +@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> > +Expect the compiler to accept the test (even though it should be rejected with

> > +a compile-time error), if the conditions (which are the same as for

> > +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> > +fixed yet.

> > +@end table

> > +

> > @end table

> > 

> > @subsubsection Verify output of the test executable

> > diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp

> > index 45d97024883..6478eda283b 100644

> > --- a/gcc/testsuite/lib/gcc-dg.exp

> > +++ b/gcc/testsuite/lib/gcc-dg.exp

> > @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } {

> >     verbose "$target_compile $prog $output_file $compile_type $options" 4

> >     set comp_output [$target_compile "$prog" "$output_file" "$compile_type" $options]

> > 

> > +    global expect_ice

> >     # Look for an internal compiler error, which sometimes masks the fact

> >     # that we didn't get an expected error message.  XFAIL an ICE via

> >     # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }

> >     # to avoid a second failure for excess errors.

> > -    if [string match "*internal compiler error*" $comp_output] {

> > +    # "Error reporting routines re-entered" ICE says "Internal" rather than

> > +    # "internal", so match that too.

> > +    if [string match {*[Ii]nternal compiler error*} $comp_output] {

> > 	upvar 2 name name

> > -	fail "$name (internal compiler error)"

> > +	if { $expect_ice == 0 } {

> > +	  fail "$name (internal compiler error)"

> > +	} else {

> > +	  # We expected an ICE and we got it.  Emit an XFAIL.

> > +	  setup_xfail "*-*-*"

> > +	  fail "$name (internal compiler error)"

> > +	  clear_xfail "*-*-*"

> > +	  # Prune the ICE from the output.

> > +	  set comp_output [prune_ices $comp_output]

> > +	}

> > +    } else {

> > +	upvar 2 name name

> > +	global accepts_invalid

> > +	if { $expect_ice == 1 } {

> > +	  # We expected an ICE but we didn't get it.  We want an XPASS, so

> > +	  # call setup_xfail to set xfail_flag.

> > +	  setup_xfail "*-*-*"

> > +	  pass "$name (internal compiler error)"

> > +	  clear_xfail "*-*-*"

> > +	} elseif { $accepts_invalid == 1 } {

> > +	    if [string match {*error: *} $comp_output] {

> > +	      # We expected that this test be (wrongly) accepted, but now we have

> > +	      # seen error(s).  Issue an XPASS to signal that.

> > +	      setup_xfail "*-*-*"

> > +	      pass "$name (accepts invalid)"

> > +	      clear_xfail "*-*-*"

> > +	    } else {

> > +	      # This test is still (wrongly) accepted.  Just emit an XFAIL.

> > +	      setup_xfail "*-*-*"

> > +	      fail "$name (accepts invalid)"

> > +	      clear_xfail "*-*-*"

> > +	    }

> > +	}

> >     }

> > 

> >     if { $do_what == "repo" } {

> > diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp

> > index 1c776249f1a..58a739684a5 100644

> > --- a/gcc/testsuite/lib/prune.exp

> > +++ b/gcc/testsuite/lib/prune.exp

> > @@ -118,6 +118,15 @@ proc prune_file_path { text } {

> >     return $text

> > }

> > 

> > +# Prune internal compiler error messages, including the "Please submit..."

> > +# footnote.

> > +

> > +proc prune_ices { text } {

> > +  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for instructions\[^\n\]*" $text "" text

> > +  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" $text "" text

> > +  return $text

> > +}

> > +

> > # Provide a definition of this if missing (delete after next dejagnu release).

> > 

> > if { [info procs prune_warnings] == "" } then {

> > diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp

> > index 2a21424b890..765f3a2e27a 100644

> > --- a/gcc/testsuite/lib/target-supports-dg.exp

> > +++ b/gcc/testsuite/lib/target-supports-dg.exp

> > @@ -495,6 +495,75 @@ proc dg-shouldfail { args } {

> >     }

> > }

> > 

> > +# Record whether the compiler is expected (at the moment) to ICE.

> > +# Used for tests that test bugs that have not been fixed yet.

> > +

> > +set expect_ice 0

> > +set accepts_invalid 0

> > +

> > +proc dg-ice { args } {

> > +    # Don't bother if we're already skipping the test.

> > +    upvar dg-do-what dg-do-what

> > +    if { [lindex ${dg-do-what} 1] == "N" } {

> > +      return

> > +    }

> > +

> > +    global accepts_invalid

> > +    # Can't be combined with dg-accepts-invalid.

> > +    if { $accepts_invalid == 1 } {

> > +      error "dg-ice: cannot be combined with dg-accepts-invalid"

> > +      return

> > +    }

> > +

> > +    global expect_ice

> > +

> > +    set args [lreplace $args 0 0]

> > +    if { [llength $args] > 1 } {

> > +	set selector [list target [lindex $args 1]]

> > +	if { [dg-process-target-1 $selector] == "S" } {

> > +	    # The target matches, now check the flags.

> > +	    if [check-flags $args] {

> > +		set expect_ice 1

> > +	    }

> > +	}

> > +    } else {

> > +	set expect_ice 1

> > +    }

> > +}

> > +

> > +# Record whether the compiler should reject the testcase with an error,

> > +# but currently doesn't do so.  Used for accepts-invalid bugs.

> > +

> > +proc dg-accepts-invalid { args } {

> > +    # Don't bother if we're already skipping the test.

> > +    upvar dg-do-what dg-do-what

> > +    if { [lindex ${dg-do-what} 1] == "N" } {

> > +      return

> > +    }

> > +

> > +    global expect_ice

> > +    # Can't be combined with dg-ice.

> > +    if { $expect_ice == 1 } {

> > +      error "dg-accepts-invalid: cannot be combined with dg-ice"

> > +      return

> > +    }

> > +

> > +    global accepts_invalid

> > +

> > +    set args [lreplace $args 0 0]

> > +    if { [llength $args] > 1 } {

> > +	set selector [list target [lindex $args 1]]

> > +	if { [dg-process-target-1 $selector] == "S" } {

> > +	    # The target matches, now check the flags.

> > +	    if [check-flags $args] {

> > +		set accepts_invalid 1

> > +	    }

> > +	}

> > +    } else {

> > +	set accepts_invalid 1

> > +    }

> > +}

> > +

> > # Intercept the call to the DejaGnu version of dg-process-target to

> > # support use of an effective-target keyword in place of a list of

> > # target triplets to xfail or skip a test.

> > 

> > base-commit: f3665bd1111c1799c0421490b5e655f977570354

> > -- 

> > 2.26.2

> >
David Malcolm via Gcc-patches Aug. 4, 2020, 9:37 p.m. | #9
On Tue, Jul 28, 2020 at 09:02:17PM -0600, Jeff Law wrote:
> On Tue, 2020-07-28 at 17:44 -0400, Marek Polacek via Gcc-patches wrote:

> > In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In

> > my experience, a good amount of them have already been fixed; my periodical

> > sweeps always turn up a bunch of PRs that had already been fixed previously.

> > Sometimes my sweeps are more or less random, but more often than not I'm just

> > looking for duplicates of an existing PR.  Sometimes the reason the already

> > fixed PRs are still open is because a PR that was fixed had duplicates that we

> > didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a

> > side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming

> > because often you need to grab the test from the Bugzilla yet again (and

> > sometimes there are multiple tests).  Even if you find a PR that was fixed, you

> > still need to bisect the fix and perhaps add the test to our testsuite.  That's

> > draining and since the number of bugs only increases, never decreases, it is not

> > sustainable.

> [ ... ]

> Another approach is to add tests for unfixed bugs as XFAILs.  When we see the

> test go from XFAIL to XPASS, then we know the bug got fixed.


That's the plan precisely.  XFAILs won't show up in your test summary so won't
clutter the output, whereas XPASSs will be noticeable.

> Anyway, there's certainly room to do something here to make it easier to find

> bugs we've already fixed.


Right, I don't expect that people will start adding unfixed tests by the
hundred.  ;-)

Marek
David Malcolm via Gcc-patches Aug. 4, 2020, 9:47 p.m. | #10
On Wed, Jul 29, 2020 at 09:40:35AM +0100, Richard Sandiford wrote:
> Thanks for doing this.  +1 for the best fix being to add XFAILing tests

> to the main testsute, enabled by default.  I don't see any other realistic

> way of ensuring that fixes are matched with PRs at the time that the fix

> is made (rather than some time after the fact).


I appreciate the feedback!

> Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> > […]

> > My thinking is that for:

> >

> > * rejects-valid: use the existing dg-xfail-if

> > * accepts-valid: use the new dg-accepts-invalid

> > * ICEs: use the new dg-ice

> >

> > dg-ice can be used like this:

> >

> > // { dg-ice "build_over_call" { target c++11 } }

> >

> > and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> > ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> > you'll get an XPASS + FAIL.  Then you can close the old PR.

> 

> This is long overdue IMO, thanks for adding it.

> 

> > Similarly, dg-accepts-invalid:

> >

> > // { dg-accepts-invalid "PR86500" }

> >

> > means that if the test still compiles without errors, you'll get a quiet XFAIL.

> > If we start giving errors, you'll get an XPASS.

> >

> > If the bug is fixed, simply remove the directive.

> >

> > The patch implementing these new directives is appended.  Once/if this is

> > accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> > only concerned about the c++ component, if that wasn't already clear.)

> >

> > The question is what makes the bug "old": is it one year without it being

> > assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for

> > every new PR, just the reasonably old ones that are useful/important.  Such

> > additions should be done in batches, so that we don't have dozens of commits,

> > each of them merely adding a single test.

> 

> IMO it should be OK to add a testcase for any open PR, if someone think

> it's useful, regardless of age and without being forced to batch the

> commits.  I.e. I think it should come under the “obvious” rule and

> people should just use their judgement about when it's appropriate.

> Adding XFAILing tests shouldn't disturb anyone else very much.


Sounds good.  I do think it should be left up to developers, and that such
tests can go in under the obvious rule -- you can hardly break stuff.

My point about the batches was that if you know you're going to add 10 tests,
it's better to add them in 1 squashed commit rather than 10 separate commits.

> I guess there's a possibility that some tests happen to pass already

> on some targets.  That's more likely with middle-end and backend bugs

> rather than frontend stuff though.  Perhaps for those it would make

> sense to have a convention in which the failing testcase is restricted

> (at the whole-test level) to the targets that the person committing the

> testcase has actually tried.  Maybe with a comment on the dg-ice etc.

> to remind people to reconsider the main target selector when un-XFAILing

> the test.


Interesting point.  With my frontend hat on, I hadn't really thought of
this much, but the dg-ice directive allows you to specify the targets and
specific options when to expect an ICE.  So you could run a test everywhere
but only expect an ICE on aarch64.

Thanks,
Marek
David Malcolm via Gcc-patches Aug. 4, 2020, 10:08 p.m. | #11
On Wed, Jul 29, 2020 at 04:37:03PM -0400, Jason Merrill wrote:
> On Tue, Jul 28, 2020 at 5:45 PM Marek Polacek via Gcc-patches <

> gcc-patches@gcc.gnu.org> wrote:

> 

> > In Bugzilla, for the c++ component, we currently have over 3200 open

> > bugs.  In

> > my experience, a good amount of them have already been fixed; my periodical

> > sweeps always turn up a bunch of PRs that had already been fixed

> > previously.

> > Sometimes my sweeps are more or less random, but more often than not I'm

> > just

> > looking for duplicates of an existing PR.  Sometimes the reason the already

> > fixed PRs are still open is because a PR that was fixed had duplicates

> > that we

> > didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as

> > a

> > side-effect of fixing another PR.  Manual sweeps are tedious and

> > time-consuming

> > because often you need to grab the test from the Bugzilla yet again (and

> > sometimes there are multiple tests).  Even if you find a PR that was

> > fixed, you

> > still need to bisect the fix and perhaps add the test to our testsuite.

> > That's

> > draining and since the number of bugs only increases, never decreases, it

> > is not

> > sustainable.

> >

> > So I've started a personal repo where I've gathered dozens of tests and

> > wrote a

> > script that just compiles every test in the repo and reports if anything

> > changed.  One line from it:

> >

> > pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' ||

> > echo -e "$pr: ${msg_ice}"

> >

> > This has major drawbacks: you have to remember to run this manually, keep

> > updating it, and it's yet another repo that people interested in this would

> > have to clone, but the worst thing is that typically you would only

> > discover

> > that a patch fixed a different PR long after the patch was committed.  And

> > quite likely it wasn't even your patch.  We know that finding problems

> > earlier

> > in the developer workflow reduces costs; if we can catch this before the

> > original developer commits & pushes the changes, it's cheaper, because the

> > developer already understands what the patch does.

> >

> > A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently

> > by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in

> > the patch had this effect would probably have been very useful.  More

> > testing

> > will lead to a better compiler.

> >

> > Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after

> > it was reported by a different change.

> >

> > Or another: https://gcc.gnu.org/91525 where the patch contained a test,

> > but

> > that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

> >

> > To alleviate some of these problems, I propose that we introduce a means

> > to our

> > DejaGNU infrastructure that allows adding tests for old bugs that have not

> > been

> > fixed yet, and re-introduce the keyword monitored (no longer used for

> > anything

> > -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is

> > tracked in the testsuite.  I don't want any unnecessary moving tests

> > around, so

> > the tests would go where they would normally go; they have to be reduced

> > and

> > have proper targets, etc.  Having such tests in the testsuite means that

> > when

> > something changes, you will know immediately, before you push any changes.

> >

> > My thinking is that for:

> >

> > * rejects-valid: use the existing dg-xfail-if

> >

> 

> Or dg-excess-errors, or xfailed dg-bogus.


Yeah, whatever works.  With e.g.

int i = ...;  // { dg-xfail-if "" { *-*-* } }

one gets expected failures when we emit an error, and unexpected successes when
we stop emitting and error there.

> > * accepts-valid: use the new dg-accepts-invalid


(Surely I meant accepts-*in*valid here.)

> >

> 

> xfailed dg-error should cover this case.


That works well if you know where to expect an error.  But if you don't, it's
worse.  E.g.,

// { dg-xfail-if "" { *-*-* } }
int i = nothere; // demonstrates something that errors out
// { dg-error "" } didn't know where to put this

only prints unexpected failures, but no unexpected successes.  I guess that's
OK though, at least for now, so I'll drop dg-accepts-invalid.

> > * ICEs: use the new dg-ice.

> >

> 

> This seems like a good addition.


Thanks!

Marek
David Malcolm via Gcc-patches Aug. 4, 2020, 10:16 p.m. | #12
On Wed, Jul 29, 2020 at 04:00:27PM -0600, Martin Sebor wrote:
> I've created a much more rudimentary setup for myself to deal

> with the same problem.  I copy tests from Bugzilla, sometimes

> with tweaks, and compile them from time to time as I revisit

> unresolved bugs.  I've also thought about adding those to

> the test suite and marking them XFAIL but I don't think I've

> actually done it more than a handful of times.  I was told

> adding tests (passing or xfailing) is fine without approval.

> 

> I think your proposal to add tests for known failures is a good

> idea.  I don't have much of an opinion about extending the test

> harness to differentiate other kinds of failures (like ICEs) and

> mark them as expected.  I'm not sure I understand the benefit

> of adding directives like dg-accepts-invalid over using xfail.


Thanks for the feedback.  The benefit of dg-accepts-invalid was that you would
get an XPASS even for a test that should not be accepted, but you didn't know
what line to expect an error on, so you put a dg-error at the end of the test.

That's probably not necessary for the first incarnation of this patch so I've
dropped it.

Thanks,
Marek

> On 7/28/20 3:44 PM, Marek Polacek via Gcc-patches wrote:

> > In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In

> > my experience, a good amount of them have already been fixed; my periodical

> > sweeps always turn up a bunch of PRs that had already been fixed previously.

> > Sometimes my sweeps are more or less random, but more often than not I'm just

> > looking for duplicates of an existing PR.  Sometimes the reason the already

> > fixed PRs are still open is because a PR that was fixed had duplicates that we

> > didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a

> > side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming

> > because often you need to grab the test from the Bugzilla yet again (and

> > sometimes there are multiple tests).  Even if you find a PR that was fixed, you

> > still need to bisect the fix and perhaps add the test to our testsuite.  That's

> > draining and since the number of bugs only increases, never decreases, it is not

> > sustainable.

> > 

> > So I've started a personal repo where I've gathered dozens of tests and wrote a

> > script that just compiles every test in the repo and reports if anything

> > changed.  One line from it:

> > 

> > pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || echo -e "$pr: ${msg_ice}"

> > 

> > This has major drawbacks: you have to remember to run this manually, keep

> > updating it, and it's yet another repo that people interested in this would

> > have to clone, but the worst thing is that typically you would only discover

> > that a patch fixed a different PR long after the patch was committed.  And

> > quite likely it wasn't even your patch.  We know that finding problems earlier

> > in the developer workflow reduces costs; if we can catch this before the

> > original developer commits & pushes the changes, it's cheaper, because the

> > developer already understands what the patch does.

> > 

> > A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently

> > by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in

> > the patch had this effect would probably have been very useful.  More testing

> > will lead to a better compiler.

> > 

> > Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after

> > it was reported by a different change.

> > 

> > Or another: https://gcc.gnu.org/91525 where the patch contained a test, but

> > that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

> > 

> > To alleviate some of these problems, I propose that we introduce a means to our

> > DejaGNU infrastructure that allows adding tests for old bugs that have not been

> > fixed yet, and re-introduce the keyword monitored (no longer used for anything

> > -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is

> > tracked in the testsuite.  I don't want any unnecessary moving tests around, so

> > the tests would go where they would normally go; they have to be reduced and

> > have proper targets, etc.  Having such tests in the testsuite means that when

> > something changes, you will know immediately, before you push any changes.

> > 

> > My thinking is that for:

> > 

> > * rejects-valid: use the existing dg-xfail-if

> > * accepts-valid: use the new dg-accepts-invalid

> > * ICEs: use the new dg-ice

> > 

> > dg-ice can be used like this:

> > 

> > // { dg-ice "build_over_call" { target c++11 } }

> > 

> > and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the

> > ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,

> > you'll get an XPASS + FAIL.  Then you can close the old PR.

> > 

> > Similarly, dg-accepts-invalid:

> > 

> > // { dg-accepts-invalid "PR86500" }

> > 

> > means that if the test still compiles without errors, you'll get a quiet XFAIL.

> > If we start giving errors, you'll get an XPASS.

> > 

> > If the bug is fixed, simply remove the directive.

> > 

> > The patch implementing these new directives is appended.  Once/if this is

> > accepted, I can start adding the old tests we have in our Bugzilla.  (I'm

> > only concerned about the c++ component, if that wasn't already clear.)

> > 

> > The question is what makes the bug "old": is it one year without it being

> > assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for

> > every new PR, just the reasonably old ones that are useful/important.  Such

> > additions should be done in batches, so that we don't have dozens of commits,

> > each of them merely adding a single test.

> > 

> > We will still have a surfeit of bugs that we've given short shrift to, but

> > let's at least automate what we can.  The initial addition of the relevant

> > old(-ish) tests won't of course happen automagically, but it's a price I'm

> > willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> > it is to improve the testing of the compiler overall.

> > 

> > Thoughts?

> > 

> > Bootstrapped/regtested on x86_64-pc-linux-gnu.

> > 

> > [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

> > 

> > gcc/ChangeLog:

> > 

> > 	* doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

> > 

> > gcc/testsuite/ChangeLog:

> > 

> > 	* lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.

> > 	* lib/prune.exp (prune_ices): New.

> > 	* lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.

> > ---

> >   gcc/doc/sourcebuild.texi                 | 19 +++++++

> >   gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-

> >   gcc/testsuite/lib/prune.exp              |  9 ++++

> >   gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++

> >   4 files changed, 134 insertions(+), 2 deletions(-)

> > 

> > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi

> > index a7a922d84a2..636d21d30dd 100644

> > --- a/gcc/doc/sourcebuild.texi

> > +++ b/gcc/doc/sourcebuild.texi

> > @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the conditions (which are

> >   the same as for @code{dg-skip-if}) are met.

> >   @end table

> > +@subsubsection Expect the compiler to crash

> > +

> > +@table @code

> > +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> > +Expect the compiler to crash with an internal compiler error and return

> > +a nonzero exit status if the conditions (which are the same as for

> > +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> > +fixed yet.

> > +@end table

> > +

> >   @subsubsection Expect the test executable to fail

> >   @table @code

> > @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.

> >   @item @{ dg-prune-output @var{regexp} @}

> >   Prune messages matching @var{regexp} from the test output.

> > +

> > +@table @code

> > +@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}

> > +Expect the compiler to accept the test (even though it should be rejected with

> > +a compile-time error), if the conditions (which are the same as for

> > +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been

> > +fixed yet.

> > +@end table

> > +

> >   @end table

> >   @subsubsection Verify output of the test executable

> > diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp

> > index 45d97024883..6478eda283b 100644

> > --- a/gcc/testsuite/lib/gcc-dg.exp

> > +++ b/gcc/testsuite/lib/gcc-dg.exp

> > @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } {

> >       verbose "$target_compile $prog $output_file $compile_type $options" 4

> >       set comp_output [$target_compile "$prog" "$output_file" "$compile_type" $options]

> > +    global expect_ice

> >       # Look for an internal compiler error, which sometimes masks the fact

> >       # that we didn't get an expected error message.  XFAIL an ICE via

> >       # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }

> >       # to avoid a second failure for excess errors.

> > -    if [string match "*internal compiler error*" $comp_output] {

> > +    # "Error reporting routines re-entered" ICE says "Internal" rather than

> > +    # "internal", so match that too.

> > +    if [string match {*[Ii]nternal compiler error*} $comp_output] {

> >   	upvar 2 name name

> > -	fail "$name (internal compiler error)"

> > +	if { $expect_ice == 0 } {

> > +	  fail "$name (internal compiler error)"

> > +	} else {

> > +	  # We expected an ICE and we got it.  Emit an XFAIL.

> > +	  setup_xfail "*-*-*"

> > +	  fail "$name (internal compiler error)"

> > +	  clear_xfail "*-*-*"

> > +	  # Prune the ICE from the output.

> > +	  set comp_output [prune_ices $comp_output]

> > +	}

> > +    } else {

> > +	upvar 2 name name

> > +	global accepts_invalid

> > +	if { $expect_ice == 1 } {

> > +	  # We expected an ICE but we didn't get it.  We want an XPASS, so

> > +	  # call setup_xfail to set xfail_flag.

> > +	  setup_xfail "*-*-*"

> > +	  pass "$name (internal compiler error)"

> > +	  clear_xfail "*-*-*"

> > +	} elseif { $accepts_invalid == 1 } {

> > +	    if [string match {*error: *} $comp_output] {

> > +	      # We expected that this test be (wrongly) accepted, but now we have

> > +	      # seen error(s).  Issue an XPASS to signal that.

> > +	      setup_xfail "*-*-*"

> > +	      pass "$name (accepts invalid)"

> > +	      clear_xfail "*-*-*"

> > +	    } else {

> > +	      # This test is still (wrongly) accepted.  Just emit an XFAIL.

> > +	      setup_xfail "*-*-*"

> > +	      fail "$name (accepts invalid)"

> > +	      clear_xfail "*-*-*"

> > +	    }

> > +	}

> >       }

> >       if { $do_what == "repo" } {

> > diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp

> > index 1c776249f1a..58a739684a5 100644

> > --- a/gcc/testsuite/lib/prune.exp

> > +++ b/gcc/testsuite/lib/prune.exp

> > @@ -118,6 +118,15 @@ proc prune_file_path { text } {

> >       return $text

> >   }

> > +# Prune internal compiler error messages, including the "Please submit..."

> > +# footnote.

> > +

> > +proc prune_ices { text } {

> > +  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for instructions\[^\n\]*" $text "" text

> > +  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" $text "" text

> > +  return $text

> > +}

> > +

> >   # Provide a definition of this if missing (delete after next dejagnu release).

> >   if { [info procs prune_warnings] == "" } then {

> > diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp

> > index 2a21424b890..765f3a2e27a 100644

> > --- a/gcc/testsuite/lib/target-supports-dg.exp

> > +++ b/gcc/testsuite/lib/target-supports-dg.exp

> > @@ -495,6 +495,75 @@ proc dg-shouldfail { args } {

> >       }

> >   }

> > +# Record whether the compiler is expected (at the moment) to ICE.

> > +# Used for tests that test bugs that have not been fixed yet.

> > +

> > +set expect_ice 0

> > +set accepts_invalid 0

> > +

> > +proc dg-ice { args } {

> > +    # Don't bother if we're already skipping the test.

> > +    upvar dg-do-what dg-do-what

> > +    if { [lindex ${dg-do-what} 1] == "N" } {

> > +      return

> > +    }

> > +

> > +    global accepts_invalid

> > +    # Can't be combined with dg-accepts-invalid.

> > +    if { $accepts_invalid == 1 } {

> > +      error "dg-ice: cannot be combined with dg-accepts-invalid"

> > +      return

> > +    }

> > +

> > +    global expect_ice

> > +

> > +    set args [lreplace $args 0 0]

> > +    if { [llength $args] > 1 } {

> > +	set selector [list target [lindex $args 1]]

> > +	if { [dg-process-target-1 $selector] == "S" } {

> > +	    # The target matches, now check the flags.

> > +	    if [check-flags $args] {

> > +		set expect_ice 1

> > +	    }

> > +	}

> > +    } else {

> > +	set expect_ice 1

> > +    }

> > +}

> > +

> > +# Record whether the compiler should reject the testcase with an error,

> > +# but currently doesn't do so.  Used for accepts-invalid bugs.

> > +

> > +proc dg-accepts-invalid { args } {

> > +    # Don't bother if we're already skipping the test.

> > +    upvar dg-do-what dg-do-what

> > +    if { [lindex ${dg-do-what} 1] == "N" } {

> > +      return

> > +    }

> > +

> > +    global expect_ice

> > +    # Can't be combined with dg-ice.

> > +    if { $expect_ice == 1 } {

> > +      error "dg-accepts-invalid: cannot be combined with dg-ice"

> > +      return

> > +    }

> > +

> > +    global accepts_invalid

> > +

> > +    set args [lreplace $args 0 0]

> > +    if { [llength $args] > 1 } {

> > +	set selector [list target [lindex $args 1]]

> > +	if { [dg-process-target-1 $selector] == "S" } {

> > +	    # The target matches, now check the flags.

> > +	    if [check-flags $args] {

> > +		set accepts_invalid 1

> > +	    }

> > +	}

> > +    } else {

> > +	set accepts_invalid 1

> > +    }

> > +}

> > +

> >   # Intercept the call to the DejaGnu version of dg-process-target to

> >   # support use of an effective-target keyword in place of a list of

> >   # target triplets to xfail or skip a test.

> > 

> > base-commit: f3665bd1111c1799c0421490b5e655f977570354

> > 

>
David Malcolm via Gcc-patches Aug. 4, 2020, 10:22 p.m. | #13
On Thu, Jul 30, 2020 at 11:08:03AM +0200, Martin Liška wrote:
> Hello.

> 

> I support the initiative!

> What would be nice to add leading 'PR component/12345'

> to a git commit so that these test additions are linked to bugzilla issues.


Thanks!  Yes, it should be clear which test tests a PR that has the monitored
keyword.  That may get lost when adding a lot of tests in one commit, but can
always be clarified in a comment.  Or just grep 'PR component/12345' in the
testsuite; new tests should have this as their first line.

Marek
David Malcolm via Gcc-patches Aug. 4, 2020, 10:33 p.m. | #14
On Thu, Jul 30, 2020 at 11:54:23AM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Jul 28, 2020 at 05:44:47PM -0400, Marek Polacek via Gcc-patches wrote:

> > We will still have a surfeit of bugs that we've given short shrift to, but

> > let's at least automate what we can.  The initial addition of the relevant

> > old(-ish) tests won't of course happen automagically, but it's a price I'm

> > willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> > it is to improve the testing of the compiler overall.

> > 

> > Thoughts?

> 

> Looks useful to me, but I'd think it might be desirable to use separate

> directories for those tests, so that it is more obvious that it is a

> different category of tests.  Now that we use git, just using git mv

> to move them to another place once they are fixed for good (together with

> some dg-* directive tweaks) wouldn't be that much work later.

> 

> So having gcc.dg/unfixed/ , g++.dg/unfixed/ , c-c++-common/unfixed/

> and their torture/ suffixed variants (or better directory name for those)?


Thanks.  I was afraid that it would cause too much friction when you happen
to fix one of the unfixed tests: you will have to find the correct directory
to put the test in and perhaps even rename the test to avoid conflicts with
tests with the same name in the final destination.  But it's also true that
git is much better at moving files, and the extra clarity might be worth the
occasional hassle.  It would also make it easy to skip testing unfixed tests.
dg-ice tests are easy to spot/grep for, but accepts-invalid/rejects-valid are
a different story.

I'll post a v2 patch soon with the unfixed/ dir in mind.

Marek
David Malcolm via Gcc-patches Aug. 4, 2020, 10:33 p.m. | #15
I think the read of the room is that people think it would be generally useful, so let approve the general plan.

So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

For the suggestion to isolate the tests into their own area easily distinguished by filename, I think it is better to choose an original home for them using the existing naming scheme as much as possible, that way when fixed, they are already in the right spot.  We in theory can move them around, but, there is a beauty in having a long term stable name that just doesn't change any.  You can look through time to see all state changes.  git log file.C, show you all the history nicely, and so on.

You can experiment with dg-prms-id and see if that lets you tag in bug report numbers in a more meaningful way.  Anyway, would be good to always include the bug report number in the test case, and in the bug report, add the name of the added test case (so others don't add yet another instance of the bug).

So with that as a backdrop, I think it's reasonable to self-approve additions of this sort to the test suite.  If you have test cases that can go in with existing mechanisms, feel free to start adding them in.

As you find it difficult to express a test using the existing mechanisms, let's talk about those and see if anyone has a good idea on how to express it.  I think ICEs are the most annoying to manage, but, I think excess and prune should be able to handle them.  I think should get an error or warning, or should not get an error or warning are more trivial to manage.

A word of caution, if we produce core files, before you add tons of core file producing test cases, you'll want to submit a, ulimit -c 0 patch that can avoid the issue.  corefile writing is slow and consumes disk.  I can't recall at the moment if the current infrastructure will reliably avoid core files.

If test cases infinitely loop, timeout, consume all available ram, fill the disk, crash the host machine, or do other really nasty stuff, please, let's avoid those for now.  It is mazing host fast testing goes on a 200 thread count box, and how slow it can go if a single test case needs to time out.  If you start with the idea that any individual test case should only take 2-10 seconds, you won't go wrong.
David Malcolm via Gcc-patches Aug. 4, 2020, 10:45 p.m. | #16
On Aug 4, 2020, at 3:08 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 

> That works well if you know where to expect an error.  But if you don't, it's

> worse.  E.g.,

> 

> // { dg-xfail-if "" { *-*-* } }

> int i = nothere; // demonstrates something that errors out

> // { dg-error "" } didn't know where to put this

> 

> only prints unexpected failures, but no unexpected successes.  I guess that's

> OK though, at least for now, so I'll drop dg-accepts-invalid.


There are two cases, either you get an error message that is wrong, and you can use:

  strncpy (p, s, 60);   /* { dg-bogus "-Wstringop-truncation" } */  

or, you don't get an error, but you should:

  A foo(void i = 0);  // { dg-error "incomplete type|invalid use" }       

?  Do you have an example of a specific case that doesn't work?  I'm not sure I'm following.
David Malcolm via Gcc-patches Aug. 4, 2020, 10:53 p.m. | #17
On Aug 4, 2020, at 3:16 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 

> The benefit of dg-accepts-invalid was that you would

> get an XPASS even for a test that should not be accepted, but you didn't know

> what line to expect an error on, so you put a dg-error at the end of the test.


I think for most cases it's easy enough to figure out where the error goes.  I do see the subtly of the dg-accepts-invalid directive now in the harder cases.  A change of state of them by the new error message I'd like to think is enough to get the people to look at the test case and the corresponding bug report.  I'd propose seeing if people don't also push along the bug in that sort of complex case, I think they will.
David Malcolm via Gcc-patches Aug. 5, 2020, 12:54 a.m. | #18
On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:
> I think the read of the room is that people think it would be generally useful, so let approve the general plan.


Cool.

> So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.


At this point I'm only proposing one new directive, dg-ice.  I think we can't
really do without it.  The other one was a matter of convenience.

> For the suggestion to isolate the tests into their own area easily distinguished by filename, I think it is better to choose an original home for them using the existing naming scheme as much as possible, that way when fixed, they are already in the right spot.  We in theory can move them around, but, there is a beauty in having a long term stable name that just doesn't change any.  You can look through time to see all state changes.  git log file.C, show you all the history nicely, and so on.


What you say makes sense.  I'm still pretty wishy-washy about this.

But since git tracks renaming files well (you see the whole history of the
renamed file, even with its old name), I'm currently of the mind that having
a dedicated directory is preferable.

> You can experiment with dg-prms-id and see if that lets you tag in bug report numbers in a more meaningful way.  Anyway, would be good to always include the bug report number in the test case, and in the bug report, add the name of the added test case (so others don't add yet another instance of the bug).


Absolutely, the PR number should be in the test, and the monitored PR ought
to say what test it's covered by.

I've looked at dg-prms-id in dejagnu, but I don't readily see how that
could help us.

> So with that as a backdrop, I think it's reasonable to self-approve additions of this sort to the test suite.  If you have test cases that can go in with existing mechanisms, feel free to start adding them in.


Sounds good.  As I mentioned, they should be of high quality, just like the
test we normally add when fixing bugs.

> As you find it difficult to express a test using the existing mechanisms, let's talk about those and see if anyone has a good idea on how to express it.  I think ICEs are the most annoying to manage, but, I think excess and prune should be able to handle them.  I think should get an error or warning, or should not get an error or warning are more trivial to manage.


I experimented with
// { dg-prune-output ".*internal compiler error.*" }
// { dg-xfail-if "" { *-*-* } }
but it's a mouthful and the results were poor (when the ICE is fixed but we
generate errors instead).  dg-ice is convenient, handles even the different
kind of ICE (when the diagnostic routines were re-entered), and generates
nice XPASSes when the ICE goes away.

I've also played games with dg-regexp but it was too ugly.

(I honestly don't see why new directives are such a big deal, if they're
properly documented.)

> A word of caution, if we produce core files, before you add tons of core file producing test cases, you'll want to submit a, ulimit -c 0 patch that can avoid the issue.  corefile writing is slow and consumes disk.  I can't recall at the moment if the current infrastructure will reliably avoid core files.


I thought we'd already set ulimit -c 0, but I don't see that now.  I definitely
agree that we don't want core dumps.  It probably needs some hack that sets
ulimit -c to 0 when we're running a test in */unfixed/.  :/  Which also argues
for a separate directory for unfixed tests.

> If test cases infinitely loop, timeout, consume all available ram, fill the disk, crash the host machine, or do other really nasty stuff, please, let's avoid those for now.  It is mazing host fast testing goes on a 200 thread count box, and how slow it can go if a single test case needs to time out.  If you start with the idea that any individual test case should only take 2-10 seconds, you won't go wrong.


I agree, but that should be true for all tests.  Really, only the expect-ice
tests are novel.

Marek
David Malcolm via Gcc-patches Aug. 5, 2020, 12:59 a.m. | #19
On Tue, Aug 04, 2020 at 03:53:50PM -0700, Mike Stump wrote:
> On Aug 4, 2020, at 3:16 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> > 

> > The benefit of dg-accepts-invalid was that you would

> > get an XPASS even for a test that should not be accepted, but you didn't know

> > what line to expect an error on, so you put a dg-error at the end of the test.

> 

> I think for most cases it's easy enough to figure out where the error goes.  I do see the subtly of the dg-accepts-invalid directive now in the harder cases.  A change of state of them by the new error message I'd like to think is enough to get the people to look at the test case and the corresponding bug report.  I'd propose seeing if people don't also push along the bug in that sort of complex case, I think they will.


You're probably right.  Let's go ahead without dg-accepts-invalid for now and
perhaps reconsider later.

Marek
Richard Sandiford Aug. 5, 2020, 7:58 a.m. | #20
Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Thu, Jul 30, 2020 at 11:54:23AM +0200, Jakub Jelinek via Gcc-patches wrote:

>> On Tue, Jul 28, 2020 at 05:44:47PM -0400, Marek Polacek via Gcc-patches wrote:

>> > We will still have a surfeit of bugs that we've given short shrift to, but

>> > let's at least automate what we can.  The initial addition of the relevant

>> > old(-ish) tests won't of course happen automagically, but it's a price I'm

>> > willing to pay.  My goal here isn't merely to reduce the number of open PRs;

>> > it is to improve the testing of the compiler overall.

>> > 

>> > Thoughts?

>> 

>> Looks useful to me, but I'd think it might be desirable to use separate

>> directories for those tests, so that it is more obvious that it is a

>> different category of tests.  Now that we use git, just using git mv

>> to move them to another place once they are fixed for good (together with

>> some dg-* directive tweaks) wouldn't be that much work later.

>> 

>> So having gcc.dg/unfixed/ , g++.dg/unfixed/ , c-c++-common/unfixed/

>> and their torture/ suffixed variants (or better directory name for those)?

>

> Thanks.  I was afraid that it would cause too much friction when you happen

> to fix one of the unfixed tests: you will have to find the correct directory

> to put the test in and perhaps even rename the test to avoid conflicts with

> tests with the same name in the final destination.  But it's also true that

> git is much better at moving files, and the extra clarity might be worth the

> occasional hassle.  It would also make it easy to skip testing unfixed tests.

> dg-ice tests are easy to spot/grep for, but accepts-invalid/rejects-valid are

> a different story.


FWIW, I agree with Mike that it might be better to put tests in the
location that they'd have normally.  Some reasons:

- As already mentioned, it'd give more stable names.  That's not an
  issue for things like git log, but it does mean that comparing one
  set of test results with another gives a meaningful XFAIL->PASS
  transition rather than a “removed test” + “added test” transition.
  It's also more convenient when comparing different branches.

- There are a lot of specialised .exp harnesses, and I don't think it
  would be a good idea to encourage all of them to have unfixed/
  variants, or unfixed/ subdirectories.  E.g. vect.exp is not something
  we'd want to duplicate or subclass, but there are others.

- There's a temptation to remove empty directories/harnesses when
  they have no associated tests, so we might oscillate between having
  unfixed/ directories and not.  (The alternative is for the directory
  or harness to stick around based on the fact that it was useful at
  some point in the past.)

- It's inconsistent with existing tests that are already XFAILed for all
  targets.  Having a separate directory for completely-XFAILed tests
  might create a requirement (or the impression of a requirement) to
  move tests to an unfixed/ directory when XFAILing them.

- IMO it draws an artificial distinction between tests that are
  completely XFAILed on all targets and tests that are either partly
  XFAILed for all targets, or XFAILed only for some targets.

Thanks,
Richard
Richard Sandiford Aug. 5, 2020, 8:04 a.m. | #21
Marek Polacek <polacek@redhat.com> writes:
> On Wed, Jul 29, 2020 at 09:40:35AM +0100, Richard Sandiford wrote:

>> I guess there's a possibility that some tests happen to pass already

>> on some targets.  That's more likely with middle-end and backend bugs

>> rather than frontend stuff though.  Perhaps for those it would make

>> sense to have a convention in which the failing testcase is restricted

>> (at the whole-test level) to the targets that the person committing the

>> testcase has actually tried.  Maybe with a comment on the dg-ice etc.

>> to remind people to reconsider the main target selector when un-XFAILing

>> the test.

>

> Interesting point.  With my frontend hat on, I hadn't really thought of

> this much, but the dg-ice directive allows you to specify the targets and

> specific options when to expect an ICE.  So you could run a test everywhere

> but only expect an ICE on aarch64.


Yeah.  But the problem I was thinking of was: whoever adds the test
will only test on a subset of targets.  If the test runs for all targets,
the dg-ice condition has to be exact for all targets too.  Missing out
one target will generate a new FAIL, while adding a target unnecessarily
will generate an XPASS.  So I think the condition has to be applied at
a whole-test level instead, unless the person committing the test is
confident about which targets are and aren't affected.

(The same goes for other directives, dg-ice is just an example.)

Richard
David Malcolm via Gcc-patches Aug. 5, 2020, 12:56 p.m. | #22
On Tue, Aug 04, 2020 at 03:45:11PM -0700, Mike Stump wrote:
> On Aug 4, 2020, at 3:08 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> > 

> > That works well if you know where to expect an error.  But if you don't, it's

> > worse.  E.g.,

> > 

> > // { dg-xfail-if "" { *-*-* } }

> > int i = nothere; // demonstrates something that errors out

> > // { dg-error "" } didn't know where to put this

> > 

> > only prints unexpected failures, but no unexpected successes.  I guess that's

> > OK though, at least for now, so I'll drop dg-accepts-invalid.

> 

> There are two cases, either you get an error message that is wrong, and you can use:

> 

>   strncpy (p, s, 60);   /* { dg-bogus "-Wstringop-truncation" } */  

> 

> or, you don't get an error, but you should:

> 

>   A foo(void i = 0);  // { dg-error "incomplete type|invalid use" }       

> 

> ?  Do you have an example of a specific case that doesn't work?  I'm not sure I'm following.


Sorry for being unclear.  Say you have

// PR c++/55408

struct foo {
    template<int*>
    void bar();
};

template<int*...>
void foo::bar() {}

int main()
{
    extern int i;
    foo {}.bar<&i>();
}

which we wrongly accept.  It might be unclear what line to put that dg-error
on.  So you put it at the end of the file.  Then when we start issuing an error
for the testcase, you will just get a FAIL, not an XPASS.  That might be
confusing if that file isn't in unfixed/.

Maybe it's not a real problem though -- upon inspection you'll find the
dg-error line and just tweak the testcase as needed.

Marek
David Malcolm via Gcc-patches Aug. 5, 2020, 12:59 p.m. | #23
On Wed, Aug 05, 2020 at 09:04:53AM +0100, Richard Sandiford wrote:
> Marek Polacek <polacek@redhat.com> writes:

> > On Wed, Jul 29, 2020 at 09:40:35AM +0100, Richard Sandiford wrote:

> >> I guess there's a possibility that some tests happen to pass already

> >> on some targets.  That's more likely with middle-end and backend bugs

> >> rather than frontend stuff though.  Perhaps for those it would make

> >> sense to have a convention in which the failing testcase is restricted

> >> (at the whole-test level) to the targets that the person committing the

> >> testcase has actually tried.  Maybe with a comment on the dg-ice etc.

> >> to remind people to reconsider the main target selector when un-XFAILing

> >> the test.

> >

> > Interesting point.  With my frontend hat on, I hadn't really thought of

> > this much, but the dg-ice directive allows you to specify the targets and

> > specific options when to expect an ICE.  So you could run a test everywhere

> > but only expect an ICE on aarch64.

> 

> Yeah.  But the problem I was thinking of was: whoever adds the test

> will only test on a subset of targets.  If the test runs for all targets,

> the dg-ice condition has to be exact for all targets too.  Missing out

> one target will generate a new FAIL, while adding a target unnecessarily

> will generate an XPASS.  So I think the condition has to be applied at

> a whole-test level instead, unless the person committing the test is

> confident about which targets are and aren't affected.

> 

> (The same goes for other directives, dg-ice is just an example.)


Ah, got it.  Thanks for the explanation.

Marek
David Malcolm via Gcc-patches Aug. 5, 2020, 1:18 p.m. | #24
On Wed, Aug 05, 2020 at 08:58:40AM +0100, Richard Sandiford wrote:
> Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> > On Thu, Jul 30, 2020 at 11:54:23AM +0200, Jakub Jelinek via Gcc-patches wrote:

> >> On Tue, Jul 28, 2020 at 05:44:47PM -0400, Marek Polacek via Gcc-patches wrote:

> >> > We will still have a surfeit of bugs that we've given short shrift to, but

> >> > let's at least automate what we can.  The initial addition of the relevant

> >> > old(-ish) tests won't of course happen automagically, but it's a price I'm

> >> > willing to pay.  My goal here isn't merely to reduce the number of open PRs;

> >> > it is to improve the testing of the compiler overall.

> >> > 

> >> > Thoughts?

> >> 

> >> Looks useful to me, but I'd think it might be desirable to use separate

> >> directories for those tests, so that it is more obvious that it is a

> >> different category of tests.  Now that we use git, just using git mv

> >> to move them to another place once they are fixed for good (together with

> >> some dg-* directive tweaks) wouldn't be that much work later.

> >> 

> >> So having gcc.dg/unfixed/ , g++.dg/unfixed/ , c-c++-common/unfixed/

> >> and their torture/ suffixed variants (or better directory name for those)?

> >

> > Thanks.  I was afraid that it would cause too much friction when you happen

> > to fix one of the unfixed tests: you will have to find the correct directory

> > to put the test in and perhaps even rename the test to avoid conflicts with

> > tests with the same name in the final destination.  But it's also true that

> > git is much better at moving files, and the extra clarity might be worth the

> > occasional hassle.  It would also make it easy to skip testing unfixed tests.

> > dg-ice tests are easy to spot/grep for, but accepts-invalid/rejects-valid are

> > a different story.

> 

> FWIW, I agree with Mike that it might be better to put tests in the

> location that they'd have normally.  Some reasons:

> 

> - As already mentioned, it'd give more stable names.  That's not an

>   issue for things like git log, but it does mean that comparing one

>   set of test results with another gives a meaningful XFAIL->PASS

>   transition rather than a “removed test” + “added test” transition.

>   It's also more convenient when comparing different branches.

> 

> - There are a lot of specialised .exp harnesses, and I don't think it

>   would be a good idea to encourage all of them to have unfixed/

>   variants, or unfixed/ subdirectories.  E.g. vect.exp is not something

>   we'd want to duplicate or subclass, but there are others.

> 

> - There's a temptation to remove empty directories/harnesses when

>   they have no associated tests, so we might oscillate between having

>   unfixed/ directories and not.  (The alternative is for the directory

>   or harness to stick around based on the fact that it was useful at

>   some point in the past.)

> 

> - It's inconsistent with existing tests that are already XFAILed for all

>   targets.  Having a separate directory for completely-XFAILed tests

>   might create a requirement (or the impression of a requirement) to

>   move tests to an unfixed/ directory when XFAILing them.

> 

> - IMO it draws an artificial distinction between tests that are

>   completely XFAILed on all targets and tests that are either partly

>   XFAILed for all targets, or XFAILed only for some targets.


Those are all great points.  I have to reconsider my position once again ;-).

Perhaps we should add an // UNFIXED comment then to easily disambiguate
between tests that are not fixed.  But I fear that people won't always
use this consistently.

Marek
Nathan Sidwell Aug. 5, 2020, 3:03 p.m. | #25
On 8/4/20 8:54 PM, Marek Polacek via Gcc-patches wrote:
> On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

>> I think the read of the room is that people think it would be generally useful, so let approve the general plan.

> 

> Cool.

> 

>> So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

> 

> At this point I'm only proposing one new directive, dg-ice.  I think we can't

> really do without it.  The other one was a matter of convenience.


I've realized I have a concern.  Grepping (or searching in an editor 
buffer) the log file for 'internal compiler error' to find actual 
regressions is a thing I want to still be able to do (perhaps with 
alternative spelling, I don't care).  I don't want to see the ICEs of 
tests that are expected to ICE.

I think that means there has to be a positive marker on the unexpected 
ICEs, rather than lack of an expected marker on them.

nathan

-- 
Nathan Sidwell
David Malcolm via Gcc-patches Aug. 5, 2020, 7:44 p.m. | #26
On Aug 5, 2020, at 5:56 AM, Marek Polacek <polacek@redhat.com> wrote:
> 

> Sorry for being unclear.  Say you have

> 

> // PR c++/55408

> 

> struct foo {

>    template<int*>

>    void bar();

> };

> 

> template<int*...>

> void foo::bar() {}

> 

> int main()

> {

>    extern int i;

>    foo {}.bar<&i>();

> }

> 

> which we wrongly accept.  It might be unclear


Sure, one might think it goes on line 12...  could be wrong.  But, if one wants to do better, one can run clang on it, an it says that the error goes on line 7.

> what line to put that dg-error

> on.  So you put it at the end of the file.  Then when we start issuing an error

> for the testcase, you will just get a FAIL, not an XPASS.  That might be

> confusing if that file isn't in unfixed/.


The people that fix bugs should be able to sort it out.
David Malcolm via Gcc-patches Aug. 5, 2020, 7:59 p.m. | #27
On Tue, Aug 04, 2020 at 08:54:32PM -0400, Marek Polacek via Gcc-patches wrote:
> On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

> > A word of caution, if we produce core files, before you add tons of core file producing test cases, you'll want to submit a, ulimit -c 0 patch that can avoid the issue.  corefile writing is slow and consumes disk.  I can't recall at the moment if the current infrastructure will reliably avoid core files.

> 

> I thought we'd already set ulimit -c 0, but I don't see that now.  I definitely

> agree that we don't want core dumps.  It probably needs some hack that sets

> ulimit -c to 0 when we're running a test in */unfixed/.  :/  Which also argues

> for a separate directory for unfixed tests.


I realized that the crashing compiler will probably only generate a code dump
when using -dH, so we should be fine.

Marek
David Malcolm via Gcc-patches Aug. 5, 2020, 8:01 p.m. | #28
On Aug 4, 2020, at 5:54 PM, Marek Polacek <polacek@redhat.com> wrote:
>> As you find it difficult to express a test using the existing mechanisms, let's talk about those and see if anyone has a good idea on how to express it.  I think ICEs are the most annoying to manage, but, I think excess and prune should be able to handle them.  I think should get an error or warning, or should not get an error or warning are more trivial to manage.

> 

> I experimented with

> // { dg-prune-output ".*internal compiler error.*" }

> // { dg-xfail-if "" { *-*-* } }

> but it's a mouthful and the results were poor (when the ICE is fixed but we

> generate errors instead).  dg-ice is convenient, handles even the different

> kind of ICE (when the diagnostic routines were re-entered), and generates

> nice XPASSes when the ICE goes away.

> 

> I've also played games with dg-regexp but it was too ugly.

> 

> (I honestly don't see why new directives are such a big deal, if they're

> properly documented.)


I don't see a bogus here?  I think that can't be skipped.
David Malcolm via Gcc-patches Aug. 5, 2020, 11:29 p.m. | #29
On Wed, Aug 05, 2020 at 11:03:08AM -0400, Nathan Sidwell wrote:
> On 8/4/20 8:54 PM, Marek Polacek via Gcc-patches wrote:

> > On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

> > > I think the read of the room is that people think it would be generally useful, so let approve the general plan.

> > 

> > Cool.

> > 

> > > So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

> > 

> > At this point I'm only proposing one new directive, dg-ice.  I think we can't

> > really do without it.  The other one was a matter of convenience.

> 

> I've realized I have a concern.  Grepping (or searching in an editor buffer)

> the log file for 'internal compiler error' to find actual regressions is a

> thing I want to still be able to do (perhaps with alternative spelling, I

> don't care).  I don't want to see the ICEs of tests that are expected to

> ICE.

> 

> I think that means there has to be a positive marker on the unexpected ICEs,

> rather than lack of an expected marker on them.


Hmm, by the log file you mean g++.log?  Currently, if you run a dg-ice test,
and the test still ICEs, the g++.log file (but not the stdout of make
check-c++!) will have:

Executing on host: ... xg++ with options ...
spawn -ignore SIGHUP ... xg++ with options ...
.../foo.C:14:15: internal compiler error: in poplevel_class, at cp/name-lookup.c:4225
<backtrace>
compiler exited with status 1
XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)
PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

Which one of these would you not like to see?

The second one, in parens, can be changed easily.  The first one, the actual
output generated by dejagnu's default_target_compile, I don't think I can
change at all.  g++_target_compile invokes 

  set result [target_compile $source $dest $type $options]

and we have no control over the target_compile proc.

Can you give me more details?  Hopefully we'll work something out that doesn't
break your workflow.

Marek
David Malcolm via Gcc-patches Aug. 6, 2020, 12:27 p.m. | #30
On Wed, Aug 05, 2020 at 01:01:32PM -0700, Mike Stump wrote:
> On Aug 4, 2020, at 5:54 PM, Marek Polacek <polacek@redhat.com> wrote:

> >> As you find it difficult to express a test using the existing mechanisms, let's talk about those and see if anyone has a good idea on how to express it.  I think ICEs are the most annoying to manage, but, I think excess and prune should be able to handle them.  I think should get an error or warning, or should not get an error or warning are more trivial to manage.

> > 

> > I experimented with

> > // { dg-prune-output ".*internal compiler error.*" }

> > // { dg-xfail-if "" { *-*-* } }

> > but it's a mouthful and the results were poor (when the ICE is fixed but we

> > generate errors instead).  dg-ice is convenient, handles even the different

> > kind of ICE (when the diagnostic routines were re-entered), and generates

> > nice XPASSes when the ICE goes away.

> > 

> > I've also played games with dg-regexp but it was too ugly.

> > 

> > (I honestly don't see why new directives are such a big deal, if they're

> > properly documented.)

> 

> I don't see a bogus here?  I think that can't be skipped.


A dg-bogus?  Where?  Putting it on the line where the ICE happens results in
FAILs.  A complete example of what I had in mind:

// PR c++/88003
// { dg-do compile { target c++14 } }
// { dg-prune-output ".*internal compiler error.*" }
// { dg-xfail-if "" { *-*-* } }

auto test() {
  struct O {
    struct N;
  };
  return O();
}

typedef decltype(test()) TN;
struct TN::N {};


I think I'd still prefer a single dg-ice.

Marek
David Malcolm via Gcc-patches Aug. 6, 2020, 12:30 p.m. | #31
On Thu, Aug 06, 2020 at 08:27:05AM -0400, Marek Polacek via Gcc-patches wrote:
> // PR c++/88003

> // { dg-do compile { target c++14 } }

> // { dg-prune-output ".*internal compiler error.*" }

> // { dg-xfail-if "" { *-*-* } }


Would that XPASS if the ICE is fixed though?

	Jakub
David Malcolm via Gcc-patches Aug. 6, 2020, 12:36 p.m. | #32
On Thu, Aug 06, 2020 at 02:30:11PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 06, 2020 at 08:27:05AM -0400, Marek Polacek via Gcc-patches wrote:

> > // PR c++/88003

> > // { dg-do compile { target c++14 } }

> > // { dg-prune-output ".*internal compiler error.*" }

> > // { dg-xfail-if "" { *-*-* } }

> 

> Would that XPASS if the ICE is fixed though?


Yes, but if the ICE is fixed and we issue an error instead, we won't get
an XPASS, just expected failures, so you'd never notice that the ICE is
gone.  Therefore, dg-prune-output + dg-xfail-if is not a viable solution.

Marek
Nathan Sidwell Aug. 6, 2020, 2:01 p.m. | #33
On 8/5/20 7:29 PM, Marek Polacek wrote:
> On Wed, Aug 05, 2020 at 11:03:08AM -0400, Nathan Sidwell wrote:

>> On 8/4/20 8:54 PM, Marek Polacek via Gcc-patches wrote:

>>> On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

>>>> I think the read of the room is that people think it would be generally useful, so let approve the general plan.

>>>

>>> Cool.

>>>

>>>> So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

>>>

>>> At this point I'm only proposing one new directive, dg-ice.  I think we can't

>>> really do without it.  The other one was a matter of convenience.

>>

>> I've realized I have a concern.  Grepping (or searching in an editor buffer)

>> the log file for 'internal compiler error' to find actual regressions is a

>> thing I want to still be able to do (perhaps with alternative spelling, I

>> don't care).  I don't want to see the ICEs of tests that are expected to

>> ICE.

>>

>> I think that means there has to be a positive marker on the unexpected ICEs,

>> rather than lack of an expected marker on them.

> 

> Hmm, by the log file you mean g++.log?  Currently, if you run a dg-ice test,

> and the test still ICEs, the g++.log file (but not the stdout of make

> check-c++!) will have:

> 

> Executing on host: ... xg++ with options ...

> spawn -ignore SIGHUP ... xg++ with options ...

> .../foo.C:14:15: internal compiler error: in poplevel_class, at cp/name-lookup.c:4225

> <backtrace>

> compiler exited with status 1

> XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

> PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

> 

> Which one of these would you not like to see?


Neither of these is solving the issue.  How do I find the ICES that are 
unexpected, without tripping over the ICEs that are expected?

> Can you give me more details?  Hopefully we'll work something out that doesn't

> break your workflow.


sure.
* develop patch
* run testsuite
* observe unexpected ICEs
* load g++.log into editor
* ^sinternal comp
* gets to first unexpected ICE
* debug it.

What does '^sinternal comp' become?  As there could be many expected 
ICEs it'll be painful to determine whether any particular utterance of 
'internal compiler' is expected or not.

nathan

-- 
Nathan Sidwell
David Malcolm via Gcc-patches Aug. 6, 2020, 10:55 p.m. | #34
On Thu, Aug 06, 2020 at 10:01:37AM -0400, Nathan Sidwell wrote:
> On 8/5/20 7:29 PM, Marek Polacek wrote:

> > On Wed, Aug 05, 2020 at 11:03:08AM -0400, Nathan Sidwell wrote:

> > > On 8/4/20 8:54 PM, Marek Polacek via Gcc-patches wrote:

> > > > On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

> > > > > I think the read of the room is that people think it would be generally useful, so let approve the general plan.

> > > > 

> > > > Cool.

> > > > 

> > > > > So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

> > > > 

> > > > At this point I'm only proposing one new directive, dg-ice.  I think we can't

> > > > really do without it.  The other one was a matter of convenience.

> > > 

> > > I've realized I have a concern.  Grepping (or searching in an editor buffer)

> > > the log file for 'internal compiler error' to find actual regressions is a

> > > thing I want to still be able to do (perhaps with alternative spelling, I

> > > don't care).  I don't want to see the ICEs of tests that are expected to

> > > ICE.

> > > 

> > > I think that means there has to be a positive marker on the unexpected ICEs,

> > > rather than lack of an expected marker on them.

> > 

> > Hmm, by the log file you mean g++.log?  Currently, if you run a dg-ice test,

> > and the test still ICEs, the g++.log file (but not the stdout of make

> > check-c++!) will have:

> > 

> > Executing on host: ... xg++ with options ...

> > spawn -ignore SIGHUP ... xg++ with options ...

> > .../foo.C:14:15: internal compiler error: in poplevel_class, at cp/name-lookup.c:4225

> > <backtrace>

> > compiler exited with status 1

> > XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

> > PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

> > 

> > Which one of these would you not like to see?

> 

> Neither of these is solving the issue.  How do I find the ICES that are

> unexpected, without tripping over the ICEs that are expected?

> 

> > Can you give me more details?  Hopefully we'll work something out that doesn't

> > break your workflow.

> 

> sure.

> * develop patch

> * run testsuite

> * observe unexpected ICEs

> * load g++.log into editor

> * ^sinternal comp

> * gets to first unexpected ICE

> * debug it.

> 

> What does '^sinternal comp' become?  As there could be many expected ICEs

> it'll be painful to determine whether any particular utterance of 'internal

> compiler' is expected or not.


That is a problem I don't know how to deal with.  I know how to pass
additional options to the compiler from dejagnu.  I thought maybe I could use
-pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is
ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm
stuck.

Though, you could just grep for '^FAIL.*internal comp', which will find the
first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will
be shown in 'Excess errors:'.  Won't that work?

Marek
David Malcolm via Gcc-patches Aug. 7, 2020, 12:01 a.m. | #35
On Aug 6, 2020, at 7:01 AM, Nathan Sidwell <nathan@acm.org> wrote:
> 

>> XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

>> PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

>> Which one of these would you not like to see?

> 

> Neither of these is solving the issue.  How do I find the ICES that are unexpected, without tripping over the ICEs that are expected?


Don't you already have this issue for current xfailed ICEs?  ^FAIL.*internal compiler error I think finds them, doesn't it?
Nathan Sidwell Aug. 7, 2020, 1:16 p.m. | #36
On 8/6/20 8:01 PM, Mike Stump wrote:
> On Aug 6, 2020, at 7:01 AM, Nathan Sidwell <nathan@acm.org> wrote:

>>

>>> XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

>>> PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

>>> Which one of these would you not like to see?

>>

>> Neither of these is solving the issue.  How do I find the ICES that are unexpected, without tripping over the ICEs that are expected?

> 

> Don't you already have this issue for current xfailed ICEs?  ^FAIL.*internal compiler error I think finds them, doesn't it?



Are there xfailed ICEs?  I've not run into them?

nathan

-- 
Nathan Sidwell
Nathan Sidwell Aug. 7, 2020, 1:21 p.m. | #37
On 8/6/20 6:55 PM, Marek Polacek wrote:
> On Thu, Aug 06, 2020 at 10:01:37AM -0400, Nathan Sidwell wrote:

>> On 8/5/20 7:29 PM, Marek Polacek wrote:

>>> On Wed, Aug 05, 2020 at 11:03:08AM -0400, Nathan Sidwell wrote:

>>>> On 8/4/20 8:54 PM, Marek Polacek via Gcc-patches wrote:

>>>>> On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

>>>>>> I think the read of the room is that people think it would be generally useful, so let approve the general plan.

>>>>>

>>>>> Cool.

>>>>>

>>>>>> So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

>>>>>

>>>>> At this point I'm only proposing one new directive, dg-ice.  I think we can't

>>>>> really do without it.  The other one was a matter of convenience.

>>>>

>>>> I've realized I have a concern.  Grepping (or searching in an editor buffer)

>>>> the log file for 'internal compiler error' to find actual regressions is a

>>>> thing I want to still be able to do (perhaps with alternative spelling, I

>>>> don't care).  I don't want to see the ICEs of tests that are expected to

>>>> ICE.

>>>>

>>>> I think that means there has to be a positive marker on the unexpected ICEs,

>>>> rather than lack of an expected marker on them.

>>>

>>> Hmm, by the log file you mean g++.log?  Currently, if you run a dg-ice test,

>>> and the test still ICEs, the g++.log file (but not the stdout of make

>>> check-c++!) will have:

>>>

>>> Executing on host: ... xg++ with options ...

>>> spawn -ignore SIGHUP ... xg++ with options ...

>>> .../foo.C:14:15: internal compiler error: in poplevel_class, at cp/name-lookup.c:4225

>>> <backtrace>

>>> compiler exited with status 1

>>> XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

>>> PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

>>>

>>> Which one of these would you not like to see?

>>

>> Neither of these is solving the issue.  How do I find the ICES that are

>> unexpected, without tripping over the ICEs that are expected?

>>

>>> Can you give me more details?  Hopefully we'll work something out that doesn't

>>> break your workflow.

>>

>> sure.

>> * develop patch

>> * run testsuite

>> * observe unexpected ICEs

>> * load g++.log into editor

>> * ^sinternal comp

>> * gets to first unexpected ICE

>> * debug it.

>>

>> What does '^sinternal comp' become?  As there could be many expected ICEs

>> it'll be painful to determine whether any particular utterance of 'internal

>> compiler' is expected or not.

> 

> That is a problem I don't know how to deal with.  I know how to pass

> additional options to the compiler from dejagnu.  I thought maybe I could use

> -pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is

> ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm

> stuck.

> 

> Though, you could just grep for '^FAIL.*internal comp', which will find the

> first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will

> be shown in 'Excess errors:'.  Won't that work?


Read what I wrote:
 >> * load g++.log into editor

 >> * ^sinternal comp


are you telling me not to use an editor for this task?  the search is so 
one can get the command line.  Grepping the log file will not do that.

while solveable, this seems to be the equivalent of putting stones in 
ones shoe.  an annoyance one could do without.  I'm sadly not convinced 
the goal is worth it.

nathan
-- 
Nathan Sidwell
David Malcolm via Gcc-patches Aug. 7, 2020, 2:18 p.m. | #38
On Fri, Aug 07, 2020 at 09:21:27AM -0400, Nathan Sidwell wrote:
> On 8/6/20 6:55 PM, Marek Polacek wrote:

> > On Thu, Aug 06, 2020 at 10:01:37AM -0400, Nathan Sidwell wrote:

> > > On 8/5/20 7:29 PM, Marek Polacek wrote:

> > > > On Wed, Aug 05, 2020 at 11:03:08AM -0400, Nathan Sidwell wrote:

> > > > > On 8/4/20 8:54 PM, Marek Polacek via Gcc-patches wrote:

> > > > > > On Tue, Aug 04, 2020 at 03:33:23PM -0700, Mike Stump wrote:

> > > > > > > I think the read of the room is that people think it would be generally useful, so let approve the general plan.

> > > > > > 

> > > > > > Cool.

> > > > > > 

> > > > > > > So, now we are down to the fine details.  Please do see just how far you can stretch the existing mechanisms to cover what you need to do.  I think the existing mechanisms should be able to cover it all; but the devil is in the details and those matter.

> > > > > > 

> > > > > > At this point I'm only proposing one new directive, dg-ice.  I think we can't

> > > > > > really do without it.  The other one was a matter of convenience.

> > > > > 

> > > > > I've realized I have a concern.  Grepping (or searching in an editor buffer)

> > > > > the log file for 'internal compiler error' to find actual regressions is a

> > > > > thing I want to still be able to do (perhaps with alternative spelling, I

> > > > > don't care).  I don't want to see the ICEs of tests that are expected to

> > > > > ICE.

> > > > > 

> > > > > I think that means there has to be a positive marker on the unexpected ICEs,

> > > > > rather than lack of an expected marker on them.

> > > > 

> > > > Hmm, by the log file you mean g++.log?  Currently, if you run a dg-ice test,

> > > > and the test still ICEs, the g++.log file (but not the stdout of make

> > > > check-c++!) will have:

> > > > 

> > > > Executing on host: ... xg++ with options ...

> > > > spawn -ignore SIGHUP ... xg++ with options ...

> > > > .../foo.C:14:15: internal compiler error: in poplevel_class, at cp/name-lookup.c:4225

> > > > <backtrace>

> > > > compiler exited with status 1

> > > > XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

> > > > PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

> > > > 

> > > > Which one of these would you not like to see?

> > > 

> > > Neither of these is solving the issue.  How do I find the ICES that are

> > > unexpected, without tripping over the ICEs that are expected?

> > > 

> > > > Can you give me more details?  Hopefully we'll work something out that doesn't

> > > > break your workflow.

> > > 

> > > sure.

> > > * develop patch

> > > * run testsuite

> > > * observe unexpected ICEs

> > > * load g++.log into editor

> > > * ^sinternal comp

> > > * gets to first unexpected ICE

> > > * debug it.

> > > 

> > > What does '^sinternal comp' become?  As there could be many expected ICEs

> > > it'll be painful to determine whether any particular utterance of 'internal

> > > compiler' is expected or not.

> > 

> > That is a problem I don't know how to deal with.  I know how to pass

> > additional options to the compiler from dejagnu.  I thought maybe I could use

> > -pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is

> > ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm

> > stuck.

> > 

> > Though, you could just grep for '^FAIL.*internal comp', which will find the

> > first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will

> > be shown in 'Excess errors:'.  Won't that work?

> 

> Read what I wrote:

> >> * load g++.log into editor

> >> * ^sinternal comp

> 

> are you telling me not to use an editor for this task?  the search is so one

> can get the command line.  Grepping the log file will not do that.


No, I'm saying that looking for '^FAIL.*internal comp' (in an editor) instead
of just 'internal comp' will get you to the ICEs you care about.  In vim,
'/^FAIL.*in' gets me there.

If that is still too much, we could maybe add a new diagnostic kind, like
DK_EXPECTED_ICE that has different wording than DK_ICE, and have dg-ice
use it via some option/envvar.

> while solveable, this seems to be the equivalent of putting stones in ones

> shoe.  an annoyance one could do without.  I'm sadly not convinced the goal

> is worth it.


If this is an annoyance to people, I'll just go back to my own repo.

Marek
Martin Liška Aug. 10, 2020, 3:04 a.m. | #39
On 8/5/20 12:22 AM, Marek Polacek wrote:
> On Thu, Jul 30, 2020 at 11:08:03AM +0200, Martin Liška wrote:

>> Hello.

>>

>> I support the initiative!

>> What would be nice to add leading 'PR component/12345'

>> to a git commit so that these test additions are linked to bugzilla issues.

> 

> Thanks!  Yes, it should be clear which test tests a PR that has the monitored

> keyword.  That may get lost when adding a lot of tests in one commit, but can

> always be clarified in a comment.


Note that contrib/mklog.py can correctly extract leading 'PR component/12345' lines
from test-cases and add them to a ChangeLog entry.

Martin

> Or just grep 'PR component/12345' in the

> testsuite; new tests should have this as their first line.

> 

> Marek

>
Richard Sandiford Aug. 10, 2020, 8:48 a.m. | #40
Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > > sure.

>> > > * develop patch

>> > > * run testsuite

>> > > * observe unexpected ICEs

>> > > * load g++.log into editor

>> > > * ^sinternal comp

>> > > * gets to first unexpected ICE

>> > > * debug it.

>> > > 

>> > > What does '^sinternal comp' become?  As there could be many expected ICEs

>> > > it'll be painful to determine whether any particular utterance of 'internal

>> > > compiler' is expected or not.

>> > 

>> > That is a problem I don't know how to deal with.  I know how to pass

>> > additional options to the compiler from dejagnu.  I thought maybe I could use

>> > -pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is

>> > ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm

>> > stuck.

>> > 

>> > Though, you could just grep for '^FAIL.*internal comp', which will find the

>> > first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will

>> > be shown in 'Excess errors:'.  Won't that work?

>> 

>> Read what I wrote:

>> >> * load g++.log into editor

>> >> * ^sinternal comp

>> 

>> are you telling me not to use an editor for this task?  the search is so one

>> can get the command line.  Grepping the log file will not do that.

>

> No, I'm saying that looking for '^FAIL.*internal comp' (in an editor) instead

> of just 'internal comp' will get you to the ICEs you care about.  In vim,

> '/^FAIL.*in' gets me there.


Yeah, this works.  I'm not sure whether Nathan's ^s is isearch
or regexp-isearch :-)

> If that is still too much, we could maybe add a new diagnostic kind, like

> DK_EXPECTED_ICE that has different wording than DK_ICE, and have dg-ice

> use it via some option/envvar.


Seems unnecessary given the above.

>> while solveable, this seems to be the equivalent of putting stones in ones

>> shoe.  an annoyance one could do without.  I'm sadly not convinced the goal

>> is worth it.

>

> If this is an annoyance to people, I'll just go back to my own repo.


Please don't!  I'm sure we can find something for Nathan's use case,
if your regexp search isn't enough.

How do things stand with the new directives?  Seems like there are no
objections to dg-ice, but I lost track of the others.  Personally I'm
not interested in any solution that replaces one of your new directives
with two existing directives: one (sub)test should be one directive as
far as possible.  So IMO we should add every directive that doesn't
already have a one-for-one analogue.

Happy to review any updated patch :-)

Thanks,
Richard
Nathan Sidwell Aug. 10, 2020, 12:58 p.m. | #41
On 8/10/20 4:48 AM, Richard Sandiford wrote:
> Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

>>>>> sure.

>>>>> * develop patch

>>>>> * run testsuite

>>>>> * observe unexpected ICEs

>>>>> * load g++.log into editor

>>>>> * ^sinternal comp

>>>>> * gets to first unexpected ICE

>>>>> * debug it.

>>>>>

>>>>> What does '^sinternal comp' become?  As there could be many expected ICEs

>>>>> it'll be painful to determine whether any particular utterance of 'internal

>>>>> compiler' is expected or not.

>>>>

>>>> That is a problem I don't know how to deal with.  I know how to pass

>>>> additional options to the compiler from dejagnu.  I thought maybe I could use

>>>> -pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is

>>>> ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm

>>>> stuck.

>>>>

>>>> Though, you could just grep for '^FAIL.*internal comp', which will find the

>>>> first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will

>>>> be shown in 'Excess errors:'.  Won't that work?

>>>

>>> Read what I wrote:

>>>>> * load g++.log into editor

>>>>> * ^sinternal comp

>>>

>>> are you telling me not to use an editor for this task?  the search is so one

>>> can get the command line.  Grepping the log file will not do that.

>>

>> No, I'm saying that looking for '^FAIL.*internal comp' (in an editor) instead

>> of just 'internal comp' will get you to the ICEs you care about.  In vim,

>> '/^FAIL.*in' gets me there.

> 

> Yeah, this works.  I'm not sure whether Nathan's ^s is isearch

> or regexp-isearch :-)


It was incremental :) it;s kind of nice just to stop typing when it hits 
the first ICE message!  But, I'm not going to object any more, I guess 
I'll live with it. </grumpy old man>

[I'm sure I can figure out some emacs macro binding to DTRT]

nathan

-- 
Nathan Sidwell
David Malcolm via Gcc-patches Aug. 10, 2020, 9:30 p.m. | #42
On Mon, Aug 10, 2020 at 09:48:06AM +0100, Richard Sandiford wrote:
> Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> >> > > sure.

> >> > > * develop patch

> >> > > * run testsuite

> >> > > * observe unexpected ICEs

> >> > > * load g++.log into editor

> >> > > * ^sinternal comp

> >> > > * gets to first unexpected ICE

> >> > > * debug it.

> >> > > 

> >> > > What does '^sinternal comp' become?  As there could be many expected ICEs

> >> > > it'll be painful to determine whether any particular utterance of 'internal

> >> > > compiler' is expected or not.

> >> > 

> >> > That is a problem I don't know how to deal with.  I know how to pass

> >> > additional options to the compiler from dejagnu.  I thought maybe I could use

> >> > -pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is

> >> > ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm

> >> > stuck.

> >> > 

> >> > Though, you could just grep for '^FAIL.*internal comp', which will find the

> >> > first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will

> >> > be shown in 'Excess errors:'.  Won't that work?

> >> 

> >> Read what I wrote:

> >> >> * load g++.log into editor

> >> >> * ^sinternal comp

> >> 

> >> are you telling me not to use an editor for this task?  the search is so one

> >> can get the command line.  Grepping the log file will not do that.

> >

> > No, I'm saying that looking for '^FAIL.*internal comp' (in an editor) instead

> > of just 'internal comp' will get you to the ICEs you care about.  In vim,

> > '/^FAIL.*in' gets me there.

> 

> Yeah, this works.  I'm not sure whether Nathan's ^s is isearch

> or regexp-isearch :-)

> 

> > If that is still too much, we could maybe add a new diagnostic kind, like

> > DK_EXPECTED_ICE that has different wording than DK_ICE, and have dg-ice

> > use it via some option/envvar.

> 

> Seems unnecessary given the above.


Yeah, I would like to avoid adding hacks like that.

> >> while solveable, this seems to be the equivalent of putting stones in ones

> >> shoe.  an annoyance one could do without.  I'm sadly not convinced the goal

> >> is worth it.

> >

> > If this is an annoyance to people, I'll just go back to my own repo.

> 

> Please don't!  I'm sure we can find something for Nathan's use case,

> if your regexp search isn't enough.

> 

> How do things stand with the new directives?  Seems like there are no

> objections to dg-ice, but I lost track of the others.  Personally I'm

> not interested in any solution that replaces one of your new directives

> with two existing directives: one (sub)test should be one directive as

> far as possible.  So IMO we should add every directive that doesn't

> already have a one-for-one analogue.


Thanks a lot.  Here's the latest version of my patch, which only adds dg-ice
at this point.

I do agree with adding useful directives that don't have an existing
variant, or whose effect could only be achieved by a cumbersome combination
of existing directives.  Perhaps we'll realize we need another one down the
line.

Here's the patch.  Tested by adding two same foo.C and foo2.C tests that
ICE, only foo.C has { dg-ice "" }, and running

$ make check-c++ RUNTESTFLAGS=dg.exp=foo*.C

The result is

FAIL: g++.dg/foo2.C  -std=c++14 (internal compiler error)
FAIL: g++.dg/foo2.C  -std=c++14 (test for excess errors)

so only the unexpected ICE is displayed.

So, um, OK?  Mike, would you be able to take a look at this patch?

I guess it wouldn't hurt to mention this in
https://gcc.gnu.org/codingconventions.html#Testsuite

Thanks,

-- >8 --
This patch adds a new DejaGNU directive, dg-ice, as outlined in the
proposal here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550913.html

It means that it's expected that the compiler crashes with an internal
compiler error when compiling test with such a directive.

A minor optimization could be to use -pass-exit-codes and then check for
ICE_EXIT_CODE return code instead of using string match.

gcc/ChangeLog:

	* doc/sourcebuild.texi: Document dg-ice.

gcc/testsuite/ChangeLog:

	* lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice.
	(cleanup-after-saved-dg-test): Reset expect_ice.
	* lib/prune.exp (prune_ices): New.
	* lib/target-supports-dg.exp (dg-ice): New.
---
 gcc/doc/sourcebuild.texi                 | 10 +++++++++
 gcc/testsuite/lib/gcc-dg.exp             | 20 +++++++++++++++--
 gcc/testsuite/lib/prune.exp              |  9 ++++++++
 gcc/testsuite/lib/target-supports-dg.exp | 28 ++++++++++++++++++++++++
 4 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 63216a0daba..967cb135cb4 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the conditions (which are
 the same as for @code{dg-skip-if}) are met.
 @end table
 
+@subsubsection Expect the compiler to crash
+
+@table @code
+@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}
+Expect the compiler to crash with an internal compiler error and return
+a nonzero exit status if the conditions (which are the same as for
+@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been
+fixed yet.
+@end table
+
 @subsubsection Expect the test executable to fail
 
 @table @code
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 45d97024883..e8ad3052657 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -308,13 +308,27 @@ proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } {
     verbose "$target_compile $prog $output_file $compile_type $options" 4
     set comp_output [$target_compile "$prog" "$output_file" "$compile_type" $options]
 
+    global expect_ice
     # Look for an internal compiler error, which sometimes masks the fact
     # that we didn't get an expected error message.  XFAIL an ICE via
     # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }
     # to avoid a second failure for excess errors.
-    if [string match "*internal compiler error*" $comp_output] {
+    # "Error reporting routines re-entered" ICE says "Internal" rather than
+    # "internal", so match that too.
+    if [string match {*[Ii]nternal compiler error*} $comp_output] {
 	upvar 2 name name
-	fail "$name (internal compiler error)"
+	if { $expect_ice == 0 } {
+	  fail "$name (internal compiler error)"
+	} else {
+	  # We expected an ICE and we got it.
+	  xfail "$name (internal compiler error)"
+	  # Prune the ICE from the output.
+	  set comp_output [prune_ices $comp_output]
+	}
+    } elseif { $expect_ice == 1 } {
+	upvar 2 name name
+	# We expected an ICE but we didn't get it.
+	xpass "$name (internal compiler error)"
     }
 
     if { $do_what == "repo" } {
@@ -939,6 +953,7 @@ if { [info procs saved-dg-test] == [list] } {
 	global additional_prunes
 	global compiler_conditional_xfail_data
 	global shouldfail
+	global expect_ice
 	global testname_with_flags
 	global set_target_env_var
 	global set_compiler_env_var
@@ -954,6 +969,7 @@ if { [info procs saved-dg-test] == [list] } {
 	set additional_sources_used ""
 	set additional_prunes ""
 	set shouldfail 0
+	set expect_ice 0
 	if [info exists set_target_env_var] {
 	    unset set_target_env_var
 	}
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 1c776249f1a..58a739684a5 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -118,6 +118,15 @@ proc prune_file_path { text } {
     return $text
 }
 
+# Prune internal compiler error messages, including the "Please submit..."
+# footnote.
+
+proc prune_ices { text } {
+  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for instructions\[^\n\]*" $text "" text
+  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" $text "" text
+  return $text
+}
+
 # Provide a definition of this if missing (delete after next dejagnu release).
 
 if { [info procs prune_warnings] == "" } then {
diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp
index 2a21424b890..5bb99f4e8f9 100644
--- a/gcc/testsuite/lib/target-supports-dg.exp
+++ b/gcc/testsuite/lib/target-supports-dg.exp
@@ -495,6 +495,34 @@ proc dg-shouldfail { args } {
     }
 }
 
+# Record whether the compiler is expected (at the moment) to ICE.
+# Used for tests that test bugs that have not been fixed yet.
+
+set expect_ice 0
+
+proc dg-ice { args } {
+    # Don't bother if we're already skipping the test.
+    upvar dg-do-what dg-do-what
+    if { [lindex ${dg-do-what} 1] == "N" } {
+      return
+    }
+
+    global expect_ice
+
+    set args [lreplace $args 0 0]
+    if { [llength $args] > 1 } {
+	set selector [list target [lindex $args 1]]
+	if { [dg-process-target-1 $selector] == "S" } {
+	    # The target matches, now check the flags.
+	    if [check-flags $args] {
+		set expect_ice 1
+	    }
+	}
+    } else {
+	set expect_ice 1
+    }
+}
+
 # Intercept the call to the DejaGnu version of dg-process-target to
 # support use of an effective-target keyword in place of a list of
 # target triplets to xfail or skip a test.

base-commit: e4ced0b60ccb4c944970304cf74f1ee9086e5553
-- 
2.26.2
David Malcolm via Gcc-patches Aug. 10, 2020, 9:36 p.m. | #43
On Mon, Aug 10, 2020 at 08:58:54AM -0400, Nathan Sidwell wrote:
> On 8/10/20 4:48 AM, Richard Sandiford wrote:

> > Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> > > > > > sure.

> > > > > > * develop patch

> > > > > > * run testsuite

> > > > > > * observe unexpected ICEs

> > > > > > * load g++.log into editor

> > > > > > * ^sinternal comp

> > > > > > * gets to first unexpected ICE

> > > > > > * debug it.

> > > > > > 

> > > > > > What does '^sinternal comp' become?  As there could be many expected ICEs

> > > > > > it'll be painful to determine whether any particular utterance of 'internal

> > > > > > compiler' is expected or not.

> > > > > 

> > > > > That is a problem I don't know how to deal with.  I know how to pass

> > > > > additional options to the compiler from dejagnu.  I thought maybe I could use

> > > > > -pass-exit-codes, redirect stderr to /dev/null, and check if the exit code is

> > > > > ICE_EXIT_CODE, but there seems to be no way to do that redirection.  So I'm

> > > > > stuck.

> > > > > 

> > > > > Though, you could just grep for '^FAIL.*internal comp', which will find the

> > > > > first unexpected ICE.  Contrary to the expected ICEs, the unexpected ICEs will

> > > > > be shown in 'Excess errors:'.  Won't that work?

> > > > 

> > > > Read what I wrote:

> > > > > > * load g++.log into editor

> > > > > > * ^sinternal comp

> > > > 

> > > > are you telling me not to use an editor for this task?  the search is so one

> > > > can get the command line.  Grepping the log file will not do that.

> > > 

> > > No, I'm saying that looking for '^FAIL.*internal comp' (in an editor) instead

> > > of just 'internal comp' will get you to the ICEs you care about.  In vim,

> > > '/^FAIL.*in' gets me there.

> > 

> > Yeah, this works.  I'm not sure whether Nathan's ^s is isearch

> > or regexp-isearch :-)

> 

> It was incremental :) it;s kind of nice just to stop typing when it hits the

> first ICE message!  But, I'm not going to object any more, I guess I'll live

> with it. </grumpy old man>


Sorry.

> [I'm sure I can figure out some emacs macro binding to DTRT]


I'm sure you could.  If it becomes painful, I could also add
GCC_TESTSUITE_SKIP_ICE_TESTS (disabled by default) or similar to skip dg-ice
tests.

Marek
David Malcolm via Gcc-patches Aug. 10, 2020, 10:35 p.m. | #44
On Aug 7, 2020, at 6:16 AM, Nathan Sidwell <nathan@acm.org> wrote:
> 

> On 8/6/20 8:01 PM, Mike Stump wrote:

>> On Aug 6, 2020, at 7:01 AM, Nathan Sidwell <nathan@acm.org> wrote:

>>> 

>>>> XFAIL: g++.dg/foo.C  -std=c++17 (internal compiler error)

>>>> PASS: g++.dg/foo.C  -std=c++17 (test for excess errors)

>>>> Which one of these would you not like to see?

>>> 

>>> Neither of these is solving the issue.  How do I find the ICES that are unexpected, without tripping over the ICEs that are expected?

>> Don't you already have this issue for current xfailed ICEs?  ^FAIL.*internal compiler error I think finds them, doesn't it?

> 

> Are there xfailed ICEs?  I've not run into them?


I certainly have seen them in various forms and in various places over the years; though, you do raise the interesting point, it may only be theory if things work well on all the platforms of interest.  So, yeah, adding new ICEs can make the search regex longer in practice.
David Malcolm via Gcc-patches Aug. 10, 2020, 10:58 p.m. | #45
On Aug 10, 2020, at 2:30 PM, Marek Polacek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 

> Thanks a lot.  Here's the latest version of my patch, which only adds dg-ice

> at this point.

> 

> So, um, OK?


Ok.

Patch

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index a7a922d84a2..636d21d30dd 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1172,6 +1172,16 @@  Expect the execute step of a test to fail if the conditions (which are
 the same as for @code{dg-skip-if}) are met.
 @end table
 
+@subsubsection Expect the compiler to crash
+
+@table @code
+@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}
+Expect the compiler to crash with an internal compiler error and return
+a nonzero exit status if the conditions (which are the same as for
+@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been
+fixed yet.
+@end table
+
 @subsubsection Expect the test executable to fail
 
 @table @code
@@ -1234,6 +1244,15 @@  has the same effect as @samp{target}.
 
 @item @{ dg-prune-output @var{regexp} @}
 Prune messages matching @var{regexp} from the test output.
+
+@table @code
+@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}
+Expect the compiler to accept the test (even though it should be rejected with
+a compile-time error), if the conditions (which are the same as for
+@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been
+fixed yet.
+@end table
+
 @end table
 
 @subsubsection Verify output of the test executable
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 45d97024883..6478eda283b 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -308,13 +308,48 @@  proc gcc-dg-test-1 { target_compile prog do_what extra_tool_flags } {
     verbose "$target_compile $prog $output_file $compile_type $options" 4
     set comp_output [$target_compile "$prog" "$output_file" "$compile_type" $options]
 
+    global expect_ice
     # Look for an internal compiler error, which sometimes masks the fact
     # that we didn't get an expected error message.  XFAIL an ICE via
     # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }
     # to avoid a second failure for excess errors.
-    if [string match "*internal compiler error*" $comp_output] {
+    # "Error reporting routines re-entered" ICE says "Internal" rather than
+    # "internal", so match that too.
+    if [string match {*[Ii]nternal compiler error*} $comp_output] {
 	upvar 2 name name
-	fail "$name (internal compiler error)"
+	if { $expect_ice == 0 } {
+	  fail "$name (internal compiler error)"
+	} else {
+	  # We expected an ICE and we got it.  Emit an XFAIL.
+	  setup_xfail "*-*-*"
+	  fail "$name (internal compiler error)"
+	  clear_xfail "*-*-*"
+	  # Prune the ICE from the output.
+	  set comp_output [prune_ices $comp_output]
+	}
+    } else {
+	upvar 2 name name
+	global accepts_invalid
+	if { $expect_ice == 1 } {
+	  # We expected an ICE but we didn't get it.  We want an XPASS, so
+	  # call setup_xfail to set xfail_flag.
+	  setup_xfail "*-*-*"
+	  pass "$name (internal compiler error)"
+	  clear_xfail "*-*-*"
+	} elseif { $accepts_invalid == 1 } {
+	    if [string match {*error: *} $comp_output] {
+	      # We expected that this test be (wrongly) accepted, but now we have
+	      # seen error(s).  Issue an XPASS to signal that.
+	      setup_xfail "*-*-*"
+	      pass "$name (accepts invalid)"
+	      clear_xfail "*-*-*"
+	    } else {
+	      # This test is still (wrongly) accepted.  Just emit an XFAIL.
+	      setup_xfail "*-*-*"
+	      fail "$name (accepts invalid)"
+	      clear_xfail "*-*-*"
+	    }
+	}
     }
 
     if { $do_what == "repo" } {
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 1c776249f1a..58a739684a5 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -118,6 +118,15 @@  proc prune_file_path { text } {
     return $text
 }
 
+# Prune internal compiler error messages, including the "Please submit..."
+# footnote.
+
+proc prune_ices { text } {
+  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for instructions\[^\n\]*" $text "" text
+  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" $text "" text
+  return $text
+}
+
 # Provide a definition of this if missing (delete after next dejagnu release).
 
 if { [info procs prune_warnings] == "" } then {
diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp
index 2a21424b890..765f3a2e27a 100644
--- a/gcc/testsuite/lib/target-supports-dg.exp
+++ b/gcc/testsuite/lib/target-supports-dg.exp
@@ -495,6 +495,75 @@  proc dg-shouldfail { args } {
     }
 }
 
+# Record whether the compiler is expected (at the moment) to ICE.
+# Used for tests that test bugs that have not been fixed yet.
+
+set expect_ice 0
+set accepts_invalid 0
+
+proc dg-ice { args } {
+    # Don't bother if we're already skipping the test.
+    upvar dg-do-what dg-do-what
+    if { [lindex ${dg-do-what} 1] == "N" } {
+      return
+    }
+
+    global accepts_invalid
+    # Can't be combined with dg-accepts-invalid.
+    if { $accepts_invalid == 1 } {
+      error "dg-ice: cannot be combined with dg-accepts-invalid"
+      return
+    }
+
+    global expect_ice
+
+    set args [lreplace $args 0 0]
+    if { [llength $args] > 1 } {
+	set selector [list target [lindex $args 1]]
+	if { [dg-process-target-1 $selector] == "S" } {
+	    # The target matches, now check the flags.
+	    if [check-flags $args] {
+		set expect_ice 1
+	    }
+	}
+    } else {
+	set expect_ice 1
+    }
+}
+
+# Record whether the compiler should reject the testcase with an error,
+# but currently doesn't do so.  Used for accepts-invalid bugs.
+
+proc dg-accepts-invalid { args } {
+    # Don't bother if we're already skipping the test.
+    upvar dg-do-what dg-do-what
+    if { [lindex ${dg-do-what} 1] == "N" } {
+      return
+    }
+
+    global expect_ice
+    # Can't be combined with dg-ice.
+    if { $expect_ice == 1 } {
+      error "dg-accepts-invalid: cannot be combined with dg-ice"
+      return
+    }
+
+    global accepts_invalid
+
+    set args [lreplace $args 0 0]
+    if { [llength $args] > 1 } {
+	set selector [list target [lindex $args 1]]
+	if { [dg-process-target-1 $selector] == "S" } {
+	    # The target matches, now check the flags.
+	    if [check-flags $args] {
+		set accepts_invalid 1
+	    }
+	}
+    } else {
+	set accepts_invalid 1
+    }
+}
+
 # Intercept the call to the DejaGnu version of dg-process-target to
 # support use of an effective-target keyword in place of a list of
 # target triplets to xfail or skip a test.