[RFC] rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

Message ID 20200626201118.zqmdpp4yx3jglrjg@work-tp
State New
Headers show
Series
  • [RFC] rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]
Related show

Commit Message

Peter Bergner via Gcc-patches June 26, 2020, 8:11 p.m.
Hi all,


This is an early draft I'm working on to add fegetround , feclearexcept
and feraiseexcept as builtins on rs6000.  This is my first patch so I
welcome any and all feedback.  Foremost I have some questions to ask as
I got stuck on some problems.


Q1) How to implement a target specific builtin for a C standard
    function?

More specifically, how to make gcc use a rs6000 builtin for a
standard C function? Right now, I am getting a double define of the
builtin.  I don't know if define is the right word for it, may be
register an implementation?

The context is that I am creating builtin optimizations for fegetround,
feclearexcept and feraiseexcept.  Early on I discovered that there is
this file that defines builtins for all C library but not actually
implements them (in gcc/builtins.def) and trying to redefine them in
gcc/config/rs6000/rs6000-builtin.def ends up with a name clash.  So I
implemented the builtins with a suffix in its names and pushed this
problem for later...  And this later time is now.

I tried my best to find something about it on the gcc internal
documentation but I may have missed it.

So this is my question, how to I link the builtin defined in
gcc/builtins.def to use my implementation on rs6000? If someone has a
pointer about it or a patch that does it for some other c function (in
any target architecture) that would be great.


Q2) How to fallback to the default behavior of the function call when
    the builtin is not suitable for the parameters?

Here, it is more specifically for feclearexcept and feraiseexcept.  The
builtin should only be used in the case of the parameter input is a
constant number with only 1bit mask (to work on only one exception).
Right now, I make the correctly check and it works (I validate the
builtins using a name suffix to avoid the problem mentioned in Q1)
But It aborts when the input is not valid instead of falling back to a
function call.


Q3) Are the implementations for the builtins more or less on the
    right places?

The first one I did was fegetround and I based it on ppc_get_timebase
and other related builtins, so I used a define_expand on rs6000.md, but
when I was working on the fe*except I was basing it on other builtins
and ended up implementing it all on rs6000-call.c, but I am not sure if
there is a canonical way of doing it one way or another.


o/
Raoni Fassina Firmino

---- 8< ----

This optimizations were originally in glibc, but was removed
and sugested that they were a good fit as gcc builtins[1].

The associated bugreport: PR target/94193

[1] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html
    https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html

Signed-off-by: Raoni Fassina Firmino <raoni@linux.ibm.com>

---
 gcc/config/rs6000/rs6000-builtin.def | 13 ++++++
 gcc/config/rs6000/rs6000-call.c      | 69 ++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000.md          | 18 ++++++++
 3 files changed, 100 insertions(+)

-- 
2.26.2

Comments

Peter Bergner via Gcc-patches June 29, 2020, 6:49 a.m. | #1
On Fri, Jun 26, 2020 at 10:12 PM Raoni Fassina Firmino via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> Hi all,

>

>

> This is an early draft I'm working on to add fegetround , feclearexcept

> and feraiseexcept as builtins on rs6000.  This is my first patch so I

> welcome any and all feedback.  Foremost I have some questions to ask as

> I got stuck on some problems.

>

>

> Q1) How to implement a target specific builtin for a C standard

>     function?

>

> More specifically, how to make gcc use a rs6000 builtin for a

> standard C function? Right now, I am getting a double define of the

> builtin.  I don't know if define is the right word for it, may be

> register an implementation?

>

> The context is that I am creating builtin optimizations for fegetround,

> feclearexcept and feraiseexcept.  Early on I discovered that there is

> this file that defines builtins for all C library but not actually

> implements them (in gcc/builtins.def) and trying to redefine them in

> gcc/config/rs6000/rs6000-builtin.def ends up with a name clash.  So I

> implemented the builtins with a suffix in its names and pushed this

> problem for later...  And this later time is now.

>

> I tried my best to find something about it on the gcc internal

> documentation but I may have missed it.

>

> So this is my question, how to I link the builtin defined in

> gcc/builtins.def to use my implementation on rs6000? If someone has a

> pointer about it or a patch that does it for some other c function (in

> any target architecture) that would be great.

>

>

> Q2) How to fallback to the default behavior of the function call when

>     the builtin is not suitable for the parameters?

>

> Here, it is more specifically for feclearexcept and feraiseexcept.  The

> builtin should only be used in the case of the parameter input is a

> constant number with only 1bit mask (to work on only one exception).

> Right now, I make the correctly check and it works (I validate the

> builtins using a name suffix to avoid the problem mentioned in Q1)

> But It aborts when the input is not valid instead of falling back to a

> function call.

>

>

> Q3) Are the implementations for the builtins more or less on the

>     right places?

>

> The first one I did was fegetround and I based it on ppc_get_timebase

> and other related builtins, so I used a define_expand on rs6000.md, but

> when I was working on the fe*except I was basing it on other builtins

> and ended up implementing it all on rs6000-call.c, but I am not sure if

> there is a canonical way of doing it one way or another.


GCC already knows fe* builtins, what GCC does not yet have is
a way for targets to specify custom expansion of them.  So instead
of adding powerpc specific builtins you should add optabs for the
RTL expansion part.

I'm not sure if the actual choice of macro values for the fe* builtins
need glueing logic or if we want them to be determined statically
by the target configuration - see how we handle folding of
fpclassify.  At least without -frounding-math fegetround could be
constant folded to FE_TONEAREST for which we'd need the
actual value of FE_TONEAREST.

Richard.

>

> o/

> Raoni Fassina Firmino

>

> ---- 8< ----

>

> This optimizations were originally in glibc, but was removed

> and sugested that they were a good fit as gcc builtins[1].

>

> The associated bugreport: PR target/94193

>

> [1] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html

>     https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html

>

> Signed-off-by: Raoni Fassina Firmino <raoni@linux.ibm.com>

> ---

>  gcc/config/rs6000/rs6000-builtin.def | 13 ++++++

>  gcc/config/rs6000/rs6000-call.c      | 69 ++++++++++++++++++++++++++++

>  gcc/config/rs6000/rs6000.md          | 18 ++++++++

>  3 files changed, 100 insertions(+)

>

> diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def

> index 54f750c8384..d5ca15141b1 100644

> --- a/gcc/config/rs6000/rs6000-builtin.def

> +++ b/gcc/config/rs6000/rs6000-builtin.def

> @@ -2567,12 +2567,25 @@ BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase",

>  BU_SPECIAL_X (RS6000_BUILTIN_MFTB, "__builtin_ppc_mftb",

>               RS6000_BTM_ALWAYS, RS6000_BTC_MISC)

>

> +BU_SPECIAL_X (RS6000_BUILTIN_FEGETROUND, "__builtin_fegetround",

> +             RS6000_BTM_ALWAYS, RS6000_BTC_MISC)

> +

>  BU_SPECIAL_X (RS6000_BUILTIN_MFFS, "__builtin_mffs",

>               RS6000_BTM_ALWAYS, RS6000_BTC_MISC)

>

>  BU_SPECIAL_X (RS6000_BUILTIN_MFFSL, "__builtin_mffsl",

>               RS6000_BTM_ALWAYS, RS6000_BTC_MISC)

>

> +RS6000_BUILTIN_X (RS6000_BUILTIN_FECLEAREXCEPT, "__builtin_feclearexcept",

> +                 RS6000_BTM_ALWAYS,

> +                 RS6000_BTC_MISC | RS6000_BTC_UNARY,

> +                 CODE_FOR_nothing)

> +

> +RS6000_BUILTIN_X (RS6000_BUILTIN_FERAISEEXCEPT, "__builtin_feraiseexcept",

> +                 RS6000_BTM_ALWAYS,

> +                 RS6000_BTC_MISC | RS6000_BTC_UNARY,

> +                 CODE_FOR_nothing)

> +

>  RS6000_BUILTIN_X (RS6000_BUILTIN_MTFSF, "__builtin_mtfsf",

>                   RS6000_BTM_ALWAYS,

>                   RS6000_BTC_MISC | RS6000_BTC_UNARY | RS6000_BTC_VOID,

> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c

> index 7621d6f5278..af93259e73d 100644

> --- a/gcc/config/rs6000/rs6000-call.c

> +++ b/gcc/config/rs6000/rs6000-call.c

> @@ -8533,6 +8533,53 @@ rs6000_expand_zeroop_builtin (enum insn_code icode, rtx target)

>  }

>

>

> +static rtx

> +rs6000_expand_feCRexcept_builtin (enum insn_code icode, tree exp, rtx target)

> +{

> +  rtx pat;

> +  tree arg0 = CALL_EXPR_ARG (exp, 0);

> +  rtx op0 = expand_normal (arg0);

> +

> +  if (icode == CODE_FOR_nothing)

> +    /* Builtin not supported on this processor.  */

> +    return 0;

> +

> +  if (rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)

> +    {

> +      error ("%<__builtin_feclearexcept%> and "

> +            "%<__builtin_feraiseexcept%> not supported with "

> +            "%<-msoft-float%>");

> +      return const0_rtx;

> +    }

> +

> +  /* If we got invalid arguments bail out before generating bad rtl.  */

> +  if (arg0 == error_mark_node)

> +    return const0_rtx;

> +

> +  if (!CONST_INT_P (op0)

> +      || __builtin_popcount (INTVAL(op0)) != 1

> +      || INTVAL (op0) == 0x20000000)

> +      //|| INTVAL (op0) == FE_INVALID)

> +    {

> +      error ("argument 1 must be a constant representing one valid exception number");

> +      return const0_rtx;

> +    }

> +

> +  rtx tmp = gen_rtx_CONST_INT (SImode, __builtin_clz (INTVAL(op0)));

> +  pat = GEN_FCN (icode) (tmp);

> +  if (!pat)

> +    return const0_rtx;

> +  emit_insn (pat);

> +

> +  if (target == 0 || GET_MODE (target) != SImode)

> +    target = gen_reg_rtx (SImode);

> +

> +  emit_move_insn (target, GEN_INT (0));

> +

> +  return target;

> +}

> +

> +

>  static rtx

>  rs6000_expand_mtfsf_builtin (enum insn_code icode, tree exp)

>  {

> @@ -11646,6 +11693,15 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,

>          rs6000_expand_set_fpscr_drn_builtin (CODE_FOR_rs6000_set_fpscr_drn,

>                                              exp);

>

> +    case RS6000_BUILTIN_FEGETROUND:

> +      return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_fegetround, target);

> +

> +    case RS6000_BUILTIN_FECLEAREXCEPT:

> +      return rs6000_expand_feCRexcept_builtin (CODE_FOR_rs6000_mtfsb0, exp, target);

> +

> +    case RS6000_BUILTIN_FERAISEEXCEPT:

> +      return rs6000_expand_feCRexcept_builtin (CODE_FOR_rs6000_mtfsb1, exp, target);

> +

>      case RS6000_BUILTIN_MFFSL:

>        return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_mffsl, target);

>

> @@ -12029,6 +12085,19 @@ rs6000_init_builtins (void)

>                                       NULL_TREE);

>    def_builtin ("__builtin_ppc_mftb", ftype, RS6000_BUILTIN_MFTB);

>

> +  ftype = build_function_type_list (intSI_type_node, NULL_TREE);

> +  def_builtin ("__builtin_fegetround", ftype, RS6000_BUILTIN_FEGETROUND);

> +

> +  ftype = build_function_type_list (intSI_type_node,

> +                                   intSI_type_node,

> +                                   NULL_TREE);

> +  def_builtin ("__builtin_feclearexcept", ftype, RS6000_BUILTIN_FECLEAREXCEPT);

> +

> +  ftype = build_function_type_list (intSI_type_node,

> +                                   intSI_type_node,

> +                                   NULL_TREE);

> +  def_builtin ("__builtin_feraiseexcept", ftype, RS6000_BUILTIN_FERAISEEXCEPT);

> +

>    ftype = build_function_type_list (double_type_node, NULL_TREE);

>    def_builtin ("__builtin_mffs", ftype, RS6000_BUILTIN_MFFS);

>

> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md

> index 6173994797c..f935e7118ef 100644

> --- a/gcc/config/rs6000/rs6000.md

> +++ b/gcc/config/rs6000/rs6000.md

> @@ -13600,6 +13600,24 @@

>      return "mftb %0";

>  })

>

> +

> +;; int __builtin_fegetround()

> +(define_expand "rs6000_fegetround"

> +  [(use (match_operand:SI 0 "gpc_reg_operand"))]

> +  "TARGET_HARD_FLOAT"

> +{

> +    rtx tmp_df = gen_reg_rtx (DFmode);

> +    emit_insn (gen_rs6000_mffsl (tmp_df));

> +

> +    rtx tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);

> +    rtx tmp_di_2 = gen_reg_rtx (DImode);

> +    emit_insn (gen_anddi3 (tmp_di_2, tmp_di, GEN_INT (0x3LL)));

> +    rtx tmp_si = gen_reg_rtx (SImode);

> +    tmp_si = simplify_gen_subreg (SImode, tmp_di_2, DImode, 0);

> +    emit_move_insn (operands[0], tmp_si);

> +    DONE;

> +})

> +

>

>  ;; The ISA 3.0 mffsl instruction is a lower latency instruction

>  ;; for reading bits [29:31], [45:51] and [56:63] of the FPSCR.

> --

> 2.26.2

>
Segher Boessenkool June 29, 2020, 8:37 p.m. | #2
Hi!

On Mon, Jun 29, 2020 at 08:49:05AM +0200, Richard Biener via Gcc-patches wrote:
> On Fri, Jun 26, 2020 at 10:12 PM Raoni Fassina Firmino via Gcc-patches

> <gcc-patches@gcc.gnu.org> wrote:

> > This is an early draft I'm working on to add fegetround , feclearexcept

> > and feraiseexcept as builtins on rs6000.  This is my first patch so I

> > welcome any and all feedback.  Foremost I have some questions to ask as

> > I got stuck on some problems.


> > Q2) How to fallback to the default behavior of the function call when

> >     the builtin is not suitable for the parameters?


In general, in the expander you can do FAIL in such cases.  For some
patterns that isn't allowed, and you have to copy everything the
standard implemntation does to your specialized implementation.  This
of course doesn't scale, and will make you miss all future changes to
the standard implementation.  It is then probably better to then do the
work the original implementatyion skimped on, and *do* allow FAIL.

> > Here, it is more specifically for feclearexcept and feraiseexcept.  The

> > builtin should only be used in the case of the parameter input is a

> > constant number with only 1bit mask (to work on only one exception).


rs6000 has exact_log2_cint_operand (and the "N" constraint).

> > Q3) Are the implementations for the builtins more or less on the

> >     right places?

> >

> > The first one I did was fegetround and I based it on ppc_get_timebase

> > and other related builtins, so I used a define_expand on rs6000.md, but

> > when I was working on the fe*except I was basing it on other builtins

> > and ended up implementing it all on rs6000-call.c, but I am not sure if

> > there is a canonical way of doing it one way or another.


Patterns have to go to one of the .md files.  rs6000.md is a fine choice;
but put this somewhere near the other floating point patterns?

> GCC already knows fe* builtins, what GCC does not yet have is

> a way for targets to specify custom expansion of them.  So instead

> of adding powerpc specific builtins you should add optabs for the

> RTL expansion part.


Yes.

> > +static rtx

> > +rs6000_expand_feCRexcept_builtin (enum insn_code icode, tree exp, rtx target)


No caMel case please.  "fe" and "cr" do not mean to much here; think of
a nicer name please?  Saving a character or two isn't useful, this is
called in only a few places.

> > +      //|| INTVAL (op0) == FE_INVALID)


Please fix.

> > +;; int __builtin_fegetround()

> > +(define_expand "rs6000_fegetround"

> > +  [(use (match_operand:SI 0 "gpc_reg_operand"))]

> > +  "TARGET_HARD_FLOAT"

> > +{

> > +    rtx tmp_df = gen_reg_rtx (DFmode);

> > +    emit_insn (gen_rs6000_mffsl (tmp_df));

> > +

> > +    rtx tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);

> > +    rtx tmp_di_2 = gen_reg_rtx (DImode);

> > +    emit_insn (gen_anddi3 (tmp_di_2, tmp_di, GEN_INT (0x3LL)));


Just  GEN_INT (3)  will do fine.


Another question.  How do these builtins prevent other FP insns from
being moved (or optimised) "over" them?


Segher
Marc Glisse June 29, 2020, 9:45 p.m. | #3
On Mon, 29 Jun 2020, Segher Boessenkool wrote:

> Another question.  How do these builtins prevent other FP insns from

> being moved (or optimised) "over" them?


At the GIMPLE level they don't. They prevent other function calls from 
moving across, just because function calls where at least one is not pure 
can't cross, but otherwise fenv_access is one big missing feature in gcc. 
I started something last year (and postponed indefinitely for lack of 
time), replacing all FP operations (when the safe mode is enabled) with 
builtins that get expanded by default to insert asm pass-through on the 
arguments and the result.

-- 
Marc Glisse
Segher Boessenkool June 29, 2020, 10:02 p.m. | #4
On Mon, Jun 29, 2020 at 11:45:41PM +0200, Marc Glisse wrote:
> On Mon, 29 Jun 2020, Segher Boessenkool wrote:

> 

> >Another question.  How do these builtins prevent other FP insns from

> >being moved (or optimised) "over" them?

> 

> At the GIMPLE level they don't.


And not at RTL level, either.

> They prevent other function calls from 

> moving across, just because function calls where at least one is not pure 

> can't cross, but otherwise fenv_access is one big missing feature in gcc. 

> I started something last year (and postponed indefinitely for lack of 

> time), replacing all FP operations (when the safe mode is enabled) with 

> builtins that get expanded by default to insert asm pass-through on the 

> arguments and the result.


Yes, it is an ancient missing feature, and still very relevant.  Thanks
for any attempt you made / are making / will make to make this better!

My fear is that if we optimise the floating env access better, that then
fewer bad transforms are accidentally prevented :-/


Segher
Joseph Myers June 30, 2020, 12:19 a.m. | #5
On Mon, 29 Jun 2020, Richard Biener via Gcc-patches wrote:

> I'm not sure if the actual choice of macro values for the fe* builtins

> need glueing logic or if we want them to be determined statically

> by the target configuration - see how we handle folding of

> fpclassify.  At least without -frounding-math fegetround could be

> constant folded to FE_TONEAREST for which we'd need the

> actual value of FE_TONEAREST.


In most cases, target architectures have fixed values for the exceptions, 
independent of the target OS, but on SPARC there is OS dependence, which 
the SPARC_LOW_FE_EXCEPT_VALUES macro deals with (for the interface between 
atomic compound assignment and libatomic's __atomic_feraiseexcept).  I'd 
guess they tend to have fixed values for the rounding modes as well.

If GCC had a built-in fegetround that was always expanded inline and 
always know the correct value for each rounding mode, that would allow 
fixing bug 30569 (making FLT_ROUNDS depend on the rounding mode at runtime 
- note that expanding the FLT_ROUNDS macro mustn't introduce a dependency 
on libm, hence the need to expand inline).  That bug could also be fixed 
incrementally, target by target, if the target-independent code had some 
way to determine whether the target would expand the built-in fegetround 
inline and what the rounding mode values would be.

(I've tended to suppose that defining a separate __builtin_flt_rounds 
would be the way to go for fixing bug 30569, but that could easily wrap 
__builtin_fegetround and return a constant value for targets that can't 
expand __builtin_fegetround inline.)

-- 
Joseph S. Myers
joseph@codesourcery.com
Marc Glisse June 30, 2020, 6:44 a.m. | #6
On Mon, 29 Jun 2020, Richard Biener via Gcc-patches wrote:

> At least without -frounding-math fegetround could be

> constant folded to FE_TONEAREST for which we'd need the

> actual value of FE_TONEAREST.


That will break existing code which, since -frounding-math doesn't work 
for that, protects all FP operations with volatile read/writes or similar 
asm, and then doesn't specify -frounding-math because it doesn't seem 
necessary. I am not saying that code is right, just that it exists.

In a world where we have implemented fenv_access, this kind of folding of 
fegetround could only happen in "#pragma fenv_access off" regions, which 
seems to imply that it would be the front-end's responsibility (although 
it would need help from the back-end to know the default value to fold 
to).

-- 
Marc Glisse

Patch

diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 54f750c8384..d5ca15141b1 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2567,12 +2567,25 @@  BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase",
 BU_SPECIAL_X (RS6000_BUILTIN_MFTB, "__builtin_ppc_mftb",
 	      RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
+BU_SPECIAL_X (RS6000_BUILTIN_FEGETROUND, "__builtin_fegetround",
+	      RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
+
 BU_SPECIAL_X (RS6000_BUILTIN_MFFS, "__builtin_mffs",
 	      RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
 BU_SPECIAL_X (RS6000_BUILTIN_MFFSL, "__builtin_mffsl",
 	      RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
+RS6000_BUILTIN_X (RS6000_BUILTIN_FECLEAREXCEPT, "__builtin_feclearexcept",
+		  RS6000_BTM_ALWAYS,
+		  RS6000_BTC_MISC | RS6000_BTC_UNARY,
+		  CODE_FOR_nothing)
+
+RS6000_BUILTIN_X (RS6000_BUILTIN_FERAISEEXCEPT, "__builtin_feraiseexcept",
+		  RS6000_BTM_ALWAYS,
+		  RS6000_BTC_MISC | RS6000_BTC_UNARY,
+		  CODE_FOR_nothing)
+
 RS6000_BUILTIN_X (RS6000_BUILTIN_MTFSF, "__builtin_mtfsf",
 	          RS6000_BTM_ALWAYS,
 	          RS6000_BTC_MISC | RS6000_BTC_UNARY | RS6000_BTC_VOID,
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 7621d6f5278..af93259e73d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -8533,6 +8533,53 @@  rs6000_expand_zeroop_builtin (enum insn_code icode, rtx target)
 }
 
 
+static rtx
+rs6000_expand_feCRexcept_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx pat;
+  tree arg0 = CALL_EXPR_ARG (exp, 0);
+  rtx op0 = expand_normal (arg0);
+
+  if (icode == CODE_FOR_nothing)
+    /* Builtin not supported on this processor.  */
+    return 0;
+
+  if (rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)
+    {
+      error ("%<__builtin_feclearexcept%> and "
+	     "%<__builtin_feraiseexcept%> not supported with "
+	     "%<-msoft-float%>");
+      return const0_rtx;
+    }
+
+  /* If we got invalid arguments bail out before generating bad rtl.  */
+  if (arg0 == error_mark_node)
+    return const0_rtx;
+
+  if (!CONST_INT_P (op0)
+      || __builtin_popcount (INTVAL(op0)) != 1
+      || INTVAL (op0) == 0x20000000)
+      //|| INTVAL (op0) == FE_INVALID)
+    {
+      error ("argument 1 must be a constant representing one valid exception number");
+      return const0_rtx;
+    }
+
+  rtx tmp = gen_rtx_CONST_INT (SImode, __builtin_clz (INTVAL(op0)));
+  pat = GEN_FCN (icode) (tmp);
+  if (!pat)
+    return const0_rtx;
+  emit_insn (pat);
+
+  if (target == 0 || GET_MODE (target) != SImode)
+    target = gen_reg_rtx (SImode);
+
+  emit_move_insn (target, GEN_INT (0));
+
+  return target;
+}
+
+
 static rtx
 rs6000_expand_mtfsf_builtin (enum insn_code icode, tree exp)
 {
@@ -11646,6 +11693,15 @@  rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
         rs6000_expand_set_fpscr_drn_builtin (CODE_FOR_rs6000_set_fpscr_drn,
 					     exp);
 
+    case RS6000_BUILTIN_FEGETROUND:
+      return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_fegetround, target);
+
+    case RS6000_BUILTIN_FECLEAREXCEPT:
+      return rs6000_expand_feCRexcept_builtin (CODE_FOR_rs6000_mtfsb0, exp, target);
+
+    case RS6000_BUILTIN_FERAISEEXCEPT:
+      return rs6000_expand_feCRexcept_builtin (CODE_FOR_rs6000_mtfsb1, exp, target);
+
     case RS6000_BUILTIN_MFFSL:
       return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_mffsl, target);
 
@@ -12029,6 +12085,19 @@  rs6000_init_builtins (void)
 				      NULL_TREE);
   def_builtin ("__builtin_ppc_mftb", ftype, RS6000_BUILTIN_MFTB);
 
+  ftype = build_function_type_list (intSI_type_node, NULL_TREE);
+  def_builtin ("__builtin_fegetround", ftype, RS6000_BUILTIN_FEGETROUND);
+
+  ftype = build_function_type_list (intSI_type_node,
+				    intSI_type_node,
+				    NULL_TREE);
+  def_builtin ("__builtin_feclearexcept", ftype, RS6000_BUILTIN_FECLEAREXCEPT);
+
+  ftype = build_function_type_list (intSI_type_node,
+				    intSI_type_node,
+				    NULL_TREE);
+  def_builtin ("__builtin_feraiseexcept", ftype, RS6000_BUILTIN_FERAISEEXCEPT);
+
   ftype = build_function_type_list (double_type_node, NULL_TREE);
   def_builtin ("__builtin_mffs", ftype, RS6000_BUILTIN_MFFS);
 
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6173994797c..f935e7118ef 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13600,6 +13600,24 @@ 
     return "mftb %0";
 })
 
+
+;; int __builtin_fegetround()
+(define_expand "rs6000_fegetround"
+  [(use (match_operand:SI 0 "gpc_reg_operand"))]
+  "TARGET_HARD_FLOAT"
+{
+    rtx tmp_df = gen_reg_rtx (DFmode);
+    emit_insn (gen_rs6000_mffsl (tmp_df));
+
+    rtx tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
+    rtx tmp_di_2 = gen_reg_rtx (DImode);
+    emit_insn (gen_anddi3 (tmp_di_2, tmp_di, GEN_INT (0x3LL)));
+    rtx tmp_si = gen_reg_rtx (SImode);
+    tmp_si = simplify_gen_subreg (SImode, tmp_di_2, DImode, 0);
+    emit_move_insn (operands[0], tmp_si);
+    DONE;
+})
+
 
 ;; The ISA 3.0 mffsl instruction is a lower latency instruction
 ;; for reading bits [29:31], [45:51] and [56:63] of the FPSCR.