[3/4] libstdc++: More efficient is_leap.

Message ID CAOfgUPj1UwwKs5HjjFW3T=F2+ubzs5AhQPkEyfZoY2s_qh=mBQ@mail.gmail.com
State New
Headers show
Series
  • [1/4] libstdc++: More efficient date from days.
Related show

Commit Message

Kewen.Lin via Gcc-patches Feb. 23, 2021, 1:25 p.m.
This patch reimplements std::chrono::year::is_leap().  Leap year check is
ubiquitously implemented (including here) as:

    y % 4 == 0 && (y % 100 != 0 || y % 400 == 0).

The rationale being that testing divisibility by 4 first implies an earlier
return for 75% of the cases, therefore, avoiding the needless calculations of
y % 100 and y % 400. Although this fact is true, it does not take into account
the cost of branching.  This patch, instead, tests divisibility by 100 first:

    (y % 100 != 0 || y % 400 == 0) && y % 4 == 0.

It is certainly counterintuitive that this could be more efficient since among
the three divisibility tests (4, 100 and 400) the one by 100 is the only one
that can never provide a definitive answer and a second divisibility test (by 4
or 400) is always required. However, measurements [1] in x86_64 suggest this is
3x more efficient!  A possible explanation is that checking divisibility by 100
first implies a split in the execution path with probabilities of (1%, 99%)
rather than (25%, 75%) when divisibility by 4 is checked first.  This decreases
the entropy of the branching distribution which seems to help prediction.

Given that y belongs to [-32767, 32767] [time.cal.year.members], a more
efficient algorithm [2] to check divisibility by 100 is used (instead of
y % 100 != 0).  Measurements suggest that this optimization improves performance
by 20%.

The patch adds a test that exhaustively compares the result of this
implementation with the ubiquitous one for all y in [-32767, 32767]. Although
its completeness, the test completes in a matter of seconds.

References:
[1] https://stackoverflow.com/a/60646967/1137388
[2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16

libstdc++-v3/ChangeLog:

    * include/std/chrono:
    * testsuite/std/time/year/2.cc: New test.
---
 libstdc++-v3/include/std/chrono           | 19 ++++++++-
 libstdc++-v3/testsuite/std/time/year/2.cc | 52 +++++++++++++++++++++++
 2 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/testsuite/std/time/year/2.cc

-- 
2.29.2

Patch

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 7840099d743..b777acdd4a2 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -1597,7 +1597,24 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION

       constexpr bool
       is_leap() const noexcept
-      { return _M_y % 4 == 0 && (_M_y % 100 != 0 || _M_y % 400 == 0); }
+      {
+        // Testing divisibility by 100 first gives better performance, that is,
+        // return (_M_y % 100 != 0 || _M_y % 400 == 0) && _M_y % 4 == 0;
+
+        // It gets even faster if _M_y is in [-536870800, 536870999]
(which is the case here) and
+        // _M_y % 100 is replaced by __is_multiple_of_100 below.
+
+        // References:
+        // [1] https://github.com/cassioneri/calendar
+        // [2]
https://accu.org/journals/overload/28/155/overload155.pdf#page=16
+
+        constexpr uint32_t __multiplier   = 42949673;
+        constexpr uint32_t __bound        = 42949669;
+        constexpr uint32_t __max_dividend = 1073741799;
+        constexpr uint32_t __offset       = __max_dividend / 2 / 100 * 100;
+        const bool __is_multiple_of_100 = __multiplier * (_M_y +
__offset) < __bound;
+        return (!__is_multiple_of_100 || _M_y % 400 == 0) && _M_y % 4 == 0;
+      }

       explicit constexpr
       operator int() const noexcept
diff --git a/libstdc++-v3/testsuite/std/time/year/2.cc
b/libstdc++-v3/testsuite/std/time/year/2.cc
new file mode 100644
index 00000000000..e22101e305a
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/time/year/2.cc
@@ -0,0 +1,52 @@ 
+// { dg-options "-std=gnu++2a" }
+// { dg-do run { target c++2a } }
+
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// Class year [time.cal.year_month_day]
+
+#include <chrono>
+#include <testsuite_hooks.h>
+
+// Slow but clear test for leap year.
+constexpr bool
+is_leap_year(const std::chrono::year& y) noexcept
+{
+  const int n = static_cast<int>(y);
+  return n % 4 == 0 && (n % 100 != 0 || n % 400 == 0);
+}
+
+void test01()
+{
+  using namespace std::chrono;
+
+  year y{-32767};
+  while (y < year{32767}) {
+    VERIFY( y.is_leap() ==  is_leap_year(y) );
+    ++y;
+  }
+
+  // One more for y = 32767.
+  VERIFY( year{32767}.is_leap() == is_leap_year(year{32767}) );
+}
+
+int main()
+{
+  test01();
+  return 0;
+}