Refactor mktime and add the POSIX function timegm

Message ID 20210503083750.1043988-1-rdiezmail-newlib@yahoo.de
State New
Headers show
Series
  • Refactor mktime and add the POSIX function timegm
Related show

Commit Message

R. Diez via Newlib May 3, 2021, 8:37 a.m.
I have manually reconstructed the patch from Andrew Russell posted here:

https://sourceware.org/legacy-ml/newlib/2018/msg00824.html

Is the new implementation of __is_leap_year() right? I would rather keep the old ISLEAP implementation.

R. Diez (1):
  Refactor mktime and add the POSIX function timegm

 newlib/libc/include/time.h   |  3 ++
 newlib/libc/saber            |  1 +
 newlib/libc/time/Makefile.am |  2 +
 newlib/libc/time/Makefile.in |  9 +++++
 newlib/libc/time/local.h     |  2 +
 newlib/libc/time/mktime.c    | 78 +++++++++++++++++++++++++-----------
 newlib/libc/time/timegm.c    | 63 +++++++++++++++++++++++++++++
 7 files changed, 135 insertions(+), 23 deletions(-)
 create mode 100644 newlib/libc/time/timegm.c

-- 
2.31.1

From b935001fcc817efe1187d993c076f65a7988e468 Mon Sep 17 00:00:00 2001
From: "R. Diez" <rdiezmail-newlib@yahoo.de>

Date: Mon, 3 May 2021 10:27:32 +0200
Subject: [PATCH 1/1] Refactor mktime and add the POSIX function timegm

Original author: Andrew Russell <ahrussell@google.com>
Signed-off-by: R. Diez <rdiezmail-newlib@yahoo.de>

---
 newlib/libc/include/time.h   |  3 ++
 newlib/libc/saber            |  1 +
 newlib/libc/time/Makefile.am |  2 +
 newlib/libc/time/Makefile.in |  9 +++++
 newlib/libc/time/local.h     |  2 +
 newlib/libc/time/mktime.c    | 78 +++++++++++++++++++++++++-----------
 newlib/libc/time/timegm.c    | 63 +++++++++++++++++++++++++++++
 7 files changed, 135 insertions(+), 23 deletions(-)
 create mode 100644 newlib/libc/time/timegm.c

-- 
2.31.1

Comments

Brian Inglis May 3, 2021, 4:04 p.m. | #1
On 2021-05-03 02:37, R. Diez via Newlib wrote:
> I have manually reconstructed the patch from Andrew Russell posted here:

> 

> https://sourceware.org/legacy-ml/newlib/2018/msg00824.html

> 

> Is the new implementation of __is_leap_year() right? I would rather keep the old ISLEAP implementation.

> 

> R. Diez (1):

>    Refactor mktime and add the POSIX function timegm

> 

>   newlib/libc/include/time.h   |  3 ++

>   newlib/libc/saber            |  1 +

>   newlib/libc/time/Makefile.am |  2 +

>   newlib/libc/time/Makefile.in |  9 +++++

>   newlib/libc/time/local.h     |  2 +

>   newlib/libc/time/mktime.c    | 78 +++++++++++++++++++++++++-----------

>   newlib/libc/time/timegm.c    | 63 +++++++++++++++++++++++++++++

>   7 files changed, 135 insertions(+), 23 deletions(-)

>   create mode 100644 newlib/libc/time/timegm.c


 >+<<timegm>> is a nonstandard GNU extension to POSIX also present on BSD.


Don't see where *non-inline* static functions and trivial renames are 
improvements over macros.
You removed a cast that may have been added to avoid overflow, probably in 32 
bit implementations, which is a continuing problem in these functions.
It may be unnecessary to add a new file for a small new function.

For patches of this sort, you normally want to avoid trivial changes so the 
substance of the patch stands out.

It may have been better to compare against the upstream BSD (or tzcode) sources 
and include one of those implementations of timegm, with any required changes, 
rather than refactor including trivial changes, making it difficult to rebase 
against the upstream and merge changes in future.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]
R. Diez via Newlib May 3, 2021, 4:23 p.m. | #2
> [...]

> Don't see where *non-inline* static functions and trivial renames are improvements over macros.

 > [...]


Like I said, I did not write the code, I only manually reconstructed the last patch version from Andrew Russell.

I was under the impression that the patch had already been reviewed, at least up to a point. It was by no means the first patch version posted.

But, if there is some sort of consensus here, I could rework the patch.


 > It may have been better to compare against the upstream BSD (or tzcode) sources

 > [...]


That means extra work. Do you have any reason to suspect that the existing code is wrong in some way? Or do you know whether the code in Newlib is 
actually supposed to track some BSD or tzcode source? There is no source code comment to that respect, and the code in other libraries will probably 
have diverged by now.

Unless you volunteer and/or specify concrete reasons, I would assume that the existing implementation in Newlib is fine, so a small refactor in order 
to provide timegm() is probably the best solution. Such a patch is also easier to review than bringing in new or further modified code from an 
external source.

Regards,
   rdiez
Brian Inglis May 3, 2021, 4:50 p.m. | #3
On 2021-05-03 10:23, R. Diez wrote:
>> [...]

>> Don't see where *non-inline* static functions and trivial renames are 

>> improvements over macros.

>> [...]


> Like I said, I did not write the code, I only manually reconstructed the last 

> patch version from Andrew Russell.

> I was under the impression that the patch had already been reviewed, at least up 

> to a point. It was by no means the first patch version posted.

> But, if there is some sort of consensus here, I could rework the patch.


>> It may have been better to compare against the upstream BSD (or tzcode) sources

>> [...]


> That means extra work. Do you have any reason to suspect that the existing code 

> is wrong in some way? Or do you know whether the code in Newlib is actually 

> supposed to track some BSD or tzcode source? There is no source code comment to 

> that respect, and the code in other libraries will probably have diverged by now.

> 

> Unless you volunteer and/or specify concrete reasons, I would assume that the 

> existing implementation in Newlib is fine, so a small refactor in order to 

> provide timegm() is probably the best solution. Such a patch is also easier to 

> review than bringing in new or further modified code from an external source.


My comments are for your and the committers' consideration.

I previously commented on patches after comparing some source files against BSD 
(licence) and tzcode (public domain) upstreams and found them line-by-line 
similar mostly: it's been a while and details are now fuzzy.

Some amounts of newlib code are pulled or adapted from BSD sources, as the 
licences allow commercial use, and newlib is used in RTEMS and development 
products offered by hardware and RT OS vendors.
As you may have noticed, there are not always a lot of comments in sources.

Such a substantial patch will also require a BSD licence assignment, but given 
that it is an adaptation of another's work, I am unsure where that leaves it.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

Patch

diff --git a/newlib/libc/include/time.h b/newlib/libc/include/time.h
index ed6cc70fe..28aa978fc 100644
--- a/newlib/libc/include/time.h
+++ b/newlib/libc/include/time.h
@@ -56,6 +56,9 @@  struct tm
 clock_t	   clock (void);
 double	   difftime (time_t _time2, time_t _time1);
 time_t	   mktime (struct tm *_timeptr);
+#if __BSD_VISIBLE || __SVID_VISIBLE || __GNU_VISIBLE
+time_t    timegm (struct tm *_timeptr);
+#endif // __BSD_VISIBLE || __SVID_VISIBLE || __GNU_VISIBLE
 time_t	   time (time_t *_timer);
 #ifndef _REENT_ONLY
 char	  *asctime (const struct tm *_tblock);
diff --git a/newlib/libc/saber b/newlib/libc/saber
index 154eddf41..bdaef28fe 100644
--- a/newlib/libc/saber
+++ b/newlib/libc/saber
@@ -122,6 +122,7 @@  time/gmtime.c
 time/localtime.c
 time/mktime.c
 time/strftime.c
+time/timegm.c
 
 
 load stdio/fiprintf.c
diff --git a/newlib/libc/time/Makefile.am b/newlib/libc/time/Makefile.am
index be040baec..5c73c3a17 100644
--- a/newlib/libc/time/Makefile.am
+++ b/newlib/libc/time/Makefile.am
@@ -17,6 +17,7 @@  LIB_SOURCES = \
 	lcltime.c	\
 	lcltime_r.c	\
 	mktime.c	\
+	timegm.c	\
 	month_lengths.c \
 	strftime.c  	\
 	strptime.c	\
@@ -54,6 +55,7 @@  CHEWOUT_FILES = \
 	gmtime.def	\
 	lcltime.def	\
 	mktime.def	\
+	timegm.def	\
 	strftime.def	\
 	time.def	\
 	tzlock.def	\
diff --git a/newlib/libc/time/Makefile.in b/newlib/libc/time/Makefile.in
index ce6b6c183..b20ae09c4 100644
--- a/newlib/libc/time/Makefile.in
+++ b/newlib/libc/time/Makefile.in
@@ -80,6 +80,7 @@  am__objects_1 = lib_a-asctime.$(OBJEXT) lib_a-asctime_r.$(OBJEXT) \
 	lib_a-lcltime_r.$(OBJEXT) lib_a-mktime.$(OBJEXT) \
 	lib_a-month_lengths.$(OBJEXT) lib_a-strftime.$(OBJEXT) \
 	lib_a-strptime.$(OBJEXT) lib_a-time.$(OBJEXT) \
+	lib_a-timegm.$(OBJEXT) \
 	lib_a-tzcalc_limits.$(OBJEXT) lib_a-tzlock.$(OBJEXT) \
 	lib_a-tzset.$(OBJEXT) lib_a-tzset_r.$(OBJEXT) \
 	lib_a-tzvars.$(OBJEXT) lib_a-wcsftime.$(OBJEXT)
@@ -91,6 +92,7 @@  am__objects_2 = asctime.lo asctime_r.lo clock.lo ctime.lo ctime_r.lo \
 	difftime.lo gettzinfo.lo gmtime.lo gmtime_r.lo lcltime.lo \
 	lcltime_r.lo mktime.lo month_lengths.lo strftime.lo \
 	strptime.lo time.lo tzcalc_limits.lo tzlock.lo tzset.lo \
+	timegm.lo \
 	tzset_r.lo tzvars.lo wcsftime.lo
 @USE_LIBTOOL_TRUE@am_libtime_la_OBJECTS = $(am__objects_2)
 libtime_la_OBJECTS = $(am_libtime_la_OBJECTS)
@@ -283,6 +285,7 @@  LIB_SOURCES = \
 	strftime.c  	\
 	strptime.c	\
 	time.c		\
+	timegm.c	\
 	tzcalc_limits.c \
 	tzlock.c	\
 	tzset.c		\
@@ -464,6 +467,12 @@  lib_a-mktime.o: mktime.c
 lib_a-mktime.obj: mktime.c
 	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(lib_a_CFLAGS) $(CFLAGS) -c -o lib_a-mktime.obj `if test -f 'mktime.c'; then $(CYGPATH_W) 'mktime.c'; else $(CYGPATH_W) '$(srcdir)/mktime.c'; fi`
 
+lib_a-timegm.o: timegm.c
+	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(lib_a_CFLAGS) $(CFLAGS) -c -o lib_a-timegm.o `test -f 'timegm.c' || echo '$(srcdir)/'`timegm.c
+
+lib_a-timegm.obj: timegm.c
+	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(lib_a_CFLAGS) $(CFLAGS) -c -o lib_a-timegm.obj `if test -f 'timegm.c'; then $(CYGPATH_W) 'timegm.c'; else $(CYGPATH_W) '$(srcdir)/timegm.c'; fi`
+
 lib_a-month_lengths.o: month_lengths.c
 	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(lib_a_CFLAGS) $(CFLAGS) -c -o lib_a-month_lengths.o `test -f 'month_lengths.c' || echo '$(srcdir)/'`month_lengths.c
 
diff --git a/newlib/libc/time/local.h b/newlib/libc/time/local.h
index bfe06e622..557187e0e 100644
--- a/newlib/libc/time/local.h
+++ b/newlib/libc/time/local.h
@@ -39,3 +39,5 @@  void _tzset_unlocked (void);
 void __tz_lock (void);
 void __tz_unlock (void);
 
+time_t __mktime_internal (struct tm *tim_p);
+void __set_tm_wday (long days, struct tm *tim_p);
diff --git a/newlib/libc/time/mktime.c b/newlib/libc/time/mktime.c
index 02032599a..c2717b08c 100644
--- a/newlib/libc/time/mktime.c
+++ b/newlib/libc/time/mktime.c
@@ -11,6 +11,8 @@ 
  * represented, returns the value (time_t) -1.
  *
  * Modifications:	Fixed tm_isdst usage - 27 August 2008 Craig Howland.
+ * 			Refactor code from mktime.c to share internal
+ * 			functions. - 17 July 2018 Andrew Russell.
  */
 
 /*
@@ -44,25 +46,36 @@  ANSI C requires <<mktime>>.
 
 #include <stdlib.h>
 #include <time.h>
+#include <stdint.h>
 #include "local.h"
 
 #define _SEC_IN_MINUTE 60L
 #define _SEC_IN_HOUR 3600L
 #define _SEC_IN_DAY 86400L
 
-static const int DAYS_IN_MONTH[12] =
+static const uint_least8_t DAYS_IN_MONTH[12] =
 {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
 
 #define _DAYS_IN_MONTH(x) ((x == 1) ? days_in_feb : DAYS_IN_MONTH[x])
 
-static const int _DAYS_BEFORE_MONTH[12] =
+static const uint_least16_t _DAYS_BEFORE_MONTH[12] =
 {0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334};
 
-#define _ISLEAP(y) (((y) % 4) == 0 && (((y) % 100) != 0 || (((y)+1900) % 400) == 0))
-#define _DAYS_IN_YEAR(year) (_ISLEAP(year) ? 366 : 365)
+/* returns either 0 or 1 */
+static int
+__is_leap_year (int year)
+{
+  return (year % 4) == 0 && ((year % 100) != 0 || (((year / 100) & 3) == (-(YEAR_BASE / 100)  & 3)));
+}
+
+static int
+__days_in_year (int year)
+{
+  return __is_leap_year(year) ? 366 : 365;
+}
 
-static void 
-validate_structure (struct tm *tim_p)
+static void
+__validate_tm_structure (struct tm *tim_p)
 {
   div_t res;
   int days_in_feb = 28;
@@ -112,7 +125,7 @@  validate_structure (struct tm *tim_p)
         }
     }
 
-  if (_DAYS_IN_YEAR (tim_p->tm_year) == 366)
+  if (__days_in_year(tim_p->tm_year) == 366)
     days_in_feb = 29;
 
   if (tim_p->tm_mday <= 0)
@@ -124,7 +137,7 @@  validate_structure (struct tm *tim_p)
 	      tim_p->tm_year--;
 	      tim_p->tm_mon = 11;
 	      days_in_feb =
-		((_DAYS_IN_YEAR (tim_p->tm_year) == 366) ?
+	        ((__days_in_year(tim_p->tm_year) == 366) ?
 		 29 : 28);
 	    }
 	  tim_p->tm_mday += _DAYS_IN_MONTH (tim_p->tm_mon);
@@ -140,15 +153,23 @@  validate_structure (struct tm *tim_p)
 	      tim_p->tm_year++;
 	      tim_p->tm_mon = 0;
 	      days_in_feb =
-		((_DAYS_IN_YEAR (tim_p->tm_year) == 366) ?
+	        ((__days_in_year(tim_p->tm_year) == 366) ?
 		 29 : 28);
 	    }
 	}
     }
 }
 
-time_t 
-mktime (struct tm *tim_p)
+void
+__set_tm_wday (long days, struct tm *tim_p)
+{
+  if ((tim_p->tm_wday = (days + 4) % 7) < 0)
+    tim_p->tm_wday += 7;
+}
+
+/* Assumes the time at tim_p is a UTC time and returns its arithmetic representation */
+time_t
+__mktime_internal (struct tm *tim_p)
 {
   time_t tim = 0;
   long days = 0;
@@ -156,7 +177,7 @@  mktime (struct tm *tim_p)
   __tzinfo_type *tz = __gettzinfo ();
 
   /* validate structure */
-  validate_structure (tim_p);
+  __validate_tm_structure (tim_p);
 
   /* compute hours, minutes, seconds */
   tim += tim_p->tm_sec + (tim_p->tm_min * _SEC_IN_MINUTE) +
@@ -165,7 +186,7 @@  mktime (struct tm *tim_p)
   /* compute days in year */
   days += tim_p->tm_mday - 1;
   days += _DAYS_BEFORE_MONTH[tim_p->tm_mon];
-  if (tim_p->tm_mon > 1 && _DAYS_IN_YEAR (tim_p->tm_year) == 366)
+  if (tim_p->tm_mon > 1 && __days_in_year(tim_p->tm_year) == 366)
     days++;
 
   /* compute day of the year */
@@ -178,17 +199,29 @@  mktime (struct tm *tim_p)
   if ((year = tim_p->tm_year) > 70)
     {
       for (year = 70; year < tim_p->tm_year; year++)
-	days += _DAYS_IN_YEAR (year);
+	days += __days_in_year(year);
     }
   else if (year < 70)
     {
       for (year = 69; year > tim_p->tm_year; year--)
-	days -= _DAYS_IN_YEAR (year);
-      days -= _DAYS_IN_YEAR (year);
+	days -= __days_in_year(year);
+      days -= __days_in_year(year);
     }
 
   /* compute total seconds */
-  tim += (time_t)days * _SEC_IN_DAY;
+  tim += (days * _SEC_IN_DAY);
+
+  return tim;
+}
+
+time_t
+mktime (struct tm *tim_p)
+{
+  time_t tim = __mktime_internal(tim_p);
+  long days = tim / SECSPERDAY;
+  int year = tim_p->tm_year;
+  int isdst=0;
+  __tzinfo_type *tz = __gettzinfo ();
 
   TZ_LOCK;
 
@@ -237,7 +270,7 @@  mktime (struct tm *tim_p)
 		  tim_p->tm_sec += diff;
 		  tim += diff;  /* we also need to correct our current time calculation */
 		  int mday = tim_p->tm_mday;
-		  validate_structure (tim_p);
+		  __validate_tm_structure(tim_p);
 		  mday = tim_p->tm_mday - mday;
 		  /* roll over occurred */
 		  if (mday) {
@@ -251,9 +284,9 @@  mktime (struct tm *tim_p)
 		    /* handle yday */
 		    if ((tim_p->tm_yday += mday) < 0) {
 			  --year;
-			  tim_p->tm_yday = _DAYS_IN_YEAR(year) - 1;
+			  tim_p->tm_yday = __days_in_year(year) - 1;
 		    } else {
-			  mday = _DAYS_IN_YEAR(year);
+			  mday = __days_in_year(year);
 			  if (tim_p->tm_yday > (mday - 1))
 				tim_p->tm_yday -= mday;
 		    }
@@ -275,8 +308,7 @@  mktime (struct tm *tim_p)
   tim_p->tm_isdst = isdst;
 
   /* compute day of the week */
-  if ((tim_p->tm_wday = (days + 4) % 7) < 0)
-    tim_p->tm_wday += 7;
-	
+  __set_tm_wday(days, tim_p);
+
   return tim;
 }
diff --git a/newlib/libc/time/timegm.c b/newlib/libc/time/timegm.c
new file mode 100644
index 000000000..ab50de7e1
--- /dev/null
+++ b/newlib/libc/time/timegm.c
@@ -0,0 +1,63 @@ 
+/*
+ * timegm.c
+ * Original Author: A. Russell
+ *
+ * Converts the broken-down time, expressed as UTC time, in the structure
+ * pointed to by tim_p into a calendar time value. The original values of the
+ * tm_wday and tm_yday fields of the structure are ignored, and the original
+ * values of the other fields have no restrictions. On successful completion
+ * the fields of the structure are set to represent the specified calendar
+ * time. Returns the specified calendar time. If the calendar time can not be
+ * represented, returns the value (time_t) -1.  These functions are nonstandard
+ * GNU extensions that are also present on the BSDs.  Avoid their use.
+ * Modifications: Refactored mktime.c to support both mktime and timegm
+                         - 23 July 2018 Andrew Russell.
+ */
+
+/*
+FUNCTION
+<<timegm>>---convert time to arithmetic representation
+
+INDEX
+ timegm
+
+SYNOPSIS
+ #include <time.h>
+ time_t timegm(struct tm *<[timp]>);
+
+DESCRIPTION
+<<timegm>> assumes the time at <[timp]> is a UTC time, and converts
+its representation from the traditional representation defined by
+<<struct tm>> into a representation suitable for arithmetic.
+
+<<timegm>> is the inverse of <<gmtime>>.
+
+RETURNS
+If the contents of the structure at <[timp]> do not form a valid
+calendar time representation, the result is <<-1>>.  Otherwise, the
+result is the time, converted to a <<time_t>> value.
+
+PORTABILITY
+<<timegm>> is a nonstandard GNU extension to POSIX also present on BSD.
+
+<<timegm>> requires no supporting OS subroutines.
+*/
+
+#include <stdlib.h>
+#include <time.h>
+#include "local.h"
+
+time_t
+timegm (struct tm *tim_p)
+{
+  time_t tim = __mktime_internal(tim_p);
+  long days = tim / SECSPERDAY;
+
+  /* set isdst flag to 0 since we are in UTC */
+  tim_p->tm_isdst = 0;
+
+  /* compute day of the week */
+  __set_tm_wday(days, tim_p);
+
+  return tim;
+}