libiberty: Fix and cleanup choose_temp_base()

Message ID f8a7d6a5-bb3b-65a9-f134-b071748c8fe0@gmx.de
State New
Headers show
Series
  • libiberty: Fix and cleanup choose_temp_base()
Related show

Commit Message

Tim Rühsen Nov. 7, 2019, 5:08 p.m.
Hi,

reduced code size by using xasprintf().

Please review the check of mktemp(), which was formerly checking against
0 (regarding glibc man pages that was wrong). I left that check intact
(just in case I miss something) and added the check for an empty string,
as documented in the glibc man pages.

Does it make sense to further replace strlen/malloc/strcpy/strcat
sequences by [x]asprintf in order to reduce source lines and library
(binary) size ? (In the means of "is it appreciated")

A side effect is calming down static analyzers that dislike unbounded
memory accesses and thus warn about strcpy() and strcat().

Regards, Tim

Comments

Palmer Dabbelt via binutils Nov. 7, 2019, 5:43 p.m. | #1
On Thu, Nov 7, 2019 at 9:09 AM Tim Rühsen <tim.ruehsen@gmx.de> wrote:
>

> reduced code size by using xasprintf().

>

> Please review the check of mktemp(), which was formerly checking against

> 0 (regarding glibc man pages that was wrong). I left that check intact

> (just in case I miss something) and added the check for an empty string,

> as documented in the glibc man pages.

>

> Does it make sense to further replace strlen/malloc/strcpy/strcat

> sequences by [x]asprintf in order to reduce source lines and library

> (binary) size ? (In the means of "is it appreciated")

>

> A side effect is calming down static analyzers that dislike unbounded

> memory accesses and thus warn about strcpy() and strcat().


The new code is simpler but it will run slower.  The existing code has
been there for 20 years, I doubt it's buggy.  I don't see a benefit to
making this change.

Ian
Tim Rühsen Nov. 8, 2019, 11:25 a.m. | #2
On 11/7/19 6:43 PM, Ian Lance Taylor wrote:
> On Thu, Nov 7, 2019 at 9:09 AM Tim Rühsen <tim.ruehsen@gmx.de> wrote:

>>

>> reduced code size by using xasprintf().

>>

>> Please review the check of mktemp(), which was formerly checking against

>> 0 (regarding glibc man pages that was wrong). I left that check intact

>> (just in case I miss something) and added the check for an empty string,

>> as documented in the glibc man pages.

>>

>> Does it make sense to further replace strlen/malloc/strcpy/strcat

>> sequences by [x]asprintf in order to reduce source lines and library

>> (binary) size ? (In the means of "is it appreciated")

>>

>> A side effect is calming down static analyzers that dislike unbounded

>> memory accesses and thus warn about strcpy() and strcat().

> 

> The new code is simpler but it will run slower.


The commit does two things (I can split these if you want me to do).

1) Code deduplication

The new code eliminates code duplication. xasprintf("%s%s", ...) is
basically 2xstrlen, 1xmalloc, 2xstrcpy. The old code just does this
*manually* inlined (with a small optimization of dropping strlen for an
8 byte string).

The part with the biggest CPU cycle impact is malloc(), whether you call
 xasprintf() or inline the code - it doesn't change much regarding total
used CPU cycles.

But even much more cycles may be required by mktemp() which goes down
into the kernel (several times), goes through file system code and does
file I/O (which may even block).

So I can't really follow your argument of "will run slower".

While 20 years ago we all where keen at manually inlining and unrolling
everything to spare even a few CPU cycles, the coding paradigm today is
bit different (IMO). It's more shifted towards readability,
maintainability, runtime stability.

But if the project's policy is to keep code just because it's old, I
accept that (but not very happily).


> The existing code has

> been there for 20 years, I doubt it's buggy.  I don't see a benefit to

> making this change.


2) Fixing the check of the return value of mktemp()

The current code calls abort() if mktemp() returns NULL. Now read the
glibc man page of mktemp():

RETURN VALUE
       The  mktemp()  function always returns template.  If a unique
name was created, the last six bytes of template will have been modi‐
       fied in such a way that the resulting name is unique (i.e., does
not exist already) If a unique name could not be created, template
       is made an empty string, and errno is set to indicate the error.

a) mktemp() never returns NULL, so the check is wrong.

b) choose_temp_base() either returns a temporary file name OR an empty
string

c) the documentation of choose_temp_base() does not reflect this
behavior (it says "Return a prefix for temporary file names or NULL if
unable to find one").

d) callers of choose_temp_base() rely on the return value being a
temporary file name (not NULL nor an empty string), e.g. see
./libiberty/pex-msdos.c:185
./binutils/dlltool.c:1309
./binutils/dllwrap.c:354
./binutils/dllwrap.c:820
./binutils/dllwrap.c:1033
./binutils/resrc.c:204
./binutils/resrc.c:318


If you think that the man page (3) of mktemp() if wrong about the return
value, let me know and I dig deeper (into glibc sources).

The documentation of choose_temp_base() is definitely incorrect. If you
agree I extend the commit with a fix (or make a second commit - what you
think is more appropriate).

Regards, Tim
Palmer Dabbelt via binutils Nov. 8, 2019, 1:58 p.m. | #3
On Fri, Nov 8, 2019 at 3:25 AM Tim Rühsen <tim.ruehsen@gmx.de> wrote:
>

> On 11/7/19 6:43 PM, Ian Lance Taylor wrote:

> > On Thu, Nov 7, 2019 at 9:09 AM Tim Rühsen <tim.ruehsen@gmx.de> wrote:

> >>

> >> reduced code size by using xasprintf().

> >>

> >> Please review the check of mktemp(), which was formerly checking against

> >> 0 (regarding glibc man pages that was wrong). I left that check intact

> >> (just in case I miss something) and added the check for an empty string,

> >> as documented in the glibc man pages.

> >>

> >> Does it make sense to further replace strlen/malloc/strcpy/strcat

> >> sequences by [x]asprintf in order to reduce source lines and library

> >> (binary) size ? (In the means of "is it appreciated")

> >>

> >> A side effect is calming down static analyzers that dislike unbounded

> >> memory accesses and thus warn about strcpy() and strcat().

> >

> > The new code is simpler but it will run slower.

>

> The commit does two things (I can split these if you want me to do).

>

> 1) Code deduplication

>

> The new code eliminates code duplication. xasprintf("%s%s", ...) is

> basically 2xstrlen, 1xmalloc, 2xstrcpy. The old code just does this

> *manually* inlined (with a small optimization of dropping strlen for an

> 8 byte string).


The xasprintf function has to parse the format string, twice.  There
is no comparable work in the current function.


> While 20 years ago we all where keen at manually inlining and unrolling

> everything to spare even a few CPU cycles, the coding paradigm today is

> bit different (IMO). It's more shifted towards readability,

> maintainability, runtime stability.

>

> But if the project's policy is to keep code just because it's old, I

> accept that (but not very happily).


Sure, we might write it differently today.  But we probably still
wouldn't use xasprintf.

That said, avoiding code churn is a benefit.  If you have working
code, the only reason to change it is to make it faster or remove
bugs.  If you don't do that, the only meaningful effect of changing
existing working code would be to possibly break it.  If code works,
leave it alone.  This code base is not intended as an example for how
to write good code.  It's intended to work.


> 2) Fixing the check of the return value of mktemp()

>

> The current code calls abort() if mktemp() returns NULL. Now read the

> glibc man page of mktemp():

>

> RETURN VALUE

>        The  mktemp()  function always returns template.  If a unique

> name was created, the last six bytes of template will have been modi‐

>        fied in such a way that the resulting name is unique (i.e., does

> not exist already) If a unique name could not be created, template

>        is made an empty string, and errno is set to indicate the error.

>

> a) mktemp() never returns NULL, so the check is wrong.

>

> b) choose_temp_base() either returns a temporary file name OR an empty

> string

>

> c) the documentation of choose_temp_base() does not reflect this

> behavior (it says "Return a prefix for temporary file names or NULL if

> unable to find one").

>

> d) callers of choose_temp_base() rely on the return value being a

> temporary file name (not NULL nor an empty string), e.g. see

> ./libiberty/pex-msdos.c:185

> ./binutils/dlltool.c:1309

> ./binutils/dllwrap.c:354

> ./binutils/dllwrap.c:820

> ./binutils/dllwrap.c:1033

> ./binutils/resrc.c:204

> ./binutils/resrc.c:318

>

>

> If you think that the man page (3) of mktemp() if wrong about the return

> value, let me know and I dig deeper (into glibc sources).

>

> The documentation of choose_temp_base() is definitely incorrect. If you

> agree I extend the commit with a fix (or make a second commit - what you

> think is more appropriate).


The mktemp function behaves differently on different systems.  See,
for example, https://man.openbsd.org/NetBSD-7.0.1/mktemp.3 .  It would
be fine to fix this code to use mktemp in a better way, but it needs
to be portable.  Perhaps these days we can safely call something other
than mktemp; after all, both man pages recommend not using it.
Thanks.

Ian

Patch

From 72b324037c957e53a15cdbeefc1ff49128f0e965 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim=20R=C3=BChsen?= <tim.ruehsen@gmx.de>
Date: Thu, 7 Nov 2019 17:53:56 +0100
Subject: [PATCH] libiberty: Fix and cleanup choose_temp_base()

*choose-temp.c (choose_temp_base): Simplify code,
fix checking return value of mktemp().
---
 libiberty/ChangeLog     |  5 +++++
 libiberty/choose-temp.c | 17 ++++++-----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 95cb1525f2..7a7809ba47 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,8 @@ 
+2019-11-07  Tim Ruehsen  <tim.ruehsen@gmx.de>
+
+        *choose-temp.c (choose_temp_base): Simplify code,
+        fix checking return value of mktemp().
+
 2019-08-08  Martin Liska  <mliska@suse.cz>
 
 	PR bootstrap/91352
diff --git a/libiberty/choose-temp.c b/libiberty/choose-temp.c
index 72c1b710bd..17bb0f27d4 100644
--- a/libiberty/choose-temp.c
+++ b/libiberty/choose-temp.c
@@ -38,7 +38,6 @@  Boston, MA 02110-1301, USA.  */
 /* Name of temporary file.
    mktemp requires 6 trailing X's.  */
 #define TEMP_FILE "ccXXXXXX"
-#define TEMP_FILE_LEN (sizeof(TEMP_FILE) - 1)
 
 /*
 
@@ -47,7 +46,7 @@  Boston, MA 02110-1301, USA.  */
 Return a prefix for temporary file names or @code{NULL} if unable to
 find one.  The current directory is chosen if all else fails so the
 program is exited if a temporary directory can't be found (@code{mktemp}
-fails).  The buffer for the result is obtained with @code{xmalloc}.
+fails).  The buffer for the result is obtained with @code{xasprintf}.
 
 This function is provided for backwards compatibility only.  Its use is
 not recommended.
@@ -59,16 +58,12 @@  not recommended.
 char *
 choose_temp_base (void)
 {
-  const char *base = choose_tmpdir ();
-  char *temp_filename;
-  int len;
+  char *temp_filename = xasprintf("%s%s", choose_tmpdir (), TEMP_FILE);
 
-  len = strlen (base);
-  temp_filename = XNEWVEC (char, len + TEMP_FILE_LEN + 1);
-  strcpy (temp_filename, base);
-  strcpy (temp_filename + len, TEMP_FILE);
-
-  if (mktemp (temp_filename) == 0)
+  /* mktemp() man page: If a unique name could not be created,
+   * template is made an empty string. */
+  if (mktemp (temp_filename) == NULL || *temp_filename == 0)
     abort ();
+
   return temp_filename;
 }
-- 
2.24.0