Define char16_t, char32_t consistently with uint_least16_t, uint_least32_t (bug 17979)

Message ID alpine.DEB.2.20.1802012036000.13065@digraph.polyomino.org.uk
State New
Headers show
Series
  • Define char16_t, char32_t consistently with uint_least16_t, uint_least32_t (bug 17979)
Related show

Commit Message

Joseph Myers Feb. 1, 2018, 8:36 p.m.
As noted in bug 17979 (and as I noted earlier in
<https://sourceware.org/ml/libc-alpha/2012-02/msg00647.html>), uchar.h
has gratuitously complicated code to determine the types for char16_t
and char32_t, and to reject including that header for pre-C11
compilers not defining __CHAR16_TYPE__ and __CHAR32_TYPE__.  Since
those types are always required to match uint_least16_t and
uint_least32_t, which glibc knows how to define without reference to
such predefined macros, it's safe just to define those types the same
as the *least* types are defined in stdint.h, so allowing the header
to work with (for example) GCC 4.3.

This patch implements that.  bits/types.h is made to define
__int_leastN_t and __uint_leastN_t so the logic for those types can
stay in a single place, and stdint.h is made to use those __*_t to
define the public *_t types.  uchar.h is then made to use
__uint_least16_t and __uint_least32_t to define char16_t and char32_t,
so simplifying the logic there.  A new test is added that verifies the
types chosen for char16_t and char32_t do indeed match the types the
compiler uses for u"" and U"" string literals.

Tested for x86_64.  (I have not tested with any of the older compilers
for which this would actually make a difference to whether you can
include uchar.h.)

2018-02-01  Joseph Myers  <joseph@codesourcery.com>

	[BZ #17979]
	* posix/bits/types.h (__int_least8_t): New typedef.
	(__uint_least8_t): Likewise.
	(__int_least16_t): Likewise.
	(__uint_least16_t): Likewise.
	(__int_least32_t): Likewise.
	(__uint_least32_t): Likewise.
	(__int_least64_t): Likewise.
	(__uint_least64_t): Likewise.
	* sysdeps/generic/stdint.h (int_least8_t): Define using
	__int_least8_t.
	(int_least16_t): Define using __int_least16_t.
	(int_least32_t): Define using __int_least32_t.
	(int_least64_t): Define using __int_least64_t.
	(uint_least8_t): Define using __uint_least8_t.
	(uint_least16_t): Define using __uint_least16_t.
	(uint_least32_t): Define using __uint_least32_t.
	(uint_least64_t): Define using __uint_least64_t.
	* wcsmbs/uchar.h: Include <bits/types.h>.
	(char16_t): Define using __uint_least16_t conditional only on
	[!__USE_ISOCXX11].
	(char32_t): Define using __uint_least32_t conditional only on
	[!__USE_ISOCXX11].
	* wcsmbs/test-char-types.c: New file.
	* wcsmbs/Makefile (tests): Add test-char-types.


-- 
Joseph S. Myers
joseph@codesourcery.com

Comments

Joseph Myers Feb. 6, 2018, 6:01 p.m. | #1
Ping.  This patch 
<https://sourceware.org/ml/libc-alpha/2018-02/msg00032.html> is pending 
review.

-- 
Joseph S. Myers
joseph@codesourcery.com
Adhemerval Zanella Feb. 7, 2018, 6:46 p.m. | #2
On 01/02/2018 18:36, Joseph Myers wrote:
> As noted in bug 17979 (and as I noted earlier in

> <https://sourceware.org/ml/libc-alpha/2012-02/msg00647.html>), uchar.h

> has gratuitously complicated code to determine the types for char16_t

> and char32_t, and to reject including that header for pre-C11

> compilers not defining __CHAR16_TYPE__ and __CHAR32_TYPE__.  Since

> those types are always required to match uint_least16_t and

> uint_least32_t, which glibc knows how to define without reference to

> such predefined macros, it's safe just to define those types the same

> as the *least* types are defined in stdint.h, so allowing the header

> to work with (for example) GCC 4.3.

> 

> This patch implements that.  bits/types.h is made to define

> __int_leastN_t and __uint_leastN_t so the logic for those types can

> stay in a single place, and stdint.h is made to use those __*_t to

> define the public *_t types.  uchar.h is then made to use

> __uint_least16_t and __uint_least32_t to define char16_t and char32_t,

> so simplifying the logic there.  A new test is added that verifies the

> types chosen for char16_t and char32_t do indeed match the types the

> compiler uses for u"" and U"" string literals.

> 

> Tested for x86_64.  (I have not tested with any of the older compilers

> for which this would actually make a difference to whether you can

> include uchar.h.)


Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>


> +

> +#include <uchar.h>

> +

> +const char16_t *s16 = u"";

> +const char32_t *s32 = U"";

> +

> +static int

> +do_test (void)

> +{

> +  /* This is a compilation test.  */

> +  return 0;

> +}

> +

> +#include <support/test-driver.c>


Maybe we could add these kind of tests on build time to catch on cross-compilation,
as:

#define is_compatible(x, T) _Generic((x), T:1, default: 0)
#define check_compatibility(x, t) \
  _Static_assert (is_compatible (x, t), "type not compatible")

  check_compatibility (u"", char16_t *);
  check_compatibility (U"", char32_t *);
Joseph Myers Feb. 7, 2018, 8:30 p.m. | #3
On Wed, 7 Feb 2018, Adhemerval Zanella wrote:

> > +#include <uchar.h>

> > +

> > +const char16_t *s16 = u"";

> > +const char32_t *s32 = U"";

> > +

> > +static int

> > +do_test (void)

> > +{

> > +  /* This is a compilation test.  */

> > +  return 0;

> > +}

> > +

> > +#include <support/test-driver.c>

> 

> Maybe we could add these kind of tests on build time to catch on 

> cross-compilation, as:


The test above is fully functional for cross compilation (provided you 
compile the testsuite, not just glibc itself, of course).

> #define is_compatible(x, T) _Generic((x), T:1, default: 0)

> #define check_compatibility(x, t) \

>   _Static_assert (is_compatible (x, t), "type not compatible")

> 

>   check_compatibility (u"", char16_t *);

>   check_compatibility (U"", char32_t *);


This would be problematic because of -Wwrite-strings meaning the types end 
up as const char16_t * and const char32_t *.  The test as written works 
whether or not the type ends up being const-qualified.  (You don't want a 
test that actually relies on -Wwrite-strings being in use to get the 
tested-for type, because it's not ideal that -Wwrite-strings works by 
changing types of string constants; see GCC bug 61579.)

-- 
Joseph S. Myers
joseph@codesourcery.com
Adhemerval Zanella Feb. 8, 2018, 11:26 a.m. | #4
On 07/02/2018 18:30, Joseph Myers wrote:
> On Wed, 7 Feb 2018, Adhemerval Zanella wrote:

> 

>>> +#include <uchar.h>

>>> +

>>> +const char16_t *s16 = u"";

>>> +const char32_t *s32 = U"";

>>> +

>>> +static int

>>> +do_test (void)

>>> +{

>>> +  /* This is a compilation test.  */

>>> +  return 0;

>>> +}

>>> +

>>> +#include <support/test-driver.c>

>>

>> Maybe we could add these kind of tests on build time to catch on 

>> cross-compilation, as:

> 

> The test above is fully functional for cross compilation (provided you 

> compile the testsuite, not just glibc itself, of course).

> 

>> #define is_compatible(x, T) _Generic((x), T:1, default: 0)

>> #define check_compatibility(x, t) \

>>   _Static_assert (is_compatible (x, t), "type not compatible")

>>

>>   check_compatibility (u"", char16_t *);

>>   check_compatibility (U"", char32_t *);

> 

> This would be problematic because of -Wwrite-strings meaning the types end 

> up as const char16_t * and const char32_t *.  The test as written works 

> whether or not the type ends up being const-qualified.  (You don't want a 

> test that actually relies on -Wwrite-strings being in use to get the 

> tested-for type, because it's not ideal that -Wwrite-strings works by 

> changing types of string constants; see GCC bug 61579.)

> 


Fair enough.

Patch

diff --git a/posix/bits/types.h b/posix/bits/types.h
index bd06e2d..5e22ce4 100644
--- a/posix/bits/types.h
+++ b/posix/bits/types.h
@@ -47,6 +47,16 @@  __extension__ typedef signed long long int __int64_t;
 __extension__ typedef unsigned long long int __uint64_t;
 #endif
 
+/* Smallest types with at least a given width.  */
+typedef __int8_t __int_least8_t;
+typedef __uint8_t __uint_least8_t;
+typedef __int16_t __int_least16_t;
+typedef __uint16_t __uint_least16_t;
+typedef __int32_t __int_least32_t;
+typedef __uint32_t __uint_least32_t;
+typedef __int64_t __int_least64_t;
+typedef __uint64_t __uint_least64_t;
+
 /* quad_t is also 64 bits.  */
 #if __WORDSIZE == 64
 typedef long int __quad_t;
diff --git a/sysdeps/generic/stdint.h b/sysdeps/generic/stdint.h
index 046d8bf..11fb0eb 100644
--- a/sysdeps/generic/stdint.h
+++ b/sysdeps/generic/stdint.h
@@ -40,26 +40,16 @@ 
 /* Small types.  */
 
 /* Signed.  */
-typedef signed char		int_least8_t;
-typedef short int		int_least16_t;
-typedef int			int_least32_t;
-#if __WORDSIZE == 64
-typedef long int		int_least64_t;
-#else
-__extension__
-typedef long long int		int_least64_t;
-#endif
+typedef __int_least8_t int_least8_t;
+typedef __int_least16_t int_least16_t;
+typedef __int_least32_t int_least32_t;
+typedef __int_least64_t int_least64_t;
 
 /* Unsigned.  */
-typedef unsigned char		uint_least8_t;
-typedef unsigned short int	uint_least16_t;
-typedef unsigned int		uint_least32_t;
-#if __WORDSIZE == 64
-typedef unsigned long int	uint_least64_t;
-#else
-__extension__
-typedef unsigned long long int	uint_least64_t;
-#endif
+typedef __uint_least8_t uint_least8_t;
+typedef __uint_least16_t uint_least16_t;
+typedef __uint_least32_t uint_least32_t;
+typedef __uint_least64_t uint_least64_t;
 
 
 /* Fast types.  */
diff --git a/wcsmbs/Makefile b/wcsmbs/Makefile
index 653e289..3ee91d2 100644
--- a/wcsmbs/Makefile
+++ b/wcsmbs/Makefile
@@ -50,7 +50,7 @@  strop-tests :=  wcscmp wcsncmp wmemcmp wcslen wcschr wcsrchr wcscpy wcsnlen \
 tests := tst-wcstof wcsmbs-tst1 tst-wcsnlen tst-btowc tst-mbrtowc \
 	 tst-wcrtomb tst-wcpncpy tst-mbsrtowcs tst-wchar-h tst-mbrtowc2 \
 	 tst-c16c32-1 wcsatcliff tst-wcstol-locale tst-wcstod-nan-locale \
-	 tst-wcstod-round \
+	 tst-wcstod-round test-char-types \
 	 $(addprefix test-,$(strop-tests))
 
 include ../Rules
diff --git a/wcsmbs/test-char-types.c b/wcsmbs/test-char-types.c
new file mode 100644
index 0000000..3123761
--- /dev/null
+++ b/wcsmbs/test-char-types.c
@@ -0,0 +1,31 @@ 
+/* Test char16_t and char32_t types consistent with compiler.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <uchar.h>
+
+const char16_t *s16 = u"";
+const char32_t *s32 = U"";
+
+static int
+do_test (void)
+{
+  /* This is a compilation test.  */
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/wcsmbs/uchar.h b/wcsmbs/uchar.h
index 4d78f70..4d9b2ec 100644
--- a/wcsmbs/uchar.h
+++ b/wcsmbs/uchar.h
@@ -28,20 +28,13 @@ 
 #define __need_size_t
 #include <stddef.h>
 
+#include <bits/types.h>
 #include <bits/types/mbstate_t.h>
 
-#if defined __GNUC__ && !defined __USE_ISOCXX11
-/* Define the 16-bit and 32-bit character types.  Use the information
-   provided by the compiler.  */
-# if !defined __CHAR16_TYPE__ || !defined __CHAR32_TYPE__
-#  if defined __STDC_VERSION__ && __STDC_VERSION__ < 201000L
-#   error "<uchar.h> requires ISO C11 mode"
-#  else
-#   error "definitions of __CHAR16_TYPE__ and/or __CHAR32_TYPE__ missing"
-#  endif
-# endif
-typedef __CHAR16_TYPE__ char16_t;
-typedef __CHAR32_TYPE__ char32_t;
+#ifndef __USE_ISOCXX11
+/* Define the 16-bit and 32-bit character types.  */
+typedef __uint_least16_t char16_t;
+typedef __uint_least32_t char32_t;
 #endif