ctype: Fix bitfield types on 16-bit targets

Message ID 20180720110716.13812-1-sebastian.huber@embedded-brains.de
State Accepted
Commit 46ba1675c457324b0eeef4670a09101ef3f34c50
Headers show
Series
  • ctype: Fix bitfield types on 16-bit targets
Related show

Commit Message

Sebastian Huber July 20, 2018, 11:07 a.m.
This prevents errors like this:

newlib/libc/ctype/categories.c:6:3: error: width of 'first' exceeds its type
   unsigned int first: 24;
   ^

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>

---
 newlib/libc/ctype/categories.c  | 5 +++--
 newlib/libc/ctype/towctrans_l.c | 9 +++++----
 2 files changed, 8 insertions(+), 6 deletions(-)

-- 
2.13.7

Comments

Corinna Vinschen July 20, 2018, 11:57 a.m. | #1
On Jul 20 13:07, Sebastian Huber wrote:
> This prevents errors like this:

> 

> newlib/libc/ctype/categories.c:6:3: error: width of 'first' exceeds its type

>    unsigned int first: 24;

>    ^

> 

> Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>

> ---

>  newlib/libc/ctype/categories.c  | 5 +++--

>  newlib/libc/ctype/towctrans_l.c | 9 +++++----

>  2 files changed, 8 insertions(+), 6 deletions(-)


ACK.  Please push.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Hans-Bernhard Bröker July 20, 2018, 5:37 p.m. | #2
Am 20.07.2018 um 13:07 schrieb Sebastian Huber:
> -  unsigned int first: 24; -  unsigned short delta; +  uint_least32_t

> first: 24; +  uint_least16_t delta;


Unfortunately that makes the code's correctness implementation-defined,
because it now depends on what the underlying implementation's take on
the constraint C99 6.7.2.1p4 is:

	A bit-field shall have a type that is a qualified or unqualified
	version of _Bool, signed int, unsigned int, or some other	
	implementation-defined type.

I.e. compilers are not required to allow uint32_least_t in this place.
Paul Koning July 20, 2018, 5:43 p.m. | #3
> On Jul 20, 2018, at 1:37 PM, Hans-Bernhard Bröker <HBBroeker@t-online.de> wrote:

> 

> Am 20.07.2018 um 13:07 schrieb Sebastian Huber:

>> -  unsigned int first: 24; -  unsigned short delta; +  uint_least32_t

>> first: 24; +  uint_least16_t delta;

> 

> Unfortunately that makes the code's correctness implementation-defined,

> because it now depends on what the underlying implementation's take on

> the constraint C99 6.7.2.1p4 is:

> 

> 	A bit-field shall have a type that is a qualified or unqualified

> 	version of _Bool, signed int, unsigned int, or some other	

> 	implementation-defined type.

> 

> I.e. compilers are not required to allow uint32_least_t in this place.


Yes, but the original code was just as implementation-dependent, because it depended on "int" having at least 24 bits.  And indeed that is not true (or not necessarily true) on 16-bit targets.

Does "qualified" mean "long int" is allowed?  If so, that would be an alternative.  But the proposed patch has the advantage that it explicitly calls for a long-enough type.  

While legally speaking the standard doesn't require that to work, is it plausible that any real world compiler would object?

	paul
Sebastian Huber July 20, 2018, 7:11 p.m. | #4
----- Am 20. Jul 2018 um 19:37 schrieb Hans-Bernhard Bröker HBBroeker@t-online.de:

> Am 20.07.2018 um 13:07 schrieb Sebastian Huber:

>> -  unsigned int first: 24; -  unsigned short delta; +  uint_least32_t

>> first: 24; +  uint_least16_t delta;

> 

> Unfortunately that makes the code's correctness implementation-defined,

> because it now depends on what the underlying implementation's take on

> the constraint C99 6.7.2.1p4 is:

> 

>	A bit-field shall have a type that is a qualified or unqualified

>	version of _Bool, signed int, unsigned int, or some other

>	implementation-defined type.

> 

> I.e. compilers are not required to allow uint32_least_t in this place.


The previous code used short for a bit-field, so this patch makes nothing worse in this respect and fixes a real world regression on 16-bit targets.

Do you know a C99+ compiler which has a problem with such a bit-field?
Sebastian Huber July 20, 2018, 7:17 p.m. | #5
----- Am 20. Jul 2018 um 19:43 schrieb Paul Koning paulkoning@comcast.net:

[...]
> Does "qualified" mean "long int" is allowed?

[...]

No, qualified means const, volatile and restrict.
Hans-Bernhard Bröker July 21, 2018, 1:13 p.m. | #6
Am 20.07.2018 um 21:11 schrieb Sebastian Huber:
> ----- Am 20. Jul 2018 um 19:37 schrieb Hans-Bernhard Bröker HBBroeker@t-online.de:


>> I.e. compilers are not required to allow uint32_least_t in this place.


> The previous code used short for a bit-field, so this patch makes nothing worse in this respect 


That's not quite as certain as you make it sound.  Whether a compiler 
allows "short" for bitfields is entirely unrelated to it allowing 
"uint32_least_t" for them.  All four combinations of outcomes are 
possible (though not equally probable).


> Do you know a C99+ compiler which has a problem with such a bit-field?


No.  But neither do I claim my knowledge of C compilers to be complete.
Brian Inglis July 22, 2018, 6:46 a.m. | #7
On 2018-07-20 11:37, Hans-Bernhard Bröker wrote:
> Am 20.07.2018 um 13:07 schrieb Sebastian Huber:

>> -  unsigned int first: 24; -  unsigned short delta; +  uint_least32_t

>> first: 24; +  uint_least16_t delta;

> Unfortunately that makes the code's correctness implementation-defined,

> because it now depends on what the underlying implementation's take on

> the constraint C99 6.7.2.1p4 is:

>     A bit-field shall have a type that is a qualified or unqualified

>     version of _Bool, signed int, unsigned int, or some other   

>     implementation-defined type.

> i.e. compilers are not required to allow uint32_least_t in this place.


Old code breaks existing implementations; those implementations do support it:
if others have problems, another patch can fix that; please feel free to propose
a better patch.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
Andre Vieira (lists) July 26, 2018, 4:05 p.m. | #8
On 20/07/18 12:57, Corinna Vinschen wrote:
> On Jul 20 13:07, Sebastian Huber wrote:

>> This prevents errors like this:

>>

>> newlib/libc/ctype/categories.c:6:3: error: width of 'first' exceeds its type

>>    unsigned int first: 24;

>>    ^

>>

>> Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>

>> ---

>>  newlib/libc/ctype/categories.c  | 5 +++--

>>  newlib/libc/ctype/towctrans_l.c | 9 +++++----

>>  2 files changed, 8 insertions(+), 6 deletions(-)

> 

> ACK.  Please push.

> 

> 

> Thanks,

> Corinna

> 

Hi,

Our aarch64 and arm bare-metal tests are showing failures with gcc's
libstdc++ 22_locale/ctype/to/wchar_t/1.cc after this patch.

src/gcc/libstdc++-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:66: void
test01(): Assertion 'c100 == c00' failed.

I haven't looked further into, just finished bisecting the issue. Any
idea what might be going wrong?

Cheers,
Andre
Brian Inglis July 26, 2018, 8:35 p.m. | #9
On 2018-07-26 10:05, Andre Vieira (lists) wrote:
> On 20/07/18 12:57, Corinna Vinschen wrote:

>> On Jul 20 13:07, Sebastian Huber wrote:

>>> This prevents errors like this:

>>> newlib/libc/ctype/categories.c:6:3: error: width of 'first' exceeds its type

>>>    unsigned int first: 24;

>>>    ^

>>> Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>

>>> ---

>>>  newlib/libc/ctype/categories.c  | 5 +++--

>>>  newlib/libc/ctype/towctrans_l.c | 9 +++++----

>>>  2 files changed, 8 insertions(+), 6 deletions(-)


>> ACK.  Please push.


> Our aarch64 and arm bare-metal tests are showing failures with gcc's 

> libstdc++ 22_locale/ctype/to/wchar_t/1.cc after this patch.

> src/gcc/libstdc++-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:66: void

> test01(): Assertion 'c100 == c00' failed.

> I haven't looked further into, just finished bisecting the issue. 

> Any idea what might be going wrong?

Besides uninformative messages, small s is not being uppercased to capital S;
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:

...
typedef wchar_t char_type;
class gnu_ctype: public std::ctype<char_type> { };

void test01()
{
...
  const char_type c00 = L'S';
  const char_type c10 = L's';

  gnu_ctype gctype;
  char_type c100;
...
  // char_type toupper(char_type c) const
  c100 = gctype.toupper(c10);
  VERIFY( c100 == c00 );
...

which if not overridden by a non-default locale elsewhere may end up at:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_configure_char.cc#l69

  char
  ctype<char>::do_toupper(char __c) const
  {
    int __x = __c;
    return (this->is(ctype_base::lower, __c) ? (__x - 'a' + 'A') : __x);
  }

which depends on the lower case test working correctly, which depends on the
locale table initialization, wherever that's done.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
Thomas Wolff July 27, 2018, 4:17 p.m. | #10
Am 26.07.2018 um 22:35 schrieb Brian Inglis:
> On 2018-07-26 10:05, Andre Vieira (lists) wrote:

>> On 20/07/18 12:57, Corinna Vinschen wrote:

>>> On Jul 20 13:07, Sebastian Huber wrote:

>>>> This prevents errors like this:

>>>> newlib/libc/ctype/categories.c:6:3: error: width of 'first' exceeds its type

>>>>     unsigned int first: 24;

>>>>     ^

>>>> Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>

>>>> ---

>>>>   newlib/libc/ctype/categories.c  | 5 +++--

>>>>   newlib/libc/ctype/towctrans_l.c | 9 +++++----

>>>>   2 files changed, 8 insertions(+), 6 deletions(-)

>>> ACK.  Please push.

>> Our aarch64 and arm bare-metal tests are showing failures with gcc's

>> libstdc++ 22_locale/ctype/to/wchar_t/1.cc after this patch.

>> src/gcc/libstdc++-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:66: void

>> test01(): Assertion 'c100 == c00' failed.

>> I haven't looked further into, just finished bisecting the issue.

>> Any idea what might be going wrong?

> Besides uninformative messages, small s is not being uppercased to capital S;

> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:

>

> ...

> typedef wchar_t char_type;

> class gnu_ctype: public std::ctype<char_type> { };

>

> void test01()

> {

> ...

>    const char_type c00 = L'S';

>    const char_type c10 = L's';

>

>    gnu_ctype gctype;

>    char_type c100;

> ...

>    // char_type toupper(char_type c) const

>    c100 = gctype.toupper(c10);

>    VERIFY( c100 == c00 );

> ...

>

> which if not overridden by a non-default locale elsewhere may end up at:

>

> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_configure_char.cc#l69

>

>    char

>    ctype<char>::do_toupper(char __c) const

>    {

>      int __x = __c;

>      return (this->is(ctype_base::lower, __c) ? (__x - 'a' + 'A') : __x);

>    }

>

> which depends on the lower case test working correctly, which depends on the

> locale table initialization, wherever that's done.

Whatever gctype.toupper may be mapped to, the following gives me the 
correct output:
   printf ("%X %X\n", toupper('s'), towupper('s'));
Andre Vieira (lists) July 27, 2018, 4:23 p.m. | #11
On 27/07/18 17:17, Thomas Wolff wrote:
> 

> Am 26.07.2018 um 22:35 schrieb Brian Inglis:

>> On 2018-07-26 10:05, Andre Vieira (lists) wrote:

>>> On 20/07/18 12:57, Corinna Vinschen wrote:

>>>> On Jul 20 13:07, Sebastian Huber wrote:

>>>>> This prevents errors like this:

>>>>> newlib/libc/ctype/categories.c:6:3: error: width of 'first' exceeds

>>>>> its type

>>>>>     unsigned int first: 24;

>>>>>     ^

>>>>> Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>

>>>>> ---

>>>>>   newlib/libc/ctype/categories.c  | 5 +++--

>>>>>   newlib/libc/ctype/towctrans_l.c | 9 +++++----

>>>>>   2 files changed, 8 insertions(+), 6 deletions(-)

>>>> ACK.  Please push.

>>> Our aarch64 and arm bare-metal tests are showing failures with gcc's

>>> libstdc++ 22_locale/ctype/to/wchar_t/1.cc after this patch.

>>> src/gcc/libstdc++-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:66: void

>>> test01(): Assertion 'c100 == c00' failed.

>>> I haven't looked further into, just finished bisecting the issue.

>>> Any idea what might be going wrong?

>> Besides uninformative messages, small s is not being uppercased to

>> capital S;

>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/testsuite/22_locale/ctype/to/wchar_t/1.cc:

>>

>>

>> ...

>> typedef wchar_t char_type;

>> class gnu_ctype: public std::ctype<char_type> { };

>>

>> void test01()

>> {

>> ...

>>    const char_type c00 = L'S';

>>    const char_type c10 = L's';

>>

>>    gnu_ctype gctype;

>>    char_type c100;

>> ...

>>    // char_type toupper(char_type c) const

>>    c100 = gctype.toupper(c10);

>>    VERIFY( c100 == c00 );

>> ...

>>

>> which if not overridden by a non-default locale elsewhere may end up at:

>>

>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_configure_char.cc#l69

>>

>>

>>    char

>>    ctype<char>::do_toupper(char __c) const

>>    {

>>      int __x = __c;

>>      return (this->is(ctype_base::lower, __c) ? (__x - 'a' + 'A') : __x);

>>    }

>>

>> which depends on the lower case test working correctly, which depends

>> on the

>> locale table initialization, wherever that's done.

> Whatever gctype.toupper may be mapped to, the following gives me the

> correct output:

>   printf ("%X %X\n", toupper('s'), towupper('s'));

> 


Sebastian's new patch series
https://sourceware.org/ml/newlib/2018/msg00570.html fixes this issue on Arm.

Cheers,
Andre

Patch

diff --git a/newlib/libc/ctype/categories.c b/newlib/libc/ctype/categories.c
index c237324ec..85328ef2e 100644
--- a/newlib/libc/ctype/categories.c
+++ b/newlib/libc/ctype/categories.c
@@ -1,10 +1,11 @@ 
 #include <wctype.h>
+#include <stdint.h>
 #include "categories.h"
 
 struct _category {
   enum category cat: 8;
-  unsigned int first: 24;
-  unsigned short delta;
+  uint_least32_t first: 24;
+  uint_least16_t delta;
 } __attribute__((packed));
 
 static const struct _category categories[] = {
diff --git a/newlib/libc/ctype/towctrans_l.c b/newlib/libc/ctype/towctrans_l.c
index 9759cf7bc..42085ac78 100644
--- a/newlib/libc/ctype/towctrans_l.c
+++ b/newlib/libc/ctype/towctrans_l.c
@@ -1,6 +1,7 @@ 
 /* Modified (m) 2017 Thomas Wolff: revise Unicode and locale/wchar handling */
 #include <_ansi.h>
 #include <wctype.h>
+#include <stdint.h>
 //#include <errno.h>
 #include "local.h"
 
@@ -35,10 +36,10 @@ 
 enum {TO1, TOLO, TOUP, TOBOTH};
 enum {EVENCAP, ODDCAP};
 static struct caseconv_entry {
-  unsigned int first: 21;
-  unsigned short diff: 8;
-  unsigned char mode: 2;
-  int delta: 17;
+  uint_least32_t first: 21;
+  uint_least8_t diff: 8;
+  uint_least8_t mode: 2;
+  uint_least32_t delta: 17;
 } __attribute__ ((packed))
 caseconv_table [] = {
 #include "caseconv.t"