[2/2] ctype: use less short names in public header

Message ID 20211109012459.30826-1-vapier@gentoo.org
State New
Headers show
Series
  • [1/2] define _COMPILING_NEWLIB for all targets when compiling
Related show

Commit Message

Mike Frysinger Nov. 9, 2021, 1:24 a.m.
We're seeing a build failure in GNU sim code which is using _P locally
but the ctype.h define clashes with it.  Rename these to use the same
symbols that glibc does.  They're a bit more verbose, but seems likely
that we'll have fewer conflicts if glibc isn't seeing them.

However, these shortnames are still used internally by ctype modules
to produce pretty concise source code, so use _COMPILING_NEWLIB to
keep them around when compiling newlib itself where we have better
control over short name conflicts.
---
 newlib/libc/include/ctype.h | 77 ++++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 32 deletions(-)

-- 
2.33.0

Comments

Corinna Vinschen Nov. 9, 2021, 11:38 a.m. | #1
On Nov  8 20:24, Mike Frysinger wrote:
> We're seeing a build failure in GNU sim code which is using _P locally

> but the ctype.h define clashes with it.  Rename these to use the same

> symbols that glibc does.  They're a bit more verbose, but seems likely

> that we'll have fewer conflicts if glibc isn't seeing them.

                                     ^^^^^
Mixing newlib and glibc?  That's just a typo, I guess?

> However, these shortnames are still used internally by ctype modules

> to produce pretty concise source code, so use _COMPILING_NEWLIB to

> keep them around when compiling newlib itself where we have better

> control over short name conflicts.


Hmm.  I'm not sure we should really maintain two different sets of
symbols.  I think it would be better to go the entire way and replace
the single letter symbols with the new, more speaking ones throughout.
There are not that many affected files and the change might be done with
sed mostly.

The only exceptions *could* be libc/ctype/ctype_.c and the local
ctype_iso.h and ctype_cp.h headers it includes.  A local definition
of the single letter symbols in ctype_.c would be sufficient then.
But even there we might be better off with the new symbols in the long
run...


Corinna
Mike Frysinger Nov. 10, 2021, 12:18 a.m. | #2
On 09 Nov 2021 12:38, Corinna Vinschen wrote:
> On Nov  8 20:24, Mike Frysinger wrote:

> > We're seeing a build failure in GNU sim code which is using _P locally

> > but the ctype.h define clashes with it.  Rename these to use the same

> > symbols that glibc does.  They're a bit more verbose, but seems likely

> > that we'll have fewer conflicts if glibc isn't seeing them.

>                                      ^^^^^

> Mixing newlib and glibc?  That's just a typo, I guess?


i meant glibc here.  these are the same symbol names that glibc is using,
and it's not seeing conflicts in the wider ecosystem.  so if they aren't
seeing issues, it's likely newlib won't either if it uses the same names.

> > However, these shortnames are still used internally by ctype modules

> > to produce pretty concise source code, so use _COMPILING_NEWLIB to

> > keep them around when compiling newlib itself where we have better

> > control over short name conflicts.

> 

> Hmm.  I'm not sure we should really maintain two different sets of

> symbols.  I think it would be better to go the entire way and replace

> the single letter symbols with the new, more speaking ones throughout.

> There are not that many affected files and the change might be done with

> sed mostly.

> 

> The only exceptions *could* be libc/ctype/ctype_.c and the local

> ctype_iso.h and ctype_cp.h headers it includes.  A local definition

> of the single letter symbols in ctype_.c would be sufficient then.

> But even there we might be better off with the new symbols in the long

> run...


i don't have an opinion.  i can easily run sed on the files and send the
result out.  i figured people would prefer having the condense tables so
they could scan them quicker by eye.

so let me know what you want and i'll do it ;).
-mike
Corinna Vinschen Nov. 10, 2021, 10:56 a.m. | #3
On Nov  9 19:18, Mike Frysinger wrote:
> On 09 Nov 2021 12:38, Corinna Vinschen wrote:

> > On Nov  8 20:24, Mike Frysinger wrote:

> > > We're seeing a build failure in GNU sim code which is using _P locally

> > > but the ctype.h define clashes with it.  Rename these to use the same

> > > symbols that glibc does.  They're a bit more verbose, but seems likely

> > > that we'll have fewer conflicts if glibc isn't seeing them.

> >                                      ^^^^^

> > Mixing newlib and glibc?  That's just a typo, I guess?

> 

> i meant glibc here.  these are the same symbol names that glibc is using,

> and it's not seeing conflicts in the wider ecosystem.  so if they aren't

> seeing issues, it's likely newlib won't either if it uses the same names.

> 

> > > However, these shortnames are still used internally by ctype modules

> > > to produce pretty concise source code, so use _COMPILING_NEWLIB to

> > > keep them around when compiling newlib itself where we have better

> > > control over short name conflicts.

> > 

> > Hmm.  I'm not sure we should really maintain two different sets of

> > symbols.  I think it would be better to go the entire way and replace

> > the single letter symbols with the new, more speaking ones throughout.

> > There are not that many affected files and the change might be done with

> > sed mostly.

> > 

> > The only exceptions *could* be libc/ctype/ctype_.c and the local

> > ctype_iso.h and ctype_cp.h headers it includes.  A local definition

> > of the single letter symbols in ctype_.c would be sufficient then.

> > But even there we might be better off with the new symbols in the long

> > run...

> 

> i don't have an opinion.  i can easily run sed on the files and send the

> result out.  i figured people would prefer having the condense tables so

> they could scan them quicker by eye.

> 

> so let me know what you want and i'll do it ;).


Great.  Let's try the compromise from above: Drop _X etc from the
header, just define them in libc/ctype/ctype_.c.


Thanks,
Corinna
Richard Earnshaw Nov. 23, 2021, 3:09 p.m. | #4
This is wrong and breaks all old versions of C++.

The GNU sim code should not be using reserved names (those starting _) 
in normal source code.  Such names are reserved to the implementation.

R.

On 09/11/2021 01:24, Mike Frysinger wrote:
> We're seeing a build failure in GNU sim code which is using _P locally

> but the ctype.h define clashes with it.  Rename these to use the same

> symbols that glibc does.  They're a bit more verbose, but seems likely

> that we'll have fewer conflicts if glibc isn't seeing them.

> 

> However, these shortnames are still used internally by ctype modules

> to produce pretty concise source code, so use _COMPILING_NEWLIB to

> keep them around when compiling newlib itself where we have better

> control over short name conflicts.

> ---

>   newlib/libc/include/ctype.h | 77 ++++++++++++++++++++++---------------

>   1 file changed, 45 insertions(+), 32 deletions(-)

> 

> diff --git a/newlib/libc/include/ctype.h b/newlib/libc/include/ctype.h

> index 932a567e25db..f2a4368da5d2 100644

> --- a/newlib/libc/include/ctype.h

> +++ b/newlib/libc/include/ctype.h

> @@ -57,14 +57,27 @@ extern int isascii_l (int __c, locale_t __l);

>   extern int toascii_l (int __c, locale_t __l);

>   #endif

>   

> -#define	_U	01

> -#define	_L	02

> -#define	_N	04

> -#define	_S	010

> -#define _P	020

> -#define _C	040

> -#define _X	0100

> -#define	_B	0200

> +enum

> +{

> +  _ISupper = 01,

> +  _ISlower = 02,

> +  _ISdigit = 04,

> +  _ISspace = 010,

> +  _ISpunct = 020,

> +  _IScntrl = 040,

> +  _ISxdigit = 0100,

> +  _ISblank = 0200,

> +};

> +#ifdef _COMPILING_NEWLIB

> +# define _U _ISupper

> +# define _L _ISlower

> +# define _N _ISdigit

> +# define _S _ISspace

> +# define _P _ISpunct

> +# define _C _IScntrl

> +# define _X _ISxdigit

> +# define _B _ISblank

> +#endif

>   

>   /* For C++ backward-compatibility only.  */

>   extern	__IMPORT const char	_ctype_[];

> @@ -89,22 +102,22 @@ const char *__locale_ctype_ptr (void);

>      an out-of-bounds reference on a 64-bit machine.  */

>   #define __ctype_lookup(__c) ((__CTYPE_PTR+sizeof(""[__c]))[(int)(__c)])

>   

> -#define	isalpha(__c)	(__ctype_lookup(__c)&(_U|_L))

> -#define	isupper(__c)	((__ctype_lookup(__c)&(_U|_L))==_U)

> -#define	islower(__c)	((__ctype_lookup(__c)&(_U|_L))==_L)

> -#define	isdigit(__c)	(__ctype_lookup(__c)&_N)

> -#define	isxdigit(__c)	(__ctype_lookup(__c)&(_X|_N))

> -#define	isspace(__c)	(__ctype_lookup(__c)&_S)

> -#define ispunct(__c)	(__ctype_lookup(__c)&_P)

> -#define isalnum(__c)	(__ctype_lookup(__c)&(_U|_L|_N))

> -#define isprint(__c)	(__ctype_lookup(__c)&(_P|_U|_L|_N|_B))

> -#define	isgraph(__c)	(__ctype_lookup(__c)&(_P|_U|_L|_N))

> -#define iscntrl(__c)	(__ctype_lookup(__c)&_C)

> +#define	isalpha(__c)	(__ctype_lookup(__c) & (_ISupper|_ISlower))

> +#define	isupper(__c)	((__ctype_lookup(__c) & (_ISupper|_ISlower)) == _ISupper)

> +#define	islower(__c)	((__ctype_lookup(__c) & (_ISupper|_ISlower)) == _ISlower)

> +#define	isdigit(__c)	(__ctype_lookup(__c) & _ISdigit)

> +#define	isxdigit(__c)	(__ctype_lookup(__c) & (_ISxdigit|_ISdigit))

> +#define	isspace(__c)	(__ctype_lookup(__c) & _ISspace)

> +#define ispunct(__c)	(__ctype_lookup(__c) & _ISpunct)

> +#define isalnum(__c)	(__ctype_lookup(__c) & (_ISupper|_ISlower|_ISdigit))

> +#define isprint(__c)	(__ctype_lookup(__c) & (_ISpunct|_ISupper|_ISlower|_ISdigit|_ISblank))

> +#define	isgraph(__c)	(__ctype_lookup(__c) & (_ISpunct|_ISupper|_ISlower|_ISdigit))

> +#define iscntrl(__c)	(__ctype_lookup(__c) & _IScntrl)

>   

>   #if defined(__GNUC__) && __ISO_C_VISIBLE >= 1999

>   #define isblank(__c) \

>     __extension__ ({ __typeof__ (__c) __x = (__c);		\

> -        (__ctype_lookup(__x)&_B) || (int) (__x) == '\t';})

> +        (__ctype_lookup(__x)&_ISblank) || (int) (__x) == '\t';})

>   #endif

>   

>   #if __POSIX_VISIBLE >= 200809

> @@ -120,22 +133,22 @@ __locale_ctype_ptr_l(locale_t _l)

>   #endif

>   #define __ctype_lookup_l(__c,__l) ((__locale_ctype_ptr_l(__l)+sizeof(""[__c]))[(int)(__c)])

>   

> -#define	isalpha_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_U|_L))

> -#define	isupper_l(__c,__l)	((__ctype_lookup_l(__c,__l)&(_U|_L))==_U)

> -#define	islower_l(__c,__l)	((__ctype_lookup_l(__c,__l)&(_U|_L))==_L)

> -#define	isdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_N)

> -#define	isxdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_X|_N))

> -#define	isspace_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_S)

> -#define ispunct_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_P)

> -#define isalnum_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_U|_L|_N))

> -#define isprint_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_P|_U|_L|_N|_B))

> -#define	isgraph_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_P|_U|_L|_N))

> -#define iscntrl_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_C)

> +#define	isalpha_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower))

> +#define	isupper_l(__c,__l)	((__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower)) == _ISupper)

> +#define	islower_l(__c,__l)	((__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower)) == _ISlower)

> +#define	isdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _ISdigit)

> +#define	isxdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISxdigit|_ISdigit))

> +#define	isspace_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _ISspace)

> +#define ispunct_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _ISpunct)

> +#define isalnum_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower|_ISdigit))

> +#define isprint_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISpunct|_ISupper|_ISlower|_ISdigit|_ISblank))

> +#define	isgraph_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISpunct|_ISupper|_ISlower|_ISdigit))

> +#define iscntrl_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _IScntrl)

>   

>   #if defined(__GNUC__)

>   #define isblank_l(__c, __l) \

>     __extension__ ({ __typeof__ (__c) __x = (__c);		\

> -        (__ctype_lookup_l(__x,__l)&_B) || (int) (__x) == '\t';})

> +        (__ctype_lookup_l(__x,__l)&_ISblank) || (int) (__x) == '\t';})

>   #endif

>   

>   #endif /* __POSIX_VISIBLE >= 200809 */

>
Mike Frysinger Nov. 24, 2021, 4:15 a.m. | #5
On 23 Nov 2021 15:09, Richard Earnshaw wrote:
> This is wrong and breaks all old versions of C++.


this is a bit vague.  it would help if you provided details as to what broke.
i doubt this broke all old versions of C++ everywhere.

i'm guessing you're referring to the GNU C++ (libstdc++) library specifically
and its hardcoding of newlib's internal ctype define names.
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

if you're talking about something else, please state so clearly.

> The GNU sim code should not be using reserved names (those starting _) 

> in normal source code.  Such names are reserved to the implementation.


that's not really a good reason to go pooping all over the namespace.

we can maintain backwards compat here for C++ code fairly easily:

--- a/newlib/libc/include/ctype.h
+++ b/newlib/libc/include/ctype.h
@@ -71,6 +71,16 @@ enum
 
 /* For C++ backward-compatibility only.  */
 extern	__IMPORT const char	_ctype_[];
+#ifdef __cplusplus
+# define _U _ISupper
+# define _L _ISlower
+# define _N _ISdigit
+# define _S _ISspace
+# define _P _ISpunct
+# define _C _IScntrl
+# define _X _ISxdigit
+# define _B _ISblank
+#endif
 
 #ifdef __HAVE_LOCALE_INFO__
 const char *__locale_ctype_ptr (void);

considering the numerical value is part of the ABI, not the name, libstdc++
could have inlined the constant values instead.  i wonder how long of a version
skew is reasonable if we wanted to transition it to the new names to match what
glibc uses.
-mike
Richard Earnshaw Nov. 24, 2021, 10:58 a.m. | #6
On 24/11/2021 04:15, Mike Frysinger wrote:
> On 23 Nov 2021 15:09, Richard Earnshaw wrote:

>> This is wrong and breaks all old versions of C++.

> 

> this is a bit vague.  it would help if you provided details as to what broke.

> i doubt this broke all old versions of C++ everywhere.


My apologies, I did mean to say GNU C++.
> 

> i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

> and its hardcoding of newlib's internal ctype define names.

> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

> 

> if you're talking about something else, please state so clearly.

> 

>> The GNU sim code should not be using reserved names (those starting _) 

>> in normal source code.  Such names are reserved to the implementation.

> 

> that's not really a good reason to go pooping all over the namespace.

> 

It's not 'pooping' on the namespace.  Identifiers starting with _ are
all reserved, one way or another; some for the compiler, some for the
library implementation and some for future changes to the standard.
Applications have no business defining such names.

> we can maintain backwards compat here for C++ code fairly easily:

> 

> --- a/newlib/libc/include/ctype.h

> +++ b/newlib/libc/include/ctype.h

> @@ -71,6 +71,16 @@ enum

>  

>  /* For C++ backward-compatibility only.  */

>  extern	__IMPORT const char	_ctype_[];

> +#ifdef __cplusplus

> +# define _U _ISupper

> +# define _L _ISlower

> +# define _N _ISdigit

> +# define _S _ISspace

> +# define _P _ISpunct

> +# define _C _IScntrl

> +# define _X _ISxdigit

> +# define _B _ISblank

> +#endif



>  

>  #ifdef __HAVE_LOCALE_INFO__

>  const char *__locale_ctype_ptr (void);

> 

> considering the numerical value is part of the ABI, not the name, libstdc++

> could have inlined the constant values instead.  i wonder how long of a version

> skew is reasonable if we wanted to transition it to the new names to match what

> glibc uses.



I think you'll find that the BSD libraries have been using _[ULNSPCXB]
since pretty much forever.  Why do we need to be compatible with glibc?

R.

> -mike

>
Richard Earnshaw Nov. 24, 2021, 11:01 a.m. | #7
On 24/11/2021 10:58, Richard Earnshaw wrote:
> On 24/11/2021 04:15, Mike Frysinger wrote:

>> On 23 Nov 2021 15:09, Richard Earnshaw wrote:

>>> This is wrong and breaks all old versions of C++.

>>

>> this is a bit vague.  it would help if you provided details as to what broke.

>> i doubt this broke all old versions of C++ everywhere.

> 

> My apologies, I did mean to say GNU C++.

>>

>> i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

>> and its hardcoding of newlib's internal ctype define names.

>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

>>

>> if you're talking about something else, please state so clearly.

>>

>>> The GNU sim code should not be using reserved names (those starting _) 

>>> in normal source code.  Such names are reserved to the implementation.

>>

>> that's not really a good reason to go pooping all over the namespace.

>>

> It's not 'pooping' on the namespace.  Identifiers starting with _ are

> all reserved, one way or another; some for the compiler, some for the

> library implementation and some for future changes to the standard.

> Applications have no business defining such names.

> 

>> we can maintain backwards compat here for C++ code fairly easily:

>>

>> --- a/newlib/libc/include/ctype.h

>> +++ b/newlib/libc/include/ctype.h

>> @@ -71,6 +71,16 @@ enum

>>  

>>  /* For C++ backward-compatibility only.  */

>>  extern	__IMPORT const char	_ctype_[];

>> +#ifdef __cplusplus

>> +# define _U _ISupper

>> +# define _L _ISlower

>> +# define _N _ISdigit

>> +# define _S _ISspace

>> +# define _P _ISpunct

>> +# define _C _IScntrl

>> +# define _X _ISxdigit

>> +# define _B _ISblank

>> +#endif

> 

> 

>>  

>>  #ifdef __HAVE_LOCALE_INFO__

>>  const char *__locale_ctype_ptr (void);

>>

>> considering the numerical value is part of the ABI, not the name, libstdc++

>> could have inlined the constant values instead.  i wonder how long of a version

>> skew is reasonable if we wanted to transition it to the new names to match what

>> glibc uses.

> 

> 

> I think you'll find that the BSD libraries have been using _[ULNSPCXB]

> since pretty much forever.  Why do we need to be compatible with glibc?


Eg: http://www.retro11.de/ouxr/43bsd/usr/src/include/ctype.h.html

R.

> 

> R.

> 

>> -mike

>>

>
Jonathan Wakely Nov. 30, 2021, 12:01 p.m. | #8
On 23/11/21 23:15 -0500, Mike Frysinger wrote:
>On 23 Nov 2021 15:09, Richard Earnshaw wrote:

>> This is wrong and breaks all old versions of C++.

>

>this is a bit vague.  it would help if you provided details as to what broke.

>i doubt this broke all old versions of C++ everywhere.

>

>i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

>and its hardcoding of newlib's internal ctype define names.

>https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0


Yes, you were CC'd on the GCC bug slightly before Richard sent his
email to this list:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c16

>if you're talking about something else, please state so clearly.

>

>> The GNU sim code should not be using reserved names (those starting _)

>> in normal source code.  Such names are reserved to the implementation.

>

>that's not really a good reason to go pooping all over the namespace.

>

>we can maintain backwards compat here for C++ code fairly easily:


Yes, or only do that for GCC < 12, as I suggested in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c19

#if defined(__GNUC__) && defined(__cplusplus)
# if __GNUC__ < 12

The libstdc++ code on trunk uses the new _ISupper names.

I have no opinion on how long you should keep such backwards
compatibility around. Whatever time limit you set, at some point it
will make a new newlib release unusable with past G++ versions.

But either way, the GNU sim code needed to be fixed. Your commit msg
for that was misleading:

     Some C libraries export _P symbols in their headers (like older
     newlib and its ctype.h), so use P_ instead to avoid conflicts.

It's irrelevant whether they use it or not: _P is a reserved name
meaning they *could* use it, and so anything outside "the
implementation" MUST NOT use it.

This isn't "that particular name happens to clash with a particular
header", it's "that name causes undefined behaviour, period".
Jonathan Wakely Nov. 30, 2021, 12:18 p.m. | #9
On 09/11/21 19:18 -0500, Mike Frysinger wrote:
>On 09 Nov 2021 12:38, Corinna Vinschen wrote:

>> On Nov  8 20:24, Mike Frysinger wrote:

>> > We're seeing a build failure in GNU sim code which is using _P locally

>> > but the ctype.h define clashes with it.  Rename these to use the same

>> > symbols that glibc does.  They're a bit more verbose, but seems likely

>> > that we'll have fewer conflicts if glibc isn't seeing them.

>>                                      ^^^^^

>> Mixing newlib and glibc?  That's just a typo, I guess?

>

>i meant glibc here.  these are the same symbol names that glibc is using,

>and it's not seeing conflicts in the wider ecosystem.  so if they aren't

>seeing issues, it's likely newlib won't either if it uses the same names.


There's a reason you don't see conflicts with those names:

They are reserved names, just like _P.

Any user code that conflicts with those names has a bug and needs
fixing.
Corinna Vinschen Nov. 30, 2021, 3:14 p.m. | #10
On Nov 30 12:01, Jonathan Wakely wrote:
> On 23/11/21 23:15 -0500, Mike Frysinger wrote:

> > On 23 Nov 2021 15:09, Richard Earnshaw wrote:

> > > This is wrong and breaks all old versions of C++.

> > 

> > this is a bit vague.  it would help if you provided details as to what broke.

> > i doubt this broke all old versions of C++ everywhere.

> > 

> > i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

> > and its hardcoding of newlib's internal ctype define names.

> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

> 

> Yes, you were CC'd on the GCC bug slightly before Richard sent his

> email to this list:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c16

> 

> > if you're talking about something else, please state so clearly.

> > 

> > > The GNU sim code should not be using reserved names (those starting _)

> > > in normal source code.  Such names are reserved to the implementation.

> > 

> > that's not really a good reason to go pooping all over the namespace.

> > 

> > we can maintain backwards compat here for C++ code fairly easily:

> 

> Yes, or only do that for GCC < 12, as I suggested in

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c19

> 

> #if defined(__GNUC__) && defined(__cplusplus)

> # if __GNUC__ < 12

> 

> The libstdc++ code on trunk uses the new _ISupper names.

> 

> I have no opinion on how long you should keep such backwards

> compatibility around. Whatever time limit you set, at some point it

> will make a new newlib release unusable with past G++ versions.


Is there a good reason to revert these patches in newlib?  I see the
problem but I'm unclear on how problematic the change is in real life.


Thanks,
Corinna
Jonathan Wakely Nov. 30, 2021, 5:12 p.m. | #11
On Tue, 30 Nov 2021 at 15:14, Corinna Vinschen <vinschen@redhat.com> wrote:
>

> On Nov 30 12:01, Jonathan Wakely wrote:

> > On 23/11/21 23:15 -0500, Mike Frysinger wrote:

> > > On 23 Nov 2021 15:09, Richard Earnshaw wrote:

> > > > This is wrong and breaks all old versions of C++.

> > >

> > > this is a bit vague.  it would help if you provided details as to what broke.

> > > i doubt this broke all old versions of C++ everywhere.

> > >

> > > i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

> > > and its hardcoding of newlib's internal ctype define names.

> > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

> >

> > Yes, you were CC'd on the GCC bug slightly before Richard sent his

> > email to this list:

> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c16

> >

> > > if you're talking about something else, please state so clearly.

> > >

> > > > The GNU sim code should not be using reserved names (those starting _)

> > > > in normal source code.  Such names are reserved to the implementation.

> > >

> > > that's not really a good reason to go pooping all over the namespace.

> > >

> > > we can maintain backwards compat here for C++ code fairly easily:

> >

> > Yes, or only do that for GCC < 12, as I suggested in

> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c19

> >

> > #if defined(__GNUC__) && defined(__cplusplus)

> > # if __GNUC__ < 12

> >

> > The libstdc++ code on trunk uses the new _ISupper names.

> >

> > I have no opinion on how long you should keep such backwards

> > compatibility around. Whatever time limit you set, at some point it

> > will make a new newlib release unusable with past G++ versions.

>

> Is there a good reason to revert these patches in newlib?  I see the

> problem but I'm unclear on how problematic the change is in real life.


You cannot use newlib from Git to build any released version of GCC.

Is building newlib from Git only supported when using GCC trunk, or is
it supposed to build with e.g. GCC 11.2.0 from July this year? If yes,
then newlib needs changes (whether reverting the change entirely, or
just making another change to restore the old names in addition to the
new ones).
Jonathan Wakely Nov. 30, 2021, 5:15 p.m. | #12
On Tue, 30 Nov 2021 at 17:12, Jonathan Wakely <jwakely@redhat.com> wrote:
>

> On Tue, 30 Nov 2021 at 15:14, Corinna Vinschen <vinschen@redhat.com> wrote:

> >

> > On Nov 30 12:01, Jonathan Wakely wrote:

> > > On 23/11/21 23:15 -0500, Mike Frysinger wrote:

> > > > On 23 Nov 2021 15:09, Richard Earnshaw wrote:

> > > > > This is wrong and breaks all old versions of C++.

> > > >

> > > > this is a bit vague.  it would help if you provided details as to what broke.

> > > > i doubt this broke all old versions of C++ everywhere.

> > > >

> > > > i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

> > > > and its hardcoding of newlib's internal ctype define names.

> > > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

> > >

> > > Yes, you were CC'd on the GCC bug slightly before Richard sent his

> > > email to this list:

> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c16

> > >

> > > > if you're talking about something else, please state so clearly.

> > > >

> > > > > The GNU sim code should not be using reserved names (those starting _)

> > > > > in normal source code.  Such names are reserved to the implementation.

> > > >

> > > > that's not really a good reason to go pooping all over the namespace.

> > > >

> > > > we can maintain backwards compat here for C++ code fairly easily:

> > >

> > > Yes, or only do that for GCC < 12, as I suggested in

> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c19

> > >

> > > #if defined(__GNUC__) && defined(__cplusplus)

> > > # if __GNUC__ < 12

> > >

> > > The libstdc++ code on trunk uses the new _ISupper names.

> > >

> > > I have no opinion on how long you should keep such backwards

> > > compatibility around. Whatever time limit you set, at some point it

> > > will make a new newlib release unusable with past G++ versions.

> >

> > Is there a good reason to revert these patches in newlib?  I see the

> > problem but I'm unclear on how problematic the change is in real life.

>

> You cannot use newlib from Git to build any released version of GCC.

>

> Is building newlib from Git only supported when using GCC trunk, or is


Oops, I mean building *against* newlib from Git, not building newlib
itself. You can still build newlib itself, because it doesn't use C++.
But you can't build a GCC 11.2.0 compiler that uses the latest newlib
from Git as its libc.

> it supposed to build with e.g. GCC 11.2.0 from July this year? If yes,

> then newlib needs changes (whether reverting the change entirely, or

> just making another change to restore the old names in addition to the

> new ones).
Richard Earnshaw Nov. 30, 2021, 5:52 p.m. | #13
On 30/11/2021 17:15, Jonathan Wakely wrote:
> On Tue, 30 Nov 2021 at 17:12, Jonathan Wakely <jwakely@redhat.com> wrote:

>>

>> On Tue, 30 Nov 2021 at 15:14, Corinna Vinschen <vinschen@redhat.com> wrote:

>>>

>>> On Nov 30 12:01, Jonathan Wakely wrote:

>>>> On 23/11/21 23:15 -0500, Mike Frysinger wrote:

>>>>> On 23 Nov 2021 15:09, Richard Earnshaw wrote:

>>>>>> This is wrong and breaks all old versions of C++.

>>>>>

>>>>> this is a bit vague.  it would help if you provided details as to what broke.

>>>>> i doubt this broke all old versions of C++ everywhere.

>>>>>

>>>>> i'm guessing you're referring to the GNU C++ (libstdc++) library specifically

>>>>> and its hardcoding of newlib's internal ctype define names.

>>>>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/config/os/newlib/ctype_base.h;hb=releases/gcc-11.2.0

>>>>

>>>> Yes, you were CC'd on the GCC bug slightly before Richard sent his

>>>> email to this list:

>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c16

>>>>

>>>>> if you're talking about something else, please state so clearly.

>>>>>

>>>>>> The GNU sim code should not be using reserved names (those starting _)

>>>>>> in normal source code.  Such names are reserved to the implementation.

>>>>>

>>>>> that's not really a good reason to go pooping all over the namespace.

>>>>>

>>>>> we can maintain backwards compat here for C++ code fairly easily:

>>>>

>>>> Yes, or only do that for GCC < 12, as I suggested in

>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305#c19

>>>>

>>>> #if defined(__GNUC__) && defined(__cplusplus)

>>>> # if __GNUC__ < 12

>>>>

>>>> The libstdc++ code on trunk uses the new _ISupper names.

>>>>

>>>> I have no opinion on how long you should keep such backwards

>>>> compatibility around. Whatever time limit you set, at some point it

>>>> will make a new newlib release unusable with past G++ versions.

>>>

>>> Is there a good reason to revert these patches in newlib?  I see the

>>> problem but I'm unclear on how problematic the change is in real life.

>>

>> You cannot use newlib from Git to build any released version of GCC.

>>

>> Is building newlib from Git only supported when using GCC trunk, or is

> 

> Oops, I mean building *against* newlib from Git, not building newlib

> itself. You can still build newlib itself, because it doesn't use C++.

> But you can't build a GCC 11.2.0 compiler that uses the latest newlib

> from Git as its libc.

> 

>> it supposed to build with e.g. GCC 11.2.0 from July this year? If yes,

>> then newlib needs changes (whether reverting the change entirely, or

>> just making another change to restore the old names in addition to the

>> new ones).

> 


My concern is that the proposed workaround may break other (probably 
buggy) apps that have been relying on the old BSD internal API for 30 
odd years.  The proposed workaround only solves the issue for G++.

R.
Corinna Vinschen Dec. 2, 2021, 10:27 a.m. | #14
On Nov 30 17:52, Richard Earnshaw wrote:
> On 30/11/2021 17:15, Jonathan Wakely wrote:

> > On Tue, 30 Nov 2021 at 17:12, Jonathan Wakely <jwakely@redhat.com> wrote:

> > > 

> > > On Tue, 30 Nov 2021 at 15:14, Corinna Vinschen <vinschen@redhat.com> wrote:

> > > > Is there a good reason to revert these patches in newlib?  I see the

> > > > problem but I'm unclear on how problematic the change is in real life.

> > > 

> > > You cannot use newlib from Git to build any released version of GCC.

> > > 

> > > Is building newlib from Git only supported when using GCC trunk, or is

> > 

> > Oops, I mean building *against* newlib from Git, not building newlib

> > itself. You can still build newlib itself, because it doesn't use C++.

> > But you can't build a GCC 11.2.0 compiler that uses the latest newlib

> > from Git as its libc.

> > 

> > > it supposed to build with e.g. GCC 11.2.0 from July this year? If yes,

> > > then newlib needs changes (whether reverting the change entirely, or

> > > just making another change to restore the old names in addition to the

> > > new ones).

> > 

> 

> My concern is that the proposed workaround may break other (probably buggy)

> apps that have been relying on the old BSD internal API for 30 odd years.

> The proposed workaround only solves the issue for G++.


I'm inclined to revert 3ba1bd0d9db, given it solves a problem which
isn't actually a problem in newlib, but in an application not following
the standards in terms of reserved symbols.

I'm discussing this with Jeff, stay tuned.


Corinna
Corinna Vinschen Dec. 3, 2021, 9:56 a.m. | #15
On Dec  2 11:27, Corinna Vinschen wrote:
> On Nov 30 17:52, Richard Earnshaw wrote:

> > On 30/11/2021 17:15, Jonathan Wakely wrote:

> > > On Tue, 30 Nov 2021 at 17:12, Jonathan Wakely <jwakely@redhat.com> wrote:

> > > > 

> > > > On Tue, 30 Nov 2021 at 15:14, Corinna Vinschen <vinschen@redhat.com> wrote:

> > > > > Is there a good reason to revert these patches in newlib?  I see the

> > > > > problem but I'm unclear on how problematic the change is in real life.

> > > > 

> > > > You cannot use newlib from Git to build any released version of GCC.

> > > > 

> > > > Is building newlib from Git only supported when using GCC trunk, or is

> > > 

> > > Oops, I mean building *against* newlib from Git, not building newlib

> > > itself. You can still build newlib itself, because it doesn't use C++.

> > > But you can't build a GCC 11.2.0 compiler that uses the latest newlib

> > > from Git as its libc.

> > > 

> > > > it supposed to build with e.g. GCC 11.2.0 from July this year? If yes,

> > > > then newlib needs changes (whether reverting the change entirely, or

> > > > just making another change to restore the old names in addition to the

> > > > new ones).

> > > 

> > 

> > My concern is that the proposed workaround may break other (probably buggy)

> > apps that have been relying on the old BSD internal API for 30 odd years.

> > The proposed workaround only solves the issue for G++.

> 

> I'm inclined to revert 3ba1bd0d9db, given it solves a problem which

> isn't actually a problem in newlib, but in an application not following

> the standards in terms of reserved symbols.

> 

> I'm discussing this with Jeff, stay tuned.


Reverted.


Corinna
Mike Frysinger Dec. 5, 2021, 9:42 a.m. | #16
On 30 Nov 2021 17:52, Richard Earnshaw wrote:
> My concern is that the proposed workaround may break other (probably 

> buggy) apps that have been relying on the old BSD internal API for 30 

> odd years.  The proposed workaround only solves the issue for G++.


those programs, if they exist, don't build on glibc or other POSIX libs
either.  so i doubt such apps really exist in practice.
-mike

Patch

diff --git a/newlib/libc/include/ctype.h b/newlib/libc/include/ctype.h
index 932a567e25db..f2a4368da5d2 100644
--- a/newlib/libc/include/ctype.h
+++ b/newlib/libc/include/ctype.h
@@ -57,14 +57,27 @@  extern int isascii_l (int __c, locale_t __l);
 extern int toascii_l (int __c, locale_t __l);
 #endif
 
-#define	_U	01
-#define	_L	02
-#define	_N	04
-#define	_S	010
-#define _P	020
-#define _C	040
-#define _X	0100
-#define	_B	0200
+enum
+{
+  _ISupper = 01,
+  _ISlower = 02,
+  _ISdigit = 04,
+  _ISspace = 010,
+  _ISpunct = 020,
+  _IScntrl = 040,
+  _ISxdigit = 0100,
+  _ISblank = 0200,
+};
+#ifdef _COMPILING_NEWLIB
+# define _U _ISupper
+# define _L _ISlower
+# define _N _ISdigit
+# define _S _ISspace
+# define _P _ISpunct
+# define _C _IScntrl
+# define _X _ISxdigit
+# define _B _ISblank
+#endif
 
 /* For C++ backward-compatibility only.  */
 extern	__IMPORT const char	_ctype_[];
@@ -89,22 +102,22 @@  const char *__locale_ctype_ptr (void);
    an out-of-bounds reference on a 64-bit machine.  */
 #define __ctype_lookup(__c) ((__CTYPE_PTR+sizeof(""[__c]))[(int)(__c)])
 
-#define	isalpha(__c)	(__ctype_lookup(__c)&(_U|_L))
-#define	isupper(__c)	((__ctype_lookup(__c)&(_U|_L))==_U)
-#define	islower(__c)	((__ctype_lookup(__c)&(_U|_L))==_L)
-#define	isdigit(__c)	(__ctype_lookup(__c)&_N)
-#define	isxdigit(__c)	(__ctype_lookup(__c)&(_X|_N))
-#define	isspace(__c)	(__ctype_lookup(__c)&_S)
-#define ispunct(__c)	(__ctype_lookup(__c)&_P)
-#define isalnum(__c)	(__ctype_lookup(__c)&(_U|_L|_N))
-#define isprint(__c)	(__ctype_lookup(__c)&(_P|_U|_L|_N|_B))
-#define	isgraph(__c)	(__ctype_lookup(__c)&(_P|_U|_L|_N))
-#define iscntrl(__c)	(__ctype_lookup(__c)&_C)
+#define	isalpha(__c)	(__ctype_lookup(__c) & (_ISupper|_ISlower))
+#define	isupper(__c)	((__ctype_lookup(__c) & (_ISupper|_ISlower)) == _ISupper)
+#define	islower(__c)	((__ctype_lookup(__c) & (_ISupper|_ISlower)) == _ISlower)
+#define	isdigit(__c)	(__ctype_lookup(__c) & _ISdigit)
+#define	isxdigit(__c)	(__ctype_lookup(__c) & (_ISxdigit|_ISdigit))
+#define	isspace(__c)	(__ctype_lookup(__c) & _ISspace)
+#define ispunct(__c)	(__ctype_lookup(__c) & _ISpunct)
+#define isalnum(__c)	(__ctype_lookup(__c) & (_ISupper|_ISlower|_ISdigit))
+#define isprint(__c)	(__ctype_lookup(__c) & (_ISpunct|_ISupper|_ISlower|_ISdigit|_ISblank))
+#define	isgraph(__c)	(__ctype_lookup(__c) & (_ISpunct|_ISupper|_ISlower|_ISdigit))
+#define iscntrl(__c)	(__ctype_lookup(__c) & _IScntrl)
 
 #if defined(__GNUC__) && __ISO_C_VISIBLE >= 1999
 #define isblank(__c) \
   __extension__ ({ __typeof__ (__c) __x = (__c);		\
-        (__ctype_lookup(__x)&_B) || (int) (__x) == '\t';})
+        (__ctype_lookup(__x)&_ISblank) || (int) (__x) == '\t';})
 #endif
 
 #if __POSIX_VISIBLE >= 200809
@@ -120,22 +133,22 @@  __locale_ctype_ptr_l(locale_t _l)
 #endif
 #define __ctype_lookup_l(__c,__l) ((__locale_ctype_ptr_l(__l)+sizeof(""[__c]))[(int)(__c)])
 
-#define	isalpha_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_U|_L))
-#define	isupper_l(__c,__l)	((__ctype_lookup_l(__c,__l)&(_U|_L))==_U)
-#define	islower_l(__c,__l)	((__ctype_lookup_l(__c,__l)&(_U|_L))==_L)
-#define	isdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_N)
-#define	isxdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_X|_N))
-#define	isspace_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_S)
-#define ispunct_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_P)
-#define isalnum_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_U|_L|_N))
-#define isprint_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_P|_U|_L|_N|_B))
-#define	isgraph_l(__c,__l)	(__ctype_lookup_l(__c,__l)&(_P|_U|_L|_N))
-#define iscntrl_l(__c,__l)	(__ctype_lookup_l(__c,__l)&_C)
+#define	isalpha_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower))
+#define	isupper_l(__c,__l)	((__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower)) == _ISupper)
+#define	islower_l(__c,__l)	((__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower)) == _ISlower)
+#define	isdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _ISdigit)
+#define	isxdigit_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISxdigit|_ISdigit))
+#define	isspace_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _ISspace)
+#define ispunct_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _ISpunct)
+#define isalnum_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISupper|_ISlower|_ISdigit))
+#define isprint_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISpunct|_ISupper|_ISlower|_ISdigit|_ISblank))
+#define	isgraph_l(__c,__l)	(__ctype_lookup_l(__c,__l) & (_ISpunct|_ISupper|_ISlower|_ISdigit))
+#define iscntrl_l(__c,__l)	(__ctype_lookup_l(__c,__l) & _IScntrl)
 
 #if defined(__GNUC__)
 #define isblank_l(__c, __l) \
   __extension__ ({ __typeof__ (__c) __x = (__c);		\
-        (__ctype_lookup_l(__x,__l)&_B) || (int) (__x) == '\t';})
+        (__ctype_lookup_l(__x,__l)&_ISblank) || (int) (__x) == '\t';})
 #endif
 
 #endif /* __POSIX_VISIBLE >= 200809 */