Change pid_t base type

Message ID 040f00d1-fff3-adf7-4288-02dc1e2d9e3d@provenrun.com
State New
Headers show
Series
  • Change pid_t base type
Related show

Commit Message

Henri Chataing Aug. 1, 2018, 2:02 p.m.
Hello,

I am currently porting newlib to our platform provencore, which uses
64bit wide process identifiers. Everything went well except for that
last point, as pid_t seems to be hardcoded as int.
I feel that it should be possible to modify the type pid_t to use any
arbitrary signed integer.

Can you tell me if there is a reason for not allowing larger pids ?

The attachement contains some modifications I performed in order to be
able to override __pid_t in sys/_types.h.
The patch applies on top of
master:ab640f4cd5605b6675538b196641c46c36c75c64 in the newlib-cygwin
github mirror.

Thanks !

Bests regards,
Henri

Comments

Joel Sherrill Aug. 1, 2018, 2:22 p.m. | #1
http://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/types.h.html

Says pid_t can be no larger than long if I am reading that correctly.

Is POSIX a concern? Or is long 64-bits on all your targets?

--joel

On Wed, Aug 1, 2018, 9:04 AM Henri Chataing <henri.chataing@provenrun.com>
wrote:

> Hello,

>

> I am currently porting newlib to our platform provencore, which uses

> 64bit wide process identifiers. Everything went well except for that

> last point, as pid_t seems to be hardcoded as int.

> I feel that it should be possible to modify the type pid_t to use any

> arbitrary signed integer.

>

> Can you tell me if there is a reason for not allowing larger pids ?

>

> The attachement contains some modifications I performed in order to be

> able to override __pid_t in sys/_types.h.

> The patch applies on top of

> master:ab640f4cd5605b6675538b196641c46c36c75c64 in the newlib-cygwin

> github mirror.

>

> Thanks !

>

> Bests regards,

> Henri

>

>
Eric Blake Aug. 1, 2018, 2:33 p.m. | #2
On 08/01/2018 09:22 AM, Joel Sherrill wrote:
> http://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/types.h.html

> 

> Says pid_t can be no larger than long if I am reading that correctly.


"The implementation shall support one or more programming environments 
in which the widths of blksize_t, pid_t, size_t, ssize_t, suseconds_t, 
and useconds_t are no greater than the width of type long. The names of 
these programming environments can be obtained using the confstr() 
function or the getconf utility."

> 

> Is POSIX a concern? Or is long 64-bits on all your targets?


If long is 32 bits, do you support two separate compilation environments 
(one for POSIX with 32-bit pid_t, the other for full 64-bit pid_t)?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
Brian Inglis Aug. 1, 2018, 8:06 p.m. | #3
On 2018-08-01 08:22, Joel Sherrill wrote:
> On Wed, Aug 1, 2018, 9:04 AM Henri Chataing wrote:

>> I am currently porting newlib to our platform provencore, which uses

>> 64bit wide process identifiers. Everything went well except for that

>> last point, as pid_t seems to be hardcoded as int.

>> I feel that it should be possible to modify the type pid_t to use any

>> arbitrary signed integer.

>>

>> Can you tell me if there is a reason for not allowing larger pids ?

>>

>> The attachement contains some modifications I performed in order to be

>> able to override __pid_t in sys/_types.h.

>> The patch applies on top of

>> master:ab640f4cd5605b6675538b196641c46c36c75c64 in the newlib-cygwin

>> github mirror.


> http://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/types.h.html

> Says pid_t can be no larger than long if I am reading that correctly.

> Is POSIX a concern? Or is long 64-bits on all your targets?


Your platform can use 64 bit process identifiers, but those need not, and
probably should not, be exposed to userspace, but kept private in the process or
task struct. Use hashes of those as pids within system defined limits except for
pids 0 (idle) and 1 (init).

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
Eric Blake Aug. 1, 2018, 8:20 p.m. | #4
On 08/01/2018 03:06 PM, Brian Inglis wrote:
> On 2018-08-01 08:22, Joel Sherrill wrote:

>> On Wed, Aug 1, 2018, 9:04 AM Henri Chataing wrote:

>>> I am currently porting newlib to our platform provencore, which uses

>>> 64bit wide process identifiers. Everything went well except for that

>>> last point, as pid_t seems to be hardcoded as int.

>>> I feel that it should be possible to modify the type pid_t to use any

>>> arbitrary signed integer.

>>>

>>> Can you tell me if there is a reason for not allowing larger pids ?


As a matter of portable code, for a while, mingw made the mistake of 
declaring a 64-bit pid_t but having 'int getpid()', which led to all 
sorts of messes.  I think they have since fixed it to 32-bit pid_t and 
'pid_t getpid()' as a result of the bug reports raised against their 
headers.

At any rate, it becomes very hard to portably printf() a pid_t value if 
you cannot guarantee that printf("%ld", (long)pid) works out of the box.

>>>

>>> The attachement contains some modifications I performed in order to be

>>> able to override __pid_t in sys/_types.h.

>>> The patch applies on top of

>>> master:ab640f4cd5605b6675538b196641c46c36c75c64 in the newlib-cygwin

>>> github mirror.

> 

>> http://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/types.h.html

>> Says pid_t can be no larger than long if I am reading that correctly.

>> Is POSIX a concern? Or is long 64-bits on all your targets?

> 

> Your platform can use 64 bit process identifiers, but those need not, and

> probably should not, be exposed to userspace, but kept private in the process or

> task struct. Use hashes of those as pids within system defined limits except for

> pids 0 (idle) and 1 (init).


Indeed. Is your 32-bit platform really capable of running more than 2 
billion simultaneous processes even when bounded by a 4G address space? 
Okay, maybe you have some hardware tricks for a single process accessing 
more than 4G of physical memory, but how much more?  Conversely, there's 
the argument that any process, even the most trivial of them, will 
occupy enough space (figure at least 1k per process just for information 
related to tracking the process across task swaps), that you can 
reasonable assure that your hardware limitations will kick in long 
before you can exhaust 32-bit pid_t in use at the same time (pid_t 
wraparound is more likely when one process starts reusing the same id as 
an earlier completed process, particularly when using 16-bit pids, but 
portable programs have learned to be robust to pid reuse).  A 64-bit 
platform might actually be able to have more than 2 billion parallel 
processes, but there, sizeof(pid_t) <= sizeof(long) is still reasonable 
when you have 64-bit long.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
Henri Chataing Aug. 27, 2018, 3:10 p.m. | #5
Hello,

sorry for the late reply, I forgot to hit the reply-all on my last mail
it seems ...
I wasn't aware of this limitation. We are working on 32 bit platforms as
well as 64 bits, so no easy way out.
I suppose we could support two compilation environments by deactivating
fork / kill / getpid if pid_t when long is 32bits.
Do you think it is possible anyway to change the library to enable
overwriting the type pid_t ?
Thanks !

Henri Chataing

On 01/08/18 16:33, Eric Blake wrote:
> On 08/01/2018 09:22 AM, Joel Sherrill wrote:

>> http://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/types.h.html

>>

>> Says pid_t can be no larger than long if I am reading that correctly.

>

> "The implementation shall support one or more programming environments

> in which the widths of blksize_t, pid_t, size_t, ssize_t, suseconds_t,

> and useconds_t are no greater than the width of type long. The names

> of these programming environments can be obtained using the confstr()

> function or the getconf utility."

>

>>

>> Is POSIX a concern? Or is long 64-bits on all your targets?

>

> If long is 32 bits, do you support two separate compilation

> environments (one for POSIX with 32-bit pid_t, the other for full

> 64-bit pid_t)?

>

Patch

diff --git a/newlib/libc/include/reent.h b/newlib/libc/include/reent.h
index 2b01fbe..b71a901 100644
--- a/newlib/libc/include/reent.h
+++ b/newlib/libc/include/reent.h
@@ -33,13 +33,13 @@ 
       and do not supply functional interfaces for any of the reentrant
       calls. With this method, the reentrant syscalls are redefined to
       directly call the regular system call without the reentrancy argument.
-      When you do this, specify both -DREENTRANT_SYSCALLS_PROVIDED and 
+      When you do this, specify both -DREENTRANT_SYSCALLS_PROVIDED and
       -DMISSING_SYSCALL_NAMES via newlib_cflags in configure.host and do
       not specify "syscall_dir".
 
    Stubs of the reentrant versions of the syscalls exist in the libc/reent
-   source directory and are provided if REENTRANT_SYSCALLS_PROVIDED isn't 
-   defined.  These stubs call the native system calls: _open, _close, etc. 
+   source directory and are provided if REENTRANT_SYSCALLS_PROVIDED isn't
+   defined.  These stubs call the native system calls: _open, _close, etc.
    if MISSING_SYSCALL_NAMES is *not* defined, otherwise they call the
    non-underscored versions: open, close, etc. when MISSING_SYSCALL_NAMES
    *is* defined.
@@ -51,17 +51,17 @@ 
    keep a separate errno value which is intuitive since the application flow
    cannot check for failure reliably otherwise.
 
-   The reentrant syscalls are either provided by the platform, by the 
-   libc/reent stubs, or in the case of both MISSING_SYSCALL_NAMES and 
+   The reentrant syscalls are either provided by the platform, by the
+   libc/reent stubs, or in the case of both MISSING_SYSCALL_NAMES and
    REENTRANT_SYSCALLS_PROVIDED being defined, the calls are redefined to
    simply call the regular syscalls with no reentrancy struct argument.
 
    A single-threaded application does not need to worry about the reentrancy
-   structure.  It is used internally.  
+   structure.  It is used internally.
 
-   A multi-threaded application needs either to manually manage reentrancy 
+   A multi-threaded application needs either to manually manage reentrancy
    structures or use dynamic reentrancy.
-   
+
    Manually managing reentrancy structures entails calling special reentrant
    versions of newlib functions that have an additional reentrancy argument.
    For example, _printf_r.  By convention, the first argument is the
@@ -76,7 +76,7 @@ 
    to __getreent().  This function needs to be implemented by the platform
    and it is meant to return the reentrancy structure for the current
    thread.  When the regular C functions (e.g. printf) go to call internal
-   routines with the default _REENT structure, they end up calling with 
+   routines with the default _REENT structure, they end up calling with
    the reentrancy structure for the thread.  Thus, application code does not
    need to call the _r routines nor worry about reentrancy structures.  */
 
@@ -139,11 +139,11 @@  struct timezone;
 extern int _close_r (struct _reent *, int);
 extern int _execve_r (struct _reent *, const char *, char *const *, char *const *);
 extern int _fcntl_r (struct _reent *, int, int, int);
-extern int _fork_r (struct _reent *);
+extern __pid_t _fork_r (struct _reent *);
 extern int _fstat_r (struct _reent *, int, struct stat *);
-extern int _getpid_r (struct _reent *);
+extern __pid_t _getpid_r (struct _reent *);
 extern int _isatty_r (struct _reent *, int);
-extern int _kill_r (struct _reent *, int, int);
+extern int _kill_r (struct _reent *, __pid_t, int);
 extern int _link_r (struct _reent *, const char *, const char *);
 extern _off_t _lseek_r (struct _reent *, int, _off_t, int);
 extern int _mkdir_r (struct _reent *, const char *, int);
diff --git a/newlib/libc/include/sys/types.h b/newlib/libc/include/sys/types.h
index 65ff520..5a05af0 100644
--- a/newlib/libc/include/sys/types.h
+++ b/newlib/libc/include/sys/types.h
@@ -1,4 +1,4 @@ 
-/* unified sys/types.h: 
+/* unified sys/types.h:
    start with sef's sysvi386 version.
    merge go32 version -- a few ifdefs.
    h8300hms, h8300xray, and sysvnecv70 disagree on the following types:
@@ -10,8 +10,8 @@ 
    typedef int mode_t;
    typedef int caddr_t;
 
-   however, these aren't "reasonable" values, the sysvi386 ones make far 
-   more sense, and should work sufficiently well (in particular, h8300 
+   however, these aren't "reasonable" values, the sysvi386 ones make far
+   more sense, and should work sufficiently well (in particular, h8300
    doesn't have a stat, and the necv70 doesn't matter.) -- eichin
  */
 
@@ -29,7 +29,7 @@  typedef __uint8_t	u_int8_t;
 #endif
 #if ___int16_t_defined
 typedef __uint16_t	u_int16_t;
-#endif 
+#endif
 #if ___int32_t_defined
 typedef __uint32_t	u_int32_t;
 #endif
diff --git a/newlib/libc/reent/execr.c b/newlib/libc/reent/execr.c
index 59b6122..54b68d2 100644
--- a/newlib/libc/reent/execr.c
+++ b/newlib/libc/reent/execr.c
@@ -29,7 +29,7 @@  extern int errno;
 
 /*
 FUNCTION
-	<<_execve_r>>---Reentrant version of execve	
+	<<_execve_r>>---Reentrant version of execve
 INDEX
 	_execve_r
 
@@ -63,7 +63,7 @@  _execve_r (struct _reent *ptr,
 NEWPAGE
 FUNCTION
 	<<_fork_r>>---Reentrant version of fork
-	
+
 INDEX
 	_fork_r
 
@@ -79,10 +79,10 @@  DESCRIPTION
 
 #ifndef NO_FORK
 
-int
+pid_t
 _fork_r (struct _reent *ptr)
 {
-  int ret;
+  pid_t ret;
 
   errno = 0;
   if ((ret = _fork ()) == -1 && errno != 0)
@@ -96,7 +96,7 @@  _fork_r (struct _reent *ptr)
 NEWPAGE
 FUNCTION
 	<<_wait_r>>---Reentrant version of wait
-	
+
 INDEX
 	_wait_r
 
diff --git a/newlib/libc/reent/signalr.c b/newlib/libc/reent/signalr.c
index 345910e..cb41510 100644
--- a/newlib/libc/reent/signalr.c
+++ b/newlib/libc/reent/signalr.c
@@ -28,7 +28,7 @@  extern int errno;
 /*
 FUNCTION
 	<<_kill_r>>---Reentrant version of kill
-	
+
 INDEX
 	_kill_r
 
@@ -44,7 +44,7 @@  DESCRIPTION
 
 int
 _kill_r (struct _reent *ptr,
-     int pid,
+     pid_t pid,
      int sig)
 {
   int ret;
@@ -59,7 +59,7 @@  _kill_r (struct _reent *ptr,
 NEWPAGE
 FUNCTION
 	<<_getpid_r>>---Reentrant version of getpid
-	
+
 INDEX
 	_getpid_r
 
@@ -76,10 +76,10 @@  DESCRIPTION
 	still must have the reentrant pointer argument.
 */
 
-int
+pid_t
 _getpid_r (struct _reent *ptr)
 {
-  int ret;
+  pid_t ret;
   ret = _getpid ();
   return ret;
 }
diff --git a/newlib/libc/syscalls/sysfork.c b/newlib/libc/syscalls/sysfork.c
index 282f73f..d819c28 100644
--- a/newlib/libc/syscalls/sysfork.c
+++ b/newlib/libc/syscalls/sysfork.c
@@ -7,7 +7,7 @@ 
 #include <reent.h>
 #include <unistd.h>
 
-int
+pid_t
 fork (void)
 {
   return _fork_r (_REENT);
diff --git a/newlib/libc/syscalls/sysgetpid.c b/newlib/libc/syscalls/sysgetpid.c
index 22c7605..fae7a15 100644
--- a/newlib/libc/syscalls/sysgetpid.c
+++ b/newlib/libc/syscalls/sysgetpid.c
@@ -3,7 +3,7 @@ 
 #include <reent.h>
 #include <unistd.h>
 
-int
+pid_t
 getpid (void)
 {
   return _getpid_r (_REENT);
diff --git a/newlib/libc/syscalls/syskill.c b/newlib/libc/syscalls/syskill.c
index 34b9f17..7687604 100644
--- a/newlib/libc/syscalls/syskill.c
+++ b/newlib/libc/syscalls/syskill.c
@@ -4,7 +4,7 @@ 
 #include <signal.h>
 
 int
-kill (int pid,
+kill (pid_t pid,
      int sig)
 {
   return _kill_r (_REENT, pid, sig);