newlib: fix fseek optimization with SEEK_CUR

Message ID 20191109162804.1905160-1-bastien.bouclet@gmail.com
State Accepted
Commit 59362c80e3a02c011fd0ef3d7f07a20098d2a9d5
Headers show
Series
  • newlib: fix fseek optimization with SEEK_CUR
Related show

Commit Message

Bastien Bouclet Nov. 9, 2019, 4:28 p.m.
The call to fflush was invalidating the read buffer, preventing relative
seeks to positions that would have been inside the read buffer from
being optimized. The call to srefill would then re-read mostly the same
data that was initially in the read buffer.
---
 newlib/libc/stdio/fseeko.c     | 31 ++++++-------------------------
 newlib/libc/stdio64/fseeko64.c | 31 ++++++-------------------------
 2 files changed, 12 insertions(+), 50 deletions(-)

-- 
2.24.0

Comments

Corinna Vinschen Nov. 13, 2019, 10:15 a.m. | #1
Hi Bastien,

On Nov  9 17:28, Bastien Bouclet wrote:
> The call to fflush was invalidating the read buffer, preventing relative

> seeks to positions that would have been inside the read buffer from

> being optimized. The call to srefill would then re-read mostly the same

> data that was initially in the read buffer.


I checked this against upstream BSD versions.  OpenBSD and NetBSD
operate like our code, including the flush, while FreeBSD uses its
internal ftello and never flushed since the repository import back in
1994.

I'm pretty unsure if we can do this.  Apparently the flush op is only
necessary for streams in append mode.  If at all.

Can we be sure this works as desired on append streams as well?

Also, given that this is changing very basic code, nobody is unaffected.
Any input from other folks?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Corinna Vinschen Nov. 18, 2019, 10:11 a.m. | #2
On Nov  9 17:28, Bastien Bouclet wrote:
> The call to fflush was invalidating the read buffer, preventing relative

> seeks to positions that would have been inside the read buffer from

> being optimized. The call to srefill would then re-read mostly the same

> data that was initially in the read buffer.

> ---

>  newlib/libc/stdio/fseeko.c     | 31 ++++++-------------------------

>  newlib/libc/stdio64/fseeko64.c | 31 ++++++-------------------------

>  2 files changed, 12 insertions(+), 50 deletions(-)


Pushed.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Corinna Vinschen Jan. 29, 2020, 6:02 p.m. | #3
On Nov 18 11:11, Corinna Vinschen wrote:
> On Nov  9 17:28, Bastien Bouclet wrote:

> > The call to fflush was invalidating the read buffer, preventing relative

> > seeks to positions that would have been inside the read buffer from

> > being optimized. The call to srefill would then re-read mostly the same

> > data that was initially in the read buffer.

> > ---

> >  newlib/libc/stdio/fseeko.c     | 31 ++++++-------------------------

> >  newlib/libc/stdio64/fseeko64.c | 31 ++++++-------------------------

> >  2 files changed, 12 insertions(+), 50 deletions(-)

> 

> Pushed.


Sorry, but I had to revert this patch.  It breaks gnulib's autoconf
test.  The attached conftest.c returns 5, rather than 0 as before
because lseek and ftello return different results.

While this is expected on BSD systems, it's not expected on at least
Linux and Cygwin.  Since this breaks backward compatibility and
leads to gnulib wrongly providing its own fflush, fseek and fseeko
implementations when building for newlib/Cygwin.

I attached the gnulib testcase for completeness.

Many thanks to Takashi Yano for figuring this out after the CYgwin
octave build was broken.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
#include <stdio.h>
# include <unistd.h>

int
main ()
{
FILE *f = fopen ("conftest.txt", "r");
         char buffer[10];
         int fd;
         int c;
         if (f == NULL)
           return 1;
         fd = fileno (f);
         if (fd < 0 || fread (buffer, 1, 5, f) != 5)
           { fclose (f); return 2; }
         /* For deterministic results, ensure f read a bigger buffer.  */
         if (lseek (fd, 0, SEEK_CUR) == 5)
           { fclose (f); return 3; }
         /* POSIX requires fflush-fseek to set file offset of fd.  This fails
            on BSD systems and on mingw.  */
         if (fflush (f) != 0 || fseek (f, 0, SEEK_CUR) != 0)
           { fclose (f); return 4; }
         if (lseek (fd, 0, SEEK_CUR) != 5)
           { fclose (f); return 5; }
         /* Verify behaviour of fflush after ungetc. See
            <http://www.opengroup.org/austin/aardvark/latest/xshbug3.txt>  */
         /* Verify behaviour of fflush after a backup ungetc.  This fails on
            mingw.  */
         c = fgetc (f);
         ungetc (c, f);
         fflush (f);
         if (fgetc (f) != c)
           { fclose (f); return 6; }
         /* Verify behaviour of fflush after a non-backup ungetc.  This fails
            on glibc 2.8 and on BSD systems.  */
         c = fgetc (f);
         ungetc ('@', f);
         fflush (f);
         if (fgetc (f) != c)
           { fclose (f); return 7; }
         fclose (f);
         return 0;

  ;
  return 0;
}
hello world

Patch

diff --git a/newlib/libc/stdio/fseeko.c b/newlib/libc/stdio/fseeko.c
index 3e0f9e90b..bbf1af43e 100644
--- a/newlib/libc/stdio/fseeko.c
+++ b/newlib/libc/stdio/fseeko.c
@@ -141,31 +141,12 @@  _fseeko_r (struct _reent *ptr,
   switch (whence)
     {
     case SEEK_CUR:
-      /*
-       * In order to seek relative to the current stream offset,
-       * we have to first find the current stream offset a la
-       * ftell (see ftell for details).
-       */
-      _fflush_r (ptr, fp);   /* may adjust seek offset on append stream */
-      if (fp->_flags & __SOFF)
-	curoff = fp->_offset;
-      else
-	{
-	  curoff = seekfn (ptr, fp->_cookie, (_fpos_t) 0, SEEK_CUR);
-	  if (curoff == -1L)
-	    {
-	      _newlib_flockfile_exit (fp);
-	      return EOF;
-	    }
-	}
-      if (fp->_flags & __SRD)
-	{
-	  curoff -= fp->_r;
-	  if (HASUB (fp))
-	    curoff -= fp->_ur;
-	}
-      else if (fp->_flags & __SWR && fp->_p != NULL)
-	curoff += fp->_p - fp->_bf._base;
+      curoff = _ftello_r(ptr, fp);
+      if (curoff == -1L)
+        {
+          _newlib_flockfile_exit (fp);
+          return EOF;
+        }
 
       offset += curoff;
       whence = SEEK_SET;
diff --git a/newlib/libc/stdio64/fseeko64.c b/newlib/libc/stdio64/fseeko64.c
index 0672086a3..f38005570 100644
--- a/newlib/libc/stdio64/fseeko64.c
+++ b/newlib/libc/stdio64/fseeko64.c
@@ -142,31 +142,12 @@  _fseeko64_r (struct _reent *ptr,
   switch (whence)
     {
     case SEEK_CUR:
-      /*
-       * In order to seek relative to the current stream offset,
-       * we have to first find the current stream offset a la
-       * ftell (see ftell for details).
-       */
-      _fflush_r (ptr, fp);   /* may adjust seek offset on append stream */
-      if (fp->_flags & __SOFF)
-	curoff = fp->_offset;
-      else
-	{
-	  curoff = seekfn (ptr, fp->_cookie, (_fpos64_t) 0, SEEK_CUR);
-	  if (curoff == -1L)
-	    {
-	      _newlib_flockfile_exit(fp);
-	      return EOF;
-	    }
-	}
-      if (fp->_flags & __SRD)
-	{
-	  curoff -= fp->_r;
-	  if (HASUB (fp))
-	    curoff -= fp->_ur;
-	}
-      else if (fp->_flags & __SWR && fp->_p != NULL)
-	curoff += fp->_p - fp->_bf._base;
+      curoff = _ftello64_r(ptr, fp);
+      if (curoff == -1L)
+        {
+          _newlib_flockfile_exit (fp);
+          return EOF;
+        }
 
       offset += curoff;
       whence = SEEK_SET;