[Python-Dev] file read-ahead with Mac end-of-line

Skip Montanaro skip at pobox.com
Sun Aug 17 23:22:43 EDT 2003


A bug was reported against the csv module, claiming (rightly so) that the
csv module was not properly parsing files which use Mac line endings.  I
tracked the problem down to an apparent defiency in
readahead_get_line_skip() in fileobject.c.  It believes that only \n can
terminate a line.  The patch below fixes my csv module problem, but I wonder
if it's the correct fix.  Suppose you're using Mac line endings and
encounter a \n before a \r?  This function will return a too-short line.
(Of course, it would without the patch as well.)

I don't know how (or if) this should work with universal newline support.
We expect files to be opened in binary mode, so I don't know if universal
newline support applies.

In short, does this look like the correct patch, closer to the correct
behavior than the current setup, or no improvement at all?

Thx,

Skip

cvs diff fileobject.c
Index: fileobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/fileobject.c,v
retrieving revision 2.179
diff -c -r2.179 fileobject.c
*** fileobject.c        18 May 2003 12:56:25 -0000      2.179
--- fileobject.c        18 Aug 2003 03:13:38 -0000
***************
*** 1803,1810 ****
                return (PyStringObject *)
                        PyString_FromStringAndSize(NULL, skip);
        bufptr = memchr(f->f_bufptr, '\n', len);
        if (bufptr != NULL) {
!               bufptr++;                       /* Count the '\n' */
                len = bufptr - f->f_bufptr;
                s = (PyStringObject *)
                        PyString_FromStringAndSize(NULL, skip+len);
--- 1803,1812 ----
                return (PyStringObject *)
                        PyString_FromStringAndSize(NULL, skip);
        bufptr = memchr(f->f_bufptr, '\n', len);
+       if (bufptr == NULL)
+               bufptr = memchr(f->f_bufptr, '\r', len);
        if (bufptr != NULL) {
!               bufptr++;                       /* Count the '\n' or '\r' */
                len = bufptr - f->f_bufptr;
                s = (PyStringObject *)
                        PyString_FromStringAndSize(NULL, skip+len);



More information about the Python-Dev mailing list