[Python-Dev] xreadlines : readlines :: xrange : range
Guido van Rossum
guido@python.org
Thu, 11 Jan 2001 10:23:26 -0500
> [Tim]
> >> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
> >> to keep the file locked until the line was complete, and I
> >> wouldn't be opposed to making life saner on platforms that allow it.
>
> [Guido]
> > Hm... That would be possible, except for one unfortunate detail:
> > _PyString_Resize() may call PyErr_BadInternalCall() which touches
> > thread state.
[Tim]
> FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
> IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
> *exit* path thereafter. We can block/unblock Python threads as often as
> desired between those *file*-locking brackets. The only thing the repeated
> FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
> for multiple readers to get partial lines of the file.
I don't want to call FLOCKFILE while holding the Python lock, as this
means that *if* we're blocked in FLOCKFILE (e.g. we're reading from a
pipe or socket), no other Python thread can run!
> > ...
> > NO, NO NO! Mixing reads and writes on the same stream wasn't what we
> > are locking against at all. (As you've found out, it doesn't even
> > work.)
>
> On Windows, yes, but that still seems to me to be a bug in MS's code. If
> anyone had reported a core dump on any other platform, I'd be more tractable
> <wink> on this point.
Yes, it's a Windows bug.
> > We're only trying to protect against concurrent *reads*.
>
> As above, I believe that we could do a better job of that, then, on
> platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
> but also against .readline() not delivering an intact line from the file.
See above for a reason why I think that's not safe. I think that
applications that want to do this can do their own locking. (They'll
find out soon enough that readline() isn't atomic. :-)
> >> But since FLOCKFILE is in effect, other threads *trying* to write
> >> to the stream we're reading will get blocked anyway. Seems to give us
> >> potential for deadlocks.
>
> > Only if tyeh are holding other locks at the same time.
>
> I'm not being clear, then. Thread X does f.readline(), on a
> HAVE_GETC_UNLOCKED platform. get_line allows other threads to run and
> invokes FLOCKFILE on f->f_fp. get_line's GETC in thread X eventually hits
> the end of the stdio buffer, and does its platform's version of _filbuf.
> _filbuf may wait (depending on the nature of the stream) for more input to
> show up. Simultaneously, thread Y attempts to write some data to f. But
> the *FLOCKFILE* lock prevents it from doing anything with f. So X is
> waiting for Y to write data inside platform _filbuf, but Y is waiting for X
> to release the platform stream lock inside some platform stream-output
> routine (if I'm being clear now, Python locks have nothing to do with this
> scenario: it's the platform stream lock).
I don't think that _filbuf can possibly wait for another thread to
write data to the same stream object. A single stream object doesn't
act like a pipe, even if it is open for simultaneous reading and
writing. So if there's no more data in the file, _fulbuf will simply
return with an EOF status, not wait for the data that the other thread
would write.
> I think this is purely the user's fault if it happens. Just pointing it out
> as another insecurity we're probably not able to protect users from.
I don't think this can happen.
> > ...
> > Yeah. But this is insane use -- see my comments on SF. It's only
> > worth fixing because it could be used to intentionally crash Python --
> > but there are easier ways...
>
> If it's unique to MS (as I suspect), I see no reason to even consider trying
> to fix it in Python. Unless the Perl Mongers use it to crash Zope <wink>.
OK. It's unique to MS. So close the bug report with a "won't fix"
resolution. There's no point in having bug reports remain open that
we know we can't fix.
--Guido van Rossum (home page: http://www.python.org/~guido/)