[Pythonmac-SIG] Re: [Python-Dev] Import hook to do end-of-line conversion?

Tim Peters tim.one@home.com
Fri, 13 Apr 2001 18:39:46 -0400


[MAL]
> I don't know why this thread lead to tweaking stdio -- after all
> we only need a solution for the Python tokenizer ...

[Just]
> Aaaaaaaaaaaargh! ;-) Here we go again: fixing the tokenizer is
> great and all,> but then what about all tools that read source
> files line by line? ...

Note that this is why the topic needs a PEP:  nothing here is new; the same
debates reoccur every time it comes up.

[Aahz]
> ...
> QIO claims that it can be configured to recognize different
> kinds of line endings.

It can be, yes, but in the same sense as Awk/Perl paragraph mode:  you can
tell it to consider any string (not just single character) as meaning "end of
the line", but it's a *fixed* string per invocation.  What people want *here*
is more the ability to recognize the regular expression

    \r\n?|\n

as ending a line, and QIO can't do that directly (as currently written).  And
MAL probably wants Unicode line-end detection:

    http://www.unicode.org/unicode/reports/tr13/

> QIO is claimed to be 2-3 times faster than Python 1.5.2; don't
> know how that compares to 2.x.

The bulk of that was due to QIO avoiding per-character thread locks.  2.1
avoids them too, so most of QIO's speed advantage should be gone now.  But
QIO's internals could certainly be faster than they are (this is obscure
because QIO.readline() has so many optional behaviors that the maze of
if-tests makes it hard to see the speed-crucial bits; studying Perl's
line-reading code is a better model, because Perl's speed-crucial inner loop
has no non-essential operations -- Perl makes the *surrounding* code sort out
the optional bits, instead of bogging down the loop with them).