An efficient split function
Tim Peters
tim_one at email.msn.com
Mon May 10 23:19:28 EDT 1999
[Andrew M. Kuchling]
> Note that your use of split(/\|/) in Perl requires using the
> regular expression engine, instead of a simple C splitting loop . Try
> using a literal string instead of a regex, as in split('|', ...); that
> will probably even out the speeds.
[William S. Lear]
> Thanks for the suggestion, which I had tried originally, but got
> marginally worse performance than with the regexp. For some reason, I
> did have to do split('\|') instead of split('|'), which I found curious.
Unless Perl has changed a lot since the last time I cared <wink>, the notion
that split will accept a literal string *as* a literal string is an
illusion: string expressions are treated as regexps too, *typically* used
when the split pattern varies at runtime. '|' as a regexp means "match the
empty string, or match the empty string", and so will split the line into
characters. This is consistent with your need to spell it '\|' to get what
you wanted. The "marginally worse" performance was also likely an
illusion -- should have been the same.
The easiest ways to speed the Python version:
1. Stick the whole thing in a function (local vrbl access is much cheaper
than global).
2. Read more than one line at a time (e.g. try readlines with a largish
"hint" argument).
anything-faster-than-doing-it-by-hand-is-excessive<wink>-ly y'rs - tim
More information about the Python-list
mailing list