An efficient split function

Dan Schmidt dfan at harmonixmusic.com
Mon May 10 12:26:36 EDT 1999


"Andrew M. Kuchling" <akuchlin at cnri.reston.va.us> writes:

| William S. Lear writes:
|
| >Surprisingly, to me, the Python version far outperformed the Perl
| >version.  Running on 1 million lines of input of 9 fields each, the
| >Python version ran in just under 20 seconds, the Perl version in
| >just under 40 seconds (this on a 400Mhz Pentium Linux box).
| 
| 	Note that your use of split(/\|/) in Perl requires using the
| regular expression engine, instead of a simple C splitting loop .
| Try using a literal string instead of a regex, as in split('|',
| ...); that will probably even out the speeds.

The first argument to Perl's split() is a regular expression.  If
it's a string, it'll just get converted into a regexp (except for the
special case ' '; it's Perl, there had to be a special case).  So

 - You actually need to use '\|', not '|', if you're going to use a
   string instead of a regexp (try it and see);

 - '\|' isn't actually any faster than /\|/ (I benchmarked it to
   check).

-- 
                 Dan Schmidt -> dfan at harmonixmusic.com, dfan at alum.mit.edu
Honest Bob & the                http://www2.thecia.net/users/dfan/
Factory-to-Dealer Incentives -> http://www2.thecia.net/users/dfan/hbob/
          Gamelan Galak Tika -> http://web.mit.edu/galak-tika/www/




More information about the Python-list mailing list