Equivalent of Perl chomp?

Fri Feb 1 15:28:15 EST 2002

On Thu, 31 Jan 2002 18:24:16 -0500, "Steve Holden" <sholden at holdenweb.com> wrote:
[... big chomp test ...]
>
>My results follow. You may find you need to add a couple more decimal places
>to get sensible results, my Thinkpad is a bit of a cronker ...
>
>D:\Steve\Projects\Python>python time1.py
>Endswith 0.00000511
>sliceIt  0.00000434
>
>D:\Steve\Projects\Python>python time1.py
>Endswith 0.00000511
>sliceIt  0.00000186
>
>D:\Steve\Projects\Python>python time1.py
>Endswith 0.00000505
>sliceIt  0.00000209
>
Using CPU ticks (actual RDTSC) and finding the _minimum_ 
(best of 100 in this case - usually 10 is enough)
times on a 300mhz P2:

[12:30] C:\pywk\chompt>python chompt.py
min:   1595 nullTime
min:   3498 endsWithTime, ex nulltime: 1903
min:   2702 sliceItTime, ex nulltime: 1107

[12:30] C:\pywk\chompt>python chompt.py
min:   1598 nullTime
min:   3537 endsWithTime, ex nulltime: 1939
min:   2681 sliceItTime, ex nulltime: 1083

[12:30] C:\pywk\chompt>python chompt.py
min:   1594 nullTime
min:   3541 endsWithTime, ex nulltime: 1947
min:   2679 sliceItTime, ex nulltime: 1085

[12:30] C:\pywk\chompt>python chompt.py
min:   1598 nullTime
min:   3528 endsWithTime, ex nulltime: 1930
min:   2675 sliceItTime, ex nulltime: 1077

The variability in minimums from trials of 100 is not so much.
Four ticks (in the nullTimes) is 4/300 microseconds here ;-)

>As you can see there is still some variability but slicing looks like a
>clear winner. Increasing the iteration count might produce convergence on
I think slicing is probably a winner because endswith subsumes more
complex slicing options that have to be tested for, and the argument can be
multicharacter. I would be surprised if single-character string handling
wasn't special-cased a lot for speed, benefiting s[-1:] == '\n'.

>the timings. You might also ponder why sliceIt appear more variable than
>Endswith (tell me if you think of a good reason, I have no clue - this isn't
>a trick question).
I suspect the variability is lumpiness in memory resource access timings, not in the
routines per se. Changing the order may make the other appear more variable. The state
of your OS's memory resources will not be identical each time you start the program.

Also, you can make a shocking difference in results by inserting e.g. an apparently
innocuous print or info-storing code inside a timing loop (but outside of the timed
interval). I strongly suspect this is due to the extra code's messing up the CPU cache
for the next iteration.

I like minimum single shot times to get an idea of what an algorithm
is capable of, which you can't get as well or as quickly with averages.
Then the trick is to make it do it most of the time ;-)

Regards,
Bengt Richter