[SciPy-Dev] possible speed-up for arffread
Benjamin Root
ben.root at ou.edu
Tue Jun 15 22:46:21 EDT 2010
Hello,
I was looking at the scipy.io.arff module to see if I could easily shave
some processing time for loading an ARFF file. Doing some profiling on a
file with 40,000 floating point numbers pointed me to the safe_float()
function in the arffread.py file. In it, it was stripping the string token
of any whitespace and then comparing it to '?' (which is ARFF's missing data
indicator). I found that if one just does a check for the '?' character,
you can shave almost 30% of the processing time off of the safe_float()
function.
In addition, I found a very slight improvement by calculating the range(ni)
once and reusing that variable in the generator function. Attached is my
patch file.
It isn't much, but it is noticeable.
Thanks,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20100615/5a1c2d5c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: arffread_speedup.patch
Type: text/x-patch
Size: 1289 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20100615/5a1c2d5c/attachment.bin>
More information about the SciPy-Dev
mailing list