need fast parser for comma/space delimited numbers
Les Schaffer
godzilla at netmeg.net
Sat Mar 18 12:09:58 EST 2000
>>>>> ">" == Gordon McMillan <gmcm at hypernet.com> writes:
>> Les Schaffer wants speed:
hmmm.... i am going to get a bad reputation ...
>> We can speed up what you've got, but probably not that much!
your ideas look real good. i will try them first before thinking about
doing a C module.
>> First, use "def __parseIFF(self, str, atoi=string.atoi,
>> atof=string.atof):" and then access those as locals.
this is interesting. is there a difference between
def __parseIFF(self, str, atoi=string.atoi):
...
and
def __parseIFF(self, str, atoi=string.atoi):
atoi = string.atoi
...
i am guessing there is enough difference, never thought about it till
now. i guess the atoi=string.atoi creates a "static" local copy for
this function, the assignment done only once, whereas the
atoi=string.atoi in the body of the def gets executed every stinkin
time, correct?
>> Second, benchmark against "int" and "float".
okay. i noticed in the Scientific Python modules K. Hinsen uses
something like this
numb = exec( str )
with str being things like ' 4.235 ', etc. i wonder which is faster?
(thinking out loud)
>> First, splitfields is obsolete, use "split".
sheeesh. i read the manual all the time and i constantly confuse which
of them is obsolete and which isnt. Someone toss that splitfields out
the window, please!!!!
>> Second, special case the whitespace case, because that would
>> just be "split(str)". Third, use locals trick.
i think i can swing the special case trick, cause code using this class
can know ahead of time if its csv or whitespace.
> For the all floats, all whitespace case, this would just be
> num = map(float, split(strLines[i]))
> and that might get you the speed you want.
okay. is there a big difference between the string.ato[if] and
float/int?
> For the comma case, you might try:
> s = join(split(strLines[i], ','), ' ')
> num = map(float, split(s))
> or
> t = split(strLines[i], ',')
> t = map(strip, t)
> num = map(float, t)
will give'em a try...
anyone care to take a guesstimate on how much further time i could
save by coding something in C?if i did that, i would write a function
which takes a Python list object (list of string) and passes back a
pair of Numeric array objects (dependent and independent
variables). so i would cut out all the python for looping as well.
many thanks, gordon!
les schaffer
More information about the Python-list
mailing list