Text Parsing - character at a time...
Jeff Epler
jepler at unpythonic.net
Fri Jul 9 08:12:43 EDT 2004
It's not clear what you mean by claiming that "creating a new string for
every character" is inefficient:
$ timeit 'int()'
100000 loops, best of 3: 1.26 usec per loop
$ timeit 'str()'
1000000 loops, best of 3: 1.28 usec per loop
$ timeit 'chr(0)'
1000000 loops, best of 3: 1.73 usec per loop
If your output is a transformation of your input, I'd write
def transform(input):
def _transform():
for c in input:
yield a string zero or more times
return ''.join(_transform())
Python should automatically do some nice overallocation tricks to make
this fairly efficient. You could also write
def transform(input):
result = ''
for c in input:
result.append(a string) zero or more times
return ''.join(result)
and if you care about the absolute fastest code you'll benchmark both of
them.
A common "gotcha" for starting programmers would be to write something
like
def transform(input):
result = ''
for c in input:
result += a string zero or more times
return result
because in this case Python won't (currently, anyway) do any clever
overallocation tricks, but instead will do a copy of the partial result
at the site of each +=.
Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20040709/ead05848/attachment.sig>
More information about the Python-list
mailing list