python performance
Terry Reedy
tjreedy at udel.edu
Sun Sep 15 17:56:48 EDT 2002
"Padraig Brady" <padraig at linux.ie> wrote in message
news:3D84E66F.6020701 at linux.ie...
> I was wondering about the performance characteristics
> of python and ran a simple test. The 2 programs
> below are functionally equivalent and just read
> in fields into a list. file.fields contains
> 720 fields in each of 405 lines of the form of
> repeating: oneoneone twotwo "thre e three"
>
> The time to run is shown above each program
> from which I've inferred the following:
>
> 1. The function call version is (6.3%) faster because
> the cumulative cost of parsing the simpler expressions
> and function call overhead is smaller than parsing the
> 1 single complex expression?
Calling functions adds overhead. Running code within functions
reduces overhead for object access. See end for third version to
test.
> Or the function is parsed
> only once and doesn't have to be reparsed. This would
> suggest that top level code is parsed for each iteration?
No. All code is parsed only once. See above and below.
> 2. Anyway I thought that parsing affects would be removed by doing
the
> parsing only once, i.e. compiling the code to .pyc (I used
> py_compile.compile()). However this makes no difference at all?
> Surely compiling is not just for code obfuscation.
No. The one-time parsing of 10 lines of code is extremely fast. If
it were parsed over again for each line, then you would notice the
parsing time.
> Note I did do the test several times and averaged the results.
> ----------------------------
> 2.514s
> ----------------------------
> #!/usr/bin/env python2.2
> import re
>
> reFieldFinder = re.compile('[^ "]+|"[^"]+"') #unquoted|quoted
> def getFields(line):
> fields = reFieldFinder.findall(line)
> return [field.replace('"', '') for field in fields]
>
> for line in open("file.fields").readlines():
> listLine = getFields(line[:-1])
>
> ----------------------------
> 2.672s
> ----------------------------
> #!/usr/bin/env python2.2
> import re
>
> reFieldFinder = re.compile('[^ "]+|"[^"]+"') #unquoted|quoted
> for line in open("file.fields").readlines():
> listLine = [field.replace('"', '') for field in
> reFieldFinder.findall(line[:-1])]
Try one more test to run at function speed without repeated
getsfields():
def mytest:
reFieldFinder = re.compile('[^ "]+|"[^"]+"') #unquoted|quoted
for line in open("file.fields").readlines():
listLine = [field.replace('"', '') for field in
reFieldFinder.findall(line[:-1])]
mytest()
Terry J. Reedy
More information about the Python-list
mailing list