string.join is abysmally slow

Pekka Pessi Pekka.Pessi at nokia.com
Sun Apr 15 19:46:42 EDT 2001


In message <mailman.987375622.11843.python-list at python.org> Graham Guttocks <graham_guttocks at yahoo.co.nz> writes:
>I've run into a performance problem in one of my functions, and wonder
>if I could get some recommendations on how to speed things up.
...
>I'm using string.join to concatenate the addresses together, separated
>by a `|'.  The problem is that string.join is unacceptably slow in
>this task.  The following program takes 37 seconds on a PIII/700 to
>process a 239-line file!

        If "|".join is slow, use it sparingly?  The real problem seems to be
        that you are recreating the regexp about 238 times or more, and then
        throwing the result away.  Create the regexp once after the loop has
        completed.  You could also re.escape() the mail addresses:

for line in fileinput.input(textfile):
    if line != '' and line[0] != '#':
        list.append(re.escape(string.strip(line)))

# "address1|address2|address3|addressN"
reo = re.compile('|'.join(list), re.I)

        BR,
                                                Pekka



More information about the Python-list mailing list