string.join is abysmally slow

Graham Guttocks graham_guttocks at yahoo.co.nz
Sun Apr 15 18:59:11 EDT 2001


Greetings,

I've run into a performance problem in one of my functions, and wonder
if I could get some recommendations on how to speed things up.

What I'm trying to do is read in a textfile containing e-mail
addresses, one per line, and use them to build a regular expression
object in the form "address1|address2|address3|addressN" to search
against.

I'm using string.join to concatenate the addresses together, separated
by a `|'.  The problem is that string.join is unacceptably slow in
this task.  The following program takes 37 seconds on a PIII/700 to
process a 239-line file!

--------------------------------------------------------------------

import fileinput, re, string
list = []

for line in fileinput.input(textfile):
    # Comment or blank line?
    if line == '' or line[0] in '#':
        continue
    else:
        list.append(string.strip(line))
        # "address1|address2|address3|addressN"
        regex = string.join(list,'|')
        regex = '"' + regex + '"'
        reo = re.compile(regex, re.I)

--------------------------------------------------------------------


_____________________________________________________________________________
http://movies.yahoo.com.au - Yahoo! Movies
- Now showing: Dude Where's My Car, The Wedding Planner, Traffic..




More information about the Python-list mailing list