string.join is abysmally slow
Ixokai
news at myNOSPAM.org
Sun Apr 15 19:29:09 EDT 2001
Hello,
I'd think that there has to be a better way to go about the problem. I
mean, I'd imagine its slow because strings are immutable; every time you
append something new, it has to allocate a new block of memory to be the
proper size, copy everything over, then deallocate the old one if its not
referenced anymore. This is just a guess, mind you.
If your file is strictly one-address-per line, why not 'list =
file.readlines()' then 'if email in list:'. Both operations take a fraction
of a second on my p2-400mhz. I tested it with a mock-up file of 239 pretend
email addresses I generated.
Then again... after loading the entire file into a list of 'lines',
"regex = "|".join(lines)" is also nearly instantaneous for me. What version
of Python are you using?
--Stephen
(to reply, remove 'NOSPAM' and replace with 'seraph')
"Graham Guttocks" <graham_guttocks at yahoo.co.nz> wrote in message
news:mailman.987375622.11843.python-list at python.org...
> Greetings,
>
> I've run into a performance problem in one of my functions, and wonder
> if I could get some recommendations on how to speed things up.
>
> What I'm trying to do is read in a textfile containing e-mail
> addresses, one per line, and use them to build a regular expression
> object in the form "address1|address2|address3|addressN" to search
> against.
>
> I'm using string.join to concatenate the addresses together, separated
> by a `|'. The problem is that string.join is unacceptably slow in
> this task. The following program takes 37 seconds on a PIII/700 to
> process a 239-line file!
>
> --------------------------------------------------------------------
>
> import fileinput, re, string
> list = []
>
> for line in fileinput.input(textfile):
> # Comment or blank line?
> if line == '' or line[0] in '#':
> continue
> else:
> list.append(string.strip(line))
> # "address1|address2|address3|addressN"
> regex = string.join(list,'|')
> regex = '"' + regex + '"'
> reo = re.compile(regex, re.I)
>
> --------------------------------------------------------------------
>
>
>
____________________________________________________________________________
_
> http://movies.yahoo.com.au - Yahoo! Movies
> - Now showing: Dude Where's My Car, The Wedding Planner, Traffic..
>
-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 80,000 Newsgroups - 16 Different Servers! =-----
More information about the Python-list
mailing list