basic question: target assignment in for loop

Carl Banks imbosol-1046071041 at
Mon Feb 24 08:34:19 CET 2003

Kawaldeep Grewal wrote:
> hello,
> this may be a faq, and if it is, I would appreciate a pointer in the 
> right direction.
> I'm using python to edit some text/html with the re module. I want to do 
> this:
> html = htmlFile.readlines()
> for line in html:
>       line = re.sub("regexString", functionReturningString, line),
> but python assigns the target by value and not by reference, which (in 
> my mind) breaks the abstraction. So, I have to resort to this:
> html = htmlFile.readlines()
> i = 0
> while i < len(html):
>        html[i] = re.sub("regexString", functionReturningString, html[i])
>        i = i + 1
> this code is decidedly not elegant, and looks very C-ish. As I'm new to 
> python, can anyone tell me whether I'm just confused or that this is the 
> way to do things?

Sometimes it's best to just build up a separate list.  For example:

    oldhtml = htmlFile.readlines()
    newhtml = []
    for line in oldhtml:
        newhtml.append(re.sub("regexString", functionReturningString, line))

You can get the same effect using one line of code using a list
comprehension; these are more complicated:

    html = [ re.sub("regexString", functionReturningString, line)
             for line in htmlFile.readlines() ]

And if you would rather use an index, you would find the xrange
function useful; it is used for iteration:

    html = htmlFile.readlines()
    for i in xrange(len(html)):
        html[i] = re.sub("regexString", functionReturningString, html[i])

A couple of other points.  It will certainly benefit you speedwise to
use a compiled regular expression.  Do this by using re.compile:

    pattern = re.compile("regexString")
    html = htmlFile.readlines()
    for i in xrange(len(html)):
        html[i] = pattern.sub(functionReturningString, html[i])

Second, if your goal is to simply replace one string (or regex) with
another everywhere in a file, and then write it out, this can be done
much more efficiently with one sweep over the entire buffer:

    pattern = re.compile("regexString")
    buf =
    buf = pattern.sub(functionReturningString, buf)


More information about the Python-list mailing list