Becoming Unicode Aware

Michael Foord fuzzyman at
Fri Oct 29 09:29:47 CEST 2004

aleaxit at (Alex Martelli) wrote in message news:<1gmd10n.1xt7l7q6ahyacN%aleaxit at>...
> Michael Foord <fuzzyman at> wrote:
>    ...
> > def afunction(setoflines, encoding='ascii'):
> >     for line in setoflines:
> >         if encoding:
> >             line = line.decode(encoding)
> This snippet as posted is a complicated "no-op but raise an error for
> invalidly encoded lines", if it's the whole function.

It wouldn't be the whole function...... glad you attribute me with
some intelligence ;-)

> Assuming the so-called setoflines IS not a set but a list (order
> normally matters in such cases), you may rather want:
> def afunction(setoflines, encoding='ascii'):
>     for i, line in enumerate(setoflines):
>         setoflines[i] = line.decode(encoding)
> The removal of the 'if' is just the same advice you were already given;
> if you want to be able to explicitly pass encoding='' to AVOID the
> decode (the whole purpose of the function), just insert a firs line
>     if not encoding: return
> rather than repeating the test in the loop.  But the key change is to
> use enumerate to get indices as well as values, and assign into the
> indexing in order to update 'setoflines' in-place; assigning to the
> local variable 'line' (assuming, again, that you didn't snip your code
> w/o a mention of that) is no good.

The rest of the function (which I didn't show) would actually process
the lines one by one......



> A good alternative might alternatively be
>     setoflines[:] = [line.decode(encoding) for line in setoflines]
> assuming again that you want the change to happen in-place.
> Alex

More information about the Python-list mailing list