Why do my list go uni-code by itself?
Martin Hvidberg
Martin at Hvidberg.net
Mon Dec 20 16:08:20 EST 2010
I'm reading a fixed format text file, line by line. I hereunder present
the code. I have <snipped> out part not related to the file reading.
Only relevant detail left out is the lstCutters. It looks like this:
[[1, 9], [11, 21], [23, 48], [50, 59], [61, 96], [98, 123], [125, 150]]
It specifies the first and last character position of each token in the
fixed format of the input line.
All this works fine, and is only to explain where I'm going.
The code, in the function definition, is broken up in more lines than
necessary, to be able to monitor the variables, step by step.
--- Code start ------
import codecs
<snip>
def CutLine2List(strIn,lstCut):
strIn = strIn.strip()
print '>InNextLine>',strIn
# skip if line is empty
if len(strIn)<1:
return False
lstIn = list()
for cc in lstCut:
strSubline =strIn[cc[0]-1:cc[1]-1].strip()
lstIn.append(strSubline)
print '>InSubline2>'+lstIn[len(lstIn)-1]+'<'
del strIn, lstCut,cc
print '>InReturLst>',lstIn
return lstIn
<snip>
filIn = codecs.open(
strFileNameIn,
mode='r',
encoding='utf-8',
errors='strict',
buffering=1)
for linIn in filIn:
lstIn = CutLine2List(linIn,lstCutters)
--- Code end ------
A sample output, representing one line from the input file looks like this:
>InNextLine> I 30 2002-12-11 20:01:19.280
563 FANØ
2001-12-12-15.46.12.734502 2001-12-12-15.46.12.734502
>InSubline2>I<
>InSubline2>30<
>InSubline2>2002-12-11 20:01:19.280<
>InSubline2>563<
>InSubline2>FANØ<
>InSubline2>2001-12-12-15.46.12.73450<
>InSubline2>2001-12-12-15.46.12.73450<
>InReturLst> [u'I', u'30', u'2002-12-11 20:01:19.280', u'563',
u'FAN\xd8', u'2001-12-12-15.46.12.73450', u'2001-12-12-15.46.12.73450']
Question:
In the last printout, tagged >InReturLst> all entries turn into
uni-code. What happens here?
Look for the word 'FANØ'. This word changes from 'FANØ' to u'FAN\xd8' --
That's a problem to me, and I don't want it to change like this.
What do I do to stop this behavior?
Best Regards
Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20101220/d6063641/attachment-0001.html>
More information about the Python-list
mailing list