[Tutor] unicode nightmare
Alan Gauld
alan.gauld at btinternet.com
Thu Nov 11 01:46:26 CET 2010
"danielle davout" <danielle.davout at gmail.com> wrote
> I simplify it to
> v = u'\u0eb4'
> X = (1,)
> gen = ((v ,v) for x in X for y in X)
>
> What can be so wrong in this line, around it to give the 1lined file
> ໄ:ໄ
> where ໄ "is" not u'\u0eb4' but u'\u0ec4' though a direct printing
> looks OK
The code will produce a one line file with v repeated twice.
Now why do you think the character is different?
What have you done to check it?
What do you mean by a direct printing?
print v
maybe?
> To write the file corresponding to my nth generator of my list h I
> use
> def ecrire(n):
> f= codecs.open("G"+str(n),"w","utf8")
> for x, tx in h[n]:
> f.write((x + U":"+ tx))
> f.write('\n')
Personally I'd use
f.write(U"%s:%s\n" % (x,tx))
but thats largely a matter of style preference I guess.
But why do you have double parens in the first print?
> But In its non simplified form
> h.append( (x + v + y ,tr[x]+ tr[v]+ tr[y]) for x in CC for y in
> OFC) )
> before I have a chance to write anything in the file G5
> I have got the KeyError: u'\u0ec4'
> yes tr is a dictionary that doesn't have u'\u0ec4' as a key
> but tr[v] is well definied ...
OK, but the error is valid in that case.
Which implies that you have bad data in CC.
What exactly are you asking?
--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/
More information about the Tutor
mailing list