[Tutor] Fw: unicode nightmare
ALAN GAULD
alan.gauld at btinternet.com
Thu Nov 11 09:41:49 CET 2010
forward to the list
Alan Gauld
Author of the Learn To Program website
http://www.alan-g.me.uk/
----- Forwarded Message ----
> From: danielle davout <danielle.davout at gmail.com>
> To: Alan Gauld <alan.gauld at btinternet.com>
> Sent: Thursday, 11 November, 2010 4:19:22
> Subject: Re: [Tutor] unicode nightmare
>
> First thanks to answer ... even if I couldn't manage to be clear in
> formulating my question
> It already helps ;) I was confident enough that somebody of this
> helpful group would answer to my S.O.S
> to manage to have a night of sleep without nigthmare and a night of
> sleep even short helps too :)
>
> I am able now better resume my problem and it doesn't look anymore
> like a unicode problem at all.
> (below, If necessary .., some of lines of codes that conduct me to
> this conclusion)
> In my script I construct a generator I want to use later on; for this
> I make the use of what I thought an accessory (convenient) variable, I
> do my business ...
> and when I'm ready to use the generator it has changed "because" in
> the mean time I have reused the same name of variable.
> If I say I go as far as later on to remove altogether the variable by
> the statement del(v)
> I have a traceback complaining NameError: global name 'v' is not defined
> If I don't reuse the same name (say as a quick test I put an exit()
> before using it)
> I have no complaint.
> If I take care to change the name, not using any more loop to
> construct all the generators I need, my script gives the results I
> expected.
>
> It looks very puzzling to me and I would like to have some pointers to
> be able to understand one day what have happened to me :)
> Thanks,
>
>
>
> On Thu, Nov 11, 2010 at 7:46 AM, Alan Gauld <alan.gauld at btinternet.com>
wrote:
> >
> > "danielle davout" <danielle.davout at gmail.com> wrote
> >
> >> I simplify it to
> >> v = u'\u0eb4'
> >> X = (1,)
> >> gen = ((v ,v) for x in X for y in X)
> >>
> >> What can be so wrong in this line, around it to give the 1lined file
> >> ໄ:ໄ
> >> where ໄ "is" not u'\u0eb4' but u'\u0ec4' though a direct printing looks
> >> OK
> >
> > The code will produce a one line file with v repeated twice.
> > Now why do you think the character is different?
> > What have you done to check it?
> >
> > What do you mean by a direct printing?
> >
> > print v
> >
> > maybe?
> yes
> >> To write the file corresponding to my nth generator of my list h I use
> >> def ecrire(n):
> >> f= codecs.open("G"+str(n),"w","utf8")
> >> for x, tx in h[n]:
> >> f.write((x + U":"+ tx))
> >> f.write('\n')
> >
> > Personally I'd use
> >
> > f.write(U"%s:%s\n" % (x,tx))
> ok
> > but thats largely a matter of style preference I guess.
> > But why do you have double parens in the first print?
> remaining of several trials...
> >> But In its non simplified form
> >> h.append( (x + v + y ,tr[x]+ tr[v]+ tr[y]) for x in CC for y in OFC) )
> >> before I have a chance to write anything in the file G5
> >> I have got the KeyError: u'\u0ec4'
> >> yes tr is a dictionary that doesn't have u'\u0ec4' as a key
> >> but tr[v] is well defined ...
> >
> > OK, but the error is valid in that case.
> > Which implies that you have bad data in CC.
>
> I'm sorry, I was not very clear on that point... nor CC nor OFC
> contains u'\u0ec4'
> and I haven't no problem with for example
> h.append(((u"ເ" + x + y, tr[x]+e2 +tr[y]) for x in CC for y in OFC))
> if I define tr[u'\u0ec4'] I obtain a result but that is wrong on the
> first part x + v + y
> every thing looks like *the generator was not made with the value of v
> that I believe to give*
> > What exactly are you asking?
> If I write :
> 114 v = u'\u0eb4'
> 115 X = (1,)
> 116 h.append( ((v ,v) for x in X for y in X))
> 117 print v
> 118 #exit()
> v prints nicely
> If I remove the last comment, v still prints nicely but I get the traceback
> Traceback (most recent call last):
> File "/home/dan/workspace/lao/src/renaud/tt.py", line 261, in <module>
> ecrire(n)
> File "/home/dan/workspace/lao/src/renaud/tt.py", line 254, in ecrire
> for x, tx in h[n]:
> File "/home/dan/workspace/lao/src/renaud/tt.py", line 116, in <genexpr>
> h.append( ((v ,v) for x in X for y in X))
> NameError: global name 'v' is not defined
>
> Where has gone my variable v ? Why it is considered as global as soon
> I use it in a generator?
>
> If in 114 I write
> v = w = u'\u0eb4'
> and I delete w by del(w) and still want to print it I'm only told
> that "name 'w' is not defined"
> Why changing the value of v 200 lines below I can change the value of
> the generator
> Definitely it is not a unicode problem but of global variable
> If I write
> w = u'\u0eb4'
> h.append( (x +w + y , tr[x]+ tr[w]+ tr[y]) for x in CC for y in OFC) )
> my file G5 is generated
> ກິກ:kik
> ກິມ:kim
> ກິວ:kiv
> .....
>
More information about the Tutor
mailing list