[Tutor] unicode problem

Paul Tremblay phthenry@earthlink.net
Wed Apr 30 20:01:32 2003


Now I can't reproduce my problem! I have played around a bit with
unicode. As you discovered below (but which I am just realizing), you
can concatenate any string with utf-8 as long as both strings ar utf-8,
or one string is utf-8 and the other string has values below 128. 

My code should not have broken. Sax translates everything to utf-8. The
module that broke the string tries to add stuff to the string Sax passed
to it (utf-8), but this stuff has a range below 128. 

I think I'll try this in my code, just to make sure:

try:
    line = line + filler + padding + border + "\n"
except UnicodeError:
    filler = convert_to_utf_8_func(filler)
    padding = convert_to_utf_8_func(padding)
    border = convert_to_utf_8_func(border)
    line = line + filler + padding + border + '\n'
    
def conver_to_utf_8_func(my_string):
   # fill in code. I can read one character at a time, get the 
   # the value and use unichr(num), but that another thread a few
   # days later has a better way to do this
   pass

Thanks

Paul

On Mon, Apr 28, 2003 at 10:13:30AM -0700, Danny Yoo wrote:

> 

> 
> 
> > The error message you report:
> >
> > >  File "/home/paul/lib/python/paul/format_txt.py", line 159, in r_border
> > >     line = line + filler + padding + border + "\n"
> > > UnicodeError: ASCII decoding error: ordinal not in range(128)
> >
> >
> > doesn't smell right to me --- for the life of me, I can't imagine why
> > string concatenation would raise that kind of error.
> 
> 
> Oh.  Never mind.
> 
> ###
> >>> x, y = u'\xf6', '\xf6
> >>> x + y
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeError: ASCII decoding error: ordinal not in range(128)
> ###
> 
> 
> Well, at least now we have a test case we can work on.  *grin*
> 
> 
> 
> I think that the concatentation causes Python to raise the second string y
> up as a unicode string.  At least, it looks like that unicod()ing a
> high-byte character can cause the encoding error:
> 
> ###
> >>> unicode('\xf6')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeError: ASCII decoding error: ordinal not in range(128)
> ###
> 
> 
> I'm actually not quite sure how to solve this yet; I'm not familiar with
> Unicode at all, so I think I might need to tinker with this problem a bit.

-- 

************************
*Paul Tremblay         *
*phthenry@earthlink.net*
************************