Characters aren't displayed correctly

Philip Semanchuk philip at semanchuk.com
Mon Mar 2 10:22:38 EST 2009


On Mar 2, 2009, at 9:50 AM, Hussein B wrote:

> On Mar 2, 4:31 pm, John Machin <sjmac... at lexicon.net> wrote:
>> On Mar 2, 7:30 pm, Hussein B <hubaghd... at gmail.com> wrote:
>>
>>> On Mar 1, 4:51 pm, Philip Semanchuk <phi... at semanchuk.com> wrote:
>>>> What are you getting out of the database? Is it being converted to
>>>> Unicode correctly, or at all?
>>
>>> I don't know, how to make sure of this point?

Personally, I'd add a debug breakpoint just after extracting the  
characters from the database, like so:

    import pdb
    pdb.set_trace()

When you're stopped at the breakpoint, examine the string you get  
back. Is it what you expect? For instance, is it Unicode?

    isinstance(my_string, unicode)

Or maybe you're expecting a utf-8 encoded string, so examine one of  
the non-ASCII characters. Is it really utf-8 encoded?

 >>> my_string = u"snö".encode("utf-8")
 >>> my_string[0]
's'
 >>> my_string[1]
'n'
 >>> my_string[2]
'\xc3'
 >>> my_string[3]
'\xb6'


Since you feel pretty confident that you're getting what you expect  
out of the database, maybe you want to eliminate that from  
consideration. As a test, construct "by hand" a string that represents  
the email message you're trying to send. If you send that with the  
proper content-type header and you still don't get the results you  
want, then we can all stop discussing the database. Make sense?

Forget about the HTML markup, too. That's just a distraction. Start  
with the simplest problem first, and then add pieces on.

See if you can successfully construct and send an email that says  
"Hello world" in English/ASCII. If that works, change it to Arabic. If  
that works, change the email format to HTML. If that works, starts  
pulling the content from the database. If that works, then you're  
done. =)

bye
Philip











More information about the Python-list mailing list