How to display Chinese in a list retrieved from database via python

Mark Tolonen metolone+gmane at
Mon Dec 29 20:19:48 CET 2008

"zxo102" <zxo102 at> wrote in message 
news:7e38e76a-d5ee-41d9-9ed5-73a2e2993733 at
> On 12月29日, 下午5时06分, "Mark Tolonen" <metolone+gm... at> wrote:
>> "zxo102" <zxo... at> wrote in message
>> news:2560a6e0-c103-46d2-aa5a-8604de4d1968 at


>> That said, learn to use Unicode strings by trying the following program, 
>> but
>> set the first line to the encoding *your editor* saves files in.  You can
>> use the actual Chinese characters instead of escape codes this way.  The
>> encoding used for the source code and the encoding used for the html file
>> don't have to match, but the charset declared in the file and the 
>> encoding
>> used to write the file *do* have to match.
>> # coding: utf8
>> import codecs
>> mydict = {}
>> mydict['JUNK'] = [u'中文',u'中文',u'中文']
>> def conv_list2str(value):
>>     return u'["' + u'","'.join(s for s in value) + u'"]'
>> f_str = u'''<html><head>
>> <META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=gb2312">
>> <title>test</title>
>> <script language=javascript>
>> var test = %s
>> alert(test[0])
>> alert(test[1])
>> alert(test[2])
>> </script>
>> </head>
>> <body></body></html>'''
>> s = conv_list2str(mydict['JUNK'])
>> f.write(f_str % s)
>> f.close()
>> -Mark
>> P.S.  Python 3.0 makes this easier for what you want to do, because the
>> representation of a dictionary changes.  You'll be able to skip the
>> conv_list2str() function and all strings are Unicode by default.
> Thanks for your comments, Mark. I understand it now. The list(escape
> codes): ['\xd6\xd0\xce\xc4','\xd6\xd0\xce\xc4','\xd6\xd0\xce\xc4'] is
> from a postgresql database with "select" statement.I will postgresql
> database configurations and see if it is possible to return ['中文','中
> 文','中文'] directly with "select" statement.
> ouyang

The trick with working with Unicode is convert anything read into the 
program (from a file, database, etc.) to Unicode characters, manipulate it, 
then convert it back to a specific encoding when writing it back.  So if 
postgresql is returning gb2312 data, use:

data.decode('gb2312') to get the Unicode equivalent:

>>> '\xd6\xd0\xce\xc4'.decode('gb2312')
>>> print '\xd6\xd0\xce\xc4'.decode('gb2312')

Google for some Python Unicode tutorials.


More information about the Python-list mailing list