Encoding problem when launching Python27 via DOS

Jean-Pierre M pythonrubylang at gmail.com
Mon Apr 11 05:02:26 EDT 2011


Thanks a lot for this quick answer! It is very clear!

Ti better understand what the difference between encoding and decoding is I
found the following website: http://www.evanjones.ca/python-utf8.html

<http://www.evanjones.ca/python-utf8.html>I change the program to (changes
are in bold):
*# -*- coding: utf8 -*- *(no more cp1252 the source file is directly in
unicode)
*#!/usr/bin/python*
*'''*
*Created on 27 déc. 2010*
*
*
*@author: jpmena*
*'''*
*from datetime import datetime*
*import locale*
*import codecs*
*import os,sys*
*
*
*class Log(object):*
*    log=None*
*    def __init__(self,log_path):*
*        self.log_path=log_path*
*        if(os.path.exists(self.log_path)):*
*            os.remove(self.log_path)*
*        #self.log=open(self.log_path,'a')*
*        self.log=codecs.open(self.log_path, "a", 'utf-8')*
*    *
*    def getInstance(log_path=None):*
*        print "encodage systeme:"+sys.getdefaultencoding()*
*        if Log.log is None:*
*            if log_path is None:*
*                log_path=os.path.join(os.getcwd(),'logParDefaut.log')*
*            Log.log=Log(log_path)*
*        return Log.log*
*    *
*    getInstance=staticmethod(getInstance)*
*        *
*    def p(self,msg):*
*        aujour_dhui=datetime.now()*
*        date_stamp=aujour_dhui.strftime("%d/%m/%y-%H:%M:%S")*
*        print sys.getdefaultencoding()*
*        unicode_str='%s : %s \n'  % (date_stamp,unicode(msg,'utf-8'))*
*        #unicode_str=msg*
*        self.log.write(unicode_str)*
*        return unicode_str*
*    *
*    def close(self):*
*        self.log.flush()*
*        self.log.close()*
*        return self.log_path*
*
*
*if __name__ == '__main__':*
*    l=Log.getInstance()*
*    l.p("premier message de Log à accents")*
*    Log.getInstance().p("second message de Log")*
*    l.close()*

The DOS conole output is now:
*C:\Documents and Settings\jpmena\Mes
documents\VelocityRIF\VelocityTransforms>generationProgrammeSitePublicActuel.cmd
*
*Page de codes active : 1252*
*encodage systeme:ascii*
*ascii*
*encodage systeme:ascii*
*ascii*

And the Generated Log file showsnow the expected result:
*11/04/11-10:53:44 : premier message de Log à accents *
*11/04/11-10:53:44 : second message de Log*

Thanks.

If you have other links of interests about unicode encoding and decoding  in
Python. They are welcome

2011/4/10 MRAB <python at mrabarnett.plus.com>

> On 10/04/2011 13:22, Jean-Pierre M wrote:
> > I created a simple program which writes in a unicode files some french
> text with accents!
> [snip]
> This line:
>
>
>    l.p("premier message de Log à accents")
>
> passes a bytestring to the method, and inside the method, this line:
>
>
>    unicode_str=u'%s : %s \n'  %
> (date_stamp,msg.encode(self.charset_log,'replace'))
>
> it tries to encode the bytestring to Unicode.
>
> It's not possible to encode a bytestring, only a Unicode string, so
> Python tries to decode the bytestring using the fallback encoding
> (ASCII) and then encode the result.
>
> Unfortunately, the bytestring isn't ASCII (it contains accented
> characters), so it can't be decoded as ASCII, hence the exception.
>
> BTW, it's probably better to forget about cp1252, etc, and use UTF-8
> instead, and also to use Unicode wherever possible.
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110411/c34757d7/attachment.html>


More information about the Python-list mailing list