print UTF-8 file with BOM

Kevin Yuan farproc at gmail.com
Fri Dec 23 15:47:30 CET 2005


Sorry, I'm newbie in python. I can't help you further, indeed I don't know
either.:)

2005/12/23, David Xiao <davihigh at gmail.com>:
>
> Hi Kuan:
>
> Thanks a lot! One more question here: How to write if I want to
> specify locale other than current locale?
>
> For example, running on Korea locale system, and try read a UTF-8 file
> that save chinese.
>
> Regards, David
>
>
>
>
> 2005/12/23, Kevin Yuan <farproc at gmail.com>:
> > import codecs
> > def read_utf8_txt_file (filename):
> >     fileObj = codecs.open( filename, "r", "utf-8" )
> >     content = fileObj.read()
> >     content = content[1:] #exclude BOM
> >     print content
> >      fileObj.close()
> >
> > read_utf8_txt_file("e:\\u.txt")
> >
> > 22 Dec 2005 18:12:28 -0800, davihigh at gmail.com < davihigh at gmail.com>:
> > > Hi Friends:
> > >
> > >         fileObj = codecs.open( filename, "r", "utf-8" )
> > >         u = fileObj.read() # Returns a Unicode string from the UTF-8
> bytes
> > in
> > > the file
> > >         print u
> > >
> > > It says error:
> > >         UnicodeEncodeError: 'gbk' codec can't encode character
> u'\ufeff'
> > in
> > > position 0:
> > >         illegal multibyte sequence
> > >
> > > I want to know how read from UTF-8 file, and convert to specified
> > > locale (default is current system locale) and print out string. I hope
> > > put away BOM header automatically.
> > >
> > > Rgds, David
> > >
> > > --
> > > http://mail.python.org/mailman/listinfo/python-list
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20051223/3d4fd2e9/attachment.html>


More information about the Python-list mailing list