is there a way to get the encoding of python file

Thomas Jollans thomas at jollybox.de
Mon Sep 13 16:56:26 EDT 2010


On Monday 13 September 2010, it occurred to Robert Kern to exclaim:
> On 9/13/10 2:00 PM, Stef Mientki wrote:
> >   On 12-09-2010 19:28, Robert Kern wrote:
> >> On 9/12/10 4:14 AM, Stef Mientki wrote:
> >>>    hello,
> >>> 
> >>> Is it possible to get the encoding of a python file from the first
> >>> source line, (if there's any),
> >>> after importing it ( with '__import__' )
> >>> 
> >>> # -*- coding: windows-1252 -*-
> >> 
> >> The regular expression used to match the encoding declaration is given
> >> here:
> >> 
> >> http://docs.python.org/reference/lexical_analysis.html#encoding-declarat
> >> ions
> > 
> > yes, but then I've to read the first line of the file myself.
> > 
> > In the meanwhile I found  another (better ?) solution, (I'm using Python
> > 2.6)
> > 
> > 
> > Place these 2 lines at the top of the file
> > # -*- coding: windows-1252 -*-
> > from __future__ import unicode_literals
> > 
> > or these
> > # -*- coding: utf-8 -*-
> > from __future__ import unicode_literals
> > 
> > then you always get the correct unicode string back.
> 
> Ah. I see. You don't actually need to know the encoding; you just want to
> use literals with raw, unescaped characters embedded in them.
> 
> This may interfere with the cases when you need a real str object.

If you assume that unicode_literals from __future__ works, you can also assume 
that the b'bytestring' syntax works.

> In
> Python 2.x, if you want a unicode literal, just use one like so: u'ß'. As
> long as the encoding declaration is correct, this will work just fine.



More information about the Python-list mailing list