[Python-ideas] Python 3000 TIOBE -3%

Terry Reedy tjreedy at udel.edu
Sat Feb 11 18:43:41 CET 2012

On 2/11/2012 12:00 PM, Masklinn wrote:
> On 2012-02-11, at 17:44 , Terry Reedy wrote:
>> On 2/11/2012 5:47 AM, Paul Moore wrote:
>>> I have a text file, in an unknown encoding (yes, it does happen
>>> to me!) but opening in an editor shows it's mainly-ASCII. I want
>>> to find all the lines starting with a '*'. The simple
>>> with open('myfile.txt') as f:
 >>>   for line in f:
 >>>     if line.startswith('*'): print(line)
>>> fails with encoding errors. What do I do?
>> Good example. I believe adding ", encoding='latin-1'" to open() is
>> sufficient.
> Why not open the file in binary mode in stead? (and replace `'*'` by
> `b'*'` in the startswith call)

When I wrote that response, I thought that 'for line in f' would not 
work for binary-mode files. I then opened IDLE, experimented with 'rb', 
and discovered otherwise. So the remaining issue is how one wants the 
unknown encoding bytes to appear when printed -- as hex escapes, or as 
arbitrary but more readable non-ascii latin-1 chars.

Terry Jan Reedy

More information about the Python-ideas mailing list