[Python-ideas] Python 3000 TIOBE -3%
Terry Reedy
tjreedy at udel.edu
Sat Feb 11 18:43:41 CET 2012
On 2/11/2012 12:00 PM, Masklinn wrote:
>
> On 2012-02-11, at 17:44 , Terry Reedy wrote:
>
>> On 2/11/2012 5:47 AM, Paul Moore wrote:
>>> I have a text file, in an unknown encoding (yes, it does happen
>>> to me!) but opening in an editor shows it's mainly-ASCII. I want
>>> to find all the lines starting with a '*'. The simple
>>>
>>> with open('myfile.txt') as f:
>>> for line in f:
>>> if line.startswith('*'): print(line)
>>>
>>> fails with encoding errors. What do I do?
>>
>> Good example. I believe adding ", encoding='latin-1'" to open() is
>> sufficient.
>
> Why not open the file in binary mode in stead? (and replace `'*'` by
> `b'*'` in the startswith call)
When I wrote that response, I thought that 'for line in f' would not
work for binary-mode files. I then opened IDLE, experimented with 'rb',
and discovered otherwise. So the remaining issue is how one wants the
unknown encoding bytes to appear when printed -- as hex escapes, or as
arbitrary but more readable non-ascii latin-1 chars.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list