[SOLVED] detecting newline character

Thomas 'PointedEars' Lahn PointedEars at web.de
Sun Apr 24 09:35:20 EDT 2011


Daniel Geržo wrote:

> On 23.4.2011 21:18, Thomas 'PointedEars' Lahn wrote:
>> Daniel Geržo wrote:
>>> [f = codecs.open(…, mode='rU', encoding='ascii') and f.newlines]
>>
>> […]
>> The only reason I can think of for this not working ATM comes from the
>> documentation, where it says that 'U' requires Python to be built with
>> universal newline support; that it is *usually* so, but might not be so
>> in your case (but then the question remains: How could it be not None
>> without `encoding' argument?)
> 
> Yes, this is what does not make sense. If I didn't have the universal
> newline support enabled, I wouldn't have the newlines attribute at all.

True.  But good to know to have a test with hasattr(fileobj, 'newlines')!

>> 
<http://docs.python.org/library/codecs.html?highlight=codecs.open#codecs.open>
>> <http://docs.python.org/library/functions.html#open>
>>
>> WFM with and without `encoding' argument in python-2.7.1-8 (CPython),
>> Debian GNU/Linux 6.0.1, Linux 2.6.35.5-pe (custom) SMP i686.
>>
>> Which Python implementation and version are you using on which system?
> 
> This is a standard python installation from MacPorts. System is OS X
> 10.6.7. I have now tried both python 2.7.1 and python 2.6.6 from
> MacPorts and also 2.6.6 on FreeBSD. All fail for me when I set encoding.

I think this discussion, in particular <2838616.PzL39ZcT7Z at PointedEars.de>, 
<news:5684911.Hjke4DdEvY at PointedEars.de> and finally 
<http://bugs.python.org/issue691291>, is providing a good explanation now.

To summarize:

1. From Python 2.6.5-rc1 and Python 2.7-alpha4 forward, codecs.open()
   does not support universal newlines and will ignore any 'U' in its
   `mode' argument when the `encoding' argument is different from None.

2. As a result, file.newlines will be None if if exists.

3. This is by design, fixing a bug back from Python 2.3a.

4. Use another approach.

:)

>> On which system has the "ASCII" file been created and how?  Note that
>> both uploading the file with FTP in ASCII mode and downloading over HTTP
>> might have removed the problem Python has with it.
> 
> Unfortunately I am not 100% sure where I created the file, it was quite
> some time ago, but it was either WinXP, or OS X Leopard. The source code
> can be found at https://bitbucket.org/danger/pysublib/src - I noticed
> the subtitle file tests (e.g. test/test_subripfile.py) are failing for
> me and I have identified the problem with newlines being None after
> calling read().

Well, you have two alternatives now (codecs.open() with 
list(set(re.search(newlines, readlines())) and io.open()), and you appear to 
have decided for `io', so there should not be a problem anymore.

I wish you good luck with your project, it looks really interesting (I 
remember having written a DVD subtitle script based on gocr in bash a few 
years ago).

-- 
\\//,
PointedEars



More information about the Python-list mailing list