[Python-bugs-list] [ python-Bugs-487277 ] Different behavior of readline()

noreply@sourceforge.net noreply@sourceforge.net
Wed, 12 Dec 2001 04:46:31 -0800


Bugs item #487277, was opened at 2001-11-29 14:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=487277&group_id=5470

Category: Python Interpreter Core
>Group: Feature Request
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Different behavior of readline()

Initial Comment:
The behavior of readline() has changed on 2.2. This 
should be documented.

Example:

Python 2.1 (#1, Jun 22 2001, 17:13:13)
[GCC 2.95.3 20010315 (release) (conectiva)] on 
linux-i386
Type "copyright", "credits" or "license" for more 
information.
>>> open("/etc").readline()
''

Python 2.2b2+ (#1, Nov 27 2001, 21:39:35)
[GCC 2.95.3 20010315 (release) (conectiva)] on 
linux-ppc
Type "help", "copyright", "credits" or "license" for 
more information.
>>> open("/etc").readline()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IOError: [Errno 21] Is a directory



----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-12 04:46

Message:
Logged In: YES 
user_id=6380

OK,. I buy that argument. I'll take a patch after 2.2. We
really need a "2.3" group here -- for now I've set the group
to feathre request.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-12 00:06

Message:
Logged In: YES 
user_id=21627

Tim was arguing that open("/") fails on some systems,
anyway, since the C library will refuse to open a directory.
So it would be for uniformity's sake that makes it fail on
Posix, as well. I think that fopen() allows to open a
directory is by a similar rationale: if open(2) allows to
open it, fopen(3) should do so as well. That rationale is
already broken: open(2) is needed to read a directory, but
you cannot read a directory if you have a FILE*.

The "used for existence test" argument isn't very
convincing; use access or stat for that (specifically,
os.exists). Of course, changing it might break existing
code, just as changing .readline did.
It is clear that posix.open should forward directly to the
system call, instead of performing checks. It is not so
clear that file() should do the same thing.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-11 14:50

Message:
Logged In: YES 
user_id=6380

Are we even sure that open("/") should be disallowed? What
if it is used as an existence test? What is allowed by
open() should be the realm of the stdio fopen() function,
and we shouldn't second-guess it. If fopen() allows us to
open directories, why shouldn't we?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-03 06:22

Message:
Logged In: YES 
user_id=21627

Yes, changing it later is fine.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-03 05:03

Message:
Logged In: YES 
user_id=6380

I take it that this needn't be included in 2.2?  Who knows
what it would break.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-03 01:02

Message:
Logged In: YES 
user_id=21627

I agree that Python should not allow to open a directory.
Attached is a patch that implements this check; it is
somewhat more involved since it must also check file object
that are created through PyFile_FromFile etc.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-12-02 12:18

Message:
Logged In: YES 
user_id=31435

Well, my question isn't really about libc but about 
Python:  since Python doesn't expose getdents(2) or readdir
(2), how could it be anything but a mistake for a Python 
user on Linux to try to __builtin__.open() a directory?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-02 05:16

Message:
Logged In: YES 
user_id=21627

Opening a directory is the only way, on Unix, to get its
contents. The file descriptor returned is then passed to
getdents(2) or readdir(2) (depending on the OS version).

__builtins__.open doesn't fail because it calls fopen right
away, which doesn't fail because it calls open(2) with just
O_RDONLY and O_LARGEFILE, not with O_DIRECTORY. open(2) will
then only return EISDIR if the directory is opened for
writing. Since this is the behaviour documented in Posix, it
is unlikely that Linux glibc will change.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-11-30 14:44

Message:
Logged In: YES 
user_id=38388

For the record, this is what I get with Python 2.2b2 on Linux:
>>> f = open('/home/lemburg/')
>>> f.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IOError: [Errno 21] Is a directory
>>> f.readline()
''
>>> f.readlines()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IOError: [Errno 21] Is a directory
>>>

Reading the man-page for open(), it seems that directories play some role (though it's not clear which...):

open() options:
...
       O_DIRECTORY
              If pathname is not a directory, cause the  open  to
              fail.   This  flag is Linux-specific, and was added
              in kernel version 2.1.126, to avoid  denial-of-ser­
              vice  problems if opendir(3) is called on a FIFO or
              tape device, but should not be used outside of  the
              implementation of opendir.

errno values:
...
       EISDIR pathname refers  to  a  directory  and  the  access
              requested involved writing.
       EACCES The requested access to the file is not allowed, or
              one of the directories in pathname  did  not  allow
              search  (execute)  permission,  or the file did not
              exist yet and write access to the parent  directory
              is not allowed.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-11-30 14:26

Message:
Logged In: YES 
user_id=31435

Does anyone know why opening a directory for reading 
doesn't complain on Linux?  It has always complained on 
Windows.  If readline() is doomed to fail, why do we allow 
opening a directory for reading at all?  OTOH, if it's not 
insane to open a directory for reading on Linux, why does 
readline() complain on Linux?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-11-30 08:07

Message:
Logged In: YES 
user_id=21627

No. I think very few of the changes will ever cause problems
in real software - including this one. I wouldn't be
surprised if this remains the only reported incidence of
that change.

I *am* saying that for every change that has been made, one
could construct an application that breaks under this
change. This theoretical potential for breakage shouldn't
cause prominent appearance in documentation; everybody who
really wants to see documentation for all changes should
consult the CVS logs.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2001-11-30 06:15

Message:
Logged In: YES 
user_id=7887

I beg your pardon. Are you saying that there are many 
changes in Python 2.2 that will blow up code written in 
Python 2.1!? Why do we have __future__ than!? Why are we
concerned about being careful with division upgrade when
we have lots of small stuff breaking up *valid* code
written one release ago? And, if this was not enough, 
these changes won't even be documented!?!



----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-11-30 04:21

Message:
Logged In: YES 
user_id=21627

Well, I wouldn't object to anybody documenting this
particular change (even though I won't do that myself, either).

I'm just pointing out that there are many more changes of
this kind, and that it would be an illusion that you could
ever produce a complete list.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-11-30 01:36

Message:
Logged In: YES 
user_id=38388

I disagree: any change which could potentially blow up existing programs should at least be mentioned somewhere 
in the docs, be it the NEWS file, the release page on python.org or (thanks to the great work of Andrew Kuchling) 
the "What's new in Python x.x" docs.

This is simply needed due to the very dynamic way Python works: there's no way to test all paths through a 
program if you want to port it from one Python version to the next, so we'll have to be very careful about these 
changes even if they are clearly bug fixes.

In the mentioned case, I'd say that the programmer was at fault, though: it's so much easier to test a path for being 
a directory than to rely on some obscure method return value and also much safer !

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-11-30 01:19

Message:
Logged In: YES 
user_id=21627

Any change is user-visible: for any change, I can construct
a program that behaves differently with the change.
Otherwise, the change would be useless (except that it may
add commentary or some such).

In fact, the change *is* documented, namely in the CVS log.
Extracting all those fragments into a single document would
be  time-consuming and pointless: nobody can read through
hundreds of changes.

Your program would have blown up even if there was such a
document. If a change has severe effects on many existing
programs, it should be implemented differently.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2001-11-29 16:10

Message:
Logged In: YES 
user_id=7887

I think *any* change visible at user space should be 
documented. I discovered this change because a program I 
have never read in my life (in fact, I didn't even know it
was written in python) has blown up in my hands. It was
reading files and safely ignoring directories by testing
readline()'s output.



----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-11-29 15:53

Message:
Logged In: YES 
user_id=21627

I don't think this needs to be documented. The change is a
bug fix, reading from a directory was never supposed to work
(whether in lines or not).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=487277&group_id=5470