[issue14626] os module: use keyword-only arguments for dir_fd and nofollow to reduce function count
Larry Hastings
report at bugs.python.org
Sat May 26 06:54:46 CEST 2012
Larry Hastings <larry at hastings.org> added the comment:
Here's my first pass at a patch. For this patch,
I took the proposal to its logical extreme: I removed
every function in os that was both mildly redundant
with an existing function *and* has been added since
3.2, and moved that functionality to the equivalent
existing function, making it accessible with the use
of keyword-only parameters.
Specifically:
This function has been removed, and instead
| this parameter has been added to
| | this function
| | |
v v v
-------------------------------------
faccessat dir_fd access
faccessat effective_ids access
faccessat follow_symlinks access
fchmodat dir_fd chmod
fchmodat follow_symlinks chmod
fchownat dir_fd chown
fchownat follow_symlinks chown
fexecve fd execve
fgetxattr fd getxattr
flistdir fd listdir
flistxattr fd listxattr
fremovexattr fd removexattr
fsetxattr fd setxattr
fstatat dir_fd stat
futimens fd utime
futimes fd utime
futimesat dir_fd utime
lgetxattr follow_symlinks getxattr
linkat dst_dir_fd link
linkat src_dir_fd link
linkat follow_symlinks link
llistxattr follow_symlinks listxattr
lremovexattr follow_symlinks removexattr
lsetxattr follow_symlinks setxattr
lutimes follow_symlinks utime
mkdirat dir_fd mkdir
mkfifoat dir_fd mkfifoat
mknodat dir_fd mknod
open dir_fd openat
readlinkat dir_fd readlink
renameat dst_dir_fd rename
renameat src_dir_fd rename
symlinkat dir_fd symlink
unlinkat dir_fd unlink
unlinkat remove_directory unlink
utimensat dir_fd utime
utimensat follow_symlinks utime
Additionally, we *could* deprecate this function,
| as I have added this parameter
| | to this function:
| | |
v v v
--------------------------------------
fchdir fd chdir
fchmod fd chmod
fstat fd stat
fstatvfs fd statvfs
lchflags follow_symlinks chflags
lchmod follow_symlinks chmod
fchown fd chown
lchown follow_symlinks chown
lstat follow_symlinks stat
I doubt we'll ever deprecate those functions.
This patch does not deprecate those functions.
I don't propose deprecating those functions.
Notes:
* What do you do on platforms where the functionality isn't available?
I believe it's better to always accept parameters, but throw a
NotImplementedError if the functionality they represent is
unavailable on the current platform. Adding and removing
parameters based on the current platform... that way lies madness.
(It's like scrollbars that pop in and out of existance depending
on whether or not you need them. Better that the scrollbars are
always there, and simply disabled when the content fits in the
current window. Users like a stable foundation under their feet.)
* The patch is... pretty big. But you can divide-and-conquer it
into a series of self-contained parts. First I add path_converter,
then I modify existing functions / remove new functions. Each of
those can be reviewed in isolation.
Also, the new implementations generally follow the same pattern:
initialize
call PyArg_ParseTupleAndKeywords
error out early if user asks for functionality
unavailable on the current platform
ensure that combinations of parameters
(dir_fd, fd, follow_symlinks) are permitted
do actual work, turning on/off advanced functionality
based on configure ifdefs (HAVE_FCHOWNAT etc)
if error, raise exception
compute return value
exit:
cleanup all path variables
return return_value
Here's a list of all the functions I added arguments to:
access chdir chflags chmod chown execve getxattr link listdir
listxattr mkdir mkfifo mknod open readlink removexattr rename
setxattr stat statvfs symlink unlink utime
* The one new bit of technology: a PyArg_ParseTuple "converter"
function called path_converter. Here's its documentation:
/*
* A PyArg_ParseTuple "converter" function
* that handles filesystem paths in the manner
* preferred by the os module.
*
* path_converter accepts (Unicode) strings and their
* subclasses, and bytes and their subclasses. What
* it does with the argument depends on the platform:
*
* * On Windows, if we get a (Unicode) string we
* extract the wchar_t * and return it; if we get
* bytes we extract the char * and return that.
*
* * On all other platforms, strings are encoded
* to bytes using PyUnicode_FSConverter, then we
* extract the char * from the bytes object and
* return that.
*
* Input fields:
* path.nullable
* If nonzero, the path is permitted to be None.
* path.function_name
* If non-NULL, path_converter will use that as the name
* of the function in error messages.
* (If path.argument_name is NULL it omits the function name.)
* path.argument_name
* If non-NULL, path_converter will use that as the name
* of the parameter in error messages.
* (If path.argument_name is NULL it uses "path".)
*
* Output fields:
* path.wide
* Points to the path if it was expressed as Unicode
* and was not encoded. (Only used on Windows.)
* path.narrow
* Points to the path if it was expressed as bytes,
* or it was Unicode and was encoded to bytes.
* path.length
* The length of the path in characters.
* path.object
* The original object passed in.
* path.cleanup
* For internal use only. May point to a temporary object.
* (Pay no attention to the man behind the curtain.)
*
* At most one of path.wide or path.narrow will be non-NULL.
* If path was None and path.nullable was set,
* both path.wide and path.narrow will be NULL,
* and path.length will be 0.
*
* path_converter takes care to not write to the path_t
* unless it's successful. However it must reset the
* "cleanup" field each time it's called.
*
* Use as follows:
* path_t path;
* memset(&path, 0, sizeof(path));
* PyArg_ParseTuple(args, "O&", path_converter, &path);
* // ... use values from path ...
* path_cleanup(&path);
*
* (Note that if PyArg_Parse fails you don't need to call
* path_cleanup(). However it is safe to do so.)
*/
typedef struct {
char *function_name;
char *argument_name;
int nullable;
wchar_t *wide;
char *narrow;
Py_ssize_t length;
PyObject *object;
PyObject *cleanup;
} path_t;
I assert path_converter is Very Useful. It nearly always
reduced the argument processing from three code paths
(two for Windows, one for everyone else) to one. Even if
the rest of the patch isn't accepted I'm sure we'll keep
path_converter.
* In a lot of places I combined together several functions, or
several large blobs of #ifdef'd code, into one large
slightly-shaggy function. Like, Windows code is now often
streamlined in, instead of being a big separate #ifdef. And
instead of three passes at decoding arguments (one for Windows
wide, one for Windows narrow, one for everyone else) there is
generally just one. (Thanks, path_converter!) Again, even
if we don't keep the extra keyword arguments / functionality,
I'm hoping at least some of that cleanup will survive.
* utime is the whale. It obviates *five* other functions
(futimens, futimes, futimesat, lutimes, and utimensat).
I spent a fair amount of time refactoring it. It's so complex,
it was hard to make it anything like readable. I like to
think I reached a local maximum of readability given what I had
to work with, and I like to think that the resulting dog's
breakfast is less unreadable than the previous iteration. Anyway
I'm definitely open to suggestions on how to restructure it
further to enhance readability.
* The tests used to use e.g. hasattr(os, 'futimes') and the like to
determine what functionality was available. In order to LBYL,
I had to resort to e.g. sysconfig.get_config_var('HAVE_FUTIMES').
If users need to LBYL here too, well, it's kind of a problem.
(I hope function signature objects solve this--we're talking
about it.)
* On a very related topic: os.openat() is one of the functions I
removed. And then we have this in Lib/os.py:
if _exists('openat'):
def fwalk(...)
Since I removed os.openat completely, we could longer tell
whether or not the functionality is available. And I couldn't
use my sysconfig.get_config_var trick because it's os.py and
we're bootstrapping the interpreter. So I went with a dirty
hack--uh, I mean, practicality beats purity. posixmodule.c
conditionally exposes a symbol called _HAVE_OPENAT if openat()
is available, and fwalk() is now gated on that.
* A minor semantic change: symlink() now accepts the same arguments
everywhere. Previously it had an extra optional argument only on
Windows ("target_is_directory"). Non-Windows now accepts that too
and ignores it, in much the same way that os.mkdir ignores the mode
on Windows. Also, os.symlink now accepts byte strings for the
paths on Windows. (Even though we'll remove byte string paths
in 3.4, yes?)
* The patch is still sloppy:
* The docstrings are incomplete.
* I haven't touched the docs at all.
* There are > 80 col lines.
Clearly I'll clean up this stuff if the patch has a shot at
going in.
I do think it's an improvement. But I won't check any part of it
in without some consensus (or BFDL ruling).
I look forward to your feedback!
----------
assignee: -> larry
keywords: +patch
stage: -> patch review
Added file: http://bugs.python.org/file25716/larry.os.keyword.arguments.collapse.1.diff
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14626>
_______________________________________
More information about the Python-bugs-list
mailing list