os.path.dirname misleading?

I'm not sure whether to classify this as a bug or a feature request. Recently, I got burned by the fact that despite the name, dirname() does not return the expected directory portion of a path if you pass it a directory, instead it will return the parent directory because it uses split. That it uses split is clearly documented and also evident in the source, though both fail to point out the case of passing in a directory path. "dirname(path) Return the directory name of pathname path. This is the first half of the pair returned by split(path)." # Return the head (dirname) part of a path. def dirname(p): """Returns the directory component of a pathname""" return split(p)[0] However, to get what I would consider correct behavior based on the function name, the code would need to be: def dirname(p): """Returns the directory component of a pathname""" if isdir(p): return p else: return split(p)[0] Changing dirname() may in fact break existing code if people expect it to just use split, so a dirname2() function seems called for, but that seems silly, given that dirname should probably be doing an isdir() check. ka

Kevin> However, to get what I would consider correct behavior based on Kevin> the function name, the code would need to be: Kevin> def dirname(p): Kevin> """Returns the directory component of a pathname""" Kevin> if isdir(p): Kevin> return p Kevin> else: Kevin> return split(p)[0] No can do. On my Mac I could execute: >>> import ntpath >>> print ntpath.dirname("C:\\system\\win32") C:\system Calling isdir() is not an option. Taken another way, "/usr/bin" is a path to a file, so "/usr" is its directory component. and "bin" is its basename: >>> os.path.dirname("/usr/bin") '/usr' >>> os.path.basename("/usr/bin") 'bin' That "/usr/bin" happens to also be a directory is beside the point. Skip

This is the first time I've ever heard of this confusion. dirname is named after the Unix shell function of the same name, which behaves the same way. I'm not even sure I understand what you expected -- you expected dirname("foo") to return "foo" if foo is a directory? What would be the point of that? --Guido van Rossum (home page: http://www.python.org/~guido/)

Well that's news. I never heard of or used dirname in the shell. But with that historical context it makes more sense now.
Yes, I expected to get the directory passed in based on the function name. In the code in question I don't know whether the path is a directory or a file when I call dirname. I was simply misled by the function name. Looking at this further I can see that I'm just going to have to create my own directory(path) function because of how os.path.split behaves which impacts dirname, I definitely need an isdir() check.
Hmm, I may actually switch to using split(path)[0] and split(path)[-1] (or split(path)[1]) in some cases since those might be more descriptive of what dirname and basename actually do. Pity the functions aren't named os.path.head and os.path.tail. Sorry for the confusion, ka

Kevin Altis <altis@semi-retired.com>:
Pity the functions aren't named os.path.head and os.path.tail.
It wouldn't be entirely clear what they mean even then -- "head" might mean just the first pathname component. In a tool I wrote some years ago in Scheme, I called them "filename-directory" and "filename-nondirectory". Which suffered from the same problem, really (they didn't consult the file system either). But it didn't matter, since I was the only person who used them, and *I* knew what they meant. :-) Maybe they should be called "all_except_the_last_pathname_component" and "last_pathname_component"? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Greg> Kevin Altis <altis@semi-retired.com>: >> Pity the functions aren't named os.path.head and os.path.tail. Greg> It wouldn't be entirely clear what they mean even then -- "head" Greg> might mean just the first pathname component. ... Greg> Maybe they should be called Greg> "all_except_the_last_pathname_component" and Greg> "last_pathname_component"? I know, how about car and cdr? ;-) Skip

Kevin> However, to get what I would consider correct behavior based on Kevin> the function name, the code would need to be: Kevin> def dirname(p): Kevin> """Returns the directory component of a pathname""" Kevin> if isdir(p): Kevin> return p Kevin> else: Kevin> return split(p)[0] No can do. On my Mac I could execute: >>> import ntpath >>> print ntpath.dirname("C:\\system\\win32") C:\system Calling isdir() is not an option. Taken another way, "/usr/bin" is a path to a file, so "/usr" is its directory component. and "bin" is its basename: >>> os.path.dirname("/usr/bin") '/usr' >>> os.path.basename("/usr/bin") 'bin' That "/usr/bin" happens to also be a directory is beside the point. Skip

This is the first time I've ever heard of this confusion. dirname is named after the Unix shell function of the same name, which behaves the same way. I'm not even sure I understand what you expected -- you expected dirname("foo") to return "foo" if foo is a directory? What would be the point of that? --Guido van Rossum (home page: http://www.python.org/~guido/)

Well that's news. I never heard of or used dirname in the shell. But with that historical context it makes more sense now.
Yes, I expected to get the directory passed in based on the function name. In the code in question I don't know whether the path is a directory or a file when I call dirname. I was simply misled by the function name. Looking at this further I can see that I'm just going to have to create my own directory(path) function because of how os.path.split behaves which impacts dirname, I definitely need an isdir() check.
Hmm, I may actually switch to using split(path)[0] and split(path)[-1] (or split(path)[1]) in some cases since those might be more descriptive of what dirname and basename actually do. Pity the functions aren't named os.path.head and os.path.tail. Sorry for the confusion, ka

Kevin Altis <altis@semi-retired.com>:
Pity the functions aren't named os.path.head and os.path.tail.
It wouldn't be entirely clear what they mean even then -- "head" might mean just the first pathname component. In a tool I wrote some years ago in Scheme, I called them "filename-directory" and "filename-nondirectory". Which suffered from the same problem, really (they didn't consult the file system either). But it didn't matter, since I was the only person who used them, and *I* knew what they meant. :-) Maybe they should be called "all_except_the_last_pathname_component" and "last_pathname_component"? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Greg> Kevin Altis <altis@semi-retired.com>: >> Pity the functions aren't named os.path.head and os.path.tail. Greg> It wouldn't be entirely clear what they mean even then -- "head" Greg> might mean just the first pathname component. ... Greg> Maybe they should be called Greg> "all_except_the_last_pathname_component" and Greg> "last_pathname_component"? I know, how about car and cdr? ;-) Skip
participants (4)
-
Greg Ewing
-
Guido van Rossum
-
Kevin Altis
-
Skip Montanaro