Re: [Python-ideas] [Python-Dev] os.path.normcase rationale?

Guido van Rossum wrote:
Maybe the API could be called os.path.unnormpath(), since it is in a sense the opposite of normpath() (which removes case) ?
Cute, but not very intuitive. Something like actualpath() might be better -- although that's somewhat arbitrarily different from realpath(). -- Greg

Antoine Pitrou <solipsis@pitrou.net> writes:
Again, why not simply improve realpath()?
Because that already does what it says it does. The behaviour being asked for is distinct from what ‘os.path.normcase’ and ‘os.path.realpath’ are meant to do, so that behaviour belongs in a different place from those two. -- \ “Value your freedom or you will lose it, teaches history. | `\ “Don't bother us with politics,” respond those who don't want | _o__) to learn.” —Richard Stallman, 2002 | Ben Finney

On Sun, 26 Sep 2010 00:00:57 +1000 Ben Finney <ben+python@benfinney.id.au> wrote:
So what? The behaviour of fetching the canonical name can be added to the behaviour of resolving symlinks. It wouldn't be incompatible with the current behaviour AFAICT. And it would be better than adding yet another function to our ménagerie of path-normalizing functions. We already have abspath(), normpath(), normcase(), realpath() -- all with very descriptive names as you might notice. We don't need another function. Regards Antoine.

On Sat, Sep 25, 2010 at 7:11 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
There's no need to get all emotional or sarcastic about it. You might have noticed the risks of sarcasm on this list recently. Instead, it should be possibly to analyze how realpath() is currently used and see if changing it as desired is likely to break any code. TBH, I am personally on the fence and would like to see an analysis including the current and desired behavior in the following cases: - Windows - OS X - Other Unixoid systems Also take into account: - Filesystems whose case behavior is the opposite of the platform default (all three support such filesystems through system configuration and/or mounting) - Relative paths - Paths containing symlinks In any case it is much easier to design and implement the best possible functionality if you don't also have to be backward compatible with an existing function. I think it might be useful to call this new API (let's call it "casefulpath" while we wait for a better name to come to us :-) on a relative path without having the answer turned into an absolute path -- if that's desired it's easy enough to call abspath() or realpath() on the result. -- --Guido van Rossum (python.org/~guido)

Le samedi 25 septembre 2010 à 13:55 -0700, Guido van Rossum a écrit :
There's no need to get all emotional or sarcastic about it. You might have noticed the risks of sarcasm on this list recently.
Ironic considering the naming of the language :) Anyway:
realcase() ?

On Sep 25, 2010, at 7:11 AM, Antoine Pitrou wrote:
realpath's docs describe its result as "the canonical path of the specified filename, eliminating any symbolic links encountered in the path (if they are supported by the operating system)". "Canonical" should describe the behavior we're after, with the correct case of the filename as it is actually stored on disk. But isn't realpath modeled after POSIX realpath(3)? realpath(3) doesn't seem to clearly guarantee the original name as stored on disk either. However realpath(3) on OSX 10.6 with case-insensitive HFS+ does return the original name as it was stored. Do any other platforms do this and do we care about maintaining parity with realpath(3)? -- Philip Jenvey

Antoine Pitrou wrote:
So what? The behaviour of fetching the canonical name can be added to the behaviour of resolving symlinks.
Finding the actual name (I wouldn't call it "canonical", since that term could be ambiguous) requires reading the contents of entire directories at each step, which could be noticeably less efficient than what realpath() currently does. Users who only want symlinks expanded might object to that. An option could be added to realpath(), but then we're into constant-parameter territory. -- Greg

On Sun, Sep 26, 2010 at 9:02 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Constant parameter territory isn't *necessarily* a bad thing if the number of parameters is sufficiently high. In particular, if you have one basic command (say, "give me the canonical path for this possibly-non-canonical path I already have") with a gazillion different variants (*ahem*), then a single function with well-named boolean parameters (to explain "this is what I really mean by 'canonical path'") is likely to be much easier for people to remember than trying to create a concise-yet-meaningful mnemonic for each variant. So we shouldn't dismiss out of hand the idea of a keyword-only "swiss-army" path normalisation function that can at least be queried via help() if you forget the exact spelling for the various parameters. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Sep 27, 2010 at 8:21 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
How high is high enough? Just in realpath, normpath, normcase we already have 3 options, with the "match the existing case-preserving filename if it exists" variant requested in this discussion making it 4. Supporting platform appropriate Unicode normalisation would make it 5. Note that I'm not saying the swiss-army function is necessarily the right answer here, but remembering "use os.realpath to get canonical filenames" and then having a bunch of flags to enable/disable various aspects of the normalisation (defaulting to the current implementation of only expanding symlinks) fits my brain more easily than remembering the distinctions between the tasks that currently correspond to each function name. If there really isn't a name that makes sense for the new variant, then maybe adding some constant parameters to one of the existing methods is the way to go. realpath and normpath are the two most likely candidates to use as a basis for such an approach. If realpath was used as a basis, then it would gain keyword-only parameters along the lines of "expand_links=True", "collapse=False", "lower_case=False", "match_case=False". Setting both lower_case=True and match_case=True would trigger ValueError, but the API with separate boolean flags is easier to use than one with a single tri-state parameter for the case conversion. If normcase was used as a basis instead, then symlink expansion would remain a separate operation and normpath would gain "collapse=True", "lower_case=False", "match_case=False" as keyword-only parameters. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Antoine Pitrou <solipsis@pitrou.net> writes:
Again, why not simply improve realpath()?
Because that already does what it says it does. The behaviour being asked for is distinct from what ‘os.path.normcase’ and ‘os.path.realpath’ are meant to do, so that behaviour belongs in a different place from those two. -- \ “Value your freedom or you will lose it, teaches history. | `\ “Don't bother us with politics,” respond those who don't want | _o__) to learn.” —Richard Stallman, 2002 | Ben Finney

On Sun, 26 Sep 2010 00:00:57 +1000 Ben Finney <ben+python@benfinney.id.au> wrote:
So what? The behaviour of fetching the canonical name can be added to the behaviour of resolving symlinks. It wouldn't be incompatible with the current behaviour AFAICT. And it would be better than adding yet another function to our ménagerie of path-normalizing functions. We already have abspath(), normpath(), normcase(), realpath() -- all with very descriptive names as you might notice. We don't need another function. Regards Antoine.

On Sat, Sep 25, 2010 at 7:11 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
There's no need to get all emotional or sarcastic about it. You might have noticed the risks of sarcasm on this list recently. Instead, it should be possibly to analyze how realpath() is currently used and see if changing it as desired is likely to break any code. TBH, I am personally on the fence and would like to see an analysis including the current and desired behavior in the following cases: - Windows - OS X - Other Unixoid systems Also take into account: - Filesystems whose case behavior is the opposite of the platform default (all three support such filesystems through system configuration and/or mounting) - Relative paths - Paths containing symlinks In any case it is much easier to design and implement the best possible functionality if you don't also have to be backward compatible with an existing function. I think it might be useful to call this new API (let's call it "casefulpath" while we wait for a better name to come to us :-) on a relative path without having the answer turned into an absolute path -- if that's desired it's easy enough to call abspath() or realpath() on the result. -- --Guido van Rossum (python.org/~guido)

Le samedi 25 septembre 2010 à 13:55 -0700, Guido van Rossum a écrit :
There's no need to get all emotional or sarcastic about it. You might have noticed the risks of sarcasm on this list recently.
Ironic considering the naming of the language :) Anyway:
realcase() ?

On Sep 25, 2010, at 7:11 AM, Antoine Pitrou wrote:
realpath's docs describe its result as "the canonical path of the specified filename, eliminating any symbolic links encountered in the path (if they are supported by the operating system)". "Canonical" should describe the behavior we're after, with the correct case of the filename as it is actually stored on disk. But isn't realpath modeled after POSIX realpath(3)? realpath(3) doesn't seem to clearly guarantee the original name as stored on disk either. However realpath(3) on OSX 10.6 with case-insensitive HFS+ does return the original name as it was stored. Do any other platforms do this and do we care about maintaining parity with realpath(3)? -- Philip Jenvey

Antoine Pitrou wrote:
So what? The behaviour of fetching the canonical name can be added to the behaviour of resolving symlinks.
Finding the actual name (I wouldn't call it "canonical", since that term could be ambiguous) requires reading the contents of entire directories at each step, which could be noticeably less efficient than what realpath() currently does. Users who only want symlinks expanded might object to that. An option could be added to realpath(), but then we're into constant-parameter territory. -- Greg

On Sun, Sep 26, 2010 at 9:02 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Constant parameter territory isn't *necessarily* a bad thing if the number of parameters is sufficiently high. In particular, if you have one basic command (say, "give me the canonical path for this possibly-non-canonical path I already have") with a gazillion different variants (*ahem*), then a single function with well-named boolean parameters (to explain "this is what I really mean by 'canonical path'") is likely to be much easier for people to remember than trying to create a concise-yet-meaningful mnemonic for each variant. So we shouldn't dismiss out of hand the idea of a keyword-only "swiss-army" path normalisation function that can at least be queried via help() if you forget the exact spelling for the various parameters. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Sep 27, 2010 at 8:21 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
How high is high enough? Just in realpath, normpath, normcase we already have 3 options, with the "match the existing case-preserving filename if it exists" variant requested in this discussion making it 4. Supporting platform appropriate Unicode normalisation would make it 5. Note that I'm not saying the swiss-army function is necessarily the right answer here, but remembering "use os.realpath to get canonical filenames" and then having a bunch of flags to enable/disable various aspects of the normalisation (defaulting to the current implementation of only expanding symlinks) fits my brain more easily than remembering the distinctions between the tasks that currently correspond to each function name. If there really isn't a name that makes sense for the new variant, then maybe adding some constant parameters to one of the existing methods is the way to go. realpath and normpath are the two most likely candidates to use as a basis for such an approach. If realpath was used as a basis, then it would gain keyword-only parameters along the lines of "expand_links=True", "collapse=False", "lower_case=False", "match_case=False". Setting both lower_case=True and match_case=True would trigger ValueError, but the API with separate boolean flags is easier to use than one with a single tri-state parameter for the case conversion. If normcase was used as a basis instead, then symlink expansion would remain a separate operation and normpath would gain "collapse=True", "lower_case=False", "match_case=False" as keyword-only parameters. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (7)
-
Antoine Pitrou
-
Ben Finney
-
Greg Ewing
-
Guido van Rossum
-
MRAB
-
Nick Coghlan
-
Philip Jenvey