Responding to points individually to avoid confusing multi-topic threads. :) Andrew Barnert wrote: < snip >
When permissive is False, characters that are generally unsafe are rejected. When permissive is True, only path separator characters are rejected. Generally unsafe characters besides path separators would include things like a leading ".", any non-printing character, any wildcard, piping and redirection characters, etc. I think neither of these is what I’d usually want. I never want to sanitize just pathsep characters without sanitizing all illegal characters. I do often want to sanitize all illegal characters (just \0 and the path sep on POSIX, a larger set that I don’t know by heart on Windows).
Sanitization and validation are not the same thing though. \0 is invalid and will result in an error when passed to a function that attempts to use it to reference a file, so allowing that character to pass through sanitization doesn't constitute an exploitable vulnerability. Having said that, it's usually friendlier to fail sooner rather than later, so it maybe it actually does make sense for sanitization to fail for illegal characters as well as for valid, unsafe characters. Hmm. I just realized that "..." and (to a lesser extent) "." are valid path parts but are nevertheless usually not safe to allow.