Sanitizing filename strings across platforms
Jean-Paul Calderone
calderone.jeanpaul at gmail.com
Tue May 31 23:17:50 EDT 2011
On May 31, 10:17 pm, Tim Chase <python.l... at tim.thechases.com> wrote:
> Scenario: a file-name from potentially untrusted sources may have
> odd filenames that need to be sanitized for the underlying OS.
> On *nix, this generally just means "don't use '/' or \x00 in your
> string", while on Win32, there are a host of verboten characters
> and file-names. Then there's also checking the abspath/normpath
> of the resulting name to make sure it's still in the intended folder.
>
> I've read through [1] and have started to glom together various
> bits from that thread. My current course of action is something like
>
> SACRED_WIN32_FNAMES = set(
> ['CON', 'PRN', 'CLOCK$', 'AUX', 'NUL'] +
> ['LPT%i' % i for i in range(32)] +
> ['CON%i' % i for i in range(32)] +
>
> def sanitize_filename(fname):
> sane = set(string.letters + string.digits + '-_.[]{}()$')
> results = ''.join(c for c in fname if c in sane)
> # might have to check sans-extension
> if results.upper() in SACRED_WIN32_FNAMES:
> results = "_" + results
> return results
>
> but if somebody already has war-hardened code they'd be willing
> to share, I'd appreciate any thoughts.
>
There's http://pypi.python.org/pypi/filepath/0.1 (taken from
twisted.python.filepath).
Jean-Paul
More information about the Python-list
mailing list