[Python-Dev] Bytes path support

Nick Coghlan ncoghlan at gmail.com
Thu Aug 21 14:26:56 CEST 2014


On 21 August 2014 12:16, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Nick Coghlan writes:
>
>  > One idea I had along those lines is a surrogatereplace error handler (
>  > http://bugs.python.org/issue22016) that emitted an ASCII question mark for
>  > each smuggled byte, rather than propagating the encoding problem.
>
> Please, don't.
>
> "Smuggled bytes" are not independent events.  They tend to be
> correlated *within* file names, and this handler would generate names
> whose human semantics get lost (and there *are* human semantics,
> otherwise the name would be str(some_counter)).  They tend to be
> correlated across file names, and this handler will generate multiple
> files with the same munged name (and again, the differentiating human
> semantics get lost).
>
> If you don't know the semantics of the intended file names, you can't
> generate good replacement names.  This has to be an application-level
> function, and often requires user intervention to get good names.
>
> If you want to provide helper functions that applications can use to
> clean names explicitly, that might be OK.

Yeah, I was thinking in the context of reproducing sys.stdout's
behaviour in Python 2, but that reproduces the bytes faithfully, so
'surrogateescape' is already offers exactly the behaviour we want
(sys.stdout will have surrogateescape enabled by default in 3.5).

I'll keep pondering the question of possible helper functions in the
"string" module.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list