[Python-Dev] Bytes path related questions for Guido

MRAB python at mrabarnett.plus.com
Tue Aug 26 13:31:19 CEST 2014


On 2014-08-26 03:11, Stephen J. Turnbull wrote:
> Nick Coghlan writes:
>
>   > "purge_surrogate_escapes" was the other term that occurred to me.
>
> "purge" suggests removal, not replacement.  That may be useful too.
>
> neutralize_surrogate_escapes(s, remove=False, replacement='\uFFFD')
>
How about:

     replace_surrogate_escapes(s, replacement='\uFFFD')

If you want them removed, just pass an empty string as the replacement.

> maybe?  (Of course the remove argument is feature creep, so I'm only
> about +0.5 myself.  And the name is long, but I can't think of any
> better synonyms for "make safe" in English right now).
>
>   > Either way, my use case is to filter them out when I *don't* want to
>   > pass them along to other software, but would prefer the Unicode
>   > replacement character to the ASCII question mark created by using the
>   > "replace" filter when encoding.
>
> I think it would be preferable to be unicodely correct here by
> default, since this is a str -> str function.
>



More information about the Python-Dev mailing list