<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#330033">
<div class="moz-cite-prefix">On 8/26/2014 4:31 AM, MRAB wrote:<br>
</div>
<blockquote cite="mid:53FC7007.2060502@mrabarnett.plus.com"
type="cite">On 2014-08-26 03:11, Stephen J. Turnbull wrote:
<br>
<blockquote type="cite" style="color: #000000;">Nick Coghlan
writes:
<br>
<br>
> "purge_surrogate_escapes" was the other term that
occurred to me.
<br>
<br>
"purge" suggests removal, not replacement. That may be useful
too.
<br>
<br>
neutralize_surrogate_escapes(s, remove=False,
replacement='\uFFFD')
<br>
<br>
</blockquote>
How about:
<br>
<br>
replace_surrogate_escapes(s, replacement='\uFFFD')
<br>
<br>
If you want them removed, just pass an empty string as the
replacement.
<br>
</blockquote>
<br>
And further, replacement could be a vector of 128 characters, to do
immediate transcoding, or a single character to do wholesale
replacement with some gibberish character, or None to remove (or an
empty string).<br>
</body>
</html>