25 Apr
2009
25 Apr
'09
12:44 p.m.
If the bytes are mapped to single half surrogate codes instead of the normal pairs (low+high), then I can see that decoding could never be ambiguous and encoding could produce the original bytes.
I was confused by Markus Kuhn's original UTF-8b specification. I have now changed the PEP to avoid using PUA characters at all. Regards, Martin