Python's handling of unicode surrogates

Ross Ridge rridge at
Sat Apr 21 01:49:09 CEST 2007

Rhamphoryncus  <rhamph at> wrote:
>The only code that will be changed is that which doesn't handle
>surrogates properly.  Some will start working properly.  Some (ie
>random.choice(u'\U00100000\uFFFF')) will fail explicitly (rather than

You're falsely assuming that any code that doesn't support surrogates
is broken.  Supporting surrogates is no more required than supporting
combining characters, right-to-left languages or lower case letters.

					Ross Ridge

 l/  //	  Ross Ridge -- The Great HTMU
[oo][oo]  rridge at
 db  //	  

More information about the Python-list mailing list