On 29.08.2014 02:41, Stephen J. Turnbull wrote:
In the process of booking up for my other post in this thread, I noticed the 'surrogatepass' handler.
Is there a real use case for the 'surrogatepass' error handler? It seems like a horrible break in the abstraction. IMHO, if there's a need, the application should handle this. Python shouldn't provide it on encoding as the resulting streams are not Unicode conformant, nor on decoding UTF-16, as conversion of surrogate pairs is a requirement of all Unicode versions since about 1995.
This error handler allows applications to reactivate the Python 2 style behavior of the UTF codecs in Python 3, which allow reading lone surrogates on input. Since Python allows working with lone surrogates in Unicode (they are valid code points) and we're using UTF-8 for marshal, we needed a way to make sure that Python 3 also optionally supports working with lone surrogates in such UTF-8 streams (nowadays called CESU-8: http://en.wikipedia.org/wiki/CESU-8). See http://bugs.python.org/issue3672 http://bugs.python.org/issue12892 for discussions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2014)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2014-08-27: Released eGenix PyRun 2.0.1 ... http://egenix.com/go62 2014-09-19: PyCon UK 2014, Coventry, UK ... 21 days to go 2014-09-27: PyDDF Sprint 2014 ... 29 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/