[Python-Dev] bytes type discussion

Wed Feb 15 21:33:10 CET 2006

On 2/14/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Fred L. Drake, Jr. wrote:
>
> > The proper response in this case is often to re-start decoding
> > with the correct encoding, since some of the data extracted so far may have
> > been decoded incorrectly.
>
> If the protocol has been sensibly designed, that shouldn't
> happen, since everything up to the coding marker should
> be ascii (or some other protocol-defined initial coding).
>
> For protocols that are not sensibly designed (or if you're
> just trying to guess) what you suggest may be needed. But
> it would be good to have a nicer way of going about it
> for when the protocol is sensible.

I think that the implementation of encoding-guessing or
auto-encoding-upgrade techniques should be left out of the standard
library design for now. I know that XML does something like this, but
fortunately we employ dedicated C code to parse XML so that particular
case should be taken care of without complicating the rest of the
standard I/O library.

As far as searching bytes objects, that shouldn't be a problem as long
as the search 'string' is also specified as a bytes object.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)