[Python-Dev] bytes type discussion

Stephen J. Turnbull stephen at xemacs.org
Fri Feb 17 07:11:12 CET 2006

>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:

    Guido> I think that the implementation of encoding-guessing or
    Guido> auto-encoding-upgrade techniques should be left out of the
    Guido> standard library design for now.

As far as I can see, little new design is needed.  There's no reason
why an encoding-guesser couldn't be written as a codec that detects
the coding, then dispatches to the appropriate codec.  The only real
issue I know of is that if you ask such a codec "who are you?", there
are two plausible answers: "autoguess" and the codec actually being
used to translate the stream.  If there's no API to ask for both of
those, the API might want generalization.

    Guido> As far as searching bytes objects, that shouldn't be a
    Guido> problem as long as the search 'string' is also specified as
    Guido> a bytes object.

You do need to be a little careful in implementation, as (for example)
"case insensitive" should be meaningless for searching bytes objects.
This would be especially important if searching and collation become
more Unicode conformant.

School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

More information about the Python-Dev mailing list