[Python-Dev] Pre-PEP: The "bytes" object
mal at egenix.com
Fri Feb 17 13:03:29 CET 2006
Guido van Rossum wrote:
> On 2/15/06, Neil Schemenauer <nas at arctrix.com> wrote:
>> This could be a replacement for PEP 332. At least I hope it can
>> serve to summarize the previous discussion and help focus on the
>> currently undecided issues.
>> I'm too tired to dig up the rules for assigning it a PEP number.
>> Also, there are probably silly typos, etc. Sorry.
> I may check it in for you, although right now it would be good if we
> had some more feedback.
> I noticed one behavior in your pseudo-code constructor that seems
> questionable: while in the Q&A section you explain why the encoding is
> ignored when the argument is a str instance, in fact you require an
> encoding (and one that's not "ascii") if the str instance contains any
> non-ASCII bytes. So bytes("\xff") would fail, but bytes("\xff",
> "blah") would succeed. I think that's a bit strange -- if you ignore
> the encoding, you should always ignore it. So IMO bytes("\xff") and
> bytes("\xff", "ascii") should both return the same as bytes().
> Also, there's a code path where the initializer is a unicode instance
> and its encode() method is called with None as the argument. I think
> both could be fixed by setting the encoding to
> sys.getdefaultencoding() if it is None and the argument is a unicode
> def bytes(initialiser=, encoding=None):
> if isinstance(initialiser, basestring):
> if isinstance(initialiser, unicode):
> if encoding is None:
> encoding = sys.getdefaultencoding()
> initialiser = initialiser.encode(encoding)
> initialiser = [ord(c) for c in initialiser]
> elif encoding is not None:
> raise TypeError("explicit encoding invalid for non-string "
> create bytes object and fill with integers from initialiser
> return bytes object
> BTW, for folks who want to experiment, it's quite simple to create a
> working bytes implementation by inheriting from array.array. Here's a
> quick draft (which only takes str instance arguments):
> from array import array
> class bytes(array):
> def __new__(cls, data=None):
> b = array.__new__(cls, "B")
> if data is not None:
> return b
> def __str__(self):
> return self.tostring()
> def __repr__(self):
> return "bytes(%s)" % repr(list(self))
> def __add__(self, other):
> if isinstance(other, array):
> return bytes(super(bytes, self).__add__(other))
> return NotImplemented
If you want to play around with the migration
to all Unicode in Py3k, start Python with the -U switch and
monkey-patch the builtin str to be an alias for unicode.
Ideally, the bytes type should work under both the Py3k conditions
and the Py2.x default ones.
Professional Python Services directly from the Source (#1, Feb 17 2006)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
More information about the Python-Dev