[Python-3000] Should int() and float() accept bytes?
Mark Dickinson
dickinsm at gmail.com
Tue Apr 15 03:58:23 CEST 2008
This is a repeat of a question that came up on the "Decimal(unicode)" thread
a
little while ago. I think it needs an answer, so I'm reposting it in its
own thread.
I couldn't find any other previous discussion of this; apologies if I'm
rehashing
old issues.
Currently, int() and float() accept bytes instances. For example:
>>> int(bytes([49, 50, 51]))
123
[40381 refs]
>>> int(b'123')
123
[40381 refs]
Philosophically, this seems wrong: it's not clear why bytes([49, 50, 51])
should represent an integer, or even which integer it should represent; if
it's intended that the bytes sequence be thought of as an ascii string
then really it should be explicitly decoded as such first:
>>> int(b'123'.decode('ascii'))
123
On the other hand, there's at least some sense in which bytes already
acts as a sort of poor-man's string: witness bytes.lower and friends.
Maybe practicality beats purity here?
What do people think about changing the int() and float() constructors so
that
they don't accept bytes?
I experimented with removing int(bytes) and int(bytearray) support in
longobject.c's long_new and in PyNumber_Long in abstract.c, to see how much
breakage occurred. The results:
11 tests failed:
test_email test_httplib test_io test_mimetools test_pickle
test_pickletools test_random test_smtplib test_sqlite test_tarfile
test_uu
(random.py needed some patching to get the test-suite to
run in the first place.)
None of the breakage looks particularly serious or difficult to fix. I
haven't tried removing float(bytes) support yet.
See also
http://bugs.python.org/issue2483
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20080414/887f9bd5/attachment.htm
More information about the Python-3000
mailing list