Does Python need a '>>>' operator?
Beni Cherniavksy
cben at techunix.technion.ac.il
Mon Jun 10 07:11:31 EDT 2002
On 2002-06-10, Ken Seehof wrote:
> Beni Cherniavsky <cben at tx.technion.ac.il> wrote:
> > I just got another idea: use 0x1234 for 0-filled numbers and 1xABCD for
> > 1-filled ones. That way you impose no restrictions on what follows the
> > prefix and keep backward compatibility. 0xFFFFFFFF stays a 2^n-1
> > _positive_ number, as it should be. The look of 1x is weird at first but
> > it is very logical...
>
> Hey, that's pretty clever. I like it. One objection I can forsee is
> that it does add another little thing to learn. And yes, one must
> consider such a feature as making python more complex if it requires
> a paragraph to be added to the main documentation, even if the feature
> will only be used by "advanced" programmers. The standard argument
> is that all python programmers have to be able to read your code.
>
> However, I don't see this as making life much more complicated for
> beginners, since we already have 0x1234, which is only familiar to
> experienced programmers. People who are not familiar with 0x1234 can
> learn about 1xABCD at the same time. People who are familiar with
> 0x1234 (e.g. from programming in C/C++), can probably handle looking up
> the proposed syntax in the documentation, if they can't guess what it
> means by experimenting in the interactive interpreter.
>
After some thought I discovered that this is actually very consistent:
~1x0123456789ABCDEF == 0xFEDCBA9876543210
One only need to learn a little to about 2's compleemnt to understand:
-1x0123456789ABCDEF == 0xFEDCBA9876543211 # 1 added
if he wishes to...
> In particular, I like:
>
> 1xF == -1
>
> Seems more pythonic than 0xFFFFFFFF, since the latter implies knowledge
> that we are using 32 bits. I don't know of any current way to express
> 1 filled binary numbers cleanly.
>
Currently you use the ~ or the - unary operators. So another idea, in the
spirit of repr() is to print -1 as ~0x0. When you want bit-manipulation,
~ is more natural than -. However I personally like the 1x prefix more.
> Unfortunately, I don't see an easy way to clean up this blemish:
>
> >>> 0xffff
> 65535
This is correct behaviour.
> >>> 0xffffffff
> -1
While currently it is like this, it cannot be considered standard. It
depends on the machine's word size so any code relying on it being equal
to -1 is already broken!
> >>> 0xfffffffff
> 68719476735L
>
Correct too.
> I suppose in a perfect world, 0xffffffff would be 4294967295L, but
> we'd have serious compatibility issues with that (note that one
> would use 1xf instead of 0xffffffff to represent -1 in the perfect
> world).
>
> BTW, note that 0xffffffff == 4294967295L would be consistent with
> C/C++ with an unsigned int (32 or more bits). The idea is that
> all numbers 0x... are non-negative, while number 1x... are negative.
>
Sure. That's the idea. To make the 32bit boundary invisible one must
ensure that a given written number represents the same mathematical
integer, on any machine.
> I'm quite certain that any proposed solution (such as issuing a
> warning for 0xXXXXXXXX) will receive flames, so I think I will
> stop here, hoping it's not too late :-)
>
I would propose the 1x for a read syntax. ~ and - are already accepted
automagically (since python's only read syntax is the eval syntax). The
only trouble now are non-long constants with msb set. They should be
warned anyway to help people detect word-size sensitivity bugs... Maybe a
__future__ should be provided is serious enough compatibility problems
arise. This point itself is probably not worth a __future__ as the new
semantics of 0x... always positive is compatible with any bug free-code ;)
I do not propose to replace hex() now, one can provide a newhex() [better
name desperately needed, maybe hex1?]. In the long run, it might be
reasonable to support the different combinations in format strings:
We already have (or should have; in C and python1.5 - sadly there is no
up-to-date installation around - %x treats the data as always unsigned;
python1.5 refuses to %x longs but accepts them in hex() - I presume that's
been already fixed):
"%x" % +2 => 2
"%x" % -2 => -2
"%+x" % +2 => +2
"%+x" % -2 => -2
"%04x" % +2 => 0002
"%04x" % -2 => -0002
"%0#4x" %+2 => 0x02
"%0#4x" %-2 => -0x02 # Is this so indeed?
Or maybe it's still always unsigned in '%x' to mirror C? I son't think
that can be kept because the signed->unsigned coercion exposes the word
length; do (x & 0xffff) when you want to coerce, so that your results will
be predictable...
Proposed:
"%~x" % +2 => 2
"%~x" % -2 => ~1
"%#~x" % +2 => 0x2
"%#~x" % -2 => 1xE
# Like the proposal by I-don't-remember-who (sorry) but with a determined
# msb! No guessing!
"%0~4x" % +2 => 0x02
"%0~4x" % -2 => ~0x01
"%0#~4x" %+2 => 0x02
"%0#~4x" %-2 => 1xFE
Now this is probably overkill. But one must introduce a new flag symbol,
# is already used for adding the 0x, it can't be 1 (would mean a width)
and it would be logical to do 1xE only when # is also specified, which
brought me to all these combinations.
But the two ideas of ~0x1 and 1xE both serve exactly the same purpose so
only one sould suffice IMHO. ~0x1 is easy to constuct from the existing
facilities, 1xE needs playing around with the output's first char...
> - Ken Seehof
>
--
Beni Cherniavsky <cben at tx.technion.ac.il>
More information about the Python-list
mailing list