[Tutor] built in functions int(),long()+convert.base(r1,r2,num)
Jeff Shannon
jeff@ccvcorp.com
Mon Jun 23 17:33:01 2003
cino hilliard wrote:
>> You cannot convert to any other base than 2 (a series of charges in a
>> set of transistors), but every time that Python shows you the number,
>
>> it'll automatically convert it to base 10 for you, because
>
> How can this be? You just said "You cannot convert to any other base
> than 2
> (a series of charges in a set of transistors)
Right. The computer stores integers in base 2, by creating a pattern of
charges. When it comes time to display that integer on the screen,
however, the display functions convert that pattern of charges into a
string of characters representing that number in base 10.
>> But all of this is beside the point, because, as I said before, an
>> integer doesn't care what base you think it is[...]
>
> This is quite vague. If you type
>
>>>> print 12345
>>>
> 12345
> you get a base 10 number. The print command gives output in decimal
> or base 10.
>
>>>> print 0xFFFF
>>>
> 65535
> Even using hex notation print still outputs base 10.
Yes, because in both cases, the interpreter converts the number you've
typed into an internal integer (which it stores, however briefly, in
binary format), and then sends that integer to the display routines,
which automatically convert it to a string in base 10 for display purposes.
> You are conflating the inner-workings of python and the output as the
> same unified thing. Again the
> key word is output not the electrical capacitance of transistors.
> BTW, I don't think a transistor is a capacitor but rather a
> simi-conducter or amplier and switch. A Dynamic Ram memory cell has a
> transistor
> AND capacitor. The capacitor holds the charge bit=1 and the transistor
> is controlled by the memory
> circutry the release of the charge. I picked this up with a google
> search. The processor probably has both also.
The key is that the output is a separate thing than the storage of (and
the existence of) the number. I may have used the wrong terms for the
various electronic components -- I'm not an electrical engineer, nor do
I desire to become one, and my understanding of the electronics involved
is very abstract. The point is that the internal representation of a
number is different than the string that's shown on the screen when the
number is displayed. Yet they both represent the same number, even
though each representation uses a different base.
> Maybe any integer in the range [2, 36], or zero. If radix is zero,
> the proper radix is guessed based on the contents of string;
> Should the parser not guess that the radix is 29 or higher for
> int('JEFFSHANNON',0)?
It guesses by the same rules that it uses to parse numeric literals in
any code, as the docs for int() describe. If you were to start the
interpreter and type JEFFSHANNON, what would you expect? Since there
are no numeric characters in that, and it doesn't use any of the special
indications for octal or hex numbers, this is an error. The parser will
decide that it must be an identifier, and will try to resolve it, giving
a NameError when it finds nothing with that name. Since int() knows
that it's supposed to be a number, but doesn't see any indications that
it *is* a number, it gives an error. It doesn't try to guess that the
radix is the lowest number that can represent the highest-ordinal
character in the string, because the odds that someone really wants that
are insignificant. There are *very* few uses for numbers in an
arbitrary radix; 99.9% of the time, a displayed number should be
represented in binary, octal, decimal, or hexidecimal. Of the tiny
percentage of times that someone actually wants something in some other
radix, almost all of those occurrences are simply examples showing how
different numeric bases work. I have never seen, and can't imagine, a
project for which it's truly useful and practical to represent something
in base 29.
> Internal manipulation is a separate issue from display,
> I am not questioning the internals. It is the display or output I am
> interested in.
So why are you having such a difficult time with the concept that the
fact that you're shown a base 10 number is just an attribute of the
display?
>> and there *is* a conversion step in between. The conversion is
>> especially apparent when dealing with floats, because the same float
>> will display differently depending on whether you use str() or repr()
>> (the interpreter uses repr() by default) --
>>
>> >>> repr(0.1)
>> '0.10000000000000001'
>
> Can't this be fixed?
Which, that 0.1 can't be represented in binary? No, that can't be fixed
-- it's a well-known limitation of binary floating point numbers, and is
inherent in the mathematics. The only "fix" would be to completely
redesign every bit of computer floating-point code and hardware in use,
following entirely different principles (true decimal floating point, or
full rational numbers, instead of binary floating point). And even
then, it would only partially solve the problem, because some numbers
simply cannot be represented with a finite (and very limited) number of
digits, regardless of the base that's used to represent them.
>
>> >>>
>
> Then why this?
>
>>>> print 0.1
>>>
> 0.1
> if (the interpreter uses repr() by default)
This is because 'print' uses str() by default.
>> FFFF = "15"
>> base3.convert(16, 2, FFFF)
>>
>> Now, is this intended to convert 15 to binary (11111111), or FFFF to
>> binary (11111111 11111111 11111111 11111111) ?? There's no way to
>> tell, and Python certainly shouldn't be trying to guess.
>
> Oh no? The Book says
>
>> Maybe any integer in the range [2, 36], or zero. If radix is zero,
>> the proper radix is guessed based on the contents of string;
>
> However, it doesen't work.
That's the case once a string has already been passed into the function.
The important point is that Python must be able to tell whether it's a
string or a numeric literal *before* it's passed into the function, and
before it has any knowledge of *what* function this parameter is
intended for. And anyhow, the way that Python "guesses" the proper
radix is a much simpler and more straightforward procedure than what
you're imagining. Such a "guess" will result in one of three possible
radixes (radices?) -- hexidecimal if the string starts with '0x', octal
if the string starts with '0' and the second character is *not* 'x', or
decimal otherwise. Whichever of these three options it guesses, it will
give an error (invalid literal) if there are characters that are not
appropriate for digits in that particular base.
Note that, by your logic of how these guesses should be done,
int('21',0) would be interpreted as being in trinary (base 3), and would
be equal to 7 decimal. It seems *extremely* unlikely that that would be
the intent, since there is virtually *never* any use for numbers in a
nonstandard base.
>>>>>> convert.base(10,16,2**256-1) See Mom, No quotes
>>>>>> here either!
>>>>>
>>>>>
>>> 0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>>
>>
>> That's because you're using an integer constant, which the parser
>> *can* distinguish from an identifier -- indeed, being able to
>> distinguish between those is why identifiers cannot start with a
>> numeric character.
>
> some more no quotes.
>
>>>> convert.base(10,16,e**e*tan(1)*4)
>>>
> 888E53DBD2D not correct value but it parsed!
>
>>>> convert.base(10,16,convert.numToBase(100,16))
>>>
> 40
That's because, in each of these cases, you're using an expression which
evaluates to an integer. The parser separates 'e**e*tan(1)*4' (BTW,
maybe that's the "wrong" value because operator precedence isn't what
you're expecting/intending?) and 'convert.numToBase(100,16)', and
evaluates each of those subexpressions on it's own. In both cases,
those subexpressions evaluate to an integer.
As another point, the first one only works if you've imported e from
math, and then it *is* interpeting e as an identifier.
>>> e
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
NameError: name 'e' is not defined
>>> from math import e
>>> e
2.7182818284590451
>>>
This is precisely why quotes are necessary -- to separate the
mathematical constant e from the hexidecimal digit e from the base-36
digit e.
Indeed, the same applies in your second example -- the lack of quotes is
how Python knows that 'convert.numToBase' represents a function to be
called instead of a numeric constant in some strange radix (that nobody
ever actually uses).
[consolidating a bit, from your other email]
> probably not of any practical value.
When 64 bit 128 bit and higher processor chips hit the mainstream you
may change your opinion
if you want to get a base 64 representation. My convert.base would do
this quickly if we used
the { and | as the value for 62 and 62 base 10.
-----------------
The number of bits that processors can use has almost zero correlation
with the usefulness of number represented in different bases. We
currently have 32-bit processors (in most cases), but that doesn't mean
we're using base 32 numbers for anything. We've used base 16 numbers
since long before 16-bit processors were standard. When 64-bit
processors become standard, humans will *not* learn to read base-64
numbers; we'll simply represent processor words with a longer string of
hexidecimal digits.
I say *almost* zero correlation, because the reason that hexidecimal is
so popular is that a standard byte (8 bits) can be exactly represented
using two hex digits. Every possible 8-bit value can be shown in two
hex digits, and every 2-hex-digit value can be shown in 8 binary digits
(bits). Humans typically find '0xE4' easier to read than '11100100', so
hex makes a convenient shorthand for looking at bit patterns. Note that
this means that 32 bits are equivalent to 8 hex digits, and 64 bits to
16 hex digits. Once upon a time, many mainframes/minicomputers used
9-bit, or 18-bit, or 27-bit words. 9 bits have that same mapping to
three octal digits, so for these machines, octal is the convenient
shorthand. As that type of machine passes out of favor, octal is
passing out of favor now, too.
The point of this diversion is simply to show that unusual bases are
extremely rare, and serve very little practical purpose, which is *why*
the Python interpreter is biased towards a few specific bases.
Jeff Shannon
Technician/Programmer
Credit International