[Tutor] built in functions int(),long()+convert.base(r1,r2,num)

Jeff Shannon jeff@ccvcorp.com
Fri Jun 20 15:26:02 2003


cino hilliard wrote:

>
> Hi Jeff, et al,
> Thank for your input and all your work in this list.
> I have comments below. Some of the examples will use convert.py and 
> base3.py
> Please take no offense at my remarks as I am trying to learn. If i 
> have made an unsupportable
> claim then a correction of that will reinforce my learning. 


I'm going to trim a lot of this, because it's digressing into side 
issues, but...

>> I use quotes because these are symbols (strings) representing 
>> something else (a number) -- but they all represent the *same* 
>> number.  A "pure" number doesn't have a base associated with it --
>
> What is a pure number? Pure base 62 is 6176790 base 10. 


A pure number is a theoretical mathematical construct.  This is getting 
into the realm of number theory, of which I only have the vaguest of 
knowledge.  However, it is possible (in number theory notation) to 
express numbers without relying on any base.  The abstract, Platonic 
concept of a number is baseless, but it's impossible to talk about 
numbers in everyday terms without using some implied base, which (for 
almost all humans) is assumed to be base 10 unless otherwise specified. 
 (Indeed, as Bob Gailer pointed out, the only reason that we can even 
talk about "base 10" is because we assume that we're looking at a base 
10 number; no matter what base we used, whether binary, octal, decimal, 
or hexidecimal, if it was our "default" base we would, by definition, 
write it as "base 10".  But now we're getting *really* far from the 
actual point...)

> [...]With out a base we cannot express the number. 


This being the point -- you can't *express* a number, but the number 
exists regardless.  It's the same *number* of cows, regardless of what 
words the farmers use to count them.  The number is a Platonic abstraction.

>> int() function (and atoi(), which is really the same thing) will 
>> convert a representation (which has a base associated with it) to the 
>> underlying number (which has no base).
>
>
> Balony. The number converted to is base 10 period. Sure it's binary 
> internally but to the
> viewer it is decimal. That is all the function can do in terms of base 
> conversion. I knew that. 


Once again, I can only point out that it's meaningless to talk about the 
base of an integer.  It's only meaningful to talk about a base in 
reference to a particular representation of that integer, and the same 
integer can be represented in multiple bases (but that doesn't stop it 
from being the *same* integer).  You're insisting that these numbers are 
converted to base 10 even though it's stored in base 2 -- that's 
contradictory.  int() really *does* convert everything to base 2 -- it's 
just that every time Python shows you an integer, it converts it to base 
10 *at the point of display*, because it assumes that this will be 
easier for you to read.

> The issue is the vague definition which by the way is not life 
> threatning. It just ain't clear to me. That
> was my point. Read the thing carefully and you will see it too. Don't 
> keep snipping it when you try to defend it. In uppercase BELOW I give 
> a more precise definition of what python actually does.
> Like I said earlier, it is not life threatning. 


In almost all circumstances, it doesn't hurt anything to treat all 
integers as if they're base 10.  In fact, it's usually easiest to do 
that, because that's the way we humans think.  (At least, it's how 
*most* humans think -- base 10 is dictated by culture and convention, 
not biology, but it seems to be pretty universal...)

And, since you're specifically asking me to not keep snipping....

> atoi( s[, base])
>
> Deprecated since release 2.0. Use the int() built-in function.
> Convert string s to an integer in the given base. The string must 
> consist of one or more digits, optionally 


I'll grant that this description isn't as clear as it could be.  The 
docs for int() are a bit more clear.  Let's look at them again, without 
your interspersed comments for the moment (I'll get back to those).

"""
int(x[, radix])
Convert a string or number to a plain integer. If the argument is a 
string, it must contain a possibly signed decimal number representable 
as a Python integer, possibly embedded in whitespace; this behaves 
identical to string.atoi(x[, radix]). The radix parameter gives the base 
for the conversion and may be any integer in the range [2, 36], or zero. 
If radix is zero, the proper radix is guessed based on the contents of 
string; the interpretation is the same as for integer literals. If radix 
is specified and x is not a string, TypeError is raised. Otherwise, the 
argument may be a plain or long integer or a floating point number. 
Conversion of floating point numbers to integers truncates (towards zero).
"""

Note that there is no mention of "base 10", only that the interpretation 
is, by default, the same as for integer literals (which are assumed to 
be in base 10 unless otherwise indicated).  Note also that radix is not 
used in reference to the integer result, but rather in reference to the 
input parameter, and describes how that parameter (if it's a string) 
should be converted to an integer.  It's significant that radix applies 
*only* when the input parameter is a string -- if x is already a number, 
then the radix is irrelevant because numbers don't have bases, only the 
representations (strings) do.  The fact that every time you see a 
number, it's been converted to a string in base 10, does not mean that 
the underlying number is base 10; it just means that it must be 
represented somehow, and that representation must have a base.

>
> int( x[, radix])
>
> Convert a string or number to a plain integer. If the argument is a 
> string, it must contain a possibly
>
> CONVERT A STRING OR NUMBER IN BASE RADIX TO A PLAIN INTEGER IN BASE 
> 10. NOTE THAT YOU
> CANNOT GO THE OTHER WAY - FOR EXAMPLE CONVERT STRING BASE 10 TO BASE 2 
> IS NOT
> SUPPORTED. 


As I mentioned above, base 10 is not specified.  The reason that you 
can't go the "other way" is because there is no other way -- an integer 
has no base.  You can *display* the integer using base 2, if you really 
want, though Python doesn't have a built-in binary() function to 
complement the hex() and oct() functions.  But the integer itself 
doesn't care what base you think it is.

> signed decimal number representable as a Python integer, possibly 
> embedded in whitespace; this
> behaves identical to string.atoi(x[, radix]). The radix parameter 
> gives the base for the conversion and
>
>
> THE RADIX IS THE BASE WE ARE CONVERTING THE STRING FROM. THE RESULT 
> WILL ALWAYS
> BE A PYTHON  INTEGER IN BASE 10. YOU CANNOT CONVERT TO ANY OTHER BASE 
> THAN 10. 


You cannot convert to any other base than 2 (a series of charges in a 
set of transistors), but every time that Python shows you the number, 
it'll automatically convert it to base 10 for you, because humans aren't 
very good at interpreting the electrical capacitance of transistors. 
 But all of this is beside the point, because, as I said before, an 
integer doesn't care what base you think it is, it's still the same number.

> Maybe any integer in the range [2, 36], or zero. If radix is zero, the 
> proper radix is guessed based on
> Hmm.., Example of 0 radix errors out
> SyntaxError: invalid syntax
>
>>>> int('JEFFSHANNON',0)
>>>
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
> ValueError: invalid literal for int(): JEFFSHANNON
>
>>>>
> It appears that a sufficient magnitude in radix is required to avoid 
> errors. Radix 29 takes you to letter S,
> The largest letter in JEFFSHANNON. Then radix 29-36 will not error on 
> that input. 


Obviously, you'll have an error if any symbol in the string is 
meaningless for a given base.  This is the same thing as having a '9' in 
a string that you're trying to interpret as octal, or a '5' in a string 
you're trying to interpret as binary.  The symbol is meaningless in that 
base, so the only reasonable thing for Python to do is report it as an 
error.

>> What has you confused is that Python, when asked to show you an
>
> I am not confused at what python does. That is obvious. I am confused 
> in how your manual
> describes it. Ii know what it does. It just doesn't do what you say it 
> is supposed to do
> the way I read the definition. 


But you *are* confused, because you're conflating the step where Python 
shows you a number, with how Python actually keeps track of the number. 
 Internal manipulation is a separate issue from display, and there *is* 
a conversion step in between.  The conversion is especially apparent 
when dealing with floats, because the same float will display 
differently depending on whether you use str() or repr() (the 
interpreter uses repr() by default) --

 >>> repr(0.1)
'0.10000000000000001'
 >>>

Note that the number that the interpreter prints is different from the 
number I put in.  That's an artifact of conversion -- the decimal (base 
10) value is converted to a floating point (base 2) value, and because 
of representation issues the conversion is not perfect.  (Just as 1/3 
cannot be exactly represented in base 10, since 0.3333... is inexact, 
but can be exactly expressed in base 3 where it's 0.1, 1/10 cannot be 
exactly represented in base 2.)  That binary floating point is then 
converted *back* to a decimal value (by repr() ) to be printed to the 
screen.  That same conversion happens with integers, too, although 
integers don't have the conversion errors that fractional values do. 
 But the interpreter *does* use repr() to convert those integers to 
strings that can be displayed on the screen.

>>> Question. How can I get convert.base(r1,r2,str) without having to 
>>> type the quotes for the string?
>>> Eg., convert(16,10,FFFF) instead of convert(16,10,"FFFF") = 65535 ?
>>
>>
>> You can't, because the Python parser would have way of knowing
>
> Of course you can. Change python >>> CONSOLE source to do it like dos. 
> See my examples above. 


The Python parser, as I tried to explain, must deal with a much richer 
and more diverse environment than the DOS commandline parser does. 
 Because there are so many more constructs for the Python parser to deal 
with, there's a lot more room for ambiguity.  This means that in many 
cases, it's necessary for the programmer to put a little more effort 
into resolving that ambiguity.  There's only a couple of ways it makes 
sense to read a commandline parameter, and in fact DOS simply treats 
*every* commandline parameter as a string, and passes that set of 
strings to the program -- there's no other options.  However, a given 
word in a Python script could be interpreted in a wide variety of ways. 
 Thus, it's necessary to indicate whether a word is an integer literal 
(consists only of numbers, or starts with '0x' and contains only numbers 
and letters ABCDEF), an identifier, or a string literal.  DOS doesn't 
need to worry about whether something is an identifier, because it 
*can't* be one; Python needs to have something to distinguish strings 
from identifiers, and so it uses quotes.  This is not something that can 
be changed in Python, though you can use a DOS command line to feed the 
quote-less string into a Python program (as you've done with your script).

> this is dumb. It should look at it as a string. would require some 
> thinking but it should be doable.
> Try the dos example below. Obviouslt python is reading the stdin 
> base3.py and passing the command
> tail to variables r1,r2 and num. Python.exe is smart. it is the >>> 
> console that doesn't understand.
>
> c:\Python23>python base3.py 36 10 JEFFSHANNON         Look mom, no 
> quotes.
> 70932403861357991 


No, it's not the console, it's the parser, which is separate from 
(though used by) the console.  In your example, Python is *not* reading 
stdin (it's readying sys.argv, which is a list of strings) and even if 
it was, stdin is by definition a string.  The commandline parameters to 
a program are parsed by DOS, and then fed to python as a set of strings. 
Standard input is gathered by DOS and then passed on to Python. 
 (Actually, given the world of virtual DOS machines within Windows and 
virtual device drivers, it's more complicated than that, but the point 
remains that commandline parameters and standard input come from the 
operating system, rather than through the Python parser.)  Python 
*isn't* that smart -- it has no idea that you're performing a conversion 
on the final parameter, and that the parameter must therefore be a 
string.  All it can do is follow one simple step at a time, exactly what 
you tell it to do.  And in order to know how to follow out your 
directions, it first breaks everything up into pieces and figures out 
what those pieces are.  It has to know that that final argument to the 
function call is a string before it can understand how to work with it. 
 Here's an example to show why Python *must* have quotes around strings:

FFFF = "15"
base3.convert(16, 2, FFFF)

Now, is this intended to convert 15 to binary (11111111), or FFFF to 
binary (11111111 11111111 11111111 11111111) ??  There's no way to tell, 
and Python certainly shouldn't be trying to guess.

>>>> convert.base(10,16,2**256-1)                 See Mom, No quotes 
>>>> here either!
>>>
> 0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF 


That's because you're using an integer constant, which the parser *can* 
distinguish from an identifier -- indeed, being able to distinguish 
between those is why identifiers cannot start with a numeric character.

Jeff Shannon
Technician/Programmer
Credit International