[Tutor] built in functions int(),long()+convert.base(r1,r2,num)
Jeff Shannon
jeff@ccvcorp.com
Fri Jun 20 15:26:02 2003
cino hilliard wrote:
>
> Hi Jeff, et al,
> Thank for your input and all your work in this list.
> I have comments below. Some of the examples will use convert.py and
> base3.py
> Please take no offense at my remarks as I am trying to learn. If i
> have made an unsupportable
> claim then a correction of that will reinforce my learning.
I'm going to trim a lot of this, because it's digressing into side
issues, but...
>> I use quotes because these are symbols (strings) representing
>> something else (a number) -- but they all represent the *same*
>> number. A "pure" number doesn't have a base associated with it --
>
> What is a pure number? Pure base 62 is 6176790 base 10.
A pure number is a theoretical mathematical construct. This is getting
into the realm of number theory, of which I only have the vaguest of
knowledge. However, it is possible (in number theory notation) to
express numbers without relying on any base. The abstract, Platonic
concept of a number is baseless, but it's impossible to talk about
numbers in everyday terms without using some implied base, which (for
almost all humans) is assumed to be base 10 unless otherwise specified.
(Indeed, as Bob Gailer pointed out, the only reason that we can even
talk about "base 10" is because we assume that we're looking at a base
10 number; no matter what base we used, whether binary, octal, decimal,
or hexidecimal, if it was our "default" base we would, by definition,
write it as "base 10". But now we're getting *really* far from the
actual point...)
> [...]With out a base we cannot express the number.
This being the point -- you can't *express* a number, but the number
exists regardless. It's the same *number* of cows, regardless of what
words the farmers use to count them. The number is a Platonic abstraction.
>> int() function (and atoi(), which is really the same thing) will
>> convert a representation (which has a base associated with it) to the
>> underlying number (which has no base).
>
>
> Balony. The number converted to is base 10 period. Sure it's binary
> internally but to the
> viewer it is decimal. That is all the function can do in terms of base
> conversion. I knew that.
Once again, I can only point out that it's meaningless to talk about the
base of an integer. It's only meaningful to talk about a base in
reference to a particular representation of that integer, and the same
integer can be represented in multiple bases (but that doesn't stop it
from being the *same* integer). You're insisting that these numbers are
converted to base 10 even though it's stored in base 2 -- that's
contradictory. int() really *does* convert everything to base 2 -- it's
just that every time Python shows you an integer, it converts it to base
10 *at the point of display*, because it assumes that this will be
easier for you to read.
> The issue is the vague definition which by the way is not life
> threatning. It just ain't clear to me. That
> was my point. Read the thing carefully and you will see it too. Don't
> keep snipping it when you try to defend it. In uppercase BELOW I give
> a more precise definition of what python actually does.
> Like I said earlier, it is not life threatning.
In almost all circumstances, it doesn't hurt anything to treat all
integers as if they're base 10. In fact, it's usually easiest to do
that, because that's the way we humans think. (At least, it's how
*most* humans think -- base 10 is dictated by culture and convention,
not biology, but it seems to be pretty universal...)
And, since you're specifically asking me to not keep snipping....
> atoi( s[, base])
>
> Deprecated since release 2.0. Use the int() built-in function.
> Convert string s to an integer in the given base. The string must
> consist of one or more digits, optionally
I'll grant that this description isn't as clear as it could be. The
docs for int() are a bit more clear. Let's look at them again, without
your interspersed comments for the moment (I'll get back to those).
"""
int(x[, radix])
Convert a string or number to a plain integer. If the argument is a
string, it must contain a possibly signed decimal number representable
as a Python integer, possibly embedded in whitespace; this behaves
identical to string.atoi(x[, radix]). The radix parameter gives the base
for the conversion and may be any integer in the range [2, 36], or zero.
If radix is zero, the proper radix is guessed based on the contents of
string; the interpretation is the same as for integer literals. If radix
is specified and x is not a string, TypeError is raised. Otherwise, the
argument may be a plain or long integer or a floating point number.
Conversion of floating point numbers to integers truncates (towards zero).
"""
Note that there is no mention of "base 10", only that the interpretation
is, by default, the same as for integer literals (which are assumed to
be in base 10 unless otherwise indicated). Note also that radix is not
used in reference to the integer result, but rather in reference to the
input parameter, and describes how that parameter (if it's a string)
should be converted to an integer. It's significant that radix applies
*only* when the input parameter is a string -- if x is already a number,
then the radix is irrelevant because numbers don't have bases, only the
representations (strings) do. The fact that every time you see a
number, it's been converted to a string in base 10, does not mean that
the underlying number is base 10; it just means that it must be
represented somehow, and that representation must have a base.
>
> int( x[, radix])
>
> Convert a string or number to a plain integer. If the argument is a
> string, it must contain a possibly
>
> CONVERT A STRING OR NUMBER IN BASE RADIX TO A PLAIN INTEGER IN BASE
> 10. NOTE THAT YOU
> CANNOT GO THE OTHER WAY - FOR EXAMPLE CONVERT STRING BASE 10 TO BASE 2
> IS NOT
> SUPPORTED.
As I mentioned above, base 10 is not specified. The reason that you
can't go the "other way" is because there is no other way -- an integer
has no base. You can *display* the integer using base 2, if you really
want, though Python doesn't have a built-in binary() function to
complement the hex() and oct() functions. But the integer itself
doesn't care what base you think it is.
> signed decimal number representable as a Python integer, possibly
> embedded in whitespace; this
> behaves identical to string.atoi(x[, radix]). The radix parameter
> gives the base for the conversion and
>
>
> THE RADIX IS THE BASE WE ARE CONVERTING THE STRING FROM. THE RESULT
> WILL ALWAYS
> BE A PYTHON INTEGER IN BASE 10. YOU CANNOT CONVERT TO ANY OTHER BASE
> THAN 10.
You cannot convert to any other base than 2 (a series of charges in a
set of transistors), but every time that Python shows you the number,
it'll automatically convert it to base 10 for you, because humans aren't
very good at interpreting the electrical capacitance of transistors.
But all of this is beside the point, because, as I said before, an
integer doesn't care what base you think it is, it's still the same number.
> Maybe any integer in the range [2, 36], or zero. If radix is zero, the
> proper radix is guessed based on
> Hmm.., Example of 0 radix errors out
> SyntaxError: invalid syntax
>
>>>> int('JEFFSHANNON',0)
>>>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> ValueError: invalid literal for int(): JEFFSHANNON
>
>>>>
> It appears that a sufficient magnitude in radix is required to avoid
> errors. Radix 29 takes you to letter S,
> The largest letter in JEFFSHANNON. Then radix 29-36 will not error on
> that input.
Obviously, you'll have an error if any symbol in the string is
meaningless for a given base. This is the same thing as having a '9' in
a string that you're trying to interpret as octal, or a '5' in a string
you're trying to interpret as binary. The symbol is meaningless in that
base, so the only reasonable thing for Python to do is report it as an
error.
>> What has you confused is that Python, when asked to show you an
>
> I am not confused at what python does. That is obvious. I am confused
> in how your manual
> describes it. Ii know what it does. It just doesn't do what you say it
> is supposed to do
> the way I read the definition.
But you *are* confused, because you're conflating the step where Python
shows you a number, with how Python actually keeps track of the number.
Internal manipulation is a separate issue from display, and there *is*
a conversion step in between. The conversion is especially apparent
when dealing with floats, because the same float will display
differently depending on whether you use str() or repr() (the
interpreter uses repr() by default) --
>>> repr(0.1)
'0.10000000000000001'
>>>
Note that the number that the interpreter prints is different from the
number I put in. That's an artifact of conversion -- the decimal (base
10) value is converted to a floating point (base 2) value, and because
of representation issues the conversion is not perfect. (Just as 1/3
cannot be exactly represented in base 10, since 0.3333... is inexact,
but can be exactly expressed in base 3 where it's 0.1, 1/10 cannot be
exactly represented in base 2.) That binary floating point is then
converted *back* to a decimal value (by repr() ) to be printed to the
screen. That same conversion happens with integers, too, although
integers don't have the conversion errors that fractional values do.
But the interpreter *does* use repr() to convert those integers to
strings that can be displayed on the screen.
>>> Question. How can I get convert.base(r1,r2,str) without having to
>>> type the quotes for the string?
>>> Eg., convert(16,10,FFFF) instead of convert(16,10,"FFFF") = 65535 ?
>>
>>
>> You can't, because the Python parser would have way of knowing
>
> Of course you can. Change python >>> CONSOLE source to do it like dos.
> See my examples above.
The Python parser, as I tried to explain, must deal with a much richer
and more diverse environment than the DOS commandline parser does.
Because there are so many more constructs for the Python parser to deal
with, there's a lot more room for ambiguity. This means that in many
cases, it's necessary for the programmer to put a little more effort
into resolving that ambiguity. There's only a couple of ways it makes
sense to read a commandline parameter, and in fact DOS simply treats
*every* commandline parameter as a string, and passes that set of
strings to the program -- there's no other options. However, a given
word in a Python script could be interpreted in a wide variety of ways.
Thus, it's necessary to indicate whether a word is an integer literal
(consists only of numbers, or starts with '0x' and contains only numbers
and letters ABCDEF), an identifier, or a string literal. DOS doesn't
need to worry about whether something is an identifier, because it
*can't* be one; Python needs to have something to distinguish strings
from identifiers, and so it uses quotes. This is not something that can
be changed in Python, though you can use a DOS command line to feed the
quote-less string into a Python program (as you've done with your script).
> this is dumb. It should look at it as a string. would require some
> thinking but it should be doable.
> Try the dos example below. Obviouslt python is reading the stdin
> base3.py and passing the command
> tail to variables r1,r2 and num. Python.exe is smart. it is the >>>
> console that doesn't understand.
>
> c:\Python23>python base3.py 36 10 JEFFSHANNON Look mom, no
> quotes.
> 70932403861357991
No, it's not the console, it's the parser, which is separate from
(though used by) the console. In your example, Python is *not* reading
stdin (it's readying sys.argv, which is a list of strings) and even if
it was, stdin is by definition a string. The commandline parameters to
a program are parsed by DOS, and then fed to python as a set of strings.
Standard input is gathered by DOS and then passed on to Python.
(Actually, given the world of virtual DOS machines within Windows and
virtual device drivers, it's more complicated than that, but the point
remains that commandline parameters and standard input come from the
operating system, rather than through the Python parser.) Python
*isn't* that smart -- it has no idea that you're performing a conversion
on the final parameter, and that the parameter must therefore be a
string. All it can do is follow one simple step at a time, exactly what
you tell it to do. And in order to know how to follow out your
directions, it first breaks everything up into pieces and figures out
what those pieces are. It has to know that that final argument to the
function call is a string before it can understand how to work with it.
Here's an example to show why Python *must* have quotes around strings:
FFFF = "15"
base3.convert(16, 2, FFFF)
Now, is this intended to convert 15 to binary (11111111), or FFFF to
binary (11111111 11111111 11111111 11111111) ?? There's no way to tell,
and Python certainly shouldn't be trying to guess.
>>>> convert.base(10,16,2**256-1) See Mom, No quotes
>>>> here either!
>>>
> 0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
That's because you're using an integer constant, which the parser *can*
distinguish from an identifier -- indeed, being able to distinguish
between those is why identifiers cannot start with a numeric character.
Jeff Shannon
Technician/Programmer
Credit International