[Python-ideas] Add a builtin method to 'int' for base/radix conversion

MRAB python at mrabarnett.plus.com
Mon Aug 31 17:47:24 CEST 2009


Masklinn wrote:
> On 31 Aug 2009, at 15:00 , Nick Coghlan wrote:
> Yuvgoog Greenle wrote:
>>> I believe int(s, base) needs an inverse function to allow string
>>> representation with different bases. An example use case is 'hashing' a
>>> counter like video ID's on youtube, you could use a regular int
>>> internally and publish a shorter base-62 id
>>> for links.
>>>
>>> This subject was discussed 2.5 years ago:
>>> http://mail.python.org/pipermail/python-dev/2006-January/059789.html
>>>
>>> I opened a feature request ticket:
>>> http://bugs.python.org/issue6783
>>>
>>> Some of the questions that remain:
>>> 1. Whether this should be a method for int or a regular function in a
>>> standard library module like math.
>>> 2. What should the method/function be called? (base_convert, radix, etc)
>>>
>>> What do you guys think?
>>
>> This has been coming up for years and always gets bogged down in a
>> spelling argument (a method on int, a function in the math module and an
>> update to the str.format mini language would be the current contenders).
>>
>> However, most of the actual real use cases for bases between 2 and 36
>> were dealt with by the addition of binary and octal output to string
>> formatting so the impetus to do anything about it is now a lot lower.
>>
>> As far as bases between 37 and 62 go, that would involve first getting
>> agreement on extending int() to handle those bases by allowing case
>> sensitive digit parsing. Presumably that would use string lexical
>> ordering so that int('a', 37) > int('A', 37) and int('b', 37) would
>> raise an exception.
>>
>> That would only be intuitive to someone that knows how ASCII based
>> alphanumeric ordering works though.

ASCII? Surely it should be Unicode! :-)

> Or it could be handled via a translation table (needed both ways of 
> course) mapping n indexes to n characters (with n the base you're 
> working with), defaulting to something sane.
> 
The default could cover only bases 2 to 36. Any base > 36 would require
a user-supplied translation table.

> Though I'm not sure this is of much interest really: even Erlang (which 
> provides pretty good base conversion tools: it supports literal integers 
> of any base between 2 and 36) doesn't natively support bases beyond 36. 
> A library would probably be better for those more conflictual (or less 
> intuitive) ranges.
> 
It could permit a dict as the translation table when 'decoding' so that
both 'A' and 'a' could be mapped to 10, if necessary.



More information about the Python-ideas mailing list