
On Mon, 2 May 2022 at 14:49, Simon de Vlieger <cmdr@supakeen.com> wrote:
Hey there,
The `int()` function allows to specify a base to convert from, for example:
int("foo", 26) 10788
Which is documented as:
The base defaults to 10. Valid bases are 0 and 2-36.
For other common bases functions exist in the base64 module in stdlib.
I often need other bases, or bases with custom alphabets. Doing so involves a bit of code every time that I think could be generalized so I'd like to propose an `int.to_base()`, and `int.from_base`. These would not supercede or replace any current possibilities but extend and simplify current possibilities.
The signature(s) I had in mind for now are akin to:
int.from_base(x, alphabet, padding_character)
and
int.to_base(alphabet, padding_character)
Has any discussion on this been had previously (I searched around a bit), and if not would this make a decent PEP?
Let's not go as far as a PEP yet, and figure out a couple of things: 1) What's it like using existing tools? 2) How common is it to need something that's really clunky with existing tools? The "alternate alphabet" case can be done by base converting and then replacing on the string. It's not the smoothest, so that counts a bit of clunkiness; but it's also not all THAT common (I can recall doing it for SteamGuard 2FA codes, which are base 26 but avoid confusable digit/letter pairs, and that's about it). When you say "other bases", do you mean beyond base 36? Do you have use-cases for anything >36 that isn't 64, 85, or 256? If so, how do you currently do this? The CPython integer type is implemented in C for performance. If that's not a consideration, maybe this would be better done in the base64 module (which is where base 85 also lives), as a general tool for arbitrary ASCIIfication. Can you link to your codebase where you 'often' do these kinds of conversions? Is it in a performance-critical area? ChrisA