
On Mon, 2 May 2022 at 16:37, Simon de Vlieger <cmdr@supakeen.com> wrote:
On Mon, May 2, 2022, at 7:03 AM, Chris Angelico wrote:
The "alternate alphabet" case can be done by base converting and then replacing on the string. It's not the smoothest, so that counts a bit of clunkiness; but it's also not all THAT common (I can recall doing it for SteamGuard 2FA codes, which are base 26 but avoid confusable digit/letter pairs, and that's about it).
I've mostly resorted to using str.maketrans and .replace as well.
When you say "other bases", do you mean beyond base 36? Do you have use-cases for anything >36 that isn't 64, 85, or 256? If so, how do you currently do this?
Some examples I've encountered over the past year are: Base58, as used in Bitcoin [1]. Base45 [2], and Base91.
As far as I can tell, these are all separate algorithms, and they don't really generalise well. Knowing all of the ones mentioned (45, 58, 64, 85, 91, 256), you still wouldn't be able to synthesize a (say) Base 73 encoding. So that suggests to me that these belong (if anywhere) in the base64 module, or perhaps in the codecs module (you can find base64 itself there as well).
My experience is likely skewed as I do take part in CTFs where obscureness is often part of the deal.
Not familiar with the term CTF in this context, my brain assumes Capture The Flag but maybe that's not it?
The CPython integer type is implemented in C for performance. If that's not a consideration, maybe this would be better done in the base64 module (which is where base 85 also lives), as a general tool for arbitrary ASCIIfication.
For my usecases it hasn't been especially performance critical. The base64 module might be a good place for this to live instead of the integer type.
Perhaps the base64 module is in fact a better place as converting to bytes is likely what's wanted instead of going to/from integer first.
Yeah, that's the other reason - those kinds of encodings are often used for representing long strings, not numbers.
Can you link to your codebase where you 'often' do these kinds of conversions? Is it in a performance-critical area?
I can't but it hasn't been performance critical.
Cool. Then I would be inclined to push forward with this as additional functions in the base64 module. Particularly when they have well-known use-cases (you mentioned Bitcoin for Base58, would help if you can cite others). ChrisA