
On Mon, Sep 21, 2015, at 16:12, Serhiy Storchaka wrote:
But why these particular alphabets are special? I expect that every application will use the alphabet that matches its needs. One needs decimal digits ('0123456789'), other needs English letters ('ABCDEFGHIJKLMNOPQRSTUVWXYZ'), or letters and digits and underscore, or letters, digits and punctuation, or all safe ASCII characters, or all well graphical distinguished characters. Why token_hex and token_url, but not token_digits, token_letters, token_identifier, token_base32, token_base85, token_html_safe, etc?
Well, for one thing, they're trivial encodings of random bits, which is why passing in nbytes (number of random bytes) makes sense. Someone else pointed out that this makes it easier to reason about the amount of entropy involved. Token_base64 could actually, in principle, return a string with padding at the end according to base64 rules, if you ask for a number of bytes that is not a multiple of four. Base85 could likewise, for that matter, but base85 is a less common encoding.