
On Tue, Sep 22, 2015 at 10:26:13AM +0200, Jonas Wielicki wrote:
On 20.09.2015 02:27, Chris Angelico wrote:
Also, if you ask for 4 bytes from token_hex, do you get 4 hex digits or 8 (four bytes of entropy)?
I think the answer there has to be 8. I interpret Tim's reference to "same" as that the intent of token_hex is to call os.urandom(nbytes), then convert it to a hex string. So the implementation might be as simple as: def token_hex(nbytes): return binascii.hexlify(os.urandom(nbytes)) modulo a call to .decode('ascii') if we want it to return a string. One obvious question is, how many bytes is enough? Perhaps we should set a default value for nbytes, with the understanding that the default value will increase in the future.
My personal preference for shed colour: token_bytes returns a bytestring, its length being the number provided. All the others return Unicode strings, their lengths again being the number provided. So they're all text bar the one that explicitly says it's in bytes.
My personal preference would be for the number of bytes to rather reflect the entropy in the result. This would be a safer use when migrating from using e.g. token_url to token_alpha with the base32 alphabet [1], for example because you want to have better readable tokens.
Speaking of which, a token_base32 would probably make sense, too.
Oh oh, scope creep already! And so it begins... *wink* What you are referring to isn't the standard base32, which already exists in the stdlib (in base64.py, together with base16). It's is referred to by its creators as z-base-32, and the reasoning they give seems sound. It's not intended as a replacement for RFC-3458 base32, but an alternative. If the std lib already included a z-base-32 implementation, I would be happy to include token_zbase32 in the same spirit as token_base64. But it doesn't. So first you would have to convince somebody to add zbase32 to the standard library.
[1]: https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
-- Steve