Coding challenge: Optimise a custom string encoding
Alex Willmer
alex at moreati.org.uk
Mon Aug 18 17:27:16 EDT 2014
On Monday, 18 August 2014 21:16:26 UTC+1, Terry Reedy wrote:
> On 8/18/2014 3:16 PM, Alex Willmer wrote:
> > A challenge, just for fun. Can you speed up this function?
>
> You should give a specification here, with examples. You should perhaps
Sorry, the (informal) spec was further down.
> > a custom encoding to store unicode usernames in a config file that only allowed mixed case ascii, digits, underscore, dash, at-sign and plus sign. We also wanted to keeping the encoded usernames somewhat human readable.
> > My design was utf-8 and a variant of %-escaping, using the plus symbol. So u'alic EURO 123' would be encoded as b'alic+e2+82+ac123'.
Other examples:
>>> plus_encode(u'alice')
'alice'
>>> plus_encode(u'Bacon & eggs only $19.95')
'Bacon+20+26+20eggs+20only+20+2419+2e95'
>>> plus_encode(u'ünïcoԁë')
'+c3+bc+ef+bd+8e+c3+af+ef+bd+83+ef+bd+8f+d4+81+c3+ab'
> You should perhaps be using .maketrans and .translate.
That wouldn't work, maketrans() can only map single bytes to other single bytes. To encode 256 possible source bytes with 66 possible symbols requires a multi-symbol expansion of some or all source bytes.
More information about the Python-list
mailing list