[Python-ideas] More user-friendly version for string.translate()

Mikhail V mikhailwas at gmail.com
Mon Oct 24 13:39:16 EDT 2016


Hello all,

I would be happy to see a somewhat more general and user friendly
version of string.translate function.
It could work this way:
string.newtranslate(file_with_table, Drop=True, Dec=True)

So the parameters:

1. "file_with_table" : a text file with table in following format:

#[In]    [Out]

97    {65}
98    {66}
99    {67}
100    {}
...
110    {110}


Notes:
All values are decimal or hex (to switch between parsing format use
Dec parameter)
As it turned out from my last discussion, majority prefers hex notation,
so I am not in mainstream with my decimal notation here, but both
should be supported.
Empty [Out] value {} means that the character will be deleted.

2. "Drop = True" this will set the default behavior for those values
which are NOT in the table.

For Drop = True: all values not defined in table set to [out] = {},
and be deleted.

For Drop=False: all values not defined in table set [out] = [in], so
those remain as is.

3. Dec= True : parsing format Decimal/hex. I use decimal everywhere.


Further thoughts: for 8-bit strings this should be simple to implement
I think. For 16-bit of course
there is issue of memory usage for lookup tables, but the gurus could
probably optimise it.
E.g. at the parsing stage it is not necessary to build the lookup
table  for whole 16-bit range of course,
but take only values till the largest ordinal present in the table file.

About the format of table file: I suppose many users would want also
to define characters directly, I am not sure
if it is really needed, but if so, additional brackets or escape char
could be used, like this for example:

a    {A}
\98    {\66}
\99    {\67}

but as said I don't like very much the idea and would be OK for me to
use numeric values only.

So approximately I see it.
Feel free to share thoughts or criticise.


Mikhail


More information about the Python-ideas mailing list