On Mon, Oct 24, 2016 at 07:39:16PM +0200, Mikhail V wrote:
Hello all,
I would be happy to see a somewhat more general and user friendly version of string.translate function. It could work this way: string.newtranslate(file_with_table, Drop=True, Dec=True)
That's an interesting concept for "user friendly". Apart from functions that are actually designed to read files of a particular format, can you think of any built-in functions that take a file as argument? This is how you would use this "user friendly version of translate": path = '/tmp/table' # hope no other program is using it... with open(path, 'w') as f: f.write('97 {65}\n') f.write('98 {66}\n') f.write('99 {67}\n') with open(path, 'r') as f: new_string = old_string.newtranslate(f, False, True) Compared to the existing solution: new_string = old_string.translate(str.maketrans('abc', 'ABC')) Mikhail, I appreciate that you have many ideas and want to share them, but try to think about how those ideas would work. The Python standard library is full of really well-designed programming interfaces. You can learn a lot by thinking "what existing function is this like? how does that existing function work?". str.translate and str.maketrans already exist. Look at how maketrans builds a translation table: it can take either two equal length strings, and maps characters in one to the equivalent character in the other: str.maketrans('abc', 'ABC') Or it can take a mapping (usually a dict) that maps either characters or ordinal numbers to a new string (not just a single character, but an arbitrary string) or ordinal numbers. str.maketrans({'a': 'A', 98: 66, 0x63: 0x:43}) (or None, to delete them). Note the flexibility: you don't need to specify ahead of time whether you are specifying the ordinal value as a decimal, hex, octal or binary value. Any expression that evaluates to a string or a int within the legal range is valid. That's a good programming interface. Could it be better? Perhaps. I've suggested that maybe translate could automatically call maketrans if given more than one argument. Maybe there's an easier way to just delete unwanted characters. Perhaps there could be a way to say "any character not in the translation table should be dropped". These are interesting questions.
Further thoughts: for 8-bit strings this should be simple to implement I think.
I doubt that these new features will be added to bytes as well as strings. For 8-bits byte strings, it is easy enough to generate your own translation and deletion tables -- there are only 256 values to consider.
For 16-bit of course there is issue of memory usage for lookup tables, but the gurus could probably optimise it.
There are no 16-bit strings. Unicode is a 21-bit encoding, usually encoded as either fixed-width sequence of 4-byte code units (UTF-32) or a variable-width sequence of 2-byte (UTF-16) or 1-byte (UTF-8) code units. But it absolutely is not a "16-bit string". [...]
but as said I don't like very much the idea and would be OK for me to use numeric values only.
I think you are very possibly the only Python programmer in the world who thinks that writing decimal ordinal values is more user-friendly than writing the actual character itself. I know I would much rather see $, π or ╔ than 36, 960 or 9556. -- Steve