Custom converters in str.format() and f-strings

Currently str.format() and f-strings support three converters. Converting is specified by the "!" character followed by a letter which denotes the converter. s: str() r: repr() a: ascii() In some cases it would be useful to apply other converters. One obvious example is escaping special characters ("&", "<", ">", "'" and '"') for XML and HTML. Other applications may need other converters, for example escaping special characters (like \xff) for Telnet, or escaping non-BMP characters for displaying on Tk widget, or escaping delimiters and special characters for shell, or translating to other language. I do not think that it is practical to provide additional standard converters for all above cases, but what if create a special registry similar to registries for encodings and error handlers and allow a user to register custom converters? We could extend str.format() and f-strings to accept arbitrary letters after "!", or maybe even allow multi-character names of converters. It is less important for f-strings because you can use arbitrary expressions, but even in this case f"Hello, {name!x}!" or f"Hello, {name!xml}!" looks better than f"Hello, {html.escape(name)}!" or f"Hello, {x(name)}!".

On Sat, May 04, 2019 at 06:51:39PM +0300, Serhiy Storchaka wrote:
html.escape(name) is mostly self-documenting. !x is a cryptic, arbitrary code. Why !x rather than !e ("escape") or !h ("html") or !g for that matter? Compact mini-languages like regexes and format codes are compact but cryptic. They're harder to read, especially when you're reading less common codes. Probably most people could guess that %y in a date format string means "year", but how many people will recognise %G and %j without looking them up? -- Steven

On Sat, May 04, 2019 at 06:51:39PM +0300, Serhiy Storchaka wrote:
html.escape(name) is mostly self-documenting. !x is a cryptic, arbitrary code. Why !x rather than !e ("escape") or !h ("html") or !g for that matter? Compact mini-languages like regexes and format codes are compact but cryptic. They're harder to read, especially when you're reading less common codes. Probably most people could guess that %y in a date format string means "year", but how many people will recognise %G and %j without looking them up? -- Steven
participants (3)
-
Eric V. Smith
-
Serhiy Storchaka
-
Steven D'Aprano