I already have a module that I include in a couple of widely used utilities at work that I called cmd_line_fixup (but I think that I like defancier better) it is used on the command line options prior to processing to fix several of these issues. This was written in response to the frequency of me getting emails saying "how do I get the utility to do X?" to which I would dash off a quick reply and get back "that doesn't work" - the users of these utilities include developers (who use other languages) and electrical test technicians and many of them are based in other countries where English is not their native tongue so explaining "You need to type what I sent you rather than paste it" was challenging.
Potentially such a module could be smarter than my current one in that I currently just use a series of replaces to swap out everything but potentially user code could be using these characters legitimately inside of strings or comments e.g. the copywrite symbol.
-----Original Message----- From: Andrew Barnert firstname.lastname@example.org Sent: 10 May 2020 21:17 To: Steve Barnes GadgetSteve@live.co.uk Cc: email@example.com Subject: Re: [Python-ideas] Improve handling of Unicode quotes and hyphens
On May 10, 2020, at 00:11, Steve Barnes GadgetSteve@live.co.uk wrote:
What can be done?
I think there’s another option (in addition to improving SyntaxError, not instead of it):
Add a defancier module to the stdlib. It has functions that take some text and turn smart quotes into plain ASCII quotes, dashes and minuses into ASCII hyphens, etc., or just detect them and produce useful objects and/or text. And it’s a runnable module that can either lint or fix source code.
Then instead of telling people who get this SyntaxError “Use a proper editor, and all the code you wrote so far has to be rewritten or fixed manually, and that’ll show you”, we can tell them “Use a proper editor in the future, but meanwhile you can fix your existing script with `python -m defancier -f script.py`“.
And a simple IDE or editor mode that doesn’t want to come up with something better could run defancier on SyntaxError or on open or whenever and show the output in a nice way and offer a single-click fix.
There’s nothing in the stdlib quite like this, but textwrap, tabnanny, 2to3, etc. are vaguely similar precedents.
And it seems like the kind of thing that will evolve on about the right scale for the stdlib—new problems to add to the list come up about once a decade, not every few months or anything.
The place I’d _really_ like this is Pythonista, which does an admirable job fighting iOS text input for me, but it’s not so helpful for fixing up pasted code. (And needless to say, I can’t just get a better editor/IDE; it’s by far the best option for the platform.)
(By the way, the reason I used -f rather than —fix is that I can’t figure out how to get the iPhone Mail.app to not replace double hyphens with an em-dash, or even how to fix it when it does. All of the other fancifier stuff can be worked around pretty easily, but apparently not that one…)