10.05.20 10:09, Steve Barnes пише:
- Start accepting hyphens as minus & Unicode quotation marks – this would be the ideal answer for pasted code but has a lot of possible things to iron out such as do we require that the quotes match and are in the typographically correct order. It is also quite a big & complex change to the python interpreter.
Two consequent hyphens can look as a dash, and can be replaced with a dash by "typographer", but they have different meaning that a single minus.
- Normalise the input to the python interpreter (at least for these characters and possibly a few others) so that entering or reading from a file S1 = “Double Quoted” becomesS1 = "Double Quoted", etc. – this should be a easier change to the interpreter but, from a purist point of view, could be said to make us as bad as the others because we are not honouring what the user entered.
It is ambiguous. For example, in Ukraine we use pairs of quotation marks « and » or „ and “. But “ is used as an opening quotation mark in English, and » and « are used with opposite meaning in Swedish. Single low-9 quotation mark ‚ can be confused with a comma, single angle quotation marks ‹ and ❮ can be confused with <.
- Change the error message “SyntaxError: invalid character in identifier” to include which character and it’s Unicode value so that it becomes “SyntaxError: invalid character 0x201c “ in identifier” – this is almost certainly the easiest change and fits well with explicit is better than implicit but still leaves it to the user to correct the erroneous input (which could be argued is both good and bad).
Also, "in identifier" is incorrect in most cases, because the invalid character does not look like a part of identifier in most cases.