In order to achieve this ideal, and assuming you'd be keeping backward compatibility (!), you'd have to explain how to support both of these strings:
"Hello\n" r"Hello\n"
A possible solution is:
- In parse time, any string literal is a /raw string/, regardless of
what prefix it has or if it even has a prefix.
- The /raw string/ is then passed to user-land in this raw state,
and then, if no prefix is applied, it is parsed as a standard
string, otherwise the requested prefix is applied.
- In case of a user-land raw string (e.g. r"yo"), the prefix
function can be the identity function (e.g. f(x) = x).
This is possibly not the most ideal solution, but it is a solution.
Greetings,
Göktuğ.
Ned Batchelder
On 5/27/2013 11:51 AM, Haoyi Li wrote:
If-if-if all that works out, you would be able to completely remove the ("b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" | "r" | "u" | "R" | "U") from the grammar specification! Not add more to it, remove it! Shifting the specification of all the different string prefixes into a user-land library. I'd say that's a pretty creative way of getting rid of that nasty blob of grammar =D.
In order to achieve this ideal, and assuming you'd be keeping backward compatibility (!), you'd have to explain how to support both of these strings:
"Hello\n" r"Hello\n"
Implicit in your idea is that the plain literal creates a string of some kind, and but the r-prefixed string would apply some user-land function to the string. But there is no function you can apply to string literals to make them be raw. The r prefix suppresses interpretation that happens in un-prefixed strings. By the time a user-land function got hold of the string, the interpretation has already been done, information has already been lost.
--Ned.
--
Göktuğ Kayaalp