Is this a bug?

Bengt Richter bokr at accessone.com
Tue May 15 00:11:02 EDT 2001


On Sun, 13 May 2001 15:39:55 -0400, "Tim Peters" <tim.one at home.com>
wrote:

>[Alex Martelli, in reply to Costas Menico]
>> ...
>> I do hope this debate resolves your personal issue that you expressed
>> as "I just don't understand why the parser can't be smart enough about
>> this".  The Python parser is "smart enough" about rawstrings to focus
>> anomalies into ONE, easily documented, easily understood and tested-
>> for issue -- zero is not achievable, and that one was craftily chosen.
>> Rawstrings are optimized for use as regular-expression literals, where
>> there is never any need for them to end with an odd number of
>> backslashes.
>
>Something often overlooked here is the tool issue:  virtually every
>programmer's editor in existence "knows" that double and single quotes
>delimit strings, and that within a string a backslash escapes the character
>immediately following.  The rule for r-strings gives such tools no trouble at
>all:  they may not have deep knowledge of what exactly the string consists
>of, but they know for certain where it begins and ends just by applying "the
>usual" string escape rules.  That was by design too, of course.
>
>Maybe we should introduce d-strings for DOS pathnames <wink>.
>
Actually, what is the purpose of recognizing the backslash as
an escape in raw strings, other than compatibility with C-style
strings that happen to include the delimiter character? Other
than that, you could just treat the backslash as an ordinary
character and just match delimiters, it seems.

To be able to include the same kind of quote as the delimiter doesn't
seem that great a benefit when you have a simple strategy using
alternative delimiters, and the simple cases become no-brainers.
You could also systematically quote as deeply as desired (see below).

	myPath = R'c:\Windows\'

would be no problem. Nor

	test = R"myPath = R'c:\Windows\'"

nor (brute force isolating quote-containing segments in the above,
using alternating delimiters)

	test2 = R'R"myPath = R' R"'c:\Windows\'" R'"'

(I hope I got that right ;-)

Some tools might not like it, I guess, as you point out.
Another possibility might be to allow alternate delimiters
specified just inside otherwise normal-looking quotes, e.g.,

	myPath = D'#c:\Windows\#'

This could allow quoting of anything but the terminating delimiter
pair. e.g.,

	test = D'#This contains "'" and '#' and possibly "'#",
	and '\n' is ok, but not '#' immediately followed by "'".#'

That could cover a lot of possibilities. And then there's MIME-style
bulk delimiting that might be adapted ...

	test = M'--mime style delim--:<delimited stuff ...
	...
	end of delimited stuff>:--mime style delim--'

as an example, using (') and (:) to delimit the delimiter ;-)
I suppose if you wanted a pretty bullet proof delimiter you could
use the md5 digest of the quoted part as a mime style delimiter.

(XML should do something like that to fix their CDATA's inability
to be used for nested quoting, IMHO)




More information about the Python-list mailing list