RFC PEP candidate: q'<delim>'quoted<delim> ?

Bengt Richter bokr at oz.net
Thu Mar 7 05:14:13 EST 2002


On Tue, 05 Mar 2002 12:00:56 -0800, Jeff Shannon <jeff at ccvcorp.com> wrote:

>
>
>Bengt Richter wrote:
>
>> >Making Python as gibberish as Perl is. And all that only to
>> >have Windows path be written without double-\
>> Not 'only'. I said 'also' ;-)  Perhaps my choice of '|' delimiter triggered
>> your 'gibberish as Perl' detector?
>>
>> I could have written
>>
>>     q'###'c:\foo\bar\###
>> or
>>     q'[quoting delimiter]'c:\foo\bar\[quoting delimiter]
>>
>> just as well for this one.
>
>I still don't like it.  It's very difficult for me to see at a glance, what's part of
>the string and what is part of the delimiter, especially with the leading delimiter
>being quoted and the trailing one not quoted.  It looks unbalanced, it looks ungainly,
>and, to me, it just plain looks ugly.  I'd expect the above example to be equivalent to
>"'c:\\foo\\bar\\", with a leading mismatched single-quote... Your earlier example of

I understand the reaction, and I had considered defining the delimiter with the quotes
included. Whould you prefer the following?

    q'###'c:\foo\bar\'###'
or
    q'__quoting delimiter (incl quotes)__'c:\foo\bar\'__quoting delimiter (incl quotes)__'

But then I had to wonder whether using alternative quotes should imply the identical
usage at both ends w.r.t. the quote marks. I.e.,
    q'###'c:\foo\bar\'###'
    q"###"c:\foo\bar\"###"
    q'''###'''c:\foo\bar\'''###'''
    q"""###"""c:\foo\bar\"""###"""

All that is very cluttered and ugly. The major use of q' would probably actually be
in the Q' variation, and you could do the above pretty cleanly as:

    s = Q'
###'
c:\foo\bar\
###

Note that the delimiter is '\n###\n' in the above, so there is no \n in the
quoted string. I think this would bean easy pattern to use. To quote large unknown
things, you just choose something safe in place of ###, and if you don't want to
clip off the last \n, use Q'###' with no \n in front of the ###.

Note that you could use Q'"""' in place of the leading """ in existing code, to
allow you to put the first line of quoted text on the next line, without getting
a leading \n. I.e.,

    s = Q'"""'
First line.
...
Last line.
"""
# this comment immediately follows the quoting delimiter

is equivalent (assuming """ quotes ok) to

    s = r"""First line
...
Last line.
"""# this comment immediately follows the quoting delimiter

(i.e., you have to account for the delimiter actually being '"""\n' -- cf. M' below)


>cutting & pasting left me totally confused until  I spent a minute sorting through it.
>
A minute seems not too bad ;-) I.e., you wouldn't have to re-think it to use it as
a pattern for arbitrary quoting, I don't think. I only used the example text because
it was a heavy mix of quoting that you could not quote with triple or double quotes.
I thought q' and Q' to be pretty straight forward, once the syntax is grasped.

Can you think of a better way to quote an arbitrary sequence of characters within
a program text?

>At least to me, this seems totally unclear and totally nonintuitive.  It vastly

How does it seem if you go along the steps I took?:

 1. You have an arbitrary sequence of characters that are to be the value of a string.
 2. The sequence may contain both ''' and """ and may even end with \ and it must be unchanged.
 3. (2) Means you need a different delimiter than " or ' or """ or '''.
 4. Using a string as a delimiter (like MIME or <<XYZ in Perl & shells etc) seems viable, whereas
    no fixed delimeters can nest without counting and symmetry rules (which contradicts the
    definition of unrestricted text), so yet another thing like XML's <![CDATA[ ... ]]> won't do.
 5. Python has a way to define a string, but not a way to indicate that it should
    be used as a delimiter.
 6. (5) suggests using a Python string to define the delimiter string
 7. (5) suggests that the string-as-delimiter needs to be distinguished from others
 8. raw and unicode strings use a quote prefix to distinguish themselves from others
 9. (5)+(8) suggests using an alternate prefix to define string-as-delimiter: I chose q and Q for quote
10. The actual content string must start somewhere after the delimiter string is defined
11. The obvious place for (10) is the next character after the final quote of the _delimiter_ string.
12. Using an otherwise ordinary python raw string as a delimiter, means the quotes are not included
13. (12) means the postfixed delimiter does not have quotes around it, unless you alter the delimiter
    string definition rules to include them.
14. There is an ugliness in using triple quotes to quote multiple lines of text with no leading
    empty line, since = """the text of the first line
doesn't line up with the text of the following lines.
"""
15. I thought of Q' to allow lining up all quoted text lines in a block by using the first character
    following Q'xxx' as the last character of the delimiter (thus using it up and allowing the real
    quoted text to start on the next line). Alternatively, we could replace the second ' after Q
    and delimit the delimiter-string with ' on the front and \n at the end, including neither.
    Perhaps that would be cleaner for ordinary multiline quotes (and change the prefix to M)? E.g.,

    print M'XXX
First line.
Second line.
XXX


This would let you do the ugly windows path string even more cleanly:
    s =  M'
###
c:\foo\bar\
###

It's getting cleaner looking, don't you think?
(The delimiter is '\n###' above, from source "M'\n###\n<content>\n###" ).
(delim-string delimiters)-->                   ^     ^^

>multiplies the possibilities for writing hard-to-read code, while providing a real
I don't buy this, unless someone is being perverse in using it, or hasn't thought
of really clean examples yet ;-).

The point is to make the few places where it  IS needed simple and clean. The point
of posting it for discussion is to tease out better alternatives and/or usage patterns
that really address the problem(s) without declaring it/them a/() non-problem(s).
C'mon, how is the print M'XXX example above hard to digest? ;-)

    M'"""
This could be a doc-string with
no leading \n and looking as a
block the way it would print.
"""

>benefit in relatively few situations.  Considering how often triple-quotes appear in
>text files, I can only imagine this being needed when trying to programmatically
>generate Python code, which doesn't seem to be a terribly common task.  I'd prefer to
>find an alternate solution for the specific case of that task, which doesn't require
>changing the core language and creating so much potential for ugly code in every other
I don't see where an added capability, which does not alter interpretation of existing
code, has "potential for ugly code" -- unless someone is abusing the capability.

>task that can be done with Python.
>
How would you handle the problem as stated (cf. 1-15 above)?

I appreciate the comments. I think they have led me to better examples, and possible
variations on the original idea. If someone has better variations or alternative
better solutions, I'd like to hear them. Thank you.

Regards,
Bengt Richter




More information about the Python-list mailing list