[Python-Dev] Why both r'' and R'', u'' and U''?

Ka-Ping Yee ping@lfw.org
Sun, 14 Jan 2001 04:38:42 -0800 (PST)


Sorry i'm being forgetful -- could someone please refresh my memory:

Was there a good reason for allowing both lowercase and capital 'r'
as a prefix for raw-strings?  I assume that the availability of both
r'' and R'' is what led to having both u'' and U''.  Is there any
good reason for that either?

This just seems to lead to ambiguity and unneeded complexity:
more cases in tokenize.py, more cases in tokenize.c, more work
for IDLE, more annoying when searching for u' in your editor.
(I was about to fix the lack of u'' support in tokenize.py and
that made me think about this.)

What happened to TOOWTDI?

Would you believe we now have 36 different ways of starting a string:

    '      "      '''    """
    r'     r"     r'''   r"""
    u'     u"     u'''   u"""
    ur'    ur"    ur'''  ur"""
    R'     R"     R'''   R"""
    U'     U"     U'''   U"""
    uR'    uR"    uR'''  uR"""
    Ur'    Ur"    Ur'''  Ur"""
    UR'    UR"    UR'''  UR"""

Would it be outrageous to suggest deprecating the last five rows?


-- ?!ng

[1] We started with 4.  Perl has (by my count) 381 ways of starting
    a string literal, so we're halfway there, logarithmically speaking.
    Perl has 757 if you count the fancier operators qx, qw, s, and tr.