String handling bug in Python

Tim Peters tim.one at comcast.net
Fri Apr 26 16:31:49 EDT 2002


[Stephen Ferg]
> ...
> I would describe the situation this way... Python's "raw" strings
> aren't truly raw at all.  Python does not have any capability for
> TRULY raw strings -- that is, for string literals that are not escaped
> in any way.

That's true of raw strings now -- there are no "escape sequences" in
Python's raw strings.

> Escaping is still performed even with "raw" strings.

This isn't so.  r'\\' == "\\\\" and r'\'' == "\\'" and r"\"" == "\\\"" and
so on.  What you see is exactly what you get, no exceptions.  In return,
some strings can't be spelled *at all* using Python's raw-string notation,
and you appear to have the same problem with your scheme (read on).

> So the term "raw" is misleading.
>
> I would like to see Python support TRULY raw string literals -- string
> literals that are not escaped in any way.  Perhaps they could be added
> with a prefix of z (for zero escaping?), so that
>
>      z"\" == "\\"
>      z"\'" == "\\\'"

The first one can't be spelled at all with a raw string today.  The second
one can be, and indeed the same way you spelled it, as r"\'".  What do you
do in your scheme if you want a z-string to contain a single double-quote?
Don't brush that off, because it's the heart of the problem:  you need
*some* way to detect where the "TRULY raw string literal" ends.  Whatever
way you pick to spell that then becomes "a problem" if you want to include
the very same spelling *in* the raw string literal.

fortran-hollerith-strings-didn't-have-this-problem-but-people-got-
    tired-of-counting-how-many-characters-they-needed-ly y'rs  - tim






More information about the Python-list mailing list