Embedding a literal "\u" in a unicode raw string.

rmano romano.giannetti at gmail.com
Mon Feb 25 17:45:54 EST 2008


On Feb 25, 11:27 pm, "Martin v. Löwis" <mar... at v.loewis.de> wrote:
> > Raw
> > should be raw...
>
> Right. IMO, this is just a plain design mistake in the Python Unicode
> handling. Unfortunately, there was discussion about this specific issue
> in the past, and the proponent of the status quo always defended it,
> with the rationale (IIUC) that a) without that, you can't put arbitrary
> Unicode characters into a string, and b) the semantics of \u in Java and
> C is so that \u gets processed even before tokenization even starts, and
> it should be the same in Python.

Well, I do not know Java, but C AFAIK has no raw strings, so you have
nevertheless
to use double backslashes. Raw strings are a handy shorthand when you
can generate
the characters with your keyboard, and this asymmetry quite defeat it.

Is it decided or it is possible to lobby for it? :-)

Thanks,
       Romano

BTW, 2to3.py should warn when a raw string (not unicode) with \u in
it, I think.
I tried it and it seems to ignore the problem...



More information about the Python-list mailing list