[Tutor] Puzzled again

Wed Aug 3 22:30:13 CEST 2011

On Wed, Aug 3, 2011 at 12:10, Peter Otten <__peter__ at web.de> wrote:
> Richard D. Moores wrote:
>
>> On Wed, Aug 3, 2011 at 10:11, Peter Otten <__peter__ at web.de> wrote:
>>> Richard D. Moores wrote:
>>>
>>>> I wrote before that I had pasted the function (convertPath()) from my
>>>> initial post into mycalc.py because I had accidentally deleted it from
>>>> mycalc.py. And that there was no problem importing it from mycalc.
>>>> Well, I was mistaken (for a reason too tedious to go into). There WAS
>>>> a problem, the same one as before.
>>>
>>> Dave was close, but Steven hit the nail: the string r"C:\Users\Dick\..."
>>> is fine, but when you put it into the docstring it is not a raw string
>>> within another string, it becomes just a sequence of characters that is
>>> part of the outer string. As such \U marks the beginning of a special way
>>> to define a unicode codepoint:
>>>
>>>>>> "\U00000041"
>>> 'A'
>>>
>>> As "sers\Dic", the eight characters following the \U in your docstring,
>>> are not a valid hexadecimal number you get an error message.
>>>
>>> The solution is standard procedure: escape the backslash or use a
>>> rawstring:
>>>
>>> Wrong:
>>>
>>>>>> """yadda r"C:\Users\Dick\..." yadda"""
>>> File "<stdin>", line 1
>>> SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
>>> position 10-12: truncated \UXXXXXXXX escape
>>>
>>> Correct:
>>>
>>>>>> """yadda r"C:\\Users\Dick\..." yadda"""
>>> 'yadda r"C:\\Users\\Dick\\..." yadda'
>>>
>>> Also correct:
>>>
>>>>>> r"""yadda r"C:\Users\Dick\..." yadda"""
>>> 'yadda r"C:\\Users\\Dick\\..." yadda'
>>
>> Here's from my last post:
>>
>> ====================================
>> Now I edit it back to its original problem form:
>>
>> def convertPath(path):
>>    """
>>    Given a path with backslashes, return that path with forward slashes.
>>
>>    By Steven D'Aprano  07/31/2011 on Tutor list
>>    >>> path = r'C:\Users\Dick\Desktop\Documents\Notes\College Notes.rtf'
>>    >>> convertPath(path)
>>    'C:/Users/Dick/Desktop/Documents/Notes/College Notes.rtf'
>>    """
>>    import os.path
>>    separator = os.path.sep
>>    if separator != '/':
>>        path = path.replace(os.path.sep, '/')
>>    return path
>>
>> and get
>>
>> C:\Windows\System32>python
>> Python 3.2.1 (default, Jul 10 2011, 20:02:51) [MSC v.1500 64 bit
>> (AMD64)] on win32
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import mycalc2
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>>  File "C:\Python32\lib\site-packages\mycalc2.py", line 10
>>    """
>> SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
>> in position 144-146: truncated \UXXXXXXX
>> X escape
>>
>> Using HxD, I find that the bytes in 144-146 are 20, 54, 75 or  the
>> <space>, 'T', 'u' of  " Tutor" .  A screen shot of HxD with this
>> version of mycalc2.py open in it is at
>> <http://www.rcblue.com/images/HxD.jpg>. You can see that I believe the
>> offset integers are base-10 ints. I do hope that's correct, or I've
>> done a lot of work for naught.
>> ====================================
>>
>> So have I not used HxD correctly (my first time to use a hex reader)?
>> If I have used it correctly, why do the reported problem offsets of
>> 144-146 correspond to such innocuous things as 'T', 'u' and <space>,
>> and which come BEFORE the problems you and Steven point out?
>
> Come on, put that r before the docstring

Yes! That works! Now I've got it. As I told Steven, I didn't
understand that suggestion before, partly
because I had the """ on a separate line, and was also confused by the
"yadda"s.

> def convertPath(path):
>    r"""
>    Given a path with backslashes, return that path with forward slashes.
>    ...
>
> and see the problem go away.

It did!

>The numbers reported by Python are byte offsets
> within the string literal:

Ah, that explains the mystery. Thanks, Peter.

Dick