[Tutor] Puzzled again

Steven D'Aprano steve at pearwood.info
Wed Aug 3 21:16:59 CEST 2011


Richard D. Moores wrote:
> I wrote before that I had pasted the function (convertPath()) from my
> initial post into mycalc.py because I had accidentally deleted it from
> mycalc.py. And that there was no problem importing it from mycalc.
> Well, I was mistaken (for a reason too tedious to go into). There WAS
> a problem, the same one as before.

Richard, too much information! Please try to be more concise in your 
posts. Try to find the *simplest* code that demonstrates the problem.

E.g. instead of a 14 line function (including docstring) you should be 
able to demonstrate the problem with a TWO liner:

def func():
     """ text r'aaa\Userswxyzabcd' """

In Python 3, you will get a SyntaxError complaining about a truncated 
\UXXXXXXXX escape. This comes back to what I said earlier: r' inside 
another string does not start a raw string. (In much the same way that [ 
inside a string does not create a list.) \U instroduces a Unicode 
escape. Since docstrings are unicode in Python 3, and serswxyz are not 
valid hex digits, you get an error.

To fix, you need to either make the whole docstring a raw string:

     r""" text r'aaa\Userswxyzabcd' """

or you need to escape the backslashes:

     """ text r'aaa\\Userswxyzabcd' """


You don't even need to use import to work with this. You can just copy 
and paste the string into an interactive interpreter:

# Python 2:
 >>> s = """ text r'aaa\Userswxyzabcd' """
 >>> print s
  text r'aaa\Userswxyzabcd'


# Python 3:
 >>> s = """ text r'aaa\Userswxyzabcd' """
   File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in 
position 11-13: truncated \UXXXXXXXX escape

Remember: function docstrings are just ordinary strings. All the rules 
for strings apply equally to docstrings.


You also report (in tedious detail!) apparent discrepancies between the 
offsets reported by Python when importing the file, and the actual 
content of the file at those offsets. I'm not even going to attempt to 
diagnose that. Without access to the exact source files, not copy and 
pasted into an email or a pastebin, there is no hope of diagnosing it. 
It would just be a major waste of time and energy.

The most likely cause is that the version of the file you are importing 
is not the same as the version of the file you have open in the hex editor.

E.g. suppose you open the file in the hex editor. Then you edit the file 
in your text editor, and save changes. Then you import it in Python, 
which reports the offsets according to the version of the file on disk. 
You look at the hex editor, but you're seeing the *old* version, before 
the changes.

Unless the problem is *reproducible*, i.e. you can repeat it at will, 
there's no way of knowing for sure what happened. (Unless you have a 
time machine and can go back in time to see exactly you did.)

Don't lose any sleep over this sort of thing. We've all done it. My 
favourite is when I'm making incremental edits to a file, but forget to 
reload() changes in the interactive interpreter correctly. Python ends 
up finding an error on (let's say) line 42, but by the time the error is 
reported, line 42 on disk has changed and so the traceback prints a 
completely irrelevant line.

If you *can* reproduce the error at will, e.g. do this:

(1) Take a source file in a known state;
(2) Import the file into a *freshly started* Python interpreter;
(3) Python reports an error on some line;
(4) but the line doesn't seem to have anything to do with that error

and repeat it as often as you like, then it would be worth sharing the 
file for further investigation, by attaching it to an email, not copying 
and pasting the text from it.

Disclaimer: due to the way Python's parser works, sometimes a missing 
parenthesis or bracket will cause a syntax error on the line *following* 
the erroneous line. This is a known limitation.



-- 
Steven


More information about the Tutor mailing list