[Python-Dev] status of development documentation

Sat Dec 24 16:34:41 CET 2005

Tim Peters wrote:
> [Neal]
> 
>>Hmmm, I thought others were running the tests on Windows too.  There
>>was one report on Nov 22 about running Purify on Windows 2k (subject:
>>ast status, memory leaks, etc).  He had problems with a stack overflow
>>in test_compile.  He was going to disable the test and re-run.  I
>>never heard back though.  Based on that info, I would guess that
>>test_builtin was working on Win 2k on Nov 22.
> 
> 
> I wouldn't assume that.  My "nobody" was wrt the universe of Python
> developers, not users; folks like Trent and MarkH and you and me. 
> Without "normal" baseline test experience, users don't understand what
> they're seeing, and so their reports can be highly misleading.  You
> can trust that while I don't understand what I'm seeing here either,
> at least I told the absolute truth about it and didn't hold back
> critical details ;-)
> 
> That said, I was hoping to do some Python work over Thanksgiving week,
> but was mortally discouraged on the Nov 19-20 weekend by all the test
> failures I saw.  So I'm pretty sure (but not certain) that
> test_builtin was failing then.
> 
> 
>>>(A parenthentical question:  is there a reason you don't pass -uall to
>>>regrtest.py?)
> 
> 
>>It's calling make test.  I thought about calling regrtest.py instead
>>and doing as you suggest.  Is there a benefit to running make test?
> 
> 
> You're asking a Windows guy about make:  bad career move ;-)
> 
> 
>>I know it runs with and without -O.  I guess it's only machine time, I
>>could run make test and regrtest.py -uall.
> 
> 
> -uall is helpful in finding bugs.  One thing in particular here is
> that test_compiler runs only a tiny subset of its full test unless an
> appropriate -u flag is given.
> 
> 
>>>On WinXP Pro SP2 today, passing -uall, and after fixing all the MS
>>>compiler warnings that have crept in:
>>>
>>>251 tests OK.
>>>12 tests failed:
>>>    test_builtin test_coding test_compiler test_pep263
>>>    test_univnewlines test_urllib test_urllib2 test_urllibnet
>>>    test_userlist test_wave test_whichdb test_zipfile
>>>1 skip unexpected on win32:
>>>    test_xml_etree_c
> 
> 
>>Ouch!  I'm might be to blame for at least some of those. :-(
> 
> 
> I'm sure it's not as bad as it looks.  For example, test_coding and
> (the -uall) test_compiler fail for exactly the same reason.  For
> another, when a test fails on Windows, it sometimes leaves a "@test"
> file or directory behind, which causes a cascade of bogus failures
> later.  For example, test_userlist, test_wave, test_whichdb, and
> test_zipfile all pass when run alone here.  Others probably do too.
> 
> ...
> 
>>Do you know if the tests were broken before the AST merge (about Oct
>>22 I think)?
> 
> 
> I don't know.  I'm getting more evidence that most (if not all) of the
> failures are due to compile-time parsing screwups, so the AST merge is
> a prime suspect.
> 
> Is it possible that generated parse tables (whatever) are out-of-date
> on a Windows box?  There are no makefiles here, and the Windows build
> has always relied on Unix-heads to check in correct generated parser
> files.
> 
> 
>>>The code up to the first failure is short:
>>>
>>>        bom = '\xef\xbb\xbf'
>>>        compile(bom + 'print 1\n', '', 'exec')
>>>
>>>Curiously, that sequence doesn't blow up under the released Windows
>>>Python 2.4.2, so somebody broke something here since then ...
> 
> 
>>There were a bunch of changes to Parser/tokenizer.c to handle error
>>conditions.  Those go back to Oct 1.  I don't *think* those would
>>cause these, but I'm not sure.
>>
>>Sorry, I don't know any more.  I guess you might have to binary search
>>by date to try and find the problem.
> 
> 
> That's just silly ;-)  What I need is someone who understands what
> this code is _supposed_ to do, so we can fix it; finding the checkin
> that caused it isn't nearly as interesting.  Windows has an excellent
> debug-build debugger and I can easily step through the code.  But I
> have no idea why compiling a string starting with  '\xef\xbb\xbf'
> should _not_ be a syntax error -- it looks like a syntax error to me.
> 
> And when I step thru the code, it looks like a syntax error to the
> parser too.  It peels off the first character (\xef), and says "syntax
> error" at that point:
> 
> Py_CompileStringFlags ->
> PyParser_ASTFromString ->
> PyParser_ParseStringFlagsFilename ->
> parsetok ->
> PyTokenizer_Get
> 
> That sets `a` to point at the start of the string, `b` to point at the
> second character, and returns type==51.  Then `len` is set to 1, 
> `str` is malloc'ed to hold 2 bytes, and `str` is filled in with
> "\xef\x00" (the first byte of the input, as a NUL-terminated C
> string).
> 
> PyParser_AddToken then calls classify(), which falls off the end of
> its last loop and returns -1:  syntax error.
> 
> So how it gets there is really quite straightfoward.  The problem for
> me is that I have no idea why it _should_ do something other than
> that, let alone what that may be.

PEP263:

"""
     To aid with platforms such as Windows, which add Unicode BOM marks
     to the beginning of Unicode files, the UTF-8 signature
     '\xef\xbb\xbf' will be interpreted as 'utf-8' encoding as well
     (even if no magic encoding comment is given).
"""


> This needs someone who knows
> something :-)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com