[Python-Dev] status of development documentation

Tim Peters tim.peters at gmail.com
Sat Dec 24 16:19:57 CET 2005


[Neal]
> Hmmm, I thought others were running the tests on Windows too.  There
> was one report on Nov 22 about running Purify on Windows 2k (subject:
> ast status, memory leaks, etc).  He had problems with a stack overflow
> in test_compile.  He was going to disable the test and re-run.  I
> never heard back though.  Based on that info, I would guess that
> test_builtin was working on Win 2k on Nov 22.

I wouldn't assume that.  My "nobody" was wrt the universe of Python
developers, not users; folks like Trent and MarkH and you and me. 
Without "normal" baseline test experience, users don't understand what
they're seeing, and so their reports can be highly misleading.  You
can trust that while I don't understand what I'm seeing here either,
at least I told the absolute truth about it and didn't hold back
critical details ;-)

That said, I was hoping to do some Python work over Thanksgiving week,
but was mortally discouraged on the Nov 19-20 weekend by all the test
failures I saw.  So I'm pretty sure (but not certain) that
test_builtin was failing then.

>> (A parenthentical question:  is there a reason you don't pass -uall to
>> regrtest.py?)

> It's calling make test.  I thought about calling regrtest.py instead
> and doing as you suggest.  Is there a benefit to running make test?

You're asking a Windows guy about make:  bad career move ;-)

> I know it runs with and without -O.  I guess it's only machine time, I
> could run make test and regrtest.py -uall.

-uall is helpful in finding bugs.  One thing in particular here is
that test_compiler runs only a tiny subset of its full test unless an
appropriate -u flag is given.

>> On WinXP Pro SP2 today, passing -uall, and after fixing all the MS
>> compiler warnings that have crept in:
>>
>> 251 tests OK.
>> 12 tests failed:
>>     test_builtin test_coding test_compiler test_pep263
>>     test_univnewlines test_urllib test_urllib2 test_urllibnet
>>     test_userlist test_wave test_whichdb test_zipfile
>> 1 skip unexpected on win32:
>>     test_xml_etree_c

> Ouch!  I'm might be to blame for at least some of those. :-(

I'm sure it's not as bad as it looks.  For example, test_coding and
(the -uall) test_compiler fail for exactly the same reason.  For
another, when a test fails on Windows, it sometimes leaves a "@test"
file or directory behind, which causes a cascade of bogus failures
later.  For example, test_userlist, test_wave, test_whichdb, and
test_zipfile all pass when run alone here.  Others probably do too.

...
>
> Do you know if the tests were broken before the AST merge (about Oct
> 22 I think)?

I don't know.  I'm getting more evidence that most (if not all) of the
failures are due to compile-time parsing screwups, so the AST merge is
a prime suspect.

Is it possible that generated parse tables (whatever) are out-of-date
on a Windows box?  There are no makefiles here, and the Windows build
has always relied on Unix-heads to check in correct generated parser
files.

>> The code up to the first failure is short:
>>
>>         bom = '\xef\xbb\xbf'
>>         compile(bom + 'print 1\n', '', 'exec')
>>
>> Curiously, that sequence doesn't blow up under the released Windows
>> Python 2.4.2, so somebody broke something here since then ...

> There were a bunch of changes to Parser/tokenizer.c to handle error
> conditions.  Those go back to Oct 1.  I don't *think* those would
> cause these, but I'm not sure.
>
> Sorry, I don't know any more.  I guess you might have to binary search
> by date to try and find the problem.

That's just silly ;-)  What I need is someone who understands what
this code is _supposed_ to do, so we can fix it; finding the checkin
that caused it isn't nearly as interesting.  Windows has an excellent
debug-build debugger and I can easily step through the code.  But I
have no idea why compiling a string starting with  '\xef\xbb\xbf'
should _not_ be a syntax error -- it looks like a syntax error to me.

And when I step thru the code, it looks like a syntax error to the
parser too.  It peels off the first character (\xef), and says "syntax
error" at that point:

Py_CompileStringFlags ->
PyParser_ASTFromString ->
PyParser_ParseStringFlagsFilename ->
parsetok ->
PyTokenizer_Get

That sets `a` to point at the start of the string, `b` to point at the
second character, and returns type==51.  Then `len` is set to 1, 
`str` is malloc'ed to hold 2 bytes, and `str` is filled in with
"\xef\x00" (the first byte of the input, as a NUL-terminated C
string).

PyParser_AddToken then calls classify(), which falls off the end of
its last loop and returns -1:  syntax error.

So how it gets there is really quite straightfoward.  The problem for
me is that I have no idea why it _should_ do something other than
that, let alone what that may be.  This needs someone who knows
something :-)


More information about the Python-Dev mailing list