[Python-bugs-list] [ python-Bugs-766669 ] Consistent GPF on exit

SourceForge.net noreply@sourceforge.net
Sun, 06 Jul 2003 18:29:39 -0700


Bugs item #766669, was opened at 2003-07-06 07:32
Message generated for change (Comment added) made by grittkmg
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=766669&group_id=5470

Category: Windows
Group: Python 2.2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Kurt Grittner (grittkmg)
Assigned to: Nobody/Anonymous (nobody)
Summary: Consistent GPF on exit

Initial Comment:
Using the following script (clt.py):

import os, time, sys
from socket import *              
from Tkinter import *

class App:
    def __init__(self, master):
        self.win = Toplevel(master)
        self.button = Button(self.win, text="QUIT", 
fg="red", command=self.goaway)
        self.button.pack(side=LEFT)
        self.hi_there = Button(self.win, text="Hello", 
command=self.say_hi)
        self.hi_there.pack(side=LEFT)
        #self.serverHost = 'localhost'      
        self.serverHost = '192.168.1.12'      
        self.serverPort = 50007            
        self.sockobj = socket(AF_INET, SOCK_STREAM)      
        self.sockobj.connect((self.serverHost, self.
serverPort))   

    def say_hi(self):
        self.message = ['Hello network world']   
        for line in self.message:
            self.sockobj.send(line)  
        data = self.sockobj.recv(1024)
        print 'Client received:', `data`

    def goaway(self):
        self.sockobj.shutdown(0)
        self.sockobj.close()
        self.win.destroy()

def NewClient():
	App(root)

def PgmExit():
	root.quit()
    
root = Tk()
Button(root, text='New Client', command=NewClient).
pack()
Button(root, text='Quit All Clients', command=PgmExit).
pack()
root.mainloop()

If you press 'Quit' first, then there are no errors.

If you press 'New Client' even once (whether or not there 
is an echo server to talk to) the program runs the other 
Toplevel windows seemingly without problems, but then 
when you press 'Quit' you die in the following place :

/* Vc98\Crt\Src\Crtexe.c */

#ifdef WPRFLAG
            __winitenv = envp;
            mainret = wmain(argc, argv, envp);
#else  /* WPRFLAG */
            __initenv = envp;
            mainret = main(argc, argv, envp);
#endif  /* WPRFLAG */

#endif  /* _WINMAIN_ */

/* 
-------------------------------------------------------
----------------- */
/* Error executing following line */
            exit(mainret);
/* 
-------------------------------------------------------
----------------- */

/* OS error dump
PYTHON_D caused an invalid page fault in
module KERNEL32.DLL at 017f:bff88396.
Registers:
EAX=c00309c4 CS=017f EIP=bff88396 EFLGS=00210216
EBX=0062ffec SS=0187 ESP=0052fecc EBP=00530044
ECX=00000000 DS=0187 ESI=00000000 FS=10e7
EDX=bff76855 ES=0187 EDI=bff79060 GS=0000
Bytes at CS:EIP:
53 56 57 8b 75 10 8b 38 33 db 85 f6 75 2d 8d b5 
Stack dump:

/* Vc98\Crt\Src\Crt0dat.c */
/* First debug run */
1020ACE0   jmp         doexit+0B6h (1020acf6)
384:          }
385:
386:
387:          _C_Exit_Done = TRUE;
1020ACE2   mov         dword ptr [__C_Exit_Done 
(10264728)],1
388:
389:          ExitProcess(code);
1020ACEC   mov         ecx,dword ptr [code]
1020ACEF   push        ecx

/* 
-------------------------------------------------------
----------------- */
/* Error on the following call */
1020ACF0   call        dword ptr [__imp__ExitProcess@4 
(10256020)]
/* 
-------------------------------------------------------
----------------- */

/* Second debug run */
PYTHON_D caused an invalid page fault in
module KERNEL32.DLL at 017f:bff88396.
Registers:
EAX=c00309c4 CS=017f EIP=bff88396 EFLGS=00210216
EBX=0062ffec SS=0187 ESP=0052fecc EBP=00530044
ECX=00000000 DS=0187 ESI=00000000 FS=1b3f
EDX=bff76855 ES=0187 EDI=bff79060 GS=0000
Bytes at CS:EIP:
53 56 57 8b 75 10 8b 38 33 db 85 f6 75 2d 8d b5 
Stack dump:


390:
391:
392:  }
1020ACF6   mov         esp,ebp
1020ACF8   pop         ebp
1020ACF9   ret

/* Vc98\Crt\Src\Crtexe.c - Continued */

*/

        }
        __except ( _XcptFilter(GetExceptionCode(), 
GetExceptionInformation()) )
        {
            /*
             * Should never reach here
             */
            _exit( GetExceptionCode() );

        } /* end of try - except */

}

It seems weird that everything else in the program works 
fine right up to ExitProcess.  The error occurs on multiple 
windows machines.  The error does NOT occur on linux 
running the same code base.  Also, many other 
multi-threaded sample programs from other authors 
exhibit the same 'crash on exit' syndrome.

Thanks for your attention,
Kurt


----------------------------------------------------------------------

>Comment By: Kurt Grittner (grittkmg)
Date: 2003-07-06 20:29

Message:
Logged In: YES 
user_id=816888

I have a working fix for this problem, but the code is a fairly 
ugly hack. (A terrible thing to do to such a well organized 
codebase).  

1. I exported a fini_socket() function from the PYD DLL.
2. I cloned the dynamic lookup function to check for the 
optional fini_socket() name as well as the init_socket().  If 
found, I remember it in a global.
3. In PyImport_Cleanup() I call the fini_socket() routine (if 
present) before these modules start getting freed.
4. In NTInit() I add to a counter instead of using atexit()
5. fini_socket() calls NTcleanup(), which simply calls 
WSACleanup 'counter' times to release the winsock DLL.

This works, so now I can go on with my Python study.  If 
anyone wants my hack code, just email me.

Kurt

----------------------------------------------------------------------

Comment By: Kurt Grittner (grittkmg)
Date: 2003-07-06 15:57

Message:
Logged In: YES 
user_id=816888

The minimum required to cause this GPF is the following:
C:\ptest>python
Python 2.3b2 (#43, Jun 29 2003, 16:43:04) [MSC v.1200 32 
bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more 
information.
>>> from socket import *
>>> s=socket(AF_INET, SOCK_STREAM)
>>><Ctrl-Z>

I have found the problem in the source, though I don't know 
how to fix it.  WSAStartup() is called as a DLL export, but the 
WSACleanup() is called as an atexit().  This is too late to call 
it.  It must be called before the DLL unloads, or the 
WSAStartup should be moved to the process level.  I don't 
know the program well enough to attempt a hack, but I did 
verify that it is the WSACleanup line running from the CRT 
code that is killing things (every time you even create a 
socket).

Notice, this is using the latest version too.

Kurt



----------------------------------------------------------------------

Comment By: Kurt Grittner (grittkmg)
Date: 2003-07-06 13:41

Message:
Logged In: YES 
user_id=816888

Hi Tim,

Here is some more info.  I stripped out all references to 
socket, and saved as clt2.clw to force it to load with 
pythonw.exe.  Things worked perfectly.  Here is the script:

import os, time, sys
#from socket import *              
from Tkinter import *

class App:
    def __init__(self, master):
        self.win = Toplevel(master)
        self.button = Button(self.win, text="QUIT", fg="red", 
command=self.goaway)
        self.button.pack(side=LEFT)
        self.hi_there = Button(self.win, text="Hello", 
command=self.say_hi)
        self.hi_there.pack(side=LEFT)
        #self.serverHost = 'localhost'      
        self.serverHost = '192.168.1.12'      
        self.serverPort = 50007            
        #self.sockobj = socket(AF_INET, SOCK_STREAM)      
        #self.sockobj.connect((self.serverHost, self.serverPort))  
 

    def say_hi(self):
        self.message = ['Hello network world']   
        #for line in self.message:
        #    self.sockobj.send(line)  
        #data = self.sockobj.recv(1024)
        print 'Client received:', `data`

    def goaway(self):
        #self.sockobj.shutdown(0)
        #self.sockobj.close()
        self.win.destroy()

def NewClient():
	App(root)

def PgmExit():
	root.quit()
    
root = Tk()
Button(root, text='New Client', command=NewClient).pack()
Button(root, text='Quit All Clients', command=PgmExit).pack()
root.mainloop()

So next, I did the reverse... I took out all references to Tkinter 
and saved it as clt3.clw (again to use pythonw.exe).  It blew 
sky high again.  Here's the script:

import os, time, sys
from socket import *              

class App:
    def __init__(self, master):
        self.serverHost = '192.168.1.12'      
        self.serverPort = 50007            
        self.sockobj = socket(AF_INET, SOCK_STREAM)      
        self.sockobj.connect((self.serverHost, self.serverPort))   

    def say_hi(self):
        self.message = ['Hello network world']   
        for line in self.message:
            self.sockobj.send(line)  
        data = self.sockobj.recv(1024)
        print 'Client received:', `data`

    def goaway(self):
        self.sockobj.shutdown(0)
        self.sockobj.close()

x = App(0)
x.say_hi()
time.sleep(1)
y = App(1)
y.say_hi()
time.sleep(1)
x.goaway()
y.goaway()
time.sleep(1)

Here's the output from the linux-based echo server:

ns1:/www/python # python select-server.py
select-server loop starting
Connect: ('192.168.1.20', 2382) 135755344
        got Hello network world on 135755344
Connect: ('192.168.1.20', 2383) 135744224
        got Hello network world on 135744224
        got  on 135755344
        got  on 135744224

Here's the (now familiar) GPF output:

PYTHONW caused an invalid page fault in
module KERNEL32.DLL at 017f:bff7b9a6.
Registers:
EAX=00000000 CS=017f EIP=bff7b9a6 EFLGS=00000246
EBX=007961a0 SS=0187 ESP=0095fba8 EBP=0095fbe8
ECX=0095fbf8 DS=0187 ESI=10013168 FS=19d7
EDX=0079d670 ES=0187 EDI=0079e7b0 GS=356e
Bytes at CS:EIP:
ff 76 04 e8 13 89 ff ff 5e c2 04 00 56 8b 74 24 
Stack dump:
007961a0 10007eb0 10013168 7800b317 0079eb50 0079d670 
00000000 007a2c98 00000000 81dafb90 00794641 760028e8 
0079d67c 0079d670 007961a0 0079d67c 

And here's the netstat output from right after the crash:

C:\WINDOWS>netstat /a

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    amd2000:2339           AMD2000:0              
LISTENING
  TCP    amd2000:135            AMD2000:0              LISTENING
  TCP    amd2000:1243           AMD2000:0              
LISTENING
  TCP    amd2000:1025           AMD2000:0              
LISTENING
  TCP    amd2000:1699           AMD2000:0              
LISTENING
  TCP    amd2000:2339           www.mlgames.org:22     
ESTABLISHED
  TCP    amd2000:2382           www.mlgames.org:50007  
TIME_WAIT
  TCP    amd2000:2383           www.mlgames.org:50007  
TIME_WAIT
  TCP    amd2000:137            AMD2000:0              LISTENING
  TCP    amd2000:138            AMD2000:0              LISTENING
  TCP    amd2000:nbsession      AMD2000:0              
LISTENING
  TCP    amd2000:1243           www.mlgames.org:22     
ESTABLISHED
  UDP    amd2000:1699           *:*
  UDP    amd2000:nbname         *:*
  UDP    amd2000:nbdatagram     *:*

C:\WINDOWS>


Notice the TIME_WAIT connections to port 50007.

It looks like this is a 'socket' library problem.  I also tried 
adding del x del y and a long delay to the end of the script to 
see if things would change... but no.  Same explosion.

So, to answer your question.  It happens on both python and 
pythonw, and it happens without loading Tkinter.  I suppose I 
can try the latest development version as you suggested.

Kurt

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-07-06 11:34

Message:
Logged In: YES 
user_id=31435

Unassigned.  Didn't see any problem on Win98SE, under 2.2.3 
or 2.3b2, using either of the self.serverHost assignments.

If you can, please try under 2.3b2.  On Windows that ships 
with the current version of Tcl/Tk (8.4.3), and some kinds of 
Tcl/Tk Windows shutdown races are said to be fixed in 8.4.3.

Question:  How do you start this program?  With python.exe 
or with pythonw.exe?  The only known workaround for some 
kinds of previous Tcl/Tk shutdown races was to start the 
program with pythonw.exe.  So if you haven't tried 
pythonw.exe, try it and see whether the problem goes away.  
If it does, it's almost certainly a bug in Tcl/Tk.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=766669&group_id=5470