[DB-SIG] MySQL lost connections

Chris Cogdon chris at cogdon.org
Fri Mar 5 11:12:49 EST 2004

On Mar 5, 2004, at 07:10, Lloyd Kvam wrote:

> I have been getting python program failures involving lost connections 
> to a
> MySQL database.  The server is a Redhat (9) Linux box.  The clients 
> are running
> Win2K.  To research the issue, I've been running tcpdump on the server 
> to trace
> the packet flow.  The following text describes what I found from 
> tcpdump.  My
> guess is that this is really some kind of Windows problem, but I have 
> not had
> any success getting information on that front.  I'd appreciate any 
> thoughts or
> suggestions.  Thanks.
> At 11:54:01 we have the PC acknowledging successful receipt of the 
> previous packet from
> the database.  I assume the PC operator heads off to lunch at this 
> point, leaving
> the program running.
> Then at 12:59:02 the PC tries to open a new connection from port 3648 
> rather than 1497,
> the port it had been using.  The database accepts the new connection.
> At 12:59:03 the PC resets (closes) the new connection.  I believe that 
> this is when the
> program realized it had lost its working connection and aborted.
> At 12:59:09 a new connection from port 3650 is establised.  I believe 
> this is from the
> program being restarted.  3650 was still working when I wrote this 
> email.
> So, what's going on???
> My guess is that Windows is terminating the idle connection even 
> though the program is
> still running!!!  When the program tries to use the connection that it 
> had never closed,
> Windows attempts to open a new connection.  However, that never 
> connection never established
> the login with the database, so it doesn't work.  The program now 
> aborts because it has
> lost its working database connection.  Once the program is restarted, 
> we are back to normal.
> I will try to research whether this idea can possibly be correct.

This type of problem is typical if you have something like NAT running. 
I "don't do windows" anymore, so I don't know if Win2K has any kind of 
internal NAT going on... but... if there's an external NAT box between 
your client and server, then that seems like the most likely culprit.

To fix: change the nat to have a much larger timeout, or... change your 
client program so that it will properly reconnect the database if it 
notices that it's dead.

    ("`-/")_.-'"``-._        Chris Cogdon <chris at cogdon.org>
     . . `; -._    )-;-,_`)
    (v_,)'  _  )`-.\  ``-'
   _.- _..-_/ / ((.'
((,.-'   ((,/   fL

More information about the DB-SIG mailing list