[issue13070] segmentation fault in pure-python multi-threaded server
report at bugs.python.org
Fri Sep 30 00:42:47 CEST 2011
New submission from Victor Semionov <vsemionov at gmail.com>:
I'm developing a multi-threaded TCP server and have been seeing segmentation faults on 3.2 on Linux and 3.2.2 on Windows. This happens when using only pure-Python libraries, so I believe the problem is in the interpreter. The issue is very easy to reproduce with my code, but I think it is obscure, because I have not been able to reproduce it with a smaller program.
Here's what happens. The server accepts TCP connections, and creates a thread for every new connection. When the client sends a request, the server initiates its own TCP connection to a database. If any socket IO operation fails by raising a socket error (e.g. the database is down), those errors are caught by the calling code, and it gracefully terminates the thread. However, when the next client connects and sends a request, even if the server-initiated connections are successfully established, the interpreter crashes a bit later during the processing of the client's request (I think during IO operations).
Strangely, this does not occur if the thread recovers and does not terminate after catching an exception (as the case with failed redis connections). Also, I was able to port my program to python 2.7, and it did not crash.
To reproduce, you will need pg8000, which is a pure-python dbapi driver. You will need to get my program, wordbase, from the mercurial repository at https://bitbucket.org/vsemionov/wordbase (changeset 31c6554e67ee) and edit src/wordbase/db/pgsql.py. Change "import psycopg2 as dbapi" to "import pg8000.dbapi as dbapi". This is just to ensure that no C-based library is used. Steps to reproduce:
0. Ensure postgres is not running
1. Start wordbase with src/wordbase/wordbase.py -f <conf_file>. Use the path to the provided sample conf file at src/wordbase/wordbase.conf. By default you'll need to be root, in order to be able to create a log file.
2. Connect a client with "telnet localhost 2628" and enter "d hello". This should fail with status 420. Reconnect and repeat the same step a couple of times. The interpreter usually crashes after repeating this step.
I'm providing the interpreter's backtrace, which is obtained from Python 3.2 on Linux. It is attached in a separate file.
If you need any other information, please let me know.
components: IO, Interpreter Core
title: segmentation fault in pure-python multi-threaded server
versions: Python 3.2
Added file: http://bugs.python.org/file23270/backtrace
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list