[Python-bugs-list] [Bug #110639] test_fork1 hangs with 1.6a2 on Linux (PR#296)

noreply@sourceforge.net noreply@sourceforge.net
Mon, 14 Aug 2000 06:50:47 -0700


Bug #110639, was updated on 2000-Jul-31 14:09
Here is a current snapshot of the bug.

Project: Python
Category: Core
Status: Closed
Resolution: Works For Me
Bug Group: 3rd Party
Priority: 7
Summary: test_fork1 hangs with 1.6a2 on Linux (PR#296)

Details: Jitterbug-Id: 296
Submitted-By: oli@andrich.net
Date: Thu, 13 Apr 2000 15:18:07 -0400 (EDT)
Version: 1.6a2
OS: Linux Mandrake 2.2.14-mdksmp


Hi,

I am currently evaluating Python 1.6 on my machine in order to be
uptodate with the Linux distribution as soon as 1.6 final is released.

The testsuite runs fine, just one test fails completely. test_fork1
When I run the test, it sometimes runs really fine (very seldom). Most of
the time it hangs and uses nearly 100% of CPU time. The loop in the child
prozess is completed successfully. n is 0 as it ought to be. But the
os.waitpid call hangs for an indefinete time. This is strange,
cause I can't reproduce this in an equivalent c code snippet. And without 
creation of the threads it works really fine and doesn't hang at all.

Any idea? I am running a 2.2.14 Linux on a SMP machine (non SMP machine show
the same behaviour). glibc 2.1.3.

Bye, Oliver



====================================================================
Audit trail:
Thu Apr 13 15:55:33 2000	fdrake	sent reply 1
Thu Apr 13 15:55:45 2000	fdrake	moved from incoming to open

Follow-Ups:

Date: 2000-Jul-31 14:09
By: none

Comment:
From: Fred L. Drake, Jr. <bugs-py@python.org>
Subject: Re: test_fork1 hangs with 1.6a2 on Linux (PR#296)
Date: Thu Apr 13 15:55:33 2000

Oliver,
  We're aware of it, but haven't figured out the exact problem yet.  If you can
provide additional information on the observed behavior, that would be good.  I
get three different behaviors on a Mandrake 7.1 SMP machine: works, segfaults,
and blocks.  I'm suspecting a pthread implementation problem, but haven't had
enough time to really dig into it sufficiently to be sure.


  -Fred

-------------------------------------------------------

Date: 2000-Jul-31 14:09
By: none

Comment:
From: Oliver Andrich <oli@rz-online.net>
Subject: Re: test_fork1 hangs with 1.6a2 on Linux (PR#296)
Date: Thu, 13 Apr 2000 22:26:08 +0200

Hi,

well what I have discovered so far is.

    - Normal optimizations for Mandrake packages results in a sure segfault.
    - Normal python optimizations (jst calling make) causes hangs sometimes
      works.

A look at ps ax shows, that there exist 6 active python "processes" (the
parent process and 5 threads) and one defunct python process (the child). So
thet child terminates correctly as I already mentioned. But the os.waitpid
doesn't discover that the child has already exited.

Things I will to tonight:

    - write a complete cversion of the testcode
    - compile python against another thread library
    - compile python against the most recent glibc snapshot
    - compile python on a RedHat 6.1 system

Hopefully I get some more insights. Sadly, I am not good at c debugging, as I
am a Python code by choice and a C coder who has written his last C code five
years ago (a really program not just a Python extension ;-).

Best regards,

    Oliver 

On Thu, Apr 13, 2000 at 03:55:34PM -0400, Fred@python.org wrote:
> Oliver,
>   We're aware of it, but haven't figured out the exact problem yet.  If you can
> provide additional information on the observed behavior, that would be good.  I
> get three different behaviors on a Mandrake 7.1 SMP machine: works, segfaults,
> and blocks.  I'm suspecting a pthread implementation problem, but haven't had
> enough time to really dig into it sufficiently to be sure.
> 
> 
>   -Fred


-------------------------------------------------------

Date: 2000-Jul-31 14:09
By: none

Comment:
From: Oliver Andrich <oli@rz-online.net>
Subject: Re: test_fork1 hangs with 1.6a2 on Linux (PR#296)
Date: Thu, 13 Apr 2000 23:38:08 +0200

Hi,

I just checked the following to things. I patched the threading to work with
GNU pth and get the same results. When I run the test script with Python 1.5.2
on the same machine, that means same glibc and so on, it works fine. I
compared the threading code of Python 1.5.2 and 1.6.a2 and it doesn't seem to
differ at all. Same for the posixcode as far it is relevant for the test.

I am a little bit irritated by this.

Best regards,

    Oliver

On Thu, Apr 13, 2000 at 03:55:34PM -0400, Fred@python.org wrote:
> Oliver,
>   We're aware of it, but haven't figured out the exact problem yet.  If you can
> provide additional information on the observed behavior, that would be good.  I
> get three different behaviors on a Mandrake 7.1 SMP machine: works, segfaults,
> and blocks.  I'm suspecting a pthread implementation problem, but haven't had
> enough time to really dig into it sufficiently to be sure.
> 
> 
>   -Fred


-------------------------------------------------------

Date: 2000-Aug-11 06:49
By: fdrake

Comment:
I do not currently have ready access to an SMP machine, so I can't tell if this is fixed or not.  I cannot reproduce this using a uniprocessor running Mandrake 7.1 (kernel 2.2.15-4mdk).

This bug may also be related to the use of the GNU Pth pthreads implementation; we should find out which threading library was being used.
-------------------------------------------------------

Date: 2000-Aug-14 06:50
By: fdrake

Comment:
This does not appear to be a problem in the currrent code base, and may be an artefact of using the GNU Pth thread library; I cannot reproduce this on either a single or dual processor machine.

If you can still get this behavior from the CVS version of Python, please ask us to reopen this bug.
-------------------------------------------------------

For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=110639&group_id=5470