[Python-bugs-list] [Bug #113785] SIGSEGV in PyDict_SetItem

noreply@sourceforge.net noreply@sourceforge.net
Thu, 25 Jan 2001 23:11:09 -0800


Bug #113785, was updated on 2000-Sep-07 01:51
Here is a current snapshot of the bug.

Project: Python
Category: Python Interpreter Core
Status: Closed
Resolution: Works For Me
Bug Group: 3rd Party
Priority: 5
Submitted by: nobody
Assigned to : gvanrossum
Summary: SIGSEGV in PyDict_SetItem

Details: On a PC running FreeBSD 3.5
Python 1.5.2 (#1, Aug 23 2000, 10:14:09)  [GCC 2.95 19990728 (release)] on
freebsd3
 I installed Python to run a filter that comes with inn 2.3.0 (usenet
News)

Some time the fiter written in Python hangs with the following (from gdb)
Program received signal SIGSEGV, Segmentation fault.
0x810465b in PyDict_SetItem (op=0x83c6a80, key=0x83d4020, value=0x0)
    at dictobject.c:375
375             Py_INCREF(value);

I suspect that the Python filter may be 
#
# $Id: filter_innd.py,v 1.2 1999/09/23 14:24:23 kondou Exp $
#
# This is a sample filter for the Python innd hook.
#
# For details, see the file README.python_hook that came with INN.
#

import re
from string import *

# This looks weird, but creating and interning these strings should
# let us get faster access to header keys (which innd also interns) by
# losing some strcmps under the covers.
Approved = intern("Approved");           Control = intern("Control")
Date = intern("Date");                   Distribution =
intern("Distribution")
Expires = intern("Expires");             From = intern("From")
Lines = intern("Lines");                 Message_ID =
intern("Message-ID")
Newsgroups = intern("Newsgroups");       Path = intern("Path")
Reply_To = intern("Reply-To");           Sender = intern("Sender")
Subject = intern("Subject");             Supersedes =
intern("Supersedes")
Bytes = intern("Bytes");                 Also_Control =
intern("Also-Control")
References = intern("References");       Xref = intern("Xref")
Keywords = intern("Keywords");           X_Trace = intern("X-Trace")
NNTP_Posting_Host = intern("NNTP-Posting-Host")
Followup_To = intern("Followup-To");     Organization =
intern("Organization")
Content_Type = intern("Content-Type");   Content_Base =
intern("Content-Base")
Content_Disposition = intern("Content-Disposition")
X_Newsreader = intern("X-Newsreader");   X_Mailer = intern("X-Mailer")
X_Newsposter = intern("X-Newsposter")
X_Cancelled_By = intern("X-Cancelled-By")
X_Canceled_By = intern("X-Canceled-By"); Cancel_Key =
intern("Cancel-Key")
__BODY__ = intern("__BODY__");           __LINES__ = intern("__LINES__")


class InndFilter:
    """Provide filtering callbacks to innd."""

    def __init__(self):
        """This runs every time the filter is loaded or reloaded.

        This is a good place to initialize variables and precompile
        regular expressions, or maybe reload stats from disk.
        """
        self.re_newrmgroup = re.compile('(?:new|rm)group\s')
        self.re_obsctl = re.compile('(?:sendsys|version|uuname)')
        # msgid  pattern from a once-common spambot.
        self.re_none44 = re.compile('none\d+\.yet>')
        # There is a mad newgrouper who likes to meow.
        self.re_meow = re.compile("^Meow\!", re.M)
        # One of my silly addresses.
        self.re_fluffymorph = re.compile("andruQ@myremarQ.coM", re.I)

    def filter_before_reload(self):
        """Runs just before the filter gets reloaded.

        You can use this method to save state information to be
        restored by the __init__() method or down in the main module.
        """
        syslog('notice', "filter_before_reload executing...")

    def filter_close(self):
        """Runs when innd exits.

        You can use this method to save state information to be
        restored by the __init__() method or down in the main module.
        """
        syslog('notice', "filter_close running, bye!")

    def filter_messageid(self, msgid):
        """Filter articles just by their message IDs.

        This method interacts with the IHAVE and CHECK NNTP commands.
        If you return a non-empty string here, the offered article
        will be refused before you ever have to waste any bandwidth
        looking at it.  This is not foolproof, so you should do your
        ID checks both here and in filter_art.  (TAKETHIS does not
        offer the ID for examination, and a TAKETHIS isn't always
        preceded by a CHECK.)
        """
        return ""               # deactivate the samples.
        
        if self.re_none44.search(msgid):
            return "But I don't like spam!"
        if msgid[0:8] == '<cancel.':
            return "I don't do cybercancels."

    def filter_art(self, art):
        """Decide whether to keep offered articles.

        art is a dictionary with a bunch of headers, the article's
        body, and innd's reckoning of the line count.  Itens not
        in the article will have a value of None.

        The available headers are the ones listed near the top of
        innd/art.c.  At this writing, they are:

            Approved, Control, Date, Distribution, Expires, From,
            Lines, Message-ID, Newsgroups, Path, Reply-To, Sender,
            Subject, Supersedes, Bytes, Also-Control, References,
            Xref, Keywords, X-Trace, NNTP-Posting-Host, Folowup-To,
            Organization, Content-Type, Content-Base,
            Content-Disposition, X-Newsreader, X-Mailer, X-Newsposter,
            X-Cancelled-By, X-Canceled-By and Cancel-Key.

        The body is the buffer in art['__BODY__'] and the INN-reckoned
        line count is held as an integer in art['__LINES__'].  (The
        Lines: header is often generated by the poster, and large
        differences can be a good indication of a corrupt article.)

        If you want to keep an article, return None or "".  If you
        want to reject, return a non-empty string.  The rejection
        string will appear in transfer and posting response banners,
        and local posters will see them if their messages are
        rejected.
        """
        return ""               # deactivate the samples.

        # catch bad IDs from articles fed with TAKETHIS but no CHECK.
        idcheck = self.filter_messageid(art[Message_ID])
        if idcheck:
            return idcheck

        # There are some control messages we don't want to process or
        # forward to other sites.
        try:
            if art[Control] is not None:
                if self.re_newrmgroup.match(art[Control]):
                    if self.re_meow.search(art[__BODY__]):
                        return "The fake tale meows again."
                    if art[Distribution] == buffer('mxyzptlk'):
                        return "Evil control message from the 10th
dimension"
                if self.re_obsctl.match(art[Control]):
                    return "Obsolete control message"

            # If you don't know, you don't want to know.
            if self.re_fluffymorph.search(art[From]):
                return "No, you may NOT meow."
        except:
            syslog('n', str(sys.exc_info[1]))

    def filter_mode(self, oldmode, newmode, reason):
        """Capture server events and do something useful.

        When the admin throttles or pauses innd (and lets it go
        again), this method will be called.  oldmode is the state we
        just left, and newmode is where we are going.  reason is
        usually just a comment string.

        The possible values of newmode and oldmode are the four
        strings 'running', 'paused', 'throttled' and 'unknown'.
        Actually 'unknown' shouldn't happen, it's there in case
        feeping creatures invade innd.
        """
        syslog('notice', 'state change from %s to %s - %s'
               % (oldmode, newmode, reason))



"""
Okay, that's the end of our class definition.  What follows is the
stuff you need to do to get it all working inside innd.
"""

# This import must succeed, or your filter won't work.  I'll repeat
# that: You MUST import INN.
from INN import *


#   Some of the stuff below is gratuitous, just demonstrating how the
#   INN.syslog call works.  That first thingy tells the Unix syslogger
#   what severity to use; you can abbreviate down to one letter and
#   it's case insensitive.  Available levels are (in increasing levels
#   of seriousness) Debug, Info, Notice, Warning, Err, Crit, and
#   Alert.  If you provide any other string, it will be defaulted to
#   Notice.  You'll find the entries in the same log files innd itself
#   uses, with an 'innd: python:' prefix.
#
#   The native Python syslog module seems to clash with INN, so use
#   INN's.  Oh yeah -- you may notice that stdout and stderr have been
#   redirected to /dev/null -- if you want to print stuff, open your
#   own files.

try:
    import sys
except Exception, errmsg:
    syslog('Error', "import boo-boo: " + errmsg[0])


#     If you want to do something special when the server first starts
#     up, this is how to find out when it's time.

if 'spamfilter' not in dir():
    syslog ('n', "First load, so I can do initialization stuff.")
    #  You could unpickle a saved hash here, so that your hard-earned
    #  spam scores aren't lost whenver you shut down innd.
else:
    syslog ('NoTicE', "I'm just reloading, so skip the formalities.")


#  Finally, here is how we get our class on speaking terms with innd.
#  The hook is refreshed on every reload, so that you can change the
#  methods on a running server.  Don't forget to test your changes
#  before reloading!
spamfilter = InndFilter()
try:
    set_filter_hook(spamfilter)
    syslog('n', "spamfilter successfully hooked into INN")
except Exception, errmsg:
    syslog('e', "Cannot obtain INN hook for spamfilter: %s" % errmsg[0])



And the data would be something like:
        takethis <967664873.12710.1.nnrp-02.9e98fa8b@news.demon.co.uk>\r
        Path:
news.nectec.or.th!news.loxinfo.co.th!news-out.cwix.com!newsfeed.\
       
cwix.com!newsfeeds.belnet.be!news.belnet.be!newsgate.cistron.nl!bullse\
       
ye.news.demon.net!demon!news.demon.co.uk!demon!inert.demon.co.uk!not-f\
        or-mail\r
        From: inert@inert.demon.co.uk\r
        Newsgroups: rec.music.industrial\r
        Subject: *** CLUB NOIR ***  31/8/00\r
        Date: Wed, 30 Aug 2000 19:47:53 GMT\r
        Organization: inertia\r
        Message-ID:
<967664873.12710.1.nnrp-02.9e98fa8b@news.demon.co.uk>\r
        NNTP-Posting-Host: inert.demon.co.uk\r
        X-NNTP-Posting-Host: inert.demon.co.uk:158.152.250.139\r
        X-Trace: news.demon.co.uk 967664873 nnrp-02:12710 NO-IDENT
inert.demon\
        .co.uk:158.152.250.139\r
        X-Complaints-To: abuse@demon.net\r
        X-Mailer: Mozilla 1.22 (Windows; I; 16bit)\r
        MIME-Version: 1.0\r
        Lines: 0\r
        Xref: news.nectec.or.th rec.music.industrial:137401\r
        \r
        .\r

So, for an easy solution, I can disable Python filtering and then inn will
be working fine.
But even if the filter was not doing what it is supposed to do, Python
should not SIGSEGV I presume.

Beside Python installed without problem and make test result is:
42 tests OK.
19 tests skipped: test_al test_audioop test_bsddb test_cd test_cl
test_crypt test_dbm test_dl test_gdbm test_gl test_gzip test_imageop
test_imgfile test_nis test_rgbimg test_sunaudiodev test_thread test_timing
test_zlib

For more info: on@cs.ait.ac.th

Thank you,

Olivier

Follow-Ups:

Date: 2001-Jan-25 23:11
By: meowing

Comment:
Not a Python bug.  An off-by-1 error had crept into in innd; reported to
inn-bugs and fixed in December 2000.

FWIW, INN embeds Python, and the INN module is a C extension built into
innd.

-------------------------------------------------------

Date: 2000-Sep-12 16:52
By: gvanrossum

Comment:
I'm closing this because I have not enough information, and I have no way
to reproduce the problem. If you provide more information I'll gladly
reopen it.

Some questions:
- How does inn invoke Python? Does it embed Python or does it run Python as
a subprocess? Is there any INN specific extension C code?
- What's in the INN module?

-------------------------------------------------------

Date: 2000-Sep-07 15:04
By: jhylton

Comment:
Please do triage on this bug.
-------------------------------------------------------

For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=113785&group_id=5470