[Python-Dev] getcode() function in pyexpat.c

Martin v. Loewis martin@mira.cs.tu-berlin.de
Mon, 22 Jan 2001 19:36:16 +0100


> A few comments to explain this highly stylized and macro-laden code
> would be appreciated.

I probably can't do that before 2.1a1, but I promise to suggest
something right afterwards.

In general, the macro magic is designed to make the many expat
callbacks available to Python. RC_HANDLER (for return code) is the
most general template; VOID_HANDLER and INT_HANDLER are common
specializations. In the core of RC_HANDLER, there a tuple is built and
a Python function is called.

The code used to do PyEval_CallObject right inside the macro; the
call_with_frame feature is new compared to 2.0. It solves the specific
problem of incomprehensible tracebacks.

In a typical SAX application, the user code calls
expatreader.ExpatParser.parse, which in turn calls 

            self._parser.Parse(data, isFinal)

Now, in 2.0, a common problem was a traceback

            self._parser.Parse(data, isFinal)
TypeError: not enough arguments; expected 4, got 2

Everybody assumes a problem in the call to Parse; the real problem is
in the call to the callback inside RC_HANDLER, which tried to call a
user's function with two arguments that expected four.

2.1 would improve this slightly on its own, writing

            self._parser.Parse(data, isFinal)
TypeError: characters() takes exactly 4 arguments (2 given)

With that code, you get

  File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 81, in feed
    self._parser.Parse(data, isFinal)
  File "pyexpat.c", line 379, in CharacterData
TypeError: characters() takes exactly 4 arguments (2 given)

So that tells you that it is the CharacterData handler that invokes
characters(). You are right that the frame object is not used
otherwise; it is just there to make a nice traceback.

> I simply don't understand what's going on -- and I'm deeply
> suspicious that it is the source of whatever problems Tim is seeing
> with test_sax.

I thought so, too, at first; it turned out that the problem was
elsewhere.

Regards,
Martin