Python in C
skip at pobox.com
skip at pobox.com
Tue Dec 30 03:32:45 CET 2008
thmpsn> 1. Can anyone explain to me what kind of program structuring
thmpsn> technique (which paradigm, etc) CPython uses? How do modules
thmpsn> interact together? What conventions does it use?
it's quite object-oriented once you understand how things are done. Take a
look, for example, at the implementation of floating point numbers:
BTW, as a person who hasn't really written a stitch of C++ in about 10 years
I personally find the CPython implementation to be one of the most
well-organized large pieces of code I have ever encountered. It's much
easier to read (to me) than any significant piece of C++ code I have ever
tried to read.
Here are a few things which might help you understand the code structure a
* The Python parser is generated from a specification. Look in
.../Parser and .../Grammar/Grammar.
* The code for most objects generally has a single entry point. For
floating point objects it's _PyFloat_Init. The leading underscore
tells the world "we needed to export this symbol, but keep your hands
off it". Objects like floats and ints tend to have several other
exported functions (search for "Py" at the beginning of a line) which
are used by module writers. Objects implemented as extension modules
(look in .../Modules/*.c) have a single (static) entry point,
init_<mod>. The runtime dlopen's the so/dll file and calls that
* Since C doesn't offer transparent method tables it's explicit in an
object's code. Referring again to the float object code, look for the
type specifier ("PyFloat_Type") and the method "dict"
("float_methods"). Note the comments at the end of the lines defining
PyFloat_Type. They describe the use of each slot. Float objects
aren't sequences so the tp_as_sequence is NULL. Similarly, objects
such as lists which don't implement numeric methods a NULL
* The byte code compiler is in Python/compile.c.
* The runtime interpreter is in Python/ceval.c.
* Nothing requires a module to implement classes or types. Look at
Python/sysmodule.c and Modules/mathmodule.c for two examples. Both
modules export many functions but define no new types.
Part of the complexity you might be stumbling on is due to the fact that
Python is a mature application and so many parts of the implementation have
been fine-tuned. Python "eats its own dog food", so for example the dict
implementation you use in your scripts is also used by the runtime virtual
machine to implement namespaces of all sorts (instance, class and module
dicts, for example). It has been heavily optimized. Take a look at
Objects/dictnotes.txt. Also, observations about the data use of many Python
programs (and the C runtime itself) have lead to a number of optimizations
such as the int and float free lists and the custom object allocator
implemented in Objects/obmalloc.c.
thmpsn> 2. Have there been any suggestions in the past to rewrite
thmpsn> Python's mainstream implementation in C++ (or why wasn't it
thmpsn> done this way from the beginning)?
C++ was far from widely enough available when Python was first written in
the late-80s/early-90s. Today there is no particular reason to rewrite it.
If you want to incorporate externally written C++ code into Python you can
do that either manually or using tools such as SWIG, SIP or Boost.Python.
Skip Montanaro - skip at pobox.com - http://smontanaro.dyndns.org/
More information about the Python-list