[Python-porting] [RELEASED] six 1.1
Barry Warsaw
barry at python.org
Wed Nov 23 21:59:15 CET 2011
On Nov 23, 2011, at 10:55 AM, Benjamin Peterson wrote:
>2011/11/23 Barry Warsaw <barry at python.org>:
>> both the Python and C levels. I'll write up the details (e.g. __next__()
>> vs. next()) hopefully today, I've also found a few more traps and tricks for
>> extension modules. I wonder if you have any interest in adding some C level
>> portability helpers.
>
>You mean like a header file with macros for PyInt -> PyLong/PyString
>-> PyUnicode etc?
There are a bunch of little things I've found helpful while porting
dbus-python. I think some at least would be generally useful for extension
modules. Here's a quick summary (so far :). I should note first that I only
care about Python 2.6, 2.7, and 3.2. I think there was only one case where
2.6 didn't have what I needed.
I tried to reduce the number of #ifdefs in the code by converting some things
that can be made common between the two versions.
- In Python 2, I always #include <bytesobject.h> and unilaterally change all
PyString names to PyBytes names. That reduces a lot of the ugliness.
- I changed all the reprs to return unicodes in both Python versions instead
of conditionally continuing to return strings in Python 2. That reduced
another source of noise, but I had to use a little trick with
PyUnicode_FromFormat(). The reprs in this package embed the repr of the
parent class, but you don't know whether that will be a bytes (under Python
2) or a unicode (under Python 3). It was fairly ugly to ifdef around this,
so instead of using either the %s or %U codes wrapped in macros, I use the
%V code. Now, I'm not sure if that was added for this purpose, but it sure
is handy. The call sites look something like this now:
PyObject *parent_repr = (<baseclass>.tp_repr)(self);
PyObject *my_repr = PyUnicode_FromFormat("...%V...", REPRV(parent_repr));
and the macro looks like this:
#define REPRV(obj) \
(PyUnicode_Check(obj) ? (obj) : NULL), \
(PyUnicode_Check(obj) ? NULL : PyBytes_AS_STRING(obj))
I supposed technically this could crash if the parent repr (erroneously)
returned a non-string, but in my case, that won't happen because the base
classes are standard Python types, or otherwise well-controlled.
Additional compatibility macros and functions:
- I really dislike writing "#if PY_MAJOR_VERSION >= 3" all over the place, so
I define the following macro to make the version test easier:
#if PY_MAJOR_VERSION >= 3
#define PY3K
#endif
Now all I need are "#ifdef PY3K" sprinkles. Okay, maybe it's a minor
savings, but I've found it helpful.
- dbus defines subclasses of PyInts and PyLongs. When porting to Python 3,
all of these have to become subclasses of PyLongs, however for some of
them, the exact hierarchy doesn't matter so much, so I've switched them to
use PyLongObjects.
Python 3.0 had a <intobject.h> compatibility header which I think would
have been nice, but that's gone in Python 3.2.
In Python 2, PyLongObject isn't defined unless you also #include
<longintrepr.h>. <Python.h> isn't enough.
- The extension module interns a couple of strings. In Python 2 this is
PyString_InternFromString while in Python 3 it's
PyUnicode_InternFromString. I have the following macro for this:
#ifdef PY3K
#define INTERN PyUnicode_InternFromString
#else
#define INTERN PyString_InternFromString
#endif
- There are several places where PyArg_Parse*() wants to get a char*.
Under Python 2, these just provide "s" codes and get passed a PyString.
Under Python 3, I decided to allow either a bytes object or a utf-8 encoded
unicode, but I always want to coerce it to a bytes internally, making it
easy to extract the char*.
I decided to switch the "s" codes to O& codes and add the following
converter function:
#ifdef PY3K
#define RETURN_CLEANUP Py_CLEANUP_SUPPORTED
#else
#define RETURN_CLEANUP 1
#endif
int
dbus_parse_bytes(PyObject *object, void *address)
{
PyObject *bytes;
Py_ssize_t size;
void *data;
if (!object) {
/* This is Python having a parse error, so free our reference. */
Py_CLEAR(*(PyObject **)address);
return 1;
}
if (PyBytes_Check(object)) {
bytes = object;
Py_INCREF(bytes);
}
else {
if (!(bytes = PyUnicode_AsUTF8String(object)))
return 0;
}
/* Embedded NULs are not allowed in dbus. */
size = PyBytes_GET_SIZE(bytes);
data = PyBytes_AS_STRING(bytes);
if (size != (Py_ssize_t)strlen(data)) {
PyErr_SetString(PyExc_TypeError, "embedded NUL character");
Py_DECREF(bytes);
return 0;
}
*(PyObject**)address = bytes;
return RETURN_CLEANUP;
}
I think there's a potential for leaking these args under Python 2 when
subsequent parse codes fail, because Py_CLEANUP_SUPPORTED isn't defined.
I'm not sure there's anything that can be done about it, so hopefully it's
rare enough not to matter in practice.
Things I haven't macro'd around:
- PyCapsule vs PyCObject; I just #ifdef around the whole block of code.
- A number of places want to check if something's a PyInt or a PyLong. The
PyInt checks can't be performed under Python 3, so I have some rather ugly
#ifdefs sprinkled in various conditional. (I suppose I could no-op
PyInt_Check under Python 3).
- Py_TPFLAGS_HAVE_WEAKREFS doesn't exist in Python 3 so I have to ifdef
around setting the flags. It might be nice if that was no-op'd in the
compatibility header.
- The changes to module inits are just a pain. I'm not sure there's really
anything you can do to make it nicer. The C porting guides both on
python.org and on python3porting.com provide some strategies, and I rolled
my own slightly different approach based on those examples.
I did define this:
#ifdef PY3K
#define RETURN_INITERROR return NULL
#else
#define RETURN_INITERROR return
#endif
just to make error condition returns a little easier to write.
- Py_BuildValue() does not have a "y" code in Python 2, so you basically have
to ifdef around that.
A few more things I ran across at the Python level:
- For the Python code, I wanted to avoid 2to3, and was mainly successful with
some liberal sprinkling of sys.version_info.major checks, and __future__
imports (e.g. print_function, unicode_literals, and absolute_imports).
Many of these might be nicer with your six module.
- Long literals (i.e. trailing 'L's are a pain).
- Metaclasses are a huge pain because the Python 3 syntax prevents
compilation in Python 2, so you can't use sys.version_info.major checks
alone. Looks like six has a nice helper for this; I ended up using exec,
but I think both cases would be rather painful if the derived class were
anything more than a `pass` in the body.
- iteritems() and friends are a pain. In my case, I think they just weren't
very useful, so I switched everything back to items() and such.
- Similarly with xrange().
- isSequenceType() is gone in Python 3.
- Dealing with __next__() vs. next() methods.
Anyway, that's everything I kept notes on.
Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-porting/attachments/20111123/ce1167d5/attachment.pgp>
More information about the Python-porting
mailing list