[Python-Dev] [Python-checkins] r43041 - python/trunk/Modules/_ctypes/cfield.c

Mon Mar 20 20:29:55 CET 2006

Martin v. Löwis wrote:
> Fernando Perez wrote:
>> So I think M.A. is right on the money here with his statement.  Unless you
>> consider the Linux/64bit camp insignificant.  But if that is the case, it
>> might be worth putting a big statement in the 2.5 release notes indicating
>> "there is a good chance, if you use third party extensions and a 64 bit OS,
>> that this won't work for you".  Which will mean that a fraction of the
>> scientific userbase (a big, enthusiastic and growing community of python
>> users) will have to stick to 2.4.
> 
> It's more intricate than that. If certain extensions are of interest to
> the scientific community, I certainly expect the authors of them to fix
> them.
> 
> However, to "fix" the extension can still be done in different levels:
> you can easily fix it to not crash, by just honoring all compiler
> warnings that gcc produces. That doesn't mean you fully support
> Py_ssize_t, since you might not support collections with more than
> 2**31 elements. In these cases, the extension will not crash, but it
> will compute incorrect results if you ever encounter such a collection.

Sorry Martin, but such an advice is simply not good enough.

You can't expect people to go chasing compiler warnings to
"fix" their code without at the same time giving them a
complete list of APIs that changed and which is needed for
the still required manual inspection.

I really don't understand why you put so much effort into
trying to argue that the ssize_t patch isn't going to break
extensions or that fixing compiler warnings is good enough.

The interface to the Python API should not be a compiler
giving you warnings, it should be a document where users
(extension authors) can read about the changes, check
their code using grep an editor or some other tool and then
*know* that their code works rather than relying on the
compiler warnings which only *might* catch all the cases
where breakage occurs.

Like Fernando said, 64-bit is becoming the de-facto standard
for Unix systems very quickly these days. It is becoming
hard to buy systems not supporting 64-bit and, of course,
people are anxious to use the capabilities of the systems
they bought (whether they need them or not). If you look at
the market for root servers as second example of users
of 64-bit machines next to the scientific users, several
big hosters have completely switched over to 64-bit.
You can't even get a 32-bit install of Linux on those
machines - because they use custom kernels patched for
their particular hardware.

Ignoring this is just foolish.

I consider the fact that it's currently not possible to have
a look at the changed APIs a documentation which is in the
responsibility of the patch authors to provide (just like it
always is).

Please provide such a list.

> Even on all these AMD64 Linux machines, this is still fairly unlikely,
> since you need considerably more than 16GiB main memory to do something
> useful on such collections - very few machines have that much memory
> today. Again, people that have such machines and need to run Python
> programs with that many data need to make sure that the extensions
> work for them. Sticking with 2.4 is no option for these people, since
> 2.4 doesn't support such large collections, anyway.
> 
> To really fix such extensions, I recommend building with either
> Microsoft C or Intel C. The Microsoft compiler is available free
> of charge, but runs on Windows only. It gives good warnings, and
> if you fix them all (in a careful way), your code should fully
> support 64 bits. Likewise for the Intel compiler: it is available
> for free only for a month, but it runs on Linux as well.
> 
> OTOH, I'm still doubtful how many extensions will be really affected
> by the change in the first place. Your code *breaks* with the change
> only if you implement the sequence or buffer protocols. I'm doubtful
> that this is an issue for most applications, since many extensions
> (I believe) work without implementing these protocols.

You know that it's not only the sequence and buffer protocol
that changed.

If you grep through Include, you get at these reference to
output variables which are going to cause breakage regardless
of whether you use 64-bit or not simply due to the fact that
the function is writing into memory owned by the caller:

./dictobject.h:
-- PyAPI_FUNC(int) PyDict_Next(
--      PyObject *mp, Py_ssize_t *pos, PyObject **key, PyObject **value);
./pyerrors.h:
-- PyAPI_FUNC(int) PyUnicodeEncodeError_GetStart(PyObject *, Py_ssize_t *);
-- PyAPI_FUNC(int) PyUnicodeDecodeError_GetStart(PyObject *, Py_ssize_t *);
-- PyAPI_FUNC(int) PyUnicodeTranslateError_GetStart(PyObject *,
Py_ssize_t *);
-- PyAPI_FUNC(int) PyUnicodeEncodeError_GetEnd(PyObject *, Py_ssize_t *);
-- PyAPI_FUNC(int) PyUnicodeDecodeError_GetEnd(PyObject *, Py_ssize_t *);
-- PyAPI_FUNC(int) PyUnicodeTranslateError_GetEnd(PyObject *, Py_ssize_t *);
./sliceobject.h:
-- PyAPI_FUNC(int) PySlice_GetIndices(PySliceObject *r, Py_ssize_t length,
--                                   Py_ssize_t *start, Py_ssize_t
*stop, Py_ssize_t *step);
-- PyAPI_FUNC(int) PySlice_GetIndicesEx(PySliceObject *r, Py_ssize_t length,
--                                  Py_ssize_t *start, Py_ssize_t *stop,
--                                  Py_ssize_t *step, Py_ssize_t
*slicelength);
./stringobject.h:
-- PyAPI_FUNC(int) PyString_AsStringAndSize(
--     register PyObject *obj,  /* string or Unicode object */
--     register char **s,               /* pointer to buffer variable */
--     register Py_ssize_t *len /* pointer to length variable or NULL
--                                 (only possible for 0-terminated
--                                 strings) */
--     );
./abstract.h:
--      PyAPI_FUNC(int) PyObject_AsCharBuffer(PyObject *obj,
--                                        const char **buffer,
--                                        Py_ssize_t *buffer_len);
--      PyAPI_FUNC(int) PyObject_AsReadBuffer(PyObject *obj,
--                                        const void **buffer,
--                                        Py_ssize_t *buffer_len);
--      PyAPI_FUNC(int) PyObject_AsWriteBuffer(PyObject *obj,
--                                         void **buffer,
--                                         Py_ssize_t *buffer_len);
./unicodeobject.h:
-- PyAPI_FUNC(PyObject*) PyUnicode_DecodeUTF8Stateful(
--     const char *string,      /* UTF-8 encoded string */
--     Py_ssize_t length,               /* size of string */
--     const char *errors,              /* error handling */
--     Py_ssize_t *consumed             /* bytes consumed */
--     );
./ceval.h:
-- PyAPI_FUNC(int) _PyEval_SliceIndex(PyObject *, Py_ssize_t *);

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::