Mailman 3 Stabilizing the C API of 2.6 and 3.0 - Python-Dev

Stabilizing the C API of 2.6 and 3.0

Christian Heimes

May 25, 2008

2:59 p.m.

Hello! The first set of betas of Python 2.6 and 3.0 is fast apace. I like to grab the final chance and clean up the C API of 2.6 and 3.0. I know, I know, I brought up the topic two times in the past. But this time I mean it for real! :] Last time Guido said: --- I think it can actually be simplified. I think maintaining binary compatibility between 2.6 and earlier versions is hopeless anyway, so we might as well just rename PyString to PyBytes in 2.6 and 3.0, and have an extra set of macros so that code using PyString needs to be recompiled but not otherwise touched. E.g. typedef { ... } PyBytesObject; #define PyStringObject PyBytesObject ... PyString_Type; #define PyBytes_Type PyString_Type <etc> --- I like to follow Guido's advice and change the code as following: * replace PyBytes_ with PyByteArray_ * replace PyString with PyBytes_ * rename bytesobject.[ch] to bytearrayobject.[ch] * rename stringobject.[ch] to bytesobject.[ch] * add a new file stringobject.h which contains the aliases PyString_ -> PyBytes_ Christian

Show replies by date

Brett Cannon

May 2008

10:02 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Sun, May 25, 2008 at 7:59 AM, Christian Heimes <lists@cheimes.de> wrote:

...

+1 from me. -Brett

Benjamin Peterson

12:35 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Sun, May 25, 2008 at 9:59 AM, Christian Heimes <lists@cheimes.de> wrote:

...

+1 Do you need any help?

...

Christian

-- Cheers, Benjamin Peterson "There's no place like 127.0.0.1."

Christian Heimes

1:43 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

Benjamin Peterson schrieb:

...

I've renamed the functions and modules. Can you help me with updating the C API docs? In Python 2.6 the docs must still use PyString but you can add a note that PyBytes_ works, too. Christian

Andrew MacIntyre

1:10 p.m.

Christian Heimes wrote:

...

On the subject of stabilising the API, I assigned issue 2862 to you concerning tidying up freelist management interfaces for ints and floats (http://bugs.python.org/issue2862). Note that the patch in issue 2862 is essentially orthogonal to the patch in issue 2039, although any int/float freelist implementation changes would require amendments. Additionally, I notice that not all of the types with free lists have grown routines to clear them - dicts, lists and sets are missing these routines. I will add a patch for these in the next few days if no-one else gets there first. On the subject of issue 2039, I've come to the view that "explicit is better than implicit" applies to the freelist management, and with the addition of freelist clearing routines called from gc.collect() I see little reason to pursue bounding of freelist sizes (and would suggest removal of existing bounding code in those freelist implementations that currently have it). I have also come to the view that pymalloc's automatic attempts to return empty arenas to the OS should be changed to an on-demand cleaning, called after all other cleanup in gc.collect(). Returning arenas, while not expensive in general, is nonetheless not free (in performance terms). -- ------------------------------------------------------------------------- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia

M.-A. Lemburg

1:29 p.m.

On 2008-05-25 16:59, Christian Heimes wrote:

...

Since this is major break in the Python C API, please make sure that you bump the Python C API level used for module imports. Most imports will fail anyway at the link stage, since PyString_* APIs are probably the most used C APIs in Python extensions. One detail, I'm worried about is the change of the type name, since that is sometimes used in object serialization or proxy implementations. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 26 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 41 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Christian Heimes

1:40 p.m.

M.-A. Lemburg schrieb:

...

Most imports will fail anyway at the link stage, since PyString_* APIs are probably the most used C APIs in Python extensions.

I think you have missed an important point. In Python 2.6 the names stay the same for the linker. Although the functions are now called PyBytes_Egg, they are redefined to PyString_Egg by a second header file. In Python 2.6 the renaming of PyString are purely for consistence with the new Python 3.0 names. The names for PyString stay the same for external code like the library and extension modules. PyBytes -> PyByteArray is a different story, though.

...

One detail, I'm worried about is the change of the type name, since that is sometimes used in object serialization or proxy implementations.

The type names aren't changed, too They are still "str" and "bytearray" in Python 2.6 (moved down)

...

Since this is major break in the Python C API, please make sure that you bump the Python C API level used for module imports.

Do you still think it's necessary to bump up the C API version level? Christian

M.-A. Lemburg

3:03 p.m.

On 2008-05-26 15:40, Christian Heimes wrote:

...

M.-A. Lemburg schrieb:

...
Most imports will fail anyway at the link stage, since PyString_* APIs are probably the most used C APIs in Python extensions.

I think you have missed an important point. In Python 2.6 the names stay the same for the linker. Although the functions are now called PyBytes_Egg, they are redefined to PyString_Egg by a second header file.

In Python 2.6 the renaming of PyString are purely for consistence with the new Python 3.0 names. The names for PyString stay the same for external code like the library and extension modules.

Isn't that an awefuly confusing approach ? Wouldn't it be better to keep PyString APIs and definitions in stringobject.c|h and only add a new bytesobject.h header file that #defines the PyBytes APIs in terms of PyString APIs ? That maintains backwards compatibility and allows Python internals to use the new API names. With your approach, you've basically backported the confusing notion in Py3k that str() maps PyUnicode, only that in Py2 str() will now map to PyBytes. You'd have to add an aliase bytes -> str to the builtins to at least reduce the confusion a bit. However, that's bound to cause even more problems, since people will start using bytes() instead of str() in Py2 applications and as a result they won't run in older Python versions anymore. The same problem applies to Py2 extensions writers that wish to support older Python releases as well.

...

PyBytes -> PyByteArray is a different story, though.

PyBytes was new in 2.6 anyway, so there's no breakage there.

...

...
One detail, I'm worried about is the change of the type name, since that is sometimes used in object serialization or proxy implementations.

The type names aren't changed, too They are still "str" and "bytearray" in Python 2.6

Good.

...

(moved down)

...
Since this is major break in the Python C API, please make sure that you bump the Python C API level used for module imports.

Do you still think it's necessary to bump up the C API version level?

Yes, but please let's first discuss this some more. I don't think that the timing was right.... you started this thread just yesterday and the patches are already checked in. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 26 2008)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

Christian Heimes

9:34 p.m.

M.-A. Lemburg schrieb:

...

Isn't that an awefuly confusing approach ?

Wouldn't it be better to keep PyString APIs and definitions in stringobject.c|h

and only add a new bytesobject.h header file that #defines the PyBytes APIs in terms of PyString APIs ? That maintains backwards compatibility and allows Python internals to use the new API names.

With your approach, you've basically backported the confusing notion in Py3k that str() maps PyUnicode, only that in Py2 str() will now map to PyBytes.

The last time I brought up the topic, I had a lengthy discussion with Guido. At first I wanted to rename the API in Python 3.0 only. Guido argued that it's going to cause too much merge conflicts. He then suggested the approach I implemented today. I find the approach less confusing than your suggestion and my initial idea. The internal API names are consistent for Python 2.6 and 3.0. The byte string C API is prefixed PyBytes and the unicode C API is prefixed PyUnicode. A core developer has just to remember that 'str' is a byte string in 2.x but an unicode object in 3.0. Extension developers don't have to worry at all. The ABI and external API is mostly the same and still exposes the 'str' functions as PyString.

...

You'd have to add an aliase bytes -> str to the builtins to at least reduce the confusion a bit.

Python 2.6 already has an alias bytes -> str

...

Yes, but please let's first discuss this some more. I don't think that the timing was right.... you started this thread just yesterday and the patches are already checked in.

I'm sorry if I was too hasty for you. I got +1 from a couple of developers and it's basically Guido's suggestion. Christian

M.-A. Lemburg

10:10 a.m.

On 2008-05-26 23:34, Christian Heimes wrote:

...

M.-A. Lemburg schrieb:

...
Isn't that an awefuly confusing approach ?

Wouldn't it be better to keep PyString APIs and definitions in stringobject.c|h

and only add a new bytesobject.h header file that #defines the PyBytes APIs in terms of PyString APIs ? That maintains backwards compatibility and allows Python internals to use the new API names.

With your approach, you've basically backported the confusing notion in Py3k that str() maps PyUnicode, only that in Py2 str() will now map to PyBytes.

The last time I brought up the topic, I had a lengthy discussion with Guido. At first I wanted to rename the API in Python 3.0 only. Guido argued that it's going to cause too much merge conflicts. He then suggested the approach I implemented today.

That's the same argument that came up in the module renaming discussion. I have a feeling that we should be looking for better merge tools, rather than implement code changes that cause more trouble than do good, just because our existing tools aren't smart enough. Wouldn't it be possible to have a 2to3.py converter take the 2.x code (including the C code), convert it and then apply any changes to the 3.x branch ? This wouldn't be merging in the classical sense, it would be automated forward porting.

...

I find the approach less confusing than your suggestion and my initial idea.

I disagree on that. Renaming old APIs to use the new names by adding a header file with #define <oldname> <newname> is standard practice. Renaming the old APIs in the source code and undoing the renaming with a header file is not.

...

The internal API names are consistent for Python 2.6 and 3.0. The byte string C API is prefixed PyBytes and the unicode C API is prefixed PyUnicode. A core developer has just to remember that 'str' is a byte string in 2.x but an unicode object in 3.0.

So you've solved part of the problem for 3.x by moving the naming mixup back to 2.x.

...

Extension developers don't have to worry at all. The ABI and external API is mostly the same and still exposes the 'str' functions as PyString.

Well, yes, but only due to a preprocessor hack that turns the names used in bytesobject.c back into names you'd normally look for in stringobject.c. And all this, just because Subversion can't handle merging of symbol renaming.

...

...
You'd have to add an aliase bytes -> str to the builtins to at least reduce the confusion a bit.

Python 2.6 already has an alias bytes -> str

...
Yes, but please let's first discuss this some more. I don't think that the timing was right.... you started this thread just yesterday and the patches are already checked in.

I'm sorry if I was too hasty for you. I got +1 from a couple of developers and it's basically Guido's suggestion.

Please discuss any changes of the 2.x code base on python-dev. Such major changes do need more discussion and possibly a PEP as well. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 27 2008)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania 40 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

M.-A. Lemburg

10:12 a.m.

I'm beginning to wonder whether I'm the only one who cares about the Python 2.x branch not getting cluttered up with artifacts caused by a broken forward merge strategy. How can it be that we allow major C API changes such as the renaming of the PyString APIs to go into the trunk without discussion or a PEP ? We're having lengthy discussions about the addition of single method to an object, but such major changes just go in like that and nobody seems to really care. Puzzled, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 28 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 39 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 On 2008-05-27 12:10, M.-A. Lemburg wrote:

...

On 2008-05-26 23:34, Christian Heimes wrote:

...
M.-A. Lemburg schrieb:

...
Isn't that an awefuly confusing approach ?

Wouldn't it be better to keep PyString APIs and definitions in stringobject.c|h

and only add a new bytesobject.h header file that #defines the PyBytes APIs in terms of PyString APIs ? That maintains backwards compatibility and allows Python internals to use the new API names.

With your approach, you've basically backported the confusing notion in Py3k that str() maps PyUnicode, only that in Py2 str() will now map to PyBytes.

The last time I brought up the topic, I had a lengthy discussion with Guido. At first I wanted to rename the API in Python 3.0 only. Guido argued that it's going to cause too much merge conflicts. He then suggested the approach I implemented today.

That's the same argument that came up in the module renaming discussion.

I have a feeling that we should be looking for better merge tools, rather than implement code changes that cause more trouble than do good, just because our existing tools aren't smart enough.

Wouldn't it be possible to have a 2to3.py converter take the 2.x code (including the C code), convert it and then apply any changes to the 3.x branch ?

This wouldn't be merging in the classical sense, it would be automated forward porting.

...
I find the approach less confusing than your suggestion and my initial idea.

I disagree on that.

Renaming old APIs to use the new names by adding a header file with #define <oldname> <newname> is standard practice.

Renaming the old APIs in the source code and undoing the renaming with a header file is not.

...
The internal API names are consistent for Python 2.6 and 3.0. The byte string C API is prefixed PyBytes and the unicode C API is prefixed PyUnicode. A core developer has just to remember that 'str' is a byte string in 2.x but an unicode object in 3.0.

So you've solved part of the problem for 3.x by moving the naming mixup back to 2.x.

...
Extension developers don't have to worry at all. The ABI and external API is mostly the same and still exposes the 'str' functions as PyString.

Well, yes, but only due to a preprocessor hack that turns the names used in bytesobject.c back into names you'd normally look for in stringobject.c.

And all this, just because Subversion can't handle merging of symbol renaming.

...
...
You'd have to add an aliase bytes -> str to the builtins to at least reduce the confusion a bit.

Python 2.6 already has an alias bytes -> str

...
Yes, but please let's first discuss this some more. I don't think that the timing was right.... you started this thread just yesterday and the patches are already checked in.

I'm sorry if I was too hasty for you. I got +1 from a couple of developers and it's basically Guido's suggestion.

Please discuss any changes of the 2.x code base on python-dev.

Such major changes do need more discussion and possibly a PEP as well.

Thanks,

Paul Moore

12:29 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 28/05/2008, M.-A. Lemburg <mal@egenix.com> wrote:

...

I care, but I struggle to understand the implications and/or what is being proposed in many cases. Recent examples are the ABC backports and the current thread (string C API). I simply don't follow the issues well enough to comment.

...

Christian has raised this a couple of times, but there has been little discussion. I suspect that this is because there is not enough clarity over the practical consequences. A PEP may help here, but I'm not sure how much - it could spark discussion, but would anyone actually end up any better informed?

...

I suspect deadline pressure and burnout are involved here. In all honesty, there's been little or no work done on the C API, which is just as much in need of review and possible cleanup for 3.0 as the language. It's as close as makes no difference to too late now - does that mean we've lost the chance? Paul.

M.-A. Lemburg

12:59 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-05-28 14:29, Paul Moore wrote:

...

Thanks, so I'm not the only :-)

...

Probably, yes. The reason is that if you have a PEP, more people are likely to review it and make comments. If you start a discussion with a general subject line which then results in lots of little sub-threads, important aspects of the discussion are likely to go unnoticed in the noise.

...

Perhaps, but the C API is certainly not used by as many people as the Python front-end and changes to the C API also have much deeper consequences due the API being written in C rather than Python. Overall, I don't think there's a lot to cleanup in the C API. Perhaps remove a few of those '...Ex()' APIs that were introduced to extend the original APIs and maybe remove or free up a few type slots that are no longer needed, but that's about it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 28 2008)

...

Bill Janssen

5:08 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

...

I share your concern. Seems to me that perhaps (not sure, but perhaps) the rush to back-port from 3.x, and the concern about minimizing pain of moving from 2.x to 3.x, has become the tail wagging the dog. Bill

M.-A. Lemburg

2:51 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-05-28 19:08, Bill Janssen wrote:

...

Indeed. If the need to be able to forward merge changes from the 2.x trunk to the 3.x branch is the only reason for the current approach, then we need to find a better procedure for getting patches to 2.x forwarded to 3.x. I believe that everyone is aware that 3.x breaks things and that's fine. However, the reason for introducing such breakage in 3.x is that users have the option to decide whether and when to switch to the new major version. Being able to play with 3.x features in 2.x is nice, but I wouldn't really consider those essential for 2.x. It certainly doesn't warrant causing major problems in the 2.x releases. The module renaming backport was one example (which was undone again), the C API renaming is another. I expect more such features to be backported from 3.x to 2.x (even though I don't really think it's worth the trouble) and since this always means that changes have to applied in two worlds, we'll need a better process for getting changes in one major release ported to the other. Simply tweaking 2.x into shape so that the rather simple minded SVN merge command works, isn't a good enough procedure for this. That's why I suggested to use an intermediate form or branch for the merging - one that implements the 2.x with all renaming and syntax fixing applied. This would: * reduce the number of merge conflicts since the renaming would already have happened * reduce the patch sizes that have to be applied to 3.x in order to stay in sync with 2.x * result in a tool chain that makes it easier for all Python users to port their code to 3.x * simplify renaming or reorg of modules, functions, methods and C APIs without requiring major changes on either side -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 29 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 38 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Gregory P. Smith

8:47 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Wed, May 28, 2008 at 3:12 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...

I do not consider it a C API change. The API and ABI have not changed. Old code still compiles. Old binaries still dynamically load and work fine. (I just confirmed this by importing a couple python2.4 .so files into my non-debug build of 2.6 trunk) A of the PyString APIs are the real implementations in 2.x and are still there. We only switched to using their PyBytes equivalent names within the Python trunk code base. Are you objecting to our own code switching to use a different name even though the actual underlying API and ABI haven't changed? I suppose to people reading the code and going against old reference books it could be confusing but they've got to get used to the new names somehow and sometime. I strongly support changes like this one that makes the life of porting C code forwards and backwards between 2.x and 3.x easier without breaking compatibility with earlier 2.x version because that is going to be a serious pain for all of us otherwise. -gps

M.-A. Lemburg

3:22 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-05-28 22:47, Gregory P. Smith wrote:

...

Well, first of all, it is a change in the C API: APIs have different names now, they live in different files, the Python documentation doesn't apply anymore, books have to be updated, programmers trained, etc. etc. That's fine for 3.x, it's not for 2.x. Second, if you leave out the "ease merging" argument, all of this is not really necessary in 2.x. If you absolutely want to have PyBytes APIs in 2.x, then you can *add* them, without removing the PyString APIs. We have done that on a smaller scale a couple of times in the past (turned functions into macros or vice-versa). And finally, the "merge" argument itself is not really all that strong. It's just a matter of getting the procedure corrected. Then you can rename and restructure as much as you want in 3.x - without affecting the stability and matureness of the 2.x branch. I suspect more of these backports to happen, so we better get things done right now instead of putting Python's reputation as stable and mature programming language at risk. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 29 2008)

...

Christian Heimes

3:45 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

M.-A. Lemburg schrieb:

...

No, that's not correct. The 2.x API is still the same. I've only changed the internal code.

...

The PyString methods are still available and the official API for dealing with str objects in 2.x.

...

I'm volunteering to revert my chances if you are volunteering to keep the Python 2.x series in sync with the 3.x series. Christian

M.-A. Lemburg

6:08 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

Christian, so far you have not responded to any of the suggestions made on this thread, only defended your checkin. That's not very helpful in getting to some conclusion. * What's so hard about going with a proper, standard solution that doesn't involve using your preprocessor hack ? * Why can't we have both PyString *and* PyBytes exposed in 2.x, with one redirecting to the other ? * Why should the 2.x code base turn to hacks, just because 3.x wants to restructure itself ? * Why aren't you even considering my proposed solution for this whole renaming and reorg problem ? BTW: Is there some PEP or wiki page explaining how you actually implement the merging from 2.x to 3.x ? I'm still under the assumption that you're only using svnmerge.py for this and doing straight merging from the trunk to the branch. Not sure how others feel about it, but if the only option you would feel comfortable with is not having the 3.x renaming backported, then I'd rather go with that, really. It's easy enough to add a header file to map PyString APIs to PyBytes if you want to port an extension to 3.x. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 29 2008)

...

Nick Coghlan

10:57 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

M.-A. Lemburg wrote:

...

* Why can't we have both PyString *and* PyBytes exposed in 2.x, with one redirecting to the other ?

We do have that - the PyString_* names still work perfectly fine in 2.x. They just won't be used in the Python core codebase anymore - everything in the Python core will use either PyBytes_* or PyUnicode_* regardless of which branch (2.x or 3.x) you're working on. I think that's a good thing for ease of maintenance in the future, even if it takes people a while to get their heads around it right now.

...

* Why should the 2.x code base turn to hacks, just because 3.x wants to restructure itself ?

With the better explanation from Greg of what the checked in approach achieves (i.e. preserving exact ABI compatibility for PyString_*, while allowing PyBytes_* to be used at the source code level), I don't see what has been done as being any more of a hack than the possibly more common "#define <oldname> <newname>" (which *would* break binary compatibility). The only things that I think would tidy it up further would be to: - include an explanation of the approach and its effects on API and ABI backward and forward compatibility within 2.x and between 2.x and 3.x in stringobject.h - expose the PyBytes_* functions to the linker in 2.6 as well as 3.0 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Gregory P. Smith

7:45 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Thu, May 29, 2008 at 3:57 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

Yes that is the only complaint I believe I really see left at this point. It is easy enough to fix. Change the current stringobject.h "#define PyBytes_Foo PyString_Foo" approach into a .c file that defines one line stub functions for all PyString_Foo() functions to call actual PyBytes_Foo() functions. I'd even go so far as to put the one line alternate name stubs in the Objects/bytesobject.c and .h file right next to the PyBytes_Foo() method definitions so that its clear from reading a single file that they are the same thing. The performance implications of this are minor all things considered (a single absolute jmp given a good compiler) and regardless of what we do should only apply to extension modules, not the core. If we do the above in trunk will this thread end? I'm personally not really clear on why we need PyBytes_Foo to show up in the -binary- ABI in 2.6. The #define's are enough for me but I'm happy to make this compromise. No 2.x books, documentation or literature will be invalidated by the changes regardless. -gps

M.-A. Lemburg

8:37 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-05-30 00:57, Nick Coghlan wrote:

...

Sorry, I probably wasn't clear enough: Why can't we have both PyString *and* PyBytes exposed as C APIs (ie. visible in code and in the linker) in 2.x, with one redirecting to the other ?

...

Which is what I was suggesting all along; sorry if I wasn't clear enough on that. The standard approach is that you provide #define redirects from the old APIs to the new ones (which are then picked up by the compiler) *and* add function wrappers to the same affect (to make linkers, dynamic load APIs such ctypes and debuggers happy). Example from pythonrun.h|c: --------------------------- /* Use macros for a bunch of old variants */ #define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL) /* Deprecated C API functions still provided for binary compatiblity */ #undef PyRun_String PyAPI_FUNC(PyObject *) PyRun_String(const char *str, int s, PyObject *g, PyObject *l) { return PyRun_StringFlags(str, s, g, l, NULL); } I still believe that we should *not* make "easy of merging" the primary motivation for backporting changes in 3.x to 2.x. Software design should not be guided by restrictions in the tool chain, if not absolutely necessary. The main argument for a backport needs to be general usefulness to the 2.x users, IMHO... just like any other feature that makes it into 2.x. If merging is difficult then this needs to be addressed, but there are more options to that than always going back to the original 2.x trunk code. I've given a few suggestions on how this could be approached in other emails on this thread. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 30 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 37 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Gregory P. Smith

June 2008

11:30 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Fri, May 30, 2008 at 1:37 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...

On 2008-05-30 00:57, Nick Coghlan wrote:

...
M.-A. Lemburg wrote:

...
* Why can't we have both PyString *and* PyBytes exposed in 2.x, with one redirecting to the other ?

We do have that - the PyString_* names still work perfectly fine in 2.x. They just won't be used in the Python core codebase anymore - everything in the Python core will use either PyBytes_* or PyUnicode_* regardless of which branch (2.x or 3.x) you're working on. I think that's a good thing for ease of maintenance in the future, even if it takes people a while to get their heads around it right now.

Sorry, I probably wasn't clear enough:

Why can't we have both PyString *and* PyBytes exposed as C APIs (ie. visible in code and in the linker) in 2.x, with one redirecting to the other ?

...
...
* Why should the 2.x code base turn to hacks, just because 3.x wants to restructure itself ?

With the better explanation from Greg of what the checked in approach achieves (i.e. preserving exact ABI compatibility for PyString_*, while allowing PyBytes_* to be used at the source code level), I don't see what has been done as being any more of a hack than the possibly more common "#define <oldname> <newname>" (which *would* break binary compatibility).

The only things that I think would tidy it up further would be to: - include an explanation of the approach and its effects on API and ABI backward and forward compatibility within 2.x and between 2.x and 3.x in stringobject.h - expose the PyBytes_* functions to the linker in 2.6 as well as 3.0

Which is what I was suggesting all along; sorry if I wasn't clear enough on that.

The standard approach is that you provide #define redirects from the old APIs to the new ones (which are then picked up by the compiler) *and* add function wrappers to the same affect (to make linkers, dynamic load APIs such ctypes and debuggers happy).

Example from pythonrun.h|c: ---------------------------

/* Use macros for a bunch of old variants */ #define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL)

/* Deprecated C API functions still provided for binary compatiblity */

#undef PyRun_String PyAPI_FUNC(PyObject *) PyRun_String(const char *str, int s, PyObject *g, PyObject *l) { return PyRun_StringFlags(str, s, g, l, NULL); }

Okay, how about this? http://codereview.appspot.com/1521 Using that patch, both PyString_ and PyBytes_ APIs are available using function stubs similar to the above. I opted to define the stub functions right next to the ones they were stubbing rather than putting them all at the end of the file or in another file but they could be moved if someone doesn't like them that way.

...

I am not the one doing the merging or working on merge tools so I'll leave this up to those that are. -gps

M.-A. Lemburg

12:33 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-06-02 01:30, Gregory P. Smith wrote:

...

On Fri, May 30, 2008 at 1:37 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...
Sorry, I probably wasn't clear enough:

Why can't we have both PyString *and* PyBytes exposed as C APIs (ie. visible in code and in the linker) in 2.x, with one redirecting to the other ?

...
...
* Why should the 2.x code base turn to hacks, just because 3.x wants to restructure itself ? With the better explanation from Greg of what the checked in approach achieves (i.e. preserving exact ABI compatibility for PyString_*, while allowing PyBytes_* to be used at the source code level), I don't see what has been done as being any more of a hack than the possibly more common "#define <oldname> <newname>" (which *would* break binary compatibility).

The only things that I think would tidy it up further would be to: - include an explanation of the approach and its effects on API and ABI backward and forward compatibility within 2.x and between 2.x and 3.x in stringobject.h - expose the PyBytes_* functions to the linker in 2.6 as well as 3.0 Which is what I was suggesting all along; sorry if I wasn't clear enough on that.

The standard approach is that you provide #define redirects from the old APIs to the new ones (which are then picked up by the compiler) *and* add function wrappers to the same affect (to make linkers, dynamic load APIs such ctypes and debuggers happy).

Example from pythonrun.h|c: ---------------------------

/* Use macros for a bunch of old variants */ #define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL)

/* Deprecated C API functions still provided for binary compatiblity */

#undef PyRun_String PyAPI_FUNC(PyObject *) PyRun_String(const char *str, int s, PyObject *g, PyObject *l) { return PyRun_StringFlags(str, s, g, l, NULL); }

Okay, how about this? http://codereview.appspot.com/1521

Using that patch, both PyString_ and PyBytes_ APIs are available using function stubs similar to the above. I opted to define the stub functions right next to the ones they were stubbing rather than putting them all at the end of the file or in another file but they could be moved if someone doesn't like them that way.

Thanks. I was working on a similar patch. Looks like you beat me to it. The only thing I'm not sure about is having the wrappers in the same file - this is likely to cause merge conflicts when doing direct merging and even with an automated renaming approach, the extra code would be easier to remove if it were e.g. at the end of the file or even better: in a separate file. My patch worked slightly differently: it adds wrappers PyString* that forward calls to the PyBytes* APIs and they all live in stringobject.c. stringobject.h then also provides aliases so that recompiled extensions pick up the new API names. While working on my patch I ran into an issue that I haven't been able to resolve: the wrapper functions got optimized away by the linker and even though they appear in the libpython2.6.a, they don't end up in the python binary itself. As a result, importing Python 2.5 in the resulting 2.6 binary still fails with a unresolved PyString symbol. Please check whether that's the case for your patch as well.

...

I'm not sure whether there are any specific merge tools around - apart from the 2to3.py script. There also doesn't seem to be any documentation on the merge process itself (at least nothing that Google can find in the PEPs), so it's difficult to make any suggestions. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 02 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 34 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Gregory P. Smith

10:21 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Mon, Jun 2, 2008 at 5:33 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...

I think that is going to happen no matter which approach is used (yours or mine) unless we force some included code to call each of the stubs (needlessly inefficient). One way to do that is to reference them all from a section of code called conditionally based upon an always false condition that the compiler and linker can never predetermine is false so that it cannot be eliminated as dead code. Given that, should we bother? I don't think we really need PyBytes_ to show up in the binary ABI for 2.x even if that is how we write the calls in the python internals code. The arguments put forth that debugging is easier if you can just set a breakpoint on what you read may be true but including stub functions doesn't help this when most of the time they're compiled under the alternate name using #defines so a breakpoint set on the stub name will not actually trigger. API wise we're really providing the PyBytes* names to make module author's work of writing code that targets 2.6 and 3.x easier but isn't it reasonable for authors to just be told that they're just #defined aliases for PyString*. There is no goal, nor should there be, of a module binary compiled against 2.x loading and working in 3.x. I expect most module authors, code generators and such will want to target Python 2.x earlier than 2.6 as well so should we provide PyBytes_ names as a public API in 2.6 at all? (regardless of if we use the PyBytes names internally for any reason) -gps

Guido van Rossum

11:09 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

I will freely admit that I haven't followed this thread in any detail, but if it were up to me, I'd have the 2.6 internal code use PyString (as both what the linker sees and what the human reads in the source code) and the 3.0 code use PyBytes for the same thing. Let the merges be damed -- most changes to 2.6 these days seem to be blocked explicitly from being merged anyway. I'd prefer the 2.6 code base to stay true to 2.x, and the 3.0 code base start afresh where it makes sense. We should reindent more of the 3.0 code base to use 4-space-indents in C code too. I would also add macros that map the PyBytes_* APIs to PyString_*, but I would not start using these internally except in code newly written for 2.6 and intended to be "in the spirit of 3.0". IOW use PyString for 8-bit strings containing text, and PyBytes for 8-bit strings containing binary data. For 8-bit strings that could contain either text or data, I'd use PyString, in the spirit of 2.x. --Guido On Mon, Jun 2, 2008 at 3:21 PM, Gregory P. Smith <greg@krypto.org> wrote:

...

On Mon, Jun 2, 2008 at 5:33 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...
...
Okay, how about this? http://codereview.appspot.com/1521

Using that patch, both PyString_ and PyBytes_ APIs are available using function stubs similar to the above. I opted to define the stub functions right next to the ones they were stubbing rather than putting them all at the end of the file or in another file but they could be moved if someone doesn't like them that way.

Thanks. I was working on a similar patch. Looks like you beat me to it.

The only thing I'm not sure about is having the wrappers in the same file - this is likely to cause merge conflicts when doing direct merging and even with an automated renaming approach, the extra code would be easier to remove if it were e.g. at the end of the file or even better: in a separate file.

My patch worked slightly differently: it adds wrappers PyString* that forward calls to the PyBytes* APIs and they all live in stringobject.c. stringobject.h then also provides aliases so that recompiled extensions pick up the new API names.

While working on my patch I ran into an issue that I haven't been able to resolve: the wrapper functions got optimized away by the linker and even though they appear in the libpython2.6.a, they don't end up in the python binary itself.

As a result, importing Python 2.5 in the resulting 2.6 binary still fails with a unresolved PyString symbol.

Please check whether that's the case for your patch as well.

I think that is going to happen no matter which approach is used (yours or mine) unless we force some included code to call each of the stubs (needlessly inefficient). One way to do that is to reference them all from a section of code called conditionally based upon an always false condition that the compiler and linker can never predetermine is false so that it cannot be eliminated as dead code.

Given that, should we bother? I don't think we really need PyBytes_ to show up in the binary ABI for 2.x even if that is how we write the calls in the python internals code. The arguments put forth that debugging is easier if you can just set a breakpoint on what you read may be true but including stub functions doesn't help this when most of the time they're compiled under the alternate name using #defines so a breakpoint set on the stub name will not actually trigger.

API wise we're really providing the PyBytes* names to make module author's work of writing code that targets 2.6 and 3.x easier but isn't it reasonable for authors to just be told that they're just #defined aliases for PyString*. There is no goal, nor should there be, of a module binary compiled against 2.x loading and working in 3.x.

I expect most module authors, code generators and such will want to target Python 2.x earlier than 2.6 as well so should we provide PyBytes_ names as a public API in 2.6 at all? (regardless of if we use the PyBytes names internally for any reason)

-gps

_______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Gregory P. Smith

11:29 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Mon, Jun 2, 2008 at 4:09 PM, Guido van Rossum <guido@python.org> wrote:

...

I will freely admit that I haven't followed this thread in any detail, but if it were up to me, I'd have the 2.6 internal code use PyString

... Should we read this as a BDFL pronouncement and make it so? All that would mean change wise is that trunk r63675 as well as possibly r63672 and r63677 would need to be rolled back and this whole discussion over if such a big change should have happened would turn into a moot point. I would also add macros that map the PyBytes_* APIs to PyString_*, but

...

M.-A. Lemburg

9:19 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-06-03 01:29, Gregory P. Smith wrote:

...

I would certainly welcome reverting the change. All that's needed to support PyBytes API in 2.x is a set of #defines that map the new APIs to the PyString names. That's a clean and easily understandable solution. Programmers interested in the code for a PyString API can then still look up the code in stringobject.c, e.g. to find out how a certain special case is handled or to check the ref counting - just like they did for years. Developer who want to start differentiating between mixed byte/text data and bytes-only can start using PyBytes for byte data.

...

-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 06 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 30 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Gregory P. Smith

5:20 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Fri, Jun 6, 2008 at 2:19 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...

Okay, I've reverted r63675 in trunk revision r64048. That leaves all of the python modules and internals using PyString_ api names instead of PyBytes_ api names as they were before. PyBytes_ #define's exist for the appropriate PyString methods incase anyone wants to use those. Programmers interested in the code

...

The files still exist with the new names. bytesobject.c instead of stringobject.c. Those renames were done in the other CLs i mentioned which have not yet been reverted. The current state seems a bit odd because they depend on the #defines to cause method definitions to be the PyString_ names instead of the PyBytes_ names.

...

M.-A. Lemburg

8:44 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-06-09 07:20, Gregory P. Smith wrote:

...

Thanks.

...

Please restore the original state, ie. PyString APIs live in stringobject.h and stringobject.c. bytesobject.h should then have the #defines for PyBytes APIs, pointing them to the PyString names (basically what's currently in stringobject.h).

...

Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 09 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 27 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Gregory P. Smith

3:42 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Mon, Jun 9, 2008 at 1:44 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...

all done as of 64105

M.-A. Lemburg

8:52 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-06-11 05:42, Gregory P. Smith wrote:

...

On Mon, Jun 9, 2008 at 1:44 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 2008-06-09 07:20, Gregory P. Smith wrote:

...
On Fri, Jun 6, 2008 at 2:19 AM, M.-A. Lemburg <mal@egenix.com> wrote:

On 2008-06-03 01:29, Gregory P. Smith wrote:

...
...
wrote:

I will freely admit that I haven't followed this thread in any detail,

...
but if it were up to me, I'd have the 2.6 internal code use PyString

... Should we read this as a BDFL pronouncement and make it so?

All that would mean change wise is that trunk r63675 as well as possibly r63672 and r63677 would need to be rolled back and this whole discussion over if such a big change should have happened would turn into a moot point.

I would certainly welcome reverting the change. All that's needed to support PyBytes API in 2.x is a set of #defines

On Mon, Jun 2, 2008 at 4:09 PM, Guido van Rossum <guido@python.org> that map the new APIs to the PyString names. That's a clean and easily understandable solution.

Okay, I've reverted r63675 in trunk revision r64048. That leaves all of the python modules and internals using PyString_ api names instead of PyBytes_ api names as they were before. PyBytes_ #define's exist for the appropriate PyString methods incase anyone wants to use those.

Thanks.

Programmers interested in the code

...
...
for a PyString API can then still look up the code in stringobject.c, e.g. to find out how a certain special case is handled or to check the ref counting - just like they did for years.

The files still exist with the new names. bytesobject.c instead of stringobject.c. Those renames were done in the other CLs i mentioned which have not yet been reverted. The current state seems a bit odd because they depend on the #defines to cause method definitions to be the PyString_ names instead of the PyBytes_ names.

Please restore the original state, ie. PyString APIs live in stringobject.h and stringobject.c. bytesobject.h should then have the #defines for PyBytes APIs, pointing them to the PyString names (basically what's currently in stringobject.h).

all done as of 64105

Thank you ! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 11 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 25 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Antoine Pitrou

10:39 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

Guido van Rossum <guido <at> python.org> writes:

...

Is there any reason reindenting shouldn't be done for 2.6 too? (apart from "staying true to 2.x" :-)) Antoine.

Georg Brandl

10:48 a.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

Antoine Pitrou schrieb:

...

It would make svn blame useless, for a start. (SVN could really use a feature to exclude certain revisions from showing up in svn blame.) Georg

Guido van Rossum

2:29 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On Tue, Jun 3, 2008 at 3:48 AM, Georg Brandl <g.brandl@gmx.net> wrote:

...

What he said. And "staying true to 2.x" is not a bad rationale. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

M.-A. Lemburg

5:43 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

On 2008-06-03 01:09, Guido van Rossum wrote:

...

+1 Let's work on better merge tools that edit the trunk code base into shape for a 3.x checkin. Using automated tools for this is likely going to lower the probability of bugs introduced due to unnoticed merge conflicts and in the end is also going to be a benefit to everyone wanting to maintain a single code base for both targets. Perhaps we could revive the old Tools/scripts/fixcid.py that was used for the 1.4->1.5 renaming ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 03 2008)

...

2008-07-07: EuroPython 2008, Vilnius, Lithuania 33 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

Lisandro Dalcin

4:17 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

Are you completelly sure of adding those guys: PyBytes_InternXXX ??? On 6/1/08, Gregory P. Smith <greg@krypto.org> wrote:

...

On Fri, May 30, 2008 at 1:37 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 2008-05-30 00:57, Nick Coghlan wrote:

...
M.-A. Lemburg wrote:

...
* Why can't we have both PyString *and* PyBytes exposed in 2.x, with one redirecting to the other ?

We do have that - the PyString_* names still work perfectly fine in 2.x. They just won't be used in the Python core codebase anymore - everything in the Python core will use either PyBytes_* or PyUnicode_* regardless of which branch (2.x or 3.x) you're working on. I think that's a good thing for ease of maintenance in the future, even if it takes people a while to get their heads around it right now.

Sorry, I probably wasn't clear enough:

Why can't we have both PyString *and* PyBytes exposed as C APIs (ie. visible in code and in the linker) in 2.x, with one redirecting to the other ?

...
...
* Why should the 2.x code base turn to hacks, just because 3.x wants to restructure itself ?

With the better explanation from Greg of what the checked in approach achieves (i.e. preserving exact ABI compatibility for PyString_*, while allowing PyBytes_* to be used at the source code level), I don't see what has been done as being any more of a hack than the possibly more common "#define <oldname> <newname>" (which *would* break binary compatibility).

The only things that I think would tidy it up further would be to: - include an explanation of the approach and its effects on API and ABI backward and forward compatibility within 2.x and between 2.x and 3.x in stringobject.h - expose the PyBytes_* functions to the linker in 2.6 as well as 3.0

Which is what I was suggesting all along; sorry if I wasn't clear enough on that.

The standard approach is that you provide #define redirects from the old APIs to the new ones (which are then picked up by the compiler) *and* add function wrappers to the same affect (to make linkers, dynamic load APIs such ctypes and debuggers happy).

Example from pythonrun.h|c: ---------------------------

/* Use macros for a bunch of old variants */ #define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL)

/* Deprecated C API functions still provided for binary compatiblity */

#undef PyRun_String PyAPI_FUNC(PyObject *) PyRun_String(const char *str, int s, PyObject *g, PyObject *l) { return PyRun_StringFlags(str, s, g, l, NULL); }

Okay, how about this? http://codereview.appspot.com/1521

Using that patch, both PyString_ and PyBytes_ APIs are available using function stubs similar to the above. I opted to define the stub functions right next to the ones they were stubbing rather than putting them all at the end of the file or in another file but they could be moved if someone doesn't like them that way.

...
I still believe that we should *not* make "easy of merging" the primary motivation for backporting changes in 3.x to 2.x. Software design should not be guided by restrictions in the tool chain, if not absolutely necessary.

The main argument for a backport needs to be general usefulness to the 2.x users, IMHO... just like any other feature that makes it into 2.x.

If merging is difficult then this needs to be addressed, but there are more options to that than always going back to the original 2.x trunk code. I've given a few suggestions on how this could be approached in other emails on this thread.

I am not the one doing the merging or working on merge tools so I'll leave this up to those that are.

-gps _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/dalcinl%40gmail.com

-- Lisandro Dalcín --------------- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594

Gregory P. Smith

10:22 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

-cc: python-3000 I believe those APIs are already there in the existing interface. Why does that concern you? On Mon, Jun 2, 2008 at 9:17 AM, Lisandro Dalcin <dalcinl@gmail.com> wrote:

...

Are you completelly sure of adding those guys: PyBytes_InternXXX ???

...
On Fri, May 30, 2008 at 1:37 AM, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 2008-05-30 00:57, Nick Coghlan wrote:

...
M.-A. Lemburg wrote:

...
* Why can't we have both PyString *and* PyBytes exposed in 2.x, with one redirecting to the other ?

We do have that - the PyString_* names still work perfectly fine in

2.x.

...
They just won't be used in the Python core codebase anymore - everything in the Python core will use either PyBytes_* or PyUnicode_* regardless of which branch (2.x or 3.x) you're working on. I think that's a good thing for ease of maintenance in the future, even if it takes people a while to get

On 6/1/08, Gregory P. Smith <greg@krypto.org> wrote: their

...
...
...
heads around it right now.

Sorry, I probably wasn't clear enough:

Why can't we have both PyString *and* PyBytes exposed as C APIs (ie. visible in code and in the linker) in 2.x, with one redirecting to the other ?

...
...
* Why should the 2.x code base turn to hacks, just because 3.x wants to restructure itself ?

With the better explanation from Greg of what the checked in approach achieves (i.e. preserving exact ABI compatibility for PyString_*, while allowing PyBytes_* to be used at the source code level), I don't see what has been done as being any more of a hack than the possibly more common "#define <oldname> <newname>" (which *would* break binary compatibility).

The only things that I think would tidy it up further would be to: - include an explanation of the approach and its effects on API and ABI backward and forward compatibility within 2.x and between 2.x and 3.x in stringobject.h - expose the PyBytes_* functions to the linker in 2.6 as well as 3.0

Which is what I was suggesting all along; sorry if I wasn't clear enough on that.

The standard approach is that you provide #define redirects from the old APIs to the new ones (which are then picked up by the compiler) *and* add function wrappers to the same affect (to make linkers, dynamic load APIs such ctypes and debuggers happy).

Example from pythonrun.h|c: ---------------------------

/* Use macros for a bunch of old variants */ #define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL)

/* Deprecated C API functions still provided for binary compatiblity */

#undef PyRun_String PyAPI_FUNC(PyObject *) PyRun_String(const char *str, int s, PyObject *g, PyObject *l) { return PyRun_StringFlags(str, s, g, l, NULL); }

Okay, how about this? http://codereview.appspot.com/1521

Using that patch, both PyString_ and PyBytes_ APIs are available using function stubs similar to the above. I opted to define the stub functions right next to the ones they were stubbing rather than putting them all at the end of the file or in another file but they could be moved if someone doesn't like them that way.

...
I still believe that we should *not* make "easy of merging" the primary motivation for backporting changes in 3.x to 2.x. Software design should not be guided by restrictions in the tool chain, if not absolutely necessary.

The main argument for a backport needs to be general usefulness to the 2.x users, IMHO... just like any other feature that makes it into 2.x.

If merging is difficult then this needs to be addressed, but there are more options to that than always going back to the original 2.x trunk code. I've given a few suggestions on how this could be approached in other emails on this thread.

I am not the one doing the merging or working on merge tools so I'll leave this up to those that are.

-gps _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/dalcinl%40gmail.com

-- Lisandro Dalcín --------------- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594

Christian Heimes

May 2008

12:02 p.m.

M.-A. Lemburg schrieb:

...

I have a feeling that we should be looking for better merge tools, rather than implement code changes that cause more trouble than do good, just because our existing tools aren't smart enough.

We don't have better tools at our hands. I don't think we'll get any tools in time or chance the VCS right before a major release.

...

Wouldn't it be possible to have a 2to3.py converter take the 2.x code (including the C code), convert it and then apply any changes to the 3.x branch ?

Such a converter would be nice for 3rd party code but it's not an option for the core. In the past few months I've merged a lot of code from trunk to py3k. A 2to3 C converter doesn't help with merge conflicts. Naming differences make any merge more painful

...

...
I find the approach less confusing than your suggestion and my initial idea.

I disagree on that.

Renaming old APIs to use the new names by adding a header file with #define <oldname> <newname> is standard practice.

Renaming the old APIs in the source code and undoing the renaming with a header file is not.

I wasn't talking about standard practice here. I talked about less confusion for core developers. My approach doesn't split our internal API in two. And by the way it *is* a standard approach fore Python. Guido told me that the same approach was used during the 1.x to 2.0 migration.

...

And all this, just because Subversion can't handle merging of symbol renaming.

As I said earlier we don't have better tools at our disposal. We have to make some compromises. Sometimes practicality beat purity.

...

Please discuss any changes of the 2.x code base on python-dev.

Such major changes do need more discussion and possibly a PEP as well.

In the last few months I started at least three topics about the C API renaming. It's in the thread "2.6 and 3.0 tasks" http://permalink.gmane.org/gmane.comp.python.devel/93016 Christian

M.-A. Lemburg

12:47 p.m.

New subject: PyString -> PyBytes C API renaming (Stabilizing the C API of 2.6 and 3.0)

On 2008-05-28 14:02, Christian Heimes wrote:

...

M.-A. Lemburg schrieb:

...
I have a feeling that we should be looking for better merge tools, rather than implement code changes that cause more trouble than do good, just because our existing tools aren't smart enough.

We don't have better tools at our hands. I don't think we'll get any tools in time or chance the VCS right before a major release.

...
Wouldn't it be possible to have a 2to3.py converter take the 2.x code (including the C code), convert it and then apply any changes to the 3.x branch ?

Such a converter would be nice for 3rd party code but it's not an option for the core. In the past few months I've merged a lot of code from trunk to py3k. A 2to3 C converter doesn't help with merge conflicts. Naming differences make any merge more painful

I was suggesting to not use SVN to merge changes directly, but to instead use an intermediate step in the process: Init: 1. grab the latest trunk 2. apply a 2to3 converter to the Python code and the C code, applying any renaming that may be necessary 3. save this converted version in a separate branch merge-branch Update: 1. checkout the merge-branch, . grab the latest trunk and 3.x branch 2. apply a 2to3 converter to the Python code and the C code, applying any renaming that may be necessary 3. copy the files over your working copy of the merge-branch 4. create a diff on the merge-branch 5. apply the diffs to 3.x branch, resolving any conflicts as necessary This doesn't require new tools (except for some C renaming support in the 2to3 tool). It only changes the procedure. We'd basically follow our own suggestions w/r to porting to 3.x, which is to make changes in the 2.x code, apply 2to3 and then apply remaining fixes there. I'm suggesting this, since 3.x is likely to introduce more Python stdlib and C API changes. The process would likely also makes a lot of other changes more easily manageable and reduce the overall merge conflicts.

...

...
...
I find the approach less confusing than your suggestion and my initial idea. I disagree on that.

Renaming old APIs to use the new names by adding a header file with #define <oldname> <newname> is standard practice.

Renaming the old APIs in the source code and undoing the renaming with a header file is not.

I wasn't talking about standard practice here. I talked about less confusion for core developers. My approach doesn't split our internal API in two.

No, but it does apply a well hidden renaming which will cause confusion when using a debugger to trace calls in C code. If you use PyBytes APIs, you expect to find PyBytes functions in the libs and also set breakpoints on these. With the renaming we don't have two sets of APIs (old and new) exposed in the lib, like what we normally do when applying changes to API names.

...

And by the way it *is* a standard approach fore Python. Guido told me that the same approach was used during the 1.x to 2.0 migration.

There was no API change between 1.6 and 2.0. You are probably talking about the great renaming between 1.4 and 1.5. That was different, since it changes almost all C APIs in Python. And it used the standard practice... from rename2.h in Python 1.5: /* This file contains a bunch of #defines that make it possible to use "old style" names (e.g. object) with the new style Python source distribution. */ #define True Py_True #define False Py_False #define None Py_None ie. #define <oldname> <newname>

...

...
And all this, just because Subversion can't handle merging of symbol renaming.

As I said earlier we don't have better tools at our disposal. We have to make some compromises. Sometimes practicality beat purity.

See above.

...

...
Please discuss any changes of the 2.x code base on python-dev.

Such major changes do need more discussion and possibly a PEP as well.

In the last few months I started at least three topics about the C API renaming. It's in the thread "2.6 and 3.0 tasks" http://permalink.gmane.org/gmane.comp.python.devel/93016

Thanks. I stopped reading that thread after Guido's reply in http://comments.gmane.org/gmane.comp.python.devel/92541 It would really help if subject lines were more specific. This thread also uses a much to general subject line (which is why I changed it). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 28 2008)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

Nick Coghlan

1:43 p.m.

New subject: [Python-3000] PyString -> PyBytes C API renaming (Stabilizing the C API of 2.6 and 3.0)

M.-A. Lemburg wrote:

...

This is what I expected to see in stringobject.h, along with some code in stringobject.c to allow the linker to see the old names *as well as* the new names. At the moment, all the code appears to be using the new names, but stringobject.h implicitly converts the new names back to the old names - so trying to use ctypes to retrieve the PyBytes_* functions from the Python DLL will fail. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Jesus Cea

7:34 a.m.

New subject: [Python-3000] PyString -> PyBytes C API renaming (Stabilizing the C API of 2.6 and 3.0)

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 M.-A. Lemburg wrote: | If you use PyBytes APIs, you expect to find PyBytes functions in | the libs and also set breakpoints on these. Very good point. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ ~ _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBSD5ccplgi5GaxT1NAQIZwQP/SMW+GFHxPWui2/tjj2DgZtnzYigjQj/o T8/DYFXEwls65E1xukOi3zS9ePU49u+i36EaVOvYmYdasedTmODnV3anmBo49VFv rsWWr4BBbRwLj4TjjwWPGy7KNKCvyG/mIiBH0uq9tOe2oW9gZng67e1f3snBIite mw4qF6w9bmw= =1Rh8 -----END PGP SIGNATURE-----

Lisandro Dalcin

1:39 p.m.

New subject: [Python-3000] Stabilizing the C API of 2.6 and 3.0

Chistian, I've posted some weeks ago some observation about the status of PyNumberMethods API. The thread link is below, I t did not received much atention. http://mail.python.org/pipermail/python-3000/2008-May/013594.html Now I sumarize that post * 'nb_nonzero' was renamed to 'nb_bool' * 'nb_inplace_divide' was removed * 'nb_hex', 'nb_oct', and 'nb_coerce' are there, but they are unused IMHO, the PyNumbersMethods struct should be left as in Py2, or it should be cleaned up, that is, all unused slots should be removed. On 5/25/08, Christian Heimes <lists@cheimes.de> wrote:

...

6067

Age (days ago)

6084

Last active (days ago)

List overview

Download

42 comments

14 participants

participants (14)

Andrew MacIntyre
Antoine Pitrou
Benjamin Peterson
Bill Janssen
Brett Cannon
Christian Heimes
Georg Brandl
Gregory P. Smith
Guido van Rossum
Jesus Cea
Lisandro Dalcin
M.-A. Lemburg
Nick Coghlan
Paul Moore

Stabilizing the C API of 2.6 and 3.0

Benjamin Peterson

tags

participants (14)