[Python-3000] Looking for advice on PEP 3101 implementation details

Eric Smith eric+python-dev at trueblade.com
Fri Aug 17 22:14:31 CEST 2007


Guido van Rossum wrote:
> On 8/17/07, Eric Smith <eric+python-dev at trueblade.com> wrote:
>> I'm refactoring the sandbox implementation, and I need to add the code
>> that parses the standard format specifiers to py3k.  Since strings,
>> ints, and floats share same format specifiers, I want to have only a
>> single parser.
> 
> Really? Strings support only a tiny subset of the numeric
> mini-language (only [-]N[.N]).

I think strings are:
[[fill]align][width][.precision][type]

ints are:
[[fill]align][sign][width][type]

floats are the full thing:
[[fill]align][sign][width][.precision][type]

They seem similar enough that a single parser would make sense.  Is it 
acceptable to put this parse in unicodeobject.c, and have it callable by 
floatobject.c and longobject.c?  I'm okay with that, I just want to make 
sure I'm not violating some convention that objects don't call into each 
other's implementation files.

>> My first question is:  where should this parser code live?  Should I
>> create a file Python/format.c, or is there a better place?  Should the
>> .h file be Include/format.h?
> 
> Is it only callable from C? Or is it also callable from Python? If so,
> how would Python access it?

I think the parser only needs to be callable from C.

>> I also need to have C code that is called by both str.format, and that
>> is also used by the Formatter implementation.
>>
>> So my second question is:  should I create a Module/_format.c for this
>> code?  And why do some of these modules have leading underscores?  Is it
>> a problem if str.format uses code in Module/_format.c?  Where would the
>> .h file for this code go, if str.format (implemented in unicodeobject.c)
>> needs to get access to it?
>>
>> Thanks for your help, and ongoing patience with a Python internals
>> newbie (but C/C++ veteran).
> 
> Unless the plan is for it to be importable from Python, it should not
> live in Modules. Modules with an underscore are typically imported
> only by a "wrapper" .py module (e.g. _hashlib.c vs. hashlib.py).
> Modules without an underscore are for direct import (though there are
> a few legacy exceptions, e.g. socket.c should really be _socket.c).

The PEP calls for a string.Formatter class, that is subclassable in 
Python code.  I was originally thinking that this class would be written 
in Python, but now I'm not so sure.  Let me digest your answers here and 
I'll re-read the PEP, and see where it takes me.

> Putting it in Modules makes it harder to access from C, as those
> modules are dynamically loaded. If you can't put it in floatobject.c,
> and it's not for import, you could create a new file under Python/.

Okay.  Thanks for the help.

Eric.



More information about the Python-3000 mailing list