Hello everybody, As I won't read my mail during the whole next week I thought it might be helpful if I attempted some kind of summary. It collects numerous ideas posted here (recently or not) and a bit of personal feelings. I hope it might help in getting some distance over what we are actually trying to do with our "standard object space". I am afraid it is all very vague, but should be kept in mind before we actually design something new there. The std/ directory is the "space of compliant Python object implementations", as opposed to non-compliant object spaces which do some more funny things while the interpreter follows the bytecode. It collects various implementations for the same user types, which are based on some lower-level abstractions. The abstractions that are used are explicitely described in each file, for example: N-bits or machine-sized signed or unsigned integers; memory blocks with explicit management; bitfield objects; various hints (e.g. "this implementation should not be used with refcounts because it has many circularities"); or even which "RPython level" we use (if we develop more and more complex translators), including what basic type operations we allow. Then each low-level abstraction itself has a reference implementation in Python, like Christian's r_int; this should be put in a file that can easily be associated to alternate implementations of the same concept, the ones that can be used by the translator (like the description of the C 'int' type and the tedious complexities of overflow detection in C). These low-level implementations can also explicitely depend on other low-level implementations: for example, "C lists with stored length" could be implemented in term of malloc-style memory blocks -- as opposed to "lists as Java arrays" which don't go lower-level because the length is always stored in Java arrays. "Above" all these files, we have several compliant object spaces, whose purpose is to link a selection of implementations together. For testing purposes object spaces should be easy to build and run in Python, thanks to the Python reference implementations of low-level abstractions. On the other hand, the "C object space" links some or all the files which are (indirectly) based on C-implemented low-level abstractions only; this is what the translator will input. Which files an object space depends on should be specified declaratively, in such a way that we can actually use this information to implement the "wrap", "newlist", "newstring",... methods of the object space. These are the functions that build W_XxxObject instances of the basic types from scratch. Let me stress again that the maximum flexibility seems to be to allow each concept to be implemented in possibly several ways above lower-level concepts, including new ones if the existing ones don't suit you. The arguments on the mailing list show nicely enough that there is just no single better way to implement things, so we should just allow them all to coexist. After that, *other* files are responsible for putting the pieces together -- explicitely (listing the files to include), implicitely (automatically searching for implementations that meet some requirements), or even dynamically at run-time to optimize the performances (not easy, but at least it should be *possible*)!  is a departure from one of Python's moto: "there is one and only one obvious way to do something". Well, I assume that I won't shock people around here if I say I never liked this one :-) A bientôt, Armin.
Armin Rigo wrote:
As I won't read my mail during the whole next week I thought it might be helpful if I attempted some kind of summary. It collects numerous ideas posted here (recently or not) and a bit of personal feelings. I hope it might help in getting some distance over what we are actually trying to do with our "standard object space". I am afraid it is all very vague, but should be kept in mind before we actually design something new there.
[snipped all the goodies] Thanks for the nice summary. I see that you might have different opinions, but you accept that there is the possibility to allow different opinions to coexist, since we don't target to the one-size-fits-all concept, that CPython is restricted to by nature. I think this is a wise decision, since this allows us to continue without having to fight about peanuts issues: Assuming that I insist on naked memory arrays, neglecting any built-in array operations, I can build a worthwile list implementation. At the same time, you may prefer to assume more builtin flexibility and implement arrays in Java-style, either on top of my primitive assembly-like layer with ease, or just by ignoring it. Both ways are OK for me. I would just like to write my implementation now, in the way I think about it, and I'm open to either wrap it later on top, or to modify it for "the new standard" or whatever. I just don't think that this will change the "meat of the implementation", which is not just wrappers around builtin lists, but that implements hairy stuff like sorting, dynamic allocation size decisions, and more. Well, because you are not available next week, please let me ask some questions, in order to avoid too much diversification and less later work to get towards the yet not well-defined target: Assuming I would implement a tuple/list type, what would you prefer: An implementation that uses lists just to denotate a malloc'ed piece of memory, therefore explicitly maintaining actlen and maxlen, or more of the notation of a Java array? (Which I guess) Which are the allowed basic operations assumed to be available? If there were two different implementations, one in terms of plain memory chunks, and one in terms of Java arrays, how would they be split? Is there one interface file in std, which configures itself by some global options and some local include, or would there be a std_c and a java folder? As Stephan pointed out in private emails, he actually is in favor of not doing a long implementation at all, but to link against an external library. This is personal taste which we should provide space for. Where do you think this space should live? And if I pefer to do "real longs", since I'm the earth-bound, hack-all-or-nothing guy, where should I put my stuff? Should I create more specific folder names? Do the STD objects generally turn themselves into just interfaces, which redirect to something else by some means of config files? Assuming that I'm going to implement some basic string type, to you prefer to have this done using builtin Python strings? Of course, I'd use them for constants, like the C strings, but then, when I implement them as basic types, I'm thinking to build them either upon array.array("c","abc") (less likely, since array is not favored by Guido, and seems to not support unicode), or by some restricted list class (ouch!), again. This list class would be meant as a plain memory interface, allowing no list-level operations like "+", len, append or such, just creation, indexing and element assignment, and the implementation would pretty much resemble the current C imple- mentation. So the underlying primitive list would be restricted to able to hold single chars and None, only. This is, as stated in earlier, less coherent emails, just a way to express what the primitive type *cannot* do, in order to let the restricted base list object be error-checked interpreted by CPython, *and* be an easy target for a simple compiler to C. The W_Stringobject class would take care of everything a string can do, by providing every string function that CPython has as well. If I implement things this way, I think it will be more than easy to generate C code, even for a simple mind like mine. If you are uneasy with my way to tackle this, I have no problems if somebody rewrites the stuff in slightly higher level style, or if (s)he wraps it up, claiming "this is the Java level implementation". No problem. But allowing for this allows me to write what I want to write, right now. What I'm just asking for is just how I should put it where, in which directory, with which file naming conventions. I think it's more efficient to just let me keep going in some way, than to let me get stuck in some general interface/ /principle discussion, which is neither my point nor my strength. There is another issue concerning builtins and similar stuff, where we also still have no coherent interface for, and I'm going to propose a simple-minded approach, just to get people like me started. Will write that up in another thread. thanks for your patience with the ancient hacker -- chris -- Christian Tismer :^) mailto:email@example.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
Hello Christian, In short, write whatever implementation you like most. Using lists are very-dump malloc-style blocks of memory is fine. As you said we can always add intermediate layers later if neccesary. On Sat, Mar 01, 2003 at 05:59:38AM +0100, Christian Tismer wrote:
Which are the allowed basic operations assumed to be available?
Only the ones for which you can easily think about a C translation. For example, the list += operator could represent the realloc() primitive.
would they be split? Is there one interface file in std, which configures itself by some global options and some local include, or would there be a std_c and a java folder?
I don´t know exactly, but I think we should put all this into the same std/ folder because the Java-style implementation can also be used to emit C given the proper translator, so that the boundaries are fuzzy. Maybe you can create subdirectories like std/malloc/ that contain the low-level concepts. In all cases write the dependencies explicitely (we´ll see at that point which syntax could be used for that).
of not doing a long implementation at all, but to link against an external library. This is personal taste which we should provide space for. Where do you think this space should live?
This could be done in std/xxxlong.py if you are using the library xxx, for which we should also find a syntax. Not sure about strings and builtins yet. The meat of the implementation of strings shouldn't change too much anyway, so I suggest you just try them with any low-level representation you like (strings are fine, if you manage never to need to mutate them). A bientot, Armin.