Hello Christian, Whoow, I am afraid this thread is going in some wrong direction! As Paolo points out, this is getting confused. There are already several different levels at which things are working, and Stephan and you are adding yet another one. I don't say this is necessarily wrong, but things are not very clear. The W_IntObject class itself is supposed to be one possible implementation of Python's integers, described in a high-level point of view. The whole point is in that word "high-level". We (or at least I) want the objspace/std/* files to be translatable to whatever actual low-level language that we like, and not only C -- this is why the bitfield metaphor is not necessarily a good thing. When you are talking about r_int and similarly r_float or even r_long wrappers, you want to make it clear that you mean a "machine-level" plain object. But actually that was the whole point of RPython. The C translation of all the interpreter-level code will produce machine-level "int"s. In other words there is no deep reason to use a special wrapping mecanism for ints, floats, lists, whatever, to implement the W_XxxObject classes, because we will need *exactly the same* low-level objects to implement just *everything else* in the interpreter! For example, when the interpreter manipulates the current bytecode position, it manipulates what looks like Python integers and strings --- opcode = ord(self.bytecode[nextposition]) nextposition += 1 --- but we want this code to be translated into efficient C code, too. The least thing we want is to have to rewrite *all* this code with r_ints! So the *only* point of r_ints, as far as I can tell, is to have explicit control over bit width and overflow detection. There is no built-in Python type "integer with n bits" that generates the correct OverflowErrors. And there is no unsigned integer type, hence the need for r_uint. But apart from that, just use plain integers and it is fine! That is the whole purpose of writing our pypy in Python, isn't it? Creating r_long, r_float, r_list... looks like we are reinventing our own language. The confusion that may have led to this point is that we never explicitely said what objects we would allow in RPython. Remember, RPython is supposed to be Python with a few restrictions whose goal is *only* to ease the job of type inference and later translation. But the *existing* high-level constructs of Python must be kept! If we don't use them, we are back again at the C level and there is no point in writing pypy in Python. A couple of more specific points now... On Wed, Feb 26, 2003 at 07:46:12PM +0100, Christian Tismer wrote:
So, W_ListObject would have some fields like self.objarray # r_objectarray self.maxlen # r_uint self.actlen # r_uint
It has been suggested to not use maxlen, since we could use len(self.objarray), but I believe this is wrong to do. If we are modelling primitive arrays, then they don't support len() at all.
I feel exactly the opposite. Just *use* real Python lists all the way, including most of their normal operations. This is nice because they describe high-level operations. In my view a W_ListObject would have only two fields: * self.objarray # a real Python list of wrapped items * self.length # the actual list length with self.length <= len(self.objarray). The self.objarray list grows by more than one item at a time, to amortize the cost of adding a single element at a time, just like what the current CPython implementation does. All this works fine, is normal pure Python code, and can be expected to translate to an efficient C implementation in which lists are just translated into malloc'ed blocks of PyObject* pointers. The point is that we still have a nice Pythonic high-level implementation of W_ListObject but nevertheless can translate it into an efficient low-level implementation. Another point I would like to raise is that we don't have to give types to every variable in pypy, like enforcing that a slot contains an r_int. This is just unPythonic. *All* types are expected to be computed during type inference. So don't even say that W_ListObject.objarray is a list of wrapped objects or Nones -- this can be figured out automatically. ..from Paolo's reply:
In other words... is not the *W_IntObject with r_int* one of the *possible* choice that pypy can choose for representing (to target code emission!) a plain python int (c)? Instead of r_int, for example, I can choose to use a tagged-rapresentation of an int, and write a t_int that check the overflow and so problems (and where to check arithmetics?)...(d).
This is right. Consider for example the case of *two* different implementations (that would both appear as having the 'int' type to users). Say that one is like Christian's W_IntObject+r_int, and the other one can only encode small, tagged integers. The choice to use one or both representations in an actual C implementation must be made by the RPython-to-C translator, and not in the object space. For example, if we want to produce something very similar to the current CPython, then we have no use for small, tagged integers. The question is thus, "how do we express things to allow for this?" Similarily, we may provide different implementations for lists, dictionaries, whatever; we may even consider that Python's "long" type is an unneeded hack, for long integers could be just another implementation for the 'int' type, which goes very much in the direction that Python seems to go with the recent automatic conversions of overflows to longs. The original intent of classes like W_IntObject was "one class, one implementation", and I think that we must stick to that idea because these classes are what are used for the multiple dispatch routines. I don't have a clear and complete answer for the rest of the question "how do we express things to allow for this?". I hope that this e-mail has clarified some points. Disagreement is welcome. I apologize to Christian and Stephan because it seems that we might have to reorganize the xxxobject.py sources, althought I'm not sure yet how. In an effort to go in that direction I'd like to add that nothing has been done yet about: * built-in methods (like list.append); the StdObjSpace.xxx.register() trick only works for built-in operators * non-built-in operators and methods, e.g. implementing something like long_long_add in application-space (longobject_app.py). A bientôt, Armin.