From Oliphant.Travis at mayo.edu Wed Aug 9 18:18:02 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Wed, 9 Aug 2000 17:18:02 -0500 (CDT) Subject: [Numpy-discussion] ANNOUNCE (Numeric): Some RPMS and a GIST interface I use regularly. Message-ID: Just letting interested people know that I've made RPM's of the Python interface to all of Lapack originally written by Doug Heisterkamp and modified by Konrad Hinsen (Konrad, you may have already made these --- please let me know if you have). (Under Numerical Python RPMS at) http://oliphant.netpedia.net Also, I've released a module I use regularly for interactive plotting with Gist. It makes it much more "MATLAB" like: from Mplot import * mplot(x,y,'r-',x2,y2,'b:') legend(['Line1',Line2']) # will prompt you for a place to put it. title("Title") xlabel("some label") ylabel("other label") You can also do subplots with the module using gist.plsys to change between subplots. I use it everyday and thought others might be interested in it. It's not well documented at this point, though. Needed files: http://oliphant.netpedia.net/packages/Mplot.py http://oliphant.netpedia.net/packages/write_style.py The latter file allows one to construct a gist style file from a Python nested dictionary which is needed for changing the color and style of the axis system. -Travis From jhauser at ifm.uni-kiel.de Mon Aug 14 08:43:13 2000 From: jhauser at ifm.uni-kiel.de (Janko Hauser) Date: Mon, 14 Aug 2000 14:43:13 +0200 (CEST) Subject: [Numpy-discussion] VSIPL? Message-ID: <14743.59745.249584.750915@ifm.uni-kiel.de> Just by chance I found the link to an ANSI-C library which looks interesting in the light of the reimplementation of NumPy. The link is: http://www.vsipl.org/ I can not decide, if the design decisions made by the authors of this library or this spec are good or not. But they mention a lot of topics coming up on this list from time to time, like different views of the data space, gather/scatter, interfacing to Fortran. Are there people on the list, which do know this library and the possible problems or merits of it? This library was used as an positive design example in a discussion about the GSL. Regarding the redesign, I often get confused, because there are so many libraries, which do all implement there own vector/matrix/tensor definitions, so the question is, if the new NumPy should have more characteristics of an interface, so that the underlying numeric engine could be changed. Would it help in such a case to wrap a library as a first proof of concept? 'ly __Janko From humberto at hpcf.upr.edu Tue Aug 15 14:32:40 2000 From: humberto at hpcf.upr.edu (Humberto Ortiz) Date: Tue, 15 Aug 2000 14:32:40 -0400 Subject: [Numpy-discussion] ANNOUNCE (Numeric): Some RPMS and a GIST interface I use regularly. In-Reply-To: Your message of "Wed, 09 Aug 2000 17:18:02 EST." Message-ID: <200008151832.OAA16266@mail.hpcf.upr.edu> Oliphant.Travis at mayo.edu said: > Also, I've released a module I use regularly for interactive plotting > with Gist. It makes it much more "MATLAB" like: These use your gist version 11 rpms, right? What's the status of the gist pakage in the Release 15 series? I've got some arrays in numpy that I'm writing to a text file and plotting in yorick. -- Humberto Ortiz Zuazaga Visualization Specialist/Programmer UPR High Performance Computing facility http://www.hpcf.upr.edu/ From Oliphant.Travis at mayo.edu Tue Aug 15 14:41:10 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Tue, 15 Aug 2000 13:41:10 -0500 (CDT) Subject: [Numpy-discussion] ANNOUNCE (Numeric): Some RPMS and a GISTinterface I use regularly. In-Reply-To: <200008151832.OAA16266@mail.hpcf.upr.edu> Message-ID: > > Oliphant.Travis at mayo.edu said: > > Also, I've released a module I use regularly for interactive plotting > > with Gist. It makes it much more "MATLAB" like: > > These use your gist version 11 rpms, right? What's the status of the gist > pakage in the Release 15 series? I've got some arrays in numpy that I'm > writing to a text file and plotting in yorick. There have been no changes that I know of to the gist interface since Release 11 and so those are current. Eventually, the RPM's will need to be recompiled, but that has not happened yet. Writing to a text file sounds a little bit indirect but if it works for you... Best wishes, -Travis From hinsen at cnrs-orleans.fr Mon Aug 21 11:01:20 2000 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon, 21 Aug 2000 17:01:20 +0200 Subject: [Numpy-discussion] VSIPL? In-Reply-To: <14743.59745.249584.750915@ifm.uni-kiel.de> (message from Janko Hauser on Mon, 14 Aug 2000 14:43:13 +0200 (CEST)) References: <14743.59745.249584.750915@ifm.uni-kiel.de> Message-ID: <200008211501.RAA01633@chinon.cnrs-orleans.fr> > Just by chance I found the link to an ANSI-C library which looks > interesting in the light of the reimplementation of NumPy. The link is: Is anyone working on a reimplementation? > data space, gather/scatter, interfacing to Fortran. Are there people > on the list, which do know this library and the possible problems or > merits of it? This library was used as an positive design example in a > discussion about the GSL. Not me, but at first sight it does look reasonable. The main benefit for using this in NumPy would be the possibility of substituting optimized implementations for the reference implementation in C. > definitions, so the question is, if the new NumPy should have more > characteristics of an interface, so that the underlying numeric engine > could be changed. Would it help in such a case to wrap a library as a To be useful in C modules, at least the data layout must be documented and stable. Given that there are at least two popular layouts (C style and Fortran style), it is difficult to accomodate all existing array libraries. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From Oliphant.Travis at mayo.edu Tue Aug 22 14:37:41 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Tue, 22 Aug 2000 13:37:41 -0500 (CDT) Subject: [Numpy-discussion] History and Why NumPy2 (long) Message-ID: Greetings to all interested in Numerical Python, My purpose in writing this somewhat long post is to inform interested parties as to where NumPy is going and how far it has gone. I'm doing this in order to coordinate interest and try to summarize some of the recent conversations I've had with other interested people. There are a significant handful of people who are very interested in where Numerical Python is going. All of these people are very bright and have distinct desires for the future of Numerical Python which come from quite diverse experience. This intelligence and diversity brings tremendous strength (both current and potential) to the community and has made Numerical Python an extremely useful tool. Of course, these benefits do not come cheaply: there is quite a bit of disagreement about how things should be done --- mostly due to the fact that people use Numerical Python for different things. Fortunately, this disagreement is not insurmountable provided people are willing to compromise a little syntatic sugar here and there. Numerical Python users have been enjoying the flexibility and power of the underlying programming language for several years. The price we must pay for using a language that is not wholly dedicated to Numerical pursuits, is that we must cooperate with other users of the core language who have interests entirely different than our own. Since Numerical programming is rarely "strictly numerical," what we gain is access to the work they do in improving Python's stock of library tools. When I was introduced to Numerical Python system, some of the results of this compromise were a little annoying to me --- somewhat like the whitespace rule. What I found, however, was that my annoyance gave way to elation as I realized that the non-numeric objects and toolkits where extremely beneficial to me in my numeric work: regular expressions, serving graphs from a website, writing translators for various files and formats, etc. With that introduction, I'll give a brief history of Numerical Python (please forgive me if I have neglected important contributors). Numerical Python started from the work of Jim Hugunin (which he used as part of his Oral Examination at MIT). He posted an announcement of his proposal in August of 1995 based on the Matrix Object previsouly presented by Jim Fulton. Early discussions of the work can be found at http://www.python.org/pipermail/matrix-sig/ which presents very interested reading since many of the topics peole still talk about were hashed even back then. Konrad Hinsen, Paul Dubois, David Ascher, and Jim Fulton were all early contributors. Jim Fulton's work and connections to Guido Van Rossum enabled many of the early changes (extended slicing, complex numbers, ellipses) to get into Python itself. Guido was also part of the early discussion. Konrad Hinsen contributed a significant amount of code to the current version of Numeric Python as well. Jim Hugunin released version 0.2 in December of 1995 and followed the release early, release often model for several months to get Numerical Python into a working state. It is obvious that he spent many hours writing code (time which NumPy2 contributors have not been able to duplicate). One thing that led to some stall in Numerical Python's development is that Jim Hugunin left the project to concentrate on JPython. Paul Dubois picked up the task of project administrator and has done an admirable job, including securing resources to get the current documentation written. David Ascher wrote the bulk of that important resource. Personally, I started using Numerical Python after scouring the Net for something to replace MATLAB for me which had become burdensome under the weight of large data volumes and inefficient memory handling. I started using Numerical Python in the Spring of 1998 ( a relative late-comer ) but I have used it actively ever since. I started releasing packages at that time to increase the number of toolboxes available to the Numerical Python programmer as I was quite happy with the language itself (after I got over the initial annoyances). I've released many pieces of code since then which I personally use quite regularly. Most of these can be found at http://oliphant.netpedia.net Naturally my contributions have been in areas where I had a personal need, but they have enabled me to understand the Numerical Python source code enough to feel confident in modifying it. With that bit of history let's get into why NumPy2: Guido Van Rossum has expressed willingness to include multidimensional arrays into the Python core. The source of this willingness appears to be a general respect for the community of users who use Python for Numerical programming (although he himself is not one of those users). There is already a useful one-dimensional array object distributed with Python which, however, does not support any operations. Some of it's features where borrowed for the current Numerical Python. Last year, I suggested that the PIL and Numeric Python work more closely together (since an image is conceptually just a 2-D (or 3-D for color) Numeric Python array). /F from pythonware responded by saying that until Numeric Python was a part of Python itself he saw no reason to modify the PIL. I took the bait and after pondering why 4 years had elapsed without Numerical Python getting into Python itself, I contacted Guido and Paul to start the ball rolling. Guido's response was that those familiar with the code said it was too ugly and unwieldy to put into Python. The code is just too hard to modify and understand. Evidently, since there are only a handful of people of the hundreds that use it who submit bug patches, or feature enhancements, this must be true. Those who do understand how it works have a hard time finding time to make needed changes --- the intrinsic cooperation problem with volunteer time that is not funded (or contributed to) by those who make use of the results. Guido was kind enough to provide me with some design documents for an implementation of multidimensional arrays that he had worked out. Thinking I would be in graduate school for longer than I am going to be, I set about trying to clean up Numerical Python with the intent of getting it into the Python core. As part of this effort I conducted a survey of current Numerical Python users to find out their interests. The survey and it's results are available at the sourceforge site for Numerical Python. Basically the results indicate that most people agree on some important features (like arbitrary indexing into arrays), but disagree on some details (copy vs. reference and automatic casting rules being the most memorable). While the results were useful, a simple comment made by one of the survey participants made a significant impression on me: "the C-code is too inflexible and hard to change." This is essentially the problem that Paul Dubois had identified and which was keeping Numerical Python out of the Python core. At the same time I had been doing some work with implementing a sparse matrix package for Python by wrapping some compiled C and Fortran code into a Python class I'd constructed. The results were very encouraging and made me realize that the same technique could be used to make Numerical Python much more flexible and easier to extend while retaining it's significant speed benefits. I decided to make a new implementation of Numerical Python where the underlying objects (the array and ufunc objects) are not extension types but true Python classes. This would allow significant benefits in terms of flexibility and modifiability with a small memory-overhead loss and an indeterminate speed change (it will likely be faster under some usages and slightly slower in others). I also wanted to add more types (unsigned types, boolean, and potentially others). While making this change, I realized that another way out of the "type-class" dichotomy (along with ExtensionClasses) is to not make new types at all. If all types were really ExtensionClasses and all new types had to be as well, this could effectively solve the problem from the Python user perspective as well. An noble effort at making Numerical Python an Extension Class was undertaken by David Ascher last year. His work became the ill-fated Numerical Python 12. I rather liked his work, but there were some very hard to trace bugs in the implementation, and the C-code was still hard to modify. Another problem (that must be dealt with with the new implementation as well) is the significant amount of code that has been written to the old C-API. This finally brings us to the state of Numerical Python. I've been working on this implementation on and off for six months (mostly off), but have worked out many of the design details. Since my time is currently limited for the next 3 months, I wanted to let others know of the status to encourage involvement. We have a window here to get this next version of Numeric into Python 2.1, but the window will probably close sometime in January, so there is some urgency. In the next installment, I will outline the design of Numerical Python 2 and some of it's goals. -Travis Oliphant From Oliphant.Travis at mayo.edu Tue Aug 22 17:06:06 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Tue, 22 Aug 2000 16:06:06 -0500 (CDT) Subject: [Numpy-discussion] NumPy2 design In-Reply-To: Message-ID: As a followup to my previous post here is a discussion and overview of the current NumPy2 design as I have it in my head and partially implemented in the numpy2 module on the CVS tree at numpy's sourceforge site. The design of NumPy2 is quite simple and tries to balance speed with flexibility and modifiability. An outline of the design follows (the names can change, they are only reference at the moment) Three classes replacing the current C structures: ArrayType --- replaces PyArray_Descr # not implemented yet Ufunc --- replaces PyUfuncObject NDArray --- replaces PyArrayObject The purpose of each class is to encapsulate interfaces to allow code re-use for similar operations. NDArray # This is implemented except for the # operations (ufuncs) ======================== The most concrete class is the NDArray class --- it just needs coding to make it happen. The other classes still need some design work to efficiently handle mixed type operations and additions of new types to the system. The NDArray base class gives an N-Dimensional array interpretation to a Python buffer (a segment of memory, an m-mapped file, a PIL image, etc.). It provides this interpretation with three special attributes: self.rank --- the dimension of the array (hard-coded changeable limit of 10). self._data --- a buffer object pointing to the data self._structure --- a buffer object pointing to an array of INTEGERS which holds the dimensions and strides information (INTEGER) is a platform-dependent type #defined in compiled code self._descr --- Python class describing the type. To interact seamlessly with the C-API and be recognized as "an array" all subclasses must either export an __array__ method which creates a suitable NDArray or not interfere with these provided attributes. Note that the same data segment can be viewed in several different ways. The NDArray will have default implementations for the numeric operations that will resemble the current implementation. But, it will be easy to subclass the array to handle these operations as you'd like without losing the ability to use the data in that array in extension modules which assume array inputs. Two other attributes are worth mentioning: self.CONTIGUOUS # this can be determined from the _structure information # but it is useful to keep a flag around indicating the # status. This tells you whether or not you can # walk through the entire array an element at a # time with a single for loop. self.FORTRANVIEW # This basically indicates how the array will view it's # shape when asked and indexed (This does not change the # _structure information). An array of "shape" # (10,3,5,7) when # FORTRANVIEW is 0 will be an array of "shape" (7,5,3,10) # when FORTRANVIEW is 1 ArrayType: # This is not implemented yet. =============================== This class is to replace the PyArray_Descr structure in current Numerical Python. As a result, it must contain the information: self.name ---- some kind of object to identify it (a string) self.elsize ---- size of an item of this kind. self.cast ---- a dictionary of compiled functions with at least one entry called to cast this type to at least one other type self.getitem ---- (Compiled) function self.setitem ---- (Compiled) function self.zeros ---- needed for the zeros command to include arrays of Python Objects. It points to the representation of zero for this type. Currently, the above is not implemented, yet. What is implemented is a module _arraytypes which exposes to Python the PyArray_Descr structure so that it can be used. The idea of adding new types to Numerical Python without having to change all of the code is appealing to me, however. Ufunc: # This is partly implemented ========================================== There are two ideas here. I've partly implemented the first one which I'll explain. The second was presented to me by Paul Barrett. Ufunc's are encapsulations of the N-D looping construct and the broadcasting rules of Numerical Python. The N-D looping construct is limited to the fixed but arbitrary 10 dimensions as given above for C code but can be arbitrary if a Python function is called at each iteration. I explain Ufuncs in a piece called Ufuncexplain.txt which is on the CVS tree. Here is a quote that explains broadcasting rules: 1) If input arrays do not have the same rank. The array with lower rank will be prepended with ones until ranks agree. 2) If an input array has length one, then "duplicate" the elements along that dimension so that input shapes agree. Example: A is an array of shape (10,) B is an array of shape (3,10) A * B will return an array of shape (3,10): - A is interpreted as shape (1,10) - the columns of A are "broadcast" across the rows of B Thus the output is (3,10): [ A*B[0] A*B[1] A*B[2] ] Note that A is not actually extended to a (3,10) array. It merely behaves as if it had been. The element-wise math operators are implemented using Ufuncs. SpecialFuncs is a Python package at http://oliphant.netpedia.net that impelements a whole range of special functions using the Ufunc formalism. I also include in that package a general arraymap function which can turn any Python function into a broadcasting "ufunc-mimic" This code does not have the rank 10 limit on the number of dimensions on the inputs -- but it might be slower than the current implementation. My current implementation assumes that the Ufunc instantiator will provide two functions: a select function and a compute function (either of these can be in C or Python). The select function and the compute function work together. The select function determines the type of the outputs based on the input types, while the compute function takes the inputs and outputs (and their types) and computes the ouptut. This is done on either an entire block of memory (optimized ufuncs) or one-element-at-a-time (unoptimized ufuncs---Python coded ufuncs for example). This allows for efficient coding and the possibility of mixed type arithmetic with a more complicated creation process. It also may be hard to add new types and have them function as you'd like without modifying others already-defined ufuncs. But, I know this idea will work and I can see my way through it. I've already implemented an "addition" function using this method. Another idea that has been presented is to instantiate a Ufunc with only one function that is entered into a "dispatch table" or dictionary of functions keyed by the ArrayType class much like the Multimethod approach that has been discussed. I like this idea, but I do not see the details (I haven't thought about it too much) and do not know if we can actually make it work --- Frankly, I think we can and it will result in a better system. What hasn't been thought through is exactly what is entered into the "function" dictionary and when is it called. Some have suggested that it is called "immediately" upon ufunc call, but this would eliminate the benefits of the encapsulated broadcasting rules. An alternative would be to call it after the "broadcasting rule encapsulation" as been done. In other words use it as a replacement for the current array of functions in the Ufunc implementation. With the appropriate mix of C-modules and Python code I think this could be done quite elegantly. The other issue that has to be worked out (again) is that obviously this table will not be filled out in every case (for a function with 10 inputs and 10 outputs with 16 types we are talking 16^20 different entries in the table and will be sparse) so what is done when there is not an entry for a particular combination will require some thought. I think we could allow multiple behaviors according to some attribute of the ufunc (casting, exception raising, etc.,) is set. Many people might fear this would inhibit code re-use but I have not seen convincing examples. So, that is a brief overview of the state of things. It doesn't try to cover everything, but it should give you enough of a perspective to understand the code that is on the CVS tree under module numpy2. I have been using C for the compiled code because it is easy to interface to and it has the widest platform support and because Python istelf is written in C. Anybody with specific questions (including offers to help) can feel free to contact me or post to this list. Thanks, Travis Oliphant From Oliphant.Travis at mayo.edu Wed Aug 23 11:38:26 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Wed, 23 Aug 2000 10:38:26 -0500 (CDT) Subject: [Numpy-discussion] Explanation of some terms In-Reply-To: Message-ID: For the benefit of those who may be unfamiliar with ways to add new functionality I will try to briefly summarize. More information can be found in the documentation and in the books that have been written about Python. There are two (three) ways to add a new object to Python: using an extension type and defining a class. The fact that there are two distinct ways to add new objects is often called the type-class dichotomy. It is a goal of Py3K to somehow eliminate this distinction. Another way to add new behavior that I'll explain is to make the type an "extension class." Making the type a "subtype" of this fancy type gives a possible direction for unifying types and classes. Types =============================== "Types" are more fundamental to the language and must be added using compiled code (All of the types I've seen are in straight C since you don't really buy anything by using C++ as Python itself is written in C). You can investigate the type of an object from within python by using the command type: >>> type(a) # prints the "type" of object a There are many types defined in the Python core such as integers, floats, complex, lists, tuples, dictionaries, etc. Python allows you to make new types. These must be made in C (maybe C++ but again I don't think the extra complexity buys you anything since Python is in C.) A new type is a PyTypeObject basically filled with function pointers and arrays of function pointers to handle the various operations one might do on the new type. This PyTypeObject is coupled with a C-structure containing the "data" for the new type. This data C structure lists PyObject_HEAD as it's first member and then whatever other data is necessary. Making a new type is thus a matter of creating these two C structures and filling in the TypeObject table with function pointers to handle various operations (getting and setting attributes, treating the type as an abstract number, sequence, or mapping, or printing the object). Python has an abstract object interface on the C level, that is used, so that if a type that has a "number" interface (operations) it can be used like a number, if it has a "sequence" interface can be indexed like a sequence, or if it has a "mapping" interface it can be indexed like a dictionary. Classes ====================== A Python Class is at the C level just another "type." There are actually two "types" associated with a Python class: an instance type and a class type. An instance of a class is the instance type. So every instance of any class has the same "type." What this means on a C level is that there is one more layer of indirection for each "operation" in Python when the type is "class". The Python interpreter goes through the "class type" to see what to do and finds the appropriate C function from that PyTypeObject Method table. This C function does a dictionary lookup using the special method names and executes the Python function associated with that name for the particular instance (which may call back into a compiled extension module to do the actual work). This level of indirection gives a great deal of dynamic flexibility since classes can be subclassed and attributes can be added dynamically, but there will be a performance hit which won't be noticeable except inside Python iteration loops. So in reality there is no "type"-"class" dichotomy. Everything is a type. It's just that classes are dynamic types which allow you to define Python functions to implement the "method table" The reason for the dichotomy is that classes are so useful, that people really like them, and use them quite a bit so that the other static types seem quite rigid in comparison. Extension Classes ================================== This is another fancy, dynamic "type" not distributed in the Python Core but developed by Digital Creations (the Zope people) in order to let C programmers "subclass" types. I'm not an expert on these as I've never really used them but as far as I can tell they bring the idea of "dynamic types" to the C programmer. This is accomplished by making all types just subtypes of the extension class "type". One way to understand the result is by understanding what the type command tells you about your new "extension class". It will tell you that's it's of type "extension class." So, dynamic typing is again implemented with another layer of indirection where the fixed special C functions of the extension class "type" call out to your particular set of registered C functions. The difference is that the indirection is all handled in C. So those are the choices for implementing new behavior in Python. Currently, Numerical Python is implemented as a new "type" which defines all of these interfaces. The mapping interface handles "extended slicing," the "sequence" interface allows the array to return something when len() is called for example, and the "number" interface implements the operators. Actually, two new types are defined: a "ufunc" type and an "array" type. All of the operators are implemented as instances of the "ufunc" type. The "ufunc" essentially encapsulates the "casting and broadcasting" rules associated with elementwise operations. The ufunc is not well-understood by most non-developers I've talked too since most people don't instantiate their own ufuncs (which must be instantiated in C). The code works and is fast, but it can be hard to extend and there are pieces that are poorly documented and hard to understand. For example, nobody has reworked the "extended slicing" syntax to enable arbitrary-index slicing, despite many people who would like that feature (actually I've heard that John Bernard did finally write some code to do that but I've never seen it and it's not there now). As mentioned before, David Ascher made the necessary changes to make Numerical Python of type "extension class" which among other things, allowed, the type to be "subclassed" from within Python. I thought this was a nice solution and we'd have to hear from him as to what went wrong. The only trouble I had with it is that the C-API changed slightly in that Arrays were no longer of type Array_Type and code that depended on it would break (the same is true of any redesign making Python arrays a class). We'd have to hear from him as to what other problems he saw. It still doesn't solve the problem of maintainability of the C-code base, but it definitely gave a more flexible result to the Python user. Perhaps retrofitting the ExtensionClass solution with an enhanced C-API would be a better solution. We really need David's input on that suggestion... The idea I've put forward is to make the object "classes" but I would support the "extension class" solution as well. Regardless of how it is implemented, we still need to design the appropriate "objects" (arraytype, NDArray, Ufunc) and how they interact with each other, as well as a suitable C-API so that they work together seemlessly. I hope this helps some readers who are less familiar with extending Python. DISCLAIMER: I am not the world's expert on these issues but I do have some experience, so take what lessons you may. Best wishes, Travis Oliphant From Oliphant.Travis at mayo.edu Thu Aug 24 17:29:50 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Thu, 24 Aug 2000 16:29:50 -0500 (CDT) Subject: [Numpy-discussion] Why I wrote the previous posts. In-Reply-To: <200008242105.OAA31079@lists.sourceforge.net> Message-ID: For those wondering why I wrote the previous posts (and didn't bother to correct the spelling mistakes :-) ) it is due to a request from Greg Wilson who is bending the ear of other contributors to Numerical software that are not in the Python community. He asked that I quickly hash out some words on Numerical Python and where it is going for him --- so I did and crossposted what I wrote to this list since I thought some might be interested in my position and goals. --Travis From pauldubois at home.com Sat Aug 26 23:49:51 2000 From: pauldubois at home.com (Paul F. Dubois) Date: Sat, 26 Aug 2000 20:49:51 -0700 Subject: [Numpy-discussion] Announce: Pyfort 6.0 Message-ID: There is a new Pyfort on SourceForge. Type pyfort with no arguments to get usage info. Incompatible change required: Remove the module/end module pair of statements from your .pyf Documentation: in pdf and HTML at http://pyfortran.sourceforge.net. See the project page at http://sourceforge.net/projects/pyfortran for downloading Pyfort-6.0.tar.gz. The new version builds and installs extensions in just one line. For example, if you have created an input file mydiagostic.pfy, using Fortran library libmydiags.a located in directory /myhouse, then pyfort -i -l mydiags -L /myhouse mydiagnostic.pyf makes mydiagnostic.so and installs in Python's site-packages directory (assuming you have write permission there). It is now much easier to figure out how to add a "compiler id" for a new platform. I wouldn't say it is completely easy, but it is "easier". I hope people will post patches to fortran_compiler.py to add their favorites. I have only tested the Linux g77 / pgf77 combinations. pgf90 (Portland Group F90, Linux), g77alpha (g77 on an alpha chip)and solaris (Sun Solaris Fortran) "should" work. Numeric Python, and 0.9 or better Distutils are assumed; see documentation for details. From pete at shinners.org Sun Aug 27 04:49:51 2000 From: pete at shinners.org (Pete Shinners) Date: Sun, 27 Aug 2000 01:49:51 -0700 Subject: [Numpy-discussion] NumPy2 design References: Message-ID: <001801c01003$c4793d00$0200a8c0@home> quick easy question here. it seems like the current development efforts are going into numpy2. i can see why this is very important to try to make the python2.1 cutoff date in january. in the mean time, is there a planned release for the current numpy that will take advantage of python 2.0 features? (i'm mainly looking at the assignment operators, array1 += array2) i would assume there is, but i haven't heard any talk going on for this. the python 2.0 beta is coming up in one week (pending further delays). is there a timeframe for a new numpy that is ready to take advantage of 2.0 ? From jack at oratrix.nl Mon Aug 28 07:09:02 2000 From: jack at oratrix.nl (Jack Jansen) Date: Mon, 28 Aug 2000 13:09:02 +0200 Subject: [Numpy-discussion] lapack_lite Message-ID: <20000828110902.9E0EF303181@snelboot.oratrix.nl> The lapack_lite directory is empty in the sourceforge repository. And according to the build docs it shouldn't be. Could whoever has the contents please check them in, or (if they should come from a different place) please tell me where to find them? Right now I can't build from the repository, and I'd like to get Numeric working with the latest release for MacPython 2.0. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From pauldubois at home.com Mon Aug 28 07:56:56 2000 From: pauldubois at home.com (Paul F. Dubois) Date: Mon, 28 Aug 2000 04:56:56 -0700 Subject: [Numpy-discussion] lapack_lite In-Reply-To: <20000828110902.9E0EF303181@snelboot.oratrix.nl> Message-ID: That would be me, and it is fixed now. I had done a cvs add, and created a directory and put stuff in it, but I think I moved it to a different name temporarily for a test and didn't put it back, so when I did the commit it was missed. I also had missed adding one file and needed to repair the setup files for a three-digit distutils version - a stupid error on my part. Jack, I will add you to the developer list so you can check in your macisms. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Jack > Jansen > Sent: Monday, August 28, 2000 4:09 AM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] lapack_lite > > > The lapack_lite directory is empty in the sourceforge repository. And > according to the build docs it shouldn't be. Could whoever has > the contents > please check them in, or (if they should come from a different > place) please > tell me where to find them? > > Right now I can't build from the repository, and I'd like to get Numeric > working with the latest release for MacPython 2.0. > -- > Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ > Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to > your sig ++++ > www.oratrix.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > http://lists.sourceforge.net/mailman/listinfo/numpy-discussion From jack at oratrix.nl Mon Aug 28 08:30:01 2000 From: jack at oratrix.nl (Jack Jansen) Date: Mon, 28 Aug 2000 14:30:01 +0200 Subject: [Numpy-discussion] lapack_lite In-Reply-To: Message by "Paul F. Dubois" , Mon, 28 Aug 2000 04:56:56 -0700 , Message-ID: <20000828123001.EC7B4303181@snelboot.oratrix.nl> > That would be me, and it is fixed now. I had done a cvs add, and created a > directory and put stuff in it, but I think I moved it to a different name > temporarily for a test and didn't put it back, so when I did the commit it > was missed. Thanks! > Jack, I will add you to the developer list so you can check in your macisms. Okay, great. I'll probably stick with Mac subdirectories on the toplevel and lapack-lite level which will contain the mac project files (and possibly any config header files or so, but I think Numeric didn't need them last time around). I've also started ansifying the numeric stuff for Python 2.0 (mainly taking out the Py_PROTO and Py_FPROTO macros, which have disappearded) but I won't check that in right now (I don't know whether you want to do a Numeric-for-Python-1.6 distribution, and I don't remember whether 1.6 was fully ansified), but let me know when/if you want this checked in. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From managan at llnl.gov Tue Aug 29 15:08:30 2000 From: managan at llnl.gov (Rob Managan) Date: Tue, 29 Aug 2000 12:08:30 -0700 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #90 - 3 msgs In-Reply-To: <200008281930.MAA18250@lists.sourceforge.net> References: <200008281930.MAA18250@lists.sourceforge.net> Message-ID: When trying to build the 16.0 release on the Mac I found that I had to modify fftpackmodule.c so that line 3 was #include "arrayobject.h" instead of #include "Numeric/arrayobject.h" What was the reason for the Numeric in the first place? It seems that arrayobject.h is now in the Include directory anyway. -- *-*-*-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-*-*-*- Rob Managan LLNL ph: 925-423-0903 P.O. Box 808, L-095 FAX: 925-422-3389 Livermore, CA 94551-0808 From Oliphant.Travis at mayo.edu Thu Aug 31 15:20:22 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Thu, 31 Aug 2000 14:20:22 -0500 (CDT) Subject: [Numpy-discussion] ANNOUNCE: New algorithms added to optimize.py In-Reply-To: <8om8cg$3um$1@nnrp1.deja.com> References: <8om8cg$3um$1@nnrp1.deja.com> Message-ID: This is a heads up to those interested in Python-implemented optimization algorithms. I've updated my optimize.py module to version 0.3 by including two new unconstrained optimization algorithms to minimize a function of many variables -- one which implements a quasi-Newton algorithm (BFGS) and another which implements a practical Newton's algorithm using conjugate gradients to invert the Hessian (approximated if not provided). The license is very liberal it's available at http://oliphant.netpedia.net/packages/optimize.py Here are the docstrings: >>> print optimize.fminBFGS.__doc__ xopt = fminBFGS(f, fprime, x0, args=(), avegtol=1e-5, maxiter=None, fulloutput=0, printmessg=1) Optimize the function, f, whose gradient is given by fprime using the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) See Wright, and Nocedal 'Numerical Optimization', 1999, pg. 198. >>> print optimize.fminNCG.__doc__ xopt = fminNCG(f, fprime, x0, fhess_p=None, args=(), avextol=1e-5, maxiter=None, fulloutput=0, printmessg=1) Optimize the function, f, whose gradient is given by fprime using the Newton-CG method. fhess_p must compute the hessian times an arbitrary vector. If it is not given, finite-differences on fprime are used to compute it. See Wright, and Nocedal 'Numerical Optimization', 1999, pg. 140. From Oliphant.Travis at mayo.edu Wed Aug 9 18:18:02 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Wed, 9 Aug 2000 17:18:02 -0500 (CDT) Subject: [Numpy-discussion] ANNOUNCE (Numeric): Some RPMS and a GIST interface I use regularly. Message-ID: Just letting interested people know that I've made RPM's of the Python interface to all of Lapack originally written by Doug Heisterkamp and modified by Konrad Hinsen (Konrad, you may have already made these --- please let me know if you have). (Under Numerical Python RPMS at) http://oliphant.netpedia.net Also, I've released a module I use regularly for interactive plotting with Gist. It makes it much more "MATLAB" like: from Mplot import * mplot(x,y,'r-',x2,y2,'b:') legend(['Line1',Line2']) # will prompt you for a place to put it. title("Title") xlabel("some label") ylabel("other label") You can also do subplots with the module using gist.plsys to change between subplots. I use it everyday and thought others might be interested in it. It's not well documented at this point, though. Needed files: http://oliphant.netpedia.net/packages/Mplot.py http://oliphant.netpedia.net/packages/write_style.py The latter file allows one to construct a gist style file from a Python nested dictionary which is needed for changing the color and style of the axis system. -Travis From jhauser at ifm.uni-kiel.de Mon Aug 14 08:43:13 2000 From: jhauser at ifm.uni-kiel.de (Janko Hauser) Date: Mon, 14 Aug 2000 14:43:13 +0200 (CEST) Subject: [Numpy-discussion] VSIPL? Message-ID: <14743.59745.249584.750915@ifm.uni-kiel.de> Just by chance I found the link to an ANSI-C library which looks interesting in the light of the reimplementation of NumPy. The link is: http://www.vsipl.org/ I can not decide, if the design decisions made by the authors of this library or this spec are good or not. But they mention a lot of topics coming up on this list from time to time, like different views of the data space, gather/scatter, interfacing to Fortran. Are there people on the list, which do know this library and the possible problems or merits of it? This library was used as an positive design example in a discussion about the GSL. Regarding the redesign, I often get confused, because there are so many libraries, which do all implement there own vector/matrix/tensor definitions, so the question is, if the new NumPy should have more characteristics of an interface, so that the underlying numeric engine could be changed. Would it help in such a case to wrap a library as a first proof of concept? 'ly __Janko From humberto at hpcf.upr.edu Tue Aug 15 14:32:40 2000 From: humberto at hpcf.upr.edu (Humberto Ortiz) Date: Tue, 15 Aug 2000 14:32:40 -0400 Subject: [Numpy-discussion] ANNOUNCE (Numeric): Some RPMS and a GIST interface I use regularly. In-Reply-To: Your message of "Wed, 09 Aug 2000 17:18:02 EST." Message-ID: <200008151832.OAA16266@mail.hpcf.upr.edu> Oliphant.Travis at mayo.edu said: > Also, I've released a module I use regularly for interactive plotting > with Gist. It makes it much more "MATLAB" like: These use your gist version 11 rpms, right? What's the status of the gist pakage in the Release 15 series? I've got some arrays in numpy that I'm writing to a text file and plotting in yorick. -- Humberto Ortiz Zuazaga Visualization Specialist/Programmer UPR High Performance Computing facility http://www.hpcf.upr.edu/ From Oliphant.Travis at mayo.edu Tue Aug 15 14:41:10 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Tue, 15 Aug 2000 13:41:10 -0500 (CDT) Subject: [Numpy-discussion] ANNOUNCE (Numeric): Some RPMS and a GISTinterface I use regularly. In-Reply-To: <200008151832.OAA16266@mail.hpcf.upr.edu> Message-ID: > > Oliphant.Travis at mayo.edu said: > > Also, I've released a module I use regularly for interactive plotting > > with Gist. It makes it much more "MATLAB" like: > > These use your gist version 11 rpms, right? What's the status of the gist > pakage in the Release 15 series? I've got some arrays in numpy that I'm > writing to a text file and plotting in yorick. There have been no changes that I know of to the gist interface since Release 11 and so those are current. Eventually, the RPM's will need to be recompiled, but that has not happened yet. Writing to a text file sounds a little bit indirect but if it works for you... Best wishes, -Travis From hinsen at cnrs-orleans.fr Mon Aug 21 11:01:20 2000 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon, 21 Aug 2000 17:01:20 +0200 Subject: [Numpy-discussion] VSIPL? In-Reply-To: <14743.59745.249584.750915@ifm.uni-kiel.de> (message from Janko Hauser on Mon, 14 Aug 2000 14:43:13 +0200 (CEST)) References: <14743.59745.249584.750915@ifm.uni-kiel.de> Message-ID: <200008211501.RAA01633@chinon.cnrs-orleans.fr> > Just by chance I found the link to an ANSI-C library which looks > interesting in the light of the reimplementation of NumPy. The link is: Is anyone working on a reimplementation? > data space, gather/scatter, interfacing to Fortran. Are there people > on the list, which do know this library and the possible problems or > merits of it? This library was used as an positive design example in a > discussion about the GSL. Not me, but at first sight it does look reasonable. The main benefit for using this in NumPy would be the possibility of substituting optimized implementations for the reference implementation in C. > definitions, so the question is, if the new NumPy should have more > characteristics of an interface, so that the underlying numeric engine > could be changed. Would it help in such a case to wrap a library as a To be useful in C modules, at least the data layout must be documented and stable. Given that there are at least two popular layouts (C style and Fortran style), it is difficult to accomodate all existing array libraries. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From Oliphant.Travis at mayo.edu Tue Aug 22 14:37:41 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Tue, 22 Aug 2000 13:37:41 -0500 (CDT) Subject: [Numpy-discussion] History and Why NumPy2 (long) Message-ID: Greetings to all interested in Numerical Python, My purpose in writing this somewhat long post is to inform interested parties as to where NumPy is going and how far it has gone. I'm doing this in order to coordinate interest and try to summarize some of the recent conversations I've had with other interested people. There are a significant handful of people who are very interested in where Numerical Python is going. All of these people are very bright and have distinct desires for the future of Numerical Python which come from quite diverse experience. This intelligence and diversity brings tremendous strength (both current and potential) to the community and has made Numerical Python an extremely useful tool. Of course, these benefits do not come cheaply: there is quite a bit of disagreement about how things should be done --- mostly due to the fact that people use Numerical Python for different things. Fortunately, this disagreement is not insurmountable provided people are willing to compromise a little syntatic sugar here and there. Numerical Python users have been enjoying the flexibility and power of the underlying programming language for several years. The price we must pay for using a language that is not wholly dedicated to Numerical pursuits, is that we must cooperate with other users of the core language who have interests entirely different than our own. Since Numerical programming is rarely "strictly numerical," what we gain is access to the work they do in improving Python's stock of library tools. When I was introduced to Numerical Python system, some of the results of this compromise were a little annoying to me --- somewhat like the whitespace rule. What I found, however, was that my annoyance gave way to elation as I realized that the non-numeric objects and toolkits where extremely beneficial to me in my numeric work: regular expressions, serving graphs from a website, writing translators for various files and formats, etc. With that introduction, I'll give a brief history of Numerical Python (please forgive me if I have neglected important contributors). Numerical Python started from the work of Jim Hugunin (which he used as part of his Oral Examination at MIT). He posted an announcement of his proposal in August of 1995 based on the Matrix Object previsouly presented by Jim Fulton. Early discussions of the work can be found at http://www.python.org/pipermail/matrix-sig/ which presents very interested reading since many of the topics peole still talk about were hashed even back then. Konrad Hinsen, Paul Dubois, David Ascher, and Jim Fulton were all early contributors. Jim Fulton's work and connections to Guido Van Rossum enabled many of the early changes (extended slicing, complex numbers, ellipses) to get into Python itself. Guido was also part of the early discussion. Konrad Hinsen contributed a significant amount of code to the current version of Numeric Python as well. Jim Hugunin released version 0.2 in December of 1995 and followed the release early, release often model for several months to get Numerical Python into a working state. It is obvious that he spent many hours writing code (time which NumPy2 contributors have not been able to duplicate). One thing that led to some stall in Numerical Python's development is that Jim Hugunin left the project to concentrate on JPython. Paul Dubois picked up the task of project administrator and has done an admirable job, including securing resources to get the current documentation written. David Ascher wrote the bulk of that important resource. Personally, I started using Numerical Python after scouring the Net for something to replace MATLAB for me which had become burdensome under the weight of large data volumes and inefficient memory handling. I started using Numerical Python in the Spring of 1998 ( a relative late-comer ) but I have used it actively ever since. I started releasing packages at that time to increase the number of toolboxes available to the Numerical Python programmer as I was quite happy with the language itself (after I got over the initial annoyances). I've released many pieces of code since then which I personally use quite regularly. Most of these can be found at http://oliphant.netpedia.net Naturally my contributions have been in areas where I had a personal need, but they have enabled me to understand the Numerical Python source code enough to feel confident in modifying it. With that bit of history let's get into why NumPy2: Guido Van Rossum has expressed willingness to include multidimensional arrays into the Python core. The source of this willingness appears to be a general respect for the community of users who use Python for Numerical programming (although he himself is not one of those users). There is already a useful one-dimensional array object distributed with Python which, however, does not support any operations. Some of it's features where borrowed for the current Numerical Python. Last year, I suggested that the PIL and Numeric Python work more closely together (since an image is conceptually just a 2-D (or 3-D for color) Numeric Python array). /F from pythonware responded by saying that until Numeric Python was a part of Python itself he saw no reason to modify the PIL. I took the bait and after pondering why 4 years had elapsed without Numerical Python getting into Python itself, I contacted Guido and Paul to start the ball rolling. Guido's response was that those familiar with the code said it was too ugly and unwieldy to put into Python. The code is just too hard to modify and understand. Evidently, since there are only a handful of people of the hundreds that use it who submit bug patches, or feature enhancements, this must be true. Those who do understand how it works have a hard time finding time to make needed changes --- the intrinsic cooperation problem with volunteer time that is not funded (or contributed to) by those who make use of the results. Guido was kind enough to provide me with some design documents for an implementation of multidimensional arrays that he had worked out. Thinking I would be in graduate school for longer than I am going to be, I set about trying to clean up Numerical Python with the intent of getting it into the Python core. As part of this effort I conducted a survey of current Numerical Python users to find out their interests. The survey and it's results are available at the sourceforge site for Numerical Python. Basically the results indicate that most people agree on some important features (like arbitrary indexing into arrays), but disagree on some details (copy vs. reference and automatic casting rules being the most memorable). While the results were useful, a simple comment made by one of the survey participants made a significant impression on me: "the C-code is too inflexible and hard to change." This is essentially the problem that Paul Dubois had identified and which was keeping Numerical Python out of the Python core. At the same time I had been doing some work with implementing a sparse matrix package for Python by wrapping some compiled C and Fortran code into a Python class I'd constructed. The results were very encouraging and made me realize that the same technique could be used to make Numerical Python much more flexible and easier to extend while retaining it's significant speed benefits. I decided to make a new implementation of Numerical Python where the underlying objects (the array and ufunc objects) are not extension types but true Python classes. This would allow significant benefits in terms of flexibility and modifiability with a small memory-overhead loss and an indeterminate speed change (it will likely be faster under some usages and slightly slower in others). I also wanted to add more types (unsigned types, boolean, and potentially others). While making this change, I realized that another way out of the "type-class" dichotomy (along with ExtensionClasses) is to not make new types at all. If all types were really ExtensionClasses and all new types had to be as well, this could effectively solve the problem from the Python user perspective as well. An noble effort at making Numerical Python an Extension Class was undertaken by David Ascher last year. His work became the ill-fated Numerical Python 12. I rather liked his work, but there were some very hard to trace bugs in the implementation, and the C-code was still hard to modify. Another problem (that must be dealt with with the new implementation as well) is the significant amount of code that has been written to the old C-API. This finally brings us to the state of Numerical Python. I've been working on this implementation on and off for six months (mostly off), but have worked out many of the design details. Since my time is currently limited for the next 3 months, I wanted to let others know of the status to encourage involvement. We have a window here to get this next version of Numeric into Python 2.1, but the window will probably close sometime in January, so there is some urgency. In the next installment, I will outline the design of Numerical Python 2 and some of it's goals. -Travis Oliphant From Oliphant.Travis at mayo.edu Tue Aug 22 17:06:06 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Tue, 22 Aug 2000 16:06:06 -0500 (CDT) Subject: [Numpy-discussion] NumPy2 design In-Reply-To: Message-ID: As a followup to my previous post here is a discussion and overview of the current NumPy2 design as I have it in my head and partially implemented in the numpy2 module on the CVS tree at numpy's sourceforge site. The design of NumPy2 is quite simple and tries to balance speed with flexibility and modifiability. An outline of the design follows (the names can change, they are only reference at the moment) Three classes replacing the current C structures: ArrayType --- replaces PyArray_Descr # not implemented yet Ufunc --- replaces PyUfuncObject NDArray --- replaces PyArrayObject The purpose of each class is to encapsulate interfaces to allow code re-use for similar operations. NDArray # This is implemented except for the # operations (ufuncs) ======================== The most concrete class is the NDArray class --- it just needs coding to make it happen. The other classes still need some design work to efficiently handle mixed type operations and additions of new types to the system. The NDArray base class gives an N-Dimensional array interpretation to a Python buffer (a segment of memory, an m-mapped file, a PIL image, etc.). It provides this interpretation with three special attributes: self.rank --- the dimension of the array (hard-coded changeable limit of 10). self._data --- a buffer object pointing to the data self._structure --- a buffer object pointing to an array of INTEGERS which holds the dimensions and strides information (INTEGER) is a platform-dependent type #defined in compiled code self._descr --- Python class describing the type. To interact seamlessly with the C-API and be recognized as "an array" all subclasses must either export an __array__ method which creates a suitable NDArray or not interfere with these provided attributes. Note that the same data segment can be viewed in several different ways. The NDArray will have default implementations for the numeric operations that will resemble the current implementation. But, it will be easy to subclass the array to handle these operations as you'd like without losing the ability to use the data in that array in extension modules which assume array inputs. Two other attributes are worth mentioning: self.CONTIGUOUS # this can be determined from the _structure information # but it is useful to keep a flag around indicating the # status. This tells you whether or not you can # walk through the entire array an element at a # time with a single for loop. self.FORTRANVIEW # This basically indicates how the array will view it's # shape when asked and indexed (This does not change the # _structure information). An array of "shape" # (10,3,5,7) when # FORTRANVIEW is 0 will be an array of "shape" (7,5,3,10) # when FORTRANVIEW is 1 ArrayType: # This is not implemented yet. =============================== This class is to replace the PyArray_Descr structure in current Numerical Python. As a result, it must contain the information: self.name ---- some kind of object to identify it (a string) self.elsize ---- size of an item of this kind. self.cast ---- a dictionary of compiled functions with at least one entry called to cast this type to at least one other type self.getitem ---- (Compiled) function self.setitem ---- (Compiled) function self.zeros ---- needed for the zeros command to include arrays of Python Objects. It points to the representation of zero for this type. Currently, the above is not implemented, yet. What is implemented is a module _arraytypes which exposes to Python the PyArray_Descr structure so that it can be used. The idea of adding new types to Numerical Python without having to change all of the code is appealing to me, however. Ufunc: # This is partly implemented ========================================== There are two ideas here. I've partly implemented the first one which I'll explain. The second was presented to me by Paul Barrett. Ufunc's are encapsulations of the N-D looping construct and the broadcasting rules of Numerical Python. The N-D looping construct is limited to the fixed but arbitrary 10 dimensions as given above for C code but can be arbitrary if a Python function is called at each iteration. I explain Ufuncs in a piece called Ufuncexplain.txt which is on the CVS tree. Here is a quote that explains broadcasting rules: 1) If input arrays do not have the same rank. The array with lower rank will be prepended with ones until ranks agree. 2) If an input array has length one, then "duplicate" the elements along that dimension so that input shapes agree. Example: A is an array of shape (10,) B is an array of shape (3,10) A * B will return an array of shape (3,10): - A is interpreted as shape (1,10) - the columns of A are "broadcast" across the rows of B Thus the output is (3,10): [ A*B[0] A*B[1] A*B[2] ] Note that A is not actually extended to a (3,10) array. It merely behaves as if it had been. The element-wise math operators are implemented using Ufuncs. SpecialFuncs is a Python package at http://oliphant.netpedia.net that impelements a whole range of special functions using the Ufunc formalism. I also include in that package a general arraymap function which can turn any Python function into a broadcasting "ufunc-mimic" This code does not have the rank 10 limit on the number of dimensions on the inputs -- but it might be slower than the current implementation. My current implementation assumes that the Ufunc instantiator will provide two functions: a select function and a compute function (either of these can be in C or Python). The select function and the compute function work together. The select function determines the type of the outputs based on the input types, while the compute function takes the inputs and outputs (and their types) and computes the ouptut. This is done on either an entire block of memory (optimized ufuncs) or one-element-at-a-time (unoptimized ufuncs---Python coded ufuncs for example). This allows for efficient coding and the possibility of mixed type arithmetic with a more complicated creation process. It also may be hard to add new types and have them function as you'd like without modifying others already-defined ufuncs. But, I know this idea will work and I can see my way through it. I've already implemented an "addition" function using this method. Another idea that has been presented is to instantiate a Ufunc with only one function that is entered into a "dispatch table" or dictionary of functions keyed by the ArrayType class much like the Multimethod approach that has been discussed. I like this idea, but I do not see the details (I haven't thought about it too much) and do not know if we can actually make it work --- Frankly, I think we can and it will result in a better system. What hasn't been thought through is exactly what is entered into the "function" dictionary and when is it called. Some have suggested that it is called "immediately" upon ufunc call, but this would eliminate the benefits of the encapsulated broadcasting rules. An alternative would be to call it after the "broadcasting rule encapsulation" as been done. In other words use it as a replacement for the current array of functions in the Ufunc implementation. With the appropriate mix of C-modules and Python code I think this could be done quite elegantly. The other issue that has to be worked out (again) is that obviously this table will not be filled out in every case (for a function with 10 inputs and 10 outputs with 16 types we are talking 16^20 different entries in the table and will be sparse) so what is done when there is not an entry for a particular combination will require some thought. I think we could allow multiple behaviors according to some attribute of the ufunc (casting, exception raising, etc.,) is set. Many people might fear this would inhibit code re-use but I have not seen convincing examples. So, that is a brief overview of the state of things. It doesn't try to cover everything, but it should give you enough of a perspective to understand the code that is on the CVS tree under module numpy2. I have been using C for the compiled code because it is easy to interface to and it has the widest platform support and because Python istelf is written in C. Anybody with specific questions (including offers to help) can feel free to contact me or post to this list. Thanks, Travis Oliphant From Oliphant.Travis at mayo.edu Wed Aug 23 11:38:26 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Wed, 23 Aug 2000 10:38:26 -0500 (CDT) Subject: [Numpy-discussion] Explanation of some terms In-Reply-To: Message-ID: For the benefit of those who may be unfamiliar with ways to add new functionality I will try to briefly summarize. More information can be found in the documentation and in the books that have been written about Python. There are two (three) ways to add a new object to Python: using an extension type and defining a class. The fact that there are two distinct ways to add new objects is often called the type-class dichotomy. It is a goal of Py3K to somehow eliminate this distinction. Another way to add new behavior that I'll explain is to make the type an "extension class." Making the type a "subtype" of this fancy type gives a possible direction for unifying types and classes. Types =============================== "Types" are more fundamental to the language and must be added using compiled code (All of the types I've seen are in straight C since you don't really buy anything by using C++ as Python itself is written in C). You can investigate the type of an object from within python by using the command type: >>> type(a) # prints the "type" of object a There are many types defined in the Python core such as integers, floats, complex, lists, tuples, dictionaries, etc. Python allows you to make new types. These must be made in C (maybe C++ but again I don't think the extra complexity buys you anything since Python is in C.) A new type is a PyTypeObject basically filled with function pointers and arrays of function pointers to handle the various operations one might do on the new type. This PyTypeObject is coupled with a C-structure containing the "data" for the new type. This data C structure lists PyObject_HEAD as it's first member and then whatever other data is necessary. Making a new type is thus a matter of creating these two C structures and filling in the TypeObject table with function pointers to handle various operations (getting and setting attributes, treating the type as an abstract number, sequence, or mapping, or printing the object). Python has an abstract object interface on the C level, that is used, so that if a type that has a "number" interface (operations) it can be used like a number, if it has a "sequence" interface can be indexed like a sequence, or if it has a "mapping" interface it can be indexed like a dictionary. Classes ====================== A Python Class is at the C level just another "type." There are actually two "types" associated with a Python class: an instance type and a class type. An instance of a class is the instance type. So every instance of any class has the same "type." What this means on a C level is that there is one more layer of indirection for each "operation" in Python when the type is "class". The Python interpreter goes through the "class type" to see what to do and finds the appropriate C function from that PyTypeObject Method table. This C function does a dictionary lookup using the special method names and executes the Python function associated with that name for the particular instance (which may call back into a compiled extension module to do the actual work). This level of indirection gives a great deal of dynamic flexibility since classes can be subclassed and attributes can be added dynamically, but there will be a performance hit which won't be noticeable except inside Python iteration loops. So in reality there is no "type"-"class" dichotomy. Everything is a type. It's just that classes are dynamic types which allow you to define Python functions to implement the "method table" The reason for the dichotomy is that classes are so useful, that people really like them, and use them quite a bit so that the other static types seem quite rigid in comparison. Extension Classes ================================== This is another fancy, dynamic "type" not distributed in the Python Core but developed by Digital Creations (the Zope people) in order to let C programmers "subclass" types. I'm not an expert on these as I've never really used them but as far as I can tell they bring the idea of "dynamic types" to the C programmer. This is accomplished by making all types just subtypes of the extension class "type". One way to understand the result is by understanding what the type command tells you about your new "extension class". It will tell you that's it's of type "extension class." So, dynamic typing is again implemented with another layer of indirection where the fixed special C functions of the extension class "type" call out to your particular set of registered C functions. The difference is that the indirection is all handled in C. So those are the choices for implementing new behavior in Python. Currently, Numerical Python is implemented as a new "type" which defines all of these interfaces. The mapping interface handles "extended slicing," the "sequence" interface allows the array to return something when len() is called for example, and the "number" interface implements the operators. Actually, two new types are defined: a "ufunc" type and an "array" type. All of the operators are implemented as instances of the "ufunc" type. The "ufunc" essentially encapsulates the "casting and broadcasting" rules associated with elementwise operations. The ufunc is not well-understood by most non-developers I've talked too since most people don't instantiate their own ufuncs (which must be instantiated in C). The code works and is fast, but it can be hard to extend and there are pieces that are poorly documented and hard to understand. For example, nobody has reworked the "extended slicing" syntax to enable arbitrary-index slicing, despite many people who would like that feature (actually I've heard that John Bernard did finally write some code to do that but I've never seen it and it's not there now). As mentioned before, David Ascher made the necessary changes to make Numerical Python of type "extension class" which among other things, allowed, the type to be "subclassed" from within Python. I thought this was a nice solution and we'd have to hear from him as to what went wrong. The only trouble I had with it is that the C-API changed slightly in that Arrays were no longer of type Array_Type and code that depended on it would break (the same is true of any redesign making Python arrays a class). We'd have to hear from him as to what other problems he saw. It still doesn't solve the problem of maintainability of the C-code base, but it definitely gave a more flexible result to the Python user. Perhaps retrofitting the ExtensionClass solution with an enhanced C-API would be a better solution. We really need David's input on that suggestion... The idea I've put forward is to make the object "classes" but I would support the "extension class" solution as well. Regardless of how it is implemented, we still need to design the appropriate "objects" (arraytype, NDArray, Ufunc) and how they interact with each other, as well as a suitable C-API so that they work together seemlessly. I hope this helps some readers who are less familiar with extending Python. DISCLAIMER: I am not the world's expert on these issues but I do have some experience, so take what lessons you may. Best wishes, Travis Oliphant From Oliphant.Travis at mayo.edu Thu Aug 24 17:29:50 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Thu, 24 Aug 2000 16:29:50 -0500 (CDT) Subject: [Numpy-discussion] Why I wrote the previous posts. In-Reply-To: <200008242105.OAA31079@lists.sourceforge.net> Message-ID: For those wondering why I wrote the previous posts (and didn't bother to correct the spelling mistakes :-) ) it is due to a request from Greg Wilson who is bending the ear of other contributors to Numerical software that are not in the Python community. He asked that I quickly hash out some words on Numerical Python and where it is going for him --- so I did and crossposted what I wrote to this list since I thought some might be interested in my position and goals. --Travis From pauldubois at home.com Sat Aug 26 23:49:51 2000 From: pauldubois at home.com (Paul F. Dubois) Date: Sat, 26 Aug 2000 20:49:51 -0700 Subject: [Numpy-discussion] Announce: Pyfort 6.0 Message-ID: There is a new Pyfort on SourceForge. Type pyfort with no arguments to get usage info. Incompatible change required: Remove the module/end module pair of statements from your .pyf Documentation: in pdf and HTML at http://pyfortran.sourceforge.net. See the project page at http://sourceforge.net/projects/pyfortran for downloading Pyfort-6.0.tar.gz. The new version builds and installs extensions in just one line. For example, if you have created an input file mydiagostic.pfy, using Fortran library libmydiags.a located in directory /myhouse, then pyfort -i -l mydiags -L /myhouse mydiagnostic.pyf makes mydiagnostic.so and installs in Python's site-packages directory (assuming you have write permission there). It is now much easier to figure out how to add a "compiler id" for a new platform. I wouldn't say it is completely easy, but it is "easier". I hope people will post patches to fortran_compiler.py to add their favorites. I have only tested the Linux g77 / pgf77 combinations. pgf90 (Portland Group F90, Linux), g77alpha (g77 on an alpha chip)and solaris (Sun Solaris Fortran) "should" work. Numeric Python, and 0.9 or better Distutils are assumed; see documentation for details. From pete at shinners.org Sun Aug 27 04:49:51 2000 From: pete at shinners.org (Pete Shinners) Date: Sun, 27 Aug 2000 01:49:51 -0700 Subject: [Numpy-discussion] NumPy2 design References: Message-ID: <001801c01003$c4793d00$0200a8c0@home> quick easy question here. it seems like the current development efforts are going into numpy2. i can see why this is very important to try to make the python2.1 cutoff date in january. in the mean time, is there a planned release for the current numpy that will take advantage of python 2.0 features? (i'm mainly looking at the assignment operators, array1 += array2) i would assume there is, but i haven't heard any talk going on for this. the python 2.0 beta is coming up in one week (pending further delays). is there a timeframe for a new numpy that is ready to take advantage of 2.0 ? From jack at oratrix.nl Mon Aug 28 07:09:02 2000 From: jack at oratrix.nl (Jack Jansen) Date: Mon, 28 Aug 2000 13:09:02 +0200 Subject: [Numpy-discussion] lapack_lite Message-ID: <20000828110902.9E0EF303181@snelboot.oratrix.nl> The lapack_lite directory is empty in the sourceforge repository. And according to the build docs it shouldn't be. Could whoever has the contents please check them in, or (if they should come from a different place) please tell me where to find them? Right now I can't build from the repository, and I'd like to get Numeric working with the latest release for MacPython 2.0. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From pauldubois at home.com Mon Aug 28 07:56:56 2000 From: pauldubois at home.com (Paul F. Dubois) Date: Mon, 28 Aug 2000 04:56:56 -0700 Subject: [Numpy-discussion] lapack_lite In-Reply-To: <20000828110902.9E0EF303181@snelboot.oratrix.nl> Message-ID: That would be me, and it is fixed now. I had done a cvs add, and created a directory and put stuff in it, but I think I moved it to a different name temporarily for a test and didn't put it back, so when I did the commit it was missed. I also had missed adding one file and needed to repair the setup files for a three-digit distutils version - a stupid error on my part. Jack, I will add you to the developer list so you can check in your macisms. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Jack > Jansen > Sent: Monday, August 28, 2000 4:09 AM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] lapack_lite > > > The lapack_lite directory is empty in the sourceforge repository. And > according to the build docs it shouldn't be. Could whoever has > the contents > please check them in, or (if they should come from a different > place) please > tell me where to find them? > > Right now I can't build from the repository, and I'd like to get Numeric > working with the latest release for MacPython 2.0. > -- > Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ > Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to > your sig ++++ > www.oratrix.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > http://lists.sourceforge.net/mailman/listinfo/numpy-discussion From jack at oratrix.nl Mon Aug 28 08:30:01 2000 From: jack at oratrix.nl (Jack Jansen) Date: Mon, 28 Aug 2000 14:30:01 +0200 Subject: [Numpy-discussion] lapack_lite In-Reply-To: Message by "Paul F. Dubois" , Mon, 28 Aug 2000 04:56:56 -0700 , Message-ID: <20000828123001.EC7B4303181@snelboot.oratrix.nl> > That would be me, and it is fixed now. I had done a cvs add, and created a > directory and put stuff in it, but I think I moved it to a different name > temporarily for a test and didn't put it back, so when I did the commit it > was missed. Thanks! > Jack, I will add you to the developer list so you can check in your macisms. Okay, great. I'll probably stick with Mac subdirectories on the toplevel and lapack-lite level which will contain the mac project files (and possibly any config header files or so, but I think Numeric didn't need them last time around). I've also started ansifying the numeric stuff for Python 2.0 (mainly taking out the Py_PROTO and Py_FPROTO macros, which have disappearded) but I won't check that in right now (I don't know whether you want to do a Numeric-for-Python-1.6 distribution, and I don't remember whether 1.6 was fully ansified), but let me know when/if you want this checked in. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From managan at llnl.gov Tue Aug 29 15:08:30 2000 From: managan at llnl.gov (Rob Managan) Date: Tue, 29 Aug 2000 12:08:30 -0700 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #90 - 3 msgs In-Reply-To: <200008281930.MAA18250@lists.sourceforge.net> References: <200008281930.MAA18250@lists.sourceforge.net> Message-ID: When trying to build the 16.0 release on the Mac I found that I had to modify fftpackmodule.c so that line 3 was #include "arrayobject.h" instead of #include "Numeric/arrayobject.h" What was the reason for the Numeric in the first place? It seems that arrayobject.h is now in the Include directory anyway. -- *-*-*-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-*-*-*- Rob Managan LLNL ph: 925-423-0903 P.O. Box 808, L-095 FAX: 925-422-3389 Livermore, CA 94551-0808 From Oliphant.Travis at mayo.edu Thu Aug 31 15:20:22 2000 From: Oliphant.Travis at mayo.edu (Travis Oliphant) Date: Thu, 31 Aug 2000 14:20:22 -0500 (CDT) Subject: [Numpy-discussion] ANNOUNCE: New algorithms added to optimize.py In-Reply-To: <8om8cg$3um$1@nnrp1.deja.com> References: <8om8cg$3um$1@nnrp1.deja.com> Message-ID: This is a heads up to those interested in Python-implemented optimization algorithms. I've updated my optimize.py module to version 0.3 by including two new unconstrained optimization algorithms to minimize a function of many variables -- one which implements a quasi-Newton algorithm (BFGS) and another which implements a practical Newton's algorithm using conjugate gradients to invert the Hessian (approximated if not provided). The license is very liberal it's available at http://oliphant.netpedia.net/packages/optimize.py Here are the docstrings: >>> print optimize.fminBFGS.__doc__ xopt = fminBFGS(f, fprime, x0, args=(), avegtol=1e-5, maxiter=None, fulloutput=0, printmessg=1) Optimize the function, f, whose gradient is given by fprime using the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) See Wright, and Nocedal 'Numerical Optimization', 1999, pg. 198. >>> print optimize.fminNCG.__doc__ xopt = fminNCG(f, fprime, x0, fhess_p=None, args=(), avextol=1e-5, maxiter=None, fulloutput=0, printmessg=1) Optimize the function, f, whose gradient is given by fprime using the Newton-CG method. fhess_p must compute the hessian times an arbitrary vector. If it is not given, finite-differences on fprime are used to compute it. See Wright, and Nocedal 'Numerical Optimization', 1999, pg. 140.