From beach@verinet.com Sat Oct 4 00:04:21 1997 From: beach@verinet.com (David J. C. Beach) Date: Fri, 03 Oct 1997 17:04:21 -0600 Subject: [MATRIX-SIG] Printing BUG? Message-ID: <343579F5.83CE8C18@verinet.com> To he that hath knowlege of the array-printing code... I believe I've found a bug in NumPy involving its formatting and printing of arrays, here's a very simple example of how to recreate the bug: ----------------------------------------------------- Python 1.4 (Oct 3 1997) [GCC 2.7.2.2.f.2] Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> from Numeric import * >>> x = array([[1.496e11, 0, 0], [0, -2.925e4, 0]], Float) >>> print x [[ 1.49600000e+11 0.00000000e+00 0.00000000e+00] [ 0.00000000e+00 -2.92500000e+04 0.00000000e+00]] >>> print x[0] Traceback (innermost last): File "", line 1, in ? File "/usr/local/lib/python1.4/NumPy/Numeric.py", line 116, in array_str return array2string(a, max_line_width, precision, suppress_small, ' ', 0) File "/usr/local/lib/python1.4/NumPy/ArrayPrinter.py", line 46, in array2string format, item_length = _floatFormat(data, precision, suppress_small) File "/usr/local/lib/python1.4/NumPy/ArrayPrinter.py", line 118, in _floatFormat max_str_len = len(str(int(max_val))) + precision + 2 OverflowError: float too large to convert ------------------------------------------------------ I really don't know much about the printing code, but it doesn't seem that 1.49e+11 should be too large of a number to be converted (to a string, I assume). And besides, why does it work when I print x, and not when I print x[0]? Anyhow, this bug is creating problems for me because I'm working with, well, astronomical numbers (no pun), and sometimes when I evaluate certain arrays or portions of them, I get the OverflowError shown here. Does anybody else have this problem? Is there an easy workaround for this? (Other than simply never printing arrays?) Thanks. Dave -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Sat Oct 4 01:08:20 1997 From: beach@verinet.com (David J. C. Beach) Date: Fri, 03 Oct 1997 18:08:20 -0600 Subject: [MATRIX-SIG] Changing shelve... Message-ID: <343588F4.6D8ECEA2@verinet.com> Developers, The Numeric module includes it's own special versions of Pickler and Unpickler which inherit from those defined in the pickle module, but add the functionality of Numeric.array pickling. This is all well and good, but it's still pretty troublesome if you're working with the shelve module. shelve.Shelf likes to directly call pickle.Pickler and pickle.Unpickler, not those versions found in Numeric. I'd like to propose a change to shelf which allows the user to say something like: import Numeric import shelve y = shelf.open("my_shelf_file", pickler=Numeric.pickler, unpickler=Numeric.unpickler) y['some_key'] = Numeric.array([[1,2,3],[4,5,6]]) ...and so on. In fact, I've created a modified version of shelve for exactly this purpose. The keywords pickler= and unpickler= must be set to classes which are compatible with those defined in pickle. Also, the keywords are required, they're not named parameters. I hoped that this would avoid any confusion when calling shelf.open()... Anyhow, I'm including my modified version of shelve.py at the end of this message, (I hope that doesn't go against the nature of this list!) Are there any chances of something like this being folded in to the 1.5 release? -----------myshelve.py------------------- """Manage shelves of pickled objects. A "shelf" is a persistent, dictionary-like object. The difference with dbm databases is that the values (not the keys!) in a shelf can be essentially arbitrary Python objects -- anything that the "pickle" module can handle. This includes most class instances, recursive data types, and objects containing lots of shared sub-objects. The keys are ordinary strings. To summarize the interface (key is a string, data is an arbitrary object): import shelve d = shelve.open(filename) # open, with (g)dbm filename -- no suffix d[key] = data # store data at key (overwrites old data if # using an existing key) data = d[key] # retrieve data at key (raise KeyError if no # such key) del d[key] # delete data stored at key (raises KeyError # if no such key) flag = d.has_key(key) # true if the key exists list = d.keys() # a list of all existing keys (slow!) d.close() # close it Dependent on the implementation, closing a persistent dictionary may or may not be necessary to flush changes to disk. -------------------------------- Modified by Dave Beach to add a Shelf.values() method, and to allow inclusion use of an arbitrary Pickler and Unpickler, not necessairly those included in the pickle module. The added parameters to Shelf.__init__(), BsdDbShelf.__init__(), DbFileNameShelf.__init__(), and open are intentionally added with the **kw feature so that they can't be "accidentally" used by an unsuspecting programmer (i.e., the keywords "pickler=" and "unpickler=" are required explicitly. """ import pickle import StringIO # modified lines of code are marked in the right hand margin ______________ # ` # \ # V class Shelf: """Base class for shelf implementations. This is initialized with a dictionary-like object. See the module's __doc__ string for an overview of the interface. """ pickler = [pickle.Pickler] unpickler = [pickle.Unpickler] def __init__(self, dict, **kw): # self.dict = dict for key in kw.keys(): # if key == 'pickler': # self.pickler = [kw['pickler']] # elif key == 'unpickler': # self.unpickler = [kw['unpickler']] # else: raise AttributeError, key # def keys(self): return self.dict.keys() # this implementation can be a memory pig for a very large Shelf # a design similar to xrange would probably work better def values(self): # values = [] # for key in self.keys(): # values.append(self[key]) # return values # def __len__(self): return len(self.dict) def has_key(self, key): return self.dict.has_key(key) def __getitem__(self, key): f = StringIO.StringIO(self.dict[key]) return self.unpickler[0](f).load() # def __setitem__(self, key, value): f = StringIO.StringIO() p = self.pickler[0](f) # p.dump(value) self.dict[key] = f.getvalue() def __delitem__(self, key): del self.dict[key] def close(self): if hasattr(self.dict, 'close'): self.dict.close() self.dict = None def __del__(self): self.close() class BsdDbShelf(Shelf): """Shelf implementation using the "BSD" db interface. The actual database is opened using one of thethe "bsddb" modules "open" routines (i.e. bsddb.hashopen, bsddb.btopen or bsddb.rnopen.) This class is initialized with the the database object returned from one of the bsddb open functions. See the module's __doc__ string for an overview of the interface. """ def __init__(self, dict, **kw): # pickler = None # unpickler = None # for key in kw.keys(): # if key == 'pickler': pickler = kw['pickler'] # elif key == 'unpickler': unpickler = kw['unpickler'] # else: raise AttributeError, key # Shelf.__init__(self, dict, pickler=pickler, unpicker=unpickler) # def set_location(self, key): (key, value) = self.dict.set_location(key) f = StringIO.StringIO(value) return (key, pickle.Unpickler(f).load()) def next(self): (key, value) = self.dict.next() f = StringIO.StringIO(value) return (key, pickle.Unpickler(f).load()) def previous(self): (key, value) = self.dict.previous() f = StringIO.StringIO(value) return (key, pickle.Unpickler(f).load()) def first(self): (key, value) = self.dict.first() f = StringIO.StringIO(value) return (key, pickle.Unpickler(f).load()) def last(self): (key, value) = self.dict.last() f = StringIO.StringIO(value) return (key, pickle.Unpickler(f).load()) class DbfilenameShelf(Shelf): """Shelf implementation using the "anydbm" generic dbm interface. This is initialized with the filename for the dbm database. See the module's __doc__ string for an overview of the interface. """ def __init__(self, filename, flag='c', **kw): # import anydbm pickler = None # unpickler = None # for key in kw.keys(): # if key == 'pickler': pickler = kw['pickler'] # elif key == 'unpickler': unpickler = kw['unpickler'] # else: raise AttributeError, key # Shelf.__init__(self, anydbm.open(filename, flag), # pickler=pickler, unpickler=unpickler) # def open(filename, flag='c', **kw): # """Open a persistent dictionary for reading and writing. Argument is the filename for the dbm database. See the module's __doc__ string for an overview of the interface. """ pickler = None # unpickler = None # for key in kw.keys(): # if key == 'pickler': pickler = kw['pickler'] # elif key == 'unpickler': unpickler = kw['unpickler'] # else: raise AttributeError, key # return DbfilenameShelf(filename, flag, # pickler=pickler, unpickler=unpickler) # _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Sat Oct 4 07:36:53 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Sat, 4 Oct 1997 08:36:53 +0200 Subject: [MATRIX-SIG] Printing BUG? In-Reply-To: <343579F5.83CE8C18@verinet.com> (beach@verinet.com) Message-ID: <199710040636.IAA31025@lmspc1.ibs.fr> > File "/usr/local/lib/python1.4/NumPy/ArrayPrinter.py", line 118, in > _floatFormat > max_str_len = len(str(int(max_val))) + precision + 2 > OverflowError: float too large to convert > ------------------------------------------------------ > > I really don't know much about the printing code, but it doesn't seem > that 1.49e+11 should be too large of a number to be converted (to a > string, I assume). And besides, why does it work when I print x, and > not when I print x[0]? What fails is the conversion to an integer, done do determine the length of the mantissa. That ought to be changed to something more tolerant of large numbers, of course. But as a quick fix, change line 101 of ArrayPrinter.py: if max_val >= 1.e12: Reducing the limit will cause big numbers to be printed in exponential format rather than crashing. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Sat Oct 4 07:38:37 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Sat, 4 Oct 1997 08:38:37 +0200 Subject: [MATRIX-SIG] Changing shelve... In-Reply-To: <343588F4.6D8ECEA2@verinet.com> (beach@verinet.com) Message-ID: <199710040638.IAA31033@lmspc1.ibs.fr> > The Numeric module includes it's own special versions of Pickler and > Unpickler which inherit from those defined in the pickle module, but add > the functionality of Numeric.array pickling. This is all well and good, > but it's still pretty troublesome if you're working with the shelve > module. shelve.Shelf likes to directly call pickle.Pickler and > pickle.Unpickler, not those versions found in Numeric. And that's only one of the problems with this approach. Another one is that it works only as long as there is no more than one "specialization" of pickle. The fundamental problem is pickle itself, which in its present form does not permit the extension to new data types other than by subclassing. This might change in later versions. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jhauser@ifm.uni-kiel.de Sat Oct 4 10:23:45 1997 From: jhauser@ifm.uni-kiel.de (Janko Hauser) Date: Sat, 4 Oct 1997 11:23:45 +0200 (CEST) Subject: [MATRIX-SIG] Changing shelve... In-Reply-To: <199710040638.IAA31033@lmspc1.ibs.fr> References: <343588F4.6D8ECEA2@verinet.com> <199710040638.IAA31033@lmspc1.ibs.fr> Message-ID: As far as I understand pickle is rather slow, not very memory conserving and at the end not portable between different applications. Wouldn't it be good to have a simple routine to save raw Numpy-arrays in a portable (machine and application) way. I think I can build something like that on top of the netcdf module. The main drawback of this data format is that it is not easy to append to a file. But I don't know if HDF is more flexible in this regard. PDB from llnl can this but other applications can't read this without extensions. Besides, also pickle can't append, right? By the way, thanks Konrad for the inclusion of negativ indexing in the netcdfmodule. Are there any other new features? __Janko _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jim@digicool.com Sat Oct 4 14:09:34 1997 From: jim@digicool.com (Jim Fulton) Date: Sat, 04 Oct 1997 09:09:34 -0400 Subject: [MATRIX-SIG] Changing shelve... References: <199710040638.IAA31033@lmspc1.ibs.fr> Message-ID: <3436400E.6962@digicool.com> Konrad Hinsen wrote: > > > The Numeric module includes it's own special versions of Pickler and > > Unpickler which inherit from those defined in the pickle module, but add > > the functionality of Numeric.array pickling. This is all well and good, > > but it's still pretty troublesome if you're working with the shelve > > module. shelve.Shelf likes to directly call pickle.Pickler and > > pickle.Unpickler, not those versions found in Numeric. > > And that's only one of the problems with this approach. Another one is > that it works only as long as there is no more than one > "specialization" of pickle. True. > The fundamental problem is pickle itself, which in its present form > does not permit the extension to new data types other than by > subclassing. This might change in later versions. Actually, the latest pickle and cPickle provide a generalized way of handling new types without subclassing. This involves a protocol for "reducing" custom types to standard types and works very well for many applications. Unfortunately, as you have pointed out to me in private email, the new approach does not work well for objects like arrays because it is too inefficient to reduce *VERY LARGE* arrays to Python objects. I've thought alot about this, and even began to implement an idea or two, but haven't been satisfied with any approach so far. The major difficulty is handling arrays that are soooo big, that it isn't good enough to marshal their data to a portable string format in memory. I'd guess that many people have arrays that are small enough that they can afford to marshal the array to a string. In such cases the current reduce mechanism can work quite well. One idea I had was to define a new pickle type "temporary file". Basically, you could create a temporary file object, write your data to it, hopefully in a portable format, and then pickle the temporary file. The pickling machinery would be prepared to pickle and unopickle the temporary file without reading the contents into memory. This would involve an extra copy operation when pickling and unpickling, which you objected to. Another option would be to define a CObject like object that would allow a block of memory to be pickled and unpickled. This would allow very fast pickling and unpickling but with a loss of portability. Hm....What about a special picklable type that would wrap: - A void pointer, - An object that contains the void pointer, - a size, and - a type code. So, an array's __reduce__ method would contruct one of these special objects and the picking machinery would be prepared to pickle and unpickle the object efficiently *and* portably. This last idea takes advantage of an assumption that we want to pickle a block of memory that contains objects of some constant known C type. I think this last idea can work. Anybody want to volunteer to help me make it work? (I have so little time these days. :-() Jim _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jim@digicool.com Sat Oct 4 14:17:41 1997 From: jim@digicool.com (Jim Fulton) Date: Sat, 04 Oct 1997 09:17:41 -0400 Subject: [MATRIX-SIG] Changing shelve... References: <343588F4.6D8ECEA2@verinet.com> <199710040638.IAA31033@lmspc1.ibs.fr> Message-ID: <343641F5.50B8@digicool.com> Janko Hauser wrote: > > As far as I understand pickle is rather slow, cPickle is orders of magnitude faster than pickle > not very memory conserving Hm, well, currently for arrays, you would have to marshal to a string in memory, temporarily doubling your memory usage. > and at the end not portable between different applications. I'ts very portable, assuming that the applications are all written in Python. :-) Pickles are certainly portable from machine to machine. > Wouldn't it be good to have a simple routine to save raw > Numpy-arrays in a portable (machine and application) way. Yes. > I think I can build something like that on top of the netcdf > module. The main drawback of this data format is that it is not easy > to append to a file. But I don't know if HDF is more flexible in this > regard. PDB from llnl can this but other applications can't read this > without extensions. > > Besides, also pickle can't append, right? Wrong. In fact, I suppose if you wanted to write a pickle file that only contained a single array, you could write the file in multiple pickles, avoiding the large memory usage. For that matter, if you were writing simple objects (like arrays) to a pickle file, it would be easy to write pickling and unpickling routines for other non-python applications. I imagine that when we commercialize our pickle-based OODBMS, we'll provide some picklers and unpicklers for other languages. Jim _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Sat Oct 4 20:52:36 1997 From: beach@verinet.com (David J. C. Beach) Date: Sat, 04 Oct 1997 13:52:36 -0600 Subject: [MATRIX-SIG] Changing shelve... References: <199710040638.IAA31033@lmspc1.ibs.fr> <3436400E.6962@digicool.com> Message-ID: <34369E84.C8FB84C7@verinet.com> Jim Fulton wrote: > I think this last idea can work. Anybody want to volunteer > to help me make it work? (I have so little time these days. :-() Well, I'd like to help, but I don't know very much about the internals of pickle. (I've only been learning about Python for the last four months or so, and only in my spare time -- that is -- I bought Mark Lutz's book last spring and started programming in Python for the first time then. Worse, I know virtually nothing of the C API for Python yet...) Would a change like this be likely to become a standard for 1.5? It would seem a shame to make a design change like this, and then to only offer it in a module. From what I've heard and read about pickling objects in Python so far, it seems like what we really want is: 1) many pickle routines to be implemented in C (cpickle, I guess, for speed) 2) pickle will use the __reduce__ method to get necessary information for pickling a particular class. 3) pickle should pickle functions, and classes (it should automagically call marshal if/when necessary) 4) pickle should give the option of packaging a class along with an instance (so that the receiver of a pickled object can automatically get the bytecode for the class along with it) Options 3 and 4 have security implications, of course, but I think there are many situations in a trusted environment where it might make sense to move code from one machine to another along with class data, and other Python objects. Does this sound reasonable? Dave -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Sat Oct 4 20:58:00 1997 From: beach@verinet.com (David J. C. Beach) Date: Sat, 04 Oct 1997 13:58:00 -0600 Subject: [MATRIX-SIG] Printing BUG? References: <199710040636.IAA31025@lmspc1.ibs.fr> Message-ID: <34369FC8.469E1E32@verinet.com> Konrad Hinsen wrote: > Reducing the limit will cause big numbers to be printed in exponential > format rather than crashing. Thanks. That fixed things. What I still don't understand is why it worked when I was printing x, and failed when I was printing x[0]. Shouldn't the conversion to integer have failed in both places? Dave -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsenk@PLGCN.UMontreal.CA Mon Oct 6 17:22:03 1997 From: hinsenk@PLGCN.UMontreal.CA (Hinsen Konrad) Date: Mon, 6 Oct 1997 12:22:03 -0400 Subject: [MATRIX-SIG] Printing BUG? In-Reply-To: "David J. C. Beach"'s message of Sat, 04 Oct 1997 13:58:00 -0600 <34369FC8.469E1E32@verinet.com> Message-ID: <199710061622.MAA14680@esi22.ESI.UMontreal.CA> Thanks. That fixed things. What I still don't understand is why it worked when I was printing x, and failed when I was printing x[0]. Shouldn't the conversion to integer have failed in both places? In the other case exponential format was already selected due to a different criterion (large difference between the numbers). ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-76.88.99.28 Institut de Biologie Structurale | Fax: +33-76.88.54.94 41, Ave. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsenk@PLGCN.UMontreal.CA Mon Oct 6 18:34:09 1997 From: hinsenk@PLGCN.UMontreal.CA (Hinsen Konrad) Date: Mon, 6 Oct 1997 13:34:09 -0400 Subject: [MATRIX-SIG] Changing shelve... In-Reply-To: Jim Fulton's message of Sat, 04 Oct 1997 09:09:34 -0400 <3436400E.6962@digicool.com> Message-ID: <199710061734.NAA22474@esi22.ESI.UMontreal.CA> Actually, the latest pickle and cPickle provide a generalized way of handling new types without subclassing. This involves a protocol for "reducing" custom types to standard types and works very well for many applications. But this won't be standard before Python 1.5. OK, that's soon enough I hope... The major difficulty is handling arrays that are soooo big, that it isn't good enough to marshal their data to a portable string format in memory. I'd guess that many people have arrays that are small enough that they can afford to marshal the array to a string. In such cases the current reduce mechanism can work quite well. I agree that most arrays will be small in practice. But that won't help the people who do have large arrays, e.g. myself ;-) Hm....What about a special picklable type that would wrap: - A void pointer, - An object that contains the void pointer, - a size, and - a type code. So, an array's __reduce__ method would contruct one of these special objects and the picking machinery would be prepared to pickle and unpickle the object efficiently *and* portably. This last idea takes advantage of an assumption that we want to pickle a block of memory that contains objects of some constant known C type. Unfortunately the array data space does not have to be contiguous. But it would be possible to turn an array into a *sequence* of such "binary data" objects. That sounds like a good idea... I think this last idea can work. Anybody want to volunteer to help me make it work? (I have so little time these days. :-() I happen to suffer from exactly the same problem :-( ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-76.88.99.28 Institut de Biologie Structurale | Fax: +33-76.88.54.94 41, Ave. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From amullhau@ix.netcom.com Tue Oct 7 02:20:37 1997 From: amullhau@ix.netcom.com (Andrew P. Mullhaupt) Date: Mon, 06 Oct 1997 21:20:37 -0400 Subject: [MATRIX-SIG] Changing shelve... In-Reply-To: <199710061734.NAA22474@esi22.ESI.UMontreal.CA> References: Message-ID: <3.0.1.32.19971006212037.008f75e0@popd.netcruiser> At 01:34 PM 10/6/97 -0400, Hinsen Konrad wrote: > >But this won't be standard before Python 1.5. OK, that's soon enough >I hope... > >I agree that most arrays will be small in practice. But that won't >help the people who do have large arrays, e.g. myself ;-) Me too. Later, Andrew Mullhaupt _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jhauser@ifm.uni-kiel.de Tue Oct 7 08:22:57 1997 From: jhauser@ifm.uni-kiel.de (Janko Hauser) Date: Tue, 7 Oct 1997 09:22:57 +0200 (CEST) Subject: [MATRIX-SIG] Re: Printing BUG? Message-ID: Here another addition to ArrayPrinter. There is the option to change the seperator between numbers in ArrayPrinter.array2string. I have added the option bracket=[1/0] to change the output in this way, that there are no surrounding brackets. This is sometimes useful for ASCII-export. Are there any objections to this approach? __Janko # ArrayPrinter # Array printing function # # Written by Konrad Hinsen # last revision: 1996-3-13 # modified by Jim Hugunin 1997-3-3 for repr's and str's (and other details) # modified by Janko Hauser 1997-10-7 to exclude the brackets and added # the path of Konrad Hinsen for printing of big numbers # import sys from fast_umath import * import Numeric def array2string(a, max_line_width = None, precision = None, suppress_small = None, separator=' ', array_output=0, bracket = 1): if len(a.shape) == 0: return str(a[0]) if multiply.reduce(a.shape) == 0: return "zeros(%s, '%s')" % (a.shape, a.typecode()) if max_line_width is None: try: max_line_width = sys.output_line_width except AttributeError: max_line_width = 77 if precision is None: try: precision = sys.float_output_precision except AttributeError: precision = 8 if suppress_small is None: try: suppress_small = sys.float_output_suppress_small except AttributeError: suppress_small = 0 data = Numeric.ravel(a) type = a.typecode() items_per_line = a.shape[-1] if type == 'b' or type == '1' or type == 's' or type == 'i' \ or type == 'l': max_str_len = max(len(str(maximum.reduce(data))), len(str(minimum.reduce(data)))) format = '%' + str(max_str_len) + 'd' item_length = max_str_len format_function = lambda x, f = format: _formatInteger(x, f) elif type == 'f' or type == 'd': format, item_length = _floatFormat(data, precision, suppress_small) format_function = lambda x, f = format: _formatFloat(x, f) elif type == 'F' or type == 'D': real_format, real_item_length = _floatFormat(data.real, precision, suppress_small, sign=0) imag_format, imag_item_length = _floatFormat(data.imaginary, precision, suppress_small, sign=1) item_length = real_item_length + imag_item_length + 3 format_function = lambda x, f1 = real_format, f2 = imag_format: \ _formatComplex(x, f1, f2) elif type == 'c': item_length = 1 format_function = lambda x: x elif type == 'O': item_length = max(map(lambda x: len(str(x)), data)) format_function = _formatGeneral else: return str(a) final_spaces = (type != 'c') item_length = item_length+len(separator) line_width = item_length*items_per_line - final_spaces if line_width > max_line_width: indent = 6 if indent == item_length: indent = 8 items_first = (max_line_width+final_spaces)/item_length if items_first < 1: items_first = 1 items_continuation = (max_line_width+final_spaces-indent)/item_length if items_continuation < 1: items_continuation = 1 line_width = max(item_length*items_first, item_length*items_continuation+indent) - final_spaces number_of_lines = 1 + (items_per_line-items_first + items_continuation-1)/items_continuation line_format = (number_of_lines, items_first, items_continuation, indent, line_width, separator) else: line_format = (1, items_per_line, 0, 0, line_width, separator) lst = _arrayToString(a, format_function, len(a.shape), line_format, 6*array_output, 0, bracket=bracket)[:-1] if array_output: if a.typecode() in ['l', 'd', 'D']: return "array(%s)" % lst else: return "array(%s,'%s')" % (lst, a.typecode()) else: return lst def _floatFormat(data, precision, suppress_small, sign = 0): exp_format = 0 non_zero = abs(Numeric.compress(not_equal(data, 0), data)) if len(non_zero) == 0: max_val = 0. min_val = 0. else: max_val = maximum.reduce(non_zero) min_val = minimum.reduce(non_zero) if max_val >= 1.e12: exp_format = 1 if not suppress_small and (min_val < 0.0001 or max_val/min_val > 1000.): exp_format = 1 if exp_format: large_exponent = 0 < min_val < 1e-99 or max_val >= 1e100 max_str_len = 8 + precision + large_exponent if sign: format = '%+' else: format = '%' format = format + str(max_str_len) + '.' + str(precision) + 'e' if large_exponent: format = format + '3' item_length = max_str_len else: format = '%.' + str(precision) + 'f' precision = min(precision, max(tuple(map(lambda x, p=precision, f=format: _digits(x,p,f), data)))) max_str_len = len(str(int(max_val))) + precision + 2 if sign: format = '%#+' else: format = '%#' format = format + str(max_str_len) + '.' + str(precision) + 'f' item_length = max_str_len return (format, item_length) def _digits(x, precision, format): s = format % x zeros = len(s) while s[zeros-1] == '0': zeros = zeros-1 return precision-len(s)+zeros def _arrayToString(a, format_function, rank, line_format, base_indent=0, indent_first=1, bracket): if bracket: l_br = '[' r_br = ']' else: l_br = '' r_br = '' if rank == 0: return str(a[0]) elif rank == 1: s = '' s0 = l_br items = line_format[1] if indent_first: indent = base_indent else: indent = 0 index = 0 for j in range(line_format[0]): s = s + indent * ' '+s0 for i in range(items): s = s + format_function(a[index])+line_format[-1] index = index + 1 if index == a.shape[0]: break if s[-1] == ' ': s = s[:-1] s = s + '\n' items = line_format[2] indent = line_format[3]+base_indent s0 = '' s = s[:-len(line_format[-1])]+r_br+'\n' else: if indent_first: s = ' '*base_indent+l_br else: s = l_br for i in range(a.shape[0]-1): s = s + _arrayToString(a[i], format_function, rank-1, line_format, base_indent+1, indent_first=i!=0) s = s[:-1]+line_format[-1][:-1]+'\n' s = s + _arrayToString(a[a.shape[0]-1], format_function, rank-1, line_format, base_indent+1) s = s[:-1]+r_br+'\n' return s def _formatInteger(x, format): return format % x def _formatFloat(x, format, strip_zeros = 1): if format[-1] == '3': format = format[:-1] s = format % x third = s[-3] if third == '+' or third == '-': s = s[1:-2] + '0' + s[-2:] elif format[-1] == 'f': s = format % x if strip_zeros: zeros = len(s) while s[zeros-1] == '0': zeros = zeros-1 s = s[:zeros] + (len(s)-zeros)*' ' else: s = format % x return s def _formatComplex(x, real_format, imag_format): r = _formatFloat(x.real, real_format) i = _formatFloat(x.imag, imag_format, 0) if imag_format[-1] == 'f': zeros = len(i) while zeros > 2 and i[zeros-1] == '0': zeros = zeros-1 i = i[:zeros] + 'j' + (len(i)-zeros)*' ' else: i = i + 'j' return r + i def _formatGeneral(x): return str(x) + ' ' if __name__ == '__main__': a = Numeric.arange(10) b = Numeric.array([a, a+10, a+20]) c = Numeric.array([b,b+100, b+200]) print array2string(a) print array2string(b) print array2string(sin(c), separator=', ', array_output=1) print array2string(sin(c)+1j*cos(c), separator=', ', array_output=1) print array2string(Numeric.array([[],[]])) _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Fri Oct 10 15:59:50 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Fri, 10 Oct 1997 16:59:50 +0200 Subject: [MATRIX-SIG] Changing shelve... In-Reply-To: (message from Janko Hauser on Sat, 4 Oct 1997 11:23:45 +0200 (CEST)) Message-ID: <199710101459.QAA23076@lmspc1.ibs.fr> > As far as I understand pickle is rather slow, not very memory > conserving and at the end not portable between different > applications. Wouldn't it be good to have a simple routine to save raw > Numpy-arrays in a portable (machine and application) way. You mean non-Python? I guess netCDF is as close as you can get to a portable format for that use. > I think I can build something like that on top of the netcdf > module. The main drawback of this data format is that it is not easy > to append to a file. But I don't know if HDF is more flexible in this Why? You can always add arrays to an existing netCDF file. You just have to take care of name conflicts somehow, both for the arrays and the dimensions. > Besides, also pickle can't append, right? It's not difficult; you just have to use the "conversion to string" version and do the file write yourself. > By the way, thanks Konrad for the inclusion of negativ indexing in the > netcdfmodule. Are there any other new features? Bug fixes, a C API for string I/O, and some optimizations. I think I should start keeping track of changes, and use version numbers... -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From maz@pap.univie.ac.at Sun Oct 26 21:25:48 1997 From: maz@pap.univie.ac.at (Rupert Mazzucco) Date: Sun, 26 Oct 1997 22:25:48 +0100 (MET) Subject: [MATRIX-SIG] random number generator? Message-ID: Hello, I'm trying to do some simulation with Monte-Carlo methods, and I'm a bit confused by the various random number modules. There is rand, random, whrandom, and ranlib & RandomArray from NumPy. Which one should I use and why are there others? Thank you. Regards, Rupert Mazzucco PS: I notice there is not much traffic here, lately. Does that imply anything about the status of the NumPy project? PPS: Is there really no += operator in Python? _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Sun Oct 26 23:18:58 1997 From: beach@verinet.com (David J. C. Beach) Date: Sun, 26 Oct 1997 16:18:58 -0700 Subject: [MATRIX-SIG] random number generator? References: Message-ID: <3453CFE1.6782A603@verinet.com> Rupert Mazzucco wrote: (Sorry, I can't help you with your main question as I haven't had any need for random numbers in Python -- yet... Good luck, though.) > PS: I notice there is not much traffic here, lately. Does that > imply anything about the status of the NumPy project? Yeah it's been pretty dead this week. > PPS: Is there really no += operator in Python? Yup, no += operator. Python's designers (Guido and others) opted not to use many of the "fancy" operators from c and c++ that make it so easy to get yourself in trouble... Among these, +=, -=, *=, /=, ++, and --. One might argue that writing a=a+1 is a bit more work than writing a+=1 or a++, but then again, Python is a whole lot easier to read and is more consistent (IMHO) than either c or c++. Having more operators is not necessiarly a good thing. Dave -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From dubois1@llnl.gov Mon Oct 27 15:04:33 1997 From: dubois1@llnl.gov (Paul F. Dubois) Date: Mon, 27 Oct 1997 07:04:33 -0800 Subject: [MATRIX-SIG] random number generator? Message-ID: <9710271504.AA01411@icf.llnl.gov.llnl.gov> If your project is large and you may have need for independent random number streams then consider URNG. This is part of the LLNL distribution available at ftp-icf.llnl.gov/pub/python. Generally in our simulations we don't want turning on or off some piece of physics to change the answer some other piece of physics is getting. URNG allows you to create a separate state for each piece of physics. The design is explained in my book "Object Technology for Scientific Computing". But to answer the obvious question, the algorithm is efficient; it does not, for example, save the state after every call. ---------- > From: Rupert Mazzucco > To: matrix-sig@python.org > Subject: [MATRIX-SIG] random number generator? > Date: Sunday, October 26, 1997 1:25 PM > > Hello, > > I'm trying to do some simulation with Monte-Carlo methods, > and I'm a bit confused by the various random number modules. > There is rand, random, whrandom, and ranlib & RandomArray from > NumPy. Which one should I use and why are there others? > > Thank you. > > Regards, > Rupert Mazzucco > > PS: I notice there is not much traffic here, lately. Does that > imply anything about the status of the NumPy project? > > PPS: Is there really no += operator in Python? > > _______________ > MATRIX-SIG - SIG on Matrix Math for Python > > send messages to: matrix-sig@python.org > administrivia to: matrix-sig-request@python.org > _______________ _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From maz@pap.univie.ac.at Mon Oct 27 18:14:52 1997 From: maz@pap.univie.ac.at (Rupert Mazzucco) Date: Mon, 27 Oct 1997 19:14:52 +0100 (MET) Subject: [MATRIX-SIG] random number generator? In-Reply-To: Message-ID: On Sun, 26 Oct 1997, W.T. Bridgman wrote: > When generating multiple numbers for my time-series work, I start with > > RandomArray.seed(num1,num2) # initialize the seed > > then > > binlist=RandomArray.randint(0,npoints,totalcounts) > > to generate a uniformly distributed list of bin numbers (between 0 and > npoints) for constructing a Poisson-distributed time series of totalcounts. Thank you :-) I'm now trying to do something like >>>RandomArray.seed() >>>def new_angle(): >>> >>> x1 = pi >>> x2 = 1 >>> # putting the seed procedure here doesn't help >>> while( x2 > phase_function( x1 )): >>> x1 = RandomArray.uniform( -pi, pi ) >>> x2 = RandomArray.uniform( 0, max_phase ) >>> >>> return x1 to get random angles with a frequency distribution that approximates phase_function (rejection method). The angles I get, however, seem to be correlated, which manifests as horizontal stripes in the frequency distribution. I think the method as such is ok. (At least it worked with Perl, setting the seed with Perl's TrulyRandom module.) Any suggestions? Thanks in advance, Rupert Mazzucco _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From maz@pap.univie.ac.at Mon Oct 27 18:22:29 1997 From: maz@pap.univie.ac.at (Rupert Mazzucco) Date: Mon, 27 Oct 1997 19:22:29 +0100 (MET) Subject: [MATRIX-SIG] random number generator? In-Reply-To: <3453CFE1.6782A603@verinet.com> Message-ID: On Sun, 26 Oct 1997, David J. C. Beach wrote: > One might argue that writing a=a+1 is a bit more work than writing a+=1 Actually I was typing >>> frequency[angle] = frequency[angle] + 1 when I felt the need for += ;-) Regards, Rupert _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From gbrown@cau.edu Wed Oct 29 14:11:41 1997 From: gbrown@cau.edu (Raymond Brown) Date: Wed, 29 Oct 1997 09:11:41 -0500 Subject: [MATRIX-SIG] Fast Fourier Transform Message-ID: <3457441D.1989@star.cau.edu> I do not understand how to compute the fft of a building in terms of frequency, time domain, etc... Could you explain how this can be done. _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From furnish@acl.lanl.gov Wed Oct 29 16:02:17 1997 From: furnish@acl.lanl.gov (Geoffrey Furnish) Date: Wed, 29 Oct 1997 09:02:17 -0700 (MST) Subject: [MATRIX-SIG] random number generator? In-Reply-To: <3453CFE1.6782A603@verinet.com> References: <3453CFE1.6782A603@verinet.com> Message-ID: <199710291602.JAA00946@steam.acl.lanl.gov> David J. C. Beach writes: > > PPS: Is there really no += operator in Python? > > Yup, no += operator. Python's designers (Guido and others) opted not to > use many of the "fancy" operators from c and c++ that make it so easy to > get yourself in trouble... Among these, +=, -=, *=, /=, ++, and --. > > One might argue that writing a=a+1 is a bit more work than writing a+=1 > or a++, but then again, Python is a whole lot easier to read and is more > consistent (IMHO) than either c or c++. Having more operators is not > necessiarly a good thing. Speed is the most important counter argument, imo, and one which ought to be of considerable concern to NumPy customers. I have previously done timing tests between loops containing expressions like: a[i] = a[i] + x[i]; versus a[i] += x[i]; in C and C++. The += form was as much as 20% faster on some tests on some architectures. Frankly, for a language litered with "self." on nearly every line, it is very hard for me to buy that "a = a + 1" is syntactically more beautiful. It certainly is slower. -- Geoffrey Furnish email: furnish@lanl.gov LANL CIC-19 POOMA/RadTran phone: 505-665-4529 fax: 505-665-7880 "Software complexity is an artifact of implementation." -Dan Quinlan _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Wed Oct 29 16:13:12 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Wed, 29 Oct 1997 17:13:12 +0100 Subject: [MATRIX-SIG] Fast Fourier Transform In-Reply-To: <3457441D.1989@star.cau.edu> (message from Raymond Brown on Wed, 29 Oct 1997 09:11:41 -0500) Message-ID: <199710291613.RAA04678@lmspc1.ibs.fr> > I do not understand how to compute the fft of a building in terms of > frequency, time domain, etc... Could you explain how this can be done. The FFT of a building? Sorry, that doesn't make sense to me. I suppose you should start with an introduction to discrete Fourier transforms; there are plenty of them in any decent library. Numerical Recipes has a chapter on Fourier transforms, which is a good starting point. Once you know what exactly you need, we can then help you to get it done in Python. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Wed Oct 29 18:04:06 1997 From: beach@verinet.com (David J. C. Beach) Date: Wed, 29 Oct 1997 11:04:06 -0700 Subject: [MATRIX-SIG] random number generator? References: <3453CFE1.6782A603@verinet.com> <199710291602.JAA00946@steam.acl.lanl.gov> Message-ID: <34577A96.23FAD443@verinet.com> Geoffrey Furnish wrote: > Speed is the most important counter argument, imo, and one which ought > to be of considerable concern to NumPy customers. I have previously > done timing tests between loops containing expressions like: > > a[i] = a[i] + x[i]; > > versus > > a[i] += x[i]; > > in C and C++. The += form was as much as 20% faster on some tests on > some architectures. > > Frankly, for a language litered with "self." on nearly every line, it > is very hard for me to buy that "a = a + 1" is syntactically more > beautiful. It certainly is slower. There's nothing inherently slower about "a = a + 1" than there is about "a += 1". The difference is in how the compiler interprets them. Now, any good optimizing C compiler should transform "a = a + 1" into "a += 1" automatically. (Were you using optimizations?) Perhaps the Python byte code compiler would only look-up the lvalue for a (or a[i]) once. This would make "a = a + 1" just as fast as "a += 1". The point here is that you're confusing a language difference for a performance difference. The language is what you type, but the performance depends on how the compiler/interpreter transforms that language into machine instructions. Come to think of it, I'm pretty sure that FORTRAN users (not that I like the language) don't have either a += or a ++ operator, and I'd be pretty willing to bet that you're a[i] = a[i] + x[i] test on a good optomizing FORTRAN compiler would outperform the C version of a[i] += x[i]. You might give it a try. As for the litered with "self" argument, I'm assuming that you're complaining because you're needing to type "self.a[i] = self.a[i] + 1" instead of "self.a[i] += 1". Well, I rather like the "self" prefix for object attributes because it makes it crystal clear that you're setting an attribute of the object instead of some other variable. I find that it's easier to read other people's code when this self "littered" style is employed. And I could be wrong, but I doubt I'm the only one. And I think there's room for some fair comparison here. How do you like the complete lack of standard's for container classes in C/C++? Sometimes people use Rogue Wave's classes, sometimes they use STL, sometimes he write they're own, sometimes they you their "in-house" class library. But they're all different: different ways to get elements, slices, append, insert, etc. I'll grant you in an instant that C++, as a machine-compiled language, runs faster than Python byte-compiled code, but it sure seems to lack any consistency. In C++, templates are generally a mess, there are six different kinds of inheritance (public virtual, protected virtual, private virtual, public, protected, and private), compilation tends to be a slow process, and you get all the advantages (AND DISADVANTAGES) a strongly typed language. (In Python, I was able to use the same function for a runge-kutta4 in python for a single equation system, and a multiple equation system just by virtue of whether I passed it floats or matrices. You could get that same behavior from C++, but you'd have to go well out of your way to spoof dynamic typing, or simply write multiple versions of runge-kutta4.) I could go on, but this email is already too long as it is. Even if you don't like the lack of a '+=' (most languages lack this feature which is really only in C, C++, JAVA, etc.) I'm pretty willing to bet that the Python language as a whole more than stands up to C or C++. Dave -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From mclay@nist.gov Wed Oct 29 17:05:56 1997 From: mclay@nist.gov (Michael McLay) Date: Wed, 29 Oct 1997 12:05:56 -0500 Subject: [MATRIX-SIG] random number generator? In-Reply-To: <199710291602.JAA00946@steam.acl.lanl.gov> References: <3453CFE1.6782A603@verinet.com> <199710291602.JAA00946@steam.acl.lanl.gov> Message-ID: <199710291705.MAA15003@fermi.eeel.nist.gov> Geoffrey Furnish writes: > David J. C. Beach writes: > > > PPS: Is there really no += operator in Python? > > > > Yup, no += operator. Python's designers (Guido and others) opted not to > > use many of the "fancy" operators from c and c++ that make it so easy to > > get yourself in trouble... Among these, +=, -=, *=, /=, ++, and --. > > > > One might argue that writing a=a+1 is a bit more work than writing a+=1 > > or a++, but then again, Python is a whole lot easier to read and is more > > consistent (IMHO) than either c or c++. Having more operators is not > > necessiarly a good thing. > > Speed is the most important counter argument, imo, and one which ought > to be of considerable concern to NumPy customers. I have previously > done timing tests between loops containing expressions like: > > a[i] = a[i] + x[i]; > > versus > > a[i] += x[i]; > > in C and C++. The += form was as much as 20% faster on some tests on > some architectures. Couldn't the performance also be gained by optimizing the parser to recognize this idium and avoided making the temporary variable? The syntax addition has been brought up before. Didn't Guido say he would consider adding the patches if someone did the work of writing the code for +=, -=, *=, and /= to the language. It's not his priority so someone else needs to become inspired. I think the ++ and -- operators sould be avoided. > Frankly, for a language litered with "self." on nearly every line, it > is very hard for me to buy that "a = a + 1" is syntactically more > beautiful. It certainly is slower. It is not a matter of beauty, but clarity. a += 1 is familiar notation to C programmers, but it may not be immediately apparent what this statement would mean if you are a Fortran programmers and a non-programming who is using Python for scripting. It also borders on feature creep. I tried searching for += in the mail archive and I wasn't able to locate relevant message because the usual problem of using symbols instead of names. _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Wed Oct 29 18:33:31 1997 From: beach@verinet.com (David J. C. Beach) Date: Wed, 29 Oct 1997 11:33:31 -0700 Subject: [MATRIX-SIG] Fast Fourier Transform References: <199710291613.RAA04678@lmspc1.ibs.fr> Message-ID: <3457817B.62C8BB8F@verinet.com> Konrad Hinsen wrote: > > I do not understand how to compute the fft of a building in terms of > > frequency, time domain, etc... Could you explain how this can be done. > > The FFT of a building? Sorry, that doesn't make sense to me. I suppose > you should start with an introduction to discrete Fourier transforms; > there are plenty of them in any decent library. Numerical Recipes has > a chapter on Fourier transforms, which is a good starting point. Once > you know what exactly you need, we can then help you to get it done > in Python. > -- Come on.... the wave function for a building! You know... that *one* wave function that perfectly describes the state and structure of the entire building. We'll take an fft of that. This is child's play! Dave (just kidding) -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Wed Oct 29 17:54:16 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Wed, 29 Oct 1997 18:54:16 +0100 Subject: [MATRIX-SIG] NumPy in the Python 1.5 era Message-ID: <199710291754.SAA05810@lmspc1.ibs.fr> With Python 1.5 approaching slowly but steadily, it is time to discuss the future of NumPy as well. At IPC6 it was mentioned that LLNL might take over the responsibility for further NumPy development, but others are more qualified to comment on that - is anyone willing to do so? With the new package feature in 1.5, we could turn NumPy into a package again (veterans may remember that one alpha release used ni-based packages). This obviously raises the question of compatibility. My suggestion is to keep the current modules, but provide an alternate arrangement as a package, and add eventual new modules only to that package. In other words, we would still have module/package Numeric with all its functions, but the other modules (LinearAlgebra etc.) would be available as modules within Numeric, as well as stand-alone modules for compatibility. This would cost only minor modifications plus a few links (no duplicate code). Any comments? Even more important is the installation problem. It's still too difficult for many users, and the installation of other C modules using NumPy currently depends on the way NumPy was installed. It is possible (and I even volunteer to do it!) to convert NumPy into a clean extension that can be installed in the same way on all Unix systems supporting dynamic libraries, and the C API would then also be accessible in the same way on all systems. Basically it involves making the C API accessible via a pointer array that other modules get by a standard import. This involves, however, a compulsory addition (of one line) to all C extension modules using the C API. In my opinion this is preferable to the current mess. Any other opinions? Once NumPy loses its "special" status, one could think about providing an automatic installation program, such that users would essentially download the tar archive, unpack it, and type two commands, one for compilation and one (as root) for installation. It could hardly be easier than that. I can't say much about non-Unix systems, but I suppose that for Windows and Mac a binary distribution would be the best solution. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Wed Oct 29 18:17:06 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Wed, 29 Oct 1997 19:17:06 +0100 Subject: [MATRIX-SIG] random number generator? In-Reply-To: <199710291705.MAA15003@fermi.eeel.nist.gov> (message from Michael McLay on Wed, 29 Oct 1997 12:05:56 -0500) Message-ID: <199710291817.TAA05894@lmspc1.ibs.fr> > > Speed is the most important counter argument, imo, and one which ought > > to be of considerable concern to NumPy customers. I have previously > > done timing tests between loops containing expressions like: > > > > a[i] = a[i] + x[i]; > > > > versus > > > > a[i] += x[i]; > > > > in C and C++. The += form was as much as 20% faster on some tests on > > some architectures. > > Couldn't the performance also be gained by optimizing the parser to > recognize this idium and avoided making the temporary variable? It seems you are all thinking too much about compiled C code. Python is very different! First question: what should the semantics of a += 1 be? There are three solutions: 1) Just syntactic sugar for a = a + 1. 2) a = a+1, but with whatever stands for a only evaluated once. 3) Another operation which modifies the *object* that is the value of a. The first one obviously makes no difference in execution speed. The second one could be faster, but have a different effect if the evaluation of a has side effects (e.g. x[n] could have side effects if x is an instance of a Python class with a weird __getitem__). Only the third interpretation would offer substantial speedups, but it is hardly useful, because the most important arithmetic types (numbers) are immutable. > It is not a matter of beauty, but clarity. a += 1 is familiar > notation to C programmers, but it may not be immediately apparent what > this statement would mean if you are a Fortran programmers and a > non-programming who is using Python for scripting. It also borders on True, but to my eyes a+=1 would be clearer than a = a + 1 if a is a sufficiently complicated expression. Even Fortran programmers might agree in the end ;-) Anyway, I doubt anyone wants this feature so much to be willing to implement it ;-) -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From krodgers@tdyryan.com Wed Oct 29 19:05:32 1997 From: krodgers@tdyryan.com (Kevin Rodgers) Date: Wed, 29 Oct 1997 11:05:32 -0800 Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generator?) In-Reply-To: <199710291705.MAA15003@fermi.eeel.nist.gov> References: <199710291602.JAA00946@steam.acl.lanl.gov> <3453CFE1.6782A603@verinet.com> <199710291602.JAA00946@steam.acl.lanl.gov> Message-ID: <3.0.1.32.19971029110532.00529480@gate.tdyryan.com> One should remember the adage that code is only written ONCE, but is read (by humans) MANY TIMES. The C shortcut operators are an abomination for code legibility, and contribute mightily to the "write-only" nature of C. IMHO. <0.01 grin> And I suspect that if you have code in which the speed difference between, say, "i = i + 1" and "i++" (and as has been mentioned, for compiled languages, any self-respecting optimizing compiler will make the speed difference nonexistant) is significant, you either (a) don't know what significant means, or (b) have much worse problems than you possibly realize! (OK, 0.1 grin this time) ---------------------------------------------------------------------- Kevin Rodgers Teledyne Ryan Aeronautical krodgers@tdyryan.com "This one goes up to eleven." -- Nigel Tufnel ---------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From busby@icf.llnl.gov Wed Oct 29 19:12:56 1997 From: busby@icf.llnl.gov (L. Busby) Date: Wed, 29 Oct 97 11:12:56 PST Subject: [MATRIX-SIG] NumPy in the Python 1.5 era Message-ID: <9710291912.AA00249@icf.llnl.gov.llnl.gov> [ Konrad Hinsen said ] KH> With Python 1.5 approaching slowly but steadily, it is time to discuss KH> the future of NumPy as well. At IPC6 it was mentioned that LLNL might KH> take over the responsibility for further NumPy development, but others KH> are more qualified to comment on that - is anyone willing to do so? Yes, LLNL did volunteer, and Jim Hugunin accepted. We haven't actually transferred any code yet - Jim is doing some cleanup first. In a previous message to Jim, I said: Basically, here's what we (LLNL) propose to do: Give the code a home in a CVS repository (with access to selected outside developers as needed). Keep the release package current with Python's conventions, makefiles, and et cetera. Keep the released code at python.org up to date with respect to our copy. Implement changes as per discussion by the matrix-sig. Protect existing and any future copyrights. In the near term, we'll plan to add support for the new floating point exception package in Python-1.5, and we'll take a look at how best to package NumPy given the new package support in 1.5. I'll be doing some of this work myself, so I can't really commit to a lot more until I've had a chance to review and shuffle my priorities a bit. We Python programmers in X-Division at LLNL absolutely intend to expose NumPy to the widest possible circle of interested developers in the matrix-sig. We also have a lot of real big problems to solve with NumPy, so we plan to be strong custodians of it with respect to stability, performance, documentation, and so on. Although we have a lot of ideas for its future improvement, they generally need to be worked out in the SIG. We would especially appreciate ideas now about how best to do source code management for an international community of programmers. KH> With the new package feature in 1.5, we could turn NumPy into a package KH> again (veterans may remember that one alpha release used ni-based packages). KH> This obviously raises the question of compatibility. My suggestion is KH> to keep the current modules, but provide an alternate arrangement as KH> a package, and add eventual new modules only to that package. In other KH> words, we would still have module/package Numeric with all its functions, KH> but the other modules (LinearAlgebra etc.) would be available as KH> modules within Numeric, as well as stand-alone modules for compatibility. KH> This would cost only minor modifications plus a few links (no duplicate KH> code). Any comments? My only comment is that I'm not very wedded to backward compatibility. If NumPy can be presented as a Python-1.5 package in a neat and generic fashion, I would be perfectly happy if that were the *only* (Unix) distribution format. KH> Even more important is the installation problem. It's still too difficult KH> for many users, and the installation of other C modules using NumPy KH> currently depends on the way NumPy was installed. KH> KH> It is possible (and I even volunteer to do it!) to convert NumPy into KH> a clean extension that can be installed in the same way on all Unix KH> systems supporting dynamic libraries, and the C API would then also be KH> accessible in the same way on all systems. Basically it involves KH> making the C API accessible via a pointer array that other modules get KH> by a standard import. This involves, however, a compulsory addition KH> (of one line) to all C extension modules using the C API. In my KH> opinion this is preferable to the current mess. Any other opinions? KH> KH> Once NumPy loses its "special" status, one could think about providing KH> an automatic installation program, such that users would essentially KH> download the tar archive, unpack it, and type two commands, one for KH> compilation and one (as root) for installation. It could hardly be KH> easier than that. I like that part about volunteering. I haven't really studied the new 1.5 package facility, so can't comment on it yet. Your idea for exposing the C API to other dynamically loaded modules sounds plausible. KH> I can't say much about non-Unix systems, but I suppose that for KH> Windows and Mac a binary distribution would be the best solution. Ok by me. _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From furnish@acl.lanl.gov Wed Oct 29 19:27:43 1997 From: furnish@acl.lanl.gov (Geoffrey Furnish) Date: Wed, 29 Oct 1997 12:27:43 -0700 (MST) Subject: [MATRIX-SIG] random number generator? In-Reply-To: <34577A96.23FAD443@verinet.com> References: <3453CFE1.6782A603@verinet.com> <199710291602.JAA00946@steam.acl.lanl.gov> <34577A96.23FAD443@verinet.com> Message-ID: <199710291927.MAA01336@steam.acl.lanl.gov> David J. C. Beach writes: > Geoffrey Furnish wrote: > > > Speed is the most important counter argument, imo, and one which ought > > to be of considerable concern to NumPy customers. I have previously > > done timing tests between loops containing expressions like: > > > > a[i] = a[i] + x[i]; > > > > versus > > > > a[i] += x[i]; > > > > in C and C++. The += form was as much as 20% faster on some tests on > > some architectures. > > > > Frankly, for a language litered with "self." on nearly every line, it > > is very hard for me to buy that "a = a + 1" is syntactically more > > beautiful. It certainly is slower. > > There's nothing inherently slower about "a = a + 1" than there is > about "a += 1". The difference is in how the compiler interprets > them. Now, any good optimizing C compiler should transform "a = a > + 1" into "a += 1" automatically. (Were you using optimizations?) Yes there is a difference. "a = a + 1" evaluates a twice, "a += 1" does it only once. That's why its faster, and yes my tests were with maximal optimization. The fact that this result is a little surprising, is THE reason I shared it with you. I'm not confused--I have already done this test, and am aware of the surprsiing fact of the matter. I brought it up in this forum because one could presume that IF NumPy had a += operator at the Python language level, that its underlying C implementation would use the same operator, and hence would stand to run faster. (Not just the use of the operator itself here, but also the issue of temporary objects that Mike aluded to.) Now whether it is fair for an optimizing compiler to reinterpret things in the way you suggest, is a topic that is out of my league. What I know of this issue is that there are a variety of farily surprising limitations that compiler writers must operate under. For example, in C++, you cannot assume that a != b <==> !(a == b). Evidently there are efforts on the part of certain influential individuals in the community to lobby for certain "common sense" semantic interpretations that would make life easier for compiler optimizers, but this is outstanding at this time. Could Python do it? Sure. There is no standards body to bicker with... > Perhaps the Python byte code compiler would only look-up the lvalue > for a (or a[i]) once. This would make "a = a + 1" just as fast as > "a += 1". The point here is that you're confusing a language > difference for a performance difference. The language is what you > type, but the performance depends on how the compiler/interpreter > transforms that language into machine instructions. I'm not actually as confused as you think. :-). BTW, besides the operator evaluation semantics, there is also the (possibly larger?) issue of termporary proliferation on binary operators. This has resulted in very interesting research in the C++ community into techniques for doing array/matrix math without producing temporaries from binary operators. This has resulted in speed-up's as large as 10x for some applications. If you're interested, you could inspect the literature on "expression templates". > Come to think of it, I'm pretty sure that FORTRAN users (not that I > like the language) don't have either a += or a ++ operator, and I'd > be pretty willing to bet that you're a[i] = a[i] + x[i] test on a > good optomizing FORTRAN compiler would outperform the C version of > a[i] += x[i]. Think apples to apples. Fortran optimizers operate under vastly different semantic requirements than C or C++ do. > You might give it a try. Thanks. > As for the litered with "self" argument, I'm assuming that you're > complaining because you're needing to type "self.a[i] = self.a[i] + > 1" instead of "self.a[i] += 1". Well, I rather like the "self" > prefix for object attributes because it makes it crystal clear that > you're setting an attribute of the object instead of some other > variable. I find that it's easier to read other people's code when > this self "littered" style is employed. And I could be wrong, but > I doubt I'm the only one. I am aware that there is a spectrum of opinions, and that many, maybe even most people who use Python, prefer "self.". I also know I am not alone in finding it really annoying, and I have met people who cite it as a principle reason for not wanting to use Python. Oh well. I just grin and bear it. But I don't pesonally buy the argument that we should keep fancy operators out of Python because it keeps the language looking pretty, or because the hoomenfligutz operator in C++ was error prone. I have never been "bitten" by ++, for instance, and cannot for the life of me even imagine what on earth these comments could be referring to. If someone would like to enlighten me about the perils of ++ and -- in private email, I would be interested in hearing it. > And I think there's room for some fair comparison here. How do you > like the complete lack of standard's for container classes in > C/C++? Sometimes people use Rogue Wave's classes, sometimes they > use STL, sometimes he write they're own, sometimes they you their > "in-house" class library. But they're all different: different > ways to get elements, slices, append, insert, etc. I'll grant you > in an instant that C++, as a machine-compiled language, runs faster > than Python byte-compiled code, but it sure seems to lack any > consistency. In C++, templates are generally a mess, there are six > different kinds of inheritance (public virtual, protected virtual, > private virtual, public, protected, and private), compilation tends > to be a slow process, and you get all the advantages (AND > DISADVANTAGES) a strongly typed language. (In Python, I was able > to use the same function for a runge-kutta4 in python for a single > equation system, and a multiple equation system just by virtue of > whether I passed it floats or matrices. You could get that same > behavior from C++, but you'd have to go well out of your way to > spoof dynamic typing, or simply write multiple versions of > runge-kutta4.) This is by now way way off topic for this email list. I will respond to this portion of the thread one time only. C: There are no container classes period. Or any other kind of classes. I try not to use C if I can help it. C++: I do not agree with your assessment of the market realities. Every project I've been associated with for 18 months, has been fully committed to using the STL. Your remark has merit from a historical standpoint, but not much relevance from the standpoint of a current assesment of the state of C++. Saying that "[C++] sure seems to lack any consistency" is patently absurd. If you would like to be made aware of some of the modernities of C++, there is a great article "Foundations for a Native C++ Style" by Koenig and Stroustrup in a recent SW Practice and Experience. The rest of your comments above are sufficiently undirected that I can't formulate a good response. I'm glad you have rk in python that pleases you. My own view is that there is room for both compiled and interpretted langauges in the solution space for large systems programming projects, and that is precisely why I chair the Python C++ sig. > I'm pretty willing to bet that the Python language as a whole more > than stands up to C or C++. I certainly do not view Python as a credible threat to C++. Nor do I view C++ as a credible threat to Python. That's why I use them together. -- Geoffrey Furnish email: furnish@lanl.gov LANL XTM Radiation Transport/POOMA phone: 505-665-4529 fax: 505-665-7880 "Software complexity is an artifact of implementation." -Dan Quinlan _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From pas@xis.xerox.com Wed Oct 29 19:48:37 1997 From: pas@xis.xerox.com (Perry A. Stoll) Date: Wed, 29 Oct 1997 11:48:37 PST Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generat or?) In-Reply-To: <3.0.1.32.19971029110532.00529480@gate.tdyryan.com> Message-ID: <199710291948.OAA13584@terminator.xis.xerox.com> > The C shortcut operators are an abomination for > code legibility, and contribute mightily to the "write-only" nature of C. I disagree about the drastic effects of using shortcut operators. They express a notion very concisely and they are quite easy to parse visually. I'm not going to take the extreme view of "we've just *got* to add <<= to python!", but i'd would certainly use a few (+=, -=, *=, and possibly /=) if they were there. We've been here before on this list. Take the following example: which gets across the notion that all we want to do is increment a variable? (1) self.obj_list[self.obj_offset].count = self.obj_list[self.obj_offst].count + 1 --or-- obj = self.obj_list[self.obj_offset] obj.count = obj.count + 1 --or-- self.obj_list[self.obj_offset].count += 1 How about we drop the topic until someone provides patchs? -Perry BTW, the typo in (1) is on purpose... :) On 29 Oct , Kevin Rodgers wrote: > One should remember the adage that code is only written ONCE, but is read > (by humans) MANY TIMES. The C shortcut operators are an abomination for > code legibility, and contribute mightily to the "write-only" nature of C. > IMHO. <0.01 grin> And I suspect that if you have code in which the speed > difference between, say, "i = i + 1" and "i++" (and as has been mentioned, > for compiled languages, any self-respecting optimizing compiler will make > the speed difference nonexistant) is significant, you either (a) don't know > what significant means, or (b) have much worse problems than you possibly > realize! (OK, 0.1 grin this time) > ---------------------------------------------------------------------- > Kevin Rodgers Teledyne Ryan Aeronautical krodgers@tdyryan.com > "This one goes up to eleven." -- Nigel Tufnel > ---------------------------------------------------------------------- > > _______________ > MATRIX-SIG - SIG on Matrix Math for Python > > send messages to: matrix-sig@python.org > administrivia to: matrix-sig-request@python.org > _______________ _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jhauser@ifm.uni-kiel.de Wed Oct 29 20:06:06 1997 From: jhauser@ifm.uni-kiel.de (Janko Hauser) Date: Wed, 29 Oct 1997 21:06:06 +0100 (CET) Subject: [MATRIX-SIG] Re: NumPy in the Python 1.5 era In-Reply-To: <199710291754.SAA05810@lmspc1.ibs.fr> References: <199710291754.SAA05810@lmspc1.ibs.fr> Message-ID: If there are plans for a new version of NumPy (will this be a final version?) I would suggest that the documentation needs an update to. Especially for interactiv use it would be good to have more doc-strings in the code itself. I volunteer to start with the doc-strings. I will manly use the current written information and put it in the appropiate places. Should something like gendoc used for this? Is there a perfomance hit if there are the doc-strings in often used functions? I know that David Ascher has some more documentation, should this be integrated in the distribution? Are there any other big librarys, which should be included? At the end, I was not at the IPC6. Is there some new information about future plans regarding NumPy? __Janko _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From motteler@laura.llnl.gov Wed Oct 29 20:25:03 1997 From: motteler@laura.llnl.gov (Zane C. Motteler) Date: Wed, 29 Oct 1997 12:25:03 -0800 (PST) Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generat or?) In-Reply-To: <199710291948.OAA13584@terminator.xis.xerox.com> Message-ID: Just a quick response to Perry Stool's example, self.obj_list[self.obj_offset].count = self.obj_list[self.obj_offst].count + 1 It is only necessary to type something like self.obj_list[self.obj_offset].count once. Don't we all have a mouse and "copy/paste" capability? There are ways to obviate all that typing. Also, it is possible in Python to alias complicated expressions and possibly give them more easily readable names, e. g. current_object = self.obj_list[self.obj_offset] Incidentally, I "moused" all of the above expressions, didn't type them. Cheers Zane ----------------------------------------------------------------------------- Zane C. Motteler, Ph. D. ___________ALSO:_________________ Computer Scientist | Professor Emeritus of Computer| Lawrence Livermore National Laboratory | Science and Engineering | P O Box 808, L-472 | California Polytechnic State | Livermore, CA 94551-9900 | University, San Luis Obispo | 510/423-2143 --------------------------------- FAX 510/423-9969 zcm@llnl.gov or motteler@laura.llnl.gov or zmottel@calpoly.edu _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jhauser@ifm.uni-kiel.de Wed Oct 29 20:40:26 1997 From: jhauser@ifm.uni-kiel.de (Janko Hauser) Date: Wed, 29 Oct 1997 21:40:26 +0100 (CET) Subject: [MATRIX-SIG] random number generator? In-Reply-To: <199710291927.MAA01336@steam.acl.lanl.gov> References: <3453CFE1.6782A603@verinet.com> <199710291602.JAA00946@steam.acl.lanl.gov> <34577A96.23FAD443@verinet.com> <199710291927.MAA01336@steam.acl.lanl.gov> Message-ID: Uh, here are some argumentation points I really don't understand. But as far as I see (I should be slapped for this sentence :0) in NumPy there is something like a+=1 (if the speed gain comes from the missing copy of a) a=add(a,1,a) I must admitt, that this is not nice looking, but very usefull with large arrays. But it can be, that I don't see far although I sit here with glasses on my nose. __Janko PS: Hasn't Phil Austin mentioned some efforts to connect Python with Blitz++, a expression-template rich C++ library? _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From barrett@compass.gsfc.nasa.gov Wed Oct 29 20:40:12 1997 From: barrett@compass.gsfc.nasa.gov (Paul Barrett) Date: Wed, 29 Oct 1997 15:40:12 -0500 Subject: [MATRIX-SIG] NumPy in the Python 1.5 era In-Reply-To: <199710291754.SAA05810@lmspc1.ibs.fr> References: <199710291754.SAA05810@lmspc1.ibs.fr> Message-ID: <199710292040.PAA28442@compass.gsfc.nasa.gov> Konrad Hinsen writes: > With the new package feature in 1.5, we could turn NumPy into a package > again (veterans may remember that one alpha release used ni-based packages). > This obviously raises the question of compatibility. My suggestion is > to keep the current modules, but provide an alternate arrangement as > a package, and add eventual new modules only to that package. In other > words, we would still have module/package Numeric with all its functions, > but the other modules (LinearAlgebra etc.) would be available as > modules within Numeric, as well as stand-alone modules for compatibility. > This would cost only minor modifications plus a few links (no duplicate > code). Any comments? > Lately, I too have felt that NumPy needs some restructuring to provide a cleaner appearance and make it a bona-vide package. I haven't had the time though to looking at the details in order to provide any definite recommendations. My feeling is that NumPy should be pared down to the minimum number of array operations and that any other operations that have been added for easy of use should be placed in another module. This, in my naive opinion, would then define the NumPy array class and probably make it easier for the Matrix-SIG to discuss and develop other specialized stand-alone modules, such as the Linear Algebra module. I have also thought about enhancing the array module to include heterogeneous arrays, where the right most index is defined by a structure format. This would provide improved performance for many applications. For example, I am currently working with photon event data that is stored in a 2x2 array, where each row of the data is a structure containing positional, temporal, and spectral information. It would be nice to do quick calculations on slices of rows and columns. I think this has been mentioned before as an enhancement to the array module. It doesn't appear to require any major changes to the package, though a few operations may not work on all indexes. I would also like the Matrix-SIG to decide on floating point exceptions. Jim Hugunin and I discussed this briefly at the IPC6. I suggested that we settle on the IEEE FPE, since it seems to me that all processors will eventually conform to this standard and that most do already, e.g. x86, PPC, Alpha, MIPS, etc. (Bye, Bye VAX). We could then define (or let the user define) one of the many NAN values to indicate a NULL data type. > > Even more important is the installation problem. It's still too difficult > for many users, and the installation of other C modules using NumPy > currently depends on the way NumPy was installed. > I'm all for this! -- Paul Barrett - Astrophysicist - Universities Space Research Association Compton Observatory Science Support Center NASA/Goddard SFC phone: 301-286-1108 Spam, spam, spam, Code 660.1, FAX: 301-286-1681 eggs, spam, and Greenbelt,MD 20771 barrett@compass.gsfc.nasa.gov Python! http://lheawww.gsfc.nasa.gov/users/barrett/CV.html _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From dubois1@llnl.gov Wed Oct 29 21:04:24 1997 From: dubois1@llnl.gov (Paul F. Dubois) Date: Wed, 29 Oct 1997 13:04:24 -0800 Subject: [MATRIX-SIG] Re: NumPy in the Python 1.5 era Message-ID: <9710292103.AA01239@icf.llnl.gov.llnl.gov> We (LLNL) are taking over responsibility for it, and that will include doing the documentation. Now, if we could only figure out how it works... (:->. ---------- > From: Janko Hauser > To: matrix-sig@python.org > Subject: [MATRIX-SIG] Re: NumPy in the Python 1.5 era > Date: Wednesday, October 29, 1997 12:06 PM > > If there are plans for a new version of NumPy (will this be a final > version?) I would suggest that the documentation needs an update > to. Especially for interactiv use it would be good to have more > doc-strings in the code itself. I volunteer to start with the > doc-strings. I will manly use the current written information and put > it in the appropiate places. > > Should something like gendoc used for this? Is there a perfomance hit > if there are the doc-strings in often used functions? > > I know that David Ascher has some more documentation, should this be > integrated in the distribution? Are there any other big librarys, > which should be included? > > At the end, I was not at the IPC6. Is there some new information about > future plans regarding NumPy? > > __Janko > > > _______________ > MATRIX-SIG - SIG on Matrix Math for Python > > send messages to: matrix-sig@python.org > administrivia to: matrix-sig-request@python.org > _______________ _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From dubois1@llnl.gov Wed Oct 29 21:22:47 1997 From: dubois1@llnl.gov (Paul F. Dubois) Date: Wed, 29 Oct 1997 13:22:47 -0800 Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generator?) Message-ID: <9710292122.AA01372@icf.llnl.gov.llnl.gov> Python and C are quite different; for in a += 1 in C you are definitely adding one to the storage location for a. In Python, a = a + 1 is short for "bind the name a to the result of evaluating a+1". But a+1 may or may not have anything to do with adding one to something; it depends on a. Assuming you implemented a +=1 by adding another handler to the dispatch mechanism, you would then have raise the ugly spectre that a += 1 and a = a + 1 could be totally different. Anyway, by the time you get done with the dispatch I'm skeptical there would be any time saved worth having. So the only real case for it in Python is as syntactic sugar for a = a + 1. C++ does indeed have the ability to make a = a + 1 and a++ and a += 1 all come out differently. Pardon me if I do not consider this a good thing. _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From johann@physics.berkeley.edu Wed Oct 29 21:33:38 1997 From: johann@physics.berkeley.edu (Johann Hibschman) Date: Wed, 29 Oct 1997 13:33:38 -0800 (PST) Subject: [MATRIX-SIG] C++/Blitz++ comments In-Reply-To: Message-ID: On Wed, 29 Oct 1997, Janko Hauser wrote: > PS: Hasn't Phil Austin mentioned some efforts to connect Python with > Blitz++, a expression-template rich C++ library? That would be interesting, but I'm not quite sure how such a linkage could work, since templates are a compile-time phenomenon. Does anyone know anything more about this? In any case, both of my major NumPy/C++ projects have been through SWIG. I've just started using the TNT C++ library (from NIST). With the right SWIG typemaps, passing NumPy arrays to and from the TNT Vectors is completely seamless, only requiring a copy or two. - Johann --- Johann A. Hibschman | Grad student in Physics, working in Astronomy. johann@physics.berkeley.edu | Probing pulsar pair production processes. _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From pas@xis.xerox.com Wed Oct 29 21:43:00 1997 From: pas@xis.xerox.com (Perry A. Stoll) Date: Wed, 29 Oct 1997 13:43:00 PST Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generat or?) In-Reply-To: Message-ID: <199710292143.QAA15047@terminator.xis.xerox.com> On 29 Oct , Zane C. Motteler wrote: > Just a quick response to Perry Stool's example, > > self.obj_list[self.obj_offset].count = self.obj_list[self.obj_offst].count + 1 > > It is only necessary to type something like self.obj_list[self.obj_offset].count > once. Don't we all have a mouse and "copy/paste" capability? Well, of course we do (that's how I did it for that example!) but... my feeling is that you're doomed once you start down the dark path of cut/paste while coding... ;) Actually, I was just trying to get some complex expression that wouldn't be easily deconstructed at first glance. The point was to ask : which says "add one to this thing" in the most clear way? > There are > ways to obviate all that typing. Also, it is possible in Python to alias > complicated expressions and possibly give them more easily readable > names, e. g. > > current_object = self.obj_list[self.obj_offset] I believe that was the second alternative in my example. Again, if all I want to do is "add one" to something, does aliasing the object express that more clearly than foo += 1? -Perry _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From jac@lanl.gov Wed Oct 29 22:11:21 1997 From: jac@lanl.gov (James A. Crotinger) Date: Wed, 29 Oct 1997 15:11:21 -0700 Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generator?) In-Reply-To: <9710292122.AA01372@icf.llnl.gov.llnl.gov> Message-ID: <3.0.3.32.19971029151121.006a9ee0@cic-mail.lanl.gov> At 01:22 PM 10/29/97 -0800, Paul F. Dubois wrote: >C++ does indeed have the ability to make a = a + 1 and a++ and a += 1 all >come out differently. Pardon me if I do not consider this a good thing. Fortran has the ability to have: a = add_one(a) a = sum(a,1) call increment_by_one(a) all come out differently as well. One can do stupid things in any programming language. 8-) Jim _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From beach@verinet.com Thu Oct 30 00:08:52 1997 From: beach@verinet.com (David J. C. Beach) Date: Wed, 29 Oct 1997 17:08:52 -0700 Subject: [MATRIX-SIG] random number generator? Message-ID: <3457D014.98FBEE6E@verinet.com> This is a multi-part message in MIME format. --------------47829609CB1559D129ECD9EA Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit --------------47829609CB1559D129ECD9EA Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-ID: <3457CF32.95F4DEF2@verinet.com> Date: Wed, 29 Oct 1997 17:05:06 -0700 From: "David J. C. Beach" Organization: Colorado State University X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686) MIME-Version: 1.0 To: Geoffrey Furnish Subject: Re: [MATRIX-SIG] random number generator? References: <3453CFE1.6782A603@verinet.com> <199710291602.JAA00946@steam.acl.lanl.gov> <34577A96.23FAD443@verinet.com> <199710291927.MAA01336@steam.acl.lanl.gov> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Geoffrey Furnish wrote: Regarding your argument that a += 1 is inherently faster than a = a + 1, I urge you again to examine FORTRAN. ANY good FORTRAN compiler WILL make that optimization. I promise you that there's nothing inherently faster about one than the other. If your C and C++ compilers aren't doing that for you, it could be for one of two reasons: 1) the compiler you are using is optimization shy. 2) C has a long history of forcing programmers to make optimizations themselves. Look at the keyword 'register', and the new 'inline' directive in C++. Look at the ++, --, +=, *=, /=, -=, etc. operators. These all make the language bigger, and encourage the programmer to opimize his/her code using language tricks. This in no way improves readability, and optimizations of this type are something that can (and should) be done by the compiler, NOT the programmer. > > I'm pretty willing to bet that the Python language as a whole more > > than stands up to C or C++. > > I certainly do not view Python as a credible threat to C++. Nor do I > view C++ as a credible threat to Python. That's why I use them > together. No no no.... Python will probably never be a threat to C++. Obviously Python is an interpreted language and C++ is a compiled language. These two different tools can often be complimentary as you pointed out. HOWEVER, I do feel that C++ is just plain gross when it comes to syntax and consistency. Maybe if I had money to shell out, I'd see the brighter side of things, but all I've had to work with is the g++ compiler. I've seen the Tools.h++ library, the STL library, and others, and they all differ. After 40k lines for my last (and maybe last ever) programming project in C++, I can't say I really think much of the language. It seems like a hog butchering to me. Maybe six different types of inheritance, and the ability to make the compiler perform arbitrary computations during compilation (ala expression templates) are really cool features, but I'm starting to think that C++ just is too big for its britches. You can say that C++ when used with STL is beautifully consistent, and I'd be more inclined to believe you, but the fact remains that most of the C++ out there doesn't use STL, but uses some other crap. It's going to take a long time for any of that to change. If you have to interfact your program with any other C or older C++ tools, you can take that consistency and flush it down the toilet. Consistency means everybody using similar mecahnisms to get similar behavior, and, in that light, C++ really just isn't consistent. I also feel strongly that the addition of too many operators can change a language for the worse. I've found Python's readability to be a refreshing change from C++, and I believe that Guido et al. exercised a careful and necessary caution when deciding what operators to add to the language. As it was brought up, code is written only once and read many times. Donald Knuth knew what he was doing when he invented WEB to create TeX. I believe that Guido exhibited similar wisdom when he designed Python. Dave -- David J. C. Beach Colorado State University mailto:beach@verinet.com http://www.verinet.com/~beach --------------47829609CB1559D129ECD9EA-- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From turner@blueskystudios.com Wed Oct 29 23:15:57 1997 From: turner@blueskystudios.com (John Turner) Date: Wed, 29 Oct 1997 18:15:57 -0500 Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generator?) In-Reply-To: <3.0.3.32.19971029151121.006a9ee0@cic-mail.lanl.gov> References: <9710292122.AA01372@icf.llnl.gov.llnl.gov> <3.0.3.32.19971029151121.006a9ee0@cic-mail.lanl.gov> Message-ID: <199710292315.SAA05372@oneida> James A. Crotinger writes: > At 01:22 PM 10/29/97 -0800, Paul F. Dubois wrote: > > > C++ does indeed have the ability to make a = a + 1 and a++ and > > a += 1 all come out differently. Pardon me if I do not consider > > this a good thing. > > Fortran has the ability to have: > > a = add_one(a) > a = sum(a,1) > call increment_by_one(a) > > all come out differently as well. One can do stupid things in any > programming language. 8-) And with F90 you can overload intrinsic operators (e.g. +) and/or assignment (=), so you can blow away chunks of your lower limbs almost as large as with C++. However, you are only allowed to extend the meaning of an intrinsic operator, not change its meaning for instrinsic data types. However, this is becoming increasingly unrelated to NumPy and the original issue... -- John A. Turner mailto:turner@blueskystudios.com Senior Research Associate http://www.blueskystudios.com Blue Sky | VIFX http://www.vifx.com One South Road, Harrison, NY 10528 http://www.lanl.gov/home/turner Phone: 914-381-8400 Fax: 914-381-9790/1 _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From amullhau@ix.netcom.com Thu Oct 30 00:51:24 1997 From: amullhau@ix.netcom.com (Andrew P. Mullhaupt) Date: Wed, 29 Oct 1997 19:51:24 -0500 Subject: [MATRIX-SIG] NumPy in the Python 1.5 era In-Reply-To: <199710292040.PAA28442@compass.gsfc.nasa.gov> References: <199710291754.SAA05810@lmspc1.ibs.fr> <199710291754.SAA05810@lmspc1.ibs.fr> Message-ID: <3.0.1.32.19971029195124.009704e0@popd.netcruiser> At 03:40 PM 10/29/97 -0500, Paul Barrett wrote: >I would also like the Matrix-SIG to decide on floating point >exceptions. Jim Hugunin and I discussed this briefly at the IPC6. I >suggested that we settle on the IEEE FPE, since it seems to me that >all processors will eventually conform to this standard and that most >do already, e.g. x86, PPC, Alpha, MIPS, etc. (Bye, Bye VAX). We >could then define (or let the user define) one of the many NAN values >to indicate a NULL data type. Absolutely. I second this. Later, Andrew Mullhaupt _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From phil@geog.ubc.ca Thu Oct 30 03:51:14 1997 From: phil@geog.ubc.ca (Phil Austin) Date: Wed, 29 Oct 1997 19:51:14 -0800 Subject: [MATRIX-SIG] C++/Blitz++ comments In-Reply-To: References:

Message-ID: <199710300351.TAA00795@curlew.geog.ubc.ca> >>>>> "JH" == Johann Hibschman writes: JH> On Wed, 29 Oct 1997, Janko Hauser wrote: >> PS: Hasn't Phil Austin mentioned some efforts to connect Python >> with Blitz++, a expression-template rich C++ library? JH> That would be interesting, but I'm not quite sure how such a JH> linkage could work, since templates are a compile-time JH> phenomenon. Does anyone know anything more about this? JH> In any case, both of my major NumPy/C++ projects have been JH> through SWIG. I've just started using the TNT C++ library JH> (from NIST). With the right SWIG typemaps, passing NumPy JH> arrays to and from the TNT Vectors is completely seamless, JH> only requiring a copy or two. That's all we're doing as well--we circulated some SWIG typemaps that used MV++ several months ago (with an error or two in those examples, unfortunately). We were waiting to migrate the examples from MV++ to TNT when Roldan Pozo adds a constructor that takes a C pointer and the array dimensions (eliminating that array copy). Blitz has this constructor in the latest release. The other good news is that egcs (http://www.cygnus.com/egcs/) now compiles the Blitz example suite, so we won't have to increase our KCC licence count to move everyone to Blitz. Phil _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Thu Oct 30 11:11:00 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Thu, 30 Oct 1997 12:11:00 +0100 Subject: Shortcut operators (was Re: [MATRIX-SIG] random number generator?) In-Reply-To: <3.0.1.32.19971029110532.00529480@gate.tdyryan.com> (message from Kevin Rodgers on Wed, 29 Oct 1997 11:05:32 -0800) Message-ID: <199710301111.MAA08230@lmspc1.ibs.fr> > One should remember the adage that code is only written ONCE, but is read > (by humans) MANY TIMES. The C shortcut operators are an abomination for > code legibility, and contribute mightily to the "write-only" nature of C. I know this is a frequently cited argument, but I don't agree. Think about how you describe a step in an algorithm in plain English. Do you say "increment a by one" or "set a to the sum of a and one"? I'd say the notion of "modifying the value of a variable" is a sufficiently basic and widespread one to be worth expressing in a programming language. But although I like these operators in C and C++, I wouldn't want them in Python. To my mind they indicate "changing the value of an object", which is not a meaningful operation on Python number types. In a reference-based language like Python, a+=1 would have to be defined in terms of standard assignment, and would thereby lose the straightforward meaning of "increment a". -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Thu Oct 30 11:26:19 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Thu, 30 Oct 1997 12:26:19 +0100 Subject: [MATRIX-SIG] NumPy in the Python 1.5 era In-Reply-To: <9710291912.AA00249@icf.llnl.gov.llnl.gov> (busby@icf.llnl.gov) Message-ID: <199710301126.MAA08269@lmspc1.ibs.fr> > KH> With the new package feature in 1.5, we could turn NumPy into a package > KH> again (veterans may remember that one alpha release used ni-based packages). > KH> This obviously raises the question of compatibility. My suggestion is > KH> to keep the current modules, but provide an alternate arrangement as > KH> a package, and add eventual new modules only to that package. In other > KH> words, we would still have module/package Numeric with all its functions, > KH> but the other modules (LinearAlgebra etc.) would be available as > KH> modules within Numeric, as well as stand-alone modules for compatibility. > KH> This would cost only minor modifications plus a few links (no duplicate > KH> code). Any comments? > > My only comment is that I'm not very wedded to backward compatibility. > If NumPy can be presented as a Python-1.5 package in a neat and generic > fashion, I would be perfectly happy if that were the *only* (Unix) > distribution format. Python 1.5 packages are not really a distribution format. The name "package" is a bit confusing since it has been used for two distinct concepts: 1) A distribution unit of modules. 2) A hierarchy of modules. NumPy already is a package in the first sense, and I think everyone agrees that making this package easier to install and use is a Good Thing. This would have no (well, almost no) effect on code that relies on it. Turning NumPy into a hierarchy of modules (rather than a set of parallel modules) is something else, and would require a change in all other modules and scripts using NumPy. I am not a big fan of backwards compatibility either, but a new release that would break most existing code would probably find little acceptance. That's why I propose, at least for a transition period, to make modules available both at the top level (for compatibility) and as submodules of Numeric (for new development). > I like that part about volunteering. I haven't really studied the > new 1.5 package facility, so can't comment on it yet. Your idea for > exposing the C API to other dynamically loaded modules sounds plausible. And it's totally independent of the package question - in fact, it involves major changes only in the C header files. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________ From hinsen@ibs.ibs.fr Thu Oct 30 11:34:25 1997 From: hinsen@ibs.ibs.fr (Konrad Hinsen) Date: Thu, 30 Oct 1997 12:34:25 +0100 Subject: [MATRIX-SIG] NumPy in the Python 1.5 era In-Reply-To: <199710292040.PAA28442@compass.gsfc.nasa.gov> (message from Paul Barrett on Wed, 29 Oct 1997 15:40:12 -0500) Message-ID: <199710301134.MAA08290@lmspc1.ibs.fr> > definite recommendations. My feeling is that NumPy should be pared > down to the minimum number of array operations and that any other > operations that have been added for easy of use should be placed in > another module. This, in my naive opinion, would then define the > NumPy array class and probably make it easier for the Matrix-SIG to > discuss and develop other specialized stand-alone modules, such as the > Linear Algebra module. I doubt we could ever agree on a "minimum set" of array operations. It depends too much on what you are doing. But one possible hierarchical structure would certainly be to have just the array constructor in the top level module and everything else in submodules. > I have also thought about enhancing the array module to include > heterogeneous arrays, where the right most index is defined by a > structure format. This would provide improved performance for many This is one example of array specializations. One reasonable approach to this problem would be to use the ExtensionClass system and thereby make arrays subclassable in C and Python. > I would also like the Matrix-SIG to decide on floating point > exceptions. Jim Hugunin and I discussed this briefly at the IPC6. I > suggested that we settle on the IEEE FPE, since it seems to me that > all processors will eventually conform to this standard and that most > do already, e.g. x86, PPC, Alpha, MIPS, etc. (Bye, Bye VAX). We > could then define (or let the user define) one of the many NAN values > to indicate a NULL data type. I agree that IEEE is only reasonable choice. We could still try to make NumPy work without IEEE, but then also without exception handling. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- _______________ MATRIX-SIG - SIG on Matrix Math for Python send messages to: matrix-sig@python.org administrivia to: matrix-sig-request@python.org _______________