From e.maryniak at pobox.com Mon Jul 1 01:48:01 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Mon Jul 1 01:48:01 2002 Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info) In-Reply-To: <3D1F0839.2090802@stsci.edu> References: <3D1F0839.2090802@stsci.edu> Message-ID: <200207011047.25000.e.maryniak@pobox.com> On Sunday 30 June 2002 15:31, Todd Miller wrote: > Perry Greenfield wrote: > >... > >>2. Because I'm running two versions of Python (because Zope > >> and a lot of Zope/C products depend on a particular version) > >> the 'development' Python is installed in /usr/local/bin > >> (whereas SuSE's python is in /usr/bin). > >> It probably wouldn't do any harm if the manual would include > >> a hint at the '--prefix' option and mention an alternative > >> Python installation like: > >> > >> /usr/local/bin/python ./setup.py install --prefix=/usr/local > > > >Good idea. > > I'm actually surprised that this is necessary. I was under the > impression that the distutils pick reasonable defaults simply based on > the python that is running. In your case, I would expect numarray to > install to /usr/local/lib/pythonX.Y/site-packages without specifying any > prefix. What happens on SuSE? Yes, you're probably right. On SuSE I tested it out on my own machine ('test server'), because I did not want to do it on the production server. It run's Python 2.2.1 exclusively. I remembered that I had to this in a previous Numeric installation, where 1.5.2 and 2.1 were running side-by-side (and at that time I also had to install distutils manually). So, yes, it may not be an issue (anymore) for at least recent Python's if you call the Python explicitly like '/usr/local/bin/python ./setup.py' and '/usr/bin/python ./setup' (on SuSE python goes to /usr/bin). > > >>... Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. It said 'Insert disk #3', but only two will fit. From hinsen at cnrs-orleans.fr Mon Jul 1 08:48:10 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon Jul 1 08:48:10 2002 Subject: [Numpy-discussion] Scientific Python 2.4 Message-ID: <200207011543.g61FhHL25160@chinon.cnrs-orleans.fr> Scientific Python 2.4 --------------------- Scientific Python is a module library for scientific computing. In this collection you will find modules that cover basic geometry (vectors, tensors, transformations, vector and tensor fields), quaternions, automatic derivatives, (linear) interpolation, polynomials, elementary statistics, nonlinear least-squares fits, unit calculations and conversions, Fortran-compatible data formatting, 3D visualization via VRML, two Tk widgets for simple line plots and 3D wireframe models. Scientific Python also contains Python interfaces to the netCDF library (implementing a portable binary format for large arrays) and the Message Passing Interface, the most widely used communications library for parallel computers. Version 2.4 of Scientific Python has just been released. In addition to numerous small improvents and bug fixes, it contains - the high-level parallelization module Scientific.BSP - an interface to the parallelization library BSPlib (see www.bsp-worldwide.org for details) - autoregressive models for time series in Scientific.Signals.Models The BSP parallelization module was designed to facilitate development and testing of parallel programs. Its main features are: - communication can handle almost any Python object - deadlocks are impossible by design - possibility to implement distributed data classes that can be used transparently by parallel applications - an interactive parallel interpreter that can be used inside Emacs (and perhaps other Python development environments) in order to provide an interactive parallel programming environment - parallel programs run as serial monoprocessor code on any Python installation with no changes and usually negligeable loss of performance - no need to maintain a separate serial version A tutorial on BSP programming with Python is available at the Web site and included in the distribution. For more information and for downloading, see http://dirac.cnrs-orleans.fr/ScientificPython or http://starship.python.net/crew/hinsen/scientific.html -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Mon Jul 1 16:41:28 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 1 16:41:28 2002 Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info) In-Reply-To: <200207011047.25000.e.maryniak@pobox.com> Message-ID: <002601c22158$90f7e900$0c01a8c0@NICKLEBY> distutils installs into the python used to run the setup.py by using the sys.exec_prefix and sys.prefix. You would not normally need to use any option unless you are trying to install something "off to the side" because, for example, you don't have write permission in that python's site-packages directory. From paul at pfdubois.com Mon Jul 1 16:50:57 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 1 16:50:57 2002 Subject: [Numpy-discussion] words that must not be spoken In-Reply-To: <200206262047.00731.e.maryniak@pobox.com> Message-ID: <002701c22159$ca4fd270$0c01a8c0@NICKLEBY> > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Eric Maryniak In the midst of a discussion Eric wrote: > > ... > shouldn't Convolve, for > orthogonality, be named > Convolve2? (cuz who knows, numarray's Convolve may be backported > to Numeric in the future, for comparative testing etc.). Use of the phrase "backported to Numeric" will result in your subscription to numpy-discussion being cancelled. (:-> No backporting is ever going to happen. This is a short one-way street or there is no purpose to travel on it. I am just back from Europython and had a chance to talk to a lot of users and have some thoughts which I will share with all of you shortly. However, since I just had to fill out a form and where it said "Date" I looked at my watch and wrote the time 11/16, I conclude that I have jet lag and can't trust myself to be lucid yet. From jae at zhar.net Mon Jul 8 02:49:01 2002 From: jae at zhar.net (John Eikenberry) Date: Mon Jul 8 02:49:01 2002 Subject: [Numpy-discussion] Optimization advice Message-ID: <20020708094805.GA370@kosh.zhar.net> I'm working on an influence map [1] for game civil [2]. I have a working version, but as a real numeric newbie I thought I'd bounce it off the people here before calling it done. I'm basically looking for an easy to understand but fast influence spreading algorithm. I've read that this algorithm is similar to those used to predict fire spreading or heat transfer in metal if that helps. The attached code is setup for a hex based map and the functions to take this into accounts (shift_hex_up,shift_hex_down) are probably the most naive. The others being only slight modifications of those in the life.py example. Its not really commented but its short and hopefully should be readily understandable. I've only included the base influence map class and its associated functions. If you'd like a version you can run, I can send you a .tgz setup to run in place (for *nix systems). Thanks in advance for any advice or opinions. [1] An influence map is used commonly in strategic war games. It is a simple means of capturing the areas on the game map that one side is strong vs the other side. Read the first post in this thread for a good description: http://www.gameai.com/influ.thread.html [2] Civil is a cross-platform, turn-based, networked strategy game, developed using Python, PyGame and SDL--allowing players to take part in scenarios set during the American Civil war. http://civil.sourceforge.net/ -- John Eikenberry [jae at zhar.net - http://zhar.net] ______________________________________________________________ "They who can give up essential liberty to purchase a little temporary safety, deserve neither liberty nor safety." --B. Franklin -------------- next part -------------- # /usr/bin/env python from Numeric import * factor = array(6.).astype(Float16) edge_mod = array(0.66).astype(Float16) class InfluenceMap: def __init__(self,hex_map): self.map_size = map_size = hex_map.size self._iterations = (map_size[0] + map_size[1])/4 self.hex_map = hex_map # weightmap == influence map self.weightmap = zeros((map_size[0],map_size[1]),Float16) # constmap = initial state with constraints/constants self.constmap = zeros((map_size[0],map_size[1]),Float16) def step(self,iterations=None): constmap = self.constmap weightmap = self.weightmap if not iterations: iterations = self._iterations while iterations: # spread the influence # diamond_h neighbors = _shift_up(weightmap)/factor neighbors += _shift_left(weightmap)/factor neighbors += _shift_right(weightmap)/factor neighbors += _shift_down(weightmap)/factor neighbors += _shift_hex_up(weightmap)/factor neighbors += _shift_hex_down(weightmap)/factor # constrain initial points to prevent overheating putmask(neighbors,constmap,constmap) weightmap = neighbors iterations -= 1 self.weightmap = weightmap def shift_up(cells): return concatenate((cells[1:], cells[-1:]*edge_mod)) def shift_down(cells): return concatenate((cells[:1]*edge_mod, cells[:-1])) def shift_left(cells): return transpose(shift_up(transpose(cells))) def shift_right(cells): return transpose(shift_down(transpose(cells))) # for array layout def shift_hex_up(cells): neighbors = array(cells) # add to odd cell rows [1::2] neighbors[1::2] = shift_left(shift_up(cells))[1::2] # even cell rows [::2] neighbors[::2] = shift_right(shift_up(cells))[::2] return neighbors def shift_hex_down(cells): neighbors = array(cells) # odd cell rows [1::2] neighbors[1::2] = shift_left(shift_down(cells))[1::2] # even cell rows [::2] neighbors[::2] = shift_right(shift_down(cells))[::2] return neighbors From dubois1 at llnl.gov Mon Jul 8 09:10:04 2002 From: dubois1 at llnl.gov (Paul Dubois) Date: Mon Jul 8 09:10:04 2002 Subject: [Numpy-discussion] Caution -- // not standard Message-ID: <1026144543.13905.3.camel@ldorritt> I have run into several cases of this on different open-source projects, the latest being an incorrect change in Numeric's arrayobject.c: the use of // to start a comment. Many contributors who work only with Linux have come to believe that this works with other C compilers, which is not true. This construct comes from C++. Please avoid this construct when contributing changes or patches to Numeric. From bsder at mail.allcaps.org Mon Jul 8 12:03:09 2002 From: bsder at mail.allcaps.org (Andrew P. Lentvorski) Date: Mon Jul 8 12:03:09 2002 Subject: [Numpy-discussion] Caution -- // not standard In-Reply-To: <1026144543.13905.3.camel@ldorritt> Message-ID: <20020708114304.T66456-100000@mail.allcaps.org> Actually, // is standard C99 released December 1, 1999 as ISO/IEC 9899:1999. It also has support for variable length arrays, a complex number type and a bunch of *portable* stuff for getting at numerical information (limits, floating-point environment) rather than nasty compiler specific hacks. ( See: http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9x.htm ) Many of these extensions are specifically for the numerical community. I would recommend taking up the issue of non-standards compliance with your compiler vendor. -a On 8 Jul 2002, Paul Dubois wrote: > I have run into several cases of this on different open-source projects, > the latest being an incorrect change in Numeric's arrayobject.c: the use > of // to start a comment. Many contributors who work only with Linux > have come to believe that this works with other C compilers, which is > not true. This construct comes from C++. Please avoid this construct > when contributing changes or patches to Numeric. From paul at pfdubois.com Mon Jul 8 12:38:05 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 8 12:38:05 2002 Subject: [Numpy-discussion] Caution -- // not standard In-Reply-To: <20020708114304.T66456-100000@mail.allcaps.org> Message-ID: <001101c226b6$da5f3090$0c01a8c0@NICKLEBY> Thank you for the clarification. Unfortunately, "my" compiler vendor is the set of all compiler vendors that users of Numeric have, and we have to restrict ourselves to what works. I misspoke when I said it was "not standard"; I should have said, "doesn't work everywhere". > -----Original Message----- > From: Andrew P. Lentvorski [mailto:bsder at mail.allcaps.org] > Sent: Monday, July 08, 2002 12:02 PM > To: Paul Dubois > Cc: numpy-discussion at lists.sourceforge.net > Subject: Re: [Numpy-discussion] Caution -- // not standard > > > Actually, // is standard C99 released December 1, 1999 as > ISO/IEC 9899:1999. > > It also has support for variable length arrays, a complex > number type and a bunch of *portable* stuff for getting at > numerical information (limits, floating-point environment) > rather than nasty compiler specific hacks. ( See: > http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9> x.htm ) > > Many > of these extensions are specifically for the > numerical community. > > I would recommend taking up the issue of non-standards > compliance with your compiler vendor. > > -a > > On 8 Jul 2002, Paul Dubois wrote: > > > I have run into several cases of this on different open-source > > projects, the latest being an incorrect change in Numeric's > > arrayobject.c: the use of // to start a comment. Many > contributors who > > work only with Linux have come to believe that this works > with other C > > compilers, which is not true. This construct comes from C++. Please > > avoid this construct when contributing changes or patches > to Numeric. > > From jae at zhar.net Tue Jul 16 00:12:13 2002 From: jae at zhar.net (John Eikenberry) Date: Tue Jul 16 00:12:13 2002 Subject: [Numpy-discussion] Optimization advice In-Reply-To: <20020708094805.GA370@kosh.zhar.net> References: <20020708094805.GA370@kosh.zhar.net> Message-ID: <20020716070554.GB363@kosh.zhar.net> After getting some advice off the pygame list I think I have a pretty good version of my influence map now. I thought someone on this list might be interested or at least someone checking the archives. The new and improved code is around 6-7x faster. The main gain was obtained by converting all the array functions to slice notation and eliminating most of the needless copying of arrays. The new version is attached and is much better commented. It is also unabridged, as it was pointed out that it wasn't entirely clear what was going on in the last (edited) version. Hopefully things are more obvious in this one. Anyways... I just hating leaving a thread without a conclusion. Hope someone finds this useful. -- John Eikenberry [jae at zhar.net - http://zhar.net] ______________________________________________________________ "They who can give up essential liberty to purchase a little temporary safety, deserve neither liberty nor safety." --B. Franklin -------------- next part -------------- # /usr/bin/env python # # John Eikenberry from Numeric import * from types import * FACTOR = array(6.).astype(Float32) EDGE_MOD = array(0.66).astype(Float32) ONE = array(1.).astype(Float32) ZERO = array(0.).astype(Float32) class InfluenceMap: """ There are 2 primary ways to setup the influence map, either might be useful depending on your needs. The first is to recreate the map each 'turn' the second is to keep the map around and just update it each turn. The first way is simple and easy to understand, both in terms of tweaking and later analysis. The second gives the map a sense of time and allows for fewer iterations of the spreading algorithm per 'turn'. Setting up the map to for one or the other of these is a matter of tweaking the code. There are 3 main bits of code which are described below and indicated via comments in the code. First some terminology: - weightmap stores the current influence map - neighbors is used as the memory buffer to calculate a the influence spreading - constmap contains a map with only the unit's scores present - when I refer to a 'multi-turn map' I mean using one instance of the influence map throughout the game without resetting it. [1] neighbors *= ZERO At the end of each iteraction, the neighbors take on the values of the weightmap from the previous step. This will reset those values to zero. This has a 1% performance hit. [2] putmask(neighbors,constmap,constmap) This keeps the values of the units hexes constant through all iterations. This results in about a 40% performance hit. This needs improvement. [3] setDecayRate([float]) This is meant to be used with a multi-turn map. It sets the floating point value (N>0.0<1.0)which is used on the map each turn to modify the current map before the influence spreading. No performance hit. If just [1] used then it will cause all influence values to decend toward zero. Not sure what this would be useful for, just documenting the effect. If [1] is not used (commented out) then the map values will never balance out, rising with each iteration. This is fine if you plan on resetting the influence map each turn. Allowing you to tweak the number of iterations to get the level of values you want. But it would cause problem with a multi-turn map unless [3] is used to keep this in check. Using [2] without [1] will accellerate the rising of the values described above. It will also lead to more variation amoung the influence values when using fewer iterations. High peaks and steep sides. Using neither [1] nor [2] the peaks are much lower. If [1] and [2] are both used the map will always attain a point of balance no matter how many iterations are run. This is desirable for maps used throughout the entire game (multi-turn maps) for obvious reasons. Given the effect of [1] this also limits the need for [3] as the influence values in areas of the map where units are no longer present will naturally decrease. Though the decay rate may still be useful for tweaking this. """ _decay_rate = None def __init__(self,hex_map): """ hex_map is the in game (civl) map object """ self.map_size = map_size = hex_map.size ave_size = (map_size[0] + map_size[1])/2 self._iterations = ave_size/2 # is the hex_map useful for anything other than size? self.hex_map = hex_map # weightmap == influence map self.weightmap = weightmap = zeros((map_size[0],map_size[1]),Float32) # constmap == initial unit locations self.constmap = zeros((map_size[0],map_size[1]),Float32) def setUnitMap(self,units): """ Put unit scores on map -units is a list of (x,y,score) tuples where x,y are map coordinates and score is the units influence modifier """ weightmap = self.weightmap constmap = self.constmap constmap *= ZERO # mayby use the hex_map here to get terrain effects? for (x,y,score) in units: weightmap[x,y] = score constmap[x,y]=score def setInterations(self,iterations): """ Set number of times through the influence spreading loop """ assert type(iterations) == IntType, "Bad arg type: setIterations([int])" self._iterations = iterations # [3] above def setDecayRate(self,rate): """ Set decay rate for a multi-turn map. """ assert type(rate) == FloatType, "Bad arg type: setDecayRate([float])" self._decay_rate = array(rate).astype(Float32) def reset(self): """ Reset an existing map back to zeros """ map_size = self.map_size self.weightmap = zeros((map_size[0],map_size[1]),Float32) def step(self,iterations=None): """ One set of loops through influence spreading algorithm """ # save lookup time constmap = self.constmap weightmap = self.weightmap if not iterations: iterations = self._iterations # decay rate can be used when the map is kept over duration of game, # instead of a new one each turn. the old values are retained, # degrading slowly over time. this allows for fewer iterations per turn # and gives sense of time to the map. its experimental at this point. if self._decay_rate: weightmap = weightmap * self._decay_rate # It might be possible to pre-allocate the memory for neighbors in the # init method. But I'm not sure how to update that pre-allocated array. neighbors = weightmap.copy() # spread the influence while iterations: # [1] in notes above # neighbors *= ZERO # diamond_hex layout neighbors[:-1,:] += weightmap[1:,:] # shift up neighbors[1:,:] += weightmap[:-1,:] # shift down neighbors[:,:-1] += weightmap[:,1:] # shift left neighbors[:,1:] += weightmap[:,:-1] # shift right neighbors[1::2][:-1,:-1] += weightmap[::2][1:,1:] # hex up (even) neighbors[1::2][:,:-1] += weightmap[::2][:,1:] # hex down (even) neighbors[::2][:,1:] += weightmap[1::2][:,:-1] # hex up (odd) neighbors[::2][1:,1:] += weightmap[1::2][:-1,:-1] # hex down (odd) # keep influence values balanced neighbors *= (ONE/FACTOR) # [2] above - maintain scores in unit hexes # putmask(neighbors,constmap,constmap) # 'putmask' adds almost 40% to the overhead. There should be a # faster way. A little testing seems to show that this problem is # related to the usage of floats for the map values. # prepare for next iteration weightmap,neighbors = neighbors,weightmap iterations -= 1 # save for next turn self.weightmap = weightmap From paul at pfdubois.com Thu Jul 18 14:47:02 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Jul 18 14:47:02 2002 Subject: [Numpy-discussion] [ANNOUNCE] Pyfort 8.0 Message-ID: <001f01c22ea4$89ff4f40$0b01a8c0@NICKLEBY> Pyfort 8.0 has been released at SourceForge (sf.net/projects/pyfortran) Version 8 This version contains a new facility for making and installing projects. Old compile lines will still work, but will produce an equivalent .pfp file that you could use in the future. Included is a Tkinter-based GUI editor for the project files. However, the format of the files is simple and they could be edited with a text editor as well. There is improved support for installing Pyfort and the modules it creates in a location other than inside Python. See README. This version does change the installation location for an extension. Therefore, you should remove the files of any previous installation from your Python. Yes, this is annoying. That is why we are doing it, so that we can have an "uninstall" command. A new "windows" subdirectory has been added, containing an example of how to use Pyfort on Windows with Visual Fortran. Thanks to Reinhold Niesner. Testing of, and advice about, this are needed from Windows users. The pyfort script itself is also now installed as a .bat script for win32. Support for Mac OSX (Darwin) added. From biesingert at yahoo.com Fri Jul 19 01:13:03 2002 From: biesingert at yahoo.com (Thomas Biesinger) Date: Fri Jul 19 01:13:03 2002 Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Hi, when I try to install NumPy on Mac OS X.1.5, it fails on this error: .... cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so /usr/bin/ld: -undefined error must be used when -twolevel_namespace is in effect error: command 'cc' failed with exit status 1 ~/Python/Numeric-21.3 % cc cc: No input files I had thought to submit this to the developers section of the list but could not find the way to subscribe to it ;-) If somehow had a running version of NumPy with for Mac OSX http://tony.lownds.com/macosx, I would appreciate it. Thanks everyone for their help! Regards, Thomas __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com From rob at pythonemproject.com Fri Jul 19 05:35:04 2002 From: rob at pythonemproject.com (rob) Date: Fri Jul 19 05:35:04 2002 Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 References: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Message-ID: <3D3806B3.F2BE1C1A@pythonemproject.com> Thomas Biesinger wrote: > > Hi, > > when I try to install NumPy on Mac OS X.1.5, it fails on this error: > > .... > cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- > 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ > arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - > o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so > /usr/bin/ld: -undefined error must be used when -twolevel_namespace is > in effect > error: command 'cc' failed with exit status 1 > ~/Python/Numeric-21.3 % cc > cc: No input files > > I had thought to submit this to the developers section of the list > but could not find the way to subscribe to it ;-) > > If somehow had a running version of NumPy with for Mac OSX > http://tony.lownds.com/macosx, I would appreciate it. > > Thanks everyone for their help! > > Regards, > Thomas > > __________________________________________________ > Do You Yahoo!? > Yahoo! Autos - Get free new car price quotes > http://autos.yahoo.com > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion Hi Thomas, sorry I don't have the expertise to help you with your question. I am wondering if you are using one of Apple's new G4 machines? I'm curious about the floating point performance of those chips. If you ever get Numpy working, I have a routine that I use for a benchmark, a Norton-Summerfeld ground (antenna) simulation routine that I could send to you. The record for me is 120s on a P4 1.8Ghz at work, but I'm sure the new Xeons would beat that, and maybe the new Athlons. My 1.2Ghz DDR Athlon is much slower than the P4, but the clock speeds are so much different. Rob. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com From welch at cs.unc.edu Fri Jul 19 05:52:01 2002 From: welch at cs.unc.edu (Greg Welch) Date: Fri Jul 19 05:52:01 2002 Subject: FW: [Numpy-discussion] NumPy on Mac OS 10.1.5 In-Reply-To: <200207191053.g6JArGbE017359@wren.cs.unc.edu> Message-ID: Thomas, I have (recently) built Numeric 21.3 on multiple OS X 10.1.5 platforms, and have had no problems that I know of. I am using Python 2.3a0 but had also built Numeric w/ earlier versions of Python too. All platforms have the April 2002 developer tools update. I just noticed that your compile line shows the use of cc, as opposed to gcc. Here is the corresponding compile line for 21.3 on my powerbook (Python 2.3a0): gcc -bundle -bundle_loader /usr/local/bin/python build/temp.darwin-5.5-Power Macintosh-2.3/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.3/arrayobject.o build/temp.darwin-5.5-PowerMacintosh-2.3/ufuncobject.o -o build/lib.darwin-5.5-Power Macintosh-2.3/_numpy.so --Greg -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From biesingert at yahoo.com Fri Jul 19 04:12:31 2002 From: biesingert at yahoo.com (Thomas Biesinger) Date: Fri, 19 Jul 2002 01:12:31 -0700 (PDT) Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Hi, when I try to install NumPy on Mac OS X.1.5, it fails on this error: .... cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so /usr/bin/ld: -undefined error must be used when -twolevel_namespace is in effect error: command 'cc' failed with exit status 1 ~/Python/Numeric-21.3 % cc cc: No input files I had thought to submit this to the developers section of the list but could not find the way to subscribe to it ;-) If somehow had a running version of NumPy with for Mac OSX http://tony.lownds.com/macosx, I would appreciate it. Thanks everyone for their help! Regards, Thomas __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion --B_3109913412_427129-- From Jack.Jansen at oratrix.com Fri Jul 19 14:17:02 2002 From: Jack.Jansen at oratrix.com (Jack Jansen) Date: Fri Jul 19 14:17:02 2002 Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 In-Reply-To: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Message-ID: On vrijdag, juli 19, 2002, at 10:12 , Thomas Biesinger wrote: > Hi, > > when I try to install NumPy on Mac OS X.1.5, it fails on this error: > > .... > cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- > 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ > arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - > o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so > /usr/bin/ld: -undefined error must be used when -twolevel_namespace is > in effect Thomas, as of MacOSX 10.1 the link step needs either the -flat_namespace option, or the -bundle_loader option. But: this has been fixed in both Python 2.2.1 and Python 2.3a0 (the CVS tree). Are you by any chance still running Python 2.2 (which predates OSX 10.1, and therefore two-level namespaces, and therefore the right linker invocations, which distutils reads from Python's own Makefile). If you're running 2.2: please upgrade and try again. If you're running 2.2.1 or later: let me know and I'll try and think of what questions I should ask you to debug this:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From paul at pfdubois.com Mon Jul 22 16:14:03 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 22 16:14:03 2002 Subject: [Numpy-discussion] Numarray design announcement Message-ID: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY> At numpy.sf.net you will find a posting from Perry Greenfield and I detailing the design decisions we have taken with respect to Numarray. What follows is the text of that message without the formatting. We ask for your understanding about those decisions that differ from the ones you might prefer. Numarray's Design Paul F. Dubois and Perry Greenfield Numarray is the new implementation of the Numeric Python extension. It is our intention that users will change as rapidly as possible to the new module when we decide it is ready. The present Numeric Python team will cease supporting Numeric after a short transition period. During recent months there has been a lot of discussion about Numarray and whether or not it should differ from Numeric in certain ways. We have reviewed this lengthy discussion and come to some conclusions about what we plan to do. The discussion has been valuable in that it took a whole new "generation" back through the considerations that the "founding fathers" debated when Numeric Python was designed. There are literally tens of thousands of Numerical Python users. These users may represent only a tiny percentage of potential users but they are real users today with real code that they have written, and breaking that code would represent real harm to real people. Most of the issues discussed recently were discussed at length when Numeric was first designed. Some decisions taken then represent a choice that was simply a choice among valid alternatives. Nevertheless, the choice was made, and to arbitrarily now make a different choice would be difficult to justify. In arguing about Python's indentation, we often see heart-felt arguments from opponents who have sincere reasons for feeling as they do. However, many of the pitfalls they point to do not seem to actually occur in real life very often. We feel the same way about many arguments about Numeric Python. The view / copy argument, for example, claims that beginners will make errors with view semantics. Well, some do, but not very often, and not twice. It is just one of many differences that users need to adapt to when learning an entity-object model such as Python's when they are used to variable semantics such as in Fortran or C. Similarly, we do not receive massive reports of confusion about differing default values for the axis keyword -- there was a rationale for the way it is now, and although one could propose a different rationale for a different choice, it would be just a choice. Decisions Numarray will have the same Python interface as Numeric except for the exceptions discussed below. 1. The Numarray C API includes a compatibility layer consisting of some of the members of the Numeric C API. For details on compatibility at the C level see http://telia.dl.sourceforge.net/sourceforge/numpy/numarray.pdf , pdf pages 78-81. Since no formal decision was ever made about what parts of the Numeric C header file were actually intended to be publicly available, do not expect complete emulation. Numarray's current view of arrays in C, using either native or emulation C-APIs, is that array data can be mutated, but array properties cannot. Thus, an existing Numeric extension function which tries to change the shape or strides of an array in C is more of a porting challenge, possibly requiring a python wrapper. Depending on what kind of optimization we do, this restriction might be lifted. For the Numeric extensions already ported to Numarray (RandomArray, LinearAlgebra, FFT), none of this was an issue. 2. Currently, if the result of an index operation x[i] results in a scalar result, the result is converted to a similar Python type. For example, the result of array([1,2,3])[1] is the Python integer 2. This will be changed so that the result of an index operation on a Numarray array is always a Numarray array. Scalar results will become rank-zero arrays (i.e., shape () ). 3. Currently, binary operations involving Numeric arrays and Python scalars uses the precision of the Python scalar to help determine the precision of the result. In Numarray, the precision of the array will have precedence in determining the precision of the outcome. Full details are available in the Numarray documention. 4. The Numarray version of MA will no longer have copy semantics on indexing but instead will be consistent with Numarray. (The decision to make MA differ in this regards was due to a need for CDAT to be backward compatible with a local variant of Numeric; the CDAT user community no longer feels this was necessary). Some explanation about the scalar change is in order. Currently, much coding in Numeric-based applications must be devoted to handling the fact that after an index operation, the programmer can not assume that the result is an array. So, what are the consequences of change? A rank-zero array will interact as expected with most other parts of Python. When it does not, the most likely result is a type error. For example, let x = array([1,2,3]). Then [1,2,3][x[0]] currently produces the result 2. With the change, it would produce a type error unless a change is made to the Python core (currently under discussion). But x[x[0]] would still work because we have control of that. In short, we do not think this change will break much code and it will prevent the writing of more code that is either broken or difficult to write correctly. From pete at shinners.org Mon Jul 22 17:36:12 2002 From: pete at shinners.org (Pete Shinners) Date: Mon Jul 22 17:36:12 2002 Subject: [Numpy-discussion] Numarray design announcement References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY> Message-ID: <3D3CA3B1.7010708@shinners.org> Paul F Dubois wrote: > Numarray's Design > Paul F. Dubois and Perry Greenfield a very nice design, for a lot of challenging decisions > Numarray's current view of arrays in C, using either native or > emulation C-APIs, is that array data can be mutated, but array > properties cannot. Thus, an existing Numeric extension function > which tries to change the shape or strides of an array in C is > more of a porting challenge, possibly requiring a python wrapper. i have a c extension that does this, but only during "creation time" of the array. i'm hoping there can be some way to do this from C. i need to create a new array from a block of numbers that aren't contiguous... /* roughly snipped code */ dim[0] = myimg->w; dim[1] = myimg->h; dim[2] = 3; /*r,g,b*/ array = PyArray_FromDimsAndData(3, dim, PyArray_UBYTE, startpixel); array->flags = OWN_DIMENSIONS|OWN_STRIDES; array->strides[2] = pixelstep; array->strides[1] = myimg->pitch; array->strides[0] = myimg->format->BytesPerPixel; array->base = myimg_object; note this data is image data, and i am "reorienting" it so that the first index is X and the second index is Y. plus i need to account for an image pitch, where the rows are not exactly the same width as the number of pixels. also, i am also changing the "base" field, since the data for this array lives inside another image object of course, once the array is created, i pass it off to the user and never touch these fields again, so perhaps something like this will work in the new numarray? if not, i'm eager to start my petition for a "PyArray_FromDimsAndDataAndStrides" function, and also a way to assign the "base" as well. i'm looking forward to the new numarray, looks very exciting. From biesingert at yahoo.com Mon Jul 22 23:54:03 2002 From: biesingert at yahoo.com (Thomas Biesinger) Date: Mon Jul 22 23:54:03 2002 Subject: [Numpy-discussion] Summary to NumPy on Mac OS 10.1.5 Message-ID: <20020723065343.73589.qmail@web14106.mail.yahoo.com> From e.maryniak at pobox.com Tue Jul 23 09:19:04 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Tue Jul 23 09:19:04 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug Message-ID: <200206261833.29702.e.maryniak@pobox.com> Dear crunchers, According to the _Numpy_ manual for RandomArray.seed(x=0, y=0) (with /my/ emphasis): The seed() function takes two integers and sets the two seeds of the random number generator to those values. If the default values of 0 are used for /both/ x and y, then a seed is generated from the current time, providing a /pseudo-random/ seed. Note: in numarray, the RandomArray2 package is provided but it's description is not (yet) included in the numarray manual. I have some questions about this: 1. The implementation of seed(), which is, by the way, identical both in Numeric's RandomArray.py and numarray's RandomArray2.py seems to contradict it's usage description: ---cut--- def seed(x=0,y=0): """seed(x, y), set the seed using the integers x, y; Set a random one from clock if y == 0 """ if type (x) != IntType or type (y) != IntType : raise ArgumentError, "seed requires integer arguments." if y == 0: import time t = time.time() ndigits = int(math.log10(t)) base = 10**(ndigits/2) x = int(t/base) y = 1 + int(t%base) ranlib.set_seeds(x,y) ---cut--- Shouldn't the second 'if' be: if x == 0 and y == 0: With the current implementation: - 'seed(3)' will actually use the clock for seeding - it is impossible to specify 0's (0,0) as seed: it might be better to use None as default values? 2. With the current time.time() based default seeding, I wonder if you can call that, from a mathematical point of view, pseudo-random: ---cut--- $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) [GCC 2.95.3 20010315 (SuSE)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from numarray import * >>> from RandomArray2 import * >>> import time >>> numarray.__version__ '0.3.5' >>> for i in range(5): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(1) ... print ... 1027434978.406238 (102743, 4979) 1027434979.400319 (102743, 4980) 1027434980.400316 (102743, 4981) 1027434981.40031 (102743, 4982) 1027434982.400308 (102743, 4983) ---cut--- It is incremental, and if you use default seeding within one (1) second, you get the same seed: ---cut--- >>> for i in range(5): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(0.1) ... print ... 1027436537.066677 (102743, 6538) 1027436537.160303 (102743, 6538) 1027436537.260363 (102743, 6538) 1027436537.360299 (102743, 6538) 1027436537.460363 (102743, 6538) ---cut--- 3. I wonder what the design philosophy is behind the decision to use 'mathematically suspect' seeding as default behavior. Apart from the fact that it can hardly be called 'random', I also have the following problems with it: - The RandomArray2 module initializes with 'seed()' itself, too. Reload()'s of RandomArray2, which might occur outside the control of the user, will thus override explicit user's seeding. Or am I seeing ghosts here? - When doing repeated run's of one's neural net simulations that each take less than a second, one will get identical streams of random numbers, despite seed()'ing each time. Not quite what you would expect or want. - From a purist software engineering point of view, I don't think automagical default behavior is desirable: one wants programs to be deterministic and produce reproducible behavior/output. If you use default seed()'ing now and re-run your program/model later with identical parameters, you will get different output. In Eiffel, object attributes are always initialized, and you will almost never have irreproducible runs. I found that this is a good thing for reproducing ones bugs, too ;-) To summarize, my recommendation would be to use None default arguments and use, when no user arguments are supplied, a hard (built-in) seed tuple, like (1,1) or whatever. Sometimes a paper on a random number generator suggests seeds (like 4357 for the MersenneTwister), but of course, a good random number generator should behave well independently of the initial seed/seed-tuple. I may be completely mistaken here (I'm not an expert on random number theory), but the random number generators (Ahrens, et. al) seem 'old'? After some studying, we decided to use the Mersenne Twister: http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html http://www.math.keio.ac.jp/~matumoto/emt.html PDF article: http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator", ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998 There are some Python wrappers and it has good performance as well. Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. Hail Caesar! We, who are about to dine, salad you. From jmiller at stsci.edu Tue Jul 23 11:56:04 2002 From: jmiller at stsci.edu (Todd Miller) Date: Tue Jul 23 11:56:04 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug References: <200206261833.29702.e.maryniak@pobox.com> Message-ID: <3D3DA67E.308@stsci.edu> Eric Maryniak wrote: >Dear crunchers, > >According to the _Numpy_ manual for RandomArray.seed(x=0, y=0) >(with /my/ emphasis): > > The seed() function takes two integers and sets the two seeds > of the random number generator to those values. If the default > values of 0 are used for /both/ x and y, then a seed is generated > from the current time, providing a /pseudo-random/ seed. > >Note: in numarray, the RandomArray2 package is provided but it's >description is not (yet) included in the numarray manual. > >I have some questions about this: > >1. The implementation of seed(), which is, by the way, identical > both in Numeric's RandomArray.py and numarray's RandomArray2.py > seems to contradict it's usage description: > The 2 in RandomArray2 is there to support side-by-side testing with Numeric, not to imply something new and improved. The point of providing RandomArray2 is to provide a migration path for current Numeric users. To that end, RandomArray2 should be functionally identical to RandomArray. That should not, however, discourage you from writing a new and improved random number package for numarray. > > >---cut--- >def seed(x=0,y=0): > """seed(x, y), set the seed using the integers x, y; > Set a random one from clock if y == 0 > """ > if type (x) != IntType or type (y) != IntType : > raise ArgumentError, "seed requires integer arguments." > if y == 0: > import time > t = time.time() > ndigits = int(math.log10(t)) > base = 10**(ndigits/2) > x = int(t/base) > y = 1 + int(t%base) > ranlib.set_seeds(x,y) >---cut--- > > Shouldn't the second 'if' be: > > if x == 0 and y == 0: > > With the current implementation: > > - 'seed(3)' will actually use the clock for seeding > - it is impossible to specify 0's (0,0) as seed: it might be > better to use None as default values? > >2. With the current time.time() based default seeding, I wonder > if you can call that, from a mathematical point of view, > pseudo-random: > >---cut--- >$ python >Python 2.2.1 (#1, Jun 25 2002, 20:45:02) >[GCC 2.95.3 20010315 (SuSE)] on linux2 >Type "help", "copyright", "credits" or "license" for more information. > >>>>from numarray import * >>>>from RandomArray2 import * >>>>import time >>>>numarray.__version__ >>>> >'0.3.5' > >>>>for i in range(5): >>>> >... time.time() >... RandomArray2.seed() >... RandomArray2.get_seed() >... time.sleep(1) >... print >... >1027434978.406238 >(102743, 4979) > >1027434979.400319 >(102743, 4980) > >1027434980.400316 >(102743, 4981) > >1027434981.40031 >(102743, 4982) > >1027434982.400308 >(102743, 4983) >---cut--- > > It is incremental, and if you use default seeding within > one (1) second, you get the same seed: > >---cut--- > >>>>for i in range(5): >>>> >... time.time() >... RandomArray2.seed() >... RandomArray2.get_seed() >... time.sleep(0.1) >... print >... >1027436537.066677 >(102743, 6538) > >1027436537.160303 >(102743, 6538) > >1027436537.260363 >(102743, 6538) > >1027436537.360299 >(102743, 6538) > >1027436537.460363 >(102743, 6538) >---cut--- > >3. I wonder what the design philosophy is behind the decision > to use 'mathematically suspect' seeding as default behavior. > Using time for a seed is fairly common. Since it's an implementation detail, I doubt anyone would object if you can suggest a better default seed. > > Apart from the fact that it can hardly be called 'random', I also > have the following problems with it: > > - The RandomArray2 module initializes with 'seed()' itself, too. > Reload()'s of RandomArray2, which might occur outside the > control of the user, will thus override explicit user's seeding. > Or am I seeing ghosts here? > Overriding a user's explicit seed as a result of a reload sounds correct to me. All of the module's top level statements are re-executed during a reload. > > - When doing repeated run's of one's neural net simulations that > each take less than a second, one will get identical streams of > random numbers, despite seed()'ing each time. > Not quite what you would expect or want. > This is easy enough to work around: don't seed or re-seed. If you then need to make multiple simulation runs, make a separate module and call your simulation like: import simulation RandomArray2.seed(something_deterministic, something_else_deterministic) for i in range(number_of_runs): simulation.main() > > - From a purist software engineering point of view, I don't think > automagical default behavior is desirable: one wants programs to > be deterministic and produce reproducible behavior/output. > I don't know. I think by default, random numbers *should be* random. > > If you use default seed()'ing now and re-run your program/model > later with identical parameters, you will get different output. > When you care about this, you need to set the seed to something deterministic. > > In Eiffel, object attributes are always initialized, and you will > almost never have irreproducible runs. I found that this is a good > thing for reproducing ones bugs, too ;-) > This sounds like a good design principle, but I don't see anything in RandomArray2 which is keeping you from doing this now. > >To summarize, my recommendation would be to use None default arguments >and use, when no user arguments are supplied, a hard (built-in) seed >tuple, like (1,1) or whatever. > Unless there is a general outcry from the rest of the community, I think the (existing) numarray extensions (RandomArray2, LinearAlgebra2, FFT2) should try to stay functionally identical with Numeric. > >Sometimes a paper on a random number generator suggests seeds (like 4357 >for the MersenneTwister), but of course, a good random number generator >should behave well independently of the initial seed/seed-tuple. >I may be completely mistaken here (I'm not an expert on random number >theory), but the random number generators (Ahrens, et. al) seem 'old'? >After some studying, we decided to use the Mersenne Twister: > An array enabled version might make a good add-on package for numarray. > > > http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html > http://www.math.keio.ac.jp/~matumoto/emt.html > >PDF article: > > http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf > > M. Matsumoto and T. Nishimura, > "Mersenne Twister: A 623-dimensionally equidistributed uniform > pseudorandom number generator", > ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1, > January pp.3-30 1998 > >There are some Python wrappers and it has good performance as well. > >Bye-bye, > >Eric > Bye, Todd From e.maryniak at pobox.com Tue Jul 23 13:03:02 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Tue Jul 23 13:03:02 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <3D3DA67E.308@stsci.edu> References: <200206261833.29702.e.maryniak@pobox.com> <3D3DA67E.308@stsci.edu> Message-ID: <200207232202.04104.e.maryniak@pobox.com> On Tuesday 23 July 2002 20:54, Todd Miller wrote: > Eric Maryniak wrote: > >... > That should not, however, discourage you from writing a new and improved > random number package for numarray. Yes, thank you :-) > >... > >3. I wonder what the design philosophy is behind the decision > > to use 'mathematically suspect' seeding as default behavior. > > Using time for a seed is fairly common. Since it's an implementation > detail, I doubt anyone would object if you can suggest a better default > seed. Well, as said, a fixed seed, provided by the class implementation and therefore 'good', instead of a not-so-random 'random' seed. And imho it would be better not to (only) use the clock, but a /dev/random kinda thing. Personally, I find the RNG setup much more appealing: there the default is: standard_generator = CreateGenerator(-1) where seed < 0 ==> Use the default initial seed value. seed = 0 ==> Set a "random" value for the seed from the system clock. seed > 0 ==> Set seed directly (32 bits only). And indeed 'void Mixranf(int *s,u32 s48[2])' uses a built-in constant as initial seed value (actually, two). >... > > If you use default seed()'ing now and re-run your program/model > > later with identical parameters, you will get different output. > > When you care about this, you need to set the seed to something > deterministic. Naturally, but how do I know what a 'good' seed is (or indeed it's type, range, etc.)? I just would like, as e.g. RNG does, let the number generator take care of this... (or at least provide the option to) >... In the programs I've seen so far, including a lot of ours ahem, usually a program (simulation) is run multiple times with the same parameters and, in our case for neural nets, seeded each time with a clock generated seed and then the different simulations are compared and checked if they are similar or sensitive to chaotic influences. But I don't think this is the proper way to do this. My point is, I guess, that the sequence of these clock-generated seeds itself is not random, because (as for RandomArray) the generated numbers are clearly not random. Better, and reproducible, would be to start the first simulation with a supplied seed, get the seed and pickle after the first run and use the pickled seed for run 2 etc. or indeed have a kind of master script (as you suggest) that manages this. That way you would start with one seed only and are not re-seeding for each run. Because if the clock-seeds are not truly random, you will a much greater change of cycles in your overall sequence of numbers. Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. VME ERROR 37022: Hierarchic name syntax invalid taking into account starting points defined by initial context. From paul at pfdubois.com Tue Jul 23 13:14:05 2002 From: paul at pfdubois.com (paul at pfdubois.com) Date: Tue Jul 23 13:14:05 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <200207232202.04104.e.maryniak@pobox.com> Message-ID: <3D36139400005515@mta08.san.yahoo.com> RandomArray got a "special" position as part of Numeric simply by historical accident in being there first. I think in the conversion to Numarray we will be able to remove such things from the "core" and make more of a marketplace of equals for the "addons". As it is now there is some implication that somehow one is "better" than the other, which is unjustified either mathematically or in the sense of design. RNG's design is based on my experience with large codes needing many independent streams. The mathematics is from a well-tested Cray algorithm. I'm sure it could use fluffing up but a good case can be made for it. From gb at cs.unc.edu Tue Jul 23 14:24:03 2002 From: gb at cs.unc.edu (Gary Bishop) Date: Tue Jul 23 14:24:03 2002 Subject: [Numpy-discussion] Bug in Numpy FFT reference? Message-ID: <200207232123.g6NLN6bE004136@wren.cs.unc.edu> The example given for real_fft in the FFT section of the Sept 7, 2001 Numpy manual makes no sense to me. The text says >>> x = cos(arange(30.0)/30.0*2*pi) >>> print real_fft(x) [ -1. +0.j 13.69406641+2.91076367j -0.91354546-0.40673664j -0.80901699-0.58778525j -0.66913061-0.74314483j -0.5 -0.8660254j -0.30901699-0.95105652j -0.10452846-0.9945219j 0.10452846-0.9945219j 0.30901699-0.95105652j 0.5 -0.8660254j 0.66913061-0.74314483j 0.80901699-0.58778525j 0.91354546-0.40673664j 0.9781476 -0.20791169j 1. +0.j ] But surely x is a single cycle of a cosine wave and should have a very sensible and simple FT. Namely [0, 1, 0, 0, 0, ...] Indeed, running the example using Numeric and FFT produces, within rounding error, exactly what I would expect. Why the non-intuitive (and wrong) result in the example text? gb From dubois1 at llnl.gov Tue Jul 23 14:32:04 2002 From: dubois1 at llnl.gov (Paul Dubois) Date: Tue Jul 23 14:32:04 2002 Subject: [Numpy-discussion] Bug in Numpy FFT reference? In-Reply-To: <200207232123.g6NLN6bE004136@wren.cs.unc.edu> References: <200207232123.g6NLN6bE004136@wren.cs.unc.edu> Message-ID: <1027459879.8212.2.camel@ldorritt> The person who wrote the manual cut and pasted from running the code. I think there was a bug in FFT at the time. (:-> On Tue, 2002-07-23 at 14:23, Gary Bishop wrote: > The example given for real_fft in the FFT section of the Sept 7, 2001 > Numpy manual makes no sense to me. The text says > > >>> x = cos(arange(30.0)/30.0*2*pi) > >>> print real_fft(x) > [ -1. +0.j 13.69406641+2.91076367j > -0.91354546-0.40673664j -0.80901699-0.58778525j > -0.66913061-0.74314483j -0.5 -0.8660254j > -0.30901699-0.95105652j -0.10452846-0.9945219j > 0.10452846-0.9945219j 0.30901699-0.95105652j > 0.5 -0.8660254j 0.66913061-0.74314483j > 0.80901699-0.58778525j 0.91354546-0.40673664j > 0.9781476 -0.20791169j 1. +0.j ] > > But surely x is a single cycle of a cosine wave and should have a very > sensible and simple FT. Namely [0, 1, 0, 0, 0, ...] > > Indeed, running the example using Numeric and FFT produces, within > rounding error, exactly what I would expect. > > Why the non-intuitive (and wrong) result in the example text? > > gb > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From e.maryniak at pobox.com Wed Jul 24 09:24:14 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Wed Jul 24 09:24:14 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <3D36139400005515@mta08.san.yahoo.com> References: <3D36139400005515@mta08.san.yahoo.com> Message-ID: <200207241823.42218.e.maryniak@pobox.com> On Tuesday 23 July 2002 22:15, paul at pfdubois.com wrote: > RandomArray got a "special" position as part of Numeric simply by > historical accident in being there first. I think in the conversion to > Numarray we will be able to remove such things from the "core" and make > more of a marketplace of equals for the "addons". As it is now there is > some implication that somehow one is "better" than the other, which is > unjustified either mathematically or in the sense of design. > > RNG's design is based on my experience with large codes needing many > independent streams. The mathematics is from a well-tested Cray algorithm. > I'm sure it could use fluffing up but a good case can be made for it. A famous quote from Linus is "Nice idea. Now show me the code." Perhaps a detailed example makes my problem clearer, because as it is now, RNG and RandomArray2 are not orthogonal in design, in the sense that RNG's default seed is fixed and RandomArray's is automagical (clock), not reproducible and mathematically suspect, which I think is not good for the more naive Python user. Below I will give intended usage in a provocative way, but please don't take me too seriously (I know, I don't ;-) Let's say you have a master shell script that runs a neural net paradigm (size 20x20) 10 times, each time with the same parameters, to see if it's stable or chaotic, i.e. does not 'converge' c.q. outcome depends on initial values (it should not be chaotic, but this should always be checked). run10.sh tracelink.py 20 20 inputpat.dat > hippocamp01.out ... 8 more ... tracelink.py 20 20 inputpat.dat > hippocamp10.out tracelink.py ... import numarray, RandomArray2 _or_ RNG ... # Case 1: RandomArray2 # User uses default clock seed, which is the same # during 1 second (see my previous posting). # ignlgi(void)'s seeds 1234567890L,123456789L # are _not_ used (see com.c). RandomArray2.seed() # But if omitted, RandomArray2.py does it, too. ... calculations ... other program outcome _only_ if program runs > 1 second, ... otherwise the others will have the same result. # Case 2: RNG # A 'standard_generator = CreateGenerator(-1)' is automatically done. # seed < 0 ==> Use the default initial seed value. # seed = 0 ==> Set a "random" value for the seed from system clock. # seed > 0 ==> Set seed directly (32 bits only). # Thus, the fixed seeds used are 0,0 (see Mixranf() in ranf.c). ... calculations ... all 10 programs have the same outcome when using ranf(), ... because it always starts the same seed, the sequence is always: ... 0.58011364857958725, 0.95051273498076583, 0.78637142533060356 etc. The problem with RandomArray's seed is, that it is not truly random itself. In it's current (time.time based) implementation it is linearly auto incrementing every second, and therefore suffers from auto-correlation. Moreover, in the above example, if 10 separate .py runs complete in 1 second they'll all have the same seed (and outcome). This is not what the user, if accustomed to clock seeding, would expect. But if the seed is different each time, a problem is that runs are not reproducible. Let's say that run hippocamp06.out produced some strange output: now unless the user saved the seed (with get_seed), it can never be reproduced. Therefore, I think RNG's design is better and should be applied to RandomArray2, too, because RandomArray2's seeding is flawed anyways. A user should be aware of proper seeding, agreed, and now will be: when doing multiple identical runs, the same (and thus reproducible) output will result and so the user is made aware of the fact that, as an example, he or she should seed or pickle it between runs. So my suggestion would be to re-implement RandomArray2.seed(x=0,y=0) as follows: if either the x or y seed: seed < 0 ==> Use the default initial seed value. seed = None ==> Set a "random" value for the seed from the system clock. seeds >= 0 ==> Set seed directly (32 bits only). and en-passant do a better job than clock-based seeding: ---cut--- def seed(x=None,y=None): """seed(x, y), set the seed using the integers x, y; ... """ if (x != None and type (x) != IntType) or (y != None and type (y) != IntType) : raise ArgumentError, "seed requires integer arguments (or None)." if x == None or y == None: import dev_random_device # uses /dev/random or equivalent x = dev_random_device.nextvalue() # egd.sf.net is a user space y = dev_random_device.nextvalue() # alternative elif x < 0 or y < 0: x = 1234567890L y = 123456789L ranlib.set_seeds(x,y) ---cut--- But: I realize that this is different behavior from Python's standard random and whrandom, where no arg or None uses the clock. But, if that behavior is kept for RandomArray2 (and RNG should then be adapted, too) then I'd urge at least to use a better initial seed. In certain applications, e.g. generating session id's in crypto programs, non-predictability of initial seeds is crucial. But if you have a look at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks like an art in itself. So perhaps RNG's 'clock code' should replace RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will not have the 1-second problem. Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. Just because you're not paranoid, that doesn't mean that they're not after you. From Chris.Barker at noaa.gov Wed Jul 24 10:01:06 2002 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jul 24 10:01:06 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com> Message-ID: <3D3ECEEE.6BAF4CC2@noaa.gov> Just to add my $.02: I disagree with Eric about what the default behaviour should be. Every programming language/environment I have ever used uses some kind of "random" seed by default. When I want reproducible results (which I often do for testing) I can specify a seed. I find the the most useful behaviour. As Eric points out, it is not trivial to generate a "random" seed (from the time, or whatever), so it doesn't make sense to burdon the nieve user with this chore. Therefore, I strongly support keeping the default behaviour of a "random" seed. Eric Maryniak wrote: > then I'd urge at least to use a better initial seed. > In certain applications, e.g. generating session id's in crypto programs, > non-predictability of initial seeds is crucial. But if you have a look > at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks > like an art in itself. So perhaps RNG's 'clock code' should replace > RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will > not have the 1-second problem. This I agree with: a better default initial seed would be great. As someone said, "show me the code!". I don't imagine anyone would object to improving this. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From e.maryniak at pobox.com Wed Jul 24 10:29:02 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Wed Jul 24 10:29:02 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <3D3ECEEE.6BAF4CC2@noaa.gov> References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com> <3D3ECEEE.6BAF4CC2@noaa.gov> Message-ID: <200207241928.07366.e.maryniak@pobox.com> On Wednesday 24 July 2002 17:59, Chris Barker wrote: > Just to add my $.02: > > I disagree with Eric about what the default behaviour should be. Every > programming language/environment I have ever used uses some kind of > "random" seed by default. When I want reproducible results (which I > often do for testing) I can specify a seed. I find the the most useful > behaviour. As Eric points out, it is not trivial to generate a "random" > seed (from the time, or whatever), so it doesn't make sense to burdon > the nieve user with this chore. > > Therefore, I strongly support keeping the default behaviour of a > "random" seed. In that case, and if that is the general consensus, RNG should be adapted: it now uses a fixed seed by default (and not a clock generated one). > Eric Maryniak wrote: > > then I'd urge at least to use a better initial seed. > > In certain applications, e.g. generating session id's in crypto programs, > > non-predictability of initial seeds is crucial. But if you have a look > > at GPG's or OpenSSL's source for a PRNG (especially for Windows), it > > looks like an art in itself. So perhaps RNG's 'clock code' should replace > > RandomArray2's: it uses microseconds (in gettimeofday), too, and thus > > will not have the 1-second problem. > > This I agree with: a better default initial seed would be great. As > someone said, "show me the code!". I don't imagine anyone would object > to improving this. The source is in Mixranf(), file Numerical/Packages/RNG/Src/ranf.c (when checked out with CVS), but it may be a good idea to check it with Python's own random/whrandom code (which I don't have at hand -- it may be more recent and/or portable for other OSes). By the way, I realized in my code 'fix' for RandomArray2.seed(x=None,y=None) that I already anticipated this and that the default behavior is _not_ to use a fixed seed ;-) : if either the x or y seed: seed < 0 ==> Use the default initial seed value. seed = None ==> Set a "random" value for the seed from clock (default) seeds >= 0 ==> Set seed directly (32 bits only). and en-passant do a better job than clock-based seeding: ---cut--- def seed(x=None,y=None): """seed(x, y), set the seed using the integers x, y; ... """ if (x != None and type (x) != IntType) or (y != None and type (y) != IntType) : raise ArgumentError, "seed requires integer arguments (or None)." if x == None or y == None: # This would be the best, but is problematic under Windows/Mac. import dev_random_device # uses /dev/random or equivalent x = dev_random_device.nextvalue() # egd.sf.net is a user space y = dev_random_device.nextvalue() # alternative # So best is to use Mixranf() from RNG/Src/ranf.c here. elif x < 0 or y < 0: x = 1234567890L y = 123456789L ranlib.set_seeds(x,y) ---cut--- Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. Unix was a trademark of AT&T. AT&T is a modem test command. From peter.chang at nottingham.ac.uk Wed Jul 24 11:08:06 2002 From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk) Date: Wed Jul 24 11:08:06 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <200207241928.07366.e.maryniak@pobox.com> Message-ID: Just to stick my oar in: I think Eric's preference is predicated by the lousiness (or otherwise?) of RandomArray's seeding mechanism. The random sequences generated by incremental seeds should, by design, be uncorrelated thus allowing the use of the system clock as a seed source. If you're running lots of simulations (as I do with Monte Carlos, though not in numpy) using PRNGs, the last thing you want is the task to find a (pseudo) random source of seeds. Using /dev/random is not particularly portable; the system clock is much easier to obtain and is fine as long as your iteration cycle is longer than its resolution. Peter From paul at pfdubois.com Wed Jul 24 23:09:02 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Jul 24 23:09:02 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <200207241928.07366.e.maryniak@pobox.com> Message-ID: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY> I'm not going to change the default seed on RNG. Existing users have the right to stability, and not to have things change because someone thinks a certain choice among several reasonable ones is better than the one previously made. There is the further issue here of RNG being advertised as similar to Cray's ranf() and that similarity extends to this default. Not to mention that for many purposes the current default is quite useful. From e.maryniak at pobox.com Thu Jul 25 06:02:03 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Thu Jul 25 06:02:03 2002 Subject: [Numpy-discussion] Numarray: Summary (seeding): personal code and manual suggestions on initial seeding in module RNG and RandomArray(2) In-Reply-To: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY> References: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY> Message-ID: <200207251501.47126.e.maryniak@pobox.com> Dear crunchers, Please see my personal thoughts on the past discussion about initial seeds some paragraphs down below, where I'd like to list concrete code and manual enhancements aimed at providing users with a clear understanding of it's usage (and pitfalls e.g. w/r to cryptographic applications)... ==> Suggestions for code and manual changes w/r to initial seeding (down below) But first a response to Paul's earlier message: On Thursday 25 July 2002 08:08, Paul F Dubois wrote: > I'm not going to change the default seed on RNG. Existing users have the > right to stability, and not to have things change because someone thinks > a certain choice among several reasonable ones is better than the one > previously made. Well, I wasn't aware of the fact that things were completely set in stone for Numarray solely for backward compatibilty. It was my impression that numarray and it's accompanying xx2 packages were also open for redesign. I agree stability is important, but numarray already breaks with Numeric in other aspects so why should RNG (RNG2 in numarray?) or other packages not be? It's more a matter of well documenting changes I think. Users switching to numarray will already have to take into account some changes and verify their code. It's not that I "think a certain choice among several reasonable ones is better" [although my favorite is still a fixed seed, as in RNG, for reasons of reproducibility in later re-runs of Monte Carlo's that are not possible now, because the naive user, using a clock seed, may not have saved the initial seed with get_seed], but that the different packages, i.c. RNG (RNG2 to be?) and RandomArray2, should be orthogonal in this respect. I.e. the same, so 'default always an automagical (clock whatever) random initial seed _or_ a fixed one'. Orthogonality is a very common and accepted design principle in computing science and for good reasons (usability). Users changing from one PRNG to another (and using the default seed) would otherwise be unwelcomely surprised by a sudden change in behavior of their program. I try to give logical arguments and real code examples in this discussion and fail to see in Paul's reaction where I'm wrong. By the way: in Python 2.1 alpha 2 seeding changed, too: """ - random.py's seed() function is new. For bit-for-bit compatibility with prior releases, use the whseed function instead. The new seed function addresses two problems: (1) The old function couldn't produce more than about 2**24 distinct internal states; the new one about 2**45 (the best that can be done in the Wichmann-Hill generator). (2) The old function sometimes produced identical internal states when passed distinct integers, and there was no simple way to predict when that would happen; the new one guarantees to produce distinct internal states for all arguments in [0, 27814431486576L). """ > There is the further issue here of RNG being advertised as similar to > Cray's ranf() and that similarity extends to this default. Not to > mention that for many purposes the current default is quite useful. Perhaps I'm mistaken here, but RNG/Lib/__init__.py does (-1 -> uses fixed internal seed): standard_generator = CreateGenerator(-1) and: def ranf(): "ranf() = a random number from the standard generator." return standard_generator.ranf() And indeed Mixranf in RNG/Src/ranf.c does set them to 0: ... if(*s < 0){ /* Set default initial value */ s48[0] = s48[1] = 0; Setranf(s48); Getranf(s48); And this code, or I'm missing the point, uses a standard generator from RNG, which demonstrates the same sequence of initial seeds in re-runs (note that it does not suffer from the "1-second problem" as RandomArray2 does, see the Appendix below for a demonstration of that, because RNG uses milliseconds). Note that 'ranf()' is listed in chapter 18 in Module RNG as one of the 'Generator objects': $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from numarray import * >>> from RNG import * >>> for i in range(3): ... standard_generator.ranf() ... 0.58011364857958725 0.95051273498076583 0.78637142533060356 >>> $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from numarray import * >>> from RNG import * >>> for i in range(3): ... standard_generator.ranf() ... 0.58011364857958725 0.95051273498076583 0.78637142533060356 >>> Ok, now then my own (and possibly biased) personal summary of the past discussions and concrete code and manual recommendations: ==> Suggestions for code and manual changes w/r to initial seeding Conclusions: 1. Default initial seeding should be random (and not fixed). This is the general consensus and while it may not win the beauty contest in purist software engineering circles, it also is the default behavior in Python's own Random/WHRandom modules. URL: http://web.pydoc.org/2.2/random.html => Recommendations: - Like Python's random/whrandom module, default arguments to seed() should not be 0, but None, and this triggers the default behavior which is to use a random initial seed (ideally 'truly' random from e.g. /dev/random or otherwise clock or whatever based), because: o better usability: users changing from Python's own random to numarray's random facilities will find familiar seed() usage semantics o often 0 itself can be a legal seed (although the MersenneTwister does not recommend it) - Like RNG provide support for using a built-in fixed seed by supplying negative seeds to seed(), rationale: o support for reproducible re-runs of Monte Carlo's without having to specify ones own initial seed o usability: naive users may not know a 'good' seed is, like: can it be 0 or must it be >0, what is the maximum, etc. - See my suggested code fix for RandomArray2.seed() in the Appendix below. - Likewise, in RNG: o CreateGenerator (s, ...) should be changed to CreateGenerator (s=None) Also note Python's own: def create_generators(num, delta, firstseed=None) from random (random.py), url: http://web.pydoc.org/2.2/random.html o RNG's code should be changed from testing on 0 to testing on None first (which results in using the clock), then on < 0 (use built-in seed), and then using the user provided seed (which is thus >= 0, and hence can also be 0) o 'standard_generator = CreateGenerator(-1)' should be changed to 'standard_generator = CreateGenerator() and results in using the clock - Put some explicit warnings in the numarray manual, that the seeding of numarray's packages should _not_ be used in those parts of software where unpredictability of seeds is important, such as for example, cryptographical software for creating session keys, TCP sequence numbers etc. Attacks on crypto software usually center around these issues. Ideally, a /dev/random should be used, but with the current system clock based implementation, the seeds are not random, because the clock does not have deci-nanosecond precision (10**10 ~= 2**32) yet ;-) Appendix -------- ** 1. "1-second problem" with RandomArray2: $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from numarray import * >>> from RandomArray2 import * >>> import time >>> import sys >>> sys.version '2.2.1 (#1, Jun 25 2002, 20:45:02) \n[GCC 2.95.3 20010315 (SuSE)]' >>> numarray.__version__ '0.3.5' >>> for i in range(3): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(1) ... print ... 1027591910.9043469 (102759, 1911) 1027591911.901091 (102759, 1912) 1027591912.901088 (102759, 1913) >>> for i in range(3): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(0.3) ... print ... 1027591966.260392 (102759, 1967) 1027591966.5510809 (102759, 1967) 1027591966.851079 (102759, 1967) Note that Python (at least 2.2.1) own random() suffers much less from this (on my 450 MHz machine, every 10-th millisecond or so the seed will be different): $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from random import * >>> import time >>> >>> for i in range(3): ... print long(time.time() * 256) ... 263065231349 263065231349 263065231349 >>> for i in range(3): ... print long(time.time() * 256) ... time.sleep(.00001) ... 263065240314 263065240315 263065240317 By the way, Python's own random.seed() also suffers from this, but on a 10th-millisecond level (on my 450 Mhz i586 at least). For the implementation of seed() see Lib/random.py, basically a 'long(time.time()' is used: $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from random import * >>> import time >>> for i in range(3): ... print long(time.time() * 256) ... 263065231349 263065231349 263065231349 >>> for i in range(3): ... print long(time.time() * 256) ... time.sleep(.00001) ... 263065240314 263065240315 263065240317 2. Proposed re-implementation of RandomArray2.seed(): def seed(x=None,y=None): """seed(x, y), set the seed using the integers x, y: x or y is None (or not specified): A random seed is used which in the current implementation may be based on the system's clock. Warning: do not this seed in software where the initial seed may not be predictable, such as for example, in cryptographical software for creating session keys. x < 0 or y < 0: Use the module's fixed built-in seed which is the tuple (1234567890L, 123456789L) (or whatever) x >= 0 and y >= 0 Use the seeds specified by the user. (Note: some random number generators do not recommend using 0) Note: based on Python 2.2.1's random.seed(a=None). ADAPTED for _2_ seeds as required by ranlib.set_seeds(x,y) """ if (x != None and type (x) != IntType) or (y != None and type (y) != IntType) : raise ArgumentError, "seed requires integer arguments (or None)." if x == None or y == None: try: # This would be the best, but is problematic under Windows/Mac. # To my knowledge there isn't a portable lib_randdevice yet. # As GPG, OpenSSH and OpenSSL's code show, getting entropy # under Windows is problematic. # However, Python 2.2.1's socketmodule does wrap the ssl code. import dev_random_device # uses /dev/random or equivalent x = dev_random_device.nextvalue() # egd.sf.net is a user space y = dev_random_device.nextvalue() # alternative except: # Use Mixranf() from RNG/Src/ranf.c here or, perhaps better, # use Python 2.2.1's code? At least it looks simpler and does not # have the platform dependency's and has possibly met wider testing # (and why not re-use code? ;-) # For Python 2.2.1's random.seed(a=None), see url: # http://web.pydoc.org/2.2/random.html # and file Lib/random.py. # Do note, however, that on my 450 Mhz machine, the statement # 'long(time.time() * 256)' will generate the same values # within a tenth of a millisecond (see Appendix #1 for a code # example). This can be fixed by doing a time.sleep(0.001). # See my #EM# comment. # Naturally this code needs to be adapted for ranlib's # generator, because this code uses the Wichmann-Hill generator. ---cut: Wichmann-Hill--- def seed(self, a=None): """Initialize internal state from hashable object. None or no argument seeds from current time. If a is not None or an int or long, hash(a) is used instead. If a is an int or long, a is used directly. Distinct values between 0 and 27814431486575L inclusive are guaranteed to yield distinct internal states (this guarantee is specific to the default Wichmann-Hill generator). """ if a is None: # Initialize from current time import time a = long(time.time() * 256) #EM# Guarantee unique a's between subsequent call's of seed() #EM# by sleeping one millisecond. This should not be harmful, #EM# because ordinarily, seed() will only be called once or so #EM# in a program. time.sleep(0.001) if type(a) not in (type(3), type(3L)): a = hash(a) a, x = divmod(a, 30268) a, y = divmod(a, 30306) a, z = divmod(a, 30322) self._seed = int(x)+1, int(y)+1, int(z)+1 ---cut: Wichmann-Hill--- elif x < 0 or y < 0: x = 1234567890L # or any other suitable 0 - 2**32-1 y = 123456789L ranlib.set_seeds(x,y) 3. Mersenne Twister, another PRNG: Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. In a grocery store, the Real Programmer is the one who insists on running the cans past the laser checkout scanner himself, because he never could trust keypunch operators to get it right the first time. From aureli at ipk.fhg.de Thu Jul 25 09:51:06 2002 From: aureli at ipk.fhg.de (Aureli Soria Frisch) Date: Thu Jul 25 09:51:06 2002 Subject: [Numpy-discussion] index method for array objects? In-Reply-To: References: <20020621133705.A15296@idi.ntnu.no> Message-ID: Hi all, Has someone implemented a function for arrays that behaves like the index(*) method for lists (it should then consider something like a tolerance parameter). I suppose it could be maybe done with array.tolist() and list.index(), but have someone implemented something more elegant/array-based? Thanks in advance Aureli PD: (*) index receive a value as an argument and retunrs the index of the list member equal to this value... -- ################################# Aureli Soria Frisch Fraunhofer IPK Dept. Pattern Recognition post: Pascalstr. 8-9, 10587 Berlin, Germany e-mail: aureli at ipk.fhg.de fon: +49 30 39006-143 fax: +49 30 3917517 web: http://vision.fhg.de/~aureli/web-aureli_en.html ################################# From jmiller at stsci.edu Thu Jul 25 10:15:03 2002 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jul 25 10:15:03 2002 Subject: [Numpy-discussion] index method for array objects? References: <20020621133705.A15296@idi.ntnu.no> Message-ID: <3D4031C2.3090607@stsci.edu> Aureli Soria Frisch wrote: > Hi all, > > Has someone implemented a function for arrays that behaves like the > index(*) method for lists (it should then consider something like a > tolerance parameter). > > I suppose it could be maybe done with array.tolist() and list.index(), > but have someone implemented something more elegant/array-based? > > Thanks in advance > > Aureli > > PD: (*) index receive a value as an argument and retunrs the index of > the list member equal to this value... I think the basics of what you're looking for are something like: def index(a, b, eps): return nonzero(abs(a-b) < eps) which should return all indices at which the absolute value of the difference between elements of a and b differ by less than eps. e.g.: >>> import Numeric >>> index(Numeric.arange(10,20), 15, 1e-5) array([5]) Todd -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From magnus at hetland.org Thu Jul 25 12:12:11 2002 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu Jul 25 12:12:11 2002 Subject: [Numpy-discussion] Spectral approximation/DFT Message-ID: <20020725211111.A27670@idi.ntnu.no> Hi! Sorry to ask what is probably a really clueless question -- if there are any obvious sources of information about this, I'd be happy to go there and find this out for myself... :] Anyway; I'm trying to produce a graph to illustrate a time sequence indexing method, which relies on extracting the low-frequent Fourier coefficients and indexing a vector consisting of those. The graph should contain the original time sequence, and one reconstructed from the Fourier coefficients. Since it is reconstructed from only the low-frequent coefficients (perhaps 10-20 coefficients), it will look wavy and sinus'y. Now... I'm no expert in signal processing (or the specifics of FFT/DFT etc.), and I can't seem to make the FFT module do exactly what I want here... It seems that using fft(seq).real extracts the coefficients I'm after (though I'm not sure whether the imaginary components ought to figure in the equation somehow...) But no matter how I use inverse_fft or inverse_real_fft it seems I have to supply a number of coefficients equal to the sequence I want to approximate -- otherwise there will be a huge offset between them. Why is this so? Shouldn't the first coefficient take care of such an offset? Perhaps inverse_fft isn't doing what I think it is? If I haven't expressed myself clearly, I'd be happy to elaborate... (For those who might be interested, the approach is described in the paper found at http://citeseer.nj.nec.com/307308.html with a figure of the type I'm trying to produce at page 5.) Anyway, thanks for any help :) -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org From magnus at hetland.org Thu Jul 25 12:16:21 2002 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu Jul 25 12:16:21 2002 Subject: [Numpy-discussion] A probable solution... Message-ID: <20020725211534.A27914@idi.ntnu.no> After posting to the list (sorry about that ;) a possible solution occurred to me... To get an approximation, I used fft(seq, 10) and then inverted that using inverse_fft(signature, 100)... I guess that fouled up the scale of things -- when I use fft(seq, 100)[:10] to get the signature, it seems that everything works just fine... Even though this _seems_ to do the right thing, I just wanted to make sure that I'm not doing something weird here... -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org From a.schmolck at gmx.net Thu Jul 25 15:18:04 2002 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Thu Jul 25 15:18:04 2002 Subject: [Numpy-discussion] Numarray design announcement References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY> Message-ID: "Paul F Dubois" writes: > > During recent months there has been a lot of discussion about Numarray > and whether or not it should differ from Numeric in certain ways. We > have reviewed this lengthy discussion and come to some conclusions about > what we plan to do. The discussion has been valuable in that it took a > whole new "generation" back through the considerations that the > "founding fathers" debated when Numeric Python was designed. [...] > Decisions > > Numarray will have the same Python interface as Numeric except for the > exceptions discussed below. [...] > 2. Currently, if the result of an index operation x[i] results in a > scalar result, the result is converted to a similar Python type. For > example, the result of array([1,2,3])[1] is the Python integer 2. This > will be changed so that the result of an index operation on a Numarray > array is always a Numarray array. Scalar results will become rank-zero > arrays (i.e., shape () ). > [...] > > 4. The Numarray version of MA will no longer have copy semantics on > indexing but instead will be consistent with Numarray. (The decision to > make MA differ in this regards was due to a need for CDAT to be backward > compatible with a local variant of Numeric; the CDAT user community no > longer feels this was necessary). [...] As one of the people who argued for interface changes in numarray (mainly copy semantics for slicing), let me say that I welcome this announcement which clarifies many issues. Although I still believe that copy behavior would be preferable in principle, I think that continuity and backwards compatibility to Numeric is a sufficient reason to stick to the old behavior (now that numarray strives to be largely compatible) [1]. In a similar vain I also greatly welcome the change to view semantics in MA, because I feel that internal consistency is vital. Apart from being a heavy Numeric user, these interface issues are also quite important to me because I have been working for some time on a fully-featured matrix [2] class which I wanted to be both a) compatible to Numeric and numarray (so that it would ideally make no difference to the user which of the 2 libraries he'd be using as a "backend" to the matrix class). b) consistent in usage to numarray's interface wherever feasible (i.e. not too much of a compromise on usability). This turned out to be much more of a hassle than I would have anticipated, because contrary to what the compatibility section of the manual seemed to suggest I found numarray to be incompatible in a variety of ways (even making it impossible to write *forward* compatible code without writing additional wrapping functions). Just as an example, there was no simple way that would work across both versions to do something as common as creating e.g. an int array (with both parameter names and positions differing): Numeric (21): array(sequence, typecode=None, copy=1, savespace=0) numarray (0.3.3?) : array(buffer=None, shape=None, type=None) As for b) this obviously turned out to be a moving target, but I hope that now the final shape of things is getting reasonably clear and I'm now for example determined to have view slicing behavior for my matrix class, too. Nonetheless, for me a few issues still remain. Most importantly, numarray doesn't provide the same degree of polymorphism as Numeric. One of the chief reasons given as to why Numerics design is based around functions rather than methods is that it enables greater generality (e.g. allowing to ``sum`` over all sorts of sequence types). Consequently the role of methods and attributes was largely limited to functionality that only made sense for array objects and special methods. This is more than just a neat convinience -- because of the resulting polymorphism it is easy to write fairly general code and define new kinds of numeric classes that can seamlessly be passed to Numeric functions (e.g. one can also ``sum`` Matrix'es). I find it highly undesirable that numarray apparently doesn't follow this design rationale and the division of labour between functions and methods/attributes has been blured (or so it appears to me -- maybe this is some lack of insight on my part). That numarray versions before 0.3.4 were missing functions such as ``shape`` (which is also quite handy for other sequence types) was largely an inconvenience, but the fact that numarray function generally only operate on scalars, ``tuple``s and ``list``s (apart from obviously numarray.array's) is in my eyes a significant shortcoming. In contrast, Numeric functions would operate on any type that had an __array__ method to return an array representation of itself. The explicit checking for a type that numarray uses (via constructs ? la type(a) == types.ListType) flies in the face of standard python sensibilities and places arbitrarily limits on the kinds of objects that numarray users can conviniently work with and places a significant hurdle for creating new kinds of numerical objects. For example, the design of my matrix class depends on the fact that Numeric functions also accept objects with __array__ methods (such as my matrix class). Even if I invested the substantial amount of work that would be needed to redesign a less general version that wouldn't rely on this property, one of the key virtues of my class, namely the ability to transparently replace Numeric.array's in most cases where they are used as matrices would be lost. These two reasons would presumably be sufficient for me not to switch to numarray if I can at all avoid it, so I really hope that there numarray will also grow an __array__ protocol or somethign equivalent. This is the only point that is really vital to me, but there are others that I'd rather see reconsidered. As I said, I liked the division of labor between functions and methods/attributes in Numeric and the motivations behind it, as far as I understand them. numarray arrays, however, have grown methods like ``argsort`` and ``diagonal`` that seem somewhat unmotivated to me (and some of which cause problems for my matrix class). Similarly, why is there a e.g. a ``.rank`` attribute but a ``.type()`` method? If anything one would expect type to be an attribute and rank a method, since the type is actually some intrinsic property that needs to be stored (and could even be plausibly assigned to, with results like an ``astype`` call) whereas ``size`` and ``rank`` have no "real" existence as they are only computed from the shape and modifying them makes no sense. TMTOWTDI is the road to perl, so I'd really prefer to avoid duplicate functionality a la ``rank(a)`` and ``a.rank`` and generally reserve attributes and methods to array specific functionality. One area where TMTOWTDI seems to have run amok (several ways to do something but IMHO all broken) are flattened representations of arrays. All these expressions aim to produce a flattened version of ``a``: ``ravel(a)``, ``a.ravel()``, ``a.getflat()``/ ``a.flat`` `Aim` in this context is some sort of euphemism -- the only one for which it is possible to determine at compile time that it will do anything apart from raising an exception is ``ravel(a)`` -- not that one could know *what* it will do before the code is actually run (return a flattened copy of a or a flattened view), but never mind. Yuck. I think this really needs fixing (deprecating, rather then removing or changing incompatibly where felt necessary). Something else, which I however consider as less important: is it really necessary to have both 'type' and 'typecode'? Wouldn't it be enough to just stick with typecode, along the following lines (potentially issuing deprecation warnings where appropriate): a.typecode() returns a type object (e.g. Float32). array([1,2,3], typecode=Float32) behaves the same as array([1,2,3], typecode='d') Float32 etc. are already defined in Numeric so it's easy to write forward-compatible code and although hunting down instances of if a.typecode() == 'd': presumably wouldn't be that difficult, incompatibility could most likely almost be eliminated by making ``Float32 == 'd'`` return true. Sticking to the old name typecode also has the advantage that it is fairly unique and unambiguous (just try grep'ing for type vs. typecode). I must that apart from the switch to type objects, I don't fully understand the differences in numeric types in old Numeric and numarray and the motivation behind them. As far as I can see the emphasis with Numeric was to keep flexible to different hardware and increasing word sizes (i.e. to only guarantee minimum precision) and provide some reasonable "default" size for each type (e.g. `Float` being a double precision [3]). This approach is maybe somewhat similar to python core (floats and ints can have different sizes, depending on the underlying platform). In numarray the emphasis seems to have shifted on guaranteeing the actual size in memory (if in a few years time most calculations are done with 128bit precision than that's maybe not such a good idea, but I have no clue how likely this is to happen). Is this shift of emphasis is also responsible for the decision to have indexing operations always return arrays rather than scalars (including ones defined by numarray in cases where there is no plain-python equivalent)? Will all other functions (e.g. min) continue to return scalars? [BTW can anyone explain to me the difference between Int and Int32 (typecodes 'i' and 'l')?] Anyway, my apologies if I come across as too negative or if some the points are misinformed. I really think that the recent changes to numarray and this announcment are great step forward to a smooth transition of the whole community from Numeric to numarray which will play an important role in consolidating python's role in the scientific computing. night, alex Footnotes: [1] I think it might be beneficial, however, to add an explicitly note to the manual that alerts users to the fact that small slices can keep alive very large arrays, because I am under the impression that this is not immediately obvious to everyone and can cause puzzling problems. [2] I moaned on this list some months ago that doing linear algebra with Numeric array's was often cumbersome and inefficient (and the Matrix class that already comes with Numeric is rather limited). My (currently alpha) matrix class attempts to address these issues and also provides a much more flexible 'plugable' output formating (matlab-like, amongst others, which I guess many people will find much more readable; but the standard array-like formating is also available). [3] As an aside: maybe ``type="Float"`` in numarray should therefore *not* be equivalent to ``type=Float32`` but to ``type=Float64``, given that these strings seem to just be there for backwards compatibility? -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From victor at idaccr.org Tue Jul 30 06:43:06 2002 From: victor at idaccr.org (Victor S. Miller) Date: Tue Jul 30 06:43:06 2002 Subject: [Numpy-discussion] Sparse matrices Message-ID: I had noticed that Travis Oliphant had a sparse.py package, but it no longer is available (clicking on the link gives a "404"). I have a particular kind of sparse matrix that I'd like to use to give vector matrix multiplies. In particular, it's an n x n matrix which has at most k (which is small, usually 2 or 3) non-zeros in each row which are in consecutive locations. I have this encoded as an n x k matrix, the i-th row gives the non-zero values in the i-th row of the big matrix, and an n long vector of indices -- the i-th element gives the starting position in the i-th row. When I want to multiply this matrix by a row vector v on the left. To do the multiplication I do the following: # loc is the location vector n = matrix.shape[0] mm = reshape(v,(-1,1))*matrix w = zeros((n+m),v.typecode()) for i in range(mm.shape[0]): w[loc[i]:loc[i]+matrix.shape[1]] += w[i] w = w[:n] I would like to be able to replace the loop with some Numeric operations. Is there a trick to do this? Note that the n that I'm using is around 100000, so that storing the full matrix is out of the question (and multiplying by that matrix would be extremely inefficient, anyway). -- Victor S. Miller | " ... Meanwhile, those of us who can compute can hardly victor at idaccr.org | be expected to keep writing papers saying 'I can do the CCR, Princeton, NJ | following useless calculation in 2 seconds', and indeed 08540 USA | what editor would publish them?" -- Oliver Atkin From victor at idaccr.org Tue Jul 30 08:29:06 2002 From: victor at idaccr.org (Victor S. Miller) Date: Tue Jul 30 08:29:06 2002 Subject: [Numpy-discussion] Sparse matrices In-Reply-To: (victor@idaccr.org's message of "Tue, 30 Jul 2002 09:42:13 -0400") References: Message-ID: Sorry, I had a typo in the program. It should be: # M is n by k, and represents a sparse n by n matrix A # the non-zero entries of row i of A start in column loc[i] # and are the i-th row of M in locations loc[i]:loc[i]+k # loc is the location vector n,k = M.shape mm = reshape(v,(-1,1))*M w = zeros((n+m),v.typecode()) # is there a trick to replace the loop below? for i in range(mm.shape[0]): w[loc[i]:loc[i]+k] += mm[i] w = w[:n] -- Victor S. Miller | " ... Meanwhile, those of us who can compute can hardly victor at idaccr.org | be expected to keep writing papers saying 'I can do the CCR, Princeton, NJ | following useless calculation in 2 seconds', and indeed 08540 USA | what editor would publish them?" -- Oliver Atkin From jochen at unc.edu Tue Jul 30 09:24:02 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Tue Jul 30 09:24:02 2002 Subject: [Numpy-discussion] Sparse matrices In-Reply-To: References: Message-ID: On Tue, 30 Jul 2002 09:42:13 -0400 Victor S Miller wrote: Victor> I had noticed that Travis Oliphant had a sparse.py package, Victor> but it no longer is available (clicking on the link gives a Victor> "404"). It's part of scipy now. Greetings, Jochen -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E From e.maryniak at pobox.com Mon Jul 1 01:48:01 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Mon Jul 1 01:48:01 2002 Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info) In-Reply-To: <3D1F0839.2090802@stsci.edu> References: <3D1F0839.2090802@stsci.edu> Message-ID: <200207011047.25000.e.maryniak@pobox.com> On Sunday 30 June 2002 15:31, Todd Miller wrote: > Perry Greenfield wrote: > >... > >>2. Because I'm running two versions of Python (because Zope > >> and a lot of Zope/C products depend on a particular version) > >> the 'development' Python is installed in /usr/local/bin > >> (whereas SuSE's python is in /usr/bin). > >> It probably wouldn't do any harm if the manual would include > >> a hint at the '--prefix' option and mention an alternative > >> Python installation like: > >> > >> /usr/local/bin/python ./setup.py install --prefix=/usr/local > > > >Good idea. > > I'm actually surprised that this is necessary. I was under the > impression that the distutils pick reasonable defaults simply based on > the python that is running. In your case, I would expect numarray to > install to /usr/local/lib/pythonX.Y/site-packages without specifying any > prefix. What happens on SuSE? Yes, you're probably right. On SuSE I tested it out on my own machine ('test server'), because I did not want to do it on the production server. It run's Python 2.2.1 exclusively. I remembered that I had to this in a previous Numeric installation, where 1.5.2 and 2.1 were running side-by-side (and at that time I also had to install distutils manually). So, yes, it may not be an issue (anymore) for at least recent Python's if you call the Python explicitly like '/usr/local/bin/python ./setup.py' and '/usr/bin/python ./setup' (on SuSE python goes to /usr/bin). > > >>... Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. It said 'Insert disk #3', but only two will fit. From hinsen at cnrs-orleans.fr Mon Jul 1 08:48:10 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon Jul 1 08:48:10 2002 Subject: [Numpy-discussion] Scientific Python 2.4 Message-ID: <200207011543.g61FhHL25160@chinon.cnrs-orleans.fr> Scientific Python 2.4 --------------------- Scientific Python is a module library for scientific computing. In this collection you will find modules that cover basic geometry (vectors, tensors, transformations, vector and tensor fields), quaternions, automatic derivatives, (linear) interpolation, polynomials, elementary statistics, nonlinear least-squares fits, unit calculations and conversions, Fortran-compatible data formatting, 3D visualization via VRML, two Tk widgets for simple line plots and 3D wireframe models. Scientific Python also contains Python interfaces to the netCDF library (implementing a portable binary format for large arrays) and the Message Passing Interface, the most widely used communications library for parallel computers. Version 2.4 of Scientific Python has just been released. In addition to numerous small improvents and bug fixes, it contains - the high-level parallelization module Scientific.BSP - an interface to the parallelization library BSPlib (see www.bsp-worldwide.org for details) - autoregressive models for time series in Scientific.Signals.Models The BSP parallelization module was designed to facilitate development and testing of parallel programs. Its main features are: - communication can handle almost any Python object - deadlocks are impossible by design - possibility to implement distributed data classes that can be used transparently by parallel applications - an interactive parallel interpreter that can be used inside Emacs (and perhaps other Python development environments) in order to provide an interactive parallel programming environment - parallel programs run as serial monoprocessor code on any Python installation with no changes and usually negligeable loss of performance - no need to maintain a separate serial version A tutorial on BSP programming with Python is available at the Web site and included in the distribution. For more information and for downloading, see http://dirac.cnrs-orleans.fr/ScientificPython or http://starship.python.net/crew/hinsen/scientific.html -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Mon Jul 1 16:41:28 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 1 16:41:28 2002 Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info) In-Reply-To: <200207011047.25000.e.maryniak@pobox.com> Message-ID: <002601c22158$90f7e900$0c01a8c0@NICKLEBY> distutils installs into the python used to run the setup.py by using the sys.exec_prefix and sys.prefix. You would not normally need to use any option unless you are trying to install something "off to the side" because, for example, you don't have write permission in that python's site-packages directory. From paul at pfdubois.com Mon Jul 1 16:50:57 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 1 16:50:57 2002 Subject: [Numpy-discussion] words that must not be spoken In-Reply-To: <200206262047.00731.e.maryniak@pobox.com> Message-ID: <002701c22159$ca4fd270$0c01a8c0@NICKLEBY> > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Eric Maryniak In the midst of a discussion Eric wrote: > > ... > shouldn't Convolve, for > orthogonality, be named > Convolve2? (cuz who knows, numarray's Convolve may be backported > to Numeric in the future, for comparative testing etc.). Use of the phrase "backported to Numeric" will result in your subscription to numpy-discussion being cancelled. (:-> No backporting is ever going to happen. This is a short one-way street or there is no purpose to travel on it. I am just back from Europython and had a chance to talk to a lot of users and have some thoughts which I will share with all of you shortly. However, since I just had to fill out a form and where it said "Date" I looked at my watch and wrote the time 11/16, I conclude that I have jet lag and can't trust myself to be lucid yet. From jae at zhar.net Mon Jul 8 02:49:01 2002 From: jae at zhar.net (John Eikenberry) Date: Mon Jul 8 02:49:01 2002 Subject: [Numpy-discussion] Optimization advice Message-ID: <20020708094805.GA370@kosh.zhar.net> I'm working on an influence map [1] for game civil [2]. I have a working version, but as a real numeric newbie I thought I'd bounce it off the people here before calling it done. I'm basically looking for an easy to understand but fast influence spreading algorithm. I've read that this algorithm is similar to those used to predict fire spreading or heat transfer in metal if that helps. The attached code is setup for a hex based map and the functions to take this into accounts (shift_hex_up,shift_hex_down) are probably the most naive. The others being only slight modifications of those in the life.py example. Its not really commented but its short and hopefully should be readily understandable. I've only included the base influence map class and its associated functions. If you'd like a version you can run, I can send you a .tgz setup to run in place (for *nix systems). Thanks in advance for any advice or opinions. [1] An influence map is used commonly in strategic war games. It is a simple means of capturing the areas on the game map that one side is strong vs the other side. Read the first post in this thread for a good description: http://www.gameai.com/influ.thread.html [2] Civil is a cross-platform, turn-based, networked strategy game, developed using Python, PyGame and SDL--allowing players to take part in scenarios set during the American Civil war. http://civil.sourceforge.net/ -- John Eikenberry [jae at zhar.net - http://zhar.net] ______________________________________________________________ "They who can give up essential liberty to purchase a little temporary safety, deserve neither liberty nor safety." --B. Franklin -------------- next part -------------- # /usr/bin/env python from Numeric import * factor = array(6.).astype(Float16) edge_mod = array(0.66).astype(Float16) class InfluenceMap: def __init__(self,hex_map): self.map_size = map_size = hex_map.size self._iterations = (map_size[0] + map_size[1])/4 self.hex_map = hex_map # weightmap == influence map self.weightmap = zeros((map_size[0],map_size[1]),Float16) # constmap = initial state with constraints/constants self.constmap = zeros((map_size[0],map_size[1]),Float16) def step(self,iterations=None): constmap = self.constmap weightmap = self.weightmap if not iterations: iterations = self._iterations while iterations: # spread the influence # diamond_h neighbors = _shift_up(weightmap)/factor neighbors += _shift_left(weightmap)/factor neighbors += _shift_right(weightmap)/factor neighbors += _shift_down(weightmap)/factor neighbors += _shift_hex_up(weightmap)/factor neighbors += _shift_hex_down(weightmap)/factor # constrain initial points to prevent overheating putmask(neighbors,constmap,constmap) weightmap = neighbors iterations -= 1 self.weightmap = weightmap def shift_up(cells): return concatenate((cells[1:], cells[-1:]*edge_mod)) def shift_down(cells): return concatenate((cells[:1]*edge_mod, cells[:-1])) def shift_left(cells): return transpose(shift_up(transpose(cells))) def shift_right(cells): return transpose(shift_down(transpose(cells))) # for array layout def shift_hex_up(cells): neighbors = array(cells) # add to odd cell rows [1::2] neighbors[1::2] = shift_left(shift_up(cells))[1::2] # even cell rows [::2] neighbors[::2] = shift_right(shift_up(cells))[::2] return neighbors def shift_hex_down(cells): neighbors = array(cells) # odd cell rows [1::2] neighbors[1::2] = shift_left(shift_down(cells))[1::2] # even cell rows [::2] neighbors[::2] = shift_right(shift_down(cells))[::2] return neighbors From dubois1 at llnl.gov Mon Jul 8 09:10:04 2002 From: dubois1 at llnl.gov (Paul Dubois) Date: Mon Jul 8 09:10:04 2002 Subject: [Numpy-discussion] Caution -- // not standard Message-ID: <1026144543.13905.3.camel@ldorritt> I have run into several cases of this on different open-source projects, the latest being an incorrect change in Numeric's arrayobject.c: the use of // to start a comment. Many contributors who work only with Linux have come to believe that this works with other C compilers, which is not true. This construct comes from C++. Please avoid this construct when contributing changes or patches to Numeric. From bsder at mail.allcaps.org Mon Jul 8 12:03:09 2002 From: bsder at mail.allcaps.org (Andrew P. Lentvorski) Date: Mon Jul 8 12:03:09 2002 Subject: [Numpy-discussion] Caution -- // not standard In-Reply-To: <1026144543.13905.3.camel@ldorritt> Message-ID: <20020708114304.T66456-100000@mail.allcaps.org> Actually, // is standard C99 released December 1, 1999 as ISO/IEC 9899:1999. It also has support for variable length arrays, a complex number type and a bunch of *portable* stuff for getting at numerical information (limits, floating-point environment) rather than nasty compiler specific hacks. ( See: http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9x.htm ) Many of these extensions are specifically for the numerical community. I would recommend taking up the issue of non-standards compliance with your compiler vendor. -a On 8 Jul 2002, Paul Dubois wrote: > I have run into several cases of this on different open-source projects, > the latest being an incorrect change in Numeric's arrayobject.c: the use > of // to start a comment. Many contributors who work only with Linux > have come to believe that this works with other C compilers, which is > not true. This construct comes from C++. Please avoid this construct > when contributing changes or patches to Numeric. From paul at pfdubois.com Mon Jul 8 12:38:05 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 8 12:38:05 2002 Subject: [Numpy-discussion] Caution -- // not standard In-Reply-To: <20020708114304.T66456-100000@mail.allcaps.org> Message-ID: <001101c226b6$da5f3090$0c01a8c0@NICKLEBY> Thank you for the clarification. Unfortunately, "my" compiler vendor is the set of all compiler vendors that users of Numeric have, and we have to restrict ourselves to what works. I misspoke when I said it was "not standard"; I should have said, "doesn't work everywhere". > -----Original Message----- > From: Andrew P. Lentvorski [mailto:bsder at mail.allcaps.org] > Sent: Monday, July 08, 2002 12:02 PM > To: Paul Dubois > Cc: numpy-discussion at lists.sourceforge.net > Subject: Re: [Numpy-discussion] Caution -- // not standard > > > Actually, // is standard C99 released December 1, 1999 as > ISO/IEC 9899:1999. > > It also has support for variable length arrays, a complex > number type and a bunch of *portable* stuff for getting at > numerical information (limits, floating-point environment) > rather than nasty compiler specific hacks. ( See: > http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9> x.htm ) > > Many > of these extensions are specifically for the > numerical community. > > I would recommend taking up the issue of non-standards > compliance with your compiler vendor. > > -a > > On 8 Jul 2002, Paul Dubois wrote: > > > I have run into several cases of this on different open-source > > projects, the latest being an incorrect change in Numeric's > > arrayobject.c: the use of // to start a comment. Many > contributors who > > work only with Linux have come to believe that this works > with other C > > compilers, which is not true. This construct comes from C++. Please > > avoid this construct when contributing changes or patches > to Numeric. > > From jae at zhar.net Tue Jul 16 00:12:13 2002 From: jae at zhar.net (John Eikenberry) Date: Tue Jul 16 00:12:13 2002 Subject: [Numpy-discussion] Optimization advice In-Reply-To: <20020708094805.GA370@kosh.zhar.net> References: <20020708094805.GA370@kosh.zhar.net> Message-ID: <20020716070554.GB363@kosh.zhar.net> After getting some advice off the pygame list I think I have a pretty good version of my influence map now. I thought someone on this list might be interested or at least someone checking the archives. The new and improved code is around 6-7x faster. The main gain was obtained by converting all the array functions to slice notation and eliminating most of the needless copying of arrays. The new version is attached and is much better commented. It is also unabridged, as it was pointed out that it wasn't entirely clear what was going on in the last (edited) version. Hopefully things are more obvious in this one. Anyways... I just hating leaving a thread without a conclusion. Hope someone finds this useful. -- John Eikenberry [jae at zhar.net - http://zhar.net] ______________________________________________________________ "They who can give up essential liberty to purchase a little temporary safety, deserve neither liberty nor safety." --B. Franklin -------------- next part -------------- # /usr/bin/env python # # John Eikenberry from Numeric import * from types import * FACTOR = array(6.).astype(Float32) EDGE_MOD = array(0.66).astype(Float32) ONE = array(1.).astype(Float32) ZERO = array(0.).astype(Float32) class InfluenceMap: """ There are 2 primary ways to setup the influence map, either might be useful depending on your needs. The first is to recreate the map each 'turn' the second is to keep the map around and just update it each turn. The first way is simple and easy to understand, both in terms of tweaking and later analysis. The second gives the map a sense of time and allows for fewer iterations of the spreading algorithm per 'turn'. Setting up the map to for one or the other of these is a matter of tweaking the code. There are 3 main bits of code which are described below and indicated via comments in the code. First some terminology: - weightmap stores the current influence map - neighbors is used as the memory buffer to calculate a the influence spreading - constmap contains a map with only the unit's scores present - when I refer to a 'multi-turn map' I mean using one instance of the influence map throughout the game without resetting it. [1] neighbors *= ZERO At the end of each iteraction, the neighbors take on the values of the weightmap from the previous step. This will reset those values to zero. This has a 1% performance hit. [2] putmask(neighbors,constmap,constmap) This keeps the values of the units hexes constant through all iterations. This results in about a 40% performance hit. This needs improvement. [3] setDecayRate([float]) This is meant to be used with a multi-turn map. It sets the floating point value (N>0.0<1.0)which is used on the map each turn to modify the current map before the influence spreading. No performance hit. If just [1] used then it will cause all influence values to decend toward zero. Not sure what this would be useful for, just documenting the effect. If [1] is not used (commented out) then the map values will never balance out, rising with each iteration. This is fine if you plan on resetting the influence map each turn. Allowing you to tweak the number of iterations to get the level of values you want. But it would cause problem with a multi-turn map unless [3] is used to keep this in check. Using [2] without [1] will accellerate the rising of the values described above. It will also lead to more variation amoung the influence values when using fewer iterations. High peaks and steep sides. Using neither [1] nor [2] the peaks are much lower. If [1] and [2] are both used the map will always attain a point of balance no matter how many iterations are run. This is desirable for maps used throughout the entire game (multi-turn maps) for obvious reasons. Given the effect of [1] this also limits the need for [3] as the influence values in areas of the map where units are no longer present will naturally decrease. Though the decay rate may still be useful for tweaking this. """ _decay_rate = None def __init__(self,hex_map): """ hex_map is the in game (civl) map object """ self.map_size = map_size = hex_map.size ave_size = (map_size[0] + map_size[1])/2 self._iterations = ave_size/2 # is the hex_map useful for anything other than size? self.hex_map = hex_map # weightmap == influence map self.weightmap = weightmap = zeros((map_size[0],map_size[1]),Float32) # constmap == initial unit locations self.constmap = zeros((map_size[0],map_size[1]),Float32) def setUnitMap(self,units): """ Put unit scores on map -units is a list of (x,y,score) tuples where x,y are map coordinates and score is the units influence modifier """ weightmap = self.weightmap constmap = self.constmap constmap *= ZERO # mayby use the hex_map here to get terrain effects? for (x,y,score) in units: weightmap[x,y] = score constmap[x,y]=score def setInterations(self,iterations): """ Set number of times through the influence spreading loop """ assert type(iterations) == IntType, "Bad arg type: setIterations([int])" self._iterations = iterations # [3] above def setDecayRate(self,rate): """ Set decay rate for a multi-turn map. """ assert type(rate) == FloatType, "Bad arg type: setDecayRate([float])" self._decay_rate = array(rate).astype(Float32) def reset(self): """ Reset an existing map back to zeros """ map_size = self.map_size self.weightmap = zeros((map_size[0],map_size[1]),Float32) def step(self,iterations=None): """ One set of loops through influence spreading algorithm """ # save lookup time constmap = self.constmap weightmap = self.weightmap if not iterations: iterations = self._iterations # decay rate can be used when the map is kept over duration of game, # instead of a new one each turn. the old values are retained, # degrading slowly over time. this allows for fewer iterations per turn # and gives sense of time to the map. its experimental at this point. if self._decay_rate: weightmap = weightmap * self._decay_rate # It might be possible to pre-allocate the memory for neighbors in the # init method. But I'm not sure how to update that pre-allocated array. neighbors = weightmap.copy() # spread the influence while iterations: # [1] in notes above # neighbors *= ZERO # diamond_hex layout neighbors[:-1,:] += weightmap[1:,:] # shift up neighbors[1:,:] += weightmap[:-1,:] # shift down neighbors[:,:-1] += weightmap[:,1:] # shift left neighbors[:,1:] += weightmap[:,:-1] # shift right neighbors[1::2][:-1,:-1] += weightmap[::2][1:,1:] # hex up (even) neighbors[1::2][:,:-1] += weightmap[::2][:,1:] # hex down (even) neighbors[::2][:,1:] += weightmap[1::2][:,:-1] # hex up (odd) neighbors[::2][1:,1:] += weightmap[1::2][:-1,:-1] # hex down (odd) # keep influence values balanced neighbors *= (ONE/FACTOR) # [2] above - maintain scores in unit hexes # putmask(neighbors,constmap,constmap) # 'putmask' adds almost 40% to the overhead. There should be a # faster way. A little testing seems to show that this problem is # related to the usage of floats for the map values. # prepare for next iteration weightmap,neighbors = neighbors,weightmap iterations -= 1 # save for next turn self.weightmap = weightmap From paul at pfdubois.com Thu Jul 18 14:47:02 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Jul 18 14:47:02 2002 Subject: [Numpy-discussion] [ANNOUNCE] Pyfort 8.0 Message-ID: <001f01c22ea4$89ff4f40$0b01a8c0@NICKLEBY> Pyfort 8.0 has been released at SourceForge (sf.net/projects/pyfortran) Version 8 This version contains a new facility for making and installing projects. Old compile lines will still work, but will produce an equivalent .pfp file that you could use in the future. Included is a Tkinter-based GUI editor for the project files. However, the format of the files is simple and they could be edited with a text editor as well. There is improved support for installing Pyfort and the modules it creates in a location other than inside Python. See README. This version does change the installation location for an extension. Therefore, you should remove the files of any previous installation from your Python. Yes, this is annoying. That is why we are doing it, so that we can have an "uninstall" command. A new "windows" subdirectory has been added, containing an example of how to use Pyfort on Windows with Visual Fortran. Thanks to Reinhold Niesner. Testing of, and advice about, this are needed from Windows users. The pyfort script itself is also now installed as a .bat script for win32. Support for Mac OSX (Darwin) added. From biesingert at yahoo.com Fri Jul 19 01:13:03 2002 From: biesingert at yahoo.com (Thomas Biesinger) Date: Fri Jul 19 01:13:03 2002 Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Hi, when I try to install NumPy on Mac OS X.1.5, it fails on this error: .... cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so /usr/bin/ld: -undefined error must be used when -twolevel_namespace is in effect error: command 'cc' failed with exit status 1 ~/Python/Numeric-21.3 % cc cc: No input files I had thought to submit this to the developers section of the list but could not find the way to subscribe to it ;-) If somehow had a running version of NumPy with for Mac OSX http://tony.lownds.com/macosx, I would appreciate it. Thanks everyone for their help! Regards, Thomas __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com From rob at pythonemproject.com Fri Jul 19 05:35:04 2002 From: rob at pythonemproject.com (rob) Date: Fri Jul 19 05:35:04 2002 Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 References: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Message-ID: <3D3806B3.F2BE1C1A@pythonemproject.com> Thomas Biesinger wrote: > > Hi, > > when I try to install NumPy on Mac OS X.1.5, it fails on this error: > > .... > cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- > 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ > arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - > o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so > /usr/bin/ld: -undefined error must be used when -twolevel_namespace is > in effect > error: command 'cc' failed with exit status 1 > ~/Python/Numeric-21.3 % cc > cc: No input files > > I had thought to submit this to the developers section of the list > but could not find the way to subscribe to it ;-) > > If somehow had a running version of NumPy with for Mac OSX > http://tony.lownds.com/macosx, I would appreciate it. > > Thanks everyone for their help! > > Regards, > Thomas > > __________________________________________________ > Do You Yahoo!? > Yahoo! Autos - Get free new car price quotes > http://autos.yahoo.com > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion Hi Thomas, sorry I don't have the expertise to help you with your question. I am wondering if you are using one of Apple's new G4 machines? I'm curious about the floating point performance of those chips. If you ever get Numpy working, I have a routine that I use for a benchmark, a Norton-Summerfeld ground (antenna) simulation routine that I could send to you. The record for me is 120s on a P4 1.8Ghz at work, but I'm sure the new Xeons would beat that, and maybe the new Athlons. My 1.2Ghz DDR Athlon is much slower than the P4, but the clock speeds are so much different. Rob. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com From welch at cs.unc.edu Fri Jul 19 05:52:01 2002 From: welch at cs.unc.edu (Greg Welch) Date: Fri Jul 19 05:52:01 2002 Subject: FW: [Numpy-discussion] NumPy on Mac OS 10.1.5 In-Reply-To: <200207191053.g6JArGbE017359@wren.cs.unc.edu> Message-ID: Thomas, I have (recently) built Numeric 21.3 on multiple OS X 10.1.5 platforms, and have had no problems that I know of. I am using Python 2.3a0 but had also built Numeric w/ earlier versions of Python too. All platforms have the April 2002 developer tools update. I just noticed that your compile line shows the use of cc, as opposed to gcc. Here is the corresponding compile line for 21.3 on my powerbook (Python 2.3a0): gcc -bundle -bundle_loader /usr/local/bin/python build/temp.darwin-5.5-Power Macintosh-2.3/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.3/arrayobject.o build/temp.darwin-5.5-PowerMacintosh-2.3/ufuncobject.o -o build/lib.darwin-5.5-Power Macintosh-2.3/_numpy.so --Greg -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From biesingert at yahoo.com Fri Jul 19 04:12:31 2002 From: biesingert at yahoo.com (Thomas Biesinger) Date: Fri, 19 Jul 2002 01:12:31 -0700 (PDT) Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Hi, when I try to install NumPy on Mac OS X.1.5, it fails on this error: .... cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so /usr/bin/ld: -undefined error must be used when -twolevel_namespace is in effect error: command 'cc' failed with exit status 1 ~/Python/Numeric-21.3 % cc cc: No input files I had thought to submit this to the developers section of the list but could not find the way to subscribe to it ;-) If somehow had a running version of NumPy with for Mac OSX http://tony.lownds.com/macosx, I would appreciate it. Thanks everyone for their help! Regards, Thomas __________________________________________________ Do You Yahoo!? Yahoo! Autos - Get free new car price quotes http://autos.yahoo.com ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion --B_3109913412_427129-- From Jack.Jansen at oratrix.com Fri Jul 19 14:17:02 2002 From: Jack.Jansen at oratrix.com (Jack Jansen) Date: Fri Jul 19 14:17:02 2002 Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5 In-Reply-To: <20020719081231.1850.qmail@web14103.mail.yahoo.com> Message-ID: On vrijdag, juli 19, 2002, at 10:12 , Thomas Biesinger wrote: > Hi, > > when I try to install NumPy on Mac OS X.1.5, it fails on this error: > > .... > cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh- > 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/ > arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o - > o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so > /usr/bin/ld: -undefined error must be used when -twolevel_namespace is > in effect Thomas, as of MacOSX 10.1 the link step needs either the -flat_namespace option, or the -bundle_loader option. But: this has been fixed in both Python 2.2.1 and Python 2.3a0 (the CVS tree). Are you by any chance still running Python 2.2 (which predates OSX 10.1, and therefore two-level namespaces, and therefore the right linker invocations, which distutils reads from Python's own Makefile). If you're running 2.2: please upgrade and try again. If you're running 2.2.1 or later: let me know and I'll try and think of what questions I should ask you to debug this:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From paul at pfdubois.com Mon Jul 22 16:14:03 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Jul 22 16:14:03 2002 Subject: [Numpy-discussion] Numarray design announcement Message-ID: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY> At numpy.sf.net you will find a posting from Perry Greenfield and I detailing the design decisions we have taken with respect to Numarray. What follows is the text of that message without the formatting. We ask for your understanding about those decisions that differ from the ones you might prefer. Numarray's Design Paul F. Dubois and Perry Greenfield Numarray is the new implementation of the Numeric Python extension. It is our intention that users will change as rapidly as possible to the new module when we decide it is ready. The present Numeric Python team will cease supporting Numeric after a short transition period. During recent months there has been a lot of discussion about Numarray and whether or not it should differ from Numeric in certain ways. We have reviewed this lengthy discussion and come to some conclusions about what we plan to do. The discussion has been valuable in that it took a whole new "generation" back through the considerations that the "founding fathers" debated when Numeric Python was designed. There are literally tens of thousands of Numerical Python users. These users may represent only a tiny percentage of potential users but they are real users today with real code that they have written, and breaking that code would represent real harm to real people. Most of the issues discussed recently were discussed at length when Numeric was first designed. Some decisions taken then represent a choice that was simply a choice among valid alternatives. Nevertheless, the choice was made, and to arbitrarily now make a different choice would be difficult to justify. In arguing about Python's indentation, we often see heart-felt arguments from opponents who have sincere reasons for feeling as they do. However, many of the pitfalls they point to do not seem to actually occur in real life very often. We feel the same way about many arguments about Numeric Python. The view / copy argument, for example, claims that beginners will make errors with view semantics. Well, some do, but not very often, and not twice. It is just one of many differences that users need to adapt to when learning an entity-object model such as Python's when they are used to variable semantics such as in Fortran or C. Similarly, we do not receive massive reports of confusion about differing default values for the axis keyword -- there was a rationale for the way it is now, and although one could propose a different rationale for a different choice, it would be just a choice. Decisions Numarray will have the same Python interface as Numeric except for the exceptions discussed below. 1. The Numarray C API includes a compatibility layer consisting of some of the members of the Numeric C API. For details on compatibility at the C level see http://telia.dl.sourceforge.net/sourceforge/numpy/numarray.pdf , pdf pages 78-81. Since no formal decision was ever made about what parts of the Numeric C header file were actually intended to be publicly available, do not expect complete emulation. Numarray's current view of arrays in C, using either native or emulation C-APIs, is that array data can be mutated, but array properties cannot. Thus, an existing Numeric extension function which tries to change the shape or strides of an array in C is more of a porting challenge, possibly requiring a python wrapper. Depending on what kind of optimization we do, this restriction might be lifted. For the Numeric extensions already ported to Numarray (RandomArray, LinearAlgebra, FFT), none of this was an issue. 2. Currently, if the result of an index operation x[i] results in a scalar result, the result is converted to a similar Python type. For example, the result of array([1,2,3])[1] is the Python integer 2. This will be changed so that the result of an index operation on a Numarray array is always a Numarray array. Scalar results will become rank-zero arrays (i.e., shape () ). 3. Currently, binary operations involving Numeric arrays and Python scalars uses the precision of the Python scalar to help determine the precision of the result. In Numarray, the precision of the array will have precedence in determining the precision of the outcome. Full details are available in the Numarray documention. 4. The Numarray version of MA will no longer have copy semantics on indexing but instead will be consistent with Numarray. (The decision to make MA differ in this regards was due to a need for CDAT to be backward compatible with a local variant of Numeric; the CDAT user community no longer feels this was necessary). Some explanation about the scalar change is in order. Currently, much coding in Numeric-based applications must be devoted to handling the fact that after an index operation, the programmer can not assume that the result is an array. So, what are the consequences of change? A rank-zero array will interact as expected with most other parts of Python. When it does not, the most likely result is a type error. For example, let x = array([1,2,3]). Then [1,2,3][x[0]] currently produces the result 2. With the change, it would produce a type error unless a change is made to the Python core (currently under discussion). But x[x[0]] would still work because we have control of that. In short, we do not think this change will break much code and it will prevent the writing of more code that is either broken or difficult to write correctly. From pete at shinners.org Mon Jul 22 17:36:12 2002 From: pete at shinners.org (Pete Shinners) Date: Mon Jul 22 17:36:12 2002 Subject: [Numpy-discussion] Numarray design announcement References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY> Message-ID: <3D3CA3B1.7010708@shinners.org> Paul F Dubois wrote: > Numarray's Design > Paul F. Dubois and Perry Greenfield a very nice design, for a lot of challenging decisions > Numarray's current view of arrays in C, using either native or > emulation C-APIs, is that array data can be mutated, but array > properties cannot. Thus, an existing Numeric extension function > which tries to change the shape or strides of an array in C is > more of a porting challenge, possibly requiring a python wrapper. i have a c extension that does this, but only during "creation time" of the array. i'm hoping there can be some way to do this from C. i need to create a new array from a block of numbers that aren't contiguous... /* roughly snipped code */ dim[0] = myimg->w; dim[1] = myimg->h; dim[2] = 3; /*r,g,b*/ array = PyArray_FromDimsAndData(3, dim, PyArray_UBYTE, startpixel); array->flags = OWN_DIMENSIONS|OWN_STRIDES; array->strides[2] = pixelstep; array->strides[1] = myimg->pitch; array->strides[0] = myimg->format->BytesPerPixel; array->base = myimg_object; note this data is image data, and i am "reorienting" it so that the first index is X and the second index is Y. plus i need to account for an image pitch, where the rows are not exactly the same width as the number of pixels. also, i am also changing the "base" field, since the data for this array lives inside another image object of course, once the array is created, i pass it off to the user and never touch these fields again, so perhaps something like this will work in the new numarray? if not, i'm eager to start my petition for a "PyArray_FromDimsAndDataAndStrides" function, and also a way to assign the "base" as well. i'm looking forward to the new numarray, looks very exciting. From biesingert at yahoo.com Mon Jul 22 23:54:03 2002 From: biesingert at yahoo.com (Thomas Biesinger) Date: Mon Jul 22 23:54:03 2002 Subject: [Numpy-discussion] Summary to NumPy on Mac OS 10.1.5 Message-ID: <20020723065343.73589.qmail@web14106.mail.yahoo.com> From e.maryniak at pobox.com Tue Jul 23 09:19:04 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Tue Jul 23 09:19:04 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug Message-ID: <200206261833.29702.e.maryniak@pobox.com> Dear crunchers, According to the _Numpy_ manual for RandomArray.seed(x=0, y=0) (with /my/ emphasis): The seed() function takes two integers and sets the two seeds of the random number generator to those values. If the default values of 0 are used for /both/ x and y, then a seed is generated from the current time, providing a /pseudo-random/ seed. Note: in numarray, the RandomArray2 package is provided but it's description is not (yet) included in the numarray manual. I have some questions about this: 1. The implementation of seed(), which is, by the way, identical both in Numeric's RandomArray.py and numarray's RandomArray2.py seems to contradict it's usage description: ---cut--- def seed(x=0,y=0): """seed(x, y), set the seed using the integers x, y; Set a random one from clock if y == 0 """ if type (x) != IntType or type (y) != IntType : raise ArgumentError, "seed requires integer arguments." if y == 0: import time t = time.time() ndigits = int(math.log10(t)) base = 10**(ndigits/2) x = int(t/base) y = 1 + int(t%base) ranlib.set_seeds(x,y) ---cut--- Shouldn't the second 'if' be: if x == 0 and y == 0: With the current implementation: - 'seed(3)' will actually use the clock for seeding - it is impossible to specify 0's (0,0) as seed: it might be better to use None as default values? 2. With the current time.time() based default seeding, I wonder if you can call that, from a mathematical point of view, pseudo-random: ---cut--- $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) [GCC 2.95.3 20010315 (SuSE)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from numarray import * >>> from RandomArray2 import * >>> import time >>> numarray.__version__ '0.3.5' >>> for i in range(5): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(1) ... print ... 1027434978.406238 (102743, 4979) 1027434979.400319 (102743, 4980) 1027434980.400316 (102743, 4981) 1027434981.40031 (102743, 4982) 1027434982.400308 (102743, 4983) ---cut--- It is incremental, and if you use default seeding within one (1) second, you get the same seed: ---cut--- >>> for i in range(5): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(0.1) ... print ... 1027436537.066677 (102743, 6538) 1027436537.160303 (102743, 6538) 1027436537.260363 (102743, 6538) 1027436537.360299 (102743, 6538) 1027436537.460363 (102743, 6538) ---cut--- 3. I wonder what the design philosophy is behind the decision to use 'mathematically suspect' seeding as default behavior. Apart from the fact that it can hardly be called 'random', I also have the following problems with it: - The RandomArray2 module initializes with 'seed()' itself, too. Reload()'s of RandomArray2, which might occur outside the control of the user, will thus override explicit user's seeding. Or am I seeing ghosts here? - When doing repeated run's of one's neural net simulations that each take less than a second, one will get identical streams of random numbers, despite seed()'ing each time. Not quite what you would expect or want. - From a purist software engineering point of view, I don't think automagical default behavior is desirable: one wants programs to be deterministic and produce reproducible behavior/output. If you use default seed()'ing now and re-run your program/model later with identical parameters, you will get different output. In Eiffel, object attributes are always initialized, and you will almost never have irreproducible runs. I found that this is a good thing for reproducing ones bugs, too ;-) To summarize, my recommendation would be to use None default arguments and use, when no user arguments are supplied, a hard (built-in) seed tuple, like (1,1) or whatever. Sometimes a paper on a random number generator suggests seeds (like 4357 for the MersenneTwister), but of course, a good random number generator should behave well independently of the initial seed/seed-tuple. I may be completely mistaken here (I'm not an expert on random number theory), but the random number generators (Ahrens, et. al) seem 'old'? After some studying, we decided to use the Mersenne Twister: http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html http://www.math.keio.ac.jp/~matumoto/emt.html PDF article: http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator", ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998 There are some Python wrappers and it has good performance as well. Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. Hail Caesar! We, who are about to dine, salad you. From jmiller at stsci.edu Tue Jul 23 11:56:04 2002 From: jmiller at stsci.edu (Todd Miller) Date: Tue Jul 23 11:56:04 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug References: <200206261833.29702.e.maryniak@pobox.com> Message-ID: <3D3DA67E.308@stsci.edu> Eric Maryniak wrote: >Dear crunchers, > >According to the _Numpy_ manual for RandomArray.seed(x=0, y=0) >(with /my/ emphasis): > > The seed() function takes two integers and sets the two seeds > of the random number generator to those values. If the default > values of 0 are used for /both/ x and y, then a seed is generated > from the current time, providing a /pseudo-random/ seed. > >Note: in numarray, the RandomArray2 package is provided but it's >description is not (yet) included in the numarray manual. > >I have some questions about this: > >1. The implementation of seed(), which is, by the way, identical > both in Numeric's RandomArray.py and numarray's RandomArray2.py > seems to contradict it's usage description: > The 2 in RandomArray2 is there to support side-by-side testing with Numeric, not to imply something new and improved. The point of providing RandomArray2 is to provide a migration path for current Numeric users. To that end, RandomArray2 should be functionally identical to RandomArray. That should not, however, discourage you from writing a new and improved random number package for numarray. > > >---cut--- >def seed(x=0,y=0): > """seed(x, y), set the seed using the integers x, y; > Set a random one from clock if y == 0 > """ > if type (x) != IntType or type (y) != IntType : > raise ArgumentError, "seed requires integer arguments." > if y == 0: > import time > t = time.time() > ndigits = int(math.log10(t)) > base = 10**(ndigits/2) > x = int(t/base) > y = 1 + int(t%base) > ranlib.set_seeds(x,y) >---cut--- > > Shouldn't the second 'if' be: > > if x == 0 and y == 0: > > With the current implementation: > > - 'seed(3)' will actually use the clock for seeding > - it is impossible to specify 0's (0,0) as seed: it might be > better to use None as default values? > >2. With the current time.time() based default seeding, I wonder > if you can call that, from a mathematical point of view, > pseudo-random: > >---cut--- >$ python >Python 2.2.1 (#1, Jun 25 2002, 20:45:02) >[GCC 2.95.3 20010315 (SuSE)] on linux2 >Type "help", "copyright", "credits" or "license" for more information. > >>>>from numarray import * >>>>from RandomArray2 import * >>>>import time >>>>numarray.__version__ >>>> >'0.3.5' > >>>>for i in range(5): >>>> >... time.time() >... RandomArray2.seed() >... RandomArray2.get_seed() >... time.sleep(1) >... print >... >1027434978.406238 >(102743, 4979) > >1027434979.400319 >(102743, 4980) > >1027434980.400316 >(102743, 4981) > >1027434981.40031 >(102743, 4982) > >1027434982.400308 >(102743, 4983) >---cut--- > > It is incremental, and if you use default seeding within > one (1) second, you get the same seed: > >---cut--- > >>>>for i in range(5): >>>> >... time.time() >... RandomArray2.seed() >... RandomArray2.get_seed() >... time.sleep(0.1) >... print >... >1027436537.066677 >(102743, 6538) > >1027436537.160303 >(102743, 6538) > >1027436537.260363 >(102743, 6538) > >1027436537.360299 >(102743, 6538) > >1027436537.460363 >(102743, 6538) >---cut--- > >3. I wonder what the design philosophy is behind the decision > to use 'mathematically suspect' seeding as default behavior. > Using time for a seed is fairly common. Since it's an implementation detail, I doubt anyone would object if you can suggest a better default seed. > > Apart from the fact that it can hardly be called 'random', I also > have the following problems with it: > > - The RandomArray2 module initializes with 'seed()' itself, too. > Reload()'s of RandomArray2, which might occur outside the > control of the user, will thus override explicit user's seeding. > Or am I seeing ghosts here? > Overriding a user's explicit seed as a result of a reload sounds correct to me. All of the module's top level statements are re-executed during a reload. > > - When doing repeated run's of one's neural net simulations that > each take less than a second, one will get identical streams of > random numbers, despite seed()'ing each time. > Not quite what you would expect or want. > This is easy enough to work around: don't seed or re-seed. If you then need to make multiple simulation runs, make a separate module and call your simulation like: import simulation RandomArray2.seed(something_deterministic, something_else_deterministic) for i in range(number_of_runs): simulation.main() > > - From a purist software engineering point of view, I don't think > automagical default behavior is desirable: one wants programs to > be deterministic and produce reproducible behavior/output. > I don't know. I think by default, random numbers *should be* random. > > If you use default seed()'ing now and re-run your program/model > later with identical parameters, you will get different output. > When you care about this, you need to set the seed to something deterministic. > > In Eiffel, object attributes are always initialized, and you will > almost never have irreproducible runs. I found that this is a good > thing for reproducing ones bugs, too ;-) > This sounds like a good design principle, but I don't see anything in RandomArray2 which is keeping you from doing this now. > >To summarize, my recommendation would be to use None default arguments >and use, when no user arguments are supplied, a hard (built-in) seed >tuple, like (1,1) or whatever. > Unless there is a general outcry from the rest of the community, I think the (existing) numarray extensions (RandomArray2, LinearAlgebra2, FFT2) should try to stay functionally identical with Numeric. > >Sometimes a paper on a random number generator suggests seeds (like 4357 >for the MersenneTwister), but of course, a good random number generator >should behave well independently of the initial seed/seed-tuple. >I may be completely mistaken here (I'm not an expert on random number >theory), but the random number generators (Ahrens, et. al) seem 'old'? >After some studying, we decided to use the Mersenne Twister: > An array enabled version might make a good add-on package for numarray. > > > http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html > http://www.math.keio.ac.jp/~matumoto/emt.html > >PDF article: > > http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf > > M. Matsumoto and T. Nishimura, > "Mersenne Twister: A 623-dimensionally equidistributed uniform > pseudorandom number generator", > ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1, > January pp.3-30 1998 > >There are some Python wrappers and it has good performance as well. > >Bye-bye, > >Eric > Bye, Todd From e.maryniak at pobox.com Tue Jul 23 13:03:02 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Tue Jul 23 13:03:02 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <3D3DA67E.308@stsci.edu> References: <200206261833.29702.e.maryniak@pobox.com> <3D3DA67E.308@stsci.edu> Message-ID: <200207232202.04104.e.maryniak@pobox.com> On Tuesday 23 July 2002 20:54, Todd Miller wrote: > Eric Maryniak wrote: > >... > That should not, however, discourage you from writing a new and improved > random number package for numarray. Yes, thank you :-) > >... > >3. I wonder what the design philosophy is behind the decision > > to use 'mathematically suspect' seeding as default behavior. > > Using time for a seed is fairly common. Since it's an implementation > detail, I doubt anyone would object if you can suggest a better default > seed. Well, as said, a fixed seed, provided by the class implementation and therefore 'good', instead of a not-so-random 'random' seed. And imho it would be better not to (only) use the clock, but a /dev/random kinda thing. Personally, I find the RNG setup much more appealing: there the default is: standard_generator = CreateGenerator(-1) where seed < 0 ==> Use the default initial seed value. seed = 0 ==> Set a "random" value for the seed from the system clock. seed > 0 ==> Set seed directly (32 bits only). And indeed 'void Mixranf(int *s,u32 s48[2])' uses a built-in constant as initial seed value (actually, two). >... > > If you use default seed()'ing now and re-run your program/model > > later with identical parameters, you will get different output. > > When you care about this, you need to set the seed to something > deterministic. Naturally, but how do I know what a 'good' seed is (or indeed it's type, range, etc.)? I just would like, as e.g. RNG does, let the number generator take care of this... (or at least provide the option to) >... In the programs I've seen so far, including a lot of ours ahem, usually a program (simulation) is run multiple times with the same parameters and, in our case for neural nets, seeded each time with a clock generated seed and then the different simulations are compared and checked if they are similar or sensitive to chaotic influences. But I don't think this is the proper way to do this. My point is, I guess, that the sequence of these clock-generated seeds itself is not random, because (as for RandomArray) the generated numbers are clearly not random. Better, and reproducible, would be to start the first simulation with a supplied seed, get the seed and pickle after the first run and use the pickled seed for run 2 etc. or indeed have a kind of master script (as you suggest) that manages this. That way you would start with one seed only and are not re-seeding for each run. Because if the clock-seeds are not truly random, you will a much greater change of cycles in your overall sequence of numbers. Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. VME ERROR 37022: Hierarchic name syntax invalid taking into account starting points defined by initial context. From paul at pfdubois.com Tue Jul 23 13:14:05 2002 From: paul at pfdubois.com (paul at pfdubois.com) Date: Tue Jul 23 13:14:05 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <200207232202.04104.e.maryniak@pobox.com> Message-ID: <3D36139400005515@mta08.san.yahoo.com> RandomArray got a "special" position as part of Numeric simply by historical accident in being there first. I think in the conversion to Numarray we will be able to remove such things from the "core" and make more of a marketplace of equals for the "addons". As it is now there is some implication that somehow one is "better" than the other, which is unjustified either mathematically or in the sense of design. RNG's design is based on my experience with large codes needing many independent streams. The mathematics is from a well-tested Cray algorithm. I'm sure it could use fluffing up but a good case can be made for it. From gb at cs.unc.edu Tue Jul 23 14:24:03 2002 From: gb at cs.unc.edu (Gary Bishop) Date: Tue Jul 23 14:24:03 2002 Subject: [Numpy-discussion] Bug in Numpy FFT reference? Message-ID: <200207232123.g6NLN6bE004136@wren.cs.unc.edu> The example given for real_fft in the FFT section of the Sept 7, 2001 Numpy manual makes no sense to me. The text says >>> x = cos(arange(30.0)/30.0*2*pi) >>> print real_fft(x) [ -1. +0.j 13.69406641+2.91076367j -0.91354546-0.40673664j -0.80901699-0.58778525j -0.66913061-0.74314483j -0.5 -0.8660254j -0.30901699-0.95105652j -0.10452846-0.9945219j 0.10452846-0.9945219j 0.30901699-0.95105652j 0.5 -0.8660254j 0.66913061-0.74314483j 0.80901699-0.58778525j 0.91354546-0.40673664j 0.9781476 -0.20791169j 1. +0.j ] But surely x is a single cycle of a cosine wave and should have a very sensible and simple FT. Namely [0, 1, 0, 0, 0, ...] Indeed, running the example using Numeric and FFT produces, within rounding error, exactly what I would expect. Why the non-intuitive (and wrong) result in the example text? gb From dubois1 at llnl.gov Tue Jul 23 14:32:04 2002 From: dubois1 at llnl.gov (Paul Dubois) Date: Tue Jul 23 14:32:04 2002 Subject: [Numpy-discussion] Bug in Numpy FFT reference? In-Reply-To: <200207232123.g6NLN6bE004136@wren.cs.unc.edu> References: <200207232123.g6NLN6bE004136@wren.cs.unc.edu> Message-ID: <1027459879.8212.2.camel@ldorritt> The person who wrote the manual cut and pasted from running the code. I think there was a bug in FFT at the time. (:-> On Tue, 2002-07-23 at 14:23, Gary Bishop wrote: > The example given for real_fft in the FFT section of the Sept 7, 2001 > Numpy manual makes no sense to me. The text says > > >>> x = cos(arange(30.0)/30.0*2*pi) > >>> print real_fft(x) > [ -1. +0.j 13.69406641+2.91076367j > -0.91354546-0.40673664j -0.80901699-0.58778525j > -0.66913061-0.74314483j -0.5 -0.8660254j > -0.30901699-0.95105652j -0.10452846-0.9945219j > 0.10452846-0.9945219j 0.30901699-0.95105652j > 0.5 -0.8660254j 0.66913061-0.74314483j > 0.80901699-0.58778525j 0.91354546-0.40673664j > 0.9781476 -0.20791169j 1. +0.j ] > > But surely x is a single cycle of a cosine wave and should have a very > sensible and simple FT. Namely [0, 1, 0, 0, 0, ...] > > Indeed, running the example using Numeric and FFT produces, within > rounding error, exactly what I would expect. > > Why the non-intuitive (and wrong) result in the example text? > > gb > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From e.maryniak at pobox.com Wed Jul 24 09:24:14 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Wed Jul 24 09:24:14 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <3D36139400005515@mta08.san.yahoo.com> References: <3D36139400005515@mta08.san.yahoo.com> Message-ID: <200207241823.42218.e.maryniak@pobox.com> On Tuesday 23 July 2002 22:15, paul at pfdubois.com wrote: > RandomArray got a "special" position as part of Numeric simply by > historical accident in being there first. I think in the conversion to > Numarray we will be able to remove such things from the "core" and make > more of a marketplace of equals for the "addons". As it is now there is > some implication that somehow one is "better" than the other, which is > unjustified either mathematically or in the sense of design. > > RNG's design is based on my experience with large codes needing many > independent streams. The mathematics is from a well-tested Cray algorithm. > I'm sure it could use fluffing up but a good case can be made for it. A famous quote from Linus is "Nice idea. Now show me the code." Perhaps a detailed example makes my problem clearer, because as it is now, RNG and RandomArray2 are not orthogonal in design, in the sense that RNG's default seed is fixed and RandomArray's is automagical (clock), not reproducible and mathematically suspect, which I think is not good for the more naive Python user. Below I will give intended usage in a provocative way, but please don't take me too seriously (I know, I don't ;-) Let's say you have a master shell script that runs a neural net paradigm (size 20x20) 10 times, each time with the same parameters, to see if it's stable or chaotic, i.e. does not 'converge' c.q. outcome depends on initial values (it should not be chaotic, but this should always be checked). run10.sh tracelink.py 20 20 inputpat.dat > hippocamp01.out ... 8 more ... tracelink.py 20 20 inputpat.dat > hippocamp10.out tracelink.py ... import numarray, RandomArray2 _or_ RNG ... # Case 1: RandomArray2 # User uses default clock seed, which is the same # during 1 second (see my previous posting). # ignlgi(void)'s seeds 1234567890L,123456789L # are _not_ used (see com.c). RandomArray2.seed() # But if omitted, RandomArray2.py does it, too. ... calculations ... other program outcome _only_ if program runs > 1 second, ... otherwise the others will have the same result. # Case 2: RNG # A 'standard_generator = CreateGenerator(-1)' is automatically done. # seed < 0 ==> Use the default initial seed value. # seed = 0 ==> Set a "random" value for the seed from system clock. # seed > 0 ==> Set seed directly (32 bits only). # Thus, the fixed seeds used are 0,0 (see Mixranf() in ranf.c). ... calculations ... all 10 programs have the same outcome when using ranf(), ... because it always starts the same seed, the sequence is always: ... 0.58011364857958725, 0.95051273498076583, 0.78637142533060356 etc. The problem with RandomArray's seed is, that it is not truly random itself. In it's current (time.time based) implementation it is linearly auto incrementing every second, and therefore suffers from auto-correlation. Moreover, in the above example, if 10 separate .py runs complete in 1 second they'll all have the same seed (and outcome). This is not what the user, if accustomed to clock seeding, would expect. But if the seed is different each time, a problem is that runs are not reproducible. Let's say that run hippocamp06.out produced some strange output: now unless the user saved the seed (with get_seed), it can never be reproduced. Therefore, I think RNG's design is better and should be applied to RandomArray2, too, because RandomArray2's seeding is flawed anyways. A user should be aware of proper seeding, agreed, and now will be: when doing multiple identical runs, the same (and thus reproducible) output will result and so the user is made aware of the fact that, as an example, he or she should seed or pickle it between runs. So my suggestion would be to re-implement RandomArray2.seed(x=0,y=0) as follows: if either the x or y seed: seed < 0 ==> Use the default initial seed value. seed = None ==> Set a "random" value for the seed from the system clock. seeds >= 0 ==> Set seed directly (32 bits only). and en-passant do a better job than clock-based seeding: ---cut--- def seed(x=None,y=None): """seed(x, y), set the seed using the integers x, y; ... """ if (x != None and type (x) != IntType) or (y != None and type (y) != IntType) : raise ArgumentError, "seed requires integer arguments (or None)." if x == None or y == None: import dev_random_device # uses /dev/random or equivalent x = dev_random_device.nextvalue() # egd.sf.net is a user space y = dev_random_device.nextvalue() # alternative elif x < 0 or y < 0: x = 1234567890L y = 123456789L ranlib.set_seeds(x,y) ---cut--- But: I realize that this is different behavior from Python's standard random and whrandom, where no arg or None uses the clock. But, if that behavior is kept for RandomArray2 (and RNG should then be adapted, too) then I'd urge at least to use a better initial seed. In certain applications, e.g. generating session id's in crypto programs, non-predictability of initial seeds is crucial. But if you have a look at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks like an art in itself. So perhaps RNG's 'clock code' should replace RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will not have the 1-second problem. Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. Just because you're not paranoid, that doesn't mean that they're not after you. From Chris.Barker at noaa.gov Wed Jul 24 10:01:06 2002 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jul 24 10:01:06 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com> Message-ID: <3D3ECEEE.6BAF4CC2@noaa.gov> Just to add my $.02: I disagree with Eric about what the default behaviour should be. Every programming language/environment I have ever used uses some kind of "random" seed by default. When I want reproducible results (which I often do for testing) I can specify a seed. I find the the most useful behaviour. As Eric points out, it is not trivial to generate a "random" seed (from the time, or whatever), so it doesn't make sense to burdon the nieve user with this chore. Therefore, I strongly support keeping the default behaviour of a "random" seed. Eric Maryniak wrote: > then I'd urge at least to use a better initial seed. > In certain applications, e.g. generating session id's in crypto programs, > non-predictability of initial seeds is crucial. But if you have a look > at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks > like an art in itself. So perhaps RNG's 'clock code' should replace > RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will > not have the 1-second problem. This I agree with: a better default initial seed would be great. As someone said, "show me the code!". I don't imagine anyone would object to improving this. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From e.maryniak at pobox.com Wed Jul 24 10:29:02 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Wed Jul 24 10:29:02 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <3D3ECEEE.6BAF4CC2@noaa.gov> References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com> <3D3ECEEE.6BAF4CC2@noaa.gov> Message-ID: <200207241928.07366.e.maryniak@pobox.com> On Wednesday 24 July 2002 17:59, Chris Barker wrote: > Just to add my $.02: > > I disagree with Eric about what the default behaviour should be. Every > programming language/environment I have ever used uses some kind of > "random" seed by default. When I want reproducible results (which I > often do for testing) I can specify a seed. I find the the most useful > behaviour. As Eric points out, it is not trivial to generate a "random" > seed (from the time, or whatever), so it doesn't make sense to burdon > the nieve user with this chore. > > Therefore, I strongly support keeping the default behaviour of a > "random" seed. In that case, and if that is the general consensus, RNG should be adapted: it now uses a fixed seed by default (and not a clock generated one). > Eric Maryniak wrote: > > then I'd urge at least to use a better initial seed. > > In certain applications, e.g. generating session id's in crypto programs, > > non-predictability of initial seeds is crucial. But if you have a look > > at GPG's or OpenSSL's source for a PRNG (especially for Windows), it > > looks like an art in itself. So perhaps RNG's 'clock code' should replace > > RandomArray2's: it uses microseconds (in gettimeofday), too, and thus > > will not have the 1-second problem. > > This I agree with: a better default initial seed would be great. As > someone said, "show me the code!". I don't imagine anyone would object > to improving this. The source is in Mixranf(), file Numerical/Packages/RNG/Src/ranf.c (when checked out with CVS), but it may be a good idea to check it with Python's own random/whrandom code (which I don't have at hand -- it may be more recent and/or portable for other OSes). By the way, I realized in my code 'fix' for RandomArray2.seed(x=None,y=None) that I already anticipated this and that the default behavior is _not_ to use a fixed seed ;-) : if either the x or y seed: seed < 0 ==> Use the default initial seed value. seed = None ==> Set a "random" value for the seed from clock (default) seeds >= 0 ==> Set seed directly (32 bits only). and en-passant do a better job than clock-based seeding: ---cut--- def seed(x=None,y=None): """seed(x, y), set the seed using the integers x, y; ... """ if (x != None and type (x) != IntType) or (y != None and type (y) != IntType) : raise ArgumentError, "seed requires integer arguments (or None)." if x == None or y == None: # This would be the best, but is problematic under Windows/Mac. import dev_random_device # uses /dev/random or equivalent x = dev_random_device.nextvalue() # egd.sf.net is a user space y = dev_random_device.nextvalue() # alternative # So best is to use Mixranf() from RNG/Src/ranf.c here. elif x < 0 or y < 0: x = 1234567890L y = 123456789L ranlib.set_seeds(x,y) ---cut--- Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. Unix was a trademark of AT&T. AT&T is a modem test command. From peter.chang at nottingham.ac.uk Wed Jul 24 11:08:06 2002 From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk) Date: Wed Jul 24 11:08:06 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <200207241928.07366.e.maryniak@pobox.com> Message-ID: Just to stick my oar in: I think Eric's preference is predicated by the lousiness (or otherwise?) of RandomArray's seeding mechanism. The random sequences generated by incremental seeds should, by design, be uncorrelated thus allowing the use of the system clock as a seed source. If you're running lots of simulations (as I do with Monte Carlos, though not in numpy) using PRNGs, the last thing you want is the task to find a (pseudo) random source of seeds. Using /dev/random is not particularly portable; the system clock is much easier to obtain and is fine as long as your iteration cycle is longer than its resolution. Peter From paul at pfdubois.com Wed Jul 24 23:09:02 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Jul 24 23:09:02 2002 Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug In-Reply-To: <200207241928.07366.e.maryniak@pobox.com> Message-ID: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY> I'm not going to change the default seed on RNG. Existing users have the right to stability, and not to have things change because someone thinks a certain choice among several reasonable ones is better than the one previously made. There is the further issue here of RNG being advertised as similar to Cray's ranf() and that similarity extends to this default. Not to mention that for many purposes the current default is quite useful. From e.maryniak at pobox.com Thu Jul 25 06:02:03 2002 From: e.maryniak at pobox.com (Eric Maryniak) Date: Thu Jul 25 06:02:03 2002 Subject: [Numpy-discussion] Numarray: Summary (seeding): personal code and manual suggestions on initial seeding in module RNG and RandomArray(2) In-Reply-To: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY> References: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY> Message-ID: <200207251501.47126.e.maryniak@pobox.com> Dear crunchers, Please see my personal thoughts on the past discussion about initial seeds some paragraphs down below, where I'd like to list concrete code and manual enhancements aimed at providing users with a clear understanding of it's usage (and pitfalls e.g. w/r to cryptographic applications)... ==> Suggestions for code and manual changes w/r to initial seeding (down below) But first a response to Paul's earlier message: On Thursday 25 July 2002 08:08, Paul F Dubois wrote: > I'm not going to change the default seed on RNG. Existing users have the > right to stability, and not to have things change because someone thinks > a certain choice among several reasonable ones is better than the one > previously made. Well, I wasn't aware of the fact that things were completely set in stone for Numarray solely for backward compatibilty. It was my impression that numarray and it's accompanying xx2 packages were also open for redesign. I agree stability is important, but numarray already breaks with Numeric in other aspects so why should RNG (RNG2 in numarray?) or other packages not be? It's more a matter of well documenting changes I think. Users switching to numarray will already have to take into account some changes and verify their code. It's not that I "think a certain choice among several reasonable ones is better" [although my favorite is still a fixed seed, as in RNG, for reasons of reproducibility in later re-runs of Monte Carlo's that are not possible now, because the naive user, using a clock seed, may not have saved the initial seed with get_seed], but that the different packages, i.c. RNG (RNG2 to be?) and RandomArray2, should be orthogonal in this respect. I.e. the same, so 'default always an automagical (clock whatever) random initial seed _or_ a fixed one'. Orthogonality is a very common and accepted design principle in computing science and for good reasons (usability). Users changing from one PRNG to another (and using the default seed) would otherwise be unwelcomely surprised by a sudden change in behavior of their program. I try to give logical arguments and real code examples in this discussion and fail to see in Paul's reaction where I'm wrong. By the way: in Python 2.1 alpha 2 seeding changed, too: """ - random.py's seed() function is new. For bit-for-bit compatibility with prior releases, use the whseed function instead. The new seed function addresses two problems: (1) The old function couldn't produce more than about 2**24 distinct internal states; the new one about 2**45 (the best that can be done in the Wichmann-Hill generator). (2) The old function sometimes produced identical internal states when passed distinct integers, and there was no simple way to predict when that would happen; the new one guarantees to produce distinct internal states for all arguments in [0, 27814431486576L). """ > There is the further issue here of RNG being advertised as similar to > Cray's ranf() and that similarity extends to this default. Not to > mention that for many purposes the current default is quite useful. Perhaps I'm mistaken here, but RNG/Lib/__init__.py does (-1 -> uses fixed internal seed): standard_generator = CreateGenerator(-1) and: def ranf(): "ranf() = a random number from the standard generator." return standard_generator.ranf() And indeed Mixranf in RNG/Src/ranf.c does set them to 0: ... if(*s < 0){ /* Set default initial value */ s48[0] = s48[1] = 0; Setranf(s48); Getranf(s48); And this code, or I'm missing the point, uses a standard generator from RNG, which demonstrates the same sequence of initial seeds in re-runs (note that it does not suffer from the "1-second problem" as RandomArray2 does, see the Appendix below for a demonstration of that, because RNG uses milliseconds). Note that 'ranf()' is listed in chapter 18 in Module RNG as one of the 'Generator objects': $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from numarray import * >>> from RNG import * >>> for i in range(3): ... standard_generator.ranf() ... 0.58011364857958725 0.95051273498076583 0.78637142533060356 >>> $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from numarray import * >>> from RNG import * >>> for i in range(3): ... standard_generator.ranf() ... 0.58011364857958725 0.95051273498076583 0.78637142533060356 >>> Ok, now then my own (and possibly biased) personal summary of the past discussions and concrete code and manual recommendations: ==> Suggestions for code and manual changes w/r to initial seeding Conclusions: 1. Default initial seeding should be random (and not fixed). This is the general consensus and while it may not win the beauty contest in purist software engineering circles, it also is the default behavior in Python's own Random/WHRandom modules. URL: http://web.pydoc.org/2.2/random.html => Recommendations: - Like Python's random/whrandom module, default arguments to seed() should not be 0, but None, and this triggers the default behavior which is to use a random initial seed (ideally 'truly' random from e.g. /dev/random or otherwise clock or whatever based), because: o better usability: users changing from Python's own random to numarray's random facilities will find familiar seed() usage semantics o often 0 itself can be a legal seed (although the MersenneTwister does not recommend it) - Like RNG provide support for using a built-in fixed seed by supplying negative seeds to seed(), rationale: o support for reproducible re-runs of Monte Carlo's without having to specify ones own initial seed o usability: naive users may not know a 'good' seed is, like: can it be 0 or must it be >0, what is the maximum, etc. - See my suggested code fix for RandomArray2.seed() in the Appendix below. - Likewise, in RNG: o CreateGenerator (s, ...) should be changed to CreateGenerator (s=None) Also note Python's own: def create_generators(num, delta, firstseed=None) from random (random.py), url: http://web.pydoc.org/2.2/random.html o RNG's code should be changed from testing on 0 to testing on None first (which results in using the clock), then on < 0 (use built-in seed), and then using the user provided seed (which is thus >= 0, and hence can also be 0) o 'standard_generator = CreateGenerator(-1)' should be changed to 'standard_generator = CreateGenerator() and results in using the clock - Put some explicit warnings in the numarray manual, that the seeding of numarray's packages should _not_ be used in those parts of software where unpredictability of seeds is important, such as for example, cryptographical software for creating session keys, TCP sequence numbers etc. Attacks on crypto software usually center around these issues. Ideally, a /dev/random should be used, but with the current system clock based implementation, the seeds are not random, because the clock does not have deci-nanosecond precision (10**10 ~= 2**32) yet ;-) Appendix -------- ** 1. "1-second problem" with RandomArray2: $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from numarray import * >>> from RandomArray2 import * >>> import time >>> import sys >>> sys.version '2.2.1 (#1, Jun 25 2002, 20:45:02) \n[GCC 2.95.3 20010315 (SuSE)]' >>> numarray.__version__ '0.3.5' >>> for i in range(3): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(1) ... print ... 1027591910.9043469 (102759, 1911) 1027591911.901091 (102759, 1912) 1027591912.901088 (102759, 1913) >>> for i in range(3): ... time.time() ... RandomArray2.seed() ... RandomArray2.get_seed() ... time.sleep(0.3) ... print ... 1027591966.260392 (102759, 1967) 1027591966.5510809 (102759, 1967) 1027591966.851079 (102759, 1967) Note that Python (at least 2.2.1) own random() suffers much less from this (on my 450 MHz machine, every 10-th millisecond or so the seed will be different): $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from random import * >>> import time >>> >>> for i in range(3): ... print long(time.time() * 256) ... 263065231349 263065231349 263065231349 >>> for i in range(3): ... print long(time.time() * 256) ... time.sleep(.00001) ... 263065240314 263065240315 263065240317 By the way, Python's own random.seed() also suffers from this, but on a 10th-millisecond level (on my 450 Mhz i586 at least). For the implementation of seed() see Lib/random.py, basically a 'long(time.time()' is used: $ python Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ... >>> from random import * >>> import time >>> for i in range(3): ... print long(time.time() * 256) ... 263065231349 263065231349 263065231349 >>> for i in range(3): ... print long(time.time() * 256) ... time.sleep(.00001) ... 263065240314 263065240315 263065240317 2. Proposed re-implementation of RandomArray2.seed(): def seed(x=None,y=None): """seed(x, y), set the seed using the integers x, y: x or y is None (or not specified): A random seed is used which in the current implementation may be based on the system's clock. Warning: do not this seed in software where the initial seed may not be predictable, such as for example, in cryptographical software for creating session keys. x < 0 or y < 0: Use the module's fixed built-in seed which is the tuple (1234567890L, 123456789L) (or whatever) x >= 0 and y >= 0 Use the seeds specified by the user. (Note: some random number generators do not recommend using 0) Note: based on Python 2.2.1's random.seed(a=None). ADAPTED for _2_ seeds as required by ranlib.set_seeds(x,y) """ if (x != None and type (x) != IntType) or (y != None and type (y) != IntType) : raise ArgumentError, "seed requires integer arguments (or None)." if x == None or y == None: try: # This would be the best, but is problematic under Windows/Mac. # To my knowledge there isn't a portable lib_randdevice yet. # As GPG, OpenSSH and OpenSSL's code show, getting entropy # under Windows is problematic. # However, Python 2.2.1's socketmodule does wrap the ssl code. import dev_random_device # uses /dev/random or equivalent x = dev_random_device.nextvalue() # egd.sf.net is a user space y = dev_random_device.nextvalue() # alternative except: # Use Mixranf() from RNG/Src/ranf.c here or, perhaps better, # use Python 2.2.1's code? At least it looks simpler and does not # have the platform dependency's and has possibly met wider testing # (and why not re-use code? ;-) # For Python 2.2.1's random.seed(a=None), see url: # http://web.pydoc.org/2.2/random.html # and file Lib/random.py. # Do note, however, that on my 450 Mhz machine, the statement # 'long(time.time() * 256)' will generate the same values # within a tenth of a millisecond (see Appendix #1 for a code # example). This can be fixed by doing a time.sleep(0.001). # See my #EM# comment. # Naturally this code needs to be adapted for ranlib's # generator, because this code uses the Wichmann-Hill generator. ---cut: Wichmann-Hill--- def seed(self, a=None): """Initialize internal state from hashable object. None or no argument seeds from current time. If a is not None or an int or long, hash(a) is used instead. If a is an int or long, a is used directly. Distinct values between 0 and 27814431486575L inclusive are guaranteed to yield distinct internal states (this guarantee is specific to the default Wichmann-Hill generator). """ if a is None: # Initialize from current time import time a = long(time.time() * 256) #EM# Guarantee unique a's between subsequent call's of seed() #EM# by sleeping one millisecond. This should not be harmful, #EM# because ordinarily, seed() will only be called once or so #EM# in a program. time.sleep(0.001) if type(a) not in (type(3), type(3L)): a = hash(a) a, x = divmod(a, 30268) a, y = divmod(a, 30306) a, z = divmod(a, 30322) self._seed = int(x)+1, int(y)+1, int(z)+1 ---cut: Wichmann-Hill--- elif x < 0 or y < 0: x = 1234567890L # or any other suitable 0 - 2**32-1 y = 123456789L ranlib.set_seeds(x,y) 3. Mersenne Twister, another PRNG: Bye-bye, Eric -- Eric Maryniak WWW homepage: http://pobox.com/~e.maryniak/ Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL. In a grocery store, the Real Programmer is the one who insists on running the cans past the laser checkout scanner himself, because he never could trust keypunch operators to get it right the first time. From aureli at ipk.fhg.de Thu Jul 25 09:51:06 2002 From: aureli at ipk.fhg.de (Aureli Soria Frisch) Date: Thu Jul 25 09:51:06 2002 Subject: [Numpy-discussion] index method for array objects? In-Reply-To: References: <20020621133705.A15296@idi.ntnu.no> Message-ID: Hi all, Has someone implemented a function for arrays that behaves like the index(*) method for lists (it should then consider something like a tolerance parameter). I suppose it could be maybe done with array.tolist() and list.index(), but have someone implemented something more elegant/array-based? Thanks in advance Aureli PD: (*) index receive a value as an argument and retunrs the index of the list member equal to this value... -- ################################# Aureli Soria Frisch Fraunhofer IPK Dept. Pattern Recognition post: Pascalstr. 8-9, 10587 Berlin, Germany e-mail: aureli at ipk.fhg.de fon: +49 30 39006-143 fax: +49 30 3917517 web: http://vision.fhg.de/~aureli/web-aureli_en.html ################################# From jmiller at stsci.edu Thu Jul 25 10:15:03 2002 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jul 25 10:15:03 2002 Subject: [Numpy-discussion] index method for array objects? References: <20020621133705.A15296@idi.ntnu.no> Message-ID: <3D4031C2.3090607@stsci.edu> Aureli Soria Frisch wrote: > Hi all, > > Has someone implemented a function for arrays that behaves like the > index(*) method for lists (it should then consider something like a > tolerance parameter). > > I suppose it could be maybe done with array.tolist() and list.index(), > but have someone implemented something more elegant/array-based? > > Thanks in advance > > Aureli > > PD: (*) index receive a value as an argument and retunrs the index of > the list member equal to this value... I think the basics of what you're looking for are something like: def index(a, b, eps): return nonzero(abs(a-b) < eps) which should return all indices at which the absolute value of the difference between elements of a and b differ by less than eps. e.g.: >>> import Numeric >>> index(Numeric.arange(10,20), 15, 1e-5) array([5]) Todd -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From magnus at hetland.org Thu Jul 25 12:12:11 2002 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu Jul 25 12:12:11 2002 Subject: [Numpy-discussion] Spectral approximation/DFT Message-ID: <20020725211111.A27670@idi.ntnu.no> Hi! Sorry to ask what is probably a really clueless question -- if there are any obvious sources of information about this, I'd be happy to go there and find this out for myself... :] Anyway; I'm trying to produce a graph to illustrate a time sequence indexing method, which relies on extracting the low-frequent Fourier coefficients and indexing a vector consisting of those. The graph should contain the original time sequence, and one reconstructed from the Fourier coefficients. Since it is reconstructed from only the low-frequent coefficients (perhaps 10-20 coefficients), it will look wavy and sinus'y. Now... I'm no expert in signal processing (or the specifics of FFT/DFT etc.), and I can't seem to make the FFT module do exactly what I want here... It seems that using fft(seq).real extracts the coefficients I'm after (though I'm not sure whether the imaginary components ought to figure in the equation somehow...) But no matter how I use inverse_fft or inverse_real_fft it seems I have to supply a number of coefficients equal to the sequence I want to approximate -- otherwise there will be a huge offset between them. Why is this so? Shouldn't the first coefficient take care of such an offset? Perhaps inverse_fft isn't doing what I think it is? If I haven't expressed myself clearly, I'd be happy to elaborate... (For those who might be interested, the approach is described in the paper found at http://citeseer.nj.nec.com/307308.html with a figure of the type I'm trying to produce at page 5.) Anyway, thanks for any help :) -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org From magnus at hetland.org Thu Jul 25 12:16:21 2002 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu Jul 25 12:16:21 2002 Subject: [Numpy-discussion] A probable solution... Message-ID: <20020725211534.A27914@idi.ntnu.no> After posting to the list (sorry about that ;) a possible solution occurred to me... To get an approximation, I used fft(seq, 10) and then inverted that using inverse_fft(signature, 100)... I guess that fouled up the scale of things -- when I use fft(seq, 100)[:10] to get the signature, it seems that everything works just fine... Even though this _seems_ to do the right thing, I just wanted to make sure that I'm not doing something weird here... -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org From a.schmolck at gmx.net Thu Jul 25 15:18:04 2002 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Thu Jul 25 15:18:04 2002 Subject: [Numpy-discussion] Numarray design announcement References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY> Message-ID: "Paul F Dubois" writes: > > During recent months there has been a lot of discussion about Numarray > and whether or not it should differ from Numeric in certain ways. We > have reviewed this lengthy discussion and come to some conclusions about > what we plan to do. The discussion has been valuable in that it took a > whole new "generation" back through the considerations that the > "founding fathers" debated when Numeric Python was designed. [...] > Decisions > > Numarray will have the same Python interface as Numeric except for the > exceptions discussed below. [...] > 2. Currently, if the result of an index operation x[i] results in a > scalar result, the result is converted to a similar Python type. For > example, the result of array([1,2,3])[1] is the Python integer 2. This > will be changed so that the result of an index operation on a Numarray > array is always a Numarray array. Scalar results will become rank-zero > arrays (i.e., shape () ). > [...] > > 4. The Numarray version of MA will no longer have copy semantics on > indexing but instead will be consistent with Numarray. (The decision to > make MA differ in this regards was due to a need for CDAT to be backward > compatible with a local variant of Numeric; the CDAT user community no > longer feels this was necessary). [...] As one of the people who argued for interface changes in numarray (mainly copy semantics for slicing), let me say that I welcome this announcement which clarifies many issues. Although I still believe that copy behavior would be preferable in principle, I think that continuity and backwards compatibility to Numeric is a sufficient reason to stick to the old behavior (now that numarray strives to be largely compatible) [1]. In a similar vain I also greatly welcome the change to view semantics in MA, because I feel that internal consistency is vital. Apart from being a heavy Numeric user, these interface issues are also quite important to me because I have been working for some time on a fully-featured matrix [2] class which I wanted to be both a) compatible to Numeric and numarray (so that it would ideally make no difference to the user which of the 2 libraries he'd be using as a "backend" to the matrix class). b) consistent in usage to numarray's interface wherever feasible (i.e. not too much of a compromise on usability). This turned out to be much more of a hassle than I would have anticipated, because contrary to what the compatibility section of the manual seemed to suggest I found numarray to be incompatible in a variety of ways (even making it impossible to write *forward* compatible code without writing additional wrapping functions). Just as an example, there was no simple way that would work across both versions to do something as common as creating e.g. an int array (with both parameter names and positions differing): Numeric (21): array(sequence, typecode=None, copy=1, savespace=0) numarray (0.3.3?) : array(buffer=None, shape=None, type=None) As for b) this obviously turned out to be a moving target, but I hope that now the final shape of things is getting reasonably clear and I'm now for example determined to have view slicing behavior for my matrix class, too. Nonetheless, for me a few issues still remain. Most importantly, numarray doesn't provide the same degree of polymorphism as Numeric. One of the chief reasons given as to why Numerics design is based around functions rather than methods is that it enables greater generality (e.g. allowing to ``sum`` over all sorts of sequence types). Consequently the role of methods and attributes was largely limited to functionality that only made sense for array objects and special methods. This is more than just a neat convinience -- because of the resulting polymorphism it is easy to write fairly general code and define new kinds of numeric classes that can seamlessly be passed to Numeric functions (e.g. one can also ``sum`` Matrix'es). I find it highly undesirable that numarray apparently doesn't follow this design rationale and the division of labour between functions and methods/attributes has been blured (or so it appears to me -- maybe this is some lack of insight on my part). That numarray versions before 0.3.4 were missing functions such as ``shape`` (which is also quite handy for other sequence types) was largely an inconvenience, but the fact that numarray function generally only operate on scalars, ``tuple``s and ``list``s (apart from obviously numarray.array's) is in my eyes a significant shortcoming. In contrast, Numeric functions would operate on any type that had an __array__ method to return an array representation of itself. The explicit checking for a type that numarray uses (via constructs ? la type(a) == types.ListType) flies in the face of standard python sensibilities and places arbitrarily limits on the kinds of objects that numarray users can conviniently work with and places a significant hurdle for creating new kinds of numerical objects. For example, the design of my matrix class depends on the fact that Numeric functions also accept objects with __array__ methods (such as my matrix class). Even if I invested the substantial amount of work that would be needed to redesign a less general version that wouldn't rely on this property, one of the key virtues of my class, namely the ability to transparently replace Numeric.array's in most cases where they are used as matrices would be lost. These two reasons would presumably be sufficient for me not to switch to numarray if I can at all avoid it, so I really hope that there numarray will also grow an __array__ protocol or somethign equivalent. This is the only point that is really vital to me, but there are others that I'd rather see reconsidered. As I said, I liked the division of labor between functions and methods/attributes in Numeric and the motivations behind it, as far as I understand them. numarray arrays, however, have grown methods like ``argsort`` and ``diagonal`` that seem somewhat unmotivated to me (and some of which cause problems for my matrix class). Similarly, why is there a e.g. a ``.rank`` attribute but a ``.type()`` method? If anything one would expect type to be an attribute and rank a method, since the type is actually some intrinsic property that needs to be stored (and could even be plausibly assigned to, with results like an ``astype`` call) whereas ``size`` and ``rank`` have no "real" existence as they are only computed from the shape and modifying them makes no sense. TMTOWTDI is the road to perl, so I'd really prefer to avoid duplicate functionality a la ``rank(a)`` and ``a.rank`` and generally reserve attributes and methods to array specific functionality. One area where TMTOWTDI seems to have run amok (several ways to do something but IMHO all broken) are flattened representations of arrays. All these expressions aim to produce a flattened version of ``a``: ``ravel(a)``, ``a.ravel()``, ``a.getflat()``/ ``a.flat`` `Aim` in this context is some sort of euphemism -- the only one for which it is possible to determine at compile time that it will do anything apart from raising an exception is ``ravel(a)`` -- not that one could know *what* it will do before the code is actually run (return a flattened copy of a or a flattened view), but never mind. Yuck. I think this really needs fixing (deprecating, rather then removing or changing incompatibly where felt necessary). Something else, which I however consider as less important: is it really necessary to have both 'type' and 'typecode'? Wouldn't it be enough to just stick with typecode, along the following lines (potentially issuing deprecation warnings where appropriate): a.typecode() returns a type object (e.g. Float32). array([1,2,3], typecode=Float32) behaves the same as array([1,2,3], typecode='d') Float32 etc. are already defined in Numeric so it's easy to write forward-compatible code and although hunting down instances of if a.typecode() == 'd': presumably wouldn't be that difficult, incompatibility could most likely almost be eliminated by making ``Float32 == 'd'`` return true. Sticking to the old name typecode also has the advantage that it is fairly unique and unambiguous (just try grep'ing for type vs. typecode). I must that apart from the switch to type objects, I don't fully understand the differences in numeric types in old Numeric and numarray and the motivation behind them. As far as I can see the emphasis with Numeric was to keep flexible to different hardware and increasing word sizes (i.e. to only guarantee minimum precision) and provide some reasonable "default" size for each type (e.g. `Float` being a double precision [3]). This approach is maybe somewhat similar to python core (floats and ints can have different sizes, depending on the underlying platform). In numarray the emphasis seems to have shifted on guaranteeing the actual size in memory (if in a few years time most calculations are done with 128bit precision than that's maybe not such a good idea, but I have no clue how likely this is to happen). Is this shift of emphasis is also responsible for the decision to have indexing operations always return arrays rather than scalars (including ones defined by numarray in cases where there is no plain-python equivalent)? Will all other functions (e.g. min) continue to return scalars? [BTW can anyone explain to me the difference between Int and Int32 (typecodes 'i' and 'l')?] Anyway, my apologies if I come across as too negative or if some the points are misinformed. I really think that the recent changes to numarray and this announcment are great step forward to a smooth transition of the whole community from Numeric to numarray which will play an important role in consolidating python's role in the scientific computing. night, alex Footnotes: [1] I think it might be beneficial, however, to add an explicitly note to the manual that alerts users to the fact that small slices can keep alive very large arrays, because I am under the impression that this is not immediately obvious to everyone and can cause puzzling problems. [2] I moaned on this list some months ago that doing linear algebra with Numeric array's was often cumbersome and inefficient (and the Matrix class that already comes with Numeric is rather limited). My (currently alpha) matrix class attempts to address these issues and also provides a much more flexible 'plugable' output formating (matlab-like, amongst others, which I guess many people will find much more readable; but the standard array-like formating is also available). [3] As an aside: maybe ``type="Float"`` in numarray should therefore *not* be equivalent to ``type=Float32`` but to ``type=Float64``, given that these strings seem to just be there for backwards compatibility? -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From victor at idaccr.org Tue Jul 30 06:43:06 2002 From: victor at idaccr.org (Victor S. Miller) Date: Tue Jul 30 06:43:06 2002 Subject: [Numpy-discussion] Sparse matrices Message-ID: I had noticed that Travis Oliphant had a sparse.py package, but it no longer is available (clicking on the link gives a "404"). I have a particular kind of sparse matrix that I'd like to use to give vector matrix multiplies. In particular, it's an n x n matrix which has at most k (which is small, usually 2 or 3) non-zeros in each row which are in consecutive locations. I have this encoded as an n x k matrix, the i-th row gives the non-zero values in the i-th row of the big matrix, and an n long vector of indices -- the i-th element gives the starting position in the i-th row. When I want to multiply this matrix by a row vector v on the left. To do the multiplication I do the following: # loc is the location vector n = matrix.shape[0] mm = reshape(v,(-1,1))*matrix w = zeros((n+m),v.typecode()) for i in range(mm.shape[0]): w[loc[i]:loc[i]+matrix.shape[1]] += w[i] w = w[:n] I would like to be able to replace the loop with some Numeric operations. Is there a trick to do this? Note that the n that I'm using is around 100000, so that storing the full matrix is out of the question (and multiplying by that matrix would be extremely inefficient, anyway). -- Victor S. Miller | " ... Meanwhile, those of us who can compute can hardly victor at idaccr.org | be expected to keep writing papers saying 'I can do the CCR, Princeton, NJ | following useless calculation in 2 seconds', and indeed 08540 USA | what editor would publish them?" -- Oliver Atkin From victor at idaccr.org Tue Jul 30 08:29:06 2002 From: victor at idaccr.org (Victor S. Miller) Date: Tue Jul 30 08:29:06 2002 Subject: [Numpy-discussion] Sparse matrices In-Reply-To: (victor@idaccr.org's message of "Tue, 30 Jul 2002 09:42:13 -0400") References: Message-ID: Sorry, I had a typo in the program. It should be: # M is n by k, and represents a sparse n by n matrix A # the non-zero entries of row i of A start in column loc[i] # and are the i-th row of M in locations loc[i]:loc[i]+k # loc is the location vector n,k = M.shape mm = reshape(v,(-1,1))*M w = zeros((n+m),v.typecode()) # is there a trick to replace the loop below? for i in range(mm.shape[0]): w[loc[i]:loc[i]+k] += mm[i] w = w[:n] -- Victor S. Miller | " ... Meanwhile, those of us who can compute can hardly victor at idaccr.org | be expected to keep writing papers saying 'I can do the CCR, Princeton, NJ | following useless calculation in 2 seconds', and indeed 08540 USA | what editor would publish them?" -- Oliver Atkin From jochen at unc.edu Tue Jul 30 09:24:02 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Tue Jul 30 09:24:02 2002 Subject: [Numpy-discussion] Sparse matrices In-Reply-To: References: Message-ID: On Tue, 30 Jul 2002 09:42:13 -0400 Victor S Miller wrote: Victor> I had noticed that Travis Oliphant had a sparse.py package, Victor> but it no longer is available (clicking on the link gives a Victor> "404"). It's part of scipy now. Greetings, Jochen -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E