From e.maryniak at pobox.com  Mon Jul  1 01:48:01 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Mon Jul  1 01:48:01 2002
Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info)
In-Reply-To: <3D1F0839.2090802@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOEPGDOAA.perry@stsci.edu> <3D1F0839.2090802@stsci.edu>
Message-ID: <200207011047.25000.e.maryniak@pobox.com>

On Sunday 30 June 2002 15:31, Todd Miller wrote:
> Perry Greenfield wrote:
> >...
> >>2. Because I'm running two versions of Python (because Zope
> >>   and a lot of Zope/C products depend on a particular version)
> >>   the 'development' Python is installed in /usr/local/bin
> >>   (whereas SuSE's python is in /usr/bin).
> >>   It probably wouldn't do any harm if the manual would include
> >>   a hint at the '--prefix' option and mention an alternative
> >>   Python installation like:
> >>
> >>       /usr/local/bin/python ./setup.py install --prefix=/usr/local
> >
> >Good idea.
>
> I'm actually surprised that this is necessary.  I was under the
> impression that the distutils pick reasonable defaults simply based on
> the python that is running.  In your case,  I would expect numarray to
> install to /usr/local/lib/pythonX.Y/site-packages without specifying any
> prefix.  What happens on SuSE?

Yes, you're probably right.
On SuSE I tested it out on my own machine ('test server'), because I did
not want to do it on the production server. It run's Python 2.2.1 exclusively.
I remembered that I had to this in a previous Numeric installation, where
1.5.2 and 2.1 were running side-by-side (and at that time I also had to
install distutils manually).
So, yes, it may not be an issue (anymore) for at least recent Python's
if you call the Python explicitly like '/usr/local/bin/python ./setup.py'
and '/usr/bin/python ./setup' (on SuSE python goes to /usr/bin).

>
> >>...

Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

It said 'Insert disk #3', but only two will fit.


From hinsen at cnrs-orleans.fr  Mon Jul  1 08:48:10 2002
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Jul  1 08:48:10 2002
Subject: [Numpy-discussion] Scientific Python 2.4
Message-ID: <200207011543.g61FhHL25160@chinon.cnrs-orleans.fr>

                       Scientific Python 2.4
                       ---------------------

Scientific Python is a module library for scientific computing. In
this collection you will find modules that cover basic geometry
(vectors, tensors, transformations, vector and tensor fields),
quaternions, automatic derivatives, (linear) interpolation,
polynomials, elementary statistics, nonlinear least-squares fits, unit
calculations and conversions, Fortran-compatible data formatting, 3D
visualization via VRML, two Tk widgets for simple line plots and 3D
wireframe models.

Scientific Python also contains Python interfaces to the netCDF
library (implementing a portable binary format for large arrays) and
the Message Passing Interface, the most widely used communications
library for parallel computers.

Version 2.4 of Scientific Python has just been released. In addition
to numerous small improvents and bug fixes, it contains

 - the high-level parallelization module Scientific.BSP

 - an interface to the parallelization library BSPlib
   (see www.bsp-worldwide.org for details)

 - autoregressive models for time series in Scientific.Signals.Models


The BSP parallelization module was designed to facilitate development
and testing of parallel programs. Its main features are:

 - communication can handle almost any Python object

 - deadlocks are impossible by design

 - possibility to implement distributed data classes that
   can be used transparently by parallel applications

 - an interactive parallel interpreter that can be used inside
   Emacs (and perhaps other Python development environments)
   in order to provide an interactive parallel programming
   environment

 - parallel programs run as serial monoprocessor code on any
   Python installation with no changes and usually negligeable
   loss of performance - no need to maintain a separate
   serial version

A tutorial on BSP programming with Python is available at the Web site
and included in the distribution.


For more information and for downloading, see

      http://dirac.cnrs-orleans.fr/ScientificPython

or

      http://starship.python.net/crew/hinsen/scientific.html

-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From paul at pfdubois.com  Mon Jul  1 16:41:28 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul  1 16:41:28 2002
Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info)
In-Reply-To: <200207011047.25000.e.maryniak@pobox.com>
Message-ID: <002601c22158$90f7e900$0c01a8c0@NICKLEBY>

distutils installs into the python used to run the setup.py by using the
sys.exec_prefix and sys.prefix. You would not normally need to use any
option unless you are trying to install something "off to the side"
because, for example, you don't have write permission in that python's
site-packages directory.


From paul at pfdubois.com  Mon Jul  1 16:50:57 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul  1 16:50:57 2002
Subject: [Numpy-discussion] words that must not be spoken
In-Reply-To: <200206262047.00731.e.maryniak@pobox.com>
Message-ID: <002701c22159$ca4fd270$0c01a8c0@NICKLEBY>


> [mailto:numpy-discussion-admin at lists.sourceforge.net] On 
> Behalf Of Eric Maryniak
In the midst of a discussion Eric wrote:
> > ...
> shouldn't Convolve, for 
> orthogonality, be named
>     Convolve2? (cuz who knows, numarray's Convolve may be backported
>     to Numeric in the future, for comparative testing etc.).

Use of the phrase "backported to Numeric" will result in your
subscription to numpy-discussion being cancelled. (:->

No backporting is ever going to happen. This is a short one-way street
or there is no purpose to travel on it. 

I am just back from Europython and had a chance to talk to a lot of
users and have some thoughts which I will share with all of you shortly.
However, since I just had to fill out a form and where it said "Date" I
looked at my watch and wrote the time 11/16, I conclude that I have jet
lag and can't trust myself to be lucid yet.


From jae at zhar.net  Mon Jul  8 02:49:01 2002
From: jae at zhar.net (John Eikenberry)
Date: Mon Jul  8 02:49:01 2002
Subject: [Numpy-discussion] Optimization advice
Message-ID: <20020708094805.GA370@kosh.zhar.net>


I'm working on an influence map [1] for game civil [2]. I have a working
version, but as a real numeric newbie I thought I'd bounce it off the
people here before calling it done. I'm basically looking for an easy to
understand but fast influence spreading algorithm. I've read that this
algorithm is similar to those used to predict fire spreading or heat
transfer in metal if that helps.

The attached code is setup for a hex based map and the functions to take
this into accounts (shift_hex_up,shift_hex_down) are probably the most
naive. The others being only slight modifications of those in the
life.py example. Its not really commented but its short and hopefully
should be readily understandable.

I've only included the base influence map class and its associated
functions. If you'd like a version you can run, I can send you a .tgz
setup to run in place (for *nix systems). 

Thanks in advance for any advice or opinions.


[1] An influence map is used commonly in strategic war games. It is a
simple means of capturing the areas on the game map that one side is
strong vs the other side. Read the first post in this thread for a good
description: 

    http://www.gameai.com/influ.thread.html

[2] Civil is a cross-platform, turn-based, networked strategy game,
developed using Python, PyGame and SDL--allowing players to take part in
scenarios set during the American Civil war.

    http://civil.sourceforge.net/

-- 

John Eikenberry
[jae at zhar.net - http://zhar.net]
______________________________________________________________
"They who can give up essential liberty to purchase a little temporary
 safety, deserve neither liberty nor safety."
                                          --B. Franklin
-------------- next part --------------
# /usr/bin/env python

from Numeric import *

factor = array(6.).astype(Float16)
edge_mod = array(0.66).astype(Float16)

class InfluenceMap:

    def __init__(self,hex_map):
        self.map_size = map_size = hex_map.size
        self._iterations = (map_size[0] + map_size[1])/4
        self.hex_map = hex_map

        # weightmap == influence map
        self.weightmap = zeros((map_size[0],map_size[1]),Float16)
        # constmap = initial state with constraints/constants
        self.constmap = zeros((map_size[0],map_size[1]),Float16)

    def step(self,iterations=None):
        constmap = self.constmap
        weightmap = self.weightmap
        
        if not iterations:
            iterations = self._iterations
        while iterations:
            # spread the influence
            # diamond_h
            neighbors = _shift_up(weightmap)/factor
            neighbors += _shift_left(weightmap)/factor
            neighbors += _shift_right(weightmap)/factor
            neighbors += _shift_down(weightmap)/factor
            neighbors += _shift_hex_up(weightmap)/factor
            neighbors += _shift_hex_down(weightmap)/factor
            
            # constrain initial points to prevent overheating
            putmask(neighbors,constmap,constmap)
            weightmap = neighbors
            iterations -= 1
        self.weightmap = weightmap

def shift_up(cells):
    return concatenate((cells[1:], cells[-1:]*edge_mod))

def shift_down(cells):
    return concatenate((cells[:1]*edge_mod, cells[:-1]))

def shift_left(cells):
    return transpose(shift_up(transpose(cells)))

def shift_right(cells):
    return transpose(shift_down(transpose(cells)))

# for array layout 
def shift_hex_up(cells):
    neighbors = array(cells)
    # add to odd cell rows [1::2]
    neighbors[1::2] = shift_left(shift_up(cells))[1::2]
    # even cell rows [::2]
    neighbors[::2] = shift_right(shift_up(cells))[::2]
    return neighbors
 
def shift_hex_down(cells):
    neighbors = array(cells)
    # odd cell rows [1::2]
    neighbors[1::2] = shift_left(shift_down(cells))[1::2]
    # even cell rows [::2]
    neighbors[::2] = shift_right(shift_down(cells))[::2]
    return neighbors
 

From dubois1 at llnl.gov  Mon Jul  8 09:10:04 2002
From: dubois1 at llnl.gov (Paul Dubois)
Date: Mon Jul  8 09:10:04 2002
Subject: [Numpy-discussion] Caution -- // not standard
Message-ID: <1026144543.13905.3.camel@ldorritt>

I have run into several cases of this on different open-source projects,
the latest being an incorrect change in Numeric's arrayobject.c: the use
of // to start a comment. Many contributors who work only with Linux
have come to believe that this works with other C compilers, which is
not true. This construct comes from C++. Please avoid this construct
when contributing changes or patches to Numeric.


From bsder at mail.allcaps.org  Mon Jul  8 12:03:09 2002
From: bsder at mail.allcaps.org (Andrew P. Lentvorski)
Date: Mon Jul  8 12:03:09 2002
Subject: [Numpy-discussion] Caution -- // not standard
In-Reply-To: <1026144543.13905.3.camel@ldorritt>
Message-ID: <20020708114304.T66456-100000@mail.allcaps.org>

Actually, // is standard C99 released December 1, 1999 as
ISO/IEC 9899:1999.

It also has support for variable length arrays, a complex number type and
a bunch of *portable* stuff for getting at numerical information (limits,
floating-point environment) rather than nasty compiler specific hacks.
( See: http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9x.htm )

Many of these extensions are specifically for the numerical community.

I would recommend taking up the issue of non-standards compliance with
your compiler vendor.

-a

On 8 Jul 2002, Paul Dubois wrote:

> I have run into several cases of this on different open-source projects,
> the latest being an incorrect change in Numeric's arrayobject.c: the use
> of // to start a comment. Many contributors who work only with Linux
> have come to believe that this works with other C compilers, which is
> not true. This construct comes from C++. Please avoid this construct
> when contributing changes or patches to Numeric.


From paul at pfdubois.com  Mon Jul  8 12:38:05 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul  8 12:38:05 2002
Subject: [Numpy-discussion] Caution -- // not standard
In-Reply-To: <20020708114304.T66456-100000@mail.allcaps.org>
Message-ID: <001101c226b6$da5f3090$0c01a8c0@NICKLEBY>

Thank you for the clarification.

Unfortunately, "my" compiler vendor is the set of all compiler vendors
that users of Numeric have, and we have to restrict ourselves to what
works. I misspoke when I said it was "not standard"; I should have said,
"doesn't work everywhere".

> -----Original Message-----
> From: Andrew P. Lentvorski [mailto:bsder at mail.allcaps.org] 
> Sent: Monday, July 08, 2002 12:02 PM
> To: Paul Dubois
> Cc: numpy-discussion at lists.sourceforge.net
> Subject: Re: [Numpy-discussion] Caution -- // not standard
> 
> 
> Actually, // is standard C99 released December 1, 1999 as 
> ISO/IEC 9899:1999.
> 
> It also has support for variable length arrays, a complex 
> number type and a bunch of *portable* stuff for getting at 
> numerical information (limits, floating-point environment) 
> rather than nasty compiler specific hacks. ( See: 
> http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9> x.htm )
> 
> Many 
> of these extensions are specifically for the 
> numerical community.
> 
> I would recommend taking up the issue of non-standards 
> compliance with your compiler vendor.
> 
> -a
> 
> On 8 Jul 2002, Paul Dubois wrote:
> 
> > I have run into several cases of this on different open-source 
> > projects, the latest being an incorrect change in Numeric's 
> > arrayobject.c: the use of // to start a comment. Many 
> contributors who 
> > work only with Linux have come to believe that this works 
> with other C 
> > compilers, which is not true. This construct comes from C++. Please 
> > avoid this construct when contributing changes or patches 
> to Numeric.
> 
> 


From jae at zhar.net  Tue Jul 16 00:12:13 2002
From: jae at zhar.net (John Eikenberry)
Date: Tue Jul 16 00:12:13 2002
Subject: [Numpy-discussion] Optimization advice
In-Reply-To: <20020708094805.GA370@kosh.zhar.net>
References: <20020708094805.GA370@kosh.zhar.net>
Message-ID: <20020716070554.GB363@kosh.zhar.net>


After getting some advice off the pygame list I think I have a pretty
good version of my influence map now. I thought someone on this list
might be interested or at least someone checking the archives.

The new and improved code is around 6-7x faster. The main gain was
obtained by converting all the array functions to slice notation and
eliminating most of the needless copying of arrays. 

The new version is attached and is much better commented. It is also
unabridged, as it was pointed out that it wasn't entirely clear what was
going on in the last (edited) version. Hopefully things are more obvious
in this one.

Anyways... I just hating leaving a thread without a conclusion. Hope
someone finds this useful.

-- 

John Eikenberry
[jae at zhar.net - http://zhar.net]
______________________________________________________________
"They who can give up essential liberty to purchase a little temporary
 safety, deserve neither liberty nor safety."
                                          --B. Franklin
-------------- next part --------------
# /usr/bin/env python
#
# John Eikenberry <jae at zhar.net>

from Numeric import *
from types import *

FACTOR = array(6.).astype(Float32)
EDGE_MOD = array(0.66).astype(Float32)
ONE = array(1.).astype(Float32)
ZERO = array(0.).astype(Float32)

class InfluenceMap: 
    """ 

    There are 2 primary ways to setup the influence map, either might be useful
    depending on your needs. The first is to recreate the map each 'turn' the
    second is to keep the map around and just update it each turn. The first
    way is simple and easy to understand, both in terms of tweaking and later
    analysis. The second gives the map a sense of time and allows for fewer
    iterations of the spreading algorithm per 'turn'. 

    Setting up the map to for one or the other of these is a matter of tweaking
    the code. There are 3 main bits of code which are described below and
    indicated via comments in the code.
    
    First some terminology:

    - weightmap stores the current influence map
    - neighbors is used as the memory buffer to calculate a the influence
      spreading
    - constmap contains a map with only the unit's scores present
    - when I refer to a 'multi-turn map' I mean using one instance of the
      influence map throughout the game without resetting it.

    [1] neighbors *= ZERO

        At the end of each iteraction, the neighbors take on the values of the
        weightmap from the previous step. This will reset those values to
        zero.

        This has a 1% performance hit.

    [2] putmask(neighbors,constmap,constmap)

        This keeps the values of the units hexes constant through all
        iterations.

        This results in about a 40% performance hit. This needs improvement.

    [3] setDecayRate([float])
        
        This is meant to be used with a multi-turn map. It sets the floating
        point value (N>0.0<1.0)which is used on the map each turn to modify
        the current map before the influence spreading. 

        No performance hit.

    If just [1] used then it will cause all influence values to decend
    toward zero. Not sure what this would be useful for, just documenting the
    effect.

    If [1] is not used (commented out) then the map values will never balance
    out, rising with each iteration. This is fine if you plan on resetting the
    influence map each turn. Allowing you to tweak the number of iterations to
    get the level of values you want. But it would cause problem with a
    multi-turn map unless [3] is used to keep this in check. 
        
    Using [2] without [1] will accellerate the rising of the values described
    above. It will also lead to more variation amoung the influence values
    when using fewer iterations. High peaks and steep sides. Using neither [1]
    nor [2] the peaks are much lower.

    If [1] and [2] are both used the map will always attain a point of balance
    no matter how many iterations are run. This is desirable for maps used
    throughout the entire game (multi-turn maps) for obvious reasons. Given the
    effect of [1] this also limits the need for [3] as the influence values in
    areas of the map where units are no longer present will naturally decrease.
    Though the decay rate may still be useful for tweaking this.
    
    """

    _decay_rate = None

    def __init__(self,hex_map):
        """ hex_map is the in game (civl) map object """
        self.map_size = map_size = hex_map.size
        ave_size = (map_size[0] + map_size[1])/2
        self._iterations = ave_size/2
        # is the hex_map useful for anything other than size?
        self.hex_map = hex_map

        # weightmap == influence map
        self.weightmap = weightmap = zeros((map_size[0],map_size[1]),Float32)
        # constmap == initial unit locations
        self.constmap = zeros((map_size[0],map_size[1]),Float32)

    def setUnitMap(self,units):
        """ Put unit scores on map 
            -units is a list of (x,y,score) tuples
             where x,y are map coordinates and score is the units influence
             modifier
        """
        weightmap = self.weightmap
        constmap = self.constmap
        constmap *= ZERO
        # mayby use the hex_map here to get terrain effects?
        for (x,y,score) in units:
            weightmap[x,y] = score
            constmap[x,y]=score

    def setInterations(self,iterations):
        """ Set number of times through the influence spreading loop """
        assert type(iterations) == IntType, "Bad arg type: setIterations([int])"
        self._iterations = iterations

    # [3] above
    def setDecayRate(self,rate):
        """ Set decay rate for a multi-turn map. """
        assert type(rate) == FloatType, "Bad arg type: setDecayRate([float])"
        self._decay_rate = array(rate).astype(Float32)

    def reset(self):
        """ Reset an existing map back to zeros """
        map_size = self.map_size
        self.weightmap = zeros((map_size[0],map_size[1]),Float32)

    def step(self,iterations=None):
        """ One set of loops through influence spreading algorithm """
        # save lookup time
        constmap = self.constmap
        weightmap = self.weightmap
        if not iterations:
            iterations = self._iterations

        # decay rate can be used when the map is kept over duration of game,
        # instead of a new one each turn. the old values are retained,
        # degrading slowly over time. this allows for fewer iterations per turn
        # and gives sense of time to the map. its experimental at this point.
        if self._decay_rate:
            weightmap = weightmap * self._decay_rate

        # It might be possible to pre-allocate the memory for neighbors in the
        # init method. But I'm not sure how to update that pre-allocated array.
        neighbors = weightmap.copy()
        # spread the influence
        while iterations:
            # [1] in notes above
#            neighbors *= ZERO
            # diamond_hex layout
            neighbors[:-1,:] += weightmap[1:,:] # shift up
            neighbors[1:,:] += weightmap[:-1,:] # shift down
            neighbors[:,:-1] += weightmap[:,1:] # shift left
            neighbors[:,1:] += weightmap[:,:-1] # shift right
            neighbors[1::2][:-1,:-1] += weightmap[::2][1:,1:] # hex up (even)
            neighbors[1::2][:,:-1] += weightmap[::2][:,1:] # hex down (even)
            neighbors[::2][:,1:] += weightmap[1::2][:,:-1] # hex up (odd)
            neighbors[::2][1:,1:] += weightmap[1::2][:-1,:-1] # hex down (odd)
            # keep influence values balanced
            neighbors *= (ONE/FACTOR)
            
            # [2] above - maintain scores in unit hexes
#            putmask(neighbors,constmap,constmap)

            # 'putmask' adds almost 40% to the overhead. There should be a
            # faster way. A little testing seems to show that this problem is
            # related to the usage of floats for the map values. 

            # prepare for next iteration
            weightmap,neighbors = neighbors,weightmap
            iterations -= 1

        # save for next turn
        self.weightmap = weightmap


From paul at pfdubois.com  Thu Jul 18 14:47:02 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Thu Jul 18 14:47:02 2002
Subject: [Numpy-discussion] [ANNOUNCE] Pyfort 8.0
Message-ID: <001f01c22ea4$89ff4f40$0b01a8c0@NICKLEBY>

Pyfort 8.0 has been released at SourceForge (sf.net/projects/pyfortran)

Version 8
     This version contains a new facility for making and installing
projects.
     Old compile lines will still work, but will produce an equivalent
.pfp
     file that you could use in the future. Included is a Tkinter-based
GUI
     editor for the project files. However, the format of the files is
simple
     and they could be edited with a text editor as well.

     There is improved support for installing Pyfort and the modules it 
     creates in a location other than inside Python. See README.

     This version does change the installation location for an
extension. 
     Therefore, you should remove the files of any previous installation

     from your Python. Yes, this is annoying. That is why we are doing
it, 
     so that we can have an "uninstall" command.

     A new "windows" subdirectory has been added, containing an example
of how to
     use Pyfort on Windows with Visual Fortran. Thanks to Reinhold
Niesner. Testing
     of, and advice about, this are needed from Windows users. The
pyfort script itself
     is also now installed as a .bat script for win32.
 
     Support for Mac OSX (Darwin) added.


From biesingert at yahoo.com  Fri Jul 19 01:13:03 2002
From: biesingert at yahoo.com (Thomas Biesinger)
Date: Fri Jul 19 01:13:03 2002
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com>

Hi,

when I try to install NumPy on Mac OS X.1.5, it fails on this error:

....
cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
/usr/bin/ld: -undefined error must be used when -twolevel_namespace is
in effect
error: command 'cc' failed with exit status 1
~/Python/Numeric-21.3 % cc
cc: No input files

I had thought to submit this to the developers section of the list
but could not find the way to subscribe to it ;-)

If somehow had a running version of NumPy with for Mac OSX
http://tony.lownds.com/macosx, I would appreciate it.

Thanks everyone for their help!

Regards,
Thomas


__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com


From rob at pythonemproject.com  Fri Jul 19 05:35:04 2002
From: rob at pythonemproject.com (rob)
Date: Fri Jul 19 05:35:04 2002
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
References: <20020719081231.1850.qmail@web14103.mail.yahoo.com>
Message-ID: <3D3806B3.F2BE1C1A@pythonemproject.com>

Thomas Biesinger wrote:
> 
> Hi,
> 
> when I try to install NumPy on Mac OS X.1.5, it fails on this error:
> 
> ....
> cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
> 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
> arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
> o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
> /usr/bin/ld: -undefined error must be used when -twolevel_namespace is
> in effect
> error: command 'cc' failed with exit status 1
> ~/Python/Numeric-21.3 % cc
> cc: No input files
> 
> I had thought to submit this to the developers section of the list
> but could not find the way to subscribe to it ;-)
> 
> If somehow had a running version of NumPy with for Mac OSX
> http://tony.lownds.com/macosx, I would appreciate it.
> 
> Thanks everyone for their help!
> 
> Regards,
> Thomas
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Autos - Get free new car price quotes
> http://autos.yahoo.com
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Hi Thomas, sorry I don't have the expertise to help you with your
question.  I am wondering if you are using one of Apple's new G4
machines?  I'm curious about the floating point performance of those 
chips.  If you ever get Numpy working, I have a routine that I use for a
benchmark, a Norton-Summerfeld ground (antenna) simulation routine that
I could send to you.  The record for me is 120s on a P4 1.8Ghz at work,
but I'm sure the new Xeons would beat that, and maybe the new Athlons. 
My 1.2Ghz DDR Athlon is much slower than the P4, but the clock speeds
are so much different.  Rob.

-- 
-----------------------------
The Numeric Python EM Project

www.pythonemproject.com


From welch at cs.unc.edu  Fri Jul 19 05:52:01 2002
From: welch at cs.unc.edu (Greg Welch)
Date: Fri Jul 19 05:52:01 2002
Subject: FW: [Numpy-discussion] NumPy on Mac OS 10.1.5
In-Reply-To: <200207191053.g6JArGbE017359@wren.cs.unc.edu>
Message-ID: <B95D8343.9EC0%welch@cs.unc.edu>

Thomas, I have (recently) built Numeric 21.3 on multiple OS X 10.1.5
platforms, and have had no problems that I know of. I am using Python 2.3a0
but had also built Numeric w/ earlier versions of Python too. All platforms
have the April 2002 developer tools update.

I just noticed that your compile line shows the use of cc, as opposed to
gcc. Here is the corresponding compile line for 21.3 on my powerbook (Python
2.3a0):

gcc -bundle -bundle_loader /usr/local/bin/python build/temp.darwin-5.5-Power
Macintosh-2.3/_numpymodule.o build/temp.darwin-5.5-Power
Macintosh-2.3/arrayobject.o
build/temp.darwin-5.5-PowerMacintosh-2.3/ufuncobject.o -o
build/lib.darwin-5.5-Power Macintosh-2.3/_numpy.so

--Greg


-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20020719/bdf544d4/attachment.mht>

From biesingert at yahoo.com  Fri Jul 19 04:12:31 2002
From: biesingert at yahoo.com (Thomas Biesinger)
Date: Fri, 19 Jul 2002 01:12:31 -0700 (PDT)
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com>

Hi,

when I try to install NumPy on Mac OS X.1.5, it fails on this error:

....
cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
/usr/bin/ld: -undefined error must be used when -twolevel_namespace is
in effect
error: command 'cc' failed with exit status 1
~/Python/Numeric-21.3 % cc
cc: No input files

I had thought to submit this to the developers section of the list
but could not find the way to subscribe to it ;-)

If somehow had a running version of NumPy with for Mac OSX
http://tony.lownds.com/macosx, I would appreciate it.

Thanks everyone for their help!

Regards,
Thomas


__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


--B_3109913412_427129--


From Jack.Jansen at oratrix.com  Fri Jul 19 14:17:02 2002
From: Jack.Jansen at oratrix.com (Jack Jansen)
Date: Fri Jul 19 14:17:02 2002
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
In-Reply-To: <20020719081231.1850.qmail@web14103.mail.yahoo.com>
Message-ID: <B64A212F-9B5C-11D6-9B6B-003065517236@oratrix.com>

On vrijdag, juli 19, 2002, at 10:12 , Thomas Biesinger wrote:

> Hi,
>
> when I try to install NumPy on Mac OS X.1.5, it fails on this error:
>
> ....
> cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
> 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
> arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
> o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
> /usr/bin/ld: -undefined error must be used when -twolevel_namespace is
> in effect

Thomas,
as of MacOSX 10.1 the link step needs either the -flat_namespace 
option, or the -bundle_loader option.

But: this has been fixed in both Python 2.2.1 and Python 2.3a0 
(the CVS tree). Are you by any chance still running Python 2.2 
(which predates OSX 10.1, and therefore two-level namespaces, 
and therefore the right linker invocations, which distutils 
reads from Python's own Makefile).

If you're running 2.2: please upgrade and try again. If you're 
running 2.2.1 or later: let me know and I'll try and think of 
what questions I should ask you to debug this:-)
--
- Jack Jansen        <Jack.Jansen at oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -


From paul at pfdubois.com  Mon Jul 22 16:14:03 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul 22 16:14:03 2002
Subject: [Numpy-discussion] Numarray design announcement
Message-ID: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY>

At numpy.sf.net you will find a posting from Perry Greenfield and I
detailing the design decisions we have taken with respect to Numarray.
What follows is the text of that message without the formatting. We ask
for your understanding about those decisions that differ from the ones
you might prefer. 

Numarray's Design
Paul F. Dubois and Perry Greenfield

Numarray is the new implementation of the Numeric Python extension. It
is our intention that users will change as rapidly as possible to the
new  module when we decide it is ready. The present Numeric Python team
will cease supporting Numeric after a short transition period.

During recent months there has been a lot of discussion about Numarray
and whether or not it should differ from Numeric in certain ways. We
have reviewed this lengthy discussion and come to some conclusions about
what we plan to do. The discussion has been valuable in that it took a
whole new "generation" back through the considerations that the
"founding fathers" debated when Numeric Python was designed.

There are literally tens of thousands of Numerical Python users. These
users may represent only a tiny percentage of potential users but they
are real users today with real code that they have written, and breaking
that code would represent real harm to real people. Most of the issues
discussed recently were discussed at length when Numeric was first
designed. Some decisions taken then represent a choice that was simply a
choice among valid alternatives. Nevertheless, the choice was made, and
to arbitrarily now make a different choice would be difficult to
justify.

In arguing about Python's indentation, we often see heart-felt arguments
from opponents who have sincere reasons for feeling as they do. However,
many of the pitfalls they point to do not seem to actually occur in real
life very often. We feel the same way about many arguments about Numeric
Python. The view / copy argument, for example, claims that beginners
will make errors with view semantics. Well, some do, but not very often,
and not twice.  It is just one of many differences that users need to
adapt to when learning an entity-object model such as Python's when they
are used to variable semantics such as in Fortran or C. Similarly, we do
not receive massive reports of confusion about differing default values
for the axis keyword -- there was a rationale for the way it is now, and
although one could propose a different rationale for a different choice,
it would be just a choice.

Decisions

Numarray will have the same Python interface as Numeric except for the
exceptions discussed below. 

1. The Numarray C API includes a compatibility layer consisting of some
of the members of the Numeric C API. For details on compatibility at the
C level see
http://telia.dl.sourceforge.net/sourceforge/numpy/numarray.pdf , pdf
pages 78-81. Since no formal decision was ever made about what parts of
the Numeric C header file were actually intended to be publicly
available, do not expect complete emulation. 

Numarray's current view of arrays in C, using either native or emulation
C-APIs, is that array data can be mutated, but array properties cannot.
Thus, an existing Numeric extension function which tries to change the
shape or strides of an array in C is more of a porting challenge,
possibly requiring a python wrapper. Depending on what kind of
optimization we do, this restriction might be lifted. For the Numeric
extensions already ported to Numarray (RandomArray, LinearAlgebra, FFT),
none of this was an issue.

2. Currently, if the result of an index operation x[i] results in a
scalar result, the result is converted to a similar Python type. For
example, the result of array([1,2,3])[1] is the Python integer 2. This
will be changed so that the result of an index operation on a Numarray
array is always a Numarray array. Scalar results will become rank-zero
arrays (i.e., shape () ).

3. Currently, binary operations involving Numeric arrays and Python
scalars uses the precision of the Python scalar to help determine the
precision of the result. In Numarray, the precision of the array will
have precedence in determining the precision of the outcome. Full
details are available in the Numarray documention.

4. The Numarray version of MA will no longer have copy semantics on
indexing but instead will be consistent with Numarray. (The decision to
make MA differ in this regards was due to a need for CDAT to be backward
compatible with a local variant of Numeric; the CDAT user community no
longer feels this was necessary).

Some explanation about the scalar change is in order. Currently, much
coding in Numeric-based applications must be devoted to handling the
fact that after an index operation, the programmer can not assume that
the result is an array. So, what are the consequences of change? A
rank-zero array will interact as expected with most other parts of
Python. When it does not, the most likely result is a type error. For
example, let x = array([1,2,3]). Then [1,2,3][x[0]] currently produces
the result 2. With the change, it would produce a type error unless a
change is made to the Python core (currently under discussion). But
x[x[0]] would still work because we have control of that.  In short, we
do not think this change will break much code and it will prevent the
writing of more code that is either broken or difficult to write
correctly.


From pete at shinners.org  Mon Jul 22 17:36:12 2002
From: pete at shinners.org (Pete Shinners)
Date: Mon Jul 22 17:36:12 2002
Subject: [Numpy-discussion] Numarray design announcement
References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY>
Message-ID: <3D3CA3B1.7010708@shinners.org>

Paul F Dubois wrote:
 > Numarray's Design
 > Paul F. Dubois and Perry Greenfield

a very nice design, for a lot of challenging decisions


 > Numarray's current view of arrays in C, using either native or
 > emulation C-APIs, is that array data can be mutated, but array
 > properties cannot. Thus, an existing Numeric extension function
 > which tries to change the shape or strides of an array in C is
 > more of a porting challenge, possibly requiring a python wrapper.

i have a c extension that does this, but only during "creation time" 
of the array. i'm hoping there can be some way to do this from C. i 
need to create a new array from a block of numbers that aren't 
contiguous...

/* roughly snipped code */
dim[0] = myimg->w;
dim[1] = myimg->h;
dim[2] = 3; /*r,g,b*/
array = PyArray_FromDimsAndData(3, dim, PyArray_UBYTE, startpixel);
array->flags = OWN_DIMENSIONS|OWN_STRIDES;
array->strides[2] = pixelstep;
array->strides[1] = myimg->pitch;
array->strides[0] = myimg->format->BytesPerPixel;
array->base = myimg_object;


note this data is image data, and i am "reorienting" it so that the 
first index is X and the second index is Y. plus i need to account 
for an image pitch, where the rows are not exactly the same width as 
the number of pixels.

also, i am also changing the "base" field, since the data for this 
array lives inside another image object

of course, once the array is created, i pass it off to the user and 
never touch these fields again, so perhaps something like this will 
work in the new numarray?

if not, i'm eager to start my petition for a 
"PyArray_FromDimsAndDataAndStrides" function, and also a way to 
assign the "base" as well.


i'm looking forward to the new numarray, looks very exciting.


From biesingert at yahoo.com  Mon Jul 22 23:54:03 2002
From: biesingert at yahoo.com (Thomas Biesinger)
Date: Mon Jul 22 23:54:03 2002
Subject: [Numpy-discussion] Summary to NumPy on Mac OS 10.1.5
Message-ID: <20020723065343.73589.qmail@web14106.mail.yahoo.com>


From e.maryniak at pobox.com  Tue Jul 23 09:19:04 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Tue Jul 23 09:19:04 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
Message-ID: <200206261833.29702.e.maryniak@pobox.com>

Dear crunchers,

According to the _Numpy_ manual for RandomArray.seed(x=0, y=0)
(with /my/ emphasis):

  The seed() function takes two integers and sets the two seeds
  of the random number generator to those values. If the default
  values of 0 are used for /both/ x and y, then a seed is generated
  from the current time, providing a /pseudo-random/ seed.

Note: in numarray, the RandomArray2 package is provided but it's
description is not (yet) included in the numarray manual.

I have some questions about this:

1. The implementation of seed(), which is, by the way, identical
   both in Numeric's RandomArray.py and numarray's RandomArray2.py
   seems to contradict it's usage description:

---cut---
def seed(x=0,y=0):
    """seed(x, y), set the seed using the integers x, y;
    Set a random one from clock if  y == 0
    """
    if type (x) != IntType or type (y) != IntType :
        raise ArgumentError, "seed requires integer arguments."
    if y == 0:
        import time
        t = time.time()
        ndigits = int(math.log10(t))
        base = 10**(ndigits/2)
        x = int(t/base)
        y = 1 + int(t%base)
    ranlib.set_seeds(x,y)
---cut---

   Shouldn't the second 'if' be:

    if x == 0 and y == 0:

  With the current implementation:

  - 'seed(3)' will actually use the clock for seeding
  - it is impossible to specify 0's (0,0) as seed: it might be
    better to use None as default values?

2. With the current time.time() based default seeding, I wonder
   if you can call that, from a mathematical point of view,
   pseudo-random:

---cut---
$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from numarray import *
>>> from RandomArray2 import *
>>> import time
>>> numarray.__version__
'0.3.5'
>>> for i in range(5):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(1)
...     print
...
1027434978.406238
(102743, 4979)

1027434979.400319
(102743, 4980)

1027434980.400316
(102743, 4981)

1027434981.40031
(102743, 4982)

1027434982.400308
(102743, 4983)
---cut---

   It is incremental, and if you use default seeding within
   one (1) second, you get the same seed:

---cut---
>>> for i in range(5):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(0.1)
...     print
...
1027436537.066677
(102743, 6538)

1027436537.160303
(102743, 6538)

1027436537.260363
(102743, 6538)

1027436537.360299
(102743, 6538)

1027436537.460363
(102743, 6538)
---cut---

3. I wonder what the design philosophy is behind the decision
   to use 'mathematically suspect' seeding as default behavior.
   Apart from the fact that it can hardly be called 'random', I also
   have the following problems with it:

   - The RandomArray2 module initializes with 'seed()' itself, too.
     Reload()'s of RandomArray2, which might occur outside the
     control of the user, will thus override explicit user's seeding.
     Or am I seeing ghosts here?
   - When doing repeated run's of one's neural net simulations that
     each take less than a second, one will get identical streams of
     random numbers, despite seed()'ing each time.
     Not quite what you would expect or want.
   - From a purist software engineering point of view, I don't think
     automagical default behavior is desirable: one wants programs to
     be deterministic and produce reproducible behavior/output.
     If you use default seed()'ing now and re-run your program/model
     later with identical parameters, you will get different output.
     In Eiffel, object attributes are always initialized, and you will
     almost never have irreproducible runs. I found that this is a good
     thing for reproducing ones bugs, too ;-)

To summarize, my recommendation would be to use None default arguments
and use, when no user arguments are supplied, a hard (built-in) seed
tuple, like (1,1) or whatever.
Sometimes a paper on a random number generator suggests seeds (like 4357
for the MersenneTwister), but of course, a good random number generator
should behave well independently of the initial seed/seed-tuple.
I may be completely mistaken here (I'm not an expert on random number
theory), but the random number generators (Ahrens, et. al) seem 'old'?
After some studying, we decided to use the Mersenne Twister:

    http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html
    http://www.math.keio.ac.jp/~matumoto/emt.html

PDF article:

    http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf

    M. Matsumoto and T. Nishimura,
        "Mersenne Twister: A 623-dimensionally equidistributed uniform
         pseudorandom number generator",
         ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1,
         January pp.3-30 1998

There are some Python wrappers and it has good performance as well.

Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

Hail Caesar! We, who are about to dine, salad you.


From jmiller at stsci.edu  Tue Jul 23 11:56:04 2002
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul 23 11:56:04 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0,
 y=0) system clock default and possible bug
References: <200206261833.29702.e.maryniak@pobox.com>
Message-ID: <3D3DA67E.308@stsci.edu>

Eric Maryniak wrote:

>Dear crunchers,
>
>According to the _Numpy_ manual for RandomArray.seed(x=0, y=0)
>(with /my/ emphasis):
>
>  The seed() function takes two integers and sets the two seeds
>  of the random number generator to those values. If the default
>  values of 0 are used for /both/ x and y, then a seed is generated
>  from the current time, providing a /pseudo-random/ seed.
>
>Note: in numarray, the RandomArray2 package is provided but it's
>description is not (yet) included in the numarray manual.
>
>I have some questions about this:
>
>1. The implementation of seed(), which is, by the way, identical
>   both in Numeric's RandomArray.py and numarray's RandomArray2.py
>   seems to contradict it's usage description:
>
The 2 in RandomArray2 is there to support side-by-side testing with 
Numeric, not to imply something new and improved.  The point of 
providing RandomArray2 is to provide a migration path for current 
Numeric users.  To that end, RandomArray2 should be functionally 
identical to RandomArray.  

That should not, however, discourage you from writing a new and improved 
random number package for numarray.

>
>
>---cut---
>def seed(x=0,y=0):
>    """seed(x, y), set the seed using the integers x, y;
>    Set a random one from clock if  y == 0
>    """
>    if type (x) != IntType or type (y) != IntType :
>        raise ArgumentError, "seed requires integer arguments."
>    if y == 0:
>        import time
>        t = time.time()
>        ndigits = int(math.log10(t))
>        base = 10**(ndigits/2)
>        x = int(t/base)
>        y = 1 + int(t%base)
>    ranlib.set_seeds(x,y)
>---cut---
>
>   Shouldn't the second 'if' be:
>
>    if x == 0 and y == 0:
>
>  With the current implementation:
>
>  - 'seed(3)' will actually use the clock for seeding
>  - it is impossible to specify 0's (0,0) as seed: it might be
>    better to use None as default values?
>
>2. With the current time.time() based default seeding, I wonder
>   if you can call that, from a mathematical point of view,
>   pseudo-random:
>
>---cut---
>$ python
>Python 2.2.1 (#1, Jun 25 2002, 20:45:02)
>[GCC 2.95.3 20010315 (SuSE)] on linux2
>Type "help", "copyright", "credits" or "license" for more information.
>
>>>>from numarray import *
>>>>from RandomArray2 import *
>>>>import time
>>>>numarray.__version__
>>>>
>'0.3.5'
>
>>>>for i in range(5):
>>>>
>...     time.time()
>...     RandomArray2.seed()
>...     RandomArray2.get_seed()
>...     time.sleep(1)
>...     print
>...
>1027434978.406238
>(102743, 4979)
>
>1027434979.400319
>(102743, 4980)
>
>1027434980.400316
>(102743, 4981)
>
>1027434981.40031
>(102743, 4982)
>
>1027434982.400308
>(102743, 4983)
>---cut---
>
>   It is incremental, and if you use default seeding within
>   one (1) second, you get the same seed:
>
>---cut---
>
>>>>for i in range(5):
>>>>
>...     time.time()
>...     RandomArray2.seed()
>...     RandomArray2.get_seed()
>...     time.sleep(0.1)
>...     print
>...
>1027436537.066677
>(102743, 6538)
>
>1027436537.160303
>(102743, 6538)
>
>1027436537.260363
>(102743, 6538)
>
>1027436537.360299
>(102743, 6538)
>
>1027436537.460363
>(102743, 6538)
>---cut---
>
>3. I wonder what the design philosophy is behind the decision
>   to use 'mathematically suspect' seeding as default behavior.
>
Using time for a seed is fairly common.   Since it's an implementation 
detail,  I doubt anyone would object if you can suggest a better default 
seed.

>
>   Apart from the fact that it can hardly be called 'random', I also
>   have the following problems with it:
>
>   - The RandomArray2 module initializes with 'seed()' itself, too.
>     Reload()'s of RandomArray2, which might occur outside the
>     control of the user, will thus override explicit user's seeding.
>     Or am I seeing ghosts here?
>
Overriding a user's explicit seed as a result of a reload sounds correct 
to me.   All of the module's top level statements are re-executed during 
a reload.

>
>   - When doing repeated run's of one's neural net simulations that
>     each take less than a second, one will get identical streams of
>     random numbers, despite seed()'ing each time.
>     Not quite what you would expect or want.
>
This is easy enough to work around: don't seed or re-seed.     If you 
then need to make multiple simulation runs, make a separate module and 
call your simulation like:

import simulation

RandomArray2.seed(something_deterministic, something_else_deterministic)
for i in range(number_of_runs):
    simulation.main()

>
>   - From a purist software engineering point of view, I don't think
>     automagical default behavior is desirable: one wants programs to
>     be deterministic and produce reproducible behavior/output.
>
I don't know.  I think by default,  random numbers *should be* random.

>
>     If you use default seed()'ing now and re-run your program/model
>     later with identical parameters, you will get different output.
>
When you care about this, you need to set the seed to something 
deterministic.

>
>     In Eiffel, object attributes are always initialized, and you will
>     almost never have irreproducible runs. I found that this is a good
>     thing for reproducing ones bugs, too ;-)
>
This sounds like a good design principle, but I don't see anything in 
RandomArray2 which is keeping you from doing this now.

>
>To summarize, my recommendation would be to use None default arguments
>and use, when no user arguments are supplied, a hard (built-in) seed
>tuple, like (1,1) or whatever.
>
Unless there is a general outcry from the rest of the community,  I 
think the (existing) numarray extensions (RandomArray2, LinearAlgebra2, 
FFT2) should try to stay functionally identical with Numeric.

>
>Sometimes a paper on a random number generator suggests seeds (like 4357
>for the MersenneTwister), but of course, a good random number generator
>should behave well independently of the initial seed/seed-tuple.
>I may be completely mistaken here (I'm not an expert on random number
>theory), but the random number generators (Ahrens, et. al) seem 'old'?
>After some studying, we decided to use the Mersenne Twister:
>
An array enabled version might make a good add-on package for numarray.

>
>
>    http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html
>    http://www.math.keio.ac.jp/~matumoto/emt.html
>
>PDF article:
>
>    http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf
>
>    M. Matsumoto and T. Nishimura,
>        "Mersenne Twister: A 623-dimensionally equidistributed uniform
>         pseudorandom number generator",
>         ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1,
>         January pp.3-30 1998
>
>There are some Python wrappers and it has good performance as well.
>
>Bye-bye,
>
>Eric
>
Bye,
Todd


From e.maryniak at pobox.com  Tue Jul 23 13:03:02 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Tue Jul 23 13:03:02 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
In-Reply-To: <3D3DA67E.308@stsci.edu>
References: <200206261833.29702.e.maryniak@pobox.com> <3D3DA67E.308@stsci.edu>
Message-ID: <200207232202.04104.e.maryniak@pobox.com>

On Tuesday 23 July 2002 20:54, Todd Miller wrote:
> Eric Maryniak wrote:
> >...
> That should not, however, discourage you from writing a new and improved
> random number package for numarray.

Yes, thank you :-)

> >...
> >3. I wonder what the design philosophy is behind the decision
> >   to use 'mathematically suspect' seeding as default behavior.
>
> Using time for a seed is fairly common.   Since it's an implementation
> detail,  I doubt anyone would object if you can suggest a better default
> seed.

Well, as said, a fixed seed, provided by the class implementation
and therefore 'good', instead of a not-so-random 'random' seed.
And imho it would be better not to (only) use the clock, but a /dev/random
kinda thing.
Personally, I find the RNG setup much more appealing: there the default is:

    standard_generator = CreateGenerator(-1)

where

    seed < 0  ==>  Use the default initial seed value.
    seed = 0  ==>  Set a "random" value for the seed from the system clock.
    seed > 0  ==>  Set seed directly (32 bits only).

And indeed 'void Mixranf(int *s,u32 s48[2])' uses a built-in constant
as initial seed value (actually, two).

>...
> >     If you use default seed()'ing now and re-run your program/model
> >     later with identical parameters, you will get different output.
>
> When you care about this, you need to set the seed to something
> deterministic.

Naturally, but how do I know what a 'good' seed is (or indeed it's type,
range, etc.)? I just would like, as e.g. RNG does, let the number generator
take care of this... (or at least provide the option to)

>...

In the programs I've seen so far, including a lot of ours ahem, usually a
program (simulation) is run multiple times with the same parameters and,
in our case for neural nets, seeded each time with a clock generated seed
and then the different simulations are compared and checked if they are
similar or sensitive to chaotic influences.
But I don't think this is the proper way to do this.
My point is, I guess, that the sequence of these clock-generated seeds
itself is not random, because (as for RandomArray) the generated numbers
are clearly not random.
Better, and reproducible, would be to start the first simulation with a
supplied seed, get the seed and pickle after the first run and use the
pickled seed for run 2 etc. or indeed have a kind of master script
(as you suggest) that manages this.
That way you would start with one seed only and are not re-seeding
for each run. Because if the clock-seeds are not truly random, you
will a much greater change of cycles in your overall sequence of numbers.

Bye-bye, Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

VME ERROR 37022: Hierarchic name syntax invalid taking
into account starting points defined by initial context.


From paul at pfdubois.com  Tue Jul 23 13:14:05 2002
From: paul at pfdubois.com (paul at pfdubois.com)
Date: Tue Jul 23 13:14:05 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
In-Reply-To: <200207232202.04104.e.maryniak@pobox.com>
Message-ID: <3D36139400005515@mta08.san.yahoo.com>

RandomArray got a "special" position as part of Numeric simply by historical
accident in being there first. I think in the conversion to Numarray we
will be able to remove such things from the "core" and make more of a marketplace
of equals for the "addons". As it is now there is some implication that
somehow one is "better" than the other, which is unjustified either mathematically
or in the sense of design.

RNG's design is based on my experience with large codes needing many independent
streams. The mathematics is from a well-tested Cray algorithm. I'm sure
it could use fluffing up but a good case can be made for it.


From gb at cs.unc.edu  Tue Jul 23 14:24:03 2002
From: gb at cs.unc.edu (Gary Bishop)
Date: Tue Jul 23 14:24:03 2002
Subject: [Numpy-discussion] Bug in Numpy FFT reference?
Message-ID: <200207232123.g6NLN6bE004136@wren.cs.unc.edu>

The example given for real_fft in the FFT section of the Sept 7, 2001 
Numpy manual makes no sense to me. The text says

>>> x = cos(arange(30.0)/30.0*2*pi)
>>> print real_fft(x)
[ -1. +0.j 13.69406641+2.91076367j
-0.91354546-0.40673664j -0.80901699-0.58778525j
-0.66913061-0.74314483j -0.5 -0.8660254j
-0.30901699-0.95105652j -0.10452846-0.9945219j
0.10452846-0.9945219j 0.30901699-0.95105652j
0.5 -0.8660254j 0.66913061-0.74314483j
0.80901699-0.58778525j 0.91354546-0.40673664j
0.9781476 -0.20791169j 1. +0.j ]

But surely x is a single cycle of a cosine wave and should have a very 
sensible and simple FT. Namely [0, 1, 0, 0, 0, ...]

Indeed, running the example using Numeric and FFT produces, within 
rounding error, exactly what I would expect.

Why the non-intuitive (and wrong) result in the example text?

gb


From dubois1 at llnl.gov  Tue Jul 23 14:32:04 2002
From: dubois1 at llnl.gov (Paul Dubois)
Date: Tue Jul 23 14:32:04 2002
Subject: [Numpy-discussion] Bug in Numpy FFT reference?
In-Reply-To: <200207232123.g6NLN6bE004136@wren.cs.unc.edu>
References: <200207232123.g6NLN6bE004136@wren.cs.unc.edu>
Message-ID: <1027459879.8212.2.camel@ldorritt>

The person who wrote the manual cut and pasted from running the code. I
think there was a bug in FFT at the time. (:->

On Tue, 2002-07-23 at 14:23, Gary Bishop wrote:
> The example given for real_fft in the FFT section of the Sept 7, 2001 
> Numpy manual makes no sense to me. The text says
> 
> >>> x = cos(arange(30.0)/30.0*2*pi)
> >>> print real_fft(x)
> [ -1. +0.j 13.69406641+2.91076367j
> -0.91354546-0.40673664j -0.80901699-0.58778525j
> -0.66913061-0.74314483j -0.5 -0.8660254j
> -0.30901699-0.95105652j -0.10452846-0.9945219j
> 0.10452846-0.9945219j 0.30901699-0.95105652j
> 0.5 -0.8660254j 0.66913061-0.74314483j
> 0.80901699-0.58778525j 0.91354546-0.40673664j
> 0.9781476 -0.20791169j 1. +0.j ]
> 
> But surely x is a single cycle of a cosine wave and should have a very 
> sensible and simple FT. Namely [0, 1, 0, 0, 0, ...]
> 
> Indeed, running the example using Numeric and FFT produces, within 
> rounding error, exactly what I would expect.
> 
> Why the non-intuitive (and wrong) result in the example text?
> 
> gb
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From e.maryniak at pobox.com  Wed Jul 24 09:24:14 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Wed Jul 24 09:24:14 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
In-Reply-To: <3D36139400005515@mta08.san.yahoo.com>
References: <3D36139400005515@mta08.san.yahoo.com>
Message-ID: <200207241823.42218.e.maryniak@pobox.com>

On Tuesday 23 July 2002 22:15, paul at pfdubois.com wrote:
> RandomArray got a "special" position as part of Numeric simply by
> historical accident in being there first. I think in the conversion to
> Numarray we will be able to remove such things from the "core" and make
> more of a marketplace of equals for the "addons". As it is now there is
> some implication that somehow one is "better" than the other, which is
> unjustified either mathematically or in the sense of design.
>
> RNG's design is based on my experience with large codes needing many
> independent streams. The mathematics is from a well-tested Cray algorithm.
> I'm sure it could use fluffing up but a good case can be made for it.

A famous quote from Linus is "Nice idea. Now show me the code."

Perhaps a detailed example makes my problem clearer, because as it is now,
RNG and RandomArray2 are not orthogonal in design, in the sense that RNG's
default seed is fixed and RandomArray's is automagical (clock), not
reproducible and mathematically suspect, which I think is not good for the
more naive Python user.

Below I will give intended usage in a provocative way, but please don't
take me too seriously (I know, I don't ;-)

Let's say you have a master shell script that runs a neural net paradigm
(size 20x20) 10 times, each time with the same parameters, to see if it's
stable or chaotic, i.e. does not 'converge' c.q. outcome depends on initial
values (it should not be chaotic, but this should always be checked).

    run10.sh
       tracelink.py 20 20 inputpat.dat > hippocamp01.out
       ... 8 more ...
       tracelink.py 20 20 inputpat.dat > hippocamp10.out

    tracelink.py
       ... import numarray, RandomArray2 _or_ RNG ...
       # Case 1: RandomArray2
       # User uses default clock seed, which is the same
       # during 1 second (see my previous posting).
       # ignlgi(void)'s seeds 1234567890L,123456789L
       # are _not_ used (see com.c).
       RandomArray2.seed()
       # But if omitted, RandomArray2.py does it, too.
       ... calculations
       ... other program outcome _only_ if program runs > 1 second,
       ... otherwise the others will have the same result.
       # Case 2: RNG
       # A 'standard_generator = CreateGenerator(-1)' is automatically done.
       #   seed < 0  ==>  Use the default initial seed value.
       #   seed = 0  ==>  Set a "random" value for the seed from system clock.
       #   seed > 0  ==>  Set seed directly (32 bits only).
       # Thus, the fixed seeds used are 0,0 (see Mixranf() in ranf.c).
       ... calculations
       ... all 10 programs have the same outcome when using ranf(),
       ... because it always starts the same seed, the sequence is always:
       ... 0.58011364857958725, 0.95051273498076583, 0.78637142533060356 etc.
       
The problem with RandomArray's seed is, that it is not truly random itself.
In it's current (time.time based) implementation it is linearly auto
incrementing every second, and therefore suffers from auto-correlation.
Moreover, in the above example, if 10 separate .py runs complete in 1 second
they'll all have the same seed (and outcome). This is not what the user,
if accustomed to clock seeding, would expect.
But if the seed is different each time, a problem is that runs are not
reproducible. Let's say that run hippocamp06.out produced some strange
output: now unless the user saved the seed (with get_seed), it can never
be reproduced.

Therefore, I think RNG's design is better and should be applied to
RandomArray2, too, because RandomArray2's seeding is flawed anyways.
A user should be aware of proper seeding, agreed, and now will be:
when doing multiple identical runs, the same (and thus reproducible)
output will result and so the user is made aware of the fact that,
as an example, he or she should seed or pickle it between runs.
So my suggestion would be to re-implement RandomArray2.seed(x=0,y=0)
as follows:

  if either the x or y seed:

    seed  < 0     ==>  Use the default initial seed value.
    seed  = None  ==>  Set a "random" value for the seed from the system clock.
    seeds >= 0    ==>  Set seed directly (32 bits only).

and en-passant do a better job than clock-based seeding:

---cut---
def seed(x=None,y=None):
    """seed(x, y), set the seed using the integers x, y;
    ...
    """
    if (x != None and type (x) != IntType) or
       (y != None and type (y) != IntType) :
        raise ArgumentError, "seed requires integer arguments (or None)."
    if x == None or y == None:
        import dev_random_device  # uses /dev/random or equivalent
        x = dev_random_device.nextvalue()   # egd.sf.net is a user space
        y = dev_random_device.nextvalue()   # alternative
    elif x < 0 or y < 0:
        x = 1234567890L
        y = 123456789L
    ranlib.set_seeds(x,y)
---cut---

But: I realize that this is different behavior from Python's standard
random and whrandom, where no arg or None uses the clock. But, if that
behavior is kept for RandomArray2 (and RNG should then be adapted, too)
then I'd urge at least to use a better initial seed.
In certain applications, e.g. generating session id's in crypto programs,
non-predictability of initial seeds is crucial. But if you have a look
at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks
like an art in itself. So perhaps RNG's 'clock code' should replace
RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will
not have the 1-second problem.

Bye-bye, Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

Just because you're not paranoid, that doesn't mean that they're not
after you.


From Chris.Barker at noaa.gov  Wed Jul 24 10:01:06 2002
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jul 24 10:01:06 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) 
 system clock default and possible bug
References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com>
Message-ID: <3D3ECEEE.6BAF4CC2@noaa.gov>

Just to add my $.02:

I disagree with Eric about what the default behaviour should be. Every
programming language/environment I have ever used uses some kind of
"random" seed by default. When I want reproducible results (which I
often do for testing) I can specify a seed. I find the the most useful
behaviour. As Eric points out, it is not trivial to generate a "random"
seed (from the time, or whatever), so it doesn't make sense to burdon
the nieve user with this chore.

Therefore, I strongly support keeping the default behaviour of a
"random" seed.

Eric Maryniak wrote:
> then I'd urge at least to use a better initial seed.
> In certain applications, e.g. generating session id's in crypto programs,
> non-predictability of initial seeds is crucial. But if you have a look
> at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks
> like an art in itself. So perhaps RNG's 'clock code' should replace
> RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will
> not have the 1-second problem.

This I agree with: a better default initial seed would be great. As
someone said, "show me the code!". I don't imagine anyone would object
to improving this.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From e.maryniak at pobox.com  Wed Jul 24 10:29:02 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Wed Jul 24 10:29:02 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0)  system clock default and possible bug
In-Reply-To: <3D3ECEEE.6BAF4CC2@noaa.gov>
References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com> <3D3ECEEE.6BAF4CC2@noaa.gov>
Message-ID: <200207241928.07366.e.maryniak@pobox.com>

On Wednesday 24 July 2002 17:59, Chris Barker wrote:
> Just to add my $.02:
>
> I disagree with Eric about what the default behaviour should be. Every
> programming language/environment I have ever used uses some kind of
> "random" seed by default. When I want reproducible results (which I
> often do for testing) I can specify a seed. I find the the most useful
> behaviour. As Eric points out, it is not trivial to generate a "random"
> seed (from the time, or whatever), so it doesn't make sense to burdon
> the nieve user with this chore.
>
> Therefore, I strongly support keeping the default behaviour of a
> "random" seed.

In that case, and if that is the general consensus, RNG should be adapted:
it now uses a fixed seed by default (and not a clock generated one).

> Eric Maryniak wrote:
> > then I'd urge at least to use a better initial seed.
> > In certain applications, e.g. generating session id's in crypto programs,
> > non-predictability of initial seeds is crucial. But if you have a look
> > at GPG's or OpenSSL's source for a PRNG (especially for Windows), it
> > looks like an art in itself. So perhaps RNG's 'clock code' should replace
> > RandomArray2's: it uses microseconds (in gettimeofday), too, and thus
> > will not have the 1-second problem.
>
> This I agree with: a better default initial seed would be great. As
> someone said, "show me the code!". I don't imagine anyone would object
> to improving this.

The source is in Mixranf(), file Numerical/Packages/RNG/Src/ranf.c
(when checked out with CVS), but it may be a good idea to check it
with Python's own random/whrandom code (which I don't have at hand
-- it may be more recent and/or portable for other OSes).

By the way, I realized in my code 'fix' for RandomArray2.seed(x=None,y=None)
that I already anticipated this and that the default behavior is _not_
to use a fixed seed ;-)  :

  if either the x or y seed:

    seed  < 0     ==>  Use the default initial seed value.
    seed  = None  ==>  Set a "random" value for the seed from clock (default)
    seeds >= 0    ==>  Set seed directly (32 bits only).

and en-passant do a better job than clock-based seeding:

---cut---
def seed(x=None,y=None):
    """seed(x, y), set the seed using the integers x, y;
    ...
    """
    if (x != None and type (x) != IntType) or
       (y != None and type (y) != IntType) :
        raise ArgumentError, "seed requires integer arguments (or None)."
    if x == None or y == None:
        # This would be the best, but is problematic under Windows/Mac.
        import dev_random_device  # uses /dev/random or equivalent
        x = dev_random_device.nextvalue()   # egd.sf.net is a user space
        y = dev_random_device.nextvalue()   # alternative
        # So best is to use Mixranf() from RNG/Src/ranf.c here.
    elif x < 0 or y < 0:
        x = 1234567890L
        y = 123456789L
    ranlib.set_seeds(x,y)
---cut---

Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

Unix was a trademark of AT&T.
AT&T is a modem test command.


From peter.chang at nottingham.ac.uk  Wed Jul 24 11:08:06 2002
From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk)
Date: Wed Jul 24 11:08:06 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0,
 y=0)  system clock default and possible bug
In-Reply-To: <200207241928.07366.e.maryniak@pobox.com>
Message-ID: <Pine.LNX.4.44.0207241856350.15117-100000@eexpc1.eee.nott.ac.uk>


Just to stick my oar in:

I think Eric's preference is predicated by the lousiness (or otherwise?)  
of RandomArray's seeding mechanism. The random sequences generated by
incremental seeds should, by design, be uncorrelated thus allowing the use
of the system clock as a seed source.

If you're running lots of simulations (as I do with Monte Carlos, though
not in numpy) using PRNGs, the last thing you want is the task to find a
(pseudo) random source of seeds. Using /dev/random is not particularly
portable; the system clock is much easier to obtain and is fine as long as
your iteration cycle is longer than its resolution.

Peter


From paul at pfdubois.com  Wed Jul 24 23:09:02 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Wed Jul 24 23:09:02 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0)  system clock default and possible bug
In-Reply-To: <200207241928.07366.e.maryniak@pobox.com>
Message-ID: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY>

I'm not going to change the default seed on RNG. Existing users have the
right to stability, and not to have things change because someone thinks
a certain choice among several reasonable ones is better than the one
previously made. 

There is the further issue here of RNG being advertised as similar to
Cray's ranf() and that similarity extends to this default. Not to
mention that for many purposes the current default is quite useful.


From e.maryniak at pobox.com  Thu Jul 25 06:02:03 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Thu Jul 25 06:02:03 2002
Subject: [Numpy-discussion] Numarray: Summary (seeding): personal code and manual suggestions on initial seeding in module RNG and RandomArray(2)
In-Reply-To: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY>
References: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY>
Message-ID: <200207251501.47126.e.maryniak@pobox.com>

Dear crunchers,

Please see my personal thoughts on the past discussion about initial
seeds some paragraphs down below, where I'd like to list concrete
code and manual enhancements aimed at providing users with a clear
understanding of it's usage (and pitfalls e.g. w/r to cryptographic
applications)...

==> Suggestions for code and manual changes w/r to initial seeding
    (down below)

But first a response to Paul's earlier message:

On Thursday 25 July 2002 08:08, Paul F Dubois wrote:
> I'm not going to change the default seed on RNG. Existing users have the
> right to stability, and not to have things change because someone thinks
> a certain choice among several reasonable ones is better than the one
> previously made.

Well, I wasn't aware of the fact that things were completely set in stone
for Numarray solely for backward compatibilty. It was my impression that
numarray and it's accompanying xx2 packages were also open for redesign.
I agree stability is important, but numarray already breaks with Numeric
in other aspects so why should RNG (RNG2 in numarray?) or other packages
not be? It's more a matter of well documenting changes I think.
Users switching to numarray will already have to take into account some
changes and verify their code.
It's not that I "think a certain choice among several reasonable ones is
better" [although my favorite is still a fixed seed, as in RNG, for reasons
of reproducibility in later re-runs of Monte Carlo's that are not possible
now, because the naive user, using a clock seed, may not have saved the
initial seed with get_seed], but that the different packages, i.c. RNG
(RNG2 to be?) and RandomArray2, should be orthogonal in this respect.
I.e. the same, so 'default always an automagical (clock whatever) random
initial seed _or_ a fixed one'. Orthogonality is a very common and accepted
design principle in computing science and for good reasons (usability).
Users changing from one PRNG to another (and using the default seed) would
otherwise be unwelcomely surprised by a sudden change in behavior of their
program.
I try to give logical arguments and real code examples in this discussion
and fail to see in Paul's reaction where I'm wrong.
By the way: in Python 2.1 alpha 2 seeding changed, too:
"""
- random.py's seed() function is new.  For bit-for-bit compatibility with
  prior releases, use the whseed function instead.  The new seed function
  addresses two problems:  (1) The old function couldn't produce more than
  about 2**24 distinct internal states; the new one about 2**45 (the best
  that can be done in the Wichmann-Hill generator).  (2) The old function
  sometimes produced identical internal states when passed distinct
  integers, and there was no simple way to predict when that would happen;
  the new one guarantees to produce distinct internal states for all
  arguments in [0, 27814431486576L).
"""

> There is the further issue here of RNG being advertised as similar to
> Cray's ranf() and that similarity extends to this default. Not to
> mention that for many purposes the current default is quite useful.

Perhaps I'm mistaken here, but RNG/Lib/__init__.py does
(-1 -> uses fixed internal seed):

    standard_generator = CreateGenerator(-1)

and:
    def ranf():
            "ranf() = a random number from the standard generator."
            return standard_generator.ranf()

And indeed Mixranf in RNG/Src/ranf.c does set them to 0:

    ...
    if(*s < 0){ /* Set default initial value */
        s48[0] = s48[1] = 0;
        Setranf(s48);
        Getranf(s48);

And this code, or I'm missing the point, uses a standard generator
from RNG, which demonstrates the same sequence of initial seeds in
re-runs (note that it does not suffer from the "1-second problem"
as RandomArray2 does, see the Appendix below for a demonstration
of that, because RNG uses milliseconds).
Note that 'ranf()' is listed in chapter 18 in Module RNG as one of
the 'Generator objects':

    $ python
    Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
    >>> from numarray import *
    >>> from RNG import *
    >>> for i in range(3):
    ...    standard_generator.ranf()
    ...
    0.58011364857958725
    0.95051273498076583
    0.78637142533060356
    >>>

    $ python
    Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
    >>> from numarray import *
    >>> from RNG import *
    >>> for i in range(3):
    ...     standard_generator.ranf()
    ...
    0.58011364857958725
    0.95051273498076583
    0.78637142533060356
    >>>


Ok, now then my own (and possibly biased) personal summary of the
past discussions and concrete code and manual recommendations:

==> Suggestions for code and manual changes w/r to initial seeding

Conclusions:
1. Default initial seeding should be random (and not fixed).
   This is the general consensus and while it may not win the beauty
   contest in purist software engineering circles, it also is the
   default behavior in Python's own Random/WHRandom modules. URL:
       http://web.pydoc.org/2.2/random.html
=> Recommendations:
   - Like Python's random/whrandom module, default arguments to seed()
     should not be 0, but None, and this triggers the default behavior
     which is to use a random initial seed (ideally 'truly' random from
     e.g. /dev/random or otherwise clock or whatever based), because:
     o better usability: users changing from Python's own random to
       numarray's random facilities will find familiar seed() usage
       semantics
     o often 0 itself can be a legal seed (although the MersenneTwister
       does not recommend it)
   - Like RNG provide support for using a built-in fixed seed by
     supplying negative seeds to seed(), rationale:
     o support for reproducible re-runs of Monte Carlo's without
       having to specify ones own initial seed
     o usability: naive users may not know a 'good' seed is, like:
       can it be 0 or must it be >0, what is the maximum, etc.
   - See my suggested code fix for RandomArray2.seed() in the Appendix below.
   - Likewise, in RNG:
     o CreateGenerator (s, ...) should be changed to CreateGenerator (s=None)
       Also note Python's own: def create_generators(num, delta, firstseed=None)
       from random (random.py), url: http://web.pydoc.org/2.2/random.html
     o RNG's code should be changed from testing on 0 to testing on None first
       (which results in using the clock), then on < 0 (use built-in seed),
       and then using the user provided seed (which is thus >= 0, and hence
       can also be 0)
     o 'standard_generator = CreateGenerator(-1)' should be changed to
       'standard_generator = CreateGenerator() and results in using the clock
   - Put some explicit warnings in the numarray manual, that the seeding
     of numarray's packages should _not_ be used in those parts of software
     where unpredictability of seeds is important, such as for example,
     cryptographical software for creating session keys, TCP sequence numbers
     etc. Attacks on crypto software usually center around these issues.
     Ideally, a /dev/random should be used, but with the current system
     clock based implementation, the seeds are not random, because the clock
     does not have deci-nanosecond precision (10**10 ~= 2**32) yet ;-)

Appendix
--------
** 1. "1-second problem" with RandomArray2:

$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
>>> from numarray import *
>>> from RandomArray2 import *
>>> import time
>>> import sys
>>> sys.version
'2.2.1 (#1, Jun 25 2002, 20:45:02) \n[GCC 2.95.3 20010315 (SuSE)]'
>>> numarray.__version__
'0.3.5'
>>> for i in range(3):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(1)
...     print
...
1027591910.9043469
(102759, 1911)

1027591911.901091
(102759, 1912)

1027591912.901088
(102759, 1913)

>>> for i in range(3):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(0.3)
...     print
...
1027591966.260392
(102759, 1967)

1027591966.5510809
(102759, 1967)

1027591966.851079
(102759, 1967)

Note that Python (at least 2.2.1) own random() suffers much less
from this (on my 450 MHz machine, every 10-th millisecond or so
the seed will be different):

$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
>>> from random import *
>>> import time
>>>
>>> for i in range(3):
...     print long(time.time() * 256)
...
263065231349
263065231349
263065231349
>>> for i in range(3):
...     print long(time.time() * 256)
...     time.sleep(.00001)
...
263065240314
263065240315
263065240317

By the way, Python's own random.seed() also suffers from this,
but on a 10th-millisecond level (on my 450 Mhz i586 at least).
For the implementation of seed() see Lib/random.py, basically a
'long(time.time()' is used:

$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
>>> from random import *
>>> import time
>>> for i in range(3):
...     print long(time.time() * 256)
...
263065231349
263065231349
263065231349
>>> for i in range(3):
...     print long(time.time() * 256)
...     time.sleep(.00001)
...
263065240314
263065240315
263065240317

2. Proposed re-implementation of RandomArray2.seed():

def seed(x=None,y=None):
    """seed(x, y), set the seed using the integers x, y:
       x or y is None (or not specified):
          A random seed is used which in the current implementation
          may be based on the system's clock.
          Warning: do not this seed in software where the initial seed may
          not be predictable, such as for example, in cryptographical software
          for creating session keys.
       x < 0 or y < 0:
          Use the module's fixed built-in seed which is the tuple
          (1234567890L, 123456789L) (or whatever)
       x >= 0 and y >= 0
          Use the seeds specified by the user.
          (Note: some random number generators do not recommend using 0)
       Note: based on Python 2.2.1's random.seed(a=None).
       ADAPTED for _2_ seeds as required by ranlib.set_seeds(x,y)
    """
    if (x != None and type (x) != IntType) or
       (y != None and type (y) != IntType) :
        raise ArgumentError, "seed requires integer arguments (or None)."
    if x == None or y == None:
        try:
            # This would be the best, but is problematic under Windows/Mac.
            # To my knowledge there isn't a portable lib_randdevice yet.
            # As GPG, OpenSSH and OpenSSL's code show, getting entropy
            # under Windows is problematic.
            # However, Python 2.2.1's socketmodule does wrap the ssl code.
            import dev_random_device  # uses /dev/random or equivalent
            x = dev_random_device.nextvalue()   # egd.sf.net is a user space
            y = dev_random_device.nextvalue()   # alternative
        except:
            # Use Mixranf() from RNG/Src/ranf.c here or, perhaps better,
            # use Python 2.2.1's code? At least it looks simpler and does not
            # have the platform dependency's and has possibly met wider testing
            # (and why not re-use code? ;-)
            # For Python 2.2.1's random.seed(a=None), see url:
            #     http://web.pydoc.org/2.2/random.html
            # and file Lib/random.py.
            # Do note, however, that on my 450 Mhz machine, the statement
            # 'long(time.time() * 256)' will generate the same values
            # within a tenth of a millisecond (see Appendix #1 for a code
            # example). This can be fixed by doing a time.sleep(0.001).
            # See my #EM# comment.
            # Naturally this code needs to be adapted for ranlib's
            # generator, because this code uses the Wichmann-Hill generator.

---cut: Wichmann-Hill---
    def seed(self, a=None):
        """Initialize internal state from hashable object.

        None or no argument seeds from current time.

        If a is not None or an int or long, hash(a) is used instead.

        If a is an int or long, a is used directly.  Distinct values between
        0 and 27814431486575L inclusive are guaranteed to yield distinct
        internal states (this guarantee is specific to the default
        Wichmann-Hill generator).
        """

        if a is None:
            # Initialize from current time
            import time
            a = long(time.time() * 256)
            #EM# Guarantee unique a's between subsequent call's of seed()
            #EM# by sleeping one millisecond. This should not be harmful,
            #EM# because ordinarily, seed() will only be called once or so
            #EM# in a program.
            time.sleep(0.001)

        if type(a) not in (type(3), type(3L)):
            a = hash(a)

        a, x = divmod(a, 30268)
        a, y = divmod(a, 30306)
        a, z = divmod(a, 30322)
        self._seed = int(x)+1, int(y)+1, int(z)+1
---cut: Wichmann-Hill---
    elif x < 0 or y < 0:
        x = 1234567890L  # or any other suitable 0 - 2**32-1
        y = 123456789L
    ranlib.set_seeds(x,y)

3. Mersenne Twister, another PRNG:


Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

In a grocery store, the Real Programmer is the one who insists on running
the cans past the laser checkout scanner himself, because he never could
trust keypunch operators to get it right the first time.


From aureli at ipk.fhg.de  Thu Jul 25 09:51:06 2002
From: aureli at ipk.fhg.de (Aureli Soria Frisch)
Date: Thu Jul 25 09:51:06 2002
Subject: [Numpy-discussion] index method for array objects?
In-Reply-To: <yfswusomq84.fsf@black132.ex.ac.uk>
References: <20020621133705.A15296@idi.ntnu.no>
 <a05111b02b93d0b733efd@[153.97.92.109]>
 <yfswusomq84.fsf@black132.ex.ac.uk>
Message-ID: <a05111b00b965d8029fdb@[153.97.92.109]>

Hi all,

Has someone implemented a function for arrays that behaves like the 
index(*) method for lists (it should then consider something like a 
tolerance parameter).

I suppose it could be maybe done with array.tolist() and 
list.index(), but have someone implemented something more 
elegant/array-based?

Thanks in advance

Aureli

PD: (*) index receive a value as an argument and retunrs the index of 
the list member equal to this value...
-- 
#################################
Aureli Soria Frisch
Fraunhofer IPK
Dept. Pattern Recognition

post: Pascalstr. 8-9, 10587 Berlin, Germany
e-mail: aureli at ipk.fhg.de
fon: +49 30 39006-143
fax: +49 30 3917517
web: http://vision.fhg.de/~aureli/web-aureli_en.html
#################################


From jmiller at stsci.edu  Thu Jul 25 10:15:03 2002
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 25 10:15:03 2002
Subject: [Numpy-discussion] index method for array objects?
References: <20020621133705.A15296@idi.ntnu.no> <a05111b02b93d0b733efd@[153.97.92.109]> <yfswusomq84.fsf@black132.ex.ac.uk> <a05111b00b965d8029fdb@[153.97.92.109]>
Message-ID: <3D4031C2.3090607@stsci.edu>

Aureli Soria Frisch wrote:

> Hi all,
>
> Has someone implemented a function for arrays that behaves like the 
> index(*) method for lists (it should then consider something like a 
> tolerance parameter).
>
> I suppose it could be maybe done with array.tolist() and list.index(), 
> but have someone implemented something more elegant/array-based?
>
> Thanks in advance
>
> Aureli
>
> PD: (*) index receive a value as an argument and retunrs the index of 
> the list member equal to this value...

I think the basics of what you're looking for are something like:

def index(a, b, eps):
      return nonzero(abs(a-b) < eps)

which should return all indices at which the absolute value of the 
difference between elements of a and b differ by less than eps.   e.g.:

 >>> import Numeric
 >>> index(Numeric.arange(10,20), 15, 1e-5)
array([5])

Todd

-- 
Todd Miller 			jmiller at stsci.edu
STSCI / SSG			(410) 338 4576


From magnus at hetland.org  Thu Jul 25 12:12:11 2002
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Thu Jul 25 12:12:11 2002
Subject: [Numpy-discussion] Spectral approximation/DFT
Message-ID: <20020725211111.A27670@idi.ntnu.no>

Hi!

Sorry to ask what is probably a really clueless question -- if there
are any obvious sources of information about this, I'd be happy to go
there and find this out for myself... :]

Anyway; I'm trying to produce a graph to illustrate a time sequence
indexing method, which relies on extracting the low-frequent Fourier
coefficients and indexing a vector consisting of those. The graph
should contain the original time sequence, and one reconstructed from
the Fourier coefficients. Since it is reconstructed from only the
low-frequent coefficients (perhaps 10-20 coefficients), it will look
wavy and sinus'y.

Now... I'm no expert in signal processing (or the specifics of FFT/DFT
etc.), and I can't seem to make the FFT module do exactly what I want
here...

It seems that using fft(seq).real extracts the coefficients I'm after
(though I'm not sure whether the imaginary components ought to figure
in the equation somehow...)

But no matter how I use inverse_fft or inverse_real_fft it seems I
have to supply a number of coefficients equal to the sequence I want
to approximate -- otherwise there will be a huge offset between them.
Why is this so? Shouldn't the first coefficient take care of such an
offset? Perhaps inverse_fft isn't doing what I think it is?

If I haven't expressed myself clearly, I'd be happy to elaborate...

(For those who might be interested, the approach is described in the
paper found at http://citeseer.nj.nec.com/307308.html with a figure of
the type I'm trying to produce at page 5.)

Anyway, thanks for any help :)

-- 
Magnus Lie Hetland                                  The Anygui Project
http://hetland.org                                  http://anygui.org


From magnus at hetland.org  Thu Jul 25 12:16:21 2002
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Thu Jul 25 12:16:21 2002
Subject: [Numpy-discussion] A probable solution...
Message-ID: <20020725211534.A27914@idi.ntnu.no>

After posting to the list (sorry about that ;) a possible solution
occurred to me... To get an approximation, I used fft(seq, 10) and
then inverted that using inverse_fft(signature, 100)... I guess that
fouled up the scale of things -- when I use fft(seq, 100)[:10] to get
the signature, it seems that everything works just fine...

Even though this _seems_ to do the right thing, I just wanted to make
sure that I'm not doing something weird here...

-- 
Magnus Lie Hetland                                  The Anygui Project
http://hetland.org                                  http://anygui.org


From a.schmolck at gmx.net  Thu Jul 25 15:18:04 2002
From: a.schmolck at gmx.net (Alexander Schmolck)
Date: Thu Jul 25 15:18:04 2002
Subject: [Numpy-discussion] Numarray design announcement
References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY>
Message-ID: <yfsfzy7bggb.fsf@black132.ex.ac.uk>

"Paul F Dubois" <paul at pfdubois.com> writes:

> 
> During recent months there has been a lot of discussion about Numarray
> and whether or not it should differ from Numeric in certain ways. We
> have reviewed this lengthy discussion and come to some conclusions about
> what we plan to do. The discussion has been valuable in that it took a
> whole new "generation" back through the considerations that the
> "founding fathers" debated when Numeric Python was designed.
[...] 
> Decisions
> 
> Numarray will have the same Python interface as Numeric except for the
> exceptions discussed below. 
[...] 
> 2. Currently, if the result of an index operation x[i] results in a
> scalar result, the result is converted to a similar Python type. For
> example, the result of array([1,2,3])[1] is the Python integer 2. This
> will be changed so that the result of an index operation on a Numarray
> array is always a Numarray array. Scalar results will become rank-zero
> arrays (i.e., shape () ).
> 
[...]
> 
> 4. The Numarray version of MA will no longer have copy semantics on
> indexing but instead will be consistent with Numarray. (The decision to
> make MA differ in this regards was due to a need for CDAT to be backward
> compatible with a local variant of Numeric; the CDAT user community no
> longer feels this was necessary).
[...]


As one of the people who argued for interface changes in numarray (mainly copy
semantics for slicing), let me say that I welcome this announcement which
clarifies many issues. Although I still believe that copy behavior would be
preferable in principle, I think that continuity and backwards compatibility
to Numeric is a sufficient reason to stick to the old behavior (now that
numarray strives to be largely compatible) [1]. In a similar vain I also greatly
welcome the change to view semantics in MA, because I feel that internal
consistency is vital.

Apart from being a heavy Numeric user, these interface issues are also quite
important to me because I have been working for some time on a fully-featured
matrix [2] class which I wanted to be both

a) compatible to Numeric and numarray (so that it would ideally make no
   difference to the user which of the 2 libraries he'd be using as a
   "backend" to the matrix class).

b) consistent in usage to numarray's interface wherever feasible (i.e. not too
   much of a compromise on usability).

This turned out to be much more of a hassle than I would have anticipated,
because contrary to what the compatibility section of the manual seemed to
suggest I found numarray to be incompatible in a variety of ways (even making
it impossible to write *forward* compatible code without writing additional
wrapping functions). Just as an example, there was no simple way that would
work across both versions to do something as common as creating e.g. an int
array (with both parameter names and positions differing):

Numeric (21):       array(sequence, typecode=None, copy=1, savespace=0)
numarray (0.3.3?) : array(buffer=None, shape=None, type=None)


As for b) this obviously turned out to be a moving target, but I hope that now
the final shape of things is getting reasonably clear and I'm now for example
determined to have view slicing behavior for my matrix class, too.

Nonetheless, for me a few issues still remain.

Most importantly, numarray doesn't provide the same degree of polymorphism as
Numeric. One of the chief reasons given as to why Numerics design is based
around functions rather than methods is that it enables greater generality
(e.g. allowing to ``sum`` over all sorts of sequence types). Consequently the
role of methods and attributes was largely limited to functionality that only
made sense for array objects and special methods. This is more than just a
neat convinience -- because of the resulting polymorphism it is easy to write
fairly general code and define new kinds of numeric classes that can
seamlessly be passed to Numeric functions (e.g. one can also ``sum``
Matrix'es).

I find it highly undesirable that numarray apparently doesn't follow this
design rationale and the division of labour between functions and
methods/attributes has been blured (or so it appears to me -- maybe this is
some lack of insight on my part).  That numarray versions before 0.3.4 were
missing functions such as ``shape`` (which is also quite handy for other
sequence types) was largely an inconvenience, but the fact that numarray
function generally only operate on scalars, ``tuple``s and ``list``s (apart
from obviously numarray.array's) is in my eyes a significant shortcoming.

In contrast, Numeric functions would operate on any type that had an __array__
method to return an array representation of itself. The explicit checking for
a type that numarray uses (via constructs ? la type(a) == types.ListType)
flies in the face of standard python sensibilities and places arbitrarily
limits on the kinds of objects that numarray users can conviniently work with
and places a significant hurdle for creating new kinds of numerical objects.

For example, the design of my matrix class depends on the fact that Numeric
functions also accept objects with __array__ methods (such as my matrix
class). Even if I invested the substantial amount of work that would be needed
to redesign a less general version that wouldn't rely on this property, one of
the key virtues of my class, namely the ability to transparently replace
Numeric.array's in most cases where they are used as matrices would be
lost. These two reasons would presumably be sufficient for me not to switch to
numarray if I can at all avoid it, so I really hope that there numarray will
also grow an __array__ protocol or somethign equivalent.

This is the only point that is really vital to me, but there are others that
I'd rather see reconsidered. As I said, I liked the division of labor between
functions and methods/attributes in Numeric and the motivations behind it, as
far as I understand them. numarray arrays, however, have grown methods like
``argsort`` and ``diagonal`` that seem somewhat unmotivated to me (and some of
which cause problems for my matrix class). Similarly, why is there a e.g. a
``.rank`` attribute but a ``.type()`` method? If anything one would expect
type to be an attribute and rank a method, since the type is actually some
intrinsic property that needs to be stored (and could even be plausibly
assigned to, with results like an ``astype`` call) whereas ``size`` and
``rank`` have no "real" existence as they are only computed from the shape and
modifying them makes no sense.

TMTOWTDI is the road to perl, so I'd really prefer to avoid duplicate
functionality a la ``rank(a)`` and ``a.rank`` and generally reserve attributes
and methods to array specific functionality.

One area where TMTOWTDI seems to have run amok (several ways to do something
but IMHO all broken) are flattened representations of arrays. All these
expressions aim to produce a flattened version of ``a``:

``ravel(a)``, ``a.ravel()``, ``a.getflat()``/ ``a.flat``

`Aim` in this context is some sort of euphemism -- the only one for which it
is possible to determine at compile time that it will do anything apart from
raising an exception is ``ravel(a)`` -- not that one could know *what* it will
do before the code is actually run (return a flattened copy of a or a
flattened view), but never mind. Yuck. I think this really needs fixing
(deprecating, rather then removing or changing incompatibly where felt
necessary).

Something else, which I however consider as less important: is it really
necessary to have both 'type' and 'typecode'?  Wouldn't it be enough to just
stick with typecode, along the following lines (potentially issuing
deprecation warnings where appropriate):

  a.typecode()

returns a type object (e.g. Float32).

  array([1,2,3], typecode=Float32)

behaves the same as 

  array([1,2,3], typecode='d')

Float32 etc. are already defined in Numeric so it's easy to write
forward-compatible code and although hunting down instances of 
 
  if a.typecode() == 'd':

presumably wouldn't be that difficult, incompatibility could most likely
almost be eliminated by making ``Float32 == 'd'`` return true.

Sticking to the old name typecode also has the advantage that it is fairly
unique and unambiguous (just try grep'ing for type vs. typecode). 

I must that apart from the switch to type objects, I don't fully understand
the differences in numeric types in old Numeric and numarray and the
motivation behind them. As far as I can see the emphasis with Numeric was to
keep flexible to different hardware and increasing word sizes (i.e. to only
guarantee minimum precision) and provide some reasonable "default" size for
each type (e.g. `Float` being a double precision [3]). This approach is maybe
somewhat similar to python core (floats and ints can have different sizes,
depending on the underlying platform). In numarray the emphasis seems to have
shifted on guaranteeing the actual size in memory (if in a few years time most
calculations are done with 128bit precision than that's maybe not such a good
idea, but I have no clue how likely this is to happen).

Is this shift of emphasis is also responsible for the decision to
have indexing operations always return arrays rather than scalars (including
ones defined by numarray in cases where there is no plain-python equivalent)?

Will all other functions (e.g. min) continue to return scalars?

[BTW can anyone explain to me the difference between Int and Int32 (typecodes
'i' and 'l')?]


Anyway, my apologies if I come across as too negative or if some the points
are misinformed. I really think that the recent changes to numarray and this
announcment are great step forward to a smooth transition of the whole
community from Numeric to numarray which will play an important role in
consolidating python's role in the scientific computing.

night,

alex


Footnotes: 
[1]  I think it might be beneficial, however, to add an explicitly note to the
     manual that alerts users to the fact that small slices can keep alive
     very large arrays, because I am under the impression that this is not
     immediately obvious to everyone and can cause puzzling problems.

[2]  I moaned on this list some months ago that doing linear algebra with
     Numeric array's was often cumbersome and inefficient (and the Matrix
     class that already comes with Numeric is rather limited). My (currently
     alpha) matrix class attempts to address these issues and also provides a
     much more flexible 'plugable' output formating (matlab-like, amongst
     others, which I guess many people will find much more readable; but the
     standard array-like formating is also available).

[3]  As an aside: maybe ``type="Float"`` in numarray should therefore *not* be
     equivalent to ``type=Float32`` but to ``type=Float64``, given that these
     strings seem to just be there for backwards compatibility?

-- 
Alexander Schmolck     Postgraduate Research Student
                       Department of Computer Science
                       University of Exeter
A.Schmolck at gmx.net     http://www.dcs.ex.ac.uk/people/aschmolc/


From victor at idaccr.org  Tue Jul 30 06:43:06 2002
From: victor at idaccr.org (Victor S. Miller)
Date: Tue Jul 30 06:43:06 2002
Subject: [Numpy-discussion] Sparse matrices
Message-ID: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>

I had noticed that Travis Oliphant had a sparse.py package, but it no
longer is available (clicking on the link gives a "404").  I have a
particular kind of sparse matrix that I'd like to use to give vector
matrix multiplies.  In particular, it's an n x n matrix which has at
most k (which is small, usually 2 or 3) non-zeros in each row which
are in consecutive locations.  I have this encoded as an n x k matrix,
the i-th row gives the non-zero values in the i-th row of the big
matrix, and an n long vector of indices -- the i-th element gives the
starting position in the i-th row.  When I want to multiply this
matrix by a row vector v on the left.  To do the multiplication I do
the following:

# loc is the location vector
n = matrix.shape[0]
mm = reshape(v,(-1,1))*matrix
w = zeros((n+m),v.typecode())
for i in range(mm.shape[0]):
    w[loc[i]:loc[i]+matrix.shape[1]] += w[i]
w = w[:n]


I would like to be able to replace the loop with some Numeric
operations.  Is there a trick to do this?  Note that the n that I'm
using is around 100000, so that storing the full matrix is out of the
question (and multiplying by that matrix would be extremely
inefficient, anyway).
-- 
Victor S. Miller     | " ... Meanwhile, those of us who can compute can hardly
victor at idaccr.org    | be expected to keep writing papers saying 'I can do the
CCR, Princeton, NJ   | following useless calculation in 2 seconds', and indeed
    08540 USA        | what editor would publish them?"  -- Oliver Atkin


From victor at idaccr.org  Tue Jul 30 08:29:06 2002
From: victor at idaccr.org (Victor S. Miller)
Date: Tue Jul 30 08:29:06 2002
Subject: [Numpy-discussion] Sparse matrices
In-Reply-To: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org> (victor@idaccr.org's
 message of "Tue, 30 Jul 2002 09:42:13 -0400")
References: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>
Message-ID: <ulznw944mj.fsf@runner.princeton.idaccr.org>

Sorry, I had a typo in the program.  It should be:

# M is n by k, and represents a sparse n by n matrix A
# the non-zero entries of row i of A start in column loc[i]
# and are the i-th row of M in locations loc[i]:loc[i]+k
# loc is the location vector
n,k = M.shape
mm = reshape(v,(-1,1))*M
w = zeros((n+m),v.typecode())
# is there a trick to replace the loop below?
for i in range(mm.shape[0]):
    w[loc[i]:loc[i]+k] += mm[i]
w = w[:n]

-- 
Victor S. Miller     | " ... Meanwhile, those of us who can compute can hardly
victor at idaccr.org    | be expected to keep writing papers saying 'I can do the
CCR, Princeton, NJ   | following useless calculation in 2 seconds', and indeed
    08540 USA        | what editor would publish them?"  -- Oliver Atkin


From jochen at unc.edu  Tue Jul 30 09:24:02 2002
From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=)
Date: Tue Jul 30 09:24:02 2002
Subject: [Numpy-discussion] Sparse matrices
In-Reply-To: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>
References: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>
Message-ID: <lywurd6v8m.fsf@bock.chem.unc.edu>

On Tue, 30 Jul 2002 09:42:13 -0400 Victor S Miller wrote:

Victor> I had noticed that Travis Oliphant had a sparse.py package,
Victor> but it no longer is available (clicking on the link gives a
Victor> "404").

It's part of scipy now.

Greetings,
Jochen
-- 
University of North Carolina                       phone: +1-919-962-4403
Department of Chemistry                            phone: +1-919-962-1579
Venable Hall CB#3290 (Kenan C148)                    fax: +1-919-843-6041
Chapel Hill, NC 27599, USA                            GnuPG key: 44BCCD8E


From e.maryniak at pobox.com  Mon Jul  1 01:48:01 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Mon Jul  1 01:48:01 2002
Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info)
In-Reply-To: <3D1F0839.2090802@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOEPGDOAA.perry@stsci.edu> <3D1F0839.2090802@stsci.edu>
Message-ID: <200207011047.25000.e.maryniak@pobox.com>

On Sunday 30 June 2002 15:31, Todd Miller wrote:
> Perry Greenfield wrote:
> >...
> >>2. Because I'm running two versions of Python (because Zope
> >>   and a lot of Zope/C products depend on a particular version)
> >>   the 'development' Python is installed in /usr/local/bin
> >>   (whereas SuSE's python is in /usr/bin).
> >>   It probably wouldn't do any harm if the manual would include
> >>   a hint at the '--prefix' option and mention an alternative
> >>   Python installation like:
> >>
> >>       /usr/local/bin/python ./setup.py install --prefix=/usr/local
> >
> >Good idea.
>
> I'm actually surprised that this is necessary.  I was under the
> impression that the distutils pick reasonable defaults simply based on
> the python that is running.  In your case,  I would expect numarray to
> install to /usr/local/lib/pythonX.Y/site-packages without specifying any
> prefix.  What happens on SuSE?

Yes, you're probably right.
On SuSE I tested it out on my own machine ('test server'), because I did
not want to do it on the production server. It run's Python 2.2.1 exclusively.
I remembered that I had to this in a previous Numeric installation, where
1.5.2 and 2.1 were running side-by-side (and at that time I also had to
install distutils manually).
So, yes, it may not be an issue (anymore) for at least recent Python's
if you call the Python explicitly like '/usr/local/bin/python ./setup.py'
and '/usr/bin/python ./setup' (on SuSE python goes to /usr/bin).

>
> >>...

Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

It said 'Insert disk #3', but only two will fit.


From hinsen at cnrs-orleans.fr  Mon Jul  1 08:48:10 2002
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Jul  1 08:48:10 2002
Subject: [Numpy-discussion] Scientific Python 2.4
Message-ID: <200207011543.g61FhHL25160@chinon.cnrs-orleans.fr>

                       Scientific Python 2.4
                       ---------------------

Scientific Python is a module library for scientific computing. In
this collection you will find modules that cover basic geometry
(vectors, tensors, transformations, vector and tensor fields),
quaternions, automatic derivatives, (linear) interpolation,
polynomials, elementary statistics, nonlinear least-squares fits, unit
calculations and conversions, Fortran-compatible data formatting, 3D
visualization via VRML, two Tk widgets for simple line plots and 3D
wireframe models.

Scientific Python also contains Python interfaces to the netCDF
library (implementing a portable binary format for large arrays) and
the Message Passing Interface, the most widely used communications
library for parallel computers.

Version 2.4 of Scientific Python has just been released. In addition
to numerous small improvents and bug fixes, it contains

 - the high-level parallelization module Scientific.BSP

 - an interface to the parallelization library BSPlib
   (see www.bsp-worldwide.org for details)

 - autoregressive models for time series in Scientific.Signals.Models


The BSP parallelization module was designed to facilitate development
and testing of parallel programs. Its main features are:

 - communication can handle almost any Python object

 - deadlocks are impossible by design

 - possibility to implement distributed data classes that
   can be used transparently by parallel applications

 - an interactive parallel interpreter that can be used inside
   Emacs (and perhaps other Python development environments)
   in order to provide an interactive parallel programming
   environment

 - parallel programs run as serial monoprocessor code on any
   Python installation with no changes and usually negligeable
   loss of performance - no need to maintain a separate
   serial version

A tutorial on BSP programming with Python is available at the Web site
and included in the distribution.


For more information and for downloading, see

      http://dirac.cnrs-orleans.fr/ScientificPython

or

      http://starship.python.net/crew/hinsen/scientific.html

-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From paul at pfdubois.com  Mon Jul  1 16:41:28 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul  1 16:41:28 2002
Subject: [Numpy-discussion] Numarray: minor feature requests (setup.py and version info)
In-Reply-To: <200207011047.25000.e.maryniak@pobox.com>
Message-ID: <002601c22158$90f7e900$0c01a8c0@NICKLEBY>

distutils installs into the python used to run the setup.py by using the
sys.exec_prefix and sys.prefix. You would not normally need to use any
option unless you are trying to install something "off to the side"
because, for example, you don't have write permission in that python's
site-packages directory.


From paul at pfdubois.com  Mon Jul  1 16:50:57 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul  1 16:50:57 2002
Subject: [Numpy-discussion] words that must not be spoken
In-Reply-To: <200206262047.00731.e.maryniak@pobox.com>
Message-ID: <002701c22159$ca4fd270$0c01a8c0@NICKLEBY>


> [mailto:numpy-discussion-admin at lists.sourceforge.net] On 
> Behalf Of Eric Maryniak
In the midst of a discussion Eric wrote:
> > ...
> shouldn't Convolve, for 
> orthogonality, be named
>     Convolve2? (cuz who knows, numarray's Convolve may be backported
>     to Numeric in the future, for comparative testing etc.).

Use of the phrase "backported to Numeric" will result in your
subscription to numpy-discussion being cancelled. (:->

No backporting is ever going to happen. This is a short one-way street
or there is no purpose to travel on it. 

I am just back from Europython and had a chance to talk to a lot of
users and have some thoughts which I will share with all of you shortly.
However, since I just had to fill out a form and where it said "Date" I
looked at my watch and wrote the time 11/16, I conclude that I have jet
lag and can't trust myself to be lucid yet.


From jae at zhar.net  Mon Jul  8 02:49:01 2002
From: jae at zhar.net (John Eikenberry)
Date: Mon Jul  8 02:49:01 2002
Subject: [Numpy-discussion] Optimization advice
Message-ID: <20020708094805.GA370@kosh.zhar.net>


I'm working on an influence map [1] for game civil [2]. I have a working
version, but as a real numeric newbie I thought I'd bounce it off the
people here before calling it done. I'm basically looking for an easy to
understand but fast influence spreading algorithm. I've read that this
algorithm is similar to those used to predict fire spreading or heat
transfer in metal if that helps.

The attached code is setup for a hex based map and the functions to take
this into accounts (shift_hex_up,shift_hex_down) are probably the most
naive. The others being only slight modifications of those in the
life.py example. Its not really commented but its short and hopefully
should be readily understandable.

I've only included the base influence map class and its associated
functions. If you'd like a version you can run, I can send you a .tgz
setup to run in place (for *nix systems). 

Thanks in advance for any advice or opinions.


[1] An influence map is used commonly in strategic war games. It is a
simple means of capturing the areas on the game map that one side is
strong vs the other side. Read the first post in this thread for a good
description: 

    http://www.gameai.com/influ.thread.html

[2] Civil is a cross-platform, turn-based, networked strategy game,
developed using Python, PyGame and SDL--allowing players to take part in
scenarios set during the American Civil war.

    http://civil.sourceforge.net/

-- 

John Eikenberry
[jae at zhar.net - http://zhar.net]
______________________________________________________________
"They who can give up essential liberty to purchase a little temporary
 safety, deserve neither liberty nor safety."
                                          --B. Franklin
-------------- next part --------------
# /usr/bin/env python

from Numeric import *

factor = array(6.).astype(Float16)
edge_mod = array(0.66).astype(Float16)

class InfluenceMap:

    def __init__(self,hex_map):
        self.map_size = map_size = hex_map.size
        self._iterations = (map_size[0] + map_size[1])/4
        self.hex_map = hex_map

        # weightmap == influence map
        self.weightmap = zeros((map_size[0],map_size[1]),Float16)
        # constmap = initial state with constraints/constants
        self.constmap = zeros((map_size[0],map_size[1]),Float16)

    def step(self,iterations=None):
        constmap = self.constmap
        weightmap = self.weightmap
        
        if not iterations:
            iterations = self._iterations
        while iterations:
            # spread the influence
            # diamond_h
            neighbors = _shift_up(weightmap)/factor
            neighbors += _shift_left(weightmap)/factor
            neighbors += _shift_right(weightmap)/factor
            neighbors += _shift_down(weightmap)/factor
            neighbors += _shift_hex_up(weightmap)/factor
            neighbors += _shift_hex_down(weightmap)/factor
            
            # constrain initial points to prevent overheating
            putmask(neighbors,constmap,constmap)
            weightmap = neighbors
            iterations -= 1
        self.weightmap = weightmap

def shift_up(cells):
    return concatenate((cells[1:], cells[-1:]*edge_mod))

def shift_down(cells):
    return concatenate((cells[:1]*edge_mod, cells[:-1]))

def shift_left(cells):
    return transpose(shift_up(transpose(cells)))

def shift_right(cells):
    return transpose(shift_down(transpose(cells)))

# for array layout 
def shift_hex_up(cells):
    neighbors = array(cells)
    # add to odd cell rows [1::2]
    neighbors[1::2] = shift_left(shift_up(cells))[1::2]
    # even cell rows [::2]
    neighbors[::2] = shift_right(shift_up(cells))[::2]
    return neighbors
 
def shift_hex_down(cells):
    neighbors = array(cells)
    # odd cell rows [1::2]
    neighbors[1::2] = shift_left(shift_down(cells))[1::2]
    # even cell rows [::2]
    neighbors[::2] = shift_right(shift_down(cells))[::2]
    return neighbors
 

From dubois1 at llnl.gov  Mon Jul  8 09:10:04 2002
From: dubois1 at llnl.gov (Paul Dubois)
Date: Mon Jul  8 09:10:04 2002
Subject: [Numpy-discussion] Caution -- // not standard
Message-ID: <1026144543.13905.3.camel@ldorritt>

I have run into several cases of this on different open-source projects,
the latest being an incorrect change in Numeric's arrayobject.c: the use
of // to start a comment. Many contributors who work only with Linux
have come to believe that this works with other C compilers, which is
not true. This construct comes from C++. Please avoid this construct
when contributing changes or patches to Numeric.


From bsder at mail.allcaps.org  Mon Jul  8 12:03:09 2002
From: bsder at mail.allcaps.org (Andrew P. Lentvorski)
Date: Mon Jul  8 12:03:09 2002
Subject: [Numpy-discussion] Caution -- // not standard
In-Reply-To: <1026144543.13905.3.camel@ldorritt>
Message-ID: <20020708114304.T66456-100000@mail.allcaps.org>

Actually, // is standard C99 released December 1, 1999 as
ISO/IEC 9899:1999.

It also has support for variable length arrays, a complex number type and
a bunch of *portable* stuff for getting at numerical information (limits,
floating-point environment) rather than nasty compiler specific hacks.
( See: http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9x.htm )

Many of these extensions are specifically for the numerical community.

I would recommend taking up the issue of non-standards compliance with
your compiler vendor.

-a

On 8 Jul 2002, Paul Dubois wrote:

> I have run into several cases of this on different open-source projects,
> the latest being an incorrect change in Numeric's arrayobject.c: the use
> of // to start a comment. Many contributors who work only with Linux
> have come to believe that this works with other C compilers, which is
> not true. This construct comes from C++. Please avoid this construct
> when contributing changes or patches to Numeric.


From paul at pfdubois.com  Mon Jul  8 12:38:05 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul  8 12:38:05 2002
Subject: [Numpy-discussion] Caution -- // not standard
In-Reply-To: <20020708114304.T66456-100000@mail.allcaps.org>
Message-ID: <001101c226b6$da5f3090$0c01a8c0@NICKLEBY>

Thank you for the clarification.

Unfortunately, "my" compiler vendor is the set of all compiler vendors
that users of Numeric have, and we have to restrict ourselves to what
works. I misspoke when I said it was "not standard"; I should have said,
"doesn't work everywhere".

> -----Original Message-----
> From: Andrew P. Lentvorski [mailto:bsder at mail.allcaps.org] 
> Sent: Monday, July 08, 2002 12:02 PM
> To: Paul Dubois
> Cc: numpy-discussion at lists.sourceforge.net
> Subject: Re: [Numpy-discussion] Caution -- // not standard
> 
> 
> Actually, // is standard C99 released December 1, 1999 as 
> ISO/IEC 9899:1999.
> 
> It also has support for variable length arrays, a complex 
> number type and a bunch of *portable* stuff for getting at 
> numerical information (limits, floating-point environment) 
> rather than nasty compiler specific hacks. ( See: 
> http://std.dkuug.dk/JTC1/SC22/WG14/www/newinc9> x.htm )
> 
> Many 
> of these extensions are specifically for the 
> numerical community.
> 
> I would recommend taking up the issue of non-standards 
> compliance with your compiler vendor.
> 
> -a
> 
> On 8 Jul 2002, Paul Dubois wrote:
> 
> > I have run into several cases of this on different open-source 
> > projects, the latest being an incorrect change in Numeric's 
> > arrayobject.c: the use of // to start a comment. Many 
> contributors who 
> > work only with Linux have come to believe that this works 
> with other C 
> > compilers, which is not true. This construct comes from C++. Please 
> > avoid this construct when contributing changes or patches 
> to Numeric.
> 
> 


From jae at zhar.net  Tue Jul 16 00:12:13 2002
From: jae at zhar.net (John Eikenberry)
Date: Tue Jul 16 00:12:13 2002
Subject: [Numpy-discussion] Optimization advice
In-Reply-To: <20020708094805.GA370@kosh.zhar.net>
References: <20020708094805.GA370@kosh.zhar.net>
Message-ID: <20020716070554.GB363@kosh.zhar.net>


After getting some advice off the pygame list I think I have a pretty
good version of my influence map now. I thought someone on this list
might be interested or at least someone checking the archives.

The new and improved code is around 6-7x faster. The main gain was
obtained by converting all the array functions to slice notation and
eliminating most of the needless copying of arrays. 

The new version is attached and is much better commented. It is also
unabridged, as it was pointed out that it wasn't entirely clear what was
going on in the last (edited) version. Hopefully things are more obvious
in this one.

Anyways... I just hating leaving a thread without a conclusion. Hope
someone finds this useful.

-- 

John Eikenberry
[jae at zhar.net - http://zhar.net]
______________________________________________________________
"They who can give up essential liberty to purchase a little temporary
 safety, deserve neither liberty nor safety."
                                          --B. Franklin
-------------- next part --------------
# /usr/bin/env python
#
# John Eikenberry <jae at zhar.net>

from Numeric import *
from types import *

FACTOR = array(6.).astype(Float32)
EDGE_MOD = array(0.66).astype(Float32)
ONE = array(1.).astype(Float32)
ZERO = array(0.).astype(Float32)

class InfluenceMap: 
    """ 

    There are 2 primary ways to setup the influence map, either might be useful
    depending on your needs. The first is to recreate the map each 'turn' the
    second is to keep the map around and just update it each turn. The first
    way is simple and easy to understand, both in terms of tweaking and later
    analysis. The second gives the map a sense of time and allows for fewer
    iterations of the spreading algorithm per 'turn'. 

    Setting up the map to for one or the other of these is a matter of tweaking
    the code. There are 3 main bits of code which are described below and
    indicated via comments in the code.
    
    First some terminology:

    - weightmap stores the current influence map
    - neighbors is used as the memory buffer to calculate a the influence
      spreading
    - constmap contains a map with only the unit's scores present
    - when I refer to a 'multi-turn map' I mean using one instance of the
      influence map throughout the game without resetting it.

    [1] neighbors *= ZERO

        At the end of each iteraction, the neighbors take on the values of the
        weightmap from the previous step. This will reset those values to
        zero.

        This has a 1% performance hit.

    [2] putmask(neighbors,constmap,constmap)

        This keeps the values of the units hexes constant through all
        iterations.

        This results in about a 40% performance hit. This needs improvement.

    [3] setDecayRate([float])
        
        This is meant to be used with a multi-turn map. It sets the floating
        point value (N>0.0<1.0)which is used on the map each turn to modify
        the current map before the influence spreading. 

        No performance hit.

    If just [1] used then it will cause all influence values to decend
    toward zero. Not sure what this would be useful for, just documenting the
    effect.

    If [1] is not used (commented out) then the map values will never balance
    out, rising with each iteration. This is fine if you plan on resetting the
    influence map each turn. Allowing you to tweak the number of iterations to
    get the level of values you want. But it would cause problem with a
    multi-turn map unless [3] is used to keep this in check. 
        
    Using [2] without [1] will accellerate the rising of the values described
    above. It will also lead to more variation amoung the influence values
    when using fewer iterations. High peaks and steep sides. Using neither [1]
    nor [2] the peaks are much lower.

    If [1] and [2] are both used the map will always attain a point of balance
    no matter how many iterations are run. This is desirable for maps used
    throughout the entire game (multi-turn maps) for obvious reasons. Given the
    effect of [1] this also limits the need for [3] as the influence values in
    areas of the map where units are no longer present will naturally decrease.
    Though the decay rate may still be useful for tweaking this.
    
    """

    _decay_rate = None

    def __init__(self,hex_map):
        """ hex_map is the in game (civl) map object """
        self.map_size = map_size = hex_map.size
        ave_size = (map_size[0] + map_size[1])/2
        self._iterations = ave_size/2
        # is the hex_map useful for anything other than size?
        self.hex_map = hex_map

        # weightmap == influence map
        self.weightmap = weightmap = zeros((map_size[0],map_size[1]),Float32)
        # constmap == initial unit locations
        self.constmap = zeros((map_size[0],map_size[1]),Float32)

    def setUnitMap(self,units):
        """ Put unit scores on map 
            -units is a list of (x,y,score) tuples
             where x,y are map coordinates and score is the units influence
             modifier
        """
        weightmap = self.weightmap
        constmap = self.constmap
        constmap *= ZERO
        # mayby use the hex_map here to get terrain effects?
        for (x,y,score) in units:
            weightmap[x,y] = score
            constmap[x,y]=score

    def setInterations(self,iterations):
        """ Set number of times through the influence spreading loop """
        assert type(iterations) == IntType, "Bad arg type: setIterations([int])"
        self._iterations = iterations

    # [3] above
    def setDecayRate(self,rate):
        """ Set decay rate for a multi-turn map. """
        assert type(rate) == FloatType, "Bad arg type: setDecayRate([float])"
        self._decay_rate = array(rate).astype(Float32)

    def reset(self):
        """ Reset an existing map back to zeros """
        map_size = self.map_size
        self.weightmap = zeros((map_size[0],map_size[1]),Float32)

    def step(self,iterations=None):
        """ One set of loops through influence spreading algorithm """
        # save lookup time
        constmap = self.constmap
        weightmap = self.weightmap
        if not iterations:
            iterations = self._iterations

        # decay rate can be used when the map is kept over duration of game,
        # instead of a new one each turn. the old values are retained,
        # degrading slowly over time. this allows for fewer iterations per turn
        # and gives sense of time to the map. its experimental at this point.
        if self._decay_rate:
            weightmap = weightmap * self._decay_rate

        # It might be possible to pre-allocate the memory for neighbors in the
        # init method. But I'm not sure how to update that pre-allocated array.
        neighbors = weightmap.copy()
        # spread the influence
        while iterations:
            # [1] in notes above
#            neighbors *= ZERO
            # diamond_hex layout
            neighbors[:-1,:] += weightmap[1:,:] # shift up
            neighbors[1:,:] += weightmap[:-1,:] # shift down
            neighbors[:,:-1] += weightmap[:,1:] # shift left
            neighbors[:,1:] += weightmap[:,:-1] # shift right
            neighbors[1::2][:-1,:-1] += weightmap[::2][1:,1:] # hex up (even)
            neighbors[1::2][:,:-1] += weightmap[::2][:,1:] # hex down (even)
            neighbors[::2][:,1:] += weightmap[1::2][:,:-1] # hex up (odd)
            neighbors[::2][1:,1:] += weightmap[1::2][:-1,:-1] # hex down (odd)
            # keep influence values balanced
            neighbors *= (ONE/FACTOR)
            
            # [2] above - maintain scores in unit hexes
#            putmask(neighbors,constmap,constmap)

            # 'putmask' adds almost 40% to the overhead. There should be a
            # faster way. A little testing seems to show that this problem is
            # related to the usage of floats for the map values. 

            # prepare for next iteration
            weightmap,neighbors = neighbors,weightmap
            iterations -= 1

        # save for next turn
        self.weightmap = weightmap


From paul at pfdubois.com  Thu Jul 18 14:47:02 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Thu Jul 18 14:47:02 2002
Subject: [Numpy-discussion] [ANNOUNCE] Pyfort 8.0
Message-ID: <001f01c22ea4$89ff4f40$0b01a8c0@NICKLEBY>

Pyfort 8.0 has been released at SourceForge (sf.net/projects/pyfortran)

Version 8
     This version contains a new facility for making and installing
projects.
     Old compile lines will still work, but will produce an equivalent
.pfp
     file that you could use in the future. Included is a Tkinter-based
GUI
     editor for the project files. However, the format of the files is
simple
     and they could be edited with a text editor as well.

     There is improved support for installing Pyfort and the modules it 
     creates in a location other than inside Python. See README.

     This version does change the installation location for an
extension. 
     Therefore, you should remove the files of any previous installation

     from your Python. Yes, this is annoying. That is why we are doing
it, 
     so that we can have an "uninstall" command.

     A new "windows" subdirectory has been added, containing an example
of how to
     use Pyfort on Windows with Visual Fortran. Thanks to Reinhold
Niesner. Testing
     of, and advice about, this are needed from Windows users. The
pyfort script itself
     is also now installed as a .bat script for win32.
 
     Support for Mac OSX (Darwin) added.


From biesingert at yahoo.com  Fri Jul 19 01:13:03 2002
From: biesingert at yahoo.com (Thomas Biesinger)
Date: Fri Jul 19 01:13:03 2002
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com>

Hi,

when I try to install NumPy on Mac OS X.1.5, it fails on this error:

....
cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
/usr/bin/ld: -undefined error must be used when -twolevel_namespace is
in effect
error: command 'cc' failed with exit status 1
~/Python/Numeric-21.3 % cc
cc: No input files

I had thought to submit this to the developers section of the list
but could not find the way to subscribe to it ;-)

If somehow had a running version of NumPy with for Mac OSX
http://tony.lownds.com/macosx, I would appreciate it.

Thanks everyone for their help!

Regards,
Thomas


__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com


From rob at pythonemproject.com  Fri Jul 19 05:35:04 2002
From: rob at pythonemproject.com (rob)
Date: Fri Jul 19 05:35:04 2002
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
References: <20020719081231.1850.qmail@web14103.mail.yahoo.com>
Message-ID: <3D3806B3.F2BE1C1A@pythonemproject.com>

Thomas Biesinger wrote:
> 
> Hi,
> 
> when I try to install NumPy on Mac OS X.1.5, it fails on this error:
> 
> ....
> cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
> 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
> arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
> o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
> /usr/bin/ld: -undefined error must be used when -twolevel_namespace is
> in effect
> error: command 'cc' failed with exit status 1
> ~/Python/Numeric-21.3 % cc
> cc: No input files
> 
> I had thought to submit this to the developers section of the list
> but could not find the way to subscribe to it ;-)
> 
> If somehow had a running version of NumPy with for Mac OSX
> http://tony.lownds.com/macosx, I would appreciate it.
> 
> Thanks everyone for their help!
> 
> Regards,
> Thomas
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Autos - Get free new car price quotes
> http://autos.yahoo.com
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Hi Thomas, sorry I don't have the expertise to help you with your
question.  I am wondering if you are using one of Apple's new G4
machines?  I'm curious about the floating point performance of those 
chips.  If you ever get Numpy working, I have a routine that I use for a
benchmark, a Norton-Summerfeld ground (antenna) simulation routine that
I could send to you.  The record for me is 120s on a P4 1.8Ghz at work,
but I'm sure the new Xeons would beat that, and maybe the new Athlons. 
My 1.2Ghz DDR Athlon is much slower than the P4, but the clock speeds
are so much different.  Rob.

-- 
-----------------------------
The Numeric Python EM Project

www.pythonemproject.com


From welch at cs.unc.edu  Fri Jul 19 05:52:01 2002
From: welch at cs.unc.edu (Greg Welch)
Date: Fri Jul 19 05:52:01 2002
Subject: FW: [Numpy-discussion] NumPy on Mac OS 10.1.5
In-Reply-To: <200207191053.g6JArGbE017359@wren.cs.unc.edu>
Message-ID: <B95D8343.9EC0%welch@cs.unc.edu>

Thomas, I have (recently) built Numeric 21.3 on multiple OS X 10.1.5
platforms, and have had no problems that I know of. I am using Python 2.3a0
but had also built Numeric w/ earlier versions of Python too. All platforms
have the April 2002 developer tools update.

I just noticed that your compile line shows the use of cc, as opposed to
gcc. Here is the corresponding compile line for 21.3 on my powerbook (Python
2.3a0):

gcc -bundle -bundle_loader /usr/local/bin/python build/temp.darwin-5.5-Power
Macintosh-2.3/_numpymodule.o build/temp.darwin-5.5-Power
Macintosh-2.3/arrayobject.o
build/temp.darwin-5.5-PowerMacintosh-2.3/ufuncobject.o -o
build/lib.darwin-5.5-Power Macintosh-2.3/_numpy.so

--Greg


-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20020719/bdf544d4/attachment-0001.mht>

From biesingert at yahoo.com  Fri Jul 19 04:12:31 2002
From: biesingert at yahoo.com (Thomas Biesinger)
Date: Fri, 19 Jul 2002 01:12:31 -0700 (PDT)
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
Message-ID: <20020719081231.1850.qmail@web14103.mail.yahoo.com>

Hi,

when I try to install NumPy on Mac OS X.1.5, it fails on this error:

....
cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
/usr/bin/ld: -undefined error must be used when -twolevel_namespace is
in effect
error: command 'cc' failed with exit status 1
~/Python/Numeric-21.3 % cc
cc: No input files

I had thought to submit this to the developers section of the list
but could not find the way to subscribe to it ;-)

If somehow had a running version of NumPy with for Mac OSX
http://tony.lownds.com/macosx, I would appreciate it.

Thanks everyone for their help!

Regards,
Thomas


__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


--B_3109913412_427129--


From Jack.Jansen at oratrix.com  Fri Jul 19 14:17:02 2002
From: Jack.Jansen at oratrix.com (Jack Jansen)
Date: Fri Jul 19 14:17:02 2002
Subject: [Numpy-discussion] NumPy on Mac OS 10.1.5
In-Reply-To: <20020719081231.1850.qmail@web14103.mail.yahoo.com>
Message-ID: <B64A212F-9B5C-11D6-9B6B-003065517236@oratrix.com>

On vrijdag, juli 19, 2002, at 10:12 , Thomas Biesinger wrote:

> Hi,
>
> when I try to install NumPy on Mac OS X.1.5, it fails on this error:
>
> ....
> cc -bundle -undefined suppress build/temp.darwin-5.5-Power Macintosh-
> 2.1/_numpymodule.o build/temp.darwin-5.5-Power Macintosh-2.1/
> arrayobject.o build/temp.darwin-5.5-Power Macintosh-2.1/ufuncobject.o -
> o build/lib.darwin-5.5-Power Macintosh-2.1/_numpy.so
> /usr/bin/ld: -undefined error must be used when -twolevel_namespace is
> in effect

Thomas,
as of MacOSX 10.1 the link step needs either the -flat_namespace 
option, or the -bundle_loader option.

But: this has been fixed in both Python 2.2.1 and Python 2.3a0 
(the CVS tree). Are you by any chance still running Python 2.2 
(which predates OSX 10.1, and therefore two-level namespaces, 
and therefore the right linker invocations, which distutils 
reads from Python's own Makefile).

If you're running 2.2: please upgrade and try again. If you're 
running 2.2.1 or later: let me know and I'll try and think of 
what questions I should ask you to debug this:-)
--
- Jack Jansen        <Jack.Jansen at oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -


From paul at pfdubois.com  Mon Jul 22 16:14:03 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Mon Jul 22 16:14:03 2002
Subject: [Numpy-discussion] Numarray design announcement
Message-ID: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY>

At numpy.sf.net you will find a posting from Perry Greenfield and I
detailing the design decisions we have taken with respect to Numarray.
What follows is the text of that message without the formatting. We ask
for your understanding about those decisions that differ from the ones
you might prefer. 

Numarray's Design
Paul F. Dubois and Perry Greenfield

Numarray is the new implementation of the Numeric Python extension. It
is our intention that users will change as rapidly as possible to the
new  module when we decide it is ready. The present Numeric Python team
will cease supporting Numeric after a short transition period.

During recent months there has been a lot of discussion about Numarray
and whether or not it should differ from Numeric in certain ways. We
have reviewed this lengthy discussion and come to some conclusions about
what we plan to do. The discussion has been valuable in that it took a
whole new "generation" back through the considerations that the
"founding fathers" debated when Numeric Python was designed.

There are literally tens of thousands of Numerical Python users. These
users may represent only a tiny percentage of potential users but they
are real users today with real code that they have written, and breaking
that code would represent real harm to real people. Most of the issues
discussed recently were discussed at length when Numeric was first
designed. Some decisions taken then represent a choice that was simply a
choice among valid alternatives. Nevertheless, the choice was made, and
to arbitrarily now make a different choice would be difficult to
justify.

In arguing about Python's indentation, we often see heart-felt arguments
from opponents who have sincere reasons for feeling as they do. However,
many of the pitfalls they point to do not seem to actually occur in real
life very often. We feel the same way about many arguments about Numeric
Python. The view / copy argument, for example, claims that beginners
will make errors with view semantics. Well, some do, but not very often,
and not twice.  It is just one of many differences that users need to
adapt to when learning an entity-object model such as Python's when they
are used to variable semantics such as in Fortran or C. Similarly, we do
not receive massive reports of confusion about differing default values
for the axis keyword -- there was a rationale for the way it is now, and
although one could propose a different rationale for a different choice,
it would be just a choice.

Decisions

Numarray will have the same Python interface as Numeric except for the
exceptions discussed below. 

1. The Numarray C API includes a compatibility layer consisting of some
of the members of the Numeric C API. For details on compatibility at the
C level see
http://telia.dl.sourceforge.net/sourceforge/numpy/numarray.pdf , pdf
pages 78-81. Since no formal decision was ever made about what parts of
the Numeric C header file were actually intended to be publicly
available, do not expect complete emulation. 

Numarray's current view of arrays in C, using either native or emulation
C-APIs, is that array data can be mutated, but array properties cannot.
Thus, an existing Numeric extension function which tries to change the
shape or strides of an array in C is more of a porting challenge,
possibly requiring a python wrapper. Depending on what kind of
optimization we do, this restriction might be lifted. For the Numeric
extensions already ported to Numarray (RandomArray, LinearAlgebra, FFT),
none of this was an issue.

2. Currently, if the result of an index operation x[i] results in a
scalar result, the result is converted to a similar Python type. For
example, the result of array([1,2,3])[1] is the Python integer 2. This
will be changed so that the result of an index operation on a Numarray
array is always a Numarray array. Scalar results will become rank-zero
arrays (i.e., shape () ).

3. Currently, binary operations involving Numeric arrays and Python
scalars uses the precision of the Python scalar to help determine the
precision of the result. In Numarray, the precision of the array will
have precedence in determining the precision of the outcome. Full
details are available in the Numarray documention.

4. The Numarray version of MA will no longer have copy semantics on
indexing but instead will be consistent with Numarray. (The decision to
make MA differ in this regards was due to a need for CDAT to be backward
compatible with a local variant of Numeric; the CDAT user community no
longer feels this was necessary).

Some explanation about the scalar change is in order. Currently, much
coding in Numeric-based applications must be devoted to handling the
fact that after an index operation, the programmer can not assume that
the result is an array. So, what are the consequences of change? A
rank-zero array will interact as expected with most other parts of
Python. When it does not, the most likely result is a type error. For
example, let x = array([1,2,3]). Then [1,2,3][x[0]] currently produces
the result 2. With the change, it would produce a type error unless a
change is made to the Python core (currently under discussion). But
x[x[0]] would still work because we have control of that.  In short, we
do not think this change will break much code and it will prevent the
writing of more code that is either broken or difficult to write
correctly.


From pete at shinners.org  Mon Jul 22 17:36:12 2002
From: pete at shinners.org (Pete Shinners)
Date: Mon Jul 22 17:36:12 2002
Subject: [Numpy-discussion] Numarray design announcement
References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY>
Message-ID: <3D3CA3B1.7010708@shinners.org>

Paul F Dubois wrote:
 > Numarray's Design
 > Paul F. Dubois and Perry Greenfield

a very nice design, for a lot of challenging decisions


 > Numarray's current view of arrays in C, using either native or
 > emulation C-APIs, is that array data can be mutated, but array
 > properties cannot. Thus, an existing Numeric extension function
 > which tries to change the shape or strides of an array in C is
 > more of a porting challenge, possibly requiring a python wrapper.

i have a c extension that does this, but only during "creation time" 
of the array. i'm hoping there can be some way to do this from C. i 
need to create a new array from a block of numbers that aren't 
contiguous...

/* roughly snipped code */
dim[0] = myimg->w;
dim[1] = myimg->h;
dim[2] = 3; /*r,g,b*/
array = PyArray_FromDimsAndData(3, dim, PyArray_UBYTE, startpixel);
array->flags = OWN_DIMENSIONS|OWN_STRIDES;
array->strides[2] = pixelstep;
array->strides[1] = myimg->pitch;
array->strides[0] = myimg->format->BytesPerPixel;
array->base = myimg_object;


note this data is image data, and i am "reorienting" it so that the 
first index is X and the second index is Y. plus i need to account 
for an image pitch, where the rows are not exactly the same width as 
the number of pixels.

also, i am also changing the "base" field, since the data for this 
array lives inside another image object

of course, once the array is created, i pass it off to the user and 
never touch these fields again, so perhaps something like this will 
work in the new numarray?

if not, i'm eager to start my petition for a 
"PyArray_FromDimsAndDataAndStrides" function, and also a way to 
assign the "base" as well.


i'm looking forward to the new numarray, looks very exciting.


From biesingert at yahoo.com  Mon Jul 22 23:54:03 2002
From: biesingert at yahoo.com (Thomas Biesinger)
Date: Mon Jul 22 23:54:03 2002
Subject: [Numpy-discussion] Summary to NumPy on Mac OS 10.1.5
Message-ID: <20020723065343.73589.qmail@web14106.mail.yahoo.com>


From e.maryniak at pobox.com  Tue Jul 23 09:19:04 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Tue Jul 23 09:19:04 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
Message-ID: <200206261833.29702.e.maryniak@pobox.com>

Dear crunchers,

According to the _Numpy_ manual for RandomArray.seed(x=0, y=0)
(with /my/ emphasis):

  The seed() function takes two integers and sets the two seeds
  of the random number generator to those values. If the default
  values of 0 are used for /both/ x and y, then a seed is generated
  from the current time, providing a /pseudo-random/ seed.

Note: in numarray, the RandomArray2 package is provided but it's
description is not (yet) included in the numarray manual.

I have some questions about this:

1. The implementation of seed(), which is, by the way, identical
   both in Numeric's RandomArray.py and numarray's RandomArray2.py
   seems to contradict it's usage description:

---cut---
def seed(x=0,y=0):
    """seed(x, y), set the seed using the integers x, y;
    Set a random one from clock if  y == 0
    """
    if type (x) != IntType or type (y) != IntType :
        raise ArgumentError, "seed requires integer arguments."
    if y == 0:
        import time
        t = time.time()
        ndigits = int(math.log10(t))
        base = 10**(ndigits/2)
        x = int(t/base)
        y = 1 + int(t%base)
    ranlib.set_seeds(x,y)
---cut---

   Shouldn't the second 'if' be:

    if x == 0 and y == 0:

  With the current implementation:

  - 'seed(3)' will actually use the clock for seeding
  - it is impossible to specify 0's (0,0) as seed: it might be
    better to use None as default values?

2. With the current time.time() based default seeding, I wonder
   if you can call that, from a mathematical point of view,
   pseudo-random:

---cut---
$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from numarray import *
>>> from RandomArray2 import *
>>> import time
>>> numarray.__version__
'0.3.5'
>>> for i in range(5):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(1)
...     print
...
1027434978.406238
(102743, 4979)

1027434979.400319
(102743, 4980)

1027434980.400316
(102743, 4981)

1027434981.40031
(102743, 4982)

1027434982.400308
(102743, 4983)
---cut---

   It is incremental, and if you use default seeding within
   one (1) second, you get the same seed:

---cut---
>>> for i in range(5):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(0.1)
...     print
...
1027436537.066677
(102743, 6538)

1027436537.160303
(102743, 6538)

1027436537.260363
(102743, 6538)

1027436537.360299
(102743, 6538)

1027436537.460363
(102743, 6538)
---cut---

3. I wonder what the design philosophy is behind the decision
   to use 'mathematically suspect' seeding as default behavior.
   Apart from the fact that it can hardly be called 'random', I also
   have the following problems with it:

   - The RandomArray2 module initializes with 'seed()' itself, too.
     Reload()'s of RandomArray2, which might occur outside the
     control of the user, will thus override explicit user's seeding.
     Or am I seeing ghosts here?
   - When doing repeated run's of one's neural net simulations that
     each take less than a second, one will get identical streams of
     random numbers, despite seed()'ing each time.
     Not quite what you would expect or want.
   - From a purist software engineering point of view, I don't think
     automagical default behavior is desirable: one wants programs to
     be deterministic and produce reproducible behavior/output.
     If you use default seed()'ing now and re-run your program/model
     later with identical parameters, you will get different output.
     In Eiffel, object attributes are always initialized, and you will
     almost never have irreproducible runs. I found that this is a good
     thing for reproducing ones bugs, too ;-)

To summarize, my recommendation would be to use None default arguments
and use, when no user arguments are supplied, a hard (built-in) seed
tuple, like (1,1) or whatever.
Sometimes a paper on a random number generator suggests seeds (like 4357
for the MersenneTwister), but of course, a good random number generator
should behave well independently of the initial seed/seed-tuple.
I may be completely mistaken here (I'm not an expert on random number
theory), but the random number generators (Ahrens, et. al) seem 'old'?
After some studying, we decided to use the Mersenne Twister:

    http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html
    http://www.math.keio.ac.jp/~matumoto/emt.html

PDF article:

    http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf

    M. Matsumoto and T. Nishimura,
        "Mersenne Twister: A 623-dimensionally equidistributed uniform
         pseudorandom number generator",
         ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1,
         January pp.3-30 1998

There are some Python wrappers and it has good performance as well.

Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

Hail Caesar! We, who are about to dine, salad you.


From jmiller at stsci.edu  Tue Jul 23 11:56:04 2002
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul 23 11:56:04 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0,
 y=0) system clock default and possible bug
References: <200206261833.29702.e.maryniak@pobox.com>
Message-ID: <3D3DA67E.308@stsci.edu>

Eric Maryniak wrote:

>Dear crunchers,
>
>According to the _Numpy_ manual for RandomArray.seed(x=0, y=0)
>(with /my/ emphasis):
>
>  The seed() function takes two integers and sets the two seeds
>  of the random number generator to those values. If the default
>  values of 0 are used for /both/ x and y, then a seed is generated
>  from the current time, providing a /pseudo-random/ seed.
>
>Note: in numarray, the RandomArray2 package is provided but it's
>description is not (yet) included in the numarray manual.
>
>I have some questions about this:
>
>1. The implementation of seed(), which is, by the way, identical
>   both in Numeric's RandomArray.py and numarray's RandomArray2.py
>   seems to contradict it's usage description:
>
The 2 in RandomArray2 is there to support side-by-side testing with 
Numeric, not to imply something new and improved.  The point of 
providing RandomArray2 is to provide a migration path for current 
Numeric users.  To that end, RandomArray2 should be functionally 
identical to RandomArray.  

That should not, however, discourage you from writing a new and improved 
random number package for numarray.

>
>
>---cut---
>def seed(x=0,y=0):
>    """seed(x, y), set the seed using the integers x, y;
>    Set a random one from clock if  y == 0
>    """
>    if type (x) != IntType or type (y) != IntType :
>        raise ArgumentError, "seed requires integer arguments."
>    if y == 0:
>        import time
>        t = time.time()
>        ndigits = int(math.log10(t))
>        base = 10**(ndigits/2)
>        x = int(t/base)
>        y = 1 + int(t%base)
>    ranlib.set_seeds(x,y)
>---cut---
>
>   Shouldn't the second 'if' be:
>
>    if x == 0 and y == 0:
>
>  With the current implementation:
>
>  - 'seed(3)' will actually use the clock for seeding
>  - it is impossible to specify 0's (0,0) as seed: it might be
>    better to use None as default values?
>
>2. With the current time.time() based default seeding, I wonder
>   if you can call that, from a mathematical point of view,
>   pseudo-random:
>
>---cut---
>$ python
>Python 2.2.1 (#1, Jun 25 2002, 20:45:02)
>[GCC 2.95.3 20010315 (SuSE)] on linux2
>Type "help", "copyright", "credits" or "license" for more information.
>
>>>>from numarray import *
>>>>from RandomArray2 import *
>>>>import time
>>>>numarray.__version__
>>>>
>'0.3.5'
>
>>>>for i in range(5):
>>>>
>...     time.time()
>...     RandomArray2.seed()
>...     RandomArray2.get_seed()
>...     time.sleep(1)
>...     print
>...
>1027434978.406238
>(102743, 4979)
>
>1027434979.400319
>(102743, 4980)
>
>1027434980.400316
>(102743, 4981)
>
>1027434981.40031
>(102743, 4982)
>
>1027434982.400308
>(102743, 4983)
>---cut---
>
>   It is incremental, and if you use default seeding within
>   one (1) second, you get the same seed:
>
>---cut---
>
>>>>for i in range(5):
>>>>
>...     time.time()
>...     RandomArray2.seed()
>...     RandomArray2.get_seed()
>...     time.sleep(0.1)
>...     print
>...
>1027436537.066677
>(102743, 6538)
>
>1027436537.160303
>(102743, 6538)
>
>1027436537.260363
>(102743, 6538)
>
>1027436537.360299
>(102743, 6538)
>
>1027436537.460363
>(102743, 6538)
>---cut---
>
>3. I wonder what the design philosophy is behind the decision
>   to use 'mathematically suspect' seeding as default behavior.
>
Using time for a seed is fairly common.   Since it's an implementation 
detail,  I doubt anyone would object if you can suggest a better default 
seed.

>
>   Apart from the fact that it can hardly be called 'random', I also
>   have the following problems with it:
>
>   - The RandomArray2 module initializes with 'seed()' itself, too.
>     Reload()'s of RandomArray2, which might occur outside the
>     control of the user, will thus override explicit user's seeding.
>     Or am I seeing ghosts here?
>
Overriding a user's explicit seed as a result of a reload sounds correct 
to me.   All of the module's top level statements are re-executed during 
a reload.

>
>   - When doing repeated run's of one's neural net simulations that
>     each take less than a second, one will get identical streams of
>     random numbers, despite seed()'ing each time.
>     Not quite what you would expect or want.
>
This is easy enough to work around: don't seed or re-seed.     If you 
then need to make multiple simulation runs, make a separate module and 
call your simulation like:

import simulation

RandomArray2.seed(something_deterministic, something_else_deterministic)
for i in range(number_of_runs):
    simulation.main()

>
>   - From a purist software engineering point of view, I don't think
>     automagical default behavior is desirable: one wants programs to
>     be deterministic and produce reproducible behavior/output.
>
I don't know.  I think by default,  random numbers *should be* random.

>
>     If you use default seed()'ing now and re-run your program/model
>     later with identical parameters, you will get different output.
>
When you care about this, you need to set the seed to something 
deterministic.

>
>     In Eiffel, object attributes are always initialized, and you will
>     almost never have irreproducible runs. I found that this is a good
>     thing for reproducing ones bugs, too ;-)
>
This sounds like a good design principle, but I don't see anything in 
RandomArray2 which is keeping you from doing this now.

>
>To summarize, my recommendation would be to use None default arguments
>and use, when no user arguments are supplied, a hard (built-in) seed
>tuple, like (1,1) or whatever.
>
Unless there is a general outcry from the rest of the community,  I 
think the (existing) numarray extensions (RandomArray2, LinearAlgebra2, 
FFT2) should try to stay functionally identical with Numeric.

>
>Sometimes a paper on a random number generator suggests seeds (like 4357
>for the MersenneTwister), but of course, a good random number generator
>should behave well independently of the initial seed/seed-tuple.
>I may be completely mistaken here (I'm not an expert on random number
>theory), but the random number generators (Ahrens, et. al) seem 'old'?
>After some studying, we decided to use the Mersenne Twister:
>
An array enabled version might make a good add-on package for numarray.

>
>
>    http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html
>    http://www.math.keio.ac.jp/~matumoto/emt.html
>
>PDF article:
>
>    http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf
>
>    M. Matsumoto and T. Nishimura,
>        "Mersenne Twister: A 623-dimensionally equidistributed uniform
>         pseudorandom number generator",
>         ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1,
>         January pp.3-30 1998
>
>There are some Python wrappers and it has good performance as well.
>
>Bye-bye,
>
>Eric
>
Bye,
Todd


From e.maryniak at pobox.com  Tue Jul 23 13:03:02 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Tue Jul 23 13:03:02 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
In-Reply-To: <3D3DA67E.308@stsci.edu>
References: <200206261833.29702.e.maryniak@pobox.com> <3D3DA67E.308@stsci.edu>
Message-ID: <200207232202.04104.e.maryniak@pobox.com>

On Tuesday 23 July 2002 20:54, Todd Miller wrote:
> Eric Maryniak wrote:
> >...
> That should not, however, discourage you from writing a new and improved
> random number package for numarray.

Yes, thank you :-)

> >...
> >3. I wonder what the design philosophy is behind the decision
> >   to use 'mathematically suspect' seeding as default behavior.
>
> Using time for a seed is fairly common.   Since it's an implementation
> detail,  I doubt anyone would object if you can suggest a better default
> seed.

Well, as said, a fixed seed, provided by the class implementation
and therefore 'good', instead of a not-so-random 'random' seed.
And imho it would be better not to (only) use the clock, but a /dev/random
kinda thing.
Personally, I find the RNG setup much more appealing: there the default is:

    standard_generator = CreateGenerator(-1)

where

    seed < 0  ==>  Use the default initial seed value.
    seed = 0  ==>  Set a "random" value for the seed from the system clock.
    seed > 0  ==>  Set seed directly (32 bits only).

And indeed 'void Mixranf(int *s,u32 s48[2])' uses a built-in constant
as initial seed value (actually, two).

>...
> >     If you use default seed()'ing now and re-run your program/model
> >     later with identical parameters, you will get different output.
>
> When you care about this, you need to set the seed to something
> deterministic.

Naturally, but how do I know what a 'good' seed is (or indeed it's type,
range, etc.)? I just would like, as e.g. RNG does, let the number generator
take care of this... (or at least provide the option to)

>...

In the programs I've seen so far, including a lot of ours ahem, usually a
program (simulation) is run multiple times with the same parameters and,
in our case for neural nets, seeded each time with a clock generated seed
and then the different simulations are compared and checked if they are
similar or sensitive to chaotic influences.
But I don't think this is the proper way to do this.
My point is, I guess, that the sequence of these clock-generated seeds
itself is not random, because (as for RandomArray) the generated numbers
are clearly not random.
Better, and reproducible, would be to start the first simulation with a
supplied seed, get the seed and pickle after the first run and use the
pickled seed for run 2 etc. or indeed have a kind of master script
(as you suggest) that manages this.
That way you would start with one seed only and are not re-seeding
for each run. Because if the clock-seeds are not truly random, you
will a much greater change of cycles in your overall sequence of numbers.

Bye-bye, Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

VME ERROR 37022: Hierarchic name syntax invalid taking
into account starting points defined by initial context.


From paul at pfdubois.com  Tue Jul 23 13:14:05 2002
From: paul at pfdubois.com (paul at pfdubois.com)
Date: Tue Jul 23 13:14:05 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
In-Reply-To: <200207232202.04104.e.maryniak@pobox.com>
Message-ID: <3D36139400005515@mta08.san.yahoo.com>

RandomArray got a "special" position as part of Numeric simply by historical
accident in being there first. I think in the conversion to Numarray we
will be able to remove such things from the "core" and make more of a marketplace
of equals for the "addons". As it is now there is some implication that
somehow one is "better" than the other, which is unjustified either mathematically
or in the sense of design.

RNG's design is based on my experience with large codes needing many independent
streams. The mathematics is from a well-tested Cray algorithm. I'm sure
it could use fluffing up but a good case can be made for it.


From gb at cs.unc.edu  Tue Jul 23 14:24:03 2002
From: gb at cs.unc.edu (Gary Bishop)
Date: Tue Jul 23 14:24:03 2002
Subject: [Numpy-discussion] Bug in Numpy FFT reference?
Message-ID: <200207232123.g6NLN6bE004136@wren.cs.unc.edu>

The example given for real_fft in the FFT section of the Sept 7, 2001 
Numpy manual makes no sense to me. The text says

>>> x = cos(arange(30.0)/30.0*2*pi)
>>> print real_fft(x)
[ -1. +0.j 13.69406641+2.91076367j
-0.91354546-0.40673664j -0.80901699-0.58778525j
-0.66913061-0.74314483j -0.5 -0.8660254j
-0.30901699-0.95105652j -0.10452846-0.9945219j
0.10452846-0.9945219j 0.30901699-0.95105652j
0.5 -0.8660254j 0.66913061-0.74314483j
0.80901699-0.58778525j 0.91354546-0.40673664j
0.9781476 -0.20791169j 1. +0.j ]

But surely x is a single cycle of a cosine wave and should have a very 
sensible and simple FT. Namely [0, 1, 0, 0, 0, ...]

Indeed, running the example using Numeric and FFT produces, within 
rounding error, exactly what I would expect.

Why the non-intuitive (and wrong) result in the example text?

gb


From dubois1 at llnl.gov  Tue Jul 23 14:32:04 2002
From: dubois1 at llnl.gov (Paul Dubois)
Date: Tue Jul 23 14:32:04 2002
Subject: [Numpy-discussion] Bug in Numpy FFT reference?
In-Reply-To: <200207232123.g6NLN6bE004136@wren.cs.unc.edu>
References: <200207232123.g6NLN6bE004136@wren.cs.unc.edu>
Message-ID: <1027459879.8212.2.camel@ldorritt>

The person who wrote the manual cut and pasted from running the code. I
think there was a bug in FFT at the time. (:->

On Tue, 2002-07-23 at 14:23, Gary Bishop wrote:
> The example given for real_fft in the FFT section of the Sept 7, 2001 
> Numpy manual makes no sense to me. The text says
> 
> >>> x = cos(arange(30.0)/30.0*2*pi)
> >>> print real_fft(x)
> [ -1. +0.j 13.69406641+2.91076367j
> -0.91354546-0.40673664j -0.80901699-0.58778525j
> -0.66913061-0.74314483j -0.5 -0.8660254j
> -0.30901699-0.95105652j -0.10452846-0.9945219j
> 0.10452846-0.9945219j 0.30901699-0.95105652j
> 0.5 -0.8660254j 0.66913061-0.74314483j
> 0.80901699-0.58778525j 0.91354546-0.40673664j
> 0.9781476 -0.20791169j 1. +0.j ]
> 
> But surely x is a single cycle of a cosine wave and should have a very 
> sensible and simple FT. Namely [0, 1, 0, 0, 0, ...]
> 
> Indeed, running the example using Numeric and FFT produces, within 
> rounding error, exactly what I would expect.
> 
> Why the non-intuitive (and wrong) result in the example text?
> 
> gb
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From e.maryniak at pobox.com  Wed Jul 24 09:24:14 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Wed Jul 24 09:24:14 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) system clock default and possible bug
In-Reply-To: <3D36139400005515@mta08.san.yahoo.com>
References: <3D36139400005515@mta08.san.yahoo.com>
Message-ID: <200207241823.42218.e.maryniak@pobox.com>

On Tuesday 23 July 2002 22:15, paul at pfdubois.com wrote:
> RandomArray got a "special" position as part of Numeric simply by
> historical accident in being there first. I think in the conversion to
> Numarray we will be able to remove such things from the "core" and make
> more of a marketplace of equals for the "addons". As it is now there is
> some implication that somehow one is "better" than the other, which is
> unjustified either mathematically or in the sense of design.
>
> RNG's design is based on my experience with large codes needing many
> independent streams. The mathematics is from a well-tested Cray algorithm.
> I'm sure it could use fluffing up but a good case can be made for it.

A famous quote from Linus is "Nice idea. Now show me the code."

Perhaps a detailed example makes my problem clearer, because as it is now,
RNG and RandomArray2 are not orthogonal in design, in the sense that RNG's
default seed is fixed and RandomArray's is automagical (clock), not
reproducible and mathematically suspect, which I think is not good for the
more naive Python user.

Below I will give intended usage in a provocative way, but please don't
take me too seriously (I know, I don't ;-)

Let's say you have a master shell script that runs a neural net paradigm
(size 20x20) 10 times, each time with the same parameters, to see if it's
stable or chaotic, i.e. does not 'converge' c.q. outcome depends on initial
values (it should not be chaotic, but this should always be checked).

    run10.sh
       tracelink.py 20 20 inputpat.dat > hippocamp01.out
       ... 8 more ...
       tracelink.py 20 20 inputpat.dat > hippocamp10.out

    tracelink.py
       ... import numarray, RandomArray2 _or_ RNG ...
       # Case 1: RandomArray2
       # User uses default clock seed, which is the same
       # during 1 second (see my previous posting).
       # ignlgi(void)'s seeds 1234567890L,123456789L
       # are _not_ used (see com.c).
       RandomArray2.seed()
       # But if omitted, RandomArray2.py does it, too.
       ... calculations
       ... other program outcome _only_ if program runs > 1 second,
       ... otherwise the others will have the same result.
       # Case 2: RNG
       # A 'standard_generator = CreateGenerator(-1)' is automatically done.
       #   seed < 0  ==>  Use the default initial seed value.
       #   seed = 0  ==>  Set a "random" value for the seed from system clock.
       #   seed > 0  ==>  Set seed directly (32 bits only).
       # Thus, the fixed seeds used are 0,0 (see Mixranf() in ranf.c).
       ... calculations
       ... all 10 programs have the same outcome when using ranf(),
       ... because it always starts the same seed, the sequence is always:
       ... 0.58011364857958725, 0.95051273498076583, 0.78637142533060356 etc.
       
The problem with RandomArray's seed is, that it is not truly random itself.
In it's current (time.time based) implementation it is linearly auto
incrementing every second, and therefore suffers from auto-correlation.
Moreover, in the above example, if 10 separate .py runs complete in 1 second
they'll all have the same seed (and outcome). This is not what the user,
if accustomed to clock seeding, would expect.
But if the seed is different each time, a problem is that runs are not
reproducible. Let's say that run hippocamp06.out produced some strange
output: now unless the user saved the seed (with get_seed), it can never
be reproduced.

Therefore, I think RNG's design is better and should be applied to
RandomArray2, too, because RandomArray2's seeding is flawed anyways.
A user should be aware of proper seeding, agreed, and now will be:
when doing multiple identical runs, the same (and thus reproducible)
output will result and so the user is made aware of the fact that,
as an example, he or she should seed or pickle it between runs.
So my suggestion would be to re-implement RandomArray2.seed(x=0,y=0)
as follows:

  if either the x or y seed:

    seed  < 0     ==>  Use the default initial seed value.
    seed  = None  ==>  Set a "random" value for the seed from the system clock.
    seeds >= 0    ==>  Set seed directly (32 bits only).

and en-passant do a better job than clock-based seeding:

---cut---
def seed(x=None,y=None):
    """seed(x, y), set the seed using the integers x, y;
    ...
    """
    if (x != None and type (x) != IntType) or
       (y != None and type (y) != IntType) :
        raise ArgumentError, "seed requires integer arguments (or None)."
    if x == None or y == None:
        import dev_random_device  # uses /dev/random or equivalent
        x = dev_random_device.nextvalue()   # egd.sf.net is a user space
        y = dev_random_device.nextvalue()   # alternative
    elif x < 0 or y < 0:
        x = 1234567890L
        y = 123456789L
    ranlib.set_seeds(x,y)
---cut---

But: I realize that this is different behavior from Python's standard
random and whrandom, where no arg or None uses the clock. But, if that
behavior is kept for RandomArray2 (and RNG should then be adapted, too)
then I'd urge at least to use a better initial seed.
In certain applications, e.g. generating session id's in crypto programs,
non-predictability of initial seeds is crucial. But if you have a look
at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks
like an art in itself. So perhaps RNG's 'clock code' should replace
RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will
not have the 1-second problem.

Bye-bye, Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

Just because you're not paranoid, that doesn't mean that they're not
after you.


From Chris.Barker at noaa.gov  Wed Jul 24 10:01:06 2002
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jul 24 10:01:06 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0) 
 system clock default and possible bug
References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com>
Message-ID: <3D3ECEEE.6BAF4CC2@noaa.gov>

Just to add my $.02:

I disagree with Eric about what the default behaviour should be. Every
programming language/environment I have ever used uses some kind of
"random" seed by default. When I want reproducible results (which I
often do for testing) I can specify a seed. I find the the most useful
behaviour. As Eric points out, it is not trivial to generate a "random"
seed (from the time, or whatever), so it doesn't make sense to burdon
the nieve user with this chore.

Therefore, I strongly support keeping the default behaviour of a
"random" seed.

Eric Maryniak wrote:
> then I'd urge at least to use a better initial seed.
> In certain applications, e.g. generating session id's in crypto programs,
> non-predictability of initial seeds is crucial. But if you have a look
> at GPG's or OpenSSL's source for a PRNG (especially for Windows), it looks
> like an art in itself. So perhaps RNG's 'clock code' should replace
> RandomArray2's: it uses microseconds (in gettimeofday), too, and thus will
> not have the 1-second problem.

This I agree with: a better default initial seed would be great. As
someone said, "show me the code!". I don't imagine anyone would object
to improving this.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From e.maryniak at pobox.com  Wed Jul 24 10:29:02 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Wed Jul 24 10:29:02 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0)  system clock default and possible bug
In-Reply-To: <3D3ECEEE.6BAF4CC2@noaa.gov>
References: <3D36139400005515@mta08.san.yahoo.com> <200207241823.42218.e.maryniak@pobox.com> <3D3ECEEE.6BAF4CC2@noaa.gov>
Message-ID: <200207241928.07366.e.maryniak@pobox.com>

On Wednesday 24 July 2002 17:59, Chris Barker wrote:
> Just to add my $.02:
>
> I disagree with Eric about what the default behaviour should be. Every
> programming language/environment I have ever used uses some kind of
> "random" seed by default. When I want reproducible results (which I
> often do for testing) I can specify a seed. I find the the most useful
> behaviour. As Eric points out, it is not trivial to generate a "random"
> seed (from the time, or whatever), so it doesn't make sense to burdon
> the nieve user with this chore.
>
> Therefore, I strongly support keeping the default behaviour of a
> "random" seed.

In that case, and if that is the general consensus, RNG should be adapted:
it now uses a fixed seed by default (and not a clock generated one).

> Eric Maryniak wrote:
> > then I'd urge at least to use a better initial seed.
> > In certain applications, e.g. generating session id's in crypto programs,
> > non-predictability of initial seeds is crucial. But if you have a look
> > at GPG's or OpenSSL's source for a PRNG (especially for Windows), it
> > looks like an art in itself. So perhaps RNG's 'clock code' should replace
> > RandomArray2's: it uses microseconds (in gettimeofday), too, and thus
> > will not have the 1-second problem.
>
> This I agree with: a better default initial seed would be great. As
> someone said, "show me the code!". I don't imagine anyone would object
> to improving this.

The source is in Mixranf(), file Numerical/Packages/RNG/Src/ranf.c
(when checked out with CVS), but it may be a good idea to check it
with Python's own random/whrandom code (which I don't have at hand
-- it may be more recent and/or portable for other OSes).

By the way, I realized in my code 'fix' for RandomArray2.seed(x=None,y=None)
that I already anticipated this and that the default behavior is _not_
to use a fixed seed ;-)  :

  if either the x or y seed:

    seed  < 0     ==>  Use the default initial seed value.
    seed  = None  ==>  Set a "random" value for the seed from clock (default)
    seeds >= 0    ==>  Set seed directly (32 bits only).

and en-passant do a better job than clock-based seeding:

---cut---
def seed(x=None,y=None):
    """seed(x, y), set the seed using the integers x, y;
    ...
    """
    if (x != None and type (x) != IntType) or
       (y != None and type (y) != IntType) :
        raise ArgumentError, "seed requires integer arguments (or None)."
    if x == None or y == None:
        # This would be the best, but is problematic under Windows/Mac.
        import dev_random_device  # uses /dev/random or equivalent
        x = dev_random_device.nextvalue()   # egd.sf.net is a user space
        y = dev_random_device.nextvalue()   # alternative
        # So best is to use Mixranf() from RNG/Src/ranf.c here.
    elif x < 0 or y < 0:
        x = 1234567890L
        y = 123456789L
    ranlib.set_seeds(x,y)
---cut---

Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

Unix was a trademark of AT&T.
AT&T is a modem test command.


From peter.chang at nottingham.ac.uk  Wed Jul 24 11:08:06 2002
From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk)
Date: Wed Jul 24 11:08:06 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0,
 y=0)  system clock default and possible bug
In-Reply-To: <200207241928.07366.e.maryniak@pobox.com>
Message-ID: <Pine.LNX.4.44.0207241856350.15117-100000@eexpc1.eee.nott.ac.uk>


Just to stick my oar in:

I think Eric's preference is predicated by the lousiness (or otherwise?)  
of RandomArray's seeding mechanism. The random sequences generated by
incremental seeds should, by design, be uncorrelated thus allowing the use
of the system clock as a seed source.

If you're running lots of simulations (as I do with Monte Carlos, though
not in numpy) using PRNGs, the last thing you want is the task to find a
(pseudo) random source of seeds. Using /dev/random is not particularly
portable; the system clock is much easier to obtain and is fine as long as
your iteration cycle is longer than its resolution.

Peter


From paul at pfdubois.com  Wed Jul 24 23:09:02 2002
From: paul at pfdubois.com (Paul F Dubois)
Date: Wed Jul 24 23:09:02 2002
Subject: [Numpy-discussion] Numarray: question on RandomArray2.seed(x=0, y=0)  system clock default and possible bug
In-Reply-To: <200207241928.07366.e.maryniak@pobox.com>
Message-ID: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY>

I'm not going to change the default seed on RNG. Existing users have the
right to stability, and not to have things change because someone thinks
a certain choice among several reasonable ones is better than the one
previously made. 

There is the further issue here of RNG being advertised as similar to
Cray's ranf() and that similarity extends to this default. Not to
mention that for many purposes the current default is quite useful.


From e.maryniak at pobox.com  Thu Jul 25 06:02:03 2002
From: e.maryniak at pobox.com (Eric Maryniak)
Date: Thu Jul 25 06:02:03 2002
Subject: [Numpy-discussion] Numarray: Summary (seeding): personal code and manual suggestions on initial seeding in module RNG and RandomArray(2)
In-Reply-To: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY>
References: <001201c233a1$a2616bc0$0a01a8c0@NICKLEBY>
Message-ID: <200207251501.47126.e.maryniak@pobox.com>

Dear crunchers,

Please see my personal thoughts on the past discussion about initial
seeds some paragraphs down below, where I'd like to list concrete
code and manual enhancements aimed at providing users with a clear
understanding of it's usage (and pitfalls e.g. w/r to cryptographic
applications)...

==> Suggestions for code and manual changes w/r to initial seeding
    (down below)

But first a response to Paul's earlier message:

On Thursday 25 July 2002 08:08, Paul F Dubois wrote:
> I'm not going to change the default seed on RNG. Existing users have the
> right to stability, and not to have things change because someone thinks
> a certain choice among several reasonable ones is better than the one
> previously made.

Well, I wasn't aware of the fact that things were completely set in stone
for Numarray solely for backward compatibilty. It was my impression that
numarray and it's accompanying xx2 packages were also open for redesign.
I agree stability is important, but numarray already breaks with Numeric
in other aspects so why should RNG (RNG2 in numarray?) or other packages
not be? It's more a matter of well documenting changes I think.
Users switching to numarray will already have to take into account some
changes and verify their code.
It's not that I "think a certain choice among several reasonable ones is
better" [although my favorite is still a fixed seed, as in RNG, for reasons
of reproducibility in later re-runs of Monte Carlo's that are not possible
now, because the naive user, using a clock seed, may not have saved the
initial seed with get_seed], but that the different packages, i.c. RNG
(RNG2 to be?) and RandomArray2, should be orthogonal in this respect.
I.e. the same, so 'default always an automagical (clock whatever) random
initial seed _or_ a fixed one'. Orthogonality is a very common and accepted
design principle in computing science and for good reasons (usability).
Users changing from one PRNG to another (and using the default seed) would
otherwise be unwelcomely surprised by a sudden change in behavior of their
program.
I try to give logical arguments and real code examples in this discussion
and fail to see in Paul's reaction where I'm wrong.
By the way: in Python 2.1 alpha 2 seeding changed, too:
"""
- random.py's seed() function is new.  For bit-for-bit compatibility with
  prior releases, use the whseed function instead.  The new seed function
  addresses two problems:  (1) The old function couldn't produce more than
  about 2**24 distinct internal states; the new one about 2**45 (the best
  that can be done in the Wichmann-Hill generator).  (2) The old function
  sometimes produced identical internal states when passed distinct
  integers, and there was no simple way to predict when that would happen;
  the new one guarantees to produce distinct internal states for all
  arguments in [0, 27814431486576L).
"""

> There is the further issue here of RNG being advertised as similar to
> Cray's ranf() and that similarity extends to this default. Not to
> mention that for many purposes the current default is quite useful.

Perhaps I'm mistaken here, but RNG/Lib/__init__.py does
(-1 -> uses fixed internal seed):

    standard_generator = CreateGenerator(-1)

and:
    def ranf():
            "ranf() = a random number from the standard generator."
            return standard_generator.ranf()

And indeed Mixranf in RNG/Src/ranf.c does set them to 0:

    ...
    if(*s < 0){ /* Set default initial value */
        s48[0] = s48[1] = 0;
        Setranf(s48);
        Getranf(s48);

And this code, or I'm missing the point, uses a standard generator
from RNG, which demonstrates the same sequence of initial seeds in
re-runs (note that it does not suffer from the "1-second problem"
as RandomArray2 does, see the Appendix below for a demonstration
of that, because RNG uses milliseconds).
Note that 'ranf()' is listed in chapter 18 in Module RNG as one of
the 'Generator objects':

    $ python
    Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
    >>> from numarray import *
    >>> from RNG import *
    >>> for i in range(3):
    ...    standard_generator.ranf()
    ...
    0.58011364857958725
    0.95051273498076583
    0.78637142533060356
    >>>

    $ python
    Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
    >>> from numarray import *
    >>> from RNG import *
    >>> for i in range(3):
    ...     standard_generator.ranf()
    ...
    0.58011364857958725
    0.95051273498076583
    0.78637142533060356
    >>>


Ok, now then my own (and possibly biased) personal summary of the
past discussions and concrete code and manual recommendations:

==> Suggestions for code and manual changes w/r to initial seeding

Conclusions:
1. Default initial seeding should be random (and not fixed).
   This is the general consensus and while it may not win the beauty
   contest in purist software engineering circles, it also is the
   default behavior in Python's own Random/WHRandom modules. URL:
       http://web.pydoc.org/2.2/random.html
=> Recommendations:
   - Like Python's random/whrandom module, default arguments to seed()
     should not be 0, but None, and this triggers the default behavior
     which is to use a random initial seed (ideally 'truly' random from
     e.g. /dev/random or otherwise clock or whatever based), because:
     o better usability: users changing from Python's own random to
       numarray's random facilities will find familiar seed() usage
       semantics
     o often 0 itself can be a legal seed (although the MersenneTwister
       does not recommend it)
   - Like RNG provide support for using a built-in fixed seed by
     supplying negative seeds to seed(), rationale:
     o support for reproducible re-runs of Monte Carlo's without
       having to specify ones own initial seed
     o usability: naive users may not know a 'good' seed is, like:
       can it be 0 or must it be >0, what is the maximum, etc.
   - See my suggested code fix for RandomArray2.seed() in the Appendix below.
   - Likewise, in RNG:
     o CreateGenerator (s, ...) should be changed to CreateGenerator (s=None)
       Also note Python's own: def create_generators(num, delta, firstseed=None)
       from random (random.py), url: http://web.pydoc.org/2.2/random.html
     o RNG's code should be changed from testing on 0 to testing on None first
       (which results in using the clock), then on < 0 (use built-in seed),
       and then using the user provided seed (which is thus >= 0, and hence
       can also be 0)
     o 'standard_generator = CreateGenerator(-1)' should be changed to
       'standard_generator = CreateGenerator() and results in using the clock
   - Put some explicit warnings in the numarray manual, that the seeding
     of numarray's packages should _not_ be used in those parts of software
     where unpredictability of seeds is important, such as for example,
     cryptographical software for creating session keys, TCP sequence numbers
     etc. Attacks on crypto software usually center around these issues.
     Ideally, a /dev/random should be used, but with the current system
     clock based implementation, the seeds are not random, because the clock
     does not have deci-nanosecond precision (10**10 ~= 2**32) yet ;-)

Appendix
--------
** 1. "1-second problem" with RandomArray2:

$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
>>> from numarray import *
>>> from RandomArray2 import *
>>> import time
>>> import sys
>>> sys.version
'2.2.1 (#1, Jun 25 2002, 20:45:02) \n[GCC 2.95.3 20010315 (SuSE)]'
>>> numarray.__version__
'0.3.5'
>>> for i in range(3):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(1)
...     print
...
1027591910.9043469
(102759, 1911)

1027591911.901091
(102759, 1912)

1027591912.901088
(102759, 1913)

>>> for i in range(3):
...     time.time()
...     RandomArray2.seed()
...     RandomArray2.get_seed()
...     time.sleep(0.3)
...     print
...
1027591966.260392
(102759, 1967)

1027591966.5510809
(102759, 1967)

1027591966.851079
(102759, 1967)

Note that Python (at least 2.2.1) own random() suffers much less
from this (on my 450 MHz machine, every 10-th millisecond or so
the seed will be different):

$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
>>> from random import *
>>> import time
>>>
>>> for i in range(3):
...     print long(time.time() * 256)
...
263065231349
263065231349
263065231349
>>> for i in range(3):
...     print long(time.time() * 256)
...     time.sleep(.00001)
...
263065240314
263065240315
263065240317

By the way, Python's own random.seed() also suffers from this,
but on a 10th-millisecond level (on my 450 Mhz i586 at least).
For the implementation of seed() see Lib/random.py, basically a
'long(time.time()' is used:

$ python
Python 2.2.1 (#1, Jun 25 2002, 20:45:02) ...
>>> from random import *
>>> import time
>>> for i in range(3):
...     print long(time.time() * 256)
...
263065231349
263065231349
263065231349
>>> for i in range(3):
...     print long(time.time() * 256)
...     time.sleep(.00001)
...
263065240314
263065240315
263065240317

2. Proposed re-implementation of RandomArray2.seed():

def seed(x=None,y=None):
    """seed(x, y), set the seed using the integers x, y:
       x or y is None (or not specified):
          A random seed is used which in the current implementation
          may be based on the system's clock.
          Warning: do not this seed in software where the initial seed may
          not be predictable, such as for example, in cryptographical software
          for creating session keys.
       x < 0 or y < 0:
          Use the module's fixed built-in seed which is the tuple
          (1234567890L, 123456789L) (or whatever)
       x >= 0 and y >= 0
          Use the seeds specified by the user.
          (Note: some random number generators do not recommend using 0)
       Note: based on Python 2.2.1's random.seed(a=None).
       ADAPTED for _2_ seeds as required by ranlib.set_seeds(x,y)
    """
    if (x != None and type (x) != IntType) or
       (y != None and type (y) != IntType) :
        raise ArgumentError, "seed requires integer arguments (or None)."
    if x == None or y == None:
        try:
            # This would be the best, but is problematic under Windows/Mac.
            # To my knowledge there isn't a portable lib_randdevice yet.
            # As GPG, OpenSSH and OpenSSL's code show, getting entropy
            # under Windows is problematic.
            # However, Python 2.2.1's socketmodule does wrap the ssl code.
            import dev_random_device  # uses /dev/random or equivalent
            x = dev_random_device.nextvalue()   # egd.sf.net is a user space
            y = dev_random_device.nextvalue()   # alternative
        except:
            # Use Mixranf() from RNG/Src/ranf.c here or, perhaps better,
            # use Python 2.2.1's code? At least it looks simpler and does not
            # have the platform dependency's and has possibly met wider testing
            # (and why not re-use code? ;-)
            # For Python 2.2.1's random.seed(a=None), see url:
            #     http://web.pydoc.org/2.2/random.html
            # and file Lib/random.py.
            # Do note, however, that on my 450 Mhz machine, the statement
            # 'long(time.time() * 256)' will generate the same values
            # within a tenth of a millisecond (see Appendix #1 for a code
            # example). This can be fixed by doing a time.sleep(0.001).
            # See my #EM# comment.
            # Naturally this code needs to be adapted for ranlib's
            # generator, because this code uses the Wichmann-Hill generator.

---cut: Wichmann-Hill---
    def seed(self, a=None):
        """Initialize internal state from hashable object.

        None or no argument seeds from current time.

        If a is not None or an int or long, hash(a) is used instead.

        If a is an int or long, a is used directly.  Distinct values between
        0 and 27814431486575L inclusive are guaranteed to yield distinct
        internal states (this guarantee is specific to the default
        Wichmann-Hill generator).
        """

        if a is None:
            # Initialize from current time
            import time
            a = long(time.time() * 256)
            #EM# Guarantee unique a's between subsequent call's of seed()
            #EM# by sleeping one millisecond. This should not be harmful,
            #EM# because ordinarily, seed() will only be called once or so
            #EM# in a program.
            time.sleep(0.001)

        if type(a) not in (type(3), type(3L)):
            a = hash(a)

        a, x = divmod(a, 30268)
        a, y = divmod(a, 30306)
        a, z = divmod(a, 30322)
        self._seed = int(x)+1, int(y)+1, int(z)+1
---cut: Wichmann-Hill---
    elif x < 0 or y < 0:
        x = 1234567890L  # or any other suitable 0 - 2**32-1
        y = 123456789L
    ranlib.set_seeds(x,y)

3. Mersenne Twister, another PRNG:


Bye-bye,

Eric
-- 
Eric Maryniak <e.maryniak at pobox.com>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

In a grocery store, the Real Programmer is the one who insists on running
the cans past the laser checkout scanner himself, because he never could
trust keypunch operators to get it right the first time.


From aureli at ipk.fhg.de  Thu Jul 25 09:51:06 2002
From: aureli at ipk.fhg.de (Aureli Soria Frisch)
Date: Thu Jul 25 09:51:06 2002
Subject: [Numpy-discussion] index method for array objects?
In-Reply-To: <yfswusomq84.fsf@black132.ex.ac.uk>
References: <20020621133705.A15296@idi.ntnu.no>
 <a05111b02b93d0b733efd@[153.97.92.109]>
 <yfswusomq84.fsf@black132.ex.ac.uk>
Message-ID: <a05111b00b965d8029fdb@[153.97.92.109]>

Hi all,

Has someone implemented a function for arrays that behaves like the 
index(*) method for lists (it should then consider something like a 
tolerance parameter).

I suppose it could be maybe done with array.tolist() and 
list.index(), but have someone implemented something more 
elegant/array-based?

Thanks in advance

Aureli

PD: (*) index receive a value as an argument and retunrs the index of 
the list member equal to this value...
-- 
#################################
Aureli Soria Frisch
Fraunhofer IPK
Dept. Pattern Recognition

post: Pascalstr. 8-9, 10587 Berlin, Germany
e-mail: aureli at ipk.fhg.de
fon: +49 30 39006-143
fax: +49 30 3917517
web: http://vision.fhg.de/~aureli/web-aureli_en.html
#################################


From jmiller at stsci.edu  Thu Jul 25 10:15:03 2002
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 25 10:15:03 2002
Subject: [Numpy-discussion] index method for array objects?
References: <20020621133705.A15296@idi.ntnu.no> <a05111b02b93d0b733efd@[153.97.92.109]> <yfswusomq84.fsf@black132.ex.ac.uk> <a05111b00b965d8029fdb@[153.97.92.109]>
Message-ID: <3D4031C2.3090607@stsci.edu>

Aureli Soria Frisch wrote:

> Hi all,
>
> Has someone implemented a function for arrays that behaves like the 
> index(*) method for lists (it should then consider something like a 
> tolerance parameter).
>
> I suppose it could be maybe done with array.tolist() and list.index(), 
> but have someone implemented something more elegant/array-based?
>
> Thanks in advance
>
> Aureli
>
> PD: (*) index receive a value as an argument and retunrs the index of 
> the list member equal to this value...

I think the basics of what you're looking for are something like:

def index(a, b, eps):
      return nonzero(abs(a-b) < eps)

which should return all indices at which the absolute value of the 
difference between elements of a and b differ by less than eps.   e.g.:

 >>> import Numeric
 >>> index(Numeric.arange(10,20), 15, 1e-5)
array([5])

Todd

-- 
Todd Miller 			jmiller at stsci.edu
STSCI / SSG			(410) 338 4576


From magnus at hetland.org  Thu Jul 25 12:12:11 2002
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Thu Jul 25 12:12:11 2002
Subject: [Numpy-discussion] Spectral approximation/DFT
Message-ID: <20020725211111.A27670@idi.ntnu.no>

Hi!

Sorry to ask what is probably a really clueless question -- if there
are any obvious sources of information about this, I'd be happy to go
there and find this out for myself... :]

Anyway; I'm trying to produce a graph to illustrate a time sequence
indexing method, which relies on extracting the low-frequent Fourier
coefficients and indexing a vector consisting of those. The graph
should contain the original time sequence, and one reconstructed from
the Fourier coefficients. Since it is reconstructed from only the
low-frequent coefficients (perhaps 10-20 coefficients), it will look
wavy and sinus'y.

Now... I'm no expert in signal processing (or the specifics of FFT/DFT
etc.), and I can't seem to make the FFT module do exactly what I want
here...

It seems that using fft(seq).real extracts the coefficients I'm after
(though I'm not sure whether the imaginary components ought to figure
in the equation somehow...)

But no matter how I use inverse_fft or inverse_real_fft it seems I
have to supply a number of coefficients equal to the sequence I want
to approximate -- otherwise there will be a huge offset between them.
Why is this so? Shouldn't the first coefficient take care of such an
offset? Perhaps inverse_fft isn't doing what I think it is?

If I haven't expressed myself clearly, I'd be happy to elaborate...

(For those who might be interested, the approach is described in the
paper found at http://citeseer.nj.nec.com/307308.html with a figure of
the type I'm trying to produce at page 5.)

Anyway, thanks for any help :)

-- 
Magnus Lie Hetland                                  The Anygui Project
http://hetland.org                                  http://anygui.org


From magnus at hetland.org  Thu Jul 25 12:16:21 2002
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Thu Jul 25 12:16:21 2002
Subject: [Numpy-discussion] A probable solution...
Message-ID: <20020725211534.A27914@idi.ntnu.no>

After posting to the list (sorry about that ;) a possible solution
occurred to me... To get an approximation, I used fft(seq, 10) and
then inverted that using inverse_fft(signature, 100)... I guess that
fouled up the scale of things -- when I use fft(seq, 100)[:10] to get
the signature, it seems that everything works just fine...

Even though this _seems_ to do the right thing, I just wanted to make
sure that I'm not doing something weird here...

-- 
Magnus Lie Hetland                                  The Anygui Project
http://hetland.org                                  http://anygui.org


From a.schmolck at gmx.net  Thu Jul 25 15:18:04 2002
From: a.schmolck at gmx.net (Alexander Schmolck)
Date: Thu Jul 25 15:18:04 2002
Subject: [Numpy-discussion] Numarray design announcement
References: <000001c231d5$6afe4900$0b01a8c0@NICKLEBY>
Message-ID: <yfsfzy7bggb.fsf@black132.ex.ac.uk>

"Paul F Dubois" <paul at pfdubois.com> writes:

> 
> During recent months there has been a lot of discussion about Numarray
> and whether or not it should differ from Numeric in certain ways. We
> have reviewed this lengthy discussion and come to some conclusions about
> what we plan to do. The discussion has been valuable in that it took a
> whole new "generation" back through the considerations that the
> "founding fathers" debated when Numeric Python was designed.
[...] 
> Decisions
> 
> Numarray will have the same Python interface as Numeric except for the
> exceptions discussed below. 
[...] 
> 2. Currently, if the result of an index operation x[i] results in a
> scalar result, the result is converted to a similar Python type. For
> example, the result of array([1,2,3])[1] is the Python integer 2. This
> will be changed so that the result of an index operation on a Numarray
> array is always a Numarray array. Scalar results will become rank-zero
> arrays (i.e., shape () ).
> 
[...]
> 
> 4. The Numarray version of MA will no longer have copy semantics on
> indexing but instead will be consistent with Numarray. (The decision to
> make MA differ in this regards was due to a need for CDAT to be backward
> compatible with a local variant of Numeric; the CDAT user community no
> longer feels this was necessary).
[...]


As one of the people who argued for interface changes in numarray (mainly copy
semantics for slicing), let me say that I welcome this announcement which
clarifies many issues. Although I still believe that copy behavior would be
preferable in principle, I think that continuity and backwards compatibility
to Numeric is a sufficient reason to stick to the old behavior (now that
numarray strives to be largely compatible) [1]. In a similar vain I also greatly
welcome the change to view semantics in MA, because I feel that internal
consistency is vital.

Apart from being a heavy Numeric user, these interface issues are also quite
important to me because I have been working for some time on a fully-featured
matrix [2] class which I wanted to be both

a) compatible to Numeric and numarray (so that it would ideally make no
   difference to the user which of the 2 libraries he'd be using as a
   "backend" to the matrix class).

b) consistent in usage to numarray's interface wherever feasible (i.e. not too
   much of a compromise on usability).

This turned out to be much more of a hassle than I would have anticipated,
because contrary to what the compatibility section of the manual seemed to
suggest I found numarray to be incompatible in a variety of ways (even making
it impossible to write *forward* compatible code without writing additional
wrapping functions). Just as an example, there was no simple way that would
work across both versions to do something as common as creating e.g. an int
array (with both parameter names and positions differing):

Numeric (21):       array(sequence, typecode=None, copy=1, savespace=0)
numarray (0.3.3?) : array(buffer=None, shape=None, type=None)


As for b) this obviously turned out to be a moving target, but I hope that now
the final shape of things is getting reasonably clear and I'm now for example
determined to have view slicing behavior for my matrix class, too.

Nonetheless, for me a few issues still remain.

Most importantly, numarray doesn't provide the same degree of polymorphism as
Numeric. One of the chief reasons given as to why Numerics design is based
around functions rather than methods is that it enables greater generality
(e.g. allowing to ``sum`` over all sorts of sequence types). Consequently the
role of methods and attributes was largely limited to functionality that only
made sense for array objects and special methods. This is more than just a
neat convinience -- because of the resulting polymorphism it is easy to write
fairly general code and define new kinds of numeric classes that can
seamlessly be passed to Numeric functions (e.g. one can also ``sum``
Matrix'es).

I find it highly undesirable that numarray apparently doesn't follow this
design rationale and the division of labour between functions and
methods/attributes has been blured (or so it appears to me -- maybe this is
some lack of insight on my part).  That numarray versions before 0.3.4 were
missing functions such as ``shape`` (which is also quite handy for other
sequence types) was largely an inconvenience, but the fact that numarray
function generally only operate on scalars, ``tuple``s and ``list``s (apart
from obviously numarray.array's) is in my eyes a significant shortcoming.

In contrast, Numeric functions would operate on any type that had an __array__
method to return an array representation of itself. The explicit checking for
a type that numarray uses (via constructs ? la type(a) == types.ListType)
flies in the face of standard python sensibilities and places arbitrarily
limits on the kinds of objects that numarray users can conviniently work with
and places a significant hurdle for creating new kinds of numerical objects.

For example, the design of my matrix class depends on the fact that Numeric
functions also accept objects with __array__ methods (such as my matrix
class). Even if I invested the substantial amount of work that would be needed
to redesign a less general version that wouldn't rely on this property, one of
the key virtues of my class, namely the ability to transparently replace
Numeric.array's in most cases where they are used as matrices would be
lost. These two reasons would presumably be sufficient for me not to switch to
numarray if I can at all avoid it, so I really hope that there numarray will
also grow an __array__ protocol or somethign equivalent.

This is the only point that is really vital to me, but there are others that
I'd rather see reconsidered. As I said, I liked the division of labor between
functions and methods/attributes in Numeric and the motivations behind it, as
far as I understand them. numarray arrays, however, have grown methods like
``argsort`` and ``diagonal`` that seem somewhat unmotivated to me (and some of
which cause problems for my matrix class). Similarly, why is there a e.g. a
``.rank`` attribute but a ``.type()`` method? If anything one would expect
type to be an attribute and rank a method, since the type is actually some
intrinsic property that needs to be stored (and could even be plausibly
assigned to, with results like an ``astype`` call) whereas ``size`` and
``rank`` have no "real" existence as they are only computed from the shape and
modifying them makes no sense.

TMTOWTDI is the road to perl, so I'd really prefer to avoid duplicate
functionality a la ``rank(a)`` and ``a.rank`` and generally reserve attributes
and methods to array specific functionality.

One area where TMTOWTDI seems to have run amok (several ways to do something
but IMHO all broken) are flattened representations of arrays. All these
expressions aim to produce a flattened version of ``a``:

``ravel(a)``, ``a.ravel()``, ``a.getflat()``/ ``a.flat``

`Aim` in this context is some sort of euphemism -- the only one for which it
is possible to determine at compile time that it will do anything apart from
raising an exception is ``ravel(a)`` -- not that one could know *what* it will
do before the code is actually run (return a flattened copy of a or a
flattened view), but never mind. Yuck. I think this really needs fixing
(deprecating, rather then removing or changing incompatibly where felt
necessary).

Something else, which I however consider as less important: is it really
necessary to have both 'type' and 'typecode'?  Wouldn't it be enough to just
stick with typecode, along the following lines (potentially issuing
deprecation warnings where appropriate):

  a.typecode()

returns a type object (e.g. Float32).

  array([1,2,3], typecode=Float32)

behaves the same as 

  array([1,2,3], typecode='d')

Float32 etc. are already defined in Numeric so it's easy to write
forward-compatible code and although hunting down instances of 
 
  if a.typecode() == 'd':

presumably wouldn't be that difficult, incompatibility could most likely
almost be eliminated by making ``Float32 == 'd'`` return true.

Sticking to the old name typecode also has the advantage that it is fairly
unique and unambiguous (just try grep'ing for type vs. typecode). 

I must that apart from the switch to type objects, I don't fully understand
the differences in numeric types in old Numeric and numarray and the
motivation behind them. As far as I can see the emphasis with Numeric was to
keep flexible to different hardware and increasing word sizes (i.e. to only
guarantee minimum precision) and provide some reasonable "default" size for
each type (e.g. `Float` being a double precision [3]). This approach is maybe
somewhat similar to python core (floats and ints can have different sizes,
depending on the underlying platform). In numarray the emphasis seems to have
shifted on guaranteeing the actual size in memory (if in a few years time most
calculations are done with 128bit precision than that's maybe not such a good
idea, but I have no clue how likely this is to happen).

Is this shift of emphasis is also responsible for the decision to
have indexing operations always return arrays rather than scalars (including
ones defined by numarray in cases where there is no plain-python equivalent)?

Will all other functions (e.g. min) continue to return scalars?

[BTW can anyone explain to me the difference between Int and Int32 (typecodes
'i' and 'l')?]


Anyway, my apologies if I come across as too negative or if some the points
are misinformed. I really think that the recent changes to numarray and this
announcment are great step forward to a smooth transition of the whole
community from Numeric to numarray which will play an important role in
consolidating python's role in the scientific computing.

night,

alex


Footnotes: 
[1]  I think it might be beneficial, however, to add an explicitly note to the
     manual that alerts users to the fact that small slices can keep alive
     very large arrays, because I am under the impression that this is not
     immediately obvious to everyone and can cause puzzling problems.

[2]  I moaned on this list some months ago that doing linear algebra with
     Numeric array's was often cumbersome and inefficient (and the Matrix
     class that already comes with Numeric is rather limited). My (currently
     alpha) matrix class attempts to address these issues and also provides a
     much more flexible 'plugable' output formating (matlab-like, amongst
     others, which I guess many people will find much more readable; but the
     standard array-like formating is also available).

[3]  As an aside: maybe ``type="Float"`` in numarray should therefore *not* be
     equivalent to ``type=Float32`` but to ``type=Float64``, given that these
     strings seem to just be there for backwards compatibility?

-- 
Alexander Schmolck     Postgraduate Research Student
                       Department of Computer Science
                       University of Exeter
A.Schmolck at gmx.net     http://www.dcs.ex.ac.uk/people/aschmolc/


From victor at idaccr.org  Tue Jul 30 06:43:06 2002
From: victor at idaccr.org (Victor S. Miller)
Date: Tue Jul 30 06:43:06 2002
Subject: [Numpy-discussion] Sparse matrices
Message-ID: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>

I had noticed that Travis Oliphant had a sparse.py package, but it no
longer is available (clicking on the link gives a "404").  I have a
particular kind of sparse matrix that I'd like to use to give vector
matrix multiplies.  In particular, it's an n x n matrix which has at
most k (which is small, usually 2 or 3) non-zeros in each row which
are in consecutive locations.  I have this encoded as an n x k matrix,
the i-th row gives the non-zero values in the i-th row of the big
matrix, and an n long vector of indices -- the i-th element gives the
starting position in the i-th row.  When I want to multiply this
matrix by a row vector v on the left.  To do the multiplication I do
the following:

# loc is the location vector
n = matrix.shape[0]
mm = reshape(v,(-1,1))*matrix
w = zeros((n+m),v.typecode())
for i in range(mm.shape[0]):
    w[loc[i]:loc[i]+matrix.shape[1]] += w[i]
w = w[:n]


I would like to be able to replace the loop with some Numeric
operations.  Is there a trick to do this?  Note that the n that I'm
using is around 100000, so that storing the full matrix is out of the
question (and multiplying by that matrix would be extremely
inefficient, anyway).
-- 
Victor S. Miller     | " ... Meanwhile, those of us who can compute can hardly
victor at idaccr.org    | be expected to keep writing papers saying 'I can do the
CCR, Princeton, NJ   | following useless calculation in 2 seconds', and indeed
    08540 USA        | what editor would publish them?"  -- Oliver Atkin


From victor at idaccr.org  Tue Jul 30 08:29:06 2002
From: victor at idaccr.org (Victor S. Miller)
Date: Tue Jul 30 08:29:06 2002
Subject: [Numpy-discussion] Sparse matrices
In-Reply-To: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org> (victor@idaccr.org's
 message of "Tue, 30 Jul 2002 09:42:13 -0400")
References: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>
Message-ID: <ulznw944mj.fsf@runner.princeton.idaccr.org>

Sorry, I had a typo in the program.  It should be:

# M is n by k, and represents a sparse n by n matrix A
# the non-zero entries of row i of A start in column loc[i]
# and are the i-th row of M in locations loc[i]:loc[i]+k
# loc is the location vector
n,k = M.shape
mm = reshape(v,(-1,1))*M
w = zeros((n+m),v.typecode())
# is there a trick to replace the loop below?
for i in range(mm.shape[0]):
    w[loc[i]:loc[i]+k] += mm[i]
w = w[:n]

-- 
Victor S. Miller     | " ... Meanwhile, those of us who can compute can hardly
victor at idaccr.org    | be expected to keep writing papers saying 'I can do the
CCR, Princeton, NJ   | following useless calculation in 2 seconds', and indeed
    08540 USA        | what editor would publish them?"  -- Oliver Atkin


From jochen at unc.edu  Tue Jul 30 09:24:02 2002
From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=)
Date: Tue Jul 30 09:24:02 2002
Subject: [Numpy-discussion] Sparse matrices
In-Reply-To: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>
References: <ulfzy1wcwa.fsf@runner.princeton.idaccr.org>
Message-ID: <lywurd6v8m.fsf@bock.chem.unc.edu>

On Tue, 30 Jul 2002 09:42:13 -0400 Victor S Miller wrote:

Victor> I had noticed that Travis Oliphant had a sparse.py package,
Victor> but it no longer is available (clicking on the link gives a
Victor> "404").

It's part of scipy now.

Greetings,
Jochen
-- 
University of North Carolina                       phone: +1-919-962-4403
Department of Chemistry                            phone: +1-919-962-1579
Venable Hall CB#3290 (Kenan C148)                    fax: +1-919-843-6041
Chapel Hill, NC 27599, USA                            GnuPG key: 44BCCD8E