From tim.hochberg at ieee.org Thu May 1 00:32:01 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 30 Apr 2008 21:32:01 -0700 Subject: [Numpy-discussion] untenable matrix behavior in SVN In-Reply-To: References: <48174FF5.50005@noaa.gov> Message-ID: On Wed, Apr 30, 2008 at 8:16 PM, Anne Archibald wrote: > 2008/4/30 Charles R Harris : > > > Some operations on stacks of small matrices are easy to get, for > instance, > > +,-,*,/, and matrix multiply. The last is the interesting one. If A and B > > are stacks of matrices with the same number of dimensions with the > matrices > > stored in the last two indices, then > > > > sum(A[...,:,:,newaxis]*B[...,newaxis,:,:], axis=-2) > > > > is the matrix-wise multiplication of the two stacks. If B is replaced by > a > > stack of 1D vectors, x, it is even simpler: > > > > sum(A[...,:,:]*x[...,newaxis,:], axis=-1) > > > > This doesn't go through BLAS, but for large stacks of small matrices it > > might be even faster than BLAS because BLAS is kinda slow for small > > matrices. > > Yes and no. For the first operation, you have to create a temporary > that is larger than either of the two input arrays. These invisible > (potentially) gigantic temporaries are the sort of thing that puzzle > users when as their problem size grows they suddenly find they hit a > massive slowdown because it starts swapping to disk, and then a > failure because the temporary can't be allocated. This is one reason > we have dot() and tensordot() even though they can be expressed like > this. (The other is of course that it lets us use optimized BLAS.) > > > This rather misses the point of Timothy Hochberg's suggestion (as I > understood it), though: yes, you can write the basic operations in > numpy, in a more or less efficient fashion. But it would be very > valuable for arrays to have some kind of metadata that let them keep > track of which dimensions represented simple array storage and which > represented components of a linear algebra object. Such metadata could > make it possible to use, say, dot() as if it were a binary ufunc > taking two matrices. That is, you could feed it two arrays of > matrices, which it would broadcast to the same shape if necessary, and > then it would compute the elementwise matrix product. > > The question I have is, what is the right mathematical model for > describing these > arrays-some-of-whose-dimensions-represent-linear-algebra-objects? > > > One idea is for each dimension to be flagged as one of "replication", > "vector", or "covector". A column vector might then be a rank-1 vector > array, a row vector might be a rank-1 covector array, a linear > operator might be a rank-2 object with one covector and one vector > dimension, a bilinear form might be a rank-2 object with two covector > dimensions. Dimensions designed for holding repetitions would be > flagged as such, so that (for example) an image might be an array of > shape (N,M,3) of types ("replication","replication","vector"); then to > apply a color-conversion matrix one would simply use dot() (or "*" I > suppose). without too much concern for which index was which. The > problem is, while this formalism sounds good to me, with a background > in differential geometry, if you only ever work in spaces with a > canonical metric, the distinction between vector and covector may seem > peculiar and be unhelpful. > > Implementing such a thing need not be too difficult: start with a new > subclass of ndarray which keeps a tuple of dimension types. Come up > with an adequate set of operations on them, and implement them in > terms of numpy's functions, taking advantage of the extra information > about each axis. A few operations that spring to mind: > > * Addition: it doesn't make sense to add vectors and covectors; raise > an exception. Otherwise addition is always elementwise anyway. (How > hard should addition work to match up corresponding dimensions?) > * Multiplication: elementwise across "repetition" axes, it combines > vector axes with corresponding covector axes to get some kind of > generalized matrix product. (How is "corresponding" defined?) > * Division: mostly doesn't make sense unless you have an array of > scalars (I suppose it could compute matrix inverses?) > * Exponentiation: very limited (though I suppose matrix powers could > be implemented if the shapes are right) > * Change of basis: this one is tricky because not all dimensions need > come from the same vector space > * Broadcasting: the rules may become a bit byzantine... > * Dimension metadata fiddling > > Is this a useful abstraction? It seems like one might run into trouble > when dealing with arrays whose dimensions represent vectors from > unrelated spaces. Thanks Anne. That is exactly what I had in mind. Alas, every time I sit down to try to prototype some code, it collapses under its own weight. I'm becoming warmer to an extended version of the row/col/matrix idea just because its simpler to understand. It would be just as you describe above but would support only four cases: 1. scalar: all axes are 'replication' axes 2. vector: last axes is a 'vector' all other replication. 3. covector: last axes is a 'covector' all other replication 4. matrix: last two axes are covector and vector respectively. Others are replication. Or something like that. It's basically the row/col/matrix formulation with stacking. I suspect in practice it gives us most of the power of the full proposal with less complexity (for the user anyway). Then again, if the implementation turns out not to be that bad, one could always implement some convenience types on top of it to make it usable for the masses with the fully general version available for the stout of heart. Regards, -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists.20.chth at xoxy.net Thu May 1 00:37:18 2008 From: lists.20.chth at xoxy.net (ctw) Date: Thu, 1 May 2008 00:37:18 -0400 Subject: [Numpy-discussion] ndarray subclassing Message-ID: Hi! I ran into some strange (at least to me) issues with sublasses of ndarray. The following minimal class definition illustrates the problem: ==================================================== import numpy as np class TestArray(np.ndarray): def __new__(cls, data, info=None, dtype=None, copy=False): subarr = np.array(data, dtype=dtype, copy=copy) subarr = subarr.view(cls) return subarr def __array_finalize__(self,obj): print "self: ",self.shape print "obj: ",obj.shape ===================================================== When I run this code interactively with IPython and then generate TestArray instances, __array_finalize__ seems to get called when printing out arrays with more than 1 dimension and self.shape seems to drop a dimension. Everything works fine if the array has just 1 dimension: In [3]: x = TestArray(np.arange(5)) self: (5,) obj: (5,) In [4]: x Out[4]: TestArray([0, 1, 2, 3, 4]) This is all expected behavior. However things change when the array is 2-D: In [5]: x = TestArray(np.zeros((2,3))) self: (2, 3) obj: (2, 3) In [6]: x Out[6]: self: (3,) obj: (2, 3) self: (3,) obj: (2, 3) TestArray([[ 0., 0., 0.], [ 0., 0., 0.]]) Now when printing out the array, __array_finalize__ seems to get called twice and each time self seems to only refer to one row of the array. Can anybody explain what is going on and why? This behavior seems to lead to problems when the __array_finalize__ method performs checks on the shape of the array. In the matrix class this seems to be circumvented with a special _getitem flag that bypasses the shape checks in __array_finalize__ and an analogous solution works for my class, too. However, I'm still puzzled by this behavior and am hoping that somebody here can shed some light on it. Thanks! CTW From oliphant at enthought.com Thu May 1 00:41:22 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 30 Apr 2008 23:41:22 -0500 Subject: [Numpy-discussion] recarray fun In-Reply-To: <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> Message-ID: <481949F2.5040506@enthought.com> St?fan van der Walt wrote: > 2008/4/30 Christopher Barker : > >> St?fan van der Walt wrote: >> > That's the way, or just rgba_image.view(numpy.int32). >> >> ah -- interestingly, I tried: >> >> rgba_image.view(dtype=numpy.int32) >> >> and got: >> >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: view() takes no keyword arguments >> >> Since it is optional, shouldn't it be keyword argument? >> > > Thanks, fixed in r5115. > > This was too hasty. I had considered this before. The problem with this is that the object can be either a type object or a data-type object. You can use view to both re-cast a numpy array as another subtype or as another data-type. So, please revert the change until a better solution is posted. -Travis From charlesr.harris at gmail.com Thu May 1 00:41:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 30 Apr 2008 22:41:27 -0600 Subject: [Numpy-discussion] untenable matrix behavior in SVN In-Reply-To: References: <48174FF5.50005@noaa.gov> Message-ID: On Wed, Apr 30, 2008 at 9:16 PM, Anne Archibald wrote: > 2008/4/30 Charles R Harris : > > > Some operations on stacks of small matrices are easy to get, for > instance, > > +,-,*,/, and matrix multiply. The last is the interesting one. If A and > B > > are stacks of matrices with the same number of dimensions with the > matrices > > stored in the last two indices, then > > > > sum(A[...,:,:,newaxis]*B[...,newaxis,:,:], axis=-2) > > > > is the matrix-wise multiplication of the two stacks. If B is replaced by > a > > stack of 1D vectors, x, it is even simpler: > > > > sum(A[...,:,:]*x[...,newaxis,:], axis=-1) > > > > This doesn't go through BLAS, but for large stacks of small matrices it > > might be even faster than BLAS because BLAS is kinda slow for small > > matrices. > > Yes and no. For the first operation, you have to create a temporary > that is larger than either of the two input arrays. These invisible > (potentially) gigantic temporaries are the sort of thing that puzzle > users when as their problem size grows they suddenly find they hit a > massive slowdown because it starts swapping to disk, and then a > failure because the temporary can't be allocated. This is one reason > we have dot() and tensordot() even though they can be expressed like > this. (The other is of course that it lets us use optimized BLAS.) > But it is interesting that you can multiply stacks of matrices that way, is it not? I haven't seen it mentioned elsewhere. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu May 1 00:45:02 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 30 Apr 2008 23:45:02 -0500 Subject: [Numpy-discussion] ndarray subclassing In-Reply-To: References: Message-ID: <48194ACE.3020103@enthought.com> ctw wrote: > Hi! > > I ran into some strange (at least to me) issues with sublasses of > ndarray. The following minimal class definition illustrates the > problem: > > ==================================================== > > import numpy as np > class TestArray(np.ndarray): > def __new__(cls, data, info=None, dtype=None, copy=False): > subarr = np.array(data, dtype=dtype, copy=copy) > subarr = subarr.view(cls) > return subarr > > def __array_finalize__(self,obj): > print "self: ",self.shape > print "obj: ",obj.shape > > ===================================================== > > When I run this code interactively with IPython and then generate > TestArray instances, __array_finalize__ seems to get called when > printing out arrays with more than 1 dimension and self.shape seems to > drop a dimension. Everything works fine if the array has just 1 > dimension: > > In [3]: x = TestArray(np.arange(5)) > self: (5,) > obj: (5,) > > In [4]: x > Out[4]: TestArray([0, 1, 2, 3, 4]) > > This is all expected behavior. > However things change when the array is 2-D: > > In [5]: x = TestArray(np.zeros((2,3))) > self: (2, 3) > obj: (2, 3) > > In [6]: x > Out[6]: self: (3,) > obj: (2, 3) > self: (3,) > obj: (2, 3) > > TestArray([[ 0., 0., 0.], > [ 0., 0., 0.]]) > You are just seeing the result of __repr__. The printing code works by accessing slices of the array. These slices create new instances of your TestArray class which have a smaller number of dimensions. That's all. Is the printing code causing you other kinds of problems? -Travis From Chris.Barker at noaa.gov Thu May 1 01:00:26 2008 From: Chris.Barker at noaa.gov (Chris.Barker) Date: Wed, 30 Apr 2008 22:00:26 -0700 Subject: [Numpy-discussion] recarray fun In-Reply-To: <481949F2.5040506@enthought.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> Message-ID: <48194E6A.8080403@noaa.gov> Travis E. Oliphant wrote: > St?fan van der Walt wrote: >> 2008/4/30 Christopher Barker : >>> Since it is optional, shouldn't it be keyword argument? >>> >> Thanks, fixed in r5115. >> >> > This was too hasty. I had considered this before. > > The problem with this is that the object can be either a type object or > a data-type object. You can use view to both re-cast a numpy array as > another subtype or as another data-type. Is the issue here that this is a slightly different meaning than the "dtype" argument everywhere else? Frankly, that doesn't bother me -- it is a superset of the functionality, is it not? Though I guess I don't really understand quite what the difference is between a subtype and a data-type. Or a type object vs. a datatype object. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From lists.20.chth at xoxy.net Thu May 1 01:15:55 2008 From: lists.20.chth at xoxy.net (ctw) Date: Thu, 1 May 2008 01:15:55 -0400 Subject: [Numpy-discussion] ndarray subclassing In-Reply-To: References: Message-ID: On Thu, May 1, 2008, Travis E. Oliphant wrote: > You are just seeing the result of __repr__. The printing code works by > accessing slices of the array. These slices create new instances of > your TestArray class which have a smaller number of dimensions. That's all. Ahh, that makes sense. Thanks so much for the quick reply! > Is the printing code causing you other kinds of problems? The problem was that my __array_finalize__ code did some checking on the shape of the array. In short, one of the attributes of my class is a list that has one entry for each dimension. __array_finalize__ checks to make sure this list length and the number of dimensions match and it throws an exception if they don't. This leads to an exception being thrown whenever I try to print the contents of an array with more than 1 dimensions, because I haven't yet implemented a __getitem__ method that adjusts this attribute when slices are taken. I didn't realize that this was what's going on and was very puzzled by the results. It looks like I can make things work by using _getitem flag as is done in the matrix class. Thanks again for clearing this up! I think it would be great if somebody with write access to the wiki could make a note of this on the sublasses page: http://www.scipy.org/Subclasses CTW From peridot.faceted at gmail.com Thu May 1 01:39:51 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 1 May 2008 01:39:51 -0400 Subject: [Numpy-discussion] recarray fun In-Reply-To: <481949F2.5040506@enthought.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> Message-ID: 2008/5/1 Travis E. Oliphant : > St?fan van der Walt wrote: > > 2008/4/30 Christopher Barker : > > > >> St?fan van der Walt wrote: > >> > That's the way, or just rgba_image.view(numpy.int32). > >> > >> ah -- interestingly, I tried: > >> > >> rgba_image.view(dtype=numpy.int32) > >> > >> and got: > >> > >> Traceback (most recent call last): > >> File "", line 1, in > >> TypeError: view() takes no keyword arguments > >> > >> Since it is optional, shouldn't it be keyword argument? > >> > > > > Thanks, fixed in r5115. > > > > > This was too hasty. I had considered this before. > > The problem with this is that the object can be either a type object or > a data-type object. You can use view to both re-cast a numpy array as > another subtype or as another data-type. > > So, please revert the change until a better solution is posted. If we're going to support keyword arguments, shouldn't those two options be *different* keywords (dtype and ndarray_subclass, say)? Then a single non-keyword argument tells numpy to guess which one you wanted... Anne From stefan at sun.ac.za Thu May 1 04:57:43 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 1 May 2008 10:57:43 +0200 Subject: [Numpy-discussion] recarray fun In-Reply-To: <481949F2.5040506@enthought.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> Message-ID: <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> 2008/5/1 Travis E. Oliphant : > St?fan van der Walt wrote: > > 2008/4/30 Christopher Barker : > > > >> St?fan van der Walt wrote: > >> > That's the way, or just rgba_image.view(numpy.int32). > >> > >> ah -- interestingly, I tried: > >> > >> rgba_image.view(dtype=numpy.int32) > >> > >> and got: > >> > >> Traceback (most recent call last): > >> File "", line 1, in > >> TypeError: view() takes no keyword arguments > >> > >> Since it is optional, shouldn't it be keyword argument? > >> > > > > Thanks, fixed in r5115. > > > > > This was too hasty. I had considered this before. > > The problem with this is that the object can be either a type object or > a data-type object. You can use view to both re-cast a numpy array as > another subtype or as another data-type. > > So, please revert the change until a better solution is posted. OK, I see your point. I'm working on a patch that does the following: def view(type_or_dtype=None, dtype=None, type=None): if type_or_dtype: if dtype: raise ValueError("Cannot specify dtype twice") if type: raise ValueError("Cannot specify type twice") if isinstance(type_or_dtype,py_type): type = type_or_dtype if isinstance(type_or_dtype,numpy_dtype): dtype = type_or_dtype return x.view(type=type).view(dtype=dtype) Would that be a satisfying solution? I'll be back around 21:00 SAST to attend to the matter. Regards St?fan From oliphant at enthought.com Thu May 1 10:03:32 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 01 May 2008 09:03:32 -0500 Subject: [Numpy-discussion] recarray fun In-Reply-To: <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> Message-ID: <4819CDB4.1020209@enthought.com> St?fan van der Walt wrote: > 2008/5/1 Travis E. Oliphant : > >> St?fan van der Walt wrote: >> > 2008/4/30 Christopher Barker : >> > >> >> St?fan van der Walt wrote: >> >> > That's the way, or just rgba_image.view(numpy.int32). >> >> >> >> ah -- interestingly, I tried: >> >> >> >> rgba_image.view(dtype=numpy.int32) >> >> >> >> and got: >> >> >> >> Traceback (most recent call last): >> >> File "", line 1, in >> >> TypeError: view() takes no keyword arguments >> >> >> >> Since it is optional, shouldn't it be keyword argument? >> >> >> > >> > Thanks, fixed in r5115. >> > >> > >> This was too hasty. I had considered this before. >> >> The problem with this is that the object can be either a type object or >> a data-type object. You can use view to both re-cast a numpy array as >> another subtype or as another data-type. >> >> So, please revert the change until a better solution is posted. >> > > OK, I see your point. I'm working on a patch that does the following: > > def view(type_or_dtype=None, dtype=None, type=None): > if type_or_dtype: > if dtype: > raise ValueError("Cannot specify dtype twice") > if type: > raise ValueError("Cannot specify type twice") > > if isinstance(type_or_dtype,py_type): > type = type_or_dtype > > if isinstance(type_or_dtype,numpy_dtype): > dtype = type_or_dtype > > return x.view(type=type).view(dtype=dtype) > > Would that be a satisfying solution? I'll be back around 21:00 SAST > to attend to the matter. > Yes, I think that would work. You need to do some checking for type=None and dtype=None as well, though. That way, the first argument would continue to work as now but be labeled correctly, but it would also support dtype= and type= keywords. -Travis From Chris.Barker at noaa.gov Thu May 1 15:20:01 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 01 May 2008 12:20:01 -0700 Subject: [Numpy-discussion] recarray fun In-Reply-To: <4819CDB4.1020209@enthought.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> <4819CDB4.1020209@enthought.com> Message-ID: <481A17E1.5000605@noaa.gov> Travis E. Oliphant wrote: >> def view(type_or_dtype=None, dtype=None, type=None): > Yes, I think that would work. Is there a way to deprecate this for future API-incompatible versions? It's better than non keywords, but a bit ugly. Maybe we should have a Wiki page for "stuff we'd like to change, but won't until major API breakage is otherwise occurring" -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stefan at sun.ac.za Thu May 1 15:29:17 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 1 May 2008 21:29:17 +0200 Subject: [Numpy-discussion] recarray fun In-Reply-To: <4819CDB4.1020209@enthought.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> <4819CDB4.1020209@enthought.com> Message-ID: <9457e7c80805011229o7fcfad75wad201018966c0de7@mail.gmail.com> 2008/5/1 Travis E. Oliphant : > > OK, I see your point. I'm working on a patch that does the following: > > > > def view(type_or_dtype=None, dtype=None, type=None): > > if type_or_dtype: > > if dtype: > > raise ValueError("Cannot specify dtype twice") > > if type: > > raise ValueError("Cannot specify type twice") > > > > if isinstance(type_or_dtype,py_type): > > type = type_or_dtype > > > > if isinstance(type_or_dtype,numpy_dtype): > > dtype = type_or_dtype > > > > return x.view(type=type).view(dtype=dtype) > > > > Would that be a satisfying solution? I'll be back around 21:00 SAST > > to attend to the matter. > > > > Yes, I think that would work. You need to do some checking for > type=None and dtype=None as well, though. > > That way, the first argument would continue to work as now but be > labeled correctly, but it would also support dtype= and type= keywords. Please review http://projects.scipy.org/scipy/numpy/changeset/5117. Thanks St?fan From lxander.m at gmail.com Thu May 1 15:33:54 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Thu, 1 May 2008 15:33:54 -0400 Subject: [Numpy-discussion] very simple iteration question. In-Reply-To: <20080430190930.GG1497@phare.normalesup.org> References: <9a771bf70804300111s749747b8we8a4393a3c68fd02@mail.gmail.com> <4818A2C1.5010208@noaa.gov> <4818C128.7030908@noaa.gov> <20080430190930.GG1497@phare.normalesup.org> Message-ID: <525f23e80805011233v719febfcp21e52fe1cd414efa@mail.gmail.com> On Wed, Apr 30, 2008 at 3:09 PM, Gael Varoquaux wrote: > On Wed, Apr 30, 2008 at 11:57:44AM -0700, Christopher Barker wrote: > > I think I still like the idea of an iterator (or maybe making rollaxis a > > method?), but this works pretty well. > > Generally, in object oriented programming, you expect a method like > rollaxis to modify an object inplace. At least that would be my > expectation. Ga?l, I'm not sure where you learned this expectation, as I don't think (but I could be wrong -- so educate me!) it is universal for OOP nor encouraged for Python in particular. OOP promotes encapsulation by providing mechanisms for objects to publish an interface and thereby hide the details of their inner workings, but I don't think there is a common OOP philosophy that disallows methods returning views and requires all transformations be performed in-place. A design philosophy like you've espoused is tractable in languages that support overloading, like C++, but it is not tractable in Python (at least not cleanly within the design of the base language). How do you create a function that returns a "flat" iterator of a container generically? In C++, each container would overload the flat function. In Python, your only hope is to ensure that the flat function only requires a standardized interface to a container object, but that would likely depend on the standardized interface including some low level iteration method from which to build more complex iterators. Even in C++ where such a design philosophy is possible at some level, objects often expose const iterators of various sorts as methods. Finally, Python has plenty of counter-examples to the maxim: all the string methods because strings are designed to be immutable, set object methods (even for mutable sets), various iterators for dicts, etc. On this topic. I would love to see numpy evolve a moderately generic array interface so that we could write stand-alone functions that work generically with ndarrays, as well as "masked arrays" and "sparse arrays" so that there was "one dot-product to rule them all", so to speak. Right now, you can't use functions like numpy.dot on a numpy.ma.MaskedArray, for example. Regards, Alex From oliphant at enthought.com Thu May 1 15:55:56 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 01 May 2008 14:55:56 -0500 Subject: [Numpy-discussion] recarray fun In-Reply-To: <9457e7c80805011229o7fcfad75wad201018966c0de7@mail.gmail.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> <4819CDB4.1020209@enthought.com> <9457e7c80805011229o7fcfad75wad201018966c0de7@mail.gmail.com> Message-ID: <481A204C.6060907@enthought.com> > Please review http://projects.scipy.org/scipy/numpy/changeset/5117. > Stefan, I don't think we really need the dtype_or_type keyword. It seems that we could just check the first argument (dtype) to see if it is a subtype of the ndarray and assume that it is type= in that case. -Travis From aisaac at american.edu Thu May 1 16:02:22 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 1 May 2008 16:02:22 -0400 Subject: [Numpy-discussion] recarray fun In-Reply-To: <481A17E1.5000605@noaa.gov> References: <4818B1F7.904@noaa.gov><9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com><4818D16F.7090502@noaa.gov><9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com><481949F2.5040506@enthought.com><9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com><4819CDB4.1020209@enthought.com> <481A17E1.5000605@noaa.gov> Message-ID: On Thu, 01 May 2008, Christopher Barker apparently wrote: > Maybe we should have a Wiki page for "stuff we'd like to change, but > won't until major API breakage is otherwise occurring" Perhaps would suffice? Cheers, Alan From oliphant at enthought.com Thu May 1 16:03:17 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 01 May 2008 15:03:17 -0500 Subject: [Numpy-discussion] recarray fun In-Reply-To: <9457e7c80805011229o7fcfad75wad201018966c0de7@mail.gmail.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> <4819CDB4.1020209@enthought.com> <9457e7c80805011229o7fcfad75wad201018966c0de7@mail.gmail.com> Message-ID: <481A2205.7040105@enthought.com> St?fan van der Walt wrote: > 2008/5/1 Travis E. Oliphant : > >> > OK, I see your point. I'm working on a patch that does the following: >> > >> > def view(type_or_dtype=None, dtype=None, type=None): >> > if type_or_dtype: >> > if dtype: >> > raise ValueError("Cannot specify dtype twice") >> > if type: >> > raise ValueError("Cannot specify type twice") >> > >> > if isinstance(type_or_dtype,py_type): >> > type = type_or_dtype >> > >> > if isinstance(type_or_dtype,numpy_dtype): >> > dtype = type_or_dtype >> > >> > return x.view(type=type).view(dtype=dtype) >> > >> > Would that be a satisfying solution? I'll be back around 21:00 SAST >> > to attend to the matter. >> > >> >> Yes, I think that would work. You need to do some checking for >> type=None and dtype=None as well, though. >> >> That way, the first argument would continue to work as now but be >> labeled correctly, but it would also support dtype= and type= keywords. >> > > Please review http://projects.scipy.org/scipy/numpy/changeset/5117. > > Check out http://projects.scipy.org/scipy/numpy/changeset/5119 -Travis From stefan at sun.ac.za Thu May 1 18:36:26 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 2 May 2008 00:36:26 +0200 Subject: [Numpy-discussion] recarray fun In-Reply-To: <481A2205.7040105@enthought.com> References: <4818B1F7.904@noaa.gov> <9457e7c80804301220p327be027sb0845b3e125fe31a@mail.gmail.com> <4818D16F.7090502@noaa.gov> <9457e7c80804301511g51b43d56p70c57f2f12ddf516@mail.gmail.com> <481949F2.5040506@enthought.com> <9457e7c80805010157t637bb639x51cb37d4d2067682@mail.gmail.com> <4819CDB4.1020209@enthought.com> <9457e7c80805011229o7fcfad75wad201018966c0de7@mail.gmail.com> <481A2205.7040105@enthought.com> Message-ID: <9457e7c80805011536q7e15deb8s10ed7a018a81104b@mail.gmail.com> 2008/5/1 Travis E. Oliphant : > St?fan van der Walt wrote: > > > > 2008/5/1 Travis E. Oliphant : > > > >> > OK, I see your point. I'm working on a patch that does the following: > >> > > >> > def view(type_or_dtype=None, dtype=None, type=None): > >> > if type_or_dtype: > >> > if dtype: > >> > raise ValueError("Cannot specify dtype twice") > >> > if type: > >> > raise ValueError("Cannot specify type twice") > >> > > >> > if isinstance(type_or_dtype,py_type): > >> > type = type_or_dtype > >> > > >> > if isinstance(type_or_dtype,numpy_dtype): > >> > dtype = type_or_dtype > >> > > >> > return x.view(type=type).view(dtype=dtype) > >> > > >> > Would that be a satisfying solution? I'll be back around 21:00 SAST > >> > to attend to the matter. > >> > > >> > >> Yes, I think that would work. You need to do some checking for > >> type=None and dtype=None as well, though. > >> > >> That way, the first argument would continue to work as now but be > >> labeled correctly, but it would also support dtype= and type= keywords. > >> > > > > Please review http://projects.scipy.org/scipy/numpy/changeset/5117. > > > > > Check out > > http://projects.scipy.org/scipy/numpy/changeset/5119 I think that's fine. It doesn't support weird combinations like x.view(np.matrix,dtype=np.int32) but people probably shouldn't do that anyway. Cheers St?fan From hoytak at gmail.com Thu May 1 21:49:52 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Thu, 1 May 2008 18:49:52 -0700 Subject: [Numpy-discussion] types as functions convert 1 elm arrays to scalars In-Reply-To: References: <4db580fd0804282328t167bda30v4a220a92c1cd32a0@mail.gmail.com> Message-ID: <4db580fd0805011849l5da35c47h7c9e2585f57e719a@mail.gmail.com> To be honest, this doesn't seem justifiable. Where it got me is interfacing with c-code that expected a 1d array, and I was calling it with arrays of varying length. I was using this to ensure the proper typing; however, when the array was length 1, the program crashed... Should I file a bug report? --Hoyt On Mon, Apr 28, 2008 at 11:51 PM, Charles R Harris wrote: > > > > On Tue, Apr 29, 2008 at 12:28 AM, Hoyt Koepke wrote: > > Hello, > > > > I have a quick question that I'm hoping will improve my numpy > > understanding. I noticed some behavior when using float64 to convert > > a matrix type that I didn't expect: > > > > > > In [35]: b1 = array([1.0]) > > > > In [36]: float64(b1) > > Out[36]: 1.0 > > > > In [37]: b2 = array([1.0, 2.0]) > > > > In [38]: float64(b2) > > Out[38]: array([ 1., 2.]) > > > > > > I didn't expect calling float64 would convert b1 to a scalar. Seems > > like an inconsistency. I assume this is intentional, as someone would > > have noticed it a long time ago if not, so could someone explain the > > reasoning behind it? (or point me to a source that will help?) > > > > It's inconsistent and looks like a bug: > > In [4]: float32(array([[[1]]])) > Out[4]: array([[[ 1.]]], dtype=float32) > > In [5]: float64(array([[[1]]])) > Out[5]: 1.0 > > Float64 is a bit special because it starts as the python float. Maybe Travis > can say what the differences are. > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From charlesr.harris at gmail.com Thu May 1 22:00:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 1 May 2008 20:00:18 -0600 Subject: [Numpy-discussion] types as functions convert 1 elm arrays to scalars In-Reply-To: <4db580fd0805011849l5da35c47h7c9e2585f57e719a@mail.gmail.com> References: <4db580fd0804282328t167bda30v4a220a92c1cd32a0@mail.gmail.com> <4db580fd0805011849l5da35c47h7c9e2585f57e719a@mail.gmail.com> Message-ID: On Thu, May 1, 2008 at 7:49 PM, Hoyt Koepke wrote: > To be honest, this doesn't seem justifiable. > > Where it got me is interfacing with c-code that expected a 1d array, > and I was calling it with arrays of varying length. I was using this > to ensure the proper typing; however, when the array was length 1, the > program crashed... > > Should I file a bug report? > > I already did, it's ticket #764. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoytak at gmail.com Thu May 1 22:11:30 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Thu, 1 May 2008 19:11:30 -0700 Subject: [Numpy-discussion] types as functions convert 1 elm arrays to scalars In-Reply-To: References: <4db580fd0804282328t167bda30v4a220a92c1cd32a0@mail.gmail.com> <4db580fd0805011849l5da35c47h7c9e2585f57e719a@mail.gmail.com> Message-ID: <4db580fd0805011911p414e7118pfefcb6606ae4a52c@mail.gmail.com> Okay, thanks! I didn't check. --Hoyt On Thu, May 1, 2008 at 7:00 PM, Charles R Harris wrote: > > > > On Thu, May 1, 2008 at 7:49 PM, Hoyt Koepke wrote: > > To be honest, this doesn't seem justifiable. > > > > Where it got me is interfacing with c-code that expected a 1d array, > > and I was calling it with arrays of varying length. I was using this > > to ensure the proper typing; however, when the array was length 1, the > > program crashed... > > > > Should I file a bug report? > > > > > > I already did, it's ticket #764. > > Chuck > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From Chris.Barker at noaa.gov Fri May 2 00:58:34 2008 From: Chris.Barker at noaa.gov (Chris.Barker) Date: Thu, 01 May 2008 21:58:34 -0700 Subject: [Numpy-discussion] Broadcasting question Message-ID: <481A9F7A.5040101@noaa.gov> Hi all, I have a n X m X 3 array, and I a n X M array. I want to assign the values in the n X m to all three of the slices in the bigger array: A1 = np.zeros((5,4,3)) A2 = np.ones((5,4)) A1[:,:,0] = A2 A1[:,:,1] = A2 A1[:,:,2] = A2 However,it seems I should be able to broadcast that, so I don't have to repeat myself, but I can't figure out how. The full question: I have a (w,h) array of ints, representing a grey scale image. I'm trying to convert that into a RBG image. I've played with defining the image array as a rec array or a h,w,3 uint8 array. neither one lets me assign the three bands to the same value in one step: import numpy as np w = 2 h = 2 grey = np.arange(w*h).reshape((w,h)) RGBrec = np.dtype({'r':(np.uint8,0), 'g':(np.uint8,1), 'b':(np.uint8,2), }) ImageRGB = np.zeros((w,h), dtype=RGBrec) ImageInt = Image1.view((np.uint8, 3)) ## neither of these work: #ImageRGB[:,:] = grey #ImageInt[:,:] = grey print ImageRGB How else might I do this? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Fri May 2 01:18:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 1 May 2008 23:18:42 -0600 Subject: [Numpy-discussion] Broadcasting question In-Reply-To: <481A9F7A.5040101@noaa.gov> References: <481A9F7A.5040101@noaa.gov> Message-ID: On Thu, May 1, 2008 at 10:58 PM, Chris.Barker wrote: > Hi all, > > I have a n X m X 3 array, and I a n X M array. I want to assign the > values in the n X m to all three of the slices in the bigger array: > > A1 = np.zeros((5,4,3)) > A2 = np.ones((5,4)) > A1[:,:,0] = A2 > A1[:,:,1] = A2 > A1[:,:,2] = A2 > A1 += A2[:,:,newaxis] is one way. Or you could just start by stacking copies of A2 and swapping axes around: In [5]: A1 = array([A2.T]*3).T In [6]: A1.shape Out[6]: (4, 5, 3) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Fri May 2 02:17:13 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 2 May 2008 02:17:13 -0400 Subject: [Numpy-discussion] Broadcasting question In-Reply-To: <481A9F7A.5040101@noaa.gov> References: <481A9F7A.5040101@noaa.gov> Message-ID: 2008/5/2 Chris.Barker : > Hi all, > > I have a n X m X 3 array, and I a n X M array. I want to assign the > values in the n X m to all three of the slices in the bigger array: > > A1 = np.zeros((5,4,3)) > A2 = np.ones((5,4)) > A1[:,:,0] = A2 > A1[:,:,1] = A2 > A1[:,:,2] = A2 > > However,it seems I should be able to broadcast that, so I don't have to > repeat myself, but I can't figure out how. All you need is a newaxis on A2: In [2]: A = np.zeros((3,4)) In [3]: B = np.ones(3) In [4]: A[:,:] = B[:,np.newaxis] In [5]: A Out[5]: array([[ 1., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.]]) Anne From lordfelipe69 at hotmail.com Fri May 2 08:51:55 2008 From: lordfelipe69 at hotmail.com (andres felipe sierra echeverry) Date: Fri, 2 May 2008 07:51:55 -0500 Subject: [Numpy-discussion] FW: HELP: PROBLEMS WITH VPYTHON, NUMPY AND PY2EXE PART 2 Message-ID: Does anyone could help me generate an executable program that functions for the bounce.py with py2exe? Thank you bounce.py: from visual import * from numpy import * ball = sphere(pos=(0,4,0), color=color.red) example = zeros((10,10),dtype=float) print (example) setup.py: from distutils.core import setup import py2exe opts = { 'py2exe': {'packages':['numarray'] } } setup(windows=[{"script" : "bounce.py"}], options=opts) _________________________________________________________________ Discover the new Windows Vista http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Fri May 2 12:25:50 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 02 May 2008 09:25:50 -0700 Subject: [Numpy-discussion] Broadcasting question In-Reply-To: References: <481A9F7A.5040101@noaa.gov> Message-ID: <481B408E.9020604@noaa.gov> Charles R Harris wrote: > A1 += A2[:,:,newaxis] is one way. Exactly what I was looking for -- thanks Charles (and Ann) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From dalcinl at gmail.com Fri May 2 13:49:18 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 2 May 2008 14:49:18 -0300 Subject: [Numpy-discussion] a possible way to implement a plogin system In-Reply-To: <1209611026.23684.19.camel@bbc8> References: <1209611026.23684.19.camel@bbc8> Message-ID: On 5/1/08, David Cournapeau wrote: > On Wed, 2008-04-30 at 16:44 -0300, Lisandro Dalcin wrote: > > David, in order to put clear what I was proposing to you in previous > > mail regarding to implementing plugin systems for numpy, please take a > > look at the attached tarball. > > > Thanks for looking at this Lisandro. > > The problem I see with the approach of struct vs direct function > pointers is of ABI compatibility: it is easy to mess things up with > structures. That's a very valid point. > There is the advantage of using only one dlsym (or > equivalent) with the struct, which may be much faster than using > hundreds of dlsym for each function. Linux does not seem to have > problems with that, but mac os X for example felt slow when I tried > doing this for several thousand functions. I did not go really far on > that, though. I don't really see why using a struct is cleaner, though. > That's really the same thing (function pointers), in both cases the name > namespace pollution will be the same, and in both cases there will be a > need to generate source code. Perhaps cleaner is not the right word. I actually believe that is far more portable, regarding the oddities of dlopening in different platforms. > Concerning the loading mechanism, I don't understand the point of using > PyCObject_Import. By quickly looking at the code of the function, the > plugin needs to be a python module in that case, which is not needed in > our case; I don't like the fact that it is not documented either. Well, that is the same mechanism NumPy uses for exporting its C API to extensions modules. Look at generated header '__multiarray_api.h' in function '_import_array()' in your numpy installation. This is the standard, portable, and (as Python doc says) recommended way to expose C API's from extensions modules. > The code for dlopen/dlclose is really short. For each platform, it is > like 50 lines of code, and we have a better control on what we can do > (which may be needed; for example, you want want to load symbols from a > dll A, but the symbols are only in dll B, which dll A depends on; that's > a problem that does not happen for python extensions, I think; I don't > really know to be honest). Please note that all my concerns about recommending you not to use dlopen, is just to save you from future headaches!!!. See yourself at row 17 of table here . And that table does not cover stuff like RTLD_GLOBAL flags to dlopen (or equivalent). -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From rshepard at appl-ecosys.com Fri May 2 15:49:36 2008 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Fri, 2 May 2008 12:49:36 -0700 (PDT) Subject: [Numpy-discussion] Combining Sigmoid Curves Message-ID: When I last visited I was given excellent advice about Gaussian and other bell-shaped curves. Upon further reflection I realized that the Gaussian curves will not do; the curve does need to have y=0.0 at each end. I tried to apply a Beta distribution, but I cannot correlate the alpha and beta parameters with the curve characteristics that I have. What will work (I call it a pi curve) is a matched pair of sigmoid curves, the ascending curve on the left and the descending curve on the right. Using the Boltzmann function for these I can calculate and plot each individually, but I'm having difficulty in appending the x and y values across the entire range. This is where I would appreciate your assistance. The curve parameters passed to the function are the left end, right end, and midpoint. The inflection point (where y = 0.5) is half-way between the ends and the midpoint. What I have in the function is this: def piCurve(ll, lr, mid): flexL = (mid - ll)/2.0 flexR = (lr - mid)/2.0 tau = (mid - ll)/10.0 x = [] y = [] xL = nx.arange(ll,mid,0.1) for i in xL: x.append(xL[i]) yL = 1.0 / (1.0 + nx.exp(-(xL-flexL)/tau)) for j in yL: y.append(yL[j]) xR = nx.arange(mid,lr,0.1) for i in xR: x.append(xR[i]) yR = 1 - (1.0 / (1.0 + nx.exp(-(xR-flexR)/tau))) for j in yR: y.append(yR[j]) appData.plotX = x appData.plotY = y Python complains about adding to the list: yL = 1.0 / (1.0 + nx.exp(-(x-flexL)/tau)) TypeError: unsupported operand type(s) for -: 'list' and 'float' What is the appropriate way to generate two sigmoid curves so that the x values range from the left end to the right end and the y values rise from 0.0 to 1.0 at the midpoint, then lower to 0.0 again? Rich From amcmorl at gmail.com Fri May 2 16:07:23 2008 From: amcmorl at gmail.com (Angus McMorland) Date: Fri, 2 May 2008 16:07:23 -0400 Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: 2008/5/2 Rich Shepard : > When I last visited I was given excellent advice about Gaussian and other > bell-shaped curves. Upon further reflection I realized that the Gaussian > curves will not do; the curve does need to have y=0.0 at each end. > > I tried to apply a Beta distribution, but I cannot correlate the alpha and > beta parameters with the curve characteristics that I have. > > What will work (I call it a pi curve) is a matched pair of sigmoid curves, > the ascending curve on the left and the descending curve on the right. Using > the Boltzmann function for these I can calculate and plot each individually, > but I'm having difficulty in appending the x and y values across the entire > range. This is where I would appreciate your assistance. > > The curve parameters passed to the function are the left end, right end, > and midpoint. The inflection point (where y = 0.5) is half-way between the > ends and the midpoint. > > What I have in the function is this: > > def piCurve(ll, lr, mid): > flexL = (mid - ll)/2.0 > flexR = (lr - mid)/2.0 > tau = (mid - ll)/10.0 > > x = [] > y = [] > > xL = nx.arange(ll,mid,0.1) > for i in xL: > x.append(xL[i]) > yL = 1.0 / (1.0 + nx.exp(-(xL-flexL)/tau)) > for j in yL: > y.append(yL[j]) > > xR = nx.arange(mid,lr,0.1) > for i in xR: > x.append(xR[i]) > yR = 1 - (1.0 / (1.0 + nx.exp(-(xR-flexR)/tau))) > for j in yR: > y.append(yR[j]) > > appData.plotX = x > appData.plotY = y > > Python complains about adding to the list: > > yL = 1.0 / (1.0 + nx.exp(-(x-flexL)/tau)) > TypeError: unsupported operand type(s) for -: 'list' and 'float' > > What is the appropriate way to generate two sigmoid curves so that the x > values range from the left end to the right end and the y values rise from > 0.0 to 1.0 at the midpoint, then lower to 0.0 again? How about multiplying two Boltzmann terms together, ala: f(x) = 1/(1+exp(-(x-flex1)/tau1)) * 1/(1+exp((x-flex2)/tau2)) You'll find if your two flexion points get too close together, the peak will drop below the maximum for each individual curve, but the transition will be continuous. Angus. -- AJC McMorland, PhD candidate Physiology, University of Auckland (Nearly) post-doctoral research fellow Neurobiology, University of Pittsburgh From rshepard at appl-ecosys.com Fri May 2 16:21:09 2008 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Fri, 2 May 2008 13:21:09 -0700 (PDT) Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: On Fri, 2 May 2008, Angus McMorland wrote: > How about multiplying two Boltzmann terms together, ala: > > f(x) = 1/(1+exp(-(x-flex1)/tau1)) * 1/(1+exp((x-flex2)/tau2)) > You'll find if your two flexion points get too close together, the peak > will drop below the maximum for each individual curve, but the transition > will be continuous. Angus, With an x range from 0.0-100.0 (and the flexion points at 25.0 and 75.0), the above formula provides a nice bell-shaped curve from x=0.0 to x=50.0, and a maximum y of only 0.25 rather than 2.0. Modifying the above so the second term is subtracted from 1 before the multiplication, or by negating the exponent in the second term, yields only the first half: the ascending 'S' curve from 0-50. Thanks, Rich -- Richard B. Shepard, Ph.D. | Integrity Credibility Applied Ecosystem Services, Inc. | Innovation Voice: 503-667-4517 Fax: 503-667-8863 From Chris.Barker at noaa.gov Fri May 2 16:24:01 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 02 May 2008 13:24:01 -0700 Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: <481B7861.8000102@noaa.gov> Rich, this could use some serious vectorization/numpyification! Poke around the scipy Wiki and whatever other tutorials you can find -- you'll be glad you did. A hint: When you are writing a loop like: > for i in xL: > x.append(xL[i]) You should be doing array operations! Specifics: xL = nx.arange(ll,mid,0.1) # you've now created an array of your X values for i in xL: x.append(xL[i]) #this dumps them into a list - why not keep them an array? # if you really need an array (you don't here), then you can use: xL.tolist() That's why you get this error: TypeError: unsupported operand type(s) for -: 'list' and 'float' you're trying to do array operations on a list. Also, you probably want np.linspace, rather than arange, it's a better option for floats. Here is my version: import numpy as np ll, lr, mid = (-10, 10, 0) flexL = (mid - ll)/2.0 flexR = (lr - mid)/2.0 tau = (mid - ll)/10.0 xL = np.linspace(ll, mid, 5) yL = 1.0 / (1.0 + np.exp(-(xL-flexL)/tau)) print xL print yL xR = np.linspace(mid, lr, 5) yR = 1 - (1.0 / (1.0 + np.exp(-(xR-flexR)/tau))) print xR print yR # now put them together: x = np.hstack((xL, xR[1:])) # don't want to duplicate the midpoint y = np.hstack((yL, yR[1:])) print x print y Though it doesn't look like the numbers are right. Also, you don't need to create the separate left and right arraysand put them together, slicing gives a view, so you could do: numpoints = 11 # should be an odd number to get the midpoint x = linspace(ll,lr, numpoints) y = zeros_like(x) xL = x[:numpoints/2] yL = y[:numpoints/2] yL[:] = 1.0 / (1.0 + np.exp(-(xL-flexL)/tau)) you could also use something like: np.where(x<0, .....) left as an exercise for the reader... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rshepard at appl-ecosys.com Fri May 2 16:32:09 2008 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Fri, 2 May 2008 13:32:09 -0700 (PDT) Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: <481B7861.8000102@noaa.gov> References: <481B7861.8000102@noaa.gov> Message-ID: On Fri, 2 May 2008, Christopher Barker wrote: > this could use some serious vectorization/numpyification! Poke around the > scipy Wiki and whatever other tutorials you can find -- you'll be glad you > did. A hint: > > When you are writing a loop like: >> for i in xL: >> x.append(xL[i]) > > You should be doing array operations! Chris, Good suggestions. I need to append the two numpy arrays so I return all x values in a single list and all y values in another list. I'll read the numpy book again. > That's why you get this error: > TypeError: unsupported operand type(s) for -: 'list' and 'float' > > you're trying to do array operations on a list. Ah, so! > Also, you probably want np.linspace, rather than arange, it's a better > option for floats. arange() has worked just fine in other functions. > # now put them together: > x = np.hstack((xL, xR[1:])) # don't want to duplicate the midpoint > y = np.hstack((yL, yR[1:])) > Also, you don't need to create the separate left and right arraysand put > them together, slicing gives a view, so you could do: I assume that there's a way to mate the two curves in a single equation. I've not been able to find what that is. Thanks, Rich From pwang at enthought.com Fri May 2 16:37:54 2008 From: pwang at enthought.com (Peter Wang) Date: Fri, 2 May 2008 15:37:54 -0500 Subject: [Numpy-discussion] [IT] Maintenance and scheduled downtime this evening and weekend Message-ID: Hi everyone, This evening and this weekend, we will be doing a major overhaul of Enthought's internal network infrastructure. We will be cleaning up a large amount of legacy structure and transitioning to a more maintainable, better documented configuration. We have planned the work so that externally-facing servers will experience a minimum of downtime. In the event of unforeseen difficulties that cause outages to extend beyond the times given below, we will update the network status page, located at http://dr0.enthought.com/status/ . This page will remain available and be unaffected by any network outages. Downtimes: Friday May 2, 2008, 8pm - 10pm Central time (= 9pm - 11pm Eastern time) (= 1am - 3am UTC Sat. May 3) Saturday May 2, 2008, 10am - 11am Central time (= 11am - 12 noon Eastern time) (= 3pm - 4pm UTC) To reach us during the downtime, please use the contact information provided on the network status page. Please let me know if you have any questions or concerns. We will send out another email once the outage is complete. Thanks for your patience! -Peter From peridot.faceted at gmail.com Fri May 2 18:03:32 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 2 May 2008 18:03:32 -0400 Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: 2008/5/2 Rich Shepard : > What will work (I call it a pi curve) is a matched pair of sigmoid curves, > the ascending curve on the left and the descending curve on the right. Using > the Boltzmann function for these I can calculate and plot each individually, > but I'm having difficulty in appending the x and y values across the entire > range. This is where I would appreciate your assistance. It's better not to work point-by-point, appending things, when working with numpy. Ideally you could find a formula which just produced the right curve, and then you'd apply it to the input vector and get the output vector all at once. For example for a Gaussian (which you're not using, I know), you'd write a function something like def f(x): return np.exp(-x**2) and then call it like: f(np.linspace(-1,1,1000)). This is efficient and clear. It does not seem to me that the logistic function, 1/(1+np.exp(x)) does quite what you want. How about using the cosine? def f(left, right, x): scaled_x = (x-(right+left)/2)/((right-left)/2) return (1+np.cos((np.pi/2) * scaled_x))/2 exactly zero at both endpoints, exactly one at the midpoint, inflection points midway between, where the value is 1/2. If you want to normalize it so that the area underneath is one, that's easy to do. More generally, the trick of producing a scaled_x as above lets you move any function anywhere you like. If you want the peak not to be at the midpoint of the interval, you will need to do something a litle more clever, perhaps choosing a different function, scaling the x values with a quadratic polynomial so that (left, middle, right) becomes (-1,0,1), or using a piecewise function. Good luck, Anne From rshepard at appl-ecosys.com Fri May 2 18:20:31 2008 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Fri, 2 May 2008 15:20:31 -0700 (PDT) Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: On Fri, 2 May 2008, Anne Archibald wrote: > It's better not to work point-by-point, appending things, when working > with numpy. Ideally you could find a formula which just produced the right > curve, and then you'd apply it to the input vector and get the output > vector all at once. Anne, That's been my goal. :-) > How about using the cosine? > > def f(left, right, x): > scaled_x = (x-(right+left)/2)/((right-left)/2) > return (1+np.cos((np.pi/2) * scaled_x))/2 > > exactly zero at both endpoints, exactly one at the midpoint, > inflection points midway between, where the value is 1/2. If you want > to normalize it so that the area underneath is one, that's easy to do. > More generally, the trick of producing a scaled_x as above lets you > move any function anywhere you like. This looks like a pragmatic solution. When I print scaled_x (using left = 0.0 and right = 100.0), the values range from -1.0 to +0.998. So, I need to figure out the scale_x that sets the end points at 0 and 100. Thanks very much, Rich From peridot.faceted at gmail.com Fri May 2 18:27:05 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 2 May 2008 18:27:05 -0400 Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: 2008/5/2 Rich Shepard : > On Fri, 2 May 2008, Anne Archibald wrote: > > > It's better not to work point-by-point, appending things, when working > > with numpy. Ideally you could find a formula which just produced the right > > curve, and then you'd apply it to the input vector and get the output > > vector all at once. > > Anne, > > That's been my goal. :-) > > > > How about using the cosine? > > > > def f(left, right, x): > > scaled_x = (x-(right+left)/2)/((right-left)/2) > > return (1+np.cos((np.pi/2) * scaled_x))/2 > > > > exactly zero at both endpoints, exactly one at the midpoint, > > inflection points midway between, where the value is 1/2. If you want > > to normalize it so that the area underneath is one, that's easy to do. > > > More generally, the trick of producing a scaled_x as above lets you > > move any function anywhere you like. > > This looks like a pragmatic solution. When I print scaled_x (using left = > 0.0 and right = 100.0), the values range from -1.0 to +0.998. So, I need to > figure out the scale_x that sets the end points at 0 and 100. No, no. You *want* scaled_x to range from -1 to 1. (The 0.998 is because you didn't include the endpoint, 100.) The function I gave, (1+np.cos((np.pi/2) * scaled_x))/2, takes [-1, 1] to a nice bump-shaped function. If you feed in numbers from 0 to 100 as x, they get transformed to scaled_x, and you feed them to the function, getting a result that goes from 0 at x=0 to 1 at x=50 to 0 at x=100. Anne From eads at soe.ucsc.edu Fri May 2 18:45:03 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Fri, 02 May 2008 15:45:03 -0700 Subject: [Numpy-discussion] [SciPy-dev] [IT] Maintenance and scheduled downtime this evening and weekend In-Reply-To: References: Message-ID: <481B996F.9020504@soe.ucsc.edu> Will this effect SVN and Trac access? Thanks! Damian Peter Wang wrote: > Hi everyone, > > This evening and this weekend, we will be doing a major overhaul of > Enthought's internal network infrastructure. We will be cleaning up a > large amount of legacy structure and transitioning to a more > maintainable, better documented configuration. From pwang at enthought.com Fri May 2 18:50:55 2008 From: pwang at enthought.com (Peter Wang) Date: Fri, 2 May 2008 17:50:55 -0500 Subject: [Numpy-discussion] [SciPy-dev] [IT] Maintenance and scheduled downtime this evening and weekend In-Reply-To: <481B996F.9020504@soe.ucsc.edu> References: <481B996F.9020504@soe.ucsc.edu> Message-ID: <6B53627F-13CD-4671-8C30-1217264A0BC9@enthought.com> On May 2, 2008, at 5:45 PM, Damian Eads wrote: > Will this effect SVN and Trac access? > Thanks! > Damian Yes, this affects svn, trac, web, mailing lists,...everything, because we will be working on the underlying network infrastructure. -Peter From Chris.Barker at noaa.gov Fri May 2 18:59:55 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 02 May 2008 15:59:55 -0700 Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: References: Message-ID: <481B9CEB.1010107@noaa.gov> Anne Archibald wrote: > 2008/5/2 Rich Shepard : > No, no. You *want* scaled_x to range from -1 to 1. Why not just scale to -pi to pi right there? (The 0.998 is because you didn't include the endpoint, 100.) Which is why you want linspace, rather than arange. Really, trust me on this! If the cosine curve isn't right for you, a little create use of np.where would let you do what you want in maybe another line or two of code. Something like: y = np.where( x < 0 , Leftfun(x), Rightfun(x) ) or just compute one side and then flip it to make the other side: y = np.zeros_like(x) y[center:] = fun(x[center:]) y[:center] = y[center+1:][::-1] # don't want the center point twice if it's symetric anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rshepard at appl-ecosys.com Fri May 2 19:10:15 2008 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Fri, 2 May 2008 16:10:15 -0700 (PDT) Subject: [Numpy-discussion] Combining Sigmoid Curves In-Reply-To: <481B9CEB.1010107@noaa.gov> References: <481B9CEB.1010107@noaa.gov> Message-ID: On Fri, 2 May 2008, Christopher Barker wrote: > Why not just scale to -pi to pi right there? Dunno, Chris. As I wrote to Anne (including a couple of files and the resulting plot), it's been almost three decades since I dealt with the math underlying distribution functions. > Which is why you want linspace, rather than arange. Really, trust me on > this! I see the difference in the book. Didn't know about linspace(), but adding True brings the end point to 100.0 > If the cosine curve isn't right for you, a little create use of np.where > would let you do what you want in maybe another line or two of code. > Something like: > > y = np.where( x < 0 , Leftfun(x), Rightfun(x) ) > > or just compute one side and then flip it to make the other side: > > y = np.zeros_like(x) > y[center:] = fun(x[center:]) > y[:center] = y[center+1:][::-1] # don't want the center point twice > > if it's symetric anyway. I'll look into this over the weekend (after upgrading my machines to Slackware-12.1). I can get nice sigmoid curves with the Boltzmann function. This thread started when I asked how to combine the two into a single curve with one set of x,y points. Much appreciated, Rich -- Richard B. Shepard, Ph.D. | Integrity Credibility Applied Ecosystem Services, Inc. | Innovation Voice: 503-667-4517 Fax: 503-667-8863 From oliphant at enthought.com Fri May 2 20:07:51 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 02 May 2008 19:07:51 -0500 Subject: [Numpy-discussion] Ruby's NMatrix and NVector Message-ID: <481BACD7.9050708@enthought.com> http://narray.rubyforge.org/matrix-e.html It seems they've implemented some of what Tim is looking for, in particular. Perhaps there is information to be gleaned from what they are doing. It looks promising.. -Travis From kwgoodman at gmail.com Fri May 2 20:24:53 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 17:24:53 -0700 Subject: [Numpy-discussion] Faster Message-ID: How can I make this function faster? It removes the i-th row and column from an array. def cut(x, i): idx = range(x.shape[0]) idx.remove(i) y = x[idx,:] y = y[:,idx] return y >> import numpy as np >> x = np.random.rand(500,500) >> timeit cut(x, 100) 100 loops, best of 3: 16.8 ms per loop From charlesr.harris at gmail.com Fri May 2 20:38:20 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 2 May 2008 18:38:20 -0600 Subject: [Numpy-discussion] Faster In-Reply-To: References: Message-ID: On Fri, May 2, 2008 at 6:24 PM, Keith Goodman wrote: > How can I make this function faster? It removes the i-th row and > column from an array. > Why do you want to do that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri May 2 20:47:32 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 17:47:32 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: Message-ID: On Fri, May 2, 2008 at 5:38 PM, Charles R Harris wrote: > On Fri, May 2, 2008 at 6:24 PM, Keith Goodman wrote: > > How can I make this function faster? It removes the i-th row and > > column from an array. > > > > Why do you want to do that? Single linkage clustering; x is the distance matrix. From kwgoodman at gmail.com Fri May 2 20:51:15 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 17:51:15 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: Message-ID: On Fri, May 2, 2008 at 5:47 PM, Keith Goodman wrote: > > On Fri, May 2, 2008 at 5:38 PM, Charles R Harris > wrote: > > On Fri, May 2, 2008 at 6:24 PM, Keith Goodman wrote: > > > How can I make this function faster? It removes the i-th row and > > > column from an array. > > > > > > > Why do you want to do that? > > Single linkage clustering; x is the distance matrix. Here's the full code if you are interested. I haven't used it yet other than running the test and test2 so it may be full of bugs. import time import numpy as np class Cluster: "Single linkage hierarchical clustering" def __init__(self, dist, label=None, debug=False): """ dist Distance matrix, NxN numpy array label Labels of each row of the distance matrix, list of N items, default is range(N) """ assert dist.shape[0] == dist.shape[1], 'dist must be square (nxn)' assert (np.abs(dist - dist.T) < 1e-8).all(), 'dist must be symmetric' if label is None: label = range(dist.shape[0]) assert dist.shape[0] == len(label), 'dist and label must match in size' self.c = [[[z] for z in label]] self.label = label self.dist = dist self.debug = debug def run(self): for level in xrange(len(self.label) - 1): i, j = self.min_dist() self.join(i, j) def join(self, i, j): assert i != j, 'Cannot combine a cluster with itself' # Join labels new = list(self.c[-1]) new[i] = new[i] + new[j] new.pop(j) self.c.append(new) # Join distance matrix self.dist[:,i] = self.dist[:,[i,j]].min(1) self.dist[i,:] = self.dist[:,i] idx = range(self.dist.shape[1]) idx.remove(j) self.dist = self.dist[:,idx] self.dist = self.dist[idx,:] # Debug output if self.debug: print print len(self.c) - 1 print 'Clusters' print self.c[-1] print 'Distance' print self.dist def min_dist(self): dist = self.dist + 1e10 * np.eye(self.dist.shape[0]) i, j = np.where(dist == dist.min()) return i[0], j[0] def test(): # Example from # home.dei.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html label = ['BA', 'FI', 'MI', 'NA', 'RM', 'TO'] dist = np.array([[0, 662, 877, 255, 412, 996], [662, 0, 295, 468, 268, 400], [877, 295, 0, 754, 564, 138], [255, 468, 754, 0, 219, 869], [412, 268, 564, 219, 0, 669], [996, 400, 138, 869, 669, 0 ]]) clust = Cluster(dist, label, debug=True) clust.run() def test2(n): x = np.random.rand(n,n) x = x + x.T c = Cluster(x) t1 = time.time() c.run() t2 = time.time() print 'n = %d took %0.2f seconds' % (n, t2-t1) From robert.kern at gmail.com Fri May 2 21:05:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 2 May 2008 20:05:52 -0500 Subject: [Numpy-discussion] Faster In-Reply-To: References: Message-ID: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> On Fri, May 2, 2008 at 7:24 PM, Keith Goodman wrote: > How can I make this function faster? It removes the i-th row and > column from an array. > > def cut(x, i): > idx = range(x.shape[0]) > idx.remove(i) > y = x[idx,:] > y = y[:,idx] > return y > > >> import numpy as np > >> x = np.random.rand(500,500) > >> timeit cut(x, 100) > 100 loops, best of 3: 16.8 ms per loop I can get a ~20% improvement with the following: In [8]: %timeit cut(x, 100) 10 loops, best of 3: 21.6 ms per loop In [9]: def mycut(x, i): ...: A = x[:i,:i] ...: B = x[:i,i+1:] ...: C = x[i+1:,:i] ...: D = x[i+1:,i+1:] ...: return hstack([vstack([A,C]),vstack([B,D])]) ...: In [10]: %timeit mycut(x, 100) 10 loops, best of 3: 17.3 ms per loop The hstack(vstack, vstack) seems to be somewhat better than vstack(hstack, hstack), at least for these sizes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri May 2 21:16:05 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 18:16:05 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 6:05 PM, Robert Kern wrote: > On Fri, May 2, 2008 at 7:24 PM, Keith Goodman wrote: > > > > How can I make this function faster? It removes the i-th row and > > column from an array. > > > > def cut(x, i): > > idx = range(x.shape[0]) > > idx.remove(i) > > y = x[idx,:] > > y = y[:,idx] > > return y > > > > >> import numpy as np > > >> x = np.random.rand(500,500) > > >> timeit cut(x, 100) > > 100 loops, best of 3: 16.8 ms per loop > > I can get a ~20% improvement with the following: > > In [8]: %timeit cut(x, 100) > 10 loops, best of 3: 21.6 ms per loop > > In [9]: def mycut(x, i): > ...: A = x[:i,:i] > ...: B = x[:i,i+1:] > ...: C = x[i+1:,:i] > ...: D = x[i+1:,i+1:] > ...: return hstack([vstack([A,C]),vstack([B,D])]) > ...: > > In [10]: %timeit mycut(x, 100) > 10 loops, best of 3: 17.3 ms per loop > > The hstack(vstack, vstack) seems to be somewhat better than > vstack(hstack, hstack), at least for these sizes. Wow. I never would have come up with that. And I probably never will. Original code: >> cluster.test2(500) n = 500 took 5.28 seconds Your improvement: >> cluster.test2(500) n = 500 took 3.52 seconds Much more than a 20% improvement when used in the larger program. Thank you. From charlesr.harris at gmail.com Fri May 2 21:29:51 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 2 May 2008 19:29:51 -0600 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 7:16 PM, Keith Goodman wrote: > On Fri, May 2, 2008 at 6:05 PM, Robert Kern wrote: > > On Fri, May 2, 2008 at 7:24 PM, Keith Goodman > wrote: > > > > > > > How can I make this function faster? It removes the i-th row and > > > column from an array. > > > > > > def cut(x, i): > > > idx = range(x.shape[0]) > > > idx.remove(i) > > > y = x[idx,:] > > > y = y[:,idx] > > > return y > > > > > > >> import numpy as np > > > >> x = np.random.rand(500,500) > > > >> timeit cut(x, 100) > > > 100 loops, best of 3: 16.8 ms per loop > > > > I can get a ~20% improvement with the following: > > > > In [8]: %timeit cut(x, 100) > > 10 loops, best of 3: 21.6 ms per loop > > > > In [9]: def mycut(x, i): > > ...: A = x[:i,:i] > > ...: B = x[:i,i+1:] > > ...: C = x[i+1:,:i] > > ...: D = x[i+1:,i+1:] > > ...: return hstack([vstack([A,C]),vstack([B,D])]) > > ...: > > > > In [10]: %timeit mycut(x, 100) > > 10 loops, best of 3: 17.3 ms per loop > > > > The hstack(vstack, vstack) seems to be somewhat better than > > vstack(hstack, hstack), at least for these sizes. > > Wow. I never would have come up with that. And I probably never will. > > Original code: > >> cluster.test2(500) > n = 500 took 5.28 seconds > > Your improvement: > >> cluster.test2(500) > n = 500 took 3.52 seconds > > Much more than a 20% improvement when used in the larger program. Thank > you. Isn't the lengthy part finding the distance between clusters? I can think of several ways to do that, but I think you will get a real speedup by doing that in c or c++. I have a module made in boost python that holds clusters and returns a list of lists containing their elements. Clusters are joined by joining any two elements, one from each. It wouldn't take much to add a distance function, but you could use the list of indices in each cluster to pull a subset out of the distance matrix and then find the minimum function in that. This also reminds me of Huffman codes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri May 2 22:02:44 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 19:02:44 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 6:29 PM, Charles R Harris wrote: > Isn't the lengthy part finding the distance between clusters? I can think > of several ways to do that, but I think you will get a real speedup by doing > that in c or c++. I have a module made in boost python that holds clusters > and returns a list of lists containing their elements. Clusters are joined > by joining any two elements, one from each. It wouldn't take much to add a > distance function, but you could use the list of indices in each cluster to > pull a subset out of the distance matrix and then find the minimum function > in that. This also reminds me of Huffman codes. You're right. Finding the distance is slow. Is there any way to speed up the function below? It returns the row and column indices of the min value of the NxN array x. def dist(x): x = x + 1e10 * np.eye(x.shape[0]) i, j = np.where(x == x.min()) return i[0], j[0] >> x = np.random.rand(500,500) >> timeit dist(x) 100 loops, best of 3: 14.1 ms per loop If the clustering gives me useful results, I'll ask you about your boost code. I'll also take a look at Damian Eads's scipy-cluster. From charlesr.harris at gmail.com Fri May 2 22:21:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 2 May 2008 20:21:55 -0600 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 8:02 PM, Keith Goodman wrote: > On Fri, May 2, 2008 at 6:29 PM, Charles R Harris > wrote: > > Isn't the lengthy part finding the distance between clusters? I can > think > > of several ways to do that, but I think you will get a real speedup by > doing > > that in c or c++. I have a module made in boost python that holds > clusters > > and returns a list of lists containing their elements. Clusters are > joined > > by joining any two elements, one from each. It wouldn't take much to add > a > > distance function, but you could use the list of indices in each cluster > to > > pull a subset out of the distance matrix and then find the minimum > function > > in that. This also reminds me of Huffman codes. > > You're right. Finding the distance is slow. Is there any way to speed > up the function below? It returns the row and column indices of the > min value of the NxN array x. > > def dist(x): > x = x + 1e10 * np.eye(x.shape[0]) x += x + diag(ones(x.shape[0])*1e10 would be faster. > i, j = np.where(x == x.min()) > return i[0], j[0] > i = x.argmin() j = i % x.shape[0] i = i / x.shape[0] But I wouldn't worry about speed yet if you are just trying things out. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri May 2 22:25:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 2 May 2008 21:25:01 -0500 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> On Fri, May 2, 2008 at 9:02 PM, Keith Goodman wrote: > On Fri, May 2, 2008 at 6:29 PM, Charles R Harris > > wrote: > > > Isn't the lengthy part finding the distance between clusters? I can think > > of several ways to do that, but I think you will get a real speedup by doing > > that in c or c++. I have a module made in boost python that holds clusters > > and returns a list of lists containing their elements. Clusters are joined > > by joining any two elements, one from each. It wouldn't take much to add a > > distance function, but you could use the list of indices in each cluster to > > pull a subset out of the distance matrix and then find the minimum function > > in that. This also reminds me of Huffman codes. > > You're right. Finding the distance is slow. Is there any way to speed > up the function below? It returns the row and column indices of the > min value of the NxN array x. > > def dist(x): > x = x + 1e10 * np.eye(x.shape[0]) > i, j = np.where(x == x.min()) > > return i[0], j[0] Assuming x is contiguous and you can modify x in-place: In [1]: from numpy import * In [2]: def dist(x): ...: x = x + 1e10 * eye(x.shape[0]) ...: i, j = where(x == x.min()) ...: return i[0], j[0] ...: In [3]: def symmdist(N): ...: x = random.rand(N, N) ...: x = x + x.T ...: x.flat[::N+1] = 0 ...: return x ...: In [4]: symmdist(5) Out[4]: array([[ 0. , 0.87508654, 1.11691704, 0.80366071, 0.57966808], [ 0.87508654, 0. , 1.5521685 , 1.74010886, 0.52156877], [ 1.11691704, 1.5521685 , 0. , 1.22725396, 1.04101992], [ 0.80366071, 1.74010886, 1.22725396, 0. , 1.94577965], [ 0.57966808, 0.52156877, 1.04101992, 1.94577965, 0. ]]) In [5]: def kerndist(x): ...: N = x.shape[0] ...: x.flat[::N+1] = x.max() ...: ij = argmin(x.flat) ...: i, j = divmod(ij, N) ...: return i, j ...: In [10]: x = symmdist(500) In [15]: %timeit dist(x) 10 loops, best of 3: 19.9 ms per loop In [16]: %timeit kerndist(x) 100 loops, best of 3: 4.38 ms per loop -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri May 2 22:36:53 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 2 May 2008 20:36:53 -0600 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 8:02 PM, Keith Goodman wrote: > On Fri, May 2, 2008 at 6:29 PM, Charles R Harris > wrote: > > Isn't the lengthy part finding the distance between clusters? I can > think > > of several ways to do that, but I think you will get a real speedup by > doing > > that in c or c++. I have a module made in boost python that holds > clusters > > and returns a list of lists containing their elements. Clusters are > joined > > by joining any two elements, one from each. It wouldn't take much to add > a > > distance function, but you could use the list of indices in each cluster > to > > pull a subset out of the distance matrix and then find the minimum > function > > in that. This also reminds me of Huffman codes. > > You're right. Finding the distance is slow. Is there any way to speed > up the function below? It returns the row and column indices of the > min value of the NxN array x. > > def dist(x): > x = x + 1e10 * np.eye(x.shape[0]) > i, j = np.where(x == x.min()) > return i[0], j[0] > > >> x = np.random.rand(500,500) > >> timeit dist(x) > 100 loops, best of 3: 14.1 ms per loop > > If the clustering gives me useful results, I'll ask you about your > boost code. I'll also take a look at Damian Eads's scipy-cluster. That package looks nice. I think your time would be better spent learning how to use it than in rolling your own routines. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri May 2 23:20:04 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 20:20:04 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 7:25 PM, Robert Kern wrote: > On Fri, May 2, 2008 at 9:02 PM, Keith Goodman wrote: > > You're right. Finding the distance is slow. Is there any way to speed > > up the function below? It returns the row and column indices of the > > min value of the NxN array x. > > > > def dist(x): > > x = x + 1e10 * np.eye(x.shape[0]) > > i, j = np.where(x == x.min()) > > > > return i[0], j[0] > > Assuming x is contiguous and you can modify x in-place: > > > In [1]: from numpy import * > > In [2]: def dist(x): > ...: x = x + 1e10 * eye(x.shape[0]) > ...: i, j = where(x == x.min()) > > ...: return i[0], j[0] > ...: > > In [3]: def symmdist(N): > ...: x = random.rand(N, N) > ...: x = x + x.T > ...: x.flat[::N+1] = 0 > ...: return x > ...: > > In [4]: symmdist(5) > Out[4]: > array([[ 0. , 0.87508654, 1.11691704, 0.80366071, 0.57966808], > [ 0.87508654, 0. , 1.5521685 , 1.74010886, 0.52156877], > [ 1.11691704, 1.5521685 , 0. , 1.22725396, 1.04101992], > [ 0.80366071, 1.74010886, 1.22725396, 0. , 1.94577965], > [ 0.57966808, 0.52156877, 1.04101992, 1.94577965, 0. ]]) > > In [5]: def kerndist(x): > ...: N = x.shape[0] > ...: x.flat[::N+1] = x.max() > ...: ij = argmin(x.flat) > ...: i, j = divmod(ij, N) > ...: return i, j > ...: > > In [10]: x = symmdist(500) > > In [15]: %timeit dist(x) > 10 loops, best of 3: 19.9 ms per loop > > In [16]: %timeit kerndist(x) > 100 loops, best of 3: 4.38 ms per loop This change and your previous one cut the run time from 5.28 to 2.23 seconds. Thank you. From kwgoodman at gmail.com Fri May 2 23:21:45 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 20:21:45 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 7:36 PM, Charles R Harris wrote: > On Fri, May 2, 2008 at 8:02 PM, Keith Goodman wrote: > > If the clustering gives me useful results, I'll ask you about your > > boost code. I'll also take a look at Damian Eads's scipy-cluster. > > That package looks nice. I think your time would be better spent learning > how to use it than in rolling your own routines. Yeah, but this is more fun. From kwgoodman at gmail.com Fri May 2 23:48:27 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 2 May 2008 20:48:27 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 7:25 PM, Robert Kern wrote: > In [5]: def kerndist(x): > ...: N = x.shape[0] > ...: x.flat[::N+1] = x.max() > ...: ij = argmin(x.flat) > ...: i, j = divmod(ij, N) > ...: return i, j I replaced ij = argmin(x.flat) with x.argmin() (they're the same in this context, right?) for a slight speed up. Now I'm down to 1.9 seconds. >> timeit argmin(x.flat) 10000 loops, best of 3: 24.8 ?s per loop >> timeit argmin(x) 100000 loops, best of 3: 9.19 ?s per loop >> timeit x.argmin() 100000 loops, best of 3: 8.06 ?s per loop OK, enough way-too-early optimization. But I'll definately add i, j = modiv(x.argmin(), x.shape[0]) to my toolbelt. From robert.kern at gmail.com Fri May 2 23:54:11 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 2 May 2008 22:54:11 -0500 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> Message-ID: <3d375d730805022054v43cb4bady27b9d82f63689cdb@mail.gmail.com> On Fri, May 2, 2008 at 10:48 PM, Keith Goodman wrote: > On Fri, May 2, 2008 at 7:25 PM, Robert Kern wrote: > > In [5]: def kerndist(x): > > ...: N = x.shape[0] > > ...: x.flat[::N+1] = x.max() > > ...: ij = argmin(x.flat) > > ...: i, j = divmod(ij, N) > > ...: return i, j > > I replaced > > ij = argmin(x.flat) > > with > > x.argmin() > > (they're the same in this context, right?) for a slight speed up. Now > I'm down to 1.9 seconds. Yes, they're the same. I forgot that .argmin() flattened by default. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Sat May 3 01:56:10 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 03 May 2008 14:56:10 +0900 Subject: [Numpy-discussion] a possible way to implement a plogin system In-Reply-To: References: <1209611026.23684.19.camel@bbc8> Message-ID: <481BFE7A.9070207@ar.media.kyoto-u.ac.jp> Lisandro Dalcin wrote: > > Perhaps cleaner is not the right word. I actually believe that is far > more portable, regarding the oddities of dlopening in different > platforms. I am sorry, I don't understand why a struct is less sensitive to platform issues than function pointers when dynamically loading them. Could you give more details ? > > Well, that is the same mechanism NumPy uses for exporting its C API to > extensions modules. Look at generated header '__multiarray_api.h' in > function '_import_array()' in your numpy installation. This is the > standard, portable, and (as Python doc says) recommended way to expose > C API's from extensions modules. Yes, but this is not for exposing C python extensions ! This is for pure C plugins, and as such, we have different requirements; in particular, we don't need to hide api (the struct thing is partly useful because iso C does not have a namespace concept outside the compilation unit with static). Again, the use cases I have in mind are multiple implementations of the same core functions, selected at runtime (sse, sse2, mmx, etc...). Adding all the python mechanism just to load symbols does not sound like simplification to me: using the python C API is heavy, you have to start thinking about all kind of issues (ref counting, etc...). > > Please note that all my concerns about recommending you not to use > dlopen, is just to save you from future headaches!!! I implemented and tested the API for mac os x, windows and dlopen which covers 99 % of our users I think. Note that we can copy the actual code in python sources if we want to. > . See yourself at > row 17 of table here > . And that > table does not cover stuff like RTLD_GLOBAL flags to dlopen (or > equivalent). I do not pretend to know about all the platforms idiosyncrasies, but I am aware of the different aspects. In particular, note that most of the table is related to building, which I already took care of when starting numscons :) Also, I don't think we will use the system outside a few majors platforms (typically, I don'tsee anyone implementing ufunc core loops in alpha assembly for True64 :) ). cheers, David From wilson.t.thompson at gmail.com Sat May 3 02:34:19 2008 From: wilson.t.thompson at gmail.com (wilson) Date: Fri, 2 May 2008 23:34:19 -0700 (PDT) Subject: [Numpy-discussion] svd and eigh Message-ID: <2efdb340-ab7f-44ef-ad6b-c1278f5a7f06@p25g2000pri.googlegroups.com> I am trying out the eigenvectors related functions in numpy.linalg.I came across some portions where i have doubts. 1). i have an array X if i calculate L=dot(X,X.transpose()) can L be called the covariance matrix of X?I read so in a paper by Turk&Pentland(equation 3 i think) can someone clarify this ? 2). i tried to find eigenvectors using svd() and eigh() functions evects1,evals,vt=svd(L,0) evals,evects2=eigh(L) and sorted both evects1 and evects2 in the descending order of their evals here i find that evects1 and evects2 have same values but some of the values differ in their signs.why is this? can anyone explain tia W From nwagner at iam.uni-stuttgart.de Sat May 3 02:40:38 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 03 May 2008 08:40:38 +0200 Subject: [Numpy-discussion] svd and eigh In-Reply-To: <2efdb340-ab7f-44ef-ad6b-c1278f5a7f06@p25g2000pri.googlegroups.com> References: <2efdb340-ab7f-44ef-ad6b-c1278f5a7f06@p25g2000pri.googlegroups.com> Message-ID: On Fri, 2 May 2008 23:34:19 -0700 (PDT) wilson wrote: > I am trying out the eigenvectors related functions in >numpy.linalg.I > came across some portions where i have doubts. > 1). > i have an array X > if i calculate L=dot(X,X.transpose()) > can L be called the covariance matrix of X?I read so in >a paper by > Turk&Pentland(equation 3 i think) > can someone clarify this ? > > 2). > i tried to find eigenvectors using svd() and eigh() >functions > evects1,evals,vt=svd(L,0) > > evals,evects2=eigh(L) > > and sorted both evects1 and evects2 in the descending >order of their > evals > here i find that evects1 and evects2 have same values >but some of the > values differ in their signs.why is this? > > can anyone explain > tia > W > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion http://en.wikipedia.org/wiki/Singular_value_decomposition http://en.wikipedia.org/wiki/Eigenvector http://en.wikipedia.org/wiki/Covariance_matrix HTH, Nils From hoytak at gmail.com Sat May 3 02:51:33 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Fri, 2 May 2008 23:51:33 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: Message-ID: <4db580fd0805022351y6e3fe1b9xaad02e8cc1ee5787@mail.gmail.com> You know, for linkage clustering and BHC, I've found it a lot easier to work with an intermediate 1d map of indices and never resize the distance matrix. I then just remove one element from this map at each iteration, which is a LOT faster than removing a column and a row from a matrix. if idxmap were the map, you would be working with X[idxmap[i], idxmap[j] ] instead of X[i, j]. Also, you could even use a python list in this case, as they are a lot better for deletions than an array. --Hoyt On Fri, May 2, 2008 at 5:47 PM, Keith Goodman wrote: > > On Fri, May 2, 2008 at 5:38 PM, Charles R Harris > wrote: > > On Fri, May 2, 2008 at 6:24 PM, Keith Goodman wrote: > > > How can I make this function faster? It removes the i-th row and > > > column from an array. > > > > > > > Why do you want to do that? > > Single linkage clustering; x is the distance matrix. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From bryan at cole.uklinux.net Sat May 3 06:43:28 2008 From: bryan at cole.uklinux.net (Bryan Cole) Date: Sat, 03 May 2008 11:43:28 +0100 Subject: [Numpy-discussion] very simple iteration question. In-Reply-To: <20080430190930.GG1497@phare.normalesup.org> References: <9a771bf70804300111s749747b8we8a4393a3c68fd02@mail.gmail.com> <4818A2C1.5010208@noaa.gov> <4818C128.7030908@noaa.gov> <20080430190930.GG1497@phare.normalesup.org> Message-ID: <1209811408.19044.5.camel@pc2.cole.uklinux.net> On Wed, 2008-04-30 at 21:09 +0200, Gael Varoquaux wrote: > On Wed, Apr 30, 2008 at 11:57:44AM -0700, Christopher Barker wrote: > > I think I still like the idea of an iterator (or maybe making rollaxis a > > method?), but this works pretty well. > > Generally, in object oriented programming, you expect a method like > rollaxis to modify an object inplace. At least that would be my > expectation. BTW. rollaxis isn't a method. I was completely unaware of this function. Learned something new today... BC > > Ga?l From robince at gmail.com Sat May 3 06:59:30 2008 From: robince at gmail.com (Robin) Date: Sat, 3 May 2008 11:59:30 +0100 Subject: [Numpy-discussion] python memory use Message-ID: Hi, I am starting to push the limits of the available memory and I'd like to understand a bit better how Python handles memory... If I try to allocate something too big for the available memory I often get a MemoryError exception. However, in other situations, Python memory use continues to grow until the machine falls over. I was hoping to understand the difference between those cases. From what I've read Python never returns memory to the OS (is this right?) so the second case, python is holding on to memory that it isn't really using (for objects that have been destroyed). I guess my question is why doesn't it reuse the memory freed from object deletions instead of requesting more - and even then when requesting more, why does it continue until the machine falls over and not cause a MemoryError? While investigating this I found this script: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474 which does wonders for my code. I was wondering if this function should be included in Numpy as it seems to provide an important feature, or perhaps an entry on the wiki (in Cookbook section?) Thanks, Robin From wilson.t.thompson at gmail.com Sat May 3 07:39:40 2008 From: wilson.t.thompson at gmail.com (wilson) Date: Sat, 3 May 2008 04:39:40 -0700 (PDT) Subject: [Numpy-discussion] svd and eigh In-Reply-To: References: <2efdb340-ab7f-44ef-ad6b-c1278f5a7f06@p25g2000pri.googlegroups.com> Message-ID: <498b0005-fb81-4ab7-995c-871d0529d4f4@w8g2000prd.googlegroups.com> thanks for the links.. but why the different signs for entries in eigenvectors? is it a library specific thing? shouldn't they be identical? W From matthieu.brucher at gmail.com Sat May 3 08:03:53 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 3 May 2008 14:03:53 +0200 Subject: [Numpy-discussion] svd and eigh In-Reply-To: <498b0005-fb81-4ab7-995c-871d0529d4f4@w8g2000prd.googlegroups.com> References: <2efdb340-ab7f-44ef-ad6b-c1278f5a7f06@p25g2000pri.googlegroups.com> <498b0005-fb81-4ab7-995c-871d0529d4f4@w8g2000prd.googlegroups.com> Message-ID: Hi, The opposite of an eigenvector is an eigenvector as well, with the same eigenvalue. Depending on the algorithm, both can be returned. Matthieu 2008/5/3 wilson : > thanks for the links.. > but why the different signs for entries in eigenvectors? is it a > library specific thing? shouldn't they be identical? > W > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Sat May 3 08:09:43 2008 From: lists at cheimes.de (Christian Heimes) Date: Sat, 03 May 2008 14:09:43 +0200 Subject: [Numpy-discussion] python memory use In-Reply-To: References: Message-ID: Robin schrieb: > If I try to allocate something too big for the available memory I > often get a MemoryError exception. However, in other situations, > Python memory use continues to grow until the machine falls over. I > was hoping to understand the difference between those cases. From what > I've read Python never returns memory to the OS (is this right?) so > the second case, python is holding on to memory that it isn't really > using (for objects that have been destroyed). I guess my question is > why doesn't it reuse the memory freed from object deletions instead of > requesting more - and even then when requesting more, why does it > continue until the machine falls over and not cause a MemoryError? Your assumption isn't correct. Python releases memory. For small objects Python uses its own memory allocation system as explained in http://svn.python.org/projects/python/trunk/Objects/obmalloc.c . For integer and floats uses a separate block allocation schema. Christian From kwgoodman at gmail.com Sat May 3 09:48:34 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 3 May 2008 06:48:34 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <4db580fd0805022351y6e3fe1b9xaad02e8cc1ee5787@mail.gmail.com> References: <4db580fd0805022351y6e3fe1b9xaad02e8cc1ee5787@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 11:51 PM, Hoyt Koepke wrote: > You know, for linkage clustering and BHC, I've found it a lot easier > to work with an intermediate 1d map of indices and never resize the > distance matrix. I then just remove one element from this map at each > iteration, which is a LOT faster than removing a column and a row from > a matrix. if idxmap were the map, you would be working with > X[idxmap[i], idxmap[j] ] instead of X[i, j]. Also, you could even use > a python list in this case, as they are a lot better for deletions > than an array. I thought of replacing the row and column with 1e10 instead of deleting it. But your idea is better. If I use lists then I bet Psyco will speed things up. Thanks for the tip. From strawman at astraw.com Sat May 3 12:27:25 2008 From: strawman at astraw.com (Andrew Straw) Date: Sat, 03 May 2008 09:27:25 -0700 Subject: [Numpy-discussion] python memory use In-Reply-To: References: Message-ID: <481C926D.6090900@astraw.com> Robin wrote: > Hi, > > I am starting to push the limits of the available memory and I'd like > to understand a bit better how Python handles memory... > This is why I switched to 64 bit linux and never looked back. > If I try to allocate something too big for the available memory I > often get a MemoryError exception. However, in other situations, > Python memory use continues to grow until the machine falls over. I > was hoping to understand the difference between those cases. I don't know what "falls over" mean. It could be that you're getting swap death -- the kernel starts attempting to use virtual memory (hard disk) for some of the RAM. This would be characterized by your CPU use dropping to near-zero, your hard disk grinding away, and your swap space use increasing. The MemoryError simply means that Python made a request for memory that the kernel didn't grant. There's something else you might run into -- the maximum memory size of a process before the kernel kills that process. On linux i686, IIRC this limit is 3 GB. I'm not sure why you get different behavior on different runs. FWIW, with 64 bit linux the worst that happens to me now is swap death, which can be forestalled by adding lots of RAM. > From what > I've read Python never returns memory to the OS (is this right?) so > the second case, python is holding on to memory that it isn't really > using (for objects that have been destroyed). I guess my question is > why doesn't it reuse the memory freed from object deletions instead of > requesting more - and even then when requesting more, why does it > continue until the machine falls over and not cause a MemoryError? > It's hard to say without knowing what your code does. A first guess is that you're allocating lots of memory without allowing it to be freed. Specifically, you may have references to objects which you no longer need, and you should eliminate those references and allow them to be garbage collected. In some cases, circular references can be hard for python to detect, so you might want to play around with the gc module and judicious use of the del statement. Note also that IPython keeps references to past results by default (the history). > While investigating this I found this script: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474 > which does wonders for my code. I was wondering if this function > should be included in Numpy as it seems to provide an important > feature, or perhaps an entry on the wiki (in Cookbook section?) > I don't think it belongs in numpy per se, and I'm not sure of the necessity of a spot on the scipy cookbook given that it's in the python cookbook. Perhaps more useful would be starting a page called "MemoryIssues" on the scipy wiki -- I imagine this subject, as a whole, is of particular interest for many in the numpy/scipy crowd. Certainly adding a link and description to that recipe would be useful in that context. But please, feel free to add to or edit the wiki as you see fit -- if you think something will be useful, by all means, go ahead and do it. I think there are enough eyes on the wiki that it's fairly self-regulating. -Andrew From Chris.Barker at noaa.gov Sat May 3 20:05:51 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Sat, 03 May 2008 17:05:51 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> Message-ID: <481CFDDF.8050309@noaa.gov> Robert Kern wrote: > I can get a ~20% improvement with the following: > In [9]: def mycut(x, i): > ...: A = x[:i,:i] > ...: B = x[:i,i+1:] > ...: C = x[i+1:,:i] > ...: D = x[i+1:,i+1:] > ...: return hstack([vstack([A,C]),vstack([B,D])]) Might it be a touch faster to built the final array first, then fill it: def mycut(x, i): r,c = x.shape out = np.empty((r-1, c-1), dtype=x.dtype) out[:i,:i] = x[:i,:i] out[:i,i:] = x[:i,i+1:] out[i:,:i] = x[i+1:,:i] out[i:,i+1:] = x[i+1:,i+1:] return out totally untested. That should save the creation of two temporaries. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception From robert.kern at gmail.com Sat May 3 20:20:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 3 May 2008 19:20:34 -0500 Subject: [Numpy-discussion] Faster In-Reply-To: <481CFDDF.8050309@noaa.gov> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: <3d375d730805031720y4699a02fl7e4a29f98f118b@mail.gmail.com> On Sat, May 3, 2008 at 7:05 PM, Christopher Barker wrote: > Robert Kern wrote: > > I can get a ~20% improvement with the following: > > > In [9]: def mycut(x, i): > > ...: A = x[:i,:i] > > ...: B = x[:i,i+1:] > > ...: C = x[i+1:,:i] > > ...: D = x[i+1:,i+1:] > > ...: return hstack([vstack([A,C]),vstack([B,D])]) > > Might it be a touch faster to built the final array first, then fill it: > > def mycut(x, i): > r,c = x.shape > out = np.empty((r-1, c-1), dtype=x.dtype) > out[:i,:i] = x[:i,:i] > out[:i,i:] = x[:i,i+1:] > out[i:,:i] = x[i+1:,:i] > out[i:,i+1:] = x[i+1:,i+1:] > return out > > totally untested. > > That should save the creation of two temporaries. After fixing the last statement to "out[i:,i:] = ...", yes, much faster. About a factor of 5 with N=500. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Sat May 3 20:31:05 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 3 May 2008 17:31:05 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <481CFDDF.8050309@noaa.gov> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: On Sat, May 3, 2008 at 5:05 PM, Christopher Barker wrote: > Robert Kern wrote: > > I can get a ~20% improvement with the following: > > > > In [9]: def mycut(x, i): > > ...: A = x[:i,:i] > > ...: B = x[:i,i+1:] > > ...: C = x[i+1:,:i] > > ...: D = x[i+1:,i+1:] > > ...: return hstack([vstack([A,C]),vstack([B,D])]) > > Might it be a touch faster to built the final array first, then fill it: > > def mycut(x, i): > r,c = x.shape > out = np.empty((r-1, c-1), dtype=x.dtype) > out[:i,:i] = x[:i,:i] > out[:i,i:] = x[:i,i+1:] > out[i:,:i] = x[i+1:,:i] > out[i:,i+1:] = x[i+1:,i+1:] > return out > > totally untested. > > That should save the creation of two temporaries. Initializing the array makes sense. And it is super fast: >> timeit mycut(x, 6) 100 loops, best of 3: 7.48 ms per loop >> timeit mycut2(x, 6) 1000 loops, best of 3: 1.5 ms per loop The time it takes to cluster went from about 1.9 seconds to 0.7 seconds! Thank you. When I run the single linkage clustering on my data I get one big cluster and a bunch of tiny clusters. So I need to try a different linkage method. Average linkage sounds good, but it sounds hard to code. From hoytak at gmail.com Sat May 3 20:56:15 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Sat, 3 May 2008 17:56:15 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: <4db580fd0805031756g50995555k93f37fadf9b4dc8b@mail.gmail.com> You could also try complete linkage, where you merge two clusters based on the farthest distance between points in two clusters instead of the smallest. This will tend to get clusters of equal size (which isn't always ideal, either). However, it also uses sufficient statistics, so it will be trivial to change your code to use that merge criteria if you want to try it. --Hoyt On Sat, May 3, 2008 at 5:31 PM, Keith Goodman wrote: > On Sat, May 3, 2008 at 5:05 PM, Christopher Barker > wrote: > > > Robert Kern wrote: > > > I can get a ~20% improvement with the following: > > > > > > > In [9]: def mycut(x, i): > > > ...: A = x[:i,:i] > > > ...: B = x[:i,i+1:] > > > ...: C = x[i+1:,:i] > > > ...: D = x[i+1:,i+1:] > > > ...: return hstack([vstack([A,C]),vstack([B,D])]) > > > > Might it be a touch faster to built the final array first, then fill it: > > > > def mycut(x, i): > > r,c = x.shape > > out = np.empty((r-1, c-1), dtype=x.dtype) > > out[:i,:i] = x[:i,:i] > > out[:i,i:] = x[:i,i+1:] > > out[i:,:i] = x[i+1:,:i] > > out[i:,i+1:] = x[i+1:,i+1:] > > return out > > > > totally untested. > > > > That should save the creation of two temporaries. > > Initializing the array makes sense. And it is super fast: > > >> timeit mycut(x, 6) > 100 loops, best of 3: 7.48 ms per loop > >> timeit mycut2(x, 6) > 1000 loops, best of 3: 1.5 ms per loop > > The time it takes to cluster went from about 1.9 seconds to 0.7 > seconds! Thank you. > > When I run the single linkage clustering on my data I get one big > cluster and a bunch of tiny clusters. So I need to try a different > linkage method. Average linkage sounds good, but it sounds hard to > code. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From kwgoodman at gmail.com Sat May 3 21:24:16 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 3 May 2008 18:24:16 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <4db580fd0805031756g50995555k93f37fadf9b4dc8b@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> <4db580fd0805031756g50995555k93f37fadf9b4dc8b@mail.gmail.com> Message-ID: On Sat, May 3, 2008 at 5:56 PM, Hoyt Koepke wrote: > You could also try complete linkage, where you merge two clusters > based on the farthest distance between points in two clusters instead > of the smallest. This will tend to get clusters of equal size (which > isn't always ideal, either). However, it also uses sufficient > statistics, so it will be trivial to change your code to use that > merge criteria if you want to try it. Thanks for the tip. The cluster sizes are much more reasonable now. From ndbecker2 at gmail.com Sun May 4 05:11:56 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 04 May 2008 05:11:56 -0400 Subject: [Numpy-discussion] strict aliasing? Message-ID: Is it safe to compile numpy with gcc 'strict aliasing'? From F.Boonstra at inter.nl.net Sun May 4 05:52:44 2008 From: F.Boonstra at inter.nl.net (Folkert Boonstra) Date: Sun, 04 May 2008 11:52:44 +0200 Subject: [Numpy-discussion] Learn about numpy Message-ID: <481D876C.1070001@inter.nl.net> With a python background but new to numpy, I have the following. Suppose I have a 2-D array and I want to apply a function to each element. The function needs to access the direct neighbouring elements in order to set a new value for the element. How would I do that in the most efficient way with numpy? Currently I have a uint8 array (self.bufbw) filled with 0 and 1 elements: def applyRule(self, rule): for xy in self.xydims: rule(xy) def rule(self, xy): x = xy[0]; y = xy[1] sum = self.bufbw[x-1:x+2, y-1:y+2].sum() \ - self.bufbw[x-1,y-1] - self.bufbw[x+1,y-1] \ - self.bufbw[x-1,y+1] - self.bufbw[x+1,y+1] if sum == 1: self.bufbw[x,y] = 1 else: self.bufbw[x,y] = 0 I have looked at the documentation online but couldn't find another faster solution yet. Does anyone want to share some ideas on a faster solution with numpy? Thanks, Folkert From nadavh at visionsense.com Sun May 4 06:13:00 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 4 May 2008 13:13:00 +0300 Subject: [Numpy-discussion] Learn about numpy References: <481D876C.1070001@inter.nl.net> Message-ID: <710F2847B0018641891D9A216027636029C132@ex3.envision.co.il> What you do here is a convolution with 0 1 0 1 1 1 0 1 0 kernel, and thresholding, you can use numpy.numarray.nd_image package: import numpy.numarray.nd_image as NI . . . ker = array([[0,1,0], [1,1,1],[0,1,0]]) result = (NI.convolve(self.bufbw, ker) == 1).astype(uint8) for nore general cases you can use the function generic_filter in the same package. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Folkert Boonstra ????: ? 04-???-08 12:52 ??: numpy-discussion at scipy.org ????: [Numpy-discussion] Learn about numpy With a python background but new to numpy, I have the following. Suppose I have a 2-D array and I want to apply a function to each element. The function needs to access the direct neighbouring elements in order to set a new value for the element. How would I do that in the most efficient way with numpy? Currently I have a uint8 array (self.bufbw) filled with 0 and 1 elements: def applyRule(self, rule): for xy in self.xydims: rule(xy) def rule(self, xy): x = xy[0]; y = xy[1] sum = self.bufbw[x-1:x+2, y-1:y+2].sum() \ - self.bufbw[x-1,y-1] - self.bufbw[x+1,y-1] \ - self.bufbw[x-1,y+1] - self.bufbw[x+1,y+1] if sum == 1: self.bufbw[x,y] = 1 else: self.bufbw[x,y] = 0 I have looked at the documentation online but couldn't find another faster solution yet. Does anyone want to share some ideas on a faster solution with numpy? Thanks, Folkert _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3493 bytes Desc: not available URL: From malkarouri at yahoo.co.uk Sun May 4 09:39:04 2008 From: malkarouri at yahoo.co.uk (Muhammad Alkarouri) Date: Sun, 4 May 2008 14:39:04 +0100 (BST) Subject: [Numpy-discussion] python memory use In-Reply-To: Message-ID: <302023.10217.qm@web27907.mail.ukl.yahoo.com> --- Robin wrote: [...] > While investigating this I found this script: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474 > which does wonders for my code. I was wondering if this function > should be included in Numpy as it seems to provide an important > feature, or perhaps an entry on the wiki (in Cookbook section?) I am the author of the mentioned recipe, and the reason I have written it is similar to your situation. I would add, however that ideally there shouldn't be such a problem but in reality there is. I have no clue why. As Christian said, Python does release memory. There was a problem before Python 2.5 as I understand, but the memory manager was patched (see http://evanjones.ca/python-memory-part3.html) and now I personally don't use Python <2.5 for that reason. The new manager helped, but still I face that problem, so I wrote the recipe. --- Andrew wrote: [...] > It's hard to say without knowing what your code does. A first guess is > that you're allocating lots of memory without allowing it to be freed. > Specifically, you may have references to objects which you no longer > need, and you should eliminate those references and allow them to be > garbage collected. In some cases, circular references can be hard for > python to detect, so you might want to play around with the gc module > and judicious use of the del statement. Note also that IPython keeps > references to past results by default (the history). Sound advice, specially the part about iPython which is often overlooked. I would have to say I have tried to play a lot with the gc module, calling gc.collect / enable / disable / playing with thresholds. In practice it helps a little but not much. In my experience, it is more likely in numpy code using only arrays of numbers to have references/views to arrays that you do not need than to have circular references. I haven't looked at the internals of gc, obmalloc or any other Python code. What happens to me is usually the machine starts to use virtual memory, slowing the whole computation a lot. I wonder if your algorithm that needs allocation of huge memory to cause a MemoryError can be modified to avoid that. I have found that to be the case is some situations. As an example, for PCA you might find depending on you matrix size the use of the transpose or other algorithms more suitable -- I ended up using http://folk.uio.no/henninri/pca_module. While I am of course partial to the fate of the cookbook recipe, I also feel that it doesn't directly belong in numpy -- it should be useful for other Pythonistas. May be in numpy, python proper somewhere, or one of the parallel processing libraries. I agree that a wiki page will be more beneficial -- though not sure what else should be there. Regards, Muhammad Alkarouri __________________________________________________________ Sent from Yahoo! Mail. A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html From lists at informa.tiker.net Sun May 4 10:28:39 2008 From: lists at informa.tiker.net (Andreas =?iso-8859-1?q?Kl=F6ckner?=) Date: Sun, 4 May 2008 10:28:39 -0400 Subject: [Numpy-discussion] strict aliasing? In-Reply-To: References: Message-ID: <200805041028.45645.lists@informa.tiker.net> On Sonntag 04 Mai 2008, Neal Becker wrote: > Is it safe to compile numpy with gcc 'strict aliasing'? It seems that numpy (and most other Python-related C code) would constantly be casting back and forth between PyObject * and PyArrayObject * (and others). Does strict aliasing allow that, as long as each struct member is only ever accessed through one type? Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From tim.hochberg at ieee.org Sun May 4 10:40:40 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 4 May 2008 07:40:40 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: On Sat, May 3, 2008 at 5:31 PM, Keith Goodman wrote: > On Sat, May 3, 2008 at 5:05 PM, Christopher Barker > wrote: > > Robert Kern wrote: > > > I can get a ~20% improvement with the following: > > > > > > > In [9]: def mycut(x, i): > > > ...: A = x[:i,:i] > > > ...: B = x[:i,i+1:] > > > ...: C = x[i+1:,:i] > > > ...: D = x[i+1:,i+1:] > > > ...: return hstack([vstack([A,C]),vstack([B,D])]) > > > > Might it be a touch faster to built the final array first, then fill it: > > > > def mycut(x, i): > > r,c = x.shape > > out = np.empty((r-1, c-1), dtype=x.dtype) > > out[:i,:i] = x[:i,:i] > > out[:i,i:] = x[:i,i+1:] > > out[i:,:i] = x[i+1:,:i] > > out[i:,i+1:] = x[i+1:,i+1:] > > return out > > > > totally untested. > > > > That should save the creation of two temporaries. > > Initializing the array makes sense. And it is super fast: > > >> timeit mycut(x, 6) > 100 loops, best of 3: 7.48 ms per loop > >> timeit mycut2(x, 6) > 1000 loops, best of 3: 1.5 ms per loop If you don't need the old array after the cut, I think that you could use the input array as the output array and then take a slice, saving a temporary and one-quarter of your assignments (on average). Something like. def destructive_cut(x, i): # Untested out = x[:-1,:-1] out[:i,i:] = x[:i,i+1:] out[i:,:i] = x[i+1:,:i] out[i:,i:] = x[i+1:,i+1:] return out If you were really clever, you could take different initial slices based on i so that you always skipped at least one-quarter of the assignments, but that's probably not worth the effort. > > The time it takes to cluster went from about 1.9 seconds to 0.7 > seconds! Thank you. > > When I run the single linkage clustering on my data I get one big > cluster and a bunch of tiny clusters. So I need to try a different > linkage method. Average linkage sounds good, but it sounds hard to > code. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sun May 4 11:05:27 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 4 May 2008 08:05:27 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: On Sun, May 4, 2008 at 7:40 AM, Timothy Hochberg wrote: > If you don't need the old array after the cut, I think that you could use > the input array as the output array and then take a slice, saving a > temporary and one-quarter of your assignments (on average). Something like. > > def destructive_cut(x, i): # Untested > out = x[:-1,:-1] > > out[:i,i:] = x[:i,i+1:] > out[i:,:i] = x[i+1:,:i] > out[i:,i:] = x[i+1:,i+1:] > return out That's a nice improvement. >> timeit mycut2(x, 6) 100 loops, best of 3: 1.54 ms per loop >> timeit destructive_cut(x, 6) 1000 loops, best of 3: 657 ?s per loop Why is it so slow to copy data or create an empty array? Where is the bottleneck? In this case I would guess, but I know nothing about it, that the array is already in the cpu's cache, so it is fast to read. But to make a copy it needs to write to ram and that is slow? From hoytak at gmail.com Sun May 4 11:11:42 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Sun, 4 May 2008 08:11:42 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: <4db580fd0805040811y5e5eec00h8eeb683fb3121944@mail.gmail.com> Another realization after looking at your code again -- you are doing extra work recomputing the minimum distance over d at each step. Since you are only updating one column and one row, you can have a length n array that gives the minimum (or max) distance for each column, then, after merging, you update the entry corresponding to the new column and then update the others only if the row you're updating has that minimum value in it. Then, when scanning for the min dist, you only need to scan O(n) rows. This improves the runtime of your algorithm from O(n^3) to O(n^2). You will definitely notice that. --Hoyt On Sun, May 4, 2008 at 7:40 AM, Timothy Hochberg wrote: > > > > > On Sat, May 3, 2008 at 5:31 PM, Keith Goodman wrote: > > > > On Sat, May 3, 2008 at 5:05 PM, Christopher Barker > > wrote: > > > > > Robert Kern wrote: > > > > I can get a ~20% improvement with the following: > > > > > > > > > > In [9]: def mycut(x, i): > > > > ...: A = x[:i,:i] > > > > ...: B = x[:i,i+1:] > > > > ...: C = x[i+1:,:i] > > > > ...: D = x[i+1:,i+1:] > > > > ...: return hstack([vstack([A,C]),vstack([B,D])]) > > > > > > Might it be a touch faster to built the final array first, then fill > it: > > > > > > def mycut(x, i): > > > r,c = x.shape > > > out = np.empty((r-1, c-1), dtype=x.dtype) > > > out[:i,:i] = x[:i,:i] > > > out[:i,i:] = x[:i,i+1:] > > > out[i:,:i] = x[i+1:,:i] > > > out[i:,i+1:] = x[i+1:,i+1:] > > > return out > > > > > > totally untested. > > > > > > That should save the creation of two temporaries. > > > > Initializing the array makes sense. And it is super fast: > > > > >> timeit mycut(x, 6) > > 100 loops, best of 3: 7.48 ms per loop > > >> timeit mycut2(x, 6) > > 1000 loops, best of 3: 1.5 ms per loop > > If you don't need the old array after the cut, I think that you could use > the input array as the output array and then take a slice, saving a > temporary and one-quarter of your assignments (on average). Something like. > > def destructive_cut(x, i): # Untested > out = x[:-1,:-1] > > out[:i,i:] = x[:i,i+1:] > out[i:,:i] = x[i+1:,:i] > out[i:,i:] = x[i+1:,i+1:] > return out > > If you were really clever, you could take different initial slices based on > i so that you always skipped at least one-quarter of the assignments, but > that's probably not worth the effort. > > > > > > > > > > The time it takes to cluster went from about 1.9 seconds to 0.7 > > seconds! Thank you. > > > > When I run the single linkage clustering on my data I get one big > > cluster and a bunch of tiny clusters. So I need to try a different > > linkage method. Average linkage sounds good, but it sounds hard to > > code. > > > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > . __ > . |-\ > . > . tim.hochberg at ieee.org > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From hoytak at gmail.com Sun May 4 11:14:17 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Sun, 4 May 2008 08:14:17 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <4db580fd0805040811y5e5eec00h8eeb683fb3121944@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> <4db580fd0805040811y5e5eec00h8eeb683fb3121944@mail.gmail.com> Message-ID: <4db580fd0805040814m10e2a7cdt3cb57b161f7def71@mail.gmail.com> > and then update the others only if the row you're updating has that > minimum value in it. Then, when scanning for the min dist, you only > need to scan O(n) rows. Sorry, let me clarify -- Update the entries corresponding to entries in the row you're updating if they are the same as the minimum distance; in that case you need to rescan the row. Also, ySorry, ou only need to scan O(n) entries in your cached array. From kwgoodman at gmail.com Sun May 4 11:46:11 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 4 May 2008 08:46:11 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <4db580fd0805040814m10e2a7cdt3cb57b161f7def71@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> <4db580fd0805040811y5e5eec00h8eeb683fb3121944@mail.gmail.com> <4db580fd0805040814m10e2a7cdt3cb57b161f7def71@mail.gmail.com> Message-ID: On Sun, May 4, 2008 at 8:14 AM, Hoyt Koepke wrote: > > and then update the others only if the row you're updating has that > > minimum value in it. Then, when scanning for the min dist, you only > > need to scan O(n) rows. > > Sorry, let me clarify -- Update the entries corresponding to entries > in the row you're updating if they are the same as the minimum > distance; in that case you need to rescan the row. Also, ySorry, ou > only need to scan O(n) entries in your cached array. If I understand your improvement, I can speed up dist.min() in i, j = np.where(dist == dist.min()) return i[0], j[0] since it is faster to find the min in a n-element array than in a nxn array. But the new code, thanks to Robert, is ij = x.argmin() i, j = divmod(ij, N) Would a 1d array of the column minimums still help? On a separate note, the distance matrix is symmetric so I could fill the lower half of the distance matrix with large values and, after updating a column skip the step of keeping the matrix symmetric by copying the column into the row (d[i,:] = d[:,i]). Does d[i,:] = d[:,i] make a copy? Here's what I have so far. It's fast for my needs. import time import numpy as np class Cluster: "Single linkage hierarchical clustering" def __init__(self, dist, label=None, linkage='single', debug=False): """ dist Distance matrix, NxN numpy array label Labels of each row of the distance matrix, list of N items, default is range(N) """ assert dist.shape[0] == dist.shape[1], 'dist must be square (nxn)' assert (np.abs(dist - dist.T) < 1e-8).all(), 'dist must be symmetric' if label is None: label = range(dist.shape[0]) assert dist.shape[0] == len(label), 'dist and label must match in size' msg = 'linkage must be single or complete' assert linkage in ('single', 'complete'), msg self.c = [[[z] for z in label]] self.label = label self.linkage = linkage self.dist = dist self.debug = debug def run(self): for level in xrange(len(self.label) - 1): i, j = self.min_dist() self.join(i, j) def join(self, i, j): assert i != j, 'Cannot combine a cluster with itself' # Join labels new = list(self.c[-1]) new[i] = new[i] + new[j] new.pop(j) self.c.append(new) # Join distance matrix if self.linkage == 'single': self.dist[:,i] = self.dist[:,[i,j]].min(1) elif self.linkage == 'complete': self.dist[:,i] = self.dist[:,[i,j]].max(1) else: raise NotImplementedError, 'Unknown linkage method' self.dist[i,:] = self.dist[:,i] # A faster verion of this code... # idx = range(self.dist.shape[1]) # idx.remove(j) # self.dist = self.dist[:,idx] # self.dist = self.dist[idx,:] # ...is this... out = self.dist[:-1,:-1] out[:i,i:] = self.dist[:i,i+1:] out[i:,:i] = self.dist[i+1:,:i] out[i:,i:] = self.dist[i+1:,i+1:] self.dist = out # Debug output if self.debug: print print len(self.c) - 1 print 'Clusters' print self.c[-1] print 'Distance' print self.dist def min_dist(self): # A faster version of this code... # dist = self.dist + 1e10 * np.eye(self.dist.shape[0]) # i, j = np.where(dist == dist.min()) # return i[0], j[0] # ...is this: x = self.dist N = x.shape[0] # With complete linkage the min distance was sometimes on the diagonal # I think it occured on the last merge (one cluster). So I added + 1. x.flat[::N+1] = x.max() + 1 ij = x.argmin() i, j = divmod(ij, N) return i, j def test(): # Example from # home.dei.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html label = ['BA', 'FI', 'MI', 'NA', 'RM', 'TO'] dist = np.array([[0, 662, 877, 255, 412, 996], [662, 0, 295, 468, 268, 400], [877, 295, 0, 754, 564, 138], [255, 468, 754, 0, 219, 869], [412, 268, 564, 219, 0, 669], [996, 400, 138, 869, 669, 0 ]]) clust = Cluster(dist, label, linkage='single', debug=True) clust.run() def test2(n): x = np.random.rand(n,n) x = x + x.T c = Cluster(x) t1 = time.time() c.run() t2 = time.time() print 'n = %d took %0.2f seconds' % (n, t2-t1) From kwgoodman at gmail.com Sun May 4 12:20:32 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 4 May 2008 09:20:32 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> Message-ID: On Fri, May 2, 2008 at 7:25 PM, Robert Kern wrote: > Assuming x is contiguous and you can modify x in-place: > > > In [1]: from numpy import * > > In [2]: def dist(x): > ...: x = x + 1e10 * eye(x.shape[0]) > ...: i, j = where(x == x.min()) > > ...: return i[0], j[0] > ...: > > In [3]: def symmdist(N): > ...: x = random.rand(N, N) > ...: x = x + x.T > ...: x.flat[::N+1] = 0 > ...: return x > ...: > > In [4]: symmdist(5) > Out[4]: > array([[ 0. , 0.87508654, 1.11691704, 0.80366071, 0.57966808], > [ 0.87508654, 0. , 1.5521685 , 1.74010886, 0.52156877], > [ 1.11691704, 1.5521685 , 0. , 1.22725396, 1.04101992], > [ 0.80366071, 1.74010886, 1.22725396, 0. , 1.94577965], > [ 0.57966808, 0.52156877, 1.04101992, 1.94577965, 0. ]]) > > In [5]: def kerndist(x): > ...: N = x.shape[0] > ...: x.flat[::N+1] = x.max() > ...: ij = argmin(x.flat) > ...: i, j = divmod(ij, N) > ...: return i, j > ...: > > In [10]: x = symmdist(500) > > In [15]: %timeit dist(x) > 10 loops, best of 3: 19.9 ms per loop > > In [16]: %timeit kerndist(x) > 100 loops, best of 3: 4.38 ms per loop I added i, j = divmod(x.argmin(), x.shape[0]) to http://scipy.org/PerformanceTips From charlesr.harris at gmail.com Sun May 4 12:25:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 May 2008 10:25:46 -0600 Subject: [Numpy-discussion] strict aliasing? In-Reply-To: References: Message-ID: On Sun, May 4, 2008 at 3:11 AM, Neal Becker wrote: > Is it safe to compile numpy with gcc 'strict aliasing'? > No! And depending on the compiler version you might find whole bits of code disappearing during optimization without warning, yielding fantastic benchmarks but questionable results. The linux kernel won't compile correctly with that flag either, and it has been a longstanding cause of contention with gnu that it is the default. If you search the mailing lists you can find Linus making some nasty comments about it. As far as I can tell, strict aliasing assumes that pointers are only cast between types of the same length. This is a problem in code that casts pointers with abandon between all types. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwang at enthought.com Sun May 4 13:12:01 2008 From: pwang at enthought.com (Peter Wang) Date: Sun, 4 May 2008 12:12:01 -0500 Subject: [Numpy-discussion] [IT] (updated) Maintenance and scheduled downtime this evening and weekend In-Reply-To: References: Message-ID: Hi everyone, We will need to do some more on the network today, Sunday May 4, from 1pm to 3pm Central time. (This is 2pm-4pm Eastern, 6pm-8pm UTC.) This affects the main Enthought and Scipy.org servers, including SVN, Trac, the mailing lists, and the web site. As usual, we don't anticipate the services being down for the entire time, but there may be intermittent connectivity issues during that time. Please try to avoid editing any Trac wikis during this time, since you may lose your changes. We will try to complete the work as quickly as possible, and we will send a status update when the scheduled work as been completed. Please check http://dr0.enthought.com/status for status updates. On behalf of the IT crew, thanks again for your patience and bearing with us. We recognize that these outages are inconvenient but they are a critical part of the transition to a new infrastructure that better supports the growing needs of the Scipy open-source community. Please let us know if you have any questions or concerns! -Peter From pav at iki.fi Sun May 4 20:17:20 2008 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 05 May 2008 03:17:20 +0300 Subject: [Numpy-discussion] MoinMoin <-> docstrings gateway Message-ID: <1209946640.13399.6.camel@localhost.localdomain> Hi, Some time ago there was discussion about MoinMoin <-> docstrings gateway. Did it produce some results? Anyway, I threw a bit of code together. There's something between a proof-of-concept and final product running now on my desktop machine. You can play with it here: http://pvx.homeip.net/pTSc0V/TestWiki Is there interest to move forward with this? Have fun (if at all possible), Pauli -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From millman at berkeley.edu Sun May 4 20:40:41 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 4 May 2008 19:40:41 -0500 Subject: [Numpy-discussion] MoinMoin <-> docstrings gateway In-Reply-To: <1209946640.13399.6.camel@localhost.localdomain> References: <1209946640.13399.6.camel@localhost.localdomain> Message-ID: On Sun, May 4, 2008 at 7:17 PM, Pauli Virtanen wrote: > Anyway, I threw a bit of code together. There's something between a > proof-of-concept and final product running now on my desktop machine. > You can play with it here: > > http://pvx.homeip.net/pTSc0V/TestWiki > > Is there interest to move forward with this? Very cool. I don't have time to look at this now, but I hope someone else will. This would be a very useful feature. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From eads at lanl.gov Sun May 4 20:59:17 2008 From: eads at lanl.gov (Damian R. Eads) Date: Sun, 4 May 2008 18:59:17 -0600 (MDT) Subject: [Numpy-discussion] Faster In-Reply-To: References: Message-ID: <12424.128.165.0.81.1209949157.squirrel@webmail.lanl.gov> Hi, Looks like a fun discussion: it's too bad for me I did not join it earlier. My first try at scipy-cluster was completely in Python. Like you, I also tried to find the most efficient way to transform the distance matrix when joining two clusters. Eventually my data sets became big enough that I decided to write these parts in C. I don't think my Python joining code was efficient as yours. I tried out your first test and I am a little confused at the output. In [107]: goodman.test() 1 Clusters [['BA'], ['FI'], ['MI', 'TO'], ['NA'], ['RM']] Distance [[997 662 255 412 996] [662 997 468 268 400] [255 468 997 219 869] [412 268 219 997 669] [996 400 869 669 997]] 2 Clusters [['BA'], ['FI'], ['MI', 'TO', 'NA'], ['RM']] Distance [[998 662 412 996] [662 998 268 400] [412 268 998 669] [996 400 669 998]] 3 Clusters [['BA'], ['FI', 'MI', 'TO', 'NA'], ['RM']] Distance [[999 412 996] [412 999 669] [996 669 999]] 4 Clusters [['BA', 'FI', 'MI', 'TO', 'NA'], ['RM']] Distance [[1000 669] [ 669 1000]] 5 Clusters [['BA', 'FI', 'MI', 'TO', 'NA', 'RM']] Distance [[1001]] The first step is right, singletons 2 and 5 (starting at 0) should be joined since they have a minimum distance of 138. Let's look at their corresponding rows in the distance matrix. In [101]: DM[[2,5],:] Out[101]: array([[ 877., 295., 10000., 754., 564., 138.], [ 996., 400., 138., 869., 669., 10000.]]) These two rows, rows 2 and 5, are all that we need to form the row for the newly joined cluster in the distance matrix. If we just take the minimum for each column we obtain, In [102]: q=DM[[2,5],:].min(axis=0) Out[102]: array([ 877., 295., 138., 754., 564., 138.]) so the row for the cluster should be the row above with the 2 and 5'th row removed. Roughly, there should be a row in the distance matrix with the following values but I don't see them in your output of the transformed distance matrix. In [103]: q[q != 138] Out[103]: array([ 877., 295., 754., 564.]) Since 295 is the minimum distance between this newly joined cluster and any other singleton, it should not be chosen for the second iteration since singletons 3 and 4 are closer to another with a distance of 219. So after iteration 2, you should get [['BA'], ['FI'], ['MI', 'TO'], ['NA', 'RM']]. Recall that the distance matrix transformation forms a new distance matrix using only values from the previous distance matrix. So, at any iteration, the values in the distance matrix should be a subset of the values in the original distance matrix, eliminating the distance entries of the clusters formed. If we look at the minimum distances in the original distance matrix in rank order, we have 138, 219, 255, 268, 295. Thus, we might expect the minimum distances found at each iteration to be these values, and they are in this case, but I don't have a mathematical proof that it works in general. If I run your distance matrix through hcluster.single, I get the following linkage matrix. Each row represents a non-singleton cluster. The first two columns are the indices of the clusters joined (non-singletons have an index >= n), the third column is their distance, and the fourth, the number of singletons belonging to the cluster. array([[ 2., 5., 138., 2.], [ 3., 4., 219., 2.], [ 0., 7., 255., 3.], [ 1., 8., 268., 4.], [ 6., 9., 295., 6.]]) You may wish to look at the dendrogram since it is easier to interpret, http://www.soe.ucsc.edu/~eads/goodman-dendro.png In [105]: lbls Out[105]: ['BA', 'FI', 'MI', 'NA', 'RM', 'TO'] In [106]: hcluster.dendrogram(Z, labels=lbls) FWIW, MATLAB returns equivalent output, >> linkage(dm2, 'single') ans = 3 6 138 4 5 219 1 8 255 2 9 268 7 10 295 I tried running your second test, and you'll see C might give you a better performance speed-up, which is not surprising. Roughly, what I'm doing in C is I'm only storing the upper triangular of the distance matrix. An array of double*'s (double **) refers to each row of this triangle. To eliminate a row, I simply remove the entry in the double ** array. To remove a column, I shift the values over in each row of the triangle. I'm not sure if this is the best approach but it is certainly more efficient than vectorized python, in terms of both memory and computation. In [107]: goodman.test2(1000) n = 1000 took 22.10 seconds In [108]: n=1000 In [109]: uu=numpy.random.rand(n*(n-1)/2) In [110]: tic = time.time(); hcluster.single(uu); toc = time.time(); print toc-tic Out[110]: 4.57607889175 Damian Keith Goodman wrote: > On Fri, May 2, 2008 at 7:25 PM, Robert Kern wrote: >> Assuming x is contiguous and you can modify x in-place: >> >> >> In [1]: from numpy import * >> >> In [2]: def dist(x): >> ...: x = x + 1e10 * eye(x.shape[0]) >> ...: i, j = where(x == x.min()) >> >> ...: return i[0], j[0] >> ...: >> >> In [3]: def symmdist(N): >> ...: x = random.rand(N, N) >> ...: x = x + x.T >> ...: x.flat[::N+1] = 0 >> ...: return x >> ...: >> >> In [4]: symmdist(5) >> Out[4]: >> array([[ 0. , 0.87508654, 1.11691704, 0.80366071, 0.57966808], >> [ 0.87508654, 0. , 1.5521685 , 1.74010886, 0.52156877], >> [ 1.11691704, 1.5521685 , 0. , 1.22725396, 1.04101992], >> [ 0.80366071, 1.74010886, 1.22725396, 0. , 1.94577965], >> [ 0.57966808, 0.52156877, 1.04101992, 1.94577965, 0. ]]) >> >> In [5]: def kerndist(x): >> ...: N = x.shape[0] >> ...: x.flat[::N+1] = x.max() >> ...: ij = argmin(x.flat) >> ...: i, j = divmod(ij, N) >> ...: return i, j >> ...: >> >> In [10]: x = symmdist(500) >> >> In [15]: %timeit dist(x) >> 10 loops, best of 3: 19.9 ms per loop >> >> In [16]: %timeit kerndist(x) >> 100 loops, best of 3: 4.38 ms per loop > > I added > > i, j = divmod(x.argmin(), x.shape[0]) > > to > > http://scipy.org/PerformanceTips From kwgoodman at gmail.com Sun May 4 21:53:44 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 4 May 2008 18:53:44 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: <12424.128.165.0.81.1209949157.squirrel@webmail.lanl.gov> References: <12424.128.165.0.81.1209949157.squirrel@webmail.lanl.gov> Message-ID: On Sun, May 4, 2008 at 5:59 PM, Damian R. Eads wrote: > Hi, > > Looks like a fun discussion: it's too bad for me I did not join it > earlier. My first try at scipy-cluster was completely in Python. Like you, > I also tried to find the most efficient way to transform the distance > matrix when joining two clusters. Eventually my data sets became big > enough that I decided to write these parts in C. I don't think my Python > joining code was efficient as yours. > > I tried out your first test and I am a little confused at the output. Thank you for catching that bug. It was introduced on the last speed up. I pasted code from the list that used i as the index. But in the cluster code I should delete the j-th row and column. After fixing that the output matches http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html I should convert that to a unit test. > You may wish to look at the dendrogram since it is easier to interpret, > http://www.soe.ucsc.edu/~eads/goodman-dendro.png Nice. > I tried running your second test, and you'll see C might give you a better > performance speed-up, which is not surprising. Roughly, what I'm doing in > C is I'm only storing the upper triangular of the distance matrix. An > array of double*'s (double **) refers to each row of this triangle. To > eliminate a row, I simply remove the entry in the double ** array. To > remove a column, I shift the values over in each row of the triangle. I'm > not sure if this is the best approach but it is certainly more efficient > than vectorized python, in terms of both memory and computation. > > In [107]: goodman.test2(1000) > n = 1000 took 22.10 seconds > In [108]: n=1000 > In [109]: uu=numpy.random.rand(n*(n-1)/2) > In [110]: tic = time.time(); hcluster.single(uu); toc = time.time(); print > toc-tic > Out[110]: > 4.57607889175 To be (almost) within a factor of 5 is great! But if I get anything close to useful from my code, I'll move over to your industrial strength code. Here's the bug-fixed version: import time import numpy as np class Cluster: "Single linkage hierarchical clustering" def __init__(self, dist, label=None, linkage='single', debug=False): """ dist Distance matrix, NxN numpy array label Labels of each row of the distance matrix, list of N items, default is range(N) """ assert dist.shape[0] == dist.shape[1], 'dist must be square (nxn)' assert (np.abs(dist - dist.T) < 1e-8).all(), 'dist must be symmetric' if label is None: label = range(dist.shape[0]) assert dist.shape[0] == len(label), 'dist and label must match in size' msg = 'linkage must be single or complete' assert linkage in ('single', 'complete'), msg self.c = [[[z] for z in label]] self.label = label self.linkage = linkage self.dist = dist self.debug = debug def run(self): for level in xrange(len(self.label) - 1): i, j = self.min_dist() self.join(i, j) def join(self, i, j): assert i != j, 'Cannot combine a cluster with itself' # Join labels new = list(self.c[-1]) new[i] = new[i] + new[j] new.pop(j) self.c.append(new) # Join distance matrix if self.linkage == 'single': self.dist[:,i] = self.dist[:,[i,j]].min(1) elif self.linkage == 'complete': self.dist[:,i] = self.dist[:,[i,j]].max(1) else: raise NotImplementedError, 'Unknown linkage method' self.dist[i,:] = self.dist[:,i] # A faster verion of this code... #idx = range(self.dist.shape[1]) #idx.remove(j) #self.dist = self.dist[:,idx] #self.dist = self.dist[idx,:] # ...is this... out = self.dist[:-1,:-1] out[:j,j:] = self.dist[:j,j+1:] out[j:,:j] = self.dist[j+1:,:j] out[j:,j:] = self.dist[j+1:,j+1:] self.dist = out # Debug output if self.debug: print print len(self.c) - 1 print 'Clusters' print self.c[-1] print 'Distance' print self.dist def min_dist(self): # A faster version of this code... # dist = self.dist + 1e10 * np.eye(self.dist.shape[0]) # i, j = np.where(dist == dist.min()) # return i[0], j[0] # ...is this: x = self.dist N = x.shape[0] # With complete linkage the min distance was sometimes on the diagonal # I think it occured on the last merge (one cluster). So I added + 1. x.flat[::N+1] = x.max() + 1 ij = x.argmin() i, j = divmod(ij, N) return i, j def test(): # Example from # home.dei.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html label = ['BA', 'FI', 'MI', 'NA', 'RM', 'TO'] dist = np.array([[0, 662, 877, 255, 412, 996], [662, 0, 295, 468, 268, 400], [877, 295, 0, 754, 564, 138], [255, 468, 754, 0, 219, 869], [412, 268, 564, 219, 0, 669], [996, 400, 138, 869, 669, 0 ]]) clust = Cluster(dist, label, linkage='single', debug=True) clust.run() def test2(n): x = np.random.rand(n,n) x = x + x.T c = Cluster(x) t1 = time.time() c.run() t2 = time.time() print 'n = %d took %0.2f seconds' % (n, t2-t1) From Chris.Barker at noaa.gov Mon May 5 00:22:32 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Sun, 04 May 2008 21:22:32 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <481CFDDF.8050309@noaa.gov> Message-ID: <481E8B88.4050805@noaa.gov> Keith Goodman wrote: > Why is it so slow to copy data I think the speed of copying data is highly dependent on what's in cache, but in any case, much faster than: > create an empty array? creating a new array requires a memory allocation, which is apparently a bunch slower than moving data around -- not that I have any idea why! I note about this last "destructive_cut" method -- it keeps using a view to the original array's data block. That may not be an issue at all, but it means that if you are starting with a big array, and removing a bit at time, you'll maintain that big 'ol chunk of data in memory, even when your working set is small. Also the resulting arrays aren't contiguous, which again, may or may not matter. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception From david at ar.media.kyoto-u.ac.jp Mon May 5 01:16:24 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 05 May 2008 14:16:24 +0900 Subject: [Numpy-discussion] strict aliasing? In-Reply-To: References: Message-ID: <481E9828.7010109@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > As far as I can tell, strict aliasing assumes that pointers are only > cast between types of the same length. Strictly speaking, strict aliasing just says that locations pointed by pointers do not alias. If you use two pointers of different types, that's one case where the compiler will always assume they do not alias. And this breaks heavily I think in numpy, where a lot of casting is done, since internally a data buffer is a char*. Converting from char* to any other type pointer often breaks the strict aliasing rule: http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_aliasing.html#cast_to_char_pointer Using numscons with numpy trunk: CFLAGS="-fstrict-aliasing -Wstrict-aliasing" python setupscons.py scons certainly generate tons of warnings, and -Wstrict-aliasing does only catch the most common ones. It is pretty safe to say it is not safe at all :) cheers, David From gael.varoquaux at normalesup.org Mon May 5 01:48:00 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 5 May 2008 07:48:00 +0200 Subject: [Numpy-discussion] MoinMoin <-> docstrings gateway In-Reply-To: <1209946640.13399.6.camel@localhost.localdomain> References: <1209946640.13399.6.camel@localhost.localdomain> Message-ID: <20080505054800.GC8593@phare.normalesup.org> On Mon, May 05, 2008 at 03:17:20AM +0300, Pauli Virtanen wrote: > Some time ago there was discussion about MoinMoin <-> docstrings > gateway. Did it produce some results? My girlfriend, Emmanuelle, (Cced, I am not sure she follows this mailing list) has been working on this, with some progress. > Anyway, I threw a bit of code together. There's something between a > proof-of-concept and final product running now on my desktop machine. > You can play with it here: > http://pvx.homeip.net/pTSc0V/TestWiki Sweet. Some comments: * A lot of the docstrings are not valid rst. This is not your fault, but we will had to fix this in the long run. * I would prefer if the main page was broken up into one per function. I know this is the way it is in the actual wiki layout, but I think it would be better if it was presented this way to the user. Anyway, this is debatable. * Emmanuelle has functions to read from the wiki and write to it from a remote client. I am not sure how well they work, but it would be nice not to require a login and rights on the server to generate patches. > Is there interest to move forward with this? There is. With Emmanuelle and Stefan van der Waalt, who has also been following the project, we were considering using a webapp running with turbogears to move forward. They would know better what the status is. Congratulations for that. Let us hope you can join forces with the other team working on that to bring this project to its success. Cheers, Ga?l From charlesr.harris at gmail.com Mon May 5 02:52:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 May 2008 00:52:57 -0600 Subject: [Numpy-discussion] strict aliasing? In-Reply-To: <481E9828.7010109@ar.media.kyoto-u.ac.jp> References: <481E9828.7010109@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, May 4, 2008 at 11:16 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > > As far as I can tell, strict aliasing assumes that pointers are only > > cast between types of the same length. > > Strictly speaking, strict aliasing just says that locations pointed by > pointers do not alias. If you use two pointers of different types, > that's one case where the compiler will always assume they do not alias. Interesting article. I note that it is OK to alias pointers to the signed and unsigned versions of integer types, which is where I must have picked up my notions about length. I don't recall seeing any major bit of software that didn't use the -fno-strict-aliasing flag, as casting pointers around is one of the things C is(was) all about. So I was a bit surprised that Mike was recommending not doing so, although making the choice on a file by file basis might be useful for the optimizer. But the assumption of pointers and aliasing is so built into the whole C outlook that I wouldn't be surprised if any large program developed obscure bugs when compiled with the strict-aliasing flag. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon May 5 02:58:57 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 05 May 2008 15:58:57 +0900 Subject: [Numpy-discussion] strict aliasing? In-Reply-To: References: <481E9828.7010109@ar.media.kyoto-u.ac.jp> Message-ID: <481EB031.1080001@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > Interesting article. I note that it is OK to alias pointers to the > signed and unsigned versions of integer types, which is where I must > have picked up my notions about length. I don't recall seeing any > major bit of software that didn't use the -fno-strict-aliasing flag, > as casting pointers around is one of the things C is(was) all about. > So I was a bit surprised that Mike was recommending not doing so, > although making the choice on a file by file basis might be useful for > the optimizer. Well, as for the case of Linus' comments on LKML, you have to take things into context, I think: this is an article about programming the Cell CPU on the PS3. And if there is one domain where performance matters a lot, it is the video game domain, and the complexity of modern video games is several orders higher than most HPC softwares (the 80/20 rule is mostly a fallacy in video games). So everything counts. C99 introduces several rules related to aliasing, as mentioned in the article, so I am not sure you can say that C point is about aliasing pointers. Seeing what happens with strict aliasing on a file per file basis would be interesting (for example, in the ufunc code: at this point, there is no aliasing possible anymore I think). It is also mentioned that a lot of code can be made "strict aliasing" safe: I don't know how far we could go in numpy. It would be quite difficult, I guess, because you would have to use special code path depending on whether there is potential aliasing or not. cheers, David From pwang at enthought.com Mon May 5 05:44:23 2008 From: pwang at enthought.com (Peter Wang) Date: Mon, 5 May 2008 04:44:23 -0500 Subject: [Numpy-discussion] [IT] Weekend outage complete In-Reply-To: References: Message-ID: <4364B874-7A19-4F77-88D1-F8855058CF1E@enthought.com> Hi everyone, The downtime took a little longer than expected (perhaps that is to be expected?), but everything should be back up and running now. Mail, web, SVN, and Trac for scipy.org and enthought.com are all functional. The mail server is working through some backlogged mail but that should clear up in a few hours. Thanks again for your patience during this upgrade process. We're in much better shape now to continue improving our network without causing such intrusive outages in the future. Please let me know if you have any questions or comments. On behalf of the IT crew, Thanks, Peter From F.Boonstra at inter.nl.net Mon May 5 06:33:10 2008 From: F.Boonstra at inter.nl.net (Folkert Boonstra) Date: Mon, 05 May 2008 12:33:10 +0200 Subject: [Numpy-discussion] Learn about numpy In-Reply-To: <710F2847B0018641891D9A216027636029C132@ex3.envision.co.il> References: <481D876C.1070001@inter.nl.net> <710F2847B0018641891D9A216027636029C132@ex3.envision.co.il> Message-ID: <481EE266.7000300@inter.nl.net> Nadav Horesh schreef: > What you do here is a convolution with > > 0 1 0 > 1 1 1 > 0 1 0 > > kernel, and thresholding, you can use numpy.numarray.nd_image package: > > import numpy.numarray.nd_image as NI > . > . > . > ker = array([[0,1,0], [1,1,1],[0,1,0]]) > result = (NI.convolve(self.bufbw, ker) == 1).astype(uint8) > > for nore general cases you can use the function generic_filter in the same package. > > Nadav. > > Thanks, that works ok! However if instead of uint8, uint32 values are used, the result only contains zeros. Or am I doing something wrong? Folkert import numpy import numpy.numarray.nd_image as NI B = numpy.zeros((5,5), dtype=numpy.uint8) C = numpy.zeros((5,5), dtype=numpy.uint32) DC = 4278190280 LC = 4278241280 B[:] = 0 B[1,1] = 1 B[2,2] = 1 C[:] = DC C[1,1] = LC C[2,2] = LC ker01 = numpy.array([[0,1,0], \ [1,1,1], \ [0,1,0]]) kerCC = numpy.array([[C[0,0],C[1,1],C[0,0]], \ [C[1,1],C[1,1],C[1,1]], \ [C[0,0],C[1,1],C[0,0]]]).astype(numpy.uint32) r1 = NI.convolve(B, ker01).astype(numpy.uint8) r2 = (NI.convolve(B, ker01) == 1).astype(numpy.uint8) r3 = NI.convolve(C, kerCC).astype(numpy.uint32) r4 = (NI.convolve(C, kerCC) == C[0,0]).astype(numpy.uint32) From bala.biophysics at gmail.com Mon May 5 07:14:37 2008 From: bala.biophysics at gmail.com (Bala subramanian) Date: Mon, 5 May 2008 16:44:37 +0530 Subject: [Numpy-discussion] numpy in RHEL4 Message-ID: <288df32a0805050414m61b3442aq40b71babbddc77b8@mail.gmail.com> Dear friends, I am trying to install numpy version numpy-1.0.4 in RHEL 4. My python version is 2.3.4. While installation, it throws me the following error and stops. Kindly write me how to get rid of this. Thanks, Bala ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- /usr/bin/ld: skipping incompatible /usr/lib/liblapack.so when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/liblapack.a when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../liblapack.so when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../liblapack.a when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/liblapack.so when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/liblapack.a when searching for -llapack /usr/bin/ld: cannot find -llapack collect2: ld returned 1 exit status /usr/bin/ld: skipping incompatible /usr/lib/liblapack.so when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/liblapack.a when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../liblapack.so when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../liblapack.a when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/liblapack.so when searching for -llapack /usr/bin/ld: skipping incompatible /usr/lib/liblapack.a when searching for -llapack /usr/bin/ld: cannot find -llapack collect2: ld returned 1 exit status error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared build/temp.linux-x86_64-2.3/numpy/linalg/lapack_litemodule.o -L/usr/lib -llapack -lblas -lg2c -o build/lib.linux-x86_64-2.3/numpy/linalg/lapack_lite.so" failed with exit status 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon May 5 08:44:51 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 05 May 2008 21:44:51 +0900 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? Message-ID: <481F0143.3030700@ar.media.kyoto-u.ac.jp> Hi, While working again on the fftpack module, to clean things up and speed some backends (in particular fftw3, which is really sub-optimal right now), I remembered how much unaligned data pointer in numpy arrays hurt performances. So I would like to relaunch the discussion on aligned allocators and default alignement for numpy arrays : http://www.mail-archive.com/numpy-discussion at scipy.org/msg04005.html Basically, what I have in mind is, in a first step (for numpy 1.2): - define functions to allocate on a given alignement - make PyMemData_NEW 16 byte aligned by default (to be compatible with SSE and co). The problem was, and still is, realloc. It is not possible to implement realloc with malloc/free (in a portable way), and as such, it is not possible to have an aligned realloc. In numpy, we can always replace realloc by malloc/free, because we know the size of the old block: would deprecating PyMemData_RENEW and replacing them by PyMemeData_NEW/PyMemData_FREE be possible, such as to make all numpy arrays follow a default alignement ? There are only a few of them in numpy (6 of them), 0 in scipy, and I guess extensions never really used them ? cheers, David From F.Boonstra at inter.nl.net Mon May 5 12:17:04 2008 From: F.Boonstra at inter.nl.net (Folkert Boonstra) Date: Mon, 05 May 2008 18:17:04 +0200 Subject: [Numpy-discussion] Learn about numpy In-Reply-To: <481EE266.7000300@inter.nl.net> References: <481D876C.1070001@inter.nl.net> <710F2847B0018641891D9A216027636029C132@ex3.envision.co.il> <481EE266.7000300@inter.nl.net> Message-ID: <481F3300.8060301@inter.nl.net> Folkert Boonstra schreef: > Nadav Horesh schreef: > >> What you do here is a convolution with >> >> 0 1 0 >> 1 1 1 >> 0 1 0 >> >> kernel, and thresholding, you can use numpy.numarray.nd_image package: >> >> import numpy.numarray.nd_image as NI >> . >> . >> . >> ker = array([[0,1,0], [1,1,1],[0,1,0]]) >> result = (NI.convolve(self.bufbw, ker) == 1).astype(uint8) >> >> for nore general cases you can use the function generic_filter in the same package. >> >> Nadav. >> >> >> > Thanks, that works ok! > > However if instead of uint8, uint32 values are used, the result only > contains zeros. > Or am I doing something wrong? > Folkert > > import numpy > import numpy.numarray.nd_image as NI > > B = numpy.zeros((5,5), dtype=numpy.uint8) > C = numpy.zeros((5,5), dtype=numpy.uint32) > > DC = 4278190280 > LC = 4278241280 > > B[:] = 0 > B[1,1] = 1 > B[2,2] = 1 > C[:] = DC > C[1,1] = LC > C[2,2] = LC > > ker01 = numpy.array([[0,1,0], \ > [1,1,1], \ > [0,1,0]]) > kerCC = numpy.array([[C[0,0],C[1,1],C[0,0]], \ > [C[1,1],C[1,1],C[1,1]], \ > [C[0,0],C[1,1],C[0,0]]]).astype(numpy.uint32) > > r1 = NI.convolve(B, ker01).astype(numpy.uint8) > r2 = (NI.convolve(B, ker01) == 1).astype(numpy.uint8) > r3 = NI.convolve(C, kerCC).astype(numpy.uint32) > r4 = (NI.convolve(C, kerCC) == C[0,0]).astype(numpy.uint32) > It should be: r5 = NI.convolve(C, ker01).astype(numpy.uint32) which results in: [[4211082216 4211133216 4211082216 4211082216 4211082216] [4211133216 4211133216 4211184216 4211082216 4211082216] [4211082216 4211184216 4211133216 4211133216 4211082216] [4211082216 4211082216 4211133216 4211082216 4211082216] [4211082216 4211082216 4211082216 4211082216 4211082216]] Now I have to find out how convolve works in order to understand why these values are generated. Are there some good examples / documentation as you know? I found a EECE253_07_Convolution.pdf with lecture notes on convolution for image processing. Thanks, Folkert From tim.hochberg at ieee.org Mon May 5 12:59:46 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 5 May 2008 09:59:46 -0700 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <481F0143.3030700@ar.media.kyoto-u.ac.jp> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, May 5, 2008 at 5:44 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > While working again on the fftpack module, to clean things up and > speed some backends (in particular fftw3, which is really sub-optimal > right now), I remembered how much unaligned data pointer in numpy arrays > hurt performances. So I would like to relaunch the discussion on aligned > allocators and default alignement for numpy arrays : > > http://www.mail-archive.com/numpy-discussion at scipy.org/msg04005.html > > Basically, what I have in mind is, in a first step (for numpy 1.2): > - define functions to allocate on a given alignement > - make PyMemData_NEW 16 byte aligned by default (to be compatible > with SSE and co). > > The problem was, and still is, realloc. It is not possible to implement > realloc with malloc/free (in a portable way), and as such, it is not > possible to have an aligned realloc. > > In numpy, we can always replace realloc by malloc/free, because we know > the size of the old block: would deprecating PyMemData_RENEW and > replacing them by PyMemeData_NEW/PyMemData_FREE be possible, such as to > make all numpy arrays follow a default alignement ? There are only a few > of them in numpy (6 of them), 0 in scipy, and I guess extensions never > really used them ? I don't think you would want to do this in the core of PyArray_FromIter; presumably realloc can sometimes reuse the existing pointer and save on allocating a new chunk of memory. Since there are lots of allocations in fromiter, this could potentially be a big performance hit. (At least I think so, realloc has always been kinda voodoo to me). One could use PyMemData_NEW/PyMemData_FREE in the final allocation to make sure that the data is alligned, we allready do a realloc there to dump any extra space. Or, possibly better, one could choose which allocation strategy to use here depending on whether the data was alligned or not. > > > cheers, > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon May 5 13:11:12 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 May 2008 12:11:12 -0500 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <481F0143.3030700@ar.media.kyoto-u.ac.jp> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> On Mon, May 5, 2008 at 7:44 AM, David Cournapeau wrote: > In numpy, we can always replace realloc by malloc/free, because we know > the size of the old block: would deprecating PyMemData_RENEW and > replacing them by PyMemeData_NEW/PyMemData_FREE be possible, such as to > make all numpy arrays follow a default alignement ? There are only a few > of them in numpy (6 of them), 0 in scipy, and I guess extensions never > really used them ? I am in favor of at least trying this out. We will have to have a set of benchmarks to make sure we haven't hurt the current uses of PyMemData_RENEW which Tim points out. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 5 13:17:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 May 2008 12:17:00 -0500 Subject: [Numpy-discussion] numpy in RHEL4 In-Reply-To: <288df32a0805050414m61b3442aq40b71babbddc77b8@mail.gmail.com> References: <288df32a0805050414m61b3442aq40b71babbddc77b8@mail.gmail.com> Message-ID: <3d375d730805051017o21e6944ct132cfaec39e18ed@mail.gmail.com> On Mon, May 5, 2008 at 6:14 AM, Bala subramanian wrote: > Dear friends, > > I am trying to install numpy version numpy-1.0.4 in RHEL 4. My python > version is 2.3.4. While installation, it throws me the following error and > stops. Kindly write me how to get rid of this. The files /usr/lib/liblapack.a and /usr/lib/liblapack.so are probably 32-bit instead of 64-bit. You can (probably) check this using the command $ file /usr/lib/liblapack.so Locate the 64-bit versions of liblapack and libblas and make sure that you have the correct directory in your site.cfg file. The numpy source tree contains a commented site.cfg.example file for you to start with. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From rowen at cesmail.net Mon May 5 13:19:40 2008 From: rowen at cesmail.net (Russell E. Owen) Date: Mon, 05 May 2008 10:19:40 -0700 Subject: [Numpy-discussion] numpy masked array oddity Message-ID: The object returned by maskedArray.compressed() appears to be a normal numpy array (based on repr output), but in reality it has some surprising differences: import numpy a = numpy.arange(10, dtype=int) b = numpy.zeros(10) b[1] = 1 b[3] = 1 ma = numpy.core.ma.array(a, mask=b, dtype=float) print ma # [0.0 -- 2.0 -- 4.0 5.0 6.0 7.0 8.0 9.0] c = ma.compressed() print repr(c) # array([ 0. 2. 4. 5. 6. 7. 8. 9.]) c.sort() #Traceback (most recent call last): # File "", line 1, in # File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac kages/#numpy/core/ma.py", line 2132, in not_implemented # raise NotImplementedError, "not yet implemented for numpy.ma arrays" #NotImplementedError: not yet implemented for numpy.ma arrays d = numpy.array(c) d.sort() # this works fine, as expected Why is "c" in the example above not just a regular numpy array? It is not a "live" view (based on a quick test), which seems sensible to me. I've worked around the problem by making a copy (d in the example above), but it seems most unfortunate to have to copy the data twice. -- Russsell From nadavh at visionsense.com Mon May 5 13:15:35 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 5 May 2008 20:15:35 +0300 Subject: [Numpy-discussion] Learn about numpy References: <481D876C.1070001@inter.nl.net> <710F2847B0018641891D9A216027636029C132@ex3.envision.co.il> <481EE266.7000300@inter.nl.net> <481F3300.8060301@inter.nl.net> Message-ID: <710F2847B0018641891D9A216027636029C139@ex3.envision.co.il> I think you have a problem of overflow in r5: You may better use utin64 instead of uint32. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Folkert Boonstra ????: ? 05-???-08 19:17 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] Learn about numpy Folkert Boonstra schreef: > Nadav Horesh schreef: > >> What you do here is a convolution with >> >> 0 1 0 >> 1 1 1 >> 0 1 0 >> >> kernel, and thresholding, you can use numpy.numarray.nd_image package: >> >> import numpy.numarray.nd_image as NI >> . >> . >> . >> ker = array([[0,1,0], [1,1,1],[0,1,0]]) >> result = (NI.convolve(self.bufbw, ker) == 1).astype(uint8) >> >> for nore general cases you can use the function generic_filter in the same package. >> >> Nadav. >> >> >> > Thanks, that works ok! > > However if instead of uint8, uint32 values are used, the result only > contains zeros. > Or am I doing something wrong? > Folkert > > import numpy > import numpy.numarray.nd_image as NI > > B = numpy.zeros((5,5), dtype=numpy.uint8) > C = numpy.zeros((5,5), dtype=numpy.uint32) > > DC = 4278190280 > LC = 4278241280 > > B[:] = 0 > B[1,1] = 1 > B[2,2] = 1 > C[:] = DC > C[1,1] = LC > C[2,2] = LC > > ker01 = numpy.array([[0,1,0], \ > [1,1,1], \ > [0,1,0]]) > kerCC = numpy.array([[C[0,0],C[1,1],C[0,0]], \ > [C[1,1],C[1,1],C[1,1]], \ > [C[0,0],C[1,1],C[0,0]]]).astype(numpy.uint32) > > r1 = NI.convolve(B, ker01).astype(numpy.uint8) > r2 = (NI.convolve(B, ker01) == 1).astype(numpy.uint8) > r3 = NI.convolve(C, kerCC).astype(numpy.uint32) > r4 = (NI.convolve(C, kerCC) == C[0,0]).astype(numpy.uint32) > It should be: r5 = NI.convolve(C, ker01).astype(numpy.uint32) which results in: [[4211082216 4211133216 4211082216 4211082216 4211082216] [4211133216 4211133216 4211184216 4211082216 4211082216] [4211082216 4211184216 4211133216 4211133216 4211082216] [4211082216 4211082216 4211133216 4211082216 4211082216] [4211082216 4211082216 4211082216 4211082216 4211082216]] Now I have to find out how convolve works in order to understand why these values are generated. Are there some good examples / documentation as you know? I found a EECE253_07_Convolution.pdf with lecture notes on convolution for image processing. Thanks, Folkert _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3885 bytes Desc: not available URL: From robert.kern at gmail.com Mon May 5 13:33:20 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 May 2008 12:33:20 -0500 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: References: Message-ID: <3d375d730805051033w1c3e084eu5d2e3f300e897693@mail.gmail.com> On Mon, May 5, 2008 at 12:19 PM, Russell E. Owen wrote: > The object returned by maskedArray.compressed() appears to be a normal > numpy array (based on repr output), but in reality it has some > surprising differences: > > import numpy > a = numpy.arange(10, dtype=int) > b = numpy.zeros(10) > b[1] = 1 > b[3] = 1 > ma = numpy.core.ma.array(a, mask=b, dtype=float) > print ma > # [0.0 -- 2.0 -- 4.0 5.0 6.0 7.0 8.0 9.0] > c = ma.compressed() > print repr(c) > # array([ 0. 2. 4. 5. 6. 7. 8. 9.]) > c.sort() > #Traceback (most recent call last): > # File "", line 1, in > # File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac > kages/#numpy/core/ma.py", line 2132, in not_implemented > # raise NotImplementedError, "not yet implemented for numpy.ma arrays" > #NotImplementedError: not yet implemented for numpy.ma arrays > d = numpy.array(c) > d.sort() > # this works fine, as expected > > Why is "c" in the example above not just a regular numpy array? It is > not a "live" view (based on a quick test), which seems sensible to me. > I've worked around the problem by making a copy (d in the example > above), but it seems most unfortunate to have to copy the data twice. I don't know the reason why it's not an ndarray, but you don't have to copy the data again to get one: c = ma.compressed().view(numpy.ndarray) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Mon May 5 13:46:55 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 05 May 2008 10:46:55 -0700 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: <3d375d730805051033w1c3e084eu5d2e3f300e897693@mail.gmail.com> References: <3d375d730805051033w1c3e084eu5d2e3f300e897693@mail.gmail.com> Message-ID: <481F480F.8060801@noaa.gov> Robert Kern wrote: > I don't know the reason why it's not an ndarray, but you don't have to > copy the data again to get one: > > c = ma.compressed().view(numpy.ndarray) would: c - numpy.asarray(ma.compressed()) work too? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Mon May 5 14:15:43 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 5 May 2008 14:15:43 -0400 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: References: Message-ID: <200805051415.44363.pgmdevlist@gmail.com> On Monday 05 May 2008 13:19:40 Russell E. Owen wrote: > The object returned by maskedArray.compressed() appears to be a normal > numpy array (based on repr output), but in reality it has some > surprising differences: Russell: * I assume you're not using the latest version of numpy, are you ? If you were, the .sort() method would work. * Currently, the output of MaskedArray.compressed() is indeed a MaskedArray, where the missing values are skipped. If you need a regular ndarray, just a view as Robert suggested. Christopher's suggestion is equivalent. * An alternative would be to force the output of MaskedArray.compressed() to type(MaskedArray._baseclass), where the _baseclass attribute is the class of the underlying array: usually it's only ndarray, but it can be any subclass. Changing this behavior would not break anything in TimeSeries. * I need to fix a bug in compressed when the underlying array is a matrix: I can take care of the alternative at the same time. What are the opinions on that matter ? From cournape at gmail.com Mon May 5 14:25:48 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 May 2008 03:25:48 +0900 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220805051125k6eb70d6aw3bc22598db5dd239@mail.gmail.com> On Tue, May 6, 2008 at 1:59 AM, Timothy Hochberg wrote: > I don't think you would want to do this in the core of PyArray_FromIter; > presumably realloc can sometimes reuse the existing pointer and save on > allocating a new chunk of memory. Since there are lots of allocations in > fromiter, this could potentially be a big performance hit. Looking at PyArray_FromIter, there is already a log(n) behavior for allocation, so I am not sure it would hurt so much: I would guess that realloc often need to allocate a new block if the size is the double of the former pointer. That's how realloc seems to work on FreeBSD, at least: http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/malloc.c?rev=1.171;content-type=text%2Fplain > (At least I think > so, realloc has always been kinda voodoo to me). Once malloc is implemented, I would guess realloc to be simple, no ? For each block of memory in the malloc chain, just check whereas there is enough size to reuse the same block and adjacent block if any, otherwise, free + malloc + copy, non ? One could use > PyMemData_NEW/PyMemData_FREE in the final allocation to make sure that the > data is alligned, we allready do a realloc there to dump any extra space. I guess that would be the easiest: directly use realloc in the loop, and use PyDataMem_NEW/FREE at the end if necessary. Using realloc is no problem if the buffer is always freed in the same function. > Or, possibly better, one could choose which allocation strategy to use here > depending on whether the data was alligned or not. I mentioned step 1, because there would be a step 2 (but maybe only numpy 2): extending the C api functions to create arrays such as asking for an explicit alignment is possible. David From peridot.faceted at gmail.com Mon May 5 14:25:55 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 5 May 2008 14:25:55 -0400 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> Message-ID: 2008/5/5 Robert Kern : > On Mon, May 5, 2008 at 7:44 AM, David Cournapeau > wrote: > > > In numpy, we can always replace realloc by malloc/free, because we know > > the size of the old block: would deprecating PyMemData_RENEW and > > replacing them by PyMemeData_NEW/PyMemData_FREE be possible, such as to > > make all numpy arrays follow a default alignement ? There are only a few > > of them in numpy (6 of them), 0 in scipy, and I guess extensions never > > really used them ? > > I am in favor of at least trying this out. We will have to have a set > of benchmarks to make sure we haven't hurt the current uses of > PyMemData_RENEW which Tim points out. realloc() may not be a performance win anyway. Allocating new memory is quite fast, copying data is quite fast, and in-place realloc() tends to cause memory fragmentation - take an arena full of 1024-byte blocks and suddenly make one of them 256 bytes long and you haven't gained anything at all in most malloc() implementations. So yes, benchmarks. Anne From cournape at gmail.com Mon May 5 14:30:11 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 May 2008 03:30:11 +0900 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> Message-ID: <5b8d13220805051130l7740a21ahb5fd540ff0ff4719@mail.gmail.com> On Tue, May 6, 2008 at 2:11 AM, Robert Kern wrote: > > I am in favor of at least trying this out. We will have to have a set > of benchmarks to make sure we haven't hurt the current uses of > PyMemData_RENEW which Tim points out. What would be a good stress test for PyArray_FromIter ? I think using array iterator was the main hotspot for reading large arff files in scipy.io.arff, would that be enough ? cheers, David From robert.kern at gmail.com Mon May 5 14:58:22 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 May 2008 13:58:22 -0500 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <5b8d13220805051130l7740a21ahb5fd540ff0ff4719@mail.gmail.com> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> <5b8d13220805051130l7740a21ahb5fd540ff0ff4719@mail.gmail.com> Message-ID: <3d375d730805051158v4c6262e3me953a7c5a401d5ba@mail.gmail.com> On Mon, May 5, 2008 at 1:30 PM, David Cournapeau wrote: > On Tue, May 6, 2008 at 2:11 AM, Robert Kern wrote: > > > > I am in favor of at least trying this out. We will have to have a set > > of benchmarks to make sure we haven't hurt the current uses of > > PyMemData_RENEW which Tim points out. > > What would be a good stress test for PyArray_FromIter ? I think using > array iterator was the main hotspot for reading large arff files in > scipy.io.arff, would that be enough ? Since there are only 6 places where PyMemData_RENEW is used, all 6 uses should be benchmarked. I would prefer a more targeted benchmark so we know exactly what we are measuring. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Mon May 5 15:10:56 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 05 May 2008 09:10:56 -1000 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: <200805051415.44363.pgmdevlist@gmail.com> References: <200805051415.44363.pgmdevlist@gmail.com> Message-ID: <481F5BC0.5020705@hawaii.edu> Pierre GM wrote: > On Monday 05 May 2008 13:19:40 Russell E. Owen wrote: >> The object returned by maskedArray.compressed() appears to be a normal >> numpy array (based on repr output), but in reality it has some >> surprising differences: > > Russell: > > * I assume you're not using the latest version of numpy, are you ? If you > were, the .sort() method would work. He is clearly using the older version; it is accessed via numpy.core.ma. > > * Currently, the output of MaskedArray.compressed() is indeed a MaskedArray, > where the missing values are skipped. If you need a regular ndarray, just a > view as Robert suggested. Christopher's suggestion is equivalent. > > * An alternative would be to force the output of MaskedArray.compressed() to > type(MaskedArray._baseclass), where the _baseclass attribute is the class of > the underlying array: usually it's only ndarray, but it can be any subclass. > Changing this behavior would not break anything in TimeSeries. This alternative makes sense to me--I expect most use cases would be most efficient with compressed yielding a plain ndarray. I don't see any gain in keeping it as a masked array, and having to manually convert it. (I don't see how the _baseclass conversion would work with the baseclass as a matrix, though.) Eric > > * I need to fix a bug in compressed when the underlying array is a matrix: I > can take care of the alternative at the same time. What are the opinions on > that matter ? > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Mon May 5 15:23:03 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 5 May 2008 15:23:03 -0400 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: <481F5BC0.5020705@hawaii.edu> References: <200805051415.44363.pgmdevlist@gmail.com> <481F5BC0.5020705@hawaii.edu> Message-ID: <200805051523.04224.pgmdevlist@gmail.com> On Monday 05 May 2008 15:10:56 Eric Firing wrote: > Pierre GM wrote: > > * An alternative would be to force the output of MaskedArray.compressed() > > to type(MaskedArray._baseclass), where the _baseclass attribute is the > > class of the underlying array: usually it's only ndarray, but it can be > > any subclass. Changing this behavior would not break anything in > > TimeSeries. > > This alternative makes sense to me--I expect most use cases would be > most efficient with compressed yielding a plain ndarray. I don't see > any gain in keeping it as a masked array, and having to manually convert > it. (I don't see how the _baseclass conversion would work with the > baseclass as a matrix, though.) In fact, it's straightforward: - ravel the _data part to get a type(_baseclass) object - use .compress on the _data part, using logical_not(mask) as the condition. When you have a matrix as _baseclass, the result will be a ravelled version of the initial matrix. But yes, it makes indeed more sense not to have a MaskedArray in output. SVN5126 should now work as discussed. > Eric > > > * I need to fix a bug in compressed when the underlying array is a > > matrix: I can take care of the alternative at the same time. What are the > > opinions on that matter ? > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From efiring at hawaii.edu Mon May 5 15:35:35 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 05 May 2008 09:35:35 -1000 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: <200805051523.04224.pgmdevlist@gmail.com> References: <200805051415.44363.pgmdevlist@gmail.com> <481F5BC0.5020705@hawaii.edu> <200805051523.04224.pgmdevlist@gmail.com> Message-ID: <481F6187.10401@hawaii.edu> Pierre GM wrote: > On Monday 05 May 2008 15:10:56 Eric Firing wrote: >> Pierre GM wrote: >>> * An alternative would be to force the output of MaskedArray.compressed() >>> to type(MaskedArray._baseclass), where the _baseclass attribute is the >>> class of the underlying array: usually it's only ndarray, but it can be >>> any subclass. Changing this behavior would not break anything in >>> TimeSeries. >> This alternative makes sense to me--I expect most use cases would be >> most efficient with compressed yielding a plain ndarray. I don't see >> any gain in keeping it as a masked array, and having to manually convert >> it. (I don't see how the _baseclass conversion would work with the >> baseclass as a matrix, though.) > > In fact, it's straightforward: > - ravel the _data part to get a type(_baseclass) object > - use .compress on the _data part, using logical_not(mask) as the condition. > When you have a matrix as _baseclass, the result will be a ravelled version of > the initial matrix. What I meant was that I don't see that such a ravelled version of a matrix would be likely to make sense in a linear algebra context, so leaving it as a matrix is likely to cause confusion rather than convenience. Still, it would be consistent, so I am not objecting to it. Eric From kwgoodman at gmail.com Mon May 5 15:56:29 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 5 May 2008 12:56:29 -0700 Subject: [Numpy-discussion] Debian Lenny switches to Python 2.5 default Message-ID: I'm the click of a botton away from changing the python default on my Debian Lenny system from 2.4 to 2.5. Has anyone experienced any numpy issues after the switch? From amcmorl at gmail.com Mon May 5 16:13:45 2008 From: amcmorl at gmail.com (Angus McMorland) Date: Mon, 5 May 2008 16:13:45 -0400 Subject: [Numpy-discussion] Debian Lenny switches to Python 2.5 default In-Reply-To: References: Message-ID: 2008/5/5 Keith Goodman : > I'm the click of a botton away from changing the python default on my > Debian Lenny system from 2.4 to 2.5. Has anyone experienced any numpy > issues after the switch? All normal here so far, with most of a day's use. All numpy tests pass, and I get four failures in scipy (which also upgraded to 0.6.0-11 this morning): 2 in lapack/tests/esv_tests, and 2 umfpack-related. A. -- AJC McMorland, PhD candidate Physiology, University of Auckland (Nearly) post-doctoral research fellow Neurobiology, University of Pittsburgh From pgmdevlist at gmail.com Mon May 5 15:44:26 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 5 May 2008 15:44:26 -0400 Subject: [Numpy-discussion] numpy masked array oddity In-Reply-To: <481F6187.10401@hawaii.edu> References: <200805051523.04224.pgmdevlist@gmail.com> <481F6187.10401@hawaii.edu> Message-ID: <200805051544.26689.pgmdevlist@gmail.com> On Monday 05 May 2008 15:35:35 Eric Firing wrote: > What I meant was that I don't see that such a ravelled version of a > matrix would be likely to make sense in a linear algebra context, so > leaving it as a matrix is likely to cause confusion rather than > convenience. Still, it would be consistent, so I am not objecting to it. I understand and concur: ravelling a matrix always brings surprises. On a side note, .compressed() isn't the method recommended to get rid of missing values in a 2D array: there are the compress_rows and compress_cols functions for that. In any case, I doubt that regular matrix users combine their matrices with missing data... From robert.kern at gmail.com Mon May 5 16:02:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 May 2008 15:02:30 -0500 Subject: [Numpy-discussion] Debian Lenny switches to Python 2.5 default In-Reply-To: References: Message-ID: <3d375d730805051302m7bfbfe95t610cb90677a90044@mail.gmail.com> On Mon, May 5, 2008 at 2:56 PM, Keith Goodman wrote: > I'm the click of a botton away from changing the python default on my > Debian Lenny system from 2.4 to 2.5. Has anyone experienced any numpy > issues after the switch? 2.4 -> 2.5 in general shouldn't be a problem. If you are on a 64-bit system, you will finally be able to mmap very large files. I don't (and wouldn't) know of any Lenny-specific issues, though. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From thrabe at burnham.org Mon May 5 19:33:39 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Mon, 5 May 2008 16:33:39 -0700 Subject: [Numpy-discussion] Compilation problems - bizzare Message-ID: Hi all, currently, I am writing a box of modular functions for exchanging python & matlab objects (nd arrays in particular). I am facing an odd problem which I can not explain to myself: I use PyArg_ParseTuple(args, "O!s",&PyArray_Type, &array,&na) for parsing the array and a string. This function call works perfectly well when called from a static function used for extending python. However, using the call above in another function encapsulating the call above yields a segmentation fault -> python crash -> anger , irritation ... : ) Same with PyArray_FromDimsAndData(dimensions,size,(const char)p.first,(char*)value) Did somebody ever encounter this? By the way, I get a compilation warning /home/global/python32/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:944: warning: 'int _import_array()' defined but not used Thank you in advance for your help, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon May 5 19:38:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 May 2008 18:38:25 -0500 Subject: [Numpy-discussion] Compilation problems - bizzare In-Reply-To: References: Message-ID: <3d375d730805051638p227b53c9j157025bce182f434@mail.gmail.com> On Mon, May 5, 2008 at 6:33 PM, Thomas Hrabe wrote: > > Hi all, > > currently, I am writing a box of modular functions for exchanging python & > matlab objects (nd arrays in particular). > I am facing an odd problem which I can not explain to myself: > > I use > PyArg_ParseTuple(args, "O!s",&PyArray_Type, &array,&na) > for parsing the array and a string. > This function call works perfectly well when called from a static function > used for extending python. > However, using the call above in another function encapsulating the call > above yields a segmentation fault -> python crash -> anger , irritation ... > : ) > Same with > PyArray_FromDimsAndData(dimensions,size,(const char)p.first,(char*)value) > > Did somebody ever encounter this? I haven't. Can you run this under a debugger to give us a backtrace at the site of the crash? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon May 5 19:50:39 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 May 2008 17:50:39 -0600 Subject: [Numpy-discussion] Compilation problems - bizzare In-Reply-To: References: Message-ID: On Mon, May 5, 2008 at 5:33 PM, Thomas Hrabe wrote: > Hi all, > > currently, I am writing a box of modular functions for exchanging python & > matlab objects (nd arrays in particular). > I am facing an odd problem which I can not explain to myself: > > I use > PyArg_ParseTuple(args, "O!s",&PyArray_Type, &array,&na) > for parsing the array and a string. > This function call works perfectly well when called from a static function > used for extending python. > However, using the call above in another function encapsulating the call > above yields a segmentation fault -> python crash -> anger , irritation ... > : ) > Same with > PyArray_FromDimsAndData(dimensions,size,(const char)p.first,(char*)value) > > Did somebody ever encounter this? > > By the way, I get a compilation warning > /home/global/python32/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:944: > warning: 'int _import_array()' defined but not used > > Thank you in advance for your help, > Thomas > Could you attach a code snippet that reproduces the problem? What version of numpy are you using? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From vel.accel at gmail.com Mon May 5 20:19:10 2008 From: vel.accel at gmail.com (dieter h) Date: Mon, 5 May 2008 20:19:10 -0400 Subject: [Numpy-discussion] MoinMoin <-> docstrings gateway In-Reply-To: <20080505054800.GC8593@phare.normalesup.org> References: <1209946640.13399.6.camel@localhost.localdomain> <20080505054800.GC8593@phare.normalesup.org> Message-ID: <1e52e0880805051719l6892cff3ta766dbdc8a4ed4d0@mail.gmail.com> On Mon, May 5, 2008 at 1:48 AM, Gael Varoquaux wrote: > On Mon, May 05, 2008 at 03:17:20AM +0300, Pauli Virtanen wrote: > > Some time ago there was discussion about MoinMoin <-> docstrings > > gateway. Did it produce some results? > > My girlfriend, Emmanuelle, (Cced, I am not sure she follows this mailing > list) has been working on this, with some progress. > > > > Anyway, I threw a bit of code together. There's something between a > > proof-of-concept and final product running now on my desktop machine. > > You can play with it here: > > > http://pvx.homeip.net/pTSc0V/TestWiki > > Sweet. Some comments: > > * A lot of the docstrings are not valid rst. This is not your fault, but > we will had to fix this in the long run. I humbly suggest Sphinx[1] as the generating tool and markup rather than just regular ReST. I'm in the process of converting all of my local documentation to this engine's format. As it's maturing, I find Sphinx to be on a path of being an ideal referencing tool (considering my desire for dialog and effort, towards standardizing documentation in the floss community. Improving pinpoint referencing is one of future focus for me). For example, NumpyBook is much more accessible to me now than it was in .pdf. I'll be sending Travis a copy when I'm finished finalizing the formating, after the just now automated translation. When he's ready to release numpybook to the public domain, he may consider it useful. Btw, I think an excellent model of documentation topology/formatting is that for the Qt toolkit (online). Take a look if your not familiar with it. I might add their docs, options of cross-referenced api code as well as internals for code junkies. Are there others out there as focussed as I am on the 'science' of useful documentation. [1] http://sphinx.pocoo.org/index.html > > * I would prefer if the main page was broken up into one per function. I > know this is the way it is in the actual wiki layout, but I think it > would be better if it was presented this way to the user. Anyway, this > is debatable. > > * Emmanuelle has functions to read from the wiki and write to it from a > remote client. I am not sure how well they work, but it would be nice > not to require a login and rights on the server to generate patches. > > > > Is there interest to move forward with this? > > There is. With Emmanuelle and Stefan van der Waalt, who has also been > following the project, we were considering using a webapp running with > turbogears to move forward. They would know better what the status is. > > Congratulations for that. Let us hope you can join forces with the other > team working on that to bring this project to its success. > > Cheers, > > Ga?l > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From eads at soe.ucsc.edu Sun May 4 17:49:41 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sun, 04 May 2008 14:49:41 -0700 Subject: [Numpy-discussion] Faster In-Reply-To: References: <3d375d730805021805g23b21e5enb79a1f8d70f85f13@mail.gmail.com> <3d375d730805021925w65348471u67f170b7f8e3e032@mail.gmail.com> Message-ID: <481E2F75.80205@soe.ucsc.edu> Hi, Looks like a fun discussion: it's too bad for me I did not join it earlier. My first try at scipy-cluster was completely in Python. Like you, I also tried to find the most efficient way to transform the distance matrix when joining two clusters. Eventually my data sets became big enough that I decided to write these parts in C. I don't think my Python joining code was efficient as yours. I tried out your first test and I am a little confused at the output. In [107]: goodman.test() 1 Clusters [['BA'], ['FI'], ['MI', 'TO'], ['NA'], ['RM']] Distance [[997 662 255 412 996] [662 997 468 268 400] [255 468 997 219 869] [412 268 219 997 669] [996 400 869 669 997]] 2 Clusters [['BA'], ['FI'], ['MI', 'TO', 'NA'], ['RM']] Distance [[998 662 412 996] [662 998 268 400] [412 268 998 669] [996 400 669 998]] 3 Clusters [['BA'], ['FI', 'MI', 'TO', 'NA'], ['RM']] Distance [[999 412 996] [412 999 669] [996 669 999]] 4 Clusters [['BA', 'FI', 'MI', 'TO', 'NA'], ['RM']] Distance [[1000 669] [ 669 1000]] 5 Clusters [['BA', 'FI', 'MI', 'TO', 'NA', 'RM']] Distance [[1001]] The first step is right, singletons 2 and 5 (starting at 0) should be joined since they have a minimum distance of 138. Let's look at their corresponding rows in the distance matrix. In [101]: DM[[2,5],:] Out[101]: array([[ 877., 295., 10000., 754., 564., 138.], [ 996., 400., 138., 869., 669., 10000.]]) These two rows, rows 2 and 5, are all that we need to form the row for the newly joined cluster in the distance matrix. If we just take the minimum for each column we obtain, In [102]: q=DM[[2,5],:].min(axis=0) Out[102]: array([ 877., 295., 138., 754., 564., 138.]) so the row for the cluster should be the row above with the 2 and 5'th row removed. Roughly, there should be a row in the distance matrix with the following values but I don't see one in your output. In [103]: q[q != 138] Out[103]: array([ 877., 295., 754., 564.]) Since 295 is the minimum distance between this newly joined cluster and any other singleton, it should not be chosen for the second iteration since singletons 3 and 4 are closer to another with a distance of 219. So after iteration 2, you should get [['BA'], ['FI'], ['MI', 'TO'], ['NA', 'RM']]. Recall that the distance matrix transformation forms a new distance matrix using only values from the previous distance matrix. So, at any iteration, the values in the distance matrix should be a subset of the values in the original distance matrix, eliminating the distance entries of the clusters formed. If we look at the minimum distances in the original distance matrix in rank order, we have 138, 219, 255, 268, 295. Thus, we might expect the minimum distances found at each iteration to be these values, and they are in this case, but I don't have a mathematical proof that it works in general. If I run your distance matrix through hcluster.single, I get the following linkage matrix. The third column is the distance between the clusters joined, and the first two columns are the indices of the clusters joined (non-singletons have an index >= n). array([[ 2., 5., 138., 2.], [ 3., 4., 219., 2.], [ 0., 7., 255., 3.], [ 1., 8., 268., 4.], [ 6., 9., 295., 6.]]) I've attached the dendrogram, since it is easier to interpret. In [105]: lbls Out[105]: ['BA', 'FI', 'MI', 'NA', 'RM', 'TO'] In [106]: hcluster.dendrogram(Z, labels=lbls) I tried running your second test, and you'll see C might give you a better performance speed-up (not surprising). Roughly, what I'm doing in C is I'm only storing the upper triangular of the distance matrix. An array of double*'s (double **) refers to each row of this triangle. To eliminate a row, I simply remove the entry in the double ** array. To remove a column, I shift the values over in each non-removed row. I'm not sure if this is the best approach but it is certainly more efficient than what can be achieved in Python. In [107]: hcluster.goodman.test2(1000) n = 1000 took 22.10 seconds In [108]: n=1000 In [109]: uu=numpy.random.rand(n*(n-1)/2) In [110]: tic = time.time(); hcluster.single(uu); toc = time.time(); print toc-tic Out[110]: 4.57607889175 Damian Keith Goodman wrote: > On Fri, May 2, 2008 at 7:25 PM, Robert Kern wrote: >> Assuming x is contiguous and you can modify x in-place: >> >> >> In [1]: from numpy import * >> >> In [2]: def dist(x): >> ...: x = x + 1e10 * eye(x.shape[0]) >> ...: i, j = where(x == x.min()) >> >> ...: return i[0], j[0] >> ...: >> >> In [3]: def symmdist(N): >> ...: x = random.rand(N, N) >> ...: x = x + x.T >> ...: x.flat[::N+1] = 0 >> ...: return x >> ...: >> >> In [4]: symmdist(5) >> Out[4]: >> array([[ 0. , 0.87508654, 1.11691704, 0.80366071, 0.57966808], >> [ 0.87508654, 0. , 1.5521685 , 1.74010886, 0.52156877], >> [ 1.11691704, 1.5521685 , 0. , 1.22725396, 1.04101992], >> [ 0.80366071, 1.74010886, 1.22725396, 0. , 1.94577965], >> [ 0.57966808, 0.52156877, 1.04101992, 1.94577965, 0. ]]) >> >> In [5]: def kerndist(x): >> ...: N = x.shape[0] >> ...: x.flat[::N+1] = x.max() >> ...: ij = argmin(x.flat) >> ...: i, j = divmod(ij, N) >> ...: return i, j >> ...: >> >> In [10]: x = symmdist(500) >> >> In [15]: %timeit dist(x) >> 10 loops, best of 3: 19.9 ms per loop >> >> In [16]: %timeit kerndist(x) >> 100 loops, best of 3: 4.38 ms per loop > > I added > > i, j = divmod(x.argmin(), x.shape[0]) > > to > > http://scipy.org/PerformanceTips -------------- next part -------------- A non-text attachment was scrubbed... Name: example.png Type: image/png Size: 11623 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Tue May 6 03:18:31 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 06 May 2008 16:18:31 +0900 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <3d375d730805051158v4c6262e3me953a7c5a401d5ba@mail.gmail.com> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> <3d375d730805051011r346f73a0kf469a9748d905442@mail.gmail.com> <5b8d13220805051130l7740a21ahb5fd540ff0ff4719@mail.gmail.com> <3d375d730805051158v4c6262e3me953a7c5a401d5ba@mail.gmail.com> Message-ID: <48200647.4010606@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > > Since there are only 6 places where PyMemData_RENEW is used, all 6 > uses should be benchmarked. I would prefer a more targeted benchmark > so we know exactly what we are measuring. > Ok, I started a new branch for aligned allocator: http://projects.scipy.org/scipy/numpy/browser/branches/aligned_alloca For now, I've only commited: - a new bunch of tests for all functions using PyDataMem_RENEW (in numpy/core/tests/test_renew.py) - If NOUSE_PYDATAMEM_RENEW is defined, PyDataMEM_RENEW is not used: instead, SYS_REALLOC is used in loops (this is defined to realloc), and when outside a loop, _fake_realloc is used, which does not use realloc, but use PyDataMem_NEW/FREE. I got a small slowdown, but I am not sure it is really relevant: the figures change almost as much between several runs as between runs of different implementations. I am not convinced about the quality of my tests either. cheers, David From millman at berkeley.edu Tue May 6 04:40:15 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 6 May 2008 01:40:15 -0700 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 Message-ID: Hey, The trunk is in pretty good shape and it is about time that I put out an official release. So tomorrow (in a little over twelve hours) I am going to create a 1.1.x branch and the trunk will be officially open for 1.2 development. If there are no major issues that show up at the last minute, I will tag 1.1.0 twenty-four hours after I branch. As soon as I tag the release I will ask the David and Chris to create the official Windows and Mac binaries. If nothing goes awry, you can expect the official release announcement by Monday, May 12th. In order to help me with the final touches, would everyone look over the release notes one last time: http://projects.scipy.org/scipy/numpy/milestone/1.1.0 Please let me know if there are any important omissions or errors ASAP. Also, there are four open tickets that I would like everyone to take a brief look at: http://projects.scipy.org/scipy/numpy/ticket/551 http://projects.scipy.org/scipy/numpy/ticket/605 http://projects.scipy.org/scipy/numpy/ticket/748 http://projects.scipy.org/scipy/numpy/ticket/760 I am not going to hold up the release any longer, but it would be nice if a few (or all) of these remaining tickets could be closed. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From peridot.faceted at gmail.com Tue May 6 05:19:52 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 6 May 2008 05:19:52 -0400 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: <481F0143.3030700@ar.media.kyoto-u.ac.jp> References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> Message-ID: 2008/5/5 David Cournapeau : > Basically, what I have in mind is, in a first step (for numpy 1.2): > - define functions to allocate on a given alignement > - make PyMemData_NEW 16 byte aligned by default (to be compatible > with SSE and co). > > The problem was, and still is, realloc. It is not possible to implement > realloc with malloc/free (in a portable way), and as such, it is not > possible to have an aligned realloc. How much does this matter? I mean, what if you simply left all the reallocs as-is? The arrays that resulted from reallocs would not typically be aligned, but we cannot in any case expect all arrays to be aligned. Or, actually, depending on how you manage the alignment, it may be possible to write a portable aligned realloc. As I understand it, a portable aligned malloc allocates extra memory, then adds something to the pointer to make the memory arena aligned. free() must then recover the original pointer and call free() on it - though this can presumably be avoided if garbage collection is smart enough. realloc() also needs to recover the pointer to call the underlying realloc(). One answer is to ensure that at least one byte of padding is always done at the beginning, so that the number of pad bytes used on an aligned pointer p is accessible as ((uint8*)p)[-1]. Other schemes are available; I have a peculiar affection for the pure-python solution of constructing a view. In any of them, it seems to me, if free() can recover the original pointer, so can realloc()... I don't think any numpy function can assume that its input arrays are aligned, since users may like to pass, say, mmap()ed hunks of a file, over which numpy has no control of the alignment. In this case, then, how much of a problem is it if a few stray realloc()s produce non-aligned arrays? Anne From bala.biophysics at gmail.com Tue May 6 05:21:03 2008 From: bala.biophysics at gmail.com (Bala subramanian) Date: Tue, 6 May 2008 14:51:03 +0530 Subject: [Numpy-discussion] numpy in RHEL4 In-Reply-To: <3d375d730805051017o21e6944ct132cfaec39e18ed@mail.gmail.com> References: <288df32a0805050414m61b3442aq40b71babbddc77b8@mail.gmail.com> <3d375d730805051017o21e6944ct132cfaec39e18ed@mail.gmail.com> Message-ID: <288df32a0805060221y7f4aef6fjc9efd3200178e81c@mail.gmail.com> Dear Robert, Thank you. But i am trying to install it in a 32-bit machine only. In that case, why dose it require 64 bit libraries. Bala On Mon, May 5, 2008 at 10:47 PM, Robert Kern wrote: > On Mon, May 5, 2008 at 6:14 AM, Bala subramanian > wrote: > > Dear friends, > > > > I am trying to install numpy version numpy-1.0.4 in RHEL 4. My python > > version is 2.3.4. While installation, it throws me the following error > and > > stops. Kindly write me how to get rid of this. > > The files /usr/lib/liblapack.a and /usr/lib/liblapack.so are probably > 32-bit instead of 64-bit. You can (probably) check this using the > command > > $ file /usr/lib/liblapack.so > > Locate the 64-bit versions of liblapack and libblas and make sure that > you have the correct directory in your site.cfg file. The numpy source > tree contains a commented site.cfg.example file for you to start with. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.noel at gmail.com Tue May 6 05:33:35 2008 From: vincent.noel at gmail.com (Vincent Noel) Date: Tue, 6 May 2008 11:33:35 +0200 Subject: [Numpy-discussion] cannot edit page on scipy-dev wiki Message-ID: Hello all, I wanted to fix the formatting problems on the wiki page http://scipy.org/scipy/numpy/wiki/MaskedArrayApiChanges, so I followed the instructions on http://scipy.org/scipy/numpy/wiki (which state "In order to edit wiki pages or create and edit tickets, you need to register first.") But even after getting a registered login, I still can't edit the wiki pages. Do you need to request some extra authorization? Thanks a lot, Cheers Vincent Noel From F.Boonstra at inter.nl.net Tue May 6 05:36:15 2008 From: F.Boonstra at inter.nl.net (Folkert Boonstra) Date: Tue, 06 May 2008 11:36:15 +0200 Subject: [Numpy-discussion] Learn about numpy In-Reply-To: <710F2847B0018641891D9A216027636029C139@ex3.envision.co.il> References: <481D876C.1070001@inter.nl.net> <710F2847B0018641891D9A216027636029C132@ex3.envision.co.il> <481EE266.7000300@inter.nl.net> <481F3300.8060301@inter.nl.net> <710F2847B0018641891D9A216027636029C139@ex3.envision.co.il> Message-ID: <4820268F.20701@inter.nl.net> Nadav Horesh schreef: > I think you have a problem of overflow in r5: You may better use utin64 instead of uint32. > > Nadav. > > Nadav, My problems were due to trying to do two things at once. The code below does what I want and it is very fast. I see the power of numpy now: import numpy as NU import numpy.numarray.nd_image as NI BW = NU.zeros((5,5), dtype=NU.uint8) AR = NU.zeros((5,5), dtype=NU.uint32) DA = NU.zeros((5,5), dtype=NU.uint32) LA = NU.zeros((5,5), dtype=NU.uint32) DC = 4278190280 LC = 4278241280 BW[1,1] = 1 BW[2,2] = 1 AR[:] = DC AR[1,1] = LC AR[2,2] = LC DA[:] = DC LA[:] = LC kernel = NU.array([[0,1,0], \ [1,1,1], \ [0,1,0]]).astype(NU.uint8) print "AR=\n", AR #convolve r1 = NI.convolve(BW, kernel).astype(NU.uint8) r2 = (NI.convolve(BW, kernel) == 1).astype(NU.uint8) print "r1=\n",r1 print "r2=\n",r2 # create 32-bit array using results from convolve as mask AR = NU.array(DA, copy=1) NU.putmask(AR, r2, LA) print "AR=\n", AR ------------------------------------------------------------------------------ From david at ar.media.kyoto-u.ac.jp Tue May 6 05:47:15 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 06 May 2008 18:47:15 +0900 Subject: [Numpy-discussion] Deprecating PyDataMem_RENEW ? In-Reply-To: References: <481F0143.3030700@ar.media.kyoto-u.ac.jp> Message-ID: <48202923.8010602@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > > How much does this matter? I mean, what if you simply left all the > reallocs as-is? The arrays that resulted from reallocs would not > typically be aligned, but we cannot in any case expect all arrays to > be aligned. The problem would be the interaction between the aligned allocator and realloc. If we do it by "ourselves", then it is no problem, but we cannot do it if we rely on posix_memalign/memalign. I was not precise enough when I mentioned portability: I meant that it was not possible to implement realloc in terms of malloc/free portably, and as such, it is not possible to implement an aligned realloc with memalign/free. Also, I have not found any indication that the pointer given by posix_memalign can be fed to realloc: depending on the implementation, it cannot be freed... Maybe the solution is to never use posix_memalign... > As I understand it, a > portable aligned malloc allocates extra memory, then adds something to > the pointer to make the memory arena aligned. Yes, that's how it works in fftw, and the implementation I have for numpy is derived from fftw code. > I don't think any numpy function can assume that its input arrays are > aligned, since users may like to pass, say, mmap()ed hunks of a file, > over which numpy has no control of the alignment. In this case, then, > how much of a problem is it if a few stray realloc()s produce > non-aligned arrays? There are a few realloc, but they are used in core routines (PyArray_FromIter, PyArray_Resize). The point is to get aligned arrays most of the time, and avoiding losing alignement unless it is too complicated to do otherwise. cheers, David From mariusnijhuis at gmail.com Tue May 6 08:15:29 2008 From: mariusnijhuis at gmail.com (Marius Nijhuis) Date: Tue, 6 May 2008 14:15:29 +0200 Subject: [Numpy-discussion] "Segmentation fault (core dumped)" as result of matrix multiplication Message-ID: <64268c6e0805060515g18b4cc2au524185f194aab45e@mail.gmail.com> Hello, I encountered the error "Segmentation fault (core dumped)" during a rather standard multiplication, without excessive memory us. This looks likes a bug to me? I am using Python 2.5, Numpy 1.0.4 under Ubuntu 7.10. Here is what I am doing: i have two arrays, points1 and points2. points1.shape=(n,k) points2.shape=(m,k) The problem only happens if n or m (or both) are 1. The arrays are then still 2-dimensional. k is usually in the order of 60. The code: p1Mat=asmatrix(points1) p2Mat=asmatrix(points2).T ab=p1Mat*p2Mat At the third line, python crashes back to terminal with the text "Segmentation fault (core dumped)" Tracing further the error happens at line 106 of numpy/core/defmatrix.py If I understand correctly, the __mul__ method of p1Mat calls N.dot(self, asmatrix(other)) and the problem occurs in asmatrix(other) , i.e. before actual multiplication is done. To be more precise, at some point the code calls __array_finalize__ twice, nested, and the error happens on the second return from it. I am not 100% sure what is done here, but it looks as if the asmatrix command I gave myself is somehow conflicting with the asmatrix command given by __mul__ I have replaced my code with: p1Mat=sp.matrix(points1) p2Mat=sp.matrix(points2).T ab=p1Mat*p2Mat and this works fine. I hope this information is useful to someone? Marius -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue May 6 08:28:32 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 06 May 2008 15:28:32 +0300 Subject: [Numpy-discussion] "Segmentation fault (core dumped)" as result of matrix multiplication In-Reply-To: <64268c6e0805060515g18b4cc2au524185f194aab45e@mail.gmail.com> References: <64268c6e0805060515g18b4cc2au524185f194aab45e@mail.gmail.com> Message-ID: <1210076912.26372.3.camel@localhost> ti, 2008-05-06 kello 14:15 +0200, Marius Nijhuis kirjoitti: > Hello, > > I encountered the error "Segmentation fault (core dumped)" during a > rather standard multiplication, without excessive memory us. This > looks likes a bug to me? > I am using Python 2.5, Numpy 1.0.4 under Ubuntu 7.10. > > Here is what I am doing: i have two arrays, points1 and points2. > > points1.shape=(n,k) > points2.shape=(m,k) > > The problem only happens if n or m (or both) are 1. The arrays are > then still 2-dimensional. k is usually in the order of 60. If you got the arrays by unpickling them and you are using a SSE2 enabled Atlas, this sounds like bug #551 (http://scipy.org/scipy/numpy/ticket/551#comment:22) to me. Is this the case? Can you reproduce a segmentation fault with the simple C-only test case? -- Pauli Virtanen From aisaac at american.edu Tue May 6 08:40:06 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 6 May 2008 08:40:06 -0400 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: On Tue, 6 May 2008, Jarrod Millman apparently wrote: > open tickets that I would like everyone to take a brief > look at: > http://projects.scipy.org/scipy/numpy/ticket/760 My understanding is that my patch, which would give a deprecation warning, was rejected in favor of the patch specified at the end of Travis's message here: At least that's how I thought things were resolved. I expected Travis's patch would be part of the 1.1 release. Cheers, Alan From Andy.cheesman at bristol.ac.uk Tue May 6 12:31:01 2008 From: Andy.cheesman at bristol.ac.uk (Andy Cheesman) Date: Tue, 06 May 2008 17:31:01 +0100 Subject: [Numpy-discussion] Tests for empty arrays Message-ID: <482087C5.8060807@bristol.ac.uk> Hi nice numpy people I was wondering if anyone could shed some light on how to distinguish an empty array of a given shape and an zeros array of the same dimensions. Thanks Andy From tim.hochberg at ieee.org Tue May 6 12:43:13 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 6 May 2008 09:43:13 -0700 Subject: [Numpy-discussion] Tests for empty arrays In-Reply-To: <482087C5.8060807@bristol.ac.uk> References: <482087C5.8060807@bristol.ac.uk> Message-ID: On Tue, May 6, 2008 at 9:31 AM, Andy Cheesman wrote: > Hi nice numpy people > > I was wondering if anyone could shed some light on how to distinguish an > empty array of a given shape and an zeros array of the same dimensions. An empty array is just uninitialized, while a zeros array is initialized to zeros. Short of checking whether the zeros array is all zeros, which only tells you that it looks like it was coming from zeros; it still could have created by by empty or zeros or some other method. Why do you need to know? If the array is coming from an unknown source, why not just use a.fill(0) to force everything to be zero and start from a known state? > > > Thanks > Andy > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue May 6 12:44:33 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 6 May 2008 09:44:33 -0700 Subject: [Numpy-discussion] labeled array Message-ID: I'm trying to design a labeled array class. A labeled array contains a 2d array and two lists. One list labels the rows of the array (e.g. variable names) and another list labels the columns of the array (e.g. dates). You can sum (or multiply, divide, subtract, etc.) two labeled arrays that have different shapes. The shape of the sum will be NxM where N is the number of labeled rows the two arrays have in common and M is the number of labeled columns the two arrays have in common. Does something like this already exist? If not, should I be building this on top of the array class? I'm new to OO programming so I don't know what I mean by "on top of". Anyway, here's a prototype: Oh, one last question. Does anyone see a way to speed up the __align method? The __align method reduces two labeled arrays to the shape the sum (or product etc) will have. import numpy as np class Larry: "Meet Larry, he's a labeled 2d array" def __init__(self, x, row, col): """ x len(row) by len(col) 2d array row list of row names, such as variable names, no duplicates col list of column names, such as dates, no duplicates """ lrow = len(row) lcol = len(col) assert x.shape[0] == lrow, 'Number of x rows must equal length of row' assert x.shape[1] == lcol, 'Number of x columns must equal length of col' assert len(frozenset(row)) == lrow, 'row elements must be unique' assert len(frozenset(col)) == lcol, 'col elements must be unique' self.x = x self.row = row self.col = col def log(self): "In-place log" np.log(self.x, self.x) def exp(self): "In-place exp" np.exp(self.x, self.x) def sum(self, axis=None, dtype=None): "Returns sum of x" return self.x.sum(axis, dtype) def cumsum(self, axis=1, dtype=None): "In-place cumsum over axis 0 or 1" assert axis in (0,1), 'axis must be 0 or 1' self.x = self.x.cumsum(axis, dtype, out=self.x) def __neg__(self): x = -self.x.copy() row = list(self.row) col = list(self.col) return Larry(x, row, col) def __pos__(self): pass def abs(self): "In-place absolute value of x" self.x = np.absolute(self.x, self.x) def __abs__(self): x = np.abs(self.x.copy()) row = list(self.row) col = list(self.col) return Larry(x, row, col) def __eq__(self, other): row = sorted(self.row) == sorted(other.row) col = sorted(self.col) == sorted(other.col) x = (self.x == other.x).all() if row & col & x: return True else: return False def __add__(self, other): x, y, row, col = self.__align(other) x += y return Larry(x, row, col) def __sub__(self, other): x, y, row, col = self.__align(other) x -= y return Larry(x, row, col) def __div__(self, other): x, y, row, col = self.__align(other) x /= y return Larry(x, row, col) def __mul__(self, other): x, y, row, col = self.__align(other) x *= y return Larry(x, row, col) def __align(self, other): row = list(frozenset(self.row) & frozenset(other.row)) col = list(frozenset(self.col) & frozenset(other.col)) row.sort() col.sort() x = np.zeros((len(row), len(col))) ridx = [self.row.index(i) for i in row] cidx = [self.col.index(i) for i in col] x += self.x[np.ix_(ridx, cidx)] ridx = [other.row.index(i) for i in row] cidx = [other.col.index(i) for i in col] y = other.x[np.ix_(ridx, cidx)] return x, y, row, col def example(): x = np.array([[1, np.nan], [3, 4]]) row = ['one', 'three'] col = [1, 2] a = Larry(x, row, col) x = np.array([[5, 6, 7], [8, 9, 10], [11, 12, 13]]) row = ['one', 'two', 'three'] col = [1, 2, 3] b = Larry(x, row, col) c = a + b print print 'a.row' print a.row print print 'a.col' print a.col print print 'a.x' print a.x print print '----------' print print 'b.row' print b.row print print 'b.col' print b.col print print 'b.x' print b.x print print '----------' print 'c = a + b' print '----------' print print 'c.row' print c.row print print 'c.col' print c.col print print 'c.x' print c.x From peridot.faceted at gmail.com Tue May 6 12:45:01 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 6 May 2008 12:45:01 -0400 Subject: [Numpy-discussion] Tests for empty arrays In-Reply-To: <482087C5.8060807@bristol.ac.uk> References: <482087C5.8060807@bristol.ac.uk> Message-ID: 2008/5/6 Andy Cheesman : > I was wondering if anyone could shed some light on how to distinguish an > empty array of a given shape and an zeros array of the same dimensions. An "empty" array, that is, an array returned by the function empty(), just means an uninitialized array. The only difference between it and an array returned by zeros() is the values in the array: from zeros() they are, obviously, all zero, while from empty() they can be anything, including zero(). In practice they tend to be crazy numbers of order 1e300 or 1e-300, but one can't count on this. As a general rule, I recommend against using empty() on your first pass through the code. Only once your code is working and you have tests running should you consider switching to empty() for speed reasons. In fact, if you want to use empty() down the road, it may make sense to initialize your array to zeros()/0., so that if you ever use the values, the NaNs will propagate and become obvious. Unfortunately, some CPUs implement NaNs in software, so there can be a huge speed hit. Anne From kwgoodman at gmail.com Tue May 6 12:53:44 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 6 May 2008 09:53:44 -0700 Subject: [Numpy-discussion] Tests for empty arrays In-Reply-To: References: <482087C5.8060807@bristol.ac.uk> Message-ID: On Tue, May 6, 2008 at 9:45 AM, Anne Archibald wrote: > In fact, if you want to use empty() down the road, it may > make sense to initialize your array to zeros()/0., so that if you ever > use the values, the NaNs will propagate and become obvious. Numpy has ones and zeros. Could we add a nans? I often initialize using x = nan * ones((n ,m)). But if it's in a loop, I'll avoid one copy by doing x = np.ones((n, m)) x *= np.nan To many on the list using nans for missing values is like chewing gum you found on the sidewalk. But I use it all the time so I'd use a nans. From tim.hochberg at ieee.org Tue May 6 13:03:38 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 6 May 2008 10:03:38 -0700 Subject: [Numpy-discussion] Tests for empty arrays In-Reply-To: References: <482087C5.8060807@bristol.ac.uk> Message-ID: On Tue, May 6, 2008 at 9:53 AM, Keith Goodman wrote: > On Tue, May 6, 2008 at 9:45 AM, Anne Archibald > wrote: > > In fact, if you want to use empty() down the road, it may > > make sense to initialize your array to zeros()/0., so that if you ever > > use the values, the NaNs will propagate and become obvious. > > Numpy has ones and zeros. Could we add a nans? > > I often initialize using x = nan * ones((n ,m)). But if it's in a > loop, I'll avoid one copy by doing > > x = np.ones((n, m)) > x *= np.nan > > To many on the list using nans for missing values is like chewing gum > you found on the sidewalk. But I use it all the time so I'd use a > nans. Why don't you just roll your own? >>> def nans(shape, dtype=float): ... a = np.empty(shape, dtype) ... a.fill(np.nan) ... return a ... >>> nans([3,4]) array([[ NaN, NaN, NaN, NaN], [ NaN, NaN, NaN, NaN], [ NaN, NaN, NaN, NaN]]) -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue May 6 13:14:41 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 6 May 2008 10:14:41 -0700 Subject: [Numpy-discussion] Tests for empty arrays In-Reply-To: References: <482087C5.8060807@bristol.ac.uk> Message-ID: On Tue, May 6, 2008 at 10:03 AM, Timothy Hochberg wrote: > Why don't you just roll your own? > > >>> def nans(shape, dtype=float): > ... a = np.empty(shape, dtype) > ... a.fill(np.nan) > ... return a > ... > >>> nans([3,4]) > array([[ NaN, NaN, NaN, NaN], > [ NaN, NaN, NaN, NaN], > [ NaN, NaN, NaN, NaN]]) I learn a lot from this list. I didn't know about fill. Looks like it is much faster than adding nan. >> timeit nans0((500,500)) 10 loops, best of 3: 30.5 ms per loop >> timeit nans1((500,500)) 1000 loops, best of 3: 956 ?s per loop def nans0(shape, dtype=float): a = np.ones(shape, dtype) a += np.nan return a def nans1(shape, dtype=float): a = np.empty(shape, dtype) a.fill(np.nan) No need to roll my own. I'll smoke yours. return a From ctw at cogsci.info Tue May 6 13:15:16 2008 From: ctw at cogsci.info (Christoph T. Weidemann) Date: Tue, 6 May 2008 13:15:16 -0400 Subject: [Numpy-discussion] labeled array Message-ID: On Tue, May 6, 2008 at 1:00 PM, "Keith Goodman" wrote: > I'm trying to design a labeled array class. A labeled array contains a > 2d array and two lists. One list labels the rows of the array (e.g. > variable names) and another list labels the columns of the array (e.g. > dates). I'm working on a dimensioned data class that seems to be a more general case of what you are looking for. It's a subclass of ndarray, and contains an attribute which specifies each dimension (this works for any number of dimensions, not just 2D). This data structure is not currently in a state where it's useful yet, but this will hopefully change soon. From basti.kr at gmail.com Tue May 6 15:03:14 2008 From: basti.kr at gmail.com (Sebastian =?ISO-8859-1?Q?Kr=E4mer?=) Date: Tue, 06 May 2008 21:03:14 +0200 Subject: [Numpy-discussion] __float__ is not called when instance can not evaluated. Message-ID: <1210100594.6081.32.camel@basti-desktop> Hi all, I'm currently working on a function that converts a sympy ?(http://code.google.com/p/sympy) expression to a lambda-function. In this lambda-function all sympy builtin functions are replaced by numpy functions, since they are faster. Now it may happen that users pass sympy-symbols like pi to these lambda functions and so it is possible that numpy-functions get these symbols. The same functionality is implemented using python's math module, and it works because the math functions call the __float__ method and therefor get a number they can work with. However, numpy doesn't do this, it only looks if there is a method with the same name as the called function. e.g: >> numpy.cos(sympy.pi) : cos whereas: >> math.cos(sympy.pi) -1.0 Would it be possible to change numpys behaviour so that x.__float__() is tried after x.cos() has failed? Or are there any other possible solutions? Thanks in advance, Sebastian From kwgoodman at gmail.com Tue May 6 15:22:05 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 6 May 2008 12:22:05 -0700 Subject: [Numpy-discussion] What is .T? Message-ID: What is .T? It looks like an attribute, behaves like a method, and smells like magic. I'd like to add it to my class but don't no where to begin. From robert.kern at gmail.com Tue May 6 15:28:24 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 6 May 2008 14:28:24 -0500 Subject: [Numpy-discussion] What is .T? In-Reply-To: References: Message-ID: <3d375d730805061228l3f996769t8ed3b86a7893bded@mail.gmail.com> On Tue, May 6, 2008 at 2:22 PM, Keith Goodman wrote: > What is .T? It looks like an attribute, behaves like a method, and > smells like magic. I'd like to add it to my class but don't no where > to begin. It is a property. It returns the transpose of the array. If you had a .transpose() method on your class already, you could do (in Python 2.4+) @property def T(self): return self.transpose() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Tue May 6 15:37:57 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 6 May 2008 12:37:57 -0700 Subject: [Numpy-discussion] What is .T? In-Reply-To: <3d375d730805061228l3f996769t8ed3b86a7893bded@mail.gmail.com> References: <3d375d730805061228l3f996769t8ed3b86a7893bded@mail.gmail.com> Message-ID: On Tue, May 6, 2008 at 12:28 PM, Robert Kern wrote: > > On Tue, May 6, 2008 at 2:22 PM, Keith Goodman wrote: > > What is .T? It looks like an attribute, behaves like a method, and > > smells like magic. I'd like to add it to my class but don't no where > > to begin. > > It is a property. It returns the transpose of the array. If you had a > .transpose() method on your class already, you could do (in Python > 2.4+) > > @property > def T(self): > return self.transpose() That works very nicely. Thank you. From robert.kern at gmail.com Tue May 6 15:40:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 6 May 2008 14:40:14 -0500 Subject: [Numpy-discussion] numpy in RHEL4 In-Reply-To: <288df32a0805060221y7f4aef6fjc9efd3200178e81c@mail.gmail.com> References: <288df32a0805050414m61b3442aq40b71babbddc77b8@mail.gmail.com> <3d375d730805051017o21e6944ct132cfaec39e18ed@mail.gmail.com> <288df32a0805060221y7f4aef6fjc9efd3200178e81c@mail.gmail.com> Message-ID: <3d375d730805061240u764e201bq61cb9e49440490e7@mail.gmail.com> On Tue, May 6, 2008 at 4:21 AM, Bala subramanian wrote: > Dear Robert, > Thank you. But i am trying to install it in a 32-bit machine only. In that > case, why dose it require 64 bit libraries. Well, judging from the paths on the command line, Python thinks it is on a 64-bit machine: build/temp.linux-x86_64-2.3/ How did you build Python? If you didn't build it, where did you get it from? You can check what kind of platform it thinks it is on with the following: >>> import platform >>> platform.architecture() ('64bit', 'ELF') >>> platform.processor() 'x86_64' -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue May 6 16:13:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 6 May 2008 15:13:02 -0500 Subject: [Numpy-discussion] cannot edit page on scipy-dev wiki In-Reply-To: References: Message-ID: <3d375d730805061313y135141afw9db450155dd7ad8b@mail.gmail.com> On Tue, May 6, 2008 at 4:33 AM, Vincent Noel wrote: > Hello all, > > I wanted to fix the formatting problems on the wiki page > http://scipy.org/scipy/numpy/wiki/MaskedArrayApiChanges, so I followed > the instructions on http://scipy.org/scipy/numpy/wiki (which state "In > order to edit wiki pages or create and edit tickets, you need to > register first.") > But even after getting a registered login, I still can't edit the wiki > pages. Do you need to request some extra authorization? I have added the WIKI_CREATE and WIKI_MODIFY permissions to all authenticated users. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue May 6 17:09:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 6 May 2008 15:09:57 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: On Tue, May 6, 2008 at 6:40 AM, Alan G Isaac wrote: > On Tue, 6 May 2008, Jarrod Millman apparently wrote: > > open tickets that I would like everyone to take a brief > > look at: > > http://projects.scipy.org/scipy/numpy/ticket/760 > > My understanding is that my patch, which would give > a deprecation warning, was rejected in favor of the patch > specified at the end of Travis's message here: > http://projects.scipy.org/pipermail/numpy-discussion/2008-April/033315.html > > > > At least that's how I thought things were resolved. > I expected Travis's patch would be part of the 1.1 release. > I think someone needs to step up and make the change, but first it needs to be blessed by Travis and some of the folks on the Matrix indexing thread. The change might also entail removing some workarounds in the spots indicated by Travis. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Tue May 6 17:17:54 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 6 May 2008 17:17:54 -0400 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: 2008/5/6 Charles R Harris : > > On Tue, May 6, 2008 at 6:40 AM, Alan G Isaac wrote: > > > > On Tue, 6 May 2008, Jarrod Millman apparently wrote: > > > open tickets that I would like everyone to take a brief > > > look at: > > > http://projects.scipy.org/scipy/numpy/ticket/760 > > > > My understanding is that my patch, which would give > > a deprecation warning, was rejected in favor of the patch > > specified at the end of Travis's message here: > > > > > > > At least that's how I thought things were resolved. > > I expected Travis's patch would be part of the 1.1 release. > > > > I think someone needs to step up and make the change, but first it needs to > be blessed by Travis and some of the folks on the Matrix indexing thread. > The change might also entail removing some workarounds in the spots > indicated by Travis. It has my vote, as a peripheral participant in the matrix indexing thread; in fact it's what some of us thought we were agreeing to earlier. Anne From bsouthey at gmail.com Tue May 6 17:48:31 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 6 May 2008 16:48:31 -0500 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: Hi, I think Ticket 605 (Incorrect behaviour of numpy.histogram) can be closed. With regards to Ticket 706 (scalar indexing of matrices -> deprecation warning) I think it should not be a blocker now but should apply the next version. There were many different issues (and threads) raised in the context of this often circular discussion. My conclusion was that the Matrix class needs a clearer statement of the expected behavior under different circumstances not just indexing. I don't have Atlas installed to evaluate Ticket 551 and Ticket 748 is outside my knowledge. Regards Bruce . On Tue, May 6, 2008 at 3:40 AM, Jarrod Millman wrote: > Hey, > > The trunk is in pretty good shape and it is about time that I put out > an official release. So tomorrow (in a little over twelve hours) I am > going to create a 1.1.x branch and the trunk will be officially open > for 1.2 development. If there are no major issues that show up at the > last minute, I will tag 1.1.0 twenty-four hours after I branch. As > soon as I tag the release I will ask the David and Chris to create the > official Windows and Mac binaries. If nothing goes awry, you can > expect the official release announcement by Monday, May 12th. > > In order to help me with the final touches, would everyone look over > the release notes one last time: > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > Please let me know if there are any important omissions or errors ASAP. > > Also, there are four open tickets that I would like everyone to take a > brief look at: > http://projects.scipy.org/scipy/numpy/ticket/551 > http://projects.scipy.org/scipy/numpy/ticket/605 > http://projects.scipy.org/scipy/numpy/ticket/748 > http://projects.scipy.org/scipy/numpy/ticket/760 > I am not going to hold up the release any longer, but it would be nice > if a few (or all) of these remaining tickets could be closed. > > Thanks, > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From e.howick at irl.cri.nz Tue May 6 17:50:50 2008 From: e.howick at irl.cri.nz (Eleanor) Date: Tue, 6 May 2008 21:50:50 +0000 (UTC) Subject: [Numpy-discussion] lexsort Message-ID: >>> a = numpy.array([[1,2,6], [2,2,8], [2,1,7],[1,1,5]]) >>> a array([[1, 2, 6], [2, 2, 8], [2, 1, 7], [1, 1, 5]]) >>> indices = numpy.lexsort(a.T) >>> a.T.take(indices,axis=-1).T array([[1, 1, 5], [1, 2, 6], [2, 1, 7], [2, 2, 8]]) The above does what I want, equivalent to sorting on column A then column B in Excel, but al the transposes are ungainly. I've stared at it a while but can't come up with a more elegant solution. Any ideas? cheers Eleanor From peridot.faceted at gmail.com Tue May 6 18:35:26 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 6 May 2008 18:35:26 -0400 Subject: [Numpy-discussion] lexsort In-Reply-To: References: Message-ID: 2008/5/6 Eleanor : > >>> a = numpy.array([[1,2,6], [2,2,8], [2,1,7],[1,1,5]]) > >>> a > array([[1, 2, 6], > [2, 2, 8], > [2, 1, 7], > [1, 1, 5]]) > >>> indices = numpy.lexsort(a.T) > >>> a.T.take(indices,axis=-1).T > array([[1, 1, 5], > [1, 2, 6], > [2, 1, 7], > [2, 2, 8]]) > > > The above does what I want, equivalent to sorting on column A then > column B in Excel, but al the transposes are ungainly. I've stared at it a while > but can't come up with a more elegant solution. Any ideas? It appears that lexsort is broken in several ways, and its docstring is misleading. First of all, this code is not doing quite what you describe. The primary key here is the [5,6,7,8] column, followed by the middle and then by the first. This is almost exactly the opposite of what you describe (and of what I expected). To get this to sort the way you describe, the clearest way is to write a sequence: In [34]: indices = np.lexsort( (a[:,1],a[:,0]) ) In [35]: a[indices,:] Out[35]: array([[1, 1, 5], [1, 2, 6], [2, 1, 7], [2, 2, 8]]) In other words,sort over a[:,1], then sort again over a[:,0], making a[:,0] the primary key. I have used "fancy indexing" to pull the array into the right order, but it can also be done with take(): In [40]: np.take(a,indices,axis=0) Out[40]: array([[1, 1, 5], [1, 2, 6], [2, 1, 7], [2, 2, 8]]) As for why I say lexsort() is broken, well, it simply returns 0 for higher-rank arrays (rather than either sorting or raising an exception), it raises an exception when handed axis=None rather than flattening as the docstring claims, and whatever the axis argument is supposed to do, it doesn't seem to do it: In [44]: np.lexsort(a,axis=0) Out[44]: array([1, 0, 2]) In [45]: np.lexsort(a,axis=-1) Out[45]: array([1, 0, 2]) In [46]: np.lexsort(a,axis=1) --------------------------------------------------------------------------- Traceback (most recent call last) /home/peridot/ in () : axis(=1) out of bounds Anne From e.howick at irl.cri.nz Tue May 6 21:14:59 2008 From: e.howick at irl.cri.nz (Eleanor) Date: Wed, 7 May 2008 01:14:59 +0000 (UTC) Subject: [Numpy-discussion] lexsort References: Message-ID: Anne Archibald gmail.com> writes: > > It appears that lexsort is broken in several ways, and its docstring > is misleading. > > First of all, this code is not doing quite what you describe. The > primary key here is the [5,6,7,8] column, followed by the middle and > then by the first. This is almost exactly the opposite of what you > want. Ouch! That would've got me bad. >>>a = numpy.array([[1,2,60], [2,2,800], [2,1,7],[1,1,50]]) >>> a array([[ 1, 2, 60], [ 2, 2, 800], [ 2, 1, 7], [ 1, 1, 50]]) >>> a[numpy.lexsort( (a[:,1],a[:,0]) ),:] array([[ 1, 1, 50], [ 1, 2, 60], [ 2, 1, 7], [ 2, 2, 800]]) This is correct versus >>> a[numpy.lexsort(a.T),:] array([[ 2, 1, 7], [ 1, 1, 50], [ 1, 2, 60], [ 2, 2, 800]]) which isn't. Thanks very much Eleanor From millman at berkeley.edu Wed May 7 02:12:09 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 6 May 2008 23:12:09 -0700 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: I have just created the 1.1.x branch: http://projects.scipy.org/scipy/numpy/changeset/5134 In about 24 hours I will tag the 1.1.0 release from the branch. At this point only critical bug fixes should be applied to the branch. The trunk is now open for 1.2 development. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From mariusnijhuis at gmail.com Wed May 7 03:19:32 2008 From: mariusnijhuis at gmail.com (Marius Nijhuis) Date: Wed, 7 May 2008 09:19:32 +0200 Subject: [Numpy-discussion] "Segmentation fault (core dumped)" as re: segmentation fault Message-ID: <64268c6e0805070019w30af38bam669ac0840844df7b@mail.gmail.com> Hi, The array I was using is indeed from a pickle. I ran the c example, and got the following: $ LD_PRELOAD=/usr/lib/sse2/libcblas.so.3.0 ./sse2-crash-test good_align: ok bad_align: Segmentation fault (core dumped) $ LD_PRELOAD=/usr/lib/libcblas.so.3.0 ./sse2-crash-test good_align: ok bad_align: ok So it looks as if this is indeed the problem. At least it explains why my code ran well originally, I used either fresh arrays or ran on Windows, without ATLAS. This is one obscure bug! Thanks for the help. For the moment is there anything I can do to keep it from reoccurring in other contexts? I am copying the arrays now before use, but odds are I will forget that one time. Is using scipy.io instead of cPickle going to improve the situation? Thanks anyway, Marius -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbsnyder at gmail.com Wed May 7 09:30:49 2008 From: jbsnyder at gmail.com (James Snyder) Date: Wed, 7 May 2008 08:30:49 -0500 Subject: [Numpy-discussion] malloc failures on 10.5.2 w/ apple python2.5.1 Message-ID: <33644d3c0805070630y1d810263w385571a7e0eeb0aa@mail.gmail.com> Hi - I'm not sure if this is a bug or an error on my part. I've grabbed the latest subversion truck and did a build and install on 10.5.2, and I'm getting malloc errors when running the tests. Sorry about the revision being lopped off the Numpy version, I'm current up to r5133, but I'm using git-svn :-) Uname: Darwin roeder.local 9.2.2 Darwin Kernel Version 9.2.2: Tue Mar 4 21:17:34 PST 2008; root:xnu-1228.4.31~1/RELEASE_I386 i386 >>> numpy.test(10,1) Numpy is installed in /Library/Python/2.5/site-packages/numpy-1.1.0.dev-py2.5-macosx-10.5-i386.egg/numpy Numpy version 1.1.0.dev Python version 2.5.1 (r251:54863, Feb 4 2008, 21:48:13) [GCC 4.0.1 (Apple Inc. build 5465)] Found 10/10 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 280/280 tests for numpy.core.multiarray Found 69/69 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 6/6 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 93/93 tests for numpy.ma.core Found 14/14 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ ......................................................................................................................................................................................................................................................................................................................................................................................................................................Python(1867) malloc: *** error for object 0x161c430: pointer being reallocated was not allocated *** set a breakpoint in malloc_error_break to debug EPython(1867) malloc: *** error for object 0x161c560: pointer being reallocated was not allocated *** set a breakpoint in malloc_error_break to debug EPython(1867) malloc: *** error for object 0x161c560: pointer being reallocated was not allocated *** set a breakpoint in malloc_error_break to debug E...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ====================================================================== ERROR: test_lengths (numpy.core.tests.test_numeric.TestFromiter) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Python/2.5/site-packages/numpy-1.1.0.dev-py2.5-macosx-10.5-i386.egg/numpy/core/tests/test_numeric.py", line 217, in test_lengths a = fromiter(self.makegen(), int) MemoryError: cannot allocate array memory ====================================================================== ERROR: test_types (numpy.core.tests.test_numeric.TestFromiter) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Python/2.5/site-packages/numpy-1.1.0.dev-py2.5-macosx-10.5-i386.egg/numpy/core/tests/test_numeric.py", line 208, in test_types ai32 = fromiter(self.makegen(), int32) MemoryError: cannot allocate array memory ====================================================================== ERROR: test_values (numpy.core.tests.test_numeric.TestFromiter) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Python/2.5/site-packages/numpy-1.1.0.dev-py2.5-macosx-10.5-i386.egg/numpy/core/tests/test_numeric.py", line 230, in test_values a = fromiter(self.makegen(), int) MemoryError: cannot allocate array memory ---------------------------------------------------------------------- Ran 991 tests in 1.603s FAILED (errors=3) -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From fullung at gmail.com Wed May 7 09:42:00 2008 From: fullung at gmail.com (Albert Strasheim) Date: Wed, 7 May 2008 15:42:00 +0200 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: <5eec5f300805070642l1871e6bbrf82e7ce7250c0605@mail.gmail.com> Hello, On Wed, May 7, 2008 at 8:12 AM, Jarrod Millman wrote: > I have just created the 1.1.x branch: > http://projects.scipy.org/scipy/numpy/changeset/5134 > In about 24 hours I will tag the 1.1.0 release from the branch. At > this point only critical bug fixes should be applied to the branch. > The trunk is now open for 1.2 development. Might be a bit late now, but are you (or is someone) still Valgrinding NumPy on a semi-regular basis, or at least before a release? Cheers, Albert From fullung at gmail.com Wed May 7 09:59:38 2008 From: fullung at gmail.com (Albert Strasheim) Date: Wed, 7 May 2008 15:59:38 +0200 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: <5eec5f300805070642l1871e6bbrf82e7ce7250c0605@mail.gmail.com> References: <5eec5f300805070642l1871e6bbrf82e7ce7250c0605@mail.gmail.com> Message-ID: <5eec5f300805070659r6bc8c870oaef706b15e0fcaf7@mail.gmail.com> Hello, On Wed, May 7, 2008 at 3:42 PM, Albert Strasheim wrote: > Might be a bit late now, but are you (or is someone) still Valgrinding > NumPy on a semi-regular basis, or at least before a release? Even better: let the Buildbot do it. You should see some Valgrind output appearing in the testing output of the Linux_x86_Fedora_Py2.6 builder soon. I'll probably have to tune things a bit to make it go green/red in the right situations. I'll also update the Linux_x86_64_Fedora builder later. Regards, Albert From stefan at sun.ac.za Wed May 7 11:06:16 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 7 May 2008 17:06:16 +0200 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: <5eec5f300805070659r6bc8c870oaef706b15e0fcaf7@mail.gmail.com> References: <5eec5f300805070642l1871e6bbrf82e7ce7250c0605@mail.gmail.com> <5eec5f300805070659r6bc8c870oaef706b15e0fcaf7@mail.gmail.com> Message-ID: <9457e7c80805070806l5f29d0ceke31ba098c060700d@mail.gmail.com> 2008/5/7 Albert Strasheim : > Even better: let the Buildbot do it. > > You should see some Valgrind output appearing in the testing output of > the Linux_x86_Fedora_Py2.6 builder soon. I'll probably have to tune > things a bit to make it go green/red in the right situations. Fantastic -- that's really handy. Thanks! Cheers St?fan From kwgoodman at gmail.com Wed May 7 11:29:09 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 7 May 2008 08:29:09 -0700 Subject: [Numpy-discussion] Numpy and google-analytics Message-ID: I noticed that scipy.org uses google-analytics. Does the numpy project need it? It is even on the numpy trac which to me gives new meaning to trac. From doutriaux1 at llnl.gov Wed May 7 13:35:17 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 07 May 2008 10:35:17 -0700 Subject: [Numpy-discussion] bug in oldnumeric.ma Message-ID: <4821E855.1060909@llnl.gov> The following code works with numpy.ma but not numpy.oldnumeric.ma, obviously it shouldn't have missing values... Note that replacing 2. with 2 (int) works, dtype does not seem to matter import numpy.oldnumeric.ma as MA,numpy s = MA.array([ 12.16271591, 11.19478798, 10.27440453, 9.60334778, 9.2451086, 9.13312435, 9.11890984, 9.03033924, 8.7346344, 8.18558788, 7.43921041, 6.62947559, 5.90856123, 5.37711906, 5.03489971, 4.77586126, 4.4189887 , 3.76522899, 2.65106893, 0.99725384, -1.1513288, -3.58740163, -5.93836021, -7.72085428, -8.4840517, -7.98519516, -6.31602335, -3.89117384, -1.29867446, 0.90583926, 2.3683548, 3.00160909, 2.94496846, 2.44637036, 1.7288276, 0.89646715, -0.07413431, -1.25960362, -2.67903948, -4.22316313, -5.66368866, -6.74747849, -7.31541586, -7.37950087, -7.12536144, -6.82764053, -6.74864483, -7.04398584, -7.73052883, -8.70125961, -9.77933311, -10.78447914, -11.59126091, -12.1706562, -12.60001087, -13.03606987, -13.64736843 , -14.51338005, -15.52330303, -16.33303833, -16.46606064, -15.52666664, -13.45486546, -10.63810158, -7.79133797, -5.66280842, -4.72991323, -5.03621674, -6.21381569, -7.6610961, -8.78764057, -9.22090816, -8.91826916, -8.14496899, -7.33739614, -6.91325617, -7.09938431, -7.84141493, -8.829319, -9.63222504, -9.88238335, -9.44395733, -8.4723177, -7.35424185, -6.54324722, -6.36360407, -6.85566664, -7.74132967, -8.51024342, -8.62310696, -7.73065567, -5.82352972, -3.24726033, -0.5881027 , 1.51040995, 2.52599192, 2.19137073, 0.56473863, -1.97977543, -4.84694195, -7.40427446, -9.2083149, -10.16563416, -10.53237438, -10.75102997, -11.21533489, -12.09201908, -13.28770447, -14.54858589, -15.62684536, -16.40594101, -16.91949844, -17.27025032, -17.51157188, -17.57112694, -17.25679016, -16.32791138, -14.58930111, -11.96431351, -8.52671432, -4.49569035, -0.20288418, 3.96036053, 7.60605049, 10.41992378, 12.21882153, 12.98544502, 12.87247849],dtype='f') s2c=MA.power(s,2.) print MA.count(s) print MA.count(s2c) From peridot.faceted at gmail.com Wed May 7 13:44:34 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 19:44:34 +0200 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: 2008/5/7 Jarrod Millman : > I have just created the 1.1.x branch: > http://projects.scipy.org/scipy/numpy/changeset/5134 > In about 24 hours I will tag the 1.1.0 release from the branch. At > this point only critical bug fixes should be applied to the branch. > The trunk is now open for 1.2 development. I committed Travis' matrix indexing patch (plus a basic test) to 1.1.x. Hope that's okay. (All tests pass.) Anne From charlesr.harris at gmail.com Wed May 7 14:09:13 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 7 May 2008 12:09:13 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: On Wed, May 7, 2008 at 11:44 AM, Anne Archibald wrote: > 2008/5/7 Jarrod Millman : > > I have just created the 1.1.x branch: > > http://projects.scipy.org/scipy/numpy/changeset/5134 > > In about 24 hours I will tag the 1.1.0 release from the branch. At > > this point only critical bug fixes should be applied to the branch. > > The trunk is now open for 1.2 development. > > I committed Travis' matrix indexing patch (plus a basic test) to > 1.1.x. Hope that's okay. (All tests pass.) > You should put it in the 1.2 branch also, then. I think fancy indexing and tolist, which Travis mentioned as having matrix workarounds, need to be checked to see if they still work correctly for matrices with the patch. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Wed May 7 14:15:59 2008 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 07 May 2008 08:15:59 -1000 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <4821E855.1060909@llnl.gov> References: <4821E855.1060909@llnl.gov> Message-ID: <4821F1DF.3010108@hawaii.edu> Charles Doutriaux wrote: > The following code works with numpy.ma but not numpy.oldnumeric.ma, No, this is a bug in numpy.ma also; power is broken: In [1]:import numpy as np In [2]:x = np.ma.array([-1.1]) In [3]:x**2.0 ### This works Out[3]: masked_array(data = [1.21], mask = [False], fill_value=1e+20) In [4]:np.ma.power(x, 2.0) ### This doesn't Out[4]: masked_array(data = [--], mask = [ True], fill_value=1e+20) Here is the code in ma/core.py, which is masking the output for negative inputs: def power(a, b, third=None): """Computes a**b elementwise. Masked values are set to 1. """ if third is not None: raise MAError, "3-argument power not supported." ma = getmask(a) mb = getmask(b) m = mask_or(ma, mb) fa = getdata(a) fb = getdata(b) if fb.dtype.char in typecodes["Integer"]: return masked_array(umath.power(fa, fb), m) md = make_mask((fa < 0), shrink=True) #### wrong m = mask_or(m, md) if m is nomask: return masked_array(umath.power(fa, fb)) else: fa = fa.copy() fa[(fa < 0)] = 1 #### wrong return masked_array(umath.power(fa, fb), m) Eric From peridot.faceted at gmail.com Wed May 7 14:47:18 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 20:47:18 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <4821F1DF.3010108@hawaii.edu> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> Message-ID: 2008/5/7 Eric Firing : > Charles Doutriaux wrote: > > The following code works with numpy.ma but not numpy.oldnumeric.ma, > > No, this is a bug in numpy.ma also; power is broken: While it's tempting to just call power() and mask out any NaNs that result, that's going to be a problem if people have their environments set to raise exceptions on the production of NaNs. Is it an adequate criterion to check (a<0) & (round(b)==b)? We have to be careful: In [16]: np.array([-1.0])**(2.0**128) Warning: invalid value encountered in power Out[16]: array([ nan]) 2.0**128 cannot be distinguished from nearby non-integral values, so this is reasonable behaviour (and a weird corner case), but In [23]: np.round(2.0**128) == 2.0**128 Out[23]: True Anne From pgmdevlist at gmail.com Wed May 7 15:12:18 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 7 May 2008 19:12:18 +0000 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> Message-ID: <200805071912.19427.pgmdevlist@gmail.com> All, Yes, there is a problem with ma.power: masking negative data should be restricted to the case of an exponent between -1. and 1. only, don't you think ? On Wednesday 07 May 2008 18:47:18 Anne Archibald wrote: > 2008/5/7 Eric Firing : > > Charles Doutriaux wrote: > > > The following code works with numpy.ma but not numpy.oldnumeric.ma, > > > > No, this is a bug in numpy.ma also; power is broken: > > While it's tempting to just call power() and mask out any NaNs that > result, that's going to be a problem if people have their environments > set to raise exceptions on the production of NaNs. Is it an adequate > criterion to check (a<0) & (round(b)==b)? We have to be careful: > > In [16]: np.array([-1.0])**(2.0**128) > Warning: invalid value encountered in power > Out[16]: array([ nan]) > > 2.0**128 cannot be distinguished from nearby non-integral values, so > this is reasonable behaviour (and a weird corner case), but > > In [23]: np.round(2.0**128) == 2.0**128 > Out[23]: True > > Anne > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From kwgoodman at gmail.com Wed May 7 16:24:20 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 7 May 2008 13:24:20 -0700 Subject: [Numpy-discussion] Is a string a scalar? Message-ID: >> np.isscalar('string') True From robert.kern at gmail.com Wed May 7 16:37:37 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 May 2008 15:37:37 -0500 Subject: [Numpy-discussion] Is a string a scalar? In-Reply-To: References: Message-ID: <3d375d730805071337p501dcc58t186741d460b8217c@mail.gmail.com> On Wed, May 7, 2008 at 3:24 PM, Keith Goodman wrote: > >> np.isscalar('string') > True Either option would cause someone to complain. It's not a cut-and-dry issue. However, since strings can be atomic elements through the various '|S' dtypes, and we already have rules to special-case strings as atomic, "numpy.isscalar('string') == False" would be more inconsistent. In [1]: from numpy import * In [2]: array(['one', 'two']) Out[2]: array(['one', 'two'], dtype='|S3') -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Wed May 7 16:38:22 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 22:38:22 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <200805071912.19427.pgmdevlist@gmail.com> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> Message-ID: 2008/5/7 Pierre GM : > All, > Yes, there is a problem with ma.power: masking negative data should be > restricted to the case of an exponent between -1. and 1. only, don't you > think ? No, there's a problem with any fractional exponent (with even denominator): x**(3/2) == (x**3)**(1/2). And of course in a floating-point world, you can't really ask whether the denominator is even or not. So any non-integer power is trouble. The draconian approach would be to simply disallow negative numbers to be raised to exponents of type float, but that's going to annoy a lot of people who have integers which happen to be represented in floats. (Representing integers in floats is exact up to some fairly large value.) So the first question is "how do we recognize floats with integer values?". Unfortunately that's not the real problem. The real problem is "how do we predict when power() is going to produce a NaN?" Anne From chanley at stsci.edu Wed May 7 16:51:33 2008 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 07 May 2008 16:51:33 -0400 Subject: [Numpy-discussion] no longer receiving numpy-tickets messages Message-ID: <48221655.7010505@stsci.edu> Hi, I've noticed that I am no longer receiving message from the numpy-ticket distribution list. This includes messages for tickets I have submitted in addition to tickets created by others. Chris -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From robert.kern at gmail.com Wed May 7 16:54:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 May 2008 15:54:03 -0500 Subject: [Numpy-discussion] no longer receiving numpy-tickets messages In-Reply-To: <48221655.7010505@stsci.edu> References: <48221655.7010505@stsci.edu> Message-ID: <3d375d730805071354t2fc7d4c2gd8fa572efb263078@mail.gmail.com> On Wed, May 7, 2008 at 3:51 PM, Christopher Hanley wrote: > Hi, > > I've noticed that I am no longer receiving message from the numpy-ticket > distribution list. This includes messages for tickets I have submitted > in addition to tickets created by others. I'll pass this along. I haven't gotten any since May 3, either. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Wed May 7 16:55:30 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 7 May 2008 20:55:30 +0000 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: References: <4821E855.1060909@llnl.gov> <200805071912.19427.pgmdevlist@gmail.com> Message-ID: <200805072055.30250.pgmdevlist@gmail.com> On Wednesday 07 May 2008 20:38:22 Anne Archibald wrote: > 2008/5/7 Pierre GM : > > All, > > Yes, there is a problem with ma.power: masking negative data should be > > restricted to the case of an exponent between -1. and 1. only, don't you > > think ? > > No, there's a problem with any fractional exponent (with even > denominator): x**(3/2) == (x**3)**(1/2). Argh. Good point. > The real > problem is "how do we predict when power() is going to produce a NaN?" An alternative would be to forget about it: let power() output NaNs, and fix them afterwards with fix_invalid. From kwgoodman at gmail.com Wed May 7 16:58:03 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 7 May 2008 13:58:03 -0700 Subject: [Numpy-discussion] Is a string a scalar? In-Reply-To: <3d375d730805071337p501dcc58t186741d460b8217c@mail.gmail.com> References: <3d375d730805071337p501dcc58t186741d460b8217c@mail.gmail.com> Message-ID: On Wed, May 7, 2008 at 1:37 PM, Robert Kern wrote: > On Wed, May 7, 2008 at 3:24 PM, Keith Goodman wrote: > > >> np.isscalar('string') > > True > > Either option would cause someone to complain. It's not a > cut-and-dry issue. However, since strings can be atomic elements > through the various '|S' dtypes, and we already have rules to > special-case strings as atomic, "numpy.isscalar('string') == False" > would be more inconsistent. BTW, I noticed that defmatrix.py uses isscalar (from numeric import isscalar) and N.isscalar (import numpric as N). Each is used only one time. It confused me a little at first. But that's not saying much. From robert.kern at gmail.com Wed May 7 17:04:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 May 2008 16:04:32 -0500 Subject: [Numpy-discussion] Is a string a scalar? In-Reply-To: References: <3d375d730805071337p501dcc58t186741d460b8217c@mail.gmail.com> Message-ID: <3d375d730805071404o5eb25808w3f5213cbfc848420@mail.gmail.com> On Wed, May 7, 2008 at 3:58 PM, Keith Goodman wrote: > On Wed, May 7, 2008 at 1:37 PM, Robert Kern wrote: > > On Wed, May 7, 2008 at 3:24 PM, Keith Goodman wrote: > > > >> np.isscalar('string') > > > True > > > > Either option would cause someone to complain. It's not a > > cut-and-dry issue. However, since strings can be atomic elements > > through the various '|S' dtypes, and we already have rules to > > special-case strings as atomic, "numpy.isscalar('string') == False" > > would be more inconsistent. > > BTW, I noticed that defmatrix.py uses isscalar (from numeric import > isscalar) and N.isscalar (import numpric as N). Each is used only one > time. It confused me a little at first. But that's not saying much. Different authors at different times. It should be cleaned up. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wright at esrf.fr Wed May 7 17:08:07 2008 From: wright at esrf.fr (Jonathan Wright) Date: Wed, 07 May 2008 23:08:07 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> Message-ID: <48221A37.80507@esrf.fr> Anne Archibald wrote: > 2008/5/7 Pierre GM : > >> All, >> Yes, there is a problem with ma.power: masking negative data should be >> restricted to the case of an exponent between -1. and 1. only, don't you >> think ? >> > > No, there's a problem with any fractional exponent (with even > denominator): x**(3/2) == (x**3)**(1/2). And of course in a > floating-point world, you can't really ask whether the denominator is > even or not. So any non-integer power is trouble. > > The draconian approach would be to simply disallow negative numbers to > be raised to exponents of type float, but that's going to annoy a lot > of people who have integers which happen to be represented in floats. > (Representing integers in floats is exact up to some fairly large > value.) So the first question is "how do we recognize floats with > integer values?". Unfortunately that's not the real problem. The real > problem is "how do we predict when power() is going to produce a NaN?" > > Anne > Is there a rule against squaring away the negatives? def not_your_normal_pow( x, y ): return exp( log( power( x, 2) ) * y / 2 ) Which still needs some work for x==0. Jon From peridot.faceted at gmail.com Wed May 7 17:22:19 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 23:22:19 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <48221A37.80507@esrf.fr> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> <48221A37.80507@esrf.fr> Message-ID: 2008/5/7 Jonathan Wright : > Is there a rule against squaring away the negatives? > > def not_your_normal_pow( x, y ): return exp( log( power( x, 2) ) * y / 2 ) > > Which still needs some work for x==0. Well, it means (-1.)**(3.) becomes 1., which is probably not what the user expected... Anne From peridot.faceted at gmail.com Wed May 7 17:23:53 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 23:23:53 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <200805072055.30250.pgmdevlist@gmail.com> References: <4821E855.1060909@llnl.gov> <200805071912.19427.pgmdevlist@gmail.com> <200805072055.30250.pgmdevlist@gmail.com> Message-ID: 2008/5/7 Pierre GM : > On Wednesday 07 May 2008 20:38:22 Anne Archibald wrote: > > 2008/5/7 Pierre GM : > > > All, > > > Yes, there is a problem with ma.power: masking negative data should be > > > restricted to the case of an exponent between -1. and 1. only, don't you > > > think ? > > > > No, there's a problem with any fractional exponent (with even > > denominator): x**(3/2) == (x**3)**(1/2). > > Argh. Good point. > > > > The real > > problem is "how do we predict when power() is going to produce a NaN?" > An alternative would be to forget about it: let power() output NaNs, and fix > them afterwards with fix_invalid. Tempting, but the user may have used seterr() to arrange that exceptions are raised when this happens, which is going to put a spanner in the works. (And temporarily changing seterr() is problematic in a multithreaded context...) Anne From charlesr.harris at gmail.com Wed May 7 17:36:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 7 May 2008 15:36:11 -0600 Subject: [Numpy-discussion] no longer receiving numpy-tickets messages In-Reply-To: <3d375d730805071354t2fc7d4c2gd8fa572efb263078@mail.gmail.com> References: <48221655.7010505@stsci.edu> <3d375d730805071354t2fc7d4c2gd8fa572efb263078@mail.gmail.com> Message-ID: On Wed, May 7, 2008 at 2:54 PM, Robert Kern wrote: > On Wed, May 7, 2008 at 3:51 PM, Christopher Hanley > wrote: > > Hi, > > > > I've noticed that I am no longer receiving message from the > numpy-ticket > > distribution list. This includes messages for tickets I have submitted > > in addition to tickets created by others. > > I'll pass this along. I haven't gotten any since May 3, either. > I thought the list was just slow ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From thrabe at burnham.org Wed May 7 18:13:47 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Wed, 7 May 2008 15:13:47 -0700 Subject: [Numpy-discussion] Embedding Python - data exchange && no Boost Message-ID: Hi all, I am writing an embedded python application in C/C++ with the following features: 1. The user is able to execute python commands (works fine with PyRun_SimpleString) 2. I want the user to be able to load C objects from C into the current python interpreter, so that the user is able to manipulate the python copies with python tools -> no go. I first wrote a function within a module which is loaded after Py_Initialize(), returning an PyObject. I thought it might be possible to share a static variable between the module and the C application, where the module function could access the static variable, convert it to a PyObject and provide it to the interpreter. Think of this procedure in terms of python calls: import PA; //imports the module in the embedded interpreter a = PA.set() ; //access the static object, create a PyObject, return it as a The later python calls should be able to manipulate a and then return it back to C by setting the static object again like PA.get(a) ; //or similar However, it turned out that the static object in the interpreter differs from the one in the C program -> other adresses in memory, so that I can not share the memory space. Does anybody know of a solution for such a problem? Any tips? I do not use Boost because the program is supposed to process numpy's nd-arrays and, furthermore, must remain independent of additional libraries such as Boost. I'd use it otherwise... Thank you in advance for your help, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed May 7 19:31:30 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 19:31:30 -0400 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: 2008/5/7 Charles R Harris : > On Wed, May 7, 2008 at 11:44 AM, Anne Archibald > wrote: > > 2008/5/7 Jarrod Millman : > > > > > I have just created the 1.1.x branch: > > > http://projects.scipy.org/scipy/numpy/changeset/5134 > > > In about 24 hours I will tag the 1.1.0 release from the branch. At > > > this point only critical bug fixes should be applied to the branch. > > > The trunk is now open for 1.2 development. > > > > I committed Travis' matrix indexing patch (plus a basic test) to > > 1.1.x. Hope that's okay. (All tests pass.) > > > > You should put it in the 1.2 branch also, then. I think fancy indexing and > tolist, which Travis mentioned as having matrix workarounds, need to be > checked to see if they still work correctly for matrices with the patch. Ah. Good point. I did find a bug - x[:,0] doesn't do what you'd expect. Best not release without either backing out my change. I'm still trying to track down what's up. Guess our test suite leaves something to be desired. Anne From charlesr.harris at gmail.com Wed May 7 20:14:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 7 May 2008 18:14:22 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: On Wed, May 7, 2008 at 5:31 PM, Anne Archibald wrote: > 2008/5/7 Charles R Harris : > > > On Wed, May 7, 2008 at 11:44 AM, Anne Archibald < > peridot.faceted at gmail.com> > > wrote: > > > 2008/5/7 Jarrod Millman : > > > > > > > I have just created the 1.1.x branch: > > > > http://projects.scipy.org/scipy/numpy/changeset/5134 > > > > In about 24 hours I will tag the 1.1.0 release from the branch. At > > > > this point only critical bug fixes should be applied to the branch. > > > > The trunk is now open for 1.2 development. > > > > > > I committed Travis' matrix indexing patch (plus a basic test) to > > > 1.1.x. Hope that's okay. (All tests pass.) > > > > > > > You should put it in the 1.2 branch also, then. I think fancy indexing > and > > tolist, which Travis mentioned as having matrix workarounds, need to be > > checked to see if they still work correctly for matrices with the patch. > > Ah. Good point. I did find a bug - x[:,0] doesn't do what you'd > expect. Best not release without either backing out my change. I'm > still trying to track down what's up. Returns a column matrix here, using 1.2.0.dev5143 after Travis's commits. What should it do? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed May 7 20:17:45 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 20:17:45 -0400 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: 2008/5/7 Anne Archibald : > 2008/5/7 Charles R Harris : > > > On Wed, May 7, 2008 at 11:44 AM, Anne Archibald > > wrote: > > > 2008/5/7 Jarrod Millman : > > > > > > > I have just created the 1.1.x branch: > > > > http://projects.scipy.org/scipy/numpy/changeset/5134 > > > > In about 24 hours I will tag the 1.1.0 release from the branch. At > > > > this point only critical bug fixes should be applied to the branch. > > > > The trunk is now open for 1.2 development. > > > > > > I committed Travis' matrix indexing patch (plus a basic test) to > > > 1.1.x. Hope that's okay. (All tests pass.) > > > > > > > You should put it in the 1.2 branch also, then. I think fancy indexing and > > tolist, which Travis mentioned as having matrix workarounds, need to be > > checked to see if they still work correctly for matrices with the patch. > > Ah. Good point. I did find a bug - x[:,0] doesn't do what you'd > expect. Best not release without either backing out my change. I'm > still trying to track down what's up. Okay, more tests committed. Travis has removed the .tolist() workaround (which, it turns out, doesn't actually depend on __getitem__). Not sure what needs to happen for fancy indexing - there's some alarming jiggery-pokery involving self._getitem that I don't understand. But these are not bugfixes anyway. Anne From peridot.faceted at gmail.com Wed May 7 20:19:47 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 20:19:47 -0400 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: 2008/5/7 Charles R Harris : > > On Wed, May 7, 2008 at 5:31 PM, Anne Archibald > wrote: > > Ah. Good point. I did find a bug - x[:,0] doesn't do what you'd > > expect. Best not release without either backing out my change. I'm > > still trying to track down what's up. > > Returns a column matrix here, using 1.2.0.dev5143 after Travis's commits. > What should it do? Oh, uh, my bad. I messed up my test code, and panicked thinking Jarrod was about to release a borken 1.1.0 and it was going to be All My Fault. That said, the only explicit test of .tolist() that I could find is the one I just wrote... Anne From charlesr.harris at gmail.com Wed May 7 20:23:26 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 7 May 2008 18:23:26 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: On Wed, May 7, 2008 at 6:17 PM, Anne Archibald wrote: > 2008/5/7 Anne Archibald : > > 2008/5/7 Charles R Harris : > > > > > On Wed, May 7, 2008 at 11:44 AM, Anne Archibald < > peridot.faceted at gmail.com> > > > wrote: > > > > 2008/5/7 Jarrod Millman : > > > > > > > > > I have just created the 1.1.x branch: > > > > > http://projects.scipy.org/scipy/numpy/changeset/5134 > > > > > In about 24 hours I will tag the 1.1.0 release from the branch. > At > > > > > this point only critical bug fixes should be applied to the > branch. > > > > > The trunk is now open for 1.2 development. > > > > > > > > I committed Travis' matrix indexing patch (plus a basic test) to > > > > 1.1.x. Hope that's okay. (All tests pass.) > > > > > > > > > > You should put it in the 1.2 branch also, then. I think fancy > indexing and > > > tolist, which Travis mentioned as having matrix workarounds, need to > be > > > checked to see if they still work correctly for matrices with the > patch. > > > > Ah. Good point. I did find a bug - x[:,0] doesn't do what you'd > > expect. Best not release without either backing out my change. I'm > > still trying to track down what's up. > > Okay, more tests committed. Travis has removed the .tolist() > workaround (which, it turns out, doesn't actually depend on > __getitem__). Not sure what needs to happen for fancy indexing - > there's some alarming jiggery-pokery involving self._getitem that I > don't understand. But these are not bugfixes anyway. Heh, I just added some tests to 1.2 before closing ticket #707. They should probably be merged with yours. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed May 7 20:26:11 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 7 May 2008 20:26:11 -0400 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: 2008/5/7 Charles R Harris : > > Heh, I just added some tests to 1.2 before closing ticket #707. They should > probably be merged with yours. Seems a shame: Ran 1000 tests in 5.329s Such a nice round number! Anne From charlesr.harris at gmail.com Wed May 7 20:28:52 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 7 May 2008 18:28:52 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: On Wed, May 7, 2008 at 6:19 PM, Anne Archibald wrote: > 2008/5/7 Charles R Harris : > > > > On Wed, May 7, 2008 at 5:31 PM, Anne Archibald < > peridot.faceted at gmail.com> > > wrote: > > > Ah. Good point. I did find a bug - x[:,0] doesn't do what you'd > > > expect. Best not release without either backing out my change. I'm > > > still trying to track down what's up. > > > > Returns a column matrix here, using 1.2.0.dev5143 after Travis's > commits. > > What should it do? > > Oh, uh, my bad. I messed up my test code, and panicked thinking Jarrod > was about to release a borken 1.1.0 and it was going to be All My > Fault. > I busted everything not so long ago by committing some borked files. I couldn't find any easy way to revert the depository and fixing things up was a pain. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournapeau at cslab.kecl.ntt.co.jp Wed May 7 22:16:36 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 08 May 2008 11:16:36 +0900 Subject: [Numpy-discussion] malloc failures on 10.5.2 w/ apple python2.5.1 In-Reply-To: <33644d3c0805070630y1d810263w385571a7e0eeb0aa@mail.gmail.com> References: <33644d3c0805070630y1d810263w385571a7e0eeb0aa@mail.gmail.com> Message-ID: <1210212996.17866.1.camel@bbc8> On Wed, 2008-05-07 at 08:30 -0500, James Snyder wrote: > Hi - > > I'm not sure if this is a bug or an error on my part. I've grabbed > the latest subversion truck and did a build and install on 10.5.2, and > I'm getting malloc errors when running the tests. Sorry about the > revision being lopped off the Numpy version, I'm current up to r5133, > but I'm using git-svn :-) Not that I think it is really likely to be the error, but we never know: I have been working on a branch to play with different allocators, but no code has landed into the trunk. Maybe git-svn did not take code only from the trunk ? Could you try again with svn ? cheers, David From charlesr.harris at gmail.com Wed May 7 23:12:35 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 7 May 2008 21:12:35 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: Hi Jarrod, On Tue, May 6, 2008 at 2:40 AM, Jarrod Millman wrote: > Hey, > > The trunk is in pretty good shape and it is about time that I put out > an official release. So tomorrow (in a little over twelve hours) I am > going to create a 1.1.x branch and the trunk will be officially open > for 1.2 development. If there are no major issues that show up at the > last minute, I will tag 1.1.0 twenty-four hours after I branch. As > soon as I tag the release I will ask the David and Chris to create the > official Windows and Mac binaries. If nothing goes awry, you can > expect the official release announcement by Monday, May 12th. > > In order to help me with the final touches, would everyone look over > the release notes one last time: > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > Please let me know if there are any important omissions or errors ASAP. > Scalar indexing of matrices has changed, 1D arrays are now returned instead of matrices. This has to be documented in the release notes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjhnson at gmail.com Thu May 8 03:26:23 2008 From: tjhnson at gmail.com (T J) Date: Thu, 8 May 2008 00:26:23 -0700 Subject: [Numpy-discussion] Log Arrays Message-ID: Hi, For precision reasons, I almost always need to work with arrays whose elements are log values. My thought was that it would be really neat to have a 'logarray' class implemented in C or as a subclass of the standard array class. Here is a sample of how I'd like to work with these objects: >>> x = array([-2,-2,-3], base=2) >>> y = array([-1,-2,-inf], base=2) >>> z = x + y >>> z array([-0.415037499279, -1.0, -3]) >>> z = x * y >>> z array([-3, -4, -inf]) >>> z[:2].sum() -2.41503749928 This would do a lot for the code I write....and some of the numerical stability issues would handled more appropriately. For example, the logsum function is frequently handled as: log(x + y) == log(x) + log(1 + exp(log(y) - log(x)) ) when log(x) > log(y). So the goal of the above function is to return log(x + y) using only the logarithms of x and y. Does this sound like a good idea? From tjhnson at gmail.com Thu May 8 03:30:18 2008 From: tjhnson at gmail.com (T J) Date: Thu, 8 May 2008 00:30:18 -0700 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: Message-ID: On Thu, May 8, 2008 at 12:26 AM, T J wrote: > > >>> x = array([-2,-2,-3], base=2) > >>> y = array([-1,-2,-inf], base=2) > >>> z = x + y > >>> z > array([-0.415037499279, -1.0, -3]) > >>> z = x * y > >>> z > array([-3, -4, -inf]) > >>> z[:2].sum() > -2.41503749928 > Whoops.... s/array/logarray/ From berthe.loic at gmail.com Thu May 8 07:06:17 2008 From: berthe.loic at gmail.com (LB) Date: Thu, 8 May 2008 04:06:17 -0700 (PDT) Subject: [Numpy-discussion] First steps with f2py and first problems... Message-ID: <6261e86c-96c0-430b-b046-b92716ff9a77@m45g2000hsb.googlegroups.com> Hi, I've tried to follow the example given at : http://www.scipy.org/Cookbook/Theoretical_Ecology/Hastings_and_Powell but I've got errors when compiling the fortran file : ---------------------------------errors -------------------------------------------------- 12:53 loic:~ % f2py -c -m hastings hastings.f90 --fcompiler=gnu95 running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands -- compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands -- fcompiler options running build_src building extension "hastings" sources f2py options: [] f2py:> /tmp/tmpDRL9Gh/src.linux-i686-2.5/hastingsmodule.c creating /tmp/tmpDRL9Gh creating /tmp/tmpDRL9Gh/src.linux-i686-2.5 Reading fortran codes... Reading file 'hastings.f90' (format:free) Post-processing... Block: hastings Block: model Block: fweb Post-processing (stage 2)... Block: hastings Block: unknown_interface Block: model Block: fweb Building modules... Building module "hastings"... Constructing F90 module support for "model"... Variables: a1 a2 b1 b2 d2 d1 Constructing wrapper function "model.fweb"... yprime = fweb(y,t) Wrote C/API module "hastings" to file "/tmp/tmpDRL9Gh/src.linux- i686-2.5/hastingsmodule.c" Traceback (most recent call last): File "/usr/bin/f2py", line 26, in main() File "/usr/lib/python2.5/site-packages/numpy/f2py/f2py2e.py", line 558, in main run_compile() File "/usr/lib/python2.5/site-packages/numpy/f2py/f2py2e.py", line 545, in run_compile setup(ext_modules = [ext]) File "/usr/lib/python2.5/site-packages/numpy/distutils/core.py", line 176, in setup return old_setup(**new_attr) File "/usr/lib/python2.5/distutils/core.py", line 151, in setup dist.run_commands() File "/usr/lib/python2.5/distutils/dist.py", line 974, in run_commands self.run_command(cmd) File "/usr/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/usr/lib/python2.5/distutils/command/build.py", line 113, in run self.run_command(cmd_name) File "/usr/lib/python2.5/distutils/cmd.py", line 333, in run_command self.distribution.run_command(command) File "/usr/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/usr/lib/python2.5/site-packages/numpy/distutils/command/ build_src.py", line 130, in run self.build_sources() File "/usr/lib/python2.5/site-packages/numpy/distutils/command/ build_src.py", line 147, in build_sources self.build_extension_sources(ext) File "/usr/lib/python2.5/site-packages/numpy/distutils/command/ build_src.py", line 256, in build_extension_sources sources = self.f2py_sources(sources, ext) File "/usr/lib/python2.5/site-packages/numpy/distutils/command/ build_src.py", line 513, in f2py_sources ['-m',ext_name]+f_sources) File "/usr/lib/python2.5/site-packages/numpy/f2py/f2py2e.py", line 367, in run_main ret=buildmodules(postlist) File "/usr/lib/python2.5/site-packages/numpy/f2py/f2py2e.py", line 319, in buildmodules dict_append(ret[mnames[i]],rules.buildmodule(modules[i],um)) File "/usr/lib/python2.5/site-packages/numpy/f2py/rules.py", line 1222, in buildmodule for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): TypeError: cannot concatenate 'str' and 'list' objects zsh: exit 1 f2py -c -m hastings hastings.f90 --fcompiler=gnu95 --------------------------------- configuration------------------------------------- I'm using debian testing, and I got the following information at the bottom of `f2py -h` : Version: 2_4422 numpy Version: 1.0.4 Requires: Python 2.3 or higher. License: NumPy license (see LICENSE.txt in the NumPy source code) Have you got any clue to solve this pb ? -- LB From pearu at cens.ioc.ee Thu May 8 07:20:06 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 8 May 2008 14:20:06 +0300 (EEST) Subject: [Numpy-discussion] First steps with f2py and first problems... In-Reply-To: <6261e86c-96c0-430b-b046-b92716ff9a77@m45g2000hsb.googlegroups.com> References: <6261e86c-96c0-430b-b046-b92716ff9a77@m45g2000hsb.googlegroups.com> Message-ID: <59720.129.240.176.119.1210245606.squirrel@cens.ioc.ee> On Thu, May 8, 2008 2:06 pm, LB wrote: > Hi, > > I've tried to follow the example given at : > http://www.scipy.org/Cookbook/Theoretical_Ecology/Hastings_and_Powell > but I've got errors when compiling the fortran file : > > 12:53 loic:~ % f2py -c -m hastings hastings.f90 --fcompiler=gnu95 ... > File "/usr/lib/python2.5/site-packages/numpy/f2py/rules.py", line > 1222, in buildmodule > for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): > TypeError: cannot concatenate 'str' and 'list' objects > zsh: exit 1 f2py -c -m hastings hastings.f90 --fcompiler=gnu95 ... > Have you got any clue to solve this pb ? This issue is fixed in SVN. So, either use numpy from svn, or wait a bit until numpy 1.0.5 is released, or change the line #1222 in numpy/f2py/rules.py to for l in ('\n\n'.join(funcwrappers2)+'\n').split('\n'): HTH, Pearu From charlesr.harris at gmail.com Thu May 8 09:20:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 07:20:19 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: Message-ID: On Thu, May 8, 2008 at 1:26 AM, T J wrote: > Hi, > > For precision reasons, I almost always need to work with arrays whose > elements are log values. My thought was that it would be really neat > to have a 'logarray' class implemented in C or as a subclass of the > standard array class. Here is a sample of how I'd like to work with > these objects: > Floating point numbers are essentially logs to base 2, i.e., integer exponent and mantissa between 1 and 2. What does using the log buy you? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 8 09:32:36 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 07:32:36 -0600 Subject: [Numpy-discussion] Problems with Trac Message-ID: There seem to be several problems with the Trac system. 1) Submitting/modifying tickets yields 500 internal server error, even though the changes are made. This leads to duplicate tickets. 2) Mail isn't sent to the numpy-ticket mailing list. 3) It still always takes two tries to log in. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu May 8 11:20:20 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2008 00:20:20 +0900 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: Message-ID: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> On Thu, May 8, 2008 at 10:20 PM, Charles R Harris wrote: > > > Floating point numbers are essentially logs to base 2, i.e., integer > exponent and mantissa between 1 and 2. What does using the log buy you? Precision, of course. I am not sure I understand the notation base = 2, but doing computation in the so called log-domain is a must in many statistical computations. In particular, in machine learning with large datasets, it is common to have some points whose pdf is extremely small, and well below the precision of double. Typically, internally, the computation of my EM toolbox are done in the log domain, and use the logsumexp trick to compute likelihood given some data: a = np.array([-1000., -1001.]) np.log(np.sum(np.exp(a))) -> -inf -1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result Where you use log(exp(x) + exp(y)) = x + log(1 + exp(y-x)). It is useful when x and y are in the same range bu far from 0, which happens a lot practically in many machine learning algorithms (EM, SVM, etc... everywhere you need to compute likelihood of densities from the exponential family, which covers most practical cases of parametric estimation) From teoliphant at gmail.com Thu May 8 11:30:36 2008 From: teoliphant at gmail.com (Travis Oliphant) Date: Thu, 08 May 2008 10:30:36 -0500 Subject: [Numpy-discussion] Test Message-ID: <48231C9C.1060703@gmail.com> This is a test to see if the list is working for me. -teo From charlesr.harris at gmail.com Thu May 8 12:04:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 10:04:28 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 9:20 AM, David Cournapeau wrote: > On Thu, May 8, 2008 at 10:20 PM, Charles R Harris > wrote: > > > > > > Floating point numbers are essentially logs to base 2, i.e., integer > > exponent and mantissa between 1 and 2. What does using the log buy you? > > Precision, of course. I am not sure I understand the notation base = > 2, lg(x) = ln(x)/ln(2) > but doing computation in the so called log-domain is a must in many > statistical computations. In particular, in machine learning with > large datasets, it is common to have some points whose pdf is > extremely small, and well below the precision of double. < 1e-308 ? > Typically, > internally, the computation of my EM toolbox are done in the log > domain, and use the logsumexp trick to compute likelihood given some > data: > Yes, logs can be useful there, but I still fail to see any precision advantage. As I say, to all intents and purposes, IEEE floating point *is* a logarithm. You will see that if you look at how log is implemented in hardware. I'm less sure of the C floating point library because it needs to be portable. > > a = np.array([-1000., -1001.]) > np.log(np.sum(np.exp(a))) -> -inf > -1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result > What realistic probability is in the range exp(-1000) ? > > Where you use log(exp(x) + exp(y)) = x + log(1 + exp(y-x)). It is > useful when x and y are in the same range bu far from 0, which happens > a lot practically in many machine learning algorithms (EM, SVM, etc... > everywhere you need to compute likelihood of densities from the > exponential family, which covers most practical cases of parametric > estimation) > If you have a hammer... It's portable, but there are wasted cpu cycles in there. If speed was important, I suspect you could do better writing a low level function that assumed IEEE doubles and twiddled the bits. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Thu May 8 12:08:32 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 12:08:32 -0400 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: 2008/5/8 David Cournapeau : > On Thu, May 8, 2008 at 10:20 PM, Charles R Harris > wrote: > > > > > > Floating point numbers are essentially logs to base 2, i.e., integer > > exponent and mantissa between 1 and 2. What does using the log buy you? > > Precision, of course. I am not sure I understand the notation base = > 2, but doing computation in the so called log-domain is a must in many > statistical computations. In particular, in machine learning with > large datasets, it is common to have some points whose pdf is > extremely small, and well below the precision of double. Typically, > internally, the computation of my EM toolbox are done in the log > domain, and use the logsumexp trick to compute likelihood given some > data: I'm not sure I'd describe this as precision, exactly; it's an issue of numerical range. But yes, I've come across this while doing maximum-likelihood fitting, and a coworker ran into it doing Bayesian statistics. It definitely comes up. Is "logarray" really the way to handle it, though? it seems like you could probably get away with providing a logsum ufunc that did the right thing. I mean, what operations does one want to do on logarrays? add -> logsum subtract -> ? multiply -> add mean -> logsum/N median -> median exponentiate to recover normal-space values -> exp str -> ? I suppose numerical integration is also valuable, so it would help to have a numerical integrator that was reasonably smart about working with logs. (Though really it's just a question of rescaling and exponentiating, I think.) A "logarray" type would help by keeping track of the fact that its contents were in log space, and would make expressions a little less cumbersome, I guess. How much effort would it take to write it so that it got all the corner cases right? Anne From peridot.faceted at gmail.com Thu May 8 12:11:42 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 12:11:42 -0400 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: 2008/5/8 Charles R Harris : > > What realistic probability is in the range exp(-1000) ? Well, I ran into it while doing a maximum-likelihood fit - my early guesses had exceedingly low probabilities, but I needed to know which way the probabilities were increasing. Anne From charlesr.harris at gmail.com Thu May 8 12:25:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 10:25:37 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 10:11 AM, Anne Archibald wrote: > 2008/5/8 Charles R Harris : > > > > What realistic probability is in the range exp(-1000) ? > > Well, I ran into it while doing a maximum-likelihood fit - my early > guesses had exceedingly low probabilities, but I needed to know which > way the probabilities were increasing. > The number of bosons in the universe is only on the order of 1e-42. Exp(-1000) may be convenient, but as a probability it is a delusion. The hypothesis "none of the above" would have a much larger prior. But to expand on David's computation... If the numbers are stored without using logs, i.e., as the exponentials, then the sum is of the form: x_1*2**y_1 + ... + x_i*2**y_i Where 1<= x_j < 2 and both x_i and y_i are available. When the numbers are all of the same general magnitude you get essentially the same result as David's formula by simply by dividing out the first value. Chuck Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu May 8 12:42:50 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2008 01:42:50 +0900 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: <5b8d13220805080942q78bee416qc77e1b5632bb4abb@mail.gmail.com> On Fri, May 9, 2008 at 1:04 AM, Charles R Harris wrote: > < 1e-308 ? Yes, all the time. I mean, if it was not, why people would bother with long double and co ? Why denormal would exist ? I don't consider the comparison with the number of particules to be really relevant here. We are talking about implementation problems. > > Yes, logs can be useful there, but I still fail to see any precision > advantage. As I say, to all intents and purposes, IEEE floating point *is* a > logarithm. You will see that if you look at how log is implemented in > hardware. I'm less sure of the C floating point library because it needs to > be portable. > >> >> a = np.array([-1000., -1001.]) >> np.log(np.sum(np.exp(a))) -> -inf >> -1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result > > What realistic probability is in the range exp(-1000) ? Realistic as significant, none of course. Realistic as it happens in computation ? Certainly. Typically, when you do clustering, and you have distant clusters, when you compute the probabilities of the points from one cluster relatively to another one, you can quickly get several units from the mean. Adds a small variance, and you can quickly get (x-mu)**2/sigma**2 around 1000. You cannot just clip to 0, specially in online settings. > > If you have a hammer... It's portable, but there are wasted cpu cycles in > there. If speed was important, I suspect you could do better writing a low > level function that assumed IEEE doubles and twiddled the bits. When you call a function in python, you waste thousand cycles at every call. Yet, you use python, and not assembly :) The above procedure is extremely standard, and used in all robust implementations of machine learning algorithms I am aware of, it is implemented in HTK, a widely used toolkit for HMM for speech recognition for example. Twiddling bits is all fun, but it takes time and is extremely error prone. Also, I don't see what kind of method you have in mind here, exactly: how would you do a logsumexp algorithm with bit twiddling ? cheers, David From cournape at gmail.com Thu May 8 12:52:19 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2008 01:52:19 +0900 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: <5b8d13220805080952k24cab959tfea7c382e9116d18@mail.gmail.com> On Fri, May 9, 2008 at 1:25 AM, Charles R Harris wrote: > > > But to expand on David's computation... If the numbers are stored without > using logs, i.e., as the exponentials, then the sum is of the form: > > x_1*2**y_1 + ... + x_i*2**y_i You missed the part on parametric models: in parametric settings, your x_i are often exponential, so it makes sense to compute in the log domain (you don't compute more log/exp than the naive implementation). Of course, if we were not interested in log x_i in the first place, the thing would not have made any sense. David From charlesr.harris at gmail.com Thu May 8 12:54:45 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 10:54:45 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <5b8d13220805080942q78bee416qc77e1b5632bb4abb@mail.gmail.com> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <5b8d13220805080942q78bee416qc77e1b5632bb4abb@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 10:42 AM, David Cournapeau wrote: > On Fri, May 9, 2008 at 1:04 AM, Charles R Harris > wrote: > > > < 1e-308 ? > > Yes, all the time. I mean, if it was not, why people would bother with > long double and co ? Why denormal would exist ? I don't consider the > comparison with the number of particules to be really relevant here. > We are talking about implementation problems. > > > > > Yes, logs can be useful there, but I still fail to see any precision > > advantage. As I say, to all intents and purposes, IEEE floating point > *is* a > > logarithm. You will see that if you look at how log is implemented in > > hardware. I'm less sure of the C floating point library because it needs > to > > be portable. > > > >> > >> a = np.array([-1000., -1001.]) > >> np.log(np.sum(np.exp(a))) -> -inf > >> -1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result > > > > What realistic probability is in the range exp(-1000) ? > > Realistic as significant, none of course. Realistic as it happens in > computation ? Certainly. Typically, when you do clustering, and you > have distant clusters, when you compute the probabilities of the > points from one cluster relatively to another one, you can quickly get > several units from the mean. Adds a small variance, and you can > quickly get (x-mu)**2/sigma**2 around 1000. > Yes, and Gaussians are a delusion beyond a few sigma. One of my pet peeves. If you have more than 8 standard deviations, then something is fundamentally wrong in the concept and formulation. It is more likely that a particle sized blackhole has whacked out some component of the experiment. > You cannot just clip to 0, specially in online settings. > > > > > If you have a hammer... It's portable, but there are wasted cpu cycles in > > there. If speed was important, I suspect you could do better writing a > low > > level function that assumed IEEE doubles and twiddled the bits. > > When you call a function in python, you waste thousand cycles at every > call. Yet, you use python, and not assembly :) The above procedure is > extremely standard, and used in all robust implementations of machine > learning algorithms I am aware of, it is implemented in HTK, a widely > used toolkit for HMM for speech recognition for example. > > Twiddling bits is all fun, but it takes time and is extremely error > prone. Also, I don't see what kind of method you have in mind here, > exactly: how would you do a logsumexp algorithm with bit twiddling ? > You are complaining of inadequate range, but that is what scale factors are for. Why compute exponentials and logs when all you need to do is store an exponent in an integer. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 8 12:56:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 11:56:00 -0500 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> On Thu, May 8, 2008 at 11:25 AM, Charles R Harris wrote: > > On Thu, May 8, 2008 at 10:11 AM, Anne Archibald > wrote: >> >> 2008/5/8 Charles R Harris : >> > >> > What realistic probability is in the range exp(-1000) ? >> >> Well, I ran into it while doing a maximum-likelihood fit - my early >> guesses had exceedingly low probabilities, but I needed to know which >> way the probabilities were increasing. > > The number of bosons in the universe is only on the order of 1e-42. > Exp(-1000) may be convenient, but as a probability it is a delusion. The > hypothesis "none of the above" would have a much larger prior. When you're running an optimizer over a PDF, you will be stuck in the region of exp(-1000) for a substantial amount of time before you get to the peak. If you don't use the log representation, you will never get to the peak because all of the gradient information is lost to floating point error. You can consult any book on computational statistics for many more examples. This is a long-established best practice in statistics. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu May 8 12:56:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 10:56:42 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <5b8d13220805080952k24cab959tfea7c382e9116d18@mail.gmail.com> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <5b8d13220805080952k24cab959tfea7c382e9116d18@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 10:52 AM, David Cournapeau wrote: > On Fri, May 9, 2008 at 1:25 AM, Charles R Harris > wrote: > > > > > > > But to expand on David's computation... If the numbers are stored without > > using logs, i.e., as the exponentials, then the sum is of the form: > > > > x_1*2**y_1 + ... + x_i*2**y_i > > You missed the part on parametric models: in parametric settings, your > x_i are often exponential, I'm talking IEEE floating point. The x_i are never exponentials. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 8 13:02:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 11:02:19 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 10:56 AM, Robert Kern wrote: > On Thu, May 8, 2008 at 11:25 AM, Charles R Harris > wrote: > > > > On Thu, May 8, 2008 at 10:11 AM, Anne Archibald < > peridot.faceted at gmail.com> > > wrote: > >> > >> 2008/5/8 Charles R Harris : > >> > > >> > What realistic probability is in the range exp(-1000) ? > >> > >> Well, I ran into it while doing a maximum-likelihood fit - my early > >> guesses had exceedingly low probabilities, but I needed to know which > >> way the probabilities were increasing. > > > > The number of bosons in the universe is only on the order of 1e-42. > > Exp(-1000) may be convenient, but as a probability it is a delusion. The > > hypothesis "none of the above" would have a much larger prior. > > When you're running an optimizer over a PDF, you will be stuck in the > region of exp(-1000) for a substantial amount of time before you get > to the peak. If you don't use the log representation, you will never > get to the peak because all of the gradient information is lost to > floating point error. You can consult any book on computational > statistics for many more examples. This is a long-established best > practice in statistics. > But IEEE is already a log representation. You aren't gaining precision, you are gaining more bits in the exponent at the expense of fewer bits in the mantissa, i.e., less precision. As I say, it may be convenient, but if cpu cycles matter, it isn't efficient. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 8 13:05:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 12:05:48 -0500 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: <3d375d730805081005i5d58d77eseb770f7a59cec4ed@mail.gmail.com> On Thu, May 8, 2008 at 12:02 PM, Charles R Harris wrote: > > On Thu, May 8, 2008 at 10:56 AM, Robert Kern wrote: >> >> On Thu, May 8, 2008 at 11:25 AM, Charles R Harris >> wrote: >> > >> > On Thu, May 8, 2008 at 10:11 AM, Anne Archibald >> > >> > wrote: >> >> >> >> 2008/5/8 Charles R Harris : >> >> > >> >> > What realistic probability is in the range exp(-1000) ? >> >> >> >> Well, I ran into it while doing a maximum-likelihood fit - my early >> >> guesses had exceedingly low probabilities, but I needed to know which >> >> way the probabilities were increasing. >> > >> > The number of bosons in the universe is only on the order of 1e-42. >> > Exp(-1000) may be convenient, but as a probability it is a delusion. The >> > hypothesis "none of the above" would have a much larger prior. >> >> When you're running an optimizer over a PDF, you will be stuck in the >> region of exp(-1000) for a substantial amount of time before you get >> to the peak. If you don't use the log representation, you will never >> get to the peak because all of the gradient information is lost to >> floating point error. You can consult any book on computational >> statistics for many more examples. This is a long-established best >> practice in statistics. > > But IEEE is already a log representation. You aren't gaining precision, you > are gaining more bits in the exponent at the expense of fewer bits in the > mantissa, i.e., less precision. *YES*. As David pointed out, many of these PDFs are in exponential form. Most of the meaningful variation is in the exponent, not the mantissa. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nadavh at visionsense.com Thu May 8 13:06:57 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 8 May 2008 20:06:57 +0300 Subject: [Numpy-discussion] Log Arrays References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> Is the 80 bits float (float96 on IA32, float128 on AMD64) isn't enough? It has a 64 bits mantissa and can represent numbers up to nearly 1E(+-)5000. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Charles R Harris ????: ? 08-???-08 19:25 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] Log Arrays On Thu, May 8, 2008 at 10:11 AM, Anne Archibald wrote: > 2008/5/8 Charles R Harris : > > > > What realistic probability is in the range exp(-1000) ? > > Well, I ran into it while doing a maximum-likelihood fit - my early > guesses had exceedingly low probabilities, but I needed to know which > way the probabilities were increasing. > The number of bosons in the universe is only on the order of 1e-42. Exp(-1000) may be convenient, but as a probability it is a delusion. The hypothesis "none of the above" would have a much larger prior. But to expand on David's computation... If the numbers are stored without using logs, i.e., as the exponentials, then the sum is of the form: x_1*2**y_1 + ... + x_i*2**y_i Where 1<= x_j < 2 and both x_i and y_i are available. When the numbers are all of the same general magnitude you get essentially the same result as David's formula by simply by dividing out the first value. Chuck Chuck -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3794 bytes Desc: not available URL: From cournape at gmail.com Thu May 8 13:10:34 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2008 02:10:34 +0900 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <5b8d13220805080942q78bee416qc77e1b5632bb4abb@mail.gmail.com> Message-ID: <5b8d13220805081010j49bf24f9xbc4f4ca5049ef64f@mail.gmail.com> On Fri, May 9, 2008 at 1:54 AM, Charles R Harris wrote: > Yes, and Gaussians are a delusion beyond a few sigma. One of my pet peeves. > If you have more than 8 standard deviations, then something is fundamentally > wrong in the concept and formulation. If you have a mixture of Gaussian, and the components are not all mostly overlapping, you will get those ranges, and nothing is wrong in the formulation. I mean, it is not like EM algorithms are untested things and totally new. It is used in many different fields, and all its successful implementations use the logsumexp trick. Look at here for the formula involved: http://en.wikipedia.org/wiki/Expectation-maximization_algorithm If you need to compute log (exp (-1000) + exp(-1001)), how would you do ? If you do it the naive way, you have -inf, and it propagates across all your computation quickly. -inf instead of -1000 seems like a precision win to me. of course you are trading precision for range, but when you are out of range for your number representation, the tradeoff is not a loss anymore. It is really like denormal: they are less precise than normal format *for the usual range*, but in the range where denormal are used, they are much more precise; they are actually infinitely more precise, since the normal representation would be 0 :) David From cournape at gmail.com Thu May 8 13:18:15 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2008 02:18:15 +0900 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> Message-ID: <5b8d13220805081018x392b6f16i5e21d51e13f043db@mail.gmail.com> On Fri, May 9, 2008 at 2:06 AM, Nadav Horesh wrote: > Is the 80 bits float (float96 on IA32, float128 on AMD64) isn't enough? It has a 64 bits mantissa and can represent numbers up to nearly 1E(+-)5000. It only make the problem happen later, I think. If you have a GMM with million of samples of high dimension with many clusters, any "linear" representation will fail I think. In a sense, the IEEE format is not adequate for that kind of computation. David From focke at slac.stanford.edu Thu May 8 13:31:40 2008 From: focke at slac.stanford.edu (Warren Focke) Date: Thu, 8 May 2008 10:31:40 -0700 (PDT) Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: On Thu, 8 May 2008, Charles R Harris wrote: > On Thu, May 8, 2008 at 10:11 AM, Anne Archibald > wrote: > >> 2008/5/8 Charles R Harris : >>> >>> What realistic probability is in the range exp(-1000) ? >> >> Well, I ran into it while doing a maximum-likelihood fit - my early >> guesses had exceedingly low probabilities, but I needed to know which >> way the probabilities were increasing. >> > > The number of bosons in the universe is only on the order of 1e-42. > Exp(-1000) may be convenient, but as a probability it is a delusion. The > hypothesis "none of the above" would have a much larger prior. You might like to think so. Sadly, not. If you're doing a least-square (or any other maximum-likelihood) fit to 2000 data points, exp(-1000) is the highest probability you can reasonably hope for. For a good fit. Chi-square is -2*ln(P). In the course of doing the fit, you will evaluate many parameter sets which are bad fits, and the probablility will be much lower. This has no real effect on the current discussion, but: The number of bosons in the universe (or any subset thereof) is not well-defined. It's not just a question of not knowing the number; there really is no answer to that question (well, ok, 'mu'). It's like asking which slit the particle went through in a double-slit interference experiment. The question is incorrect. Values <<1 will never be tenable, but I suspect that the minus sign was a typo. The estimates I hear for the number of baryons (protons, atoms) are ~ 1e80. w From peridot.faceted at gmail.com Thu May 8 13:46:43 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 19:46:43 +0200 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: 2008/5/8 Charles R Harris : > > On Thu, May 8, 2008 at 10:56 AM, Robert Kern wrote: > > > > When you're running an optimizer over a PDF, you will be stuck in the > > region of exp(-1000) for a substantial amount of time before you get > > to the peak. If you don't use the log representation, you will never > > get to the peak because all of the gradient information is lost to > > floating point error. You can consult any book on computational > > statistics for many more examples. This is a long-established best > > practice in statistics. > > But IEEE is already a log representation. You aren't gaining precision, you > are gaining more bits in the exponent at the expense of fewer bits in the > mantissa, i.e., less precision. As I say, it may be convenient, but if cpu > cycles matter, it isn't efficient. Efficiency is not the point here. IEEE floats simply cannot represent the difference between exp(-1000) and exp(-1001). This difference can matter in many contexts. For example, suppose I have observed a source in X-rays and I want to fit a blackbody spectrum. I have, say, hundreds of spectral bins, with a number of photons in each. For any temperature T and amplitude A, I can compute the distribution of photons in each bin (Poisson with mean depending on T and A), and obtain the probability of obtaining the observed number of photons in each bin. Even if a blackbody with temperature T and amplitude A is a perfect fit I should expect this number to be very small, since the chance of obtaining *exactly* this sequence of integers is quite small. But when I start the process I need to guess a T and an A and evauate the probability. If my values are far off, the probability will almost certainly be lower than exp(-1000). But I need to know whether, if I increase T, this probability will increase or decrease, so I cannot afford to treat it as zero, or throw up my hands and say "This is smaller than one over the number of baryons in the universe! My optimization problem doesn't make any sense!". I could also point out that frequently when one obtains these crazy numbers, one is not working with probabilities at all, but probability densities. A probability density of exp(-1000) means nothing special. Finally, the fact that *you* don't think this is a useful technique doesn't affect the fact that there is a substantial community of users who use it daily and who would like some support for it in scipy. Anne From charlesr.harris at gmail.com Thu May 8 13:53:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 11:53:43 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: <5b8d13220805081018x392b6f16i5e21d51e13f043db@mail.gmail.com> References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> <5b8d13220805081018x392b6f16i5e21d51e13f043db@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 11:18 AM, David Cournapeau wrote: > On Fri, May 9, 2008 at 2:06 AM, Nadav Horesh > wrote: > > Is the 80 bits float (float96 on IA32, float128 on AMD64) isn't enough? > It has a 64 bits mantissa and can represent numbers up to nearly 1E(+-)5000. > > It only make the problem happen later, I think. If you have a GMM with > million of samples of high dimension with many clusters, any "linear" > representation will fail I think. In a sense, the IEEE format is not > adequate for that kind of computation. > David, what you are using is a log(log(x)) representation internally. IEEE is *not* linear, it is logarithmic. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 8 13:57:19 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 12:57:19 -0500 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> <5b8d13220805081018x392b6f16i5e21d51e13f043db@mail.gmail.com> Message-ID: <3d375d730805081057i272c004fnf4f751b33d666ce1@mail.gmail.com> On Thu, May 8, 2008 at 12:53 PM, Charles R Harris wrote: > > On Thu, May 8, 2008 at 11:18 AM, David Cournapeau > wrote: >> >> On Fri, May 9, 2008 at 2:06 AM, Nadav Horesh >> wrote: >> > Is the 80 bits float (float96 on IA32, float128 on AMD64) isn't enough? >> > It has a 64 bits mantissa and can represent numbers up to nearly 1E(+-)5000. >> >> It only make the problem happen later, I think. If you have a GMM with >> million of samples of high dimension with many clusters, any "linear" >> representation will fail I think. In a sense, the IEEE format is not >> adequate for that kind of computation. > > David, what you are using is a log(log(x)) representation internally. IEEE > is *not* linear, it is logarithmic. *YES*. That is precisely the point. I want 53 bits devoted to the "x" part of "exp(-x)". The straight IEEE representation is not logarithmic *enough*. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu May 8 14:00:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 12:00:28 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 11:31 AM, Warren Focke wrote: > > > On Thu, 8 May 2008, Charles R Harris wrote: > > > On Thu, May 8, 2008 at 10:11 AM, Anne Archibald < > peridot.faceted at gmail.com> > > wrote: > > > >> 2008/5/8 Charles R Harris : > >>> > >>> What realistic probability is in the range exp(-1000) ? > >> > >> Well, I ran into it while doing a maximum-likelihood fit - my early > >> guesses had exceedingly low probabilities, but I needed to know which > >> way the probabilities were increasing. > >> > > > > The number of bosons in the universe is only on the order of 1e-42. > > Exp(-1000) may be convenient, but as a probability it is a delusion. The > > hypothesis "none of the above" would have a much larger prior. > > You might like to think so. Sadly, not. > > If you're doing a least-square (or any other maximum-likelihood) fit to > 2000 > data points, exp(-1000) is the highest probability you can reasonably hope > for. > For a good fit. Chi-square is -2*ln(P). In the course of doing the fit, > you > will evaluate many parameter sets which are bad fits, and the probablility > will > be much lower. > > This has no real effect on the current discussion, but: > > The number of bosons in the universe (or any subset thereof) is not > well-defined. It's not just a question of not knowing the number; there > really > is no answer to that question (well, ok, 'mu'). It's like asking which > slit the > particle went through in a double-slit interference experiment. The > question is > incorrect. Values <<1 will never be tenable, but I suspect that the minus > sign > was a typo. The estimates I hear for the number of baryons (protons, > atoms) are > ~ 1e80. > Say, mostly photons. Temperature (~2.7 K) determines density, multiply by volume. But I meant baryons and the last number I saw was about 1e42. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu May 8 14:25:20 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 8 May 2008 20:25:20 +0200 Subject: [Numpy-discussion] Changes to matrix class broke scipy Message-ID: <9457e7c80805081125l195f2819y1ab8b7b67709226f@mail.gmail.com> Developers, take note: http://thread.gmane.org/gmane.comp.python.scientific.user/16297 From peridot.faceted at gmail.com Thu May 8 14:39:58 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 20:39:58 +0200 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> <5b8d13220805081018x392b6f16i5e21d51e13f043db@mail.gmail.com> Message-ID: 2008/5/8 Charles R Harris : > > David, what you are using is a log(log(x)) representation internally. IEEE > is *not* linear, it is logarithmic. As Robert Kern says, yes, this is exactly what the OP and all the rest of us want. But it's a strange thing to say that IEEE is logarithmic - "2.3*10**1" is not exactly logarithmic, since the "2.3" is not the logarithm of anything. IEEE floats work the same way, which is important, since it means they can exactly represent integers of moderate size. For example, 257 is represented as sign 0, exponent 135, (implied leading 1).00000001b. The exponent is indeed the integer part of the log base 2 of the value, up to some fiddling, but the mantissa is not a logarithm of any kind. Anyway, all this is immaterial. The point is, in spite of the fact that floating-point numbers can represent a very wide range of numbers, there are some important contexts in which this range is not wide enough. One could in principle store an additional power of two in an accompanying integer, but this would be less efficient in terms of space and time, and more cumbersome, when for the usual situations where this is applied, simply taking the logarithm works fine. Anne From charlesr.harris at gmail.com Thu May 8 14:53:08 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 12:53:08 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <710F2847B0018641891D9A216027636029C140@ex3.envision.co.il> <5b8d13220805081018x392b6f16i5e21d51e13f043db@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 12:39 PM, Anne Archibald wrote: > 2008/5/8 Charles R Harris : > > > > David, what you are using is a log(log(x)) representation internally. > IEEE > > is *not* linear, it is logarithmic. > > As Robert Kern says, yes, this is exactly what the OP and all the rest > of us want. > > But it's a strange thing to say that IEEE is logarithmic - "2.3*10**1" > is not exactly logarithmic, since the "2.3" is not the logarithm of > anything. IEEE floats work the same way, which is important, since it > means they can exactly represent integers of moderate size. For > example, 257 is represented as > > sign 0, exponent 135, (implied leading 1).00000001b. > > The exponent is indeed the integer part of the log base 2 of the > value, up to some fiddling, but the mantissa is not a logarithm of any > kind. > First order in the Taylors series. The log computation uses the fact that it has small curvature over the range 1..2 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 8 15:01:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 13:01:07 -0600 Subject: [Numpy-discussion] Changes to matrix class broke scipy In-Reply-To: <9457e7c80805081125l195f2819y1ab8b7b67709226f@mail.gmail.com> References: <9457e7c80805081125l195f2819y1ab8b7b67709226f@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 12:25 PM, St?fan van der Walt wrote: > Developers, take note: > > http://thread.gmane.org/gmane.comp.python.scientific.user/16297 > _______________________________________________ > Yep, not unexpected. I suppose the real question is whether is should have been in 1.1 or 1.2, but if we are making the change we have to bite the bullet sometime. So I can understand the irritation, even anger, up front. Now let's see how hard the needed changes are. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 8 15:12:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 13:12:23 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 11:46 AM, Anne Archibald wrote: > 2008/5/8 Charles R Harris : > > > > On Thu, May 8, 2008 at 10:56 AM, Robert Kern > wrote: > > > > > > When you're running an optimizer over a PDF, you will be stuck in the > > > region of exp(-1000) for a substantial amount of time before you get > > > to the peak. If you don't use the log representation, you will never > > > get to the peak because all of the gradient information is lost to > > > floating point error. You can consult any book on computational > > > statistics for many more examples. This is a long-established best > > > practice in statistics. > > > > But IEEE is already a log representation. You aren't gaining precision, > you > > are gaining more bits in the exponent at the expense of fewer bits in the > > mantissa, i.e., less precision. As I say, it may be convenient, but if > cpu > > cycles matter, it isn't efficient. > > Efficiency is not the point here. IEEE floats simply cannot represent > the difference between exp(-1000) and exp(-1001). This difference can > matter in many contexts. > > For example, suppose I have observed a source in X-rays and I want to > fit a blackbody spectrum. I have, say, hundreds of spectral bins, with > a number of photons in each. For any temperature T and amplitude A, I > can compute the distribution of photons in each bin (Poisson with mean > depending on T and A), and obtain the probability of obtaining the > observed number of photons in each bin. Even if a blackbody with > temperature T and amplitude A is a perfect fit I should expect this > number to be very small, since the chance of obtaining *exactly* this > sequence of integers is quite small. But when I start the process I > need to guess a T and an A and evauate the probability. If my values > are far off, the probability will almost certainly be lower than > exp(-1000). But I need to know whether, if I increase T, this > probability will increase or decrease, so I cannot afford to treat it > as zero, or throw up my hands and say "This is smaller than one over > the number of baryons in the universe! My optimization problem doesn't > make any sense!". > You want the covariance of the parameters of the fit, which will be much more reasonable. And if not, then you have outliers. Mostly, folks are looking for the probability of classes of things, in this case the class of curves you are fitting, the probability of any give sequence, which approaches to zero, is a much less interest. So the errors in the parameters of the model matter, the probability of essentially unique sequence much less so. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 8 15:18:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 14:18:14 -0500 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: <3d375d730805081218q23440cb4l1ec04636cf7939c8@mail.gmail.com> On Thu, May 8, 2008 at 2:12 PM, Charles R Harris wrote: > > On Thu, May 8, 2008 at 11:46 AM, Anne Archibald > wrote: >> >> 2008/5/8 Charles R Harris : >> > >> > On Thu, May 8, 2008 at 10:56 AM, Robert Kern >> > wrote: >> > > >> > > When you're running an optimizer over a PDF, you will be stuck in the >> > > region of exp(-1000) for a substantial amount of time before you get >> > > to the peak. If you don't use the log representation, you will never >> > > get to the peak because all of the gradient information is lost to >> > > floating point error. You can consult any book on computational >> > > statistics for many more examples. This is a long-established best >> > > practice in statistics. >> > >> > But IEEE is already a log representation. You aren't gaining precision, >> > you >> > are gaining more bits in the exponent at the expense of fewer bits in >> > the >> > mantissa, i.e., less precision. As I say, it may be convenient, but if >> > cpu >> > cycles matter, it isn't efficient. >> >> Efficiency is not the point here. IEEE floats simply cannot represent >> the difference between exp(-1000) and exp(-1001). This difference can >> matter in many contexts. >> >> For example, suppose I have observed a source in X-rays and I want to >> fit a blackbody spectrum. I have, say, hundreds of spectral bins, with >> a number of photons in each. For any temperature T and amplitude A, I >> can compute the distribution of photons in each bin (Poisson with mean >> depending on T and A), and obtain the probability of obtaining the >> observed number of photons in each bin. Even if a blackbody with >> temperature T and amplitude A is a perfect fit I should expect this >> number to be very small, since the chance of obtaining *exactly* this >> sequence of integers is quite small. But when I start the process I >> need to guess a T and an A and evauate the probability. If my values >> are far off, the probability will almost certainly be lower than >> exp(-1000). But I need to know whether, if I increase T, this >> probability will increase or decrease, so I cannot afford to treat it >> as zero, or throw up my hands and say "This is smaller than one over >> the number of baryons in the universe! My optimization problem doesn't >> make any sense!". > > You want the covariance of the parameters of the fit, which will be much > more reasonable. And if not, then you have outliers. Mostly, folks are > looking for the probability of classes of things, in this case the class of > curves you are fitting, the probability of any give sequence, which > approaches to zero, is a much less interest. So the errors in the parameters > of the model matter, the probability of essentially unique sequence much > less so. Except that when you are doing practical computations, you will often be computing likelihoods as part of the larger computation. Yes, the final probabilities that you interpret won't be exp(-1000) (and if they are, then 0 is close enough), but the calculations *in between* do require that exp(-1000) and exp(-1001) be distinguishable from each other. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tjhnson at gmail.com Thu May 8 15:55:19 2008 From: tjhnson at gmail.com (T J) Date: Thu, 8 May 2008 12:55:19 -0700 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: On 5/8/08, Anne Archibald wrote: > Is "logarray" really the way to handle it, though? it seems like you > could probably get away with providing a logsum ufunc that did the > right thing. I mean, what operations does one want to do on logarrays? > > add -> logsum > subtract -> ? > multiply -> add > mean -> logsum/N > median -> median > exponentiate to recover normal-space values -> exp > str -> ? > That's about it, as far as my usage goes. Additionally, I would also benefit from: logdot logouter In addition to the elementwise operations, it would be nice to have logsum along an axis logprod along an axis cumlogsum cumlogprod Whether these are through additional ufuncs or through a subclass is not so much of an issue for me---either would be a huge improvement to the current situation. One benefit of a subclass, IMO, is that it maintains the feel of non-log arrays. That is, when I want to multiply to logarrays, I just do x*y, rather than x+y....but I can understand arguments that this might not be desirable. From charlesr.harris at gmail.com Thu May 8 16:11:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 14:11:01 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 11:46 AM, Anne Archibald wrote: > 2008/5/8 Charles R Harris : > > > > On Thu, May 8, 2008 at 10:56 AM, Robert Kern > wrote: > > > > > > When you're running an optimizer over a PDF, you will be stuck in the > > > region of exp(-1000) for a substantial amount of time before you get > > > to the peak. If you don't use the log representation, you will never > > > get to the peak because all of the gradient information is lost to > > > floating point error. You can consult any book on computational > > > statistics for many more examples. This is a long-established best > > > practice in statistics. > > > > But IEEE is already a log representation. You aren't gaining precision, > you > > are gaining more bits in the exponent at the expense of fewer bits in the > > mantissa, i.e., less precision. As I say, it may be convenient, but if > cpu > > cycles matter, it isn't efficient. > > Efficiency is not the point here. IEEE floats simply cannot represent > the difference between exp(-1000) and exp(-1001). This difference can > matter in many contexts. > > For example, suppose I have observed a source in X-rays and I want to > fit a blackbody spectrum. I have, say, hundreds of spectral bins, with > a number of photons in each. For any temperature T and amplitude A, I > can compute the distribution of photons in each bin (Poisson with mean > depending on T and A), and obtain the probability of obtaining the > observed number of photons in each bin. Even if a blackbody with > temperature T and amplitude A is a perfect fit I should expect this This is an interesting problem, partly because I was one of the first guys to use synthetic spectra (NO, OH), to fit temperatures to spectral data. But also because it might look like the Poisson matters. But probably not, the curve fitting effectively aggregates data and the central limit theorem kicks in, so that the normal least squares approach will probably work. Also, if the number of photons in a bin is much more that about 10, the computation of the Poisson distribution probably uses a Gaussian, with perhaps a small adjustment for the tail. So I'm curious, did you find any significant difference trying both methods? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Thu May 8 16:14:36 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 22:14:36 +0200 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: 2008/5/8 T J : > On 5/8/08, Anne Archibald wrote: > > Is "logarray" really the way to handle it, though? it seems like you > > could probably get away with providing a logsum ufunc that did the > > right thing. I mean, what operations does one want to do on logarrays? > > > > add -> logsum > > subtract -> ? > > multiply -> add > > mean -> logsum/N > > median -> median > > exponentiate to recover normal-space values -> exp > > str -> ? > > > > That's about it, as far as my usage goes. Additionally, I would also > benefit from: > > logdot > logouter > > In addition to the elementwise operations, it would be nice to have > > logsum along an axis > logprod along an axis > cumlogsum > cumlogprod > > Whether these are through additional ufuncs or through a subclass is > not so much of an issue for me---either would be a huge improvement to > the current situation. One benefit of a subclass, IMO, is that it > maintains the feel of non-log arrays. That is, when I want to > multiply to logarrays, I just do x*y, rather than x+y....but I can > understand arguments that this might not be desirable. Well, how about a two-step process? Let's come up with a nice implementation of each of the above, then we can discuss whether a subclass is warranted (and at that point each operation on the subclass will be very simple to implement). Most of them could be implemented almost for free by providing a ufunc that did logsum(); then logsum along an axis becomes logsum.reduce(), and cumlogsum becomes logsum.accumulate(). logprod is of course just add. logouter is conveniently and efficiently written as logprod.outer. logdot can be written in terms of logsum() and logprod() at the cost of a sizable temporary, but should really have its own implementation. It might be possible to test this by using vectorize() on a single-element version of logsum(). This would be slow but if vectorize() makes a real ufunc object should provide all the usual handy methods (reduce, accumulate, outer, etcetera). Alternatively, since logsum can be written in terms of existing ufuncs, it should be possible to implement a class providing all the ufunc methods by hand without too too much pain. Anne From focke at slac.stanford.edu Thu May 8 16:37:46 2008 From: focke at slac.stanford.edu (Warren Focke) Date: Thu, 8 May 2008 13:37:46 -0700 (PDT) Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: On Thu, 8 May 2008, Charles R Harris wrote: > On Thu, May 8, 2008 at 11:46 AM, Anne Archibald > wrote: > >> 2008/5/8 Charles R Harris : >>> >>> On Thu, May 8, 2008 at 10:56 AM, Robert Kern >> wrote: >>>> >>>> When you're running an optimizer over a PDF, you will be stuck in the >>>> region of exp(-1000) for a substantial amount of time before you get >>>> to the peak. If you don't use the log representation, you will never >>>> get to the peak because all of the gradient information is lost to >>>> floating point error. You can consult any book on computational >>>> statistics for many more examples. This is a long-established best >>>> practice in statistics. >>> >>> But IEEE is already a log representation. You aren't gaining precision, >> you >>> are gaining more bits in the exponent at the expense of fewer bits in the >>> mantissa, i.e., less precision. As I say, it may be convenient, but if >> cpu >>> cycles matter, it isn't efficient. >> >> Efficiency is not the point here. IEEE floats simply cannot represent >> the difference between exp(-1000) and exp(-1001). This difference can >> matter in many contexts. >> >> For example, suppose I have observed a source in X-rays and I want to >> fit a blackbody spectrum. I have, say, hundreds of spectral bins, with >> a number of photons in each. For any temperature T and amplitude A, I >> can compute the distribution of photons in each bin (Poisson with mean >> depending on T and A), and obtain the probability of obtaining the >> observed number of photons in each bin. Even if a blackbody with >> temperature T and amplitude A is a perfect fit I should expect this > > > This is an interesting problem, partly because I was one of the first guys > to use synthetic spectra (NO, OH), to fit temperatures to spectral data. But > also because it might look like the Poisson matters. But probably not, the > curve fitting effectively aggregates data and the central limit theorem > kicks in, so that the normal least squares approach will probably work. > Also, if the number of photons in a bin is much more that about 10, the > computation of the Poisson distribution probably uses a Gaussian, with > perhaps a small adjustment for the tail. So I'm curious, did you find any > significant difference trying both methods? I can't address Anne's results, but I've certainly found it to make a difference in my work, and it's pretty standard in high-energy astronomy. The central limit theorem does not get you out of having to use the right PDF to compare data to model. It does mean that, if you have enough events, the PDF of the fitness function is fairly chi-squarish, so much of the logic developed for least-squares fitting still applies (like for finding confidence intervals on the fitted parameters). Using Poisson statistics instead of a Gaussian approximation is actually pretty easy, see Cash (Astrophysical Journal, Part 1, vol. 228, Mar. 15, 1979, p. 939-947 http://adsabs.harvard.edu/abs/1979ApJ...228..939C) w From peridot.faceted at gmail.com Thu May 8 18:28:02 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 18:28:02 -0400 Subject: [Numpy-discussion] Why are ufunc docstrings useless? Message-ID: Hi, I frequently use functions like np.add.reduce and np.add.outer, but their docstrings are totally uninformative. Would it be possible to provide proper docstrings for these ufunc methods? They need not be specific to np.add; just an explanation of what arguments to give (for example) reduce() (presumably it takes an axis= argument? what is the default behaviour?) and what it does would help. Thanks, Anne From robert.kern at gmail.com Thu May 8 18:45:26 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 17:45:26 -0500 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: References: Message-ID: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> On Thu, May 8, 2008 at 5:28 PM, Anne Archibald wrote: > Hi, > > I frequently use functions like np.add.reduce and np.add.outer, but > their docstrings are totally uninformative. Would it be possible to > provide proper docstrings for these ufunc methods? They need not be > specific to np.add; just an explanation of what arguments to give (for > example) reduce() (presumably it takes an axis= argument? what is the > default behaviour?) and what it does would help. Sure. The place to add them would be in the PyMethodDef ufunc_methods array on line 3953 of numpy/core/src/ufuncobject.c. Just extend each of the rows with a char* containing the docstring contents. A literal "string" will suffice. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Thu May 8 18:45:44 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 08 May 2008 18:45:44 -0400 Subject: [Numpy-discussion] online (1-shot) calculation of variance (complex) Message-ID: I saw some links on 1-pass recursive calculation of mean/variance. When I tried the algorithms, it did not seem to give correct results for complex values. Anyone know how to correctly implement this? From robert.kern at gmail.com Thu May 8 18:49:07 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 17:49:07 -0500 Subject: [Numpy-discussion] online (1-shot) calculation of variance (complex) In-Reply-To: References: Message-ID: <3d375d730805081549g2267a481qbe5032f7a5583dca@mail.gmail.com> On Thu, May 8, 2008 at 5:45 PM, Neal Becker wrote: > I saw some links on 1-pass recursive calculation of mean/variance. When I > tried the algorithms, it did not seem to give correct results for complex > values. Anyone know how to correctly implement this? Well, exactly what did you try? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Thu May 8 18:54:42 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 08 May 2008 18:54:42 -0400 Subject: [Numpy-discussion] online (1-shot) calculation of variance (complex) References: <3d375d730805081549g2267a481qbe5032f7a5583dca@mail.gmail.com> Message-ID: Robert Kern wrote: > On Thu, May 8, 2008 at 5:45 PM, Neal Becker wrote: >> I saw some links on 1-pass recursive calculation of mean/variance. When >> I tried the algorithms, it did not seem to give correct results for >> complex >> values. Anyone know how to correctly implement this? > > Well, exactly what did you try? > See Algorithm III here: http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance From ndbecker2 at gmail.com Thu May 8 19:05:35 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 08 May 2008 19:05:35 -0400 Subject: [Numpy-discussion] online (1-shot) calculation of variance (complex) References: <3d375d730805081549g2267a481qbe5032f7a5583dca@mail.gmail.com> Message-ID: Robert Kern wrote: > On Thu, May 8, 2008 at 5:45 PM, Neal Becker wrote: >> I saw some links on 1-pass recursive calculation of mean/variance. When >> I tried the algorithms, it did not seem to give correct results for >> complex >> values. Anyone know how to correctly implement this? > > Well, exactly what did you try? > Here's my python translation: It seems to give a variance that just converges to 0 given a vector of gaussian r.v.: class stat2 (object): def __init__(self): self.n = 0 self._mean = 0 self.M2 = 0 def __iadd__(self, x): self.n += 1 delta = x - self._mean self._mean += delta/self.n self.M2 += delta*(x - self._mean) # This expression uses the new value of mean def mean(self): return self._mean def var(self): return self.M2/(self.n - 1) From robert.kern at gmail.com Thu May 8 19:16:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 18:16:03 -0500 Subject: [Numpy-discussion] online (1-shot) calculation of variance (complex) In-Reply-To: References: <3d375d730805081549g2267a481qbe5032f7a5583dca@mail.gmail.com> Message-ID: <3d375d730805081616n69465e4cm61279cb2381e5094@mail.gmail.com> On Thu, May 8, 2008 at 6:05 PM, Neal Becker wrote: > Robert Kern wrote: > >> On Thu, May 8, 2008 at 5:45 PM, Neal Becker wrote: >>> I saw some links on 1-pass recursive calculation of mean/variance. When >>> I tried the algorithms, it did not seem to give correct results for >>> complex >>> values. Anyone know how to correctly implement this? >> >> Well, exactly what did you try? >> > Here's my python translation: > It seems to give a variance that just converges to 0 given a vector of > gaussian r.v.: > > class stat2 (object): > def __init__(self): > self.n = 0 > self._mean = 0 > self.M2 = 0 > > def __iadd__(self, x): > self.n += 1 > delta = x - self._mean > self._mean += delta/self.n > self.M2 += delta*(x - self._mean) # This expression uses the > new value of mean You may not be able to convert this one to apply to complex numbers. The recurrence relation may not hold. In the two-pass algorithm for complex numbers, remember that you are summing (x[i] - mean).conj * (x[i] - mean) each of which is real. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From doutriaux1 at llnl.gov Thu May 8 20:38:37 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 08 May 2008 17:38:37 -0700 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: References: Message-ID: <48239D0D.2090706@llnl.gov> I don't think it is reasonable to say the trunk is in good shape when the power function does not work... Just my thoughts... C. Charles R Harris wrote: > Hi Jarrod, > > On Tue, May 6, 2008 at 2:40 AM, Jarrod Millman > wrote: > > Hey, > > The trunk is in pretty good shape and it is about time that I put out > an official release. So tomorrow (in a little over twelve hours) I am > going to create a 1.1.x branch and the trunk will be officially open > for 1.2 development. If there are no major issues that show up at the > last minute, I will tag 1.1.0 twenty-four hours after I branch. As > soon as I tag the release I will ask the David and Chris to create the > official Windows and Mac binaries. If nothing goes awry, you can > expect the official release announcement by Monday, May 12th. > > In order to help me with the final touches, would everyone look over > the release notes one last time: > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > Please let me know if there are any important omissions or errors > ASAP. > > > Scalar indexing of matrices has changed, 1D arrays are now returned > instead of matrices. This has to be documented in the release notes. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Thu May 8 20:56:03 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 18:56:03 -0600 Subject: [Numpy-discussion] Status Report for NumPy 1.1.0 In-Reply-To: <48239D0D.2090706@llnl.gov> References: <48239D0D.2090706@llnl.gov> Message-ID: On Thu, May 8, 2008 at 6:38 PM, Charles Doutriaux wrote: > I don't think it is reasonable to say the trunk is in good shape when > the power function does not work... > > Just my thoughts... > Is that ticket #301? What are you suggesting it do? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 8 22:22:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 20:22:43 -0600 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> <3d375d730805080956g11871630s9651cc13b15bed11@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 2:37 PM, Warren Focke wrote: > > > On Thu, 8 May 2008, Charles R Harris wrote: > > > On Thu, May 8, 2008 at 11:46 AM, Anne Archibald < > peridot.faceted at gmail.com> > > wrote: > > > >> 2008/5/8 Charles R Harris : > >>> > >>> On Thu, May 8, 2008 at 10:56 AM, Robert Kern > >> wrote: > >>>> > >>>> When you're running an optimizer over a PDF, you will be stuck in the > >>>> region of exp(-1000) for a substantial amount of time before you get > >>>> to the peak. If you don't use the log representation, you will never > >>>> get to the peak because all of the gradient information is lost to > >>>> floating point error. You can consult any book on computational > >>>> statistics for many more examples. This is a long-established best > >>>> practice in statistics. > >>> > >>> But IEEE is already a log representation. You aren't gaining precision, > >> you > >>> are gaining more bits in the exponent at the expense of fewer bits in > the > >>> mantissa, i.e., less precision. As I say, it may be convenient, but if > >> cpu > >>> cycles matter, it isn't efficient. > >> > >> Efficiency is not the point here. IEEE floats simply cannot represent > >> the difference between exp(-1000) and exp(-1001). This difference can > >> matter in many contexts. > >> > >> For example, suppose I have observed a source in X-rays and I want to > >> fit a blackbody spectrum. I have, say, hundreds of spectral bins, with > >> a number of photons in each. For any temperature T and amplitude A, I > >> can compute the distribution of photons in each bin (Poisson with mean > >> depending on T and A), and obtain the probability of obtaining the > >> observed number of photons in each bin. Even if a blackbody with > >> temperature T and amplitude A is a perfect fit I should expect this > > > > > > This is an interesting problem, partly because I was one of the first > guys > > to use synthetic spectra (NO, OH), to fit temperatures to spectral data. > But > > also because it might look like the Poisson matters. But probably not, > the > > curve fitting effectively aggregates data and the central limit theorem > > kicks in, so that the normal least squares approach will probably work. > > Also, if the number of photons in a bin is much more that about 10, the > > computation of the Poisson distribution probably uses a Gaussian, with > > perhaps a small adjustment for the tail. So I'm curious, did you find any > > significant difference trying both methods? > > I can't address Anne's results, but I've certainly found it to make a > difference > in my work, and it's pretty standard in high-energy astronomy. The central > limit theorem does not get you out of having to use the right PDF to > compare > data to model. It does mean that, if you have enough events, the PDF of > the > fitness function is fairly chi-squarish, so much of the logic developed for > least-squares fitting still applies (like for finding confidence intervals > on > the fitted parameters). Using Poisson statistics instead of a Gaussian > approximation is actually pretty easy, see Cash (Astrophysical Journal, > Part 1, > vol. 228, Mar. 15, 1979, p. 939-947 > http://adsabs.harvard.edu/abs/1979ApJ...228..939C) > Interesting paper. I've also been playing with a 64 bit floating format with a 31 bit offset binary exponent and 33 bit mantissa with a hidden bit, which can hold the probability of any give sequence of 2**30 - 1 coin tosses, or about 1e-323228496. It looks to be pretty efficient for multiplication and compare, and functions like log and exp aren't hard to do. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Thu May 8 22:52:08 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 22:52:08 -0400 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> Message-ID: 2008/5/8 Robert Kern : > On Thu, May 8, 2008 at 5:28 PM, Anne Archibald > wrote: >> Hi, >> >> I frequently use functions like np.add.reduce and np.add.outer, but >> their docstrings are totally uninformative. Would it be possible to >> provide proper docstrings for these ufunc methods? They need not be >> specific to np.add; just an explanation of what arguments to give (for >> example) reduce() (presumably it takes an axis= argument? what is the >> default behaviour?) and what it does would help. > > Sure. The place to add them would be in the PyMethodDef ufunc_methods > array on line 3953 of numpy/core/src/ufuncobject.c. Just extend each > of the rows with a char* containing the docstring contents. A literal > "string" will suffice. Thanks! Done add, reduce, outer, and reduceat. What about __call__? Anne From robert.kern at gmail.com Thu May 8 23:07:38 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 22:07:38 -0500 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> Message-ID: <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> On Thu, May 8, 2008 at 9:52 PM, Anne Archibald wrote: > Thanks! Done add, reduce, outer, and reduceat. What about __call__? If anyone knows enough to explicitly request a docstring from __call__, they already know what it does. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu May 8 23:12:03 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 21:12:03 -0600 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 9:07 PM, Robert Kern wrote: > On Thu, May 8, 2008 at 9:52 PM, Anne Archibald > wrote: > > > Thanks! Done add, reduce, outer, and reduceat. What about __call__? > > If anyone knows enough to explicitly request a docstring from > __call__, they already know what it does. > > -- > It's easier/better to do this in numpy/add_newdocs.py. For example: In [14]: from numpy.lib import add_newdoc as add In [15]: add('numpy.core','ufunc',('reduce','hello world')) In [16]: ufunc.reduce.__doc__ Out[16]: 'hello world' You don't have to clutter up the c sources and you can use """ """, instead of putting all those newlines in. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 8 23:16:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 22:16:18 -0500 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> Message-ID: <3d375d730805082016q45aa1674j9231585adf963843@mail.gmail.com> On Thu, May 8, 2008 at 10:12 PM, Charles R Harris wrote: > It's easier/better to do this in numpy/add_newdocs.py. For example: > > In [14]: from numpy.lib import add_newdoc as add > > In [15]: add('numpy.core','ufunc',('reduce','hello world')) > > In [16]: ufunc.reduce.__doc__ > Out[16]: 'hello world' > > You don't have to clutter up the c sources and you can use """ """, instead > of putting all those newlines in. Ah good. I didn't realize it would handle builtin type methods. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu May 8 23:17:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 21:17:42 -0600 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 9:12 PM, Charles R Harris wrote: > > > On Thu, May 8, 2008 at 9:07 PM, Robert Kern wrote: > >> On Thu, May 8, 2008 at 9:52 PM, Anne Archibald >> wrote: >> >> > Thanks! Done add, reduce, outer, and reduceat. What about __call__? >> >> If anyone knows enough to explicitly request a docstring from >> __call__, they already know what it does. >> >> -- >> > > It's easier/better to do this in numpy/add_newdocs.py. For example: > > In [14]: from numpy.lib import add_newdoc as add > > In [15]: add('numpy.core','ufunc',('reduce','hello world')) > > In [16]: ufunc.reduce.__doc__ > Out[16]: 'hello world' > > You don't have to clutter up the c sources and you can use """ """, instead > of putting all those newlines in. > Also, not all c compilers will join strings on separate lines. You need to add explicit continuation backslashes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Thu May 8 23:39:08 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 8 May 2008 23:39:08 -0400 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> Message-ID: 2008/5/8 Robert Kern : > On Thu, May 8, 2008 at 9:52 PM, Anne Archibald > wrote: > >> Thanks! Done add, reduce, outer, and reduceat. What about __call__? > > If anyone knows enough to explicitly request a docstring from > __call__, they already know what it does. How exactly are they to find out? It does take additional arguments, for example dtype and out - I think. Also, help(np.add) displays the the object, its docstring, its methods, and all their docstrings. So it provides a way to get a docstring out of __call__ without having to know what it is. Anne From strawman at astraw.com Thu May 8 23:51:21 2008 From: strawman at astraw.com (Andrew Straw) Date: Thu, 08 May 2008 20:51:21 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache Message-ID: <4823CA39.7090203@astraw.com> I've got a big element array (25 million int64s) that searchsorted() takes a long time to grind through. After a bit of digging in the literature and the numpy source code, I believe that searchsorted() is implementing a classic binary search, which is pretty bad in terms of cache misses. There are several modern implementations of binary search which arrange items in memory such that cache misses are much more rare. Clearly making such an indexing arrangement would take time, but in my particular case, I can spare the time to create an index if searching was faster, since I'd make the index once but do the searching many times. Is there an implementation of such an algorithm that works easilty with numpy? Also, can you offer any advice, suggestions, and comments to me if I attempted to implement such an algorithm? Thanks, Andrew From robert.kern at gmail.com Thu May 8 23:51:51 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 22:51:51 -0500 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> Message-ID: <3d375d730805082051y3203eb7ey9a32ccec7efb9ef1@mail.gmail.com> On Thu, May 8, 2008 at 10:39 PM, Anne Archibald wrote: > 2008/5/8 Robert Kern : >> On Thu, May 8, 2008 at 9:52 PM, Anne Archibald >> wrote: >> >>> Thanks! Done add, reduce, outer, and reduceat. What about __call__? >> >> If anyone knows enough to explicitly request a docstring from >> __call__, they already know what it does. > > How exactly are they to find out? It does take additional arguments, > for example dtype and out - I think. That should be recorded in the ufunc's main docstring, e.g. numpy.add.__doc__, since that is what people will actually be calling. No one will explicitly call numpy.add.__call__(x,y). > Also, help(np.add) displays the the object, its docstring, its > methods, and all their docstrings. So it provides a way to get a > docstring out of __call__ without having to know what it is. Meh. All it can usefully say is "Refer to the main docstring." Which is more or less what it currently says. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu May 8 23:55:56 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 22:55:56 -0500 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <4823CA39.7090203@astraw.com> References: <4823CA39.7090203@astraw.com> Message-ID: <3d375d730805082055o71d24359m31f8f63cec2bce71@mail.gmail.com> On Thu, May 8, 2008 at 10:51 PM, Andrew Straw wrote: > I've got a big element array (25 million int64s) that searchsorted() > takes a long time to grind through. After a bit of digging in the > literature and the numpy source code, I believe that searchsorted() is > implementing a classic binary search, Yes. > which is pretty bad in terms of > cache misses. There are several modern implementations of binary search > which arrange items in memory such that cache misses are much more rare. > Clearly making such an indexing arrangement would take time, but in my > particular case, I can spare the time to create an index if searching > was faster, since I'd make the index once but do the searching many times. > > Is there an implementation of such an algorithm that works easilty with > numpy? Also, can you offer any advice, suggestions, and comments to me > if I attempted to implement such an algorithm? I'm no help. You seem to know more than I do. Sadly, the first few Google hits I get for "binary search minimize cache misses" are patents. I don't know what the substantive content of those patents are; I have a strict policy of not reading patents. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri May 9 00:30:26 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 22:30:26 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <3d375d730805082055o71d24359m31f8f63cec2bce71@mail.gmail.com> References: <4823CA39.7090203@astraw.com> <3d375d730805082055o71d24359m31f8f63cec2bce71@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 9:55 PM, Robert Kern wrote: > On Thu, May 8, 2008 at 10:51 PM, Andrew Straw wrote: > > I've got a big element array (25 million int64s) that searchsorted() > > takes a long time to grind through. After a bit of digging in the > > literature and the numpy source code, I believe that searchsorted() is > > implementing a classic binary search, > > Yes. > > > which is pretty bad in terms of > > cache misses. There are several modern implementations of binary search > > which arrange items in memory such that cache misses are much more rare. > > Clearly making such an indexing arrangement would take time, but in my > > particular case, I can spare the time to create an index if searching > > was faster, since I'd make the index once but do the searching many > times. > > > > Is there an implementation of such an algorithm that works easilty with > > numpy? Also, can you offer any advice, suggestions, and comments to me > > if I attempted to implement such an algorithm? > > I'm no help. You seem to know more than I do. Sadly, the first few > Google hits I get for "binary search minimize cache misses" are > patents. I don't know what the substantive content of those patents > are; I have a strict policy of not reading patents. > I would be interested in adding such a thing if it wasn't patent encumbered. A good start would be a prototype in python to show how it all went together and whether it needed a separate indexing/lookup function or could be fit into the current setup. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Fri May 9 00:31:47 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 9 May 2008 00:31:47 -0400 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: <3d375d730805082051y3203eb7ey9a32ccec7efb9ef1@mail.gmail.com> References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> <3d375d730805082051y3203eb7ey9a32ccec7efb9ef1@mail.gmail.com> Message-ID: 2008/5/8 Robert Kern : > On Thu, May 8, 2008 at 10:39 PM, Anne Archibald > wrote: >> 2008/5/8 Robert Kern : >>> >>> If anyone knows enough to explicitly request a docstring from >>> __call__, they already know what it does. >> >> How exactly are they to find out? It does take additional arguments, >> for example dtype and out - I think. > > That should be recorded in the ufunc's main docstring, e.g. > numpy.add.__doc__, since that is what people will actually be calling. > No one will explicitly call numpy.add.__call__(x,y). > >> Also, help(np.add) displays the the object, its docstring, its >> methods, and all their docstrings. So it provides a way to get a >> docstring out of __call__ without having to know what it is. > > Meh. All it can usefully say is "Refer to the main docstring." Which > is more or less what it currently says. So is the custom that double-underscore methods get documented in the class docstring and normal methods get documented in their own docstrings? Anne From robert.kern at gmail.com Fri May 9 00:46:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 23:46:32 -0500 Subject: [Numpy-discussion] Why are ufunc docstrings useless? In-Reply-To: References: <3d375d730805081545q795099bbya6bda85378fdff86@mail.gmail.com> <3d375d730805082007x30de907cseb0d9fecc611515b@mail.gmail.com> <3d375d730805082051y3203eb7ey9a32ccec7efb9ef1@mail.gmail.com> Message-ID: <3d375d730805082146pf6f5a2fj7c71ab1d35aec4fa@mail.gmail.com> On Thu, May 8, 2008 at 11:31 PM, Anne Archibald wrote: > 2008/5/8 Robert Kern : >> On Thu, May 8, 2008 at 10:39 PM, Anne Archibald >> wrote: >>> 2008/5/8 Robert Kern : >>>> >>>> If anyone knows enough to explicitly request a docstring from >>>> __call__, they already know what it does. >>> >>> How exactly are they to find out? It does take additional arguments, >>> for example dtype and out - I think. >> >> That should be recorded in the ufunc's main docstring, e.g. >> numpy.add.__doc__, since that is what people will actually be calling. >> No one will explicitly call numpy.add.__call__(x,y). >> >>> Also, help(np.add) displays the the object, its docstring, its >>> methods, and all their docstrings. So it provides a way to get a >>> docstring out of __call__ without having to know what it is. >> >> Meh. All it can usefully say is "Refer to the main docstring." Which >> is more or less what it currently says. > > So is the custom that double-underscore methods get documented in the > class docstring and normal methods get documented in their own > docstrings? No. Objects whose main purpose is to be callable should document their call signature in the main docstring since that's where the objects they are mimiccing (functions) have their docstring. In this case, there is also a technical reason: you can't override add.__call__.__doc__, just ufunc.__call__.__doc__, but the call signatures vary between ufuncs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri May 9 01:06:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 23:06:16 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: References: <4823CA39.7090203@astraw.com> <3d375d730805082055o71d24359m31f8f63cec2bce71@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 10:30 PM, Charles R Harris wrote: > > > On Thu, May 8, 2008 at 9:55 PM, Robert Kern wrote: > >> On Thu, May 8, 2008 at 10:51 PM, Andrew Straw >> wrote: >> > I've got a big element array (25 million int64s) that searchsorted() >> > takes a long time to grind through. After a bit of digging in the >> > literature and the numpy source code, I believe that searchsorted() is >> > implementing a classic binary search, >> >> Yes. >> >> > which is pretty bad in terms of >> > cache misses. There are several modern implementations of binary search >> > which arrange items in memory such that cache misses are much more rare. >> > Clearly making such an indexing arrangement would take time, but in my >> > particular case, I can spare the time to create an index if searching >> > was faster, since I'd make the index once but do the searching many >> times. >> > >> > Is there an implementation of such an algorithm that works easilty with >> > numpy? Also, can you offer any advice, suggestions, and comments to me >> > if I attempted to implement such an algorithm? >> >> I'm no help. You seem to know more than I do. Sadly, the first few >> Google hits I get for "binary search minimize cache misses" are >> patents. I don't know what the substantive content of those patents >> are; I have a strict policy of not reading patents. >> > > I would be interested in adding such a thing if it wasn't patent > encumbered. A good start would be a prototype in python to show how it all > went together and whether it needed a separate indexing/lookup function or > could be fit into the current setup. > One way I can think of doing this is to have two indices. One is the usual sorted list, the second consists of, say, every 1024'th entry in the first list. Then search the second list first to find the part of the first list to search. That won't get you into the very best cache, but it could buy you a factor of 2x-4x in speed. It's sort of splitting the binary tree into two levels. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 9 01:11:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 May 2008 23:11:48 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: References: <4823CA39.7090203@astraw.com> <3d375d730805082055o71d24359m31f8f63cec2bce71@mail.gmail.com> Message-ID: On Thu, May 8, 2008 at 11:06 PM, Charles R Harris wrote: > > > On Thu, May 8, 2008 at 10:30 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Thu, May 8, 2008 at 9:55 PM, Robert Kern >> wrote: >> >>> On Thu, May 8, 2008 at 10:51 PM, Andrew Straw >>> wrote: >>> > I've got a big element array (25 million int64s) that searchsorted() >>> > takes a long time to grind through. After a bit of digging in the >>> > literature and the numpy source code, I believe that searchsorted() is >>> > implementing a classic binary search, >>> >>> Yes. >>> >>> > which is pretty bad in terms of >>> > cache misses. There are several modern implementations of binary search >>> > which arrange items in memory such that cache misses are much more >>> rare. >>> > Clearly making such an indexing arrangement would take time, but in my >>> > particular case, I can spare the time to create an index if searching >>> > was faster, since I'd make the index once but do the searching many >>> times. >>> > >>> > Is there an implementation of such an algorithm that works easilty with >>> > numpy? Also, can you offer any advice, suggestions, and comments to me >>> > if I attempted to implement such an algorithm? >>> >>> I'm no help. You seem to know more than I do. Sadly, the first few >>> Google hits I get for "binary search minimize cache misses" are >>> patents. I don't know what the substantive content of those patents >>> are; I have a strict policy of not reading patents. >>> >> >> I would be interested in adding such a thing if it wasn't patent >> encumbered. A good start would be a prototype in python to show how it all >> went together and whether it needed a separate indexing/lookup function or >> could be fit into the current setup. >> > > One way I can think of doing this is to have two indices. One is the usual > sorted list, the second consists of, say, every 1024'th entry in the first > list. Then search the second list first to find the part of the first list > to search. That won't get you into the very best cache, but it could buy you > a factor of 2x-4x in speed. It's sort of splitting the binary tree into two > levels. > You could even do it with your data from python, generating the second list and calling searchsorted multiple times. If you are searching for a bunch of values, it's probably good to also sort them first so they bunch together in the same part of the big list. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Fri May 9 03:43:54 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 9 May 2008 00:43:54 -0700 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <200805071912.19427.pgmdevlist@gmail.com> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> Message-ID: On Wed, May 7, 2008 at 12:12 PM, Pierre GM wrote: > Yes, there is a problem with ma.power: masking negative data should be > restricted to the case of an exponent between -1. and 1. only, don't you > think ? Charles Doutriaux has suggested that 1.1.0 shouldn't be released until this bug is fixed. I am inclined to somewhat agree with this, but would like to know if others agree with this sentiment. Please respond with a +1 or -1 to making this a 1.1.0 blocker. I also not sure how closely this issue is related with this open ticket: http://projects.scipy.org/scipy/numpy/ticket/301 I would appreciate it if someone could clarify this issue. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From efiring at hawaii.edu Fri May 9 04:23:14 2008 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 08 May 2008 22:23:14 -1000 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> Message-ID: <482409F2.4060903@hawaii.edu> Jarrod Millman wrote: > On Wed, May 7, 2008 at 12:12 PM, Pierre GM wrote: >> Yes, there is a problem with ma.power: masking negative data should be >> restricted to the case of an exponent between -1. and 1. only, don't you >> think ? > > Charles Doutriaux has suggested that 1.1.0 shouldn't be released until > this bug is fixed. I am inclined to somewhat agree with this, but > would like to know if others agree with this sentiment. Please > respond with a +1 or -1 to making this a 1.1.0 blocker. > > I also not sure how closely this issue is related with this open > ticket: http://projects.scipy.org/scipy/numpy/ticket/301 > I would appreciate it if someone could clarify this issue. It is a completely different issue; #301 applies to numpy.power with integer arguments including a negative power; Charles Doutriaux pointed out a bug in numpy.ma.power (and in the oldnumeric.ma) that masked the result if the power was negative. Pierre GM fixed this in r5137, so that the result is masked only if the first argument is negative and the power is between -1 and 1. Eric From falted at pytables.org Fri May 9 05:09:19 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 9 May 2008 11:09:19 +0200 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <4823CA39.7090203@astraw.com> References: <4823CA39.7090203@astraw.com> Message-ID: <200805091109.20425.falted@pytables.org> A Friday 09 May 2008, Andrew Straw escrigu?: > I've got a big element array (25 million int64s) that searchsorted() > takes a long time to grind through. After a bit of digging in the > literature and the numpy source code, I believe that searchsorted() > is implementing a classic binary search, which is pretty bad in terms > of cache misses. There are several modern implementations of binary > search which arrange items in memory such that cache misses are much > more rare. Clearly making such an indexing arrangement would take > time, but in my particular case, I can spare the time to create an > index if searching was faster, since I'd make the index once but do > the searching many times. > > Is there an implementation of such an algorithm that works easilty > with numpy? Also, can you offer any advice, suggestions, and comments > to me if I attempted to implement such an algorithm? Well, if you can afford extra space for the hashes you can always use a dictionary for doing the lookups. In pure Python they are around 3x faster (for arrays of 8 millions of elements) than binary searches. If your space is tight, you can build an extension (for example in Pyrex) for doing binary search for your specific type (int64), for an small improvement. Finally, if you combine this approach with what is suggesting Charles Harris (i.e. creating several levels of caches, but not more than two, which in my experience works best), you can have pretty optimal lookups with relatively low space overhead. See this thread for a discussion of the binary/hash lookup approaches: http://mail.python.org/pipermail/python-list/2007-November/465900.html Hope that helps, -- Francesc Alted From bsouthey at gmail.com Fri May 9 09:36:31 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 09 May 2008 08:36:31 -0500 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <200805091109.20425.falted@pytables.org> References: <4823CA39.7090203@astraw.com> <200805091109.20425.falted@pytables.org> Message-ID: <4824535F.3000807@gmail.com> Hi, I don't know if it helps, but Ulrich Drepper had a 9 part series about memory in Linux Weekly News (http://lwn.net). You can under all 9 linked under his name in the Guest archives (http://lwn.net/Archives/GuestIndex/) as not all are linked together. The first one is: http://lwn.net/Articles/250967/ What every programmer should know about memory, Part 1 (September 21, 2007) In part 5 there was a comment on 'Cache-oblivious algorithms' by *akapoor*: "i guess it's worth mentioning harald-prokop's 1999 thesis on "cache oblivious algorithms" (http://citeseer.ist.psu.edu/prokop99cacheobliviou.html)." Regards Bruce Francesc Alted wrote: > A Friday 09 May 2008, Andrew Straw escrigu?: > >> I've got a big element array (25 million int64s) that searchsorted() >> takes a long time to grind through. After a bit of digging in the >> literature and the numpy source code, I believe that searchsorted() >> is implementing a classic binary search, which is pretty bad in terms >> of cache misses. There are several modern implementations of binary >> search which arrange items in memory such that cache misses are much >> more rare. Clearly making such an indexing arrangement would take >> time, but in my particular case, I can spare the time to create an >> index if searching was faster, since I'd make the index once but do >> the searching many times. >> >> Is there an implementation of such an algorithm that works easilty >> with numpy? Also, can you offer any advice, suggestions, and comments >> to me if I attempted to implement such an algorithm? >> > > Well, if you can afford extra space for the hashes you can always use a > dictionary for doing the lookups. In pure Python they are around 3x > faster (for arrays of 8 millions of elements) than binary searches. If > your space is tight, you can build an extension (for example in Pyrex) > for doing binary search for your specific type (int64), for an small > improvement. Finally, if you combine this approach with what is > suggesting Charles Harris (i.e. creating several levels of caches, but > not more than two, which in my experience works best), you can have > pretty optimal lookups with relatively low space overhead. > > See this thread for a discussion of the binary/hash lookup approaches: > > http://mail.python.org/pipermail/python-list/2007-November/465900.html > > Hope that helps, > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From teoliphant at gmail.com Fri May 9 09:43:04 2008 From: teoliphant at gmail.com (Travis Oliphant) Date: Fri, 09 May 2008 08:43:04 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change Message-ID: <482454E8.8060606@gmail.com> Hi all, I'm having trouble emailing this list from work, so I'm using a different email address. After Nathan Bell's recent complaints, I'm a bit more uncomfortable with the matrix change to scalar indexing. It does and will break code in possibly hard-to-track down ways. Also, Nathan has been a *huge* contributor to the Sparse matrix in scipy and so I value his opinion about the NumPy matrix. One of my goals is to have those two objects work together a bit more seamlessly. So, I think we need to: 1) Add a warning to scalar access 2) Back-out the change and fix all the places where NumPy assumes incorrectly that the number of dimensions reduce on PySequence_GetItem. Opinions? -Travis From ndbecker2 at gmail.com Fri May 9 09:53:54 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 09 May 2008 09:53:54 -0400 Subject: [Numpy-discussion] problem with factorial? Message-ID: I noticed my fact function: from scipy.special import gamma def fact(x): return gamma (x+1) Wasn't working. Then I see: gamma Looks like there's a conflict in scipy over the name 'gamma' (I guess this was pulled in later in my script when I did 'from pylab import *') From charlesr.harris at gmail.com Fri May 9 10:00:51 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 May 2008 08:00:51 -0600 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <482454E8.8060606@gmail.com> References: <482454E8.8060606@gmail.com> Message-ID: On Fri, May 9, 2008 at 7:43 AM, Travis Oliphant wrote: > > Hi all, > > I'm having trouble emailing this list from work, so I'm using a > different email address. > > After Nathan Bell's recent complaints, I'm a bit more uncomfortable with > the matrix change to scalar indexing. It does and will break code in > possibly hard-to-track down ways. Also, Nathan has been a *huge* > contributor to the Sparse matrix in scipy and so I value his opinion > about the NumPy matrix. One of my goals is to have those two objects > work together a bit more seamlessly. > > So, I think we need to: > > 1) Add a warning to scalar access > 2) Back-out the change and fix all the places where NumPy assumes > incorrectly that the number of dimensions reduce on PySequence_GetItem. > -1. That said, the basic mistake is probably making Matrix a subclass of ndarray, as it fails the "is a" test. There really aren't that many places where inheritance is the right choice and numpy itself wasn't designed as a base class: it lacks a specification of what functions can be "virtual" and is probably too big. I vote that we bring Nathan into the conversation and see how upset he really is. Speaking for myself, I sometimes get angry upfront when specifications change unexpectedly underfoot but then settle down and find that it isn't all that bad. Being caught by surprise is probably half the problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Fri May 9 10:07:37 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 9 May 2008 10:07:37 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <482454E8.8060606@gmail.com> References: <482454E8.8060606@gmail.com> Message-ID: On Fri, 09 May 2008, Travis Oliphant apparently wrote: > I think we need to: > 1) Add a warning to scalar access > 2) Back-out the change and fix all the places where NumPy assumes incorrectly that the number of dimensions reduce on PySequence_GetItem. > Opinions? Point of information: it looks like Nathan already made the needed fixes, and the changes made were in my opinion not at all obscure and indeed were rather minor. (Which does not deny they were needed.) I am trying to advocate either way; just supplying information. Cheers, Alan Isaac From charlesr.harris at gmail.com Fri May 9 10:06:15 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 May 2008 08:06:15 -0600 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Fri, May 9, 2008 at 8:00 AM, Charles R Harris wrote: > > > On Fri, May 9, 2008 at 7:43 AM, Travis Oliphant > wrote: > >> >> Hi all, >> >> I'm having trouble emailing this list from work, so I'm using a >> different email address. >> >> After Nathan Bell's recent complaints, I'm a bit more uncomfortable with >> the matrix change to scalar indexing. It does and will break code in >> possibly hard-to-track down ways. Also, Nathan has been a *huge* >> contributor to the Sparse matrix in scipy and so I value his opinion >> about the NumPy matrix. One of my goals is to have those two objects >> work together a bit more seamlessly. >> >> So, I think we need to: >> >> 1) Add a warning to scalar access >> 2) Back-out the change and fix all the places where NumPy assumes >> incorrectly that the number of dimensions reduce on PySequence_GetItem. >> > > -1. > > That said, the basic mistake is probably making Matrix a subclass of > ndarray, as it fails the "is a" test. There really aren't that many places > where inheritance is the right choice and numpy itself wasn't designed as a > base class: it lacks a specification of what functions can be "virtual" and > is probably too big. > > I vote that we bring Nathan into the conversation and see how upset he > really is. Speaking for myself, I sometimes get angry upfront when > specifications change unexpectedly underfoot but then settle down and find > that it isn't all that bad. Being caught by surprise is probably half the > problem. > Let me add that backing it out of 1.1 might not be a bad idea, it may be a change to soon and at the last minute at that. But I would like to see it in 1.2. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 9 10:14:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 May 2008 08:14:11 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <4823CA39.7090203@astraw.com> References: <4823CA39.7090203@astraw.com> Message-ID: On Thu, May 8, 2008 at 9:51 PM, Andrew Straw wrote: > I've got a big element array (25 million int64s) that searchsorted() > takes a long time to grind through. After a bit of digging in the > literature and the numpy source code, I believe that searchsorted() is > implementing a classic binary search, which is pretty bad in terms of > cache misses. There are several modern implementations of binary search > which arrange items in memory such that cache misses are much more rare. > Clearly making such an indexing arrangement would take time, but in my > particular case, I can spare the time to create an index if searching > was faster, since I'd make the index once but do the searching many times. > > Is there an implementation of such an algorithm that works easilty with > numpy? Also, can you offer any advice, suggestions, and comments to me > if I attempted to implement such an algorithm? > What sort of algorithm is best also depends on the use. If you have a 25e6 sized table that you want to interpolate through with another set of 25e6 indices, then binary search is the wrong approach. In that case you really want to start from the last position and search forward with increasing steps to bracket the next value. Basically, binary search is order n*ln(m), where n is the size of the index list and m the size of the table. The sequential way is nearly n + m, which will be much better if n and m are of comparable size. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From roberto at dealmeida.net Fri May 9 10:16:01 2008 From: roberto at dealmeida.net (Roberto De Almeida) Date: Fri, 9 May 2008 11:16:01 -0300 Subject: [Numpy-discussion] problem with factorial? In-Reply-To: References: Message-ID: <10c662fe0805090716x1d0a5156ob0ad8c1c9cb2de5b@mail.gmail.com> On Fri, May 9, 2008 at 10:53 AM, Neal Becker wrote: > I noticed my fact function: > from scipy.special import gamma > def fact(x): > return gamma (x+1) > > Looks like there's a conflict in scipy over the name 'gamma' (I guess this > was pulled in later in my script when I did 'from pylab import *') So don't do it, then. :) Or, from scipy.special import gamma as gamma_ --Rob From peridot.faceted at gmail.com Fri May 9 10:43:46 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 9 May 2008 10:43:46 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <482454E8.8060606@gmail.com> References: <482454E8.8060606@gmail.com> Message-ID: 2008/5/9 Travis Oliphant : > After Nathan Bell's recent complaints, I'm a bit more uncomfortable with > the matrix change to scalar indexing. It does and will break code in > possibly hard-to-track down ways. Also, Nathan has been a *huge* > contributor to the Sparse matrix in scipy and so I value his opinion > about the NumPy matrix. One of my goals is to have those two objects > work together a bit more seamlessly. > > So, I think we need to: > > 1) Add a warning to scalar access > 2) Back-out the change and fix all the places where NumPy assumes > incorrectly that the number of dimensions reduce on PySequence_GetItem. > > Opinions? This is certainly the conservative approach. How much code is broken by this, compared to (say) the amount broken by the disappearance of numpy.core.ma? Is this our biggest single API breakage? I do agree that we should be paying attention to people who are actually using matrices, so I won't enter a vote. Anne From bsouthey at gmail.com Fri May 9 10:51:54 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 09 May 2008 09:51:54 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: <4824650A.9080902@gmail.com> Charles R Harris wrote: > > > On Fri, May 9, 2008 at 7:43 AM, Travis Oliphant > wrote: > > > Hi all, > > I'm having trouble emailing this list from work, so I'm using a > different email address. > > After Nathan Bell's recent complaints, I'm a bit more > uncomfortable with > the matrix change to scalar indexing. It does and will break code in > possibly hard-to-track down ways. Also, Nathan has been a *huge* > contributor to the Sparse matrix in scipy and so I value his opinion > about the NumPy matrix. One of my goals is to have those two objects > work together a bit more seamlessly. > > So, I think we need to: > > 1) Add a warning to scalar access > 2) Back-out the change and fix all the places where NumPy assumes > incorrectly that the number of dimensions reduce on > PySequence_GetItem. > > > -1. > > That said, the basic mistake is probably making Matrix a subclass of > ndarray, as it fails the "is a" test. There really aren't that many > places where inheritance is the right choice and numpy itself wasn't > designed as a base class: it lacks a specification of what functions > can be "virtual" and is probably too big. > > I vote that we bring Nathan into the conversation and see how upset he > really is. Speaking for myself, I sometimes get angry upfront when > specifications change unexpectedly underfoot but then settle down and > find that it isn't all that bad. Being caught by surprise is probably > half the problem. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Hi, +1 The prime reason is not whether or not it is a bad/good idea but because the actual change was introduced so late in the development of 1.0.5/1.1 process. A lesser reason is that gives people like Nathan time to change their code to match the pending release. Unfortunately the other problem with this change is that any user now has to be careful of which NumPy version is being used. The result is that backwards compatibility is now broken in what was originally going to be a minor release. Following Tim Hochberg's email on changing histogram, I think that for at least NumPy version 1.1 that scalar indexing should provide a 'PendingDeprecationWarning' or a 'DeprecationWarning' with the actual change happening in 1.2 or later. Regards Bruce From charlesr.harris at gmail.com Fri May 9 10:56:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 May 2008 08:56:28 -0600 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <4824650A.9080902@gmail.com> References: <482454E8.8060606@gmail.com> <4824650A.9080902@gmail.com> Message-ID: On Fri, May 9, 2008 at 8:51 AM, Bruce Southey wrote: > Charles R Harris wrote: > > > > > > On Fri, May 9, 2008 at 7:43 AM, Travis Oliphant > > wrote: > > > > > > Hi all, > > > > I'm having trouble emailing this list from work, so I'm using a > > different email address. > > > > After Nathan Bell's recent complaints, I'm a bit more > > uncomfortable with > > the matrix change to scalar indexing. It does and will break code > in > > possibly hard-to-track down ways. Also, Nathan has been a *huge* > > contributor to the Sparse matrix in scipy and so I value his opinion > > about the NumPy matrix. One of my goals is to have those two objects > > work together a bit more seamlessly. > > > > So, I think we need to: > > > > 1) Add a warning to scalar access > > 2) Back-out the change and fix all the places where NumPy assumes > > incorrectly that the number of dimensions reduce on > > PySequence_GetItem. > > > > > > -1. > > > > That said, the basic mistake is probably making Matrix a subclass of > > ndarray, as it fails the "is a" test. There really aren't that many > > places where inheritance is the right choice and numpy itself wasn't > > designed as a base class: it lacks a specification of what functions > > can be "virtual" and is probably too big. > > > > I vote that we bring Nathan into the conversation and see how upset he > > really is. Speaking for myself, I sometimes get angry upfront when > > specifications change unexpectedly underfoot but then settle down and > > find that it isn't all that bad. Being caught by surprise is probably > > half the problem. > > > > Chuck > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > Hi, > +1 > > The prime reason is not whether or not it is a bad/good idea but because > the actual change was introduced so late in the development of 1.0.5/1.1 > process. A lesser reason is that gives people like Nathan time to change > their code to match the pending release. Unfortunately the other problem > with this change is that any user now has to be careful of which NumPy > version is being used. The result is that backwards compatibility is now > broken in what was originally going to be a minor release. > Of course, if Nathan has already made the changes we will drive him crazy if we back them out now ;) I note that the thread on scipy ended pretty quickly, so I didn't get the impression there was a lot of resistance. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri May 9 11:05:18 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 9 May 2008 17:05:18 +0200 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: <9457e7c80805090805n47d43c12h5c397032b1ee411b@mail.gmail.com> 2008/5/9 Anne Archibald : > How much code is broken by this, compared to (say) the amount broken > by the disappearance of numpy.core.ma? Is this our biggest single API > breakage? You shouldn't import from core -- we never advertised that API. As far as I recall, numpy.ma was always available. Regards St?fan From aisaac at american.edu Fri May 9 11:28:55 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 9 May 2008 11:28:55 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com><4824650A.9080902@gmail.com> Message-ID: On Fri, 9 May 2008, Charles R Harris apparently wrote: > if Nathan has already made the changes we will drive him > crazy if we back them out now Since I mentioned Nathan's changes, I wish to clarify something. I have no idea what Nathan's views are, but as I recall them, it looked to me that his changes would be robust to backing out. He could be asked about this. Again, I am *not* trying to advocate backing out. > I note that the thread on scipy ended pretty quickly, so > I didn't get the impression there was a lot of resistance. I agree with this. Cheers, Alan From bsouthey at gmail.com Fri May 9 11:27:51 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 09 May 2008 10:27:51 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> <4824650A.9080902@gmail.com> Message-ID: <48246D77.9070400@gmail.com> Charles R Harris wrote: > > > On Fri, May 9, 2008 at 8:51 AM, Bruce Southey > wrote: > > Charles R Harris wrote: > > > > > > On Fri, May 9, 2008 at 7:43 AM, Travis Oliphant > > > >> wrote: > > > > > > Hi all, > > > > I'm having trouble emailing this list from work, so I'm using a > > different email address. > > > > After Nathan Bell's recent complaints, I'm a bit more > > uncomfortable with > > the matrix change to scalar indexing. It does and will > break code in > > possibly hard-to-track down ways. Also, Nathan has been a > *huge* > > contributor to the Sparse matrix in scipy and so I value his > opinion > > about the NumPy matrix. One of my goals is to have those > two objects > > work together a bit more seamlessly. > > > > So, I think we need to: > > > > 1) Add a warning to scalar access > > 2) Back-out the change and fix all the places where NumPy > assumes > > incorrectly that the number of dimensions reduce on > > PySequence_GetItem. > > > > > > -1. > > > > That said, the basic mistake is probably making Matrix a subclass of > > ndarray, as it fails the "is a" test. There really aren't that many > > places where inheritance is the right choice and numpy itself > wasn't > > designed as a base class: it lacks a specification of what functions > > can be "virtual" and is probably too big. > > > > I vote that we bring Nathan into the conversation and see how > upset he > > really is. Speaking for myself, I sometimes get angry upfront when > > specifications change unexpectedly underfoot but then settle > down and > > find that it isn't all that bad. Being caught by surprise is > probably > > half the problem. > > > > Chuck > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > Hi, > +1 > > The prime reason is not whether or not it is a bad/good idea but > because > the actual change was introduced so late in the development of > 1.0.5/1.1 > process. A lesser reason is that gives people like Nathan time to > change > their code to match the pending release. Unfortunately the other > problem > with this change is that any user now has to be careful of which NumPy > version is being used. The result is that backwards compatibility > is now > broken in what was originally going to be a minor release. > > > Of course, if Nathan has already made the changes we will drive him > crazy if we back them out now ;) I note that the thread on scipy ended > pretty quickly, so I didn't get the impression there was a lot of > resistance. > Sure all the related threads ended abruptly perhaps related to the fact that Travis suggested the change :-) Bruce From tim.hochberg at ieee.org Fri May 9 11:36:41 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 9 May 2008 08:36:41 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <482454E8.8060606@gmail.com> References: <482454E8.8060606@gmail.com> Message-ID: On Fri, May 9, 2008 at 6:43 AM, Travis Oliphant wrote: > > Hi all, > > I'm having trouble emailing this list from work, so I'm using a > different email address. > > After Nathan Bell's recent complaints, I'm a bit more uncomfortable with > the matrix change to scalar indexing. It does and will break code in > possibly hard-to-track down ways. Also, Nathan has been a *huge* > contributor to the Sparse matrix in scipy and so I value his opinion > about the NumPy matrix. One of my goals is to have those two objects > work together a bit more seamlessly. > > So, I think we need to: > > 1) Add a warning to scalar access > 2) Back-out the change and fix all the places where NumPy assumes > incorrectly that the number of dimensions reduce on PySequence_GetItem. +0 My personal opinion is that current matrix class is pretty useless and the change won't help much from my point of view. My preference would be to leave the matrix class alone, design a new matrix class, with a different name, for 1.2 and then deprecate the old matrix class. Piecemeal fixing of the matrix class is going to break someone's code and doesn't really get us where we want to go. Of course, since I think the matrix class is kind of useless, I'm not terribly invested in this change one way or another. > > > Opinions? > > -Travis > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From wright at esrf.fr Fri May 9 11:56:40 2008 From: wright at esrf.fr (Jonathan Wright) Date: Fri, 09 May 2008 17:56:40 +0200 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: <48247438.3040003@esrf.fr> Timothy Hochberg wrote: > +0 > > My personal opinion is that current matrix class is pretty useless and > the change won't help much from my point of view. My preference would > be to leave the matrix class alone, design a new matrix class, with a > different name, for 1.2 and then deprecate the old matrix class +1 From charlesr.harris at gmail.com Fri May 9 12:06:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 May 2008 10:06:00 -0600 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <48247438.3040003@esrf.fr> References: <482454E8.8060606@gmail.com> <48247438.3040003@esrf.fr> Message-ID: On Fri, May 9, 2008 at 9:56 AM, Jonathan Wright wrote: > Timothy Hochberg wrote: > > +0 > > > > My personal opinion is that current matrix class is pretty useless and > > the change won't help much from my point of view. My preference would > > be to leave the matrix class alone, design a new matrix class, with a > > different name, for 1.2 and then deprecate the old matrix class > +1 > The problem here is the workarounds in the numpy code, which we would have to maintain. I am against messing with the numpy code just to accommodate a matrix class that shouldn't have inherited from ndarray in the first place. So I am OK with backing out the changes as long as we also leave all the bugs in place. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Fri May 9 12:07:07 2008 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 9 May 2008 11:07:07 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Fri, May 9, 2008 at 9:07 AM, Alan G Isaac wrote: > > Point of information: it looks like Nathan already made the > needed fixes, and the changes made were in my opinion not at > all obscure and indeed were rather minor. (Which does not > deny they were needed.) > That's correct, the necessary changes to scipy.sparse were not very substantial. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From wnbell at gmail.com Fri May 9 12:09:37 2008 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 9 May 2008 11:09:37 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> <4824650A.9080902@gmail.com> Message-ID: On Fri, May 9, 2008 at 10:28 AM, Alan G Isaac wrote: > > Since I mentioned Nathan's changes, I wish to clarify > something. I have no idea what Nathan's views are, but as > I recall them, it looked to me that his changes would be > robust to backing out. That should be true. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From wnbell at gmail.com Fri May 9 12:12:32 2008 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 9 May 2008 11:12:32 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> <4824650A.9080902@gmail.com> Message-ID: On Fri, May 9, 2008 at 9:56 AM, Charles R Harris wrote: > > Of course, if Nathan has already made the changes we will drive him crazy if > we back them out now This shouldn't be a problem, scipy.sparse should work with either Thanks for your concern though :) -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From stefan at sun.ac.za Fri May 9 12:13:11 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 9 May 2008 18:13:11 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <482409F2.4060903@hawaii.edu> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> <482409F2.4060903@hawaii.edu> Message-ID: <9457e7c80805090913ubc33759w11d4b128d2cae693@mail.gmail.com> 2008/5/9 Eric Firing : > Pierre GM fixed this in r5137, so that the result is masked only if the > first argument is negative and the power is between -1 and 1. Why should the output be masked only when the power is between -1 and 1? Other powers also produce nan's. Either way, the docstring should be updated: Computes a**b elementwise. Masked values are set to 1. Regards St?fan From vallis.35530053 at bloglines.com Fri May 9 12:17:50 2008 From: vallis.35530053 at bloglines.com (Michele Vallisneri) Date: Fri, 9 May 2008 09:17:50 -0700 Subject: [Numpy-discussion] Problem with numpy and distutils on OS X References: Message-ID: <38EDE004-2D9F-4CA0-855D-9EE4E8D69045@bloglines.com> I'm writing a standard distutils setup.py to compile a Python C extension on OS X 10.4, and I need to specify a few special compiler options to enable vector CPU extension (altivec and SSE on i686 and PPC respectively). This compromises the generation of universal binaries, because these options are CPU-specific, and cannot be passed to gcc together with "-arch ppc -arch i386". I'm happy with generating a nonuniversal extension, which I can do (at least with distutils 2.4.4) by specifying an extra "-arch i386" compiler option. Distutils has some Darwin-specific code that catches that, and takes out the "-arch ppc" option that would be there by default. All well, but then in my setup.py I also import numpy (to find out where its C includes are). If I do that, the behavior of distutils changes, and the "-arch ppc" option is not taken out, so compilation fails again. So I have questions: - Is this an acceptable behavior for numpy to have? Should it modify other modules? I understand that there is a numpy.distutils, but shouldn't I have a choice to use it explicitly, and get the old distutils behavior by using that namespace? - Is there a way to avoid or disable this interference? - Finally, is there a way to compile universal binaries with CPU- specific options? My python is 2.4.4. My numpy is version 1.0.4. gcc is 4.0.1. Thanks a lot! Michele From efiring at hawaii.edu Fri May 9 12:55:39 2008 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 09 May 2008 06:55:39 -1000 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <9457e7c80805090913ubc33759w11d4b128d2cae693@mail.gmail.com> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> <482409F2.4060903@hawaii.edu> <9457e7c80805090913ubc33759w11d4b128d2cae693@mail.gmail.com> Message-ID: <4824820B.9090207@hawaii.edu> Stefan, (and Jarrod and Pierre) (Context for anyone new to the thread: the subject is slightly misleading, because the bug is/was present in both oldnumeric.ma and numpy.ma; the discussion of fix pertains to the latter only.) Regarding your objections to r5137: good point. I wondered about that. I think the function should look like this (although it might be possible to speed up the implementation for the most common case): def power(a, b, third=None): """Computes a**b elementwise. Where a is negative and b has a non-integer value, the result is masked. """ if third is not None: raise MAError, "3-argument power not supported." ma = getmask(a) mb = getmask(b) m = mask_or(ma, mb) fa = getdata(a) fb = getdata(b) if fb.dtype.char in typecodes["Integer"]: return masked_array(umath.power(fa, fb), m) md = make_mask((fb != fb.astype(int)) & (fa < 0), shrink=True) m = mask_or(m, md) if m is nomask: return masked_array(umath.power(fa, fb)) else: fa = fa.copy() fa[m] = 1 return masked_array(umath.power(fa, fb), m) I don't have time right now to turn this into a proper patch, complete with test case, but if no one else can do it sooner then I can probably do it in the next day or two. Here is a quick partial example of the behavior of the version above: In [1]:import numpy as np In [2]:xx = np.ma.array([-2.2, -2.2, 2.2, 2.2], mask = [True, False, False, True]) In [3]:xx Out[3]: masked_array(data = [-- -2.2 2.2 --], mask = [ True False False True], fill_value=1e+20) In [4]:np.ma.power(xx, 2.0) Out[4]: masked_array(data = [-- 4.84 4.84 --], mask = [ True False False True], fill_value=1e+20) In [5]:np.ma.power(xx, -2.0) Out[5]: masked_array(data = [-- 0.206611570248 0.206611570248 --], mask = [ True False False True], fill_value=1e+20) In [6]:np.ma.power(xx, -2.1) Out[6]: masked_array(data = [-- -- 0.190946793699 --], mask = [ True True False True], fill_value=1e+20) Eric St?fan van der Walt wrote: > 2008/5/9 Eric Firing : >> Pierre GM fixed this in r5137, so that the result is masked only if the >> first argument is negative and the power is between -1 and 1. > > Why should the output be masked only when the power is between -1 and > 1? Other powers also produce nan's. Either way, the docstring should > be updated: > > Computes a**b elementwise. > > Masked values are set to 1. > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Fri May 9 13:06:28 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 9 May 2008 13:06:28 -0400 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <4824820B.9090207@hawaii.edu> References: <4821E855.1060909@llnl.gov> <9457e7c80805090913ubc33759w11d4b128d2cae693@mail.gmail.com> <4824820B.9090207@hawaii.edu> Message-ID: <200805091306.28826.pgmdevlist@gmail.com> On Friday 09 May 2008 12:55:39 Eric Firing wrote: > Stefan, (and Jarrod and Pierre) > Regarding your objections to r5137: good point. I wondered about that. > I think the function should look like this (although it might be > possible to speed up the implementation for the most common case): OK, on it. From aisaac at american.edu Fri May 9 13:31:47 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 9 May 2008 13:31:47 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Fri, 9 May 2008, Timothy Hochberg apparently wrote: > I think the matrix class is kind of useless, In my field (economics), it has been a great way to introduce students to NumPy. I suggest that this reason alone makes it far from "useless". I also personally find it convenient for linear algebra. (I think you have stressed that this is just due to a dearth of operators, but that is the current situation.) Cheers, Alan PS In contrast to those supporting more ambitious and complex proposals, I think Travis's patch fixes almost everything that needs to be fixed. From aisaac at american.edu Fri May 9 13:31:50 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 9 May 2008 13:31:50 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com><48247438.3040003@esrf.fr> Message-ID: On Fri, 9 May 2008, Charles R Harris apparently wrote: > I am against messing with the numpy code just to > accommodate a matrix class that shouldn't have inherited > from ndarray in the first place. So I am OK with backing > out the changes as long as we also leave all the bugs in > place. That's how we got here in the first place, I think. Trading off problems in current behavior vs. the possibility that other code (like Nathan's) might rely on that bad behavior. Uncomfortable either way. One piece of good news: Nathan's fixes were easy, so one way forward is not looking too rocky. fwiw, Alan From kwgoodman at gmail.com Fri May 9 13:52:08 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 9 May 2008 10:52:08 -0700 Subject: [Numpy-discussion] A matrix user in the land of arrays Message-ID: The recently proposed changes to the matrix class was the final push I needed to begin slowly porting my package from matrices to arrays. But I'm already stuck in the first stage (all new modules must use arrays). Here's a toy example of iterating over columns of a matrix: x is a nxm matrix y is a nx1 matrix for j in xrange(x.shape[1]): idx = where(y > scalar)[0] x[idx,j] = scalar If x and y become 2d arrays, the code still works. But what confuses me is how to generalize it to work for both 2d and 1d arrays. From wnbell at gmail.com Fri May 9 13:53:40 2008 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 9 May 2008 12:53:40 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> <48247438.3040003@esrf.fr> Message-ID: On Fri, May 9, 2008 at 12:31 PM, Alan G Isaac wrote: > > That's how we got here in the first place, I think. > Trading off problems in current behavior vs. the possibility > that other code (like Nathan's) might rely on that bad behavior. > Uncomfortable either way. We'll, I think we've established that the possibility of breaking people's code is 1 :) > One piece of good news: Nathan's fixes were easy, > so one way forward is not looking too rocky. True, but scipy.sparse makes fairly limited use of matrix and I have 386 unittests to tell me what broke. End-users might spend considerably longer sorting out the problem, particularly if they don't know what they're looking for. Personally, I would not have thought a 1.0 -> 1.1 version bump would break something like this. Yes, we can put caveats in the release notes, but how many numpy.matrix users read those? Some aspects of the change are still broken. It's probably annoying to end users when they pass a matrix into a function and get an ndarray back out. Now matrix indexing is itself guilty of this offense. Furthermore, some users probably *do* rely on the lack of dimension reduction because that's one of the few differences between the matrix and ndarray. Alan, I don't fundamentally disagree with your positions on the deficiencies/quirks of matrices in numpy. However, it's completely inappropriate to plug one hole while creating others, especially in a minor release. I suspect that if we surveyed end-users we'd find that "my code still works" is a much higher priority than "A[0][0] now does what I expect". IMO scalar indexing should raise a warning (as you first suggested) in the 1.1 release. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From robert.kern at gmail.com Fri May 9 14:23:59 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 May 2008 13:23:59 -0500 Subject: [Numpy-discussion] A matrix user in the land of arrays In-Reply-To: References: Message-ID: <3d375d730805091123g6104d14ao2d17f246d21393c5@mail.gmail.com> On Fri, May 9, 2008 at 12:52 PM, Keith Goodman wrote: > The recently proposed changes to the matrix class was the final push I > needed to begin slowly porting my package from matrices to arrays. But > I'm already stuck in the first stage (all new modules must use > arrays). > > Here's a toy example of iterating over columns of a matrix: > > x is a nxm matrix > y is a nx1 matrix > > for j in xrange(x.shape[1]): > idx = where(y > scalar)[0] > x[idx,j] = scalar > > If x and y become 2d arrays, the code still works. But what confuses > me is how to generalize it to work for both 2d and 1d arrays. Use atleast_2d(x) to get a 1xm array, then use your 2D code on it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri May 9 14:35:38 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 May 2008 13:35:38 -0500 Subject: [Numpy-discussion] Problem with numpy and distutils on OS X In-Reply-To: <38EDE004-2D9F-4CA0-855D-9EE4E8D69045@bloglines.com> References: <38EDE004-2D9F-4CA0-855D-9EE4E8D69045@bloglines.com> Message-ID: <3d375d730805091135s796fa7cam29d767678ba0e47e@mail.gmail.com> On Fri, May 9, 2008 at 11:17 AM, Michele Vallisneri wrote: > I'm writing a standard distutils setup.py to compile a Python C > extension on OS X 10.4, and I need to specify a few special compiler > options to enable vector CPU extension (altivec and SSE on i686 and > PPC respectively). This compromises the generation of universal > binaries, because these options are CPU-specific, and cannot be > passed to gcc together with "-arch ppc -arch i386". > > I'm happy with generating a nonuniversal extension, which I can do > (at least with distutils 2.4.4) by specifying an extra "-arch i386" > compiler option. Distutils has some Darwin-specific code that catches > that, and takes out the "-arch ppc" option that would be there by > default. > > All well, but then in my setup.py I also import numpy (to find out > where its C includes are). If I do that, the behavior of distutils > changes, and the "-arch ppc" option is not taken out, so compilation > fails again. > > So I have questions: > > - Is this an acceptable behavior for numpy to have? No. We've seen this before, and I thought we fixed it, but perhaps not. numpy.distutils does monkeypatch distutils, but you shouldn't get that unless if you import numpy.distutils. > Should it modify > other modules? I understand that there is a numpy.distutils, but > shouldn't I have a choice to use it explicitly, and get the old > distutils behavior by using that namespace? Yes, you should be able to avoid this by avoiding importing numpy.distutils. However, I don't see why you are getting this. Just importing numpy and calling numpy.get_include() does not bring in numpy.distutils, at least not with the SVN version of numpy (the 1.1.x branch rather than the 1.2.x trunk). >>> import sys >>> old = set(sys.modules) >>> import numpy >>> numpy.get_include() '/Users/rkern/svn/numpy/numpy/core/include' >>> new = set(sys.modules) >>> [x for x in (new - old) if 'distutils' in x] [] >>> You might try something similar at the top of your setup.py script to see if 1.0.4 does something different. > - Is there a way to avoid or disable this interference? Possibly upgrade to the 1.1.x branch. Check it out from here: http://svn.scipy.org/svn/numpy/branches/1.1.x > - Finally, is there a way to compile universal binaries with CPU- > specific options? Not to my knowledge, no. There might be a gcc option to use in extra_compile_args, but you will have to check the man page for it. If you find one, please post it, since I am interested in having such a capability myself. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri May 9 14:41:08 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 9 May 2008 11:41:08 -0700 Subject: [Numpy-discussion] A matrix user in the land of arrays In-Reply-To: <3d375d730805091123g6104d14ao2d17f246d21393c5@mail.gmail.com> References: <3d375d730805091123g6104d14ao2d17f246d21393c5@mail.gmail.com> Message-ID: On Fri, May 9, 2008 at 11:23 AM, Robert Kern wrote: > On Fri, May 9, 2008 at 12:52 PM, Keith Goodman wrote: >> The recently proposed changes to the matrix class was the final push I >> needed to begin slowly porting my package from matrices to arrays. But >> I'm already stuck in the first stage (all new modules must use >> arrays). >> >> Here's a toy example of iterating over columns of a matrix: >> >> x is a nxm matrix >> y is a nx1 matrix >> >> for j in xrange(x.shape[1]): >> idx = where(y > scalar)[0] >> x[idx,j] = scalar >> >> If x and y become 2d arrays, the code still works. But what confuses >> me is how to generalize it to work for both 2d and 1d arrays. > > Use atleast_2d(x) to get a 1xm array, then use your 2D code on it. That looks good. But at the end of the function I'll have to convert back to a 1d array if the input is 1d np.whence_you_came_from(x) I guess there is no way to not test for the shape. From robert.kern at gmail.com Fri May 9 14:43:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 May 2008 13:43:34 -0500 Subject: [Numpy-discussion] A matrix user in the land of arrays In-Reply-To: References: <3d375d730805091123g6104d14ao2d17f246d21393c5@mail.gmail.com> Message-ID: <3d375d730805091143o4a8ef442r733353ad4cee38d9@mail.gmail.com> On Fri, May 9, 2008 at 1:41 PM, Keith Goodman wrote: > That looks good. But at the end of the function I'll have to convert > back to a 1d array if the input is 1d > > np.whence_you_came_from(x) > > I guess there is no way to not test for the shape. Well, in this case, since you are modifying the data in-place, just return the original array. x2 = atleast_2d(x) x2[i,j] = something return x -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri May 9 14:44:14 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 May 2008 12:44:14 -0600 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> <48247438.3040003@esrf.fr> Message-ID: On Fri, May 9, 2008 at 10:06 AM, Charles R Harris wrote: > > > On Fri, May 9, 2008 at 9:56 AM, Jonathan Wright wrote: > >> Timothy Hochberg wrote: >> > +0 >> > >> > My personal opinion is that current matrix class is pretty useless and >> > the change won't help much from my point of view. My preference would >> > be to leave the matrix class alone, design a new matrix class, with a >> > different name, for 1.2 and then deprecate the old matrix class >> +1 >> > > The problem here is the workarounds in the numpy code, which we would have > to maintain. I am against messing with the numpy code just to accommodate a > matrix class that shouldn't have inherited from ndarray in the first place. > So I am OK with backing out the changes as long as we also leave all the > bugs in place. > So to start a different line here, what properties *should* a base class have? I would posit that the scalar types, strided memory, ufuncs, and broadcasting would be the fundamental properties. The strided memory and ufuncs, and perhaps broadcasting, probably can't be directly used by sparse arrays, so this may be too general already. Things like constructors (array, matrix), display (print), and operator choices (+,-,*, indexing), should not. The problem with operators is that we want Python to do the parsing and want to use sequences and such. One possibility here is to provide separate functions with different names to replace PySequence_GetItem and PySequence_Length internally in numpy, that is, we no longer consider arrays as nested Python sequences. Instead, we provide a standalone function to translate nested sequences to arrays, which is pretty much what we have now. So we don't just special case matrices, we special case arrays in general and only use PySequence_GetItem internally for genuine Python sequences. >From this common base we can then derive ndarray which defines __getitem__ to use the new functions and can also derive matrices which define __getitem__ differently. I'm just thinking out loud here where folks might notice to start a different thread. This doesn't solve the problem of what matrices should be, but it does remove the problem of decreasing dimensions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri May 9 14:49:16 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 9 May 2008 11:49:16 -0700 Subject: [Numpy-discussion] A matrix user in the land of arrays In-Reply-To: <3d375d730805091143o4a8ef442r733353ad4cee38d9@mail.gmail.com> References: <3d375d730805091123g6104d14ao2d17f246d21393c5@mail.gmail.com> <3d375d730805091143o4a8ef442r733353ad4cee38d9@mail.gmail.com> Message-ID: On Fri, May 9, 2008 at 11:43 AM, Robert Kern wrote: > On Fri, May 9, 2008 at 1:41 PM, Keith Goodman wrote: >> That looks good. But at the end of the function I'll have to convert >> back to a 1d array if the input is 1d >> >> np.whence_you_came_from(x) >> >> I guess there is no way to not test for the shape. > > Well, in this case, since you are modifying the data in-place, just > return the original array. > > x2 = atleast_2d(x) > x2[i,j] = something > return x Nice! That was the beauty of atleast_2d---it returns a view. I see now that views are not just for speed. From vallis.35530053 at bloglines.com Fri May 9 15:32:00 2008 From: vallis.35530053 at bloglines.com (vallis.35530053 at bloglines.com) Date: 9 May 2008 19:32:00 -0000 Subject: [Numpy-discussion] Problem with numpy and distutils on OS X Message-ID: <1210361520.3816136710.28005.sendItem@bloglines.com> Thanks, Robert. Indeed, numpy 1.0.4 does some "monkeypatching" (see the transcript below). Interestingly, 1.0.3 did not, so I'm hoping that 1.0.5 may not also. (I'd rather stay with released version, since I distribute my code to colleagues, and cannot impose too many conditions on them.) In the meantime I'll get the numpy include directory information some other way in my setup.py. I will also investigate if gcc could be smarter about enabling vectorization (perhaps in a version newer than 4.0.1?), and let you know. Cheers, Michele --- Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> old = set(sys.modules) >>> import numpy >>> numpy.get_include() '/Users/vallis/lib/python2.5/site-packages/numpy-1.0.4-py2.5-macosx-10.3-fat.egg/numpy/core/include' >>> new = set(sys.modules) >>> [x for x in (new - old) if 'distutils' in x] ['numpy.distutils.misc_util', 'distutils', 'numpy.distutils.re', 'numpy.distutils.numpy', 'distutils.dep_util', 'numpy.distutils.ccompiler', 'distutils.types', 'numpy.distutils.exec_command', 'numpy.distutils.tempfile', 'distutils.re', 'numpy.distutils.info', 'numpy.distutils.distutils', 'distutils.log', 'distutils.copy', 'distutils.version', 'numpy.distutils', 'distutils.sysconfig', 'numpy.distutils.curses', 'numpy.distutils.copy', 'numpy.distutils.sys', 'numpy.distutils.__version__', 'distutils.os', 'distutils.ccompiler', 'numpy.distutils.string', 'distutils.spawn', 'distutils.sys', 'distutils.dir_util', 'numpy.distutils.unixccompiler', 'distutils.util', 'distutils.string', 'numpy.distutils.new', 'numpy.distutils.imp', 'numpy.distutils.__config__', 'distutils.distutils', 'distutils.unixccompiler', 'numpy.distutils.log', 'numpy.distutils.os', 'numpy.distutils.glob', 'numpy.distutils.atexit', 'distutils.errors', 'distutils.file_util'] --- Discussion of Numerical Python wrote: > > I'm writing a standard distutils setup.py to compile a Python C > > extension on OS X 10.4, and I need to specify a few special compiler > > options to enable vector CPU extension (altivec and SSE on i686 and > > PPC respectively). This compromises the generation of universal > > binaries, because these options are CPU-specific, and cannot be > > passed to gcc together with "-arch ppc -arch i386". > > > > I'm happy with generating a nonuniversal extension, which I can do > > (at least with distutils 2.4.4) by specifying an extra "-arch i386" > > compiler option. Distutils has some Darwin-specific code that catches > > that, and takes out the "-arch ppc" option that would be there by > > default. > > > > All well, but then in my setup.py I also import numpy (to find out > > where its C includes are). If I do that, the behavior of distutils > > changes, and the "-arch ppc" option is not taken out, so compilation > > fails again. > > > > So I have questions: > > > > - Is this an acceptable behavior for numpy to have? > > No. We've seen this before, and I thought we fixed it, but perhaps > not. numpy.distutils does monkeypatch distutils, but you shouldn't get > that unless if you import numpy.distutils. > > > Should it modify > > other modules? I understand that there is a numpy.distutils, but > > shouldn't I have a choice to use it explicitly, and get the old > > distutils behavior by using that namespace? > > Yes, you should be able to avoid this by avoiding importing > numpy.distutils. However, I don't see why you are getting this. Just > importing numpy and calling numpy.get_include() does not bring in > numpy.distutils, at least not with the SVN version of numpy (the 1.1.x > branch rather than the 1.2.x trunk). > > >>> import sys > >>> old = set(sys.modules) > >>> import numpy > >>> numpy.get_include() > '/Users/rkern/svn/numpy/numpy/core/include' > >>> new = set(sys.modules) > >>> [x for x in (new - old) if 'distutils' in x] > [] > >>> > > You might try something similar at the top of your setup.py script to > see if 1.0.4 does something different. > > > - Is there a way to avoid or disable this interference? > > Possibly upgrade to the 1.1.x branch. Check it out from here: > > http://svn.scipy.org/svn/numpy/branches/1.1.x > > > - Finally, is there a way to compile universal binaries with CPU- > > specific options? > > Not to my knowledge, no. There might be a gcc option to use in > extra_compile_args, but you will have to check the man page for it. If > you find one, please post it, since I am interested in having such a > capability myself. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Fri May 9 15:58:31 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 May 2008 14:58:31 -0500 Subject: [Numpy-discussion] Problem with numpy and distutils on OS X In-Reply-To: <1210361520.3816136710.28005.sendItem@bloglines.com> References: <1210361520.3816136710.28005.sendItem@bloglines.com> Message-ID: <3d375d730805091258p38422130hb319b5541a43f6c3@mail.gmail.com> On Fri, May 9, 2008 at 2:32 PM, wrote: > Thanks, Robert. Indeed, numpy 1.0.4 does some "monkeypatching" (see the transcript > below). Interestingly, 1.0.3 did not, so I'm hoping that 1.0.5 may not also. > (I'd rather stay with released version, since I distribute my code to colleagues, > and cannot impose too many conditions on them.) 1.1.0 (which was supposed to be 1.0.5 until recently) should be out shortly. I can indeed verify that numpy.distutils does get imported in 1.0.4 but not in 1.1.0, so I think we've already fixed the problem on our end. Ah. I found the source of the problem. numpy.testing.numpytest, which gets imported when you import numpy, uses a utility function from numpy.distutils in 1.0.4, but we changed all that for 1.1.0. > In the meantime I'll get > the numpy include directory information some other way in my setup.py. I will > also investigate if gcc could be smarter about enabling vectorization (perhaps > in a version newer than 4.0.1?), and let you know. Possibly, but 4.0.1 is the standard compiler on OS X. You probably can't rely on anything else being installed. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Fri May 9 16:12:03 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 9 May 2008 22:12:03 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <4824820B.9090207@hawaii.edu> References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> <482409F2.4060903@hawaii.edu> <9457e7c80805090913ubc33759w11d4b128d2cae693@mail.gmail.com> <4824820B.9090207@hawaii.edu> Message-ID: 2008/5/9 Eric Firing : > Stefan, (and Jarrod and Pierre) > > (Context for anyone new to the thread: the subject is slightly > misleading, because the bug is/was present in both oldnumeric.ma and > numpy.ma; the discussion of fix pertains to the latter only.) > > Regarding your objections to r5137: good point. I wondered about that. > I think the function should look like this (although it might be > possible to speed up the implementation for the most common case): [...] > md = make_mask((fb != fb.astype(int)) & (fa < 0), shrink=True) Unfortunately this isn't quite the right condition: In [18]: x = 2.**35; numpy.array([-1.])**x; numpy.array(x).astype(int)==x Out[18]: array([ 1.]) Out[18]: False Switching to int64 seems to help: In [27]: x = 2.**62; numpy.array([-1.])**x; numpy.array(x).astype(numpy.int64)==x Out[27]: array([ 1.]) Out[27]: True This seems to work, but may be platform-dependent: 2**62+1 cannot be represented as an IEEE float, so whether pow() successfully deals with it may be different for machines that don't work with 80-bit floating-point internally. A suspenders-and-belt approach would check for NaNs and add them to the mask, but this still doesn't cover the case where the user has numpy set to raise exceptions any time NaNs are generated. Anne From efiring at hawaii.edu Fri May 9 17:13:02 2008 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 09 May 2008 11:13:02 -1000 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: References: <4821E855.1060909@llnl.gov> <4821F1DF.3010108@hawaii.edu> <200805071912.19427.pgmdevlist@gmail.com> <482409F2.4060903@hawaii.edu> <9457e7c80805090913ubc33759w11d4b128d2cae693@mail.gmail.com> <4824820B.9090207@hawaii.edu> Message-ID: <4824BE5E.4060409@hawaii.edu> Anne Archibald wrote: > 2008/5/9 Eric Firing : >> Stefan, (and Jarrod and Pierre) >> >> (Context for anyone new to the thread: the subject is slightly >> misleading, because the bug is/was present in both oldnumeric.ma and >> numpy.ma; the discussion of fix pertains to the latter only.) >> >> Regarding your objections to r5137: good point. I wondered about that. >> I think the function should look like this (although it might be >> possible to speed up the implementation for the most common case): > [...] >> md = make_mask((fb != fb.astype(int)) & (fa < 0), shrink=True) > > Unfortunately this isn't quite the right condition: > > In [18]: x = 2.**35; numpy.array([-1.])**x; numpy.array(x).astype(int)==x > Out[18]: array([ 1.]) > Out[18]: False > > Switching to int64 seems to help: > > In [27]: x = 2.**62; numpy.array([-1.])**x; > numpy.array(x).astype(numpy.int64)==x > Out[27]: array([ 1.]) > Out[27]: True > > This seems to work, but may be platform-dependent: 2**62+1 cannot be > represented as an IEEE float, so whether pow() successfully deals with > it may be different for machines that don't work with 80-bit > floating-point internally. > > A suspenders-and-belt approach would check for NaNs and add them to > the mask, but this still doesn't cover the case where the user has > numpy set to raise exceptions any time NaNs are generated. There may not be a perfect solution, but I suspect your suggestion to use int64 is more than good enough to get things going for a 1.1 release. The docstring could note the limitation. If it is established that the calculation will fail for a power outside some domain, then such a domain check could be added to the mask. Eric > > Anne > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Fri May 9 17:23:07 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 9 May 2008 17:23:07 -0400 Subject: [Numpy-discussion] Power domain (was Re: bug in oldnumeric.ma) In-Reply-To: <4824BE5E.4060409@hawaii.edu> References: <4821E855.1060909@llnl.gov> <4824BE5E.4060409@hawaii.edu> Message-ID: <200805091723.08940.pgmdevlist@gmail.com> On Friday 09 May 2008 17:13:02 Eric Firing wrote: > Anne Archibald wrote: > > 2008/5/9 Eric Firing : > >> md = make_mask((fb != fb.astype(int)) & (fa < 0), shrink=True) > > > > Unfortunately this isn't quite the right condition: > > > > In [18]: x = 2.**35; numpy.array([-1.])**x; numpy.array(x).astype(int)==x > > Out[18]: array([ 1.]) > > Out[18]: False > > > > Switching to int64 seems to help: > There may not be a perfect solution, but I suspect your suggestion to > use int64 is more than good enough to get things going for a 1.1 > release. The docstring could note the limitation. If it is established > that the calculation will fail for a power outside some domain, then > such a domain check could be added to the mask. Interestingly, I notice that MaskedArray misses a __pow__ method: it uses the ndarray.__pow__ method instead, that may outputs NaNs. In other terms, I gonna have to code MaskedArray.__pow__, following Eric' example, with int64 instead of int. But, why don't we compare: abs(np.array(b).astype(int)-b).max() References: <4821E855.1060909@llnl.gov> <4824BE5E.4060409@hawaii.edu> <200805091723.08940.pgmdevlist@gmail.com> Message-ID: <4824D40D.4000207@hawaii.edu> Pierre GM wrote: > On Friday 09 May 2008 17:13:02 Eric Firing wrote: >> Anne Archibald wrote: >>> 2008/5/9 Eric Firing : >>>> md = make_mask((fb != fb.astype(int)) & (fa < 0), shrink=True) >>> Unfortunately this isn't quite the right condition: >>> >>> In [18]: x = 2.**35; numpy.array([-1.])**x; numpy.array(x).astype(int)==x >>> Out[18]: array([ 1.]) >>> Out[18]: False >>> >>> Switching to int64 seems to help: > >> There may not be a perfect solution, but I suspect your suggestion to >> use int64 is more than good enough to get things going for a 1.1 >> release. The docstring could note the limitation. If it is established >> that the calculation will fail for a power outside some domain, then >> such a domain check could be added to the mask. > > Interestingly, I notice that MaskedArray misses a __pow__ method: it uses the > ndarray.__pow__ method instead, that may outputs NaNs. In other terms, I > gonna have to code MaskedArray.__pow__, following Eric' example, with int64 > instead of int. > But, why don't we compare: > abs(np.array(b).astype(int)-b).max() instead ? At which point will b be considered sufficiently close to an integer > that x**b won't return NaN ? I don't think the .max() part of that is right; the test needs to be element-wise, and turned into a mask. It is also not clear to me that the test would actually catch all the cases where x**b would return NaN. It seems like some strategic re-thinking may be needed in the long run, if not immediately. There is a wide range of combinations of arguments that will trigger invalid results, whether Inf or NaN. The only way to trap and mask all of these is to use masked_invalid after the calculation, and this only works if the user has not disabled nan output. I have not checked recently, but based on earlier strategy discussions, I suspect that numpy.ma is already strongly depending on the availability of nan and inf output to prevent exceptions being raised upon invalid calculations. Maybe this should simply be considered a requirement for the use of ma. The point of my original suggestion was that it would make power work as a user might reasonably expect under sane conditions that are likely to arise in practice. Under extreme conditions, it would leave unmasked nans or infs, or would raise an exception if that were the specified handling mode for invalid calculations. Eric From kwgoodman at gmail.com Fri May 9 19:27:01 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 9 May 2008 16:27:01 -0700 Subject: [Numpy-discussion] clip out Message-ID: Is there a reason why clip doesn't take out as an input? It seems to work when I added it. From aisaac at american.edu Fri May 9 19:31:12 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 9 May 2008 19:31:12 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com><48247438.3040003@esrf.fr> Message-ID: On Fri, 9 May 2008, Nathan Bell apparently wrote: > I don't fundamentally disagree with your positions on the > deficiencies/quirks of matrices in numpy. However, it's > completely inappropriate to plug one hole while creating > others I think we have to be careful with that argument. The relative size of holes can matter. > especially in a minor release. That is a separate question, on which I am not expressing an opinion. Cheers, Alan From pgmdevlist at gmail.com Fri May 9 19:31:37 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 9 May 2008 19:31:37 -0400 Subject: [Numpy-discussion] Power domain (was Re: bug in oldnumeric.ma) In-Reply-To: <4824D40D.4000207@hawaii.edu> References: <4821E855.1060909@llnl.gov> <200805091723.08940.pgmdevlist@gmail.com> <4824D40D.4000207@hawaii.edu> Message-ID: <200805091931.38450.pgmdevlist@gmail.com> On Friday 09 May 2008 18:45:33 Eric Firing wrote: > I don't think the .max() part of that is right; the test needs to be > element-wise, and turned into a mask. Quite right. I was being overzealous... > It is also not clear to me that the test would actually catch all the > cases where x**b would return NaN. Oh, probably not, but it's close enough: raise an exception if you have a negative number and an exponent that is significantly different froman integer. > It seems like some strategic re-thinking may be needed in the long run, > if not immediately. There is a wide range of combinations of arguments > that will trigger invalid results, whether Inf or NaN. Mmh, I forgot about the zero case with negative integers: right now, inf is returned. Should be easy enough to make a (x The only way to > trap and mask all of these is to use masked_invalid after the > calculation, and this only works if the user has not disabled nan > output. We'll agree that's a rather quick-and-dirty patch, not a real fix... > I have not checked recently, but based on earlier strategy > discussions, I suspect that numpy.ma is already strongly depending on > the availability of nan and inf output to prevent exceptions being > raised upon invalid calculations. Maybe this should simply be > considered a requirement for the use of ma. I wouldn't say strongly. In most cases, potential NaNs/Infs are trapped beforehand. Paul, Sasha and the other original developers had introduced DomainedOperation classes that are quite useful for that. masked_invalid may be used locally in scipy.stats.mstats as a quick fix... In our case, we need for a**b: - trap the case [a==0] >>> (a>> (a<0) & (abs(b-b.astype(int)) References: Message-ID: <3d375d730805091639y4f9f32b4s8aac038d7ff9ab8b@mail.gmail.com> On Fri, May 9, 2008 at 6:27 PM, Keith Goodman wrote: > Is there a reason why clip doesn't take out as an input? Oversight. The out= argument was added to the .clip() method relatively recently. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Fri May 9 20:01:06 2008 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 09 May 2008 14:01:06 -1000 Subject: [Numpy-discussion] Power domain (was Re: bug in oldnumeric.ma) In-Reply-To: <200805091931.38450.pgmdevlist@gmail.com> References: <4821E855.1060909@llnl.gov> <200805091723.08940.pgmdevlist@gmail.com> <4824D40D.4000207@hawaii.edu> <200805091931.38450.pgmdevlist@gmail.com> Message-ID: <4824E5C2.10809@hawaii.edu> Pierre GM wrote: > On Friday 09 May 2008 18:45:33 Eric Firing wrote: >> I don't think the .max() part of that is right; the test needs to be >> element-wise, and turned into a mask. > > Quite right. I was being overzealous... > >> It is also not clear to me that the test would actually catch all the >> cases where x**b would return NaN. > > Oh, probably not, but it's close enough: raise an exception if you have a > negative number and an exponent that is significantly different froman > integer. > >> It seems like some strategic re-thinking may be needed in the long run, >> if not immediately. There is a wide range of combinations of arguments >> that will trigger invalid results, whether Inf or NaN. > > Mmh, I forgot about the zero case with negative integers: right now, inf is > returned. Should be easy enough to make a (x >> The only way to >> trap and mask all of these is to use masked_invalid after the >> calculation, and this only works if the user has not disabled nan >> output. > > We'll agree that's a rather quick-and-dirty patch, not a real fix... > >> I have not checked recently, but based on earlier strategy >> discussions, I suspect that numpy.ma is already strongly depending on >> the availability of nan and inf output to prevent exceptions being >> raised upon invalid calculations. Maybe this should simply be >> considered a requirement for the use of ma. > > I wouldn't say strongly. In most cases, potential NaNs/Infs are trapped > beforehand. Paul, Sasha and the other original developers had introduced > DomainedOperation classes that are quite useful for that. masked_invalid may > be used locally in scipy.stats.mstats as a quick fix... > > In our case, we need for a**b: > - trap the case [a==0] >>>> (a - trap the case [(a<0) and real exponent too different from an integer] >>>> (a<0) & (abs(b-b.astype(int)) > Am I missing anything else ? > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From kwgoodman at gmail.com Fri May 9 20:03:49 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 9 May 2008 17:03:49 -0700 Subject: [Numpy-discussion] clip out In-Reply-To: <3d375d730805091639y4f9f32b4s8aac038d7ff9ab8b@mail.gmail.com> References: <3d375d730805091639y4f9f32b4s8aac038d7ff9ab8b@mail.gmail.com> Message-ID: On Fri, May 9, 2008 at 4:39 PM, Robert Kern wrote: > On Fri, May 9, 2008 at 6:27 PM, Keith Goodman wrote: >> Is there a reason why clip doesn't take out as an input? > > Oversight. The out= argument was added to the .clip() method > relatively recently. Oh. I didn't know it was available as a method. Great, I'll use that. From peridot.faceted at gmail.com Sat May 10 00:44:57 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 10 May 2008 00:44:57 -0400 Subject: [Numpy-discussion] Power domain (was Re: bug in oldnumeric.ma) In-Reply-To: <4824D40D.4000207@hawaii.edu> References: <4821E855.1060909@llnl.gov> <4824BE5E.4060409@hawaii.edu> <200805091723.08940.pgmdevlist@gmail.com> <4824D40D.4000207@hawaii.edu> Message-ID: 2008/5/9 Eric Firing : > It seems like some strategic re-thinking may be needed in the long run, > if not immediately. There is a wide range of combinations of arguments > that will trigger invalid results, whether Inf or NaN. The only way to > trap and mask all of these is to use masked_invalid after the > calculation, and this only works if the user has not disabled nan > output. I have not checked recently, but based on earlier strategy > discussions, I suspect that numpy.ma is already strongly depending on > the availability of nan and inf output to prevent exceptions being > raised upon invalid calculations. Maybe this should simply be > considered a requirement for the use of ma. I think in principle the right answer is to simply run whatever underlying function, and mask any NaNs or Infs in the output. This may be a problem when it comes to seterr - is the behaviour of seterr() even defined in a multithreaded context? Can it be made thread-local? Surely hardware must support thread-local floating-point error flags. If seterr() works this way, then surely the right answer in ma is to use a try/finally block to turn off exceptions and warnings, and clean up NaNs after the fact. Anne From wright at esrf.fr Sat May 10 03:54:34 2008 From: wright at esrf.fr (Jonathan Wright) Date: Sat, 10 May 2008 09:54:34 +0200 Subject: [Numpy-discussion] Power domain (was Re: bug in oldnumeric.ma) In-Reply-To: References: <4821E855.1060909@llnl.gov> <4824BE5E.4060409@hawaii.edu> <200805091723.08940.pgmdevlist@gmail.com> <4824D40D.4000207@hawaii.edu> Message-ID: <482554BA.5090308@esrf.fr> Anne Archibald wrote: > 2008/5/9 Eric Firing : > > >> It seems like some strategic re-thinking may be needed in the long run, >> if not immediately. > I think in principle the right answer is to simply run whatever > underlying function, and mask any NaNs or Infs in the output. Did you already rule out promoting to complex based on the type information of the power args? I see that already: >> power( -3.0 , arange(2.8, 3.2, 0.1) ) Warning: invalid value encountered in power array([ NaN, NaN, -27., NaN, NaN]) >>> power( -3.0*(1+0j) , arange(2.8, 3.2, 0.1) ) array([-17.53465227+12.73967059j, -23.00689255 +7.47539254j, -27.00000000 +0.j , -28.66039788 -9.31232777j, -27.21107252-19.77000142j]) This gives a continuous function. So, multiply by (1+0j) and mask according to the .imag parts of the answers. It would have made more sense to have power(real, real) return complex, but of course that may break some code. Jon From millman at berkeley.edu Sat May 10 05:01:00 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 10 May 2008 02:01:00 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <482454E8.8060606@gmail.com> References: <482454E8.8060606@gmail.com> Message-ID: Hello, I have obviously decided to delay tagging 1.1.0 until we resolve this. I didn't realize that numpy matrices were used in scipy or I would have brought this up before, but whatever matrix change we make in 1.1 has to work with the scipy 0.6. Unfortunately, I can't check myself at the moment; but I should be able to look into it tomorrow. Would someone please check that scipy 0.6 still works with the 1.1 branch? And let me know ASAP. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Sat May 10 05:35:12 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 10 May 2008 02:35:12 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Fri, May 9, 2008 at 8:36 AM, Timothy Hochberg wrote: > My personal opinion is that current matrix class is pretty useless and the > change won't help much from my point of view. My preference would be to > leave the matrix class alone, design a new matrix class, with a different > name, for 1.2 and then deprecate the old matrix class. Piecemeal fixing of > the matrix class is going to break someone's code and doesn't really get us > where we want to go. Sort of +1. I am not a user of the numpy matrix, but I am increasingly feeling that we should leave them alone for the 1.1 release. This change was more or less rushed through at the end of the 1.1 development cycle due to both the feeling that there was an eminent and unique opportunity for API breakage and a near universal agreement that the current matrix behavior was less than ideal. However, we will be releasing 1.2 in a few months and it will be entirely possible to break the matrices API then if there is an agreement that it still needs to be done. It seems that the longer this discussion has gone on, the more alternative "fixes" that are proposed and discussed. Basically almost everyone who joins the discussion suggests yet a new fix. Also not all the NumPy/SciPy core developers have been following this discussion and I want to make sure everyone has the opportunity to become engaged in this conversation before we settle on even a minor change this late in the development cycle. Moreover, I think that given the numerous bug fixes that have gone into this release, it would be a shame if this somewhat last minute API break turns out to introduce even on new bug that we don't notice before the release. Finally, I would like to apologize for not proposing a more sensible process earlier for introducing API breaking code (I am learning as I go). As we turn our attention to developing 1.2, I am going to provide more formal phases to the development process. Any code that breaks API will be required to be introduced and committed during the beginning of the development cycle. And not in the very last phase of the development cycle when we should be focused solely on critical bugs, major regression, documentation, and extensive testing. So unless there are major objections, I am going to back out the matrices changes in the 1.1 branch. I will leave whatever changes have occurred in the trunk. Sorry if you feel I am overstepping my "authority". If so, please let me know. My goal is not to cut short discussion so much as to move the 1.1.0 release forward, so that we can get the numerous bug fixes and improvements to our users while also letting the developers start working on the next release (1.2) as soon as possible. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Sat May 10 05:47:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 10 May 2008 04:47:34 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: <3d375d730805100247u451d3a0dqb3d7396804df5dcf@mail.gmail.com> On Sat, May 10, 2008 at 4:35 AM, Jarrod Millman wrote: > However, we will be releasing 1.2 in a few months and it will be > entirely possible to break the matrices API then if there is an > agreement that it still needs to be done. Please, let's have a firm policy of having a DeprecationWarning for at least one revision of 1.x before breaking something in 1.x+1. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Sat May 10 10:54:37 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 10 May 2008 07:54:37 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <3d375d730805100247u451d3a0dqb3d7396804df5dcf@mail.gmail.com> References: <482454E8.8060606@gmail.com> <3d375d730805100247u451d3a0dqb3d7396804df5dcf@mail.gmail.com> Message-ID: On Sat, May 10, 2008 at 2:47 AM, Robert Kern wrote: > On Sat, May 10, 2008 at 4:35 AM, Jarrod Millman wrote: > >> However, we will be releasing 1.2 in a few months and it will be >> entirely possible to break the matrices API then if there is an >> agreement that it still needs to be done. > > Please, let's have a firm policy of having a DeprecationWarning for at > least one revision of 1.x before breaking something in 1.x+1. +1 I agree. Let's pretend I said "... it will be entirely possible to a DeprecationWarning for the matrices API with the 1.2 release and switching to the a new API in 1.3 (if there is an agreement as to what the new behavior should be before the 1.2 release) ...". -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From aisaac at american.edu Sat May 10 11:08:15 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 10 May 2008 11:08:15 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, 10 May 2008, Jarrod Millman wrote: > unless there are major objections, I am going to back out > the matrices changes in the 1.1 branch. If these are backed out, will some kind of deprecation warning be added for scalar indexing, as Travis suggested? Robert's request seems in accord with this. Cheers, Alan From kwgoodman at gmail.com Sat May 10 11:14:39 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 10 May 2008 08:14:39 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 8:08 AM, Alan G Isaac wrote: > On Sat, 10 May 2008, Jarrod Millman wrote: >> unless there are major objections, I am going to back out >> the matrices changes in the 1.1 branch. > > If these are backed out, will some kind of deprecation > warning be added for scalar indexing, as Travis suggested? > Robert's request seems in accord with this. Shouldn't a deprecation warning explain what the future behavior will be? Is there a firm consensus on what that behavior will be? From stefan at sun.ac.za Sat May 10 11:31:39 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 10 May 2008 17:31:39 +0200 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <4823CA39.7090203@astraw.com> References: <4823CA39.7090203@astraw.com> Message-ID: <9457e7c80805100831j3bef38f6rb5158213e4df7165@mail.gmail.com> Hi Andrew 2008/5/9 Andrew Straw : > I've got a big element array (25 million int64s) that searchsorted() > takes a long time to grind through. After a bit of digging in the > literature and the numpy source code, I believe that searchsorted() is > implementing a classic binary search, which is pretty bad in terms of > cache misses. There are several modern implementations of binary search > which arrange items in memory such that cache misses are much more rare. > Clearly making such an indexing arrangement would take time, but in my > particular case, I can spare the time to create an index if searching > was faster, since I'd make the index once but do the searching many times. > > Is there an implementation of such an algorithm that works easilty with > numpy? Also, can you offer any advice, suggestions, and comments to me > if I attempted to implement such an algorithm? If found Francesc Altet's Pyrex implementation at http://mail.python.org/pipermail/python-list/2007-November/466503.html I modified it for use with Cython and added some tests: https://code.launchpad.net/~stefanv/+junk/my_bisect That may be a good starting point for further experimentation. As it is, it is already about 10 times faster than the built-in version (since I can assume we're working with int64's, so no special type checking is done). Regards St?fan From alan.mcintyre at gmail.com Sat May 10 12:02:15 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sat, 10 May 2008 12:02:15 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: <1d36917a0805100902t56473394v5b68d00af60857e4@mail.gmail.com> On Sat, May 10, 2008 at 11:14 AM, Keith Goodman wrote: > Shouldn't a deprecation warning explain what the future behavior will > be? Is there a firm consensus on what that behavior will be? For what the opinion of an interested observer is worth: I honestly can't tell whether there's a consensus or not. If there is, I'm not quite sure where I'd look to get the details of the 'official' behavior (but that might just be due to my lack of familiarity with the way things get documented for NumPy). From aisaac at american.edu Sat May 10 12:24:00 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 10 May 2008 12:24:00 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: >> On Sat, 10 May 2008, Jarrod Millman wrote: >>> unless there are major objections, I am going to back out >>> the matrices changes in the 1.1 branch. > On Sat, May 10, 2008 at 8:08 AM, Alan G Isaac > wrote: >> If these are backed out, will some kind of deprecation >> warning be added for scalar indexing, as Travis >> suggested? Robert's request seems in accord with this. On Sat, 10 May 2008, Keith Goodman apparently wrote: > Shouldn't a deprecation warning explain what the future > behavior will be? I do not think so. I think the warning should say: "use x[0,:] instead of x[0] to return row 0 as a matrix." > Is there a firm consensus on what that behavior will be? My understanding of the plan: - step 1: scalar indexing and iteration returns 1d arrays. - possible step 2: scalar indexing and iteration return "oriented" 1d-array-like objects ("vectors") suitable for linear algebra. Personally, I only care that step 1 is achieved. Cheers, Alan Isaac From kwgoodman at gmail.com Sat May 10 12:53:25 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 10 May 2008 09:53:25 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 9:24 AM, Alan G Isaac wrote: > On Sat, 10 May 2008, Keith Goodman apparently wrote: >> Shouldn't a deprecation warning explain what the future >> behavior will be? > > I do not think so. I think the warning should say: > "use x[0,:] instead of x[0] to return row 0 as a matrix." That would confuse me. I would think it meant that I have to use x[0,:] right now. Maybe you could add something that explains that this is in the future? Something like: "In future versions x[0] will not return row 0 as a matrix, but x[0,:] will continue to return a matrix"? From charlesr.harris at gmail.com Sat May 10 12:55:53 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 10 May 2008 10:55:53 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <9457e7c80805100831j3bef38f6rb5158213e4df7165@mail.gmail.com> References: <4823CA39.7090203@astraw.com> <9457e7c80805100831j3bef38f6rb5158213e4df7165@mail.gmail.com> Message-ID: On Sat, May 10, 2008 at 9:31 AM, St?fan van der Walt wrote: > Hi Andrew > > 2008/5/9 Andrew Straw : > > I've got a big element array (25 million int64s) that searchsorted() > > takes a long time to grind through. After a bit of digging in the > > literature and the numpy source code, I believe that searchsorted() is > > implementing a classic binary search, which is pretty bad in terms of > > cache misses. There are several modern implementations of binary search > > which arrange items in memory such that cache misses are much more rare. > > Clearly making such an indexing arrangement would take time, but in my > > particular case, I can spare the time to create an index if searching > > was faster, since I'd make the index once but do the searching many > times. > > > > Is there an implementation of such an algorithm that works easilty with > > numpy? Also, can you offer any advice, suggestions, and comments to me > > if I attempted to implement such an algorithm? > > If found Francesc Altet's Pyrex implementation at > > http://mail.python.org/pipermail/python-list/2007-November/466503.html > > I modified it for use with Cython and added some tests: > > https://code.launchpad.net/~stefanv/+junk/my_bisect > > That may be a good starting point for further experimentation. As it > is, it is already about 10 times faster than the built-in version. The built in version is in c, but not type specific. It could be moved to the generated code base easily enough. The slow part is here if (compare(parr + elsize*imid, pkey, key) < 0) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sat May 10 13:50:23 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 10 May 2008 10:50:23 -0700 Subject: [Numpy-discussion] ix_ with matrix input Message-ID: Would it break a numpy design principle to allow ix_ to take 1xn and nx1 matrices as input? Here's the use case I had in mind: >> import numpy.matlib as mp >> x = mp.asmatrix(mp.arange(9).reshape(3,3)) >> ridx = x.sum(1) > 3 >> cidx = x.sum(0) > 9 >> x[mp.ix_(ridx, cidx)] --------------------------------------------------------------------------- ValueError: Cross index must be 1 dimensional Workaround (convert to arrays): >> ridx = x.A.sum(1) > 3 >> cidx = x.A.sum(0) > 9 >> x[mp.ix_(ridx, cidx)] From millman at berkeley.edu Sat May 10 15:31:41 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 10 May 2008 12:31:41 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 8:14 AM, Keith Goodman wrote: >> If these are backed out, will some kind of deprecation >> warning be added for scalar indexing, as Travis suggested? >> Robert's request seems in accord with this. > > Shouldn't a deprecation warning explain what the future behavior will > be? Is there a firm consensus on what that behavior will be? I agree that a deprecation warning needs to explain the future behavior and don't believe we have agreed on what it should be yet. I don't personally have an opinion on the what the new behavior should be at this point. But I don't think it makes sense to add deprecation warnings at this point--unless we know exactly what it is that we will be doing in the future. So while it could be argued that it would be useful to say "use x[0,:] instead of x[0] to return row 0 as a matrix" in the 1.1 release, my sense is that we will still need to replace that DeprecationWarning in 1.2 with a new DeprecationWarning saying something like "in 1.3 x[0] will return .... and if you want .... you will need to call x[0,:]". I don't think we should have two different DeprecationWarnings in back to back releases and believe that we should wait to include the warning until we decide what the new behavior will be in 1.2. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From aisaac at american.edu Sat May 10 16:07:03 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 10 May 2008 16:07:03 -0400 Subject: [Numpy-discussion] warn of matrix change? In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, 10 May 2008, Jarrod Millman apparently wrote: > I don't think it makes sense to add deprecation warnings > at this point--unless we know exactly what it is that we > will be doing in the future. My last comment on this ... As a user, I would want to know *now* that the behavior of scalar indexing is targeted for a change, especially since I can easily write my code *now* to avoid being affected by this. E.g., suppose instead of incorporating Travis's patch (which I support) NumPy adopted Gael's proposal (and Tim's?) to raise an IndexError for scalar indexes. In either case, code written with non-scalar indexes will continue to work as expected, but code written with scalar indexes will often break. So it only seems fair to include a warning now. Cheers, Alan Isaac From peridot.faceted at gmail.com Sat May 10 16:05:38 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 10 May 2008 16:05:38 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: 2008/5/10 Jarrod Millman : > On Sat, May 10, 2008 at 8:14 AM, Keith Goodman wrote: >>> If these are backed out, will some kind of deprecation >>> warning be added for scalar indexing, as Travis suggested? >>> Robert's request seems in accord with this. >> >> Shouldn't a deprecation warning explain what the future behavior will >> be? Is there a firm consensus on what that behavior will be? > > I agree that a deprecation warning needs to explain the future > behavior and don't believe we have agreed on what it should be yet. > > I don't personally have an opinion on the what the new behavior should > be at this point. But I don't think it makes sense to add deprecation > warnings at this point--unless we know exactly what it is that we will > be doing in the future. So while it could be argued that it would be > useful to say "use x[0,:] instead of x[0] to return row 0 as a matrix" > in the 1.1 release, my sense is that we will still need to replace > that DeprecationWarning in 1.2 with a new DeprecationWarning saying > something like "in 1.3 x[0] will return .... and if you want .... you > will need to call x[0,:]". I don't think we should have two different > DeprecationWarnings in back to back releases and believe that we > should wait to include the warning until we decide what the new > behavior will be in 1.2. Regrettably I have to agree that we can't introduce a DeprecationWarning until 1.2. I don't agree that x[0,:] should return a matrix! It should be the one-dimensional object it looks like. See Keith Goodman's recent message - http://projects.scipy.org/pipermail/numpy-discussion/2008-May/033726.html - for an example of why representing 1D objects as 2D objects is trouble. I am of the opinion that matrix indexing should be identical to array indexing, insofar as that is possible. I don't expect my opinion to prevail, but the point is that we do not even have enough consensus to agree on a recommendation to go in the DeprecationWarning. Alas. Anne From wnbell at gmail.com Sat May 10 16:21:33 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 10 May 2008 15:21:33 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 3:05 PM, Anne Archibald wrote: > > I don't expect my opinion to prevail, but the point is that we do not > even have enough consensus to agree on a recommendation to go in the > DeprecationWarning. Alas. > Would you object to raising a general Warning with a message like the following? "matrix indexing of the form x[0] is ambiguous, consider the explicit format x[0,:]" -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From efiring at hawaii.edu Sat May 10 16:30:19 2008 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 10 May 2008 10:30:19 -1000 Subject: [Numpy-discussion] ticket 788: possible blocker Message-ID: <482605DB.9090709@hawaii.edu> Jarrod et al., I just ran into a nasty bug, described in ticket 788: under some circumstances, which I don't understand, the astype method fails to return a copy and returns the original array instead. It causes bizarre behavior in basemap (Jeff Whitaker's mapping toolkit for matplotlib), which is where I stumbled over it. The ticket includes a pickled array that illustrates the behavior with 1.2.0.dev5150--at least on my system (ubuntu feisty, 32-bit). Eric From peridot.faceted at gmail.com Sat May 10 16:37:18 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 10 May 2008 16:37:18 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: 2008/5/10 Nathan Bell : > On Sat, May 10, 2008 at 3:05 PM, Anne Archibald > wrote: >> >> I don't expect my opinion to prevail, but the point is that we do not >> even have enough consensus to agree on a recommendation to go in the >> DeprecationWarning. Alas. > > Would you object to raising a general Warning with a message like the following? > > "matrix indexing of the form x[0] is ambiguous, consider the explicit > format x[0,:]" Well, since I at least have proposed changing the latter as well, the warning doesn't actually help with deprecation. It does reduce the amount of beginner foot-shooting (as in x[0][0]), I guess, so I'm not opposed to it. Anne From millman at berkeley.edu Sat May 10 17:30:38 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 10 May 2008 14:30:38 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 1:21 PM, Nathan Bell wrote: > Would you object to raising a general Warning with a message like the following? > > "matrix indexing of the form x[0] is ambiguous, consider the explicit > format x[0,:]" -1 I am not certain that there is universal agreement as to what x[0] or x[0,:] should return. Moreover, I don't think the phrase "ambiguous" is entirely correct. My understanding is that x[0] has a specified behavior at this point and that in a later release it will still have a specified behavior. But there is some discussion as to whether we should change that specific behavior. So it isn't that it is ambiguous exactly, but that we may change its meaning. I guess it might be reasonable to say something like "the developers are currently discussing changing the meaning of x[0] (and perhaps even x[0,:]), but we haven't agreed on how, or even if, we will change it." [If we are going to say that, we might want to suggest that our users consider using arrays--not that I am arguing for that :)] I don't think it is worth focusing much more effort on trying to come up with some language that we all agree on to indicate that we are discussing changing this in the future. We have made considerable progress in preparing for 1.1.0; we have nearly doubled the number of tests, fixed nearly 200 bugs, greatly improved the quality and coverage of our documentation, vastly improved MaskedArrays, fixed the histogram, etc. I think it may be time to release what we have at this point and start working on 1.2. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From tim.hochberg at ieee.org Sat May 10 17:35:51 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sat, 10 May 2008 14:35:51 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 1:37 PM, Anne Archibald wrote: > 2008/5/10 Nathan Bell : > > On Sat, May 10, 2008 at 3:05 PM, Anne Archibald > > wrote: > >> > >> I don't expect my opinion to prevail, but the point is that we do not > >> even have enough consensus to agree on a recommendation to go in the > >> DeprecationWarning. Alas. > > > > Would you object to raising a general Warning with a message like the > following? > > > > "matrix indexing of the form x[0] is ambiguous, consider the explicit > > format x[0,:]" > > Well, since I at least have proposed changing the latter as well, the > warning doesn't actually help with deprecation. It does reduce the > amount of beginner foot-shooting (as in x[0][0]), I guess, so I'm not > opposed to it. > Please, let's just leave the current matrix class alone. Any change sufficient to make matrix not terrible, will break everyone's code. Instead, the goal should be build a new matrix class (say newmatrix) where we can start from scratch. We can build it and keep it alpha for a while so we can break the interface as needed while we get experience with it and then, maybe, transition matrix to oldmatrx and newmatrix to matrix over a couple of releases. Or just choose a reasonable name for matrix to begin with (ndmatrix?) and then deprecate the current matrix class when the new one is fully baked. Trying to slowly morph matrix into something usable is just going to break code again and again and just be a giant headache in general. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sat May 10 17:50:47 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 10 May 2008 14:50:47 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: On Sat, May 10, 2008 at 2:35 PM, Timothy Hochberg wrote: > Please, let's just leave the current matrix class alone. Any change > sufficient to make matrix not terrible, will break everyone's code. Instead, > the goal should be build a new matrix class (say newmatrix) where we can > start from scratch. We can build it and keep it alpha for a while so we can > break the interface as needed while we get experience with it and then, > maybe, transition matrix to oldmatrx and newmatrix to matrix over a couple > of releases. Or just choose a reasonable name for matrix to begin with > (ndmatrix?) and then deprecate the current matrix class when the new one is > fully baked. > > Trying to slowly morph matrix into something usable is just going to break > code again and again and just be a giant headache in general. +1 That sounds ideal to me. What's the first step? A design spec on the wiki? From efiring at hawaii.edu Sat May 10 18:09:00 2008 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 10 May 2008 12:09:00 -1000 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <482605DB.9090709@hawaii.edu> References: <482605DB.9090709@hawaii.edu> Message-ID: <48261CFC.2020601@hawaii.edu> I have added a patch to the ticket. I believe it fixes the problem. It required mirroring a very complicated logical expression from PyArray_CastToType in array_cast. I suspect that for readability, this expression should be encapsulated somewhere as a function, with a signature like int PyArray_EquivDescr(PyArray_Descr *descr1, PyArray_Descr *descr2) returning True if they are equivalent. I don't understand the conditions well enough to be confident about this, however. Eric Eric Firing wrote: > Jarrod et al., > > I just ran into a nasty bug, described in ticket 788: under some > circumstances, which I don't understand, the astype method fails to > return a copy and returns the original array instead. It causes bizarre > behavior in basemap (Jeff Whitaker's mapping toolkit for matplotlib), > which is where I stumbled over it. > > The ticket includes a pickled array that illustrates the behavior with > 1.2.0.dev5150--at least on my system (ubuntu feisty, 32-bit). > > Eric > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From peridot.faceted at gmail.com Sat May 10 18:28:17 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 10 May 2008 18:28:17 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: 2008/5/10 Timothy Hochberg : > On Sat, May 10, 2008 at 1:37 PM, Anne Archibald > wrote: >> >> 2008/5/10 Nathan Bell : >> > On Sat, May 10, 2008 at 3:05 PM, Anne Archibald >> > wrote: >> >> >> >> I don't expect my opinion to prevail, but the point is that we do not >> >> even have enough consensus to agree on a recommendation to go in the >> >> DeprecationWarning. Alas. >> > >> > Would you object to raising a general Warning with a message like the >> > following? >> > >> > "matrix indexing of the form x[0] is ambiguous, consider the explicit >> > format x[0,:]" >> >> Well, since I at least have proposed changing the latter as well, the >> warning doesn't actually help with deprecation. It does reduce the >> amount of beginner foot-shooting (as in x[0][0]), I guess, so I'm not >> opposed to it. > > Please, let's just leave the current matrix class alone. Any change > sufficient to make matrix not terrible, will break everyone's code. Instead, > the goal should be build a new matrix class (say newmatrix) where we can > start from scratch. We can build it and keep it alpha for a while so we can > break the interface as needed while we get experience with it and then, > maybe, transition matrix to oldmatrx and newmatrix to matrix over a couple > of releases. Or just choose a reasonable name for matrix to begin with > (ndmatrix?) and then deprecate the current matrix class when the new one is > fully baked. > > Trying to slowly morph matrix into something usable is just going to break > code again and again and just be a giant headache in general. +1 I think something along the lines of the Numerical Ruby code you pointed to some time ago is a good idea, that is, an "array of matrices" object and an "array of vectors" object. There will be considerable challenges - distinguishing the two kinds of index manipulations on arrays of matrices (those that transpose every matrix in the array, for eample, and those that rearrange the array of matrices). Since teaching is one of the main current applications of matrices, these should be presented in as non-painful a way as possible. Anyway, I think I am in favour of returning the 1.1 matrix behaviour to its 1.0 form and releasing the thing. Anne From gael.varoquaux at normalesup.org Sun May 11 10:09:18 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 11 May 2008 16:09:18 +0200 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> <48247438.3040003@esrf.fr> Message-ID: <20080511140918.GC28418@phare.normalesup.org> On Fri, May 09, 2008 at 12:53:40PM -0500, Nathan Bell wrote: > True, but scipy.sparse makes fairly limited use of matrix and I have > 386 unittests to tell me what broke. End-users might spend > considerably longer sorting out the problem, particularly if they > don't know what they're looking for. Personally, I would not have > thought a 1.0 -> 1.1 version bump would break something like this. > Yes, we can put caveats in the release notes, but how many > numpy.matrix users read those? Exact. Don't break existing code, or users will hate you, and eventually stop using your code. In real life, many of the domain-specific libraries still use numeric or numarray. People like to write code, and forget about it for 10 years. If you change APIs they cannot do this. > Alan, I don't fundamentally disagree with your positions on the > deficiencies/quirks of matrices in numpy. However, it's completely > inappropriate to plug one hole while creating others, especially in a > minor release. I suspect that if we surveyed end-users we'd find > that "my code still works" is a much higher priority than "A[0][0] now > does what I expect". +1. Pluging one annoyance to create another one (IMHO worse), and in addition breaking backward compatibility seems utterly wrong to me. Ga?l From gael.varoquaux at normalesup.org Sun May 11 10:22:31 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 11 May 2008 16:22:31 +0200 Subject: [Numpy-discussion] MoinMoin <-> docstrings gateway In-Reply-To: <1e52e0880805051719l6892cff3ta766dbdc8a4ed4d0@mail.gmail.com> References: <1209946640.13399.6.camel@localhost.localdomain> <20080505054800.GC8593@phare.normalesup.org> <1e52e0880805051719l6892cff3ta766dbdc8a4ed4d0@mail.gmail.com> Message-ID: <20080511142231.GD28418@phare.normalesup.org> On Mon, May 05, 2008 at 08:19:10PM -0400, dieter h wrote: > I humbly suggest Sphinx[1] as the generating tool and markup rather > than just regular ReST. I'm in the process of converting all of my > local documentation to this engine's format. I agree with you that Sphinx rocks hard. Ipython, sympy and Mayavi are now using it, and it makes great docs. However I would absolutely avoid using in docstrings something that is not pure ReST. There are many applications, formatters, and wiki engine, web-apps, ... that can render ReST. You are making your life much easier if you are sticking with ReST. For your interest, with Mayavi, we have the docstrings in ReST, and are generating a Sphinx document using them. My 2 cents, Ga?l From gael.varoquaux at normalesup.org Sun May 11 10:53:46 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 11 May 2008 16:53:46 +0200 Subject: [Numpy-discussion] Log Arrays In-Reply-To: References: <5b8d13220805080820g77c053eybf6f5450fd192ba4@mail.gmail.com> Message-ID: <20080511145346.GE28418@phare.normalesup.org> On Thu, May 08, 2008 at 10:04:28AM -0600, Charles R Harris wrote: > What realistic probability is in the range exp(-1000) ? I recently tried to do Fisher information estimation of a very noisy experiment. For this I needed to calculate the norm of the derivation of the probability over the entire state space as a function of a hidden parameter. My probabilities where in most of the phase space very small, and calculating the derivative in these regions was very numerically unstable. However I could not drop these regions as they where contributing significantly to the Fisher information. I could never get clean results (everything was very noisy) and I gave up. This was just to say that extremely small numbers, and their difference, can be very significant, and that there is a strong usecase for logarithmic operations (actually I should probably have worked a bit on my formulas to express them all in logarithm form, but I ran out of time). Ga?l From aisaac at american.edu Sun May 11 12:26:49 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 11 May 2008 12:26:49 -0400 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <20080511140918.GC28418@phare.normalesup.org> References: <482454E8.8060606@gmail.com><48247438.3040003@esrf.fr> <20080511140918.GC28418@phare.normalesup.org> Message-ID: On Sun, 11 May 2008, Gael Varoquaux apparently wrote: > Pluging one annoyance to create another one (IMHO worse), > and in addition breaking backward compatibility seems > utterly wrong to me. I'm a little puzzled by this phrasing. As Anne pointed out, examples are accumulating that there is a *fundamental* problem with matrix handling of scalar indexing. I agree with this. It is not just an "annoyance". It keeps affecting code that tries to handle both matrices and arrays in a generic way. Your phrasing suggests that the only solution is to live with this forever, always cobbling new workarounds, unless backward compatability can be ensured for more sensible behavior, which it pretty clearly cannot. Is that your current stance? I suspect part of the problem is that "backward compatability" is being interpreted in terms of discoverable behavior, rather than in terms of documented behavior. Alan PS Earlier I thought you favored raising an error in response to scalar indexing of matrices... From gael.varoquaux at normalesup.org Sun May 11 12:42:01 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 11 May 2008 18:42:01 +0200 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <20080511140918.GC28418@phare.normalesup.org> Message-ID: <20080511164201.GG28418@phare.normalesup.org> On Sun, May 11, 2008 at 12:26:49PM -0400, Alan G Isaac wrote: > As Anne pointed out, examples are accumulating that there is > a *fundamental* problem with matrix handling of scalar > indexing. I agree with this. It is not just an > "annoyance". It keeps affecting code that tries to handle > both matrices and arrays in a generic way. I don't care, personally. This is a problem. I agree with you. Breaking existing code is a major disturbance. It as to be weighted with the gains. The solution of adding another bug elsewhere to plug this problem is not good to me. This is why I favor the proposal #1 on your list of propsal http://www.scipy.org/MatrixIndexing , because it introduces the minimal amount of changes to the interfaces. > Your phrasing suggests that the only solution is to live > with this forever, always cobbling new workarounds, unless > backward compatability can be ensured for more sensible > behavior, which it pretty clearly cannot. Is that your > current stance? No, but I am pretty close to this. > I suspect part of the problem is that "backward > compatability" is being interpreted in terms of discoverable > behavior, rather than in terms of documented behavior. Not at all. It is to be interpreted in terms of "I have a few dozens kilolines of code left by a student who work with numpy 1.0, if I upgrade numpy, will they still work?". I do realize the limits of freezing the behavior of software: bugware. I am not for a complete freeze, I am for a very well-thought move forward, that introduces a minimum of breakage. Ga?l From aisaac at american.edu Sun May 11 13:16:44 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 11 May 2008 13:16:44 -0400 Subject: [Numpy-discussion] promises, promises In-Reply-To: <20080511164201.GG28418@phare.normalesup.org> References: <20080511140918.GC28418@phare.normalesup.org> <20080511164201.GG28418@phare.normalesup.org> Message-ID: >> I suspect part of the problem is that "backward >> compatability" is being interpreted in terms of discoverable >> behavior, rather than in terms of documented behavior. On Sun, 11 May 2008, Gael Varoquaux wrote: > Not at all. It is to be interpreted in terms of "I have > a few dozens kilolines of code left by a student who work > with numpy 1.0, if I upgrade numpy, will they still > work?". That seems like the same thing to me. What I mean is that part of the problem is that the handling of scalar indexing is being treated as part of the API contract simply because it was discoverable behavior. To be specific: I do not recall any place in the NumPy Book where this behavior is promised. When behavior is not promised, one feels fewer regrets in changing it. Moving forward, I hope we will find a way to be more careful about making a minimum of promises. Cheers, Alan From gael.varoquaux at normalesup.org Sun May 11 13:34:44 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 11 May 2008 19:34:44 +0200 Subject: [Numpy-discussion] promises, promises In-Reply-To: References: <20080511164201.GG28418@phare.normalesup.org> Message-ID: <20080511173444.GL28418@phare.normalesup.org> On Sun, May 11, 2008 at 01:16:44PM -0400, Alan G Isaac wrote: > That seems like the same thing to me. What I mean is that > part of the problem is that the handling of scalar indexing > is being treated as part of the API contract simply because > it was discoverable behavior. OK, I didn't understand you properly. > Moving forward, I hope we will find a way to be more careful > about making a minimum of promises. You are touching here a delicate part of software development. Every major library faces this problem at some point. I don't have an answer to this problem, unfortunately. I think it is good to keep it in mind, though. Ga?l From peridot.faceted at gmail.com Sun May 11 14:04:13 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 11 May 2008 14:04:13 -0400 Subject: [Numpy-discussion] Writing new ufuncs Message-ID: Hi, Suppose I have a C function, double logsum(double a, double b); What is needed to produce a ufunc object whose elementwise operation is done by calling this function? Also, is there a way to take a python function and automatically make a ufunc out of it? (No, vectorize doesn't implement reduce(), accumulate(), reduceat(), or outer().) Thanks, Anne From alan.mcintyre at gmail.com Sun May 11 14:12:22 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 11 May 2008 14:12:22 -0400 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: References: Message-ID: <1d36917a0805111112n409233a0ob5b5b00bcb2c764a@mail.gmail.com> On Sun, May 11, 2008 at 2:04 PM, Anne Archibald wrote: > Also, is there a way to take a python function and automatically make > a ufunc out of it? (No, vectorize doesn't implement reduce(), > accumulate(), reduceat(), or outer().) I've not used it, but have a look at numpy.frompyfunc; its docstring suggests it turns arbitrary Python functions into ufuncs. From peridot.faceted at gmail.com Sun May 11 14:17:23 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 11 May 2008 14:17:23 -0400 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: <1d36917a0805111112n409233a0ob5b5b00bcb2c764a@mail.gmail.com> References: <1d36917a0805111112n409233a0ob5b5b00bcb2c764a@mail.gmail.com> Message-ID: 2008/5/11 Alan McIntyre : > On Sun, May 11, 2008 at 2:04 PM, Anne Archibald > wrote: >> Also, is there a way to take a python function and automatically make >> a ufunc out of it? (No, vectorize doesn't implement reduce(), >> accumulate(), reduceat(), or outer().) > > I've not used it, but have a look at numpy.frompyfunc; its docstring > suggests it turns arbitrary Python functions into ufuncs. Ah, thanks. Shame it produces object arrays though. Anne From kwgoodman at gmail.com Sun May 11 15:01:08 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 11 May 2008 12:01:08 -0700 Subject: [Numpy-discussion] A new matrix class Message-ID: The most basic, and the most contentious, design decision of a new matrix class is matrix indexing. There seems to be two camps: 1. The matrix class should be more like the array class. In particular x[0,:] should return a 1d array or a 1d array like object that contains the orientation (row or column) as an attribute and x[0] should return a 1d array. (Is x.sum(1) also a 1d array like object?) 2. A matrix is a matrix: all operations on a matrix, including indexing, should return a matrix or a scalar. Does that describe the two approaches to matrix indexing? Are there other approaches? From charlesr.harris at gmail.com Sun May 11 15:44:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 11 May 2008 13:44:11 -0600 Subject: [Numpy-discussion] A new matrix class In-Reply-To: References: Message-ID: On Sun, May 11, 2008 at 1:01 PM, Keith Goodman wrote: > The most basic, and the most contentious, design decision of a new > matrix class is matrix indexing. There seems to be two camps: > > 1. The matrix class should be more like the array class. In particular > x[0,:] should return a 1d array or a 1d array like object that > contains the orientation (row or column) as an attribute and x[0] > should return a 1d array. (Is x.sum(1) also a 1d array like object?) > > 2. A matrix is a matrix: all operations on a matrix, including > indexing, should return a matrix or a scalar. > > Does that describe the two approaches to matrix indexing? Are there > other approaches? > _________ Pretty well, I think. The thing about 2) is that ndarray routines break if they can't treat arrays as nested sequences, i.e. scalar indexing needs to return an array of one less dimension. So the matrix class shouldn't subclass ndarray in that case, but rather should use an ndarray as a component. More code to write, but such is life. 3) Everything is an array. I think Matlab treats scalars as 1x1 arrays. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sun May 11 16:06:22 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 11 May 2008 13:06:22 -0700 Subject: [Numpy-discussion] A new matrix class In-Reply-To: References: Message-ID: On Sun, May 11, 2008 at 12:44 PM, Charles R Harris wrote: > On Sun, May 11, 2008 at 1:01 PM, Keith Goodman wrote: >> >> The most basic, and the most contentious, design decision of a new >> matrix class is matrix indexing. There seems to be two camps: >> >> 1. The matrix class should be more like the array class. In particular >> x[0,:] should return a 1d array or a 1d array like object that >> contains the orientation (row or column) as an attribute and x[0] >> should return a 1d array. (Is x.sum(1) also a 1d array like object?) >> >> 2. A matrix is a matrix: all operations on a matrix, including >> indexing, should return a matrix or a scalar. >> >> Does that describe the two approaches to matrix indexing? Are there >> other approaches? >> _________ > > Pretty well, I think. The thing about 2) is that ndarray routines break if > they can't treat arrays as nested sequences, i.e. scalar indexing needs to > return an array of one less dimension. So the matrix class shouldn't > subclass ndarray in that case, but rather should use an ndarray as a > component. More code to write, but such is life. > > 3) Everything is an array. I think Matlab treats scalars as 1x1 arrays. So #3 is behave in a similar way to octave/matlab? In octave/matlab 1x1 matrices are handled in special ways, e.g. (nxm) * (1x1) is allowed. From the perspective of the user a 1x1 matrix is a scalar. From kwgoodman at gmail.com Sun May 11 16:39:49 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 11 May 2008 13:39:49 -0700 Subject: [Numpy-discussion] A new matrix class In-Reply-To: References: Message-ID: I added a wiki page: http://scipy.org/NewMatrixSpec From robert.kern at gmail.com Sun May 11 22:35:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 11 May 2008 21:35:03 -0500 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: References: Message-ID: <3d375d730805111935m5a4c4dacv964fd37495a86dc4@mail.gmail.com> On Sun, May 11, 2008 at 1:04 PM, Anne Archibald wrote: > Hi, > > Suppose I have a C function, > double logsum(double a, double b); > What is needed to produce a ufunc object whose elementwise operation > is done by calling this function? Basically, you need 3 arrays: functions implementing the type-specific inner loops, void* extra data to pass to these functions, and an array of arrays containing the type signatures of the ufunc. In numpy, we already have generic implementations of the loop functions for common combinations of types. In your case, for a binary function taking two doubles and returning a double, we have PyUFunc_dd_d(). As its extra void* data, it takes a function pointer that actually implements the element-wise operation. So lets start making the arrays: static char logsum_sigs[] = { NPY_FLOAT64, NPY_FLOAT64, NPY_FLOAT64 }; static PyUFuncGenericFunction logsum_functions[] = {PyUFunc_dd_d}; static void* logsum_data[] = {logsum}; static char logsum_doc[] = "Some kind of docstring."; Now in your initmodule() function, we will call PyUFunc FromFuncAndData() with this information. PyMODINIT_FUNC initmodule(void) { PyObject *m, *f, *d; m = Py_InitModule3("module", module_methods, "Some new ufuncs.\n" ); if (m == NULL) return; d = PyModule_GetDict(m); if (d == NULL) return; import_array(); import_umath(); f = PyUFunc_FromFuncAndData( logsum_functions, logsum_data, logsum_sigs, 1, // The number of type signatures. 2, // The number of inputs. 1, // The number of outputs. PyUFunc_None, // The identity element for reduction. // No good one to use for this function, // unfortunately. "logsum", // The name of the ufunc. logsum_doc, 0 // Dummy for API backwards compatibility. ); PyDict_SetItemString(d, "logsum", f); Py_DECREF(f); } Since double functions can easily be used on float32's, too, you can also add PyUFunc_ff_f to the logsum_functions array and NPY_FLOAT32, NPY_FLOAT32, NPY_FLOAT32, to the corresponding place in the logsum_sigs array. Just bump the number of signatures up to 2. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Sun May 11 22:50:29 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 11 May 2008 22:50:29 -0400 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: <3d375d730805111935m5a4c4dacv964fd37495a86dc4@mail.gmail.com> References: <3d375d730805111935m5a4c4dacv964fd37495a86dc4@mail.gmail.com> Message-ID: 2008/5/11 Robert Kern : > Basically, you need 3 arrays: functions implementing the type-specific > inner loops, void* extra data to pass to these functions, and an array > of arrays containing the type signatures of the ufunc. In numpy, we > already have generic implementations of the loop functions for common > combinations of types. In your case, for a binary function taking two > doubles and returning a double, we have PyUFunc_dd_d(). As its extra > void* data, it takes a function pointer that actually implements the > element-wise operation. So lets start making the arrays: Great! Thanks! Is it possible to provide a specialized implementation of reduce()? (Since reduce() can be implemented more efficiently than doing it pairwise.) > PyUFunc_None, // The identity element for reduction. > // No good one to use for this function, > // unfortunately. Is it safe to use minus infinity, or is this going to give people all kinds of warnings if they have seterr() set? Anne From cournapeau at cslab.kecl.ntt.co.jp Sun May 11 22:59:27 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 12 May 2008 11:59:27 +0900 Subject: [Numpy-discussion] Going toward time-based release ? Message-ID: <1210561167.17972.17.camel@bbc8> Hi, I would like to know how people feel about going toward a time-based release process for numpy (and scipy). By time-based release, I mean: - releases of numpy are time-based, not feature based. - a precise schedule is fixed, and the release manager(s) try to enforce this schedule. Why ? I already suggested the idea a few months ago, and I relaunch the idea because believe the recent masked array + matrix issues could have been somewhat avoided with such a process (from a release point of view, of course). With a time-based release, there is a period where people can write to the release branch, try new features, and a freeze period where only bug fixes are allowed (and normally, no api changes are allowed). Also, time-based releases are by definition predictable, and as such, it is easier to plan upgrades for users, and to plan breaks for developers (for example, if we release say every 3 months, we would allow one or two releases to warn about future incompatible changes, before breaking them for real: people would know it means 6 months to change their code). The big drawback is of course someone has to do the job. I like the way bzr developers do it; every new release, someone else volunteer to do the release, so it is not always the same who do the boring job. Do other people see this suggestion as useful ? If yes, we would have to decide on: - a release period (3 months sounds like a reasonable period to me ?) - a schedule within a release (api breaks would only be allowed in the first month, code addition would be allowed up to two months, and only bug fixes the last month, for example). - who does the process (if nobody steps in, I would volunteer for the first round, if only for seeing how/if it works). cheers, David From robert.kern at gmail.com Sun May 11 23:05:39 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 11 May 2008 22:05:39 -0500 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: References: <3d375d730805111935m5a4c4dacv964fd37495a86dc4@mail.gmail.com> Message-ID: <3d375d730805112005r76190250m40617341597e401b@mail.gmail.com> On Sun, May 11, 2008 at 9:50 PM, Anne Archibald wrote: > 2008/5/11 Robert Kern : > >> Basically, you need 3 arrays: functions implementing the type-specific >> inner loops, void* extra data to pass to these functions, and an array >> of arrays containing the type signatures of the ufunc. In numpy, we >> already have generic implementations of the loop functions for common >> combinations of types. In your case, for a binary function taking two >> doubles and returning a double, we have PyUFunc_dd_d(). As its extra >> void* data, it takes a function pointer that actually implements the >> element-wise operation. So lets start making the arrays: > > Great! Thanks! > > Is it possible to provide a specialized implementation of reduce()? > (Since reduce() can be implemented more efficiently than doing it > pairwise.) I don't think so, no. >> PyUFunc_None, // The identity element for reduction. >> // No good one to use for this function, >> // unfortunately. > > Is it safe to use minus infinity, or is this going to give people all > kinds of warnings if they have seterr() set? Perhaps, but ufuncs only allow 0 or 1 for this value, currently. Also, I was wrong about using PyUFunc_ff_f. Instead, use PyUFunc_ff_f_As_dd_d. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sun May 11 23:09:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 11 May 2008 22:09:04 -0500 Subject: [Numpy-discussion] promises, promises In-Reply-To: References: <20080511140918.GC28418@phare.normalesup.org> <20080511164201.GG28418@phare.normalesup.org> Message-ID: <3d375d730805112009n3487eebq8df6ab77048b08e0@mail.gmail.com> On Sun, May 11, 2008 at 12:16 PM, Alan G Isaac wrote: > To be specific: I do not recall any place in the NumPy Book > where this behavior is promised. It's promised in the docstring! """ A matrix is a specialized 2-d array that retains it's 2-d nature through operations """ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Sun May 11 23:14:55 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 11 May 2008 20:14:55 -0700 Subject: [Numpy-discussion] promises, promises In-Reply-To: <3d375d730805112009n3487eebq8df6ab77048b08e0@mail.gmail.com> References: <20080511140918.GC28418@phare.normalesup.org> <20080511164201.GG28418@phare.normalesup.org> <3d375d730805112009n3487eebq8df6ab77048b08e0@mail.gmail.com> Message-ID: On Sun, May 11, 2008 at 8:09 PM, Robert Kern wrote: > On Sun, May 11, 2008 at 12:16 PM, Alan G Isaac wrote: > > To be specific: I do not recall any place in the NumPy Book > > where this behavior is promised. > > It's promised in the docstring! > > """ A matrix is a specialized 2-d array that retains > it's 2-d nature through operations > """ Amen! (There's a typo in the doc string. "it's" should be "its".) From fperez.net at gmail.com Mon May 12 00:12:50 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 11 May 2008 21:12:50 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <1210561167.17972.17.camel@bbc8> References: <1210561167.17972.17.camel@bbc8> Message-ID: On Sun, May 11, 2008 at 7:59 PM, David Cournapeau wrote: > Hi, > > I would like to know how people feel about going toward a time-based > release process for numpy (and scipy). By time-based release, I mean: > - releases of numpy are time-based, not feature based. > - a precise schedule is fixed, and the release manager(s) try to > enforce this schedule. Just as a data point, Brian and I just discussed proposing the same idea to the ipython crowd. We've been in semi-stalled mode for a couple of months (life hit us hard and at the same time, and we just had too many balls up in the air at the same time). I'm offline this week, but in a few days we'll float the idea by the others on the ipython list to see what they think. I should note that the ipython circumstances are very different: numpy has a vastly larger team and is not in any way stalled. Hence consider the above just an anecdotal reference of a 'cousin' project, the specifics of numpy are different and I understand that. Cheers, f FWIW, I'm actually +1 on the idea for numpy as well, but my vote doesn't count because I don't really contribute almost any code directly. From mike.ressler at alum.mit.edu Mon May 12 00:14:08 2008 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Sun, 11 May 2008 21:14:08 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <1210561167.17972.17.camel@bbc8> References: <1210561167.17972.17.camel@bbc8> Message-ID: <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> On Sun, May 11, 2008 at 7:59 PM, David Cournapeau wrote: > Hi, > > I would like to know how people feel about going toward a time-based > release process for numpy (and scipy). -1 I'm just a common user, but please, no. The big Linux distros do this and it drives me nuts. Just when things are finally beginning to settle down, they throw another big, buggy release out the door because they are trying to meet some ridiculous 6 month cycle. "We" haven't even gotten the major distros to dump Numeric-24 yet (something else depends on it - pygtk maybe?), though most offer numpy as an option; what kind of disaster will we have with new numpy releases every 3 or 6 months? Numpy doesn't (and probably shouldn't) change that rapidly. I would really hope that the core numpy is solid, stable, and predictable. I like major and minor version numbers that indicate a major change is coming. If it's only a minor version number change, I know that I can update it safely and go merrily computing on my way. "2008.04" doesn't tell me if it is a minor bug fix or a major blow to my sanity. So, please, if at all possible, keep the feature numbering. Thanks for hearing me out. Mike -- mike.ressler at alum.mit.edu From millman at berkeley.edu Mon May 12 00:26:09 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 11 May 2008 21:26:09 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <1210561167.17972.17.camel@bbc8> References: <1210561167.17972.17.camel@bbc8> Message-ID: On Sun, May 11, 2008 at 7:59 PM, David Cournapeau wrote: > I would like to know how people feel about going toward a time-based > release process for numpy (and scipy). By time-based release, I mean: > - releases of numpy are time-based, not feature based. > - a precise schedule is fixed, and the release manager(s) try to > enforce this schedule. > Do other people see this suggestion as useful ? If yes, we would have to > decide on: > - a release period (3 months sounds like a reasonable period to me ?) > - a schedule within a release (api breaks would only be allowed in the > first month, code addition would be allowed up to two months, and only > bug fixes the last month, for example). > - who does the process (if nobody steps in, I would volunteer for the > first round, if only for seeing how/if it works). +1 I was basically thinking about trying this for NumPy 1.2.0, which I have been suggesting should be out by the end of August. I am happy to serve as the release manager for 1.2. We also need to get a release of SciPy out between NumPy 1.1 and 1.2. I hadn't brought this up yet, since I have been hoping to get out NumPy 1.1 before starting to plan other releases. But I think we are ready to release NumPy this week, so now is the time to start this discussion. Despite wanting to only take 3 months for 1.2, I believe that in the future it would make more sense to have a minor release of NumPy or SciPy every three months: NumPy 1.1 (May) SciPy 0.7 (July) NumPy 1.2 (August) SciPy 0.8 (November) NumPy 1.3 (February) etc. NumPy will need to work with the last released SciPy (i.e., 1.1 will need to work with 0.6) and SciPy can use features from the last NumPy release (i.e., 0.7 can depend on 1.1). Of course, we need to avoid breaking the API as much as possible. If we decided that it is necessary to break the API, we should give our users as much notice as possible. The one caveat to this is that you may recall I have tried to start having a 3 month release cycle ever since I took over release management last summer. I was almost able to do it for the first release of NumPy and SciPy. But since November of last year, I have been struggling to get out NumPy 1.1, which was originally scheduled for early February. One of the issues that I have faced in trying use the 3 month release cycle is that most of us have very busy schedules. But I think that we are gaining new developers, which should help alleviate this problem. I would be very happy to move to a time-based release schedule and would be very happy to help make this happen. I have been thinking about this for sometime, so I have a few ideas about how to make this happen. I would be very appreciative if you would be willing to work with me to get 1.2 out by the end of August. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From cournapeau at cslab.kecl.ntt.co.jp Mon May 12 00:35:58 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 12 May 2008 13:35:58 +0900 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> Message-ID: <1210566958.17972.65.camel@bbc8> On Sun, 2008-05-11 at 21:14 -0700, Mike Ressler wrote: > > I'm just a common user, but please, no. The big Linux distros do this > and it drives me nuts. Just when things are finally beginning to > settle down, they throw another big, buggy release out the door > because they are trying to meet some ridiculous 6 month cycle. "We" > haven't even gotten the major distros to dump Numeric-24 yet > (something else depends on it - pygtk maybe?), though most offer numpy > as an option; what kind of disaster will we have with new numpy > releases every 3 or 6 months? I don't see what you mean by disaster. The point of having a time-based release is to avoid bugs caused by putting new, untested things late in the release (like what is happening for 1.1 with ma/matrix), and to be able to plan those changes. I personally would really like to avoid the recent matrix thing: just when 1.1 was about to be released, several months late already, a new change which broke many things was introduced. If we keep doing that, things will become unmanageable. > > Numpy doesn't (and probably shouldn't) change that rapidly. I would > really hope that the core numpy is solid, stable, and predictable. I > like major and minor version numbers that indicate a major change is > coming. If it's only a minor version number change, I know that I can > update it safely and go merrily computing on my way. "2008.04" doesn't > tell me if it is a minor bug fix or a major blow to my sanity. Time-based release has nothing to do with the way we version releases per-se: we would still call every numpy release with the current versioning. Also, make the release process predictable is exactly the point of time-based release: it means we can have a policy for api changes, and code freeze policies (which really should not change that often as has happened the last few months). cheers, David From millman at berkeley.edu Mon May 12 00:41:07 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 11 May 2008 21:41:07 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> Message-ID: On Sun, May 11, 2008 at 9:14 PM, Mike Ressler wrote: > I'm just a common user, but please, no. The big Linux distros do this > and it drives me nuts. Just when things are finally beginning to > settle down, they throw another big, buggy release out the door > because they are trying to meet some ridiculous 6 month cycle. "We" > haven't even gotten the major distros to dump Numeric-24 yet > (something else depends on it - pygtk maybe?), though most offer numpy > as an option; what kind of disaster will we have with new numpy > releases every 3 or 6 months? This is a somewhat separate issue. I completely agree that we should take a very conservative approach to changing our API, but I would still like to see regular releases even if there was no API change. So it would be fine for me to have maintenance or minor release every three months. We should also have very few (if any) API changes if possible. That said, I expect to see more changes to SciPy than NumPy. > Numpy doesn't (and probably shouldn't) change that rapidly. I would > really hope that the core numpy is solid, stable, and predictable. I > like major and minor version numbers that indicate a major change is > coming. If it's only a minor version number change, I know that I can > update it safely and go merrily computing on my way. "2008.04" doesn't > tell me if it is a minor bug fix or a major blow to my sanity. We use the standard major.minor.maintenance numbering scheme and will do so regardless of whether we move from a feature-based to a time-based release cycle. Your points are completely relevant and I entirely agree with you. The one point to be clarified is that minor numbers signify that there may be some API breakage. The traditional meaning is something like: 1. a change in the maintenance number means just bug-fixes, better documentation, and possibly some very trivial feature addition. 2. a change in the minor number means there may be very minor API breakage, major new features, bug-fixes, and better documentation. 3. a change in the major number means there may be major API changes. A minor number should require very trivial modification of existing code, while a major number may signify that user's may require rewriting their code. However, it is never a good idea to break API unless the alternative is worse. Most of the developers are extremely reluctant to break user's code. > So, please, if at all possible, keep the feature numbering. Thanks for > hearing me out. Thanks for giving us your input. Without knowing what our user's want, we run the risk of becoming irrelevant so please don't hesitate to pitch in where and whenever possible. Of course, we can always use help with code, documentation, and testing as well. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From cournapeau at cslab.kecl.ntt.co.jp Mon May 12 00:42:42 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 12 May 2008 13:42:42 +0900 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: <1210567362.17972.73.camel@bbc8> On Sun, 2008-05-11 at 21:26 -0700, Jarrod Millman wrote: > > The one caveat to this is that you may recall I have tried to start > having a 3 month release cycle ever since I took over release > management last summer. I was almost able to do it for the first > release of NumPy and SciPy. But since November of last year, I have > been struggling to get out NumPy 1.1, which was originally scheduled > for early February. My impression, but maybe I am missing something, is that the release slipped because everybody added new code. If we have a strict policy to say no new code, only bug fixes at least for 2 weeks before a release (and even no changes at all for a period), the release process becomes much easier, no ? IMHO, the main advantage of time-based release is to be able to say no to new code just before release, and to be able to say that an api breaks cannot happen between two releases: any change needs at least N releases in between with warnings, and we know what N means because N * release period is the time you have to make changes if you want to stay up to date. I personally really did not like what happened with ma and matrix in the 1.1, and I would like to avoid this in the future. It is already pretty bad to break code in a minor release, but to do it without warnings is really something which should never happen again IMHO. cheers, David From robert.kern at gmail.com Mon May 12 00:42:44 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 11 May 2008 23:42:44 -0500 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> Message-ID: <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> On Sun, May 11, 2008 at 11:14 PM, Mike Ressler wrote: > Numpy doesn't (and probably shouldn't) change that rapidly. I would > really hope that the core numpy is solid, stable, and predictable. I > like major and minor version numbers that indicate a major change is > coming. If it's only a minor version number change, I know that I can > update it safely and go merrily computing on my way. "2008.04" doesn't > tell me if it is a minor bug fix or a major blow to my sanity. > > So, please, if at all possible, keep the feature numbering. Thanks for > hearing me out. We will be keeping the major.minor.micro version numbering. David is not proposing to change that. What he is suggesting is that we try to make these releases come out on a given schedule. The semantics of major.minor.micro for numpy got confused because of an early mistake (IMO) wherein we designated 1.1 as a release which would allow code breakage. I would very much like to get away from that and follow Python's model of only doing bugfixes/doc-enhancements in micro releases, new features in minor releases and code breakage never (or if we absolutely must, only after one full minor release with a deprecation warning). I think it is certainly feasible to roll out the bugfix releases on a fixed schedule. I am less certain about the feature releases; I'm not sure we gain much by it. Having a feature freeze policy is sufficient, IMO. Once we've decided that we have accumulated enough features for a release, we declare a feature freeze for a fixed period of time, test/build/bugfix, then release. I think a fixed recurring time period may simply encourage rushing features in without testing. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Mon May 12 01:27:30 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 11 May 2008 22:27:30 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <1210567362.17972.73.camel@bbc8> References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> Message-ID: On Sun, May 11, 2008 at 9:42 PM, David Cournapeau wrote: > My impression, but maybe I am missing something, is that the release > slipped because everybody added new code. If we have a strict policy to > say no new code, only bug fixes at least for 2 weeks before a release > (and even no changes at all for a period), the release process becomes > much easier, no ? Yes and no. The addition of MaskedArrays, which was merged in February, has been one of the issues that has delayed the release. So it would probably been a good idea to release 1.0.5 before the merge, but the trunk had several other blockers at that point. In early March you may remember that there were several problems showing up on buildbot that were unrelated to MaskedArrays. The matrices issue has been more of a non-issue than all the list traffic might suggest. I wasn't waiting for it it to be resolved before releasing and no that I am ready to release, I plan to leave matrices just as they were (including not adding any DeprecationWarnings). I could be wrong but I am pretty certain that there has mostly been bugfixes, tests, and documentation since MaskedArrays was merged in February excepting, perhaps, the histogram change. > IMHO, the main advantage of time-based release is to be able to say no > to new code just before release, and to be able to say that an api > breaks cannot happen between two releases: any change needs at least N > releases in between with warnings, and we know what N means because N * > release period is the time you have to make changes if you want to stay > up to date. I agree, but we also have the problem that we don't have adequate tests. I don't think we realized the extent to which MaskedArrays necessitated code rewrites until Charles Doutriaux pointed out how much difficulty the change was causing to their code base. So, and I know David agrees, in addition to a time-based release, we need to make a better effort to improving our unit tests. I think that main thing we should focus on for 1.2 is moving to nose as our testing framework and writing many more unit tests. > I personally really did not like what happened with ma and matrix in the > 1.1, and I would like to avoid this in the future. It is already pretty > bad to break code in a minor release, but to do it without warnings is > really something which should never happen again IMHO. I agree with your concern about MaskedArrays, but I also hope that the new implementation has fixed more issues than it has caused. I have to admit that I didn't really understand how big of an impact the ma changes would have. I should have paid closer attention and raised more of an alarm. Sorry, mea culpa. Despite the concern that has been raised because of the ma change, I am still confident that this release is extremely solid--there have been an amazing amount of bugfixes, new tests, and new documentation. I believe we are moving forward and am happy that there is a push to do even better. Just to be clear, there isn't going to be any changes to matrices in the 1.1 release. I am not even going to allow warnings, because I haven't been convinced that there is anything close to an agreement about what, if anything, is going to change. It is also clear that if there is any change to matrices, there will have to be a staged migration. There may be a new matrix class released in 1.2 along with the old one. We can talk about the details of that change later, since there are some very strong feelings about this one issue. I don't want this discussion to get hijacked like my email thread for finalizing the 1.1 release. PS. David you are now officially my release co-deputy for the 1.2 release for August. We will be in 'feature-freeze" on August 1st and will have a coding sprint during the summer conference to finalize the release. The major changes that I anticipate are: 1. moving to the nose testing framework 2. major revamping of the documentation thanks to Stefan and Joe. 3. the planned changes to median and histogram 4. possibly a technology preview of a new matrix class. 5. and a move to a time-based release After the 1.2 release we will revisit the time- vs. feature-based release systems. I have been planning to move to a time-based release and with David's help, I am confident that we can pull it off. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From peridot.faceted at gmail.com Mon May 12 01:37:38 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 12 May 2008 01:37:38 -0400 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: <3d375d730805112005r76190250m40617341597e401b@mail.gmail.com> References: <3d375d730805111935m5a4c4dacv964fd37495a86dc4@mail.gmail.com> <3d375d730805112005r76190250m40617341597e401b@mail.gmail.com> Message-ID: 2008/5/11 Robert Kern : > Perhaps, but ufuncs only allow 0 or 1 for this value, currently. That's a shame, minus infinity is the identity for maximum too. > Also, I was wrong about using PyUFunc_ff_f. Instead, use PyUFunc_ff_f_As_dd_d. Hmm. Well, I tried implementing logsum(), and it turns out neither PyUFunc_dd_d nor any of the other functions - mostly flagged as being part of the ufunc API and described in the automatically-generated ufunc_api.txt - are exported. All are static and not described in any header I could find. That said, I know scipy defines its own ufuncs, so it must either reimplement these or have some way I hadn't thought of to get at them. I've attached a patch to add the ufunc logsum to numpy. It's a bit nasty for several reasons: * I wasn't sure where to put it, so I put it in _compiled_base.c in numpy/lib/src. I was very hesitant to touch umathmodule.c.src with its sui generis macro language. * I'm not sure what to do about log1p - it seems to be available in spite of HAVE_LOG1P not being defined. In any case if it doesn't exist it seems crazy to implement it again here. * since PyUFunc_dd_d does not seem to be exported, I just cut-n-pasted its code here. Obviously not a solution, but what are extension writers supposed to do? Do we want a ufunc version of logsum() in numpy? I got the impression that in spite of some people's doubts as to its utility there was a strong contingent of users who do want it. This plus a logdot() would cover most of the operations one might want to do on numbers in log representation. (Maybe subtraction?). It could be written in pure python, for the most part, and reduce() would save a few exps and logs at the cost of a temporary, but accumulate() remains a challenge. (I don't expect many people feel the need for reduceat()). Anne P.S. it was surprisingly difficult to persuade numpy to find my tests. I suppose that's what the proposed transition to nose is for? -A -------------- next part -------------- A non-text attachment was scrubbed... Name: logsum.patch Type: text/x-diff Size: 5654 bytes Desc: not available URL: From charlesr.harris at gmail.com Mon May 12 01:42:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 11 May 2008 23:42:07 -0600 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: On Sun, May 11, 2008 at 10:26 PM, Jarrod Millman wrote: > On Sun, May 11, 2008 at 7:59 PM, David Cournapeau > wrote: > > I would like to know how people feel about going toward a > time-based > > release process for numpy (and scipy). By time-based release, I mean: > > - releases of numpy are time-based, not feature based. > > - a precise schedule is fixed, and the release manager(s) try to > > enforce this schedule. > > > Do other people see this suggestion as useful ? If yes, we would have to > > decide on: > > - a release period (3 months sounds like a reasonable period to > me ?) > > - a schedule within a release (api breaks would only be allowed > in the > > first month, code addition would be allowed up to two months, and only > > bug fixes the last month, for example). > > - who does the process (if nobody steps in, I would volunteer for > the > > first round, if only for seeing how/if it works). > > +1 > I was basically thinking about trying this for NumPy 1.2.0, which I > have been suggesting should be out by the end of August. I am happy > to serve as the release manager for 1.2. We also need to get a > release of SciPy out between NumPy 1.1 and 1.2. I hadn't brought this > up yet, since I have been hoping to get out NumPy 1.1 before starting > to plan other releases. But I think we are ready to release NumPy > this week, so now is the time to start this discussion. > > Despite wanting to only take 3 months for 1.2, I believe that in the > future it would make more sense to have a minor release of NumPy or > SciPy every three months: > > NumPy 1.1 (May) > SciPy 0.7 (July) > NumPy 1.2 (August) > SciPy 0.8 (November) > NumPy 1.3 (February) > etc. > I dunno. I tend to agree with Mike Ressler that Numpy really shouldn't change very quickly. We can add new features now and then, or move more stuff to type specific functions, but those aren't really user visible. Most work, I think, should go into tests, cleaning up the code base and documenting it, and fixing bugs. I also wonder how good a position we are in if Travis decides that there are other things he would rather spend his time on. Documentation and clean code will help with that. So I really don't see that we need much in the way of scheduled releases except to crack the whip on us lazy souls who let things ride until we don't have a choice. If we do have scheduled realeases, then I think it will also be necessary to draw up a plan for each release, i.e., cleanup and document ufuncobject.c.src, and so on. Chuck > > NumPy will need to work with the last released SciPy (i.e., 1.1 will > need to work with 0.6) and SciPy can use features from the last NumPy > release (i.e., 0.7 can depend on 1.1). > > Of course, we need to avoid breaking the API as much as possible. If > we decided that it is necessary to break the API, we should give our > users as much notice as possible. > > The one caveat to this is that you may recall I have tried to start > having a 3 month release cycle ever since I took over release > management last summer. I was almost able to do it for the first > release of NumPy and SciPy. But since November of last year, I have > been struggling to get out NumPy 1.1, which was originally scheduled > for early February. > > One of the issues that I have faced in trying use the 3 month release > cycle is that most of us have very busy schedules. But I think that > we are gaining new developers, which should help alleviate this > problem. I would be very happy to move to a time-based release > schedule and would be very happy to help make this happen. I have > been thinking about this for sometime, so I have a few ideas about how > to make this happen. I would be very appreciative if you would be > willing to work with me to get 1.2 out by the end of August. > > Thanks, > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Mon May 12 01:43:59 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 11 May 2008 22:43:59 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> Message-ID: On Sun, May 11, 2008 at 9:42 PM, Robert Kern wrote: > The semantics of major.minor.micro for numpy got confused because of > an early mistake (IMO) wherein we designated 1.1 as a release which > would allow code breakage. I disagree. I didn't realize that the ma change was going to break anyone's code. Once I understood that it did I decided to call the release 1.1.0 rather than 1.0.5. I think that it is reasonable to allow some API breakage in a minor release, but that there is no reason to encourage it and we should absolutely not require it. The problem as I see it is that we were abusing the maintenance releases to add a significant number of features, rather than just simply fixing bugs. > I would very much like to get away from > that and follow Python's model of only doing bugfixes/doc-enhancements > in micro releases, new features in minor releases and code breakage > never (or if we absolutely must, only after one full minor release > with a deprecation warning). +1 I absolutely agree. David and I will help make sure this is the case moving forward. At some point, if no one else gets to it, I will write this up more formally and make sure it is available on the scipy website. > I think it is certainly feasible to roll out the bugfix releases on a > fixed schedule. I am less certain about the feature releases; I'm not > sure we gain much by it. Having a feature freeze policy is sufficient, > IMO. Once we've decided that we have accumulated enough features for a > release, we declare a feature freeze for a fixed period of time, > test/build/bugfix, then release. I think a fixed recurring time period > may simply encourage rushing features in without testing. I am in favor of a compromise. We aim for a time-based release and are realistic about what new features we accept for a release. We institute a soft-feature freeze two months before the release and a hard feature freeze one month before the release. We can let the release slip a few weeks, but no more. To assure that our releases our solid, we may have to pull features before the release. In order to facilitate this, each major new feature will have an assigned lead who is responsible for overseeing the features development. In addition to helping guide the feature's development, the assigned lead will be required to remove the feature if necessary. The lead will be required to evaluate the features progress at the two month, one month, and release mark. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Mon May 12 02:03:59 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 01:03:59 -0500 Subject: [Numpy-discussion] Writing new ufuncs In-Reply-To: References: <3d375d730805111935m5a4c4dacv964fd37495a86dc4@mail.gmail.com> <3d375d730805112005r76190250m40617341597e401b@mail.gmail.com> Message-ID: <3d375d730805112303k13ccd821pad4e629de1bcf605@mail.gmail.com> On Mon, May 12, 2008 at 12:37 AM, Anne Archibald wrote: > 2008/5/11 Robert Kern : > >> Perhaps, but ufuncs only allow 0 or 1 for this value, currently. > > That's a shame, minus infinity is the identity for maximum too. > >> Also, I was wrong about using PyUFunc_ff_f. Instead, use PyUFunc_ff_f_As_dd_d. > > Hmm. Well, I tried implementing logsum(), and it turns out neither > PyUFunc_dd_d nor any of the other functions - mostly flagged as being > part of the ufunc API and described in the automatically-generated > ufunc_api.txt - are exported. All are static and not described in any > header I could find. That said, I know scipy defines its own ufuncs, > so it must either reimplement these or have some way I hadn't thought > of to get at them. They are not exported as symbols. They are "exported" to other extension modules by #defining them to an element in an array just like the rest of the numpy C API. import_ufunc() sets up all of those #defines. They are automatically generated into the file __ufunc_api.h. > I've attached a patch to add the ufunc logsum to numpy. It's a bit > nasty for several reasons: > * I wasn't sure where to put it, so I put it in _compiled_base.c in > numpy/lib/src. I was very hesitant to touch umathmodule.c.src with its > sui generis macro language. The place to add it would be in code_generators/generate_umath.py, which generates __umath_generated.c. > * I'm not sure what to do about log1p - it seems to be available in > spite of HAVE_LOG1P not being defined. In any case if it doesn't exist > it seems crazy to implement it again here. Then maybe our test for HAVE_LOG1P is not correct. I don't think we can rely on its omnipresence, though. > * since PyUFunc_dd_d does not seem to be exported, I just cut-n-pasted > its code here. Obviously not a solution, but what are extension > writers supposed to do? See above. > Do we want a ufunc version of logsum() in numpy? I got the impression > that in spite of some people's doubts as to its utility there was a > strong contingent of users who do want it. Well, many of us were simply contending Chuck's contention that one shouldn't need the log representation. I would probably toss it into scipy.special, myself. > This plus a logdot() would > cover most of the operations one might want to do on numbers in log > representation. (Maybe subtraction?). It could be written in pure > python, for the most part, and reduce() would save a few exps and logs > at the cost of a temporary, but accumulate() remains a challenge. (I > don't expect many people feel the need for reduceat()). > > Anne > > P.S. it was surprisingly difficult to persuade numpy to find my tests. > I suppose that's what the proposed transition to nose is for? -A Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Mon May 12 02:17:42 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 11 May 2008 23:17:42 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: On Sun, May 11, 2008 at 10:42 PM, Charles R Harris wrote: > I dunno. I tend to agree with Mike Ressler that Numpy really shouldn't > change very quickly. We can add new features now and then, or move more > stuff to type specific functions, but those aren't really user visible. Most > work, I think, should go into tests, cleaning up the code base and > documenting it, and fixing bugs. I also wonder how good a position we are in > if Travis decides that there are other things he would rather spend his time > on. Documentation and clean code will help with that. So I really don't see > that we need much in the way of scheduled releases except to crack the whip > on us lazy souls who let things ride until we don't have a choice. If we do > have scheduled realeases, then I think it will also be necessary to draw up > a plan for each release, i.e., cleanup and document ufuncobject.c.src, and > so on. I don't think there is anyone who believes that NumPy should change very quickly. Let's try and drop that discussion. I am worried that if people keep bring up this straw man argument that our users will get the impression that we are preparing to break a lot of their code. Besides that changes to MA, there has been very few discussions about changing code. The discussions that have happened have been met with numerous comments by several developers that we need to be extremely conservative about making API changes. Even for the matrices discussion it seems that very few people are pushing for us to make any changes and of those that are most are thinking carefully about how to cause as little code breakage as possible. I know that just recently some matrices changes were made that have caused some problems, but I am going to rollback those changes. I think the main issue is that we need to have more regular releases of NumPy and SciPy. For NumPy you are completely correct that most (if not all) of the work should be focused on "tests, cleaning up the code base and documenting it, and fixing bugs." The overriding reason that necessitates a 1.2 release at the end of August is the move to the nose testing framework. I feel that this change is extremely important and want it to take place as quickly as possible. Personally, I don't have any other major changes that I will champion getting into 1.2. The only other major change that has been put forth so far is the matrix change, which I personally don't think should change the default behavior in the 1.2 release. I also really like your suggestion to draw up plans for each release. And I would love it if you would be willing to take the lead on something for 1.2 like "cleaning up and documenting ufuncobject.c.src." I would encourage you to start thinking about whether you could commit to something like that and, if so, creating a ticket for it briefly describing your plan. The other thing to consider is how best to coordinate NumPy and SciPy releases. There are occasionally times when we need to add or fix things in a NumPy release before the next SciPy release. While we shouldn't expect to see many changes in NumPy, I believe that there is a *slightly* greater tolerance for changes in SciPy. As you will recall, the whole MA change arose from my push to get rid of scipy.sandbox. While in retrospect the change could have been handled better, it is nice that that code has been removed from scipy.sandbox. I also am extremely happy with all the time and effort that Pierre spent on writing the MaskedArray. I want to make it absolutely clear that all the good, hard work was his and that the transition mistakes were mine, not his. I also want to thank Stefan who spent an entire night preparing for the merge during the December sprint (the time difference between Berkeley and South Africa makes working at the same time logistically difficult for him). Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Mon May 12 02:31:50 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 01:31:50 -0500 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> Message-ID: <3d375d730805112331x7ff2eba1va9852f787d2eb013@mail.gmail.com> On Mon, May 12, 2008 at 12:43 AM, Jarrod Millman wrote: > On Sun, May 11, 2008 at 9:42 PM, Robert Kern wrote: >> The semantics of major.minor.micro for numpy got confused because of >> an early mistake (IMO) wherein we designated 1.1 as a release which >> would allow code breakage. > > I disagree. I didn't realize that the ma change was going to break > anyone's code. Once I understood that it did I decided to call the > release 1.1.0 rather than 1.0.5. That's not what I'm referring to. Before 1.0 was released, people were worried about the stability of the API, and Travis said that 1.0.x would maintain API compatibility, but that 1.1, "maybe a year from now," would be the opportunity to make API changes again. I think that statement colored development substantially. We had to add features to 1.0.x releases instead of just bugfixes because 1.1 was going to be this big thing. > I think that it is reasonable to > allow some API breakage in a minor release, but that there is no > reason to encourage it and we should absolutely not require it. The > problem as I see it is that we were abusing the maintenance releases > to add a significant number of features, rather than just simply > fixing bugs. > >> I would very much like to get away from >> that and follow Python's model of only doing bugfixes/doc-enhancements >> in micro releases, new features in minor releases and code breakage >> never (or if we absolutely must, only after one full minor release >> with a deprecation warning). > > +1 > I absolutely agree. You keep saying that you agree with me, but then you keep talking about allowing "minor code breakage" which I do not agree with. I don't think I am communicating the aversion to code breakage that I am trying to espouse. While strictly speaking, there is a conceivable reading of the words you use ("I think that it is reasonable to allow some API breakage in a minor release" for example) which I can agree with, I think we're meaning slightly different things. In particular, I think we're disagreeing on a meta level. I do agree that occasionally we will be presented with times when breaking an API is warranted. As a prediction of future, factual events, we see eye to eye. However, I think we will make better decisions about when breakage is warranted if we treat them as _a priori_ *un*reasonable. Ultimately, we may need to break APIs, but we should feel bad about it. And I mean viscerally bad. It is too easy to convince one's self that the new API is better and cleaner and that "on balance" making the break is better than not doing so. The problem is that the thing that we need to balance this against, the frustration of our users who see their code break and all of *their* users, is something we inherently can not know. Somehow, we need to internalize this frustration ourselves. Treating the breaking of code as inherently unreasonable, and refusing to accept excuses *even if they are right on the merits* is the only way I know how to do this. The scales of reason occasionally needs a well-placed thumb of unreason. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon May 12 02:39:40 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 12 May 2008 00:39:40 -0600 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: On Mon, May 12, 2008 at 12:17 AM, Jarrod Millman wrote: > On Sun, May 11, 2008 at 10:42 PM, Charles R Harris > wrote: > > I dunno. I tend to agree with Mike Ressler that Numpy really shouldn't > > change very quickly. We can add new features now and then, or move more > > stuff to type specific functions, but those aren't really user visible. > Most > > work, I think, should go into tests, cleaning up the code base and > > documenting it, and fixing bugs. I also wonder how good a position we > are in > > if Travis decides that there are other things he would rather spend his > time > > on. Documentation and clean code will help with that. So I really don't > see > > that we need much in the way of scheduled releases except to crack the > whip > > on us lazy souls who let things ride until we don't have a choice. If > we do > > have scheduled realeases, then I think it will also be necessary to draw > up > > a plan for each release, i.e., cleanup and document ufuncobject.c.src, > and > > so on. > > I don't think there is anyone who believes that NumPy should change > very quickly. Let's try and drop that discussion. I am worried that > if people keep bring up this straw man argument that our users will > get the impression that we are preparing to break a lot of their code. > Besides that changes to MA, there has been very few discussions about > changing code. The discussions that have happened have been met with > numerous comments by several developers that we need to be extremely > conservative about making API changes. Even for the matrices > discussion it seems that very few people are pushing for us to make > any changes and of those that are most are thinking carefully about > how to cause as little code breakage as possible. I know that just > recently some matrices changes were made that have caused some > problems, but I am going to rollback those changes. > > I think the main issue is that we need to have more regular releases > of NumPy and SciPy. For NumPy you are completely correct that most > (if not all) of the work should be focused on "tests, cleaning up the > code base and documenting it, and fixing bugs." The overriding reason > that necessitates a 1.2 release at the end of August is the move to > the nose testing framework. I feel that this change is extremely > important and want it to take place as quickly as possible. > Personally, I don't have any other major changes that I will champion > getting into 1.2. The only other major change that has been put forth > so far is the matrix change, which I personally don't think should > change the default behavior in the 1.2 release. > > I also really like your suggestion to draw up plans for each release. > And I would love it if you would be willing to take the lead on > something for 1.2 like "cleaning up and documenting > ufuncobject.c.src." I would encourage you to start thinking about > whether you could commit to something like that and, if so, creating a > ticket for it briefly describing your plan. > As a start, I think we can just reindent all the c code to Python3.0 specs, and change the style of if and for loops like so: if (yes) { } else { } And put space around operators in for loops, i.e., for (i = 0; i < 10; i++) { whatever; needed; } instead of for (i=0;i<10;i++){whatever;needed;} and don't do this if ((a=foo(arg))==0) barf; intervening_code; do_stuff_with_a; Which makes the spot where a gets initialized almost invisible. It is amazing how much easier the code is to read and understand with these simple changes. But if we plan tasks for the release, then we also have to assign people to the task. That is where it gets sticky. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon May 12 02:46:54 2008 From: Chris.Barker at noaa.gov (Chris.Barker) Date: Sun, 11 May 2008 23:46:54 -0700 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: References: <482454E8.8060606@gmail.com> Message-ID: <4827E7DE.8050607@noaa.gov> Could we add a "from __future__ import something" along with a deprecation warning? This could be used for Tim's "new matrix" class, or any other API change. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournapeau at cslab.kecl.ntt.co.jp Mon May 12 02:59:09 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 12 May 2008 15:59:09 +0900 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: <1210575549.22981.7.camel@bbc8> On Mon, 2008-05-12 at 00:39 -0600, Charles R Harris wrote: > > Which makes the spot where a gets initialized almost invisible. It is > amazing how much easier the code is to read and understand with these > simple changes. But if we plan tasks for the release, then we also > have to assign people to the task. That is where it gets sticky. That's exactly why I am suggesting a time-based release; that's the kind of stuff I want to see in numpy (< 2.* ) and scipy (< 1.*). The way I saw things was: it is ok to do this kind of changes, but only N days before the official release. After this point, the only (and I really mean only) reason is a critical bug. It is really easy to think "hey, let's do this one line change, that can't possibly break anything, right ?", and two days after: "hey, this breaks on windows because VS does not recognize this code". If between the two days, you have released numpy, you're screwed. Doing this with a time schedule is easier, I think. FWIW, I am doing exactly this for scipy.fftpack right now, and that's a lot of relatively boring work (fftpack is a bit special because it has a big list of possible combination dependencies with different code paths, and testing all of them is painful, but doing this in numpy.core is not that much easier). cheers, David From millman at berkeley.edu Mon May 12 03:08:22 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 00:08:22 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <3d375d730805112331x7ff2eba1va9852f787d2eb013@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> <3d375d730805112331x7ff2eba1va9852f787d2eb013@mail.gmail.com> Message-ID: On Sun, May 11, 2008 at 11:31 PM, Robert Kern wrote: > That's not what I'm referring to. Before 1.0 was released, people were > worried about the stability of the API, and Travis said that 1.0.x > would maintain API compatibility, but that 1.1, "maybe a year from > now," would be the opportunity to make API changes again. I think that > statement colored development substantially. We had to add features to > 1.0.x releases instead of just bugfixes because 1.1 was going to be > this big thing. Ah... I see. >>> I would very much like to get away from >>> that and follow Python's model of only doing bugfixes/doc-enhancements >>> in micro releases, new features in minor releases and code breakage >>> never (or if we absolutely must, only after one full minor release >>> with a deprecation warning). >> >> +1 >> I absolutely agree. To clarify, in particular, I was agreeing that if we absolutely must break code we should only do it after a full minor release with a deprecation warning. I don't want a repeat of what happened with MA. > You keep saying that you agree with me, but then you keep talking > about allowing "minor code breakage" which I do not agree with. I > don't think I am communicating the aversion to code breakage that I am > trying to espouse. While strictly speaking, there is a conceivable > reading of the words you use ("I think that it is reasonable to allow > some API breakage in a minor release" for example) which I can agree > with, I think we're meaning slightly different things. In particular, > I think we're disagreeing on a meta level. I do agree that > occasionally we will be presented with times when breaking an API is > warranted. As a prediction of future, factual events, we see eye to > eye. Are you saying that the changes to histogram and median should require waiting until 2.0--several years from now? When I say that we may allow minor API breakage this is the kind of thing I mean. I think that both instances are very reasonable and clean up minor warts. I also think that in both cases the implementation plan is reasonable. They first provide the new functionality as an option in a new minor release, then in the subsequent release switch the new functionality to the default but leave the old behavior accessable. The MA situation was handled slightly differently, which is unfortunate. I think that it is a reasonable change to make in a minor release, but we should have provided a warning first. At least, we are providing the old code still. Do you think that we should revisit this issue? Since we are importing the new MA from a different location maybe we should move the old code back to it's original location. Would it be reasonable to have: from numpy import ma --> new code from numpy.core import ma --> old code w/ a deprecation warning Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From stefan at sun.ac.za Mon May 12 03:09:39 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 12 May 2008 09:09:39 +0200 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> Message-ID: <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> 2008/5/12 Jarrod Millman : > I agree, but we also have the problem that we don't have adequate > tests. I don't think we realized the extent to which MaskedArrays > necessitated code rewrites until Charles Doutriaux pointed out how > much difficulty the change was causing to their code base. There's a valuable lesson to be learnt here: unit tests provide a contract between the developer and the user. When I did the MaskedArray merge, I made very sure that we strictly stuck to our contract -- the unit tests for numpy.core.ma (which ran without failure). Unfortunately, according to Charles' experience, that contract was inadequate. We shouldn't be caught with our pants down like that. The matplotlib guys used Pierre's maskedarray for a while before we did the merge, so we had good reason to believe that it was a vast improvement (and it was). I agree that the best policy would have been to make a point release right before the merge, but realistically, if we had to wait for such a release, the new maskedarrays would still not have been merged. Which brings me to my next point: we have very limited (human) resources, but releasing frequently is paramount. To what extent can we automate the release process? I've asked this question before, but I haven't had a clear answer: are the packages currently built automatically? Why don't we get the buildbots to produce nightly snapshot packages, so that when we tell users "try the latest SVN version, it's been fixed there" it doesn't send them into a dark depression? As for the NumPy unit tests: I have placed coverage reports online (http://mentat.za.net/numpy/coverage). This only covers Python (not extension) code, but having that part 100% tested is not impossible, nor would it take that much effort. The much more important issue is having the C extensions tested, and if anyone can figure out a way to get gcov to generate those coverage reports, I'd be in the seventh heaven. Thus far, the only way I know of is to build one large, static Python binary that includes numpy. Memory errors: Albert Strasheim recently changed his build client config to run Valgrind on the NumPy code. Surprise, surprise -- we introduced new memory errors since the last release. In the future, when *any* changes are made to the the C code: a) Add a unit test for the change, unless the test already exists (and I suggest we *strongly* enforce this policy). b) Document your change if it is not immediately clear what it does. b) Run the test suite through Valgrind, or if you're not on a linux platform, look at the buildbot (http://buildbot.scipy.org) output. Finally, our patch acceptance process is poor. It would be good if we could have a more formal system for reviewing incoming and *also* our own patches. I know Ondrej Certik had a review board in place for Sympy at some stage, so we could ask him what their experience was. So, +1 for more frequent releases, +1 for more tests and +1 for good developer discipline. Regards St?fan From millman at berkeley.edu Mon May 12 03:28:47 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 00:28:47 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> Message-ID: On Mon, May 12, 2008 at 12:09 AM, St?fan van der Walt wrote: > Which brings me to my next point: we have very limited (human) > resources, but releasing frequently is paramount. To what extent can > we automate the release process? I've asked this question before, but > I haven't had a clear answer: are the packages currently built > automatically? Why don't we get the buildbots to produce nightly > snapshot packages, so that when we tell users "try the latest SVN > version, it's been fixed there" it doesn't send them into a dark > depression? The packages aren't built automatically. We just changed the package build process. As of now David C. is building the Windows binaries and Chris B. is building the Mac OS X binaries. In terms of getting the releases out, this is not a very important problem. The main issue is getting the code stable enough to make a release. However, your suggestion to have nightly binaries autogenerated would be more useful than telling users to try the latest SVN. I assume that we should be able to have these binaries auto-generated, David and Chris should be able to provide a more detailed answer to this than I can. However, my concern with this is that, I believe, we currently need some oversight to ensure that the binaries don't have silly problems. Regardless of whether we automate the process or still have manual aspect to the process, it is a good idea to start creating binaries for release candidates or test releases. > Memory errors: Albert Strasheim recently changed his build client > config to run Valgrind on the NumPy code. Surprise, surprise -- we > introduced new memory errors since the last release. In the future, > when *any* changes are made to the the C code: > > a) Add a unit test for the change, unless the test already exists (and > I suggest we *strongly* enforce this policy). > b) Document your change if it is not immediately clear what it does. > b) Run the test suite through Valgrind, or if you're not on a linux > platform, look at the buildbot (http://buildbot.scipy.org) output. I would be happy to see a policy like this adopted. You have my vote. > Finally, our patch acceptance process is poor. It would be good if we > could have a more formal system for reviewing incoming and *also* our > own patches. I know Ondrej Certik had a review board in place for > Sympy at some stage, so we could ask him what their experience was. We need to be very careful not to make the process for contributing code too burdensome. The number of developers is increasing and our development community is growing; I don't want to see this process reversed. If we go this route, we may need better tools for creating and reviewing patches. I would recommend taking it slow. There are a number of suggestions for improving our development process. I would prefer changing only one or just a few aspects at a time. That way we can more easily understand what effect these changes have on our development process. This is another argument for increasing the frequency of releases. It would give us more of an opportunity to review and improve the process. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pav at iki.fi Mon May 12 04:49:50 2008 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 12 May 2008 11:49:50 +0300 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> Message-ID: <1210582190.26372.297.camel@localhost> ma, 2008-05-12 kello 09:09 +0200, St?fan van der Walt kirjoitti: [clip] > As for the NumPy unit tests: I have placed coverage reports online > (http://mentat.za.net/numpy/coverage). This only covers Python (not > extension) code, but having that part 100% tested is not impossible, > nor would it take that much effort. Small thing: would it be possible to use only two colors in the figleaf output, eg. uncovered code red, everything else black. The black bits in between are now slightly distracting. -- Pauli Virtanen From Joris.DeRidder at ster.kuleuven.be Mon May 12 08:31:52 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Mon, 12 May 2008 14:31:52 +0200 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <1210561167.17972.17.camel@bbc8> References: <1210561167.17972.17.camel@bbc8> Message-ID: On 12 May 2008, at 04:59, David Cournapeau wrote: > Also, time-based releases are by definition predictable, and > as such, it is easier to plan upgrades for users As long as it does not imply that users have to upgrade every 3 months, because for some users this is impossible and/or undesirable. By 'upgrading' I'm not only referring to numpy/scipy, but also to external packages based on numpy/scipy. As Mike, I'm a bit sceptic about the whole idea. The current way doesn't seem broken, so why fix it? The argument that "time-based releases avoids bugs caused by putting new untested things late in the release" doesn't sound very convincing to me. Isn't this a discipline issue? To me it seems that one can avoid this with feature-based releases too. In fact won't the time-pressure of time-based releases increase the tendency to include untested things? Just the 2 cents of a user, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From david at ar.media.kyoto-u.ac.jp Mon May 12 08:41:29 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 12 May 2008 21:41:29 +0900 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: <48283AF9.801@ar.media.kyoto-u.ac.jp> Joris De Ridder wrote: > > As long as it does not imply that users have to upgrade every 3 > months, because for some users this is impossible and/or undesirable. > By 'upgrading' I'm not only referring to numpy/scipy, but also to > external packages based on numpy/scipy. As I said, time-based release do not imply that we will break something every release. This is really an orthogonal issue. I certainly hope that the recent ma/matrix thing won't happen again. If it were me, I would have never accepted the ma or any matrix change in the numpy 1.* timespan. Concerning packages which depend on numpy, there is not much we can do: if they depend on new features on numpy, you will have to upgrade, but this has nothing to do with the release process. > > As Mike, I'm a bit sceptic about the whole idea. The current way > doesn't seem broken, so why fix it? If the recent events do not show that something went wrong, I don't know what will :) numpy 1.0.4 was released 6 months ago, and numpy 1.1 has slipped for a long time now. If there is no schedule, it is easy to keep adding code, specially when the release approaches ("I want to see my changes in the next release, let's push some code just before the release"). The plan really is to have a code freeze and hard freeze between each release, and time-based releases are more natural for that IMO, because contributors can plan and synchronize more easily. Maybe it won't work, but it worths a try; As Robert mentioned it before, code freeze and hard code freeze is what matters. cheers, David From aisaac at american.edu Mon May 12 09:15:50 2008 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 12 May 2008 09:15:50 -0400 Subject: [Numpy-discussion] promises, promises In-Reply-To: <3d375d730805112009n3487eebq8df6ab77048b08e0@mail.gmail.com> References: <20080511140918.GC28418@phare.normalesup.org><20080511164201.GG28418@phare.normalesup.org> <3d375d730805112009n3487eebq8df6ab77048b08e0@mail.gmail.com> Message-ID: > On Sun, May 11, 2008 at 12:16 PM, Alan G Isaac wrote: >> To be specific: I do not recall any place in the NumPy Book >> where this behavior is promised. On Sun, 11 May 2008, Robert Kern apparently wrote: > It's promised in the docstring! """ A matrix is > a specialized 2-d array that retains it's 2-d nature > through operations > """ I guess I would say that is too ambiguous. And wrong:: >>> x = np.mat('1 2;3 4') >>> x[0,0] 1 Cheers, Alan Isaac From robert.kern at gmail.com Mon May 12 11:19:43 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 10:19:43 -0500 Subject: [Numpy-discussion] Uncomfortable with matrix change In-Reply-To: <4827E7DE.8050607@noaa.gov> References: <482454E8.8060606@gmail.com> <4827E7DE.8050607@noaa.gov> Message-ID: <3d375d730805120819g7834755etc7232ab4ad10fe53@mail.gmail.com> On Mon, May 12, 2008 at 1:46 AM, Chris.Barker wrote: > > Could we add a "from __future__ import something" along with a > deprecation warning? That's a Python language feature. It is not available to us. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Mon May 12 11:56:23 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 12 May 2008 11:56:23 -0400 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <4821E855.1060909@llnl.gov> References: <4821E855.1060909@llnl.gov> Message-ID: <200805121156.24319.pgmdevlist@gmail.com> All, I fixed the power function in numpy.ma following Anne's suggestion: compute first, mask the problems afterwards. It's a quick and dirty fix that crashes if the user has set its error system to raise an exception on invalid (np.seterr(invalid='raise')), but it works otherwise and keeps subclasses (such as TimeSeries). I will have to modify the .__pow__ method so that ma.power is called: right now, a**b calls ndarray(a).__pow__(b), which may yield NaNs and Infs. What should I do with oldnumeric.ma.power ? Try to fix it the same way, or leave the bug ? I'm not that enthusiastic to have to debug the old package, but if it's part of the job... From stefan at sun.ac.za Mon May 12 12:04:24 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 12 May 2008 18:04:24 +0200 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <200805121156.24319.pgmdevlist@gmail.com> References: <4821E855.1060909@llnl.gov> <200805121156.24319.pgmdevlist@gmail.com> Message-ID: <9457e7c80805120904t6e6bbaebu70ef32d407ba2004@mail.gmail.com> 2008/5/12 Pierre GM : > I fixed the power function in numpy.ma following Anne's suggestion: compute > first, mask the problems afterwards. It's a quick and dirty fix that crashes > if the user has set its error system to raise an exception on invalid > (np.seterr(invalid='raise')), but it works otherwise and keeps subclasses > (such as TimeSeries). > I will have to modify the .__pow__ method so that ma.power is called: right > now, a**b calls ndarray(a).__pow__(b), which may yield NaNs and Infs. > What should I do with oldnumeric.ma.power ? Try to fix it the same way, or > leave the bug ? I'm not that enthusiastic to have to debug the old package, > but if it's part of the job... We should leave the oldnumeric.ma package alone. Even if its `power` is broken, some packages may depend on it. We'll provide it in 1.2 for backward compatibility, and get rid of it after 1.3. Regards St?fan From pgmdevlist at gmail.com Mon May 12 12:12:00 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 12 May 2008 12:12:00 -0400 Subject: [Numpy-discussion] bug in oldnumeric.ma In-Reply-To: <9457e7c80805120904t6e6bbaebu70ef32d407ba2004@mail.gmail.com> References: <4821E855.1060909@llnl.gov> <200805121156.24319.pgmdevlist@gmail.com> <9457e7c80805120904t6e6bbaebu70ef32d407ba2004@mail.gmail.com> Message-ID: <200805121212.00835.pgmdevlist@gmail.com> On Monday 12 May 2008 12:04:24 St?fan van der Walt wrote: > 2008/5/12 Pierre GM : > > What should I do with oldnumeric.ma.power ? Try to fix it the same > > way, or leave the bug ? I'm not that enthusiastic to have to debug the > > old package, but if it's part of the job... > > We should leave the oldnumeric.ma package alone. Even if its `power` > is broken, some packages may depend on it. We'll provide it in 1.2 > for backward compatibility, and get rid of it after 1.3. OK then, I prefer that... Additional question: when raising to a power in place, NaNs/Infs can show up in the _data part: should I set those invalid data to fill_value or not ? From robert.kern at gmail.com Mon May 12 12:18:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 11:18:18 -0500 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> <3d375d730805112331x7ff2eba1va9852f787d2eb013@mail.gmail.com> Message-ID: <3d375d730805120918p786cf4daj26c55a2b1f5f68f0@mail.gmail.com> On Mon, May 12, 2008 at 2:08 AM, Jarrod Millman wrote: > Are you saying that the changes to histogram and median should require > waiting until 2.0--several years from now? When I say that we may > allow minor API breakage this is the kind of thing I mean. I think > that both instances are very reasonable and clean up minor warts. I > also think that in both cases the implementation plan is reasonable. > They first provide the new functionality as an option in a new minor > release, then in the subsequent release switch the new functionality > to the default but leave the old behavior accessable. I think they are tolerable. Like I said, I don't think our yardstick should be "reasonableness". Reason is too easily fooled in these instances. > The MA situation was handled slightly differently, which is > unfortunate. I think that it is a reasonable change to make in a > minor release, but we should have provided a warning first. At least, > we are providing the old code still. Do you think that we should > revisit this issue? Since we are importing the new MA from a > different location maybe we should move the old code back to it's > original location. Would it be reasonable to have: > from numpy import ma --> new code > from numpy.core import ma --> old code w/ a deprecation warning I think that might be better, yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Mon May 12 12:47:35 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 09:47:35 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: On Mon, May 12, 2008 at 5:31 AM, Joris De Ridder wrote: > As long as it does not imply that users have to upgrade every 3 > months, because for some users this is impossible and/or undesirable. > By 'upgrading' I'm not only referring to numpy/scipy, but also to > external packages based on numpy/scipy. I can't imagine why anyone would have to upgrade. Could you explain under what circumstances you would see having to upgrade just because their is a new upstream release? I think I must be misunderstanding your concern. > As Mike, I'm a bit sceptic about the whole idea. The current way > doesn't seem broken, so why fix it? The argument that "time-based > releases avoids bugs caused by putting new untested things late in the > release" doesn't sound very convincing to me. Isn't this a discipline > issue? To me it seems that one can avoid this with feature-based > releases too. In fact won't the time-pressure of time-based releases > increase the tendency to include untested things? As a data point, I have to say that I view the current process as a bit broken. I have been very frustrated trying to get this latest release out and if you remember the reason I took over release management last summer was that the current release of NumPy and SciPy didn't work together for several months prior to me releasing 1.0.3.1 and 0.5.2.1. I am a little surprised that anyone would believe that the current system isn't broken. Now I don't think that that means the solution is to move to a time-based release schedule necessarily. But since I was hoping to do something like a time-based release for 1.2 anyway, I am happy to try it and see how it works. The concern is obviously not to include untested things; but no one wants to do that. If we get to the end of the 3 months and can't release a stable, well-tested release by using the trunk or by discarding some features, I think we would just consider the experiment a failure. There is nothing to force us to release untested code. If you look at the people in favor of this, you will find they are also some of the most adamant voices about including unit tests for all new code as well. So I don't believe any of us will require that we release 1.2 at the end of three months regardless of the quality. The more likely scenario is that those of us supporting this idea will be able to use the time-based release cycle as a mechanism to ensure high-quality, well-tested code is produced. Again it this experiment fails, we should know before we actually make the release. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From centos at scratchspace.com Mon May 12 12:50:49 2008 From: centos at scratchspace.com (Chris Miller) Date: Mon, 12 May 2008 09:50:49 -0700 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 Message-ID: <48287569.5080002@scratchspace.com> Hello, I'm having trouble getting the python-numpy RPM to build under CentOS 4.6. I've already built and installed atlas, lapack3 and refblas3 from the CentOS 5 source RPMS, but numpy won't build correctly. Although the ultimate error may be unrelated to Atlas, clearly not all the atlas dependencies are being satisfied. Only the .so files are supplied by the RPMS, but it seems that some additional lib* files may be needed as well. Side notes; * gcc-g77 is not installed (apparently causes problems), gfortran is installed from the gcc4 RPM. * I ran "export LD_LIBRARY_PATH=/usr/lib/atlas/sse2" * I created the missing directories in /var/tmp/python-numpy-1.0.4-build but it's really not clear that this is really the problem * The numpy SPEC file shows a dependency on lapack3 < 3.1, but rpmbuild doesn't complain about this, and it seems everyone uses lapack 3.0.x * Upgrading to CentOS 5 is not an option Can anyone provide any insight as to what exactly is missing from Atlas and how to correctly resolve this dependency? Any other tips are appreciated. rpmbuild output below. Chris # rpmbuild -vvv -bi SPECS/python-numpy.spec Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.7373 + umask 022 + cd /usr/src/redhat/BUILD + cd /usr/src/redhat/BUILD + rm -rf numpy-1.0.4 + /bin/gzip -dc /usr/src/redhat/SOURCES/numpy-1.0.4.tar.gz + tar -xf - + STATUS=0 + '[' 0 -ne 0 ']' + cd numpy-1.0.4 ++ /usr/bin/id -u + '[' 0 = 0 ']' + /bin/chown -Rhf root . ++ /usr/bin/id -u + '[' 0 = 0 ']' + /bin/chgrp -Rhf root . + /bin/chmod -Rf a+rX,u+w,g-w,o-w . + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.7373 + umask 022 + cd /usr/src/redhat/BUILD + cd numpy-1.0.4 + export 'CFLAGS=-O2 -g -march=i386 -mcpu=i686 -fPIC' + CFLAGS='-O2 -g -march=i386 -mcpu=i686 -fPIC' + python setup.py config_fc --fcompiler=gnu95 build Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_4422 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/lib/atlas/sse2 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/atlas libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE /usr/src/redhat/BUILD/numpy-1.0.4/numpy/distutils/system_info.py:1340: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas/sse2 libraries lapack_atlas not found in /usr/lib/atlas/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas libraries lapack_atlas not found in /usr/lib/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/lib/atlas/sse2 libraries lapack_atlas not found in /usr/lib/atlas/sse2 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/atlas libraries lapack_atlas not found in /usr/lib/atlas libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE /usr/src/redhat/BUILD/numpy-1.0.4/numpy/distutils/system_info.py:1247: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running build_src building py_modules sources creating build creating build/src.linux-i686-2.3 creating build/src.linux-i686-2.3/numpy creating build/src.linux-i686-2.3/numpy/distutils building extension "numpy.core.multiarray" sources creating build/src.linux-i686-2.3/numpy/core Generating build/src.linux-i686-2.3/numpy/core/config.h customize Gnu95FCompiler Found executable /usr/bin/gfortran customize Gnu95FCompiler using config C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-I/usr/include/python2.3 -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -L/usr/local/lib -L/usr/lib -o _configtest _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest _configtest.o(.text+0x28): In function `main': /usr/src/redhat/BUILD/numpy-1.0.4/_configtest.c:5: undefined reference to `exp' collect2: ld returned 1 exit status _configtest.o(.text+0x28): In function `main': /usr/src/redhat/BUILD/numpy-1.0.4/_configtest.c:5: undefined reference to `exp' collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest success! removing: _configtest.c _configtest.o _configtest File: build/src.linux-i686-2.3/numpy/core/config.h /* #define SIZEOF_SHORT 2 */ /* #define SIZEOF_INT 4 */ /* #define SIZEOF_LONG 4 */ /* #define SIZEOF_FLOAT 4 */ /* #define SIZEOF_DOUBLE 8 */ #define SIZEOF_LONG_DOUBLE 12 #define SIZEOF_PY_INTPTR_T 4 /* #define SIZEOF_LONG_LONG 8 */ #define SIZEOF_PY_LONG_LONG 8 /* #define CHAR_BIT 8 */ #define NPY_ALLOW_THREADS 0 #define MATHLIB m #define HAVE_LONGDOUBLE_FUNCS #define HAVE_FLOAT_FUNCS #define HAVE_LOG1P #define HAVE_EXPM1 #define HAVE_INVERSE_HYPERBOLIC #define HAVE_INVERSE_HYPERBOLIC_FLOAT #define HAVE_INVERSE_HYPERBOLIC_LONGDOUBLE #define HAVE_ISNAN #define HAVE_ISINF #define HAVE_RINT #define PyOS_ascii_strtod strtod EOF adding 'build/src.linux-i686-2.3/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.linux-i686-2.3/numpy/core/__multiarray_api.h' to sources. creating build/src.linux-i686-2.3/numpy/core/src conv_template:> build/src.linux-i686-2.3/numpy/core/src/scalartypes.inc adding 'build/src.linux-i686-2.3/numpy/core/src' to include_dirs. conv_template:> build/src.linux-i686-2.3/numpy/core/src/arraytypes.inc numpy.core - nothing done with h_files = ['build/src.linux-i686-2.3/numpy/core/src/scalartypes.inc', 'build/src.linux-i686-2.3/numpy/core/src/arraytypes.inc', 'build/src.linux-i686-2.3/numpy/core/config.h', 'build/src.linux-i686-2.3/numpy/core/__multiarray_api.h'] building extension "numpy.core.umath" sources adding 'build/src.linux-i686-2.3/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-i686-2.3/numpy/core/__ufunc_api.h' to sources. conv_template:> build/src.linux-i686-2.3/numpy/core/src/umathmodule.c adding 'build/src.linux-i686-2.3/numpy/core/src' to include_dirs. numpy.core - nothing done with h_files = ['build/src.linux-i686-2.3/numpy/core/src/scalartypes.inc', 'build/src.linux-i686-2.3/numpy/core/src/arraytypes.inc', 'build/src.linux-i686-2.3/numpy/core/config.h', 'build/src.linux-i686-2.3/numpy/core/__ufunc_api.h'] building extension "numpy.core._sort" sources adding 'build/src.linux-i686-2.3/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.linux-i686-2.3/numpy/core/__multiarray_api.h' to sources. conv_template:> build/src.linux-i686-2.3/numpy/core/src/_sortmodule.c numpy.core - nothing done with h_files = ['build/src.linux-i686-2.3/numpy/core/config.h', 'build/src.linux-i686-2.3/numpy/core/__multiarray_api.h'] building extension "numpy.core.scalarmath" sources adding 'build/src.linux-i686-2.3/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.linux-i686-2.3/numpy/core/__multiarray_api.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-i686-2.3/numpy/core/__ufunc_api.h' to sources. conv_template:> build/src.linux-i686-2.3/numpy/core/src/scalarmathmodule.c numpy.core - nothing done with h_files = ['build/src.linux-i686-2.3/numpy/core/config.h', 'build/src.linux-i686-2.3/numpy/core/__multiarray_api.h', 'build/src.linux-i686-2.3/numpy/core/__ufunc_api.h'] building extension "numpy.core._dotblas" sources building extension "numpy.lib._compiled_base" sources building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources creating build/src.linux-i686-2.3/numpy/linalg adding 'numpy/linalg/lapack_litemodule.c' to sources. building extension "numpy.random.mtrand" sources creating build/src.linux-i686-2.3/numpy/random C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest _configtest failure. removing: _configtest.c _configtest.o _configtest building data_files sources running build_py creating build/lib.linux-i686-2.3 creating build/lib.linux-i686-2.3/numpy copying numpy/setup.py -> build/lib.linux-i686-2.3/numpy copying numpy/matlib.py -> build/lib.linux-i686-2.3/numpy copying numpy/version.py -> build/lib.linux-i686-2.3/numpy copying numpy/dual.py -> build/lib.linux-i686-2.3/numpy copying numpy/__init__.py -> build/lib.linux-i686-2.3/numpy copying numpy/_import_tools.py -> build/lib.linux-i686-2.3/numpy copying numpy/add_newdocs.py -> build/lib.linux-i686-2.3/numpy copying numpy/ctypeslib.py -> build/lib.linux-i686-2.3/numpy copying build/src.linux-i686-2.3/numpy/__config__.py -> build/lib.linux-i686-2.3/numpy creating build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/setup.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/ccompiler.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/lib2def.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/exec_command.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/intelccompiler.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/mingw32ccompiler.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/cpuinfo.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/__init__.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/unixccompiler.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/conv_template.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/__version__.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/info.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/environment.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/misc_util.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/line_endings.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/from_template.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/system_info.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/core.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/log.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/extension.py -> build/lib.linux-i686-2.3/numpy/distutils copying numpy/distutils/interactive.py -> build/lib.linux-i686-2.3/numpy/distutils copying build/src.linux-i686-2.3/numpy/distutils/__config__.py -> build/lib.linux-i686-2.3/numpy/distutils creating build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/install.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/egg_info.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/config_compiler.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/build.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/__init__.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/bdist_rpm.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/sdist.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/install_headers.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/build_clib.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/build_py.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/build_scripts.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/build_src.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/install_data.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/config.py -> build/lib.linux-i686-2.3/numpy/distutils/command copying numpy/distutils/command/build_ext.py -> build/lib.linux-i686-2.3/numpy/distutils/command creating build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/vast.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/sun.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/lahey.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/nag.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/pg.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/__init__.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/none.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/g95.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/absoft.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/gnu.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/intel.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/mips.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/hpux.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/compaq.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/ibm.py -> build/lib.linux-i686-2.3/numpy/distutils/fcompiler creating build/lib.linux-i686-2.3/numpy/testing copying numpy/testing/setup.py -> build/lib.linux-i686-2.3/numpy/testing copying numpy/testing/parametric.py -> build/lib.linux-i686-2.3/numpy/testing copying numpy/testing/utils.py -> build/lib.linux-i686-2.3/numpy/testing copying numpy/testing/__init__.py -> build/lib.linux-i686-2.3/numpy/testing copying numpy/testing/info.py -> build/lib.linux-i686-2.3/numpy/testing copying numpy/testing/numpytest.py -> build/lib.linux-i686-2.3/numpy/testing creating build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/setup.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/f2py_testing.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/capi_maps.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/cfuncs.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/auxfuncs.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/rules.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/common_rules.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/__init__.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/__svn_version__.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/func2subr.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/diagnose.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/use_rules.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/__version__.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/crackfortran.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/info.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/cb_rules.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/f90mod_rules.py -> build/lib.linux-i686-2.3/numpy/f2py copying numpy/f2py/f2py2e.py -> build/lib.linux-i686-2.3/numpy/f2py creating build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/setup.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/nary.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/wrapper_base.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/api.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/__init__.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/py_wrap_type.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/main.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/py_wrap_subprogram.py -> build/lib.linux-i686-2.3/numpy/f2py/lib copying numpy/f2py/lib/py_wrap.py -> build/lib.linux-i686-2.3/numpy/f2py/lib creating build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/parsefortran.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/block_statements.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/pattern_tools.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/sourceinfo.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/Fortran2003.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/api.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/utils.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/__init__.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/test_Fortran2003.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/splitline.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/base_classes.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/statements.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/test_parser.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/typedecl_statements.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/readfortran.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/parser creating build/lib.linux-i686-2.3/numpy/f2py/lib/extgen copying numpy/f2py/lib/extgen/py_support.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/extgen copying numpy/f2py/lib/extgen/base.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/extgen copying numpy/f2py/lib/extgen/setup_py.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/extgen copying numpy/f2py/lib/extgen/utils.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/extgen copying numpy/f2py/lib/extgen/__init__.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/extgen copying numpy/f2py/lib/extgen/c_support.py -> build/lib.linux-i686-2.3/numpy/f2py/lib/extgen creating build/lib.linux-i686-2.3/numpy/core copying numpy/core/setup.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/ma.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/arrayprint.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/numeric.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/_internal.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/defmatrix.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/__init__.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/__svn_version__.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/records.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/info.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/numerictypes.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/defchararray.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/memmap.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/fromnumeric.py -> build/lib.linux-i686-2.3/numpy/core copying numpy/core/code_generators/generate_array_api.py -> build/lib.linux-i686-2.3/numpy/core creating build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/setup.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/ufunclike.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/polynomial.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/getlimits.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/index_tricks.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/function_base.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/utils.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/twodim_base.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/shape_base.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/__init__.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/user_array.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/type_check.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/info.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/machar.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/convdtype.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/arraysetops.py -> build/lib.linux-i686-2.3/numpy/lib copying numpy/lib/scimath.py -> build/lib.linux-i686-2.3/numpy/lib creating build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/misc.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/fix_default_axis.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/setup.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/ma.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/alter_code1.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/rng_stats.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/alter_code2.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/matrix.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/compat.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/ufuncs.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/__init__.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/random_array.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/array_printer.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/linear_algebra.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/user_array.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/typeconv.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/precision.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/mlab.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/fft.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/rng.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/arrayfns.py -> build/lib.linux-i686-2.3/numpy/oldnumeric copying numpy/oldnumeric/functions.py -> build/lib.linux-i686-2.3/numpy/oldnumeric creating build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/setup.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/ma.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/alter_code1.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/util.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/alter_code2.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/convolve.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/matrix.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/compat.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/ufuncs.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/nd_image.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/__init__.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/random_array.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/linear_algebra.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/image.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/mlab.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/fft.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/numerictypes.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/session.py -> build/lib.linux-i686-2.3/numpy/numarray copying numpy/numarray/functions.py -> build/lib.linux-i686-2.3/numpy/numarray creating build/lib.linux-i686-2.3/numpy/fft copying numpy/fft/setup.py -> build/lib.linux-i686-2.3/numpy/fft copying numpy/fft/helper.py -> build/lib.linux-i686-2.3/numpy/fft copying numpy/fft/__init__.py -> build/lib.linux-i686-2.3/numpy/fft copying numpy/fft/info.py -> build/lib.linux-i686-2.3/numpy/fft copying numpy/fft/fftpack.py -> build/lib.linux-i686-2.3/numpy/fft creating build/lib.linux-i686-2.3/numpy/linalg copying numpy/linalg/setup.py -> build/lib.linux-i686-2.3/numpy/linalg copying numpy/linalg/linalg.py -> build/lib.linux-i686-2.3/numpy/linalg copying numpy/linalg/__init__.py -> build/lib.linux-i686-2.3/numpy/linalg copying numpy/linalg/info.py -> build/lib.linux-i686-2.3/numpy/linalg creating build/lib.linux-i686-2.3/numpy/random copying numpy/random/setup.py -> build/lib.linux-i686-2.3/numpy/random copying numpy/random/__init__.py -> build/lib.linux-i686-2.3/numpy/random copying numpy/random/info.py -> build/lib.linux-i686-2.3/numpy/random running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize Gnu95FCompiler customize Gnu95FCompiler using build_ext building 'numpy.core.multiarray' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3 creating build/temp.linux-i686-2.3/numpy creating build/temp.linux-i686-2.3/numpy/core creating build/temp.linux-i686-2.3/numpy/core/src compile options: '-Ibuild/src.linux-i686-2.3/numpy/core/src -Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: numpy/core/src/multiarraymodule.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/numpy/core/src/multiarraymodule.o -lm -lm -o build/lib.linux-i686-2.3/numpy/core/multiarray.so building 'numpy.core.umath' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3/build creating build/temp.linux-i686-2.3/build/src.linux-i686-2.3 creating build/temp.linux-i686-2.3/build/src.linux-i686-2.3/numpy creating build/temp.linux-i686-2.3/build/src.linux-i686-2.3/numpy/core creating build/temp.linux-i686-2.3/build/src.linux-i686-2.3/numpy/core/src compile options: '-Ibuild/src.linux-i686-2.3/numpy/core/src -Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: build/src.linux-i686-2.3/numpy/core/src/umathmodule.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/build/src.linux-i686-2.3/numpy/core/src/umathmodule.o -lm -o build/lib.linux-i686-2.3/numpy/core/umath.so building 'numpy.core._sort' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: build/src.linux-i686-2.3/numpy/core/src/_sortmodule.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/build/src.linux-i686-2.3/numpy/core/src/_sortmodule.o -lm -o build/lib.linux-i686-2.3/numpy/core/_sort.so building 'numpy.core.scalarmath' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC compile options: '-Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: build/src.linux-i686-2.3/numpy/core/src/scalarmathmodule.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/build/src.linux-i686-2.3/numpy/core/src/scalarmathmodule.o -lm -o build/lib.linux-i686-2.3/numpy/core/scalarmath.so building 'numpy.lib._compiled_base' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3/numpy/lib creating build/temp.linux-i686-2.3/numpy/lib/src compile options: '-Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: numpy/lib/src/_compiled_base.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/numpy/lib/src/_compiled_base.o -o build/lib.linux-i686-2.3/numpy/lib/_compiled_base.so building 'numpy.numarray._capi' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3/numpy/numarray compile options: '-Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: numpy/numarray/_capi.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/numpy/numarray/_capi.o -o build/lib.linux-i686-2.3/numpy/numarray/_capi.so building 'numpy.fft.fftpack_lite' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3/numpy/fft compile options: '-Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: numpy/fft/fftpack_litemodule.c gcc: numpy/fft/fftpack.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/numpy/fft/fftpack_litemodule.o build/temp.linux-i686-2.3/numpy/fft/fftpack.o -o build/lib.linux-i686-2.3/numpy/fft/fftpack_lite.so building 'numpy.linalg.lapack_lite' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3/numpy/linalg compile options: '-DNO_ATLAS_INFO=1 -Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: numpy/linalg/lapack_litemodule.c /usr/bin/gfortran -Wall -Wall -shared build/temp.linux-i686-2.3/numpy/linalg/lapack_litemodule.o -L/usr/lib -llapack -lblas -lgfortran -o build/lib.linux-i686-2.3/numpy/linalg/lapack_lite.so building 'numpy.random.mtrand' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -D_GNU_SOURCE -fPIC -O2 -g -march=i386 -mcpu=i686 -fPIC -fPIC creating build/temp.linux-i686-2.3/numpy/random creating build/temp.linux-i686-2.3/numpy/random/mtrand compile options: '-Inumpy/core/include -Ibuild/src.linux-i686-2.3/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.3 -c' gcc: numpy/random/mtrand/mtrand.c gcc: numpy/random/mtrand/randomkit.c gcc: numpy/random/mtrand/initarray.c gcc: numpy/random/mtrand/distributions.c gcc -pthread -shared -O2 -g -march=i386 -mcpu=i686 -fPIC build/temp.linux-i686-2.3/numpy/random/mtrand/mtrand.o build/temp.linux-i686-2.3/numpy/random/mtrand/randomkit.o build/temp.linux-i686-2.3/numpy/random/mtrand/initarray.o build/temp.linux-i686-2.3/numpy/random/mtrand/distributions.o -lm -o build/lib.linux-i686-2.3/numpy/random/mtrand.so running build_scripts creating build/scripts.linux-i686-2.3 Creating build/scripts.linux-i686-2.3/f2py adding 'build/scripts.linux-i686-2.3/f2py' to scripts changing mode of build/scripts.linux-i686-2.3/f2py from 644 to 755 + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.28086 + umask 022 + cd /usr/src/redhat/BUILD + cd numpy-1.0.4 + /usr/lib/rpm/brp-compress + /usr/lib/rpm/brp-strip + /usr/lib/rpm/brp-strip-static-archive + /usr/lib/rpm/brp-strip-comment-note Processing files: python-numpy-1.0.4-4.1 error: File not found by glob: /var/tmp/python-numpy-1.0.4-build/usr/bin/* error: File not found by glob: /var/tmp/python-numpy-1.0.4-build/usr/lib/python*/site-packages/numpy error: File not found by glob: /var/tmp/python-numpy-1.0.4-build/usr/lib/python*/site-packages/numpy*.egg-info RPM build errors: File not found by glob: /var/tmp/python-numpy-1.0.4-build/usr/bin/* File not found by glob: /var/tmp/python-numpy-1.0.4-build/usr/lib/python*/site-packages/numpy File not found by glob: /var/tmp/python-numpy-1.0.4-build/usr/lib/python*/site-packages/numpy*.egg-info From millman at berkeley.edu Mon May 12 12:52:26 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 09:52:26 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <3d375d730805120918p786cf4daj26c55a2b1f5f68f0@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <268febdf0805112114i372fb787odf965fd00e9dc519@mail.gmail.com> <3d375d730805112142x240b66ey99e0488f748626d@mail.gmail.com> <3d375d730805112331x7ff2eba1va9852f787d2eb013@mail.gmail.com> <3d375d730805120918p786cf4daj26c55a2b1f5f68f0@mail.gmail.com> Message-ID: On Mon, May 12, 2008 at 9:18 AM, Robert Kern wrote: > > The MA situation was handled slightly differently, which is > > unfortunate. I think that it is a reasonable change to make in a > > minor release, but we should have provided a warning first. At least, > > we are providing the old code still. Do you think that we should > > revisit this issue? Since we are importing the new MA from a > > different location maybe we should move the old code back to it's > > original location. Would it be reasonable to have: > > from numpy import ma --> new code > > from numpy.core import ma --> old code w/ a deprecation warning > > I think that might be better, yes. I would like to hear from more people about whether they think this would be a useful solution to the current complaints about the new MA implementation. In particular, I would like to hear from at least the following people please: Travis, Stefan, Pierre, and David. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Mon May 12 12:59:21 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 11:59:21 -0500 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <48287569.5080002@scratchspace.com> References: <48287569.5080002@scratchspace.com> Message-ID: <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> On Mon, May 12, 2008 at 11:50 AM, Chris Miller wrote: > > Hello, > I'm having trouble getting the python-numpy RPM to build under > CentOS 4.6. I've already built and installed atlas, lapack3 and > refblas3 from the CentOS 5 source RPMS, but numpy won't build > correctly. Although the ultimate error may be unrelated to Atlas, > clearly not all the atlas dependencies are being satisfied. Only the > .so files are supplied by the RPMS, but it seems that some > additional lib* files may be needed as well. > > Side notes; > > * gcc-g77 is not installed (apparently causes problems), gfortran is > installed from the gcc4 RPM. > * I ran "export LD_LIBRARY_PATH=/usr/lib/atlas/sse2" > * I created the missing directories in > /var/tmp/python-numpy-1.0.4-build but it's really not clear that > this is really the problem > * The numpy SPEC file shows a dependency on lapack3 < 3.1, but > rpmbuild doesn't complain about this, and it seems everyone uses > lapack 3.0.x > * Upgrading to CentOS 5 is not an option > > Can anyone provide any insight as to what exactly is missing from > Atlas and how to correctly resolve this dependency? Any other tips > are appreciated. rpmbuild output below. > running build_scripts > creating build/scripts.linux-i686-2.3 > Creating build/scripts.linux-i686-2.3/f2py > adding 'build/scripts.linux-i686-2.3/f2py' to scripts > changing mode of build/scripts.linux-i686-2.3/f2py from 644 to 755 > + exit 0 > Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.28086 > + umask 022 > + cd /usr/src/redhat/BUILD > + cd numpy-1.0.4 > + /usr/lib/rpm/brp-compress > + /usr/lib/rpm/brp-strip > + /usr/lib/rpm/brp-strip-static-archive > + /usr/lib/rpm/brp-strip-comment-note > Processing files: python-numpy-1.0.4-4.1 > error: File not found by glob: > /var/tmp/python-numpy-1.0.4-build/usr/bin/* > error: File not found by glob: > /var/tmp/python-numpy-1.0.4-build/usr/lib/python*/site-packages/numpy > error: File not found by glob: > /var/tmp/python-numpy-1.0.4-build/usr/lib/python*/site-packages/numpy*.egg-info It looks like nothing actually executed "python setup.py install" to put the files into /var/tmp/python-numpy-1.0.4-build/. I suspect a problem in the spec file. Unfortunately, I am not familiar with building RPMs, so I don't know where you got the spec file from or where you can get a good one. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chanley at stsci.edu Mon May 12 13:02:53 2008 From: chanley at stsci.edu (Christopher Hanley) Date: Mon, 12 May 2008 13:02:53 -0400 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> Message-ID: <4828783D.9010600@stsci.edu> Jarrod Millman wrote: > > I can't imagine why anyone would have to upgrade. Could you explain > under what circumstances you would see having to upgrade just because > their is a new upstream release? I think I must be misunderstanding > your concern. > One circumstance in which you would need to upgrade is if you distribute software with a numpy dependency. If your user base upgrades to the latest numpy release, and that latest release breaks your code, you will have unhappy users. -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From mike.ressler at alum.mit.edu Mon May 12 13:07:42 2008 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Mon, 12 May 2008 10:07:42 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <48283AF9.801@ar.media.kyoto-u.ac.jp> References: <1210561167.17972.17.camel@bbc8> <48283AF9.801@ar.media.kyoto-u.ac.jp> Message-ID: <268febdf0805121007j23f77033g9ff0e92dd746f4eb@mail.gmail.com> On Mon, May 12, 2008 at 5:41 AM, David Cournapeau wrote: > Joris De Ridder wrote: > > As Mike, I'm a bit sceptic about the whole idea. The current way > > doesn't seem broken, so why fix it? > > If the recent events do not show that something went wrong, I don't know > what will :)... > > The plan really is to have a code freeze and hard freeze between each > release, and time-based releases are more natural for that IMO, because > contributors can plan and synchronize more easily. Maybe it won't work, > but it worths a try; As Robert mentioned it before, code freeze and hard > code freeze is what matters. I agree with this, so I think I'll have to apologize for misunderstanding the precise intent of the original message. Code freezing and plenty of testing to a schedule are a good thing. My only concern is that people don't feel compelled to push out a new release simply because the calendar rolled over (Fedora cough; Ubuntu, cough, cough). If there is a compelling feature set or bug fix, then by all means, set the schedule and go for it. Just don't fire up a release solely to show signs of life. Mike P.S. I'll take that median change in 1.1, wink :-) -- mike.ressler at alum.mit.edu From centos at scratchspace.com Mon May 12 13:29:51 2008 From: centos at scratchspace.com (Chris Miller) Date: Mon, 12 May 2008 10:29:51 -0700 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> Message-ID: <48287E8F.7020101@scratchspace.com> Robert Kern wrote: > It looks like nothing actually executed "python setup.py install" to > put the files into /var/tmp/python-numpy-1.0.4-build/. I suspect a > problem in the spec file. Unfortunately, I am not familiar with > building RPMs, so I don't know where you got the spec file from or > where you can get a good one. You're right about that. There was a OS flavor conditional that was not being evaluated properly. I commented these out and nailed the proper commands and it compiled. I'm still concerned about the missing library warnings, but I'll give it a try and see if it works properly. Thanks! Chris From millman at berkeley.edu Mon May 12 13:48:55 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 10:48:55 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <200805121330.26274.pgmdevlist@gmail.com> References: <1210561167.17972.17.camel@bbc8> <3d375d730805120918p786cf4daj26c55a2b1f5f68f0@mail.gmail.com> <200805121330.26274.pgmdevlist@gmail.com> Message-ID: On Mon, May 12, 2008 at 10:30 AM, Pierre GM wrote: > numpy.oldnumeric.ma is not bug-free, however: Charles recently pointed out a > problem with power, that failed to work properly with a float exponent. We > fixed the problem in numpy.ma, but left the old package unmodified. In short, > numpy.oldnumeric.ma is no longer supported, and left for convenience. My question was whether there was some way we could, by making some very minor changes, make the transition more gradual. Specifically, I was suggesting two things: 1. It seemed to me that a lot of the users of the old ma implementation called it from np.core. Since the new implementation doesn't get called from there, would it make sense to have the old implementation reside in np.core rather than in np.oldnumeric? 2. Should we add an import warning to the old implementation explaining the change. Something like "np.core.ma (or np.oldnumeric.ma) is deprecated. It is no longer being supported, so it will no longer receive bug fixes. Please consider using np.ma. In 1.3, np.core.ma is moving to np.oldnumeric.ma. Any comments on whether this would be helpful or useful. Is there anything else we should consider to ease the pains caused by this transition? Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Mon May 12 13:56:41 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 10:56:41 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <4828783D.9010600@stsci.edu> References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> Message-ID: On Mon, May 12, 2008 at 10:02 AM, Christopher Hanley wrote: > One circumstance in which you would need to upgrade is if you distribute > software with a numpy dependency. If your user base upgrades to the > latest numpy release, and that latest release breaks your code, you will > have unhappy users. I see, the issue is whether you(plural) will need to update your code base to support your users who may have updated to a new NumPy/SciPy release. This concern really goes to whether we should ever break code with our releases, which is orthogonal to whether we should try using a time-based release cycle. It is very clear that our users are not happy with the amount of API breaks in 1.1. All I can say, is that I am sorry that the current release is going to break some code bases out there. I am trying to figure out if there is a way to mitigate the problems caused by this release and would be happy to hear comments about how we could best reduce the problems caused by this release. In particular, it would be useful if I could get some feedback on my suggestion about the MA transition. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Mon May 12 14:05:35 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 13:05:35 -0500 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <48287E8F.7020101@scratchspace.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> Message-ID: <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> On Mon, May 12, 2008 at 12:29 PM, Chris Miller wrote: > Robert Kern wrote: > > > It looks like nothing actually executed "python setup.py install" to > > put the files into /var/tmp/python-numpy-1.0.4-build/. I suspect a > > problem in the spec file. Unfortunately, I am not familiar with > > building RPMs, so I don't know where you got the spec file from or > > where you can get a good one. > > You're right about that. There was a OS flavor conditional that was > not being evaluated properly. I commented these out and nailed the > proper commands and it compiled. > > I'm still concerned about the missing library warnings, but I'll > give it a try and see if it works properly. Can you show us $ ls -l /usr/lib/atlas/sse2 ? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Mon May 12 14:02:44 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 12 May 2008 14:02:44 -0400 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <200805121330.26274.pgmdevlist@gmail.com> Message-ID: <200805121402.45046.pgmdevlist@gmail.com> On Monday 12 May 2008 13:48:55 you wrote: > 1. It seemed to me that a lot of the users of the old ma > implementation called it from np.core. Since the new implementation > doesn't get called from there, would it make sense to have the old > implementation reside in np.core rather than in np.oldnumeric? That sounds like a good idea if np.oldnumeric is to disappear very soon. If not, maybe we could have np.core.ma points to np.oldnumeric.ma (or just do a "from np.oldnumeric.ma import *") > 2. Should we add an import warning to the old implementation > explaining the change. Something like "np.core.ma (or > np.oldnumeric.ma) is deprecated. It is no longer being supported, so > it will no longer receive bug fixes. Please consider using np.ma. In > 1.3, np.core.ma is moving to np.oldnumeric.ma. Sounds good. > Any comments on whether this would be helpful or useful. Is there > anything else we should consider to ease the pains caused by this > transition? A wiki page on www.scipy.org where we would (1) describe the API changes (it's now hidden in the DeveloperZone); (2) suggest some simple solutions to common problems. The (1) is easy to do, the (2) would be updated when needed. From centos at scratchspace.com Mon May 12 14:10:19 2008 From: centos at scratchspace.com (Chris Miller) Date: Mon, 12 May 2008 11:10:19 -0700 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> Message-ID: <4828880B.3030805@scratchspace.com> Robert Kern wrote: > Can you show us > > $ ls -l /usr/lib/atlas/sse2 # ls -l /usr/lib/atlas/sse2 total 3868 -rw-r--r-- 1 root root 3705240 May 11 13:02 libblas.so.3.0 -rw-r--r-- 1 root root 243466 May 11 13:02 liblapack.so.3.0 Chris From efiring at hawaii.edu Mon May 12 14:13:29 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 08:13:29 -1000 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <48261CFC.2020601@hawaii.edu> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> Message-ID: <482888C9.1030105@hawaii.edu> To close out this thread: With r5155 Travis fixed the problem, so the ticket is closed. Thank you! Eric Eric Firing wrote: > I have added a patch to the ticket. I believe it fixes the problem. It From robert.kern at gmail.com Mon May 12 14:25:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 13:25:52 -0500 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <4828880B.3030805@scratchspace.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> Message-ID: <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> On Mon, May 12, 2008 at 1:10 PM, Chris Miller wrote: > Robert Kern wrote: > > > Can you show us > > > > $ ls -l /usr/lib/atlas/sse2 > > # ls -l /usr/lib/atlas/sse2 > total 3868 > -rw-r--r-- 1 root root 3705240 May 11 13:02 libblas.so.3.0 > -rw-r--r-- 1 root root 243466 May 11 13:02 liblapack.so.3.0 Okay, can you also do the following: $ ls -l /usr/lib/lib{lapack,f77blas,cblas,atlas}.* $ ls -l /usr/lib/atlas/lib{lapack,f77blas,cblas,atlas}.* Is there an atlas-devel RPM that you also need to install? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ellisonbg.net at gmail.com Mon May 12 15:00:31 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Mon, 12 May 2008 13:00:31 -0600 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <1210561167.17972.17.camel@bbc8> References: <1210561167.17972.17.camel@bbc8> Message-ID: <6ce0ac130805121200r4b25013ap82720bd71b94edaf@mail.gmail.com> Hi, As Fernando mentioned, we are considering moving to a time-based released process with IPython1. Obviously, IPython1 is a very different project than numpy, but I figured it might be useful to state some of the other reasons we are thinking about going this direction: 1. It stops feature creep - "oh, just this one more thing and then we will release" We really struggle with this in IPython1. 2. It is a way of saying to the community regularly "this project is not dead, we are fixing bugs and adding new features all the time." For those of us who follow the lists, this is not as big of a deal, but _many_ users are not on the lists. When they see that there hasn't been a release in 6 months, what do they think? This is marketing. 3. It gets bug fixes and better tested code into users hands sooner. Think of all the bugs that have been fixes since 1.0.4 was released in November! 4. It gives developers regular deadlines which (hopefully) motivates them to contribute code more regularly. An aside (this holds true for Numpy, Scipy and IPython): I have noticed that all of these projects are very slow to bump up the version numbers (for example, IPython0 is only at 0.8.x). I think this is unfortunate because it leaves us with no choice but to have API breakage + huge new features in minor releases. Why not be more bold and start the up the version numbers more quickly to reflect that lots of code is being written? My theory - we are all perfectionists and we are afraid to release NumPy 2.0/SciPy1.0/IPython0-1.0 because we know the code is not perfect yet. I think Sage (they are > 3.0 at this point and started long after scipy/numpy/ipython) is a great counterexample of a project that is not afraid to bump the version numbers up quickly to reflect fast moving code. My two cents. Cheers, Brian On Sun, May 11, 2008 at 8:59 PM, David Cournapeau wrote: > Hi, > > I would like to know how people feel about going toward a time-based > release process for numpy (and scipy). By time-based release, I mean: > - releases of numpy are time-based, not feature based. > - a precise schedule is fixed, and the release manager(s) try to > enforce this schedule. > > Why ? I already suggested the idea a few months ago, and I relaunch the > idea because believe the recent masked array + matrix issues could have > been somewhat avoided with such a process (from a release point of view, > of course). With a time-based release, there is a period where people > can write to the release branch, try new features, and a freeze period > where only bug fixes are allowed (and normally, no api changes are > allowed). Also, time-based releases are by definition predictable, and > as such, it is easier to plan upgrades for users, and to plan breaks for > developers (for example, if we release say every 3 months, we would > allow one or two releases to warn about future incompatible changes, > before breaking them for real: people would know it means 6 months to > change their code). > > The big drawback is of course someone has to do the job. I like the way > bzr developers do it; every new release, someone else volunteer to do > the release, so it is not always the same who do the boring job. > > Do other people see this suggestion as useful ? If yes, we would have to > decide on: > - a release period (3 months sounds like a reasonable period to me ?) > - a schedule within a release (api breaks would only be allowed in the > first month, code addition would be allowed up to two months, and only > bug fixes the last month, for example). > - who does the process (if nobody steps in, I would volunteer for the > first round, if only for seeing how/if it works). > > cheers, > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From efiring at hawaii.edu Mon May 12 15:05:25 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 09:05:25 -1000 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> Message-ID: <482894F5.7040307@hawaii.edu> Jarrod Millman wrote: [...] > > It is very clear that our users are not happy with the amount of API > breaks in 1.1. All I can say, is that I am sorry that the current > release is going to break some code bases out there. I am trying to > figure out if there is a way to mitigate the problems caused by this > release and would be happy to hear comments about how we could best > reduce the problems caused by this release. In particular, it would > be useful if I could get some feedback on my suggestion about the MA > transition. Jarrod, As one who pushed for the MA transition, I appreciate your suggestion. It may have one unintended consequence, however, that may cause more trouble than it saves: it may lead to more situations where the ma versions are unintentionally mixed. This will probably work if an old_ma array ends up as input for a new_ma function, but the reverse often will not work correctly. (Pierre GM tried to make his MaskedArray Here is an illustration (untested, but should show the problem): module1.py: ----------- import numpy.core.ma as ma # with your suggestion, imports old_ma def dummy(arr): return ma.sin(arr) script.py: ----------- from module1 import dummy from pylab import * # now ma is new_ma x = ma.array([1,2,3], mask=[True, False, False]) plot(dummy(x)) The earlier strategy, of putting the old version solely in oldnumeric, is somewhat less likely to cause the problem because it requires anyone wanting the old version to deliberately select it--so at least the person is then aware that something has changed. Eric From centos at scratchspace.com Mon May 12 15:19:06 2008 From: centos at scratchspace.com (Chris Miller) Date: Mon, 12 May 2008 12:19:06 -0700 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> Message-ID: <4828982A.1040708@scratchspace.com> Robert Kern wrote: > Okay, can you also do the following: > > $ ls -l /usr/lib/lib{lapack,f77blas,cblas,atlas}.* > $ ls -l /usr/lib/atlas/lib{lapack,f77blas,cblas,atlas}.* -rw-r--r-- 1 root root 5083708 May 9 16:10 /usr/lib/liblapack.a lrwxrwxrwx 1 root root 18 May 11 10:47 /usr/lib/liblapack.so -> liblapack.so.3.0.0 lrwxrwxrwx 1 root root 18 May 11 10:47 /usr/lib/liblapack.so.3 -> liblapack.so.3.0.0 -rw-r--r-- 1 root root 3831921 May 9 16:10 /usr/lib/liblapack.so.3.0.0 On the f77 stuff, I read specifically to delete the gcc-f77 package as that conflicts with gcc-gfortran. The latter we are using is actually gcc4 not gcc3 (fyi). I see in Atlas that G77 is defined as gfortran, so this may not be an issue. > Is there an atlas-devel RPM that you also need to install? The Source RPM from the ashigabou repository does not generate a -devel RPM. I used the CentOS5 SRPMS here : http://download.opensuse.org/repositories/home:/ashigabou/CentOS_5/src/ Looks like the devel generation stuff in the spec file is incomplete and commented out. I did try to install Atlas from source at one point, but numpy still had issues finding libraries. Chris From robert.kern at gmail.com Mon May 12 15:29:38 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 14:29:38 -0500 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <4828982A.1040708@scratchspace.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> <4828982A.1040708@scratchspace.com> Message-ID: <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> On Mon, May 12, 2008 at 2:19 PM, Chris Miller wrote: > Robert Kern wrote: > > > Okay, can you also do the following: > > > > $ ls -l /usr/lib/lib{lapack,f77blas,cblas,atlas}.* > > $ ls -l /usr/lib/atlas/lib{lapack,f77blas,cblas,atlas}.* > > -rw-r--r-- 1 root root 5083708 May 9 16:10 /usr/lib/liblapack.a > lrwxrwxrwx 1 root root 18 May 11 10:47 /usr/lib/liblapack.so > -> liblapack.so.3.0.0 > lrwxrwxrwx 1 root root 18 May 11 10:47 /usr/lib/liblapack.so.3 > -> liblapack.so.3.0.0 > -rw-r--r-- 1 root root 3831921 May 9 16:10 /usr/lib/liblapack.so.3.0.0 > > On the f77 stuff, I read specifically to delete the gcc-f77 package > as that conflicts with gcc-gfortran. The latter we are using is > actually gcc4 not gcc3 (fyi). I see in Atlas that G77 is defined as > gfortran, so this may not be an issue. It's still called libf77blas regardless. Well, you don't have ATLAS installed. Or if you do, it's an extremely weird installation of ATLAS. > > Is there an atlas-devel RPM that you also need to install? > > The Source RPM from the ashigabou repository does not generate a > -devel RPM. I used the CentOS5 SRPMS here : > > http://download.opensuse.org/repositories/home:/ashigabou/CentOS_5/src/ > > Looks like the devel generation stuff in the spec file is incomplete > and commented out. I did try to install Atlas from source at one > point, but numpy still had issues finding libraries. Can you show me a list of files in the RPM that you installed? I don't know the rpm command off-hand. Since these are David Cournapeau's RPMs, perhaps he can chime in, here. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Mon May 12 15:38:28 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 09:38:28 -1000 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <482894F5.7040307@hawaii.edu> References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> <482894F5.7040307@hawaii.edu> Message-ID: <48289CB4.9090902@hawaii.edu> This is a patch for the previous message; I got distracted and failed to finish a sentence. Eric Firing wrote: [...] > versions are unintentionally mixed. This will probably work if an > old_ma array ends up as input for a new_ma function, but the reverse > often will not work correctly. (Pierre GM tried to make his MaskedArray handle old-style ma arrays gracefully, but there were no changes to old ma to make the reverse true.) From centos at scratchspace.com Mon May 12 15:39:43 2008 From: centos at scratchspace.com (Chris Miller) Date: Mon, 12 May 2008 12:39:43 -0700 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> <4828982A.1040708@scratchspace.com> <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> Message-ID: <48289CFF.90408@scratchspace.com> Robert Kern wrote: > On Mon, May 12, 2008 at 2:19 PM, Chris Miller wrote: >> On the f77 stuff, I read specifically to delete the gcc-f77 package >> as that conflicts with gcc-gfortran. The latter we are using is >> actually gcc4 not gcc3 (fyi). I see in Atlas that G77 is defined as >> gfortran, so this may not be an issue. > > It's still called libf77blas regardless. Well, you don't have ATLAS > installed. Or if you do, it's an extremely weird installation of > ATLAS. > > Can you show me a list of files in the RPM that you installed? I don't > know the rpm command off-hand. Since these are David Cournapeau's > RPMs, perhaps he can chime in, here. I think you hit the nail on the head, we need the -devel package which would contain the files in /usr/lib. All this RPM has is the shared objects : # rpm -ql atlas /usr/lib/atlas/sse2/libblas.so.3.0 /usr/lib/atlas/sse2/liblapack.so.3.0 David, do you have an Atlas spec file that builds the -devel package? Chris From stefan at sun.ac.za Mon May 12 15:57:03 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 12 May 2008 21:57:03 +0200 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <482894F5.7040307@hawaii.edu> References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> <482894F5.7040307@hawaii.edu> Message-ID: <9457e7c80805121257mdea15br5cddcefeca015f56@mail.gmail.com> 2008/5/12 Eric Firing : > The earlier strategy, of putting the old version solely in oldnumeric, > is somewhat less likely to cause the problem because it requires anyone > wanting the old version to deliberately select it--so at least the > person is then aware that something has changed. I think that's a good idea. If I recall correctly, numpy.core.ma had been exposed as numpy.ma, and some code uses it. Having an old and a new version in places where previously only the old version used to be won't solve the problem. The two MaskedArray modules are (supposed to be) API compatible, and we shouldn't see any breakage (Charles' experience was unfortunate, but that's been sorted out now); therefore, I'd expose the new masked arrays as numpy.ma and add a warning to the release message, which also refers anyone with problems to numpy.oldnumeric.ma (which, btw, has a number of bugs itself, which were discovered while coding the new masked arrays). Regards St?fan From efiring at hawaii.edu Mon May 12 16:01:32 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 10:01:32 -1000 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <6ce0ac130805121200r4b25013ap82720bd71b94edaf@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <6ce0ac130805121200r4b25013ap82720bd71b94edaf@mail.gmail.com> Message-ID: <4828A21C.4010709@hawaii.edu> Brian Granger wrote: > Hi, > > As Fernando mentioned, we are considering moving to a time-based > released process with IPython1. Obviously, IPython1 is a very > different project than numpy, but I figured it might be useful to > state some of the other reasons we are thinking about going this > direction: > > 1. It stops feature creep - "oh, just this one more thing and then we > will release" We really struggle with this in IPython1. > > 2. It is a way of saying to the community regularly "this project is > not dead, we are fixing bugs and adding new features all the time." > For those of us who follow the lists, this is not as big of a deal, > but _many_ users are not on the lists. When they see that there > hasn't been a release in 6 months, what do they think? This is > marketing. > > 3. It gets bug fixes and better tested code into users hands sooner. > Think of all the bugs that have been fixes since 1.0.4 was released in > November! This is a major point, and it has related aspects: 3a) The only way to get code fully tested is to get it out and in use. People have to try it in real life. There are severe limits to what unit testing can accomplish. A combination of something like daily (or weekly) builds and more frequent scheduled releases would facilitate more effective testing, and would give people more time to find out what works and what breaks, what we need to fix and what they will need to change. 3b) The above is one of the reasons I favored merging the new MA implementation--it needed more exposure, and there seemed to be no other mechanism to get it--but the other is that given the uncertainty regarding release timing it looked like practically "now or never". With a release cycle and a policy for changes in place, the merge could have been planned better and started earlier. I am very sympathetic to the argument for stability, but it has to be balanced against the need for genuine improvement and maintenance, and against the developers' time burden associated with maintaining multiple versions, or support for multiple versions. > > 4. It gives developers regular deadlines which (hopefully) motivates > them to contribute code more regularly. > > An aside (this holds true for Numpy, Scipy and IPython): I have > noticed that all of these projects are very slow to bump up the > version numbers (for example, IPython0 is only at 0.8.x). I think > this is unfortunate because it leaves us with no choice but to have > API breakage + huge new features in minor releases. Why not be more > bold and start the up the version numbers more quickly to reflect that > lots of code is being written? My theory - we are all perfectionists > and we are afraid to release NumPy 2.0/SciPy1.0/IPython0-1.0 because > we know the code is not perfect yet. I think Sage (they are > 3.0 at > this point and started long after scipy/numpy/ipython) is a great > counterexample of a project that is not afraid to bump the version > numbers up quickly to reflect fast moving code. My two cents. All good points. Thank you. Eric > > Cheers, > > Brian > > > On Sun, May 11, 2008 at 8:59 PM, David Cournapeau > wrote: >> Hi, >> >> I would like to know how people feel about going toward a time-based >> release process for numpy (and scipy). By time-based release, I mean: >> - releases of numpy are time-based, not feature based. >> - a precise schedule is fixed, and the release manager(s) try to >> enforce this schedule. >> >> Why ? I already suggested the idea a few months ago, and I relaunch the >> idea because believe the recent masked array + matrix issues could have >> been somewhat avoided with such a process (from a release point of view, >> of course). With a time-based release, there is a period where people >> can write to the release branch, try new features, and a freeze period >> where only bug fixes are allowed (and normally, no api changes are >> allowed). Also, time-based releases are by definition predictable, and >> as such, it is easier to plan upgrades for users, and to plan breaks for >> developers (for example, if we release say every 3 months, we would >> allow one or two releases to warn about future incompatible changes, >> before breaking them for real: people would know it means 6 months to >> change their code). >> >> The big drawback is of course someone has to do the job. I like the way >> bzr developers do it; every new release, someone else volunteer to do >> the release, so it is not always the same who do the boring job. >> >> Do other people see this suggestion as useful ? If yes, we would have to >> decide on: >> - a release period (3 months sounds like a reasonable period to me ?) >> - a schedule within a release (api breaks would only be allowed in the >> first month, code addition would be allowed up to two months, and only >> bug fixes the last month, for example). >> - who does the process (if nobody steps in, I would volunteer for the >> first round, if only for seeing how/if it works). >> >> cheers, >> >> David >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From stefan at sun.ac.za Mon May 12 16:34:19 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 12 May 2008 22:34:19 +0200 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <482888C9.1030105@hawaii.edu> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> Message-ID: <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> 2008/5/12 Eric Firing : > To close out this thread: > > With r5155 Travis fixed the problem, so the ticket is closed. Strange, when I look over that patch, my keyboard automatically holds in Shift and starts typing 1 through 9: (*&@#*&@#&%@#(*&)@# I ask you with tears in my eyes: where is the regression test? St?fan From millman at berkeley.edu Mon May 12 16:39:24 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 13:39:24 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <9457e7c80805121257mdea15br5cddcefeca015f56@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> <482894F5.7040307@hawaii.edu> <9457e7c80805121257mdea15br5cddcefeca015f56@mail.gmail.com> Message-ID: On Mon, May 12, 2008 at 12:57 PM, St?fan van der Walt wrote: > I think that's a good idea. If I recall correctly, numpy.core.ma had > been exposed as numpy.ma, and some code uses it. Having an old and a > new version in places where previously only the old version used to be > won't solve the problem. The two MaskedArray modules are (supposed to > be) API compatible, and we shouldn't see any breakage (Charles' > experience was unfortunate, but that's been sorted out now); > therefore, I'd expose the new masked arrays as numpy.ma and add a > warning to the release message, which also refers anyone with problems > to numpy.oldnumeric.ma (which, btw, has a number of bugs itself, > which were discovered while coding the new masked arrays). Do you think that the release notes all ready capture this: http://projects.scipy.org/scipy/numpy/milestone/1.1.0 If not, please feel free to suggest some additional language. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Mon May 12 16:42:46 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 12 May 2008 13:42:46 -0700 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <482894F5.7040307@hawaii.edu> References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> <482894F5.7040307@hawaii.edu> Message-ID: On Mon, May 12, 2008 at 12:05 PM, Eric Firing wrote: > As one who pushed for the MA transition, I appreciate your suggestion. > It may have one unintended consequence, however, that may cause more > trouble than it saves: it may lead to more situations where the ma > versions are unintentionally mixed. This will probably work if an > old_ma array ends up as input for a new_ma function, but the reverse > often will not work correctly. Good point. I now agree that it is best to avoid having the old code accessible via np.core.ma. If the old code is only accessible via np.oldnumeric, then I don't see any reason to add a deprecation warning since it is kind of implied by placing it in np.oldnumeric. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From centos at scratchspace.com Mon May 12 16:44:40 2008 From: centos at scratchspace.com (Chris Miller) Date: Mon, 12 May 2008 13:44:40 -0700 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> <4828982A.1040708@scratchspace.com> <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> Message-ID: <4828AC38.2010409@scratchspace.com> Robert, I looked in the build directory from Atlas and it is indeed generating the additional libraries. I copied them over to /usr/lib and rebuilt the numpy RPM. Indeed the behavior is different, and numpy sees the libraries. Of course some new oddities have entered the equation. Specifically this : ********************************************************************* Lapack library (from ATLAS) is probably incomplete: size of /usr/lib/liblapack.so is 3742k (expected >4000k) Follow the instructions in the KNOWN PROBLEMS section of the file numpy/INSTALL.txt. ********************************************************************* The INSTALL file is missing :-( I wonder if this is really a problem or just the result of stripping the binary (assuming RPM stripped it). I rebuilt lapack3 but the binary is still under 4MB. Let me know what you think. Below is the rest of the relevant output. BTW, thanks for your responsiveness to this issue. Chris F2PY Version 2_4422 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib'] language = c lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas libraries lapack_atlas not found in /usr/lib/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info Setting PTATLAS=ATLAS /usr/src/redhat/BUILD/numpy-1.0.4/numpy/distutils/system_info.py:955: UserWarning: ********************************************************************* Lapack library (from ATLAS) is probably incomplete: size of /usr/lib/liblapack.so is 3742k (expected >4000k) Follow the instructions in the KNOWN PROBLEMS section of the file numpy/INSTALL.txt. ********************************************************************* warnings.warn(message) Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib'] language = c From robert.kern at gmail.com Mon May 12 17:27:44 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 16:27:44 -0500 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <4828AC38.2010409@scratchspace.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> <4828982A.1040708@scratchspace.com> <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> <4828AC38.2010409@scratchspace.com> Message-ID: <3d375d730805121427w6053bd01xd9f783c548cf9610@mail.gmail.com> On Mon, May 12, 2008 at 3:44 PM, Chris Miller wrote: > Robert, > I looked in the build directory from Atlas and it is indeed > generating the additional libraries. I copied them over to /usr/lib > and rebuilt the numpy RPM. Indeed the behavior is different, and > numpy sees the libraries. Of course some new oddities have entered > the equation. Specifically this : > > ********************************************************************* > Lapack library (from ATLAS) is probably incomplete: > size of /usr/lib/liblapack.so is 3742k (expected >4000k) > > Follow the instructions in the KNOWN PROBLEMS section of the file > numpy/INSTALL.txt. > ********************************************************************* > > The INSTALL file is missing :-( I wonder if this is really a problem > or just the result of stripping the binary (assuming RPM stripped > it). I rebuilt lapack3 but the binary is still under 4MB. Let me > know what you think. Hmm. That text got copied over from scipy without thinking way back in the day. Anyways, see this: http://svn.scipy.org/svn/scipy/trunk/INSTALL.txt Basically, ATLAS only provides a few optimized LAPACK functions. To get a complete LAPACK, you need a FORTRAN LAPACK built first, then replace the relevant object files with the ATLAS-optimized versions. But you're probably correct that it's just symbol stripping. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at enthought.com Mon May 12 17:57:26 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 12 May 2008 16:57:26 -0500 Subject: [Numpy-discussion] Test from work Message-ID: <4828BD46.9030108@enthought.com> This is a test -teo From peridot.faceted at gmail.com Mon May 12 18:30:24 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 12 May 2008 18:30:24 -0400 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> <482894F5.7040307@hawaii.edu> Message-ID: 2008/5/12 Jarrod Millman : > On Mon, May 12, 2008 at 12:05 PM, Eric Firing wrote: >> As one who pushed for the MA transition, I appreciate your suggestion. >> It may have one unintended consequence, however, that may cause more >> trouble than it saves: it may lead to more situations where the ma >> versions are unintentionally mixed. This will probably work if an >> old_ma array ends up as input for a new_ma function, but the reverse >> often will not work correctly. > > Good point. I now agree that it is best to avoid having the old code > accessible via np.core.ma. If the old code is only accessible via > np.oldnumeric, then I don't see any reason to add a deprecation > warning since it is kind of implied by placing it in np.oldnumeric. Does it make sense to make the *new* code available in numpy.core.ma, optionally with a DeprecationWarning? I realize it was supposedly never advertised as being there, but given the state of numpy documentation many users will have found it there; at the least matplotlib did. Anne From efiring at hawaii.edu Mon May 12 18:52:42 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 12:52:42 -1000 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <4828783D.9010600@stsci.edu> <482894F5.7040307@hawaii.edu> Message-ID: <4828CA3A.7070207@hawaii.edu> Anne Archibald wrote: > 2008/5/12 Jarrod Millman : >> On Mon, May 12, 2008 at 12:05 PM, Eric Firing wrote: >>> As one who pushed for the MA transition, I appreciate your suggestion. >>> It may have one unintended consequence, however, that may cause more >>> trouble than it saves: it may lead to more situations where the ma >>> versions are unintentionally mixed. This will probably work if an >>> old_ma array ends up as input for a new_ma function, but the reverse >>> often will not work correctly. >> Good point. I now agree that it is best to avoid having the old code >> accessible via np.core.ma. If the old code is only accessible via >> np.oldnumeric, then I don't see any reason to add a deprecation >> warning since it is kind of implied by placing it in np.oldnumeric. > > Does it make sense to make the *new* code available in numpy.core.ma, > optionally with a DeprecationWarning? I realize it was supposedly > never advertised as being there, but given the state of numpy > documentation many users will have found it there; at the least > matplotlib did. If it can be done cleanly with a DeprecationWarning, then I don't see any problem offhand. Eric > > Anne From efiring at hawaii.edu Mon May 12 19:00:55 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 13:00:55 -1000 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> Message-ID: <4828CC27.5080906@hawaii.edu> St?fan van der Walt wrote: > 2008/5/12 Eric Firing : >> To close out this thread: >> >> With r5155 Travis fixed the problem, so the ticket is closed. > > Strange, when I look over that patch, my keyboard automatically holds > in Shift and starts typing 1 through 9: > > (*&@#*&@#&%@#(*&)@# > > I ask you with tears in my eyes: where is the regression test? Stefan, Maybe Travis could whip one up instantly, but I can't; I never figured out what is the difference between the dtypes of the ndarray that triggered the bug and the more common ones that don't, so I simply don't know how to make a test case. I'm sure I *could* figure it out, but it might take quite a bit of time--and in the grand scheme of things, I don't think that would be time particularly well-spent. I can't make it a priority. Eric From robert.kern at gmail.com Mon May 12 19:04:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 12 May 2008 18:04:34 -0500 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <4828CC27.5080906@hawaii.edu> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> Message-ID: <3d375d730805121604s7db969d9p2bdb74ca1bbd1710@mail.gmail.com> On Mon, May 12, 2008 at 6:00 PM, Eric Firing wrote: > St?fan van der Walt wrote: > > 2008/5/12 Eric Firing : > >> To close out this thread: > >> > >> With r5155 Travis fixed the problem, so the ticket is closed. > > > > Strange, when I look over that patch, my keyboard automatically holds > > in Shift and starts typing 1 through 9: > > > > (*&@#*&@#&%@#(*&)@# > > > > I ask you with tears in my eyes: where is the regression test? > > Stefan, > > Maybe Travis could whip one up instantly, but I can't; I never figured > out what is the difference between the dtypes of the ndarray that > triggered the bug and the more common ones that don't, so I simply don't > know how to make a test case. I'm sure I *could* figure it out, but it > might take quite a bit of time--and in the grand scheme of things, I > don't think that would be time particularly well-spent. I can't make it > a priority. The pickled array should be sufficient. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Mon May 12 19:34:57 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 12 May 2008 13:34:57 -1000 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <3d375d730805121604s7db969d9p2bdb74ca1bbd1710@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <3d375d730805121604s7db969d9p2bdb74ca1bbd1710@mail.gmail.com> Message-ID: <4828D421.6000908@hawaii.edu> Robert Kern wrote: > On Mon, May 12, 2008 at 6:00 PM, Eric Firing wrote: >> St?fan van der Walt wrote: >> > 2008/5/12 Eric Firing : >> >> To close out this thread: >> >> >> >> With r5155 Travis fixed the problem, so the ticket is closed. >> > >> > Strange, when I look over that patch, my keyboard automatically holds >> > in Shift and starts typing 1 through 9: >> > >> > (*&@#*&@#&%@#(*&)@# >> > >> > I ask you with tears in my eyes: where is the regression test? >> >> Stefan, >> >> Maybe Travis could whip one up instantly, but I can't; I never figured >> out what is the difference between the dtypes of the ndarray that >> triggered the bug and the more common ones that don't, so I simply don't >> know how to make a test case. I'm sure I *could* figure it out, but it >> might take quite a bit of time--and in the grand scheme of things, I >> don't think that would be time particularly well-spent. I can't make it >> a priority. > > The pickled array should be sufficient. > I did not realize it was OK to include test data files in the tests subdirectory, but now I see that there is one, testdata.fits. Eric From david at ar.media.kyoto-u.ac.jp Mon May 12 21:30:19 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 13 May 2008 10:30:19 +0900 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <48287569.5080002@scratchspace.com> References: <48287569.5080002@scratchspace.com> Message-ID: <4828EF2B.4030503@ar.media.kyoto-u.ac.jp> Chris Miller wrote: > Hello, > I'm having trouble getting the python-numpy RPM to build under > CentOS 4.6. I've already built and installed atlas, lapack3 and > refblas3 from the CentOS 5 source RPMS, but numpy won't build > correctly. Although the ultimate error may be unrelated to Atlas, > clearly not all the atlas dependencies are being satisfied. Only the > .so files are supplied by the RPMS, but it seems that some > additional lib* files may be needed as well. The problem is that rpm is a mess, and each distribution needs special casing because they can't agree on what to put where. If you look at the rpm spec file, you will see that it is special cased for CENTOS/RHEL. I used version 5 because that's the only one available on the build system. You could change that to version 4 and see what happens. cheers, David From david at ar.media.kyoto-u.ac.jp Mon May 12 21:38:38 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 13 May 2008 10:38:38 +0900 Subject: [Numpy-discussion] Installing numpy/scipy on CentOS 4 In-Reply-To: <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> References: <48287569.5080002@scratchspace.com> <3d375d730805120959s7f8e805dhae768fda0f84be3b@mail.gmail.com> <48287E8F.7020101@scratchspace.com> <3d375d730805121105g181a91f7yf2135495af75c27f@mail.gmail.com> <4828880B.3030805@scratchspace.com> <3d375d730805121125j7d5554efr4ec1e7a97522a243@mail.gmail.com> <4828982A.1040708@scratchspace.com> <3d375d730805121229u4170bea6j202f2cc97ee7230c@mail.gmail.com> Message-ID: <4828F11E.7070800@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > > It's still called libf77blas regardless. Well, you don't have ATLAS > installed. Or if you do, it's an extremely weird installation of > ATLAS. It is weird, but there is a rationale to the weirdness :) It is basically imposible to build a rpm for ATLAS, because every build produces a different binary. Because there is no atlas binary, I cannot depend on it, and so numpy/scipy depends on netlib blas/lapack. For people who still want to use atlas, I provided a source rpm that people can use if they want. Instead of building atlas the "normal" way, it builds full blas and lapack, which are drop-in replacements for netlib blas/lapack, while using atlas optimization. It means that atlas specific optimizations in numpy/scipy won't be used, but well, that's better than nothing, and is certainly faster for many operations than netlib. It works pretty well on opensuse 10, fedora 6, 7, and 8, as well as RHEL 5, both 32 and 64 bits so I would be surprised if there was a problem specific to centos related to atlas. All this is put on the wiki. I am open to suggestion for better instructions: http://www.scipy.org/Installing_SciPy/Linux cheers, David From oliphant at enthought.com Mon May 12 22:04:13 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 12 May 2008 21:04:13 -0500 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <4828CC27.5080906@hawaii.edu> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> Message-ID: <4828F71D.6090108@enthought.com> Eric Firing wrote: > St?fan van der Walt wrote: > >> 2008/5/12 Eric Firing : >> >>> To close out this thread: >>> >>> With r5155 Travis fixed the problem, so the ticket is closed. >>> >> Strange, when I look over that patch, my keyboard automatically holds >> in Shift and starts typing 1 through 9: >> >> (*&@#*&@#&%@#(*&)@# >> >> I ask you with tears in my eyes: where is the regression test? >> > > Stefan, > > Maybe Travis could whip one up instantly, but I can't; I think Stefan is asking me, not you. I don't think you should feel any sense of guilt. I was the one who closed the ticket sans regression test. I tend to still be of the opinion that a bug fix without a regression test is better than no bug fix at all. Obviously, whether future changes actually fix a bug (without introducing new ones) is a strong argument for regression tests, and I gratefully accept all tests submitted. -Travis From stefan at sun.ac.za Tue May 13 02:56:35 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 13 May 2008 08:56:35 +0200 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <4828F71D.6090108@enthought.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> Message-ID: <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> Hi Travis 2008/5/13 Travis E. Oliphant : > I think Stefan is asking me, not you. I don't think you should feel > any sense of guilt. I was the one who closed the ticket sans > regression test. I tend to still be of the opinion that a bug fix > without a regression test is better than no bug fix at all. I suppose people may be getting tired of me singing the same tune all the time, but I wouldn't do it if I weren't deeply convinced of the improvement in software quality resulting from good test coverage. This is maybe even more true for NumPy than for other packages, illustrated by the recent discussion on masked arrays. The unit tests act as a contract between ourselves and our users, and if this contract is lacking (or missing!), we cannot guarantee that APIs or even functionality will remain unbroken. It may be best if we could formalise the policy around this, so that I can either keep quiet or expect a regression test with every check-in. > Obviously, whether future changes actually fix a bug (without > introducing new ones) is a strong argument for regression tests, and I > gratefully accept all tests submitted. I believe we should not only be accepting tests -- we should be writing them, too. Regards St?fan From pav at iki.fi Tue May 13 04:37:23 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 13 May 2008 11:37:23 +0300 Subject: [Numpy-discussion] Trac internal error Message-ID: <1210667843.26372.386.camel@localhost> Hi, Something seems to be wrong with the Trac: http://scipy.org/scipy/numpy/timeline Internal Error Ticket changes event provider (TicketModule) failed: SubversionException: ("Can't open file '/home/scipy/svn/numpy/db/revprops/5159': Permission denied", 13) -- Pauli Virtanen From millman at berkeley.edu Tue May 13 05:26:00 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 13 May 2008 02:26:00 -0700 Subject: [Numpy-discussion] Trac internal error In-Reply-To: <1210667843.26372.386.camel@localhost> References: <1210667843.26372.386.camel@localhost> Message-ID: On Tue, May 13, 2008 at 1:37 AM, Pauli Virtanen wrote: > Something seems to be wrong with the Trac: Yes, I know. I already sent an email to Peter Wang. There are a few files whose ownership got changed and it will be fixed by the morning. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From haase at msg.ucsf.edu Tue May 13 05:31:33 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 13 May 2008 11:31:33 +0200 Subject: [Numpy-discussion] "API change" or "stuff we'd like to change" was: recarray fun Message-ID: On Thu, May 1, 2008 at 10:02 PM, Alan G Isaac wrote: > On Thu, 01 May 2008, Christopher Barker apparently wrote: > > Maybe we should have a Wiki page for "stuff we'd like to change, but > > won't until major API breakage is otherwise occurring" > > > Perhaps > would suffice? > > Cheers, > Alan > I would like to pick up on this comment: The mentioned wiki page contains currently links to only 3 sub-pages: 1. MatrixIndexing 2. RuntimeOptimization 3. SolversProposal I did not look into any of these, but they all sound like "rather big thinks" I understood the proposed new > > Wiki page for "stuff we'd like to change, but > > won't until major API breakage is otherwise occurring" as a summary of many more "smaller" things, such as * np.resize vs. arr.resize need unified behavior, and additional argument to allow old behavior Cheers, Sebastian Haase From millman at berkeley.edu Tue May 13 05:50:41 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 13 May 2008 02:50:41 -0700 Subject: [Numpy-discussion] "API change" or "stuff we'd like to change" was: recarray fun In-Reply-To: References: Message-ID: On Tue, May 13, 2008 at 2:31 AM, Sebastian Haase wrote: > as a summary of many more "smaller" things, such as > * np.resize vs. arr.resize need unified behavior, and additional > argument to allow old behavior I think you are asking about whether you can start discussing smaller changes for unifying the call signatures of different functions like what we did with median. If so, the answer is yes. You can just create a new page and start listing the instances where you find this kind of situation. I personally would like to see all those kinds of things cleaned up at some point, although some things may have to wait until a distant, major release like 2.0 or greater. Regardless it would be useful if you could gather examples of these kinds of things. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From matthieu.brucher at gmail.com Tue May 13 06:46:15 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 13 May 2008 12:46:15 +0200 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> Message-ID: > > As for the NumPy unit tests: I have placed coverage reports online > (http://mentat.za.net/numpy/coverage). This only covers Python (not > extension) code, but having that part 100% tested is not impossible, > nor would it take that much effort. The much more important issue is > having the C extensions tested, and if anyone can figure out a way to > get gcov to generate those coverage reports, I'd be in the seventh > heaven. Thus far, the only way I know of is to build one large, > static Python binary that includes numpy. > Hi, I tried something similar with figleaf on my own code, but it seems that every test that is decorated with @raises is not tested. Does your coverage include these tests ? I didn't change much of your script to do it. I tried the pinocchio nose extension as well, but it seems it is not compatible with figleaf anymore :| (there is a thread on the testing ML, but no answers so far :|) Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue May 13 09:02:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 13 May 2008 15:02:52 +0200 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> Message-ID: <9457e7c80805130602s5192fef4w73fd41eeb22cb044@mail.gmail.com> Hi Matthieu 2008/5/13 Matthieu Brucher : > > As for the NumPy unit tests: I have placed coverage reports online > > (http://mentat.za.net/numpy/coverage). This only covers Python (not > > extension) code, but having that part 100% tested is not impossible, > > nor would it take that much effort. The much more important issue is > > having the C extensions tested, and if anyone can figure out a way to > > get gcov to generate those coverage reports, I'd be in the seventh > > heaven. Thus far, the only way I know of is to build one large, > > static Python binary that includes numpy. > > > I tried something similar with figleaf on my own code, but it seems that > every test that is decorated with @raises is not tested. Does your coverage > include these tests ? I didn't change much of your script to do it. > I tried the pinocchio nose extension as well, but it seems it is not > compatible with figleaf anymore :| (there is a thread on the testing ML, but > no answers so far :|) I don't know about the "@raises" parameter. Those aren't used in NumPy at the moment. I also don't know which mechanism figleaf uses to track coverage, but if you investigte the issue further, please keep us up to date. Cheers St?fan From lists.20.chth at xoxy.net Tue May 13 09:08:18 2008 From: lists.20.chth at xoxy.net (ctw) Date: Tue, 13 May 2008 09:08:18 -0400 Subject: [Numpy-discussion] Array finalize call from std/var? Message-ID: Hi! Does anybody here know what's going on in the situation described in this ticket? http://scipy.org/scipy/numpy/ticket/791 Basically, it seems that __array_finalize__ is called after calls to most functions (such as mean), but not after calls to std or var. Thanks! CTW From pgmdevlist at gmail.com Tue May 13 11:40:01 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 13 May 2008 11:40:01 -0400 Subject: [Numpy-discussion] Array finalize call from std/var? In-Reply-To: References: Message-ID: <200805131140.01516.pgmdevlist@gmail.com> On Tuesday 13 May 2008 09:08:18 ctw wrote: > Basically, it seems that __array_finalize__ is called after calls to > most functions (such as mean), but not after calls to std or var. Mmh, it's not exactly how I understand: in fact, __array_finalize__ seems to be called an extra time with the ndarray as argument: in multiarraymodule.c, L934: ret = PyArray_View((PyAO *)obj1, NULL, self->ob_type); So, the question is: why this last View ? why use self.ob_type instead of ref.ob_type ? From matthieu.brucher at gmail.com Tue May 13 11:43:33 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 13 May 2008 17:43:33 +0200 Subject: [Numpy-discussion] Going toward time-based release ? In-Reply-To: <9457e7c80805130602s5192fef4w73fd41eeb22cb044@mail.gmail.com> References: <1210561167.17972.17.camel@bbc8> <1210567362.17972.73.camel@bbc8> <9457e7c80805120009o5d11c454rd69f2101c6da6738@mail.gmail.com> <9457e7c80805130602s5192fef4w73fd41eeb22cb044@mail.gmail.com> Message-ID: > > > > I tried something similar with figleaf on my own code, but it seems that > > every test that is decorated with @raises is not tested. Does your > coverage > > include these tests ? I didn't change much of your script to do it. > > I tried the pinocchio nose extension as well, but it seems it is not > > compatible with figleaf anymore :| (there is a thread on the testing ML, > but > > no answers so far :|) > > I don't know about the "@raises" parameter. Those aren't used in > NumPy at the moment. I also don't know which mechanism figleaf uses > to track coverage, but if you investigte the issue further, please > keep us up to date. > It seems I've made the mistake. I added the type of exception that could be raised and now figleaf tells me that the code is tested. An remaining annoyance is the __init__.py file that is always not 100% tested if you have a blanck line, or a file with __all__ that is not imported with from module import * A tleast, it shows that every line of code is tested at least once, even if every path is not tested ;) Thanks for the example with numpy ;) Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue May 13 12:12:06 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 13 May 2008 11:12:06 -0500 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> Message-ID: <4829BDD6.4060905@enthought.com> St?fan van der Walt wrote: > Hi Travis > > 2008/5/13 Travis E. Oliphant : > >> I think Stefan is asking me, not you. I don't think you should feel >> any sense of guilt. I was the one who closed the ticket sans >> regression test. I tend to still be of the opinion that a bug fix >> without a regression test is better than no bug fix at all. >> > > I suppose people may be getting tired of me singing the same tune all > the time, but I wouldn't do it if I weren't deeply convinced of the > improvement in software quality resulting from good test coverage. > This is maybe even more true for NumPy than for other packages, > illustrated by the recent discussion on masked arrays. The unit tests > act as a contract between ourselves and our users, and if this > contract is lacking (or missing!), we cannot guarantee that APIs or > even functionality will remain unbroken. It may be best if we could > formalise the policy around this, so that I can either keep quiet or > expect a regression test with every check-in. > I'm going to strongly oppose any "formalization" of such a policy. I think these sorts of things are best handled by reminding people of the benefits of tests and documentation and setting a good example rather than rigidly holding to a "policy" that does not encompass all situations equally well. Some bug fixes are valuable even if they don't have a unit test associated with them (such as IMHO the one under discussion here), especially if creating the unit test would take additional time that the bug fixer doesn't have and would lead to the bug not being fixed. Process does not create quality, people do. Processes, like unit testing, can help, but usually have their own sets of flaws as well (how often is the "bug" actually a unit-test bug and how often do we get a false sense of security because of the lack of code coverage). The unit test system is far from perfect and there is quite a bit of overhead in constructing them given the current framework (yes, I know we are fixing that...) NumPy would not exist if I had followed the process you seem to want now. I'm happy to improve the number of "unit-tests" written where I can, but I'm not going to be pigeon-holed into following a "process" that *requires* a unit-test for every check in, and I can't in good conscience push such a policy on others. Besides, having a "test-per-checkin" is not the proper mapping in my mind. I'd rather see whole check-ins devoted to testing large pieces of code rather than spend all unit-test foo on a rigid policy of "regression" testing each check-in. -Travis From alan.mcintyre at gmail.com Tue May 13 13:14:18 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 13 May 2008 13:14:18 -0400 Subject: [Numpy-discussion] gcov testing of C extensions (was Going toward time-based release ?) Message-ID: <1d36917a0805131014v26f35bebpc8205768117204ba@mail.gmail.com> On Mon, May 12, 2008 at 3:09 AM, St?fan van der Walt wrote: > The much more important issue is > having the C extensions tested, and if anyone can figure out a way to > get gcov to generate those coverage reports, I'd be in the seventh > heaven. Thus far, the only way I know of is to build one large, > static Python binary that includes numpy. Has anybody managed to build numpy into a static binary? It looks like somebody did it recently for some of the Python standard extensions: http://code.google.com/p/google-highly-open-participation-psf/issues/detail?id=217 Unless it's already been tried and deemed not possible, I'll have a go at doing this with numpy and post the outcome. From robert.kern at gmail.com Tue May 13 13:25:15 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 13 May 2008 12:25:15 -0500 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <4829BDD6.4060905@enthought.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> Message-ID: <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant wrote: > Besides, having a "test-per-checkin" is not the proper mapping in my > mind. I'd rather see whole check-ins devoted to testing large pieces > of code rather than spend all unit-test foo on a rigid policy of > "regression" testing each check-in. St?fan is proposing "test-per-bugfix", not "test-per-checkin". That is eminently feasible. You need to do some kind of testing to be sure that you actually fixed the problem. It is simply *not* *that* *hard* to write that in unit test form. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue May 13 13:43:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 13 May 2008 11:43:43 -0600 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> Message-ID: On Tue, May 13, 2008 at 11:25 AM, Robert Kern wrote: > On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant > wrote: > > > Besides, having a "test-per-checkin" is not the proper mapping in my > > mind. I'd rather see whole check-ins devoted to testing large pieces > > of code rather than spend all unit-test foo on a rigid policy of > > "regression" testing each check-in. > > St?fan is proposing "test-per-bugfix", not "test-per-checkin". That is > eminently feasible. You need to do some kind of testing to be sure > that you actually fixed the problem. It is simply *not* *that* *hard* > to write that in unit test form. > I'll add that every time I've tried to write comprehensive tests for subsystems while cleaning up code, undiscovered bugs show up: complex arccos using the wrong branch, some logical operators having the wrong signature. I expect the first has been there since numeric prehistory, and the second for several years. Those aren't subtle bugs, they are just bugs. Now that development has slowed down, I think it is time to start working on comprehensive tests that verify that numpy works as supposed. This will also help us pin down what the specs for various functions actually are, something that can be a bit nebulous at times. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue May 13 16:08:27 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 13 May 2008 15:08:27 -0500 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> Message-ID: <4829F53B.5030404@enthought.com> Robert Kern wrote: > On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant > wrote: > > >> Besides, having a "test-per-checkin" is not the proper mapping in my >> mind. I'd rather see whole check-ins devoted to testing large pieces >> of code rather than spend all unit-test foo on a rigid policy of >> "regression" testing each check-in. >> > > St?fan is proposing "test-per-bugfix", not "test-per-checkin". That is > eminently feasible. You need to do some kind of testing to be sure > that you actually fixed the problem. It is simply *not* *that* *hard* > to write that in unit test form. > That is not true. You *don't* need to do testing to be sure you actually fixed the problem in some cases.... Looking at the code is enough. Like the case we are talking about. -Travis From oliphant at enthought.com Tue May 13 16:09:20 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 13 May 2008 15:09:20 -0500 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> Message-ID: <4829F570.4030107@enthought.com> Charles R Harris wrote: > > > On Tue, May 13, 2008 at 11:25 AM, Robert Kern > wrote: > > On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant > > wrote: > > > Besides, having a "test-per-checkin" is not the proper mapping > in my > > mind. I'd rather see whole check-ins devoted to testing large > pieces > > of code rather than spend all unit-test foo on a rigid policy of > > "regression" testing each check-in. > > St?fan is proposing "test-per-bugfix", not "test-per-checkin". That is > eminently feasible. You need to do some kind of testing to be sure > that you actually fixed the problem. It is simply *not* *that* *hard* > to write that in unit test form. > > > I'll add that every time I've tried to write comprehensive tests for > subsystems while cleaning up code, undiscovered bugs show up: complex > arccos using the wrong branch, some logical operators having the wrong > signature. I expect the first has been there since numeric prehistory, > and the second for several years. Those aren't subtle bugs, they are > just bugs. > > Now that development has slowed down, I think it is time to start > working on comprehensive tests that verify that numpy works as > supposed. This will also help us pin down what the specs for various > functions actually are, something that can be a bit nebulous at times. I'm completely supportive of this kind of effort. Let's go write unit tests and stop harassing people who are fixing bugs. -Travis From erik.nugent at gmail.com Tue May 13 16:53:36 2008 From: erik.nugent at gmail.com (Erik Nugent) Date: Tue, 13 May 2008 14:53:36 -0600 Subject: [Numpy-discussion] numpy with icc Message-ID: <97db231b0805131353g66327a60i1160afa0dd6330c0@mail.gmail.com> I am trying to build numpy with intel icc and mkl. I don't understand a lot of what I am doing. I have followed the the suggestions of what to add/change in site.cfg intelccompiler.py and system_info.py from several different posts or google searches. i am not sure i have done any of this right. here are the contents of the files. ---------------------------------------------------------------------------------- site.cfg: [DEFAULT] library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t,/usr/lib64,/usr/local/lib64,/usr/local/python2.5.2-intel/lib,/usr/lib,/usr/local/lib include_dirs = /opt/intel/mkl/10.0.3.020/include,/usr/include,/usr/local/include,/usr/local/python2.5.2-intel/include [mkl] include_dirs = /opt/intel/mkl/10.0.3.020/include library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t lapack_libs = mkl_lapack [lapack_src] libraries=mkl_lapack,mkl,guide [lapack_info] libraries=mkl_lapack,mkl,guide ------------------------------------------------------------------------------- intelccompiler.py: from distutils.unixccompiler import UnixCCompiler from numpy.distutils.exec_command import find_executable class IntelCCompiler(UnixCCompiler): """ A modified Intel compiler compatible with an gcc built Python. """ compiler_type = 'intel' cc_exe = 'icc -g -O3 -w -fPIC -parallel -ipo -xT -axT' def __init__ (self, verbose=0, dry_run=0, force=0): UnixCCompiler.__init__ (self, verbose,dry_run, force) compiler = self.cc_exe self.set_executables(compiler=compiler, compiler_so=compiler, compiler_cxx=compiler, linker_exe=compiler, linker_so=compiler + ' -shared') class IntelItaniumCCompiler(IntelCCompiler): compiler_type = 'intele' # On Itanium, the Intel Compiler used to be called ecc, let's search for # it (now it's also icc, so ecc is last in the search). for cc_exe in map(find_executable,['icc','ecc']): if cc_exe: break ---------------------------------------------------------------- system_info.py: .... class lapack_mkl_info(mkl_info): def calc_info(self): mkl = get_info('mkl') if not mkl: return if sys.platform == 'win32': lapack_libs = self.get_libs('lapack_libs',['mkl_lapack']) else: lapack_libs = self.get_libs('lapack_libs',['mkl_lapack']) info = {'libraries': lapack_libs} dict_append(info,**mkl) self.set_info(**info) ... ------------------------------------------------------------------------ i then use this command to compile: /usr/local/python2.5.2-intel/bin/python setup.py config --compiler=intel config_fc --fcompiler=intel \ --opt='-fPIC -O3 -w -axT -xT' install > build.out build.out has this in it: F2PY Version 2_4422 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /opt/intel/mkl/10.0.3.020/lib/em64t NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS NOT AVAILABLE atlas_blas_info: NOT AVAILABLE blas_info: NOT AVAILABLE blas_src_info: NOT AVAILABLE NOT AVAILABLE lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /opt/intel/mkl/10.0.3.020/lib/em64t NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: numpy.distutils.system_info.atlas_info NOT AVAILABLE lapack_info: NOT AVAILABLE lapack_src_info: NOT AVAILABLE NOT AVAILABLE running config running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running install running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running build_src building py_modules sources creating build creating build/src.linux-x86_64-2.5 creating build/src.linux-x86_64-2.5/numpy creating build/src.linux-x86_64-2.5/numpy/distutils building extension "numpy.core.multiarray" sources creating build/src.linux-x86_64-2.5/numpy/core Generating build/src.linux-x86_64-2.5/numpy/core/config.h Found executable /opt/intel/cce/10.1.015/bin/icc Could not locate executable ecc customize IntelFCompiler Found executable /opt/intel/fce/10.1.015/bin/ifort C compiler: icc -g -O3 -w -fPIC -parallel -ipo -xT -axT ...... it seems to compile and install fine.... then i start python and try to run numpy.test(). that is where i am stuck. this is what happens: # pwd /usr/local/python2.5.2-intel/bin # ./python Python 2.5.2 (r252:60911, May 13 2008, 11:22:16) [GCC Intel(R) C++ gcc 4.1 mode] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/python2.5.2-intel/lib/python2.5/site-packages/numpy Numpy version 1.0.4 Python version 2.5.2 (r252:60911, May 13 2008, 11:22:16) [GCC Intel(R) C++ gcc 4.1 mode] Found 10/10 tests for numpy.core.defmatrix Found 36/36 tests for numpy.core.ma Found 223/223 tests for numpy.core.multiarray Found 65/65 tests for numpy.core.numeric Found 31/31 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 6/6 tests for numpy.core.scalarmath Found 14/14 tests for numpy.core.umath Found 4/4 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 1/1 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 9/9 tests for numpy.lib.arraysetops Found 46/46 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 4/4 tests for numpy.lib.index_tricks Found 3/3 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 40/40 tests for numpy.linalg Found 2/2 tests for numpy.random Found 0/0 tests for __main__ MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/em64t/: cannot read file data: Is a directory # not sure how to get past this... although, I am sure, if numpy is not working right then i will not be able to go on and compile scipy..... Any help would be great.... From efiring at hawaii.edu Tue May 13 17:39:09 2008 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 13 May 2008 11:39:09 -1000 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <4829F53B.5030404@enthought.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> <4829F53B.5030404@enthought.com> Message-ID: <482A0A7D.8070807@hawaii.edu> Travis E. Oliphant wrote: > Robert Kern wrote: >> On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant >> wrote: >> >> >>> Besides, having a "test-per-checkin" is not the proper mapping in my >>> mind. I'd rather see whole check-ins devoted to testing large pieces >>> of code rather than spend all unit-test foo on a rigid policy of >>> "regression" testing each check-in. >>> >> St?fan is proposing "test-per-bugfix", not "test-per-checkin". That is >> eminently feasible. You need to do some kind of testing to be sure >> that you actually fixed the problem. It is simply *not* *that* *hard* >> to write that in unit test form. >> > That is not true. You *don't* need to do testing to be sure you > actually fixed the problem in some cases.... Looking at the code is > enough. Like the case we are talking about. I agree that this one was pretty obvious, and the value of a dedicated test is questionable, but I added a test patch and data file to the ticket anyway. Of course the test needed to be tested, and I have done half of that: I verified that it passes now (although some others don't). I had already done this manually to verify my original suggested patch, and then again to verify Travis's actual revision as soon as I saw it. I have not gone back to verify that the test correctly identifies the original problem; but it is the same procedure I used to track the problem down in the first place. I hope this leaves the various hackles reasonably smooth. One last question for Travis: Is there a reason why PyArray_EquivTypes is *not* used in PyArray_CastToType? If so, a comment in the code might be helpful. Eric From stefan at sun.ac.za Tue May 13 18:40:39 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 14 May 2008 00:40:39 +0200 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <4829F53B.5030404@enthought.com> References: <482605DB.9090709@hawaii.edu> <48261CFC.2020601@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> <4829F53B.5030404@enthought.com> Message-ID: <9457e7c80805131540n253a7ba1n28e2a22987d2fc8a@mail.gmail.com> 2008/5/13 Travis E. Oliphant : > Robert Kern wrote: > > On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant > > wrote: > > > > > >> Besides, having a "test-per-checkin" is not the proper mapping in my > >> mind. I'd rather see whole check-ins devoted to testing large pieces > >> of code rather than spend all unit-test foo on a rigid policy of > >> "regression" testing each check-in. > >> > > > > St?fan is proposing "test-per-bugfix", not "test-per-checkin". That is > > eminently feasible. You need to do some kind of testing to be sure > > that you actually fixed the problem. It is simply *not* *that* *hard* > > to write that in unit test form. > > > That is not true. You *don't* need to do testing to be sure you > actually fixed the problem in some cases.... Looking at the code is > enough. Like the case we are talking about. That is where we disagree: looking at the code is simply not enough. It is fine if you know where to look, and if you look exactly there every day -- but who does that? That is precisely why we have tests -- to automate this inspection. You know this code base extremely well, but what may seem obvious to you won't be to another developer a year down the line, and that person needs your assistance now already. When a person works on a ticket, he become intimately familiar with its contents. That means that, where it would take that person a couple of minutes to write a test, it would take another developer many more, because the other developer first needs to gain all the relevant background knowledge. For the case in question, Eric provided test data. In order to fix the bug, it was needed to run that code; so whether it was pasted it into the test suite or whether it was run from a terminal wouldn't have made much of a difference in developer time. It would, however, have provided us with a guarantee that we conquered that bug once, and more importantly, for all. > Let's go write unit > tests and stop harassing people who are fixing bugs. It is not my intention to harass you or anyone else (and I apologise if it came across that way) for fixing bugs. In fact, I am (incredibly) grateful for these efforts. The reasons I am unhappy is because a) I have no guarantee that any bug was fixed and b) I could be stupid enough to break that functionality tonight and no-one would be any the wiser > NumPy would not exist if I had followed the process you seem to want > now. I'm not sure I agree. The tests were written anyway -- no-one codes without trying their code. I'm simply arguing that those trials should end up somewhere useful. If Numeric was developed from a test driven perspective we could, for example, have avoided breaking *any* of the maskedarray functionality in the merge. Similarly, we could have caught some of these memory errors that keep popping up a long time ago. I remain of the opinion that the benefits of decent testing completely overwhelms the tiny effort it takes to put into place. Regards St?fan From stefan at sun.ac.za Tue May 13 18:47:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 14 May 2008 00:47:52 +0200 Subject: [Numpy-discussion] gcov testing of C extensions (was Going toward time-based release ?) In-Reply-To: <1d36917a0805131014v26f35bebpc8205768117204ba@mail.gmail.com> References: <1d36917a0805131014v26f35bebpc8205768117204ba@mail.gmail.com> Message-ID: <9457e7c80805131547m6de6a56dr3678b6aeb9ec608e@mail.gmail.com> 2008/5/13 Alan McIntyre : > On Mon, May 12, 2008 at 3:09 AM, St?fan van der Walt wrote: > > The much more important issue is > > having the C extensions tested, and if anyone can figure out a way to > > get gcov to generate those coverage reports, I'd be in the seventh > > heaven. Thus far, the only way I know of is to build one large, > > static Python binary that includes numpy. > > Has anybody managed to build numpy into a static binary? It looks > like somebody did it recently for some of the Python standard > extensions: > > http://code.google.com/p/google-highly-open-participation-psf/issues/detail?id=217 > > Unless it's already been tried and deemed not possible, I'll have a go > at doing this with numpy and post the outcome. Unless anybody points out a technical reason not to, please do have a go at it -- the results are eagerly anticipated! Regards St?fan From charlesr.harris at gmail.com Tue May 13 18:59:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 13 May 2008 16:59:19 -0600 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <9457e7c80805131540n253a7ba1n28e2a22987d2fc8a@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <482888C9.1030105@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> <4829F53B.5030404@enthought.com> <9457e7c80805131540n253a7ba1n28e2a22987d2fc8a@mail.gmail.com> Message-ID: On Tue, May 13, 2008 at 4:40 PM, St?fan van der Walt wrote: > 2008/5/13 Travis E. Oliphant : > > Robert Kern wrote: > > > On Tue, May 13, 2008 at 11:12 AM, Travis E. Oliphant > > > wrote: > > > > > > > > >> Besides, having a "test-per-checkin" is not the proper mapping in > my > > >> mind. I'd rather see whole check-ins devoted to testing large > pieces > > >> of code rather than spend all unit-test foo on a rigid policy of > > >> "regression" testing each check-in. > > >> > > > > > > St?fan is proposing "test-per-bugfix", not "test-per-checkin". That > is > > > eminently feasible. You need to do some kind of testing to be sure > > > that you actually fixed the problem. It is simply *not* *that* *hard* > > > to write that in unit test form. > > > > > That is not true. You *don't* need to do testing to be sure you > > actually fixed the problem in some cases.... Looking at the code is > > enough. Like the case we are talking about. > > That is where we disagree: looking at the code is simply not enough. > It is fine if you know where to look, and if you look exactly there > every day -- but who does that? That is precisely why we have tests > -- to automate this inspection. > Stefan, sometimes the fix really is clear and a test is like closing the barn door after the horse has bolted. Sometimes it isn't even clear *how* to test. I committed one fix and omitted a test because I couldn't think of anything really reasonable. I think concentrating on unit tests is more productive in the long run because we will find *new* bugs, and if done right they will also cover spots where old bugs were found. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue May 13 19:08:31 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 13 May 2008 23:08:31 +0000 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: References: <482605DB.9090709@hawaii.edu> <9457e7c80805121334s22602758tedc354f1960964de@mail.gmail.com> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> <4829F53B.5030404@enthought.com> <9457e7c80805131540n253a7ba1n28e2a22987d2fc8a@mail.gmail.com> Message-ID: <1e2af89e0805131608r1f5fec8fyb935f4da8e209f38@mail.gmail.com> Hi, > Stefan, sometimes the fix really is clear and a test is like closing the > barn door after the horse has bolted. Sometimes it isn't even clear *how* to > test. I committed one fix and omitted a test because I couldn't think of > anything really reasonable. I think concentrating on unit tests is more > productive in the long run because we will find *new* bugs, and if done > right they will also cover spots where old bugs were found. I must say that I have certainly (correctly) fixed a bug, and then broken the code somewhere else resulting in the same effect as the original bug, and missed it because I didn't put in a test the first time. I do agree (with everyone else I think) that it's a very good habit to get into to submit a test with every fix, no matter how obvious. Best, Matthew From strawman at astraw.com Tue May 13 19:59:12 2008 From: strawman at astraw.com (Andrew Straw) Date: Tue, 13 May 2008 16:59:12 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <4823CA39.7090203@astraw.com> References: <4823CA39.7090203@astraw.com> Message-ID: <482A2B50.6010005@astraw.com> Thanks for all the comments on my original question. I was more offline than intended after I sent it until now, so I'm sorry I wasn't immediately able to participate in the discussion. Anyhow, after working on this a bit more, I came up with a few implementations of search algorithms doing just what I needed with the same interface available using bazaar and launchpad at http://launchpad.net/~astraw/+junk/fastsearch (MIT license). I have attached the output of the plot_comparisons.py benchmarking script to this email (note that this benchmarking is pretty crude). For the problem I originally wrote about, I get what a nearly unbelievable speedup of ~250x using the fastsearch.downsamp.DownSampledPreSearcher class, which is very similar in spirit to Charles' suggestion. It takes 1000 values from the original array to create a new first-level array that is itself localized in memory and points to a more localized region of the full original array. Also, I get a similar (though slightly slower) result using AVL trees using the fastsearch.avlsearch.AvlSearcher class, which uses pyavl ( http://sourceforge.net/projects/pyavl ). Using the benchmarking code included in the bzr branch, I don't get anything like this speedup (e.g. the attached figure), so I'm not sure exactly what's going on at this point, but I'm not going to argue with a 250x speedup, so the fastsearch.downsamp code is now being put to use in one of my projects. Stefan -- I think your code simply implements the classic binary search -- I don't see how it will reduce cache misses. Anyhow, perhaps someone will find the above useful. I guess it would still be a substantial amount of work to make a numpy-types-aware implementation of AVL trees or similar algorithms. These sorts of binary search trees seem like the right way to solve this problem and thus there might be an interesting project in this. I imagine that a numpy-types-aware Cython might make such implementation significantly easier and still blazingly fast compared to the binary search implemented in searchsorted() given today's cached memory architectures. -Andrew Andrew Straw wrote: > I've got a big element array (25 million int64s) that searchsorted() > takes a long time to grind through. After a bit of digging in the > literature and the numpy source code, I believe that searchsorted() is > implementing a classic binary search, which is pretty bad in terms of > cache misses. There are several modern implementations of binary search > which arrange items in memory such that cache misses are much more rare. > Clearly making such an indexing arrangement would take time, but in my > particular case, I can spare the time to create an index if searching > was faster, since I'd make the index once but do the searching many times. > > Is there an implementation of such an algorithm that works easilty with > numpy? Also, can you offer any advice, suggestions, and comments to me > if I attempted to implement such an algorithm? > > Thanks, > Andrew > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: fastsearch.png Type: image/png Size: 14205 bytes Desc: not available URL: From charlesr.harris at gmail.com Tue May 13 20:38:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 13 May 2008 18:38:37 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482A2B50.6010005@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> Message-ID: On Tue, May 13, 2008 at 5:59 PM, Andrew Straw wrote: > Thanks for all the comments on my original question. I was more offline > than intended after I sent it until now, so I'm sorry I wasn't > immediately able to participate in the discussion. > > Anyhow, after working on this a bit more, I came up with a few > implementations of search algorithms doing just what I needed with the > same interface available using bazaar and launchpad at > http://launchpad.net/~astraw/+junk/fastsearch(MIT license). I have > attached the output of the plot_comparisons.py benchmarking script to > this email (note that this benchmarking is pretty crude). > > For the problem I originally wrote about, I get what a nearly > unbelievable speedup of ~250x using the > fastsearch.downsamp.DownSampledPreSearcher class, which is very similar > in spirit to Charles' suggestion. It takes 1000 values from the original > array to create a new first-level array that is itself localized in > memory and points to a more localized region of the full original array. > Also, I get a similar (though slightly slower) result using AVL trees > using the fastsearch.avlsearch.AvlSearcher class, which uses pyavl ( > http://sourceforge.net/projects/pyavl ). > > Using the benchmarking code included in the bzr branch, I don't get > anything like this speedup (e.g. the attached figure), so I'm not sure > exactly what's going on at this point, but I'm not going to argue with a > 250x speedup, so the fastsearch.downsamp code is now being put to use in > one of my projects. > > Stefan -- I think your code simply implements the classic binary search > -- I don't see how it will reduce cache misses. > > Anyhow, perhaps someone will find the above useful. I guess it would > still be a substantial amount of work to make a numpy-types-aware > implementation of AVL trees or similar algorithms. These sorts of binary > search trees seem like the right way to solve this problem and thus > there might be an interesting project in this. I imagine that a > numpy-types-aware Cython might make such implementation significantly > easier and still blazingly fast compared to the binary search > implemented in searchsorted() given today's cached memory architectures. > That's pretty amazing, but I don't understand the graph. The DownSampled search looks like the worst. Are the curves mislabled? Are the axis correct? I'm assuming smaller is better here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From strawman at astraw.com Tue May 13 21:08:18 2008 From: strawman at astraw.com (Andrew Straw) Date: Tue, 13 May 2008 18:08:18 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> Message-ID: <482A3B82.8060808@astraw.com> Charles R Harris wrote: > > > On Tue, May 13, 2008 at 5:59 PM, Andrew Straw > wrote: > > Thanks for all the comments on my original question. I was more > offline > than intended after I sent it until now, so I'm sorry I wasn't > immediately able to participate in the discussion. > > Anyhow, after working on this a bit more, I came up with a few > implementations of search algorithms doing just what I needed with the > same interface available using bazaar and launchpad at > http://launchpad.net/~astraw/+junk/fastsearch > (MIT license). I > have > attached the output of the plot_comparisons.py benchmarking script to > this email (note that this benchmarking is pretty crude). > > For the problem I originally wrote about, I get what a nearly > unbelievable speedup of ~250x using the > fastsearch.downsamp.DownSampledPreSearcher class, which is very > similar > in spirit to Charles' suggestion. It takes 1000 values from the > original > array to create a new first-level array that is itself localized in > memory and points to a more localized region of the full original > array. > Also, I get a similar (though slightly slower) result using AVL trees > using the fastsearch.avlsearch.AvlSearcher class, which uses pyavl ( > http://sourceforge.net/projects/pyavl ). > > Using the benchmarking code included in the bzr branch, I don't get > anything like this speedup (e.g. the attached figure), so I'm not sure > exactly what's going on at this point, but I'm not going to argue > with a > 250x speedup, so the fastsearch.downsamp code is now being put to > use in > one of my projects. > > Stefan -- I think your code simply implements the classic binary > search > -- I don't see how it will reduce cache misses. > > Anyhow, perhaps someone will find the above useful. I guess it would > still be a substantial amount of work to make a numpy-types-aware > implementation of AVL trees or similar algorithms. These sorts of > binary > search trees seem like the right way to solve this problem and thus > there might be an interesting project in this. I imagine that a > numpy-types-aware Cython might make such implementation significantly > easier and still blazingly fast compared to the binary search > implemented in searchsorted() given today's cached memory > architectures. > > > That's pretty amazing, but I don't understand the graph. The > DownSampled search looks like the worst. Are the curves mislabled? Are > the axis correct? I'm assuming smaller is better here. The lines are labeled properly -- the graph is inconsistent with the findings on my real data (not shown), which is what I meant with "Using the benchmarking code included in the bzr branch, I don't get anything like this speedup (e.g. the attached figure)". My guess is that the BinarySearcher climbs terribly under some usage pattern that isn't being exhibited with this test. I'm really not sure yet what is the important difference with my real data and these synthetic data. I will keep the list posted as I find out more. Clearly, on the synthetic data for the benchmark, the BinarySearcher does pretty well when N items is large. This is quite contrary to my theory about cache misses being the root of my problem with the binary search, so I don't understand it at the moment, but certainly both the of the other searchers perform better on my real data. I will post any new insights as I continue to work on this... -Andrew From millman at berkeley.edu Tue May 13 21:39:35 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 13 May 2008 18:39:35 -0700 Subject: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2 Message-ID: Hello, Well it has become obvious that I branched way too soon. My bad. So I removed the branch: http://projects.scipy.org/scipy/numpy/changeset/5163 And updated the version.py to reflect that the trunk is still for 1.1 development: http://projects.scipy.org/scipy/numpy/changeset/5164 Sorry for the confusion. This shouldn't cause any problems, I took a quick look and: 1. I didn't see anything that had been added to the trunk that shouldn't go into 1.1.0 2. I didn't see anything that had been added to the branch that wasn't also in the trunk I have made mistakes in the past (present case in point), so I would appreciate if everyone would verify their recent changes as well. There are still a few little things that need to be done before 1.1.0 is released, so I am going to wait a few more days. Then instead of creating a branch, I will tag a release candidate from the trunk and ask for a short code freeze. I will ask the community for wide testing of the release candidate. I also hope that we can create binaries of the release candidates, so we can test them. If the release candidate looks good, it will become the official 1.1.0 release. At that point I will update the trunk's version.py for 1.1.1 development (a minor, bug-fix only release). At that point we will only allow bug-fixes, tests, and improved documentation to be committed to the trunk. During this phase we can continue the conversation about what should go into the 1.2 release and the early work will be done in a development branch. Matthew Brett has all ready agreed to create a branch to migrate NumPy to the nose testing framework (like he did for SciPy recently). If reasonable I will release 1.1.1 from the trunk within a month of the 1.1.0 release. Just to reiterate, 1.1.1 will *only* include bug fixes, tests, and documentation. Once 1.1.1 is released and the 1.2 development branch has stabilized (e.g., the move to nose is complete) we will move the 1.2 development to the trunk and the 1.1.2 development will move to a branch. I expect this to happen before the end of June. About one month after moving the 1.2 development to the trunk, I will ask everyone to review the status of the new features/changes that have happened on the trunk. I expect that at that time, we will stop trying to add new features and will enter a feature freeze. At the end of this review I will tag a beta of the 1.2.0 release and ask David and Chris to create Mac and Windows binaries so that we can get the benefit of a much wider testing audience. I expect this to happen before the end of July. I will organize a 1.2 bug-fixing sprint at the end of the SciPy conference. Hopefully, the 1.2 release can be announced at the end or shortly after the conference Of course, we have to start by getting 1.1.0 out first! Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From wnbell at gmail.com Tue May 13 21:44:01 2008 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 13 May 2008 20:44:01 -0500 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482A2B50.6010005@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> Message-ID: On Tue, May 13, 2008 at 6:59 PM, Andrew Straw wrote: > easier and still blazingly fast compared to the binary search > implemented in searchsorted() given today's cached memory architectures. Andrew, I looked at your code and I don't quite understand something. Why are you looking up single values? Is it not the case that the overhead of several Python calls completely dominates the actual cost of performing a binary search (e.g. in C code)? For instance, the 'v' argument of searchsorted(a,v) can be an array: from scipy import * haystack = rand(1e7) needles = haystack[:10000].copy() haystack.sort() timeit searchsorted(haystack, needles) 100 loops, best of 3: 12.7 ms per loop Which seems to be much faster than: timeit for k in needles: x = searchsorted(haystack,k) 10 loops, best of 3: 43.8 ms per loop The other thing to consider is that a reasonably smart CPU cache manager may retain the first few levels that are explored in a binary search tree. Clearly you could speed this up by increasing the branching factor of the tree, or using a fast index into the large array. However, I think that these effects are being masked by Python overhead in your tests. I whipped up a weave implementation of searchsorted() that uses the STL. It clocks in at 9.72ms per loop, so I think NumPy's searchsorted() is fairly good. import scipy import scipy.weave def searchsorted2(a,v): N_a = len(a) N_v = len(v) indices = scipy.empty(N_v, dtype='intc') code = """ for(int i = 0; i < N_v; i++){ indices(i) = lower_bound(&a(0), &a(N_a), v(i)) - &a(0); } """ err = scipy.weave.inline(code, ['a','v','N_a', 'N_v','indices'], type_converters = scipy.weave.converters.blitz, compiler = 'gcc', support_code = '#include ') return indices -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From strawman at astraw.com Tue May 13 22:00:17 2008 From: strawman at astraw.com (Andrew Straw) Date: Tue, 13 May 2008 19:00:17 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> Message-ID: <482A47B1.4080009@astraw.com> Nathan Bell wrote: > On Tue, May 13, 2008 at 6:59 PM, Andrew Straw wrote: > >> easier and still blazingly fast compared to the binary search >> implemented in searchsorted() given today's cached memory architectures. >> > > Andrew, I looked at your code and I don't quite understand something. > Why are you looking up single values? > Hi Nathan, The Python overhead was nothing compared to the speed problems I was having... Now I'm quite sure that some optimization could go a little further. Nevertheless, for my motivating use case, it wouldn't be trivial to vectorize this, and a "little further" in this case is too little to justify the investment of my time at the moment. -Andrew From david at ar.media.kyoto-u.ac.jp Tue May 13 21:49:11 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 14 May 2008 10:49:11 +0900 Subject: [Numpy-discussion] gcov testing of C extensions (was Going toward time-based release ?) In-Reply-To: <9457e7c80805131547m6de6a56dr3678b6aeb9ec608e@mail.gmail.com> References: <1d36917a0805131014v26f35bebpc8205768117204ba@mail.gmail.com> <9457e7c80805131547m6de6a56dr3678b6aeb9ec608e@mail.gmail.com> Message-ID: <482A4517.6030204@ar.media.kyoto-u.ac.jp> St?fan van der Walt wrote: > > Unless anybody points out a technical reason not to, please do have a > go at it -- the results are eagerly anticipated! > For numpy, it may be doable, but for scipy, it will be difficult, I think. In particular, statically linking c++ code is not easy. Is gcov the only open source code coverage tool for C/C++ ? cheers, David From charlesr.harris at gmail.com Wed May 14 00:39:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 13 May 2008 22:39:48 -0600 Subject: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2 In-Reply-To: References: Message-ID: On Tue, May 13, 2008 at 7:39 PM, Jarrod Millman wrote: > Hello, > > Well it has become obvious that I branched way too soon. My bad. So > I removed the branch: > http://projects.scipy.org/scipy/numpy/changeset/5163 > And updated the version.py to reflect that the trunk is still for 1.1 > development: > http://projects.scipy.org/scipy/numpy/changeset/5164 > Sorry for the confusion. > > This shouldn't cause any problems, I took a quick look and: > 1. I didn't see anything that had been added to the trunk that > shouldn't go into 1.1.0 > 2. I didn't see anything that had been added to the branch that wasn't > also in the trunk > I was getting ready to add a big code cleanup, so you lucked out ;) Let's get this release out as quickly as possible once masked arrays are ready to go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Wed May 14 00:53:25 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 13 May 2008 21:53:25 -0700 Subject: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2 In-Reply-To: References: Message-ID: On Tue, May 13, 2008 at 9:39 PM, Charles R Harris wrote: > I was getting ready to add a big code cleanup, so you lucked out ;) Let's > get this release out as quickly as possible once masked arrays are ready to > go. What is left to do for the masked arrays? Just want to make sure I haven't missed something. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From efiring at hawaii.edu Wed May 14 01:03:56 2008 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 13 May 2008 19:03:56 -1000 Subject: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2 In-Reply-To: References: Message-ID: <482A72BC.30003@hawaii.edu> Jarrod Millman wrote: > On Tue, May 13, 2008 at 9:39 PM, Charles R Harris > wrote: >> I was getting ready to add a big code cleanup, so you lucked out ;) Let's >> get this release out as quickly as possible once masked arrays are ready to >> go. > > What is left to do for the masked arrays? Just want to make sure I > haven't missed something. > > Thanks, > Masked arrays: the only thing I know of as a possibility is the suggestion that the new code be accessible with a DeprecationWarning from numpy.core as well as from numpy. I'm neutral on this, and will not try to implement it. All is not yet well in matrix land; the end of the numpy.test() output is: ====================================================================== ERROR: check_array_from_matrix_list (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 194, in check_array_from_matrix_list x = array([a, a]) ValueError: setting an array element with a sequence. ====================================================================== FAIL: check_dimesions (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 190, in check_dimesions assert_equal(x.ndim, 1) File "/usr/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 145, in assert_equal assert desired == actual, msg AssertionError: Items are not equal: ACTUAL: 2 DESIRED: 1 ---------------------------------------------------------------------- Ran 1004 tests in 1.534s FAILED (failures=1, errors=1) Out[2]: In [3]:numpy.__version__ Out[3]:'1.1.0.dev5164' Eric From millman at berkeley.edu Wed May 14 02:00:34 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 13 May 2008 23:00:34 -0700 Subject: [Numpy-discussion] Trac internal error In-Reply-To: <1210667843.26372.386.camel@localhost> References: <1210667843.26372.386.camel@localhost> Message-ID: On Tue, May 13, 2008 at 1:37 AM, Pauli Virtanen wrote: > Something seems to be wrong with the Trac: > > http://scipy.org/scipy/numpy/timeline > > Internal Error > Ticket changes event provider (TicketModule) failed: > > SubversionException: ("Can't open file > '/home/scipy/svn/numpy/db/revprops/5159': Permission denied", 13) Although this one sorted itself by morning, I kept running into permissions issues in various places today. I finally went ahead and changed some things and I hope these permissions issues will disappear. If anyone else runs into something like this in the next few days, please let me know. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From matthieu.brucher at gmail.com Wed May 14 02:00:49 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 14 May 2008 08:00:49 +0200 Subject: [Numpy-discussion] ticket 788: possible blocker In-Reply-To: <1e2af89e0805131608r1f5fec8fyb935f4da8e209f38@mail.gmail.com> References: <482605DB.9090709@hawaii.edu> <4828CC27.5080906@hawaii.edu> <4828F71D.6090108@enthought.com> <9457e7c80805122356v3fd7533el499ef8d0b73cfc89@mail.gmail.com> <4829BDD6.4060905@enthought.com> <3d375d730805131025j72b58ad3n31d8850e81135425@mail.gmail.com> <4829F53B.5030404@enthought.com> <9457e7c80805131540n253a7ba1n28e2a22987d2fc8a@mail.gmail.com> <1e2af89e0805131608r1f5fec8fyb935f4da8e209f38@mail.gmail.com> Message-ID: 2008/5/14 Matthew Brett : > Hi, > > > Stefan, sometimes the fix really is clear and a test is like closing the > > barn door after the horse has bolted. Sometimes it isn't even clear *how* > to > > test. I committed one fix and omitted a test because I couldn't think of > > anything really reasonable. I think concentrating on unit tests is more > > productive in the long run because we will find *new* bugs, and if done > > right they will also cover spots where old bugs were found. > > I must say that I have certainly (correctly) fixed a bug, and then > broken the code somewhere else resulting in the same effect as the > original bug, and missed it because I didn't put in a test the first > time. I do agree (with everyone else I think) that it's a very good > habit to get into to submit a test with every fix, no matter how > obvious. > > Best, > I agree as well, what may be obvious to someone is not for someone else, and there are many examples where I thought the code did this but in fact did that (and I saw it regularly in my courses with some students). Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From cburns at berkeley.edu Wed May 14 02:18:06 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Tue, 13 May 2008 23:18:06 -0700 Subject: [Numpy-discussion] how to use masked arrays Message-ID: <764e38540805132318n4af82683xad2cde9cf6d77508@mail.gmail.com> I'm finding it difficult to tell which methods/operations respect the mask and which do not, in masked arrays. mydata.filled returns a copy of the data (in a numpy array) with all masked elements set to the fill_value. So, masked respected, but data returned as a new data-type when what I wanted was to set all masked values in the array to the same value. mydata.fill however modifies the data array in-place, modifies all values regardless of the mask, and leaves the mask unchanged. Assignment (mydata[:] = 10) sets all values in the slice and updates the mask. Basic methods respect the mask, like mydata.mean(), but np.asarray ignores the mask. Example ------------ In [32]: mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0]) In [34]: mydata Out[34]: masked_array(data = [-- 1 -- 3 -- 5], mask = [ True False True False True False], fill_value=999999) In [35]: mydata.filled(np.nan) Out[35]: array([0, 1, 0, 3, 0, 5]) In [36]: mydata.fill(np.nan) In [37]: mydata Out[37]: masked_array(data = [-- 0 -- 0 -- 0], mask = [ True False True False True False], fill_value=999999) In [38]: mydata.data Out[38]: array([0, 0, 0, 0, 0, 0]) In [48]: mydata[:] = 456 In [49]: mydata Out[49]: masked_array(data = [456 456 456 456 456 456], mask = [False False False False False False], fill_value=999999) In [53]: mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0]) In [54]: mydata.mean() Out[54]: 3.0 In [55]: np.asarray(mydata) Out[55]: array([0, 1, 2, 3, 4, 5]) In summary, is there a tutorial that would show how to use masked arrays? Because at this point I'm confused and don't know how to use them. Google yields this out of data doc: http://numpy.scipy.org/numpydoc/numpy-22.html Thanks! -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From oliphant at enthought.com Wed May 14 03:47:50 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 14 May 2008 02:47:50 -0500 Subject: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2 In-Reply-To: <482A72BC.30003@hawaii.edu> References: <482A72BC.30003@hawaii.edu> Message-ID: <482A9926.6040006@enthought.com> Eric Firing wrote: > Jarrod Millman wrote: > >> On Tue, May 13, 2008 at 9:39 PM, Charles R Harris >> wrote: >> >>> I was getting ready to add a big code cleanup, so you lucked out ;) Let's >>> get this release out as quickly as possible once masked arrays are ready to >>> go. >>> >> What is left to do for the masked arrays? Just want to make sure I >> haven't missed something. >> >> Thanks, >> >> > > Masked arrays: the only thing I know of as a possibility is the > suggestion that the new code be accessible with a DeprecationWarning > from numpy.core as well as from numpy. I'm neutral on this, and will > not try to implement it. > > All is not yet well in matrix land; the end of the numpy.test() output is: > > ====================================================================== > ERROR: check_array_from_matrix_list > (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", > line 194, in check_array_from_matrix_list > x = array([a, a]) > ValueError: setting an array element with a sequence. > > ====================================================================== > FAIL: check_dimesions > (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", > line 190, in check_dimesions > assert_equal(x.ndim, 1) > File "/usr/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 145, in assert_equal > assert desired == actual, msg > AssertionError: > Items are not equal: > ACTUAL: 2 > DESIRED: 1 > > ---------------------------------------------------------------------- > Ran 1004 tests in 1.534s > > FAILED (failures=1, errors=1) > Out[2]: > I'm aware of these errors. I will try to clean them up soon by fixing the dimension reduction assumption mistakes. -Travis From Garry.Willgoose at newcastle.edu.au Wed May 14 01:27:33 2008 From: Garry.Willgoose at newcastle.edu.au (Garry Willgoose) Date: Wed, 14 May 2008 15:27:33 +1000 Subject: [Numpy-discussion] mac osx10.5 and external library crash (possible f2py/numpy problem?) Message-ID: I've just moved all my stuff from a Intel Imac on 10.4.11 to a Macpro on 10.5.2. On both machines I have the same universal activestate python (both 2.5.1 and 2.5.2.2 give the same problem). I have some codes in fortran from which I build a shared library using f2py from numpy. Now when I import the library as built on osx10.4 it all works fine but when I import library as built on 10.5 I get the following error message willgoose-macpro:system garrywillgoose$ python ActivePython 2.5.1.1 (ActiveState Software Inc.) based on Python 2.5.1 (r251:54863, May 1 2007, 17:40:00) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import tsimdtm Fatal Python error: Interpreter not initialized (version mismatch?) Abort trap Its only my libraries that are the problem. Now also (1) if I move this library back to my 10.4 machine it works fine, and (2) if I move the library built on my 10.4 machine over to 10.5 it also works fine. Its only the library built on 10.5 and run on 10.5 that is the problem. Re f2py it appears to be independent of fortran compiler (g95, gfortran and intel fortran all give same result). I upgraded to the latest version of activestate 2.5.2 but same problem. OSX 10.4 has xcode 2.4 installed, 10.5 has 3.0 installed. Reinstalled all the fortran compilers and the numpy support library ... still the same. Any ideas (1) what the error means in the first place and (2) what I should do? I've cross-posted this on the Python-mac and numpy discussion since it appears to be a OSX/numpy/f2py interaction. ==================================================================== Prof Garry Willgoose, Australian Professorial Fellow in Environmental Engineering, Director, Centre for Climate Impact Management (C2IM), School of Engineering, The University of Newcastle, Callaghan, 2308 Australia. Centre webpage: www.c2im.org.au Phone: (International) +61 2 4921 6050 (Tues-Fri AM); +61 2 6545 9574 (Fri PM-Mon) FAX: (International) +61 2 4921 6991 (Uni); +61 2 6545 9574 (personal and Telluric) Env. Engg. Secretary: (International) +61 2 4921 6042 email: garry.willgoose at newcastle.edu.au; g.willgoose at telluricresearch.com email-for-life: garry.willgoose at alum.mit.edu personal webpage: www.telluricresearch.com/garry ==================================================================== "Do not go where the path may lead, go instead where there is no path and leave a trail" Ralph Waldo Emerson ==================================================================== From pav at iki.fi Wed May 14 05:10:16 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 14 May 2008 12:10:16 +0300 Subject: [Numpy-discussion] Trac internal error In-Reply-To: References: <1210667843.26372.386.camel@localhost> Message-ID: <1210756216.2854.6.camel@localhost> ti, 2008-05-13 kello 23:00 -0700, Jarrod Millman kirjoitti: > On Tue, May 13, 2008 at 1:37 AM, Pauli Virtanen wrote: > > Something seems to be wrong with the Trac: > > > > http://scipy.org/scipy/numpy/timeline > > > > Internal Error > > Ticket changes event provider (TicketModule) failed: > > > > SubversionException: ("Can't open file > > '/home/scipy/svn/numpy/db/revprops/5159': Permission denied", 13) > > Although this one sorted itself by morning, I kept running into > permissions issues in various places today. I finally went ahead and > changed some things and I hope these permissions issues will > disappear. If anyone else runs into something like this in the next > few days, please let me know. There's a permission error here: http://scipy.org/scipy/numpy/wiki/CodingStyleGuidelines -- Pauli Virtanen From charlesr.harris at gmail.com Wed May 14 06:26:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 14 May 2008 04:26:55 -0600 Subject: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2 In-Reply-To: <482A9926.6040006@enthought.com> References: <482A72BC.30003@hawaii.edu> <482A9926.6040006@enthought.com> Message-ID: On Wed, May 14, 2008 at 1:47 AM, Travis E. Oliphant wrote: > Eric Firing wrote: > > Jarrod Millman wrote: > > > >> On Tue, May 13, 2008 at 9:39 PM, Charles R Harris > >> wrote: > >> > >>> I was getting ready to add a big code cleanup, so you lucked out ;) > Let's > >>> get this release out as quickly as possible once masked arrays are > ready to > >>> go. > >>> > >> What is left to do for the masked arrays? Just want to make sure I > >> haven't missed something. > >> > >> Thanks, > >> > >> > > > > Masked arrays: the only thing I know of as a possibility is the > > suggestion that the new code be accessible with a DeprecationWarning > > from numpy.core as well as from numpy. I'm neutral on this, and will > > not try to implement it. > > > > All is not yet well in matrix land; the end of the numpy.test() output > is: > > > > ====================================================================== > > ERROR: check_array_from_matrix_list > > (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > > "/usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", > > line 194, in check_array_from_matrix_list > > x = array([a, a]) > > ValueError: setting an array element with a sequence. > > > > ====================================================================== > > FAIL: check_dimesions > > (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > > "/usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", > > line 190, in check_dimesions > > assert_equal(x.ndim, 1) > > File "/usr/local/lib/python2.5/site-packages/numpy/testing/utils.py", > > line 145, in assert_equal > > assert desired == actual, msg > > AssertionError: > > Items are not equal: > > ACTUAL: 2 > > DESIRED: 1 > > > > ---------------------------------------------------------------------- > > Ran 1004 tests in 1.534s > > > > FAILED (failures=1, errors=1) > > Out[2]: > > > > I'm aware of these errors. I will try to clean them up soon by fixing > the dimension reduction assumption mistakes. > Nah, leave it. You not only have to fix the descent in dimensions, you will have to fix PyArray_From DescAndData (or whatever it is). I vote for simply noting that x = array([a, a]) won't work, and keep it around as a reminder of what a horrible mistake the current Matrix implementation is. The code is difficult as it is, and if we keep this up it will start looking like windows with 15 years of compatibility crud. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From strawman at astraw.com Wed May 14 10:09:26 2008 From: strawman at astraw.com (Andrew Straw) Date: Wed, 14 May 2008 07:09:26 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482A3B82.8060808@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> Message-ID: <482AF296.8040208@astraw.com> > I will post any new insights as I continue to work on this... > OK, I save isolated a sample of my data that illustrates the terrible performance with the binarysearch. I have uploaded it as a pytables file to http://astraw.com/framenumbers.h5 in case anyone wants to have a look themselves. Here's an example of the type of benchmark I've been running: import fastsearch.downsamp import fastsearch.binarysearch import tables h5=tables.openFile('framenumbers.h5',mode='r') framenumbers=h5.root.framenumbers.read() keys=h5.root.keys.read() h5.close() def bench( implementation ): for key in keys: implementation.index( key ) downsamp = fastsearch.downsamp.DownSampledPreSearcher( framenumbers ) binary = fastsearch.binarysearch.BinarySearcher( framenumbers ) # The next two lines are IPython-specific, and the 2nd takes a looong time: %timeit bench(downsamp) %timeit bench(binary) Running the above gives: In [14]: %timeit bench(downsamp) 10 loops, best of 3: 64 ms per loop In [15]: %timeit bench(binary) 10 loops, best of 3: 184 s per loop Quite a difference (a factor of about 3000)! At this point, I haven't delved into the dataset to see what makes it so pathological -- performance is nowhere near this bad for the binary search algorithm with other sets of keys. From strawman at astraw.com Wed May 14 10:17:10 2008 From: strawman at astraw.com (Andrew Straw) Date: Wed, 14 May 2008 07:17:10 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482AF296.8040208@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> <482AF296.8040208@astraw.com> Message-ID: <482AF466.5040306@astraw.com> Andrew Straw wrote: > I have uploaded it as a pytables file to http://astraw.com/framenumbers.h5 Ahh, forgot to mention a potentially important point -- this data file is 191 MB. From dblubaugh at belcan.com Wed May 14 11:20:57 2008 From: dblubaugh at belcan.com (Blubaugh, David A.) Date: Wed, 14 May 2008 11:20:57 -0400 Subject: [Numpy-discussion] HASH TABLES IN PYTHON Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3803845510@AWMAIL04.belcan.com> To Whom It May Concern, I was wondering if anyone has ever worked with hash tables within the Python Programming language? I will need to utilize this ability for quick numerical calculations. Thank You, David Blubaugh This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Wed May 14 11:30:01 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 14 May 2008 11:30:01 -0400 Subject: [Numpy-discussion] how to use masked arrays In-Reply-To: <764e38540805132318n4af82683xad2cde9cf6d77508@mail.gmail.com> References: <764e38540805132318n4af82683xad2cde9cf6d77508@mail.gmail.com> Message-ID: <200805141130.01605.pgmdevlist@gmail.com> On Wednesday 14 May 2008 02:18:06 Christopher Burns wrote: > I'm finding it difficult to tell which methods/operations respect the > mask and which do not, in masked arrays. Christopher, Unfortunately, there's no tutorial yet. Perhaps could you get one started on the scipy wiki ? I'm afraid I won't have time to do it myself, but I'd be more than happy to fill the gaps. To answer some of your questions: >>>import numpy as np, numpy.ma as ma >>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0]) * If you want to access the underlying data directly, these two commands are (almost) equivalent [1]: >>>mydata._data >>>mydata.view(np.ndarray) Note that you lose the mask information, and that the values that were masked can be bogus. * If you want to get a copy of the underlying data with masked values set to "myvalue", use .filled(myvalue). >>>mydata.filled(-999) array([-99, 1, -99, 3, -99, 5]) If you don't use any argument, ".filled" uses the "fill_value" attribute, whose value depends on the dtype: >>>mydata.fill_value 999999 >>>mydata.filled() array([999999, 1, 999999, 3, 999999, 5]) Note that the argument of ".filled" is casted to the dtype of mydata. >>>mydata.dtype dtype('int64') >>>mydata.filled(np.pi) array([3, 1, 3, 3, 3, 5]) That can be a problem if you wanted to use NaNs as filling values (a bad idea in itself): >>>mydata.filled(np.nan) array([0, 1, 0, 3, 0, 5]) Here, you don't have the NaNs you expected because NaNs are for floats, not integers. * Because masked arrays inherit from ndarrays, there's also a "fill" method available: this one acts directly on the ._data part, but setting all the values at once. The mask is preserved. >>>mydata.fill(-999) >>>print mydata [-- -999 -- -999 -- -999] You could achieve the same result with this command >>>mydata.flat = -999 * Assigning a value to a slice of mydata will modify the mask: >>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0]) >>>mydata[:2] = -999 >>>print mydata [-999 -999 -- 3 -- 5] >>>mydata[-2:] = ma.masked >>>print mydata [-999 -999 -- 3 -- --] * If you want to make sure you don't unmask data by mistake with slice assignments, set the ._hardmask attribute to True (it is set to False by default) >>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0], hard_mask=True) >>>mydata[:2] = -999 >>>print mydata [-- -999 -- 3 -- 5] You can change the value of ._hardmask either directly, or with the soften_mask() and harden_mask() methods * > Basic methods respect the mask, like mydata.mean(), but np.asarray > ignores the mask. Yes, np.asarray(x) is equivalent to np.array(x, copy=False, subok=False). If you want to keep the mask, use np.asanyarray, which is equivalent to np.array(x, copy=False, subok=True) [2] >>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0]) >>>print mydata.mean() 3.0 >>>print np.asarray(mydata).mean() 2.5 >>>print np.asanyarray(mydata).mean() 3.0 >>>print np.mean(mydata) 3.0 On the last command, np.mean(mydta) tries first to access the .mean method of mydata: if mydata hand't such a method, it would be equivalent to np.asarray(mydata).mean() Hope it helps, don't hesitate to ask for more details/explanations. Specific examples are always easier. I'm looking forward to your wiki page ;) P. [1] Almost: mydata._data is in fact a shortcut to mydata.view(mydata._baseclass), where ._baseclass is the class of the underlying data. For example >>>mxdata=ma.array(np.matrix([[1,2,],[3,4,]]),mask=[[1,0],[0,0]]) >>>print mxdata._baseclass >>>print type(mxdata._data) >>>print type(mxdata.view(np.ndarray)) [2] Note that np.asanyarray returns a masked array in numpy.ma only, not in previous implementations. From robert.kern at gmail.com Wed May 14 12:44:50 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 May 2008 11:44:50 -0500 Subject: [Numpy-discussion] HASH TABLES IN PYTHON In-Reply-To: <27CC3060AF71DA40A5DC85F7D5B70F3803845510@AWMAIL04.belcan.com> References: <27CC3060AF71DA40A5DC85F7D5B70F3803845510@AWMAIL04.belcan.com> Message-ID: <3d375d730805140944jc3af630r368ddaf5b702bd9c@mail.gmail.com> On Wed, May 14, 2008 at 10:20 AM, Blubaugh, David A. wrote: > To Whom It May Concern, > > I was wondering if anyone has ever worked with hash tables within the Python > Programming language? I will need to utilize this ability for quick > numerical calculations. Yes. Python dicts are hash tables. PS: Please do not post with ALL CAPS. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Wed May 14 13:19:55 2008 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 14 May 2008 07:19:55 -1000 Subject: [Numpy-discussion] how to use masked arrays In-Reply-To: <200805141130.01605.pgmdevlist@gmail.com> References: <764e38540805132318n4af82683xad2cde9cf6d77508@mail.gmail.com> <200805141130.01605.pgmdevlist@gmail.com> Message-ID: <482B1F3B.8070105@hawaii.edu> Pierre GM wrote: [...] > > * If you want to access the underlying data directly, these two commands are > (almost) equivalent [1]: >>>> mydata._data >>>> mydata.view(np.ndarray) Shouldn't the former be discouraged, on the grounds that a leading underscore, by Python convention, indicates an attribute that is not part of the public API, but is instead part of the potentially changeable implementation? Eric From pgmdevlist at gmail.com Wed May 14 13:33:34 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 14 May 2008 13:33:34 -0400 Subject: [Numpy-discussion] how to use masked arrays In-Reply-To: <482B1F3B.8070105@hawaii.edu> References: <764e38540805132318n4af82683xad2cde9cf6d77508@mail.gmail.com> <200805141130.01605.pgmdevlist@gmail.com> <482B1F3B.8070105@hawaii.edu> Message-ID: <200805141333.35298.pgmdevlist@gmail.com> On Wednesday 14 May 2008 13:19:55 Eric Firing wrote: > Pierre GM wrote: > > (almost) equivalent [1]: > >>>> mydata._data > >>>> mydata.view(np.ndarray) > > Shouldn't the former be discouraged, on the grounds that a leading > underscore, by Python convention, indicates an attribute that is not > part of the public API, but is instead part of the potentially > changeable implementation? Eric, * Please keep the note [1] in mind: the two commands are NOT equivalent: the former outputs a subclass of ndarray (when appropriate), the latter a regular ndarray. * You can use mydata.data to achieve the same result as mydata._data. In practice, both _data and data are properties, without a fset method and a with fget= lambda x:x.view(x._baseclass). I'm not very comfortable with using .data myself, it looks a bit awkward (personal taste), and it may let a user think that the readbuffer object is accessed (when in fact, it's mydata.data.data...) * The syntax ._data is required for backwards compatibility (that was the data portion of the old MaskedArray object). So is ._mask * You can also use the getdata(mydata) function: it returns the ._data part of a masked array or the argument as a ndarray, depending which is available. From soren.skou.nielsen at gmail.com Wed May 14 14:48:38 2008 From: soren.skou.nielsen at gmail.com (=?ISO-8859-1?Q?S=F8ren_Nielsen?=) Date: Wed, 14 May 2008 20:48:38 +0200 Subject: [Numpy-discussion] Extending an ndarray Message-ID: Hi, I've loaded an image into a ndarray. I'd like to extend the ndarray with a border of zeros all around the ndarray.. does anyone here know how to do this? Thanks, Soren -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Wed May 14 15:02:46 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 14 May 2008 15:02:46 -0400 Subject: [Numpy-discussion] Extending an ndarray In-Reply-To: References: Message-ID: <1d36917a0805141202u5dcc3475m2dc916cd0d9ba701@mail.gmail.com> Here's one way (probably not the most efficient or elegant): # example original array a=arange(1,26).reshape(5,5) # place copy of 'a' into upper left corner of a larger array of zeros b=zeros((10,10)) b[:5,:5]=a On Wed, May 14, 2008 at 2:48 PM, S?ren Nielsen wrote: > Hi, > > I've loaded an image into a ndarray. I'd like to extend the ndarray with a > border of zeros all around the ndarray.. does anyone here know how to do > this? > > Thanks, > Soren > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Wed May 14 15:43:13 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 14 May 2008 13:43:13 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482AF296.8040208@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> <482AF296.8040208@astraw.com> Message-ID: On Wed, May 14, 2008 at 8:09 AM, Andrew Straw wrote: > > > I will post any new insights as I continue to work on this... > > > OK, I save isolated a sample of my data that illustrates the terrible > performance with the binarysearch. I have uploaded it as a pytables file > to http://astraw.com/framenumbers.h5 in case anyone wants to have a look > themselves. Here's an example of the type of benchmark I've been running: > > import fastsearch.downsamp > import fastsearch.binarysearch > import tables > > h5=tables.openFile('framenumbers.h5',mode='r') > framenumbers=h5.root.framenumbers.read() > keys=h5.root.keys.read() > h5.close() > > def bench( implementation ): > for key in keys: > implementation.index( key ) > > downsamp = fastsearch.downsamp.DownSampledPreSearcher( framenumbers ) > binary = fastsearch.binarysearch.BinarySearcher( framenumbers ) > > # The next two lines are IPython-specific, and the 2nd takes a looong time: > > %timeit bench(downsamp) > %timeit bench(binary) > > > > Running the above gives: > > In [14]: %timeit bench(downsamp) > 10 loops, best of 3: 64 ms per loop > > In [15]: %timeit bench(binary) > > 10 loops, best of 3: 184 s per loop > > Quite a difference (a factor of about 3000)! At this point, I haven't > delved into the dataset to see what makes it so pathological -- > performance is nowhere near this bad for the binary search algorithm > with other sets of keys. > It can't be that bad Andrew, something else is going on. And 191 MB isn's *that* big, I expect it should bit in memory with no problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Wed May 14 15:46:37 2008 From: david.huard at gmail.com (David Huard) Date: Wed, 14 May 2008 15:46:37 -0400 Subject: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag Message-ID: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> Hi, On fedora 8, the docstrings of f2py generated extensions are strangely missing. On Ubuntu, the same modules do have the docstrings. The problem, as reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From soren.skou.nielsen at gmail.com Wed May 14 15:52:28 2008 From: soren.skou.nielsen at gmail.com (=?ISO-8859-1?Q?S=F8ren_Nielsen?=) Date: Wed, 14 May 2008 21:52:28 +0200 Subject: [Numpy-discussion] Extending an ndarray In-Reply-To: <1d36917a0805141202u5dcc3475m2dc916cd0d9ba701@mail.gmail.com> References: <1d36917a0805141202u5dcc3475m2dc916cd0d9ba701@mail.gmail.com> Message-ID: Thanks alan, that works! Soren On Wed, May 14, 2008 at 9:02 PM, Alan McIntyre wrote: > Here's one way (probably not the most efficient or elegant): > > # example original array > a=arange(1,26).reshape(5,5) > > # place copy of 'a' into upper left corner of a larger array of zeros > b=zeros((10,10)) > b[:5,:5]=a > > > On Wed, May 14, 2008 at 2:48 PM, S?ren Nielsen > wrote: > > Hi, > > > > I've loaded an image into a ndarray. I'd like to extend the ndarray with > a > > border of zeros all around the ndarray.. does anyone here know how to do > > this? > > > > Thanks, > > Soren > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From strawman at astraw.com Wed May 14 16:00:37 2008 From: strawman at astraw.com (Andrew Straw) Date: Wed, 14 May 2008 13:00:37 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> <482AF296.8040208@astraw.com> Message-ID: <482B44E5.9090504@astraw.com> Charles R Harris wrote: > > > On Wed, May 14, 2008 at 8:09 AM, Andrew Straw > wrote: > > > > Quite a difference (a factor of about 3000)! At this point, I haven't > delved into the dataset to see what makes it so pathological -- > performance is nowhere near this bad for the binary search algorithm > with other sets of keys. > > > It can't be that bad Andrew, something else is going on. And 191 MB > isn's *that* big, I expect it should bit in memory with no problem. I agree the performance difference seems beyond what one would expect due to cache misses alone. I'm at a loss to propose other explanations, though. Ideas? From robert.kern at gmail.com Wed May 14 16:07:11 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 May 2008 15:07:11 -0500 Subject: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag In-Reply-To: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> References: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> Message-ID: <3d375d730805141307g10b460a4s4913c20ef26a076f@mail.gmail.com> On Wed, May 14, 2008 at 2:46 PM, David Huard wrote: > Hi, > > On fedora 8, the docstrings of f2py generated extensions are strangely > missing. On Ubuntu, the same modules do have the docstrings. The problem, as > reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which > is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ? There is no string "FORTIFY_SOURCE" anywhere in the numpy codebase. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david.huard at gmail.com Wed May 14 16:20:51 2008 From: david.huard at gmail.com (David Huard) Date: Wed, 14 May 2008 16:20:51 -0400 Subject: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag In-Reply-To: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> References: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> Message-ID: <91cf711d0805141320g53ae0c71ncce9ec76033d4965@mail.gmail.com> I filed a patch that seems to do the trick in ticket #792. 2008/5/14 David Huard : > Hi, > > On fedora 8, the docstrings of f2py generated extensions are strangely > missing. On Ubuntu, the same modules do have the docstrings. The problem, as > reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which > is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ? > > Thanks, > > David > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Wed May 14 17:06:31 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 14 May 2008 23:06:31 +0200 Subject: [Numpy-discussion] let's use patch review Message-ID: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> Hi, I read the recent flamebate about unittests, formal procedures for a commit etc. and it was amusing. :) I think Stefan is right about the unit tests. I also think that Travis is right that there is no formal procedure that can assure what we want. I think that a solution is a patch review. Every big/succesful project does it. And the workflow as I see it is this: 1) Travis will fix a bug, and submit it to a patch review. If he is busy, that's the only thing he will do 2) Someone else reviews it. Stefan will be the one who will always point out missing tests. 3) There needs to be a common consensus that the patch is ok to go in. 4) when the patch is reviewed and ok to go in, anyone with a commit access will commit it. I think it's as simple as that. Sometimes no one has enought time to write a proper test, yet someone has a free minute to fix a bug. Then I think it's ok to put the code in, as I think it's good to fix a bug now. However, the issue is definitely not closed and the bug is not fixed (!) until someone writes a proper test. I.e. putting code in that is not tested, however it doesn't break things, is imho ok, as it will not hurt anyone and it will temporarily fix a bug (but of course the code will be broken at some point in the future, if there is no test for it). Now, the problem is that all patch review tools sucks in some way. Currently the most promissing is the one from Guido here: http://code.google.com/p/rietveld/ it's opensource, you can run it on your server, or use it online here: http://codereview.appspot.com/ I suggest you to read the docs how to use it, I am still learning it. Also it works fine for svn, but not for Mercurial, so we are not using it in SymPy yet. So to also do some work besides just talk, I started with this issue: http://projects.scipy.org/scipy/numpy/ticket/788 and submitted the code (not my code though:) in there for a review here: http://codereview.appspot.com/953 and added some comments. So what do you think? Ondrej P.S. efiring, my comments are real questions to your patch. :) From robert.kern at gmail.com Wed May 14 17:24:44 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 May 2008 16:24:44 -0500 Subject: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag In-Reply-To: <91cf711d0805141320g53ae0c71ncce9ec76033d4965@mail.gmail.com> References: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> <91cf711d0805141320g53ae0c71ncce9ec76033d4965@mail.gmail.com> Message-ID: <3d375d730805141424t2a7775efuebd50d4f85be4727@mail.gmail.com> On Wed, May 14, 2008 at 3:20 PM, David Huard wrote: > I filed a patch that seems to do the trick in ticket #792. I don't think this is the right approach. The problem isn't that _FORTIFY_SOURCE is set to 2 but that f2py is doing (probably) bad things that trip these buffer overflow checks. IIRC, Pearu wasn't on the f2py mailing list at the time this came up; please try him again. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From thrabe at burnham.org Wed May 14 18:42:05 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Wed, 14 May 2008 15:42:05 -0700 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndData Segmentation Fault Message-ID: Hi all, the PyArray_FromDimsAndData is still cousing me headaches. Is there anybody out there finding the error of the following code? #include "Python.h" #include int main(int argc,char** argv) { int dimensions = 2; void* value = malloc(sizeof(double)*100); int* size = (int*)malloc(sizeof(int)*2); size[0] = 10; size[1] = 10; for(int i=0;i<100;i++) ((double*)value)[i] = 1.0; for(int i=0;i<100;i++) printf("%e ",((double*)value)[i]); printf("\n%d %d\n",dimensions,size[0]); PyArray_FromDimsAndData(dimensions,size,NPY_DOUBLELTR,(char*)value); //TROUBLE HERE return 0; } I allway get a segmentation fault at the PyArray_FromDimsAndData call. I want to create copies of c arrays, copy them into a running python interpreter as nd-arrays and modify them with some python functions. If I did this in a module, I would have to call the import_array(); function, I know. However, this is all outside of any module and when I add it before PyArray_FromDimsAndData I get the following compilation error: src/test.cpp:24: error: return-statement with no value, in function returning 'int' Does anybody have a clue? Best, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed May 14 19:23:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 May 2008 18:23:30 -0500 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndData Segmentation Fault In-Reply-To: References: Message-ID: <3d375d730805141623h4ffa2b06qdd433133731b6f9b@mail.gmail.com> On Wed, May 14, 2008 at 5:42 PM, Thomas Hrabe wrote: > > Hi all, > > the PyArray_FromDimsAndData is still cousing me headaches. > > Is there anybody out there finding the error of the following code? > > #include "Python.h" > #include > int main(int argc,char** argv) > { > > int dimensions = 2; > > void* value = malloc(sizeof(double)*100); > > int* size = (int*)malloc(sizeof(int)*2); > > size[0] = 10; > size[1] = 10; > > for(int i=0;i<100;i++) > ((double*)value)[i] = 1.0; > > for(int i=0;i<100;i++) > printf("%e ",((double*)value)[i]); > > printf("\n%d %d\n",dimensions,size[0]); > PyArray_FromDimsAndData(dimensions,size,NPY_DOUBLELTR,(char*)value); > //TROUBLE HERE > > return 0; > } > > I allway get a segmentation fault at the PyArray_FromDimsAndData call. > I want to create copies of c arrays, copy them into a running python > interpreter as nd-arrays and modify them with some python functions. > > If I did this in a module, I would have to call the > import_array(); > function, I know. However, this is all outside of any module and when I add > it before PyArray_FromDimsAndData I get the following compilation error: > src/test.cpp:24: error: return-statement with no value, in function > returning 'int' You can't use numpy outside of Python. Put your code into a Python extension module. I can explain the proximate causes of the error messages if you really want, but they aren't really relevant. The ultimate problem is that you aren't in a Python extension module. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Wed May 14 19:27:36 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 15 May 2008 01:27:36 +0200 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndData Segmentation Fault In-Reply-To: References: Message-ID: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> Hi Thomas 2008/5/15 Thomas Hrabe : > PyArray_FromDimsAndData(dimensions,size,NPY_DOUBLELTR,(char*)value); > //TROUBLE HERE I didn't know a person could write a stand-alone program using NumPy this way (can you?); but what I do know is that FromDimsAndData is deprecated, and that it can be replaced here by PyArray_SimpleNewFromData(dimensions, size, NPY_CDOUBLE, value); where npy_intp* size = malloc(sizeof(npy_intp)*2); Regards St?fan From charlesr.harris at gmail.com Wed May 14 19:27:58 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 14 May 2008 17:27:58 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482B44E5.9090504@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> <482AF296.8040208@astraw.com> <482B44E5.9090504@astraw.com> Message-ID: On Wed, May 14, 2008 at 2:00 PM, Andrew Straw wrote: > Charles R Harris wrote: > > > > > > On Wed, May 14, 2008 at 8:09 AM, Andrew Straw > > wrote: > > > > > > > > Quite a difference (a factor of about 3000)! At this point, I haven't > > delved into the dataset to see what makes it so pathological -- > > performance is nowhere near this bad for the binary search algorithm > > with other sets of keys. > > > > > > It can't be that bad Andrew, something else is going on. And 191 MB > > isn's *that* big, I expect it should bit in memory with no problem. > I agree the performance difference seems beyond what one would expect > due to cache misses alone. I'm at a loss to propose other explanations, > though. Ideas? I just searched for 2**25/10 keys in a 2**25 array of reals. It took less than a second when vectorized. In a python loop it took about 7.7 seconds. The only thing I can think of is that the search isn't getting any cpu cycles for some reason. How much memory is it using? Do you have any nans and such in the data? Chuck > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thrabe at burnham.org Wed May 14 19:40:27 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Wed, 14 May 2008 16:40:27 -0700 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentation Fault References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> Message-ID: >I didn't know a person could write a stand-alone program using NumPy >this way (can you?) Well, this is possible when you embed python and use the "simple" objects such as ints, strings, .... Why should it be impossible to do it for numpy then? My plan is to send multidimensional arrays from C to python and to apply some python specific functions to them. -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2866 bytes Desc: not available URL: From efiring at hawaii.edu Wed May 14 19:58:24 2008 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 14 May 2008 13:58:24 -1000 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> Message-ID: <482B7CA0.4080408@hawaii.edu> Ondrej Certik wrote: > Hi, > > I read the recent flamebate about unittests, formal procedures for a > commit etc. and it was amusing. :) > I think Stefan is right about the unit tests. I also think that Travis > is right that there is no formal procedure that can assure what we > want. > > I think that a solution is a patch review. Every big/succesful project > does it. And the workflow as I see it is this: Are you sure numpy is big enough that a formal mechanism is needed--for everyone? It makes good sense for my (rare) patches to be reviewed, but shouldn't some of the core developers be allowed to simply get on with it? As it is, my patches can easily be reviewed because I don't have commit access. > > 1) Travis will fix a bug, and submit it to a patch review. If he is > busy, that's the only thing he will do > 2) Someone else reviews it. Stefan will be the one who will always > point out missing tests. That we can agree on! > 3) There needs to be a common consensus that the patch is ok to go in. What does that mean? How does one know when there is a consensus? > 4) when the patch is reviewed and ok to go in, anyone with a commit > access will commit it. But it has to be a specific person in each case, not "anyone". > > I think it's as simple as that. > > Sometimes no one has enought time to write a proper test, yet someone > has a free minute to fix a bug. Then I think it's ok to put the code > in, as I think it's good to fix a bug now. However, How does that fit with the workflow above? Does Travis commit the bugfix, or not? > the issue is definitely not closed and the bug is not fixed (!) until > someone writes a proper test. I.e. putting code in that is not tested, > however it doesn't break things, is imho ok, as it will not hurt > anyone and it will temporarily fix a bug (but of course the code will > be broken at some point in the future, if there is no test for it). That is overstating the case; for 788, for example, no one in his right mind would undo the one-line correction that Travis made. Chances are, there will be all sorts of breakage and foulups and the revelation of new bugs in the future--but not another instance that would be caught by the test for 788. > [...] > http://codereview.appspot.com/953 > > and added some comments. So what do you think? Looks like it could be useful. I replied to the comments. I haven't read the docs, and I don't know what the next step is when a revision of the patch is in order, as it is in this case. Eric > > Ondrej > > P.S. efiring, my comments are real questions to your patch. :) > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Wed May 14 20:12:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 May 2008 19:12:52 -0500 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentation Fault In-Reply-To: References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> Message-ID: <3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> On Wed, May 14, 2008 at 6:40 PM, Thomas Hrabe wrote: >>I didn't know a person could write a stand-alone program using NumPy >>this way (can you?) > > Well, this is possible when you embed python and use the "simple" objects such as ints, strings, .... > Why should it be impossible to do it for numpy then? numpy exposes its API as a pointer to an array which contains function pointers. import_array() imports the extension module, accesses the PyCObject that contains this pointer, and sets a global pointer appropriately. There are #defines macros to emulate the functions by dereferencing the appropriate element of the array and calling it with the given macro arguments. The reason you get the error about returning nothing when the return type of main() is declared int is because this macro is only intended to work inside of an initmodule() function of an extension module, whose return type is void. import_array() includes error handling logic and will return if there is an error. You get the segfault without import_array() because all of the functions you try to call are trying to dereference an array which has not been initialized. > My plan is to send multidimensional arrays from C to python and to apply some python specific functions to them. Well, first you need to call Py_Initialize() to start the VM. Otherwise, you can't import numpy to begin with. I guess you could write a "void load_numpy(void)" function which just exists to call import_array(). Just be sure to check the exception state appropriately after it returns. But for the most part, it's much better to drive your C code using Python than the other around. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From thrabe at burnham.org Wed May 14 20:26:06 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Wed, 14 May 2008 17:26:06 -0700 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentationFault References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> <3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> Message-ID: >But for the most part, it's much better to drive your C code using >Python than the other around. True, the other way arround works fine. Will investigate further tomorrow, because its a must have feature... Thanks -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2861 bytes Desc: not available URL: From ondrej at certik.cz Wed May 14 21:31:14 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 15 May 2008 03:31:14 +0200 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <482B7CA0.4080408@hawaii.edu> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <482B7CA0.4080408@hawaii.edu> Message-ID: <85b5c3130805141831q53bccc7eo828b1611fa4b8135@mail.gmail.com> On Thu, May 15, 2008 at 1:58 AM, Eric Firing wrote: > Ondrej Certik wrote: >> Hi, >> >> I read the recent flamebate about unittests, formal procedures for a >> commit etc. and it was amusing. :) >> I think Stefan is right about the unit tests. I also think that Travis >> is right that there is no formal procedure that can assure what we >> want. >> >> I think that a solution is a patch review. Every big/succesful project >> does it. And the workflow as I see it is this: > > Are you sure numpy is big enough that a formal mechanism is needed--for > everyone? It makes good sense for my (rare) patches to be reviewed, but > shouldn't some of the core developers be allowed to simply get on with > it? As it is, my patches can easily be reviewed because I don't have > commit access. It's not me who should make this decision, as I have contributed in total maybe 1 patch to numpy. I am just suggesting it as a possibility. The numpy developers are free to choose the workflow that suits them best. But if you are asking for my own opinion -- yes, I think all code should be reviewed, including core developers, that's for example what Sage does (or what we do in SymPy), because that brings other people to comment on it, to help find bugs, to get familiar with it and also everyone involved learns something from it. Simply google why patch review is useful to get arguments for doing it. > >> >> 1) Travis will fix a bug, and submit it to a patch review. If he is >> busy, that's the only thing he will do >> 2) Someone else reviews it. Stefan will be the one who will always >> point out missing tests. > > That we can agree on! > >> 3) There needs to be a common consensus that the patch is ok to go in. > > What does that mean? How does one know when there is a consensus? That everyone involved in the discussion agrees it should go in as it is. I am sure you can recognize very easily if there is not a concensus. > >> 4) when the patch is reviewed and ok to go in, anyone with a commit >> access will commit it. > > But it has to be a specific person in each case, not "anyone". Those, who have commit access are definitely not anyone. Only those, who have showed they can be trusted not to break things. > >> >> I think it's as simple as that. >> >> Sometimes no one has enought time to write a proper test, yet someone >> has a free minute to fix a bug. Then I think it's ok to put the code >> in, as I think it's good to fix a bug now. However, > > How does that fit with the workflow above? Does Travis commit the > bugfix, or not? Both is possible. Some projects require all patches to go through a review and I personally think it's a good idea. > >> the issue is definitely not closed and the bug is not fixed (!) until >> someone writes a proper test. I.e. putting code in that is not tested, >> however it doesn't break things, is imho ok, as it will not hurt >> anyone and it will temporarily fix a bug (but of course the code will >> be broken at some point in the future, if there is no test for it). > That is overstating the case; for 788, for example, no one in his right > mind would undo the one-line correction that Travis made. Chances are, > there will be all sorts of breakage and foulups and the revelation of > new bugs in the future--but not another instance that would be caught by > the test for 788. That's what the patch review is for -- people will comment on this and if a consencus is made that a test is not necessary, ok. >> > [...] >> http://codereview.appspot.com/953 >> >> and added some comments. So what do you think? > Looks like it could be useful. I replied to the comments. I haven't > read the docs, and I don't know what the next step is when a revision of > the patch is in order, as it is in this case. It seems only the owner of the issue (in this case me, because I uploaded your code) can add new patches to that issue. So simply start a new issue and upload it there. If there were more revisions from you, it would look like this: http://codereview.appspot.com/970 Ondrej From cournapeau at cslab.kecl.ntt.co.jp Wed May 14 22:22:23 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 15 May 2008 11:22:23 +0900 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <482B7CA0.4080408@hawaii.edu> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <482B7CA0.4080408@hawaii.edu> Message-ID: <1210818143.14512.5.camel@bbc8> On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote: > > What does that mean? How does one know when there is a consensus? There can be a system to make this automatic. For example, the code is never commited directly to svn, but to a gatekeeper, and people vote by an email command to say if they want the patch in; when the total number of votes is above some threshold, the gatekeeper commit the patch. David From strawman at astraw.com Wed May 14 22:50:01 2008 From: strawman at astraw.com (Andrew Straw) Date: Wed, 14 May 2008 19:50:01 -0700 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> <482AF296.8040208@astraw.com> <482B44E5.9090504@astraw.com> Message-ID: <482BA4D9.509@astraw.com> Aha, I've found the problem -- my values were int64 and my keys were uint64. Switching to the same data type immediately fixes the issue! It's not a memory cache issue at all. Perhaps searchsorted() should emit a warning if the keys require casting... I can't believe how bad the hit was. -Andrew Charles R Harris wrote: > > > On Wed, May 14, 2008 at 2:00 PM, Andrew Straw > wrote: > > Charles R Harris wrote: > > > > > > On Wed, May 14, 2008 at 8:09 AM, Andrew Straw > > > >> wrote: > > > > > > > > Quite a difference (a factor of about 3000)! At this point, > I haven't > > delved into the dataset to see what makes it so pathological -- > > performance is nowhere near this bad for the binary search > algorithm > > with other sets of keys. > > > > > > It can't be that bad Andrew, something else is going on. And 191 MB > > isn's *that* big, I expect it should bit in memory with no problem. > I agree the performance difference seems beyond what one would expect > due to cache misses alone. I'm at a loss to propose other > explanations, > though. Ideas? > > > I just searched for 2**25/10 keys in a 2**25 array of reals. It took > less than a second when vectorized. In a python loop it took about 7.7 > seconds. The only thing I can think of is that the search isn't > getting any cpu cycles for some reason. How much memory is it using? > Do you have any nans and such in the data? From pearu at cens.ioc.ee Thu May 15 04:13:43 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 15 May 2008 10:13:43 +0200 Subject: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag In-Reply-To: <3d375d730805141424t2a7775efuebd50d4f85be4727@mail.gmail.com> References: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> <91cf711d0805141320g53ae0c71ncce9ec76033d4965@mail.gmail.com> <3d375d730805141424t2a7775efuebd50d4f85be4727@mail.gmail.com> Message-ID: <482BF0B7.6020406@cens.ioc.ee> Robert Kern wrote: > On Wed, May 14, 2008 at 3:20 PM, David Huard wrote: >> I filed a patch that seems to do the trick in ticket #792. > > I don't think this is the right approach. The problem isn't that > _FORTIFY_SOURCE is set to 2 but that f2py is doing (probably) bad > things that trip these buffer overflow checks. IIRC, Pearu wasn't on > the f2py mailing list at the time this came up; please try him again. I was able to reproduce the bug on a debian system. The fix with a comment on what was causing the bug, is in svn: http://scipy.org/scipy/numpy/changeset/5173 I should warn that the bug fix does not have unittests because: 1) testing the bug requires Fortran compiler that for NumPy is an optional requirement. 2) I have tested the fix with two different setups that should cover all possible configurations. 3) In the case of problems with the fix, users should notice it immediately. 4) I have carefully read the patch before committing. Regards, Pearu From david.huard at gmail.com Thu May 15 08:41:52 2008 From: david.huard at gmail.com (David Huard) Date: Thu, 15 May 2008 08:41:52 -0400 Subject: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag In-Reply-To: <482BF0B7.6020406@cens.ioc.ee> References: <91cf711d0805141246u4f801445w2bb0a045c1ff70de@mail.gmail.com> <91cf711d0805141320g53ae0c71ncce9ec76033d4965@mail.gmail.com> <3d375d730805141424t2a7775efuebd50d4f85be4727@mail.gmail.com> <482BF0B7.6020406@cens.ioc.ee> Message-ID: <91cf711d0805150541r5bbfe2e4ibd1a01d88bddb7f1@mail.gmail.com> Works for me, Thanks David 2008/5/15 Pearu Peterson : > > > Robert Kern wrote: > > On Wed, May 14, 2008 at 3:20 PM, David Huard > wrote: > >> I filed a patch that seems to do the trick in ticket #792. > > > > I don't think this is the right approach. The problem isn't that > > _FORTIFY_SOURCE is set to 2 but that f2py is doing (probably) bad > > things that trip these buffer overflow checks. IIRC, Pearu wasn't on > > the f2py mailing list at the time this came up; please try him again. > > I was able to reproduce the bug on a debian system. The fix with > a comment on what was causing the bug, is in svn: > > http://scipy.org/scipy/numpy/changeset/5173 > > I should warn that the bug fix does not have unittests because: > 1) testing the bug requires Fortran compiler that for NumPy is > an optional requirement. > 2) I have tested the fix with two different setups that should cover > all possible configurations. > 3) In the case of problems with the fix, users should notice it > immediately. > 4) I have carefully read the patch before committing. > > Regards, > Pearu > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Thu May 15 09:08:11 2008 From: david.huard at gmail.com (David Huard) Date: Thu, 15 May 2008 09:08:11 -0400 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <1210818143.14512.5.camel@bbc8> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <482B7CA0.4080408@hawaii.edu> <1210818143.14512.5.camel@bbc8> Message-ID: <91cf711d0805150608y30d29a05v7c682d3452e90995@mail.gmail.com> 2008/5/14 David Cournapeau : > On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote: > > > > What does that mean? How does one know when there is a consensus? > > There can be a system to make this automatic. For example, the code is > never commited directly to svn, but to a gatekeeper, and people vote by > an email command to say if they want the patch in; when the total number > of votes is above some threshold, the gatekeeper commit the patch. > There is about 5 commits/day, I don't think it's a good idea to wait for a vote on each one of them. > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Thu May 15 11:32:37 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 15 May 2008 17:32:37 +0200 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <91cf711d0805150608y30d29a05v7c682d3452e90995@mail.gmail.com> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <482B7CA0.4080408@hawaii.edu> <1210818143.14512.5.camel@bbc8> <91cf711d0805150608y30d29a05v7c682d3452e90995@mail.gmail.com> Message-ID: <85b5c3130805150832p64865c16s10cba82b3f32fe95@mail.gmail.com> On Thu, May 15, 2008 at 3:08 PM, David Huard wrote: > 2008/5/14 David Cournapeau : >> >> On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote: >> > >> > What does that mean? How does one know when there is a consensus? >> >> There can be a system to make this automatic. For example, the code is >> never commited directly to svn, but to a gatekeeper, and people vote by >> an email command to say if they want the patch in; when the total number >> of votes is above some threshold, the gatekeeper commit the patch. > > There is about 5 commits/day, I don't think it's a good idea to wait for a > vote on each one of them. Me neither. I think just one reviewer is enough. Ondrej From pearu at cens.ioc.ee Thu May 15 11:49:27 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 15 May 2008 18:49:27 +0300 (EEST) Subject: [Numpy-discussion] let's use patch review In-Reply-To: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> Message-ID: <49662.88.90.135.57.1210866567.squirrel@cens.ioc.ee> On Thu, May 15, 2008 12:06 am, Ondrej Certik wrote: > Hi, > > I read the recent flamebate about unittests, formal procedures for a > commit etc. and it was amusing. :) > I think Stefan is right about the unit tests. I also think that Travis > is right that there is no formal procedure that can assure what we > want. > > I think that a solution is a patch review. I am -0.8 on it because the number of numpy core developers is just too small for the patch review to be effective - there is not enough reviewers who are qualified to review low-level code. The number of core developers can be defined as a number of developers who have ever been owners of numpy tickets. It seems that the number is less than 10. Note also that may be only few of them can work full time on numpy. For adding new features, the patch review system can be reasonable though. My 2 cents, Pearu From Glen.Mabey at swri.org Thu May 15 12:28:54 2008 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Thu, 15 May 2008 11:28:54 -0500 Subject: [Numpy-discussion] failure building numpy using icc In-Reply-To: <20080429214309.GA11807@swri.org> References: <20080228192138.GA21482@swri.org> <3d375d730803010017r1505e28fwd68bc554060b5ba3@mail.gmail.com> <20080429152151.GI23710@swri.org> <20080429155032.GA4718@swri.org> <20080429171911.GB4718@swri.org> <20080429214309.GA11807@swri.org> Message-ID: <20080515162854.GA10951@swri.org> On Tue, Apr 29, 2008 at 04:43:09PM -0500, Glen W. Mabey wrote: > Isn't that cool? I can only assume that it is a compiler bug and I will > have to upgrade to a newer version of icc (I'm using 10.0.025, actually > it's cce). > > After I do that, I'll post again if I have trouble. Just to follow up, compiling python with 10.1.015 resolved the problem. Glen From falted at pytables.org Thu May 15 13:46:46 2008 From: falted at pytables.org (Francesc Alted) Date: Thu, 15 May 2008 19:46:46 +0200 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> Message-ID: <200805151946.46386.falted@pytables.org> A Wednesday 14 May 2008, Ondrej Certik escrigu?: > Hi, > > I read the recent flamebate about unittests, formal procedures for a > commit etc. and it was amusing. :) > I think Stefan is right about the unit tests. I also think that > Travis is right that there is no formal procedure that can assure > what we want. [snip] For what is worth, Ivan and me were using patch peer review for more than a year in the PyTables project, and, although I was quite reluctant to adopt it at the begining, the reality is that it resulted in *much* better code quality to be added to the repository. Here it is the strategy ww used: 0. A ticket is opened explaining the feature to add or the thing to fix. 1. The ticket taker (i.e. the responsible of adding the new code) creates a new branch (properly labeled) for the desired modification/patch. The ticket is updated with a link to the new branch and the ownership of the ticket is transferred to the taker. 2. Once the modification is in the new branch, the ticket is updated explaining the actions done, and the ownership of the ticket transferred to the reviewer. 3. The peer reviews the code, and write suggestions in the same ticket about the new code. If the reviewer doesn't feel the need to suggest anything, he will write this in the ticket also. The ticket is transferred to the original author. 4. The original author should revise the suggestions of his peer, and if more actions are needed, he should address them. After this, he will transfer the ticket to the peer for a new review. 5. Phase 3 and 4 are repeated until an agreement is held by both parts, and the discussion remains in the ticket (one can temporarily tranfer part of the ticket discussion to the mailing list, if necessary). 6. After the agreement, the original author commits the patch in the temporary code branch to the affected branches (normally trunk and the stable branch of the project) and removes the temporary branch. The author has to tell explicitely in the ticket to which branches he has applied the new patch. 7. The ticket is closed. Of course, this works great with a pair programming paradigm (as was the case for PyTables), but for a project as NumPy there are more developers than just a pair, so you should decide how to choose the reviewer. One possibility is to form pairs by affinity, so that they can act normally together. Another possibility would be to force all the developers to subscribe to the ticket mailing list, and, for each ticket that requires a peer review, a call should be sent in order to gather a reviewer (who can offer as a volunteer by adding a note to the ticket, for example). I don't need to say that this procedure was not used for small or trivial changes (that were fixed directly), but only when the issue was important enough to deserve the attention of the mate. My two cents, -- Francesc Alted From charlesr.harris at gmail.com Thu May 15 15:26:50 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 15 May 2008 13:26:50 -0600 Subject: [Numpy-discussion] searchsorted() and memory cache In-Reply-To: <482BA4D9.509@astraw.com> References: <4823CA39.7090203@astraw.com> <482A2B50.6010005@astraw.com> <482A3B82.8060808@astraw.com> <482AF296.8040208@astraw.com> <482B44E5.9090504@astraw.com> <482BA4D9.509@astraw.com> Message-ID: On Wed, May 14, 2008 at 8:50 PM, Andrew Straw wrote: > Aha, I've found the problem -- my values were int64 and my keys were > uint64. Switching to the same data type immediately fixes the issue! > It's not a memory cache issue at all. > > Perhaps searchsorted() should emit a warning if the keys require > casting... I can't believe how bad the hit was. > I think it used to fail and that was fixed not so long ago. Was it casting the keys or was it casting the big sorted array? The latter would certainly slow things down and would be a bug, and a bug of the sort that might have slipped in when the casting was added. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu May 15 22:57:21 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 16 May 2008 11:57:21 +0900 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <91cf711d0805150608y30d29a05v7c682d3452e90995@mail.gmail.com> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <482B7CA0.4080408@hawaii.edu> <1210818143.14512.5.camel@bbc8> <91cf711d0805150608y30d29a05v7c682d3452e90995@mail.gmail.com> Message-ID: <482CF811.40007@ar.media.kyoto-u.ac.jp> David Huard wrote: > > There is about 5 commits/day, I don't think it's a good idea to wait > for a vote on each one of them. There is definitely a balance to find, and I am not convinced it would work well with subversion (it really makes sense to have those review with merge request, not per commit). For example, in scons, they have a fairly heavy review process which IMO prevents it from getting more contribution. In bzr, it works pretty well (they use a gateway system), but most main developers are paid for it. Having a somewhat official review process would also help solving one of my problem with trac: when someone sends a patch on trac, we don't know it, we have to look for it, and some of them are lost/duplicated. Requesting unit tests for new contributors is too much, I think, I hate it for other projects where I am less involved than numpy/scipy. But say I have 20 minutes to spend on reviewing patches: with a system which a list of available patches, it would be easy to do so. Maybe it is possible with trac and I just missed it. cheers, David From peridot.faceted at gmail.com Thu May 15 23:39:36 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 15 May 2008 23:39:36 -0400 Subject: [Numpy-discussion] let's use patch review In-Reply-To: <200805151946.46386.falted@pytables.org> References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <200805151946.46386.falted@pytables.org> Message-ID: 2008/5/15 Francesc Alted : > I don't need to say that this procedure was not used for small or > trivial changes (that were fixed directly), but only when the issue was > important enough to deserve the attention of the mate. I think here's the rub: when I hear "patch review system" it sounds to me like an obstacle course for getting code into the software. Maybe it's justified, but I think at the moment there are many many things that are just awaiting a little bit of attention from someone who knows the code. A patch review system overapplied would multiply that number by rather a lot. How about a purely-optional patch review system? I've submitted patches I wanted reviewed before they went in the trunk. As it was, I didn't have SVN access, so I just posted them to trac or emailed them to somebody, who then pondered and committed them. But a patch review system - provided people were promptly reviewing patches - would have fit the bill nicely. How frequently does numpy receive patches that warrant review? The zillion little doc fixes don't, even moderate-sized patches from experienced developers probably don't warrant review. So in the last while, what are changes that needed review, and what happened to them? * financial code - email to the list, discussion, eventual inclusion * matrix change - bug filed, substantial discussion, confusion about consensus committed, further dicussion, consensus committed, users complain, patch backed out and issue placed on hold * MA change - developed independently, discussed on mailing list, committed * histogram change - filed as bug, discussed on mailing list, committed * median change - discussed on mailing list, committed * .npy file format - discussed and implemented at a sprint, committed Did I miss any major ones? Of course, svn log will give you a list of minor fixes in the last few months. It seems to me like the review process at the moment is just "discuss it on the mailing list". Tools to facilitate that would be valuable; it would be handy to be able to point to a particular version of the code somewhere on the Web (rather than just in patches attached to email), for example. Anne From david at ar.media.kyoto-u.ac.jp Fri May 16 00:45:18 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 16 May 2008 13:45:18 +0900 Subject: [Numpy-discussion] let's use patch review In-Reply-To: References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <200805151946.46386.falted@pytables.org> Message-ID: <482D115E.8000301@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > I think here's the rub: when I hear "patch review system" it sounds to > me like an obstacle course for getting code into the software. Maybe > it's justified, but I think at the moment there are many many things > that are just awaiting a little bit of attention from someone who > knows the code. A patch review system overapplied would multiply that > number by rather a lot. > I think in this discussion, it is easy to see the drawbacks, and not seeing the advantages (better code, etc...). > How about a purely-optional patch review system? I've submitted > patches I wanted reviewed before they went in the trunk. As it was, I > didn't have SVN access, so I just posted them to trac or emailed them > to somebody, who then pondered and committed them. But a patch review > system - provided people were promptly reviewing patches - would have > fit the bill nicely. > It is not much, but I've just created a trac report to see all (open) tickets with an attachment. What would be good would be to create a new ticket type, like patch, such as instead of seeing all attachments (including build log as now), we see real attachments. I know next to nothing about databases and how complicated it would be to do it in trac, but I know other projects using trac have it, so it is doable. cheers, David From stefan at sun.ac.za Fri May 16 02:50:41 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 16 May 2008 08:50:41 +0200 Subject: [Numpy-discussion] let's use patch review In-Reply-To: References: <85b5c3130805141406h3abc4fd0t1d34a4b1bf1b23af@mail.gmail.com> <200805151946.46386.falted@pytables.org> Message-ID: <9457e7c80805152350t8384be6j9298ee37f49f82a3@mail.gmail.com> 2008/5/16 Anne Archibald : > How frequently does numpy receive patches that warrant review? The > zillion little doc fixes don't, even moderate-sized patches from > experienced developers probably don't warrant review. Those moderately-sized patches are the ones that need review, especially. Review provides useful information on a couple of levels: a) Motivation -- why do we want/need this patch b) Functionality -- does it do what the developer intended it to c) Implementation -- is it written according to current best practices Level (a) is normally discussed on the mailing list, if needed. Level (b) is covered by unit tests, *if* those were written. Then, level (c) is where the main advantage lies: we can learn from one another how to develop better code. I am somewhat split in two on this one. I love the idea of patch review; it undoubtedly raises the quality of the codebase. That said, it comes at a cost in developer time, and I'm not sure we have that luxury (we don't have a Michael Abshoff, unfortunately). Making it optional might be a good compromise, although the person who wrote a patch isn't the best one to judge whether it should be reviewed (of course, we all think our code is good!). Regards St?fan From millman at berkeley.edu Fri May 16 03:20:08 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 16 May 2008 03:20:08 -0400 Subject: [Numpy-discussion] Tagging 1.1rc1 in about 12 hours Message-ID: Hello, I believe that we have now addressed everything that was holding up the 1.1.0 release, so I will be tagging the 1.1.0rc1 in about 12 hours. Please be extremely conservative and careful about any commits you make to the trunk until we officially release 1.1.0 (now may be a good time to spend some effort on SciPy). Once I tag the release candidate I will ask both David and Chris to create Windows and Mac binaries. I will give everyone a few days to test the release candidate and binaries thoroughly. If everything looks good, the release candidate will become the official release. Once I tag 1.1.0, I will open the trunk for 1.1.1 development. Any development for 1.2 will have to occur on a new branch. I also plan to spend sometime once 1.1.0 is released discussing with the community what we want included in 1.2. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pearu at cens.ioc.ee Fri May 16 03:38:36 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri, 16 May 2008 09:38:36 +0200 Subject: [Numpy-discussion] Tagging 1.1rc1 in about 12 hours In-Reply-To: References: Message-ID: <482D39FC.4080303@cens.ioc.ee> Jarrod Millman wrote: > Hello, > > I believe that we have now addressed everything that was holding up > the 1.1.0 release, so I will be tagging the 1.1.0rc1 in about 12 > hours. Please be extremely conservative and careful about any commits > you make to the trunk until we officially release 1.1.0 (now may be a > good time to spend some effort on SciPy). Once I tag the release > candidate I will ask both David and Chris to create Windows and Mac > binaries. I will give everyone a few days to test the release > candidate and binaries thoroughly. If everything looks good, the > release candidate will become the official release. > > Once I tag 1.1.0, I will open the trunk for 1.1.1 development. Any > development for 1.2 will have to occur on a new branch. I am working with the ticket 752 at the moment and I would probably not want to commit my work to 1.1.0 at this time, so I shall commit when trunk is open as 1.1.1. My question regarding branching: how the changes from 1.1.1 will end up into 1.2 branch? Thanks, Pearu From opossumnano at gmail.com Fri May 16 04:01:42 2008 From: opossumnano at gmail.com (Tiziano Zito) Date: Fri, 16 May 2008 10:01:42 +0200 Subject: [Numpy-discussion] ANN: MDP 2.3 released! Message-ID: <20080516080142.GC24462@diamond.bccn-berlin> Dear NumPy and SciPy users, we are proud to announce release 2.3 of the Modular toolkit for Data Processing (MDP): a Python data processing framework. The base of readily available algorithms includes Principal Component Analysis (PCA and NIPALS), four flavors of Independent Component Analysis (CuBICA, FastICA, TDSEP, and JADE), Slow Feature Analysis, Independent Slow Feature Analysis, Gaussian Classifiers, Growing Neural Gas, Fisher Discriminant Analysis, Factor Analysis, Restricted Boltzmann Machine, and many more. What's new in version 2.3? -------------------------- - Enhanced PCA nodes (with SVD, automatic dimensionality reduction, and iterative algorithms). - A complete implementation of the FastICA algorithm. - JADE and TDSEP nodes for more fun with ICA. - Restricted Boltzmann Machine nodes. - The new subpackage "hinet" allows combining nodes in arbitrary feed-forward network architectures with a HTML visualization tool. - The tutorial has been updated with a section on hierarchical networks. - MDP integrated into the official Debian repository as "python-mdp". - A bunch of bug-fixes. Resources --------- Download: http://sourceforge.net/project/showfiles.php?group_id=116959 Homepage: http://mdp-toolkit.sourceforge.net Mailing list: http://sourceforge.net/mail/?group_id=116959 -- Pietro Berkes Gatsby Computational Neuroscience Unit UCL London, United Kingdom Niko Wilbert Institute for Theoretical Biology Humboldt-University Berlin, Germany Tiziano Zito Bernstein Center for Computational Neuroscience Humboldt-University Berlin, Germany From matthew.brett at gmail.com Fri May 16 05:35:25 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 16 May 2008 10:35:25 +0100 Subject: [Numpy-discussion] [SciPy-dev] [IT] Weekend outage complete In-Reply-To: <4364B874-7A19-4F77-88D1-F8855058CF1E@enthought.com> References: <4364B874-7A19-4F77-88D1-F8855058CF1E@enthought.com> Message-ID: <1e2af89e0805160235k239d0804h2cc8906521e1a0ac@mail.gmail.com> Hi, I hope you're the right person to ask about this - sorry if not. I have just noticed that our (neuroimaging.scipy.org) wiki link no longer works: http://projects.scipy.org/neuroimaging/ni/wiki gives a 502 proxy error: Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /neuroimaging/ni/wiki. Reason: DNS lookup failure for: neuroimaging.scipy.org I don't think we've changed any setup - is it possible there is an error at the server end? Thanks a lot, Matthew From pwang at enthought.com Fri May 16 08:04:37 2008 From: pwang at enthought.com (Peter Wang) Date: Fri, 16 May 2008 08:04:37 -0400 Subject: [Numpy-discussion] [SciPy-dev] [IT] Weekend outage complete In-Reply-To: <1e2af89e0805160235k239d0804h2cc8906521e1a0ac@mail.gmail.com> References: <4364B874-7A19-4F77-88D1-F8855058CF1E@enthought.com> <1e2af89e0805160235k239d0804h2cc8906521e1a0ac@mail.gmail.com> Message-ID: <2A7512CA-2EF1-4631-BE64-2ADBFAE931F9@enthought.com> On May 16, 2008, at 5:35 AM, Matthew Brett wrote: > Hi, > I hope you're the right person to ask about this - sorry if not. > I have just noticed that our (neuroimaging.scipy.org) wiki link no > longer works: > http://projects.scipy.org/neuroimaging/ni/wiki > gives a 502 proxy error: > Proxy Error > The proxy server received an invalid response from an upstream server. > The proxy server could not handle the request GET /neuroimaging/ni/ > wiki. > Reason: DNS lookup failure for: neuroimaging.scipy.org Hi Matthew, Thanks for reporting this. Indeed, this was affecting some other subdomains as well, but I have just now fixed it. -Peter From haase at msg.ucsf.edu Fri May 16 09:32:58 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri, 16 May 2008 15:32:58 +0200 Subject: [Numpy-discussion] Ruby benchmark -- numpy is slower.... was: Re: Ruby's NMatrix and NVector Message-ID: Hi, can someone comment on these timing numbers ? http://narray.rubyforge.org/bench.html.en Is the current numpy faster ? Cheers, Sebastian Haase On Sat, May 3, 2008 at 2:07 AM, Travis E. Oliphant wrote: > > http://narray.rubyforge.org/matrix-e.html > > It seems they've implemented some of what Tim is looking for, in > particular. Perhaps there is information to be gleaned from what they > are doing. It looks promising.. > > -Travis > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Fri May 16 09:31:00 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 16 May 2008 22:31:00 +0900 Subject: [Numpy-discussion] Ruby benchmark -- numpy is slower.... was: Re: Ruby's NMatrix and NVector In-Reply-To: References: Message-ID: <482D8C94.7040607@ar.media.kyoto-u.ac.jp> Sebastian Haase wrote: > Hi, > can someone comment on these timing numbers ? > http://narray.rubyforge.org/bench.html.en > > Is the current numpy faster ? > It is hard to know without getting the same machine or having the benchmark sources. But except for add, all other operations rely on underlying blas/lapack (only matrix operations do if you have no cblas), so I am a bit surprised by the results. FWIW, doing 100 x "c = a + b" with 1e6 elements on a PIV prescott @ 3.2 Ghz is about 2 sec, and I count numpy start: import numpy as np a = np.random.randn(1e6) b = np.random.randn(1e6) for i in range(100): a + b And np.dot(a, b) for 3 iterations and 500x500 takes 0.5 seconds (again taking into account numpy import), but what you really do here is benchmarking your underlying BLAS (if numpy.dot does use BLAS, again, which it does at least when built with ATLAS). cheers, David From peridot.faceted at gmail.com Fri May 16 11:00:51 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 16 May 2008 11:00:51 -0400 Subject: [Numpy-discussion] Ruby benchmark -- numpy is slower.... was: Re: Ruby's NMatrix and NVector In-Reply-To: <482D8C94.7040607@ar.media.kyoto-u.ac.jp> References: <482D8C94.7040607@ar.media.kyoto-u.ac.jp> Message-ID: 2008/5/16 David Cournapeau : > Sebastian Haase wrote: >> Hi, >> can someone comment on these timing numbers ? >> http://narray.rubyforge.org/bench.html.en >> >> Is the current numpy faster ? >> > > It is hard to know without getting the same machine or having the > benchmark sources. But except for add, all other operations rely on > underlying blas/lapack (only matrix operations do if you have no cblas), > so I am a bit surprised by the results. > > FWIW, doing 100 x "c = a + b" with 1e6 elements on a PIV prescott @ 3.2 > Ghz is about 2 sec, and I count numpy start: > > import numpy as np > > a = np.random.randn(1e6) > b = np.random.randn(1e6) > > for i in range(100): > a + b > > And np.dot(a, b) for 3 iterations and 500x500 takes 0.5 seconds (again > taking into account numpy import), but what you really do here is > benchmarking your underlying BLAS (if numpy.dot does use BLAS, again, > which it does at least when built with ATLAS). There are four benchmarks: add, multiply, dot, and solve. dot and solve use BLAS, and for them numpy ruby and octave are comparable. Add and multiply are much slower in numpy, but they are implemented in numpy itself. Exactly why add and multiply are slower is an interesting question - loop overhead? striding? cache behaviour? Anne From cournape at gmail.com Fri May 16 11:39:56 2008 From: cournape at gmail.com (David Cournapeau) Date: Sat, 17 May 2008 00:39:56 +0900 Subject: [Numpy-discussion] Ruby benchmark -- numpy is slower.... was: Re: Ruby's NMatrix and NVector In-Reply-To: References: <482D8C94.7040607@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220805160839ne11e411n8db39cf2690b6577@mail.gmail.com> On Sat, May 17, 2008 at 12:00 AM, Anne Archibald wrote: > > There are four benchmarks: add, multiply, dot, and solve. dot and > solve use BLAS, and for them numpy ruby and octave are comparable. Add > and multiply are much slower in numpy, but they are implemented in > numpy itself. The benchmark was done in 2005, and we do not know how it was done (no source). I don't know anything about ruby (that's my first ruby "program") but: cat > test1.py require "narray" a = NArray.float(1e6).fill(0) b = NArray.float(1e6).fill(0) for i in 1..200 a + b end EOF cat > test1.py import numpy as np a = np.zeros(1e6) b = np.zeros(1e6) for i in range(200): a + b EOF Give me extremely close results (now on my macbook with a core 2 duo). One nice thing with narray is the speed when loading it (10 x faster), but that may well be because narray is much smaller than numpy. cheers, David From sdb at cloud9.net Fri May 16 12:23:49 2008 From: sdb at cloud9.net (Stuart Brorson) Date: Fri, 16 May 2008 12:23:49 -0400 (EDT) Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? Message-ID: Hi guys, Just a quick note. I've been playing with NumPy again, looking at corner cases of function evaluation. I noticed this: In [66]: numpy.sign(numpy.nan) Out[66]: 0.0 IMO, the output should be NaN, not zero. If you agree, then I'll be happy to file a bug in the NumPy tracker. Or if somebody feels like pointing me to the place where this is implemented, I can submit a patch. (I grepped through the source for "sign" to see if I could figure it out, but it occurs so frequently that it will take longer than 5 min to sort it all out.) Before I did anything, however, I thought I would solicit the opinions of other folks in the NumPy community about the proper behavior of numpy.sign(numpy.nan). Cheers, Stuart Brorson Interactive Supercomputing, inc. 135 Beaver Street | Waltham | MA | 02452 | USA http://www.interactivesupercomputing.com/ From sdb at cloud9.net Fri May 16 12:47:59 2008 From: sdb at cloud9.net (Stuart Brorson) Date: Fri, 16 May 2008 12:47:59 -0400 (EDT) Subject: [Numpy-discussion] numpy.arccos(numpy.inf)???? Message-ID: Hi -- Sorry to be a pest with corner cases, but I found another one. In this case, if you try to take the arccos of numpy.inf in the context of a complex array, you get a bogus return (IMO). Like this: In [147]: R = numpy.array([1, numpy.inf]) In [148]: numpy.arccos(R) Warning: invalid value encountered in arccos Out[148]: array([ 0., NaN]) In [149]: C = numpy.array([1+1j, numpy.inf]) In [150]: numpy.arccos(C) Warning: invalid value encountered in arccos Out[150]: array([ 0.90455689-1.06127506j, NaN -Infj]) The arccos(numpy.inf) in the context of a real array is OK, but taking arcocs(numpy.inf) in the context of a complex array should return NaN + NaNj, IMO. Thoughts? Stuart Brorson Interactive Supercomputing, inc. 135 Beaver Street | Waltham | MA | 02452 | USA http://www.interactivesupercomputing.com/ From nripunsredar at gmail.com Fri May 16 13:31:22 2008 From: nripunsredar at gmail.com (Nripun Sredar) Date: Fri, 16 May 2008 12:31:22 -0500 Subject: [Numpy-discussion] svd in numpy Message-ID: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> I have a sparse matrix 416x52. I tried to factorize this matrix using svd from numpy. But it didn't produce a result and looked like it is in an infinite loop. I tried a similar operation using random numbers in the matrix. Even this is in an infinite loop. Did anyone else face a similar problem? Can anyone please give some suggestions? -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Fri May 16 13:37:48 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 16 May 2008 19:37:48 +0200 Subject: [Numpy-discussion] Strange behaviour of linalg.svd() and linalg.eigh() In-Reply-To: References: Message-ID: Hi, I tried using Matlab with the same matrix and its eig() function. It can diagonalize the matrix with a correct result, which is not the case for linalg.eigh(). Strange. Matthieu 2008/4/17 Matthieu Brucher : > Hi, > > Ive implemented the classical MultiDimensional Scaling for the scikit learn > using both functions. Their behavior surprised me for "big" arrays (10000 by > 10000, symmetric as it is a similarity matrix). > linalg.svd() raises a memory error because it tries to allocate a > (7000000,) array (in fact bigger than that !). This is strange because the > test was made on a 64bits Linux, so memory should not have been a problem. > linalg.eigh() fails to diagonalize the matrix, it gives me NaN as a result, > and this is not very useful. > A direct optimization of the underlying cost function can give me an > adequate solution. > > I cannot attach the matrix file (more than 700MB when pickled), but if > anyone has a clue, I'll be glad. > > Matthieu > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri May 16 13:56:41 2008 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 16 May 2008 20:56:41 +0300 Subject: [Numpy-discussion] Ruby benchmark -- numpy is slower.... was: Re: Ruby's NMatrix and NVector In-Reply-To: <5b8d13220805160839ne11e411n8db39cf2690b6577@mail.gmail.com> References: <482D8C94.7040607@ar.media.kyoto-u.ac.jp> <5b8d13220805160839ne11e411n8db39cf2690b6577@mail.gmail.com> Message-ID: <1210960601.7745.27.camel@localhost.localdomain> la, 2008-05-17 kello 00:39 +0900, David Cournapeau kirjoitti: > On Sat, May 17, 2008 at 12:00 AM, Anne Archibald > wrote: > > > > > There are four benchmarks: add, multiply, dot, and solve. dot and > > solve use BLAS, and for them numpy ruby and octave are comparable. Add > > and multiply are much slower in numpy, but they are implemented in > > numpy itself. > > The benchmark was done in 2005, and we do not know how it was done (no > source). I don't know anything about ruby (that's my first ruby > "program") but: [clip] The benchmark sources are in Narray's source directory. I took a look and my conclusion is that the benchmark is simply flawed: for Ruby, only user time is counted, while for Python, both user and system times are counted. The code uses Python's time.clock() which according to the documentation returns the CPU time (apparently user + system). On the Ruby side it uses Process.times.utime which is the elapsed user time. Running the original tests as they are in NArray 0.5.9 yields (I took representative ones from several runs. Eyeballing, the std between runs appeared of the order of 0.1...0.2s): ### Numeric 24.2 (24.2-8ubuntu2) ### Narray 0.5.9 (0.5.9-2) ### numpy 1.0.4 (1:1.0.4-6ubuntu3) ### ### All of these from Ubuntu 8.04 packages. $ time ruby mul.rb a = NArray.float(1000000): [ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, ... ] b = NArray.float(1000000): [ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, ... ] calculating c = a*b ... Time: 3.05 sec real 0m5.039s user 0m3.116s sys 0m1.564s Obviously, the reported time here is the user time only! $ time python mul.py # the old Numeric a.typecode: d , a.shape: (1000000,) b.typecode: d , b.shape: (1000000,) calculating c = a*b ... Time: 6.020 sec real 0m6.999s user 0m4.308s sys 0m2.164s Whereas here it must be the sum of the user and system times! Running tests for numpy and fixed time counting for Ruby: $ time python mul_numpy.py ?# the new numpy a.typecode: float64 , a.shape: (1000000,) b.typecode: float64 , b.shape: (1000000,) calculating c = a*b ... Time: 4.580 sec real 0m5.774s user 0m3.352s sys 0m1.996s $ time ruby mul_correct.rb # using T.times.utime + T.times.stime a = NArray.float(1000000): [ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, ... ] b = NArray.float(1000000): [ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, ... ] calculating c = a*b ... Time: 4.57 sec real 0m5.045s user 0m3.060s sys 0m1.620s I think this shows that there is no discernible difference between the performance of numpy and Ruby's NArray. Even though the performance of numpy and NArray is indeed better than that of Numeric, the difference is not as large as the original benchmark led to believe. Benchmark files attached, in case someone wants to contest my analysis. -- Pauli Virtanen -------------- next part -------------- A non-text attachment was scrubbed... Name: mul.py Type: text/x-python Size: 270 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mul.rb Type: application/x-ruby Size: 134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mul_correct.rb Type: application/x-ruby Size: 142 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mul_numpy.py Type: text/x-python Size: 266 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mybench.py Type: text/x-python Size: 313 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mybench.rb Type: application/x-ruby Size: 476 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mybench_correct.rb Type: application/x-ruby Size: 508 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mybench_numpy.py Type: text/x-python Size: 307 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From bsouthey at gmail.com Fri May 16 14:00:47 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 16 May 2008 13:00:47 -0500 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> Message-ID: <482DCBCF.7060104@gmail.com> Nripun Sredar wrote: > I have a sparse matrix 416x52. I tried to factorize this matrix using > svd from numpy. But it didn't produce a result and looked like it is > in an infinite loop. > I tried a similar operation using random numbers in the matrix. Even > this is in an infinite loop. > Did anyone else face a similar problem? > Can anyone please give some suggestions? > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Hi, Please ensure that you have the latest version of numpy (either 1.1 when it gets released or one from the svn) - see Ticket 627: http://www.scipy.org/scipy/numpy/ticket/627 Bruce From robert.kern at gmail.com Fri May 16 14:23:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 May 2008 13:23:55 -0500 Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? In-Reply-To: References: Message-ID: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> On Fri, May 16, 2008 at 11:23 AM, Stuart Brorson wrote: > Hi guys, > > Just a quick note. I've been playing with NumPy again, looking at > corner cases of function evaluation. I noticed this: > > In [66]: numpy.sign(numpy.nan) > Out[66]: 0.0 > > IMO, the output should be NaN, not zero. > > If you agree, then I'll be happy to file a bug in the NumPy tracker. > Or if somebody feels like pointing me to the place where this is > implemented, I can submit a patch. (I grepped through the source for > "sign" to see if I could figure it out, but it occurs so frequently > that it will take longer than 5 min to sort it all out.) > > Before I did anything, however, I thought I would solicit the opinions > of other folks in the NumPy community about the proper behavior of > numpy.sign(numpy.nan). You're probably right. I would like to see what other systems do before changing it, though. The implementation is actually in a #define macro in umathmodule.c.src. Look for _SIGN1 (and _SIGNC if you want to clean up the complex versions, too). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri May 16 14:37:08 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 May 2008 13:37:08 -0500 Subject: [Numpy-discussion] numpy.arccos(numpy.inf)???? In-Reply-To: References: Message-ID: <3d375d730805161137v7e9d1533l84232901035ec36@mail.gmail.com> On Fri, May 16, 2008 at 11:47 AM, Stuart Brorson wrote: > Hi -- > > Sorry to be a pest with corner cases, but I found another one. > > In this case, if you try to take the arccos of numpy.inf in the > context of a complex array, you get a bogus return (IMO). Like this: > > In [147]: R = numpy.array([1, numpy.inf]) > > In [148]: numpy.arccos(R) > Warning: invalid value encountered in arccos > Out[148]: array([ 0., NaN]) > > In [149]: C = numpy.array([1+1j, numpy.inf]) > > In [150]: numpy.arccos(C) > Warning: invalid value encountered in arccos > Out[150]: array([ 0.90455689-1.06127506j, NaN -Infj]) > > The arccos(numpy.inf) in the context of a real array is OK, but taking > arcocs(numpy.inf) in the context of a complex array should return > NaN + NaNj, IMO. > > Thoughts? Hmm, this works fine on OS X. This may be a problem with one of the lower-level math functions which we defer to the platform. In [1]: from numpy import * In [2]: arccos(nan+nan*1j) Out[2]: (nan+nanj) In [3]: arccos(nan+0j) Out[3]: (nan+nanj) In [4]: arccos(nan) Out[4]: nan In [5]: arccos([1.0+0j, nan]) Out[5]: array([ 0. -0.j, NaN NaNj]) The implementations of the complex versions are in umathmodule.c.src (for the expanded versions, see umathmodule.c after building); they are all prefixed with "nc_". E.g. the following are calling nc_acos() for doubles. Here is the source: static void nc_acos(cdouble *x, cdouble *r) { nc_prod(x,x,r); nc_diff(&nc_1, r, r); nc_sqrt(r, r); nc_prodi(r, r); nc_sum(x, r, r); nc_log(r, r); nc_prodi(r, r); nc_neg(r, r); return; /* return nc_neg(nc_prodi(nc_log(nc_sum(x,nc_prod(nc_i, nc_sqrt(nc_diff(nc_1,nc_prod(x,x)))))))); */ } I suspect the problem comes from the nc_log() which calls your platform's atan2() for the imaginary part of the result. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri May 16 14:39:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 May 2008 13:39:01 -0500 Subject: [Numpy-discussion] numpy.arccos(numpy.inf)???? In-Reply-To: <3d375d730805161137v7e9d1533l84232901035ec36@mail.gmail.com> References: <3d375d730805161137v7e9d1533l84232901035ec36@mail.gmail.com> Message-ID: <3d375d730805161139n2aa48715i6603b2febd607ab9@mail.gmail.com> On Fri, May 16, 2008 at 1:37 PM, Robert Kern wrote: > On Fri, May 16, 2008 at 11:47 AM, Stuart Brorson wrote: >> Hi -- >> >> Sorry to be a pest with corner cases, but I found another one. >> >> In this case, if you try to take the arccos of numpy.inf in the >> context of a complex array, you get a bogus return (IMO). Like this: >> >> In [147]: R = numpy.array([1, numpy.inf]) >> >> In [148]: numpy.arccos(R) >> Warning: invalid value encountered in arccos >> Out[148]: array([ 0., NaN]) >> >> In [149]: C = numpy.array([1+1j, numpy.inf]) >> >> In [150]: numpy.arccos(C) >> Warning: invalid value encountered in arccos >> Out[150]: array([ 0.90455689-1.06127506j, NaN -Infj]) >> >> The arccos(numpy.inf) in the context of a real array is OK, but taking >> arcocs(numpy.inf) in the context of a complex array should return >> NaN + NaNj, IMO. >> >> Thoughts? > > Hmm, this works fine on OS X. Sorry, I'm an idiot. I get the same results as you when I read the message correctly. In [2]: arccos(inf+0j) Out[2]: (nan+-infj) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri May 16 15:27:10 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 16 May 2008 12:27:10 -0700 Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? In-Reply-To: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> References: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> Message-ID: On Fri, May 16, 2008 at 11:23 AM, Robert Kern wrote: > On Fri, May 16, 2008 at 11:23 AM, Stuart Brorson wrote: >> In [66]: numpy.sign(numpy.nan) >> Out[66]: 0.0 >> >> IMO, the output should be NaN, not zero. > > You're probably right. I would like to see what other systems do > before changing it, though. octave:1> sign(nan) ans = NaN From robert.kern at gmail.com Fri May 16 15:29:36 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 May 2008 14:29:36 -0500 Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? In-Reply-To: References: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> Message-ID: <3d375d730805161229u1299b057k99c4934f21610587@mail.gmail.com> On Fri, May 16, 2008 at 2:27 PM, Keith Goodman wrote: > On Fri, May 16, 2008 at 11:23 AM, Robert Kern wrote: >> On Fri, May 16, 2008 at 11:23 AM, Stuart Brorson wrote: >>> In [66]: numpy.sign(numpy.nan) >>> Out[66]: 0.0 >>> >>> IMO, the output should be NaN, not zero. >> >> You're probably right. I would like to see what other systems do >> before changing it, though. > > octave:1> sign(nan) > ans = NaN Works for me. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sdb at cloud9.net Fri May 16 15:32:25 2008 From: sdb at cloud9.net (Stuart Brorson) Date: Fri, 16 May 2008 15:32:25 -0400 (EDT) Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? In-Reply-To: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> References: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> Message-ID: Hi -- >> In [66]: numpy.sign(numpy.nan) >> Out[66]: 0.0 >> >> IMO, the output should be NaN, not zero. > You're probably right. I would like to see what other systems do > before changing it, though. Here's what Matlab does: >> A = nan A = NaN >> sign(A) ans = NaN >> B = [1, nan] B = 1 NaN >> sign(B) ans = 1 NaN Cheers, Stuart Brorson Interactive Supercomputing, inc. 135 Beaver Street | Waltham | MA | 02452 | USA http://www.interactivesupercomputing.com/ From peridot.faceted at gmail.com Fri May 16 17:57:28 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 16 May 2008 23:57:28 +0200 Subject: [Numpy-discussion] numpy.arccos(numpy.inf)???? In-Reply-To: References: Message-ID: 2008/5/16 Stuart Brorson : > Hi -- > > Sorry to be a pest with corner cases, but I found another one. > > In this case, if you try to take the arccos of numpy.inf in the > context of a complex array, you get a bogus return (IMO). Like this: > > In [147]: R = numpy.array([1, numpy.inf]) > > In [148]: numpy.arccos(R) > Warning: invalid value encountered in arccos > Out[148]: array([ 0., NaN]) > > In [149]: C = numpy.array([1+1j, numpy.inf]) > > In [150]: numpy.arccos(C) > Warning: invalid value encountered in arccos > Out[150]: array([ 0.90455689-1.06127506j, NaN -Infj]) > > The arccos(numpy.inf) in the context of a real array is OK, but taking > arcocs(numpy.inf) in the context of a complex array should return > NaN + NaNj, IMO. > > Thoughts? Is this so bad? lim x->inf arccos(x) = -j inf Granted this is only when x approaches infinity along the positive real axis... Anne From bblais at bryant.edu Fri May 16 20:34:08 2008 From: bblais at bryant.edu (Brian Blais) Date: Fri, 16 May 2008 20:34:08 -0400 Subject: [Numpy-discussion] question about optimizing Message-ID: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Hello, I have a custom array, which contains custom objects (I give a stripped down example below), and I want to loop over all of the elements of the array and call a method of the object. I can do it like: a=MyArray((5,5),MyObject,10) for obj in a.flat: obj.update() but I was wondering if there is a faster way, especially if obj.update is a cython-ed function. I was thinking something like apply_along_axis, but without having an input array at all. Is there a better way to do this? While I'm asking, is there a better way to overload ndarray than what I am doing below? I tried to follow code I found online, but the examples of this are few and far between. thanks you for any help! Brian Blais -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais from numpy import ndarray,prod,array class MyObject(object): def __init__(self,value): self.value=value def update(self): self.value*=2 def __repr__(self): return "My value is %d." % self.value class MyArray(ndarray): def __new__(subtype, shape,obj, *args,**kwargs): if isinstance(shape,int): N=shape shape=(shape,) else: N=prod(shape) objs=[] for i in range(N): objs.append(obj(*args,**kwargs)) arr = array(objs, dtype=None, copy=False) arr = arr.view(subtype) arr.shape=shape return arr if __name__=="__main__": a=MyArray((5,5),MyObject,10) for obj in a.flat: obj.update() -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Fri May 16 20:48:04 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 16 May 2008 20:48:04 -0400 Subject: [Numpy-discussion] question about optimizing In-Reply-To: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: 2008/5/16 Brian Blais : > I have a custom array, which contains custom objects (I give a stripped down > example below), and I want to loop over all of the elements of the array and > call a method of the object. I can do it like: > a=MyArray((5,5),MyObject,10) > > for obj in a.flat: > obj.update() > but I was wondering if there is a faster way, especially if obj.update is a > cython-ed function. I was thinking something like apply_along_axis, but > without having an input array at all. > Is there a better way to do this? While I'm asking, is there a better way > to overload ndarray than what I am doing below? I tried to follow code I > found online, but the examples of this are few and far between. Unfortunately, the loop overhead isn't very big compared to the overhead of method dispatch, so there's no way you're going to make things fast. For convenience you can do something like update = vectorize(lambda object: object.update()) and then later update(a) But if you really want things to go quickly, I think the best approach is to take advantage of the main feature of arrays: they hold homogeneous items efficiently. So use the array to store the contents of your object, and put any special behaviour in the array object. For example, if I wanted to efficiently compute with arrays of numbers carrying units, I would attach a unit to the array as a whole, and have the array store doubles. With some cleverness, you can even have accesses to the array return a freshly-created object whose contents are based on the values you looked up in the array. But storing actual python objects in an array is probably not a good idea. Anne From millman at berkeley.edu Sat May 17 00:09:07 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 16 May 2008 21:09:07 -0700 Subject: [Numpy-discussion] Tagging 1.1rc1 in about 12 hours In-Reply-To: <482D39FC.4080303@cens.ioc.ee> References: <482D39FC.4080303@cens.ioc.ee> Message-ID: On Fri, May 16, 2008 at 12:38 AM, Pearu Peterson wrote: > I am working with the ticket 752 at the moment and I would probably > not want to commit my work to 1.1.0 at this time, so I shall commit > when trunk is open as 1.1.1. That sounds reasonable. > My question regarding branching: how the changes from 1.1.1 will end up > into 1.2 branch? The branches will need to be merged back into the trunk. I know that branches are a bit difficult to deal with in subversion, but until we decided to move to distributed version control system. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From jh at physics.ucf.edu Sat May 17 01:45:32 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Sat, 17 May 2008 01:45:32 -0400 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 Message-ID: NUMPY/SCIPY DOCUMENTATION MARATHON 2008 As we all know, the state of the numpy and scipy reference documentation (aka the docstrings) is best described as "incomplete". Most functions have docstrings shorter than 5 lines, whereas our competitors IDL and Matlab usually have a concise and well-written page or two per function. The (wonderful) categorized list of functions is very new and isn't included in the package yet. There isn't even a "Getting Started"-type of document you can hand a new user so they can dive right in. Documentation tools are limited to plain-text paginators, while our competition enjoys HTML-based documents with formulae, images, search capability, and cross linking. Tales of woe abound. A university class switched to Numpy and got hopelessly bogged down because students couldn't find out how to call the functions. A developer looked something up while giving a presentation and the words "Blah, Blah, Blah" stared down at the audience in response. To head off another pedagogical meltdown, the University of Central Florida has hired Stefan van der Walt full time to coordinate a community documentation effort to write reference documentation and tools. The project starts now and continues through the summer. The goals: 1. Produce complete docstrings for all numpy functions and as much of scipy as possible, 2. Produce an 8-15 page Getting Started tutorial that is not discipline-specific, 3. Write reference sections on topics in numpy, such as slicing and the use principles of the modules, 4. Complete a first edition, in both PDF and HTML, of a NumPy Reference Manual, and 5. Check everything into the sources by 1 August 2008 so that the Packaging Team can cut a release and have it available in time for Fall 2008 classes. Even Stefan could not document the hundreds of functions that need it by himself, and in any case such a large contribution requires community review. To make it easy for everyone to contribute, Pauli Virtanen and Emmanuelle Guillart have provided a wiki system for editing reference documentation. The idea was developed by Fernando Perez, Stefan, and Gael Varoquaux. We encourage community members to write, review, and proofread reference pages on this wiki. Stefan will check updates into the sources roughly weekly. Near the end of the project, we will put these wiki pages through a vetting process and then check them into the sources a final time for a release hopefully to occur in early August. Meanwhile, Perry Greenfield has taken the lead on on task 3, writing reference docs for things that currently don't have docstrings, such as basic concepts like slicing. We have proposed two small extensions to the current docstring format, for images (to be used sparingly) and indexing. These appear in updated versions of the doc standard, which are linked from the wiki frontpage. Please take a look and comment on these if you like. All docstrings will remain readable in plain text, but we are now generating a full reference guide in PDF and HTML (you guessed it, linked from the wiki). These are searchable formats. There are several ways you can help: 1. Write some docstrings on the wiki! Many people can do this, many more than can write code for the package itself. However, you must know numpy, the function group, and the function you are writing well. You should be familiar with the concept of a reference page and write in that concise style. We'll do tutorial docs in another project at a later date. See the instructions on the wiki for guidelines and format. 2. Review others' docstrings and leave comments on their wiki pages. 3. Proofread docstrings. Make sure they are correct, complete, and concise. Fix grammar. 4. Write examples ("doctests"). Even if you are not a top-notch English writer, you can help by producing a code snippet of a few lines that demonstrates a function. It is fine for them to go into the docstring templates before the actual text. 5. Write a new help function that optionally produces ASCII or points the user's PDF or HTML reader to the right page (either local or global). 6. If you are in a position to hire someone, such as a knowledgeable student or short-term consultant, hire them to work on the tasks above for the summer. We can provide supervision to them or guidance to you if you like. The home for this project is here: http://scipy.org/Developer_Zone/DocMarathon2008 This is not a sprint. It is a marathon, and this time we are going to finish. We hope you will join us! --jh-- and Stefan and Perry and Pauli and Emmanuelle...and you! Joe Harrington Stefan van der Walt Perry Greenfield Pauli Virtanen Emmanuelle Guillart ...and you! From david at ar.media.kyoto-u.ac.jp Sat May 17 03:34:29 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 17 May 2008 16:34:29 +0900 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> Message-ID: <482E8A85.6010606@ar.media.kyoto-u.ac.jp> Nripun Sredar wrote: > I have a sparse matrix 416x52. I tried to factorize this matrix using > svd from numpy. But it didn't produce a result and looked like it is > in an infinite loop. > I tried a similar operation using random numbers in the matrix. Even > this is in an infinite loop. > Did anyone else face a similar problem? > Can anyone please give some suggestions? Are you on windows ? What is the CPU on your machine ? I suspect this is caused by windows binaries which shipped blas/lapack without support for "old" CPU. cheers, David From sdb at cloud9.net Sat May 17 07:55:24 2008 From: sdb at cloud9.net (Stuart Brorson) Date: Sat, 17 May 2008 07:55:24 -0400 (EDT) Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? In-Reply-To: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> References: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> Message-ID: >> In [66]: numpy.sign(numpy.nan) >> Out[66]: 0.0 >> >> IMO, the output should be NaN, not zero. > The implementation is actually in a #define macro in > umathmodule.c.src. Look for _SIGN1 (and _SIGNC if you want to clean up > the complex versions, too). OK, I submitted a patch. #794 in the tracker. Stuart Brorson Interactive Supercomputing, inc. 135 Beaver Street | Waltham | MA | 02452 | USA http://www.interactivesupercomputing.com From jh at physics.ucf.edu Sat May 17 10:22:31 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Sat, 17 May 2008 10:22:31 -0400 Subject: [Numpy-discussion] [SciPy-user] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: (ryanlists@gmail.com) References: Message-ID: Ryan writes: > This is very good news. I will find some way to get involved. Great! Please dive right in, and sign up on the Developer_Zone page so we can keep track of who's involved. One thing I forgot to mention in my too-wordy announcement was that discussion of documentation is on the scipy-dev mailing list. We had to pick one spot and decided that since we are going after scipy as soon as numpy is done, we'd like to use that list rather than numpy-discussion. We also wanted to keep it on a development list rather than polluting the new users' discussion space. --jh-- From lists at cheimes.de Sat May 17 10:37:53 2008 From: lists at cheimes.de (Christian Heimes) Date: Sat, 17 May 2008 16:37:53 +0200 Subject: [Numpy-discussion] numpy.arccos(numpy.inf)???? In-Reply-To: References: Message-ID: Stuart Brorson schrieb: > Hi -- > > Sorry to be a pest with corner cases, but I found another one. [...] Mark and I spent a *lot* of time in fixing those edge cases in Python 2.6 and 3.0. We used the C99 standard as template. I recommend that you look at our code. Christian From lists at informa.tiker.net Sat May 17 10:48:40 2008 From: lists at informa.tiker.net (Andreas =?iso-8859-1?q?Kl=F6ckner?=) Date: Sat, 17 May 2008 10:48:40 -0400 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: Message-ID: <200805171048.56390.lists@informa.tiker.net> On Samstag 17 Mai 2008, Joe Harrington wrote: > To head off another pedagogical meltdown, the University of Central > Florida has hired Stefan van der Walt full time to coordinate a > community documentation effort to write reference documentation and > tools. This is truly excellent news. One question though: I didn't see Travis's Numpy book mentioned at all in your writeup, so I am wondering where its role in the doc effort is. Its home page states that it will be opened on Sep 1, 2008, apparently in time for classes, and it already provides parts of what you propose. Mainly: while we need to respect Travis's copyright, a duplication of the massive effort that went into the book hardly seems sensible. One initial question is therefore: Is it OK to copy material out of the book and into other parts of the documentation? Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From zoho.vignochi at gmail.com Sat May 17 11:02:32 2008 From: zoho.vignochi at gmail.com (Zoho Vignochi) Date: Sat, 17 May 2008 15:02:32 +0000 (UTC) Subject: [Numpy-discussion] checking element types in array Message-ID: hello: I am writing my own version of a dot product. Simple enough this way: def dot_r(a, b): return sum( x*y for (x,y) in izip(a, b) ) However if both a and b are complex we need: def dot_c(a, b): return sum( x*y for (x,y) in izip(a.conjugate(), b) ).real I would like to combine these so that I need only one function which detects which formula based on argument types. So I thought that something like: def dot(a,b): if isinstance(a.any(), complex) and isinstance(b.any(), complex): return sum( x*y for (x,y) in izip(a.conjugate(), b) ).real else: return sum( x*y for (x,y) in izip(a, b) ) And it doesn't work because I obviously have the syntax for checking element types incorrect. So my real question is: What is the best way to check if any of the elements in an array are complex? Thank you, Zoho From aisaac at american.edu Sat May 17 11:52:35 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 17 May 2008 11:52:35 -0400 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On Fri, 16 May 2008, Anne Archibald apparently wrote: > storing actual python objects in an array is probably not > a good idea I have been wondering what people use object arrays for. I have been guessing that it is for indexing convenience? Are there other core motivations? Alan Isaac From charlesr.harris at gmail.com Sat May 17 12:48:40 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 10:48:40 -0600 Subject: [Numpy-discussion] Tagging 1.1rc1 in about 12 hours In-Reply-To: References: Message-ID: On Fri, May 16, 2008 at 1:20 AM, Jarrod Millman wrote: > Hello, > > I believe that we have now addressed everything that was holding up > the 1.1.0 release, so I will be tagging the 1.1.0rc1 in about 12 > hours. Please be extremely conservative and careful about any commits > you make to the trunk until we officially release 1.1.0 (now may be a > good time to spend some effort on SciPy). Once I tag the release > candidate I will ask both David and Chris to create Windows and Mac > binaries. I will give everyone a few days to test the release > candidate and binaries thoroughly. If everything looks good, the > release candidate will become the official release. > > Once I tag 1.1.0, I will open the trunk for 1.1.1 development. You mean bug fixes. No development should happen in 1.1.1 > Any development for 1.2 will have to occur on a new branch. So open the new branch already. I've got stuff that's been stuck in the queue for over a month waiting for the release and the longer it waits the more likely it is that that someone will change something and it all becomes a merge nightmare. I don't want to let that stuff hang fire until late summer. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 17 12:58:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 10:58:09 -0600 Subject: [Numpy-discussion] Tagging 1.1rc1 in about 12 hours In-Reply-To: References: Message-ID: On Sat, May 17, 2008 at 10:48 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Fri, May 16, 2008 at 1:20 AM, Jarrod Millman > wrote: > >> Hello, >> >> I believe that we have now addressed everything that was holding up >> the 1.1.0 release, so I will be tagging the 1.1.0rc1 in about 12 >> hours. Please be extremely conservative and careful about any commits >> you make to the trunk until we officially release 1.1.0 (now may be a >> good time to spend some effort on SciPy). Once I tag the release >> candidate I will ask both David and Chris to create Windows and Mac >> binaries. I will give everyone a few days to test the release >> candidate and binaries thoroughly. If everything looks good, the >> release candidate will become the official release. >> >> Once I tag 1.1.0, I will open the trunk for 1.1.1 development. > > > You mean bug fixes. No development should happen in 1.1.1 > > And no documentation updates, either. Version 1.1.1 should *only* be for egregious bugs, everything else should go into 1.2. Trying to keep things synchronized will become a complete nightmare otherwise. I would even suggest that 1.1.1 not be the trunk, it should be a branch of 1.1. After release, 1.1 is history and time marches on. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 17 13:39:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 11:39:42 -0600 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On Sat, May 17, 2008 at 9:52 AM, Alan G Isaac wrote: > On Fri, 16 May 2008, Anne Archibald apparently wrote: > > storing actual python objects in an array is probably not > > a good idea > > I have been wondering what people use object arrays for. > I have been guessing that it is for indexing convenience? > Are there other core motivations? > You can always define an object array of matrices, which solves Tim's problem of matrix stacks, albeit in not the most efficient manner and not the easiest thing to specify due to the current limitations of array(...). I do think it would be nice to have arrays or arrays, but this needs one more array type so that one can do something like array(list_of_arrays, dtype=matrix((2,3),float)), i.e., we could use a fancier dtype. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat May 17 13:43:47 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 17 May 2008 19:43:47 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <200805171048.56390.lists@informa.tiker.net> References: <200805171048.56390.lists@informa.tiker.net> Message-ID: <9457e7c80805171043g5f774691of65863f3a63785e7@mail.gmail.com> Hi Andreas 2008/5/17 Andreas Kl?ckner : > On Samstag 17 Mai 2008, Joe Harrington wrote: >> To head off another pedagogical meltdown, the University of Central >> Florida has hired Stefan van der Walt full time to coordinate a >> community documentation effort to write reference documentation and >> tools. > > This is truly excellent news. One question though: I didn't see Travis's Numpy > book mentioned at all in your writeup, so I am wondering where its role in > the doc effort is. Its home page states that it will be opened on Sep 1, > 2008, apparently in time for classes, and it already provides parts of what > you propose. Travis has generously pemitted us to use any part of his book in the documentation. We shall be making use of his kind offer! As far as I am aware, the code to his book will be released at SciPy 2008 (http://conference.scipy.org). Regards St?fan From pearu at cens.ioc.ee Sat May 17 14:05:26 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 17 May 2008 21:05:26 +0300 (EEST) Subject: [Numpy-discussion] Tagging 1.1rc1 in about 12 hours In-Reply-To: References: Message-ID: <50661.88.90.135.57.1211047526.squirrel@cens.ioc.ee> On Sat, May 17, 2008 7:48 pm, Charles R Harris wrote: > On Fri, May 16, 2008 at 1:20 AM, Jarrod Millman > wrote: >> Once I tag 1.1.0, I will open the trunk for 1.1.1 development. ... >> Any development for 1.2 will have to occur on a new branch. > > So open the new branch already. I am waiting it too. At least, give another time target for 1.1.0. (ticket 752 has a patch ready and waiting for a commit, if 1.1.0 is going to wait another few days, the commit to 1.1.0 should be safe). Pearu From bblais at bryant.edu Sat May 17 14:22:02 2008 From: bblais at bryant.edu (Brian Blais) Date: Sat, 17 May 2008 14:22:02 -0400 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On May 17, 2008, at May 17:11:52 AM, Alan G Isaac wrote: > On Fri, 16 May 2008, Anne Archibald apparently wrote: >> storing actual python objects in an array is probably not >> a good idea > > I have been wondering what people use object arrays for. > I have been guessing that it is for indexing convenience? at least for me, that was the motivation. I am trying to build a simulation framework for part of the brain, which requires connected layers of nodes. A layer is either a 1D or 2D structure of nodes, with each node a relatively complex beast. Rather than reinvent the indexing (1D, 2D, slicing, etc...), I just inherited from ndarray. I thought, after the fact, that some numpy functions on arrays would help speed up the code, which consists mostly of calling an update function on all nodes, passing each them an input vector. I wasn't sure if there would be any speed up for this, compared to for n in self.flat: n.update(input_vector) From the response, the answer seems to be no, and that I should stick with the python loops for clarity. But also, the words of Anne Archibald, makes me think that I have made a bad choice by inheriting from ndarray, although I am not sure what a convenient alternative would be. bb -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 17 14:23:05 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 12:23:05 -0600 Subject: [Numpy-discussion] numpy.sign(numpy.nan)????? In-Reply-To: References: <3d375d730805161123p55c0e80epe3ed0a4997b2cc13@mail.gmail.com> Message-ID: On Sat, May 17, 2008 at 5:55 AM, Stuart Brorson wrote: > >> In [66]: numpy.sign(numpy.nan) > >> Out[66]: 0.0 > >> > >> IMO, the output should be NaN, not zero. > > > The implementation is actually in a #define macro in > > umathmodule.c.src. Look for _SIGN1 (and _SIGNC if you want to clean up > > the complex versions, too). > > OK, I submitted a patch. #794 in the tracker. > Let's wait until the 1.2 branch opens so as to avoid merge collisions. I'll fix these things up at that point. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at physics.ucf.edu Sat May 17 14:29:48 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Sat, 17 May 2008 14:29:48 -0400 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: (numpy-discussion-request@scipy.org) References: Message-ID: > I didn't see Travis's Numpy book mentioned at all in your writeup, so > I am wondering where its role in the doc effort is. > Is it OK to copy material out of the book and into > other parts of the documentation? No worries, Travis is on board here. We included him and others on the Steering Committee in planning this effort. Travis's book overlaps the current effort to some extent. The function descriptions in the book are the numpy docstrings, such as they currently exist. The docstrings are open source software and the book is a work derived from them. The current effort is essentially to fill in the docstrings to the full expectation of professional reference documentation. If you compare the docstring example on the wiki (for multivariate_normal) with the current page for that function, you'll see the difference. The multivariate_normal docstring is actually pretty good among current docstrings, but even for this function we're aiming for a big change. Collected, the new docstrings will make a reference manual very much like those you'll find for other scientific languages, with similar format for the pages. The choice of a ReST-based docstring format some time ago was to support producing such a manual. The rest of Travis's book is still critical information and we're not contemplating replacing it at this point. Much of it is on the technical end, and our goal is to address the general user, particularly students learning to do data analysis, so I think even the eventual User Guide, whatever form it takes, will not encroach on its technical focus. Of course, he's welcome to include the improved docstrings in his book if he wants to (as is anyone), or to exclude them and make a tighter book aimed at extension programmers, or whatever. Let's continue discussion on scipy-dev, just to keep it all in one place. --jh-- From charlesr.harris at gmail.com Sat May 17 14:40:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 12:40:49 -0600 Subject: [Numpy-discussion] Trac mailing still broken. Message-ID: Trac still fails to send me mail when tickets are added or changed. Do I need to resubscribe? I also have to log in twice to log in. That is an easy work around, but still, it has been that way for a couple of months and it should be fixable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Sat May 17 14:58:20 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 17 May 2008 14:58:20 -0400 Subject: [Numpy-discussion] checking element types in array In-Reply-To: References: Message-ID: 2008/5/17 Zoho Vignochi : > hello: > > I am writing my own version of a dot product. Simple enough this way: > > def dot_r(a, b): > return sum( x*y for (x,y) in izip(a, b) ) > > However if both a and b are complex we need: > > def dot_c(a, b): > return sum( x*y for (x,y) in izip(a.conjugate(), b) ).real As you probably realize, this will be vastly slower (ten to a hundred times) than the built-in function dot() in numpy. > I would like to combine these so that I need only one function which > detects which formula based on argument types. So I thought that > something like: > > def dot(a,b): > if isinstance(a.any(), complex) and isinstance(b.any(), complex): > return sum( x*y for (x,y) in izip(a.conjugate(), b) ).real > else: > return sum( x*y for (x,y) in izip(a, b) ) > > And it doesn't work because I obviously have the syntax for checking > element types incorrect. So my real question is: What is the best way to > check if any of the elements in an array are complex? numpy arrays are efficient, among other reasons, because they have homogeneous types. So all the elements in an array are the same type. (Yes, this means if you have an array of numbers only one of which happens to be complex, you have to represent them all as complex numbers whose imaginary part happens to be zero.) So if A is an array A.dtype is the type of its elements. numpy provides two convenience functions for checking whether an array is complex, depending on what you want: iscomplex checks whether each element has a nonzero imaginary part and returns an array representing the element-by-element answer; so any(iscomplex(A)) will be true if any element of A has a nonzero imaginary part. iscomplexobj checks whether the array has a complex data type. This is much much faster, but of course it may happen that all the imaginary parts happen to be zero; if you want to treat this array as real, you must use iscomplex. Anne Anne From peridot.faceted at gmail.com Sat May 17 15:02:54 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 17 May 2008 15:02:54 -0400 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: 2008/5/17 Charles R Harris : > > > On Sat, May 17, 2008 at 9:52 AM, Alan G Isaac wrote: >> >> On Fri, 16 May 2008, Anne Archibald apparently wrote: >> > storing actual python objects in an array is probably not >> > a good idea >> >> I have been wondering what people use object arrays for. >> I have been guessing that it is for indexing convenience? >> Are there other core motivations? > > You can always define an object array of matrices, which solves Tim's > problem of matrix stacks, albeit in not the most efficient manner and not > the easiest thing to specify due to the current limitations of array(...). I > do think it would be nice to have arrays or arrays, but this needs one more > array type so that one can do something like array(list_of_arrays, > dtype=matrix((2,3),float)), i.e., we could use a fancier dtype. I think if you're going to be writing code to do this, it would be better not to use object arrays. After all, there no reason the underlysing storage for an array of matrices shouldn't be one big block of memory rather than a lot of scattered python objects. Anne From charlesr.harris at gmail.com Sat May 17 15:13:05 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 13:13:05 -0600 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On Sat, May 17, 2008 at 1:02 PM, Anne Archibald wrote: > 2008/5/17 Charles R Harris : > > > > > > On Sat, May 17, 2008 at 9:52 AM, Alan G Isaac > wrote: > >> > >> On Fri, 16 May 2008, Anne Archibald apparently wrote: > >> > storing actual python objects in an array is probably not > >> > a good idea > >> > >> I have been wondering what people use object arrays for. > >> I have been guessing that it is for indexing convenience? > >> Are there other core motivations? > > > > You can always define an object array of matrices, which solves Tim's > > problem of matrix stacks, albeit in not the most efficient manner and not > > the easiest thing to specify due to the current limitations of > array(...). I > > do think it would be nice to have arrays or arrays, but this needs one > more > > array type so that one can do something like array(list_of_arrays, > > dtype=matrix((2,3),float)), i.e., we could use a fancier dtype. > > I think if you're going to be writing code to do this, it would be > better not to use object arrays. After all, there no reason the > underlysing storage for an array of matrices shouldn't be one big > block of memory rather than a lot of scattered python objects. > Exactly, which is why I suggested an extension to dtype so that it could also specify arraya. In that way matrix multiplication on matrix stacks would simply go over to an element wise matrix multiplication. In that case, dtype would also be nestable so that one could have arrays of arrays of arrays ... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 17 15:17:04 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 13:17:04 -0600 Subject: [Numpy-discussion] Trac mailing still broken. In-Reply-To: References: Message-ID: On Sat, May 17, 2008 at 12:40 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Trac still fails to send me mail when tickets are added or changed. Do I > need to resubscribe? I also have to log in twice to log in. That is an easy > work around, but still, it has been that way for a couple of months and it > should be fixable. > I note that the mail archive also stops at May 3. Apparently the mailing list is stopped. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Sat May 17 15:18:15 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 17 May 2008 15:18:15 -0400 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: 2008/5/17 Brian Blais : > at least for me, that was the motivation. I am trying to build a simulation > framework for part of the brain, which requires connected layers of nodes. > A layer is either a 1D or 2D structure of nodes, with each node a > relatively complex beast. Rather than reinvent the indexing (1D, 2D, > slicing, etc...), I just inherited from ndarray. I thought, after the fact, > that some numpy functions on arrays would help speed up the code, which > consists mostly of calling an update function on all nodes, passing each > them an input vector. I wasn't sure if there would be any speed up for > this, compared to > for n in self.flat: > n.update(input_vector) > From the response, the answer seems to be no, and that I should stick with > the python loops for clarity. But also, the words of Anne Archibald, makes > me think that I have made a bad choice by inheriting from ndarray, although > I am not sure what a convenient alternative would be. Well, it doesn't exist yet, but a handy tool would be a factory function "ArrayOf"; you would pass it a class, and it would produce a subclass of ndarray designed to contain that class. That is, the underlying storage would be a record array, but the getitem and setitem would automatically handle conversion to and from the class you supplied it, where appropriate. myarray = ArrayOf(Node,dtype=...) A = myarray.array([Node(...), Node(...), Node(...)]) n = A[1] A[2] = Node(...) A.C.update() # python-loop-based update of all elements You could also design it so that it was easy to derive a class from it, since that's probably the best way to handle vectorized methods: class myarray(ArrayOf(Node, dtype=...)): def update(self): self.underlying["node_attribute"] += 1 I should say, if you can get away with treating your nodes more like C structures and writing (possibly vectorized) functions to act on them, you can avoid all this mumbo jumbo: node_dtype = [("node_attribute",np.int),("weight", np.float)] A = np.zeros(10,dtype=node_dtype) def nodes_update(A): A["node_attribute"] += 1 Anne From charlesr.harris at gmail.com Sat May 17 15:45:53 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 13:45:53 -0600 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On Sat, May 17, 2008 at 1:18 PM, Anne Archibald wrote: > 2008/5/17 Brian Blais : > > > at least for me, that was the motivation. I am trying to build a > simulation > > framework for part of the brain, which requires connected layers of > nodes. > > A layer is either a 1D or 2D structure of nodes, with each node a > > relatively complex beast. Rather than reinvent the indexing (1D, 2D, > > slicing, etc...), I just inherited from ndarray. I thought, after the > fact, > > that some numpy functions on arrays would help speed up the code, which > > consists mostly of calling an update function on all nodes, passing each > > them an input vector. I wasn't sure if there would be any speed up for > > this, compared to > > for n in self.flat: > > n.update(input_vector) > > From the response, the answer seems to be no, and that I should stick > with > > the python loops for clarity. But also, the words of Anne Archibald, > makes > > me think that I have made a bad choice by inheriting from ndarray, > although > > I am not sure what a convenient alternative would be. > > Well, it doesn't exist yet, but a handy tool would be a factory > function "ArrayOf"; you would pass it a class, and it would produce a > subclass of ndarray designed to contain that class. Subclasses should generally be avoided unless they satisfy the "is a" criterion, which I don't think a matrix stack does. That is to say, a subclass should behave as an ndarray in *all* ways except for added functionality. All we want is an item type with dimensions. That is, the > underlying storage would be a record array, but the getitem and > setitem would automatically handle conversion to and from the class > you supplied it, where appropriate. I don't think getitem and setitem should be overloaded in a subclass, that is where the matrix class went wrong. Those methods should not be considered "virtual". So if you have an array of special elements, then you are allowed a[0][i, j], but you aren't allowed a[0, i, j]. I think this has the virtue of sense and consistency: you can overload the operators of the type anyway you please, but you can't overload the operators of ndarray. I suppose we could specify which operators in ndarray *could* be considered virtual. That, after all, is what designing a base class is all about. But we haven't done that. > > myarray = ArrayOf(Node,dtype=...) > A = myarray.array([Node(...), Node(...), Node(...)]) > n = A[1] > A[2] = Node(...) > A.C.update() # python-loop-based update of all elements > > You could also design it so that it was easy to derive a class from > it, since that's probably the best way to handle vectorized methods: > > class myarray(ArrayOf(Node, dtype=...)): > def update(self): > self.underlying["node_attribute"] += 1 > > > I should say, if you can get away with treating your nodes more like C > structures and writing (possibly vectorized) functions to act on them, > you can avoid all this mumbo jumbo: > Yes, what we want is something like an object array with efficient contiguous storage. Record types are a step in that direction, they just aren't complete enough. Hmm, this is an argument for using methods instead of functions, so that you could sort on the columns of the matrices in a stack by doing something like a[...].sort(axis=1). > node_dtype = [("node_attribute",np.int),("weight", np.float)] > A = np.zeros(10,dtype=node_dtype) > > def nodes_update(A): > A["node_attribute"] += 1 > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryan at cole.uklinux.net Sat May 17 16:09:21 2008 From: bryan at cole.uklinux.net (Bryan Cole) Date: Sat, 17 May 2008 21:09:21 +0100 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: <1211054960.8831.5.camel@pc2.cole.uklinux.net> > > > From the response, the answer seems to be no, and that I should stick > with the python loops for clarity. But also, the words of Anne > Archibald, makes me think that I have made a bad choice by inheriting > from ndarray, although I am not sure what a convenient alternative > would be. It may be worth bearing in mind that indexing a python list is faster than a ndarray (mostly because ndarrays must support many more features), so if you don't require (or exploit) the features of ndarrays (i.e. you don't need fancy indexing or element-wise operations), you may be better off with lists. BC > > > > > > > bb > > > > > > -- > Brian Blais > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sat May 17 16:18:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 14:18:43 -0600 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On Sat, May 17, 2008 at 1:45 PM, Charles R Harris wrote: > > > On Sat, May 17, 2008 at 1:18 PM, Anne Archibald > wrote: > >> 2008/5/17 Brian Blais : >> >> > at least for me, that was the motivation. I am trying to build a >> simulation >> > framework for part of the brain, which requires connected layers of >> nodes. >> > A layer is either a 1D or 2D structure of nodes, with each node a >> > relatively complex beast. Rather than reinvent the indexing (1D, 2D, >> > slicing, etc...), I just inherited from ndarray. I thought, after the >> fact, >> > that some numpy functions on arrays would help speed up the code, which >> > consists mostly of calling an update function on all nodes, passing each >> > them an input vector. I wasn't sure if there would be any speed up for >> > this, compared to >> > for n in self.flat: >> > n.update(input_vector) >> > From the response, the answer seems to be no, and that I should stick >> with >> > the python loops for clarity. But also, the words of Anne Archibald, >> makes >> > me think that I have made a bad choice by inheriting from ndarray, >> although >> > I am not sure what a convenient alternative would be. >> >> Well, it doesn't exist yet, but a handy tool would be a factory >> function "ArrayOf"; you would pass it a class, and it would produce a >> subclass of ndarray designed to contain that class. > > > Subclasses should generally be avoided unless they satisfy the "is a" > criterion, which I don't think a matrix stack does. That is to say, a > subclass should behave as an ndarray in *all* ways except for added > functionality. All we want is an item type with dimensions. > Base classes also tend to have limited functionality that will be common to all derived types. The object type in Python has only a few methods and attributes: In [4]: dir(object) Out[4]: ['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__'] And overloading any of these is likely to cause trouble. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 17 16:20:42 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 17 May 2008 15:20:42 -0500 Subject: [Numpy-discussion] Trac mailing still broken. In-Reply-To: References: Message-ID: <3d375d730805171320m45d4c668p3328b9ad310bd983@mail.gmail.com> On Sat, May 17, 2008 at 1:40 PM, Charles R Harris wrote: > Trac still fails to send me mail when tickets are added or changed. Do I > need to resubscribe? I also have to log in twice to log in. That is an easy > work around, but still, it has been that way for a couple of months and it > should be fixable. Yes, we know, thank you. Unfortunately, our IT staff is entirely overloaded right now. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sat May 17 16:25:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 17 May 2008 15:25:47 -0500 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: <3d375d730805171325s571510eevb52fd83839283611@mail.gmail.com> On Sat, May 17, 2008 at 3:18 PM, Charles R Harris wrote: > Base classes also tend to have limited functionality that will be common to > all derived types. The object type in Python has only a few methods and > attributes: > > In [4]: dir(object) > Out[4]: > ['__class__', > '__delattr__', > '__doc__', > '__getattribute__', > '__hash__', > '__init__', > '__new__', > '__reduce__', > '__reduce_ex__', > '__repr__', > '__setattr__', > '__str__'] > > And overloading any of these is likely to cause trouble. Nonsense. *Most* of those are intended to be overloaded. Especially on object. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 17 16:28:08 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 14:28:08 -0600 Subject: [Numpy-discussion] question about optimizing In-Reply-To: <3d375d730805171325s571510eevb52fd83839283611@mail.gmail.com> References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> <3d375d730805171325s571510eevb52fd83839283611@mail.gmail.com> Message-ID: On Sat, May 17, 2008 at 2:25 PM, Robert Kern wrote: > On Sat, May 17, 2008 at 3:18 PM, Charles R Harris > wrote: > > Base classes also tend to have limited functionality that will be common > to > > all derived types. The object type in Python has only a few methods and > > attributes: > > > > In [4]: dir(object) > > Out[4]: > > ['__class__', > > '__delattr__', > > '__doc__', > > '__getattribute__', > > '__hash__', > > '__init__', > > '__new__', > > '__reduce__', > > '__reduce_ex__', > > '__repr__', > > '__setattr__', > > '__str__'] > > > > And overloading any of these is likely to cause trouble. > > Nonsense. *Most* of those are intended to be overloaded. Especially on > object. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 17 16:33:45 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 14:33:45 -0600 Subject: [Numpy-discussion] question about optimizing In-Reply-To: <3d375d730805171325s571510eevb52fd83839283611@mail.gmail.com> References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> <3d375d730805171325s571510eevb52fd83839283611@mail.gmail.com> Message-ID: On Sat, May 17, 2008 at 2:25 PM, Robert Kern wrote: > On Sat, May 17, 2008 at 3:18 PM, Charles R Harris > wrote: > > Base classes also tend to have limited functionality that will be common > to > > all derived types. The object type in Python has only a few methods and > > attributes: > > > > In [4]: dir(object) > > Out[4]: > > ['__class__', > > '__delattr__', > > '__doc__', > > '__getattribute__', > > '__hash__', > > '__init__', > > '__new__', > > '__reduce__', > > '__reduce_ex__', > > '__repr__', > > '__setattr__', > > '__str__'] > > > > And overloading any of these is likely to cause trouble. > > Nonsense. *Most* of those are intended to be overloaded. Especially on > object. > My bad. What I should have said is, if you change the semantics of these methods and attributes, then things will go wrong. The Python documentation says as much. For instance, if __new__ returns the time of day. So my point would be: those things intended to be overloading are well defined and what they do is part of the contract. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 17 16:36:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 17 May 2008 14:36:09 -0600 Subject: [Numpy-discussion] Trac mailing still broken. In-Reply-To: <3d375d730805171320m45d4c668p3328b9ad310bd983@mail.gmail.com> References: <3d375d730805171320m45d4c668p3328b9ad310bd983@mail.gmail.com> Message-ID: On Sat, May 17, 2008 at 2:20 PM, Robert Kern wrote: > On Sat, May 17, 2008 at 1:40 PM, Charles R Harris > wrote: > > Trac still fails to send me mail when tickets are added or changed. Do I > > need to resubscribe? I also have to log in twice to log in. That is an > easy > > work around, but still, it has been that way for a couple of months and > it > > should be fixable. > > Yes, we know, thank you. Unfortunately, our IT staff is entirely > overloaded right now. > You're welcome. Didn't want it to slip through the cracks. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 17 16:36:20 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 17 May 2008 15:36:20 -0500 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> <3d375d730805171325s571510eevb52fd83839283611@mail.gmail.com> Message-ID: <3d375d730805171336t55755d8aubf90d35fcb4c6cf8@mail.gmail.com> On Sat, May 17, 2008 at 3:33 PM, Charles R Harris wrote: > > On Sat, May 17, 2008 at 2:25 PM, Robert Kern wrote: >> >> On Sat, May 17, 2008 at 3:18 PM, Charles R Harris >> wrote: >> > Base classes also tend to have limited functionality that will be common >> > to >> > all derived types. The object type in Python has only a few methods and >> > attributes: >> > >> > In [4]: dir(object) >> > Out[4]: >> > ['__class__', >> > '__delattr__', >> > '__doc__', >> > '__getattribute__', >> > '__hash__', >> > '__init__', >> > '__new__', >> > '__reduce__', >> > '__reduce_ex__', >> > '__repr__', >> > '__setattr__', >> > '__str__'] >> > >> > And overloading any of these is likely to cause trouble. >> >> Nonsense. *Most* of those are intended to be overloaded. Especially on >> object. > > My bad. What I should have said is, if you change the semantics of these > methods and attributes, then things will go wrong. The Python documentation > says as much. For instance, if __new__ returns the time of day. So my point > would be: those things intended to be overloading are well defined and what > they do is part of the contract. Gotcha. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robwesterfield at gmail.com Sat May 17 16:49:19 2008 From: robwesterfield at gmail.com (Robert Westerfield) Date: Sat, 17 May 2008 22:49:19 +0200 Subject: [Numpy-discussion] Getting pylint to recognize numpy Message-ID: <8989d7aa0805171349r73a0b2eagdf2873129a177148@mail.gmail.com> Hi, I am unable to get the incredibly useful pylint (0.14) to recognize numpy. Could anyone help me out here please? Thank you! -Rob 1) Confirming that python picks up numpy: D:\src>python DummyModule.py [1 2 3] 2) Running pylint: D:\src>pylint -rn -iy DummyModule No config file found, using default configuration ************* Module DummyModule E0611: 5: No name 'array' in module 'numpy' 3) Contents of DummyModule.py: """ Dummy doc-string """ from numpy import array if __name__ == '__main__': Z = array([1, 2, 3]) print Z From peridot.faceted at gmail.com Sat May 17 16:54:03 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 17 May 2008 16:54:03 -0400 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: 2008/5/17 Anne Archibald : > 2008/5/17 Brian Blais : > >> at least for me, that was the motivation. I am trying to build a simulation >> framework for part of the brain, which requires connected layers of nodes. >> A layer is either a 1D or 2D structure of nodes, with each node a >> relatively complex beast. Rather than reinvent the indexing (1D, 2D, >> slicing, etc...), I just inherited from ndarray. I thought, after the fact, >> that some numpy functions on arrays would help speed up the code, which >> consists mostly of calling an update function on all nodes, passing each >> them an input vector. I wasn't sure if there would be any speed up for >> this, compared to >> for n in self.flat: >> n.update(input_vector) >> From the response, the answer seems to be no, and that I should stick with >> the python loops for clarity. But also, the words of Anne Archibald, makes >> me think that I have made a bad choice by inheriting from ndarray, although >> I am not sure what a convenient alternative would be. > > Well, it doesn't exist yet, but a handy tool would be a factory > function "ArrayOf"; you would pass it a class, and it would produce a > subclass of ndarray designed to contain that class. That is, the > underlying storage would be a record array, but the getitem and > setitem would automatically handle conversion to and from the class > you supplied it, where appropriate. > > myarray = ArrayOf(Node,dtype=...) > A = myarray.array([Node(...), Node(...), Node(...)]) > n = A[1] > A[2] = Node(...) > A.C.update() # python-loop-based update of all elements > > You could also design it so that it was easy to derive a class from > it, since that's probably the best way to handle vectorized methods: > > class myarray(ArrayOf(Node, dtype=...)): > def update(self): > self.underlying["node_attribute"] += 1 So just as an experiment I implemented some of this: http://www.scipy.org/Cookbook/Obarray Anne From robert.kern at gmail.com Sat May 17 16:57:58 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 17 May 2008 15:57:58 -0500 Subject: [Numpy-discussion] Getting pylint to recognize numpy In-Reply-To: <8989d7aa0805171349r73a0b2eagdf2873129a177148@mail.gmail.com> References: <8989d7aa0805171349r73a0b2eagdf2873129a177148@mail.gmail.com> Message-ID: <3d375d730805171357p61f68f55s2880e4d94fb1185a@mail.gmail.com> On Sat, May 17, 2008 at 3:49 PM, Robert Westerfield wrote: > Hi, > > I am unable to get the incredibly useful pylint (0.14) to recognize numpy. > > Could anyone help me out here please? I'm afraid I can't help you. I don't use pylint (too slow for my uses). Have you tried asking on the pylint mailing list? http://lists.logilab.org/mailman/listinfo/python-projects They might have a better idea of the limitations of pylint wrt extension modules and possible workarounds. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robince at gmail.com Sat May 17 16:59:27 2008 From: robince at gmail.com (Robin) Date: Sat, 17 May 2008 21:59:27 +0100 Subject: [Numpy-discussion] question about optimizing In-Reply-To: References: <00C1B8EB-E75A-4421-B43A-A1E6FC60B714@bryant.edu> Message-ID: On Sat, May 17, 2008 at 7:22 PM, Brian Blais wrote: > at least for me, that was the motivation. I am trying to build a simulation > framework for part of the brain, which requires connected layers of nodes. > A layer is either a 1D or 2D structure of nodes, with each node a > relatively complex beast. Rather than reinvent the indexing (1D, 2D, > slicing, etc...), I just inherited from ndarray. I thought, after the fact, > that some numpy functions on arrays would help speed up the code, which > consists mostly of calling an update function on all nodes, passing each > them an input vector. I wasn't sure if there would be any speed up for > this, compared to > for n in self.flat: > n.update(input_vector) > From the response, the answer seems to be no, and that I should stick with > the python loops for clarity. But also, the words of Anne Archibald, makes > me think that I have made a bad choice by inheriting from ndarray, although > I am not sure what a convenient alternative would be. > > > bb Hello, It depends on what you are doing but to really exploit the performance gains of numpy it can be better to change the way you are doing the model to have arrays representing the various properties of each node seperately. For example, instead of having each node with a voltage and a number of channels node.V node.C (list of n channel conductances), you have big arrays holding the values for all of them. So V would be an array 1xn_nodes and C would be an array of n_nodes x max number of channels (some of them could be zero if not all nodes have the same number of channels). Another global type array n_nodes x n_nodes could hold all the connections... Then instead of updating each node individually you can update all nodes together with vectorised operations utilising ATLAS etc. The indexing can get a bit complicated and it may not be possible depending on how your nodes interact (but usually even if a node depends on values of other nodes it does so only at the previous timestep so you can store the full previous state and reference it in the update function). Just a suggestion - it was much more efficient for me to do it this way with integrate and fire type neural networks... Also I hope I've clearly expressed what I mean - it's getting late here. Cheers Robin From pgmdevlist at gmail.com Sat May 17 17:16:12 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Sat, 17 May 2008 17:16:12 -0400 Subject: [Numpy-discussion] Getting pylint to recognize numpy In-Reply-To: <3d375d730805171357p61f68f55s2880e4d94fb1185a@mail.gmail.com> References: <8989d7aa0805171349r73a0b2eagdf2873129a177148@mail.gmail.com> <3d375d730805171357p61f68f55s2880e4d94fb1185a@mail.gmail.com> Message-ID: <200805171716.13761.pgmdevlist@gmail.com> On Saturday 17 May 2008 16:57:58 Robert Kern wrote: > On Sat, May 17, 2008 at 3:49 PM, Robert Westerfield > > wrote: > > Hi, > > > > I am unable to get the incredibly useful pylint (0.14) to recognize > > numpy. mmh, works great on my machine... pylint 0.14.0, astng 0.17.2, common 0.21.2 Python 2.4.4 (#1, Apr 8 2008, 13:36:41) [GCC 4.1.2 (Gentoo 4.1.2 p1.0.1)] AAMOF, I use eclipse+pydev, that supports pylint. No particular problems either... From robwesterfield at gmail.com Sat May 17 17:40:58 2008 From: robwesterfield at gmail.com (Robert Westerfield) Date: Sat, 17 May 2008 23:40:58 +0200 Subject: [Numpy-discussion] Getting pylint to recognize numpy In-Reply-To: <200805171716.13761.pgmdevlist@gmail.com> References: <8989d7aa0805171349r73a0b2eagdf2873129a177148@mail.gmail.com> <3d375d730805171357p61f68f55s2880e4d94fb1185a@mail.gmail.com> <200805171716.13761.pgmdevlist@gmail.com> Message-ID: <8989d7aa0805171440rc71f014tca15336ff9315492@mail.gmail.com> On 5/17/08, Pierre GM wrote: > On Saturday 17 May 2008 16:57:58 Robert Kern wrote: > > On Sat, May 17, 2008 at 3:49 PM, Robert Westerfield > > > > wrote: > > > Hi, > > > > > > I am unable to get the incredibly useful pylint (0.14) to recognize > > > numpy. > > mmh, works great on my machine... > pylint 0.14.0, > astng 0.17.2, common 0.21.2 > Python 2.4.4 (#1, Apr 8 2008, 13:36:41) > [GCC 4.1.2 (Gentoo 4.1.2 p1.0.1)] > > AAMOF, I use eclipse+pydev, that supports pylint. No particular problems > either... Thanks Pierre - I am using the same versions for all packages on Windows (I have tried a newer common as well). Did you (or Gentoo) do anything particular with either pylint's configuration file or Eclipse/PyDev to have it pick up numpy? I've asked on logilab's lists as well. Kind regards, Rob. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From maitai02 at excite.com Sat May 17 18:55:49 2008 From: maitai02 at excite.com (Jose Martin) Date: Sat, 17 May 2008 18:55:49 -0400 (EDT) Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C Message-ID: <20080517225549.5140C91FB5@xprdmxin.myway.com> Hi, I'd like to access a C function from python, and the function takes input/output arrays. I'd probably use SWIG to do the interface to the C code. I found 2 options: -NumPtr module, to access Numeric arrays as pointers http://www.penzilla.net/tutorials/python/numptr/ - numpy.i, a SWIG interface file for NumPy that defines typemaps http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/swig/doc/numpy_swig.html I'm not sure if there is significant differences between the 2 options (besides using either NumPy or Numeric). Does numpy.i interface file use pointers to access NumPy arrays? or does it make a copy of the array to pass it to/from the C function? I'm new to programming and I'd like to make sure of this. I need to use in C very large arrays frequently, so I want to avoid making copies of it, because speed will be an important factor. Thanks in advance! _______________________________________________ Join Excite! - http://www.excite.com The most personalized portal on the Web! From dagss at student.matnat.uio.no Sat May 17 19:42:57 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 18 May 2008 01:42:57 +0200 (CEST) Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <20080517225549.5140C91FB5@xprdmxin.myway.com> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> Message-ID: <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> Jose Martin wrote: > > Hi, I'd like to access a C function from python, and the function takes > input/output arrays. I'd probably use SWIG to do the interface to the C > code. I found 2 options: > -NumPtr module, to access Numeric arrays as pointers > http://www.penzilla.net/tutorials/python/numptr/ > - numpy.i, a SWIG interface file for NumPy that defines typemaps > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/swig/doc/numpy_swig.html > > I'm not sure if there is significant differences between the 2 options > (besides using either NumPy or Numeric). Does numpy.i interface file use > pointers to access NumPy arrays? or does it make a copy of the array to > pass it to/from the C function? I'll use the opportunity to recommend Cython, http://cython.org. (Disclaimer: I am not objective as I am very involved with Cython development, in particular I'll spend this summer improving NumPy support in it. You should listen to others on SWIG.) Cython is a different approach from SWIG (see http://wiki.cython.org/WrappingCorCpp; in particular SWIG uses more layers of indirection). To wrap a function, say, dealing with a 2-D float64 NumPy array, you'd write the following Cython code (Cython is a seperate language, sort of a mix of Python and C): import numpy cimport numpy import_array() cdef extern from "your_c_header.h": int your_func(numpy.float64* buf, unsigned int width, unsigned int height) def call_your_func(numpy.ndarray arr): if arr.ndim != 2: raise ValueError("Need 2D array") # ... and so on. Better make sure array is C-contiguous too! # (or convert it) return your_func(arr.data, arr.shape[0], arr.shape[1]) After Cython compilation you get a C file, and after compiling the C file you can import it into Python like a normal module and call "call_your_func" directly. That's an overview. Basically, "arr" behaves exactly like a Python object, except for when accessing the fields in which case values from the underlying C implementation will be returned instead. More info can be found on cython.org etc. (or ask!). Dag Sverre PS. You'll need an "numpy.pxd" (put it in the same directory, the "cimport" pulls it in). Here's mine: cdef extern from "Python.h": ctypedef int Py_intptr_t cdef extern from "numpy/arrayobject.h": ctypedef class numpy.ndarray [object PyArrayObject]: cdef char *data cdef int ndim "nd" cdef Py_intptr_t *shape "dimensions" cdef Py_intptr_t *strides cdef void import_array() ctypedef float npy_float64 ctypedef float npy_float80 ctypedef int npy_int8 ctypedef int npy_uint8 ctypedef int npy_int32 ctypedef int npy_uint32 ctypedef int npy_int64 ctypedef int npy_uint64 ... From wnbell at gmail.com Sat May 17 20:24:19 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 17 May 2008 19:24:19 -0500 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <20080517225549.5140C91FB5@xprdmxin.myway.com> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> Message-ID: On Sat, May 17, 2008 at 5:55 PM, Jose Martin wrote: > > Hi, I'd like to access a C function from python, and the function takes input/output arrays. I'd probably use SWIG to do the interface to the C code. I found 2 options: > -NumPtr module, to access Numeric arrays as pointers > http://www.penzilla.net/tutorials/python/numptr/ > - numpy.i, a SWIG interface file for NumPy that defines typemaps > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/swig/doc/numpy_swig.html > > I'm not sure if there is significant differences between the 2 options (besides using either NumPy or Numeric). Does numpy.i interface file use pointers to access NumPy arrays? or does it make a copy of the array to pass it to/from the C function? > > I'm new to programming and I'd like to make sure of this. I need to use in C very large arrays frequently, so I want to avoid making copies of it, because speed will be an important factor. > I'm fairly confident that numpy.i *will not* copy an array unless it is necessary to do so. Nice contiguous arrays should be passed through efficiently. I use SWIG extensively in scipy.sparse: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools One nice aspect of SWIG is that you can have it automatically dispatch the right function. For instance, if you have foo(float * arr) and foo(double * arr) then you can use SWIG typemaps pick the correct function to use. You can also make SWIG upcast to the appropriate types. For example, if in Python you passed an int array and a double array to: foo(double * arr1, double * arr2) then you can have SWIG automatically upcast the int array to double and delete it before returning to Python. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From wnbell at gmail.com Sat May 17 20:39:58 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 17 May 2008 19:39:58 -0500 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> Message-ID: On Sat, May 17, 2008 at 6:42 PM, Dag Sverre Seljebotn wrote: > > Cython is a different approach from SWIG (see > http://wiki.cython.org/WrappingCorCpp; in particular SWIG uses more layers > of indirection). > >From the link: "[SWIG] Can wrap almost any C and C++ code, including templates etc. Disadvantage is that it produces a C file, this compiles to .so, but then it also produces a Python wrapper on top of this .so file. So it's messy and it's slow. Also SWIG is not only targeting Python, but other languages as well." I really wish that people didn't spread FUD about SWIG. For reference, here's the "messy and slow" Python wrapper for a function in scipy.sparse: 52 def expandptr(*args): 53 """expandptr(int n_row, int Ap, int Bi)""" 54 return _csr.expandptr(*args) From: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools/csr.py#L52 I understand that scipy.sparse does not use all features of SWIG (which may produce uglier wrappers), but I think the above case is a fair example of what to expect when wrapping typical C/C++ libraries. More disingenuous FUD here: http://www.sagemath.org/doc/html/prog/node36.html -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From robert.kern at gmail.com Sat May 17 20:48:44 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 17 May 2008 19:48:44 -0500 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> Message-ID: <3d375d730805171748p20ffc16as20a09281351f617d@mail.gmail.com> On Sat, May 17, 2008 at 7:39 PM, Nathan Bell wrote: > More disingenuous FUD here: http://www.sagemath.org/doc/html/prog/node36.html For the purposes to which SWIG was applied in that case, the findings are accurate. The wording is overly general, though; it doesn't talk about other use cases for which the findings are not applicable. When the amount of computation in C/C++ is fairly small (multiplying 2 GMP integers), and the Python-level wrapper function needs to be called many times, then the overhead of the various SWIG layers can become significant. When the amount of computation in C/C++ is fairly large (matmult on sparse matrices), the Python function call overhead will probably be insignificant. Making a blanket statement either way is incorrect. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wnbell at gmail.com Sat May 17 21:13:12 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 17 May 2008 20:13:12 -0500 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <3d375d730805171748p20ffc16as20a09281351f617d@mail.gmail.com> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> <3d375d730805171748p20ffc16as20a09281351f617d@mail.gmail.com> Message-ID: On Sat, May 17, 2008 at 7:48 PM, Robert Kern wrote: > For the purposes to which SWIG was applied in that case, the findings > are accurate. IMO it's deliberately misleading. The following three layers are spurious and have no analog on the Cython stack: Python code to provide a clean interface Handcode C++ Integer class GMP's C++ Interface A more honest comparison would list 3 layers for SWIG vs. 2 layers for Cython. I don't have a hard time believing that Cython is a better choice for fine-grained access to C/C++ code. However contrived scenarios like the above don't inspire my confidence either. I have yet to see a benchmark that reveals the claimed benefits of Cython. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From robert.kern at gmail.com Sat May 17 21:43:36 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 17 May 2008 20:43:36 -0500 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> <3d375d730805171748p20ffc16as20a09281351f617d@mail.gmail.com> Message-ID: <3d375d730805171843q679387a7v8563dfab8cbb59c6@mail.gmail.com> On Sat, May 17, 2008 at 8:13 PM, Nathan Bell wrote: > On Sat, May 17, 2008 at 7:48 PM, Robert Kern wrote: >> For the purposes to which SWIG was applied in that case, the findings >> are accurate. > > IMO it's deliberately misleading. The following three layers are > spurious and have no analog on the Cython stack: > Python code to provide a clean interface > Handcode C++ Integer class > GMP's C++ Interface If you want to end up with a class, that's more or less what you would do in SWIG. Some bits in Python, because they're dealing with Python types and exceptions and such. Some bits in C++ because you need to touch C/C++ structures. Just wrapping the functions in SWIG and then making the class in pure Python can sometimes work, but often you need to make a C++ class sitting on top of your library. Except for possibly the GMP C++ layer (perhaps the handwritten C++ class could just have used the GMP C API, I don't know), they're not spurious. All of that functionality that was implemented in those layers were implemented in the single Cython layer. > A more honest comparison would list 3 layers for SWIG vs. 2 layers for Cython. > > I don't have a hard time believing that Cython is a better choice for > fine-grained access to C/C++ code. However contrived scenarios like > the above don't inspire my confidence either. It was not contrived. It's production code. It was a real and perfectly valid use case. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ellisonbg.net at gmail.com Sat May 17 22:30:06 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Sat, 17 May 2008 20:30:06 -0600 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> Message-ID: <6ce0ac130805171930l3d30dc44q8c7c87fce8b72a83@mail.gmail.com> >> Cython is a different approach from SWIG (see >> http://wiki.cython.org/WrappingCorCpp; in particular SWIG uses more layers >> of indirection). >> > > >From the link: > "[SWIG] Can wrap almost any C and C++ code, including templates etc. > Disadvantage is that it produces a C file, this compiles to .so, but > then it also produces a Python wrapper on top of this .so file. So > it's messy and it's slow. Also SWIG is not only targeting Python, but > other languages as well." > > I really wish that people didn't spread FUD about SWIG. For > reference, here's the "messy and slow" Python wrapper for a function > in scipy.sparse: I have taken the liberty of making the Cython wiki a little more specific and fair to SWIG. Here is the changed text: SWIG is one of the oldest and most mature methods of wrapping C or C++ code into Python (SWIG works for other target languages as well). It can wrap almost any C and C++ code, including complex templated C++ code. If you have a large (so that hand wrapping is prohibitive) and course grained API, SWIG will likely be your best choice. However, SWIG does have a number or disadvantages compared with Cython. First, SWIG produces a C file, which gets compiled to a .so, but then it also produces a Python wrapper on top of this .so file. For fine grained APIs (where not much is done in each C/C++ call), the overhead of this additional Python wrapper can be significant. Second, with SWIG, the Python wrappers are written for you, so if their design is not exactly what you want, you end up doing more work to create your final Python API. Please correct any new errors I have introduced. Cheers, Brian From wnbell at gmail.com Sat May 17 22:35:04 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 17 May 2008 21:35:04 -0500 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <6ce0ac130805171930l3d30dc44q8c7c87fce8b72a83@mail.gmail.com> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> <6ce0ac130805171930l3d30dc44q8c7c87fce8b72a83@mail.gmail.com> Message-ID: On Sat, May 17, 2008 at 9:30 PM, Brian Granger wrote: > > Please correct any new errors I have introduced. > Thanks Brian, I think that's a fair representation. Minor typo "course grained" -> "coarse-grained" -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From ellisonbg.net at gmail.com Sat May 17 22:35:58 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Sat, 17 May 2008 20:35:58 -0600 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <20080517225549.5140C91FB5@xprdmxin.myway.com> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> Message-ID: <6ce0ac130805171935j7ba2468che24d273e668d6b6d@mail.gmail.com> Jose, As you can see, people have different preferences for wrapping C/C++ code. I should also mention that one of the easiest methods if numpy arrays are involved is ctypes. numpy arrays already have more-or-less built-in support for talking to ctypes. Details are available here: http://www.scipy.org/Cookbook/Ctypes The good news is that there are _many_ good options. I also know that the numpy.i file for handling numpy arrays in SWIG is of high quality and has decent documentation. Brian On Sat, May 17, 2008 at 4:55 PM, Jose Martin wrote: > > Hi, I'd like to access a C function from python, and the function takes input/output arrays. I'd probably use SWIG to do the interface to the C code. I found 2 options: > -NumPtr module, to access Numeric arrays as pointers > http://www.penzilla.net/tutorials/python/numptr/ > - numpy.i, a SWIG interface file for NumPy that defines typemaps > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/swig/doc/numpy_swig.html > > I'm not sure if there is significant differences between the 2 options (besides using either NumPy or Numeric). Does numpy.i interface file use pointers to access NumPy arrays? or does it make a copy of the array to pass it to/from the C function? > > I'm new to programming and I'd like to make sure of this. I need to use in C very large arrays frequently, so I want to avoid making copies of it, because speed will be an important factor. > > Thanks in advance! > > > > _______________________________________________ > Join Excite! - http://www.excite.com > The most personalized portal on the Web! > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From ellisonbg.net at gmail.com Sat May 17 22:37:02 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Sat, 17 May 2008 20:37:02 -0600 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: References: <20080517225549.5140C91FB5@xprdmxin.myway.com> <49685.193.157.243.12.1211067777.squirrel@webmail.uio.no> <6ce0ac130805171930l3d30dc44q8c7c87fce8b72a83@mail.gmail.com> Message-ID: <6ce0ac130805171937j4a018376u611c84be5252e5d7@mail.gmail.com> On Sat, May 17, 2008 at 8:35 PM, Nathan Bell wrote: > On Sat, May 17, 2008 at 9:30 PM, Brian Granger wrote: >> >> Please correct any new errors I have introduced. >> > > Thanks Brian, I think that's a fair representation. > > Minor typo "course grained" -> "coarse-grained" Fixed > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From wfspotz at sandia.gov Sat May 17 23:27:09 2008 From: wfspotz at sandia.gov (Bill Spotz) Date: Sat, 17 May 2008 21:27:09 -0600 Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C In-Reply-To: <20080517225549.5140C91FB5@xprdmxin.myway.com> References: <20080517225549.5140C91FB5@xprdmxin.myway.com> Message-ID: Just to make sure the original question gets answered, yes, numpy.i avoids copies as much as possible. A special case is when your C code provides you with a view of its internal data and does not require any memory to be allocated by the (python) user. This can be dangerous, but if it is your use case, be sure to use the ARGOUTVIEW_* typemaps. Oh, and Brian's description of SWIG is an eminently fair one. AND, if NumPtr is only for Numeric, you should know that Numeric is no longer developed or supported. On May 17, 2008, at 4:55 PM, Jose Martin wrote: > Hi, I'd like to access a C function from python, and the function > takes input/output arrays. I'd probably use SWIG to do the interface > to the C code. I found 2 options: > -NumPtr module, to access Numeric arrays as pointers > http://www.penzilla.net/tutorials/python/numptr/ > - numpy.i, a SWIG interface file for NumPy that defines typemaps > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/swig/doc/numpy_swig.html > > I'm not sure if there is significant differences between the 2 > options (besides using either NumPy or Numeric). Does numpy.i > interface file use pointers to access NumPy arrays? or does it make > a copy of the array to pass it to/from the C function? > > I'm new to programming and I'd like to make sure of this. I need to > use in C very large arrays frequently, so I want to avoid making > copies of it, because speed will be an important factor. > > Thanks in advance! ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From millman at berkeley.edu Sun May 18 00:01:16 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 17 May 2008 21:01:16 -0700 Subject: [Numpy-discussion] 1.1.0rc1 tagged Message-ID: Please test the release candidate: svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1 Also please review the release notes: http://projects.scipy.org/scipy/numpy/milestone/1.1.0 I am going to ask Chris and David to create Windows and Mac binaries, which I hope they will have time to create ASAP. Sorry that it has taken me so long, I am on vacation with my family and am having a difficult time getting on my computer. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From matt at snapbug.geek.nz Sun May 18 03:04:29 2008 From: matt at snapbug.geek.nz (Matt Crane) Date: Sun, 18 May 2008 19:04:29 +1200 Subject: [Numpy-discussion] Numpify this? Message-ID: Hey, I'm new to numpy but not new to python or programming in general. I was wondering if there's a way of using numpy to do the following or whether I've got what I've got and that's as good as it's going to get. I have two 2d arrays and I want to create another 2d array that contains the values from the 2nd column of the first two arrays where the values in the 1st column match. To elaborate with an example - if I had an array a: array([[2834, 1], [3282, 3], [6850, 2], [9458, 2]]) and an array b: array([[2834, 3], [3282, 5], [4444, 5], [9458, 3], [9999, 4], [11111, 5], [12345, 1]]) then I'd want the result to be array([[1, 3], # from 2834 [3, 5], # from 3282 [2, 3]]) # from 9458 This is what I have at the moment: results = [] while aind < amax and bind < bmax: if a[aind, 0] < b[bind, 0]: aind += 1 elif a[aind, 0] > b[bind, 0]: bind += 1 else: results.append([a[aind, 1], b[bind, 1]]) aind += 1 bind += 1 results = array(results) Where aind = bind = 0, amax = a.shape[0] and bmax = b.shape[0]. Any tips/pointers/speedups? Cheers, Matt From robert.kern at gmail.com Sun May 18 03:19:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 18 May 2008 02:19:18 -0500 Subject: [Numpy-discussion] Numpify this? In-Reply-To: References: Message-ID: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> On Sun, May 18, 2008 at 2:04 AM, Matt Crane wrote: > Hey, > > I'm new to numpy but not new to python or programming in general. I > was wondering if there's a way of using numpy to do the following or > whether I've got what I've got and that's as good as it's going to > get. > > I have two 2d arrays and I want to create another 2d array that > contains the values from the 2nd column of the first two arrays where > the values in the 1st column match. To elaborate with an example - if > I had an array a: > > array([[2834, 1], [3282, 3], [6850, 2], [9458, 2]]) > and an array b: > > array([[2834, 3], [3282, 5], [4444, 5], [9458, 3], [9999, 4], [11111, > 5], [12345, 1]]) > > then I'd want the result to be > > array([[1, 3], # from 2834 > [3, 5], # from 3282 > [2, 3]]) # from 9458 Are the matching rows always going to be the same row in each? I.e. you want rows i such that a[i,0]==b[i,0] rather than trying to find all i,j such that a[i,0]==b[j,0]? If so, then I would do the following: In [1]: from numpy import * In [2]: a = array([[2834, 1], [3282, 3], [6850, 2], [9458, 2]]) In [3]: b = array([[2834, 3], [3282, 5], [4444, 5], [9458, 3], [9999, 4], [11111, ...: 5], [12345, 1]]) In [4]: minlength = min(a.shape[0], b.shape[0]) In [5]: matching = nonzero(a[:minlength,0] == b[:minlength,0])[0] In [6]: matching Out[6]: array([0, 1, 3]) In [7]: column_stack([a[matching,1], b[matching,1]]) Out[7]: array([[1, 3], [3, 5], [2, 3]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matt at snapbug.geek.nz Sun May 18 03:59:43 2008 From: matt at snapbug.geek.nz (Matt Crane) Date: Sun, 18 May 2008 19:59:43 +1200 Subject: [Numpy-discussion] Numpify this? In-Reply-To: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> Message-ID: On Sun, May 18, 2008 at 7:19 PM, Robert Kern wrote: > Are the matching rows always going to be the same row in each? I.e. > you want rows i such that a[i,0]==b[i,0] rather than trying to find > all i,j such that a[i,0]==b[j,0]? > > If so, then I would do the following: > > > In [1]: from numpy import * > > In [2]: a = array([[2834, 1], [3282, 3], [6850, 2], [9458, 2]]) > > In [3]: b = array([[2834, 3], [3282, 5], [4444, 5], [9458, 3], [9999, > 4], [11111, > ...: 5], [12345, 1]]) > > In [4]: minlength = min(a.shape[0], b.shape[0]) > > In [5]: matching = nonzero(a[:minlength,0] == b[:minlength,0])[0] > > In [6]: matching > Out[6]: array([0, 1, 3]) > > In [7]: column_stack([a[matching,1], b[matching,1]]) > Out[7]: > array([[1, 3], > [3, 5], > [2, 3]]) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Sorry, I should have mentioned that no, the matching rows won't always be in the same position. From robert.kern at gmail.com Sun May 18 04:08:35 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 18 May 2008 03:08:35 -0500 Subject: [Numpy-discussion] Numpify this? In-Reply-To: References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> Message-ID: <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> On Sun, May 18, 2008 at 2:59 AM, Matt Crane wrote: > Sorry, I should have mentioned that no, the matching rows won't always > be in the same position. Okay, then it's just a little bit more complicated. In [18]: from numpy import * In [19]: a = array([[1, 10], [2, 20], [3, 30], [3, 40], [4, 50]]) In [20]: b = array([[2, 60], [1, 70], [5, 80], [6, 90], [7, 100], [3, 110]]) In [21]: m = a[:,0] == b[:,0][:,newaxis] In [22]: m Out[22]: array([[False, True, False, False, False], [ True, False, False, False, False], [False, False, False, False, False], [False, False, False, False, False], [False, False, False, False, False], [False, False, True, True, False]], dtype=bool) In [23]: i, j = nonzero(m) In [24]: i Out[24]: array([0, 1, 5, 5]) In [25]: j Out[25]: array([1, 0, 2, 3]) In [26]: column_stack([a[j,1], b[i,1]]) Out[26]: array([[ 20, 60], [ 10, 70], [ 30, 110], [ 40, 110]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matt at snapbug.geek.nz Sun May 18 04:29:59 2008 From: matt at snapbug.geek.nz (Matt Crane) Date: Sun, 18 May 2008 20:29:59 +1200 Subject: [Numpy-discussion] Numpify this? In-Reply-To: <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> Message-ID: On Sun, May 18, 2008 at 8:08 PM, Robert Kern wrote: > Okay, then it's just a little bit more complicated. Thanks, and that's going to be faster - the method that I posted is linear in terms of the length of the two lists? Given that the values in the first column are monotonically increasing (again something I should have mentioned -- I blame a lack of caffeine) - could we make it faster? Thanks, for everything up to this point though. Matt From robert.kern at gmail.com Sun May 18 04:52:57 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 18 May 2008 03:52:57 -0500 Subject: [Numpy-discussion] Numpify this? In-Reply-To: References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> Message-ID: <3d375d730805180152u7bab9e2lc7595a82da102b27@mail.gmail.com> On Sun, May 18, 2008 at 3:29 AM, Matt Crane wrote: > On Sun, May 18, 2008 at 8:08 PM, Robert Kern wrote: >> Okay, then it's just a little bit more complicated. > > Thanks, and that's going to be faster - the method that I posted is > linear in terms of the length of the two lists? It depends on the sizes. > Given that the values > in the first column are monotonically increasing (again something I > should have mentioned -- I blame a lack of caffeine) - could we make > it faster? Are there repeats? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matt at snapbug.geek.nz Sun May 18 05:02:44 2008 From: matt at snapbug.geek.nz (Matt Crane) Date: Sun, 18 May 2008 21:02:44 +1200 Subject: [Numpy-discussion] Numpify this? In-Reply-To: <3d375d730805180152u7bab9e2lc7595a82da102b27@mail.gmail.com> References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> <3d375d730805180152u7bab9e2lc7595a82da102b27@mail.gmail.com> Message-ID: On Sun, May 18, 2008 at 8:52 PM, Robert Kern wrote: > It depends on the sizes. The sizes could range from 3 to 240000 with an average of around 5500. > Are there repeats? No, no repeats in the first column. I'm going to go get a cup of coffee before I forget to leave out any potentially vital information again. It's going to be a long day. Thanks, Matt From robert.kern at gmail.com Sun May 18 05:11:39 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 18 May 2008 04:11:39 -0500 Subject: [Numpy-discussion] Numpify this? In-Reply-To: References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> <3d375d730805180152u7bab9e2lc7595a82da102b27@mail.gmail.com> Message-ID: <3d375d730805180211s7c8d8d97w3ce6b6be5e204441@mail.gmail.com> On Sun, May 18, 2008 at 4:02 AM, Matt Crane wrote: > On Sun, May 18, 2008 at 8:52 PM, Robert Kern wrote: >> It depends on the sizes. > The sizes could range from 3 to 240000 with an average of around 5500. A 240000x240000 boolean matrix will probably be too slow. >> Are there repeats? > No, no repeats in the first column. Great! So let's use searchsorted() to find potential indices where the two first columns are equal. We pull out the values at those indices and actually do the comparison to get a boolean mask where there is an equality. Do both a.searchsorted(b) and b.searchsorted(a) to get the appropriate masks on b and a respectively. The number of True elements will be the same for both. Now just apply the masks to the second columns. In [20]: a = array([[2, 10], [4, 20], [6, 30], [8, 40], [10, 50]]) In [21]: b = array([[2, 60], [3, 70], [4, 80], [5, 90], [8, 100], [10, 110]]) In [22]: a[b[b[:,0].searchsorted(a[:,0]),0] == a[:,0], 1] Out[22]: array([10, 20, 40, 50]) In [23]: b[a[a[:,0].searchsorted(b[:,0]),0] == b[:,0], 1] Out[23]: array([ 60, 80, 100, 110]) In [24]: column_stack([Out[22], Out[23]]) Out[24]: array([[ 10, 60], [ 20, 80], [ 40, 100], [ 50, 110]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Sun May 18 05:14:20 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 18 May 2008 05:14:20 -0400 Subject: [Numpy-discussion] Numpify this? In-Reply-To: References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> <3d375d730805180152u7bab9e2lc7595a82da102b27@mail.gmail.com> Message-ID: 2008/5/18 Matt Crane : > On Sun, May 18, 2008 at 8:52 PM, Robert Kern wrote: >> Are there repeats? > No, no repeats in the first column. > > I'm going to go get a cup of coffee before I forget to leave out any > potentially vital information again. It's going to be a long day. It can be done, though I had to be kind of devious. My solution might not even be O(n log n), depending on how mergesort is implemented: def find_matches(A,B): """Find positions A_idx and B_idx so that A[A_idx]==B[B_idx] A and B should be sorted arrays with no repeats; A_idx and B_idx will be all the places they agree. >>> import numpy as np >>> A = np.array([1,2,4,5,7]) >>> B = np.array([1,3,5,9]) >>> find_matches(A,B) array([0,3]), array([0,2]) """ AB = np.concatenate((A,B)) idx = np.argsort(np.concatenate((A,B)),kind="mergesort") sorted = AB[idx] pairs = np.where(np.diff(sorted)==0)[0] A_pairs = idx[pairs] B_pairs = idx[pairs+1]-len(A) if np.any(A_pairs>=len(A)): raise ValueError, "first array contains a repeated element" if np.any(B_pairs<0): raise ValueError, "second array contains a repeated element" return A_pairs, B_pairs The idea is that diff is not a bad way to find repeats in a sequence; so concatenate the two sequences and sort them (if you're lucky mergesort will be fast because they're individually sorted, but in any case we need stability). Of course, you have to take the pairs in the sorted sequence and figure out where they came from, but fortunately you don't even need to invert the permutation returned by argsort (do we have a tool to invert permutations, or should one just use argsort again?). Anne From rex at nosyntax.com Sun May 18 06:13:38 2008 From: rex at nosyntax.com (rex) Date: Sun, 18 May 2008 03:13:38 -0700 Subject: [Numpy-discussion] numpy with icc Message-ID: <20080518101338.GC30321@nosyntax.net> > I am trying to build numpy with intel icc and mkl. I don't understand > a lot of what I am doing. Me, too. I have built it with icc & MKL several times in the past, but cannot build the numpy svn with MKL now. I can build it with icc and no MKL, and it passes all the tests with no errors. I've interleaved my setup with yours to make it easy to compare them. > ---------------------------------------------------------------------------------- > site.cfg: > [DEFAULT] > library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t,/usr/lib64,/usr/local/lib64,/usr/local/python2.5.2-intel/lib,/usr/lib,/usr/local/lib > include_dirs = /opt/intel/mkl/10.0.3.020/include,/usr/include,/usr/local/include,/usr/local/python2.5.2-intel/include > [mkl] > include_dirs = /opt/intel/mkl/10.0.3.020/include > library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t > lapack_libs = mkl_lapack > [lapack_src] > libraries=mkl_lapack,mkl,guide > [lapack_info] > libraries=mkl_lapack,mkl,guide > ------------------------------------------------------------------------------- [DEFAULT] library_dirs = /opt/intel/mkl/10.0.3.020/lib/32 include_dirs = /opt/intel/mkl/10.0.3.020/10.0.3.020/include [mkl] library_dirs = /opt/intel/mkl/10.0.3.020/lib/32 lapack_libs = mkl, mkl_lapack, guide #my reading of the Intel MKL docs suggests that vml does not need to be explicitly added, but that mkl_lapack does. > intelccompiler.py: > from distutils.unixccompiler import UnixCCompiler > from numpy.distutils.exec_command import find_executable > class IntelCCompiler(UnixCCompiler): > """ A modified Intel compiler compatible with an gcc built Python. > """ > compiler_type = 'intel' > cc_exe = 'icc -g -O3 -w -fPIC -parallel -ipo -xT -axT' cc_exe = 'icc -msse3 -fast' #adjust to suit your cpu I think some of those flags can be omitted: drop -g, and -fast implies several other flags, e.g., -xT, IIRC. (I'm building on a Core 2 Duo) > def __init__ (self, verbose=0, dry_run=0, force=0): > UnixCCompiler.__init__ (self, verbose,dry_run, force) > compiler = self.cc_exe > self.set_executables(compiler=compiler, > compiler_so=compiler, > compiler_cxx=compiler, > linker_exe=compiler, > linker_so=compiler + ' -shared') > class IntelItaniumCCompiler(IntelCCompiler): > compiler_type = 'intele' > # On Itanium, the Intel Compiler used to be called ecc, let's search for > # it (now it's also icc, so ecc is last in the search). > for cc_exe in map(find_executable,['icc','ecc']): > if cc_exe: > break > ---------------------------------------------------------------- > system_info.py: > .... > class lapack_mkl_info(mkl_info): > def calc_info(self): > mkl = get_info('mkl') > if not mkl: > return > if sys.platform == 'win32': > lapack_libs = self.get_libs('lapack_libs',['mkl_lapack']) > else: > lapack_libs = self.get_libs('lapack_libs',['mkl_lapack']) > info = {'libraries': lapack_libs} > dict_append(info,**mkl) > self.set_info(**info) > ... > ------------------------------------------------------------------------ I made the same change in the above. It was originally: if sys.platform == 'win32': lapack_libs = self.get_libs('lapack_libs',['mkl_lapack']) else: lapack_libs = self.get_libs('lapack_libs',['mkl_lapack32','mkl_lapack64]') Which does not work because all 3 libs are now named mkl_lapack. I made another change in system.info.py: class mkl_info(system_info): section = 'mkl' dir_env_var = 'MKL' _lib_mkl = ['mkl','mkl_lapack','guide'] It originally had 'vml' instead of 'mkl_lapack' > i then use this command to compile: > /usr/local/python2.5.2-intel/bin/python setup.py config > --compiler=intel config_fc --fcompiler=intel \ > --opt='-fPIC -O3 -w -axT -xT' install > build.out I used: numpy# python setup.py config --compiler=intel build_clib \ --compiler=intel build_ext --compiler=intel install \ --prefix=/usr/local> build51 > build.out has this in it: > F2PY Version 2_4422 > blas_opt_info: > blas_mkl_info: > libraries mkl,vml,guide not found in /opt/intel/mkl/10.0.3.020/lib/em64t > NOT AVAILABLE > atlas_blas_threads_info: > Setting PTATLAS=ATLAS > NOT AVAILABLE > atlas_blas_info: > NOT AVAILABLE > blas_info: > NOT AVAILABLE > blas_src_info: > NOT AVAILABLE > NOT AVAILABLE > lapack_opt_info: > lapack_mkl_info: > mkl_info: > libraries mkl,vml,guide not found in /opt/intel/mkl/10.0.3.020/lib/em64t > NOT AVAILABLE > NOT AVAILABLE > atlas_threads_info: > Setting PTATLAS=ATLAS > numpy.distutils.system_info.atlas_threads_info > NOT AVAILABLE > atlas_info: > numpy.distutils.system_info.atlas_info > NOT AVAILABLE > lapack_info: > NOT AVAILABLE > lapack_src_info: > NOT AVAILABLE > NOT AVAILABLE build53 (I used svn 5183) has: F2PY Version 2_5183 blas_opt_info: blas_mkl_info: FOUND: libraries = ['mkl', 'mkl_lapack', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = [] FOUND: libraries = ['mkl', 'mkl_lapack', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = [] lapack_opt_info: lapack_mkl_info: mkl_info: FOUND: libraries = ['mkl', 'mkl_lapack', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = [] FOUND: libraries = ['mkl', 'mkl_lapack', 'guide', 'mkl', 'mkl_lapack', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = [] FOUND: libraries = ['mkl', 'mkl_lapack', 'guide', 'mkl', 'mkl_lapack', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] running config running build_clib [...] > running config > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running install > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running build_src > building py_modules sources > creating build > creating build/src.linux-x86_64-2.5 > creating build/src.linux-x86_64-2.5/numpy > creating build/src.linux-x86_64-2.5/numpy/distutils > building extension "numpy.core.multiarray" sources > creating build/src.linux-x86_64-2.5/numpy/core > Generating build/src.linux-x86_64-2.5/numpy/core/config.h > Found executable /opt/intel/cce/10.1.015/bin/icc > Could not locate executable ecc > customize IntelFCompiler > Found executable /opt/intel/fce/10.1.015/bin/ifort > C compiler: icc -g -O3 -w -fPIC -parallel -ipo -xT -axT > ...... > it seems to compile and install fine.... then i start python and try > to run numpy.test(). that is where i am stuck. this is what happens: > # pwd > /usr/local/python2.5.2-intel/bin > # ./python > Python 2.5.2 (r252:60911, May 13 2008, 11:22:16) > [GCC Intel(R) C++ gcc 4.1 mode] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > >>> numpy.test() > Numpy is installed in > /usr/local/python2.5.2-intel/lib/python2.5/site-packages/numpy > Numpy version 1.0.4 > Python version 2.5.2 (r252:60911, May 13 2008, 11:22:16) [GCC Intel(R) > C++ gcc 4.1 mode] > Found 10/10 tests for numpy.core.defmatrix > Found 36/36 tests for numpy.core.ma > Found 223/223 tests for numpy.core.multiarray > Found 65/65 tests for numpy.core.numeric > Found 31/31 tests for numpy.core.numerictypes > Found 12/12 tests for numpy.core.records > Found 6/6 tests for numpy.core.scalarmath > Found 14/14 tests for numpy.core.umath > Found 4/4 tests for numpy.ctypeslib > Found 5/5 tests for numpy.distutils.misc_util > Found 1/1 tests for numpy.fft.fftpack > Found 3/3 tests for numpy.fft.helper > Found 9/9 tests for numpy.lib.arraysetops > Found 46/46 tests for numpy.lib.function_base > Found 5/5 tests for numpy.lib.getlimits > Found 4/4 tests for numpy.lib.index_tricks > Found 3/3 tests for numpy.lib.polynomial > Found 49/49 tests for numpy.lib.shape_base > Found 15/15 tests for numpy.lib.twodim_base > Found 43/43 tests for numpy.lib.type_check > Found 1/1 tests for numpy.lib.ufunclike > Found 40/40 tests for numpy.linalg > Found 2/2 tests for numpy.random > Found 0/0 tests for __main__ > MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/em64t/: cannot read > file data: Is a directory > # I get the same error: python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0.dev5183 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/32/: cannot read file data: Is a directory > not sure how to get past this... although, I am sure, if numpy is not > working right then i will not be able to go on and compile scipy..... I don't understand how MKL got linked into your system. Whenever I get no messages of the type: FOUND: libraries = ['mkl', 'mkl_lapack', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = [] in the build log, MKL doesn't get linked. If it IS linked it will show up here: # ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so linux-gate.so.1 => (0xffffe000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb7e62000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb7c65000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb7c01000) libmkl_lapack.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_lapack.so (0xb7736000) libguide.so => /opt/intel/cc/10.1.015/lib/libguide.so (0xb76d4000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb76a8000) libimf.so => /opt/intel/cc/10.1.015/lib/libimf.so (0xb7478000) libsvml.so => /opt/intel/cc/10.1.015/lib/libsvml.so (0xb73a7000) libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7382000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7377000) libintlc.so.5 => /opt/intel/cc/10.1.015/lib/libintlc.so.5 (0xb7334000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7202000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb71fe000) /lib/ld-linux.so.2 (0x80000000) From david at ar.media.kyoto-u.ac.jp Sun May 18 06:14:35 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 18 May 2008 19:14:35 +0900 Subject: [Numpy-discussion] numpy.distutils: building a f2py in a subdir Message-ID: <4830018B.6090904@ar.media.kyoto-u.ac.jp> Hi, I would like to be able to build a f2py extension in a subdir with distutils, that is: config.add_extension('foo/bar', source = ['foo/bar.pyf']) But it does not work right now because of the way numpy.distutils finds the name of the extension. Replacing: ext_name = extension.name.split('.')[-1] by ext_name = os.path.basename(extension.name.split('.')[-1]) Seems to make it work. Could that break anything in numpy.distutils ? I don't see how, but I don't want to touch distutils without being sure it won't, cheers, David From robert.kern at gmail.com Sun May 18 06:38:27 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 18 May 2008 05:38:27 -0500 Subject: [Numpy-discussion] numpy.distutils: building a f2py in a subdir In-Reply-To: <4830018B.6090904@ar.media.kyoto-u.ac.jp> References: <4830018B.6090904@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730805180338n27a5c441w3b4b9594553024e2@mail.gmail.com> On Sun, May 18, 2008 at 5:14 AM, David Cournapeau wrote: > Hi, > > I would like to be able to build a f2py extension in a subdir with > distutils, that is: > > config.add_extension('foo/bar', source = ['foo/bar.pyf']) > > But it does not work right now because of the way numpy.distutils finds > the name of the extension. Is foo a subpackage and the extension is supposed to be imported as parent.foo.bar (assuming the setup.py is for the "parent" package)? If so, you want this: config.add_extension('foo.bar', source=['foo/bar.pyf']) If the source is just in a subdirectory, but bar.so should be imported as "parent.bar", then you want this: config.add_extension('bar', source=['foo/bar.pyf']) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Sun May 18 06:30:52 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 18 May 2008 19:30:52 +0900 Subject: [Numpy-discussion] numpy.distutils: building a f2py in a subdir In-Reply-To: <3d375d730805180338n27a5c441w3b4b9594553024e2@mail.gmail.com> References: <4830018B.6090904@ar.media.kyoto-u.ac.jp> <3d375d730805180338n27a5c441w3b4b9594553024e2@mail.gmail.com> Message-ID: <4830055C.6070700@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > > config.add_extension('foo.bar', source=['foo/bar.pyf']) > Duh, should have thought about that. thanks, David From pearu at cens.ioc.ee Sun May 18 06:49:00 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun, 18 May 2008 13:49:00 +0300 (EEST) Subject: [Numpy-discussion] numpy.distutils: building a f2py in a subdir In-Reply-To: <4830018B.6090904@ar.media.kyoto-u.ac.jp> References: <4830018B.6090904@ar.media.kyoto-u.ac.jp> Message-ID: <58903.88.90.135.57.1211107740.squirrel@cens.ioc.ee> On Sun, May 18, 2008 1:14 pm, David Cournapeau wrote: > Hi, > > I would like to be able to build a f2py extension in a subdir with > distutils, that is: > > config.add_extension('foo/bar', source = ['foo/bar.pyf']) A safe approach would be to create a foo/setup.py that contains config.add_extension('bar', source = ['bar.pyf']) and in the parent setup.py add config.add_subpackage('foo') (you might also need creating foo/__init__.py). > But it does not work right now because of the way numpy.distutils finds > the name of the extension. Replacing: > > ext_name = extension.name.split('.')[-1] > > by > > ext_name = os.path.basename(extension.name.split('.')[-1]) > > Seems to make it work. Could that break anything in numpy.distutils ? I > don't see how, but I don't want to touch distutils without being sure it > won't, The change should not break anything that already works because in distutils extension name is assumed to contain names joined with a dot. If distutils works with / in extension name, then I think it is because by an accident. I'd recommend checking this also on a windows system before changing numpy.distutils, not sure if it works or not there.. Pearu From rex at nosyntax.com Sun May 18 07:20:37 2008 From: rex at nosyntax.com (rex) Date: Sun, 18 May 2008 04:20:37 -0700 Subject: [Numpy-discussion] 1.1.0rc1 tagged Message-ID: <20080518112037.GD30321@nosyntax.net> Jarrod Millman wrote: >Please test the release candidate: >svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1 With icc & MKL it fails to find the MKL libraries. site.cfg: ---------------------------------------------------------- [DEFAULT] library_dirs = /opt/intel/mkl/10.0.3.020/lib/32 include_dirs = /opt/intel/mkl/10.0.3.020/10.0.3.020/include [mkl] library_dirs = /opt/intel/mkl/10.0.3.020/lib/32 lapack_libs = mkl, mkl_lapack, guide ---------------------------------------------------------- 1.1.0rc1# python setup.py config --compiler=intel build_clib \ --compiler=intel build_ext --compiler=intel install \ --prefix=/usr/local> build0 cat build0 F2PY Version 2_5188 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /opt/intel/mkl/10.0.3.020/lib/32 NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /opt/intel/mkl/10.0.3.020/lib/32 NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /opt/intel/mkl/10.0.3.020/lib/32 NOT AVAILABLE blas_info: libraries blas not found in /opt/intel/mkl/10.0.3.020/lib/32 NOT AVAILABLE blas_src_info: NOT AVAILABLE NOT AVAILABLE lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /opt/intel/mkl/10.0.3.020/lib/32 NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /opt/intel/mkl/10.0.3.020/lib/32 libraries lapack_atlas not found in /opt/intel/mkl/10.0.3.020/lib/32 numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /opt/intel/mkl/10.0.3.020/lib/32 libraries lapack_atlas not found in /opt/intel/mkl/10.0.3.020/lib/32 numpy.distutils.system_info.atlas_info NOT AVAILABLE lapack_info: libraries lapack not found in /opt/intel/mkl/10.0.3.020/lib/32 NOT AVAILABLE lapack_src_info: NOT AVAILABLE NOT AVAILABLE running config [...] ------------------------------------------------------------------------------ Without MKL linked it runs the tests perfectly after being compiled with icc: ----------------------------------------------------------- python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ---------------------------------------------------------------------- Ran 1004 tests in 0.977s OK ---------------------------------------------------------------------- However, if distutils/system_info.py is hacked to make it find MKL, it builds w/o error, but fails the test: $ python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/32/: cannot read file data: Is a directory ------------------------------------------------------------------------------------ As far as MKL goes, we're making negative progress -- I used to be able to compile a 'good' numpy with icc & MKL that would pass the unit tests. Now, just with icc, but not with MKL. FWIW, there are two changes to system_info.py that are required to link MKL. class lapack_mkl_info(mkl_info): def calc_info(self): mkl = get_info('mkl') if not mkl: return if sys.platform == 'win32': lapack_libs = self.get_libs('lapack_libs',['mkl_lapack']) else: lapack_libs = self.get_libs('lapack_libs',['mkl_lapack32','mkl_lapack64']) info = {'libraries': lapack_libs} dict_append(info,**mkl) self.set_info(**info) This will fail because all 3 MKL versions (since version 9.x) have a common name: mkl_lapack The other change is vml. From reading the Intel examples, I think (but could be wrong) that vml doesn't need to be explicitly linked to get its functionality, but mkl_lapack does. This requires changing: class mkl_info(system_info): section = 'mkl' dir_env_var = 'MKL' _lib_mkl = ['mkl','vml','guide'] to: class mkl_info(system_info): section = 'mkl' dir_env_var = 'MKL' _lib_mkl = ['mkl','mkl_lapack','guide'] After these two changes are made, MKL is used (with my site.cfg), but the result is a numpy that fails the unit tests spectacularly. -rex -- "Generally intelligence has no effect on conclusions, which are glandularly determined. It just rationalizes hormonal inevitabilities." --Fred Reed From zachary.pincus at yale.edu Sun May 18 07:38:46 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Sun, 18 May 2008 13:38:46 +0200 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <482E8A85.6010606@ar.media.kyoto-u.ac.jp> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> <482E8A85.6010606@ar.media.kyoto-u.ac.jp> Message-ID: On May 17, 2008, at 9:34 AM, David Cournapeau wrote: > Nripun Sredar wrote: >> I have a sparse matrix 416x52. I tried to factorize this matrix using >> svd from numpy. But it didn't produce a result and looked like it is >> in an infinite loop. >> I tried a similar operation using random numbers in the matrix. Even >> this is in an infinite loop. >> Did anyone else face a similar problem? >> Can anyone please give some suggestions? > > Are you on windows ? What is the CPU on your machine ? I suspect > this is > caused by windows binaries which shipped blas/lapack without support > for > "old" CPU. I have seen this issue as well, on Windows XP running on a Core2 Duo. (But... it was a virtualized environment with VirtualBox, so I don't know if that disables the SSE features.) Anyhow, this was with the latest windows binaries, I think (numpy-1.0.4.win32-py2.5.msi), and had the same issue: infinite loop with 100% processor doing a SVD. (Non-sparse array, though.) Zach From maitai02 at excite.com Sun May 18 08:19:37 2008 From: maitai02 at excite.com (Jose Martin) Date: Sun, 18 May 2008 08:19:37 -0400 (EDT) Subject: [Numpy-discussion] NumPtr vs NumPy.i to access C Message-ID: <20080518121937.7E8008B320@xprdmxin.myway.com> Thanks everyone for all the comments! It helped to understand better the advantages/disadvantages of the various options to interact with C. Jose. --- On Sat 05/17, Bill Spotz < wfspotz at sandia.gov > wrote: Just to make sure the original question gets answered, yes, numpy.i avoids copies as much as possible. A special case is when your C code provides you with a view of its internal data and does not require any memory to be allocated by the (python) user. This can be dangerous, but if it is your use case, be sure to use the ARGOUTVIEW_* typemaps. Oh, and Brian's description of SWIG is an eminently fair one. AND, if NumPtr is only for Numeric, you should know that Numeric is no longer developed or supported. _______________________________________________ Join Excite! - http://www.excite.com The most personalized portal on the Web! From steve at shrogers.com Sun May 18 09:16:32 2008 From: steve at shrogers.com (Steven H. Rogers) Date: Sun, 18 May 2008 07:16:32 -0600 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: Message-ID: <48302C30.2090502@shrogers.com> Joe Harrington wrote: > NUMPY/SCIPY DOCUMENTATION MARATHON 2008 > ... > 5. Write a new help function that optionally produces ASCII or points > the user's PDF or HTML reader to the right page (either local or > global). > I can work on this. Fernando suggested this at the IPython sprint in Boulder last year, so I've given it some thought and started a wiki page: http://ipython.scipy.org/moin/Developer_Zone/SearchDocs Regards, Steve From cournape at gmail.com Sun May 18 09:28:34 2008 From: cournape at gmail.com (David Cournapeau) Date: Sun, 18 May 2008 22:28:34 +0900 Subject: [Numpy-discussion] numpy with icc In-Reply-To: <20080518101338.GC30321@nosyntax.net> References: <20080518101338.GC30321@nosyntax.net> Message-ID: <5b8d13220805180628sd0f9b4ckbeb35d653a2ea489@mail.gmail.com> On Sun, May 18, 2008 at 7:13 PM, rex wrote: > >> I am trying to build numpy with intel icc and mkl. I don't understand >> a lot of what I am doing. > > Me, too. I have built it with icc & MKL several times in the past, > but cannot build the numpy svn with MKL now. I can build it with > icc and no MKL, and it passes all the tests with no errors. > I have not tried with icc, but the following works for me with the last mkl (I have only tried numpy). [mkl] library_dirs = /home/david/intel/mkl/10.0.1.014/lib/32 lapack_libs = mkl_lapack mkl_libs = mkl, guide (of course, adapt the library_dirs accordingly). All test pass. I have updated the site.cfg.example in numpy. But really, if Intel keeps changing its library names, there is not much we can do. cheers, David From pav at iki.fi Sun May 18 09:28:47 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 18 May 2008 16:28:47 +0300 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <48302C30.2090502@shrogers.com> References: <48302C30.2090502@shrogers.com> Message-ID: <1211117327.8207.37.camel@localhost.localdomain> Hi, su, 2008-05-18 kello 07:16 -0600, Steven H. Rogers kirjoitti: > Joe Harrington wrote: > > NUMPY/SCIPY DOCUMENTATION MARATHON 2008 > > ... > > 5. Write a new help function that optionally produces ASCII or points > > the user's PDF or HTML reader to the right page (either local or > > global). > > > I can work on this. Fernando suggested this at the IPython sprint in > Boulder last year, so I've given it some thought and started a wiki page: > http://ipython.scipy.org/moin/Developer_Zone/SearchDocs In Numpy SVN/1.1 there is a function "lookfor" that searches the docstrings for a substring (no stemming etc. is done). Similar "%lookfor" magic command got accepted into IPython0 as an extension ipy_lookfor.py. Improvements to these would be surely appreciated. I think that also Sphinx supports searching, so that the generated HTML docs [1] are searchable, as is the generated PDF output. Pauli .. [1] http://mentat.za.net/numpy/refguide/ So far, this preview contains only docs for ndarray, though. From zoho.vignochi at gmail.com Sun May 18 09:59:10 2008 From: zoho.vignochi at gmail.com (Zoho Vignochi) Date: Sun, 18 May 2008 13:59:10 +0000 (UTC) Subject: [Numpy-discussion] checking element types in array References: Message-ID: On Sat, 17 May 2008 14:58:20 -0400, Anne Archibald wrote: > numpy arrays are efficient, among other reasons, because they have > homogeneous types. So all the elements in an array are the same type. > (Yes, this means if you have an array of numbers only one of which > happens to be complex, you have to represent them all as complex numbers > whose imaginary part happens to be zero.) So if A is an array A.dtype is > the type of its elements. > > numpy provides two convenience functions for checking whether an array > is complex, depending on what you want: > > iscomplex checks whether each element has a nonzero imaginary part and > returns an array representing the element-by-element answer; so > any(iscomplex(A)) will be true if any element of A has a nonzero > imaginary part. > > iscomplexobj checks whether the array has a complex data type. This is > much much faster, but of course it may happen that all the imaginary > parts happen to be zero; if you want to treat this array as real, you > must use iscomplex. > > Anne > > > Anne Thank you for the explanation. I knew there would be a speed penalty but the current numpy dot doesn't work as expected with mpf or mpc just yet and so I had to write my own. Your explanation helped as I decided to treat all numbers as complex and just implemented the complex version. Thanks, Zoho From markbak at gmail.com Sun May 18 11:37:42 2008 From: markbak at gmail.com (mark) Date: Sun, 18 May 2008 08:37:42 -0700 (PDT) Subject: [Numpy-discussion] arbitrary precision arrays in numpy? Message-ID: <6d9f9639-1b1a-4ccd-8fd4-99b6f50d118c@2g2000hsn.googlegroups.com> Hello list - I could not find an option for arbitrary precision arrays in numpy. Did anybody implement this? I would like to use something like 80 digits precision. Thanks, Mark From strawman at astraw.com Sun May 18 11:57:13 2008 From: strawman at astraw.com (Andrew Straw) Date: Sun, 18 May 2008 08:57:13 -0700 Subject: [Numpy-discussion] 1.1.0rc1 tagged In-Reply-To: References: Message-ID: <483051D9.20000@astraw.com> Jarrod Millman wrote: > Please test the release candidate: > svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1 > Thanks, Jarrod. I have packaged SVN trunk from r5189 and made a Debian source package (based on a slightly old version the Debian Python Modules Team's numpy package with all patches removed) and Ubuntu Hardy (8.04) binary packages. These are available at: http://debs.astraw.com/hardy/ In particular, you can just grab the .deb for your architecture for Ubuntu Hardy: * i386: http://debs.astraw.com/hardy/python-numpy_1.1.0~dev5189-0ads1_i386.deb * amd64: http://debs.astraw.com/hardy/python-numpy_1.1.0~dev5189-0ads1_amd64.deb All numpy, tests pass on both architectures in my hands, and I shall begin testing my various codes with this release. Other Ubunteers who don't want to bother compiling from source are also welcome to try these packages. (I chose the trunk rather than the RC tag because my understanding is that fixes to the final 1.1.0 are going to the trunk, and David Cournapeau has made a couple commits. Also, I released after I packaged this up that I forgot to touch the date in debian/changelog -- apologies!) -Andrew From charlesr.harris at gmail.com Sun May 18 12:06:52 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 18 May 2008 10:06:52 -0600 Subject: [Numpy-discussion] arbitrary precision arrays in numpy? In-Reply-To: <6d9f9639-1b1a-4ccd-8fd4-99b6f50d118c@2g2000hsn.googlegroups.com> References: <6d9f9639-1b1a-4ccd-8fd4-99b6f50d118c@2g2000hsn.googlegroups.com> Message-ID: Hi Mark, On Sun, May 18, 2008 at 9:37 AM, mark wrote: > Hello list - > > I could not find an option for arbitrary precision arrays in numpy. > Did anybody implement this? > > I would like to use something like 80 digits precision. > No, we don't have this. What do you need it for? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rex at nosyntax.com Sun May 18 15:14:38 2008 From: rex at nosyntax.com (rex) Date: Sun, 18 May 2008 12:14:38 -0700 Subject: [Numpy-discussion] numpy with icc In-Reply-To: <5b8d13220805180628sd0f9b4ckbeb35d653a2ea489@mail.gmail.com> Message-ID: <20080518191438.GE30321@nosyntax.net> David wrote: > I have not tried with icc, but the following works for me with the > last mkl (I have only tried numpy). > > [mkl] > library_dirs = /home/david/intel/mkl/10.0.1.014/lib/32 > lapack_libs = mkl_lapack > mkl_libs = mkl, guide > > (of course, adapt the library_dirs accordingly). All test pass. I have > updated the site.cfg.example in numpy. But really, if Intel keeps > changing its library names, there is not much we can do. The last relevant MKL library name change I'm aware of occurred when MKL 9.X was released in 2006: "mkl_lapackXX.so" was changed to "mkl_lapack.so" in all 3 cases. And, since the material below shows mkl_lapack need not be in site.cfg, the name change should not matter. I tried your (locally adapted) site.cfg with no changes to distutils. /usr/local/src/1.1.0rc1/site.cfg: ----------------------------------------------------- [mkl] library_dirs = /opt/intel/mkl/mkl/10.0.3.020/lib/32 lapack_libs = mkl_lapack mkl_libs = mkl, guide ----------------------------------------------------- Build command: python setup.py config --compiler=intel build_clib \ --compiler=intel build_ext --compiler=intel install \ --prefix=/usr/local> build3 #less build3 ---------------------------------------------------- F2PY Version 2_5188 blas_opt_info: blas_mkl_info: FOUND: libraries = ['mkl', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/local/include', '/usr/include'] FOUND: libraries = ['mkl', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/local/include', '/usr/include'] lapack_opt_info: lapack_mkl_info: mkl_info: FOUND: libraries = ['mkl', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/local/include', '/usr/include'] FOUND: libraries = ['mkl_lapack', 'mkl', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/local/include', '/usr/include'] FOUND: libraries = ['mkl_lapack', 'mkl', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] [...] copying site.cfg -> /usr/local/lib/python2.5/site-packages/numpy/distutils/ copying build/src.linux-i686-2.5/numpy/core/numpyconfig.h -> /usr/local/lib/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-i686-2.5/numpy/core/__multiarray_api.h -> /usr/local/lib/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-i686-2.5/numpy/core/multiarray_api.txt -> /usr/local/lib/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-i686-2.5/numpy/core/__ufunc_api.h -> /usr/local/lib/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-i686-2.5/numpy/core/ufunc_api.txt -> /usr/local/lib/python2.5/site-packages/numpy/core/include/numpy running install_egg_info ------------------------------------------------------------- ldd shows MKL was linked: # ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so linux-gate.so.1 => (0xffffe000) libmkl_lapack.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_lapack.so (0xb7ae1000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb79a3000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb77a6000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb7742000) libguide.so => /opt/intel/cc/10.1.015/lib/libguide.so (0xb76e0000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb76b4000) libimf.so => /opt/intel/cc/10.1.015/lib/libimf.so (0xb7484000) libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb745f000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7454000) libintlc.so.5 => /opt/intel/cc/10.1.015/lib/libintlc.so.5 (0xb7411000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb72e0000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb72db000) /lib/ld-linux.so.2 (0x80000000) Interestingly, the above shows libmkl_lapack.so is being used even though it is not in the site.cfg. Apparently, mkl and guide are sufficient in site.cfg. $ python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/32/: cannot read file data: Is a directory To test for an interaction between icc and MKL, I built with gcc (after removing the 1.1.0rc1/build directory): # python setup.py config build_clib build_ext install \ --prefix=/usr/local > build4 $ python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/32/: cannot read file data: Is a directory So why do Erik and I get failures (with both gcc & icc) when MKL is used and you don't? From jbsnyder at gmail.com Sun May 18 15:56:24 2008 From: jbsnyder at gmail.com (James Snyder) Date: Sun, 18 May 2008 14:56:24 -0500 Subject: [Numpy-discussion] Which to test: 1.1.x or 1.1.0rc1? Message-ID: <33644d3c0805181256g2ccfad1q1f814b919d726c59@mail.gmail.com> Hi - I've been running out of trunk recently, and I've noted that an rc release has appeared and the 1.1.x branch has been regenerated. Which would be most helpful to provide feedback from? >From the branch (1.1.1x) - test results on Mac OS X 10.5.2, built for universal, using apple Python, looks all clean. In the past I've needed to build universal binaries, not sure if that is still the case, but things behave more happily if I do. CFLAGS="-O -g -isysroot /Developer/SDKs/MacOSX10.5.sdk -arch ppc -arch ppc64 -arch i386 -arch x86_64" LDFLAGS="-arch ppc -arch ppc64 -arch i386 -arch x86_64" Thanks for the fabulous work guys! It's great to have an open and python-based alternative for scientific computing. In [2]: numpy.test() Numpy is installed in /Library/Python/2.5/site-packages/numpy Numpy version 1.1.0.dev5142 Python version 2.5.1 (r251:54863, Feb 4 2008, 21:48:13) [GCC 4.0.1 (Apple Inc. build 5465)] Found 15/15 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 281/281 tests for numpy.core.multiarray Found 69/69 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 93/93 tests for numpy.ma.core Found 14/14 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ .................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ---------------------------------------------------------------------- Ran 996 tests in 1.784s OK Out[2]: -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Sun May 18 16:07:32 2008 From: markbak at gmail.com (mark) Date: Sun, 18 May 2008 13:07:32 -0700 (PDT) Subject: [Numpy-discussion] arbitrary precision arrays in numpy? In-Reply-To: References: <6d9f9639-1b1a-4ccd-8fd4-99b6f50d118c@2g2000hsn.googlegroups.com> Message-ID: I need it for a numerical back transformation from Laplace space. I found mpmath, which I think will do the trick Mark On May 18, 6:06 pm, "Charles R Harris" wrote: > Hi Mark, > > On Sun, May 18, 2008 at 9:37 AM, mark wrote: > > Hello list - > > > I could not find an option for arbitrary precision arrays in numpy. > > Did anybody implement this? > > > I would like to use something like 80 digits precision. > > No, we don't have this. What do you need it for? > > Chuck > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From rex at nosyntax.com Sun May 18 17:38:04 2008 From: rex at nosyntax.com (rex linuxuser) Date: Sun, 18 May 2008 14:38:04 -0700 Subject: [Numpy-discussion] numpy with icc In-Reply-To: <20080518191438.GE30321@nosyntax.net> References: <5b8d13220805180628sd0f9b4ckbeb35d653a2ea489@mail.gmail.com> <20080518191438.GE30321@nosyntax.net> Message-ID: <5636a7270805181438j2d9cb2fj949257a5f8a7b240@mail.gmail.com> David, how do these environment variables compare with yours? Are you sure MKL is being used? Adjusted for your local path, what does the ldd command below show? ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so linux-gate.so.1 => (0xffffe000) libmkl_lapack.so => /opt/intel/mkl/ 10.0.3.020/lib/32/libmkl_lapack.so (0xb7ae4000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb79a6000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb77a9000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb7745000) libguide.so => /opt/intel/mkl/10.0.3.020/lib/32/libguide.so(0xb76e3000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb76b7000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7586000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7582000) /lib/ld-linux.so.2 (0x80000000) Relevant lines from .bashrc: source /opt/intel/cc/10.1.015/bin/iccvars.sh source /opt/intel/fc/10.1.015/bin/ifortvars.sh source /opt/intel/mkl/10.0.3.020/tools/environment/mklvars32.sh export PYTHONPATH=/usr/local/lib/python2.5/site-packages:/usr/lib/python2.5 The above statements set relevant environment variables: CPATH=/opt/intel/mkl/10.0.3.020/include FPATH=/opt/intel/mkl/10.0.3.020/include DYLD_LIBRARY_PATH=/opt/intel/fc/10.1.015/lib:/opt/intel/cc/10.1.015/lib INCLUDE=/opt/intel/mkl/10.0.3.020/include LD_LIBRARY_PATH=/opt/intel/mkl/ 10.0.3.020/lib/32:/opt/intel/fc/10.1.015/lib:/opt/intel/cc/10.1.015/lib LIBRARY_PATH=/opt/intel/mkl/10.0.3.020/lib/32 MANPATH=/opt/intel/mkl/ 10.0.3.020/man:/opt/intel/fc/10.1.015/man:/opt/intel/cc/10.1.015/m an:/opt/intel/cc/10.1.015/man:/usr/local/man:/usr/local/share/man:/usr/share/man MKLROOT=/opt/intel/mkl/10.0.3.020 PYTHONPATH=/usr/local/lib/python2.5/site-packages:/usr/lib/python2.5 Are we having fun yet? :( -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournapeau at cslab.kecl.ntt.co.jp Sun May 18 21:59:10 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 19 May 2008 10:59:10 +0900 Subject: [Numpy-discussion] numpy with icc In-Reply-To: <20080518191438.GE30321@nosyntax.net> References: <20080518191438.GE30321@nosyntax.net> Message-ID: <1211162350.28902.6.camel@bbc8> On Sun, 2008-05-18 at 12:14 -0700, rex wrote: > > The last relevant MKL library name change I'm aware of occurred > when MKL 9.X was released in 2006: > No, they heavily changed how to link against mkl in 10. There is a whole chapter about it in the releases notes. > ldd shows MKL was linked: > > # ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so > linux-gate.so.1 => (0xffffe000) > libmkl_lapack.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_lapack.so (0xb7ae1000) > /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb79a3000) > /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb77a6000) > /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb7742000) > libguide.so => /opt/intel/cc/10.1.015/lib/libguide.so (0xb76e0000) > libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb76b4000) > libimf.so => /opt/intel/cc/10.1.015/lib/libimf.so (0xb7484000) > libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb745f000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7454000) > libintlc.so.5 => /opt/intel/cc/10.1.015/lib/libintlc.so.5 (0xb7411000) > libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb72e0000) > libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb72db000) > /lib/ld-linux.so.2 (0x80000000) > I don't have the same thing at all. You have many more libraries than I have, and that's because you were using the intel compiler, I think (libguide is from intel cc, imf is also used by icc and ifort). Remove the installer numpy AND the build directory, and retry building numpy with the mkl. > Interestingly, the above shows libmkl_lapack.so is being used even > though it is not in the site.cfg. Apparently, mkl and guide are > sufficient in site.cfg. I am not sure I understand: there is a mkl_lapack section in the site.cfg, and you need it. > So why do Erik and I get failures (with both gcc & icc) when MKL > is used and you don't? I don't know. I would ask Intel about this error if the above does not work, maybe you did not install it correctly, or there was a bug in your version (my version is a bit more recent, I downloaded it a few days ago). cheers, David From matt at snapbug.geek.nz Sun May 18 22:24:39 2008 From: matt at snapbug.geek.nz (Matt Crane) Date: Mon, 19 May 2008 14:24:39 +1200 Subject: [Numpy-discussion] Numpify this? In-Reply-To: References: <3d375d730805180019y244be423kb881efd83046e3de@mail.gmail.com> <3d375d730805180108p387f751dxfb0b193f9bd385e7@mail.gmail.com> <3d375d730805180152u7bab9e2lc7595a82da102b27@mail.gmail.com> Message-ID: On Sun, May 18, 2008 at 9:14 PM, Anne Archibald wrote: > 2008/5/18 Matt Crane : >> On Sun, May 18, 2008 at 8:52 PM, Robert Kern wrote: >>> Are there repeats? >> No, no repeats in the first column. >> >> I'm going to go get a cup of coffee before I forget to leave out any >> potentially vital information again. It's going to be a long day. > > It can be done, though I had to be kind of devious. My solution might > not even be O(n log n), depending on how mergesort is implemented: > Although this O(n log n) solution is written in numpy - and let's assume that the mergesort is implemented to give that. That's O(n log n) where n is the combined sizes of the arrays - although given that they are already sorted it might be straight linear because it only has to merge them? I know for sure that the solution that I originally posted was linear in terms of the combined sizes. While it might be slow because it's written in straight python - it perhaps might be quicker if one were to use scipy.weave blitz/inline? I think I might leave the discussion here until I can get some benchmarking on the different alternatives - thanks all. Matt From millman at berkeley.edu Sun May 18 22:47:19 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 18 May 2008 19:47:19 -0700 Subject: [Numpy-discussion] Which to test: 1.1.x or 1.1.0rc1? In-Reply-To: <33644d3c0805181256g2ccfad1q1f814b919d726c59@mail.gmail.com> References: <33644d3c0805181256g2ccfad1q1f814b919d726c59@mail.gmail.com> Message-ID: On Sun, May 18, 2008 at 12:56 PM, James Snyder wrote: > I've been running out of trunk recently, and I've noted that an rc release > has appeared and the 1.1.x branch has been regenerated. > > Which would be most helpful to provide feedback from? Hmmm. I deleted the 1.1.x branch and it doesn't appear to exist anymore. How did you get it? Please test the 1.1.0rc1: http://projects.scipy.org/scipy/numpy/browser/tags/1.1.0rc1 -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From rex at nosyntax.com Mon May 19 05:58:45 2008 From: rex at nosyntax.com (rex) Date: Mon, 19 May 2008 02:58:45 -0700 Subject: [Numpy-discussion] numpy with icc/MKL Message-ID: <20080519095845.GA25442@nosyntax.net> > No, they heavily changed how to link against mkl in 10. There is a whole > chapter about it in the releases notes. Yes, I read it, but it appears to me that the new layer libraries are an option, and that the legacy link format still works. From chapter 3: "Pure layered libraries give more flexibility to choose the appropriate combination of libraries but do not have backward compatibility by library names in link lines. Dummy libraries are introduced to provide backward compatibility with earlier version of Intel MKL, which did not use layered libraries. Dummy libraries do not contain any functionality, but only dependencies on a set of layered libraries. Placed in a link line, dummy libraries enable omitting dependent layered libraries, which will be linked automatically. Dummy libraries contain dependency on the following layered libraries (default principle): ? Interface: Intel, LP64 ? Threading: Intel compiled ? Computational: the computation library. So, if you employ the above interface and use OpenMP* threading provided by the Intel? compiler, you may not change your link lines." > I don't have the same thing at all. You have many more libraries than I > have, and that's because you were using the intel compiler, I think > (libguide is from intel cc, imf is also used by icc and ifort). >> Remove the installer numpy AND the build directory, and retry building >> numpy with the mkl. I always remove the build directory (if I forget the much faster compilation reminds me). Do you mean remove the installed numpy? Did that, built numpy again, and it fails numpy.test() exactly as before. I changed site.cfg to: [mkl] library_dirs = /opt/intel/mkl/10.0.3.020/lib/32 lapack_libs = mkl_lapack mkl_libs = mkl, mkl_vml_p4m, guide (the vml is appropriate for my CPU). The build log shows: F2PY Version 2_5189 blas_opt_info: blas_mkl_info: FOUND: libraries = ['mkl', 'mkl_vml_p4m', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/mkl/10.0.3.020/include'] FOUND: libraries = ['mkl', 'mkl_vml_p4m', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/mkl/10.0.3.020/include'] lapack_opt_info: lapack_mkl_info: mkl_info: FOUND: libraries = ['mkl', 'mkl_vml_p4m', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/mkl/10.0.3.020/include'] FOUND: libraries = ['mkl_lapack', 'mkl', 'mkl_vml_p4m', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/mkl/10.0.3.020/include'] FOUND: libraries = ['mkl_lapack', 'mkl', 'mkl_vml_p4m', 'guide', 'pthread'] library_dirs = ['/opt/intel/mkl/10.0.3.020/lib/32'] So it's finding the vector math library, but checking with ldd shows that of all the *.so libs in site-packages/numpy only lapack_lite.so is using vml. Any thoughts on why none of the other libs are using it? ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so linux-gate.so.1 => (0xffffe000) libmkl_lapack.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_lapack.so (0xb7a48000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb790a000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb770d000) /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb76a9000) libmkl_vml_p4m.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_vml_p4m.so (0xb73bd000) libguide.so => /opt/intel/mkl/10.0.3.020/lib/32/libguide.so (0xb735a000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb732f000) libimf.so => /opt/intel/fc/10.1.015/lib/libimf.so (0xb70ff000) libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb70da000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb70cf000) libintlc.so.5 => /opt/intel/fc/10.1.015/lib/libintlc.so.5 (0xb708c000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb6f5a000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb6f56000) /lib/ld-linux.so.2 (0x80000000) As you can see from the above, it uses libmkl_vml_p4m, but it's the only *.so that does. It would be interesting to see the output of these commands below on a box running BlAS and Lapack to see how many of the *.so libs use them. ldd /usr/local/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so ldd /usr/local/lib/python2.5/site-packages/numpy/lib/_compiled_base.so ldd /usr/local/lib/python2.5/site-packages/numpy//core/multiarray.so ldd /usr/local/lib/python2.5/site-packages/numpy//core/_dotblas.so ldd /usr/local/lib/python2.5/site-packages/numpy//core/_sort.so ldd /usr/local/lib/python2.5/site-packages/numpy/core/scalarmath.so ldd /usr/local/lib/python2.5/site-packages/numpy/core/umath.so ldd /usr/local/lib/python2.5/site-packages/numpy/random/mtrand.so ldd /usr/local/lib/python2.5/site-packages/numpy/numarray/_capi.so > > Interestingly, the above shows libmkl_lapack.so is being used even > > though it is not in the site.cfg. Apparently, mkl and guide are > > sufficient in site.cfg. > I am not sure I understand: there is a mkl_lapack section in the > site.cfg, and you need it. Sorry, I'm going blind from fatigue. :( > > So why do Erik and I get failures (with both gcc & icc) when MKL > > is used and you don't? > I don't know. I would ask Intel about this error if the above does not > work, maybe you did not install it correctly, or there was a bug in your > version (my version is a bit more recent, I downloaded it a few days > ago). In your list post you show mkl/10.0.1.014/. I'm using 10.0.3.020, but I've tried an older 10.0.X version with no better luck. BTW, my build runs, i.e., I can run some programs that use NumPy. I have posted to the Intel MKL forum. No responses yet. -rex I am not goin' to buy my kids an encyclopedia. Let them walk to school the way I did. From p.e.creasey.00 at googlemail.com Mon May 19 06:23:22 2008 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Mon, 19 May 2008 11:23:22 +0100 Subject: [Numpy-discussion] Generalised inner product Message-ID: <6be8b94a0805190323h71c116f2wc7b2e345ff29c47b@mail.gmail.com> Hi, Does numpy have some sort of generalised inner product? For example I have arrays a.shape = (5,6,7) b.shape = (8,7,9,10) and I want to perform a product over the 3rd axis of a and the 2nd of b, i.e. c[i,j,k,l,m] = sum (over x) of a[i,j,x] * b[k,x,l,m] I guess I could do it with swapaxes and numpy.dot or numpy.inner but I wondered if there was a general function. Thanks, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbsnyder at gmail.com Mon May 19 08:16:23 2008 From: jbsnyder at gmail.com (James Snyder) Date: Mon, 19 May 2008 07:16:23 -0500 Subject: [Numpy-discussion] Which to test: 1.1.x or 1.1.0rc1? In-Reply-To: References: <33644d3c0805181256g2ccfad1q1f814b919d726c59@mail.gmail.com> Message-ID: <33644d3c0805190516i95c2b6t646f66138d535300@mail.gmail.com> I've been using git-svn, so I suppose I'm pulling the last rev that was in 1.1.x. Checked out the RC, looks like there are more unit tests, but they all still pass for me: In [2]: numpy.test() Numpy is installed in /Library/Python/2.5/site-packages/numpy Numpy version 1.1.0.dev5142 Python version 2.5.1 (r251:54863, Feb 4 2008, 21:48:13) [GCC 4.0.1 (Apple Inc. build 5465)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ---------------------------------------------------------------------- Ran 1004 tests in 1.342s OK Out[2]: On Sun, May 18, 2008 at 9:47 PM, Jarrod Millman wrote: > On Sun, May 18, 2008 at 12:56 PM, James Snyder wrote: >> I've been running out of trunk recently, and I've noted that an rc release >> has appeared and the 1.1.x branch has been regenerated. >> >> Which would be most helpful to provide feedback from? > > Hmmm. I deleted the 1.1.x branch and it doesn't appear to exist > anymore. How did you get it? > > Please test the 1.1.0rc1: > http://projects.scipy.org/scipy/numpy/browser/tags/1.1.0rc1 > > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com From charlesr.harris at gmail.com Mon May 19 09:03:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 May 2008 07:03:09 -0600 Subject: [Numpy-discussion] Generalised inner product In-Reply-To: <6be8b94a0805190323h71c116f2wc7b2e345ff29c47b@mail.gmail.com> References: <6be8b94a0805190323h71c116f2wc7b2e345ff29c47b@mail.gmail.com> Message-ID: On Mon, May 19, 2008 at 4:23 AM, Peter Creasey < p.e.creasey.00 at googlemail.com> wrote: > Hi, > > Does numpy have some sort of generalised inner product? For example I have > arrays > > a.shape = (5,6,7) > b.shape = (8,7,9,10) > > and I want to perform a product over the 3rd axis of a and the 2nd of b, > i.e. > > c[i,j,k,l,m] = sum (over x) of a[i,j,x] * b[k,x,l,m] > > I guess I could do it with swapaxes and numpy.dot or numpy.inner but I > wondered if there was a general function. > Try tensordot. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon May 19 08:58:03 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 19 May 2008 21:58:03 +0900 Subject: [Numpy-discussion] numpy with icc/MKL In-Reply-To: <20080519095845.GA25442@nosyntax.net> References: <20080519095845.GA25442@nosyntax.net> Message-ID: <4831795B.40303@ar.media.kyoto-u.ac.jp> rex wrote: > > I always remove the build directory (if I forget the much faster > compilation reminds me). Do you mean remove the installed numpy? > Yes. > Did that, built numpy again, and it fails numpy.test() exactly as before. > > I changed site.cfg to: > > [mkl] > library_dirs = /opt/intel/mkl/10.0.3.020/lib/32 > lapack_libs = mkl_lapack > mkl_libs = mkl, mkl_vml_p4m, guide > Please follow exactly my instruction, otherwise, we cannot compare what we are doing: use exactly the same site.cfg as me. > > ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so > linux-gate.so.1 => (0xffffe000) > libmkl_lapack.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_lapack.so (0xb7a48000) > /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb790a000) > /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb770d000) > /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb76a9000) > libmkl_vml_p4m.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_vml_p4m.so (0xb73bd000) > libguide.so => /opt/intel/mkl/10.0.3.020/lib/32/libguide.so (0xb735a000) > libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb732f000) > libimf.so => /opt/intel/fc/10.1.015/lib/libimf.so (0xb70ff000) > libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb70da000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb70cf000) > libintlc.so.5 => /opt/intel/fc/10.1.015/lib/libintlc.so.5 (0xb708c000) > libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb6f5a000) > libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb6f56000) > /lib/ld-linux.so.2 (0x80000000) > You still have libraries linked to intel compilers (libintlc, libimf). If you remove both the build directory and installed numpy, I don't why those are used. As a last resort, can you put aside /opt/intel/fc temporally ? > As you can see from the above, it uses libmkl_vml_p4m, but it's > the only *.so that does. It would be interesting to see the output > of these commands below on a box running BlAS and Lapack to see how many > of the *.so libs use them. > > ldd /usr/local/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so > ldd /usr/local/lib/python2.5/site-packages/numpy/lib/_compiled_base.so > ldd /usr/local/lib/python2.5/site-packages/numpy//core/multiarray.so > ldd /usr/local/lib/python2.5/site-packages/numpy//core/_dotblas.so > ldd /usr/local/lib/python2.5/site-packages/numpy//core/_sort.so > ldd /usr/local/lib/python2.5/site-packages/numpy/core/scalarmath.so > ldd /usr/local/lib/python2.5/site-packages/numpy/core/umath.so > ldd /usr/local/lib/python2.5/site-packages/numpy/random/mtrand.so > ldd /usr/local/lib/python2.5/site-packages/numpy/numarray/_capi.so > Only _dotblas and lapack_lite use blas/lapack on numpy, AFAIK. They may be linked to other extensions, but won't be used. > In your list post you show mkl/10.0.1.014/. I'm using 10.0.3.020, > but I've tried an older 10.0.X version with no better luck. BTW, > my build runs, i.e., I can run some programs that use NumPy. > > You are not using the free version, right ? The problem is that the MKL error has no clear meaning, and google did not return anything meaningful. Maybe your environment is corrupted in some way, because something is likely to influence how MKL is initialized. But what exactly, I have no idea. cheers, David From ndbecker2 at gmail.com Mon May 19 09:20:21 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 19 May 2008 09:20:21 -0400 Subject: [Numpy-discussion] noncentral_chisquare buglet? Message-ID: def noncentral_chisquare(self, df, nonc, size=None): """Noncentral Chi^2 distribution. noncentral_chisquare(df, nonc, size=None) -> random values """ cdef ndarray odf, ononc cdef double fdf, fnonc fdf = PyFloat_AsDouble(df) fnonc = PyFloat_AsDouble(nonc) if not PyErr_Occurred(): if fdf <= 1: raise ValueError("df <= 0") << > > c[i,j,k,l,m] = sum (over x) of a[i,j,x] * b[k,x,l,m] > > > > Try tensordot. > > Chuck That was exactly what I needed. Thanks! Peter From orest.kozyar at gmail.com Mon May 19 10:34:37 2008 From: orest.kozyar at gmail.com (Orest Kozyar) Date: Mon, 19 May 2008 10:34:37 -0400 Subject: [Numpy-discussion] Slicing a numpy array and getting the "complement" Message-ID: Given a slice, such as s_[..., :-2:], is it possible to take the complement of this slice? Specifically, s_[..., ::-2]. I have a series of 2D arrays that I need to split into two subarrays via slicing where the members of the second array are all the members leftover from the slice. The problem is that the slice itself will vary, and could be anything such as s_[..., 1:4:] or s_[..., 1:-4:], etc, so I'm wondering if there's a straightforward idiom or routine in Numpy that would facilitate taking the complement of a slice? I've looked around the docs, and have not had much luck. Thanks! Orest From rex at nosyntax.com Mon May 19 10:35:06 2008 From: rex at nosyntax.com (rex) Date: Mon, 19 May 2008 07:35:06 -0700 Subject: [Numpy-discussion] numpy with icc/MKL In-Reply-To: <4831795B.40303@ar.media.kyoto-u.ac.jp> Message-ID: <20080519143506.GA26876@nosyntax.net> > Please follow exactly my instruction, otherwise, we cannot compare what > we are doing: use exactly the same site.cfg as me. OK, I used the same MKL version you did (10.0.1.014), the same site.cfg, and set .bashrc to do: source /opt/intel/mkl/10.0.1.014/tools/environment/mklvars32.sh and compiled 1.1.0rc1 with: python setup.py config build_clib build_ext install \ prefix=/usr/local/ > buildXX Then I ran: # python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ---------------------------------------------------------------------- Ran 1004 tests in 1.429s OK >>> So it works. :) ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so linux-gate.so.1 => (0xffffe000) libmkl_lapack.so => /opt/intel/mkl/10.0.1.014/lib/32/libmkl_lapack.so (0xb7a53000) /opt/intel/mkl/10.0.1.014/lib/32/libmkl_intel.so (0xb791f000) /opt/intel/mkl/10.0.1.014/lib/32/libmkl_intel_thread.so (0xb7735000) /opt/intel/mkl/10.0.1.014/lib/32/libmkl_core.so (0xb76d9000) libguide.so => /opt/intel/mkl/10.0.1.014/lib/32/libguide.so (0xb767f000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7653000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7522000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb751e000) /lib/ld-linux.so.2 (0x80000000) No more links to icc libs. >> ldd /usr/local/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so >> linux-gate.so.1 => (0xffffe000) >> libmkl_lapack.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_lapack.so (0xb7a48000) >> /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel.so (0xb790a000) >> /opt/intel/mkl/10.0.3.020/lib/32/libmkl_intel_thread.so (0xb770d000) >> /opt/intel/mkl/10.0.3.020/lib/32/libmkl_core.so (0xb76a9000) >> libmkl_vml_p4m.so => /opt/intel/mkl/10.0.3.020/lib/32/libmkl_vml_p4m.so (0xb73bd000) >> libguide.so => /opt/intel/mkl/10.0.3.020/lib/32/libguide.so (0xb735a000) >> libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb732f000) >> libimf.so => /opt/intel/fc/10.1.015/lib/libimf.so (0xb70ff000) >> libintlc.so.5 => /opt/intel/fc/10.1.015/lib/libintlc.so.5 (0xb708c000) > You still have libraries linked to intel compilers (libintlc, libimf). > If you remove both the build directory and installed numpy, I don't why > those are used. As a last resort, can you put aside /opt/intel/fc > temporally ? They are linked because that build was with the Intel compiler. As you can see from the new ldd above, those links are not there when I build with gcc. > You are not using the free version, right ? The problem is that the MKL > error has no clear meaning, and google did not return anything > meaningful. Maybe your environment is corrupted in some way, because > something is likely to influence how MKL is initialized. But what > exactly, I have no idea. AFAIK, there is one MKL version for each release. The only difference is the licensing. Next step is to try icc instead of gcc, and if that works, try the latest MKL (10.0.3.020). Thanks, -rex -- I almost stole another tagline. I'm so ashamed. From nripunsredar at gmail.com Mon May 19 11:13:27 2008 From: nripunsredar at gmail.com (Nripun Sredar) Date: Mon, 19 May 2008 10:13:27 -0500 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <482E8A85.6010606@ar.media.kyoto-u.ac.jp> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> <482E8A85.6010606@ar.media.kyoto-u.ac.jp> Message-ID: <7671f4e40805190813n6027e4f3s58c47dea7b50c605@mail.gmail.com> I am running on Windows Xp, Intel Xeon CPU. I'd like to fill in a few more things here. If I send 0 in the second and third argument of svd then I get the singular_values, but if its 1 then the problem persists. I've tried this on sparse and non-sparse matrices. This is with the latest windows binaries numpy-1.0.4.win32-py2.5.msi. On Sat, May 17, 2008 at 2:34 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Nripun Sredar wrote: > > I have a sparse matrix 416x52. I tried to factorize this matrix using > > svd from numpy. But it didn't produce a result and looked like it is > > in an infinite loop. > > I tried a similar operation using random numbers in the matrix. Even > > this is in an infinite loop. > > Did anyone else face a similar problem? > > Can anyone please give some suggestions? > > Are you on windows ? What is the CPU on your machine ? I suspect this is > caused by windows binaries which shipped blas/lapack without support for > "old" CPU. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rex at nosyntax.com Mon May 19 11:22:40 2008 From: rex at nosyntax.com (rex) Date: Mon, 19 May 2008 08:22:40 -0700 Subject: [Numpy-discussion] numpy with icc/MKL In-Reply-To: <20080519143506.GA26876@nosyntax.net> Message-ID: <20080519152240.GB26876@nosyntax.net> > Next step is to try icc instead of gcc, and if that works, try > the latest MKL (10.0.3.020). OK, either I've got a corrupted copy of MKL 10.0.3.020, or it has a problem. Building with icc & MKL 10.0.1.014 works. Erik, are you reading this? If so, roll back to MKL 10.0.014 and it should work, both with gcc and icc. root at c2d0:/usr/local/src/1.1.0rc1# python setup.py config --compiler=intel build_clib --compiler=intel build_ext --compiler=intel install --prefix=/usr/local/ >build25 Running from numpy source directory. root at c2d0:/usr/local/src/1.1.0rc1# cd .. root at c2d0:/usr/local/src# python Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ---------------------------------------------------------------------- Ran 1004 tests in 1.041s OK >>> From bsouthey at gmail.com Mon May 19 11:31:10 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 19 May 2008 10:31:10 -0500 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <7671f4e40805190813n6027e4f3s58c47dea7b50c605@mail.gmail.com> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> <482E8A85.6010606@ar.media.kyoto-u.ac.jp> <7671f4e40805190813n6027e4f3s58c47dea7b50c605@mail.gmail.com> Message-ID: <48319D3E.1030901@gmail.com> Nripun Sredar wrote: > I am running on Windows Xp, Intel Xeon CPU. I'd like to fill in a few > more things here. If I send 0 in the second and third argument of svd > then I get the singular_values, but if its 1 then the problem > persists. I've tried this on sparse and non-sparse matrices. This is > with the latest windows binaries numpy-1.0.4.win32-py2.5.msi. > > > > > On Sat, May 17, 2008 at 2:34 AM, David Cournapeau > > > wrote: > > Nripun Sredar wrote: > > I have a sparse matrix 416x52. I tried to factorize this matrix > using > > svd from numpy. But it didn't produce a result and looked like it is > > in an infinite loop. > > I tried a similar operation using random numbers in the matrix. Even > > this is in an infinite loop. > > Did anyone else face a similar problem? > > Can anyone please give some suggestions? > > Are you on windows ? What is the CPU on your machine ? I suspect > this is > caused by windows binaries which shipped blas/lapack without > support for > "old" CPU. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Hi, The issue relates to the closed ticket 627: http://www.scipy.org/scipy/numpy/ticket/627 Please update to the latest Numpy 1.1 (rc1 is available) or, as a temporary measure, use the installer: http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-superpack-python2.5.exe That uses numpy version 1.0.5.dev5008 which should solve you problem. Regards Bruce From nripunsredar at gmail.com Mon May 19 12:17:57 2008 From: nripunsredar at gmail.com (Nripun Sredar) Date: Mon, 19 May 2008 11:17:57 -0500 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <48319D3E.1030901@gmail.com> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> <482E8A85.6010606@ar.media.kyoto-u.ac.jp> <7671f4e40805190813n6027e4f3s58c47dea7b50c605@mail.gmail.com> <48319D3E.1030901@gmail.com> Message-ID: <7671f4e40805190917w6c9ca6cfr3029d74bc9d26b00@mail.gmail.com> Thank You.. The problem is resolved On Mon, May 19, 2008 at 10:31 AM, Bruce Southey wrote: > Nripun Sredar wrote: > > I am running on Windows Xp, Intel Xeon CPU. I'd like to fill in a few > > more things here. If I send 0 in the second and third argument of svd > > then I get the singular_values, but if its 1 then the problem > > persists. I've tried this on sparse and non-sparse matrices. This is > > with the latest windows binaries numpy-1.0.4.win32-py2.5.msi. > > > > > > > > > > On Sat, May 17, 2008 at 2:34 AM, David Cournapeau > > > > > wrote: > > > > Nripun Sredar wrote: > > > I have a sparse matrix 416x52. I tried to factorize this matrix > > using > > > svd from numpy. But it didn't produce a result and looked like it > is > > > in an infinite loop. > > > I tried a similar operation using random numbers in the matrix. > Even > > > this is in an infinite loop. > > > Did anyone else face a similar problem? > > > Can anyone please give some suggestions? > > > > Are you on windows ? What is the CPU on your machine ? I suspect > > this is > > caused by windows binaries which shipped blas/lapack without > > support for > > "old" CPU. > > > > cheers, > > > > David > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > Hi, > The issue relates to the closed ticket 627: > http://www.scipy.org/scipy/numpy/ticket/627 > > Please update to the latest Numpy 1.1 (rc1 is available) or, as a > temporary measure, use the installer: > > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-superpack-python2.5.exe > > > That uses numpy version 1.0.5.dev5008 which should solve you problem. > > Regards > Bruce > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Mon May 19 12:20:22 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 19 May 2008 12:20:22 -0400 Subject: [Numpy-discussion] Slicing a numpy array and getting the "complement" In-Reply-To: References: Message-ID: 2008/5/19 Orest Kozyar : > Given a slice, such as s_[..., :-2:], is it possible to take the > complement of this slice? Specifically, s_[..., ::-2]. I have a > series of 2D arrays that I need to split into two subarrays via > slicing where the members of the second array are all the members > leftover from the slice. The problem is that the slice itself will > vary, and could be anything such as s_[..., 1:4:] or s_[..., 1:-4:], > etc, so I'm wondering if there's a straightforward idiom or routine in > Numpy that would facilitate taking the complement of a slice? I've > looked around the docs, and have not had much luck. If you are using boolean indexing, of course complements are easy (just use ~). But if you want slice indexing, so that you get views, sometimes the complement cannot be expressed as a slice: for example: A = np.arange(10) A[2:4] The complement of A[2:4] is np.concatenate((A[:2],A[4:])). Things become even more complicated if you start skipping elements. If you don't mind fancy indexing, you can convert your index arrays into boolean form: complement = A==A complement[idx] = False Anne From erik.nugent at gmail.com Mon May 19 12:22:54 2008 From: erik.nugent at gmail.com (Erik Nugent) Date: Mon, 19 May 2008 10:22:54 -0600 Subject: [Numpy-discussion] numpy with icc/MKL In-Reply-To: <20080519152240.GB26876@nosyntax.net> References: <20080519143506.GA26876@nosyntax.net> <20080519152240.GB26876@nosyntax.net> Message-ID: <97db231b0805190922u46ab8242m825422ea5bdd8579@mail.gmail.com> I'm here... i am rolling back now and will post my results... e On Mon, May 19, 2008 at 9:22 AM, rex wrote: >> Next step is to try icc instead of gcc, and if that works, try >> the latest MKL (10.0.3.020). > > OK, either I've got a corrupted copy of MKL 10.0.3.020, or it has > a problem. Building with icc & MKL 10.0.1.014 works. > > Erik, are you reading this? If so, roll back to MKL 10.0.014 and it > should work, both with gcc and icc. > > root at c2d0:/usr/local/src/1.1.0rc1# python setup.py config --compiler=intel build_clib --compiler=intel build_ext --compiler=intel install --prefix=/usr/local/ >build25 > Running from numpy source directory. > root at c2d0:/usr/local/src/1.1.0rc1# cd .. > root at c2d0:/usr/local/src# python > Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) > [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.test() > Numpy is installed in /usr/local/lib/python2.5/site-packages/numpy > Numpy version 1.1.0rc1 > Python version 2.5 (release25-maint, Dec 9 2006, 14:35:53) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] > Found 18/18 tests for numpy.core.defmatrix > Found 3/3 tests for numpy.core.memmap > Found 283/283 tests for numpy.core.multiarray > Found 70/70 tests for numpy.core.numeric > Found 36/36 tests for numpy.core.numerictypes > Found 12/12 tests for numpy.core.records > Found 7/7 tests for numpy.core.scalarmath > Found 16/16 tests for numpy.core.umath > Found 5/5 tests for numpy.ctypeslib > Found 5/5 tests for numpy.distutils.misc_util > Found 2/2 tests for numpy.fft.fftpack > Found 3/3 tests for numpy.fft.helper > Found 24/24 tests for numpy.lib._datasource > Found 10/10 tests for numpy.lib.arraysetops > Found 1/1 tests for numpy.lib.financial > Found 0/0 tests for numpy.lib.format > Found 53/53 tests for numpy.lib.function_base > Found 5/5 tests for numpy.lib.getlimits > Found 6/6 tests for numpy.lib.index_tricks > Found 15/15 tests for numpy.lib.io > Found 1/1 tests for numpy.lib.machar > Found 4/4 tests for numpy.lib.polynomial > Found 49/49 tests for numpy.lib.shape_base > Found 15/15 tests for numpy.lib.twodim_base > Found 43/43 tests for numpy.lib.type_check > Found 1/1 tests for numpy.lib.ufunclike > Found 89/89 tests for numpy.linalg > Found 94/94 tests for numpy.ma.core > Found 15/15 tests for numpy.ma.extras > Found 7/7 tests for numpy.random > Found 16/16 tests for numpy.testing.utils > Found 0/0 tests for __main__ > .............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. > .............. > ---------------------------------------------------------------------- > Ran 1004 tests in 1.041s > > OK > >>>> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Erik Nugent System Administrator Department of Computer Science The University of Montana - Missoula (406) 243-2812 V (406) 243-5139 F From peck at spss.com Mon May 19 12:33:41 2008 From: peck at spss.com (Peck, Jon) Date: Mon, 19 May 2008 11:33:41 -0500 Subject: [Numpy-discussion] Noncentral chi square In-Reply-To: Message-ID: <5CFEFDB5226CB54CBB4328B9563A12EE0940DE9E@hqemail2.spss.com> Message: 1 Date: Mon, 19 May 2008 09:20:21 -0400 From: Neal Becker Subject: [Numpy-discussion] noncentral_chisquare buglet? To: numpy-discussion at scipy.org Message-ID: Content-Type: text/plain; charset=us-ascii def noncentral_chisquare(self, df, nonc, size=None): """Noncentral Chi^2 distribution. noncentral_chisquare(df, nonc, size=None) -> random values """ cdef ndarray odf, ononc cdef double fdf, fnonc fdf = PyFloat_AsDouble(df) fnonc = PyFloat_AsDouble(nonc) if not PyErr_Occurred(): if fdf <= 1: raise ValueError("df <= 0") <<>>Peck, Jon] Isn't it rather that the message is correct but the test is wrong? Shouldn't it be if fdf <= 0 ? -Jon Peck From pgmdevlist at gmail.com Mon May 19 12:33:09 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 19 May 2008 12:33:09 -0400 Subject: [Numpy-discussion] Cookbook/Documentation Message-ID: <200805191233.10380.pgmdevlist@gmail.com> All, * I've just noticed that the page describing RecordArrays (http://www.scipy.org/RecordArrays) is not listed under the Cookbook: should this be changed ? Shouldn't there be at least a link in the documentation page ? * Same problem with Subclasses (http://www.scipy.org/Subclasses) * I was eventually considering writing down some basic docs for MaskedArrays: should I create a page under the Cookbook ? Elsewhere ? * Congrats for the DocMarathon initiative ! The 3 points I've just raised would fit nicely with objective #3 (reference sections): what's the plan for that ? Any specific directions to follow ? From robert.kern at gmail.com Mon May 19 12:47:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 11:47:52 -0500 Subject: [Numpy-discussion] Slicing a numpy array and getting the "complement" In-Reply-To: References: Message-ID: <3d375d730805190947t1c882bf9g5124f2ad4bff3af2@mail.gmail.com> On Mon, May 19, 2008 at 9:34 AM, Orest Kozyar wrote: > Given a slice, such as s_[..., :-2:], is it possible to take the > complement of this slice? Specifically, s_[..., ::-2]. Hmm, that doesn't look like the complement. Did you mean s_[..., -2:] and s_[..., :-2]? > I have a > series of 2D arrays that I need to split into two subarrays via > slicing where the members of the second array are all the members > leftover from the slice. The problem is that the slice itself will > vary, and could be anything such as s_[..., 1:4:] or s_[..., 1:-4:], > etc, so I'm wondering if there's a straightforward idiom or routine in > Numpy that would facilitate taking the complement of a slice? I've > looked around the docs, and have not had much luck. In general, for any given slice, there may not be a slice giving the complement. For example, the complement of arange(6)[1:4] should be array([0,4,5]), but there is no slice which can make that. Things get even more difficult with start:stop:step slices let alone simultaneous multidimensional slices. Can you be more specific as to exactly the variety of slices you need to support? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 19 12:50:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 11:50:30 -0500 Subject: [Numpy-discussion] Noncentral chi square In-Reply-To: <5CFEFDB5226CB54CBB4328B9563A12EE0940DE9E@hqemail2.spss.com> References: <5CFEFDB5226CB54CBB4328B9563A12EE0940DE9E@hqemail2.spss.com> Message-ID: <3d375d730805190950s8197bcayc26bc60a713d493b@mail.gmail.com> On Mon, May 19, 2008 at 11:33 AM, Peck, Jon wrote: > Message: 1 > Date: Mon, 19 May 2008 09:20:21 -0400 > From: Neal Becker > Subject: [Numpy-discussion] noncentral_chisquare buglet? > To: numpy-discussion at scipy.org > Message-ID: > Content-Type: text/plain; charset=us-ascii > > def noncentral_chisquare(self, df, nonc, size=None): > """Noncentral Chi^2 distribution. > > noncentral_chisquare(df, nonc, size=None) -> random values > """ > cdef ndarray odf, ononc > cdef double fdf, fnonc > fdf = PyFloat_AsDouble(df) > fnonc = PyFloat_AsDouble(nonc) > if not PyErr_Occurred(): > if fdf <= 1: > raise ValueError("df <= 0") << > I think this message should be "df <= 1"? > > [>>>Peck, Jon] > Isn't it rather that the message is correct but the test is wrong? Shouldn't it be > if fdf <= 0 ? Yes, you are correct. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Mon May 19 13:02:13 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 19 May 2008 19:02:13 +0200 Subject: [Numpy-discussion] Cookbook/Documentation In-Reply-To: <200805191233.10380.pgmdevlist@gmail.com> References: <200805191233.10380.pgmdevlist@gmail.com> Message-ID: <9457e7c80805191002v4659cd6g5d2ba6e6be4e0134@mail.gmail.com> Hi Pierre 2008/5/19 Pierre GM : > * I've just noticed that the page describing RecordArrays > (http://www.scipy.org/RecordArrays) is not listed under the Cookbook: should > this be changed ? Shouldn't there be at least a link in the documentation > page ? How about we add those pages to a CookBookCategory and auto-generate the Cookbook (like I've done with ProposedEnhancements)? > * I was eventually considering writing down some basic docs for MaskedArrays: > should I create a page under the Cookbook ? Elsewhere ? That's a good place for now. Use ReST (on the wiki, use {{{#!rst }}}), then we can incorporate your work into the user guide later. Regards St?fan From ndbecker2 at gmail.com Mon May 19 13:52:56 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 19 May 2008 13:52:56 -0400 Subject: [Numpy-discussion] 1.1.0rc1 tagged References: Message-ID: Jarrod Millman wrote: > Please test the release candidate: > svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1 > > Also please review the release notes: > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > > I am going to ask Chris and David to create Windows and Mac binaries, > which I hope they will have time to create ASAP. > > Sorry that it has taken me so long, I am on vacation with my family > and am having a difficult time getting on my computer. > > Thanks, > Built OK on Fedora F9 x86_64 using ['lapack', 'f77blas', 'cblas', 'atlas'] Used rpmbuild with slightly modified version of fedora 9 spec file. From jbsnyder at gmail.com Mon May 19 14:08:49 2008 From: jbsnyder at gmail.com (James Snyder) Date: Mon, 19 May 2008 13:08:49 -0500 Subject: [Numpy-discussion] Quick Question about Optimization Message-ID: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Hi - First off, I know that optimization is evil, and I should make sure that everything works as expected prior to bothering with squeezing out extra performance, but the situation is that this particular block of code works, but it is about half as fast with numpy as in matlab, and I'm wondering if there's a better approach than what I'm doing. I have a chunk of code, below, that generally iterates over 2000 iterations, and the vectors that are being worked on at a given step generally have ~14000 elements in them. In matlab, doing pretty much exactly the same thing takes about 6-7 seconds, always around 13-14 with numpy on the same machine. I've gotten this on Linux & Mac OS X. self.aff_input has the bulk of the data in it (2000x14000 array), and the various steps are for computing the state of some afferent neurons (if there's any interest, here's a paper that includes the model: Brandman, R. and Nelson ME (2002) A simple model of long-term spike train regularization. Neural Computation 14, 1575-1597.) I've imported numpy as np. Is there anything in practice here that could be done to speed this up? I'm looking more for general numpy usage tips, that I can use while writing further code and not so things that would be obscure or difficult to maintain in the future. Also, the results of this are a binary array, I'm wondering if there's anything more compact for expressing than using 8 bits to represent each single bit. I've poked around, but I haven't come up with any clean and unhackish ideas :-) Thanks! I can provide the rest of the code if needed, but it's basically just filling some vectors with random and empty data and initializing a few things. for n in range(0,time_milliseconds): self.u = self.expfac_m * self.prev_u + (1-self.expfac_m) * self.aff_input[n,:] self.v = self.u + self.sigma * np.random.standard_normal(size=(1,self.naff)) self.theta = self.expfac_theta * self.prev_theta - (1-self.expfac_theta) idx_spk = np.where(self.v>=self.theta) self.S[n,idx_spk] = 1 self.theta[idx_spk] = self.theta[idx_spk] + self.b self.prev_u = self.u self.prev_theta = self.theta -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com From orest.kozyar at gmail.com Mon May 19 14:21:33 2008 From: orest.kozyar at gmail.com (Orest Kozyar) Date: Mon, 19 May 2008 14:21:33 -0400 Subject: [Numpy-discussion] Slicing a numpy array and getting the "complement" Message-ID: > If you don't mind fancy indexing, you can convert your index arrays > into boolean form: > complement = A==A > complement[idx] = False This actually would work perfectly for my purposes. I don't really need super-fancy indexing. >> Given a slice, such as s_[..., :-2:], is it possible to take the >> complement of this slice? Specifically, s_[..., ::-2]. > > Hmm, that doesn't look like the complement. Did you mean s_[..., -2:] > and s_[..., :-2]? Whoops, yes you're right. > In general, for any given slice, there may not be a slice giving the > complement. For example, the complement of arange(6)[1:4] should be > array([0,4,5]), but there is no slice which can make that. Things get > even more difficult with start:stop:step slices let alone simultaneous > multidimensional slices. Can you be more specific as to exactly the > variety of slices you need to support? I think Anne's solution will work well for what I need to do. Thanks! From david.huard at gmail.com Mon May 19 14:39:04 2008 From: david.huard at gmail.com (David Huard) Date: Mon, 19 May 2008 14:39:04 -0400 Subject: [Numpy-discussion] 1.1.0rc1 tagged In-Reply-To: References: Message-ID: <91cf711d0805191139j1ebaee62t84ce00a71a0a4a8f@mail.gmail.com> Ticket 793 has a patch, submitted by Alan McIntyre, waiting for review from someone C-API-wise. Cheers, David 2008/5/19 Neal Becker : > Jarrod Millman wrote: > > > Please test the release candidate: > > svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1 > > > > Also please review the release notes: > > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > > > > I am going to ask Chris and David to create Windows and Mac binaries, > > which I hope they will have time to create ASAP. > > > > Sorry that it has taken me so long, I am on vacation with my family > > and am having a difficult time getting on my computer. > > > > Thanks, > > > Built OK on Fedora F9 x86_64 using > ['lapack', 'f77blas', 'cblas', 'atlas'] > > Used rpmbuild with slightly modified version of fedora 9 spec file. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robince at gmail.com Mon May 19 14:53:57 2008 From: robince at gmail.com (Robin) Date: Mon, 19 May 2008 19:53:57 +0100 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Message-ID: On Mon, May 19, 2008 at 7:08 PM, James Snyder wrote: > > for n in range(0,time_milliseconds): > self.u = self.expfac_m * self.prev_u + > (1-self.expfac_m) * self.aff_input[n,:] > self.v = self.u + self.sigma * > np.random.standard_normal(size=(1,self.naff)) > self.theta = self.expfac_theta * self.prev_theta - > (1-self.expfac_theta) > > idx_spk = np.where(self.v>=self.theta) > > self.S[n,idx_spk] = 1 > self.theta[idx_spk] = self.theta[idx_spk] + self.b > > self.prev_u = self.u > self.prev_theta = self.theta Hello, The only thoughts I had were that depending on how 'vectorised' random.standard_normal is, it might be better to calculate a big block of random data outside the loop and index it in the same way as the aff_input. Not certain if the indexing would be faster than the function call but it could be worth a try if you have enough memory. The other thing is there are a lot of self.'s there. I don't have a lot of practicle experience, but I've read ( http://wiki.python.org/moin/PythonSpeed/PerformanceTips#head-aa6c07c46a630a2fa10bd6502510e532806f1f62 ) that . based lookups are slower than local variables so another thing to try would be to rebind everything to a local variable outside the loop: u = self.u v = self.v etc. which although a bit unsightly actually can make the inner loop more readable and might speed things up. The only other thing is be careful with things like this when translating from matlab: > self.prev_u = self.u since this is a reference not a copy of the data. This is OK because when you recreate u as a product it creates a new object, but if you changed u in another way ie self.u[:100] = 10 then self.prev_u would still be pointing to the same array and also reflect those changes. In this case it doesn't look like you explicitly need the prev_ values so it's possible you could do the v and theta updates in place (although I'm not sure if that's quicker) u *= expfac_m u += (1-expfac_m)*aff_input.. etc. Of course you can also take the (1-)'s outside of the loop although again I'm not sure how much difference it would make. So sorry I can't give any concrete advise but I hope I've given some ideas... Cheers Robin From robince at gmail.com Mon May 19 14:57:25 2008 From: robince at gmail.com (Robin) Date: Mon, 19 May 2008 19:57:25 +0100 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Message-ID: Also you could use xrange instead of range... Again, not sure of the size of the effect but it seems to be recommended by the docstring. Robin From hoytak at gmail.com Mon May 19 15:31:07 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Mon, 19 May 2008 12:31:07 -0700 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Message-ID: <4db580fd0805191231wd36def9rdb626f18f05f92c5@mail.gmail.com> > for n in range(0,time_milliseconds): > self.u = self.expfac_m * self.prev_u + > (1-self.expfac_m) * self.aff_input[n,:] > self.v = self.u + self.sigma * > np.random.standard_normal(size=(1,self.naff)) > self.theta = self.expfac_theta * self.prev_theta - > (1-self.expfac_theta) > > idx_spk = np.where(self.v>=self.theta) > > self.S[n,idx_spk] = 1 > self.theta[idx_spk] = self.theta[idx_spk] + self.b > > self.prev_u = self.u > self.prev_theta = self.theta Copying elements into array objects that already exist will always be faster than creating a new object with separate data. However, in this case, you don't need to do any copying or creation if you use a flip-flopping index to handle keeping track of the previous. If I drop the selfs, you can translate the above into (untested): curidx = 0 # prev will be 2-curidx u = empty( (2, naff) ) v = empty( naff ) theta = empty( (2, naff) ) stdnormrvs = np.random.standard_normal(size=(time_milliseconds,naff) ) for n in xrange(time_milliseconds): u[curidx, :] = expfac_m * u[2-curidx, :] + (1-expfac_m) * aff_input[n,:] v[:] = u[curidx, :] + sigma * stdnormrvs[n, :] theta[curidx, :] = expfac_theta * theta[2-curidx] - (1-expfac_theta) idx_spk = np.where(v >= theta) S[n,idx_spk] = 1 theta[curidx, idx_spk] += b # Flop to handle previous stuff curidx = 2 - curidx This should give you a substantial speedup. Also, I have to say that this is begging for weave.blitz, which compiles such statements using templated c++ code to avoid temporaries. It doesn't work on all systems, but if it does in your case, here's what your code might look like: import scipy.weave as wv curidx = 0 u = empty( (2, naff) ) v = empty( (1, naff) ) theta = empty( (2, naff) ) stdnormrvs = np.random.standard_normal(size=(time_milliseconds,naff) ) for n in xrange(time_milliseconds): wv.blitz("u[curidx, :] = expfac_m * u[2-curidx, :] + (1-expfac_m) * aff_input[n,:]") wv.blitz("v[:] = u[curidx, :] + sigma * stdnormrvs[n, :]") wv.blitz("theta[curidx, :] = expfac_theta * theta[2-curidx] - (1-expfac_theta)") idx_spk = np.where(v >= theta) S[n,idx_spk] = 1 theta[curidx, idx_spk] += b # Flop to handle previous stuff curidx = 2 - curidx -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ On Mon, May 19, 2008 at 11:08 AM, James Snyder wrote: > Hi - > > First off, I know that optimization is evil, and I should make sure > that everything works as expected prior to bothering with squeezing > out extra performance, but the situation is that this particular block > of code works, but it is about half as fast with numpy as in matlab, > and I'm wondering if there's a better approach than what I'm doing. > > I have a chunk of code, below, that generally iterates over 2000 > iterations, and the vectors that are being worked on at a given step > generally have ~14000 elements in them. > > In matlab, doing pretty much exactly the same thing takes about 6-7 > seconds, always around 13-14 with numpy on the same machine. I've > gotten this on Linux & Mac OS X. > > self.aff_input has the bulk of the data in it (2000x14000 array), and > the various steps are for computing the state of some afferent neurons > (if there's any interest, here's a paper that includes the model: > Brandman, R. and Nelson ME (2002) A simple model of long-term spike > train regularization. Neural Computation 14, 1575-1597.) > > I've imported numpy as np. > > Is there anything in practice here that could be done to speed this > up? I'm looking more for general numpy usage tips, that I can use > while writing further code and not so things that would be obscure or > difficult to maintain in the future. > > Also, the results of this are a binary array, I'm wondering if there's > anything more compact for expressing than using 8 bits to represent > each single bit. I've poked around, but I haven't come up with any > clean and unhackish ideas :-) > > Thanks! > > I can provide the rest of the code if needed, but it's basically just > filling some vectors with random and empty data and initializing a few > things. > > for n in range(0,time_milliseconds): > self.u = self.expfac_m * self.prev_u + > (1-self.expfac_m) * self.aff_input[n,:] > self.v = self.u + self.sigma * > np.random.standard_normal(size=(1,self.naff)) > self.theta = self.expfac_theta * self.prev_theta - > (1-self.expfac_theta) > > idx_spk = np.where(self.v>=self.theta) > > self.S[n,idx_spk] = 1 > self.theta[idx_spk] = self.theta[idx_spk] + self.b > > self.prev_u = self.u > self.prev_theta = self.theta > > -- > James Snyder > Biomedical Engineering > Northwestern University > jbsnyder at gmail.com > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From peridot.faceted at gmail.com Mon May 19 15:33:15 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 19 May 2008 15:33:15 -0400 Subject: [Numpy-discussion] Slicing a numpy array and getting the "complement" In-Reply-To: References: Message-ID: 2008/5/19 Orest Kozyar : >> If you don't mind fancy indexing, you can convert your index arrays >> into boolean form: >> complement = A==A >> complement[idx] = False > > This actually would work perfectly for my purposes. I don't really > need super-fancy indexing. Heh. Actually "fancy indexing" is numpy-speak for indexing with anything that's not an integer or a slice. In this case, indexing with a boolean array is "fancy indexing". The reason we make this distinction is that with slices, the new array you get is actually just a reference to the original array (so you can modify the original array through it). With fancy indexing, the new array you get is actually a copy. (Assigning to fancy-indexed arrays is handled specially in __setitem__, so it works.) Anne From cburns at berkeley.edu Mon May 19 15:39:09 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Mon, 19 May 2008 12:39:09 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test Message-ID: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> I've built a Mac binary for the 1.1 release candidate. Mac users, please test it from: https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg This is for the MacPython installed from python.org. Thanks, Chris On Sat, May 17, 2008 at 9:01 PM, Jarrod Millman wrote: > Please test the release candidate: > svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1 > > Also please review the release notes: > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > > I am going to ask Chris and David to create Windows and Mac binaries, > which I hope they will have time to create ASAP. > > Sorry that it has taken me so long, I am on vacation with my family > and am having a difficult time getting on my computer. > > Thanks, > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robince at gmail.com Mon May 19 15:53:37 2008 From: robince at gmail.com (Robin) Date: Mon, 19 May 2008 20:53:37 +0100 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <4db580fd0805191231wd36def9rdb626f18f05f92c5@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4db580fd0805191231wd36def9rdb626f18f05f92c5@mail.gmail.com> Message-ID: Hi, I think my understanding is somehow incomplete... It's not clear to me why (simplified case) a[curidx,:] = scalar * a[2-curidx,:] should be faster than a = scalar * b In both cases I thought the scalar multiplication results in a new array (new memory allocated) and then the difference between copying that result into the existing array u[curix,:] or reassining the reference u to that result should be very similar? If anything I would have thought the direct assignment would be quicker since then there is no copying. What am I missing? > This should give you a substantial speedup. Also, I have to say that > this is begging for weave.blitz, which compiles such statements using > templated c++ code to avoid temporaries. It doesn't work on all > systems, but if it does in your case, here's what your code might look > like: If you haven't seen it this page gives useful examples of methods to speed up python code (incuding weave.blitz), which has Hoyt says would be ideal in this case: http://scipy.org/PerformancePython Robin From peridot.faceted at gmail.com Mon May 19 15:55:14 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 19 May 2008 15:55:14 -0400 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Message-ID: 2008/5/19 James Snyder : > First off, I know that optimization is evil, and I should make sure > that everything works as expected prior to bothering with squeezing > out extra performance, but the situation is that this particular block > of code works, but it is about half as fast with numpy as in matlab, > and I'm wondering if there's a better approach than what I'm doing. > > I have a chunk of code, below, that generally iterates over 2000 > iterations, and the vectors that are being worked on at a given step > generally have ~14000 elements in them. With arrays this size, I wouldn't worry about python overhead - things like range versus xrange or self lookups. > Is there anything in practice here that could be done to speed this > up? I'm looking more for general numpy usage tips, that I can use > while writing further code and not so things that would be obscure or > difficult to maintain in the future. Try using a profiler to find which steps are using most of your time. With such a simple function it may not be very informative, but it's worth a try. > Also, the results of this are a binary array, I'm wondering if there's > anything more compact for expressing than using 8 bits to represent > each single bit. I've poked around, but I haven't come up with any > clean and unhackish ideas :-) There's a tradeoff between compactness and speed here. The *fastest* is probably one boolean per 32-bit integer. It sounds awful, I know, but most modern CPUs have to work harder to access bytes individually than they do to access them four at a time. On the other hand, cache performance can make a huge difference, so compactness might actually amount to speed. I don't think numpy has a packed bit array data type (which is a shame, but would require substantial implementation effort). > I can provide the rest of the code if needed, but it's basically just > filling some vectors with random and empty data and initializing a few > things. It would kind of help, since it would make it clearer what's a scalar and what's an array, and what the dimensions of the various arrays are. > for n in range(0,time_milliseconds): > self.u = self.expfac_m * self.prev_u + > (1-self.expfac_m) * self.aff_input[n,:] > self.v = self.u + self.sigma * > np.random.standard_normal(size=(1,self.naff)) You can use "scale" to rescale the random numbers on creation; that'll save you a temporary. > self.theta = self.expfac_theta * self.prev_theta - > (1-self.expfac_theta) > > idx_spk = np.where(self.v>=self.theta) You can probably skip the "where"; the result of the expression self.v>=self.theta is a boolean array, which you can use directly for indexing. > self.S[n,idx_spk] = 1 > self.theta[idx_spk] = self.theta[idx_spk] + self.b += here might speed things up, not just in terms of temporaries but by saving a fancy-indexing operation. > self.prev_u = self.u > self.prev_theta = self.theta Anne From efiring at hawaii.edu Mon May 19 15:58:34 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 19 May 2008 09:58:34 -1000 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Message-ID: <4831DBEA.5040104@hawaii.edu> Robin wrote: > Also you could use xrange instead of range... > > Again, not sure of the size of the effect but it seems to be > recommended by the docstring. No, it is going away in Python 3.0, and its only real benefit is a memory saving in extreme cases. From the Python library docs: "The advantage of xrange() over range() is minimal (since xrange() still has to create the values when asked for them) except when a very large range is used on a memory-starved machine or when all of the range's elements are never used (such as when the loop is usually terminated with break)." Eric From hoytak at gmail.com Mon May 19 16:04:46 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Mon, 19 May 2008 13:04:46 -0700 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4db580fd0805191231wd36def9rdb626f18f05f92c5@mail.gmail.com> Message-ID: <4db580fd0805191304o17496c62o110e202364bbe4ba@mail.gmail.com> On Mon, May 19, 2008 at 12:53 PM, Robin wrote: > Hi, > > I think my understanding is somehow incomplete... It's not clear to me > why (simplified case) > > a[curidx,:] = scalar * a[2-curidx,:] > should be faster than > a = scalar * b > > In both cases I thought the scalar multiplication results in a new > array (new memory allocated) and then the difference between copying > that result into the existing array u[curix,:] or reassining the > reference u to that result should be very similar? > > If anything I would have thought the direct assignment would be > quicker since then there is no copying. > > What am I missing? Actually, I think you are correct. My bad. I was mainly thinking in terms of weave.blitz, where it would make a difference, then translating back... --Hoyt +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From tgrav at mac.com Mon May 19 16:20:30 2008 From: tgrav at mac.com (Tommy Grav) Date: Mon, 19 May 2008 16:20:30 -0400 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> Message-ID: On May 19, 2008, at 3:39 PM, Christopher Burns wrote: > I've built a Mac binary for the 1.1 release candidate. Mac users, > please test it from: > > https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg > > This is for the MacPython installed from python.org. > > Thanks, > Chris I tried this build on my PPC running 10.5.2. It works with two failed tests given below. [*****:~] tgrav% python ActivePython 2.5.1.1 (ActiveState Software Inc.) based on Python 2.5.1 (r251:54863, May 1 2007, 17:40:00) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type "help", "copyright", "credits" or "license" for more information. [*****:~] tgrav% python -c 'import numpy; numpy.test()' Numpy is installed in /Library/Frameworks/Python.framework/Versions/ 2.5/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5.1 (r251:54863, May 1 2007, 17:40:00) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] [Test log snipped] ====================================================================== FAIL: test_basic (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 843, in test_basic assert_array_equal(y, [67305985, 134678021]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([16909060, 84281096]) y: array([ 67305985, 134678021]) ====================================================================== FAIL: test_keywords (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 852, in test_keywords assert_array_equal(y,[[513]]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([[258]], dtype=int16) y: array([[513]]) ---------------------------------------------------------------------- Ran 1004 tests in 2.569s FAILED (failures=2) From robert.kern at gmail.com Mon May 19 16:35:06 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 15:35:06 -0500 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> Message-ID: <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> On Mon, May 19, 2008 at 3:20 PM, Tommy Grav wrote: > ====================================================================== > FAIL: test_basic (numpy.core.tests.test_multiarray.TestView) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/numpy/core/tests/test_multiarray.py", line > 843, in test_basic > assert_array_equal(y, [67305985, 134678021]) > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/numpy/testing/utils.py", line 248, in > assert_array_equal > verbose=verbose, header='Arrays are not equal') > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/numpy/testing/utils.py", line 240, in > assert_array_compare > assert cond, msg > AssertionError: > Arrays are not equal > > (mismatch 100.0%) > x: array([16909060, 84281096]) > y: array([ 67305985, 134678021]) Endianness issues. Probably bugs in the code. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 19 16:38:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 15:38:04 -0500 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> Message-ID: <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> On Mon, May 19, 2008 at 3:35 PM, Robert Kern wrote: > Endianness issues. Probably bugs in the code. By which I meant "test code". numpy itself is fine and is working correctly. The tests themselves incorrectly assume little-endianness. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 19 16:42:06 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 15:42:06 -0500 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> Message-ID: <3d375d730805191342h44cf3663y8242848e5ae2616f@mail.gmail.com> On Mon, May 19, 2008 at 3:38 PM, Robert Kern wrote: > On Mon, May 19, 2008 at 3:35 PM, Robert Kern wrote: >> Endianness issues. Probably bugs in the code. > > By which I meant "test code". numpy itself is fine and is working > correctly. The tests themselves incorrectly assume little-endianness. And now fixed on the trunk. I believe that's the correct place to fix bugs for 1.1.0 at this time. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tgrav at mac.com Mon May 19 16:42:06 2008 From: tgrav at mac.com (Tommy Grav) Date: Mon, 19 May 2008 16:42:06 -0400 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> Message-ID: On May 19, 2008, at 4:38 PM, Robert Kern wrote: > On Mon, May 19, 2008 at 3:35 PM, Robert Kern > wrote: >> Endianness issues. Probably bugs in the code. > > By which I meant "test code". numpy itself is fine and is working > correctly. The tests themselves incorrectly assume little-endianness. I am just a "silent" newbie of the numpy list, so I hope that someone will put this in as a ticket if it is warranted :) Cheers Tommy From Chris.Barker at noaa.gov Mon May 19 16:53:08 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 19 May 2008 13:53:08 -0700 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> Message-ID: <4831E8B4.5040408@noaa.gov> Anne Archibald wrote: > 2008/5/19 James Snyder : >> I can provide the rest of the code if needed, but it's basically just >> filling some vectors with random and empty data and initializing a few >> things. > > It would kind of help, since it would make it clearer what's a scalar > and what's an array, and what the dimensions of the various arrays > are. It would also help if you provided a complete example (as little code as possible), so we could try out and time our ideas before suggesting them. >> np.random.standard_normal(size=(1,self.naff)) Anyone know how fast this is compared to Matlab? That could be the difference right there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cburns at berkeley.edu Mon May 19 16:58:39 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Mon, 19 May 2008 13:58:39 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> Message-ID: <764e38540805191358n3439a6e6sabfa0067c5cb17de@mail.gmail.com> Thanks Tommy! Robert has already committed a fix. On Mon, May 19, 2008 at 1:42 PM, Tommy Grav wrote: > > On May 19, 2008, at 4:38 PM, Robert Kern wrote: > >> On Mon, May 19, 2008 at 3:35 PM, Robert Kern >> wrote: >>> Endianness issues. Probably bugs in the code. >> >> By which I meant "test code". numpy itself is fine and is working >> correctly. The tests themselves incorrectly assume little-endianness. > > I am just a "silent" newbie of the numpy list, so I hope that someone > will put this in as a ticket if it is warranted :) > > Cheers > Tommy > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Mon May 19 18:27:32 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 May 2008 16:27:32 -0600 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <4831E8B4.5040408@noaa.gov> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> Message-ID: On Mon, May 19, 2008 at 2:53 PM, Christopher Barker wrote: > Anne Archibald wrote: > > 2008/5/19 James Snyder : > >> I can provide the rest of the code if needed, but it's basically just > >> filling some vectors with random and empty data and initializing a few > >> things. > > > > It would kind of help, since it would make it clearer what's a scalar > > and what's an array, and what the dimensions of the various arrays > > are. > > It would also help if you provided a complete example (as little code as > possible), so we could try out and time our ideas before suggesting them. > > >> np.random.standard_normal(size=(1,self.naff)) > > Anyone know how fast this is compared to Matlab? That could be the > difference right there. > The latest versions of Matlab use the ziggurat method to generate random normals and it is faster than the method used in numpy. I have ziggurat code at hand, but IIRC, Robert doesn't trust the method ;) I don't know if it would actually speed things up, though. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon May 19 18:36:41 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 17:36:41 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> Message-ID: <3d375d730805191536n9917a31xcad76d63a7f09170@mail.gmail.com> On Mon, May 19, 2008 at 5:27 PM, Charles R Harris wrote: > The latest versions of Matlab use the ziggurat method to generate random > normals and it is faster than the method used in numpy. I have ziggurat code > at hand, but IIRC, Robert doesn't trust the method ;) Well, I outlined the tests that would satisfy me, but I don't think you ever responded. http://projects.scipy.org/pipermail/scipy-dev/2005-December/004405.html which references http://projects.scipy.org/pipermail/scipy-dev/2005-December/004400.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jbsnyder at gmail.com Mon May 19 19:30:41 2008 From: jbsnyder at gmail.com (James Snyder) Date: Mon, 19 May 2008 18:30:41 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <4831E8B4.5040408@noaa.gov> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> Message-ID: <33644d3c0805191630g45e824c8mb2edbb0f30af03ed@mail.gmail.com> Separating the response into 2 emails, here's the aspect that comes from implementations of random: In short, that's part of the difference. I ran these a few times to check for consistency. MATLAB (R2008a: tic for i = 1:2000 a = randn(1,13857); end toc Runtime: ~0.733489 s NumPy (1.1.0rc1): import numpy as np import time t1 = time.time() for n in xrange(0,2000): a = np.random.standard_normal(size=(1,14000)) t2 = time.time() print 'Runtime: %1.3f s' % ((t2-t1)) Runtime: ~2.716 s On Mon, May 19, 2008 at 3:53 PM, Christopher Barker wrote: > Anne Archibald wrote: >> 2008/5/19 James Snyder : >>> I can provide the rest of the code if needed, but it's basically just >>> filling some vectors with random and empty data and initializing a few >>> things. >> >> It would kind of help, since it would make it clearer what's a scalar >> and what's an array, and what the dimensions of the various arrays >> are. > > It would also help if you provided a complete example (as little code as > possible), so we could try out and time our ideas before suggesting them. > >>> np.random.standard_normal(size=(1,self.naff)) > > Anyone know how fast this is compared to Matlab? That could be the > difference right there. > > -Chris > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com From charlesr.harris at gmail.com Mon May 19 19:39:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 May 2008 17:39:59 -0600 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <3d375d730805191536n9917a31xcad76d63a7f09170@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <3d375d730805191536n9917a31xcad76d63a7f09170@mail.gmail.com> Message-ID: On Mon, May 19, 2008 at 4:36 PM, Robert Kern wrote: > On Mon, May 19, 2008 at 5:27 PM, Charles R Harris > wrote: > > The latest versions of Matlab use the ziggurat method to generate random > > normals and it is faster than the method used in numpy. I have ziggurat > code > > at hand, but IIRC, Robert doesn't trust the method ;) > > Well, I outlined the tests that would satisfy me, but I don't think > you ever responded. > > http://projects.scipy.org/pipermail/scipy-dev/2005-December/004405.html > which references > http://projects.scipy.org/pipermail/scipy-dev/2005-December/004400.html > It's been tested in the literature. It's basically just sampling a Gaussian but done so most of the tests for being under the curve are trivial and there are few misses, i.e., the Gaussian is covered with a stack (ziggurat) of slices of equal areas, each slice is randomly chosen, then the position along the slice is randomly chosen. Most of those last points will be under the curve except at the ends, and it is those last that require computation. However, like all sampling it has to be carefully implemented and the samples are discretized differently than for the current way. Floats are strange that way because they are on a log scale. The tails will be fine, the real question is how much precision you want when doubles are returned, i.e., how fine the discetization of the resulting samples should be. The same method also works for the exponential distribution. I don't feel this is a pressing issue, when I need fast normals I use my own code. But if we are in competition with Matlab maybe we should give it a shot. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon May 19 19:52:27 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 18:52:27 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <3d375d730805191536n9917a31xcad76d63a7f09170@mail.gmail.com> Message-ID: <3d375d730805191652r4a6f945et7b7ddb6ca0b3458c@mail.gmail.com> On Mon, May 19, 2008 at 6:39 PM, Charles R Harris wrote: > > On Mon, May 19, 2008 at 4:36 PM, Robert Kern wrote: >> >> On Mon, May 19, 2008 at 5:27 PM, Charles R Harris >> wrote: >> > The latest versions of Matlab use the ziggurat method to generate random >> > normals and it is faster than the method used in numpy. I have ziggurat >> > code >> > at hand, but IIRC, Robert doesn't trust the method ;) >> >> Well, I outlined the tests that would satisfy me, but I don't think >> you ever responded. >> >> http://projects.scipy.org/pipermail/scipy-dev/2005-December/004405.html >> which references >> http://projects.scipy.org/pipermail/scipy-dev/2005-December/004400.html > > It's been tested in the literature. And it happened to fail such tests. Hence the Doornik paper which improves Marsaglia's method to pass the appropriate tests. Consequently, I want to see the tests performed on the actual implementation before using it. It's a complicated algorithm that is *demonstrably* easy to get wrong. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jbsnyder at gmail.com Mon May 19 19:55:45 2008 From: jbsnyder at gmail.com (James Snyder) Date: Mon, 19 May 2008 18:55:45 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <4831E8B4.5040408@noaa.gov> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> Message-ID: <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> On to the code, here's a current implementation, attached. I make no claims about it being great code, I've modified it so that there is a weave version and a sans-weave version. Many of the suggestions make things a bit faster. The weave version bombs out with a rather long log, which can be found at: http://pastebin.com/m79699c04 I can tell it's failing for the second weave.blitz line, but I don't understand why exactly. What does this mean?: error: no match for call to '(blitz::FastArrayIterator) (const blitz::TinyVector&)' Also note, I'm not asking to match MATLAB performance. It'd be nice, but again I'm just trying to put together decent, fairly efficient numpy code. On Mon, May 19, 2008 at 3:53 PM, Christopher Barker wrote: > Anne Archibald wrote: >> 2008/5/19 James Snyder : >>> I can provide the rest of the code if needed, but it's basically just >>> filling some vectors with random and empty data and initializing a few >>> things. >> >> It would kind of help, since it would make it clearer what's a scalar >> and what's an array, and what the dimensions of the various arrays >> are. > > It would also help if you provided a complete example (as little code as > possible), so we could try out and time our ideas before suggesting them. > >>> np.random.standard_normal(size=(1,self.naff)) > > Anyone know how fast this is compared to Matlab? That could be the > difference right there. > > -Chris > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com -------------- next part -------------- A non-text attachment was scrubbed... Name: np_afftest.py Type: text/x-python Size: 3795 bytes Desc: not available URL: From charlesr.harris at gmail.com Mon May 19 20:30:15 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 May 2008 18:30:15 -0600 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <3d375d730805191652r4a6f945et7b7ddb6ca0b3458c@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <3d375d730805191536n9917a31xcad76d63a7f09170@mail.gmail.com> <3d375d730805191652r4a6f945et7b7ddb6ca0b3458c@mail.gmail.com> Message-ID: On Mon, May 19, 2008 at 5:52 PM, Robert Kern wrote: > On Mon, May 19, 2008 at 6:39 PM, Charles R Harris > wrote: > > > > On Mon, May 19, 2008 at 4:36 PM, Robert Kern > wrote: > >> > >> On Mon, May 19, 2008 at 5:27 PM, Charles R Harris > >> wrote: > >> > The latest versions of Matlab use the ziggurat method to generate > random > >> > normals and it is faster than the method used in numpy. I have > ziggurat > >> > code > >> > at hand, but IIRC, Robert doesn't trust the method ;) > >> > >> Well, I outlined the tests that would satisfy me, but I don't think > >> you ever responded. > >> > >> > http://projects.scipy.org/pipermail/scipy-dev/2005-December/004405.html > >> which references > >> > http://projects.scipy.org/pipermail/scipy-dev/2005-December/004400.html > > > > It's been tested in the literature. > > And it happened to fail such tests. Hence the Doornik paper which > improves Marsaglia's method to pass the appropriate tests. Exactly. Doornik was more careful about using independent samples and also used a better random number generator (MWC8222), not exactly rocket science. Believe it or not, I had read Doornik's paper before I did my implementation. I also used a better ziggurat, IMHO, than Marsaglia and Doornik. MWC8222 is also about twice as fast on AMD hardware as the Mersenne Twister, but does require more careful initialization. On Pentium V they are about they are about the same. I haven't benchmarked either on Core2. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon May 19 20:36:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 19:36:54 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> Message-ID: <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> On Mon, May 19, 2008 at 6:55 PM, James Snyder wrote: > Also note, I'm not asking to match MATLAB performance. It'd be nice, > but again I'm just trying to put together decent, fairly efficient > numpy code. I can cut the time by about a quarter by just using the boolean mask directly instead of using where(). for n in range(0,time_milliseconds): u = expfac_m * prev_u + (1-expfac_m) * aff_input[n,:] v = u + sigma * stdnormrvs[n, :] theta = expfac_theta * prev_theta - (1-expfac_theta) mask = (v >= theta) S[n,np.squeeze(mask)] = 1 theta[mask] += b prev_u = u prev_theta = theta There aren't any good line-by-line profiling tools in Python, but you can fake it by making a local function for each line: def f1(): u = expfac_m * prev_u + (1-expfac_m) * aff_input[n,:] return u def f2(): v = u + sigma * stdnormrvs[n, :] return v def f3(): theta = expfac_theta * prev_theta - (1-expfac_theta) return theta def f4(): mask = (v >= theta) return mask def f5(): S[n,np.squeeze(mask)] = 1 def f6(): theta[mask] += b # Run Standard, Unoptimized Model for n in range(0,time_milliseconds): u = f1() v = f2() theta = f3() mask = f4() f5() f6() prev_u = u prev_theta = theta I get f6() as being the biggest bottleneck, followed by the general time spent in the loop (about the same), followed by f5(), f1(), and f3() (each about half of f6()), followed by f2() (about half of f5()). f4() is negligible. Masked operations are inherently slow. They mess up CPU's branch prediction. Worse, the use of iterators in that part of the code frustrates compilers' attempts to optimize that away in the case of contiguous arrays. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 19 20:41:46 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 19:41:46 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <3d375d730805191536n9917a31xcad76d63a7f09170@mail.gmail.com> <3d375d730805191652r4a6f945et7b7ddb6ca0b3458c@mail.gmail.com> Message-ID: <3d375d730805191741o5473de90pad9c82580d1839ee@mail.gmail.com> On Mon, May 19, 2008 at 7:30 PM, Charles R Harris wrote: > > On Mon, May 19, 2008 at 5:52 PM, Robert Kern wrote: >> >> On Mon, May 19, 2008 at 6:39 PM, Charles R Harris >> wrote: >> > >> > On Mon, May 19, 2008 at 4:36 PM, Robert Kern >> > wrote: >> >> >> >> On Mon, May 19, 2008 at 5:27 PM, Charles R Harris >> >> wrote: >> >> > The latest versions of Matlab use the ziggurat method to generate >> >> > random >> >> > normals and it is faster than the method used in numpy. I have >> >> > ziggurat >> >> > code >> >> > at hand, but IIRC, Robert doesn't trust the method ;) >> >> >> >> Well, I outlined the tests that would satisfy me, but I don't think >> >> you ever responded. >> >> >> >> >> >> http://projects.scipy.org/pipermail/scipy-dev/2005-December/004405.html >> >> which references >> >> >> >> http://projects.scipy.org/pipermail/scipy-dev/2005-December/004400.html >> > >> > It's been tested in the literature. >> >> And it happened to fail such tests. Hence the Doornik paper which >> improves Marsaglia's method to pass the appropriate tests. > > Exactly. Doornik was more careful about using independent samples and also > used a better random number generator (MWC8222), not exactly rocket > science. Believe it or not, I had read Doornik's paper before I did my > implementation. Good! I believe this is the first time you've mentioned it. > I also used a better ziggurat, IMHO, than Marsaglia and > Doornik. Ah. HOs need testing. dieharder is probably de rigeur these days, and it will accept input from a file. http://www.phy.duke.edu/~rgb/General/dieharder.php -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Mon May 19 21:14:58 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 19 May 2008 15:14:58 -1000 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> Message-ID: <48322612.1000206@hawaii.edu> Robert Kern wrote: > On Mon, May 19, 2008 at 6:55 PM, James Snyder wrote: >> Also note, I'm not asking to match MATLAB performance. It'd be nice, >> but again I'm just trying to put together decent, fairly efficient >> numpy code. > > I can cut the time by about a quarter by just using the boolean mask > directly instead of using where(). > > for n in range(0,time_milliseconds): > u = expfac_m * prev_u + (1-expfac_m) * aff_input[n,:] > v = u + sigma * stdnormrvs[n, :] > theta = expfac_theta * prev_theta - (1-expfac_theta) > > mask = (v >= theta) > > S[n,np.squeeze(mask)] = 1 > theta[mask] += b > > prev_u = u > prev_theta = theta > > > There aren't any good line-by-line profiling tools in Python, but you > can fake it by making a local function for each line: > > def f1(): > u = expfac_m * prev_u + (1-expfac_m) * aff_input[n,:] > return u > def f2(): > v = u + sigma * stdnormrvs[n, :] > return v > def f3(): > theta = expfac_theta * prev_theta - (1-expfac_theta) > return theta > def f4(): > mask = (v >= theta) > return mask > def f5(): > S[n,np.squeeze(mask)] = 1 > def f6(): > theta[mask] += b > > # Run Standard, Unoptimized Model > for n in range(0,time_milliseconds): > u = f1() > v = f2() > theta = f3() > mask = f4() > f5() > f6() > > prev_u = u > prev_theta = theta > > I get f6() as being the biggest bottleneck, followed by the general > time spent in the loop (about the same), followed by f5(), f1(), and > f3() (each about half of f6()), followed by f2() (about half of f5()). > f4() is negligible. > > Masked operations are inherently slow. They mess up CPU's branch > prediction. Worse, the use of iterators in that part of the code > frustrates compilers' attempts to optimize that away in the case of > contiguous arrays. > f6 can be sped up more than a factor of 2 by using putmask: In [10]:xx = np.random.rand(100000) In [11]:mask = xx > 0.5 In [12]:timeit xx[mask] += 2.34 100 loops, best of 3: 4.06 ms per loop In [14]:timeit np.putmask(xx, mask, xx+2.34) 1000 loops, best of 3: 1.4 ms per loop I think that xx += 2.34*mask will be similarly quick, but I can't get ipython timeit to work with it. Eric From david at ar.media.kyoto-u.ac.jp Mon May 19 21:27:34 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 10:27:34 +0900 Subject: [Numpy-discussion] svd in numpy In-Reply-To: <48319D3E.1030901@gmail.com> References: <7671f4e40805161031g5b8c118ds23aaa567b4e66627@mail.gmail.com> <482E8A85.6010606@ar.media.kyoto-u.ac.jp> <7671f4e40805190813n6027e4f3s58c47dea7b50c605@mail.gmail.com> <48319D3E.1030901@gmail.com> Message-ID: <48322906.1000302@ar.media.kyoto-u.ac.jp> Bruce Southey wrote: > Nripun Sredar wrote: > >> I am running on Windows Xp, Intel Xeon CPU. I'd like to fill in a few >> more things here. If I send 0 in the second and third argument of svd >> then I get the singular_values, but if its 1 then the problem >> persists. I've tried this on sparse and non-sparse matrices. This is >> with the latest windows binaries numpy-1.0.4.win32-py2.5.msi. >> There was a problem with the way those binaries were built, depending on your CPU. Hopefully, the new binary for 1.1.0 will not have this problem anymore. When available, please test it and report whether it is working or not for you, thanks, David From david at ar.media.kyoto-u.ac.jp Mon May 19 21:54:48 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 10:54:48 +0900 Subject: [Numpy-discussion] Is it ok to create a tool directory in numpy svn (for building tools, etc...) Message-ID: <48322F68.7030802@ar.media.kyoto-u.ac.jp> Hi, To build numpy binaries, I have some pretty boring python scripts, and I think it would be useful to have them somewhere in numpy trunk (for example in tools). Does anyone have something against it ? cheers, David From david at ar.media.kyoto-u.ac.jp Mon May 19 22:15:20 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 11:15:20 +0900 Subject: [Numpy-discussion] 1.1.0rc1, Win32 Installer: please test it Message-ID: <48323438.8080501@ar.media.kyoto-u.ac.jp> Hi, Sorry for the delay, but it is now ready: numpy "superpack" installers for numpy 1.1.0rc1: http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.4.exe (Python 2.4 binaries are not there yet). This binary should work on any (32 bits) CPU on windows, and in particular should solve the recurring problem of segfault/hangs on older CPU with previous binary releases. I used a fairly heavy compression scheme (lzma), because it cut the size ~ 30 %. If it is a problem, please let me know, cheers, David From charlesr.harris at gmail.com Mon May 19 22:41:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 May 2008 20:41:18 -0600 Subject: [Numpy-discussion] Is it ok to create a tool directory in numpy svn (for building tools, etc...) In-Reply-To: <48322F68.7030802@ar.media.kyoto-u.ac.jp> References: <48322F68.7030802@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, May 19, 2008 at 7:54 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > To build numpy binaries, I have some pretty boring python scripts, > and I think it would be useful to have them somewhere in numpy trunk > (for example in tools). Does anyone have something against it ? > Hey, you can always delete it later. I say go for it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon May 19 22:41:43 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 11:41:43 +0900 Subject: [Numpy-discussion] Is it ok to create a tool directory in numpy svn (for building tools, etc...) In-Reply-To: References: <48322F68.7030802@ar.media.kyoto-u.ac.jp> Message-ID: <48323A67.6050300@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, May 19, 2008 at 7:54 PM, David Cournapeau > > > wrote: > > Hi, > > To build numpy binaries, I have some pretty boring python scripts, > and I think it would be useful to have them somewhere in numpy trunk > (for example in tools). Does anyone have something against it ? > > > Hey, you can always delete it later. I say go for it. I've just deleted by accident my build script, if that's not a sign.... cheers, David From robert.kern at gmail.com Mon May 19 23:00:15 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 May 2008 22:00:15 -0500 Subject: [Numpy-discussion] Is it ok to create a tool directory in numpy svn (for building tools, etc...) In-Reply-To: <48322F68.7030802@ar.media.kyoto-u.ac.jp> References: <48322F68.7030802@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730805192000j6c164d20ye2b516ee95afb1fc@mail.gmail.com> On Mon, May 19, 2008 at 8:54 PM, David Cournapeau wrote: > Hi, > > To build numpy binaries, I have some pretty boring python scripts, > and I think it would be useful to have them somewhere in numpy trunk > (for example in tools). Does anyone have something against it ? Nope. Go for it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jbsnyder at gmail.com Mon May 19 23:26:26 2008 From: jbsnyder at gmail.com (James Snyder) Date: Mon, 19 May 2008 22:26:26 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> Message-ID: <33644d3c0805192026h5dbbb210hf06d5e3cc5fc04b4@mail.gmail.com> I've done a little profiling with cProfile as well as with dtrace since the bindings exist in mac os x, and you can use a lot of the d scripts that apply to python, so previously I've found that the np.random call and the where (in the original code) were heavy hitters as far as amount of time consumed. The time has now been shaved down to ~9 seconds with this suggestion from the original 13-14s, with the inclusing of Eric Firing's suggestions. This is without scipy.weave, which at the moment I can't get to work for all lines, and when I just replace one of them sucessfully it seems to run more slowly, I assume because it is converting data back and forth. Quick question regarding the pointer abstraction that's going on, the following seems to work: np.putmask(S[n,:],np.squeeze(mask),1) with that section of S being worked on. Is it safe to assume in most cases while working with NumPy that without additional operations, aside from indexing, that a reference rather than a copy is being passed? It certainly seems like this sort of thing, including stuff like: u = self.u v = self.v theta = self.theta ... without having to repack those data into self later, since u,v,theta are just references to the existing data saves on code and whatnot, but I'm a little worried about not being explicit. Are there any major pitfalls to be aware of? It sounds like if I do: f = a[n,:] I get a reference, but if I did something like g = a[n,:]*2 I would get a copy. Thanks guys. This is definitely useful, especially in combination with using PyPar on my dual core system I'm getting pretty good performance :-) If anyone has any clues on why scipy.weave is blowing (http://pastebin.com/m79699c04) using the code I attached, I wouldn't mind knowing. Most of the times I've attempted using weave, I've been a little baffled about why things aren't working. I also don't have a sense for whether all numpy functions should be "weavable" or if it's just general array operations that can be put through weave. I know this is the numpy list, so I can take things over to the scipy list if that's more appropriate. On Mon, May 19, 2008 at 7:36 PM, Robert Kern wrote: > On Mon, May 19, 2008 at 6:55 PM, James Snyder wrote: >> Also note, I'm not asking to match MATLAB performance. It'd be nice, >> but again I'm just trying to put together decent, fairly efficient >> numpy code. > > I can cut the time by about a quarter by just using the boolean mask > directly instead of using where(). > > for n in range(0,time_milliseconds): > u = expfac_m * prev_u + (1-expfac_m) * aff_input[n,:] > v = u + sigma * stdnormrvs[n, :] > theta = expfac_theta * prev_theta - (1-expfac_theta) > > mask = (v >= theta) > > S[n,np.squeeze(mask)] = 1 > theta[mask] += b > > prev_u = u > prev_theta = theta > > > There aren't any good line-by-line profiling tools in Python, but you > can fake it by making a local function for each line: > > def f1(): > u = expfac_m * prev_u + (1-expfac_m) * aff_input[n,:] > return u > def f2(): > v = u + sigma * stdnormrvs[n, :] > return v > def f3(): > theta = expfac_theta * prev_theta - (1-expfac_theta) > return theta > def f4(): > mask = (v >= theta) > return mask > def f5(): > S[n,np.squeeze(mask)] = 1 > def f6(): > theta[mask] += b > > # Run Standard, Unoptimized Model > for n in range(0,time_milliseconds): > u = f1() > v = f2() > theta = f3() > mask = f4() > f5() > f6() > > prev_u = u > prev_theta = theta > > I get f6() as being the biggest bottleneck, followed by the general > time spent in the loop (about the same), followed by f5(), f1(), and > f3() (each about half of f6()), followed by f2() (about half of f5()). > f4() is negligible. > > Masked operations are inherently slow. They mess up CPU's branch > prediction. Worse, the use of iterators in that part of the code > frustrates compilers' attempts to optimize that away in the case of > contiguous arrays. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com From pearu at cens.ioc.ee Tue May 20 01:29:29 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue, 20 May 2008 08:29:29 +0300 (EEST) Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> Message-ID: <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> CC: numpy-discussion because of other reactions on the subject. On Tue, May 20, 2008 1:26 am, Robert Kern wrote: > Is this an important bugfix? If not, can you hold off until 1.1.0 is > released? The patch fixes a long existing and unreported bug in f2py - I think the bug was introduced when Python defined min and max functions. I learned about the bug when reading a manuscript about f2py. Such bugs should not end up in a paper demonstrating f2py inability to process certain features as it would have not been designed to do so. So, I'd consider the bugfix important. On the other hand, the patch does not affect numpy users who do not use f2py, in any way. So, it is not important for numpy users, in general. Hmm, I also thought that the trunk is open for development, even though r5198 is only fixing a bug (and I do not plan to develop f2py in numpy further, just fix bugs and maintain it). If the release process is going to take for weeks and is locking the trunk, may be the release candidates should live in a separate branch? Pearu From stefan at sun.ac.za Tue May 20 02:56:27 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 08:56:27 +0200 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> Message-ID: <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> Hi Pearu 2008/5/20 Pearu Peterson : > CC: numpy-discussion because of other reactions on the subject. > > On Tue, May 20, 2008 1:26 am, Robert Kern wrote: >> Is this an important bugfix? If not, can you hold off until 1.1.0 is >> released? > > The patch fixes a long existing and unreported bug in f2py - I think > the bug was introduced when Python defined min and max functions. > I learned about the bug when reading a manuscript about f2py. Such bugs > should not end up in a paper demonstrating f2py inability to process > certain > features as it would have not been designed to do so. So, I'd consider > the bugfix important. > > On the other hand, the patch does not affect numpy users who do not > use f2py, in any way. So, it is not important for numpy users, in general. Many f2py users currently get their version via NumPy, I assume. > Hmm, I also thought that the trunk is open for development, even though > r5198 is only fixing a bug (and I do not plan to develop f2py in numpy > further, just fix bugs and maintain it). If the release process > is going to take for weeks and is locking the trunk, may be the > release candidates should live in a separate branch? If the patch a) Fixes an important bug and b) has unit tests to ensure it does what it is supposed to then I'd be +1 for applying. It looks like there are some tests included; to which degree do they cover the bugfix, and do we have tests to make sure that f2py still functions correctly? I'd like to make sure I understood Jarrod's message from earlier this week: 1) Release candidate branch is tagged -- development continues on trunk 2) Release candidate is tested 3) Bug-fixes are back-ported to the release candidate as necessary 4) Release is made Another version I've seen starts with: 1) Release candidate branch is tagged -- no one touches trunk except for bug-fixes Which is it? I want to know where the docstring changes should go. Regards St?fan From millman at berkeley.edu Tue May 20 03:19:05 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 00:19:05 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <3d375d730805191342h44cf3663y8242848e5ae2616f@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <3d375d730805191335x35c6701bgaad5f8cc40a4045e@mail.gmail.com> <3d375d730805191338g58f79ba3vddb09be4eb93c38d@mail.gmail.com> <3d375d730805191342h44cf3663y8242848e5ae2616f@mail.gmail.com> Message-ID: On Mon, May 19, 2008 at 1:42 PM, Robert Kern wrote: > And now fixed on the trunk. I believe that's the correct place to fix > bugs for 1.1.0 at this time. Yes, the trunk is where fixes should go. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pearu at cens.ioc.ee Tue May 20 03:40:01 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue, 20 May 2008 09:40:01 +0200 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> Message-ID: <48328051.1060709@cens.ioc.ee> St?fan van der Walt wrote: > Hi Pearu > > 2008/5/20 Pearu Peterson : >> CC: numpy-discussion because of other reactions on the subject. >> >> On Tue, May 20, 2008 1:26 am, Robert Kern wrote: >>> Is this an important bugfix? If not, can you hold off until 1.1.0 is >>> released? >> The patch fixes a long existing and unreported bug in f2py - I think >> the bug was introduced when Python defined min and max functions. >> I learned about the bug when reading a manuscript about f2py. Such bugs >> should not end up in a paper demonstrating f2py inability to process >> certain >> features as it would have not been designed to do so. So, I'd consider >> the bugfix important. >> >> On the other hand, the patch does not affect numpy users who do not >> use f2py, in any way. So, it is not important for numpy users, in general. > > Many f2py users currently get their version via NumPy, I assume. Yes. There is no other place. >> Hmm, I also thought that the trunk is open for development, even though >> r5198 is only fixing a bug (and I do not plan to develop f2py in numpy >> further, just fix bugs and maintain it). If the release process >> is going to take for weeks and is locking the trunk, may be the >> release candidates should live in a separate branch? > > If the patch > > a) Fixes an important bug and > b) has unit tests to ensure it does what it is supposed to > > then I'd be +1 for applying. It looks like there are some tests > included; to which degree do they cover the bugfix, and do we have > tests to make sure that f2py still functions correctly? Note that in past f2py was developed using a bit different model compared to what we require for numpy currently. The g3 f2py development will be carried out out of numpy tree but using numpy development model. Switching numpy.f2py to numpy development model requires substantial changes, most dramatic one would be to remove f2py;). A realistic change would require working out a way to test automatically generated extension modules. Since this requires existence of C and *Fortran* compilers, then the test runner must be able to detect the existence of compilers in order to decide whether to run such tests or not. So I beg to be flexible with f2py related commits for now. Most changes are tested by me to ensure that f2py works correctly (as a minimum, after changing f2py, I always test f2py against scipy). Some changes may need users feedback because I may not have access to all commercial Fortran compilers that numpy.distutis aims at supporting. This development model has not broke f2py in past as far as I have been concerned. If you will disallow such bug fixes in future, it would mean that maintaining f2py in numpy will be practically stopped. I am not sure that we would want that either. Pearu From david at ar.media.kyoto-u.ac.jp Tue May 20 03:37:50 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 16:37:50 +0900 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <48328051.1060709@cens.ioc.ee> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> <48328051.1060709@cens.ioc.ee> Message-ID: <48327FCE.3000201@ar.media.kyoto-u.ac.jp> Pearu Peterson wrote: > So I beg to be flexible with f2py related commits for now. Why not creating a branch for the those changes, and applying only critical bug fixes to the trunk ? cheers, David From millman at berkeley.edu Tue May 20 03:53:04 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 00:53:04 -0700 Subject: [Numpy-discussion] Current state of the trunk and release Message-ID: On Fri, May 16, 2008 at 12:20 AM, Jarrod Millman wrote: > I believe that we have now addressed everything that was holding up > the 1.1.0 release, so I will be tagging the 1.1.0rc1 in about 12 > hours. Please be extremely conservative and careful about any commits > you make to the trunk until we officially release 1.1.0 (now may be a > good time to spend some effort on SciPy). Once I tag the release > candidate I will ask both David and Chris to create Windows and Mac > binaries. I will give everyone a few days to test the release > candidate and binaries thoroughly. If everything looks good, the > release candidate will become the official release. > > Once I tag 1.1.0, I will open the trunk for 1.1.1 development. Any > development for 1.2 will have to occur on a new branch. I also plan > to spend sometime once 1.1.0 is released discussing with the community > what we want included in 1.2. Since there seems to be some confusion about what should be happening and where, I wanted to clarify the situation. There is currently no branch for 1.1.x; the trunk is still officially the 1.1.x "branch". I have tagged a 1.1.0rc1 off the trunk for testing. A development freeze is in effect on the trunk for a few days. I plan to release 1.1.0 officially be this Friday unless something ugly shows up. If you need to work on NumPy, feel free to create a branch: http://projects.scipy.org/scipy/numpy/wiki/MakingBranches I know this is frustrating for some of you, but please just bear with me for a few days and help me try and get a stable 1.1.0 out as fast as possible. The only things that should be committed to the trunk now should be trivial bug-fixes to specific issues found by the release candidate. A good example of the kind of change I intended on the trunk right now is Robert Kern's fix to two tests that were incorrectly assuming little-endianness: http://projects.scipy.org/scipy/numpy/changeset/5196 In fact, this is exactly the type of thing I was hoping we might uncover by creating binaries for the release candidate. So if you want to help get NumPy 1.1.0, please test the release candidate as well as the release candidate binaries. Also please use restraint in making new commits. I don't have time to police ever commit, so I am asking everyone to just use their best judgment. Please be patient, the release will be out very soon. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pearu at cens.ioc.ee Tue May 20 04:00:05 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue, 20 May 2008 10:00:05 +0200 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <48327FCE.3000201@ar.media.kyoto-u.ac.jp> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> <48328051.1060709@cens.ioc.ee> <48327FCE.3000201@ar.media.kyoto-u.ac.jp> Message-ID: <48328505.6010606@cens.ioc.ee> David Cournapeau wrote: > Pearu Peterson wrote: >> So I beg to be flexible with f2py related commits for now. > > Why not creating a branch for the those changes, and applying only > critical bug fixes to the trunk ? How do you define a critical bug? Critical to whom? f2py changes are never critical to numpy users who do not use f2py. I have stated before that I am not developing numpy.f2py any further. This also means that any changes to f2py should be essentially bug fixes. Creating a branch for bug fixes is a waste of time, imho. If somebody is willing to maintain the branch, that is, periodically sync the branch with the trunk and vice-versa, then I don't mind. Pearu From david at ar.media.kyoto-u.ac.jp Tue May 20 03:50:12 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 16:50:12 +0900 Subject: [Numpy-discussion] Current state of the trunk and release In-Reply-To: References: Message-ID: <483282B4.8000001@ar.media.kyoto-u.ac.jp> Jarrod Millman wrote: > > I know this is frustrating for some of you, but please just bear with > me for a few days and help me try and get a stable 1.1.0 out as fast > as possible. The only things that should be committed to the trunk > now should be trivial bug-fixes to specific issues found by the > release candidate. > Ok, that should have be more explicit before, I think, because it was not obvious at all. I did a few commits: should I revert them ? It is too late for this release, but why didn't you create a 1.1.0 branch ? That way, only the release manager has to do the work :) More seriously, I do think it is more logical to branch the trunk for a release, specially in the svn trunk/branches/tags model, and I think we should follow this model for the next releases. cheers, David From david at ar.media.kyoto-u.ac.jp Tue May 20 05:03:05 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 18:03:05 +0900 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <48328505.6010606@cens.ioc.ee> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> <48328051.1060709@cens.ioc.ee> <48327FCE.3000201@ar.media.kyoto-u.ac.jp> <48328505.6010606@cens.ioc.ee> Message-ID: <483293C9.9040108@ar.media.kyoto-u.ac.jp> Pearu Peterson wrote: > f2py changes are never critical to numpy users who do not use f2py. > No, but they are to scipy users if f2py cannot build scipy. > I have stated before that I am not developing numpy.f2py any further. > This also means that any changes to f2py should be essentially bug > fixes. Creating a branch for bug fixes is a waste of time, imho. > I was speaking about creating a branch for the unit tests changes you were talking about, that is things which could potentially break a lot of configurations. Is the new f2py available for users ? If yes, you should tell f2py users to use this, and just do not care at all about numpy.f2py anymore, except for critical bugs. Maintaining two versions of the same software is always a PITA, so if you don't want to spend time on it, just focus on the new f2py (as long as numpy.f2py can build scipy, of course). cheers, David From millman at berkeley.edu Tue May 20 05:47:10 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 02:47:10 -0700 Subject: [Numpy-discussion] Current state of the trunk and release In-Reply-To: <483282B4.8000001@ar.media.kyoto-u.ac.jp> References: <483282B4.8000001@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, May 20, 2008 at 12:50 AM, David Cournapeau wrote: > Ok, that should have be more explicit before, I think, because it was > not obvious at all. I did a few commits: should I revert them ? Sorry I wasn't more clear; I think we could avoided a lot of confusion, if I had done a better job explaining what I meant. Anyway, I don't think there is a need to revert any of the changes that have gone in all ready. I took a quick look at ever change that has been made over the last few days and I didn't see anything that I believe will break anything. If anyone notices something that I have overlooked which should be removed, please let me know ASAP. Also let's be extremely careful about committing to the trunk over the next two to three days. I didn't branch because when I branched for 1.1.x two weeks ago: http://projects.scipy.org/scipy/numpy/changeset/5134 I ended up having to delete it a week later: http://projects.scipy.org/scipy/numpy/changeset/5163 So, I tried something different this time. I actually would prefer to follow the conventional svn trunk/branches/tags model--as long as everyone else is willing to follow it. After all, I actually voted for it before I vote against it. I will send out an email momentarily proposing something closer to this. Please respond to my next email and not this one. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Tue May 20 05:59:49 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 02:59:49 -0700 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development Message-ID: Hello everyone, In response to some of the recent discussion (and confusion) surrounding my plan for the trunk and 1.2 development over the next month or two, I propose the following: 1. Stay with the "original" plan until Thursday, May 22nd. That is, all commits to the trunk should be extremely trivial, bug-fixes for issues that specifically arise during from the release candidate. (I think that there is no need to revert any changes that have happened on the trunk up to this point, but let's be more conservative going forward over the next 2 days.) 2. On Thursday, May 22nd, I will create a 1.1.x branch off the trunk. I will tag 1.1.0 off the branch. The trunk will become open for development of the 1.2.x series. Commits to the 1.1.x branch should only be relatively trivial bug-fixes. Ideally, 1.1.1 will be practically identical to 1.1.0 with only a handful of important bug-fixes. More specifically, this means that 1. There will be no new features added to the 1.1.x series. 2. There will be only very minor documentation fixes to the 1.1.x series. Only if the documentation has something so incorrect that it is considered a bug will it be updated. In particular, most of Stefan's work will go into 1.2.x. 3. There will be only very trivial bug-fixes in the 1.1.x series. I would expect no more than, say, 10 bug-fixes in a given micro/maintenance release. This is just a rule of thumb; but given our current development size, I would prefer to see most of our effort focused on the 1.2.x series. Given that 1.2.0 will be released by the end of August, this shouldn't cause any major problems for our users. Moreover, it will help ensure that if they upgrade from 1.1.x to 1.1.x+1 that there will be almost no chance that their code will break. Commits to the trunk (1.2.x) should follow these rules: 1. Documentation fixes are allowed and strongly encouraged. 2. Bug-fixes are strongly encouraged. 3. Do not break backwards compatibility. 4. New features are permissible. 5. New tests are highly desirable. 6. If you add a new feature, it must have tests. 7. If you fix a bug, it must have tests. If you want to break a rule, don't. If you feel you absolutely have to, please don't--but feel free send an email to the list explain your problem. In addition to these rules for 1.2 development, let me remind everyone that we have all ready agreed that 1.2 will: 1. Use the nose testing framework. 2. Require Python 2.4 or greater. This means we have built-in decorators, set objects, generators, etc. 3. Contain some planned changes to median and histogram. I hope this is more clear than the numerous partial emails I sent out before. Again let me apologize for the earlier confusion. Please let me know if this is a workable plan for you. In particular, let me know it there is some aspect of this that you simply refuse to agree to in at least principle. Also at this point let's focus on the overall picture. Namely, that (1) the trunk is currently is a relatively hard freeze and that (2) I will create a 1.1.x branch on Thursday and open the trunk for 1.2 development. The other details should be viewed as explanatory narrative that can be refined later. Unless there are major objections to this proposal, I will take some time later this week to make this information available on the wiki. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From stefan at sun.ac.za Tue May 20 06:20:04 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 12:20:04 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: Message-ID: <9457e7c80805200320s5e364940xe28d72c87a7fd55c@mail.gmail.com> 2008/5/20 Jarrod Millman : > In response to some of the recent discussion (and confusion) > surrounding my plan for the trunk and 1.2 development over the next > month or two, I propose the following: Thank you for the clarification, Jarrod. Your plan is sound, I'm on board. Regards St?fan From pearu at cens.ioc.ee Tue May 20 06:21:32 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue, 20 May 2008 13:21:32 +0300 (EEST) Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <483293C9.9040108@ar.media.kyoto-u.ac.jp> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> <48328051.1060709@cens.ioc.ee> <48327FCE.3000201@ar.media.kyoto-u.ac.jp> <48328505.6010606@cens.ioc.ee> <483293C9.9040108@ar.media.kyoto-u.ac.jp> Message-ID: <38753.129.240.222.155.1211278892.squirrel@cens.ioc.ee> On Tue, May 20, 2008 12:03 pm, David Cournapeau wrote: > Pearu Peterson wrote: >> f2py changes are never critical to numpy users who do not use f2py. >> > No, but they are to scipy users if f2py cannot build scipy. Well, I know pretty well what f2py features scipy uses and what could break scipy build. So, don't worry about that. >> I have stated before that I am not developing numpy.f2py any further. >> This also means that any changes to f2py should be essentially bug >> fixes. Creating a branch for bug fixes is a waste of time, imho. >> > I was speaking about creating a branch for the unit tests changes you > were talking about, that is things which could potentially break a lot > of configurations. A branch for the unit tests changes is of course reasonable. > Is the new f2py available for users ? If yes,.. No, it is far from being usable now. The numpy.f2py and g3 f2py are completely different software. The changeset was fixing a bug in numpy.f2py, it has nothing to do with g3 f2py. amazing-how-paranoiac-is-todays-numpy/scipy-development'ly yours, Pearu From millman at berkeley.edu Tue May 20 06:36:41 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 03:36:41 -0700 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> Message-ID: On Mon, May 19, 2008 at 10:29 PM, Pearu Peterson wrote: > On Tue, May 20, 2008 1:26 am, Robert Kern wrote: >> Is this an important bugfix? If not, can you hold off until 1.1.0 is >> released? > > The patch fixes a long existing and unreported bug in f2py - I think > the bug was introduced when Python defined min and max functions. > I learned about the bug when reading a manuscript about f2py. Such bugs > should not end up in a paper demonstrating f2py inability to process > certain features as it would have not been designed to do so. So, I'd consider > the bugfix important. I have been struggling to try and get a stable release out since February and every time I think that the release is almost ready some piece of code changes that requires me to delay. While overall the code has continuously improved over this period, I think it is time to get these improvements to our users. That said, I am willing to leave this change on the trunk, but please refrain from making any more changes until we release 1.1.0. I know it can be frustrating, but, I believe, this is the first time I have asked the community to not make commits to the trunk since I started handling releases almost a year ago. The freeze has only been in effect since Saturday and will last less than one week in total. I would have preferred if you could have made this change during any one of the other 51 weeks of the year. > Hmm, I also thought that the trunk is open for development, even though > r5198 is only fixing a bug (and I do not plan to develop f2py in numpy > further, just fix bugs and maintain it). If the release process > is going to take for weeks and is locking the trunk, may be the > release candidates should live in a separate branch? Sorry for the confusion, I had asked that everyone "be extremely conservative and careful about any commits you make to the trunk until we officially release 1.1.0, " which is still pretty much the rule of thumb. I have been keeping the 1.1.0 milestone page up-to-date with regard to my planned release date; but I should have highlighted the date in my email. The main reason that this is happening on the trunk is that about two weeks ago I created a 1.1.x branch, but I didn't think all the bug-fixes for the 1.1.0 release were being made to the branch and the branch and the trunk got out of synch enough that it was difficult for me to merge the fixes in the trunk into the branch, so I deleted the branch and declared the trunk to again be where 1.1.x development occurred. I fully intend to release 1.1.0 by the end of the week. I also intend to create a 1.1.x maintenance branch at that point, so the trunk will be open for 1.2 development. As long as you are only going to be adding bug-fixes to numpy.f2py, I think that you should be able to use the trunk for this purpose once I create the 1.1.x branch. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Tue May 20 06:39:47 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 03:39:47 -0700 Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: <48328505.6010606@cens.ioc.ee> References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> <9457e7c80805192356x59710f41pd90f547c860ddd82@mail.gmail.com> <48328051.1060709@cens.ioc.ee> <48327FCE.3000201@ar.media.kyoto-u.ac.jp> <48328505.6010606@cens.ioc.ee> Message-ID: On Tue, May 20, 2008 at 1:00 AM, Pearu Peterson wrote: > How do you define a critical bug? Critical to whom? I know that the definition of a "critical bug" is somewhat ill-defined, but I think that "a long existing and unreported bug" probably wouldn't fall into the category of "critical bug". -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pearu at cens.ioc.ee Tue May 20 06:47:56 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue, 20 May 2008 13:47:56 +0300 (EEST) Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: Message-ID: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> On Tue, May 20, 2008 12:59 pm, Jarrod Millman wrote: > Commits to the trunk (1.2.x) should follow these rules: > > 1. Documentation fixes are allowed and strongly encouraged. > 2. Bug-fixes are strongly encouraged. > 3. Do not break backwards compatibility. > 4. New features are permissible. > 5. New tests are highly desirable. > 6. If you add a new feature, it must have tests. > 7. If you fix a bug, it must have tests. > > If you want to break a rule, don't. If you feel you absolutely have > to, please don't--but feel free send an email to the list explain your > problem. ... > In particular, let me know it there is some aspect of this that > you simply refuse to agree to in at least principle. Since you asked, I have a problem with the rule 7 when applying it to packages like numpy.distutils and numpy.f2py, for instance. Do you realize that there exists bugs/features for which unittests cannot be written in principle? An example: say, a compiler vendor changes a flag of the new version of the compiler so that numpy.distutils is not able to detect the compiler or it uses wrong flags for the new compiler when compiling sources. Often, the required fix is trivial to find and apply, also just reading the code one can easily verify that the patch does not break anything. However, to write a unittest covering such a change would mean that one needs to ship also the corresponding compiler to the unittest directory. This is nonsense, of course. I can find other similar examples that have needed attention and changes to numpy.distutils and numpy.f2py in past and I know that few are coming up. Pearu From pearu at cens.ioc.ee Tue May 20 06:50:05 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue, 20 May 2008 13:50:05 +0300 (EEST) Subject: [Numpy-discussion] [Numpy-svn] r5198 - trunk/numpy/f2py In-Reply-To: References: <20080519221044.CBF9E39C9CE@scipy.org> <3d375d730805191526s62993da4ja04817bd5a0b5b2a@mail.gmail.com> <54635.88.89.123.83.1211261369.squirrel@cens.ioc.ee> Message-ID: <60776.129.240.222.155.1211280605.squirrel@cens.ioc.ee> On Tue, May 20, 2008 1:36 pm, Jarrod Millman wrote: > On Mon, May 19, 2008 at 10:29 PM, Pearu Peterson > wrote: >> On Tue, May 20, 2008 1:26 am, Robert Kern wrote: >>> Is this an important bugfix? If not, can you hold off until 1.1.0 is >>> released? >> >> The patch fixes a long existing and unreported bug in f2py - I think >> the bug was introduced when Python defined min and max functions. >> I learned about the bug when reading a manuscript about f2py. Such bugs >> should not end up in a paper demonstrating f2py inability to process >> certain features as it would have not been designed to do so. So, I'd >> consider >> the bugfix important. > > I have been struggling to try and get a stable release out since > February and every time I think that the release is almost ready some > piece of code changes that requires me to delay. While overall the > code has continuously improved over this period, I think it is time to > get these improvements to our users. > > That said, I am willing to leave this change on the trunk, but please > refrain from making any more changes until we release 1.1.0. I know > it can be frustrating, but, I believe, this is the first time I have > asked the community to not make commits to the trunk since I started > handling releases almost a year ago. The freeze has only been in > effect since Saturday and will last less than one week in total. I > would have preferred if you could have made this change during any one > of the other 51 weeks of the year. Please, go ahead. I'll not commit non-critical changes until the trunk is open again. Pearu From david at ar.media.kyoto-u.ac.jp Tue May 20 07:06:34 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 May 2008 20:06:34 +0900 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: <4832B0BA.7020500@ar.media.kyoto-u.ac.jp> Pearu Peterson wrote: > > Since you asked, I have a problem with the rule 7 when applying > it to packages like numpy.distutils and numpy.f2py, for instance. > Although Jarrod did not mention it, I think everybody agrees that distutils is not really testable, by its nature and design. But that's even more a reason to be careful when changing it just before a release. I don't see why f2py would not be testable by nature, though. A big part of it is parsing fortran, right ? > Often, the required fix > is trivial to find and apply, also just reading the code one can > easily verify that the patch does not break anything. IMHO, the most useful aspect of unit tests is not testing that the fix works, but being sure that if it breaks again, it will be detected (regression test). This is specially important when refactoring. David From stefan at sun.ac.za Tue May 20 07:26:34 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 13:26:34 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: <9457e7c80805200426v11a82b2y316c90debb1bbda7@mail.gmail.com> 2008/5/20 Pearu Peterson : >> 7. If you fix a bug, it must have tests. > > Since you asked, I have a problem with the rule 7 when applying > it to packages like numpy.distutils and numpy.f2py, for instance. > > Do you realize that there exists bugs/features for which unittests cannot > be written in principle? An example: say, a compiler vendor changes > a flag of the new version of the compiler so that numpy.distutils > is not able to detect the compiler or it uses wrong flags for the > new compiler when compiling sources. Often, the required fix > is trivial to find and apply, also just reading the code one can > easily verify that the patch does not break anything. However, to > write a unittest covering such a change would mean that one needs > to ship also the corresponding compiler to the unittest directory. > This is nonsense, of course. I can find other similar examples > that have needed attention and changes to numpy.distutils and > numpy.f2py in past and I know that few are coming up. My earlier message, regarding your patch to trunk, was maybe not clearly phrased. I didn't want to start a unit-testing discussion, and was simply trying to say "if we apply patches now, we should make sure they work". You did that, so I was happy. I've been the main transgressor in applying new changes to trunk; sorry about the misunderstanding, Jarrod. As for your comment above: yes, writing unit tests is hard. As you mentioned, sometimes changes are trivial, difficult to test and you can see they work. If the code was already exercised in the test suite, I would be less worried about such trivial changes. Then, at least, we know that the code is executed every time the test suite runs, so if a person forgot to close a parenthesis or a string quote, it would be picked up. The case I describe above is exceptional (but it does occur much more frequently in f2py and distutils). Still, I would not say that those tests are impossible to write, just that they require thinking outside the box. For example, it would be quite feasible to have a set of small Python scripts that pretend to be compilers, to assist in asserting that the flags we think we pass are the ones that reach the compiler. Similarly, a fake directory tree can be created to help verify that `site.cfg` is correctly parsed and applied. You are right: some factors are out of our control, but we need to test for every piece of functionality that isn't. Regards St?fan From millman at berkeley.edu Tue May 20 08:11:57 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 05:11:57 -0700 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: On Tue, May 20, 2008 at 3:47 AM, Pearu Peterson wrote: > On Tue, May 20, 2008 12:59 pm, Jarrod Millman wrote: > >> Commits to the trunk (1.2.x) should follow these rules: >> >> 1. Documentation fixes are allowed and strongly encouraged. >> 2. Bug-fixes are strongly encouraged. >> 3. Do not break backwards compatibility. >> 4. New features are permissible. >> 5. New tests are highly desirable. >> 6. If you add a new feature, it must have tests. >> 7. If you fix a bug, it must have tests. >> >> If you want to break a rule, don't. If you feel you absolutely have >> to, please don't--but feel free send an email to the list explain your >> problem. > ... >> In particular, let me know it there is some aspect of this that >> you simply refuse to agree to in at least principle. > > Since you asked, I have a problem with the rule 7 when applying > it to packages like numpy.distutils and numpy.f2py, for instance. I obviously knew this would be controversial. Personally, I would prefer if we boldly state that tests are required. I know that this "rule" will be broken occasionally and may not even always make sense. We could change the language to be something like "if you a fix a bug, there should be tests". Saying "you must" means that you're breaking a rule when you don't. Importantly I am not proposing that we have some new enforcement mechanism; I am happy to leave this to Stefan and whoever else wants to join him. (Thanks Stefan--you have taken on a tough job!) However, let's not worry too much about this at this point. Let's get 1.1.0 out. I think everyone agrees that unit tests are a good idea, some are more passionate than others. We need to figure out how best to increase the number of tests, but we are doing a great job currently. NumPy 1.0.4 had around 686 tests and the trunk now has roughly 996 tests. For now, let's officially consider rules 6 and 7 to have question marks at the end of them. Once 1.1.0 is out and we have started developing 1.2, we can start this conversation again. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From steve at shrogers.com Tue May 20 08:14:43 2008 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 20 May 2008 06:14:43 -0600 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <1211117327.8207.37.camel@localhost.localdomain> References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> Message-ID: <4832C0B3.8040503@shrogers.com> Pauli Virtanen wrote: > Hi, > > su, 2008-05-18 kello 07:16 -0600, Steven H. Rogers kirjoitti: > >> Joe Harrington wrote: >> >>> NUMPY/SCIPY DOCUMENTATION MARATHON 2008 >>> ... >>> 5. Write a new help function that optionally produces ASCII or points >>> the user's PDF or HTML reader to the right page (either local or >>> global). >>> >>> >> I can work on this. Fernando suggested this at the IPython sprint in >> Boulder last year, so I've given it some thought and started a wiki page: >> http://ipython.scipy.org/moin/Developer_Zone/SearchDocs >> > > In Numpy SVN/1.1 there is a function "lookfor" that searches the > docstrings for a substring (no stemming etc. is done). Similar > "%lookfor" magic command got accepted into IPython0 as an extension > ipy_lookfor.py. Improvements to these would be surely appreciated. > > I think that also Sphinx supports searching, so that the generated HTML > docs [1] are searchable, as is the generated PDF output. > > Pauli > > > .. [1] http://mentat.za.net/numpy/refguide/ > So far, this preview contains only docs for ndarray, though. > > Thanks Pauli. Looking at these. # Steve From stefan at sun.ac.za Tue May 20 08:30:23 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 14:30:23 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <4832C0B3.8040503@shrogers.com> References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> Message-ID: <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> 2008/5/20 Steven H. Rogers : >> .. [1] http://mentat.za.net/numpy/refguide/ >> So far, this preview contains only docs for ndarray, though. The reference guide has been updated to contain the entire numpy. Once we've applied indexing tags to functions, those will be sorted in a more coherent manner. Also, the math role and directive, i.e. :math:`\int_0^\infty` and .. math:: \int_0^\infty now render correctly. This is achieved using mathml in the xhtml files (so you need to install a mathml plugin if you use Internet Explorer). For en example, see "bartlett" (use the index to find it, quicksearch is currently broken). Regards St?fan From aisaac at american.edu Tue May 20 09:07:02 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 20 May 2008 09:07:02 -0400 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> References: <48302C30.2090502@shrogers.com><1211117327.8207.37.camel@localhost.localdomain><4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> Message-ID: On Tue, 20 May 2008, St?fan van der Walt apparently wrote: > Also, the math role and directive, i.e. > :math:`\int_0^\infty` and > .. math:: \int_0^\infty > now render correctly. Is this being done with Jens's writers? If not, I'd like to know how. Thank you, Alan Isaac PS There is currently active discussion by the docutils developers about implementing moving the math role and directive into docutils. Discovered issues could be usefully shared right now! From stefan at sun.ac.za Tue May 20 09:25:56 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 15:25:56 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> Message-ID: <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> Hi Alan Yes, the one discussed in this thread: http://groups.google.com/group/sphinx-dev/browse_thread/thread/ef74352b9f196002/0e257bc8c116f73f I've only had to make one change so far, to parse '*' as in 'A^*' (patch attached). Unfortunately, the author chose incomprehensible variable names like 'mo', 'mi', and 'mn', so I'm not sure I fixed it the right way. I'd be very glad if these directives became part of docutils -- I'd be glad if you could monitor that situation for us. Regards St?fan 2008/5/20 Alan G Isaac : > On Tue, 20 May 2008, St?fan van der Walt apparently wrote: >> Also, the math role and directive, i.e. >> :math:`\int_0^\infty` and >> .. math:: \int_0^\infty >> now render correctly. > > Is this being done with Jens's writers? > If not, I'd like to know how. > > Thank you, > Alan Isaac > > PS There is currently active discussion by the docutils > developers about implementing moving the math role and > directive into docutils. Discovered issues could be > usefully shared right now! -------------- next part -------------- A non-text attachment was scrubbed... Name: mathml.patch Type: application/octet-stream Size: 411 bytes Desc: not available URL: From gary.pajer at gmail.com Tue May 20 09:32:37 2008 From: gary.pajer at gmail.com (Gary Pajer) Date: Tue, 20 May 2008 09:32:37 -0400 Subject: [Numpy-discussion] data exchange format Message-ID: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> I want to store data in a way that can be read by a C or Matlab program. Not too much data, not too complicated: a dozen or so floats, a few integers, a few strings, and a (3, x) numpy array where typically 500 < x < 30000. I was about to create my own format for storage when it occurred to me that I might want to use XML or some other standard format. Like JSON, perhaps. Can anyone comment, esp relating to numpy implementation issues, or offer suggestions? Thanks, Gary From igorsyl at gmail.com Tue May 20 09:33:03 2008 From: igorsyl at gmail.com (Igor Sylvester) Date: Tue, 20 May 2008 08:33:03 -0500 Subject: [Numpy-discussion] building issue in windows Message-ID: Hi. I have mingw and Visual Studio installed on my computer. I am following the building instructions posted in [1]. I explicitly tell setup.py to use mingw by passing the argument --compiler=mingw32. However, setuptools is using Visual Studio anyways. has anyone encountered this problem? I am putting the mingw bin directory in my PATH without avail. Is there extra configuration needed for a system with both compilers installed? Thank you for your help. Igor [1] http://www.scipy.org/Installing_SciPy/Windows -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckers at orn.mpg.de Tue May 20 10:26:31 2008 From: beckers at orn.mpg.de (Gabriel J.L. Beckers) Date: Tue, 20 May 2008 16:26:31 +0200 Subject: [Numpy-discussion] data exchange format In-Reply-To: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> References: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> Message-ID: <1211293591.12803.7.camel@gabriel-desktop> PyTables is an efficient way of doing it (http://www.pytables.org). You essentially write data to a HDF5 file, which is portable and can be read in Matlab or in a C program (using the HDF5 library). Gabriel On Tue, 2008-05-20 at 09:32 -0400, Gary Pajer wrote: > I want to store data in a way that can be read by a C or Matlab program. > > Not too much data, not too complicated: a dozen or so floats, a few > integers, a few strings, and a (3, x) numpy array where typically 500 > < x < 30000. > > I was about to create my own format for storage when it occurred to me > that I might want to use XML or some other standard format. Like > JSON, perhaps. Can anyone comment, esp relating to numpy > implementation issues, or offer suggestions? > > Thanks, > Gary > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From rpyle at post.harvard.edu Tue May 20 10:27:16 2008 From: rpyle at post.harvard.edu (Robert Pyle) Date: Tue, 20 May 2008 10:27:16 -0400 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> Message-ID: <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> Hi all, On May 19, 2008, at 3:39 PM, Christopher Burns wrote: > I've built a Mac binary for the 1.1 release candidate. Mac users, > please test it from: > > https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg > > This is for the MacPython installed from python.org. From System Profiler --- Hardware Overview: Model Name: MacBook Pro Model Identifier: MacBookPro3,1 Processor Name: Intel Core 2 Duo Running 10.5.2 Uneventful installation, tests as follows --- >>> np.test() Numpy is installed in /Library/Frameworks/Python.framework/Versions/ 2.5/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 (Apple Computer, Inc. build 5363)] --- skipping details --- Ran 1004 tests in 1.939s OK ----- Thanks to all for the hard work. Bob Pyle From frankb.mail at gmail.com Tue May 20 10:29:01 2008 From: frankb.mail at gmail.com (Francis Bitonti) Date: Tue, 20 May 2008 10:29:01 -0400 Subject: [Numpy-discussion] no mail Message-ID: <22887860805200729r64b93491p20406dc6390b31d7@mail.gmail.com> Can I please be removed from this mailing list -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglist.honeypot at gmail.com Tue May 20 10:35:56 2008 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Tue, 20 May 2008 10:35:56 -0400 Subject: [Numpy-discussion] no mail In-Reply-To: <22887860805200729r64b93491p20406dc6390b31d7@mail.gmail.com> References: <22887860805200729r64b93491p20406dc6390b31d7@mail.gmail.com> Message-ID: <8347B7CA-1F30-422B-A658-A0308CA6F188@gmail.com> > Can I please be removed from this mailing list There is a link at the bottom of every email for the numpy discussion list: > http://projects.scipy.org/mailman/listinfo/numpy-discussion At the bottom of that page, you'll find what you're looking for, where it says "To unsubscribe from Numpy-discussion ..." -steve From schut at sarvision.nl Tue May 20 10:34:16 2008 From: schut at sarvision.nl (Vincent Schut) Date: Tue, 20 May 2008 16:34:16 +0200 Subject: [Numpy-discussion] first recarray steps Message-ID: Hi, I'm trying to get into recarrays. Unfortunately documentation is a bit on the short side... Lets say I have a rgb image of arbitrary size, as a normal ndarray (that's what my image reading lib gives me). Thus shape is (3,ysize,xsize), dtype = int8. How would I convert/view this as a recarray of shape (ysize, xsize) with the first dimension split up into 'r', 'g', 'b' fields? No need for 'x' and 'y' fields. I tried creating a numpy dtype {names: ('r','g','b'), formats: (numpy.int8,)*3}, but when I try to raw_img.view(rgb_dtype) I get: "ValueError: new type not compatible with array." Now this probably should not be too difficult, but I just don't see it... Thanks, Vincent. From stefan at sun.ac.za Tue May 20 11:31:33 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 17:31:33 +0200 Subject: [Numpy-discussion] first recarray steps In-Reply-To: References: Message-ID: <9457e7c80805200831w38d02e00yb3faca84dbb34e3f@mail.gmail.com> Hi Vincent 2008/5/20 Vincent Schut : > Hi, I'm trying to get into recarrays. Unfortunately documentation is a > bit on the short side... > > Lets say I have a rgb image of arbitrary size, as a normal ndarray > (that's what my image reading lib gives me). Thus shape is > (3,ysize,xsize), dtype = int8. How would I convert/view this as a > recarray of shape (ysize, xsize) with the first dimension split up into > 'r', 'g', 'b' fields? No need for 'x' and 'y' fields. First, you need to flatten the array so you have one (r,g,b) element per row. Say you have x with shape (3, 4, 4): x = x.T.reshape((-1,3)) Then you can view it with your new dtype: dt = np.dtype([('r',np.int8),('g',np.int8),('b',np.int8)]) x = x.view(dt) Then you must reshape it back to your original pixel arrangement: x = x.reshape((4,4)).T Or you can do it all in one go: x.T.reshape((-1,x.shape[0])).view(dt).reshape(x.shape[1:]).T Maybe someone else comes up with an easier way. Cheers St?fan From charlesr.harris at gmail.com Tue May 20 11:37:02 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 09:37:02 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: On Tue, May 20, 2008 at 6:11 AM, Jarrod Millman wrote: > On Tue, May 20, 2008 at 3:47 AM, Pearu Peterson wrote: > > On Tue, May 20, 2008 12:59 pm, Jarrod Millman wrote: > > > >> Commits to the trunk (1.2.x) should follow these rules: > >> > >> 1. Documentation fixes are allowed and strongly encouraged. > >> 2. Bug-fixes are strongly encouraged. > >> 3. Do not break backwards compatibility. > >> 4. New features are permissible. > >> 5. New tests are highly desirable. > >> 6. If you add a new feature, it must have tests. > >> 7. If you fix a bug, it must have tests. > >> > >> If you want to break a rule, don't. If you feel you absolutely have > >> to, please don't--but feel free send an email to the list explain your > >> problem. > > ... > >> In particular, let me know it there is some aspect of this that > >> you simply refuse to agree to in at least principle. > > > > Since you asked, I have a problem with the rule 7 when applying > > it to packages like numpy.distutils and numpy.f2py, for instance. > > I obviously knew this would be controversial. Personally, I would > prefer if we boldly state that tests are required. I know that this > "rule" will be broken occasionally and may not even always make sense. > We could change the language to be something like "if you a fix a > bug, there should be tests". Saying "you must" means that you're > breaking a rule when you don't. Importantly I am not proposing that > we have some new enforcement mechanism; I am happy to leave this to > Stefan and whoever else wants to join him. (Thanks Stefan--you have > taken on a tough job!) > > However, let's not worry too much about this at this point. Let's get > 1.1.0 out. I think everyone agrees that unit tests are a good idea, > some are more passionate than others. We need to figure out how best > to increase the number of tests, but we are doing a great job > currently. NumPy 1.0.4 had around 686 tests and the trunk now has > roughly 996 tests. > > For now, let's officially consider rules 6 and 7 to have question > marks at the end of them. Once 1.1.0 is out and we have started > developing 1.2, we can start this conversation again. > Two of the buildbots are showing problems, probably installation related, but it would be nice to see all green before the release. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Tue May 20 11:48:02 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 08:48:02 -0700 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: On Tue, May 20, 2008 at 8:37 AM, Charles R Harris wrote: > Two of the buildbots are showing problems, probably installation related, > but it would be nice to see all green before the release. Absolutely. Thanks for catching this. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From Chris.Barker at noaa.gov Tue May 20 12:08:26 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 20 May 2008 09:08:26 -0700 Subject: [Numpy-discussion] first recarray steps In-Reply-To: References: Message-ID: <4832F77A.8020804@noaa.gov> Vincent Schut wrote: > Lets say I have a rgb image of arbitrary size, as a normal ndarray > (that's what my image reading lib gives me). Thus shape is > (3,ysize,xsize), dtype = int8. How would I convert/view this as a > recarray of shape (ysize, xsize) with the first dimension split up into > 'r', 'g', 'b' fields? No need for 'x' and 'y' fields. Take a look in this list for a thread entitled "recarray fun" about a month ago -- you'll find some more discussion of approaches. Also, if you image data is rgb, usually, that's a (width, height, 3) array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs may give you that, I'm not sure. Also, you probably want a uint8 dtype, giving you 0-255 for each byte. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From v.gkinis at rug.nl Tue May 20 12:07:12 2008 From: v.gkinis at rug.nl (Vasileios Gkinis) Date: Tue, 20 May 2008 18:07:12 +0200 Subject: [Numpy-discussion] question on NumPy NaN Message-ID: <4832F730.7070701@rug.nl> -------- Original Message -------- Subject: question on NumPy NaN Date: Tue, 20 May 2008 18:03:00 +0200 From: Vasileios Gkinis To: numpy-discussion at scipy.org Hi all, I have a question concerning nan in NumPy. Lets say i have an array of sample measurements a = array((2,4,nan)) in NumPy calculating the mean of the elements in array a looks like: >>> a = array((2,4,nan)) >>> a array([ 2., 4., NaN]) >>> mean(a) nan What if i simply dont want nan to propagate and get something that would look like: >>> a = array((2,4,nan)) >>> a array([ 2., 4., NaN]) >>> mean(a) 3. Cheers Vasilis -- ------------------------------------------------------------ Vasileios Gkinis, PhD Student Centre for Ice and Climate Niels Bohr Institute Juliane Maries Vej 30, room 321 DK-2100 Copenhagen Denmark Office: +45 35325913 v.gkinis at gfy.ku.dk -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Tue May 20 12:11:38 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 20 May 2008 12:11:38 -0400 Subject: [Numpy-discussion] question on NumPy NaN In-Reply-To: <4832F730.7070701@rug.nl> References: <4832F730.7070701@rug.nl> Message-ID: 2008/5/20 Vasileios Gkinis : > I have a question concerning nan in NumPy. > Lets say i have an array of sample measurements > a = array((2,4,nan)) > in NumPy calculating the mean of the elements in array a looks like: > >>>> a = array((2,4,nan)) >>>> a > array([ 2., 4., NaN]) >>>> mean(a) > nan > > What if i simply dont want nan to propagate and get something that would > look like: > >>>> a = array((2,4,nan)) >>>> a > array([ 2., 4., NaN]) >>>> mean(a) > 3. For more elaborate handling of missing data, look into "masked arrays", in numpy.ma. They are designed to deal with exactly this sort of thing. Anne From gary.pajer at gmail.com Tue May 20 12:11:47 2008 From: gary.pajer at gmail.com (Gary Pajer) Date: Tue, 20 May 2008 12:11:47 -0400 Subject: [Numpy-discussion] data exchange format In-Reply-To: <1211293591.12803.7.camel@gabriel-desktop> References: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> <1211293591.12803.7.camel@gabriel-desktop> Message-ID: <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> On Tue, May 20, 2008 at 10:26 AM, Gabriel J.L. Beckers wrote: > PyTables is an efficient way of doing it (http://www.pytables.org). You > essentially write data to a HDF5 file, which is portable and can be read > in Matlab or in a C program (using the HDF5 library). > > Gabriel I thought about that. It seems to have much more than I need, so I wonder if it's got more overhead / less speed / more complex API than I need. But big isn't necessarily bad, but it might be. Is pytables overkill? > > On Tue, 2008-05-20 at 09:32 -0400, Gary Pajer wrote: >> I want to store data in a way that can be read by a C or Matlab program. >> >> Not too much data, not too complicated: a dozen or so floats, a few >> integers, a few strings, and a (3, x) numpy array where typically 500 >> < x < 30000. >> >> I was about to create my own format for storage when it occurred to me >> that I might want to use XML or some other standard format. Like >> JSON, perhaps. Can anyone comment, esp relating to numpy >> implementation issues, or offer suggestions? >> >> Thanks, >> Gary >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue May 20 12:18:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 10:18:16 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: On Tue, May 20, 2008 at 9:48 AM, Jarrod Millman wrote: > On Tue, May 20, 2008 at 8:37 AM, Charles R Harris > wrote: > > Two of the buildbots are showing problems, probably installation related, > > but it would be nice to see all green before the release. > > Absolutely. Thanks for catching this. > It would be good if we could find a PPC to add to the buildbot in order to catch endianess problems. SPARC might also do for this. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue May 20 12:18:34 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 18:18:34 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: <9457e7c80805200918s588c15dclfd6937376aeba3e0@mail.gmail.com> 2008/5/20 Charles R Harris : > Two of the buildbots are showing problems, probably installation related, > but it would be nice to see all green before the release. Thanks, fixed in SVN. Regards St?fan From kwgoodman at gmail.com Tue May 20 12:24:36 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 20 May 2008 09:24:36 -0700 Subject: [Numpy-discussion] question on NumPy NaN In-Reply-To: References: <4832F730.7070701@rug.nl> Message-ID: On Tue, May 20, 2008 at 9:11 AM, Anne Archibald wrote: > 2008/5/20 Vasileios Gkinis : > >> I have a question concerning nan in NumPy. >> Lets say i have an array of sample measurements >> a = array((2,4,nan)) >> in NumPy calculating the mean of the elements in array a looks like: >> >>>>> a = array((2,4,nan)) >>>>> a >> array([ 2., 4., NaN]) >>>>> mean(a) >> nan >> >> What if i simply dont want nan to propagate and get something that would >> look like: >> >>>>> a = array((2,4,nan)) >>>>> a >> array([ 2., 4., NaN]) >>>>> mean(a) >> 3. > > For more elaborate handling of missing data, look into "masked > arrays", in numpy.ma. They are designed to deal with exactly this sort > of thing. Or np.nansum(a) / np.isfinite(a).sum() A nanmean would be nice to have in numpy. From stefan at sun.ac.za Tue May 20 12:30:30 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 18:30:30 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> Message-ID: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> 2008/5/20 Charles R Harris : > > > On Tue, May 20, 2008 at 9:48 AM, Jarrod Millman > wrote: >> >> On Tue, May 20, 2008 at 8:37 AM, Charles R Harris >> wrote: >> > Two of the buildbots are showing problems, probably installation >> > related, >> > but it would be nice to see all green before the release. >> >> Absolutely. Thanks for catching this. > > It would be good if we could find a PPC to add to the buildbot in order to > catch endianess problems. > SPARC might also do for this. Absolutely! If anybody has access to such a machine and is willing to let a build-slave run on it, please let me know and I'll send you the necessary configuration files. Regards St?fan From charlesr.harris at gmail.com Tue May 20 12:31:54 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 10:31:54 -0600 Subject: [Numpy-discussion] data exchange format In-Reply-To: <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> References: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> <1211293591.12803.7.camel@gabriel-desktop> <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> Message-ID: On Tue, May 20, 2008 at 10:11 AM, Gary Pajer wrote: > On Tue, May 20, 2008 at 10:26 AM, Gabriel J.L. Beckers > wrote: > > PyTables is an efficient way of doing it (http://www.pytables.org). You > > essentially write data to a HDF5 file, which is portable and can be read > > in Matlab or in a C program (using the HDF5 library). > > > > Gabriel > > I thought about that. It seems to have much more than I need, so I > wonder if it's got more overhead / less speed / more complex API than > I need. But big isn't necessarily bad, but it might be. Is pytables > overkill? > PyTables is a nice bit of software and is worth getting familiar with if you want portable data. It will solve issues of endianess, annotation, and data organization, which can all be important, especially if your data sits around for a while and you forget exactly what's in it. Both Matlab and IDL support reading HDF5 files. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mabshoff at googlemail.com Tue May 20 11:57:05 2008 From: mabshoff at googlemail.com (Michael Abshoff) Date: Tue, 20 May 2008 17:57:05 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> Message-ID: <4832F4D1.3010308@googlemail.com> St?fan van der Walt wrote: > 2008/5/20 Charles R Harris : > >> On Tue, May 20, 2008 at 9:48 AM, Jarrod Millman >> wrote: >> >>> On Tue, May 20, 2008 at 8:37 AM, Charles R Harris >>> wrote: >>> >>>> Two of the buildbots are showing problems, probably installation >>>> related, >>>> but it would be nice to see all green before the release. >>>> >>> Absolutely. Thanks for catching this. >>> >> It would be good if we could find a PPC to add to the buildbot in order to >> catch endianess problems. >> SPARC might also do for this. >> > > Absolutely! If anybody has access to such a machine and is willing to > let a build-slave run on it, please let me know and I'll send you the > necessary configuration files. > > Hi Stefan, I got access to Solaris 9/Sparc and Solaris 10/Sparc and am certainly willing to help out. > Regards > St?fan > Cheers, Michael > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Tue May 20 12:53:52 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 10:53:52 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> Message-ID: On Tue, May 20, 2008 at 10:30 AM, St?fan van der Walt wrote: > 2008/5/20 Charles R Harris : > > > > > > On Tue, May 20, 2008 at 9:48 AM, Jarrod Millman > > wrote: > >> > >> On Tue, May 20, 2008 at 8:37 AM, Charles R Harris > >> wrote: > >> > Two of the buildbots are showing problems, probably installation > >> > related, > >> > but it would be nice to see all green before the release. > >> > >> Absolutely. Thanks for catching this. > > > > It would be good if we could find a PPC to add to the buildbot in order > to > > catch endianess problems. > > SPARC might also do for this. > > Absolutely! If anybody has access to such a machine and is willing to > let a build-slave run on it, please let me know and I'll send you the > necessary configuration files. > The current Mac machine seems to have a configuration problem. If anyone out there has an SGI machine we could use that too. I heart the buildbot. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdh2358 at gmail.com Tue May 20 12:57:03 2008 From: jdh2358 at gmail.com (John Hunter) Date: Tue, 20 May 2008 11:57:03 -0500 Subject: [Numpy-discussion] numpy.save bug on solaris x86 w/ nans and objects Message-ID: <88e473830805200957y4fd14c35ue52d8c9bd775a5ed@mail.gmail.com> I have a record array w/ dates (O4) and floats. If some of these floats are NaN, np.save crashes (on my solaris platform but not on a linux machine I tested on). Here is the code that produces the bug: In [1]: pwd Out[1]: '/home/titan/johnh/python/svn/matplotlib/matplotlib/examples/data' In [2]: import matplotlib.mlab as mlab In [3]: import numpy as np In [4]: r = mlab.csv2rec('aapl.csv') In [5]: r.dtype Out[5]: dtype([('date', '|O4'), ('open', '", line 1, in ? File "/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/io.py", line 158, in save format.write_array(fid, arr) File "/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/format.py", line 272, in write_array cPickle.dump(array, fp, protocol=2) SystemError: frexp() result out of range In [9]: np.__version__ Out[9]: '1.2.0.dev5136' In [10]: !uname -a SunOS flag 5.10 Generic_118855-15 i86pc i386 i86pc From cburns at berkeley.edu Tue May 20 12:57:37 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Tue, 20 May 2008 09:57:37 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> Message-ID: <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> Reminder to please test the installer. We already discovered a couple endian bugs on PPC, which is good, but we'd like to verify the release candidate on several more machines before the 1.1.0 tag on Thursday. It only takes a few minutes and you get the added bonus of having a current install of numpy. :) Thank you, Chris On Tue, May 20, 2008 at 7:27 AM, Robert Pyle wrote: > Hi all, > > On May 19, 2008, at 3:39 PM, Christopher Burns wrote: > >> I've built a Mac binary for the 1.1 release candidate. Mac users, >> please test it from: >> >> https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg >> >> This is for the MacPython installed from python.org. > > > From System Profiler --- > Hardware Overview: > > Model Name: MacBook Pro > Model Identifier: MacBookPro3,1 > Processor Name: Intel Core 2 Duo > > Running 10.5.2 > > Uneventful installation, tests as follows --- > > >>> np.test() > Numpy is installed in /Library/Frameworks/Python.framework/Versions/ > 2.5/lib/python2.5/site-packages/numpy > Numpy version 1.1.0rc1 > Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 > (Apple Computer, Inc. build 5363)] > > --- skipping details --- > > Ran 1004 tests in 1.939s > > OK > > > ----- > > Thanks to all for the hard work. > > Bob Pyle > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From beckers at orn.mpg.de Tue May 20 13:06:29 2008 From: beckers at orn.mpg.de (Gabriel J.L. Beckers) Date: Tue, 20 May 2008 19:06:29 +0200 Subject: [Numpy-discussion] data exchange format In-Reply-To: <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> References: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> <1211293591.12803.7.camel@gabriel-desktop> <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> Message-ID: <1211303189.6744.36.camel@gabriel-desktop> I am not exactly an expert on data storage, but I use PyTables a lot for all kinds of scientific data sets and am very happy with it. Indeed it has many advanced capabilities; so it may seem overkill at first glance. But ?for simple tasks such as the one you describe the api is simple; indeed I also use it for small data sets because it is such a quick way of storing data in a portable way. Regarding speed and overhead: I don't know in general what the penalties or gains are for very small files. On my system an empty file is 1032 bytes, and if I fill it with an array of 3 by 30000 random float64's it is 723080. Not so bad. Just try it out yourself: >>> import numpy, tables >>> ta = numpy.random.random((3,30000)) >>> f = tables.openFile('test.h5','w') >>> f.createArray('/','testarray',ta) >>> f.close() ?With most real data file size can be smaller because you have the option of enabling compression. But I must admit that I haven't tried reading HDF5 in Matlab or C (and never will); I know it is possible, but I don't know how difficult it is. Cheers, Gabriel On Tue, 2008-05-20 at 12:11 -0400, Gary Pajer wrote: > On Tue, May 20, 2008 at 10:26 AM, Gabriel J.L. Beckers > wrote: > > PyTables is an efficient way of doing it (http://www.pytables.org). You > > essentially write data to a HDF5 file, which is portable and can be read > > in Matlab or in a C program (using the HDF5 library). > > > > Gabriel > > I thought about that. It seems to have much more than I need, so I > wonder if it's got more overhead / less speed / more complex API than > I need. But big isn't necessarily bad, but it might be. Is pytables > overkill? > > > > > > On Tue, 2008-05-20 at 09:32 -0400, Gary Pajer wrote: > >> I want to store data in a way that can be read by a C or Matlab program. > >> > >> Not too much data, not too complicated: a dozen or so floats, a few > >> integers, a few strings, and a (3, x) numpy array where typically 500 > >> < x < 30000. > >> > >> I was about to create my own format for storage when it occurred to me > >> that I might want to use XML or some other standard format. Like > >> JSON, perhaps. Can anyone comment, esp relating to numpy > >> implementation issues, or offer suggestions? > >> > >> Thanks, > >> Gary > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From stefan at sun.ac.za Tue May 20 13:08:54 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 19:08:54 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> Message-ID: <9457e7c80805201008p6ea77ad0jf688b61c42eae88b@mail.gmail.com> 2008/5/20 Charles R Harris : > The current Mac machine seems to have a configuration problem. If anyone out > there has an SGI machine we could use that too. I heart the buildbot. I mailed Barry, he'll reset it for us. I'm glad the buildbot makes you happy, Charles :) Regards St?fan From charlesr.harris at gmail.com Tue May 20 13:13:02 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 11:13:02 -0600 Subject: [Numpy-discussion] numpy.save bug on solaris x86 w/ nans and objects In-Reply-To: <88e473830805200957y4fd14c35ue52d8c9bd775a5ed@mail.gmail.com> References: <88e473830805200957y4fd14c35ue52d8c9bd775a5ed@mail.gmail.com> Message-ID: On Tue, May 20, 2008 at 10:57 AM, John Hunter wrote: > I have a record array w/ dates (O4) and floats. If some of these > floats are NaN, np.save crashes (on my solaris platform but not on a > linux machine I tested on). Here is the code that produces the bug: > > In [1]: pwd > Out[1]: '/home/titan/johnh/python/svn/matplotlib/matplotlib/examples/data' > > In [2]: import matplotlib.mlab as mlab > > In [3]: import numpy as np > > In [4]: r = mlab.csv2rec('aapl.csv') > > In [5]: r.dtype > Out[5]: dtype([('date', '|O4'), ('open', ' ('low', ' ' > In [6]: r.close[100:] = np.nan > > In [7]: r.close > Out[7]: array([ 124.63, 127.46, 129.4 , ..., NaN, NaN, NaN]) > > In [8]: np.save('mydata.npy', r) > ------------------------------------------------------------ > Traceback (most recent call last): > File "", line 1, in ? > File "/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/io.py", > line 158, in save > format.write_array(fid, arr) > File > "/home/titan/johnh/dev/lib/python2.4/site-packages/numpy/lib/format.py", > line 272, in write_array > cPickle.dump(array, fp, protocol=2) > SystemError: frexp() result out of range > > > In [9]: np.__version__ > Out[9]: '1.2.0.dev5136' > > In [10]: !uname -a > SunOS flag 5.10 Generic_118855-15 i86pc i386 i86pc Looks like we need to add a test for this before release. But I'm off to work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From hetland at tamu.edu Tue May 20 13:18:31 2008 From: hetland at tamu.edu (Rob Hetland) Date: Tue, 20 May 2008 19:18:31 +0200 Subject: [Numpy-discussion] data exchange format In-Reply-To: <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> References: <88fe22a0805200632w3278a257r85d1c095d84bf7f9@mail.gmail.com> <1211293591.12803.7.camel@gabriel-desktop> <88fe22a0805200911r67b52586l730876126703b5ff@mail.gmail.com> Message-ID: <8380FFC1-1F79-4BFA-B198-FBF4CFC11F1E@tamu.edu> On May 20, 2008, at 6:11 PM, Gary Pajer wrote: > I thought about that. It seems to have much more than I need, so I > wonder if it's got more overhead / less speed / more complex API than > I need. But big isn't necessarily bad, but it might be. Is pytables > overkill? I use netCDF (which uses the HDF5 libraries netCDF4). NetCDF is good for large, gridded datasets, where the grid does not change in time. For my datasets (numerical ocean models), this format is perfect. HDF is more general and flexible, but a bit more complex. Take a look at the netcdf4-python.googlecode.com package, if you are interested. -Rob From hetland at tamu.edu Tue May 20 13:19:54 2008 From: hetland at tamu.edu (Rob Hetland) Date: Tue, 20 May 2008 19:19:54 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> Message-ID: <8690E1C2-45F0-4F12-9779-1F9031B645C9@tamu.edu> I would like to help, but it's not clear to me exactly how to do that from the wiki. What are the steps? -Rob From jdh2358 at gmail.com Tue May 20 13:25:14 2008 From: jdh2358 at gmail.com (John Hunter) Date: Tue, 20 May 2008 12:25:14 -0500 Subject: [Numpy-discussion] numpy.save bug on solaris x86 w/ nans and objects In-Reply-To: References: <88e473830805200957y4fd14c35ue52d8c9bd775a5ed@mail.gmail.com> Message-ID: <88e473830805201025w58400697ya9b3bec866532474@mail.gmail.com> On Tue, May 20, 2008 at 12:13 PM, Charles R Harris wrote: > Looks like we need to add a test for this before release. But I'm off to > work. Here's a simpler example in case you want to wrap it in a test harness: import datetime import numpy as np r = np.rec.fromarrays([ [datetime.date(2007,1,1), datetime.date(2007,1,2), datetime.date(2007,1,2)], [.1, .2, np.nan], ], names='date,value') np.save('mytest.npy', r) From stefan at sun.ac.za Tue May 20 13:30:55 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 19:30:55 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <8690E1C2-45F0-4F12-9779-1F9031B645C9@tamu.edu> References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> <8690E1C2-45F0-4F12-9779-1F9031B645C9@tamu.edu> Message-ID: <9457e7c80805201030u68768afeie7e19b2b3c34401f@mail.gmail.com> Hi Rob Which of the instructions are not clear? We'd like to make this as accessible as possible. In order to start editing, you need to complete step 5, which is to register on the wiki and send us your UserName. Regards St?fan 2008/5/20 Rob Hetland : > > I would like to help, but it's not clear to me exactly how to do that > from the wiki. What are the steps? > > -Rob From Chris.Barker at noaa.gov Tue May 20 13:40:26 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 20 May 2008 10:40:26 -0700 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> Message-ID: <48330D0A.7020605@noaa.gov> Charles R Harris wrote: > Absolutely! If anybody has access to such a machine and is willing to > let a build-slave run on it, please let me know and I'll send you the > necessary configuration files. > > The current Mac machine seems to have a configuration problem. I've got a PPC Mac that I may be able to use -- any particular ports need to be open -- we've got pretty draconian firewall rules. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue May 20 13:42:17 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 20 May 2008 10:42:17 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> Message-ID: <48330D79.3050701@noaa.gov> Christopher Burns wrote: > Reminder to please test the installer. Dual G5 PPC mac, OS-X 10.4.11 python2.5 from python.org > We already discovered a couple endian bugs on PPC, Got those same bugs, otherwise not issues. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Tue May 20 13:46:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 May 2008 12:46:48 -0500 Subject: [Numpy-discussion] building issue in windows In-Reply-To: References: Message-ID: <3d375d730805201046j4edbabbbh33755f268db512c0@mail.gmail.com> On Tue, May 20, 2008 at 8:33 AM, Igor Sylvester wrote: > Hi. > I have mingw and Visual Studio installed on my computer. I am following the > building instructions posted in [1]. I explicitly tell setup.py to use > mingw by passing the argument --compiler=mingw32. However, setuptools is > using Visual Studio anyways. has anyone encountered this problem? Strictly speaking, the --compiler flag goes on the config and build_ext commands. There is some logic that accepts it on the build command, too, and transfers it over to the build_ext command, but it is possibly buggy. I would make a setup.cfg file (place it next to the setup.py) with the following contents: [config] compiler=mingw32 [build_ext] compiler=mingw32 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From thrabe at burnham.org Tue May 20 13:47:20 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Tue, 20 May 2008 10:47:20 -0700 Subject: [Numpy-discussion] dimension aligment Message-ID: Hi all, just a simple question regarding the alignment of dimensions: given a 3d array a = numpy.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]],[[13,14,15],[16,17,18]],[[19,20,21],[22,23,24]]]) a.shape returns (4,2,3) so I assume the first digit is the 3rd dimension, second is 2nd dim and third is the first. how is the data aligned in memory now? according to the strides it should be 1,2,3,4,5,6,7,8,9,10,... right? if I had an array of more dimensions, the first digit returned by shape should always be the highest dim. feel free to confirm / correct my assumptions best thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From wright at esrf.fr Tue May 20 13:51:22 2008 From: wright at esrf.fr (Jonathan Wright) Date: Tue, 20 May 2008 19:51:22 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: Message-ID: <48330F9A.6010804@esrf.fr> Joe Harrington wrote: > NUMPY/SCIPY DOCUMENTATION MARATHON 2008 > On the wiki it says: "Writers should be fluent in English" In case someone is working on the dynamic docstring magic, is this a good moment to mention "internationalisation" and "world domination" in the same sentence? -Jon From charlesr.harris at gmail.com Tue May 20 13:56:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 11:56:23 -0600 Subject: [Numpy-discussion] dimension aligment In-Reply-To: References: Message-ID: On Tue, May 20, 2008 at 11:47 AM, Thomas Hrabe wrote: > > Hi all, > > just a simple question regarding the alignment of dimensions: > > > given a 3d array > a = > numpy.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]],[[13,14,15],[16,17,18]],[[19,20,21],[22,23,24]]]) > a.shape > returns (4,2,3) > > so I assume the first digit is the 3rd dimension, second is 2nd dim and > third is the first. > Only if you count from the right. I would call the first digit the first dimension. > > how is the data aligned in memory now? > according to the strides it should be > 1,2,3,4,5,6,7,8,9,10,... > right? > Like a C array, contiguous and the rightmost dimension varies fastest. > > if I had an array of more dimensions, the first digit returned by shape > should always be the highest dim. > Yes. Athough first is less ambiguous than highest Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From thrabe at burnham.org Tue May 20 13:58:08 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Tue, 20 May 2008 10:58:08 -0700 Subject: [Numpy-discussion] dimension aligment References: Message-ID: I am wandering what shape would be like in C would it be {4;2;3} or {3;2;4} ? so shape[0] == 4 or shape[0] == 3 -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Charles R Harris Sent: Tue 5/20/2008 10:56 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] dimension aligment On Tue, May 20, 2008 at 11:47 AM, Thomas Hrabe wrote: > > Hi all, > > just a simple question regarding the alignment of dimensions: > > > given a 3d array > a = > numpy.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]],[[13,14,15],[16,17,18]],[[19,20,21],[22,23,24]]]) > a.shape > returns (4,2,3) > > so I assume the first digit is the 3rd dimension, second is 2nd dim and > third is the first. > Only if you count from the right. I would call the first digit the first dimension. > > how is the data aligned in memory now? > according to the strides it should be > 1,2,3,4,5,6,7,8,9,10,... > right? > Like a C array, contiguous and the rightmost dimension varies fastest. > > if I had an array of more dimensions, the first digit returned by shape > should always be the highest dim. > Yes. Athough first is less ambiguous than highest Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tgrav at mac.com Tue May 20 14:04:41 2008 From: tgrav at mac.com (Tommy Grav) Date: Tue, 20 May 2008 14:04:41 -0400 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> Message-ID: Powerbook G4 with 10.5.2 and Activestate Python 2.5.1.1, no problems beyond the two endian test failures Cheers Tommy On May 20, 2008, at 12:57 PM, Christopher Burns wrote: > Reminder to please test the installer. We already discovered a couple > endian bugs on PPC, which is good, but we'd like to verify the release > candidate on several more machines before the 1.1.0 tag on Thursday. > > It only takes a few minutes and you get the added bonus of having a > current install of numpy. :) > > Thank you, > Chris From peridot.faceted at gmail.com Tue May 20 14:04:46 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 20 May 2008 20:04:46 +0200 Subject: [Numpy-discussion] dimension aligment In-Reply-To: References: Message-ID: 2008/5/20 Thomas Hrabe : > given a 3d array > a = > numpy.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]],[[13,14,15],[16,17,18]],[[19,20,21],[22,23,24]]]) > a.shape > returns (4,2,3) > > so I assume the first digit is the 3rd dimension, second is 2nd dim and > third is the first. > > how is the data aligned in memory now? > according to the strides it should be > 1,2,3,4,5,6,7,8,9,10,... > right? > > if I had an array of more dimensions, the first digit returned by shape > should always be the highest dim. You are basically right, but this is a surprisingly subtle issue for numpy. A numpy array is basically a block of memory and some description. One piece of that description is the type of data it contains (i.e., how to interpret each chunk of memory) for example int32, float64, etc. Another is the sizes of all the various dimensions. A third piece, which makes many of the things numpy does possible, is the "strides". The way numpy works is that basically it translates A[i,j,k] into a lookup of the item in the memory block at position i*strides[0]+j*strides[1]+k*strides[2] This means, if you have an array A and you want every second element (A[::2]), all numpy needs to do is hand you back a new array pointing to the same data block, but with strides[0] doubled. Similarly if you want to transpose a two-dimensional array, all it needs to do is exchange strides[0] and strides[1]; no data need be moved. This means, though, that if you are handed a numpy array, the elements can be arranged in memory in quite a complicated fashion. Sometimes this is no problem - you can always use the strides to find it all. But sometimes you need the data arranged in a particular way. numpy defines two particular ways: "C contiguous" and "FORTRAN contiguous". "C contiguous" arrays are what you describe, and they're what numpy produces by default; they are arranged so that the rightmost index has the smallest stride. "FORTRAN contiguous" arrays are arranged the other way around; the leftmost index has the smallest stride. (This is how FORTRAN arrays are arranged in memory.) There is also a special case: the reshape() function changes the shape of the array. It has an "order" argument that describes not how the elements are arranged in memory but how you want to think of the elements as arranged in memory for the reshape operation. Anne From hetland at tamu.edu Tue May 20 14:05:45 2008 From: hetland at tamu.edu (Rob Hetland) Date: Tue, 20 May 2008 20:05:45 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805201030u68768afeie7e19b2b3c34401f@mail.gmail.com> References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> <8690E1C2-45F0-4F12-9779-1F9031B645C9@tamu.edu> <9457e7c80805201030u68768afeie7e19b2b3c34401f@mail.gmail.com> Message-ID: On May 20, 2008, at 7:30 PM, St?fan van der Walt wrote: > ...and send us your UserName. This is the part I skipped over... I registered, and wondered why everything was not editable. -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From hetland at tamu.edu Tue May 20 14:06:30 2008 From: hetland at tamu.edu (Rob Hetland) Date: Tue, 20 May 2008 20:06:30 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805201030u68768afeie7e19b2b3c34401f@mail.gmail.com> References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> <8690E1C2-45F0-4F12-9779-1F9031B645C9@tamu.edu> <9457e7c80805201030u68768afeie7e19b2b3c34401f@mail.gmail.com> Message-ID: On May 20, 2008, at 7:30 PM, St?fan van der Walt wrote: > and send us your UserName. Oh, and my username is RobHetland -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From cburns at berkeley.edu Tue May 20 14:17:55 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Tue, 20 May 2008 11:17:55 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <48330D79.3050701@noaa.gov> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> <48330D79.3050701@noaa.gov> Message-ID: <764e38540805201117g39c088c6s9802dab221991f15@mail.gmail.com> Great! I'm glad to see we have several PPC's in testing also. On Tue, May 20, 2008 at 10:42 AM, Christopher Barker wrote: > Christopher Burns wrote: >> Reminder to please test the installer. > > Dual G5 PPC mac, OS-X 10.4.11 python2.5 from python.org > >> We already discovered a couple endian bugs on PPC, > > Got those same bugs, otherwise not issues. > > -Chris > From cburns at berkeley.edu Tue May 20 14:20:34 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Tue, 20 May 2008 11:20:34 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> Message-ID: <764e38540805201120h61e07010u20a24794abbb77a9@mail.gmail.com> Hey Tommy, Does ActiveState install python in the same location as python.org? cburns@~ 10:35:05 $ which python /Library/Frameworks/Python.framework/Versions/Current/bin/python On Tue, May 20, 2008 at 11:04 AM, Tommy Grav wrote: > Powerbook G4 with 10.5.2 and Activestate Python 2.5.1.1, no problems > beyond the two endian test failures > > Cheers > Tommy > From tgrav at mac.com Tue May 20 14:28:58 2008 From: tgrav at mac.com (Tommy Grav) Date: Tue, 20 May 2008 14:28:58 -0400 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805201120h61e07010u20a24794abbb77a9@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> <764e38540805201120h61e07010u20a24794abbb77a9@mail.gmail.com> Message-ID: Yes it does put python in that location as it should ;o) Cheers Tommy On May 20, 2008, at 2:20 PM, Christopher Burns wrote: > Hey Tommy, > > Does ActiveState install python in the same location as python.org? > > cburns@~ 10:35:05 $ which python > /Library/Frameworks/Python.framework/Versions/Current/bin/python > > > On Tue, May 20, 2008 at 11:04 AM, Tommy Grav wrote: >> Powerbook G4 with 10.5.2 and Activestate Python 2.5.1.1, no problems >> beyond the two endian test failures >> >> Cheers >> Tommy >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From cburns at berkeley.edu Tue May 20 14:34:14 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Tue, 20 May 2008 11:34:14 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <56D3A638-445E-449B-B20E-4946749BE87D@post.harvard.edu> <764e38540805200957s74dd6afauff04a658a636bf7b@mail.gmail.com> <764e38540805201120h61e07010u20a24794abbb77a9@mail.gmail.com> Message-ID: <764e38540805201134i6fc76d14m473c43082fcde8ec@mail.gmail.com> Good to know. Thanks! On Tue, May 20, 2008 at 11:28 AM, Tommy Grav wrote: > Yes it does put python in that location as it should ;o) > > Cheers > Tommy > > On May 20, 2008, at 2:20 PM, Christopher Burns wrote: > >> Hey Tommy, >> >> Does ActiveState install python in the same location as python.org? >> >> cburns@~ 10:35:05 $ which python >> /Library/Frameworks/Python.framework/Versions/Current/bin/python >> >> >> On Tue, May 20, 2008 at 11:04 AM, Tommy Grav wrote: >>> Powerbook G4 with 10.5.2 and Activestate Python 2.5.1.1, no problems >>> beyond the two endian test failures >>> >>> Cheers >>> Tommy >>> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From bsouthey at gmail.com Tue May 20 14:57:12 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 20 May 2008 13:57:12 -0500 Subject: [Numpy-discussion] 1.1.0rc1, Win32 Installer: please test it In-Reply-To: <48323438.8080501@ar.media.kyoto-u.ac.jp> References: <48323438.8080501@ar.media.kyoto-u.ac.jp> Message-ID: Hi, No installation or test errors on my AMD Athlon XP 2100 running Win XP and Python 2.5 Bruce On Mon, May 19, 2008 at 9:15 PM, David Cournapeau wrote: > Hi, > > Sorry for the delay, but it is now ready: numpy "superpack" > installers for numpy 1.1.0rc1: > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.4.exe > > (Python 2.4 binaries are not there yet). This binary should work on any > (32 bits) CPU on windows, and in particular should solve the recurring > problem of segfault/hangs on older CPU with previous binary releases. > > I used a fairly heavy compression scheme (lzma), because it cut the size > ~ 30 %. If it is a problem, please let me know, > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From thrabe at burnham.org Tue May 20 15:01:00 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Tue, 20 May 2008 12:01:00 -0700 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentationFault References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> <3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> Message-ID: After all, I figured how to create an numpy in C with the help below. If called in C, import_array() but actually _import_array() successfully creates all the instances needed for the array. However, once I call this function from another environment such as Matlab, PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray"); in __import_array() returns NULL, because numpy.core.multiarray is not found? Do you think it might depend on the path settings? As I said, the code works fine from plain C but its odd from within Matlab. Best, Thomas -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Robert Kern Sent: Wed 5/14/2008 5:12 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentationFault On Wed, May 14, 2008 at 6:40 PM, Thomas Hrabe wrote: >>I didn't know a person could write a stand-alone program using NumPy >>this way (can you?) > > Well, this is possible when you embed python and use the "simple" objects such as ints, strings, .... > Why should it be impossible to do it for numpy then? numpy exposes its API as a pointer to an array which contains function pointers. import_array() imports the extension module, accesses the PyCObject that contains this pointer, and sets a global pointer appropriately. There are #defines macros to emulate the functions by dereferencing the appropriate element of the array and calling it with the given macro arguments. The reason you get the error about returning nothing when the return type of main() is declared int is because this macro is only intended to work inside of an initmodule() function of an extension module, whose return type is void. import_array() includes error handling logic and will return if there is an error. You get the segfault without import_array() because all of the functions you try to call are trying to dereference an array which has not been initialized. > My plan is to send multidimensional arrays from C to python and to apply some python specific functions to them. Well, first you need to call Py_Initialize() to start the VM. Otherwise, you can't import numpy to begin with. I guess you could write a "void load_numpy(void)" function which just exists to call import_array(). Just be sure to check the exception state appropriately after it returns. But for the most part, it's much better to drive your C code using Python than the other around. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue May 20 15:03:22 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 20 May 2008 22:03:22 +0300 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: <48302C30.2090502@shrogers.com> <1211117327.8207.37.camel@localhost.localdomain> <4832C0B3.8040503@shrogers.com> <9457e7c80805200530s55351c8eyb4f1e06b98f2e271@mail.gmail.com> <9457e7c80805200625x38cfaf70nef78cc3be6893e43@mail.gmail.com> <8690E1C2-45F0-4F12-9779-1F9031B645C9@tamu.edu> <9457e7c80805201030u68768afeie7e19b2b3c34401f@mail.gmail.com> Message-ID: <1211310202.6076.1.camel@localhost.localdomain> ti, 2008-05-20 kello 20:06 +0200, Rob Hetland kirjoitti: > On May 20, 2008, at 7:30 PM, St?fan van der Walt wrote: > > > and send us your UserName. > > > Oh, and my username is RobHetland You're in now. Regards, Pauli From charlesr.harris at gmail.com Tue May 20 15:10:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 13:10:25 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <48330D0A.7020605@noaa.gov> References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> Message-ID: Can we close ticket 605, Incorrect Behavior of Numpy Histogram? http://projects.scipy.org/scipy/numpy/ticket/605 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Tue May 20 15:16:09 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 12:16:09 -0700 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <60773.129.240.222.155.1211280476.squirrel@cens.ioc.ee> <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> Message-ID: On Tue, May 20, 2008 at 12:10 PM, Charles R Harris wrote: > Can we close ticket 605, Incorrect Behavior of Numpy Histogram? > http://projects.scipy.org/scipy/numpy/ticket/605 Yes, but we need to create a new ticket for 1.2 detailing the planned changes. If no one else gets to it, I will do so tonight. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Tue May 20 15:19:06 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 13:19:06 -0600 Subject: [Numpy-discussion] Squeak, squeak. Trac mailing list still broken. Message-ID: Robert, The dead mailer is a PITA and becoming a major bottleneck to bugfixing/development. So I am putting on my Romex suit and wading into the fires of your irritation to raise the issue once more. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue May 20 15:23:24 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 May 2008 14:23:24 -0500 Subject: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentationFault In-Reply-To: References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> <3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> Message-ID: <3d375d730805201223i48b83fbdy9736bfea04353093@mail.gmail.com> On Tue, May 20, 2008 at 2:01 PM, Thomas Hrabe wrote: > After all, I figured how to create an numpy in C with the help below. > > If called in C, import_array() but actually _import_array() successfully > creates all the instances needed for the array. > However, once I call this function from another environment such as Matlab, > PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray"); > in __import_array() returns NULL, because numpy.core.multiarray is not > found? Something like that. Call PyErr_Print() do display the full traceback so you can find out what the actual problem is. > Do you think it might depend on the path settings? You've called Py_Initialize() before you do anything else with Py* functions, right? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From hoytak at gmail.com Tue May 20 15:24:26 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Tue, 20 May 2008 12:24:26 -0700 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805192026h5dbbb210hf06d5e3cc5fc04b4@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> <33644d3c0805192026h5dbbb210hf06d5e3cc5fc04b4@mail.gmail.com> Message-ID: <4db580fd0805201224p6517b620of1575f169f7b98d6@mail.gmail.com> > The time has now been shaved down to ~9 seconds with this suggestion > from the original 13-14s, with the inclusing of Eric Firing's > suggestions. This is without scipy.weave, which at the moment I can't > get to work for all lines, and when I just replace one of them > sucessfully it seems to run more slowly, I assume because it is > converting data back and forth. There's a nontrivial constant time cost to using scipy.weave.blitz, but it isn't copying the data. Thus it will slow you down on smaller arrays, but you'll notice quite a speed-up on much larger ones. I should have mentioned that earlier -- I assumed your arrays were really large. > Are there any major pitfalls to be aware of? It sounds like if I do: > f = a[n,:] I get a reference, but if I did something like g = a[n,:]*2 > I would get a copy. Well, if you do f = a[n, :], you would get a view, another object that shares the data in memory with a but is a separate object. > If anyone has any clues on why scipy.weave is blowing > (http://pastebin.com/m79699c04) using the code I attached, I wouldn't > mind knowing. Most of the times I've attempted using weave, I've been > a little baffled about why things aren't working. I also don't have a > sense for whether all numpy functions should be "weavable" or if it's > just general array operations that can be put through weave. Sorry I didn't get back to you earlier on this -- I was a bit busy yesterday. It looks like weave.blitz isn't working on your second line because you're not explicitly putting slices in some of the dimensions, In numpy v[0:2] works for 1,2,3,4,.... dimensions, but for a 2d array in blitz you have to use v[0:2,:], 3d v[0:2,:,:]. It's a bit more picky. I think that's the problem with your second line -- try replacing v[:] with v[0,:] and theta[1-curidx] with theta[1-curidx, :]. (I may have missed some others.) weave.blitz is currently limited to just array operations... it doesn't really support the numpy functions. Hope that helps a little.... -- Hoyt -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From charlesr.harris at gmail.com Tue May 20 15:26:04 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 13:26:04 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> Message-ID: On Tue, May 20, 2008 at 1:16 PM, Jarrod Millman wrote: > On Tue, May 20, 2008 at 12:10 PM, Charles R Harris > wrote: > > Can we close ticket 605, Incorrect Behavior of Numpy Histogram? > > http://projects.scipy.org/scipy/numpy/ticket/605 > > Yes, but we need to create a new ticket for 1.2 detailing the planned > changes. If no one else gets to it, I will do so tonight. > How about ticket 770? Did Roberts endianess fixes cover that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue May 20 15:33:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 May 2008 14:33:01 -0500 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> Message-ID: <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> On Tue, May 20, 2008 at 2:26 PM, Charles R Harris wrote: > How about ticket 770? Did Roberts endianess fixes cover that? Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mattknox.ca at gmail.com Tue May 20 15:31:20 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Tue, 20 May 2008 19:31:20 +0000 (UTC) Subject: [Numpy-discussion] 1.1.0rc1, Win32 Installer: please test it References: <48323438.8080501@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > Hi, > > Sorry for the delay, but it is now ready: numpy "superpack" > installers for numpy 1.1.0rc1: > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.4.exe > > (Python 2.4 binaries are not there yet). This binary should work on any > (32 bits) CPU on windows, and in particular should solve the recurring > problem of segfault/hangs on older CPU with previous binary releases. > > I used a fairly heavy compression scheme (lzma), because it cut the size > ~ 30 %. If it is a problem, please let me know, > > cheers, > > David > installed fine and all tests ran successfully on my machine. Machine specs: - Vista 32 bit - Python 2.5 - Intel Core 2 duo 2ghz From thrabe at burnham.org Tue May 20 16:36:25 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Tue, 20 May 2008 13:36:25 -0700 Subject: [Numpy-discussion] embeddedPyArray_FromDimsAndDataSegmentationFault References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com><3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> <3d375d730805201223i48b83fbdy9736bfea04353093@mail.gmail.com> Message-ID: Thats what PyErr_Print() prints. Python is initialised, for sure Traceback (most recent call last): File "/usr/global/python32/lib/python2.4/site-packages/numpy/__init__.py", lin e 34, in ? import testing File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/__init__. py", line 3, in ? from numpytest import * File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/numpytest .py", line 8, in ? import unittest File "/usr/global/python32/lib/python2.4/unittest.py", line 51, in ? import time ImportError: /usr/global/python32/lib/python2.4/lib-dynload/time.so: undefined s ymbol: PyExc_IOErro why does it work in C and not in C started within Matlab? -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Robert Kern Sent: Tue 5/20/2008 12:23 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] embeddedPyArray_FromDimsAndDataSegmentationFault On Tue, May 20, 2008 at 2:01 PM, Thomas Hrabe wrote: > After all, I figured how to create an numpy in C with the help below. > > If called in C, import_array() but actually _import_array() successfully > creates all the instances needed for the array. > However, once I call this function from another environment such as Matlab, > PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray"); > in __import_array() returns NULL, because numpy.core.multiarray is not > found? Something like that. Call PyErr_Print() do display the full traceback so you can find out what the actual problem is. > Do you think it might depend on the path settings? You've called Py_Initialize() before you do anything else with Py* functions, right? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4176 bytes Desc: not available URL: From robert.kern at gmail.com Tue May 20 16:52:20 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 May 2008 15:52:20 -0500 Subject: [Numpy-discussion] embeddedPyArray_FromDimsAndDataSegmentationFault In-Reply-To: References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> <3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> <3d375d730805201223i48b83fbdy9736bfea04353093@mail.gmail.com> Message-ID: <3d375d730805201352g7119d7cbgd1173545f77d5f3@mail.gmail.com> On Tue, May 20, 2008 at 3:36 PM, Thomas Hrabe wrote: > Thats what PyErr_Print() prints. > Python is initialised, for sure > > Traceback (most recent call last): > File "/usr/global/python32/lib/python2.4/site-packages/numpy/__init__.py", lin > e 34, in ? > import testing > File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/__init__. > py", line 3, in ? > from numpytest import * > File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/numpytest > .py", line 8, in ? > import unittest > File "/usr/global/python32/lib/python2.4/unittest.py", line 51, in ? > import time > ImportError: /usr/global/python32/lib/python2.4/lib-dynload/time.so: undefined s > ymbol: PyExc_IOErro > > why does it work in C and not in C started within Matlab? It depends on how you linked everything. Presumably, you linked in libpython24 for the program but not the MEX. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Tue May 20 17:05:42 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 20 May 2008 23:05:42 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <48330F9A.6010804@esrf.fr> References: <48330F9A.6010804@esrf.fr> Message-ID: <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> 2008/5/20 Jonathan Wright : > Joe Harrington wrote: >> NUMPY/SCIPY DOCUMENTATION MARATHON 2008 >> > On the wiki it says: "Writers should be fluent in English" > > In case someone is working on the dynamic docstring magic, is this a > good moment to mention "internationalisation" and "world domination" in > the same sentence? I think we'll stick to English for now (I don't think I have the motivation to do an Afrikaans translation!). As for internationali(s/z)ation, we'll see who writes the most docstrings. In a fortuitous twist of events, I find myself able to read American as well :) Cheers St?fan From pav at iki.fi Tue May 20 17:10:45 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 21 May 2008 00:10:45 +0300 Subject: [Numpy-discussion] (SPAM) Re: embeddedPyArray_FromDimsAndDataSegmentationFault In-Reply-To: References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com> <3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com> <3d375d730805201223i48b83fbdy9736bfea04353093@mail.gmail.com> Message-ID: <1211317845.6076.5.camel@localhost.localdomain> ti, 2008-05-20 kello 13:36 -0700, Thomas Hrabe kirjoitti: > Thats what PyErr_Print() prints. > Python is initialised, for sure > > Traceback (most recent call last): > File "/usr/global/python32/lib/python2.4/site-packages/numpy/__init__.py", lin > e 34, in ? > import testing > File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/__init__. > py", line 3, in ? > from numpytest import * > File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/numpytest > py", line 8, in ? > import unittest > File "/usr/global/python32/lib/python2.4/unittest.py", line 51, in ? > import time > ImportError: /usr/global/python32/lib/python2.4/lib-dynload/time.so: undefined s > ymbol: PyExc_IOErro > > why does it work in C and not in C started within Matlab? It's probably that it has something to do with the way Matlab loads its extension libraries: http://www.iki.fi/pav/software/pythoncall/index.html#compilation-important -- Pauli Virtanen From pwang at enthought.com Tue May 20 17:20:34 2008 From: pwang at enthought.com (Peter Wang) Date: Tue, 20 May 2008 16:20:34 -0500 Subject: [Numpy-discussion] Squeak, squeak. Trac mailing list still broken. In-Reply-To: References: Message-ID: <1732F311-0243-441A-98FC-B5588EFB6829@enthought.com> Hey Chuck, I was able to create a ticket just now and received a notification email about it. Can you try modifying or updating a ticket and see if you get notifications? -Peter On May 20, 2008, at 2:19 PM, Charles R Harris wrote: > Robert, > > The dead mailer is a PITA and becoming a major bottleneck to > bugfixing/development. > So I am putting on my Romex suit and wading into the fires of your > irritation to raise the > issue once more. > > Chuck > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue May 20 17:43:30 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 15:43:30 -0600 Subject: [Numpy-discussion] Squeak, squeak. Trac mailing list still broken. In-Reply-To: <1732F311-0243-441A-98FC-B5588EFB6829@enthought.com> References: <1732F311-0243-441A-98FC-B5588EFB6829@enthought.com> Message-ID: On Tue, May 20, 2008 at 3:20 PM, Peter Wang wrote: > Hey Chuck, I was able to create a ticket just now and received a > notification email about it. Can you try modifying or updating a > ticket and see if you get notifications? > > -Peter > No joy. The Scipy/Numpy ticket mail archives also end on May 3. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From thrabe at burnham.org Tue May 20 18:23:54 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Tue, 20 May 2008 15:23:54 -0700 Subject: [Numpy-discussion] (SPAM) Re:embeddedPyArray_FromDimsAndDataSegmentationFault References: <9457e7c80805141627x28001d27gf01d1c923ef871ea@mail.gmail.com><3d375d730805141712l5a7bc32ap615432fad2a2e9bb@mail.gmail.com><3d375d730805201223i48b83fbdy9736bfea04353093@mail.gmail.com> <1211317845.6076.5.camel@localhost.localdomain> Message-ID: Hi Pauli my makefile compiles with the following options /usr/global/matlab/bin/mex -cxx -g -I./include/ -I/home/global/python32/include/python2.4 -I/home/global/python32/lib/python2.4/site-packages/numpy/core/include/ -DLIBPYTHON=\"/usr/global/python32/lib/libpython2.4.so\" .... -lpython2.4 With this options compilation is processed but many symbols remain unsolved undefined symbol: PyExc_RuntimeError (./bin/mp.mexglx) .... When I use -lpython2.4 the linker woun't find the lib /usr/bin/ld: cannot find -lpython2.4 But I bet you have managed to solve this problem. By the way, what Matlab version did you use? Best, Thomas -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Pauli Virtanen Sent: Tue 5/20/2008 2:10 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] (SPAM) Re:embeddedPyArray_FromDimsAndDataSegmentationFault ti, 2008-05-20 kello 13:36 -0700, Thomas Hrabe kirjoitti: > Thats what PyErr_Print() prints. > Python is initialised, for sure > > Traceback (most recent call last): > File "/usr/global/python32/lib/python2.4/site-packages/numpy/__init__.py", lin > e 34, in ? > import testing > File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/__init__. > py", line 3, in ? > from numpytest import * > File "/usr/global/python32/lib/python2.4/site-packages/numpy/testing/numpytest > py", line 8, in ? > import unittest > File "/usr/global/python32/lib/python2.4/unittest.py", line 51, in ? > import time > ImportError: /usr/global/python32/lib/python2.4/lib-dynload/time.so: undefined s > ymbol: PyExc_IOErro > > why does it work in C and not in C started within Matlab? It's probably that it has something to do with the way Matlab loads its extension libraries: http://www.iki.fi/pav/software/pythoncall/index.html#compilation-important -- Pauli Virtanen _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From wright at esrf.fr Tue May 20 18:55:54 2008 From: wright at esrf.fr (Jonathan Wright) Date: Wed, 21 May 2008 00:55:54 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> Message-ID: <483356FA.5000502@esrf.fr> St?fan van der Walt wrote: > As for internationali(s/z)ation, we'll see who writes the most > docstrings. Indeed. There are some notes on the OLPC wiki at http://wiki.laptop.org/go/Python_i18n It seems to be just a question of adding at the top of add_newdocs.py from gettext import gettext as _ ... and putting the docstrings in a _() function call, although perhaps I miss something important, like a performance hit? This would catch everything in add_newdocs at least. It seems like a relatively minor change if you are overhauling anyway? Jon From robert.kern at gmail.com Tue May 20 19:04:22 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 May 2008 18:04:22 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <483356FA.5000502@esrf.fr> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> Message-ID: <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> On Tue, May 20, 2008 at 5:55 PM, Jonathan Wright wrote: > St?fan van der Walt wrote: > > As for internationali(s/z)ation, we'll see who writes the most > > docstrings. > > Indeed. There are some notes on the OLPC wiki at > > http://wiki.laptop.org/go/Python_i18n > > It seems to be just a question of adding at the top of add_newdocs.py > > from gettext import gettext as _ > > ... and putting the docstrings in a _() function call, although perhaps > I miss something important, like a performance hit? Possibly a significant one. This could affect startup times, which I am hesitant to make worse. > This would catch > everything in add_newdocs at least. It seems like a relatively minor > change if you are overhauling anyway? add_newdocs() could do that, but the usual function docstrings can't. The rule is that if the first statement in a function is a literal string, then the compiler will assign it func.__doc__. Expressions are just treated as expressions in the function body and have no affect on func.__doc__. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Tue May 20 20:51:04 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 17:51:04 -0700 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it Message-ID: On Mon, May 19, 2008 at 7:15 PM, David Cournapeau wrote: > Sorry for the delay, but it is now ready: numpy "superpack" > installers for numpy 1.1.0rc1: > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.4.exe > > (Python 2.4 binaries are not there yet). This binary should work on any > (32 bits) CPU on windows, and in particular should solve the recurring > problem of segfault/hangs on older CPU with previous binary releases. Hello, Please test the Windows binaries. So far I have only seen two testers. I can't tag the release until I know that our binary installers work on a wide variety of Windows machines. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Tue May 20 20:52:12 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 20 May 2008 17:52:12 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> Message-ID: On Mon, May 19, 2008 at 12:39 PM, Christopher Burns wrote: > I've built a Mac binary for the 1.1 release candidate. Mac users, > please test it from: > > https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg > > This is for the MacPython installed from python.org. Hello, Please test the Mac binaries. I can't tag the release until I know that our binary installers work on a wide variety of Mac machines. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From jbsnyder at gmail.com Tue May 20 21:01:37 2008 From: jbsnyder at gmail.com (James Snyder) Date: Tue, 20 May 2008 20:01:37 -0500 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <4db580fd0805201224p6517b620of1575f169f7b98d6@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> <33644d3c0805192026h5dbbb210hf06d5e3cc5fc04b4@mail.gmail.com> <4db580fd0805201224p6517b620of1575f169f7b98d6@mail.gmail.com> Message-ID: <33644d3c0805201801t47091d2frdbb5f10e720a3d51@mail.gmail.com> > > Well, if you do f = a[n, :], you would get a view, another object that > shares the data in memory with a but is a separate object. OK, so it is a new object, with the properties of the slice it references, but if I write anything to it, it will consistently go back to the same spot in the original array. In general, if I work on that, and don't do something that allocates a new set of memory locations for that, it will reference the same memory location. If I do: a = np.zeros((20,30)) b = a[2,:] b += 1 # will add 1 to the original slice b.resize will fail... b = np.zeros((1,30)) # allocates new memory and disconnects the view The appropriate way to zero out the original memory locations would be to do something like b *= 0? Is there any way to force a writeback to the original view so long as the dimensions of what is being assigned to b is the same as the original? Or, is there a way to, say, enable a warking if I'm dropping a view? > Sorry I didn't get back to you earlier on this -- I was a bit busy > yesterday. It looks like weave.blitz isn't working on your second > line because you're not explicitly putting slices in some of the > dimensions, In numpy v[0:2] works for 1,2,3,4,.... dimensions, but > for a 2d array in blitz you have to use v[0:2,:], 3d v[0:2,:,:]. It's > a bit more picky. I think that's the problem with your second line > -- try replacing v[:] with v[0,:] and theta[1-curidx] with > theta[1-curidx, :]. (I may have missed some others.) OK, that seems to do it. I still actually get better performance (subsequent runs after compilation) with the straight numpy code. Strangely, I'm also getting that the flip/flop method is running a bit slower than having the separate prev_ variables. aff_input is rather large (~2000x14000), but the state vectors are only 14000 (or that x2 w/ flipflopping for some), each. Is there slowdown maybe because it is doing those 3 lines of blitz operations then doing a bunch of python numpy? Either way, It seems like I've got pretty good performance as well as a handle on using weave.blitz in the future. -jsnyder -- James Snyder Biomedical Engineering Northwestern University jbsnyder at gmail.com PGP: http://fanplastic.org/key.txt From efiring at hawaii.edu Tue May 20 21:16:07 2008 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 20 May 2008 15:16:07 -1000 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805201801t47091d2frdbb5f10e720a3d51@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> <33644d3c0805192026h5dbbb210hf06d5e3cc5fc04b4@mail.gmail.com> <4db580fd0805201224p6517b620of1575f169f7b98d6@mail.gmail.com> <33644d3c0805201801t47091d2frdbb5f10e720a3d51@mail.gmail.com> Message-ID: <483377D7.4000003@hawaii.edu> James Snyder wrote: >> Well, if you do f = a[n, :], you would get a view, another object that >> shares the data in memory with a but is a separate object. > > OK, so it is a new object, with the properties of the slice it > references, but if I write anything to it, it will consistently go > back to the same spot in the original array. > > In general, if I work on that, and don't do something that allocates a > new set of memory locations for that, it will reference the same > memory location. > > If I do: > a = np.zeros((20,30)) > b = a[2,:] > > b += 1 # will add 1 to the original slice > > b.resize will fail... > > b = np.zeros((1,30)) # allocates new memory and disconnects the view > > The appropriate way to zero out the original memory locations would be > to do something like b *= 0? or, from fastest to slowest: b.fill(0) b.flat = 0 b[:] = 0 Eric From david at ar.media.kyoto-u.ac.jp Tue May 20 21:12:05 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 May 2008 10:12:05 +0900 Subject: [Numpy-discussion] question on NumPy NaN In-Reply-To: References: <4832F730.7070701@rug.nl> Message-ID: <483376E5.6010804@ar.media.kyoto-u.ac.jp> Keith Goodman wrote: > Or > > np.nansum(a) / np.isfinite(a).sum() > > A nanmean would be nice to have in numpy. > nanmean, nanstd and nanmedian are available in scipy, though. cheers, David From david at ar.media.kyoto-u.ac.jp Tue May 20 22:14:03 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 May 2008 11:14:03 +0900 Subject: [Numpy-discussion] Shouldn't numpy.core modules avoid calling numpy.lib ? Message-ID: <4833856B.9010307@ar.media.kyoto-u.ac.jp> Hi, I noticed that some functions in numpy.core call functions in numpy.lib. Shouldn't this be avoided as much as possible, to avoid potential circular import, dependencies, etc... ? cheers, David From aisaac at american.edu Tue May 20 22:39:42 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 20 May 2008 22:39:42 -0400 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it In-Reply-To: References: Message-ID: On Tue, 20 May 2008, Jarrod Millman apparently wrote: > Please test the Windows binaries. What tests would you like? :: >>> np.test(level=1) Numpy is installed in C:\Python25\lib\site-packages\numpy Numpy version 1.1.0rc1 Python version 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.13 ... OK This is on an old machine (Pentium 3 (no SSE 2)) running Win 2000. Cheers, Alan From charlesr.harris at gmail.com Tue May 20 22:36:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 20:36:12 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> Message-ID: Curious bug on Stefan's Ubuntu build client: ImportError: /usr/lib/atlas/libblas.so.3gf: undefined symbol: _gfortran_st_write_done make[1]: *** [test] Error 1 Anyone know what that is about? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 20 22:42:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 20:42:17 -0600 Subject: [Numpy-discussion] Shouldn't numpy.core modules avoid calling numpy.lib ? In-Reply-To: <4833856B.9010307@ar.media.kyoto-u.ac.jp> References: <4833856B.9010307@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, May 20, 2008 at 8:14 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > I noticed that some functions in numpy.core call functions in > numpy.lib. Shouldn't this be avoided as much as possible, to avoid > potential circular import, dependencies, etc... ? > Probably not. But numpy/lib looks like a basement closet to me, anyway. What functions are getting called? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mabshoff at googlemail.com Tue May 20 21:57:31 2008 From: mabshoff at googlemail.com (Michael Abshoff) Date: Wed, 21 May 2008 03:57:31 +0200 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> Message-ID: <4833818B.4090802@googlemail.com> Charles R Harris wrote: Hi Chuck, > Curious bug on Stefan's Ubuntu build client: > > ImportError: /usr/lib/atlas/libblas.so.3gf: undefined symbol: _gfortran_st_write_done > make[1]: *** [test] Error 1 > Anyone know what that is about? > You need to link -lgfortran since the blas you use was compiled with gfortran. > Chuck > Cheers, Michael > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Tue May 20 22:32:26 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 May 2008 11:32:26 +0900 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> Message-ID: <483389BA.3020307@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > Curious bug on Stefan's Ubuntu build client: > > ImportError: /usr/lib/atlas/libblas.so.3gf: undefined symbol: _gfortran_st_write_done > make[1]: *** [test] Error 1 Unfortunately not curious and very well known: I bet the client is configured to build with g77; libblas.so.3gf uses the gfortran ABI (which is incompatible with the g77 ABI). Any library with 3gf is the gfortran ABI. Two possible solutions: - forget about gfortran, and install atlas3* (g77 ABI) instead of libatlas* (gfortran ABI) - use gfortran in the build The first one is safer: I cannot find any concrete information on the Ubuntu side of things, but I believe g77 is still the default ABI. (Debian) Lenny uses gfortran: http://wiki.debian.org/GfortranTransition David From david at ar.media.kyoto-u.ac.jp Tue May 20 22:34:38 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 May 2008 11:34:38 +0900 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it In-Reply-To: References: Message-ID: <48338A3E.30307@ar.media.kyoto-u.ac.jp> Alan G Isaac wrote: > What tests would you like? :: > > >>> np.test(level=1) > Numpy is installed in C:\Python25\lib\site-packages\numpy > Numpy version 1.1.0rc1 > Python version 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.13 > ... > OK > > > This is on an old machine (Pentium 3 (no SSE 2)) running Win 2000. > Joy, no more crashing on machines without SSE2 :) That's exactly the point of this 'new' installer. cheers, David From david at ar.media.kyoto-u.ac.jp Tue May 20 22:39:05 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 May 2008 11:39:05 +0900 Subject: [Numpy-discussion] Shouldn't numpy.core modules avoid calling numpy.lib ? In-Reply-To: References: <4833856B.9010307@ar.media.kyoto-u.ac.jp> Message-ID: <48338B49.5000705@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Tue, May 20, 2008 at 8:14 PM, David Cournapeau > > > wrote: > > Hi, > > I noticed that some functions in numpy.core call functions in > numpy.lib. Shouldn't this be avoided as much as possible, to avoid > potential circular import, dependencies, etc... ? > > > Probably not. Probably not avoided, or should probably not be called ? > But numpy/lib looks like a basement closet to me, anyway. What > functions are getting called? I can see at least one: numpy.lib.issubtype in core/defmatrix.py, called once. I am trying to see why importing numpy is slow, and those circular import make the thing difficult to understand (or maybe I am just too new to dtrace to understand how to use it effectively here). cheers, David From charlesr.harris at gmail.com Tue May 20 22:58:36 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 20:58:36 -0600 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: <483389BA.3020307@ar.media.kyoto-u.ac.jp> References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> <483389BA.3020307@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, May 20, 2008 at 8:32 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > Curious bug on Stefan's Ubuntu build client: > > > > ImportError: /usr/lib/atlas/libblas.so.3gf: undefined symbol: > _gfortran_st_write_done > > make[1]: *** [test] Error 1 > > Unfortunately not curious and very well known: I bet the client is > configured to build with g77; libblas.so.3gf uses the gfortran ABI > (which is incompatible with the g77 ABI). Any library with 3gf is the > gfortran ABI. > > Two possible solutions: > - forget about gfortran, and install atlas3* (g77 ABI) instead of > libatlas* (gfortran ABI) > - use gfortran in the build > > The first one is safer: I cannot find any concrete information on the > Ubuntu side of things, but I believe g77 is still the default ABI. > (Debian) Lenny uses gfortran: > > http://wiki.debian.org/GfortranTransition > Ah... Is this going to be a problem with Debian installs? I just ran Stefan's buildbot too see if it was working yet. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 20 23:02:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 May 2008 21:02:59 -0600 Subject: [Numpy-discussion] Shouldn't numpy.core modules avoid calling numpy.lib ? In-Reply-To: <48338B49.5000705@ar.media.kyoto-u.ac.jp> References: <4833856B.9010307@ar.media.kyoto-u.ac.jp> <48338B49.5000705@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, May 20, 2008 at 8:39 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > > > On Tue, May 20, 2008 at 8:14 PM, David Cournapeau > > > > > wrote: > > > > Hi, > > > > I noticed that some functions in numpy.core call functions in > > numpy.lib. Shouldn't this be avoided as much as possible, to avoid > > potential circular import, dependencies, etc... ? > > > > > > Probably not. > > Probably not avoided, or should probably not be called ? > Just that it looks cleaner to me if stuff in numpy/core doesn't call special libraries. I don't know that it is a problem. Maybe some utility functions should be moved into core? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue May 20 22:51:35 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 May 2008 11:51:35 +0900 Subject: [Numpy-discussion] Branching 1.1.x and starting 1.2.x development In-Reply-To: References: <9457e7c80805200930m367a90c0y4a11daca674d2ce@mail.gmail.com> <48330D0A.7020605@noaa.gov> <3d375d730805201233p3773beb7ked79295e81f36c3f@mail.gmail.com> <483389BA.3020307@ar.media.kyoto-u.ac.jp> Message-ID: <48338E37.8070205@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > Ah... Is this going to be a problem with Debian installs? No, I meant that Lenny has a clear rule for transition, whereas I could not find such information for Ubuntu. Since the transition happened during the Hardy developement time-frame, I don't know what Ubuntu did. Maybe I am missing something, but I found the distribution of two ABI at the same time extremely annoying in Ubuntu (typically, debian has a package liblapack_pic, very useful to build a shared atlas by yourself, but contrary to all other fortran libraries - renamed for the gfortran transition - this one got gfortran ABI without name change). cheers, David From kwgoodman at gmail.com Tue May 20 23:31:43 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 20 May 2008 20:31:43 -0700 Subject: [Numpy-discussion] question on NumPy NaN In-Reply-To: <483376E5.6010804@ar.media.kyoto-u.ac.jp> References: <4832F730.7070701@rug.nl> <483376E5.6010804@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, May 20, 2008 at 6:12 PM, David Cournapeau wrote: > Keith Goodman wrote: >> Or >> >> np.nansum(a) / np.isfinite(a).sum() >> >> A nanmean would be nice to have in numpy. >> > > nanmean, nanstd and nanmedian are available in scipy, though. Thanks for pointing that out. Studying nanmedian, which is twice as fast as my for-loop implementation, taught me about compress and apply_along_axis. >> import numpy.matlib as mp >> from numpy.matlib import where >> timeit x[0, where(x.A > 0.5)[1]] 10000 loops, best of 3: 60.8 ?s per loop >> timeit x.compress(x.A.ravel() > 0.5) 10000 loops, best of 3: 44.5 ?s per loop Am I missing something obvious or is 'sort' unnecessary in _nanmedian? Perhaps it is left over from a time when _nanmedian did not call median. def _nanmedian(arr1d): # This only works on 1d arrays """Private function for rank a arrays. Compute the median ignoring Nan. :Parameters: arr1d : rank 1 ndarray input array :Results: m : float the median.""" cond = 1-np.isnan(arr1d) x = np.sort(np.compress(cond,arr1d,axis=-1)) if x.size == 0: return np.nan return median(x) From josef.pktd at gmail.com Wed May 21 01:07:01 2008 From: josef.pktd at gmail.com (joep) Date: Tue, 20 May 2008 22:07:01 -0700 (PDT) Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it In-Reply-To: <48338A3E.30307@ar.media.kyoto-u.ac.jp> References: <48338A3E.30307@ar.media.kyoto-u.ac.jp> Message-ID: <24836592-12b4-4f5c-8b9a-373aaaaa4dc8@34g2000hsh.googlegroups.com> Testing the windows installers On an 8 years old Windows 2000 with Intel processor (which ?) no sse, installed numpy-1/1/0rc1-nosse.exe * with Python 2.4.3 (no ctypes) - numpy.test(): 1001 test OK, no errors or failures - numpy.test(): 1271 tests, errors=12, failures=1 no crash, some previous numpy crashed python * with Python 2.5.1 - numpy.test(): 1004 test OK, no errors or failures - numpy.test(): 1277 tests, errors=12, failures=1 On Windows XP with Intel Pentium M on notebook with SSE2, installed numpy-1.1.0rc1-sse2.exe {{{ >>> import numpy >>> numpy.test() Numpy is installed in C:\Programs\Python25\lib\site-packages\numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Int el)] ... Ran 1004 tests in 1.469s OK }}} running all tests: {{{ Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test(all=True) Numpy is installed in C:\Programs\Python25\lib\site-packages\numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Int el)] ... Ran 1277 tests in 5.890s FAILED (failures=1, errors=12) }}} if I run numpy.test() or numpy.test(level=1) again, it does not go back to the original smaller number (1004) of tests, instead it still picks up some of the tests with errors {{{ >>> numpy.test(level=1) Numpy is installed in C:\Programs\Python25\lib\site-packages\numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Int el)] .... Ran 1021 tests in 1.500s FAILED (errors=10) }}} most errors are similar to (as in an open ticket) File "C:\Programs\Python25\Lib\site-packages\numpy\ma\tests \test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array So, installer works well on these two windows computers Josef From josef.pktd at gmail.com Wed May 21 01:11:04 2008 From: josef.pktd at gmail.com (joep) Date: Tue, 20 May 2008 22:11:04 -0700 (PDT) Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it In-Reply-To: <24836592-12b4-4f5c-8b9a-373aaaaa4dc8@34g2000hsh.googlegroups.com> References: <48338A3E.30307@ar.media.kyoto-u.ac.jp> <24836592-12b4-4f5c-8b9a-373aaaaa4dc8@34g2000hsh.googlegroups.com> Message-ID: <4489925d-28fa-4698-b816-a453af683ad2@27g2000hsf.googlegroups.com> joep wrote: > Testing the windows installers > > On an 8 years old Windows 2000 with Intel processor (which ?) no sse, > installed numpy-1/1/0rc1-nosse.exe > * with Python 2.4.3 (no ctypes) > - numpy.test(): 1001 test OK, no errors or failures typo: missed with option all=True: - numpy.test(all=True): 1271 tests, errors=12, failures=1 > no crash, some previous numpy crashed python > * with Python 2.5.1 > - numpy.test(): 1004 test OK, no errors or failures typo: missed with option all=True > - numpy.test(all=True): 1277 tests, errors=12, failures=1 > From pav at iki.fi Wed May 21 02:27:16 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 21 May 2008 09:27:16 +0300 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> Message-ID: <1211351236.15406.138.camel@localhost> ti, 2008-05-20 kello 18:04 -0500, Robert Kern kirjoitti: > On Tue, May 20, 2008 at 5:55 PM, Jonathan Wright wrote: > > St?fan van der Walt wrote: > > > As for internationali(s/z)ation, we'll see who writes the most > > > docstrings. > > > > Indeed. There are some notes on the OLPC wiki at > > > > http://wiki.laptop.org/go/Python_i18n > > > > It seems to be just a question of adding at the top of add_newdocs.py > > > > from gettext import gettext as _ > > > > ... and putting the docstrings in a _() function call, although perhaps > > I miss something important, like a performance hit? > > Possibly a significant one. This could affect startup times, which I > am hesitant to make worse. > > > This would catch > > everything in add_newdocs at least. It seems like a relatively minor > > change if you are overhauling anyway? > > add_newdocs() could do that, but the usual function docstrings can't. > The rule is that if the first statement in a function is a literal > string, then the compiler will assign it func.__doc__. Expressions are > just treated as expressions in the function body and have no affect on > func.__doc__. I think it would be quite straightforward to write a function that crawled over the numpy namespace and dynamically replaced __doc__ with gettextized versions. The user could call this function to switch the language and the reference manual authors could call it to produce localized versions of the manual. Moreover, we already have tools to extract English docstrings from numpy and producing .pot files for gettext could be done. I think i18n of numpy (or any other Python module) is technically not as far out as it initially seems! ?(This is assuming that there are no objects there that don't allow changing their __doc__.) But I believe that in practice, we really had better to concentrate on improving the documentation in English before thinking about spending much effort on i18n or l10n. -- Pauli Virtanen From robert.kern at gmail.com Wed May 21 02:33:23 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 01:33:23 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <1211351236.15406.138.camel@localhost> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> Message-ID: <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> On Wed, May 21, 2008 at 1:27 AM, Pauli Virtanen wrote: > I think it would be quite straightforward to write a function that > crawled over the numpy namespace and dynamically replaced __doc__ with > gettextized versions. The user could call this function to switch the > language and the reference manual authors could call it to produce > localized versions of the manual. Moreover, we already have tools to > extract English docstrings from numpy and producing .pot files for > gettext could be done. I think i18n of numpy (or any other Python > module) is technically not as far out as it initially seems! Yes, that sounds (technically) feasible. > ?(This is assuming that there are no objects there that don't allow > changing their __doc__.) > > But I believe that in practice, we really had better to concentrate on > improving the documentation in English before thinking about spending > much effort on i18n or l10n. Yup. Sounds about right. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From schut at sarvision.nl Wed May 21 02:48:25 2008 From: schut at sarvision.nl (Vincent Schut) Date: Wed, 21 May 2008 08:48:25 +0200 Subject: [Numpy-discussion] first recarray steps In-Reply-To: <4832F77A.8020804@noaa.gov> References: <4832F77A.8020804@noaa.gov> Message-ID: Christopher Barker wrote: > > Vincent Schut wrote: >> Lets say I have a rgb image of arbitrary size, as a normal ndarray >> (that's what my image reading lib gives me). Thus shape is >> (3,ysize,xsize), dtype = int8. How would I convert/view this as a >> recarray of shape (ysize, xsize) with the first dimension split up into >> 'r', 'g', 'b' fields? No need for 'x' and 'y' fields. > > Take a look in this list for a thread entitled "recarray fun" about a > month ago -- you'll find some more discussion of approaches. Well, actually that thread was my inspiration to take a closer look into recarrays... > > Also, if you image data is rgb, usually, that's a (width, height, 3) > array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) > array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs > may give you that, I'm not sure. My data is. In fact, this is a simplification of my situation; I'm processing satellite data, which usually has more (and other) bands than just rgb. But the data is definitely in shape (bands, y, x). > > Also, you probably want a uint8 dtype, giving you 0-255 for each byte. Same story. In fact, in this case it's int16, but can actually be any data type, even floats, even complex. But thanks for the thoughts :-) > > -Chris > > > From robert.kern at gmail.com Wed May 21 02:57:08 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 01:57:08 -0500 Subject: [Numpy-discussion] first recarray steps In-Reply-To: References: <4832F77A.8020804@noaa.gov> Message-ID: <3d375d730805202357jb26b33dme121b463d6b42286@mail.gmail.com> On Wed, May 21, 2008 at 1:48 AM, Vincent Schut wrote: > Christopher Barker wrote: >> Also, if you image data is rgb, usually, that's a (width, height, 3) >> array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) >> array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs >> may give you that, I'm not sure. > > My data is. In fact, this is a simplification of my situation; I'm > processing satellite data, which usually has more (and other) bands than > just rgb. But the data is definitely in shape (bands, y, x). I don't think record arrays will help you much, then. Individual records need to be contiguous (bar padding). You can't interleave them. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From schut at sarvision.nl Wed May 21 03:03:10 2008 From: schut at sarvision.nl (Vincent Schut) Date: Wed, 21 May 2008 09:03:10 +0200 Subject: [Numpy-discussion] first recarray steps In-Reply-To: <3d375d730805202357jb26b33dme121b463d6b42286@mail.gmail.com> References: <4832F77A.8020804@noaa.gov> <3d375d730805202357jb26b33dme121b463d6b42286@mail.gmail.com> Message-ID: Robert Kern wrote: > On Wed, May 21, 2008 at 1:48 AM, Vincent Schut wrote: >> Christopher Barker wrote: > >>> Also, if you image data is rgb, usually, that's a (width, height, 3) >>> array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) >>> array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs >>> may give you that, I'm not sure. >> My data is. In fact, this is a simplification of my situation; I'm >> processing satellite data, which usually has more (and other) bands than >> just rgb. But the data is definitely in shape (bands, y, x). > > I don't think record arrays will help you much, then. Individual > records need to be contiguous (bar padding). You can't interleave > them. > Hmm, that was just what I was wondering about, when reading Stefan's reply. So in fact, recarrays aren't just another way to view some data, no matter in what shape it is. So his solution: x.T.reshape((-1,x.shape[0])).view(dt).reshape(x.shape[1:]).T won't work, than. Or, at least, won't give me a view on my original dat, but would give me a recarray with a copy of my data. I guess I was misled by this text on the recarray wiki page: "We would like to represent a small colour image. The image is two pixels high and two pixels wide. Each pixel has a red, green and blue colour component, which is represented by a 32-bit floating point number between 0 and 1. Intuitively, we could represent the image as a 3x2x2 array, where the first dimension represents the color, and the last two the pixel positions, i.e. " Note the "3x2x2", which suggested imho that this would work with an image with (bands,y,x) shape, not with (x,y,bands) shape. But I understand that it's not shape, but internal representation in memory (contiguous or not, C/Fortran, etc) that matters? I know I can change the wiki text, but I'm afraid I still don't feel confident on this matter... From robert.kern at gmail.com Wed May 21 03:23:19 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 02:23:19 -0500 Subject: [Numpy-discussion] first recarray steps In-Reply-To: References: <4832F77A.8020804@noaa.gov> <3d375d730805202357jb26b33dme121b463d6b42286@mail.gmail.com> Message-ID: <3d375d730805210023x17f245d6vb5ba73bb6c8e27c7@mail.gmail.com> On Wed, May 21, 2008 at 2:03 AM, Vincent Schut wrote: > Robert Kern wrote: >> On Wed, May 21, 2008 at 1:48 AM, Vincent Schut wrote: >>> Christopher Barker wrote: >> >>>> Also, if you image data is rgb, usually, that's a (width, height, 3) >>>> array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) >>>> array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs >>>> may give you that, I'm not sure. >>> My data is. In fact, this is a simplification of my situation; I'm >>> processing satellite data, which usually has more (and other) bands than >>> just rgb. But the data is definitely in shape (bands, y, x). >> >> I don't think record arrays will help you much, then. Individual >> records need to be contiguous (bar padding). You can't interleave >> them. >> > Hmm, that was just what I was wondering about, when reading Stefan's > reply. So in fact, recarrays aren't just another way to view some data, > no matter in what shape it is. > > So his solution: > x.T.reshape((-1,x.shape[0])).view(dt).reshape(x.shape[1:]).T won't work, > than. Or, at least, won't give me a view on my original dat, but would > give me a recarray with a copy of my data. Right. > I guess I was misled by this text on the recarray wiki page: > > "We would like to represent a small colour image. The image is two > pixels high and two pixels wide. Each pixel has a red, green and blue > colour component, which is represented by a 32-bit floating point number > between 0 and 1. > > Intuitively, we could represent the image as a 3x2x2 array, where the > first dimension represents the color, and the last two the pixel > positions, i.e. " > > Note the "3x2x2", which suggested imho that this would work with an > image with (bands,y,x) shape, not with (x,y,bands) shape. Yes, the tutorial goes on to use record arrays as a view onto an (x,y,bands) array and also make a (bands,x,y) view from that, too. That is, in fact, quite a confusing presentation of the subject. Now, there is a way to use record arrays here; it's a bit ugly but can be quite useful when parsing data formats. Each item in the record can also be an array. So let's pretend we have a (3,nx,ny) RGB array. nbands, nx, ny = a.shape dtype = numpy.dtype([ ('r', a.dtype, [nx, ny]), ('g', a.dtype, [nx, ny]), ('b', a.dtype, [nx, ny]), ]) # The flatten() is necessary to pre-empt numpy from # trying to do too much interpretation of a's shape. rec = a.flatten().view(dtype) print rec['r'] print rec['g'] print rec['b'] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From hetland at tamu.edu Wed May 21 03:27:10 2008 From: hetland at tamu.edu (Rob Hetland) Date: Wed, 21 May 2008 09:27:10 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> Message-ID: Should we add a general discussion section to the Wiki? I would just do this, but it seems like a fundamental enough addition that I though I would suggest it first. The rational is that there are some stylistic questions that are not covered in the example. For instance, I think that the See Also section should have a format like this: See Also -------- function : One line description from function's docstring Longer description here over potentially many lines. Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. next_function : One line description from next_function's docstring Longer description here over potentially many lines. Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. This will parse better (as the line with the semicolon is bold, the next lines are not). Also, would it be possible to put function and next_function in double back-ticks, so that they are referenced, like modules? That way they will might be clickable in a html version of the documentation. -Rob From mexicalex at yahoo.com Wed May 21 03:32:32 2008 From: mexicalex at yahoo.com (Alexandra Geddes) Date: Wed, 21 May 2008 00:32:32 -0700 (PDT) Subject: [Numpy-discussion] Outputting arrays. Message-ID: <256985.68098.qm@web51501.mail.re2.yahoo.com> Hi. 1. Is there a module or other code to write arrays to databases (they want access databases)? 2. How can i write 2D arrays to textfiles with labels on the rows and columns? thanks! alex. From schut at sarvision.nl Wed May 21 03:33:23 2008 From: schut at sarvision.nl (Vincent Schut) Date: Wed, 21 May 2008 09:33:23 +0200 Subject: [Numpy-discussion] first recarray steps In-Reply-To: <3d375d730805210023x17f245d6vb5ba73bb6c8e27c7@mail.gmail.com> References: <4832F77A.8020804@noaa.gov> <3d375d730805202357jb26b33dme121b463d6b42286@mail.gmail.com> <3d375d730805210023x17f245d6vb5ba73bb6c8e27c7@mail.gmail.com> Message-ID: Robert Kern wrote: > On Wed, May 21, 2008 at 2:03 AM, Vincent Schut wrote: >> Robert Kern wrote: >>> On Wed, May 21, 2008 at 1:48 AM, Vincent Schut wrote: >>>> Christopher Barker wrote: >>>>> Also, if you image data is rgb, usually, that's a (width, height, 3) >>>>> array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) >>>>> array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs >>>>> may give you that, I'm not sure. >>>> My data is. In fact, this is a simplification of my situation; I'm >>>> processing satellite data, which usually has more (and other) bands than >>>> just rgb. But the data is definitely in shape (bands, y, x). >>> I don't think record arrays will help you much, then. Individual >>> records need to be contiguous (bar padding). You can't interleave >>> them. >>> >> Hmm, that was just what I was wondering about, when reading Stefan's >> reply. So in fact, recarrays aren't just another way to view some data, >> no matter in what shape it is. >> >> So his solution: >> x.T.reshape((-1,x.shape[0])).view(dt).reshape(x.shape[1:]).T won't work, >> than. Or, at least, won't give me a view on my original dat, but would >> give me a recarray with a copy of my data. > > Right. > >> I guess I was misled by this text on the recarray wiki page: >> >> "We would like to represent a small colour image. The image is two >> pixels high and two pixels wide. Each pixel has a red, green and blue >> colour component, which is represented by a 32-bit floating point number >> between 0 and 1. >> >> Intuitively, we could represent the image as a 3x2x2 array, where the >> first dimension represents the color, and the last two the pixel >> positions, i.e. " >> >> Note the "3x2x2", which suggested imho that this would work with an >> image with (bands,y,x) shape, not with (x,y,bands) shape. > > Yes, the tutorial goes on to use record arrays as a view onto an > (x,y,bands) array and also make a (bands,x,y) view from that, too. > That is, in fact, quite a confusing presentation of the subject. > > Now, there is a way to use record arrays here; it's a bit ugly but can > be quite useful when parsing data formats. Each item in the record can > also be an array. So let's pretend we have a (3,nx,ny) RGB array. > > nbands, nx, ny = a.shape > dtype = numpy.dtype([ > ('r', a.dtype, [nx, ny]), > ('g', a.dtype, [nx, ny]), > ('b', a.dtype, [nx, ny]), > ]) > > # The flatten() is necessary to pre-empt numpy from > # trying to do too much interpretation of a's shape. > rec = a.flatten().view(dtype) > print rec['r'] > print rec['g'] > print rec['b'] > Ah, now that is clarifying! Thanks a lot. I'll do some experiments to see whether this way of viewing my data is useful to me (in a sense that making may code more readable is already very useful). Cheers, Vincent. From robert.kern at gmail.com Wed May 21 03:40:41 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 02:40:41 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> Message-ID: <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> On Wed, May 21, 2008 at 2:27 AM, Rob Hetland wrote: > > Should we add a general discussion section to the Wiki? Discussion should happen here on the mailing list instead of the wiki. But please, let's not rehash discussions which have already happened (like this one). They simply have no way of coming to a conclusion and cannot improve matters enough to justify the effort expended on the discussion. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Wed May 21 04:08:57 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 21 May 2008 10:08:57 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> Message-ID: <9457e7c80805210108k2257026fk821978c9bc00bf1e@mail.gmail.com> Hi Rob 2008/5/21 Rob Hetland : > See Also > -------- > function : One line description from function's docstring > Longer description here over potentially many lines. Lorem ipsum > dolor sit amet, > consectetur adipisicing elit, sed do eiusmod tempor incididunt ut > labore et dolore > magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation > ullamco laboris nisi > ut aliquip ex ea commodo consequat. > next_function : One line description from next_function's docstring > Longer description here over potentially many lines. Lorem ipsum > dolor sit amet, > consectetur adipisicing elit, sed do eiusmod tempor incididunt ut > labore et dolore > magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation > ullamco laboris nisi > ut aliquip ex ea commodo consequat. > > > This will parse better (as the line with the semicolon is bold, the > next lines are not). Also, would it be possible to put function and > next_function in double back-ticks, so that they are referenced, like > modules? That way they will might be clickable in a html version of > the documentation. When generating the reference guide, I parse all the numpy docstrings and re-generate a document enhanced with Sphinx markup. In this document, functions in the See Also clause are "clickable". I have support for two formats: See Also ------------ function_a, function_b, function_c function_d : relation to current function Don't worry if it doesn't look perfect on the wiki; the reference guide will be rendered correctly. Regards St?fan From hetland at tamu.edu Wed May 21 04:26:27 2008 From: hetland at tamu.edu (Rob Hetland) Date: Wed, 21 May 2008 10:26:27 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> Message-ID: <11D40EB5-263B-4FB3-896A-4FD0F2094F56@tamu.edu> On May 21, 2008, at 9:40 AM, Robert Kern wrote: > But please, let's not rehash discussions which have already happened > (like this one). I didn't mean to suggest rehashing the documentation format. I agree that this has been discussed enough. Rather, sometimes it's not clear to me how to apply the existing standard. 'See Also' was a case where the style guidelines seem sparse. My suggestion, I guess, was more to clarify than to change. -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From gnurser at googlemail.com Wed May 21 04:47:19 2008 From: gnurser at googlemail.com (George Nurser) Date: Wed, 21 May 2008 09:47:19 +0100 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> Message-ID: <1d1e6ea70805210147p4c6e3db4y34d76e253ccdd267@mail.gmail.com> 2008/5/21 Jarrod Millman : > On Mon, May 19, 2008 at 12:39 PM, Christopher Burns wrote: >> I've built a Mac binary for the 1.1 release candidate. Mac users, >> please test it from: >> >> https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg >> >> This is for the MacPython installed from python.org. > > Hello, > > Please test the Mac binaries. I can't tag the release until I know > that our binary installers work on a wide variety of Mac machines. Works for me. Intel, MBP, 10.5.2. >>> import numpy >>> numpy.__version__ '1.1.0rc1' >>> numpy.test(10) ran 1005 tests in 2.290s OK George Nurser. From stefan at sun.ac.za Wed May 21 05:30:21 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 21 May 2008 11:30:21 +0200 Subject: [Numpy-discussion] Ticket #798: `piecewise` exposes raw memory Message-ID: <9457e7c80805210230j4e1a63b8u9479164d7c231c18@mail.gmail.com> Referring to http://scipy.org/scipy/numpy/ticket/798 `piecewise` uses `empty` to allocate output memory. If the conditions do not sufficiently cover the output, then raw memory is returned, e.g., {{{ import numpy as np np.piecewise([0,1,2],[True,False,False],[1]) }}} A patch which addresses the issue is available here for review: http://codereview.appspot.com/1105 Documentation is being updated on the wiki. From ondrej at certik.cz Wed May 21 05:53:15 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 21 May 2008 11:53:15 +0200 Subject: [Numpy-discussion] Ticket #798: `piecewise` exposes raw memory In-Reply-To: <9457e7c80805210230j4e1a63b8u9479164d7c231c18@mail.gmail.com> References: <9457e7c80805210230j4e1a63b8u9479164d7c231c18@mail.gmail.com> Message-ID: <85b5c3130805210253r2795c015v3d689d01305c061a@mail.gmail.com> On Wed, May 21, 2008 at 11:30 AM, St?fan van der Walt wrote: > Referring to > http://scipy.org/scipy/numpy/ticket/798 > > `piecewise` uses `empty` to allocate output memory. If the conditions > do not sufficiently cover the output, then raw memory is returned, > e.g., > > {{{ > import numpy as np > np.piecewise([0,1,2],[True,False,False],[1]) > }}} > > A patch which addresses the issue is available here for review: > > http://codereview.appspot.com/1105 > > Documentation is being updated on the wiki. I'd like to invite everyone to take part in the review. It's fun, it's just talking, no coding. :) Ondrej From aisaac at american.edu Wed May 21 08:29:48 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 21 May 2008 08:29:48 -0400 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it In-Reply-To: <4489925d-28fa-4698-b816-a453af683ad2@27g2000hsf.googlegroups.com> References: <48338A3E.30307@ar.media.kyoto-u.ac.jp> <24836592-12b4-4f5c-8b9a-373aaaaa4dc8@34g2000hsh.googlegroups.com> <4489925d-28fa-4698-b816-a453af683ad2@27g2000hsf.googlegroups.com> Message-ID: On Tue, 20 May 2008, joep apparently wrote: > missed with option all=True: Yes, I also see this with numpy.test(all=True). (Same old machine (no SSE2) running Win 2000.) Alan ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 104, in assert_equal desired.tolist(), File "C:\Python25\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 185, in test_set_elemen ts assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "C:\Python25\Lib\site-packages\numpy\ma\mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 312, in test_fromrecord s assert_equal_records(pa,mpa) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 333, in test_fromrecord s_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_old_ma.py", line 657, in check_testUfuncRe gression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_subclassing.py", line 86, in check_data_su bclassing assert_equal(xmsub._data, xsub) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "C:\Python25\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== FAIL: Ticket #652 ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\linalg\tests\test_regression.py", line 32, in test_eig_b uild assert_array_almost_equal(va, rva) File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 255, in assert_array_almost_equa l header='Arrays are not almost equal') File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 46.1538461538%) x: array([ 1.03221168e+02 +0.00000000e+00j, -1.91843603e+01 +0.00000000e+00j, 1.82126812e+01 +0.00000000e+00j,... y: array([ 1.03221168e+02 +0.j , -1.91843603e+01 +0.j , -6.04004526e-01+15.84422474j, -6.04004526e-01-15.84422474j, -1.13692929e+01 +0.j , -6.57612485e-01+10.41755503j,... ---------------------------------------------------------------------- Ran 1277 tests in 16.640s FAILED (failures=1, errors=12) >>> From mattknox.ca at gmail.com Wed May 21 08:36:22 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Wed, 21 May 2008 12:36:22 +0000 (UTC) Subject: [Numpy-discussion] 1.1.0rc1, Win32 Installer: please test it (test errors) References: <48323438.8080501@ar.media.kyoto-u.ac.jp> Message-ID: > installed fine and all tests ran successfully on my machine. I spoke too soon. I didn't know about the "all" parameter in numpy.test and just ran if with the default before. When I specify all=True, I get 12 errors. Most of which seem to be related to a problem with calling the "tolist" method. See output below... >>> numpy.test(all=True) Numpy is installed in C:\Python25\lib\site-packages\numpy Numpy version 1.1.0rc1 Python version 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] Found 18/18 tests for numpy.core.tests.test_defmatrix Found 3/3 tests for numpy.core.tests.test_errstate Found 3/3 tests for numpy.core.tests.test_memmap Found 283/283 tests for numpy.core.tests.test_multiarray Found 70/70 tests for numpy.core.tests.test_numeric Found 36/36 tests for numpy.core.tests.test_numerictypes Found 12/12 tests for numpy.core.tests.test_records Found 140/140 tests for numpy.core.tests.test_regression Found 7/7 tests for numpy.core.tests.test_scalarmath Found 2/2 tests for numpy.core.tests.test_ufunc Found 16/16 tests for numpy.core.tests.test_umath Found 63/63 tests for numpy.core.tests.test_unicode Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu Found 5/5 tests for numpy.distutils.tests.test_misc_util Found 2/2 tests for numpy.fft.tests.test_fftpack Found 3/3 tests for numpy.fft.tests.test_helper Found 10/10 tests for numpy.lib.tests.test_arraysetops Found 1/1 tests for numpy.lib.tests.test_financial Found 53/53 tests for numpy.lib.tests.test_function_base Found 5/5 tests for numpy.lib.tests.test_getlimits Found 6/6 tests for numpy.lib.tests.test_index_tricks Found 15/15 tests for numpy.lib.tests.test_io Found 1/1 tests for numpy.lib.tests.test_machar Found 4/4 tests for numpy.lib.tests.test_polynomial Found 1/1 tests for numpy.lib.tests.test_regression Found 49/49 tests for numpy.lib.tests.test_shape_base Found 15/15 tests for numpy.lib.tests.test_twodim_base Found 43/43 tests for numpy.lib.tests.test_type_check Found 1/1 tests for numpy.lib.tests.test_ufunclike Found 24/24 tests for numpy.lib.tests.test__datasource Found 89/89 tests for numpy.linalg.tests.test_linalg Found 3/3 tests for numpy.linalg.tests.test_regression Found 94/94 tests for numpy.ma.tests.test_core Found 15/15 tests for numpy.ma.tests.test_extras Found 17/17 tests for numpy.ma.tests.test_mrecords Found 36/36 tests for numpy.ma.tests.test_old_ma Found 4/4 tests for numpy.ma.tests.test_subclassing Found 7/7 tests for numpy.tests.test_random Found 16/16 tests for numpy.testing.tests.test_utils Found 5/5 tests for numpy.tests.test_ctypeslib .............................................................................. .............................................................................. .............................................................................. .............................................................................. .............................................................................. .............................................................................. ............Ignoring "Python was built with Visual S tudio 2003; extensions must be built with a compiler than can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py." (one should fix me in fcompiler/compaq.py) .............................................................................. .............................................................................. .............................................................................. ..F........................................................................... ....................................EEE.EEEE...E..EE.......................... .......E....E.............................. ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 104, in assert_equal desired.tolist(), File "C:\Python25\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 185, in test_set_elements assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "C:\Python25\Lib\site-packages\numpy\ma\mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 312, in test_fromrecords assert_equal_records(pa,mpa) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 333, in test_fromrecords_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_old_ma.py", line 657, in check_testUfuncRegression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\ma\tests\test_subclassing.py", line 86, in check_data_subclassing assert_equal(xmsub._data, xsub) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "C:\Python25\Lib\site-packages\numpy\ma\testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "C:\Python25\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== FAIL: Ticket #652 ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\linalg\tests\test_regression.py", line 32, in test_eig_build assert_array_almost_equal(va, rva) File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 255, in assert_array_almost_equal header='Arrays are not almost equal') File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 53.8461538462%) x: array([ 1.03221168e+02 +0.j , -1.91843603e+01 +0.j , 1.82126812e+01 +0.j , -1.13692929e+01 +0.j , -6.04004526e-01+15.84422474j, -6.04004526e-01-15.84422474j,... y: array([ 1.03221168e+02 +0.j , -1.91843603e+01 +0.j , -6.04004526e-01+15.84422474j, -6.04004526e-01-15.84422474j, -1.13692929e+01 +0.j , -6.57612485e-01+10.41755503j,... ---------------------------------------------------------------------- Ran 1277 tests in 2.028s FAILED (failures=1, errors=12) From emanuele at relativita.com Wed May 21 08:39:17 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 21 May 2008 14:39:17 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? Message-ID: <483417F5.1040705@relativita.com> Dear all, I need to speed up this function (a little example follows): ------ import numpy as N def distance_matrix(data1,data2,weights): rows = data1.shape[0] columns = data2.shape[0] dm = N.zeros((rows,columns)) for i in range(rows): for j in range(columns): dm[i,j] = ((data1[i,:]-data2[j,:])**2*weights).sum() pass pass return dm size1 = 4 size2 = 3 dimensions = 2 data1 = N.random.rand(size1,dimensions) data2 = N.random.rand(size2,dimensions) weights = N.random.rand(dimensions) dm = distance_matrix(data1,data2,weights) print dm ------------------ The distance_matrix function computes the weighted (squared) euclidean distances between each pair of vectors from two sets (data1, data2). The previous naive algorithm is extremely slow for my standard use, i.e., when size1 and size2 are in the order of 1000 or more. It can be improved using N.subtract.outer: def distance_matrix_faster(data1,data2,weights): rows = data1.shape[0] columns = data2.shape[0] dm = N.zeros((rows,columns)) for i in range(data1.shape[1]): dm += N.subtract.outer(data1[:,i],data2[:,i])**2*weights[i] pass return dm This algorithm becomes slow when dimensions (i.e., data1.shape[1]) is big (i.e., >1000), due to the Python loop. In order to speed it up, I guess that N.subtract.outer could be used on the full matrices instead of one column at a time. But then there is a memory issue: 'outer' allocates too much memory since it stores all possible combinations along all dimensions. This is clearly unnecessary. Is there a NumPy way to avoid all Python loops and without wasting too much memory? As a comparison I coded the same algorithm in C through weave (inline): it is _much_ faster and requires just the memory to store the result. But I'd prefer not using C or weave if possible. Thanks in advance for any help, Emanuele From matthieu.brucher at gmail.com Wed May 21 08:43:25 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 21 May 2008 14:43:25 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: <483417F5.1040705@relativita.com> References: <483417F5.1040705@relativita.com> Message-ID: Hi, Bill Baxter proposed a version of this problem some months ago on this ML. I use it regularly and it is fast enough for me. Matthieu 2008/5/21 Emanuele Olivetti : > Dear all, > > I need to speed up this function (a little example follows): > ------ > import numpy as N > def distance_matrix(data1,data2,weights): > rows = data1.shape[0] > columns = data2.shape[0] > dm = N.zeros((rows,columns)) > for i in range(rows): > for j in range(columns): > dm[i,j] = ((data1[i,:]-data2[j,:])**2*weights).sum() > pass > pass > return dm > > size1 = 4 > size2 = 3 > dimensions = 2 > data1 = N.random.rand(size1,dimensions) > data2 = N.random.rand(size2,dimensions) > weights = N.random.rand(dimensions) > dm = distance_matrix(data1,data2,weights) > print dm > ------------------ > The distance_matrix function computes the weighted (squared) euclidean > distances between each pair of vectors from two sets (data1, data2). > The previous naive algorithm is extremely slow for my standard use, > i.e., when size1 and size2 are in the order of 1000 or more. It can be > improved using N.subtract.outer: > > def distance_matrix_faster(data1,data2,weights): > rows = data1.shape[0] > columns = data2.shape[0] > dm = N.zeros((rows,columns)) > for i in range(data1.shape[1]): > dm += N.subtract.outer(data1[:,i],data2[:,i])**2*weights[i] > pass > return dm > > This algorithm becomes slow when dimensions (i.e., data1.shape[1]) is > big (i.e., >1000), due to the Python loop. In order to speed it up, I guess > that N.subtract.outer could be used on the full matrices instead of one > column at a time. But then there is a memory issue: 'outer' allocates > too much memory since it stores all possible combinations along all > dimensions. This is clearly unnecessary. > > Is there a NumPy way to avoid all Python loops and without wasting > too much memory? As a comparison I coded the same algorithm in > C through weave (inline): it is _much_ faster and requires just > the memory to store the result. But I'd prefer not using C or weave > if possible. > > Thanks in advance for any help, > > > Emanuele > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From hetland at tamu.edu Wed May 21 09:17:13 2008 From: hetland at tamu.edu (Rob Hetland) Date: Wed, 21 May 2008 15:17:13 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: <483417F5.1040705@relativita.com> References: <483417F5.1040705@relativita.com> Message-ID: <2D3AB6C5-A5BA-45B2-8ED2-2F2A8324F3C2@tamu.edu> I think you want something like this: x1 = x1 * weights[np.newaxis,:] x2 = x2 * weights[np.newaxis,:] x1 = x1[np.newaxis, :, :] x2 = x2[:, np.newaxis, :] distance = np.sqrt( ((x1 - x2)**2).sum(axis=-1) ) x1 and x2 are arrays with size of (npoints, ndimensions), and npoints can be different for each array. I'm not sure I did your weights right, but that part shouldn't be so difficult. On May 21, 2008, at 2:39 PM, Emanuele Olivetti wrote: > Dear all, > > I need to speed up this function (a little example follows): > ------ > import numpy as N > def distance_matrix(data1,data2,weights): > rows = data1.shape[0] > columns = data2.shape[0] > dm = N.zeros((rows,columns)) > for i in range(rows): > for j in range(columns): > dm[i,j] = ((data1[i,:]-data2[j,:])**2*weights).sum() > pass > pass > return dm > > size1 = 4 > size2 = 3 > dimensions = 2 > data1 = N.random.rand(size1,dimensions) > data2 = N.random.rand(size2,dimensions) > weights = N.random.rand(dimensions) > dm = distance_matrix(data1,data2,weights) > print dm > ------------------ > The distance_matrix function computes the weighted (squared) euclidean > distances between each pair of vectors from two sets (data1, data2). > The previous naive algorithm is extremely slow for my standard use, > i.e., when size1 and size2 are in the order of 1000 or more. It can be > improved using N.subtract.outer: > > def distance_matrix_faster(data1,data2,weights): > rows = data1.shape[0] > columns = data2.shape[0] > dm = N.zeros((rows,columns)) > for i in range(data1.shape[1]): > dm += N.subtract.outer(data1[:,i],data2[:,i])**2*weights[i] > pass > return dm > > This algorithm becomes slow when dimensions (i.e., data1.shape[1]) is > big (i.e., >1000), due to the Python loop. In order to speed it up, > I guess > that N.subtract.outer could be used on the full matrices instead of > one > column at a time. But then there is a memory issue: 'outer' allocates > too much memory since it stores all possible combinations along all > dimensions. This is clearly unnecessary. > > Is there a NumPy way to avoid all Python loops and without wasting > too much memory? As a comparison I coded the same algorithm in > C through weave (inline): it is _much_ faster and requires just > the memory to store the result. But I'd prefer not using C or weave > if possible. > > Thanks in advance for any help, > > > Emanuele > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From tgrav at mac.com Wed May 21 09:21:31 2008 From: tgrav at mac.com (Tommy Grav) Date: Wed, 21 May 2008 09:21:31 -0400 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Mac Installer: please test it In-Reply-To: References: <48338A3E.30307@ar.media.kyoto-u.ac.jp> <24836592-12b4-4f5c-8b9a-373aaaaa4dc8@34g2000hsh.googlegroups.com> <4489925d-28fa-4698-b816-a453af683ad2@27g2000hsf.googlegroups.com> Message-ID: <363EDA16-B5FD-474D-9331-F92E2C8D7EB1@mac.com> Doing the same on a the Mac installer also returns 3 failures and 12 errors with all=True. Installer works fine though :) [skathi:~] tgrav% python ActivePython 2.5.1.1 (ActiveState Software Inc.) based on Python 2.5.1 (r251:54863, May 1 2007, 17:40:00) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test(all=True) Numpy is installed in /Library/Frameworks/Python.framework/Versions/ 2.5/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5.1 (r251:54863, May 1 2007, 17:40:00) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] ...................................................../Library/ Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/ numpy/core/ma.py:609: UserWarning: Cannot automatically convert masked array to numeric because data is masked in one or more locations. warnings.warn("Cannot automatically convert masked array to "\ F....................................................................... ........................................................................ ........................................................................ .......................................................................F F......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................EEE.EEEE...E..EE.................................E....E.............................. ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 104, in assert_equal desired.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 185, in test_set_elements assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 312, in test_fromrecords assert_equal_records(pa,mpa) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 333, in test_fromrecords_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 657, in check_testUfuncRegression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/tests/test_subclassing.py", line 86, in check_data_subclassing assert_equal(xmsub._data, xsub) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== FAIL: check_testUfuncRegression (numpy.core.tests.test_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_ma.py", line 692, in check_testUfuncRegression self.failUnless(eqmask(ur.mask, mr.mask)) AssertionError ====================================================================== FAIL: test_basic (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 843, in test_basic assert_array_equal(y, [67305985, 134678021]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([16909060, 84281096]) y: array([ 67305985, 134678021]) ====================================================================== FAIL: test_keywords (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 852, in test_keywords assert_array_equal(y,[[513]]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([[258]], dtype=int16) y: array([[513]]) ---------------------------------------------------------------------- Ran 1313 tests in 3.741s FAILED (failures=3, errors=12) >>> From emanuele at relativita.com Wed May 21 09:39:27 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 21 May 2008 15:39:27 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: References: <483417F5.1040705@relativita.com> Message-ID: <4834260F.30301@relativita.com> Matthieu Brucher wrote: > Hi, > > Bill Baxter proposed a version of this problem some months ago on this > ML. I use it regularly and it is fast enough for me. > Excellent. Exactly what I was looking for. Thanks, Emanuele From bsouthey at gmail.com Wed May 21 09:47:35 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 21 May 2008 08:47:35 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> Message-ID: <483427F7.70808@gmail.com> Hi, I would like to throw out the following idea with no obligations: IF people have the time and energy while writing the documentation, can they also test that the function is doing what it is expected? Also related to this is developing appropriate tests if these are not covered or at least provide a file of code used in evaluating the functionality. The precedence for this that a number of Linux kernel bugs have been identified this way. Thanks Bruce From gnurser at googlemail.com Wed May 21 09:58:37 2008 From: gnurser at googlemail.com (George Nurser) Date: Wed, 21 May 2008 14:58:37 +0100 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Mac Installer: please test it In-Reply-To: <363EDA16-B5FD-474D-9331-F92E2C8D7EB1@mac.com> References: <48338A3E.30307@ar.media.kyoto-u.ac.jp> <24836592-12b4-4f5c-8b9a-373aaaaa4dc8@34g2000hsh.googlegroups.com> <4489925d-28fa-4698-b816-a453af683ad2@27g2000hsf.googlegroups.com> <363EDA16-B5FD-474D-9331-F92E2C8D7EB1@mac.com> Message-ID: <1d1e6ea70805210658l37b63e55t5efcb9ce8b69e7@mail.gmail.com> Hmm. I also get some problems with test(all=True) 2 failures (though they look spurious to me) + 18 errors. Intel MBP, 10.5.2, macPython 2.5.2, apple gcc 4.0.1 George Nurser. >>> numpy.test(all=True) Numpy is installed in /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 (Apple Computer, Inc. build 5363)] .... ====================================================================== ERROR: Ticket #396 ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", line 602, in check_poly1d_nan_roots self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 661, in __getattr__ return roots(self.coeffs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 124, in roots roots = _eigvals(A) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 40, in _eigvals return eigvals(arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 478, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 150, in eig a1 = asarray_chkfinite(a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py", line 527, in asarray_chkfinite raise ValueError, "array must not contain infs or NaNs" ValueError: array must not contain infs or NaNs ====================================================================== ERROR: Ticket #396 ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", line 602, in check_poly1d_nan_roots self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 661, in __getattr__ return roots(self.coeffs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 124, in roots roots = _eigvals(A) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 37, in _eigvals return eigvals(arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 478, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 150, in eig a1 = asarray_chkfinite(a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py", line 527, in asarray_chkfinite raise ValueError, "array must not contain infs or NaNs" ValueError: array must not contain infs or NaNs ====================================================================== ERROR: Tests the confidence intervals of the trimmed mean. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 43, in test_trimmedmeanci assert_equal(numpy.round(trimmed_mean_ci(data,0.2),1), [561.8, 630.6]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", line 200, in trimmed_mean_ci tstde = trimmed_stde(data, proportiontocut=proportiontocut, axis=axis) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mstats.py", line 210, in trimmed_stde return _trimmed_stde_1D(data.ravel(), proportiontocut) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mstats.py", line 204, in _trimmed_stde_1D winstd = winsorized.stdu() AttributeError: 'MaskedArray' object has no attribute 'stdu' ====================================================================== ERROR: test_hdquantiles (numpy.ma.tests.test_morestats.TestQuantiles) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 97, in test_hdquantiles hdq = hdquantiles_sd(data,[0.25, 0.5, 0.75]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", line 168, in hdquantiles_sd result = _hdsd_1D(data.compressed(), p) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", line 144, in _hdsd_1D xsorted = numpy.sort(data.compressed()) AttributeError: 'numpy.ndarray' object has no attribute 'compressed' ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 104, in assert_equal desired.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 185, in test_set_elements assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 312, in test_fromrecords assert_equal_records(pa,mpa) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 333, in test_fromrecords_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests the trimmed mean standard error. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mstats.py", line 144, in test_trimmed_stde assert_almost_equal(trimmed_stde(data,0.2), 56.1, 1) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mstats.py", line 210, in trimmed_stde return _trimmed_stde_1D(data.ravel(), proportiontocut) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mstats.py", line 204, in _trimmed_stde_1D winstd = winsorized.stdu() AttributeError: 'MaskedArray' object has no attribute 'stdu' ====================================================================== ERROR: Tests the Winsorization of the data. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mstats.py", line 150, in test_winsorization assert_almost_equal(winsorize(data).varu(), 21551.4, 1) AttributeError: 'MaskedArray' object has no attribute 'varu' ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 657, in check_testUfuncRegression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_subclassing.py", line 86, in check_data_subclassing assert_equal(xmsub._data, xsub) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== FAIL: Tests the Marits-Jarrett estimator ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 36, in test_mjci assert_almost_equal(mjci(data),[55.76819,45.84028,198.8788],5) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 134, in assert_almost_equal return assert_array_almost_equal(actual, desired, decimal, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 227, in assert_array_almost_equal header='Arrays are not almost equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 193, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([ 55.76818915, 45.84027529, 198.8787528 ]) y: array([ 55.76819, 45.84028, 198.8788 ]) ====================================================================== FAIL: Test quantiles 1D - w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mstats.py", line 60, in test_1d_mask [24.833333, 50.0, 75.166666]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 134, in assert_almost_equal return assert_array_almost_equal(actual, desired, decimal, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 227, in assert_array_almost_equal header='Arrays are not almost equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 193, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 66.6666666667%) x: array([ 24.83333333, 50. , 75.16666667]) y: array([ 24.833333, 50. , 75.166666]) ---------------------------------------------------------------------- Ran 1292 tests in 2.053s FAILED (failures=2, errors=18) From pgmdevlist at gmail.com Wed May 21 10:17:19 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 21 May 2008 10:17:19 -0400 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Mac Installer: please test it In-Reply-To: <1d1e6ea70805210658l37b63e55t5efcb9ce8b69e7@mail.gmail.com> References: <363EDA16-B5FD-474D-9331-F92E2C8D7EB1@mac.com> <1d1e6ea70805210658l37b63e55t5efcb9ce8b69e7@mail.gmail.com> Message-ID: <200805211017.19687.pgmdevlist@gmail.com> Mmh, wait a minute: * There shouldn't be any mstats.py nor morestats.py in numpy.ma any longer: I moved the packages to scipy.stats along their respective unittests. * The corresponding tests work fine on my machine... * The RuntimeError that shows up w/ mrecords is a recent problem, not related to numpy.ma: something changed with the .tolist() method. From stefan at sun.ac.za Wed May 21 10:20:12 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 21 May 2008 16:20:12 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <483427F7.70808@gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> <483427F7.70808@gmail.com> Message-ID: <9457e7c80805210720i5592190cq2049d15a141691f0@mail.gmail.com> Hi Bruce 2008/5/21 Bruce Southey : > I would like to throw out the following idea with no obligations: > IF people have the time and energy while writing the documentation, can > they also test that the function is doing what it is expected? > Also related to this is developing appropriate tests if these are not > covered or at least provide a file of code used in evaluating the > functionality. We are adding examples (read: doctests) to every function, which serve as unit tests at the same time. In writing these, we do come across bugs (like http://projects.scipy.org/scipy/numpy/ticket/798), for which tickets are filed. This is a documentation drive, though, so the examples are illustrative; we don't aim to write exhaustive unit tests that cover all corner cases. That said, any person who wishes to contribute unit tests is most welcome to do so. I can guarantee that your patches will be applied speedily :) Regards St?fan From emanuele at relativita.com Wed May 21 10:28:51 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 21 May 2008 16:28:51 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: <2D3AB6C5-A5BA-45B2-8ED2-2F2A8324F3C2@tamu.edu> References: <483417F5.1040705@relativita.com> <2D3AB6C5-A5BA-45B2-8ED2-2F2A8324F3C2@tamu.edu> Message-ID: <483431A3.60603@relativita.com> Rob Hetland wrote: > I think you want something like this: > > x1 = x1 * weights[np.newaxis,:] > x2 = x2 * weights[np.newaxis,:] > > x1 = x1[np.newaxis, :, :] > x2 = x2[:, np.newaxis, :] > distance = np.sqrt( ((x1 - x2)**2).sum(axis=-1) ) > > x1 and x2 are arrays with size of (npoints, ndimensions), and npoints > can be different for each array. I'm not sure I did your weights > right, but that part shouldn't be so difficult. > > Weights seem not right but anyway here is the solution adapted from Bill Baxter's : def distance_matrix_final(data1,data2,weights): data1w = data1*weights dm = (data1w*data1).sum(1)[:,None]-2*N.dot(data1w,data2.T)+(data2*data2*weights).sum(1) dm[dm<0] = 0 return dm This solution is super-fast, stable and use little memory. It is based on the fact that: (x-y)^2*w = x*x*w - 2*x*y*w + y*y*w For size1=size2=dimensions=1000 requires ~0.6sec. to compute on my dual core duo. It is 2 order of magnitude faster than my previous solution, but 1-2 order of magnitude slower than using C with weave.inline. Definitely good enough for me. Emanuele From chanley at stsci.edu Wed May 21 10:33:15 2008 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 21 May 2008 10:33:15 -0400 Subject: [Numpy-discussion] [Fwd: Re: [NumPy] #770: numpy.core.tests.test_multiarray.TestView failures on big-endian machines] Message-ID: <483432AB.6040405@stsci.edu> Just forwarding this to the main list since the Trac mailer still seems to be broken. Chris -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 -------------- next part -------------- An embedded message was scrubbed... From: "NumPy" Subject: Re: [NumPy] #770: numpy.core.tests.test_multiarray.TestView failures on big-endian machines Date: Wed, 21 May 2008 13:59:54 -0000 Size: 9879 URL: From gnurser at googlemail.com Wed May 21 11:11:24 2008 From: gnurser at googlemail.com (George Nurser) Date: Wed, 21 May 2008 16:11:24 +0100 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Mac Installer: please test it In-Reply-To: <200805211017.19687.pgmdevlist@gmail.com> References: <363EDA16-B5FD-474D-9331-F92E2C8D7EB1@mac.com> <1d1e6ea70805210658l37b63e55t5efcb9ce8b69e7@mail.gmail.com> <200805211017.19687.pgmdevlist@gmail.com> Message-ID: <1d1e6ea70805210811g10c61734sa3049b93fd95dc53@mail.gmail.com> 2008/5/21 Pierre GM : > Mmh, wait a minute: > * There shouldn't be any mstats.py nor morestats.py in numpy.ma any longer: I > moved the packages to scipy.stats along their respective unittests. Right. I hadn't deleted the previous /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy After doing that, I get just 12 errors; no failures. The errors are related to * ma/tests/test_mrecords.py * ma/tests/test_old_ma.py * ma/tests/test_subclassing.py -George. Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 (Apple Computer, Inc. build 5363)] ... .......................................................................EEE.EEEE...E..EE.................................E....E.............................. ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 104, in assert_equal desired.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 185, in test_set_elements assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 312, in test_fromrecords assert_equal_records(pa,mpa) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 333, in test_fromrecords_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 657, in check_testUfuncRegression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_subclassing.py", line 86, in check_data_subclassing assert_equal(xmsub._data, xsub) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ---------------------------------------------------------------------- Ran 1277 tests in 1.624s FAILED (errors=12) From pf_moore at yahoo.co.uk Wed May 21 11:13:30 2008 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Wed, 21 May 2008 16:13:30 +0100 Subject: [Numpy-discussion] URGENT: Re: 1.1.0rc1, Win32 Installer: please test it In-Reply-To: References: Message-ID: Jarrod Millman wrote: > Please test the Windows binaries. So far I have only seen two > testers. I can't tag the release until I know that our binary > installers work on a wide variety of Windows machines. For what it's worth, I got this: System information for \\GANDALF: Uptime: 0 days 0 hours 55 minutes 8 seconds Kernel version: Microsoft Windows XP, Multiprocessor Free Product type: Professional Product version: 5.1 Service pack: 3 Kernel build number: 2600 Registered organization: Registered owner: Gustav Install date: 05/12/2006, 14:22:51 Activation status: Error reading status IE version: 7.0000 System root: C:\WINDOWS Processors: 2 Processor speed: 2.4 GHz Processor type: AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ Physical memory: 2046 MB Video driver: NVIDIA GeForce 7950 GT The installet installed the SSE3 version. Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test(all=True) Numpy is installed in C:\Apps\Python\lib\site-packages\numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] Found 18/18 tests for numpy.core.tests.test_defmatrix Found 3/3 tests for numpy.core.tests.test_errstate Found 3/3 tests for numpy.core.tests.test_memmap Found 283/283 tests for numpy.core.tests.test_multiarray Found 70/70 tests for numpy.core.tests.test_numeric Found 36/36 tests for numpy.core.tests.test_numerictypes Found 12/12 tests for numpy.core.tests.test_records Found 140/140 tests for numpy.core.tests.test_regression Found 7/7 tests for numpy.core.tests.test_scalarmath Found 2/2 tests for numpy.core.tests.test_ufunc Found 16/16 tests for numpy.core.tests.test_umath Found 63/63 tests for numpy.core.tests.test_unicode Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu Found 5/5 tests for numpy.distutils.tests.test_misc_util Found 2/2 tests for numpy.fft.tests.test_fftpack Found 3/3 tests for numpy.fft.tests.test_helper Found 10/10 tests for numpy.lib.tests.test_arraysetops Found 1/1 tests for numpy.lib.tests.test_financial Found 53/53 tests for numpy.lib.tests.test_function_base Found 5/5 tests for numpy.lib.tests.test_getlimits Found 6/6 tests for numpy.lib.tests.test_index_tricks Found 15/15 tests for numpy.lib.tests.test_io Found 1/1 tests for numpy.lib.tests.test_machar Found 4/4 tests for numpy.lib.tests.test_polynomial Found 1/1 tests for numpy.lib.tests.test_regression Found 49/49 tests for numpy.lib.tests.test_shape_base Found 15/15 tests for numpy.lib.tests.test_twodim_base Found 43/43 tests for numpy.lib.tests.test_type_check Found 1/1 tests for numpy.lib.tests.test_ufunclike Found 24/24 tests for numpy.lib.tests.test__datasource Found 89/89 tests for numpy.linalg.tests.test_linalg Found 3/3 tests for numpy.linalg.tests.test_regression Found 94/94 tests for numpy.ma.tests.test_core Found 15/15 tests for numpy.ma.tests.test_extras Found 17/17 tests for numpy.ma.tests.test_mrecords Found 36/36 tests for numpy.ma.tests.test_old_ma Found 4/4 tests for numpy.ma.tests.test_subclassing Found 7/7 tests for numpy.tests.test_random Found 16/16 tests for numpy.testing.tests.test_utils Found 5/5 tests for numpy.tests.test_ctypeslib ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Ignoring "Python was built with Visual Studio version 7.1, and extensions need to be built with the same version of the compiler, but it isn't installed." (one should fix me in fcompiler/compaq.py) ....................................................................................................................................................................................................................................................................................................................................................F...............................................................................................................EEE.EEEE...E..EE.................................E....E.............................. ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 104, in assert_equal desired.tolist(), File "C:\Apps\Python\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 185, in test_set_elements assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "C:\Apps\Python\Lib\site-packages\numpy\ma\mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 312, in test_fromrecords assert_equal_records(pa,mpa) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 333, in test_fromrecords_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_old_ma.py", line 657, in check_testUfuncRegression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_subclassing.py", line 86, in check_data_subclassing assert_equal(xmsub._data, xsub) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "C:\Apps\Python\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== FAIL: Ticket #652 ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Apps\Python\Lib\site-packages\numpy\linalg\tests\test_regression.py", line 32, in test_eig_build assert_array_almost_equal(va, rva) File "C:\Apps\Python\Lib\site-packages\numpy\testing\utils.py", line 255, in assert_array_almost_equal header='Arrays are not almost equal') File "C:\Apps\Python\Lib\site-packages\numpy\testing\utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 53.8461538462%) x: array([ 1.03221168e+02 +0.j , -1.91843603e+01 +0.j , 1.82126812e+01 +0.j , -1.13692929e+01 +0.j , -6.04004526e-01+15.84422474j, -6.04004526e-01-15.84422474j,... y: array([ 1.03221168e+02 +0.j , -1.91843603e+01 +0.j , -6.04004526e-01+15.84422474j, -6.04004526e-01-15.84422474j, -1.13692929e+01 +0.j , -6.57612485e-01+10.41755503j,... ---------------------------------------------------------------------- Ran 1277 tests in 2.531s FAILED (failures=1, errors=12) >>> From pgmdevlist at gmail.com Wed May 21 11:28:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 21 May 2008 11:28:29 -0400 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors Message-ID: <200805211128.30336.pgmdevlist@gmail.com> All, Most of the errors that are reported in 1.1.0rc1 are related to the .tolist() method in numpy.ma, such as : ############################ ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): ? ?File "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line 62, in test_get ? ? ?assert_equal(getattr(mbase,field), mbase[field]) ? ?File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line 104, in assert_equal ? ? ?desired.tolist(), ? ?File "C:\Apps\Python\Lib\site-packages\numpy\ma\core.py", line 2552, in tolist ? ? ?result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ##################################### Note that the method seems to work, still: for example, the following command gives the proper output, without RuntimeError python -c "import numpy as np, numpy.ma as ma; x=ma.array(np.random.rand(5), mask=[1,0,0,0,0]); print x.tolist()" The problem looks quite recent, and not related to numpy.ma itself: what changed recently in the .tolist() method of ndarrays ? Why do we get these RuntimeErrors ? From charlesr.harris at gmail.com Wed May 21 11:35:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 09:35:12 -0600 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: <200805211128.30336.pgmdevlist@gmail.com> References: <200805211128.30336.pgmdevlist@gmail.com> Message-ID: On Wed, May 21, 2008 at 9:28 AM, Pierre GM wrote: > All, > Most of the errors that are reported in 1.1.0rc1 are related to the > .tolist() > method in numpy.ma, such as : > ############################ > ERROR: Tests fields retrieval > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "C:\Apps\Python\Lib\site-packages\numpy\ma\tests\test_mrecords.py", line > 62, in test_get > assert_equal(getattr(mbase,field), mbase[field]) > File "C:\Apps\Python\Lib\site-packages\numpy\ma\testutils.py", line > 104, in assert_equal > desired.tolist(), > File "C:\Apps\Python\Lib\site-packages\numpy\ma\core.py", line 2552, > in tolist > result = self.filled().tolist() > RuntimeError: array_item not returning smaller-dimensional array > ##################################### > > Note that the method seems to work, still: for example, the following > command > gives the proper output, without RuntimeError > > python -c "import numpy as np, numpy.ma as ma; > x=ma.array(np.random.rand(5), > mask=[1,0,0,0,0]); print x.tolist()" > > The problem looks quite recent, and not related to numpy.ma itself: what > changed recently in the .tolist() method of ndarrays ? Why do we get these > RuntimeErrors ? > > I expect it comes from the matrix churn. The tolist method was one of those with a workaround for the failure to reduce dimensions. I'll look at it later if someone doesn't beat me to it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Wed May 21 11:39:32 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 21 May 2008 11:39:32 -0400 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> Message-ID: <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> On Wed, May 21, 2008 at 11:35 AM, Charles R Harris wrote: > On Wed, May 21, 2008 at 9:28 AM, Pierre GM wrote: >> The problem looks quite recent, and not related to numpy.ma itself: what >> changed recently in the .tolist() method of ndarrays ? Why do we get these >> RuntimeErrors ? > > I expect it comes from the matrix churn. The tolist method was one of those > with a workaround for the failure to reduce dimensions. I'll look at it > later if someone doesn't beat me to it. There's some commentary and a patch on NumPy ticket 793 on this issue: http://scipy.org/scipy/numpy/ticket/793 Hope it's helpful. Alan From pgmdevlist at gmail.com Wed May 21 11:56:12 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 21 May 2008 11:56:12 -0400 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> References: <200805211128.30336.pgmdevlist@gmail.com> <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> Message-ID: <200805211156.13493.pgmdevlist@gmail.com> On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: > There's some commentary and a patch on NumPy ticket 793 on this issue: > > http://scipy.org/scipy/numpy/ticket/793 OK, thanks a lot ! That's a C problem, then... From alan.mcintyre at gmail.com Wed May 21 12:07:19 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 21 May 2008 12:07:19 -0400 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: <200805211156.13493.pgmdevlist@gmail.com> References: <200805211128.30336.pgmdevlist@gmail.com> <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> <200805211156.13493.pgmdevlist@gmail.com> Message-ID: <1d36917a0805210907i3b5dc98ew209bf2956b23148c@mail.gmail.com> On Wed, May 21, 2008 at 11:56 AM, Pierre GM wrote: > On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: >> There's some commentary and a patch on NumPy ticket 793 on this issue: >> >> http://scipy.org/scipy/numpy/ticket/793 > > OK, thanks a lot ! That's a C problem, then... It's probably worth mentioning that I'm not that familiar with all the innards of NumPy yet, so take my comments and patch on that issue with a (fairly large) grain of salt. ;) From bsouthey at gmail.com Wed May 21 12:57:49 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 21 May 2008 11:57:49 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805210720i5592190cq2049d15a141691f0@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> <483427F7.70808@gmail.com> <9457e7c80805210720i5592190cq2049d15a141691f0@mail.gmail.com> Message-ID: <4834548D.108@gmail.com> St?fan van der Walt wrote: > Hi Bruce > > 2008/5/21 Bruce Southey : > >> I would like to throw out the following idea with no obligations: >> IF people have the time and energy while writing the documentation, can >> they also test that the function is doing what it is expected? >> Also related to this is developing appropriate tests if these are not >> covered or at least provide a file of code used in evaluating the >> functionality. >> > > We are adding examples (read: doctests) to every function, which serve > as unit tests at the same time. In writing these, we do come across > bugs (like http://projects.scipy.org/scipy/numpy/ticket/798), for > which tickets are filed. This is a documentation drive, though, so > the examples are illustrative; we don't aim to write exhaustive unit > tests that cover all corner cases. > > That said, any person who wishes to contribute unit tests is most > welcome to do so. I can guarantee that your patches will be applied > speedily :) > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > Hi Excellent as I did not see any mention of this on the web page. Bruce From robert.kern at gmail.com Wed May 21 13:38:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 12:38:32 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <11D40EB5-263B-4FB3-896A-4FD0F2094F56@tamu.edu> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <3d375d730805210040w5aa2529dmc58b29d3ec9a3e13@mail.gmail.com> <11D40EB5-263B-4FB3-896A-4FD0F2094F56@tamu.edu> Message-ID: <3d375d730805211038geb37e56m28077c3ac9ee26ad@mail.gmail.com> On Wed, May 21, 2008 at 3:26 AM, Rob Hetland wrote: > > On May 21, 2008, at 9:40 AM, Robert Kern wrote: > >> But please, let's not rehash discussions which have already happened >> (like this one). > > I didn't mean to suggest rehashing the documentation format. I agree > that this has been discussed enough. > > Rather, sometimes it's not clear to me how to apply the existing > standard. 'See Also' was a case where the style guidelines seem > sparse. My suggestion, I guess, was more to clarify than to change. Okey-dokey. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed May 21 13:44:20 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 21 May 2008 10:44:20 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> Message-ID: <48345F74.3050704@noaa.gov> Jarrod Millman wrote: >> please test it from: >> >> https://cirl.berkeley.edu/numpy/numpy-1.1.0rc1-py2.5-macosx10.5.dmg > Please test the Mac binaries. I can't tag the release until I know > that our binary installers work on a wide variety of Mac machines. Has there been a new build since the endian bug in the tests was fixed? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed May 21 13:49:08 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 11:49:08 -0600 Subject: [Numpy-discussion] scipy dependency in numpy? Message-ID: Failed importing /usr/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py: No module named scipy.stats.distributions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed May 21 13:51:23 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 12:51:23 -0500 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: References: Message-ID: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> On Wed, May 21, 2008 at 12:49 PM, Charles R Harris wrote: > Failed importing > /usr/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py: No module > named scipy.stats.distributions. This has been removed in SVN. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed May 21 14:23:21 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 21 May 2008 11:23:21 -0700 Subject: [Numpy-discussion] Quick Question about Optimization In-Reply-To: <33644d3c0805201801t47091d2frdbb5f10e720a3d51@mail.gmail.com> References: <33644d3c0805191108r64e847cfwd68e00d2642b5500@mail.gmail.com> <4831E8B4.5040408@noaa.gov> <33644d3c0805191655td515272v85983e1012e2e5c1@mail.gmail.com> <3d375d730805191736w49ec680ek8deba6d4a2215390@mail.gmail.com> <33644d3c0805192026h5dbbb210hf06d5e3cc5fc04b4@mail.gmail.com> <4db580fd0805201224p6517b620of1575f169f7b98d6@mail.gmail.com> <33644d3c0805201801t47091d2frdbb5f10e720a3d51@mail.gmail.com> Message-ID: <48346899.50100@noaa.gov> James Snyder wrote: > b = np.zeros((1,30)) # allocates new memory and disconnects the view This is really about how python works, not how numpy works: np.zeros() -- creates a new array with all zeros in it -- that's the whole point. b = Something -- binds the name "b" to the Something object. Name binding will never, ever, change the object the name used to be bound to. This has nothing to do with whether the object formally know as "b" is referencing the data from another array. This is a nice write up of the concept of name binding in Python: http://python.net/crew/mwh/hacks/objectthink.html -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed May 21 14:28:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 12:28:34 -0600 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 11:51 AM, Robert Kern wrote: > On Wed, May 21, 2008 at 12:49 PM, Charles R Harris > wrote: > > Failed importing > > /usr/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py: No > module > > named scipy.stats.distributions. > > This has been removed in SVN. > Not in 1.1.0.dev5211, which is the latest. Where are people making these fixes? Note that ticket 770 has also been reopened. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Wed May 21 14:38:21 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 21 May 2008 11:38:21 -0700 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 11:28 AM, Charles R Harris wrote: > Not in 1.1.0.dev5211, which is the latest. It isn't in the trunk. Maybe you have the old file still installed. Please try removing the installed files and reinstall. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Wed May 21 14:38:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 12:38:59 -0600 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 12:28 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Wed, May 21, 2008 at 11:51 AM, Robert Kern > wrote: > >> On Wed, May 21, 2008 at 12:49 PM, Charles R Harris >> wrote: >> > Failed importing >> > /usr/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py: No >> module >> > named scipy.stats.distributions. >> >> This has been removed in SVN. >> > > Not in 1.1.0.dev5211, which is the latest. Where are people making these > fixes? Note that ticket 770 has also been reopened. > OK, I had to delete numpy from the site-packages and reinstall. Can we make the install do this? Otherwise we will end up with bogus error reports. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Wed May 21 14:37:28 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 21 May 2008 14:37:28 -0400 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> Message-ID: <200805211437.29319.pgmdevlist@gmail.com> On Wednesday 21 May 2008 14:28:34 Charles R Harris wrote: > On Wed, May 21, 2008 at 11:51 AM, Robert Kern wrote: > > On Wed, May 21, 2008 at 12:49 PM, Charles R Harris > > > > wrote: > > > Failed importing > > > /usr/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py: No > > > > module > > > > > named scipy.stats.distributions. > > > > This has been removed in SVN. > > Not in 1.1.0.dev5211, which is the latest. ??? http://scipy.org/scipy/numpy/changeset/5078 From peridot.faceted at gmail.com Wed May 21 14:44:31 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 21 May 2008 14:44:31 -0400 Subject: [Numpy-discussion] first recarray steps In-Reply-To: References: <4832F77A.8020804@noaa.gov> Message-ID: 2008/5/21 Vincent Schut : > Christopher Barker wrote: >> >> Also, if you image data is rgb, usually, that's a (width, height, 3) >> array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) >> array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs >> may give you that, I'm not sure. > > My data is. In fact, this is a simplification of my situation; I'm > processing satellite data, which usually has more (and other) bands than > just rgb. But the data is definitely in shape (bands, y, x). You may find your life becomes easier if you transpose the data in memory. This can make a big difference to efficiency. Years ago I was working with enormous (by the standards of the day) MATLAB files on disk, storing complex data. The way (that version of) MATLAB represented complex data was the way you describe: matrix of real parts, matrix of imaginary parts. This meant that to draw a single pixel, the disk needed to seek twice... depending on what sort of operations you're doing, transposing your data so that each pixel is all in one place may improve cache coherency as well as making the use of record arrays possible. Anne From charlesr.harris at gmail.com Wed May 21 15:03:10 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 13:03:10 -0600 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: <200805211437.29319.pgmdevlist@gmail.com> References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> <200805211437.29319.pgmdevlist@gmail.com> Message-ID: On Wed, May 21, 2008 at 12:37 PM, Pierre GM wrote: > On Wednesday 21 May 2008 14:28:34 Charles R Harris wrote: > > On Wed, May 21, 2008 at 11:51 AM, Robert Kern > wrote: > > > On Wed, May 21, 2008 at 12:49 PM, Charles R Harris > > > > > > wrote: > > > > Failed importing > > > > /usr/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py: No > > > > > > module > > > > > > > named scipy.stats.distributions. > > > > > > This has been removed in SVN. > > > > Not in 1.1.0.dev5211, which is the latest. > And why this in ma/tests/test_core.py set_local_path() from test_old_ma import * restore_path() and test_core then proceeds to shadow all the test classes in test_old_ma. That's just silly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cburns at berkeley.edu Wed May 21 15:12:06 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Wed, 21 May 2008 12:12:06 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <48345F74.3050704@noaa.gov> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <48345F74.3050704@noaa.gov> Message-ID: <764e38540805211212t5ad55624m9136a0a5ab4ba4fd@mail.gmail.com> On Wed, May 21, 2008 at 10:44 AM, Christopher Barker wrote: > Has there been a new build since the endian bug in the tests was fixed? > > -Chris Nope. I figured that would be included in the 1.1.1 release. -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From Chris.Barker at noaa.gov Wed May 21 15:21:11 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 21 May 2008 12:21:11 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <764e38540805211212t5ad55624m9136a0a5ab4ba4fd@mail.gmail.com> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <48345F74.3050704@noaa.gov> <764e38540805211212t5ad55624m9136a0a5ab4ba4fd@mail.gmail.com> Message-ID: <48347627.8050001@noaa.gov> Christopher Burns wrote: > Nope. I figured that would be included in the 1.1.1 release. It seems a few bugs have been found and fixed. It would be nice to put out another release candidate with those fixes at some point. Anyway: OS-X 10.4.11 Dual g5 PPC: FAILED (failures=3, errors=12) I think these are all the ma errors and endian errors already identified, but here are the details: >>> numpy.test(all=True) Numpy is installed in /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] Found 18/18 tests for numpy.core.tests.test_defmatrix Found 3/3 tests for numpy.core.tests.test_errstate Found 36/36 tests for numpy.core.tests.test_ma Found 3/3 tests for numpy.core.tests.test_memmap Found 283/283 tests for numpy.core.tests.test_multiarray Found 70/70 tests for numpy.core.tests.test_numeric Found 36/36 tests for numpy.core.tests.test_numerictypes Found 12/12 tests for numpy.core.tests.test_records Found 140/140 tests for numpy.core.tests.test_regression Found 2/2 tests for numpy.core.tests.test_reshape Found 7/7 tests for numpy.core.tests.test_scalarmath Found 2/2 tests for numpy.core.tests.test_ufunc Found 16/16 tests for numpy.core.tests.test_umath Found 63/63 tests for numpy.core.tests.test_unicode Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu Found 5/5 tests for numpy.distutils.tests.test_misc_util Found 2/2 tests for numpy.fft.tests.test_fftpack Found 3/3 tests for numpy.fft.tests.test_helper Found 24/24 tests for numpy.lib.tests.test__datasource Found 10/10 tests for numpy.lib.tests.test_arraysetops Found 1/1 tests for numpy.lib.tests.test_financial Found 53/53 tests for numpy.lib.tests.test_function_base Found 5/5 tests for numpy.lib.tests.test_getlimits Found 6/6 tests for numpy.lib.tests.test_index_tricks Found 15/15 tests for numpy.lib.tests.test_io Found 1/1 tests for numpy.lib.tests.test_machar Found 4/4 tests for numpy.lib.tests.test_polynomial Found 1/1 tests for numpy.lib.tests.test_regression Found 49/49 tests for numpy.lib.tests.test_shape_base Found 15/15 tests for numpy.lib.tests.test_twodim_base Found 43/43 tests for numpy.lib.tests.test_type_check Found 1/1 tests for numpy.lib.tests.test_ufunclike Found 89/89 tests for numpy.linalg.tests.test_linalg Found 3/3 tests for numpy.linalg.tests.test_regression Found 94/94 tests for numpy.ma.tests.test_core Found 15/15 tests for numpy.ma.tests.test_extras Found 17/17 tests for numpy.ma.tests.test_mrecords Found 36/36 tests for numpy.ma.tests.test_old_ma Found 4/4 tests for numpy.ma.tests.test_subclassing Found 7/7 tests for numpy.tests.test_random Found 16/16 tests for numpy.testing.tests.test_utils Found 5/5 tests for numpy.tests.test_ctypeslib ...................................................../Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/ma.py:608: UserWarning: Cannot automatically convert masked array to numeric because data is masked in one or more locations. warnings.warn("Cannot automatically convert masked array to "\ F..............................................................................................................................................................................................................................................................................................FF............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ...........................................................................................................................................................................................EEE.EEEE...E..EE.................................E....E.............................. ====================================================================== ERROR: Test creation by view ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 51, in test_byview assert_equal_records(mbase._data, base._data.view(recarray)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test filling the array ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 258, in test_filled assert_equal(mrecfilled['c'], np.array(('one','two','N/A'), dtype='|S8')) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests fields retrieval ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 62, in test_get assert_equal(getattr(mbase,field), mbase[field]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 104, in assert_equal desired.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test pickling ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 243, in test_pickling assert_equal_records(mrec_._data, mrec._data) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_elements (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 185, in test_set_elements assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests setting fields. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 104, in test_set_fields assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: test_set_mask (numpy.ma.tests.test_mrecords.TestMRecords) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 142, in test_set_mask assert_equal(mbase._fieldmask.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test tolist. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 269, in test_tolist assert_equal(mrec.tolist(), File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/mrecords.py", line 474, in tolist result = narray(self.filled().tolist(), dtype=object) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Test construction from records. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 312, in test_fromrecords assert_equal_records(pa,mpa) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 74, in assert_equal_records assert_equal(getattr(a,f), getattr(b,f)) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 103, in assert_equal return _assert_equal_on_sequences(actual.tolist(), RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: Tests construction from records w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mrecords.py", line 333, in test_fromrecords_wmask _mrec = fromrecords(nrec.tolist(), dtype=ddtype, mask=[0,1,0,]) RuntimeError: array_item not returning smaller-dimensional array ====================================================================== ERROR: check_testUfuncRegression (numpy.ma.tests.test_old_ma.TestUfuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 657, in check_testUfuncRegression uf = getattr(umath, f) NameError: global name 'umath' is not defined ====================================================================== ERROR: Tests whether the subclass is kept. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_subclassing.py", line 86, in check_data_subclassing assert_equal(xmsub._data, xsub) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 106, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 201, in assert_array_equal header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 185, in assert_array_compare reduced = reduced.tolist() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/core.py", line 2552, in tolist result = self.filled().tolist() RuntimeError: array_item not returning smaller-dimensional array ====================================================================== FAIL: check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_ma.py", line 692, in check_testUfuncRegression self.failUnless(eqmask(ur.mask, mr.mask)) AssertionError ====================================================================== FAIL: test_basic (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 843, in test_basic assert_array_equal(y, [67305985, 134678021]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([16909060, 84281096]) y: array([ 67305985, 134678021]) ====================================================================== FAIL: test_keywords (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 852, in test_keywords assert_array_equal(y,[[513]]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([[258]], dtype=int16) y: array([[513]]) ---------------------------------------------------------------------- Ran 1315 tests in 5.155s FAILED (failures=3, errors=12) -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed May 21 15:25:21 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 21 May 2008 12:25:21 -0700 Subject: [Numpy-discussion] Outputting arrays. In-Reply-To: <256985.68098.qm@web51501.mail.re2.yahoo.com> References: <256985.68098.qm@web51501.mail.re2.yahoo.com> Message-ID: <48347721.3090800@noaa.gov> Alexandra Geddes wrote: > 1. Is there a module or other code to write arrays to databases (they want access databases)? I don't think there is any way to write an access database file from Python except using com on Windows. > 2. How can i write 2D arrays to textfiles with labels on the rows and columns? I'm not sure it there is an out of the box way, but it's easy to write by hand, just loop through the array. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed May 21 15:28:40 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 14:28:40 -0500 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> Message-ID: <3d375d730805211228o1d86de1eh958544ad1049e627@mail.gmail.com> On Wed, May 21, 2008 at 1:38 PM, Charles R Harris wrote: > OK, I had to delete numpy from the site-packages and reinstall. Can we make > the install do this? Otherwise we will end up with bogus error reports. That's not really feasible in distutils, no. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From berthe.loic at gmail.com Wed May 21 15:33:31 2008 From: berthe.loic at gmail.com (LB) Date: Wed, 21 May 2008 12:33:31 -0700 (PDT) Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: References: Message-ID: <73ba4f6b-26c4-46c7-8d1e-9a5dca42d152@x41g2000hsb.googlegroups.com> This is really a great news, and it seems very promising according to the first pages of the Wiki that I've seen. It's perhaps not the right place to say this, but I was wondering what you would thinking about adding labels or category to the descriptions of each function ? I think It would really help newcomers to search around the 400 functions of numpy if they could use some labels to precise theirs thoughts. Until now, the most efficient why of finding the numpy function that fits my needds what to search after some words in the numpy example list page. This was not always very fast and labels like "array creation", "shape manipulation", "index operation", "arithmetic", ...etc, could simplify this kind of search. -- LB From tim.hochberg at ieee.org Wed May 21 15:39:32 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 21 May 2008 12:39:32 -0700 Subject: [Numpy-discussion] Outputting arrays. In-Reply-To: <256985.68098.qm@web51501.mail.re2.yahoo.com> References: <256985.68098.qm@web51501.mail.re2.yahoo.com> Message-ID: On Wed, May 21, 2008 at 12:32 AM, Alexandra Geddes wrote: > Hi. > > 1. Is there a module or other code to write arrays to databases (they want > access databases)? If you have $$, I think you can use mxODBC. Otherwise, I believe that you have to use COM as Chris suggested. > > > 2. How can i write 2D arrays to textfiles with labels on the rows and > columns? I'm not sure it will do what you want, but you might want to look at the csv module. > > > thanks! > alex. > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Wed May 21 15:40:53 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 21 May 2008 12:40:53 -0700 Subject: [Numpy-discussion] scipy dependency in numpy? In-Reply-To: References: <3d375d730805211051s5fd9b172m2c77e60b223e3128@mail.gmail.com> <200805211437.29319.pgmdevlist@gmail.com> Message-ID: On Wed, May 21, 2008 at 12:03 PM, Charles R Harris wrote: > And why this in ma/tests/test_core.py > > set_local_path() > from test_old_ma import * > restore_path() > > and test_core then proceeds to shadow all the test classes in test_old_ma. > That's just silly. Let's just leave this for the moment. As soon as we get 1.1.0 out, we are going to switch to the nose testing framework in the 1.2 development series. We can clean up the tests there. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From peridot.faceted at gmail.com Wed May 21 15:46:50 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 21 May 2008 15:46:50 -0400 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <73ba4f6b-26c4-46c7-8d1e-9a5dca42d152@x41g2000hsb.googlegroups.com> References: <73ba4f6b-26c4-46c7-8d1e-9a5dca42d152@x41g2000hsb.googlegroups.com> Message-ID: 2008/5/21 LB : > This is really a great news, and it seems very promising according to > the first pages of the Wiki that I've seen. > > It's perhaps not the right place to say this, but I was wondering what > you would thinking about adding labels or category to the descriptions > of each function ? I think It would really help newcomers to search > around the 400 functions of numpy if they could use some labels to > precise theirs thoughts. You're right that some kind of categorization would be very valuable. I think that this is planned, but the doc marathon organizers can presumably give a more detailed answer. > Until now, the most efficient why of finding the numpy function that > fits my needds what to search after some words in the numpy example > list page. This was not always very fast and labels like "array > creation", "shape manipulation", "index operation", > "arithmetic", ...etc, could simplify this kind of search. In the short term, there's the numpy functions by category page on the wiki: http://www.scipy.org/Numpy_Functions_by_Category Of course, this page is already incomplete, and nobody is systematically updating it. Really the right solution is what you propose, annotating each function and then automatically generating such a list. Anne From stefan at sun.ac.za Wed May 21 15:49:15 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 21 May 2008 21:49:15 +0200 Subject: [Numpy-discussion] NumPy Documentation Marathon progress Message-ID: <9457e7c80805211249v85f1caen76b0461cf3984c2@mail.gmail.com> Hi all, I'd like to thank everyone who has responded so positively to the NumPy Documentation Marathon. We now have more than 10 contributors enlisted, including (in order of signing up): - Pauli Virtanen - Emmanuelle Gouillart - Joe Harrington - Johann Cohen-Tanugi - Alan Jackson - Tim Cera - Anne Archibald - David Cournapeau - Neil Martinsen Burrell - Rob Hetland We have already updated more than 50 docstrings (not all of them major rewrites, but we're working on it). I'd like to invite any interested parties to join our efforts on http://sd-2116.dedibox.fr/doc It's easy: create an account on the wiki, familiarise yourself with the docstring standard, and start editing. When the source trunk re-opens for commits after the 1.1 release, I plan to upload the changes from the wiki. Thereafter, I will synchronise them on a weekly or twice-weekly basis. The reference guide is also shaping up nicely. To bring you up to speed: I have written a parser for the current docstring standard, which is used to convert the docstrings from the source tree to a stand-alone document. This document is rendered using Sphinx, the application developed to generate the Python 2.6 documentation. You can see the progress on the reference guide here, in HTML or PDF: http://mentat.za.net/numpy/refguide http://mentat.za.net/numpy/refguide/NumPy.pdf - Mathematical formulas now render correctly (using mathml in the .xhtml files, and LaTeX in the PDF). If you can't read the mathml (see for example the `bartlett` function) you may need to install additional fonts (e.g., http://www.cs.unb.ca/~bremner/docs/mathml-debian-firefox/). For internet explorer, a separate plugin (mathplayer) is required. - I'm aware that the HTML-search is broken -- we'll fix that soon. So, that's where we are now. A lot of the organisation has been done, and there is an editing framework in place (thanks Pauli, Emmanuelle!). Now, we just need to write some content! Thanks again for all your contributions, and here's to many more! Regards St?fan From stefan at sun.ac.za Wed May 21 15:51:24 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 21 May 2008 21:51:24 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <73ba4f6b-26c4-46c7-8d1e-9a5dca42d152@x41g2000hsb.googlegroups.com> References: <73ba4f6b-26c4-46c7-8d1e-9a5dca42d152@x41g2000hsb.googlegroups.com> Message-ID: <9457e7c80805211251w38d1e40cq4551617d2c54ef68@mail.gmail.com> Hi LB 2008/5/21 LB : > This is really a great news, and it seems very promising according to > the first pages of the Wiki that I've seen. > > It's perhaps not the right place to say this, but I was wondering what > you would thinking about adding labels or category to the descriptions > of each function ? I think It would really help newcomers to search > around the 400 functions of numpy if they could use some labels to > precise theirs thoughts. If you take a look at the new docstring standard, we've introduced a new `index` tag, that can be used as follows: .. index:: :refguide: trigonometry, ufunc We still need to decide on which categories to use, but the markup is there. > Until now, the most efficient why of finding the numpy function that > fits my needds what to search after some words in the numpy example > list page. This was not always very fast and labels like "array > creation", "shape manipulation", "index operation", > "arithmetic", ...etc, could simplify this kind of search. Watch this space! Cheers St?fan From cburns at berkeley.edu Wed May 21 16:10:42 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Wed, 21 May 2008 13:10:42 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test In-Reply-To: <48347627.8050001@noaa.gov> References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com> <48345F74.3050704@noaa.gov> <764e38540805211212t5ad55624m9136a0a5ab4ba4fd@mail.gmail.com> <48347627.8050001@noaa.gov> Message-ID: <764e38540805211310x20871633mc04ee7658c537ae1@mail.gmail.com> You're right, I'll put out a new rc. Sorry, I didn't see the other emails this morning and assumed the only errors were the endian issues in the test code. Apparently these are still an issue though, so I'll look into that. Chris On Wed, May 21, 2008 at 12:21 PM, Christopher Barker wrote: > Christopher Burns wrote: >> Nope. I figured that would be included in the 1.1.1 release. > > It seems a few bugs have been found and fixed. It would be nice to put > out another release candidate with those fixes at some point. Anyway: > From aisaac at american.edu Wed May 21 16:27:07 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 21 May 2008 16:27:07 -0400 Subject: [Numpy-discussion] Outputting arrays. In-Reply-To: <256985.68098.qm@web51501.mail.re2.yahoo.com> References: <256985.68098.qm@web51501.mail.re2.yahoo.com> Message-ID: On Wed, 21 May 2008, Alexandra Geddes apparently wrote: > 2. How can i write 2D arrays to textfiles with labels on the rows and columns? http://code.google.com/p/econpy/source/browse/trunk/utilities/text.py hth, Alan Isaac From ondrej at certik.cz Wed May 21 16:26:23 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 21 May 2008 22:26:23 +0200 Subject: [Numpy-discussion] Ticket #798: `piecewise` exposes raw memory In-Reply-To: <85b5c3130805210253r2795c015v3d689d01305c061a@mail.gmail.com> References: <9457e7c80805210230j4e1a63b8u9479164d7c231c18@mail.gmail.com> <85b5c3130805210253r2795c015v3d689d01305c061a@mail.gmail.com> Message-ID: <85b5c3130805211326v3c407274m46fe9e90ae3ba646@mail.gmail.com> On Wed, May 21, 2008 at 11:53 AM, Ondrej Certik wrote: > On Wed, May 21, 2008 at 11:30 AM, St?fan van der Walt wrote: >> Referring to >> http://scipy.org/scipy/numpy/ticket/798 >> >> `piecewise` uses `empty` to allocate output memory. If the conditions >> do not sufficiently cover the output, then raw memory is returned, >> e.g., >> >> {{{ >> import numpy as np >> np.piecewise([0,1,2],[True,False,False],[1]) >> }}} >> >> A patch which addresses the issue is available here for review: >> >> http://codereview.appspot.com/1105 >> >> Documentation is being updated on the wiki. > > I'd like to invite everyone to take part in the review. It's fun, it's > just talking, no coding. :) Thanks Robert for taking part in the review: http://codereview.appspot.com/1105/diff/22/122#newcode566 That's the way to go. Ondrej From wfspotz at sandia.gov Wed May 21 16:34:33 2008 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 21 May 2008 14:34:33 -0600 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL Message-ID: <62713CCA-B1B4-402D-9B71-18F483C8A2AB@sandia.gov> I am running into a problem with a numpy-compatible extension module that I develop, and I believe it has to do with PY_ARRAY_UNIQUE_SYMBOL. I set PY_ARRAY_UNIQUE_SYMBOL to "PyTrilinos". On my machine (Mac OS X), the module loads and works properly. Another user, however (on Ubuntu), gets the following: ImportError: Failure linking new module: /usr/local/lib/python2.4/ site- packages/PyTrilinos/_Epetra.so: Symbol not found: _PyTrilinos Referenced from: /usr/local/lib/libpytrilinos.dylib Expected in: dynamic lookup On my machine, I see: $ nm libpytrilinos.dylib | grep PyTrilinos U _PyTrilinos ... and I'm not sure where the symbol actually IS defined. I don't understand enough about how PY_ARRAY_UNIQUE_SYMBOL is actually used to hunt this down, especially through another user on a machine I don't have access to. Any ideas on what I should be looking for? Thanks ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From charlesr.harris at gmail.com Wed May 21 16:39:51 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 14:39:51 -0600 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: <1d36917a0805210907i3b5dc98ew209bf2956b23148c@mail.gmail.com> References: <200805211128.30336.pgmdevlist@gmail.com> <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> <200805211156.13493.pgmdevlist@gmail.com> <1d36917a0805210907i3b5dc98ew209bf2956b23148c@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 10:07 AM, Alan McIntyre wrote: > On Wed, May 21, 2008 at 11:56 AM, Pierre GM wrote: > > On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: > >> There's some commentary and a patch on NumPy ticket 793 on this issue: > >> > >> http://scipy.org/scipy/numpy/ticket/793 > > > > OK, thanks a lot ! That's a C problem, then... > > It's probably worth mentioning that I'm not that familiar with all the > innards of NumPy yet, so take my comments and patch on that issue with > a (fairly large) grain of salt. ;) This was introduced by Travis in r5138 as part of the matrix changes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 21 17:10:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 15:10:18 -0600 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> <200805211156.13493.pgmdevlist@gmail.com> <1d36917a0805210907i3b5dc98ew209bf2956b23148c@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 2:39 PM, Charles R Harris wrote: > > > On Wed, May 21, 2008 at 10:07 AM, Alan McIntyre > wrote: > >> On Wed, May 21, 2008 at 11:56 AM, Pierre GM wrote: >> > On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: >> >> There's some commentary and a patch on NumPy ticket 793 on this issue: >> >> >> >> http://scipy.org/scipy/numpy/ticket/793 >> > >> > OK, thanks a lot ! That's a C problem, then... >> >> It's probably worth mentioning that I'm not that familiar with all the >> innards of NumPy yet, so take my comments and patch on that issue with >> a (fairly large) grain of salt. ;) > > > This was introduced by Travis in r5138 as part of the matrix changes. > Alan's change looks a bit iffy to me because the looser check would pass things like matrices and there would be no check for decreasing dimensions to throw an error. So I think the safest thing for 1.1 is to back out Travis' change and rethink this for 1.2. Chuck > > Chuck > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed May 21 17:24:15 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 21 May 2008 16:24:15 -0500 Subject: [Numpy-discussion] NumPy Documentation Marathon progress In-Reply-To: <9457e7c80805211249v85f1caen76b0461cf3984c2@mail.gmail.com> References: <9457e7c80805211249v85f1caen76b0461cf3984c2@mail.gmail.com> Message-ID: <483492FF.3070801@enthought.com> St?fan van der Walt wrote: > Hi all, > > I'd like to thank everyone who has responded so positively to the > NumPy Documentation Marathon. We now have more than 10 contributors > enlisted, including (in order of signing up): > > - Pauli Virtanen > - Emmanuelle Gouillart > - Joe Harrington > - Johann Cohen-Tanugi > - Alan Jackson > - Tim Cera > - Anne Archibald > - David Cournapeau > - Neil Martinsen Burrell > - Rob Hetland > > > http://mentat.za.net/numpy/refguide > http://mentat.za.net/numpy/refguide/NumPy.pdf > This is looking great.... Hearty thanks to all those who are helping. -Travis From mattknox.ca at gmail.com Wed May 21 17:40:01 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Wed, 21 May 2008 21:40:01 +0000 (UTC) Subject: [Numpy-discussion] Outputting arrays. References: <256985.68098.qm@web51501.mail.re2.yahoo.com> Message-ID: > 1. Is there a module or other code to write arrays to databases (they want access databases)? There are three python odbc modules (mxODBC, ceODBC, pyodbc), all of which should allow you to connect to access databases. I've played around with all three and my personal favourite is ceODBC (which is probably the least well known too). Here is a quick summary of the three: mxODBC: pros: fast, reliable cons: not free, uses mx.DateTime objects for dates instead of standard python datetime pyodbc: pros: free cons: last time I used it, it was quite buggy. It's slow for some things. Uses a wierd custom "row" object instead of standard tuples in the results. ceODBC: pros: free, fast, reliable cons: doesn't seem to be very actively developed, but the existing code is very good Now, as for how you use these to store arrays in a db... I have a brief tutorial on the timeseries wiki (http://scipy.org/scipy/scikits/wiki/TimeSeries) for working with timeseries objects and relational databases. The same general approach works for standard arrays. The key is to make use of the executemany method in the database modules for inserting, and use the python zip function a lot and the tolist method of arrays. Note that if you are inserting a one dimensional array, you'll need to do "myarray.reshape((myarray.size, 1)).tolist()" to pass it as a parameter to executemany (so you get a list of "rows" instead of just a single list of numbers). If you need more detail, let me know and I can make a fuller example. - Matt From charlesr.harris at gmail.com Wed May 21 17:57:30 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 15:57:30 -0600 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> <1d36917a0805210839s3760dd7p6ff0ad559b3cdaef@mail.gmail.com> <200805211156.13493.pgmdevlist@gmail.com> <1d36917a0805210907i3b5dc98ew209bf2956b23148c@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 3:10 PM, Charles R Harris wrote: > > > On Wed, May 21, 2008 at 2:39 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Wed, May 21, 2008 at 10:07 AM, Alan McIntyre >> wrote: >> >>> On Wed, May 21, 2008 at 11:56 AM, Pierre GM >>> wrote: >>> > On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: >>> >> There's some commentary and a patch on NumPy ticket 793 on this issue: >>> >> >>> >> http://scipy.org/scipy/numpy/ticket/793 >>> > >>> > OK, thanks a lot ! That's a C problem, then... >>> >>> It's probably worth mentioning that I'm not that familiar with all the >>> innards of NumPy yet, so take my comments and patch on that issue with >>> a (fairly large) grain of salt. ;) >> >> >> This was introduced by Travis in r5138 as part of the matrix changes. >> > > Alan's change looks a bit iffy to me because the looser check would pass > things like matrices and there would be no check for decreasing dimensions > to throw an error. So I think the safest thing for 1.1 is to back out > Travis' change and rethink this for 1.2. > > Chuck > I backed out r5138 and, with a few test fixes, everything passes. However, there is now a warning exposed in the masked array tests: /usr/lib/python2.5/site-packages/numpy/ma/core.py:1357: UserWarning: MaskedArray.__setitem__ on fields: The mask is NOT affected! warnings.warn("MaskedArray.__setitem__ on fields: " Pierre? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed May 21 18:28:01 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 21 May 2008 15:28:01 -0700 Subject: [Numpy-discussion] matrices and __radd__ Message-ID: I have a class that stores some of its data in a matrix. I can't figure out how to do right adds with a matrix. Here's a toy example: class Myclass(object): def __init__(self, x, a): self.x = x # numpy matrix self.a = a # some attribute, say, an integer def __add__(self, other): # Assume other is a numpy matrix return Myclass(self.x + other, self.a += 1) def __radd__(self, other): print other >> from myclass import Myclass >> import numpy.matlib as mp >> m = Myclass(mp.zeros((2,2)), 1) >> x = mp.asmatrix(range(4)).reshape(2,2) >> radd = x + m 0 1 2 3 The matrix.__add__ sends one element at a time. That sounds slow. Do I have to grab the corresponding element of self.x and add it to the element passed in by matrix.__add__? Or is there a better way? From charlesr.harris at gmail.com Wed May 21 18:49:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 16:49:00 -0600 Subject: [Numpy-discussion] Shouldn't test(all=1) be the default? Message-ID: Several problems would have been caught early on if test(all=1) had been the default. Is there a reason it is not so? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 21 18:54:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 16:54:34 -0600 Subject: [Numpy-discussion] One remaining problem in tests of masked arrays. Message-ID: The recent fixes have removed almost all the problems that were exposed by running test(all=True). There remains just this warning: /usr/lib/python2.5/site-packages/numpy/ma/core.py:1357: UserWarning: MaskedArray.__setitem__ on fields: The mask is NOT affected! warnings.warn("MaskedArray.__setitem__ on fields: " It looks like this was uncovered when other test failures were removed. Any ideas? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed May 21 19:00:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 18:00:13 -0500 Subject: [Numpy-discussion] Shouldn't test(all=1) be the default? In-Reply-To: References: Message-ID: <3d375d730805211600u4fac976fyc7060dadab5490f@mail.gmail.com> On Wed, May 21, 2008 at 5:49 PM, Charles R Harris wrote: > Several problems would have been caught early on if test(all=1) had been the > default. Is there a reason it is not so? This dates back a long ways. Presumably, important tests were to be associated with module names, and extra tests (like benchmarks) weren't. But I really don't know the entire thinking behind it. We should probably turn it to True for the rest of the 1.1.y series. Anyways, all of this is getting replaced in 1.2 and is already gone in scipy; the practice of naming test files like the modules they test won't be required anymore. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed May 21 19:07:43 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 18:07:43 -0500 Subject: [Numpy-discussion] matrices and __radd__ In-Reply-To: References: Message-ID: <3d375d730805211607k4d9fe254g9f5dcb174a8352d3@mail.gmail.com> On Wed, May 21, 2008 at 5:28 PM, Keith Goodman wrote: > I have a class that stores some of its data in a matrix. I can't > figure out how to do right adds with a matrix. Here's a toy example: > > class Myclass(object): > > def __init__(self, x, a): > self.x = x # numpy matrix > self.a = a # some attribute, say, an integer > > def __add__(self, other): > # Assume other is a numpy matrix > return Myclass(self.x + other, self.a += 1) > > def __radd__(self, other): > print other > >>> from myclass import Myclass >>> import numpy.matlib as mp >>> m = Myclass(mp.zeros((2,2)), 1) >>> x = mp.asmatrix(range(4)).reshape(2,2) >>> radd = x + m > 0 > 1 > 2 > 3 > > The matrix.__add__ sends one element at a time. That sounds slow. Well, what's actually going on here is this: ndarray.__add__() looks at m and decides that it doesn't look like anything it can make an array from. However, it does have an __add__() method, so it assumes that it is intended to be a scalar. It uses broadcasting to treat it as if it were an object array of the shape of x with each element identical. Then it adds together the two arrays element-wise. Each element-wise addition triggers the MyClass.__radd__() call. > Do I > have to grab the corresponding element of self.x and add it to the > element passed in by matrix.__add__? Or is there a better way? There probably is, but not being familiar with your actual use case, I'm not sure what it would be. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed May 21 19:15:45 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 18:15:45 -0500 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL In-Reply-To: <62713CCA-B1B4-402D-9B71-18F483C8A2AB@sandia.gov> References: <62713CCA-B1B4-402D-9B71-18F483C8A2AB@sandia.gov> Message-ID: <3d375d730805211615o5b045b4fh81af55ce4d55cb68@mail.gmail.com> On Wed, May 21, 2008 at 3:34 PM, Bill Spotz wrote: > I am running into a problem with a numpy-compatible extension module > that I develop, and I believe it has to do with PY_ARRAY_UNIQUE_SYMBOL. > > I set PY_ARRAY_UNIQUE_SYMBOL to "PyTrilinos". Why? My understanding is also limited, but it does not appear to me to be something one should change. > On my machine (Mac OS > X), the module loads and works properly. Another user, however (on > Ubuntu), gets the following: > > ImportError: Failure linking new module: /usr/local/lib/python2.4/ > site- > packages/PyTrilinos/_Epetra.so: Symbol not found: _PyTrilinos > Referenced from: /usr/local/lib/libpytrilinos.dylib > Expected in: dynamic lookup ??? How did an Ubuntu user get a hold of a .dylib? > On my machine, I see: > > $ nm libpytrilinos.dylib | grep PyTrilinos > U _PyTrilinos > ... > > and I'm not sure where the symbol actually IS defined. It gets generated into __multiarray_api.h: #if defined(PY_ARRAY_UNIQUE_SYMBOL) #define PyArray_API PY_ARRAY_UNIQUE_SYMBOL #endif #if defined(NO_IMPORT) || defined(NO_IMPORT_ARRAY) extern void **PyArray_API; #else #if defined(PY_ARRAY_UNIQUE_SYMBOL) void **PyArray_API; #else static void **PyArray_API=NULL; #endif #endif How did this symbol end up in a .dylib rather than a Python extension module? I think it might be difficult to arrange for a non-extension dynamic library to call into numpy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Wed May 21 19:17:28 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 21 May 2008 16:17:28 -0700 Subject: [Numpy-discussion] matrices and __radd__ In-Reply-To: <3d375d730805211607k4d9fe254g9f5dcb174a8352d3@mail.gmail.com> References: <3d375d730805211607k4d9fe254g9f5dcb174a8352d3@mail.gmail.com> Message-ID: On Wed, May 21, 2008 at 4:07 PM, Robert Kern wrote: > On Wed, May 21, 2008 at 5:28 PM, Keith Goodman wrote: >> I have a class that stores some of its data in a matrix. I can't >> figure out how to do right adds with a matrix. Here's a toy example: >> >> class Myclass(object): >> >> def __init__(self, x, a): >> self.x = x # numpy matrix >> self.a = a # some attribute, say, an integer >> >> def __add__(self, other): >> # Assume other is a numpy matrix >> return Myclass(self.x + other, self.a += 1) >> >> def __radd__(self, other): >> print other >> >>>> from myclass import Myclass >>>> import numpy.matlib as mp >>>> m = Myclass(mp.zeros((2,2)), 1) >>>> x = mp.asmatrix(range(4)).reshape(2,2) >>>> radd = x + m >> 0 >> 1 >> 2 >> 3 >> >> The matrix.__add__ sends one element at a time. That sounds slow. > > Well, what's actually going on here is this: ndarray.__add__() looks > at m and decides that it doesn't look like anything it can make an > array from. However, it does have an __add__() method, so it assumes > that it is intended to be a scalar. It uses broadcasting to treat it > as if it were an object array of the shape of x with each element > identical. Then it adds together the two arrays element-wise. Each > element-wise addition triggers the MyClass.__radd__() call. Oh, broadcasting. OK that makes sense. >> Do I >> have to grab the corresponding element of self.x and add it to the >> element passed in by matrix.__add__? Or is there a better way? > > There probably is, but not being familiar with your actual use case, > I'm not sure what it would be. From http://projects.scipy.org/pipermail/numpy-discussion/2006-December/025075.html I see that the trick is to add __array_priority__ = 10 to my class. class Myclass(object): __array_priority__ = 10 def __init__(self, x, a): self.x = x # numpy matrix self.a = a # some attribute, say, an integer def __add__(self, other): # Assume other is a numpy matrix return Myclass(self.x + other, 2*self.a) __radd__ = __add__ >> from myclass import Myclass >> import numpy.matlib as mp >> m = Myclass(mp.zeros((2,2)), 1) >> x = mp.asmatrix(range(4)).reshape(2,2) >> radd = x + m >> radd.a 2 >> radd.x matrix([[ 0., 1.], [ 2., 3.]]) From wfspotz at sandia.gov Wed May 21 19:31:17 2008 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 21 May 2008 17:31:17 -0600 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL In-Reply-To: <3d375d730805211615o5b045b4fh81af55ce4d55cb68@mail.gmail.com> References: <62713CCA-B1B4-402D-9B71-18F483C8A2AB@sandia.gov> <3d375d730805211615o5b045b4fh81af55ce4d55cb68@mail.gmail.com> Message-ID: On May 21, 2008, at 5:15 PM, Robert Kern wrote: > On Wed, May 21, 2008 at 3:34 PM, Bill Spotz > wrote: >> I am running into a problem with a numpy-compatible extension module >> that I develop, and I believe it has to do with >> PY_ARRAY_UNIQUE_SYMBOL. >> >> I set PY_ARRAY_UNIQUE_SYMBOL to "PyTrilinos". > > Why? My understanding is also limited, but it does not appear to me to > be something one should change. My understanding is that if your extension module requires more than one source file, then this macro needs to be set to something reasonably unique to your project. And that NO_IMPORT_ARRAY needs to be set in those files that do not call import_array(). >> On my machine (Mac OS >> X), the module loads and works properly. Another user, however (on >> Ubuntu), gets the following: >> >> ImportError: Failure linking new module: /usr/local/lib/python2.4/ >> site- >> packages/PyTrilinos/_Epetra.so: Symbol not found: _PyTrilinos >> Referenced from: /usr/local/lib/libpytrilinos.dylib >> Expected in: dynamic lookup > > ??? How did an Ubuntu user get a hold of a .dylib? I could have this wrong. He is asking me questions about Mac OS X and Ubuntu simultaneously. But going back through his emails, I thought I had this right. >> On my machine, I see: >> >> $ nm libpytrilinos.dylib | grep PyTrilinos >> U _PyTrilinos >> ... >> >> and I'm not sure where the symbol actually IS defined. > > It gets generated into __multiarray_api.h: > > #if defined(PY_ARRAY_UNIQUE_SYMBOL) > #define PyArray_API PY_ARRAY_UNIQUE_SYMBOL > #endif > > #if defined(NO_IMPORT) || defined(NO_IMPORT_ARRAY) > extern void **PyArray_API; > #else > #if defined(PY_ARRAY_UNIQUE_SYMBOL) > void **PyArray_API; > #else > static void **PyArray_API=NULL; > #endif > #endif > > > How did this symbol end up in a .dylib rather than a Python extension > module? I think it might be difficult to arrange for a non-extension > dynamic library to call into numpy. It is designed so that the extension modules (plural -- there are many of them) link to the dylib that has the symbol. The code in this dylib is only ever accessed through those extension modules. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From thrabe at burnham.org Wed May 21 19:39:32 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Wed, 21 May 2008 16:39:32 -0700 Subject: [Numpy-discussion] 1.1.0rc1 OSX Installer - please test References: <764e38540805191239w41924efew333f72ad05a76ea7@mail.gmail.com><48345F74.3050704@noaa.gov><764e38540805211212t5ad55624m9136a0a5ab4ba4fd@mail.gmail.com> <48347627.8050001@noaa.gov> Message-ID: iBook G4 osX 10.5.2 hope this helps! Numpy is installed in /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy Numpy version 1.1.0rc1 Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 (Apple Computer, Inc. build 5363)] Found 18/18 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 283/283 tests for numpy.core.multiarray Found 70/70 tests for numpy.core.numeric Found 36/36 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 24/24 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 1/1 tests for numpy.lib.financial Found 0/0 tests for numpy.lib.format Found 53/53 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 6/6 tests for numpy.lib.index_tricks Found 15/15 tests for numpy.lib.io Found 1/1 tests for numpy.lib.machar Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 89/89 tests for numpy.linalg Found 94/94 tests for numpy.ma.core Found 15/15 tests for numpy.ma.extras Found 7/7 tests for numpy.random Found 16/16 tests for numpy.testing.utils Found 0/0 tests for __main__ .............................................................................................................................................................................................................................................................................................................FF............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ====================================================================== FAIL: test_basic (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 843, in test_basic assert_array_equal(y, [67305985, 134678021]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([16909060, 84281096]) y: array([ 67305985, 134678021]) ====================================================================== FAIL: test_keywords (numpy.core.tests.test_multiarray.TestView) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_multiarray.py", line 852, in test_keywords assert_array_equal(y,[[513]]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 248, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 100.0%) x: array([[258]], dtype=int16) y: array([[513]]) ---------------------------------------------------------------------- Ran 1004 tests in 6.159s FAILED (failures=2) >>> -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4148 bytes Desc: not available URL: From pgmdevlist at gmail.com Wed May 21 19:56:23 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 21 May 2008 19:56:23 -0400 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> Message-ID: <200805211956.24031.pgmdevlist@gmail.com> On Wednesday 21 May 2008 17:57:30 Charles R Harris wrote: > On Wed, May 21, 2008 at 3:10 PM, Charles R Harris > > > wrote: > > On Wed, May 21, 2008 at 2:39 PM, Charles R Harris < > > > > charlesr.harris at gmail.com> wrote: > >> On Wed, May 21, 2008 at 10:07 AM, Alan McIntyre > >> > >> > >> wrote: > >>> On Wed, May 21, 2008 at 11:56 AM, Pierre GM > >>> > >>> wrote: > >>> > On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: > >>> >> There's some commentary and a patch on NumPy ticket 793 on this > >>> >> issue: > >>> >> > >>> >> http://scipy.org/scipy/numpy/ticket/793 > >>> > > >>> > OK, thanks a lot ! That's a C problem, then... > >>> > >>> It's probably worth mentioning that I'm not that familiar with all the > >>> innards of NumPy yet, so take my comments and patch on that issue with > >>> a (fairly large) grain of salt. ;) > >> > >> This was introduced by Travis in r5138 as part of the matrix changes. > > > > Alan's change looks a bit iffy to me because the looser check would pass > > things like matrices and there would be no check for decreasing > > dimensions to throw an error. So I think the safest thing for 1.1 is to > > back out Travis' change and rethink this for 1.2. > > > > Chuck > > I backed out r5138 and, with a few test fixes, everything passes. However, > there is now a warning exposed in the masked array tests: > > /usr/lib/python2.5/site-packages/numpy/ma/core.py:1357: UserWarning: > MaskedArray.__setitem__ on fields: The mask is NOT affected! > warnings.warn("MaskedArray.__setitem__ on fields: " > > Pierre? Well, that's a warning all right, not an error. With "exotic" dtypes (understand, not int,bool,float or complex but named fields), setting a field doesn't affect the mask, hence the warning. The reason behind this behavior is that the mask of a MaskedArray is only a boolean-array: you have (at most) one boolean per element/record. Therefore, you can mask/unmask a full record, but not a specific field. Masking particular fields is possible with MaskedRecords, however, but with an overhead that wasn't worth putting in MaskedArray. Because I've been bitten a couple of times by this mechanism, I figured that a warning would be the easiest way to remember that MaskedArrays don't handle records very well. So, nothing to worry about. From charlesr.harris at gmail.com Wed May 21 20:12:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 18:12:43 -0600 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: <200805211956.24031.pgmdevlist@gmail.com> References: <200805211128.30336.pgmdevlist@gmail.com> <200805211956.24031.pgmdevlist@gmail.com> Message-ID: On Wed, May 21, 2008 at 5:56 PM, Pierre GM wrote: > On Wednesday 21 May 2008 17:57:30 Charles R Harris wrote: > > On Wed, May 21, 2008 at 3:10 PM, Charles R Harris > > > > > > wrote: > > > On Wed, May 21, 2008 at 2:39 PM, Charles R Harris < > > > > > > charlesr.harris at gmail.com> wrote: > > >> On Wed, May 21, 2008 at 10:07 AM, Alan McIntyre > > >> > > >> > > >> wrote: > > >>> On Wed, May 21, 2008 at 11:56 AM, Pierre GM > > >>> > > >>> wrote: > > >>> > On Wednesday 21 May 2008 11:39:32 Alan McIntyre wrote: > > >>> >> There's some commentary and a patch on NumPy ticket 793 on this > > >>> >> issue: > > >>> >> > > >>> >> http://scipy.org/scipy/numpy/ticket/793 > > >>> > > > >>> > OK, thanks a lot ! That's a C problem, then... > > >>> > > >>> It's probably worth mentioning that I'm not that familiar with all > the > > >>> innards of NumPy yet, so take my comments and patch on that issue > with > > >>> a (fairly large) grain of salt. ;) > > >> > > >> This was introduced by Travis in r5138 as part of the matrix changes. > > > > > > Alan's change looks a bit iffy to me because the looser check would > pass > > > things like matrices and there would be no check for decreasing > > > dimensions to throw an error. So I think the safest thing for 1.1 is to > > > back out Travis' change and rethink this for 1.2. > > > > > > Chuck > > > > I backed out r5138 and, with a few test fixes, everything passes. > However, > > there is now a warning exposed in the masked array tests: > > > > /usr/lib/python2.5/site-packages/numpy/ma/core.py:1357: UserWarning: > > MaskedArray.__setitem__ on fields: The mask is NOT affected! > > warnings.warn("MaskedArray.__setitem__ on fields: " > > > > Pierre? > > Well, that's a warning all right, not an error. > > With "exotic" dtypes (understand, not int,bool,float or complex but named > fields), setting a field doesn't affect the mask, hence the warning. > > The reason behind this behavior is that the mask of a MaskedArray is only a > boolean-array: you have (at most) one boolean per element/record. > Therefore, > you can mask/unmask a full record, but not a specific field. Masking > particular fields is possible with MaskedRecords, however, but with an > overhead that wasn't worth putting in MaskedArray. > Because I've been bitten a couple of times by this mechanism, I figured > that a > warning would be the easiest way to remember that MaskedArrays don't handle > records very well. > > So, nothing to worry about. > Should we disable the warning for the tests? It's a bit unnerving and likely to generate mail calling attention to it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed May 21 20:24:05 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 19:24:05 -0500 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> <200805211956.24031.pgmdevlist@gmail.com> Message-ID: <3d375d730805211724i56c71e9ds2c7853e125cef84c@mail.gmail.com> On Wed, May 21, 2008 at 7:12 PM, Charles R Harris wrote: > Should we disable the warning for the tests? It's a bit unnerving and likely > to generate mail calling attention to it. Yes. Known warnings should be explicitly silenced in tests. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Wed May 21 20:24:47 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 21 May 2008 17:24:47 -0700 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> <200805211956.24031.pgmdevlist@gmail.com> Message-ID: On Wed, May 21, 2008 at 5:12 PM, Charles R Harris wrote: > Should we disable the warning for the tests? It's a bit unnerving and likely > to generate mail calling attention to it. +1 I tend to agree that it is disconcerting to have warnings pop up in the tests. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pgmdevlist at gmail.com Wed May 21 20:23:39 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 21 May 2008 20:23:39 -0400 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: References: <200805211128.30336.pgmdevlist@gmail.com> <200805211956.24031.pgmdevlist@gmail.com> Message-ID: <200805212023.39599.pgmdevlist@gmail.com> > Should we disable the warning for the tests? It's a bit unnerving and > likely to generate mail calling attention to it. I don't have a problem with that. Is there some kind of trapping mechanism we can use ? A la "fail_unless_raise" ? From robert.kern at gmail.com Wed May 21 20:34:28 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 19:34:28 -0500 Subject: [Numpy-discussion] 1.1.0rc1 RuntimeErrors In-Reply-To: <200805212023.39599.pgmdevlist@gmail.com> References: <200805211128.30336.pgmdevlist@gmail.com> <200805211956.24031.pgmdevlist@gmail.com> <200805212023.39599.pgmdevlist@gmail.com> Message-ID: <3d375d730805211734u619030ddy2fba081f6b3d723d@mail.gmail.com> On Wed, May 21, 2008 at 7:23 PM, Pierre GM wrote: > >> Should we disable the warning for the tests? It's a bit unnerving and >> likely to generate mail calling attention to it. > > I don't have a problem with that. Is there some kind of trapping mechanism we > can use ? You can filter warnings using warning.filterwarnings() or warning.simplefilter(). http://docs.python.org/dev/library/warnings > A la "fail_unless_raise" ? I'm not sure I see the connection. Do you need to test that the warning is issued? If so, then make a filter with the action 'error' and use self.assertRaises(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed May 21 22:48:45 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 20:48:45 -0600 Subject: [Numpy-discussion] Fix to #789 maybe not right. Message-ID: David, I'm not sure that fix is completely correct. The out keyword is funny and I'm not what the specs are supposed to be, but generally the output is cast rather than an error raised. We need an official spec here because the documentation of this feature is essentially random. Note that the shapes don't really have to match, either. In [1]: x = ones(5) In [2]: out = ones(5, dtype=int8) In [3]: cumsum(x, out=out) Out[3]: array([1, 2, 3, 4, 5], dtype=int8) In [4]: out = empty((5,1)) In [5]: cumsum(x, out=out) Out[5]: array([[ 1.], [ 2.], [ 3.], [ 4.], [ 5.]]) OTOH, out = empty((1,5)) doesn't work but doesn't raise an error. Confused? Me too. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 21 23:02:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 21:02:17 -0600 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: References: Message-ID: On Wed, May 21, 2008 at 8:48 PM, Charles R Harris wrote: > David, > > I'm not sure that fix is completely correct. The out keyword is funny and > I'm not what the specs are supposed to be, but generally the output is cast > rather than an error raised. We need an official spec here because the > documentation of this feature is essentially random. Note that the shapes > don't really have to match, either. > > In [1]: x = ones(5) > > In [2]: out = ones(5, dtype=int8) > > In [3]: cumsum(x, out=out) > Out[3]: array([1, 2, 3, 4, 5], dtype=int8) > > In [4]: out = empty((5,1)) > > In [5]: cumsum(x, out=out) > Out[5]: > array([[ 1.], > [ 2.], > [ 3.], > [ 4.], > [ 5.]]) > > OTOH, out = empty((1,5)) doesn't work but doesn't raise an error. > Confused? Me too. > And I'm not sure self->desc needs its reference count decremented, PyArray_FromArray is one of those vicious, nasty functions with side effects and might decrement the count itself. Note that the reference count is not decremented elsewhere. It's a capital offense to write such functions, but there it is. On a lesser note, there is trailing whitespace on every new line except one. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Wed May 21 22:56:35 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 11:56:35 +0900 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: References: Message-ID: <4834E0E3.5050000@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > David, > > I'm not sure that fix is completely correct. The out keyword is funny > and I'm not what the specs are supposed to be, but generally the > output is cast rather than an error raised. I think the out argument is one of this thing which is rather a mess right now in numpy. The functions which accept it do not always mention it, and as you say, there are various different behaviour. What are the uses of the out argument ? The obvious one is saving memory, but are there others ? Automatic casting would break the memory saving. > We need an official spec here because the documentation of this > feature is essentially random. Note that the shapes don't really have > to match, either. > > In [1]: x = ones(5) > > In [2]: out = ones(5, dtype=int8) > > In [3]: cumsum(x, out=out) > Out[3]: array([1, 2, 3, 4, 5], dtype=int8) I don't feel like this should work (because of the dtype). > > In [4]: out = empty((5,1)) > > In [5]: cumsum(x, out=out) > Out[5]: > array([[ 1.], > [ 2.], > [ 3.], > [ 4.], > [ 5.]]) > > OTOH, out = empty((1,5)) doesn't work but doesn't raise an error. Not working and no error is obviously wrong. Note that after playing a bit with this, I got segfaults when exciting python (I am working on reproducint it). cheers, David From david at ar.media.kyoto-u.ac.jp Wed May 21 23:03:20 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 12:03:20 +0900 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: References: Message-ID: <4834E278.2040406@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > And I'm not sure self->desc needs its reference count decremented, > PyArray_FromArray is one of those vicious, nasty functions with side > effects and might decrement the count itself. might ? What do you mean by might decrement ? If the call to PyAarray_FromArray fails, no reference are stolen, right ? > Note that the reference count is not decremented elsewhere. It's a > capital offense to write such functions, but there it is. > > On a lesser note, there is trailing whitespace on every new line > except one. This is easier to fix :) cheers, David From oliphant at enthought.com Wed May 21 23:24:30 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 21 May 2008 22:24:30 -0500 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834E278.2040406@ar.media.kyoto-u.ac.jp> References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> Message-ID: <4834E76E.9080007@enthought.com> David Cournapeau wrote: > Charles R Harris wrote: > >> And I'm not sure self->desc needs its reference count decremented, >> PyArray_FromArray is one of those vicious, nasty functions with side >> effects and might decrement the count itself. >> > > might ? What do you mean by might decrement ? If the call to > PyAarray_FromArray fails, no reference are stolen, right ? > No, that's not right. The reference is stolen if it fails as well. This is true of all descriptor data-types. Perhaps it is weird, but it was a lot easier to retro-fit Numeric PyArray_Descr as a Python object that way. -Travis From charlesr.harris at gmail.com Wed May 21 23:26:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 21:26:34 -0600 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834E278.2040406@ar.media.kyoto-u.ac.jp> References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> Message-ID: On Wed, May 21, 2008 at 9:03 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > > > And I'm not sure self->desc needs its reference count decremented, > > PyArray_FromArray is one of those vicious, nasty functions with side > > effects and might decrement the count itself. > > might ? What do you mean by might decrement ? If the call to > PyAarray_FromArray fails, no reference are stolen, right ? > Maybe, maybe not. We need to look at the function and document it. I think these sort of functions are an invitation to reference counting bugs, of which valgrind says we have several. Really, all the increments and decrements should be inside PyArray_FromArray, but calls to this function are scattered all over. > > Note that the reference count is not decremented elsewhere. It's a > > capital offense to write such functions, but there it is. > > > > On a lesser note, there is trailing whitespace on every new line > > except one. > > This is easier to fix :) > Not that I think we are in any worse shape after the fix than before, but I do think we should pretty much leave things alone until 1.1 out the door and leave future fixes to 1.1.1 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed May 21 23:32:50 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 21 May 2008 22:32:50 -0500 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> Message-ID: <4834E962.2070803@enthought.com> Charles R Harris wrote: > Really, all the increments and decrements should be inside > PyArray_FromArray, but calls to this function are scattered all over. I don't understand what you mean by this statement. All functions that return an object and take a PyArray_Descr object steal a reference to the descriptor (even if it fails). That's the rule. -Travis From charlesr.harris at gmail.com Wed May 21 23:43:54 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 21:43:54 -0600 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834E962.2070803@enthought.com> References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> <4834E962.2070803@enthought.com> Message-ID: On Wed, May 21, 2008 at 9:32 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > Really, all the increments and decrements should be inside > > PyArray_FromArray, but calls to this function are scattered all over. > I don't understand what you mean by this statement. All functions > that return an object and take a PyArray_Descr object steal a reference > to the descriptor (even if it fails). That's the rule. > Why should it not increment the reference itself? Note that calls to this function are normally preceded by incrementing the reference, probably because one wants to keep it around. I think it would be clearer to have the rule: you increment it, you decrement it. That way everything is in one obvious place and you don't have to concern yourself with what happens inside PyArray_FromArray. Functions with side effects are almost always a bad idea and lead to bugs in practice. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Wed May 21 23:31:39 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 12:31:39 +0900 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834E76E.9080007@enthought.com> References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> <4834E76E.9080007@enthought.com> Message-ID: <4834E91B.5060105@ar.media.kyoto-u.ac.jp> Travis E. Oliphant wrote: > No, that's not right. The reference is stolen if it fails as well. > This is true of all descriptor data-types. Perhaps it is weird, but it > was a lot easier to retro-fit Numeric PyArray_Descr as a Python object > that way. > Thanks for the clarification. I fixed the code accordingly, cheers, David From david at ar.media.kyoto-u.ac.jp Wed May 21 23:35:12 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 12:35:12 +0900 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834E91B.5060105@ar.media.kyoto-u.ac.jp> References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> <4834E76E.9080007@enthought.com> <4834E91B.5060105@ar.media.kyoto-u.ac.jp> Message-ID: <4834E9F0.4080409@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > > Thanks for the clarification. I fixed the code accordingly, > Ok, you beat me :) cheers, David From charlesr.harris at gmail.com Thu May 22 00:26:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 22:26:27 -0600 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834E0E3.5050000@ar.media.kyoto-u.ac.jp> References: <4834E0E3.5050000@ar.media.kyoto-u.ac.jp> Message-ID: On Wed, May 21, 2008 at 8:56 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > David, > > > > I'm not sure that fix is completely correct. The out keyword is funny > > and I'm not what the specs are supposed to be, but generally the > > output is cast rather than an error raised. > > I think the out argument is one of this thing which is rather a mess > right now in numpy. The functions which accept it do not always mention > it, and as you say, there are various different behaviour. > > What are the uses of the out argument ? The obvious one is saving > memory, but are there others ? Automatic casting would break the memory > saving. > I've been contemplating this and I think you are right. The out parameter is for those special bits that need efficiency and should be as simple as possible. That means: same type and shape as the normal output, no appeal. This is especially so as a reference to the output is what the function returns when out is present and the return type should not depend on the type of the out parameter. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 22 00:42:15 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 22:42:15 -0600 Subject: [Numpy-discussion] Closing some tickets. Message-ID: All, Can we close ticket #117 and add Pearu's comment to the FAQ? http://projects.scipy.org/scipy/numpy/ticket/117 Can someone with MSVC 2005 check if we can close ticket #164? http://projects.scipy.org/scipy/numpy/ticket/164 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 22 00:47:45 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 May 2008 23:47:45 -0500 Subject: [Numpy-discussion] Closing some tickets. In-Reply-To: References: Message-ID: <3d375d730805212147v69372a8fxa2ec657f46fb9e90@mail.gmail.com> On Wed, May 21, 2008 at 11:42 PM, Charles R Harris wrote: > All, > > Can we close ticket #117 and add Pearu's comment to the FAQ? > http://projects.scipy.org/scipy/numpy/ticket/117 Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at enthought.com Thu May 22 00:55:45 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 21 May 2008 23:55:45 -0500 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> <4834E962.2070803@enthought.com> Message-ID: <4834FCD1.5040908@enthought.com> Charles R Harris wrote: > > > On Wed, May 21, 2008 at 9:32 PM, Travis E. Oliphant > > wrote: > > Charles R Harris wrote: > > Really, all the increments and decrements should be inside > > PyArray_FromArray, but calls to this function are scattered all > over. > I don't understand what you mean by this statement. All functions > that return an object and take a PyArray_Descr object steal a > reference > to the descriptor (even if it fails). That's the rule. > > > Why should it not increment the reference itself? Note that calls to > this function are normally preceded by incrementing the reference, > probably because one wants to keep it around. I wouldn't say normally. I would say sometimes. Normally you create a reference to the data-type and want PyArray_FromAny and friends to steal it (i.e. PyArray_DescrFromType). I wouldn't call stealing a reference count a side effect, but even if you want to call it that, it can't really change without a *huge* re-working effort for almost zero gain. You would also have to re-work all the macros that take type numbers and construct data-type objects for passing to these functions. I don't see the benefit at all. > I think it would be clearer to have the rule: you increment it, you > decrement it. Maybe, but Python's own C-API doesn't always follow that rule, there are functions that steal references. Remember, PyArray_Descr was retrofitted as a Python Object. It didn't use to be one. This steal rule was the cleanest I could come up with --- i.e. it wasn't an idle decision. It actually makes some sense because the returned array is going to "own" the reference count to the data-type object (just like setting to a list). -Travis From charlesr.harris at gmail.com Thu May 22 01:09:10 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 May 2008 23:09:10 -0600 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: <4834FCD1.5040908@enthought.com> References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> <4834E962.2070803@enthought.com> <4834FCD1.5040908@enthought.com> Message-ID: On Wed, May 21, 2008 at 10:55 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > On Wed, May 21, 2008 at 9:32 PM, Travis E. Oliphant > > > wrote: > > > > Charles R Harris wrote: > > > Really, all the increments and decrements should be inside > > > PyArray_FromArray, but calls to this function are scattered all > > over. > > I don't understand what you mean by this statement. All functions > > that return an object and take a PyArray_Descr object steal a > > reference > > to the descriptor (even if it fails). That's the rule. > > > > > > Why should it not increment the reference itself? Note that calls to > > this function are normally preceded by incrementing the reference, > > probably because one wants to keep it around. > I wouldn't say normally. I would say sometimes. > > Normally you create a reference to the data-type and want > PyArray_FromAny and friends to steal it (i.e. PyArray_DescrFromType). > > I wouldn't call stealing a reference count a side effect, but even if > you want to call it that, it can't really change without a *huge* > re-working effort for almost zero gain. You would also have to re-work > all the macros that take type numbers and construct data-type objects > for passing to these functions. I don't see the benefit at all. I agree with all that, which is why I'm not advocating a change. But it does raise the bar for working with the C code and I think the current case is an example of that. > > I think it would be clearer to have the rule: you increment it, you > > decrement it. > Maybe, but Python's own C-API doesn't always follow that rule, there are > functions that steal references. Remember, PyArray_Descr was > retrofitted as a Python Object. It didn't use to be one. This steal > rule was the cleanest I could come up with --- i.e. it wasn't an idle > decision. > I realize that Python does this too, I also note that getting reference counting right is one of the more difficult aspects of writing Python extension code. The more programmers have to know and keep in mind, the more likely they are to make mistakes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu May 22 01:16:55 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 22 May 2008 00:16:55 -0500 Subject: [Numpy-discussion] Fix to #789 maybe not right. In-Reply-To: References: <4834E278.2040406@ar.media.kyoto-u.ac.jp> <4834E962.2070803@enthought.com> <4834FCD1.5040908@enthought.com> Message-ID: <483501C7.40704@enthought.com> Charles R Harris wrote: > > I agree with all that, which is why I'm not advocating a change. But > it does raise the bar for working with the C code and I think the > current case is an example of that. > Yes it does. I also agree that reference counting is the hardest part of coding with the Python C-API. > > > > I think it would be clearer to have the rule: you increment it, you > > decrement it. > Maybe, but Python's own C-API doesn't always follow that rule, > there are > functions that steal references. Remember, PyArray_Descr was > retrofitted as a Python Object. It didn't use to be one. This steal > rule was the cleanest I could come up with --- i.e. it wasn't an idle > decision. > > > I realize that Python does this too, I also note that getting > reference counting right is one of the more difficult aspects of > writing Python extension code. The more programmers have to know and > keep in mind, the more likely they are to make mistakes. > Agreed. -teo From matthieu.brucher at gmail.com Thu May 22 02:58:39 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 22 May 2008 08:58:39 +0200 Subject: [Numpy-discussion] Closing some tickets. In-Reply-To: References: Message-ID: Hi, Is there an official support for MSVC 2005 ? Last time I tried to compile Python with it, it couldn't build extension. If MSVC 2005 is not officially supported, at least by Python itself, I'm not sure Numpy can. Matthieu 2008/5/22 Charles R Harris : > All, > > Can we close ticket #117 and add Pearu's comment to the FAQ? > http://projects.scipy.org/scipy/numpy/ticket/117 > > Can someone with MSVC 2005 check if we can close ticket #164? > http://projects.scipy.org/scipy/numpy/ticket/164 > > Chuck > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu May 22 03:02:27 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 16:02:27 +0900 Subject: [Numpy-discussion] Closing some tickets. In-Reply-To: References: Message-ID: <48351A83.1070206@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > Hi, > > Is there an official support for MSVC 2005 ? Last time I tried to > compile Python with it, it couldn't build extension. If MSVC 2005 is > not officially supported, at least by Python itself, I'm not sure > Numpy can. Python 2.5 used 2003 (VS 7 ? I am always confused by their version numbering scheme), and Python 2.6/3.0 will use 2008 (VS 9 ?) AFAIK. I don't think it makes sense to support a compiler which is not used for official binaries and is already superseded by a newer version. cheers, David From schut at sarvision.nl Thu May 22 03:35:01 2008 From: schut at sarvision.nl (Vincent Schut) Date: Thu, 22 May 2008 09:35:01 +0200 Subject: [Numpy-discussion] first recarray steps In-Reply-To: References: <4832F77A.8020804@noaa.gov> Message-ID: Anne Archibald wrote: > 2008/5/21 Vincent Schut : >> Christopher Barker wrote: >>> Also, if you image data is rgb, usually, that's a (width, height, 3) >>> array: rgbrgbrgbrgb... in memory. If you have a (3, width, height) >>> array, then that's rrrrrrr....gggggggg......bbbbbbbb. Some image libs >>> may give you that, I'm not sure. >> My data is. In fact, this is a simplification of my situation; I'm >> processing satellite data, which usually has more (and other) bands than >> just rgb. But the data is definitely in shape (bands, y, x). > > You may find your life becomes easier if you transpose the data in > memory. This can make a big difference to efficiency. Years ago I was > working with enormous (by the standards of the day) MATLAB files on > disk, storing complex data. The way (that version of) MATLAB > represented complex data was the way you describe: matrix of real > parts, matrix of imaginary parts. This meant that to draw a single > pixel, the disk needed to seek twice... depending on what sort of > operations you're doing, transposing your data so that each pixel is > all in one place may improve cache coherency as well as making the use > of record arrays possible. > > Anne Anne, thanks for the thoughts. In most cases, you'll probably be right. In this case, however, it won't give me much (if any) speedup, maybe even slowdown. Satellite images often are stored on disk in a band sequential manner. The library I use for IO is GDAL, which is a higly optimized c library for reading/writing almost any kind of satellite data type. It also features an internal caching mechanism. And it gives me my data as (y, x, bands). I'm not reading single pixels anyway. The amounts of data I have to process (enormous, even by the standards of today ;-)) require me to do this in chunks, in parallel, even on different cores/cpu's/computers. Every chunk usually is (chunkYSize, chunkXSize, allBands) with xsize and ysize being not so small (think from 64^2 to 1024^2) so that pretty much eliminates any performance issues regarding the data on disk. Furthermore, having to process on multiple computers forces me to have my data on networked storage. The latency and transfer rate of the network will probably eliminate any small speedup because my drive has to do less seeks... Now for the recarray part, that would indeed ease my life a bit :) However, having to transpose the data in memory on every read and write does not sound very attractive. It will spoil cycles, and memory, and be asking for bugs. I can live without recarrays, for sure. I only hoped they might make my live a bit easier and my code a bit more readable, without too much effort. Well, they won't, apparently... I'll just go on like I did before this little excercise. Thanks all for the inputs. Cheers, Vincent. From schut at sarvision.nl Thu May 22 03:45:32 2008 From: schut at sarvision.nl (Vincent Schut) Date: Thu, 22 May 2008 09:45:32 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: <483431A3.60603@relativita.com> References: <483417F5.1040705@relativita.com> <2D3AB6C5-A5BA-45B2-8ED2-2F2A8324F3C2@tamu.edu> <483431A3.60603@relativita.com> Message-ID: Emanuele Olivetti wrote: > > This solution is super-fast, stable and use little memory. > It is based on the fact that: > (x-y)^2*w = x*x*w - 2*x*y*w + y*y*w > > For size1=size2=dimensions=1000 requires ~0.6sec. to compute > on my dual core duo. It is 2 order of magnitude faster than my > previous solution, but 1-2 order of magnitude slower than using > C with weave.inline. > > Definitely good enough for me. > > > Emanuele Reading this thread, I remembered having tried scipy's sandbox.rbf (radial basis function) to interpolate a pretty large, multidimensional dataset, to fill in the missing data points. This however failed soon with out-of-memory errors, which, if I remember correctly, came from the pretty straightforward distance calculation between the different data points that is used in this package. Being no math wonder, I assumed that there simply was no simple way to calculate distances without using much memory, and ended my rbf experiments. To make a story short: correct me if I am wrong, but might it be an idea to use the above solution in scipy.sandbox.rbf? Vincent. From david at ar.media.kyoto-u.ac.jp Thu May 22 05:17:14 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 18:17:14 +0900 Subject: [Numpy-discussion] Numpy and scipy icons ? Message-ID: <48353A1A.5080206@ar.media.kyoto-u.ac.jp> Hi, Where can I find numpy and scipy icons, preferably in a vector format ? cheers, David From pav at iki.fi Thu May 22 05:37:14 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 22 May 2008 12:37:14 +0300 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805210108k2257026fk821978c9bc00bf1e@mail.gmail.com> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <9457e7c80805210108k2257026fk821978c9bc00bf1e@mail.gmail.com> Message-ID: <1211449034.27052.81.camel@localhost> ke, 2008-05-21 kello 10:08 +0200, St?fan van der Walt kirjoitti: [clip] > > This will parse better (as the line with the semicolon is bold, the > > next lines are not). Also, would it be possible to put function and > > next_function in double back-ticks, so that they are referenced, like > > modules? That way they will might be clickable in a html version of > > the documentation. > > When generating the reference guide, I parse all the numpy docstrings > and re-generate a document enhanced with Sphinx markup. In this > document, functions in the See Also clause are "clickable". I have > support for two formats: > > See Also > ------------ > function_a, function_b, function_c > function_d : relation to current function > > Don't worry if it doesn't look perfect on the wiki; the reference > guide will be rendered correctly. Should the function names in the See also section also include the namespace prefix, ie. numpy.function_a numpy.function_b Or should we "assume from numpy import *" or "import numpy as np"? I think it'd be useful to clarify this in the documentation standard and in example.py, also for the Examples section. (Btw, the Docstrings/Example appears to be out-of-date wrt. this.) Pauli From stefan at sun.ac.za Thu May 22 05:56:25 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 22 May 2008 11:56:25 +0200 Subject: [Numpy-discussion] Numpy and scipy icons ? In-Reply-To: <48353A1A.5080206@ar.media.kyoto-u.ac.jp> References: <48353A1A.5080206@ar.media.kyoto-u.ac.jp> Message-ID: <9457e7c80805220256m50bb3b8ewc3249c4e20c66826@mail.gmail.com> Hi David The icons are attached to the scipy.org main page as .png's. Travis Vaught drew the vector versions, IIRC. Regards St?fan 2008/5/22 David Cournapeau : > Hi, > > Where can I find numpy and scipy icons, preferably in a vector format ? > > cheers, > > David From hetland at tamu.edu Thu May 22 05:56:21 2008 From: hetland at tamu.edu (Rob Hetland) Date: Thu, 22 May 2008 11:56:21 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: References: <483417F5.1040705@relativita.com> <2D3AB6C5-A5BA-45B2-8ED2-2F2A8324F3C2@tamu.edu> <483431A3.60603@relativita.com> Message-ID: <319DE361-9FA2-47BA-B3B2-D2F0E5BEE077@tamu.edu> On May 22, 2008, at 9:45 AM, Vincent Schut wrote: > Reading this thread, I remembered having tried scipy's sandbox.rbf > (radial basis function) to interpolate a pretty large, > multidimensional > dataset, to fill in the missing data points. This however failed soon > with out-of-memory errors, which, if I remember correctly, came from > the > pretty straightforward distance calculation between the different data > points that is used in this package. Being no math wonder, I assumed > that there simply was no simple way to calculate distances without > using > much memory, and ended my rbf experiments. > > To make a story short: correct me if I am wrong, but might it be an > idea > to use the above solution in scipy.sandbox.rbf? Yes, this would be a very good substitution. Not only does it use less memory, but in my quick tests it is about as fast or faster. Really, though, both are pretty quick. There will still be memory limitations, but you only need to store a matrix of (N, M), instead of (NDIM, N, M), so for many dimensions there will be big memory improvements. Probably only small improvements for 3 dimensions or less. I'm not sure where rbf lives anymore -- it's not in scikits. I have my own version (parts of which got folded into the old scipy.sandbox version), that I would be willing to share if there is interest. Really, though, the rbf toolbox will not be limited by the memory of the distance matrix. Later on, you need to do a large linear algebra 'solve', like this: r = norm(x, x) # The distances between all of the ND points to each other. A = psi(r) # where psi is some divergent function, often the multiquadratic function : sqrt((self.epsilon*r)**2 + 1) coefs = linalg.solve(A, data) # where data is the length of x, one data point for each spatial point. # to find the interpolated data points at xi ri = norm(xi, x) Ai = psi(ri) di = dot(Ai, coefs) All in all, it is the 'linalg.solve' that kills you. -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From hetland at tamu.edu Thu May 22 06:00:38 2008 From: hetland at tamu.edu (Rob Hetland) Date: Thu, 22 May 2008 12:00:38 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <1211449034.27052.81.camel@localhost> References: <48330F9A.6010804@esrf.fr> <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <9457e7c80805210108k2257026fk821978c9bc00bf1e@mail.gmail.com> <1211449034.27052.81.camel@localhost> Message-ID: <8415617F-5BC1-499D-9B1D-708DD5C292BE@tamu.edu> On May 22, 2008, at 11:37 AM, Pauli Virtanen wrote: > Or should we "assume from numpy import *" or "import numpy as np"? I Although a good case could probably be made for all three (*, np, numpy), I think that if "import numpy as np" is to be put forward as the standard coding style, the examples should use this as well. -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From david at ar.media.kyoto-u.ac.jp Thu May 22 06:14:42 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 May 2008 19:14:42 +0900 Subject: [Numpy-discussion] Numpy and scipy icons ? In-Reply-To: <9457e7c80805220256m50bb3b8ewc3249c4e20c66826@mail.gmail.com> References: <48353A1A.5080206@ar.media.kyoto-u.ac.jp> <9457e7c80805220256m50bb3b8ewc3249c4e20c66826@mail.gmail.com> Message-ID: <48354792.80706@ar.media.kyoto-u.ac.jp> St?fan van der Walt wrote: > Hi David > > The icons are attached to the scipy.org main page as .png's. Travis > Vaught drew the vector versions, IIRC. > I can find logo for scipy conf, bug, etc... but no "normal" icon, nor any numpy icon. Would it be possible to put the icon (vector version) somewhere in subversion ? Would be useful for say installers, etc... cheers, David From charlesr.harris at gmail.com Thu May 22 06:45:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 04:45:49 -0600 Subject: [Numpy-discussion] Numpy and scipy icons ? In-Reply-To: <48354792.80706@ar.media.kyoto-u.ac.jp> References: <48353A1A.5080206@ar.media.kyoto-u.ac.jp> <9457e7c80805220256m50bb3b8ewc3249c4e20c66826@mail.gmail.com> <48354792.80706@ar.media.kyoto-u.ac.jp> Message-ID: On Thu, May 22, 2008 at 4:14 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > St?fan van der Walt wrote: > > Hi David > > > > The icons are attached to the scipy.org main page as .png's. Travis > > Vaught drew the vector versions, IIRC. > > > > I can find logo for scipy conf, bug, etc... but no "normal" icon, nor > any numpy icon. Would it be possible to put the icon (vector version) > somewhere in subversion ? Would be useful for say installers, etc... > Travis Vaught is the man: http://article.gmane.org/gmane.comp.python.numeric.general/18495 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From schut at sarvision.nl Thu May 22 06:52:08 2008 From: schut at sarvision.nl (Vincent Schut) Date: Thu, 22 May 2008 12:52:08 +0200 Subject: [Numpy-discussion] distance_matrix: how to speed up? In-Reply-To: <319DE361-9FA2-47BA-B3B2-D2F0E5BEE077@tamu.edu> References: <483417F5.1040705@relativita.com> <2D3AB6C5-A5BA-45B2-8ED2-2F2A8324F3C2@tamu.edu> <483431A3.60603@relativita.com> <319DE361-9FA2-47BA-B3B2-D2F0E5BEE077@tamu.edu> Message-ID: Rob Hetland wrote: > On May 22, 2008, at 9:45 AM, Vincent Schut wrote: > > > Really, though, the rbf toolbox will not be limited by the memory of > the distance matrix. Later on, you need to do a large linear algebra > 'solve', like this: > > > r = norm(x, x) # The distances between all of the ND points to each > other. > A = psi(r) # where psi is some divergent function, often the > multiquadratic function : sqrt((self.epsilon*r)**2 + 1) > coefs = linalg.solve(A, data) # where data is the length of x, one > data point for each spatial point. > > # to find the interpolated data points at xi > ri = norm(xi, x) > Ai = psi(ri) > di = dot(Ai, coefs) > > > All in all, it is the 'linalg.solve' that kills you. Ah, indeed, my memory was faulty, I'm afraid. It was in this phase that it halted, not in the distance calculations. Vincent. > > -Rob > > ---- > Rob Hetland, Associate Professor > Dept. of Oceanography, Texas A&M University > http://pong.tamu.edu/~rob > phone: 979-458-0096, fax: 979-845-6331 From stefan at sun.ac.za Thu May 22 07:04:33 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 22 May 2008 13:04:33 +0200 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <8415617F-5BC1-499D-9B1D-708DD5C292BE@tamu.edu> References: <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <9457e7c80805210108k2257026fk821978c9bc00bf1e@mail.gmail.com> <1211449034.27052.81.camel@localhost> <8415617F-5BC1-499D-9B1D-708DD5C292BE@tamu.edu> Message-ID: <9457e7c80805220404ha58f2f0o461d3299ce664283@mail.gmail.com> It looks like there is significant interest in using "np" instead of "numpy" in the examples (i.e. we expect the user to do "import numpy as np" before trying code snippets). Would anybody who objects to using "np" raise it now, so that we can bury this issue? Regards St?fan 2008/5/22 Rob Hetland : > > On May 22, 2008, at 11:37 AM, Pauli Virtanen wrote: > >> Or should we "assume from numpy import *" or "import numpy as np"? I > > > Although a good case could probably be made for all three (*, np, > numpy), I think that if "import numpy as np" is to be put forward as > the standard coding style, the examples should use this as well. > > -Rob > > ---- > Rob Hetland, Associate Professor > Dept. of Oceanography, Texas A&M University > http://pong.tamu.edu/~rob > phone: 979-458-0096, fax: 979-845-6331 > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From andrea.gavana at gmail.com Thu May 22 07:12:22 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 22 May 2008 12:12:22 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations Message-ID: Hi All, I am building some 3D grids for visualization starting from a much bigger grid. I build these grids by satisfying certain conditions on x, y, z coordinates of their cells: up to now I was using VTK to perform this operation, but VTK is slow as a turtle, so I thought to use numpy to get the cells I am interested in. Basically, for every cell I have the coordinates of its center point (centroids), named xCent, yCent and zCent. These values are stored in numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, yCent and zCent with 10,000 values in them). What I'd like to do is: # Filter cells which do not satisfy Z requirements: zReq = zMin <= zCent <= zMax # After that, filter cells which do not satisfy Y requirements, # but apply this filter only on cells who satisfy the above condition: yReq = yMin <= yCent <= yMax # After that, filter cells which do not satisfy X requirements, # but apply this filter only on cells who satisfy the 2 above conditions: xReq = xMin <= xCent <= xMax I'd like to end up with a vector of indices which tells me which are the cells in the original grid that satisfy all 3 conditions. I know that something like this: zReq = zMin <= zCent <= zMax Can not be done directly in numpy, as the first statement executed returns a vector of boolean. Also, if I do something like: zReq1 = numpy.nonzero(zCent <= zMax) zReq2 = numpy.nonzero(zCent[zReq1] >= zMin) I lose the original indices of the grid, as in the second statement zCent[zReq1] has no more the size of the original grid but it has already been filtered out. Is there anything I could try in numpy to get what I am looking for? Sorry if the description is not very clear :-D Thank you very much for your suggestions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From stefan at sun.ac.za Thu May 22 07:29:59 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 22 May 2008 13:29:59 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: Message-ID: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> Hi Andrea 2008/5/22 Andrea Gavana : > I am building some 3D grids for visualization starting from a much > bigger grid. I build these grids by satisfying certain conditions on > x, y, z coordinates of their cells: up to now I was using VTK to > perform this operation, but VTK is slow as a turtle, so I thought to > use numpy to get the cells I am interested in. > Basically, for every cell I have the coordinates of its center point > (centroids), named xCent, yCent and zCent. These values are stored in > numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, > yCent and zCent with 10,000 values in them). What I'd like to do is: You clearly have a large dataset, otherwise speed wouldn't have been a concern to you. You can do your operation in one pass over the data, and I'd suggest you try doing that with Cython or Ctypes. If you need an example on how to access data using those methods, let me know. Of course, it *can* be done using NumPy (maybe not in one pass), but thinking in terms of for-loops is sometimes easier, and immediately takes you to a highly optimised execution time. Cheers St?fan From andrea.gavana at gmail.com Thu May 22 07:37:42 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 22 May 2008 12:37:42 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> Message-ID: Hi Stefan & All, On Thu, May 22, 2008 at 12:29 PM, St?fan van der Walt wrote: > Hi Andrea > > 2008/5/22 Andrea Gavana : >> I am building some 3D grids for visualization starting from a much >> bigger grid. I build these grids by satisfying certain conditions on >> x, y, z coordinates of their cells: up to now I was using VTK to >> perform this operation, but VTK is slow as a turtle, so I thought to >> use numpy to get the cells I am interested in. >> Basically, for every cell I have the coordinates of its center point >> (centroids), named xCent, yCent and zCent. These values are stored in >> numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, >> yCent and zCent with 10,000 values in them). What I'd like to do is: > > You clearly have a large dataset, otherwise speed wouldn't have been a > concern to you. You can do your operation in one pass over the data, > and I'd suggest you try doing that with Cython or Ctypes. If you need > an example on how to access data using those methods, let me know. > > Of course, it *can* be done using NumPy (maybe not in one pass), but > thinking in terms of for-loops is sometimes easier, and immediately > takes you to a highly optimised execution time. First of all, thank you for your answer. I know next to nothing about Cython and very little about Ctypes, but it would be nice to have an example on how to use them to speed up the operations. Actually, I don't really know if my dataset is "large", as I work normally with xCent, yCent and zCent vectors of about 100,000-300,000 elements in them. However, all the other operations I do with numpy on these vectors are pretty fast (reshaping, re-casting, min(), max() and so on). So I believe that also a pure numpy solution might perform well enough for my needs: but I am really no expert in numpy, so please forgive any mistake I'm doing :-D. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From falted at pytables.org Thu May 22 08:04:57 2008 From: falted at pytables.org (Francesc Alted) Date: Thu, 22 May 2008 14:04:57 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: Message-ID: <200805221404.58348.falted@pytables.org> A Thursday 22 May 2008, Andrea Gavana escrigu?: > Hi All, > > I am building some 3D grids for visualization starting from a > much bigger grid. I build these grids by satisfying certain > conditions on x, y, z coordinates of their cells: up to now I was > using VTK to perform this operation, but VTK is slow as a turtle, so > I thought to use numpy to get the cells I am interested in. > Basically, for every cell I have the coordinates of its center point > (centroids), named xCent, yCent and zCent. These values are stored in > numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, > yCent and zCent with 10,000 values in them). What I'd like to do is: > > # Filter cells which do not satisfy Z requirements: > zReq = zMin <= zCent <= zMax > > # After that, filter cells which do not satisfy Y requirements, > # but apply this filter only on cells who satisfy the above > condition: > > yReq = yMin <= yCent <= yMax > > # After that, filter cells which do not satisfy X requirements, > # but apply this filter only on cells who satisfy the 2 above > conditions: > > xReq = xMin <= xCent <= xMax > > I'd like to end up with a vector of indices which tells me which are > the cells in the original grid that satisfy all 3 conditions. I know > that something like this: > > zReq = zMin <= zCent <= zMax > > Can not be done directly in numpy, as the first statement executed > returns a vector of boolean. Also, if I do something like: > > zReq1 = numpy.nonzero(zCent <= zMax) > zReq2 = numpy.nonzero(zCent[zReq1] >= zMin) > > I lose the original indices of the grid, as in the second statement > zCent[zReq1] has no more the size of the original grid but it has > already been filtered out. > > Is there anything I could try in numpy to get what I am looking for? > Sorry if the description is not very clear :-D > > Thank you very much for your suggestions. I don't know if this is what you want, but you can get the boolean arrays separately, do the intersection and finally get the interesting values (by using fancy indexing) or coordinates (by using .nonzero()). Here it is an example: In [105]: a = numpy.arange(10,20) In [106]: c1=(a>=13)&(a<=17) In [107]: c2=(a>=14)&(a<=18) In [109]: all=c1&c2 In [110]: a[all] Out[110]: array([14, 15, 16, 17]) # the values In [111]: all.nonzero() Out[111]: (array([4, 5, 6, 7]),) # the coordinates Hope that helps, -- Francesc Alted From aisaac at american.edu Thu May 22 08:21:19 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 22 May 2008 08:21:19 -0400 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: Message-ID: On Thu, 22 May 2008, Andrea Gavana apparently wrote: > # Filter cells which do not satisfy Z requirements: > zReq = zMin <= zCent <= zMax This seems to raise a question: should numpy arrays support this standard Python idiom? Cheers, Alan Isaac From amcmorl at gmail.com Thu May 22 08:27:18 2008 From: amcmorl at gmail.com (Angus McMorland) Date: Thu, 22 May 2008 08:27:18 -0400 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: Message-ID: 2008/5/22 Andrea Gavana : > Hi All, > > I am building some 3D grids for visualization starting from a much > bigger grid. I build these grids by satisfying certain conditions on > x, y, z coordinates of their cells: up to now I was using VTK to > perform this operation, but VTK is slow as a turtle, so I thought to > use numpy to get the cells I am interested in. > Basically, for every cell I have the coordinates of its center point > (centroids), named xCent, yCent and zCent. These values are stored in > numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent, > yCent and zCent with 10,000 values in them). What I'd like to do is: > > # Filter cells which do not satisfy Z requirements: > zReq = zMin <= zCent <= zMax > > # After that, filter cells which do not satisfy Y requirements, > # but apply this filter only on cells who satisfy the above condition: > > yReq = yMin <= yCent <= yMax > > # After that, filter cells which do not satisfy X requirements, > # but apply this filter only on cells who satisfy the 2 above conditions: > > xReq = xMin <= xCent <= xMax > > I'd like to end up with a vector of indices which tells me which are > the cells in the original grid that satisfy all 3 conditions. I know > that something like this: > > zReq = zMin <= zCent <= zMax > > Can not be done directly in numpy, as the first statement executed > returns a vector of boolean. Also, if I do something like: > > zReq1 = numpy.nonzero(zCent <= zMax) > zReq2 = numpy.nonzero(zCent[zReq1] >= zMin) > > I lose the original indices of the grid, as in the second statement > zCent[zReq1] has no more the size of the original grid but it has > already been filtered out. > > Is there anything I could try in numpy to get what I am looking for? > Sorry if the description is not very clear :-D > > Thank you very much for your suggestions. How about (as a pure numpy solution): valid = (z >= zMin) & (z <= zMax) valid[valid] &= (y[valid] >= yMin) & (y[valid] <= yMax) valid[valid] &= (x[valid] >= xMin) & (x[valid] <= xMax) inds = valid.nonzero() ? -- AJC McMorland, PhD candidate Physiology, University of Auckland (Nearly) post-doctoral research fellow Neurobiology, University of Pittsburgh From andrea.gavana at gmail.com Thu May 22 08:53:55 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 22 May 2008 13:53:55 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <200805221404.58348.falted@pytables.org> References: <200805221404.58348.falted@pytables.org> Message-ID: Hi Francesc & All, On Thu, May 22, 2008 at 1:04 PM, Francesc Alted wrote: > I don't know if this is what you want, but you can get the boolean > arrays separately, do the intersection and finally get the interesting > values (by using fancy indexing) or coordinates (by using .nonzero()). > Here it is an example: > > In [105]: a = numpy.arange(10,20) > > In [106]: c1=(a>=13)&(a<=17) > > In [107]: c2=(a>=14)&(a<=18) > > In [109]: all=c1&c2 > > In [110]: a[all] > Out[110]: array([14, 15, 16, 17]) # the values > > In [111]: all.nonzero() > Out[111]: (array([4, 5, 6, 7]),) # the coordinates Thank you for this suggestion! I had forgotten that this worked in numpy :-( . I have written a couple of small functions to test your method and my method (hopefully I did it correctly for both). On my computer (Toshiba Notebook 2.00 GHz, Windows XP SP2, 1GB Ram, Python 2.5, numpy 1.0.3.1), your solution is about 30 times faster than mine (implemented when I didn't know about multiple boolean operations in numpy). This is my code: # Begin Code import numpy from timeit import Timer # Number of cells in my original grid nCells = 150000 # Define some constraints for X, Y, Z xMin, xMax = 250.0, 700.0 yMin, yMax = 1000.0, 1900.0 zMin, zMax = 120.0, 300.0 # Generate random centroids for the cells xCent = 1000.0*numpy.random.rand(nCells) yCent = 2500.0*numpy.random.rand(nCells) zCent = 400.0*numpy.random.rand(nCells) def MultipleBoolean1(): """ Andrea's solution, slow :-( .""" xReq_1 = numpy.nonzero(xCent >= xMin) xReq_2 = numpy.nonzero(xCent <= xMax) yReq_1 = numpy.nonzero(yCent >= yMin) yReq_2 = numpy.nonzero(yCent <= yMax) zReq_1 = numpy.nonzero(zCent >= zMin) zReq_2 = numpy.nonzero(zCent <= zMax) xReq = numpy.intersect1d_nu(xReq_1, xReq_2) yReq = numpy.intersect1d_nu(yReq_1, yReq_2) zReq = numpy.intersect1d_nu(zReq_1, zReq_2) xyReq = numpy.intersect1d_nu(xReq, yReq) xyzReq = numpy.intersect1d_nu(xyReq, zReq) def MultipleBoolean2(): """ Francesc's's solution, Much faster :-) .""" xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ (yCent >= yMin) & (yCent <= yMax) & \ (zCent >= zMin) & (zCent <= zMax) xyzReq = numpy.nonzero(xyzReq)[0] if __name__ == "__main__": trial = 10 t = Timer("MultipleBoolean1()", "from __main__ import MultipleBoolean1") print "\n\nAndrea's Solution: %0.8g Seconds/Trial"%(t.timeit(number=trial)/trial) t = Timer("MultipleBoolean2()", "from __main__ import MultipleBoolean2") print "Francesc's Solution: %0.8g Seconds/Trial\n"%(t.timeit(number=trial)/trial) # End Code And I get this timing on my PC: Andrea's Solution: 0.34946193 Seconds/Trial Francesc's Solution: 0.011288139 Seconds/Trial If I implemented everything correctly, this is an amazing improvement. Thank you to everyone who provided suggestions, and thanks to the list :-D Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From bsouthey at gmail.com Thu May 22 09:33:15 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 22 May 2008 08:33:15 -0500 Subject: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008 In-Reply-To: <9457e7c80805220404ha58f2f0o461d3299ce664283@mail.gmail.com> References: <9457e7c80805201405i55e3d1c3n86a27aea36da41ac@mail.gmail.com> <483356FA.5000502@esrf.fr> <3d375d730805201604p26186f52tc604ceb19f7831b8@mail.gmail.com> <1211351236.15406.138.camel@localhost> <3d375d730805202333o4b0a6588rec0f493b9d79f16a@mail.gmail.com> <9457e7c80805210108k2257026fk821978c9bc00bf1e@mail.gmail.com> <1211449034.27052.81.camel@localhost> <8415617F-5BC1-499D-9B1D-708DD5C292BE@tamu.edu> <9457e7c80805220404ha58f2f0o461d3299ce664283@mail.gmail.com> Message-ID: <4835761B.6010704@gmail.com> St?fan van der Walt wrote: > It looks like there is significant interest in using "np" instead of > "numpy" in the examples (i.e. we expect the user to do "import numpy > as np" before trying code snippets). > > Would anybody who objects to using "np" raise it now, so that we can > bury this issue? > > Regards > St?fan > > 2008/5/22 Rob Hetland : > >> On May 22, 2008, at 11:37 AM, Pauli Virtanen wrote: >> >> >>> Or should we "assume from numpy import *" or "import numpy as np"? I >>> >> Although a good case could probably be made for all three (*, np, >> numpy), I think that if "import numpy as np" is to be put forward as >> the standard coding style, the examples should use this as well. >> >> -Rob >> >> ---- >> Rob Hetland, Associate Professor >> Dept. of Oceanography, Texas A&M University >> http://pong.tamu.edu/~rob >> phone: 979-458-0096, fax: 979-845-6331 >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > Hi, I prefer using 'import numpy' over 'import numpy as np'. But as long as each example has 'import numpy as np' included then I have no objections. The main reason for this is that the block of code can easily be copied and pasted to run a complete entity. Also this type of implicit assumption often get missed because these assumptions are often far from the example (missed in web searches) or overlooked as the reader doesn't think that part was important. Regards Bruce From oliphant at enthought.com Thu May 22 11:07:11 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 22 May 2008 10:07:11 -0500 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: Message-ID: <48358C1F.3090904@enthought.com> Alan G Isaac wrote: > On Thu, 22 May 2008, Andrea Gavana apparently wrote: > >> # Filter cells which do not satisfy Z requirements: >> zReq = zMin <= zCent <= zMax >> > > This seems to raise a question: > should numpy arrays support this standard Python idiom? > It would be nice, but alas it requires a significant change to Python first to give us the hooks to modify. (We need the 'and' and 'or' operations to return "vectors" instead of just numbers as they do now). There is a PEP to allow this, but it has not received much TLC as of late. The difficulty in the implementation is supporting "short-circuited" evaluation. -Travis From bsouthey at gmail.com Thu May 22 11:10:12 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 22 May 2008 10:10:12 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types Message-ID: <48358CD4.5010108@gmail.com> Hi, Is it bug if different NumPy types have different attributes? Based on prior discussion, 'complex', 'float' and 'int' are Python types and others are NumPy types. Consequently 'complex', 'float' and 'int' do not inherit from NumPy. However, an element from array created using dtype=numpy.float has the numpy.float64 type. So this is really a documentation issue than an implementation issue. Also different NumPy types have different attributes, for example, 'float64' contains attributes (eg __coerce__) that are not present in 'float32' and 'float128' (these two have the same attributes). This can cause attribute errors in somewhat contrived examples that probably are unlikely to appear in practice because of the casting involved in array creation. The 'uint' types all seem to have the same attributes so do not have these issues. import numpy len(dir(float)) # 47 len(dir(numpy.float)) # 47 len(dir(numpy.float32)) # 131 len(dir(numpy.float64)) # 135 len(dir(numpy.float128)) # 131 len(dir(int)) # 54 len(dir(numpy.int)) # 54 len(dir(numpy.int0)) # 135 len(dir(numpy.int16)) # 132 len(dir(numpy.int32)) # 132 len(dir(numpy.int64)) # 135 len(dir(numpy.int8)) # 132 print (numpy.float64(1234).size) # 1 print (numpy.float(1234).size) ''' prints error: Traceback (most recent call last): File "", line 1, in AttributeError: 'float' object has no attribute 'size' ''' Regards Bruce From josef.pktd at gmail.com Thu May 22 11:11:10 2008 From: josef.pktd at gmail.com (joep) Date: Thu, 22 May 2008 08:11:10 -0700 (PDT) Subject: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution Message-ID: <4cdacb1d-b91d-46f3-b854-e500dd7e54ec@l42g2000hsc.googlegroups.com> Hi, I was just looking around at the new numpy documentation and got a xhtml parsing error on the page (with Firefox): http://mentat.za.net/numpy/refguide/random.xhtml#index-29351 The offending line contains $X pprox prod_{i=1}^{k}{x^{lpha_i-1}_i}$< in the docstring of the dirichlet distribution the corresponding line in the source at http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/mtrand/mtrand.pyx is .. math:: X \\approx \\prod_{i=1}^{k}{x^{\\alpha_i-1}_i} (I have no idea, why it seems not to parse \\a correctly). When looking for this, I found that the Dirichlet distribution is missing from the new Docstring Wiki, http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random Then I saw that Dirichlet is also missing in __all__ in http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/info.py As a consequence numpy.lookfor does not find Dirichlet >>> numpy.lookfor('dirichlet') Search results for 'dirichlet' ------------------------------ >>> import numpy.random >>> dir(numpy.random) contains dirichlet >>> numpy.random.__all__ does not contain dirichlet. To me this seems to be a documentation bug. Josef From oliphant at enthought.com Thu May 22 11:24:58 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 22 May 2008 10:24:58 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: <48358CD4.5010108@gmail.com> References: <48358CD4.5010108@gmail.com> Message-ID: <4835904A.3040807@enthought.com> Bruce Southey wrote: > Hi, > Is it bug if different NumPy types have different attributes? > I don't think so, other than perhaps we should not have the Python types in the numpy namespace. numpy.float is just __builtin__.float which is a Python type not a NumPy data-type object. numpy.float64 inherits from numpy.float however. -Travis From bioinformed at gmail.com Thu May 22 11:59:37 2008 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 22 May 2008 11:59:37 -0400 Subject: [Numpy-discussion] Fancier indexing Message-ID: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu May 22 12:08:30 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 22 May 2008 09:08:30 -0700 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs wrote: > After poking around for a bit, I was wondering if there was a faster method > for the following: > > # Array of index values 0..n > items = numpy.array([0,3,2,1,4,2],dtype=int) > > # Count the number of occurrences of each index > counts = numpy.zeros(5, dtype=int) > for i in items: > counts[i] += 1 > > In my real code, 'items' contain up to a million values and this loop will > be in a performance critical area of code. If there is no simple solution, > I can trivially code this using the C-API. How big is n? If it is much smaller than a million then loop over that instead. From bioinformed at gmail.com Thu May 22 12:13:55 2008 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 22 May 2008 12:13:55 -0400 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: <2e1434c10805220913v168fea34h49deb7a346dd54bc@mail.gmail.com> On Thu, May 22, 2008 at 12:08 PM, Keith Goodman wrote: > How big is n? If it is much smaller than a million then loop over that > instead. > n is always relatively small, but I'd rather not do: for i in range(n): counts[i] = (items==i).sum() If that was the best alternative, I'd just bite the bullet and code this in C. Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu May 22 12:15:14 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 22 May 2008 09:15:14 -0700 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 9:08 AM, Keith Goodman wrote: > On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs > wrote: >> After poking around for a bit, I was wondering if there was a faster method >> for the following: >> >> # Array of index values 0..n >> items = numpy.array([0,3,2,1,4,2],dtype=int) >> >> # Count the number of occurrences of each index >> counts = numpy.zeros(5, dtype=int) >> for i in items: >> counts[i] += 1 >> >> In my real code, 'items' contain up to a million values and this loop will >> be in a performance critical area of code. If there is no simple solution, >> I can trivially code this using the C-API. > > How big is n? If it is much smaller than a million then loop over that instead. Or how about using a list instead: >> items = [0,3,2,1,4,2] >> uitems = frozenset(items) >> count = [items.count(i) for i in uitems] >> count [1, 1, 2, 1, 1] From mark.miller at usu.edu Thu May 22 12:15:17 2008 From: mark.miller at usu.edu (Mark Miller) Date: Thu, 22 May 2008 09:15:17 -0700 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: You're just trying to do this...correct? >>> import numpy >>> items = numpy.array([0,3,2,1,4,2],dtype=int) >>> unique = numpy.unique(items) >>> unique array([0, 1, 2, 3, 4]) >>> counts=numpy.histogram(items,unique) >>> counts (array([1, 1, 2, 1, 1]), array([0, 1, 2, 3, 4])) >>> counts[0] array([1, 1, 2, 1, 1]) >>> On Thu, May 22, 2008 at 9:08 AM, Keith Goodman wrote: > On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs > wrote: > > After poking around for a bit, I was wondering if there was a faster > method > > for the following: > > > > # Array of index values 0..n > > items = numpy.array([0,3,2,1,4,2],dtype=int) > > > > # Count the number of occurrences of each index > > counts = numpy.zeros(5, dtype=int) > > for i in items: > > counts[i] += 1 > > > > In my real code, 'items' contain up to a million values and this loop > will > > be in a performance critical area of code. If there is no simple > solution, > > I can trivially code this using the C-API. > > How big is n? If it is much smaller than a million then loop over that > instead. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu May 22 12:16:57 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 22 May 2008 09:16:57 -0700 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 9:15 AM, Keith Goodman wrote: > On Thu, May 22, 2008 at 9:08 AM, Keith Goodman wrote: >> On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs >> wrote: >>> After poking around for a bit, I was wondering if there was a faster method >>> for the following: >>> >>> # Array of index values 0..n >>> items = numpy.array([0,3,2,1,4,2],dtype=int) >>> >>> # Count the number of occurrences of each index >>> counts = numpy.zeros(5, dtype=int) >>> for i in items: >>> counts[i] += 1 >>> >>> In my real code, 'items' contain up to a million values and this loop will >>> be in a performance critical area of code. If there is no simple solution, >>> I can trivially code this using the C-API. >> >> How big is n? If it is much smaller than a million then loop over that instead. > > Or how about using a list instead: > >>> items = [0,3,2,1,4,2] >>> uitems = frozenset(items) >>> count = [items.count(i) for i in uitems] >>> count > [1, 1, 2, 1, 1] Oh, I see, so uitems should be range(n) From robince at gmail.com Thu May 22 12:22:16 2008 From: robince at gmail.com (Robin) Date: Thu, 22 May 2008 17:22:16 +0100 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs wrote: > After poking around for a bit, I was wondering if there was a faster method > for the following: > > # Array of index values 0..n > items = numpy.array([0,3,2,1,4,2],dtype=int) > > # Count the number of occurrences of each index > counts = numpy.zeros(5, dtype=int) > for i in items: > counts[i] += 1 > > In my real code, 'items' contain up to a million values and this loop will > be in a performance critical area of code. If there is no simple solution, > I can trivially code this using the C-API. I would use bincount: count = bincount(items) should be all you need: In [192]: items = [0,3,2,1,4,2] In [193]: bincount(items) Out[193]: array([1, 1, 2, 1, 1]) In [194]: bincount? Type: builtin_function_or_method Base Class: String Form: Namespace: Interactive Docstring: bincount(x,weights=None) Return the number of occurrences of each value in x. x must be a list of non-negative integers. The output, b[i], represents the number of times that i is found in x. If weights is specified, every occurrence of i at a position p contributes weights[p] instead of 1. See also: histogram, digitize, unique. Robin From kwgoodman at gmail.com Thu May 22 12:34:46 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 22 May 2008 09:34:46 -0700 Subject: [Numpy-discussion] Fancier indexing In-Reply-To: References: <2e1434c10805220859h7d9dbe05sfb0c621f543f5d64@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 9:22 AM, Robin wrote: > On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs > wrote: >> After poking around for a bit, I was wondering if there was a faster method >> for the following: >> >> # Array of index values 0..n >> items = numpy.array([0,3,2,1,4,2],dtype=int) >> >> # Count the number of occurrences of each index >> counts = numpy.zeros(5, dtype=int) >> for i in items: >> counts[i] += 1 >> >> In my real code, 'items' contain up to a million values and this loop will >> be in a performance critical area of code. If there is no simple solution, >> I can trivially code this using the C-API. > > I would use bincount: > count = bincount(items) > should be all you need: I guess bincount is *little* faster: >> items = mp.random.randint(0, 100, (1000000,)) >> timeit mp.bincount(items) 100 loops, best of 3: 4.05 ms per loop >> items = items.tolist() >> timeit [items.count(i) for i in range(100)] 10 loops, best of 3: 2.91 s per loop From josef.pktd at gmail.com Thu May 22 12:51:25 2008 From: josef.pktd at gmail.com (joep) Date: Thu, 22 May 2008 09:51:25 -0700 (PDT) Subject: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution In-Reply-To: <4cdacb1d-b91d-46f3-b854-e500dd7e54ec@l42g2000hsc.googlegroups.com> References: <4cdacb1d-b91d-46f3-b854-e500dd7e54ec@l42g2000hsc.googlegroups.com> Message-ID: <7135f262-7fd6-4985-b6e5-56602d367939@l42g2000hsc.googlegroups.com> On May 22, 11:11 am, joep wrote: > Hi, > When looking for this, I found that the Dirichlet distribution is > missing from the new Docstring Wiki,http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random Actually, a search on the wiki finds dirichlet in http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/dirichlet I found random/mtrand only through the search, it doesn't seem to be linked from anywhere Is it intentional that function that are imported inside numpy might have the same docstring assigned to several different wiki pages, and might get edited on different pages? Since all distribution (except for dirichlet) are included in numpy.random.__all__, these distribution show up on two different pages, e.g. http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/poisson and http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/poisson So except for the strange parsing of the dirichlet docstring, this is a problem with numpy: numpy.random.__all__ as defined in numpy/random/info.py does not expose Dirichlet Josef From stefan at sun.ac.za Thu May 22 13:26:33 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 22 May 2008 19:26:33 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> Message-ID: <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> Hi Andrea 2008/5/22 Andrea Gavana : >> You clearly have a large dataset, otherwise speed wouldn't have been a >> concern to you. You can do your operation in one pass over the data, >> and I'd suggest you try doing that with Cython or Ctypes. If you need >> an example on how to access data using those methods, let me know. >> >> Of course, it *can* be done using NumPy (maybe not in one pass), but >> thinking in terms of for-loops is sometimes easier, and immediately >> takes you to a highly optimised execution time. > > First of all, thank you for your answer. I know next to nothing about > Cython and very little about Ctypes, but it would be nice to have an > example on how to use them to speed up the operations. Actually, I > don't really know if my dataset is "large", as I work normally with > xCent, yCent and zCent vectors of about 100,000-300,000 elements in > them. Just to clarify things in my mind: is VTK *that* slow? I find that surprising, since it is written in C or C++. Regards St?fan From pav at iki.fi Thu May 22 13:30:21 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 22 May 2008 20:30:21 +0300 Subject: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution In-Reply-To: <7135f262-7fd6-4985-b6e5-56602d367939@l42g2000hsc.googlegroups.com> References: <4cdacb1d-b91d-46f3-b854-e500dd7e54ec@l42g2000hsc.googlegroups.com> <7135f262-7fd6-4985-b6e5-56602d367939@l42g2000hsc.googlegroups.com> Message-ID: <1211477421.8271.15.camel@localhost.localdomain> to, 2008-05-22 kello 09:51 -0700, joep kirjoitti: > > On May 22, 11:11 am, joep wrote: > > Hi, > > > When looking for this, I found that the Dirichlet distribution is > > missing from the new Docstring Wiki,http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random > > Actually, a search on the wiki finds dirichlet in > http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/dirichlet > > I found random/mtrand only through the search, it doesn't seem to be > linked from anywhere It's not in __all__ of numpy.random, and is stripped from the content list because of that. > Is it intentional that function that are imported inside numpy might > have the same docstring assigned to several different wiki pages, and > might get edited on different pages? > > Since all distribution (except for dirichlet) are included in > numpy.random.__all__, these distribution show up on two different > pages, e.g. > http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/poisson > and > http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/poisson ? It is not intentional. And for the majority of cases this does not happen, and I can fix this for numpy.random.mtrand. Thanks for reporting. Pauli From thrabe at burnham.org Thu May 22 14:22:15 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Thu, 22 May 2008 11:22:15 -0700 Subject: [Numpy-discussion] osX leopard linker setting Message-ID: Hi, does anybody know the linker settings for python c modules on osX ? i have the original xcode3 tools installed -> gcc 4.0.1 I use '-bundle -flat_namespace' for linking now, but I get ld: can't insert lazy pointers, __dyld section not found for inferred architecture ppc Does anybody know of flags working for the linker? Thank you, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From thrabe at burnham.org Thu May 22 14:23:14 2008 From: thrabe at burnham.org (Thomas Hrabe) Date: Thu, 22 May 2008 11:23:14 -0700 Subject: [Numpy-discussion] osX leopard linker setting References: Message-ID: By the way, whats a lazy pointer anyway? -----Urspr?ngliche Nachricht----- Von: numpy-discussion-bounces at scipy.org im Auftrag von Thomas Hrabe Gesendet: Do 22.05.2008 11:22 An: numpy-discussion at scipy.org Betreff: [Numpy-discussion] osX leopard linker setting Hi, does anybody know the linker settings for python c modules on osX ? i have the original xcode3 tools installed -> gcc 4.0.1 I use '-bundle -flat_namespace' for linking now, but I get ld: can't insert lazy pointers, __dyld section not found for inferred architecture ppc Does anybody know of flags working for the linker? Thank you, Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2834 bytes Desc: not available URL: From robert.kern at gmail.com Thu May 22 14:26:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 22 May 2008 13:26:02 -0500 Subject: [Numpy-discussion] osX leopard linker setting In-Reply-To: References: Message-ID: <3d375d730805221126t5ad790dahf7cebd6c00230dff@mail.gmail.com> On Thu, May 22, 2008 at 1:22 PM, Thomas Hrabe wrote: > Hi, > > does anybody know the linker settings for python c modules on osX ? > i have the original xcode3 tools installed -> gcc 4.0.1 Just use distutils (or numpy.distutils). It will take care of the linker flags for you. If you really can't use distutils for some reason, take a look at the flags that are added for modules that do build with distutils. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Thu May 22 14:28:53 2008 From: josef.pktd at gmail.com (joep) Date: Thu, 22 May 2008 11:28:53 -0700 (PDT) Subject: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution In-Reply-To: <1211477421.8271.15.camel@localhost.localdomain> References: <4cdacb1d-b91d-46f3-b854-e500dd7e54ec@l42g2000hsc.googlegroups.com> <7135f262-7fd6-4985-b6e5-56602d367939@l42g2000hsc.googlegroups.com> <1211477421.8271.15.camel@localhost.localdomain> Message-ID: On May 22, 1:30 pm, Pauli Virtanen wrote: > to, 2008-05-22 kello 09:51 -0700, joep kirjoitti: > > ? > It is not intentional. And for the majority of cases this does not > happen, and I can fix this for numpy.random.mtrand. Thanks for > reporting. > I was looking some more at the __all__ statements and trying to figure out what the system/idea behind the imports and exposure of functions at different places is. I did not find any other full duplication as with mtrand so far. However, when I do a search on the DocWiki for example for arccos (or log, log10, exp, tan,...), I see it 9 times, and it is not clear which ones refer to the same docstring and where several imports of the same function are picked up separately, and which ones refer to actually different functions in the source. numpy.lookfor('arccos') yields 3 results, with 3 different doc strings, the other 6 might be duplicates. http://sd-2116.dedibox.fr/doc/Docstrings/numpy/lib/scimath/arccos has the most informative docstring In numpy it is exposed as ``numpy.emath.arccos`` A recommendation for docstring editing might be to verify duplicates and copy doc strings if the function is (almost) duplicated or triplicated in the numpy source and possibly cross link different versions. When I start from the DocWiki front page, I seem to be able to follow links only to one version of any docstring, but any search leads to the multiple exposer of the same function. Josef From lhyatt at gmail.com Thu May 22 14:29:21 2008 From: lhyatt at gmail.com (Lewis Hyatt) Date: Thu, 22 May 2008 18:29:21 +0000 (UTC) Subject: [Numpy-discussion] workaround for searchsorted with strings? Message-ID: Hello- I see from this thread: http://article.gmane.org/gmane.comp.python.numeric.general/18746/ that searchsorted does not work correctly with strings. Is there a workaround, though, that I can use with 1.0.4 until there is a new official numpy release that includes the fix mentioned in the reference above? Using the latest SVN version is not an option for me. My understanding was that searchsorted works OK if the strings are all the same data type, but that does not appear to be the case: p >>> x=array(['0', '1', '2', '12']) p >>> y=array(['0', '0', '2', '3', '123']) p >>> x.searchsorted(y) array([0, 0, 0, 2, 0]) p >>> x.astype(y.dtype).searchsorted(y) array([0, 0, 2, 4, 2]) I understand that the first call to searchsorted fails because y has type S3 and x has type S2. But it seems that changing the type of x produces still incorrect (albeit) different results. Is there something similar I can do to make this work for now? Thanks very much. -Lewis From lhyatt at gmail.com Thu May 22 14:36:03 2008 From: lhyatt at gmail.com (Lewis Hyatt) Date: Thu, 22 May 2008 18:36:03 +0000 (UTC) Subject: [Numpy-discussion] workaround for searchsorted with strings? References: Message-ID: Oh sorry, my example was dumb, never mind. It looks like this way does work after all. Can someone please confirm for me, though, that the workaround I am using (just changing to the wider string type) is reliable? Thanks, sorry for the noise. -Lewis From charlesr.harris at gmail.com Thu May 22 14:38:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 12:38:17 -0600 Subject: [Numpy-discussion] workaround for searchsorted with strings? In-Reply-To: References: Message-ID: On Thu, May 22, 2008 at 12:29 PM, Lewis Hyatt wrote: > Hello- > > I see from this thread: > http://article.gmane.org/gmane.comp.python.numeric.general/18746/ > > that searchsorted does not work correctly with strings. Is there a > workaround, > though, that I can use with 1.0.4 until there is a new official numpy > release > that includes the fix mentioned in the reference above? Using the latest > SVN > version is not an option for me. > > My understanding was that searchsorted works OK if the strings are all the > same > data type, but that does not appear to be the case: > > p >>> x=array(['0', '1', '2', '12']) > p >>> y=array(['0', '0', '2', '3', '123']) > p >>> x.searchsorted(y) > array([0, 0, 0, 2, 0]) > p >>> x.astype(y.dtype).searchsorted(y) > array([0, 0, 2, 4, 2]) > > I understand that the first call to searchsorted fails because y has type > S3 and > x has type S2. But it seems that changing the type of x produces still > incorrect (albeit) different results. Is there something similar I can do > to > make this work for now? Thanks very much. > The x array is not sorted. Try x = array(['0', '1', '12', '2']) -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 22 14:39:49 2008 From: josef.pktd at gmail.com (joep) Date: Thu, 22 May 2008 11:39:49 -0700 (PDT) Subject: [Numpy-discussion] buglet: Dirichlet missing in numpy.random.__all__ as defined in numpy/random/info.py Message-ID: The Dirichlet distribution is missing in __all__ in http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/info.py As a consequence numpy.lookfor does not find Dirichlet >>> numpy.lookfor('dirichlet') Search results for 'dirichlet' ------------------------------ >>> import numpy.random >>> dir(numpy.random) contains dirichlet >>> numpy.random.__all__ does not contain dirichlet. looks like a tiny bug. Josef (this is kind of a duplicate email, but I didn't want it to get lost in the DocWiki discussion) From robert.kern at gmail.com Thu May 22 14:46:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 22 May 2008 13:46:13 -0500 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> Message-ID: <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> On Thu, May 22, 2008 at 12:26 PM, St?fan van der Walt wrote: > Just to clarify things in my mind: is VTK *that* slow? I find that > surprising, since it is written in C or C++. Performance can depend more on the design of the code than the implementation language. There are several places in VTK which are slower than they strictly could be because VTK exposes data primarily through abstract interfaces and only sometimes expose underlying data structure for faster processing. Quite sensibly, they implement the general form first. It's much the same with parts of numpy. The iterator abstraction lets you work on arbitrarily strided arrays, but for contiguous arrays, just using the pointer lets you, and the compiler, optimize your code more. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu May 22 14:50:54 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 22 May 2008 21:50:54 +0300 Subject: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution In-Reply-To: References: <4cdacb1d-b91d-46f3-b854-e500dd7e54ec@l42g2000hsc.googlegroups.com> <7135f262-7fd6-4985-b6e5-56602d367939@l42g2000hsc.googlegroups.com> <1211477421.8271.15.camel@localhost.localdomain> Message-ID: <1211482254.8271.20.camel@localhost.localdomain> to, 2008-05-22 kello 11:28 -0700, joep kirjoitti: [clip] > However, when I do a search on the DocWiki for example for arccos (or > log, log10, exp, tan,...), I see it 9 times, and it is not clear which > ones refer to the same docstring and where several imports of the same > function are picked up separately, and which ones refer to actually > different functions in the source. [clip] > A recommendation for docstring editing might be to verify duplicates > and copy doc strings if the function is (almost) duplicated or > triplicated in the numpy source and possibly cross link different > versions. This is a problem with the tool on handling extension objects and Pyrex-generated classes, and the editors shouldn't have to concern themselves with it. I'll fix it and remove any unedited duplicates from the wiki. Pauli From charlesr.harris at gmail.com Thu May 22 15:00:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 13:00:22 -0600 Subject: [Numpy-discussion] workaround for searchsorted with strings? In-Reply-To: References: Message-ID: On Thu, May 22, 2008 at 12:36 PM, Lewis Hyatt wrote: > Oh sorry, my example was dumb, never mind. It looks like this way does work > after all. Can someone please confirm for me, though, that the workaround I > am > using (just changing to the wider string type) is reliable? Thanks, sorry > for > the noise. > You can still have problems because the numpy strings will be filled out with zeros. The string compare in 1.0.4 doesn't handle zeros correctly and this might cause some problems. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 22 15:09:11 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 22 May 2008 14:09:11 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: <4835904A.3040807@enthought.com> References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> Message-ID: <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant wrote: > Bruce Southey wrote: >> Hi, >> Is it bug if different NumPy types have different attributes? >> > I don't think so, other than perhaps we should not have the Python types > in the numpy namespace. > > numpy.float is just __builtin__.float which is a Python type not a > NumPy data-type object. > > numpy.float64 inherits from numpy.float however. And I believe this is the cause of the difference between the attributes of numpy.float32/numpy.float128 and numpy.float64. Same deal with int0 and int64 on your presumably 64-bit platform. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu May 22 15:12:50 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 13:12:50 -0600 Subject: [Numpy-discussion] C API Message-ID: All, I added a function to array_api_order.txt and apparently this changed the order of the pointers in the API, which caused ctypes to segfault until I removed the build directory and did a complete rebuild. It seems to me that if we really want to make adding these API functions safe, then we should only have one list instead of the current two. This looks to require some mods to the build system. What do folks think? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Thu May 22 15:16:23 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 22 May 2008 20:16:23 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> Message-ID: Hi All, On Thu, May 22, 2008 at 7:46 PM, Robert Kern wrote: > On Thu, May 22, 2008 at 12:26 PM, St?fan van der Walt wrote: >> Just to clarify things in my mind: is VTK *that* slow? I find that >> surprising, since it is written in C or C++. > > Performance can depend more on the design of the code than the > implementation language. There are several places in VTK which are > slower than they strictly could be because VTK exposes data primarily > through abstract interfaces and only sometimes expose underlying data > structure for faster processing. Quite sensibly, they implement the > general form first. Yes, Robert is perfectly right. VTK is quite handy in most of the situations, but in this case I had to recursively apply 3 thresholds (each one for X, Y and Z respectively) and the threshold construction (initialization) and its execution were much slower than my (sloppy) numpy result. Compared to the solution Francesc posted, the VTK approach simply disappears. By the way, about the solution Francesc posted: xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ (yCent >= yMin) & (yCent <= yMax) & \ (zCent >= zMin) & (zCent <= zMax) xyzReq = numpy.nonzero(xyzReq)[0] Do you think is there any chance that a C extension (or something similar) could be faster? Or something else using weave? I understand that this solution is already highly optimized as it uses the power of numpy with the logic operations in Python, but I was wondering if I can make it any faster: on my PC, the algorithm runs in 0.01 seconds, more or less, for 150,000 cells, but today I encountered a case in which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-( Otherwise, I will try and implement it in Fortran and wrap it with f2py, assuming I am able to do it correctly and the overhead of calling an external extension is not killing the execution time. Thank you very much for your sugestions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From bsouthey at gmail.com Thu May 22 15:32:32 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 22 May 2008 14:32:32 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> Message-ID: Hi, Thanks very much for the confirmation. Bruce On Thu, May 22, 2008 at 2:09 PM, Robert Kern wrote: > On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant > wrote: >> Bruce Southey wrote: >>> Hi, >>> Is it bug if different NumPy types have different attributes? >>> >> I don't think so, other than perhaps we should not have the Python types >> in the numpy namespace. >> >> numpy.float is just __builtin__.float which is a Python type not a >> NumPy data-type object. >> >> numpy.float64 inherits from numpy.float however. > > And I believe this is the cause of the difference between the > attributes of numpy.float32/numpy.float128 and numpy.float64. Same > deal with int0 and int64 on your presumably 64-bit platform. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Thu May 22 15:40:41 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 22 May 2008 12:40:41 -0700 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> Message-ID: <4835CC39.1050508@noaa.gov> Andrea Gavana wrote: > By the way, about the solution Francesc posted: > > xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ > (yCent >= yMin) & (yCent <= yMax) & \ > (zCent >= zMin) & (zCent <= zMax) > > xyzReq = numpy.nonzero(xyzReq)[0] > > Do you think is there any chance that a C extension (or something > similar) could be faster? yep -- if I've be got this right, the above creates 7 temporary arrays. creating that many and pushing the data in and out of memory can be pretty slow for large arrays. In C, C++, Cython or Fortran, you can just do one loop, and one output array. It should be much faster for the big arrays. > Otherwise, I will try and implement it in Fortran and wrap it with > f2py, assuming I am able to do it correctly and the overhead of > calling an external extension is not killing the execution time. nope, that's one function call for the whole thing, negligible. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Thu May 22 15:46:45 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 13:46:45 -0600 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 1:32 PM, Bruce Southey wrote: > Hi, > Thanks very much for the confirmation. > > Bruce > > On Thu, May 22, 2008 at 2:09 PM, Robert Kern > wrote: > > On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant > > wrote: > >> Bruce Southey wrote: > >>> Hi, > >>> Is it bug if different NumPy types have different attributes? > >>> > >> I don't think so, other than perhaps we should not have the Python types > >> in the numpy namespace. > >> > >> numpy.float is just __builtin__.float which is a Python type not a > >> NumPy data-type object. > >> > >> numpy.float64 inherits from numpy.float however. > > > > And I believe this is the cause of the difference between the > > attributes of numpy.float32/numpy.float128 and numpy.float64. Same > > deal with int0 and int64 on your presumably 64-bit platform. > > It also leads to various inconsistencies: In [1]: float32(array([[1]])) Out[1]: array([[ 1.]], dtype=float32) In [2]: float64(array([[1]])) Out[2]: 1.0 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 22 15:59:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 22 May 2008 14:59:00 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> Message-ID: <3d375d730805221259x18d30dafp4ba39a7e96782876@mail.gmail.com> On Thu, May 22, 2008 at 2:46 PM, Charles R Harris wrote: > It also leads to various inconsistencies: > > In [1]: float32(array([[1]])) > Out[1]: array([[ 1.]], dtype=float32) > > In [2]: float64(array([[1]])) > Out[2]: 1.0 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wnbell at gmail.com Thu May 22 16:36:26 2008 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 22 May 2008 15:36:26 -0500 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 2:16 PM, Andrea Gavana wrote: > By the way, about the solution Francesc posted: > > xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ > (yCent >= yMin) & (yCent <= yMax) & \ > (zCent >= zMin) & (zCent <= zMax) > You could implement this with inplace operations to save memory: xyzReq = (xCent >= xMin) xyzReq &= (xCent <= xMax) xyzReq &= (yCent >= yMin) xyzReq &= (yCent <= yMax) xyzReq &= (zCent >= zMin) xyzReq &= (zCent <= zMax) > Do you think is there any chance that a C extension (or something > similar) could be faster? Or something else using weave? I understand > that this solution is already highly optimized as it uses the power of > numpy with the logic operations in Python, but I was wondering if I > can make it any faster A C implementation would certainly be faster, perhaps 5x faster, due to short-circuiting the AND operations and the fact that you'd only pass over the data once. OTOH I'd be very surprised if this is the slowest part of your application. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From orest.kozyar at gmail.com Thu May 22 16:52:22 2008 From: orest.kozyar at gmail.com (Orest Kozyar) Date: Thu, 22 May 2008 16:52:22 -0400 Subject: [Numpy-discussion] numpy.ndarray.astype conversion Message-ID: The following line: array([6.95e-5]).astype('S') returns: array(['6'], dtype='|S1') I realize that I can get it to return the correct string representation if I specify 'S12', etc, but anything below 'S9' results in an incorrect string representation of the number. Is this expected behavior? Thanks! Orest From charlesr.harris at gmail.com Thu May 22 17:03:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 15:03:19 -0600 Subject: [Numpy-discussion] numpy.ndarray.astype conversion In-Reply-To: References: Message-ID: On Thu, May 22, 2008 at 2:52 PM, Orest Kozyar wrote: > The following line: array([6.95e-5]).astype('S') > returns: array(['6'], dtype='|S1') > > I realize that I can get it to return the correct string > representation if I specify 'S12', etc, but anything below 'S9' > results in an incorrect string representation of the number. Is this > expected behavior? > Heh, I don't know what it should do. Issue a warning, maybe? What are you trying to do? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu May 22 17:23:18 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 22 May 2008 23:23:18 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> Message-ID: <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> Hi Andrea 2008/5/22 Andrea Gavana : > By the way, about the solution Francesc posted: > > xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ > (yCent >= yMin) & (yCent <= yMax) & \ > (zCent >= zMin) & (zCent <= zMax) > > xyzReq = numpy.nonzero(xyzReq)[0] > > Do you think is there any chance that a C extension (or something > similar) could be faster? Or something else using weave? I understand > that this solution is already highly optimized as it uses the power of > numpy with the logic operations in Python, but I was wondering if I > can make it any faster: on my PC, the algorithm runs in 0.01 seconds, > more or less, for 150,000 cells, but today I encountered a case in > which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-( > Otherwise, I will try and implement it in Fortran and wrap it with > f2py, assuming I am able to do it correctly and the overhead of > calling an external extension is not killing the execution time. I wrote a quick proof of concept (no guarantees). You can find it here (download using bzr, http://bazaar-vcs.org, or just grab the files with your web browser): https://code.launchpad.net/~stefanv/+junk/xyz 1. Install Cython if you haven't already 2. Run "python setup.py build_ext -i" to build the C extension 3. Use the code, e.g., import xyz out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5, array([2.0, 4.0, 6.0]), 2, 4, array([-1.0, -2.0, -4.0]), -3, -2) In the above case, out is [False, True, False]. Regards St?fan From bsouthey at gmail.com Thu May 22 17:25:23 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 22 May 2008 16:25:23 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: <3d375d730805221259x18d30dafp4ba39a7e96782876@mail.gmail.com> References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> <3d375d730805221259x18d30dafp4ba39a7e96782876@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 2:59 PM, Robert Kern wrote: > On Thu, May 22, 2008 at 2:46 PM, Charles R Harris > wrote: >> It also leads to various inconsistencies: >> >> In [1]: float32(array([[1]])) >> Out[1]: array([[ 1.]], dtype=float32) >> >> In [2]: float64(array([[1]])) >> Out[2]: 1.0 > > Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > So, should these return an error if the argument is an ndarray object, a list or similar? Otherwise, int, float and string type of arguments would be okay under the assumption that people would like variable precision scalars. Bruce From oliphant at enthought.com Thu May 22 17:34:43 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 22 May 2008 16:34:43 -0500 Subject: [Numpy-discussion] C API In-Reply-To: References: Message-ID: <4835E6F3.7050400@enthought.com> Charles R Harris wrote: > All, > > I added a function to array_api_order.txt and apparently this changed > the order of the pointers in the API, which caused ctypes to segfault > until I removed the build directory and did a complete rebuild. It > seems to me that if we really want to make adding these API functions > safe, then we should only have one list instead of the current two. > This looks to require some mods to the build system. What do folks think? Yes, or a simple solution is to only append to one of the lists. At the very least, we should mark the array_api_order as not appendable. -Travis From charlesr.harris at gmail.com Thu May 22 17:55:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 15:55:48 -0600 Subject: [Numpy-discussion] C API In-Reply-To: <4835E6F3.7050400@enthought.com> References: <4835E6F3.7050400@enthought.com> Message-ID: On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > All, > > > > I added a function to array_api_order.txt and apparently this changed > > the order of the pointers in the API, which caused ctypes to segfault > > until I removed the build directory and did a complete rebuild. It > > seems to me that if we really want to make adding these API functions > > safe, then we should only have one list instead of the current two. > > This looks to require some mods to the build system. What do folks think? > Yes, or a simple solution is to only append to one of the lists. At > the very least, we should mark the array_api_order as not appendable. > That doesn't work unless I change the tag from OBJECT_API to MULTIARRAY_API. Do these tags really matter? Maybe we should just replace them with API and merge this lists. At the beginning of 1.2, of course. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Thu May 22 18:12:53 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 22 May 2008 15:12:53 -0700 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> Message-ID: <4835EFE5.3010708@noaa.gov> St?fan van der Walt wrote: > I wrote a quick proof of concept (no guarantees). Thanks for the example -- I like how Cython understands ndarrays! It looks like this code would break if x,y,and z are not C-contiguous -- should there be a check for that? -Chris > here (download using bzr, http://bazaar-vcs.org, or just grab the > files with your web browser): > > https://code.launchpad.net/~stefanv/+junk/xyz -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From andrea.gavana at gmail.com Thu May 22 18:19:22 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 22 May 2008 23:19:22 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <4835CC39.1050508@noaa.gov> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> <4835CC39.1050508@noaa.gov> Message-ID: Hi Chris and All, On Thu, May 22, 2008 at 8:40 PM, Christopher Barker wrote: > Andrea Gavana wrote: >> By the way, about the solution Francesc posted: >> >> xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ >> (yCent >= yMin) & (yCent <= yMax) & \ >> (zCent >= zMin) & (zCent <= zMax) >> >> xyzReq = numpy.nonzero(xyzReq)[0] >> >> Do you think is there any chance that a C extension (or something >> similar) could be faster? > > yep -- if I've be got this right, the above creates 7 temporary arrays. > creating that many and pushing the data in and out of memory can be > pretty slow for large arrays. > > In C, C++, Cython or Fortran, you can just do one loop, and one output > array. It should be much faster for the big arrays. Well, I have implemented it in 2 ways in Fortran, and actually the Fortran solutions are slower than the numpy one (2 and 3 times slower respectively). I attach the source code of the timing code and the 5 implementations I have at the moment (I have included Nathan's implementation, which is as fast as Francesc's one but it has the advantage of saving memory). The timing I get on my home PC are: Andrea's Solution: 0.42807561 Seconds/Trial Francesc's Solution: 0.018297884 Seconds/Trial Fortran Solution 1: 0.035862072 Seconds/Trial Fortran Solution 2: 0.029822338 Seconds/Trial Nathan's Solution: 0.018930507 Seconds/Trial Maybe my fortran coding is sloppy but I don't really know fortran so well to implement it better... Thank you so much to everybody for your suggestions so far :-D Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: MultipleBoolean.py URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: MultipleBoolean3.f90 Type: application/octet-stream Size: 569 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: MultipleBoolean4.f90 Type: application/octet-stream Size: 631 bytes Desc: not available URL: From andrea.gavana at gmail.com Thu May 22 18:22:21 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 22 May 2008 23:22:21 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> Message-ID: Hi Stefan, On Thu, May 22, 2008 at 10:23 PM, St?fan van der Walt wrote: > Hi Andrea > > 2008/5/22 Andrea Gavana : >> By the way, about the solution Francesc posted: >> >> xyzReq = (xCent >= xMin) & (xCent <= xMax) & \ >> (yCent >= yMin) & (yCent <= yMax) & \ >> (zCent >= zMin) & (zCent <= zMax) >> >> xyzReq = numpy.nonzero(xyzReq)[0] >> >> Do you think is there any chance that a C extension (or something >> similar) could be faster? Or something else using weave? I understand >> that this solution is already highly optimized as it uses the power of >> numpy with the logic operations in Python, but I was wondering if I >> can make it any faster: on my PC, the algorithm runs in 0.01 seconds, >> more or less, for 150,000 cells, but today I encountered a case in >> which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-( >> Otherwise, I will try and implement it in Fortran and wrap it with >> f2py, assuming I am able to do it correctly and the overhead of >> calling an external extension is not killing the execution time. > > I wrote a quick proof of concept (no guarantees). You can find it > here (download using bzr, http://bazaar-vcs.org, or just grab the > files with your web browser): > > https://code.launchpad.net/~stefanv/+junk/xyz > > 1. Install Cython if you haven't already > 2. Run "python setup.py build_ext -i" to build the C extension > 3. Use the code, e.g., > > import xyz > out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5, > array([2.0, 4.0, 6.0]), 2, 4, > array([-1.0, -2.0, -4.0]), -3, -2) > > In the above case, out is [False, True, False]. Thank you very much for this! I am going to try it and time it, comparing it with the other implementations. I think I need to study a bit your code as I know almost nothing about Cython :-D Thank you! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From robert.kern at gmail.com Thu May 22 19:07:33 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 22 May 2008 18:07:33 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> <3d375d730805221259x18d30dafp4ba39a7e96782876@mail.gmail.com> Message-ID: <3d375d730805221607q734983cai70fc87c93eb2930d@mail.gmail.com> On Thu, May 22, 2008 at 4:25 PM, Bruce Southey wrote: > On Thu, May 22, 2008 at 2:59 PM, Robert Kern wrote: >> On Thu, May 22, 2008 at 2:46 PM, Charles R Harris >> wrote: >>> It also leads to various inconsistencies: >>> >>> In [1]: float32(array([[1]])) >>> Out[1]: array([[ 1.]], dtype=float32) >>> >>> In [2]: float64(array([[1]])) >>> Out[2]: 1.0 >> >> Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). > > So, should these return an error if the argument is an ndarray object, > a list or similar? I think it was originally put in as a feature, but given the inconsistency and the long-standing alternatives, I would deprecate its use for converting array dtypes. But that's just my opinion. > Otherwise, int, float and string type of arguments would be okay under > the assumption that people would like variable precision scalars. Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu May 22 19:20:50 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 17:20:50 -0600 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: <3d375d730805221607q734983cai70fc87c93eb2930d@mail.gmail.com> References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> <3d375d730805221259x18d30dafp4ba39a7e96782876@mail.gmail.com> <3d375d730805221607q734983cai70fc87c93eb2930d@mail.gmail.com> Message-ID: On Thu, May 22, 2008 at 5:07 PM, Robert Kern wrote: > On Thu, May 22, 2008 at 4:25 PM, Bruce Southey wrote: > > On Thu, May 22, 2008 at 2:59 PM, Robert Kern > wrote: > >> On Thu, May 22, 2008 at 2:46 PM, Charles R Harris > >> wrote: > >>> It also leads to various inconsistencies: > >>> > >>> In [1]: float32(array([[1]])) > >>> Out[1]: array([[ 1.]], dtype=float32) > >>> > >>> In [2]: float64(array([[1]])) > >>> Out[2]: 1.0 > >> > >> Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype). > > > > So, should these return an error if the argument is an ndarray object, > > a list or similar? > > I think it was originally put in as a feature, but given the > inconsistency and the long-standing alternatives, I would deprecate > its use for converting array dtypes. But that's just my opinion. > I agree. Having too many ways to do things just makes for headaches. Should we schedule in a deprecation for anything other than scalars and strings. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 22 19:22:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 17:22:07 -0600 Subject: [Numpy-discussion] C API In-Reply-To: References: <4835E6F3.7050400@enthought.com> Message-ID: On Thu, May 22, 2008 at 3:55 PM, Charles R Harris wrote: > > > On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant < > oliphant at enthought.com> wrote: > >> Charles R Harris wrote: >> > All, >> > >> > I added a function to array_api_order.txt and apparently this changed >> > the order of the pointers in the API, which caused ctypes to segfault >> > until I removed the build directory and did a complete rebuild. It >> > seems to me that if we really want to make adding these API functions >> > safe, then we should only have one list instead of the current two. >> > This looks to require some mods to the build system. What do folks >> think? >> Yes, or a simple solution is to only append to one of the lists. At >> the very least, we should mark the array_api_order as not appendable. >> > > That doesn't work unless I change the tag from OBJECT_API to > MULTIARRAY_API. Do these tags really matter? Maybe we should just replace > them with API and merge this lists. At the beginning of 1.2, of course. > This doesn't look to hard to do. How about a unified NUMPY_API list? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu May 22 20:02:43 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 23 May 2008 02:02:43 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> Message-ID: <9457e7c80805221702y1c200264jeeac835f96dc11e1@mail.gmail.com> Hi Andrea 2008/5/23 Andrea Gavana : > Thank you very much for this! I am going to try it and time it, > comparing it with the other implementations. I think I need to study a > bit your code as I know almost nothing about Cython :-D That won't be necessary -- the Fortran-implementation is guaranteed to win! Just to make sure, I timed it anyway (on somewhat larger arrays): Francesc's Solution: 0.062797403 Seconds/Trial Fortran Solution 1: 0.050316906 Seconds/Trial Fortran Solution 2: 0.052595496 Seconds/Trial Nathan's Solution: 0.055562282 Seconds/Trial Cython Solution: 0.06250751 Seconds/Trial Nathan's version runs over the data 6 times, and still does better than the Pyrex version. I don't know why! But, hey, this algorithm is parallelisable! Wait, no, it's bedtime. Regards St?fan From oliphant at enthought.com Thu May 22 20:36:57 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 22 May 2008 19:36:57 -0500 Subject: [Numpy-discussion] C API In-Reply-To: References: <4835E6F3.7050400@enthought.com> Message-ID: <483611A9.3040407@enthought.com> Charles R Harris wrote: > > > On Thu, May 22, 2008 at 3:55 PM, Charles R Harris > > wrote: > > > > On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant > > wrote: > > Charles R Harris wrote: > > All, > > > > I added a function to array_api_order.txt and apparently > this changed > > the order of the pointers in the API, which caused ctypes to > segfault > > until I removed the build directory and did a complete > rebuild. It > > seems to me that if we really want to make adding these API > functions > > safe, then we should only have one list instead of the > current two. > > This looks to require some mods to the build system. What do > folks think? > Yes, or a simple solution is to only append to one of the > lists. At > the very least, we should mark the array_api_order as not > appendable. > > > That doesn't work unless I change the tag from OBJECT_API to > MULTIARRAY_API. Do these tags really matter? Maybe we should just > replace them with API and merge this lists. At the beginning of > 1.2, of course. > > > This doesn't look to hard to do. How about a unified NUMPY_API list? That's fine with me. I can't remember why there were 2 separate lists. -Travis From oliphant at enthought.com Thu May 22 20:38:10 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 22 May 2008 19:38:10 -0500 Subject: [Numpy-discussion] Different attributes for NumPy types In-Reply-To: References: <48358CD4.5010108@gmail.com> <4835904A.3040807@enthought.com> <3d375d730805221209m6358606csecebbf9f68c52c84@mail.gmail.com> <3d375d730805221259x18d30dafp4ba39a7e96782876@mail.gmail.com> <3d375d730805221607q734983cai70fc87c93eb2930d@mail.gmail.com> Message-ID: <483611F2.8010801@enthought.com> Charles R Harris wrote: > > > On Thu, May 22, 2008 at 5:07 PM, Robert Kern > wrote: > > On Thu, May 22, 2008 at 4:25 PM, Bruce Southey > wrote: > > On Thu, May 22, 2008 at 2:59 PM, Robert Kern > > wrote: > >> On Thu, May 22, 2008 at 2:46 PM, Charles R Harris > >> > > wrote: > >>> It also leads to various inconsistencies: > >>> > >>> In [1]: float32(array([[1]])) > >>> Out[1]: array([[ 1.]], dtype=float32) > >>> > >>> In [2]: float64(array([[1]])) > >>> Out[2]: 1.0 > >> > >> Okay, so don't do that. Always use x.astype(dtype) or > asarray(x, dtype). > > > > So, should these return an error if the argument is an ndarray > object, > > a list or similar? > > I think it was originally put in as a feature, but given the > inconsistency and the long-standing alternatives, I would deprecate > its use for converting array dtypes. But that's just my opinion. > > > I agree. Having too many ways to do things just makes for headaches. > Should we schedule in a deprecation for anything other than scalars > and strings. I don't have a strong opinion either way. -Travis From twaite at berkeley.edu Thu May 22 21:19:22 2008 From: twaite at berkeley.edu (Tom Waite) Date: Thu, 22 May 2008 18:19:22 -0700 Subject: [Numpy-discussion] triangular matrix fill Message-ID: I have a question on filling a lower triangular matrix using numpy. This is essentially having two loops and the inner loop upper limit is the outer loop current index. In the inner loop I have a vector being multiplied by a constant set in the outer loop. For a matrix N*N in size, the C the code is: for(i = 0; i < N; ++i){ for(j = 0; j < i; ++j){ Matrix[i*N + j] = V1[i] * V2[j]; } } Thanks Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 22 21:45:30 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 19:45:30 -0600 Subject: [Numpy-discussion] C API In-Reply-To: <483611A9.3040407@enthought.com> References: <4835E6F3.7050400@enthought.com> <483611A9.3040407@enthought.com> Message-ID: On Thu, May 22, 2008 at 6:36 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > On Thu, May 22, 2008 at 3:55 PM, Charles R Harris > > > wrote: > > > > > > > > On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant > > > wrote: > > > > Charles R Harris wrote: > > > All, > > > > > > I added a function to array_api_order.txt and apparently > > this changed > > > the order of the pointers in the API, which caused ctypes to > > segfault > > > until I removed the build directory and did a complete > > rebuild. It > > > seems to me that if we really want to make adding these API > > functions > > > safe, then we should only have one list instead of the > > current two. > > > This looks to require some mods to the build system. What do > > folks think? > > Yes, or a simple solution is to only append to one of the > > lists. At > > the very least, we should mark the array_api_order as not > > appendable. > > > > > > That doesn't work unless I change the tag from OBJECT_API to > > MULTIARRAY_API. Do these tags really matter? Maybe we should just > > replace them with API and merge this lists. At the beginning of > > 1.2, of course. > > > > > > This doesn't look to hard to do. How about a unified NUMPY_API list? > That's fine with me. I can't remember why there were 2 separate lists. > OK. Another question, why do __ufunc_api.h and __multiarray_api.h have double underscores prefixes? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 22 22:07:24 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 22 May 2008 20:07:24 -0600 Subject: [Numpy-discussion] triangular matrix fill In-Reply-To: References: Message-ID: On Thu, May 22, 2008 at 7:19 PM, Tom Waite wrote: > I have a question on filling a lower triangular matrix using numpy. This > is essentially having two loops and the inner loop upper limit is the > outer loop current index. In the inner loop I have a vector being > multiplied by a constant set in the outer loop. For a matrix N*N in size, > the C the code is: > > for(i = 0; i < N; ++i){ > for(j = 0; j < i; ++j){ > Matrix[i*N + j] = V1[i] * V2[j]; > } > } > > You can use numpy.outer(V1,V2) and just ignore everything on and above the diagonal. In [1]: x = arange(3) In [2]: y = arange(3,6) In [3]: outer(x,y) Out[3]: array([[ 0, 0, 0], [ 3, 4, 5], [ 6, 8, 10]]) You can mask the upper part if you want: In [16]: outer(x,y)*fromfunction(lambda i,j: i>j, (3,3)) Out[16]: array([[0, 0, 0], [3, 0, 0], [6, 8, 0]]) Or you could use fromfunction directly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu May 22 22:13:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 22 May 2008 21:13:25 -0500 Subject: [Numpy-discussion] triangular matrix fill In-Reply-To: References: Message-ID: <3d375d730805221913g5079424fxc51495ea2168a10b@mail.gmail.com> On Thu, May 22, 2008 at 9:07 PM, Charles R Harris wrote: > > On Thu, May 22, 2008 at 7:19 PM, Tom Waite wrote: >> >> I have a question on filling a lower triangular matrix using numpy. This >> is essentially having two loops and the inner loop upper limit is the >> outer loop current index. In the inner loop I have a vector being >> multiplied by a constant set in the outer loop. For a matrix N*N in size, >> the C the code is: >> >> for(i = 0; i < N; ++i){ >> for(j = 0; j < i; ++j){ >> Matrix[i*N + j] = V1[i] * V2[j]; >> } >> } >> > > You can use numpy.outer(V1,V2) and just ignore everything on and above the > diagonal. > > In [1]: x = arange(3) > > In [2]: y = arange(3,6) > > In [3]: outer(x,y) > Out[3]: > array([[ 0, 0, 0], > [ 3, 4, 5], > [ 6, 8, 10]]) > > You can mask the upper part if you want: > > In [16]: outer(x,y)*fromfunction(lambda i,j: i>j, (3,3)) > Out[16]: > array([[0, 0, 0], > [3, 0, 0], > [6, 8, 0]]) > > Or you could use fromfunction directly. Or numpy.tril(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri May 23 02:10:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 May 2008 00:10:18 -0600 Subject: [Numpy-discussion] Buildbot errors. Message-ID: The python 2.6 buildbots are showing 5 failures that are being hidden by valgrind. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 23 02:25:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 May 2008 00:25:23 -0600 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: On Fri, May 23, 2008 at 12:10 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > The python 2.6 buildbots are showing 5 failures that are being hidden by > valgrind. > They seem to have fixed themselves, they were probably related to the API addition I made, then moved. However, it is a bad thing that the errors were covered up by Valgrind and not reported. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Fri May 23 05:36:07 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Fri, 23 May 2008 10:36:07 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805221702y1c200264jeeac835f96dc11e1@mail.gmail.com> References: <9457e7c80805220429i20ba12b1lbea7c8f0d8fe5e71@mail.gmail.com> <9457e7c80805221026q20be77a3q3f9f463455c982bd@mail.gmail.com> <3d375d730805221146h3a7b5ad5s6484ea750e7dc426@mail.gmail.com> <9457e7c80805221423g3b23454cu3e9e29099c6e77fb@mail.gmail.com> <9457e7c80805221702y1c200264jeeac835f96dc11e1@mail.gmail.com> Message-ID: Hi Stefan & All, On Fri, May 23, 2008 at 1:02 AM, St?fan van der Walt wrote: > Hi Andrea > > 2008/5/23 Andrea Gavana : >> Thank you very much for this! I am going to try it and time it, >> comparing it with the other implementations. I think I need to study a >> bit your code as I know almost nothing about Cython :-D > > That won't be necessary -- the Fortran-implementation is guaranteed to win! > > Just to make sure, I timed it anyway (on somewhat larger arrays): > > Francesc's Solution: 0.062797403 Seconds/Trial > Fortran Solution 1: 0.050316906 Seconds/Trial > Fortran Solution 2: 0.052595496 Seconds/Trial > Nathan's Solution: 0.055562282 Seconds/Trial > Cython Solution: 0.06250751 Seconds/Trial > > Nathan's version runs over the data 6 times, and still does better > than the Pyrex version. I don't know why! > > But, hey, this algorithm is parallelisable! Wait, no, it's bedtime. Thank you so much for testing, and thanks to the list for the kind help and suggestions. In any case, after all these different implementations, I think the only way to make it faster is to progressively reduce the size of the vectors on which I make the inequality tests, while somehow keeping the original vectors indices. Let me explain with a example: # step1 will be a vector with nCells elements, filled with True # or False values step1 = xCent >= xMin # step2 will be a vector with nonzero(step1) cells, filled # with True or False values step2 = xCent[step1] <= xMax # step3 will be a vector with nonzero(step2) cells, filled # with True or False values step3 = yCent[step2] >= yMin And so on. The probelm with this approach is that I lose the original indices for which I want all the inequality tests to succeed: for example, consider the following: >>> xMin, xMax = 3, 7 >>> xCent = numpy.arange(10) >>> step1 = xCent >= xMin >>> numpy.nonzero(step1)[0] array([3, 4, 5, 6, 7, 8, 9]) >>> step2 = xCent[step1] <= xMax >>> numpy.nonzero(step2)[0] array([0, 1, 2, 3, 4]) Which are no more the indices of the original vector as I shrunk down xCent to xCent[step1], and the real indices should be: >>> realStep = (xCent >= xMin) & (xCent <= xMax) >>> numpy.nonzero(realStep)[0] array([0, 1, 2, 3, 4, 5, 6, 7]) So, now the question is. If I iteratively shrink down the vectors as I did before, is there any way to get back the original indices for which all the conditions are satisfied? Sorry if this looks like a dummy/noob question, is just I am not sure on how to implement it. Thank you very much for your help. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From charlesr.harris at gmail.com Fri May 23 10:28:02 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 May 2008 08:28:02 -0600 Subject: [Numpy-discussion] SQUEEK, SQUEEK, trac ticket mailing list still busted. Message-ID: I've unsubscribed, resubscribed, changed addresses, and nothing works. Wassup with that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 23 10:35:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 May 2008 08:35:12 -0600 Subject: [Numpy-discussion] SQUEEK, SQUEEK, trac ticket mailing list still busted. In-Reply-To: References: Message-ID: On Fri, May 23, 2008 at 8:28 AM, Charles R Harris wrote: > I've unsubscribed, resubscribed, changed addresses, and nothing works. > Wassup with that? > Since the list works for Peter, I'm going to guess that something is filtering the messages into limbo somewhere along the line. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Fri May 23 11:16:11 2008 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Fri, 23 May 2008 16:16:11 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations Message-ID: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> Hi Andrea, 2008/5/23 "Andrea Gavana" : > And so on. The probelm with this approach is that I lose the original > indices for which I want all the inequality tests to succeed: To have the original indices you just need to re-index your indices, as it were idx = flatnonzero(xCent >= xMin) idx = idx[flatnonzero(xCent[idx] <= xMax)] idx = idx[flatnonzero(yCent[idx] >= yMin)] idx = idx[flatnonzero(yCent[idx] <= yMax)] ... (I haven't tested this code, apologies for bugs) However, there is a performance penalty for doing all this re-indexing (I once fell afoul of this), and if these conditions "mostly" evaluate to True you can often be better off with one of the solutions already suggested. Regards, Peter From andrea.gavana at gmail.com Fri May 23 11:50:49 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Fri, 23 May 2008 16:50:49 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> References: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> Message-ID: Hi Peter & All, On Fri, May 23, 2008 at 4:16 PM, Peter Creasey wrote: > Hi Andrea, > > 2008/5/23 "Andrea Gavana" : >> And so on. The probelm with this approach is that I lose the original >> indices for which I want all the inequality tests to succeed: > > To have the original indices you just need to re-index your indices, as it were > > idx = flatnonzero(xCent >= xMin) > idx = idx[flatnonzero(xCent[idx] <= xMax)] > idx = idx[flatnonzero(yCent[idx] >= yMin)] > idx = idx[flatnonzero(yCent[idx] <= yMax)] > ... > (I haven't tested this code, apologies for bugs) > > However, there is a performance penalty for doing all this re-indexing > (I once fell afoul of this), and if these conditions "mostly" evaluate > to True you can often be better off with one of the solutions already > suggested. Thank you for your answer. I have tried your suggestion, and the performances are more or less comparable with the other NumPy implementations (yours is roughly 1.2 times slower than the others), but I do gain some advantage when the subgrids are very small (i.e., most of the values in the first array are already False). I'll go and implement your solution when I have many small subgrids in my model. Thank you! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From kwgoodman at gmail.com Fri May 23 13:16:53 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 23 May 2008 10:16:53 -0700 Subject: [Numpy-discussion] Any and all NaNs Message-ID: I'm writing unit tests for a module that contains matrices. I was surprised that these are True: >> import numpy.matlib as mp >> x = mp.matrix([[mp.nan]]) >> x.any() True >> x.all() True My use case is (x == y).all() where x and y are the same matrix except that x contains one NaN. Certianly x and y are not equal. >> x = mp.asmatrix(range(4)).reshape(2,2) >> y = mp.asmatrix(range(4)).reshape(2,2) >> x[0,0] = mp.nan >> (x == y).all() True From kwgoodman at gmail.com Fri May 23 13:22:09 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 23 May 2008 10:22:09 -0700 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: References: Message-ID: On Fri, May 23, 2008 at 10:16 AM, Keith Goodman wrote: > I'm writing unit tests for a module that contains matrices. I was > surprised that these are True: > >>> import numpy.matlib as mp >>> x = mp.matrix([[mp.nan]]) >>> x.any() > True >>> x.all() > True > > My use case is (x == y).all() where x and y are the same matrix except > that x contains one NaN. Certianly x and y are not equal. > >>> x = mp.asmatrix(range(4)).reshape(2,2) >>> y = mp.asmatrix(range(4)).reshape(2,2) >>> x[0,0] = mp.nan >>> (x == y).all() > True Sorry. Ignore the last example. I used integers. Here's the example with floats: >> x = 1.0 * mp.asmatrix(range(4)).reshape(2,2) >> y = 1.0 * mp.asmatrix(range(4)).reshape(2,2) >> x[0,0] = mp.nan >> x matrix([[ NaN, 1.], [ 2., 3.]]) >> (x == y).all() False But the first example >> x = mp.matrix([[mp.nan]]) >> x matrix([[ NaN]]) >> x.all() True >> x.any() True is still surprising. From robert.kern at gmail.com Fri May 23 14:44:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 May 2008 13:44:14 -0500 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: References: Message-ID: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> On Fri, May 23, 2008 at 12:22 PM, Keith Goodman wrote: > But the first example > >>> x = mp.matrix([[mp.nan]]) >>> x > matrix([[ NaN]]) >>> x.all() > True >>> x.any() > True > > is still surprising. On non-boolean arrays, .all() and .any() check each element to see if it is not equal to 0. NaN != 0. Returning False would be just as wrong. If there were a Maybe in addition to True and False, then perhaps that would be worth changing, but I don't see a reason to change the rule as it is. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri May 23 15:00:01 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 23 May 2008 12:00:01 -0700 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> References: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> Message-ID: On Fri, May 23, 2008 at 11:44 AM, Robert Kern wrote: > On Fri, May 23, 2008 at 12:22 PM, Keith Goodman wrote: > >> But the first example >> >>>> x = mp.matrix([[mp.nan]]) >>>> x >> matrix([[ NaN]]) >>>> x.all() >> True >>>> x.any() >> True >> >> is still surprising. > > On non-boolean arrays, .all() and .any() check each element to see if > it is not equal to 0. NaN != 0. Returning False would be just as > wrong. If there were a Maybe in addition to True and False, then > perhaps that would be worth changing, but I don't see a reason to > change the rule as it is. That makes sense. Hopefully it will find its way into the doc string. If I want NaNs to be False I can always do (x == x).all() instead of x.all() From mark.miller at usu.edu Fri May 23 17:00:36 2008 From: mark.miller at usu.edu (Mark Miller) Date: Fri, 23 May 2008 14:00:36 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? Message-ID: To anyone who can help: I recently got around to installing numpy 1.04 over an older version (numpy 1.04dev3982) on a Windows Vista machine. Since then, I have been unable to compile some of my extensions using f2py. I also tested a fresh install of numpy 1.04 on a new XP machine that has never seen Python and am getting the same messages. Here's the relevant bits, I think. Wrote C/API module "pickparents" to file "c:\users\mark\appdata\local\temp\tmpiuxw9j\src.win32-2.5/pickparentsmodule.c" Traceback (most recent call last): File "C:\python25\scripts\f2py.py", line 26, in main() File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 558, in main run_compile() File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 545, in run_compile setup(ext_modules = [ext]) File "C:\Python25\lib\site-packages\numpy\distutils\core.py", line 176, in setup return old_setup(**new_attr) File "C:\Python25\lib\distutils\core.py", line 151, in setup dist.run_commands() File "C:\Python25\lib\distutils\dist.py", line 974, in run_commands self.run_command(cmd) File "C:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "C:\Python25\lib\distutils\command\build.py", line 112, in run self.run_command(cmd_name) File "C:\Python25\lib\distutils\cmd.py", line 333, in run_command self.distribution.run_command(command) File "C:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 130, in run self.build_sources() File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 147, in build_sources self.build_extension_sources(ext) File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 256, in build_extension_sources sources = self.f2py_sources(sources, ext) File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 513, in f2py_sources ['-m',ext_name]+f_sources) File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 367, in run_main ret=buildmodules(postlist) File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 319, in buildmodules dict_append(ret[mnames[i]],rules.buildmodule(modules[i],um)) File "C:\Python25\lib\site-packages\numpy\f2py\rules.py", line 1222, in buildmodule for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): TypeError: cannot concatenate 'str' and 'list' objects Any thoughts? Please let me know if more information is needed to troubleshoot. -Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Fri May 23 17:31:30 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 23 May 2008 14:31:30 -0700 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent Message-ID: Hello, I plan to branch 1.1.x and tag 1.1.0 later today. As of now, please consider the trunk in an near absolute freeze. If you feel that there is some unimaginably important change that must take place before we branch and tag, please send an email to the mailing list including your proposed patch and an explanation of why it is absolutely necessary. If the list overwhelmingly agrees with you, I will consider applying the patch before branching. If there is any reason why I shouldn't branch right now, please let me know ASAP! Thanks to everyone who put so much time and effort into this release. The trunk has seen tremendous improvements in terms of bug-fixing, increased testing and improved documentation. I am looking forward to a whole host of improvements over the summer. Once I create the new branch, I will designate the trunk ready for 1.2 development. Despite the increased minor number, please remain cautious with your changes. As always any changes to the trunk shouldn't break it. If you have more experimental work that you want to try, please create a branch. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From markperrymiller at gmail.com Fri May 23 17:32:11 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 14:32:11 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? Message-ID: To anyone who can help: I recently got around to installing numpy 1.04 over an older version (numpy 1.04dev3982) on a Windows Vista machine. Since then, I have been unable to compile some of my extensions using f2py. I also tested a fresh install of numpy 1.04 on a new XP machine that has never seen Python and am getting the same messages. Here's the relevant bits, I think. Wrote C/API module "pickparents" to file "c:\users\mark\appdata\local \temp\tmpiuxw9j\src.win32-2.5/pickparentsmodule.c" Traceback (most recent call last): File "C:\python25\scripts\f2py.py", line 26, in main() File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 558, in main run_compile() File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 545, in run_compile setup(ext_modules = [ext]) File "C:\Python25\lib\site-packages\numpy\distutils\core.py", line 176, in setup return old_setup(**new_attr) File "C:\Python25\lib\distutils\core.py", line 151, in setup dist.run_commands() File "C:\Python25\lib\distutils\dist.py", line 974, in run_commands self.run_command(cmd) File "C:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "C:\Python25\lib\distutils\command\build.py", line 112, in run self.run_command(cmd_name) File "C:\Python25\lib\distutils\cmd.py", line 333, in run_command self.distribution.run_command(command) File "C:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 130, in run self.build_sources() File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 147, in build_sources self.build_extension_sources(ext) File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 256, in build_extension_sources sources = self.f2py_sources(sources, ext) File "C:\Python25\lib\site-packages\numpy\distutils\command\build_src.py", line 513, in f2py_sources ['-m',ext_name]+f_sources) File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 367, in run_main ret=buildmodules(postlist) File "C:\Python25\lib\site-packages\numpy\f2py\f2py2e.py", line 319, in buildmodules dict_append(ret[mnames[i]],rules.buildmodule(modules[i],um)) File "C:\Python25\lib\site-packages\numpy\f2py\rules.py", line 1222, in buildmodule for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): TypeError: cannot concatenate 'str' and 'list' objects Any thoughts? Please let me know if more information is needed to troubleshoot. -Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri May 23 17:45:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 May 2008 16:45:18 -0500 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: References: Message-ID: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> On Fri, May 23, 2008 at 4:00 PM, Mark Miller wrote: > File "C:\Python25\lib\site-packages\numpy\f2py\rules.py", line 1222, in > buildmodule > for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): > TypeError: cannot concatenate 'str' and 'list' objects > > > Any thoughts? Please let me know if more information is needed to > troubleshoot. This is a bug that was fixed in SVN r4335. http://projects.scipy.org/scipy/numpy/changeset/4335 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From markperrymiller at gmail.com Fri May 23 17:48:47 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 14:48:47 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> Message-ID: Super...I'll give it a try. Or should I just wait for the numpy 1.1 release? thanks, -Mark On Fri, May 23, 2008 at 2:45 PM, Robert Kern wrote: > On Fri, May 23, 2008 at 4:00 PM, Mark Miller wrote: > > > File "C:\Python25\lib\site-packages\numpy\f2py\rules.py", line 1222, in > > buildmodule > > for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): > > TypeError: cannot concatenate 'str' and 'list' objects > > > > > > Any thoughts? Please let me know if more information is needed to > > troubleshoot. > > This is a bug that was fixed in SVN r4335. > > http://projects.scipy.org/scipy/numpy/changeset/4335 > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri May 23 18:01:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 May 2008 17:01:04 -0500 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> Message-ID: <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> On Fri, May 23, 2008 at 4:48 PM, Mark Miller wrote: > Super...I'll give it a try. Or should I just wait for the numpy 1.1 > release? Probably. You can get a binary installer for the release candidate here: http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From markperrymiller at gmail.com Fri May 23 18:48:02 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 15:48:02 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> Message-ID: Thank you...getting much closer now. My current issue is this message: running build_ext error: don't know how to compile C/C++ code on platform 'nt' with 'g95' compiler. Any help? Again, sorry to pester. I'm just pretty unfamiliar with these things. Once I get environmental variables set up, I rarely need to fiddle with them again. So I don't have a specific feel for what might be happening here. thanks, -Mark On Fri, May 23, 2008 at 3:01 PM, Robert Kern wrote: > On Fri, May 23, 2008 at 4:48 PM, Mark Miller > wrote: > > Super...I'll give it a try. Or should I just wait for the numpy 1.1 > > release? > > Probably. You can get a binary installer for the release candidate here: > > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri May 23 18:50:28 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 May 2008 17:50:28 -0500 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> Message-ID: <3d375d730805231550l49d1682dj6e0378f10c33c5e5@mail.gmail.com> On Fri, May 23, 2008 at 5:48 PM, Mark Miller wrote: > Thank you...getting much closer now. > > My current issue is this message: > > running build_ext > error: don't know how to compile C/C++ code on platform 'nt' with 'g95' > compiler. > > Any help? What command line are you using? Do you have a setup.cfg or pydistutils.cfg file that you are using? Can you show us the full output? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From markperrymiller at gmail.com Fri May 23 18:51:22 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 15:51:22 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> Message-ID: Ignore last message: I seem to have figured out the next environmental variable that needed to be set. Still some lingering issues, but I'll work on them some more before pestering here again. thanks, -Mark On Fri, May 23, 2008 at 3:48 PM, Mark Miller wrote: > Thank you...getting much closer now. > > My current issue is this message: > > running build_ext > error: don't know how to compile C/C++ code on platform 'nt' with 'g95' > compiler. > > Any help? > > Again, sorry to pester. I'm just pretty unfamiliar with these things. > Once I get environmental variables set up, I rarely need to fiddle with them > again. So I don't have a specific feel for what might be happening here. > > thanks, > > -Mark > > > > > > On Fri, May 23, 2008 at 3:01 PM, Robert Kern > wrote: > >> On Fri, May 23, 2008 at 4:48 PM, Mark Miller >> wrote: >> > Super...I'll give it a try. Or should I just wait for the numpy 1.1 >> > release? >> >> Probably. You can get a binary installer for the release candidate here: >> >> >> http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From markperrymiller at gmail.com Fri May 23 18:59:43 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 15:59:43 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: <3d375d730805231550l49d1682dj6e0378f10c33c5e5@mail.gmail.com> References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> <3d375d730805231550l49d1682dj6e0378f10c33c5e5@mail.gmail.com> Message-ID: In this case, I am just using the Windows command prompt. I do not have a setup.cfg or pydistutils.cfg file. I did create a file in Python25\Lib\distutils called distutils.cfg containing 2 lines: [build] compiler = mingw32 That took care of the previous message. I am currently getting a 'failed with exit status 1' message, that for the life of me I can't remember what causes it. I have attached the full (albeit tedius) output from an attempt, if someone is willing to wade through it. -Mark On Fri, May 23, 2008 at 3:50 PM, Robert Kern wrote: > On Fri, May 23, 2008 at 5:48 PM, Mark Miller > wrote: > > Thank you...getting much closer now. > > > > My current issue is this message: > > > > running build_ext > > error: don't know how to compile C/C++ code on platform 'nt' with 'g95' > > compiler. > > > > Any help? > > What command line are you using? Do you have a setup.cfg or > pydistutils.cfg file that you are using? Can you show us the full > output? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: error message.txt URL: From robert.kern at gmail.com Fri May 23 19:05:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 May 2008 18:05:47 -0500 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> <3d375d730805231550l49d1682dj6e0378f10c33c5e5@mail.gmail.com> Message-ID: <3d375d730805231605l6daad21cl4bc9aa366f8abb2e@mail.gmail.com> On Fri, May 23, 2008 at 5:59 PM, Mark Miller wrote: > In this case, I am just using the Windows command prompt. I do not have a > setup.cfg or pydistutils.cfg file. I did create a file in > Python25\Lib\distutils called distutils.cfg containing 2 lines: > > [build] > compiler = mingw32 > > That took care of the previous message. I am currently getting a 'failed > with exit status 1' message, that for the life of me I can't remember what > causes it. > > I have attached the full (albeit tedius) output from an attempt, if someone > is willing to wade through it. The important line is this one: ld: dllcrt2.o: No such file: No such file or directory This looks like a problem with g95. Either it is misconfigured or we aren't passing it the right flags. Can you check to see if there is a dllcrt2.o file somewhere in your g95 installation? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From markperrymiller at gmail.com Fri May 23 19:12:45 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 16:12:45 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: <3d375d730805231605l6daad21cl4bc9aa366f8abb2e@mail.gmail.com> References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> <3d375d730805231550l49d1682dj6e0378f10c33c5e5@mail.gmail.com> <3d375d730805231605l6daad21cl4bc9aa366f8abb2e@mail.gmail.com> Message-ID: It appears to be there: dllcrt2.o in g95\lib. I'll re-install g95 to see if it helps. I'll also give gfortran in the meantime too. -Mark On Fri, May 23, 2008 at 4:05 PM, Robert Kern wrote: > On Fri, May 23, 2008 at 5:59 PM, Mark Miller > wrote: > > In this case, I am just using the Windows command prompt. I do not have > a > > setup.cfg or pydistutils.cfg file. I did create a file in > > Python25\Lib\distutils called distutils.cfg containing 2 lines: > > > > [build] > > compiler = mingw32 > > > > That took care of the previous message. I am currently getting a 'failed > > with exit status 1' message, that for the life of me I can't remember > what > > causes it. > > > > I have attached the full (albeit tedius) output from an attempt, if > someone > > is willing to wade through it. > > The important line is this one: > > ld: dllcrt2.o: No such file: No such file or directory > > This looks like a problem with g95. Either it is misconfigured or we > aren't passing it the right flags. Can you check to see if there is a > dllcrt2.o file somewhere in your g95 installation? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From markperrymiller at gmail.com Fri May 23 19:26:01 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Fri, 23 May 2008 16:26:01 -0700 Subject: [Numpy-discussion] f2py errors: any help interpreting? In-Reply-To: References: <3d375d730805231445u635bf170l3a6fae5f0beb4410@mail.gmail.com> <3d375d730805231501h564d030eod7e7b886f20c517b@mail.gmail.com> <3d375d730805231550l49d1682dj6e0378f10c33c5e5@mail.gmail.com> <3d375d730805231605l6daad21cl4bc9aa366f8abb2e@mail.gmail.com> Message-ID: gfortran is doing the trick. Must be a g95 misconfiguration or some other thing that I have no ability to comprehend. Thanks for the tip about the buggy numpy 1.04. That seemed to be the most serious hurdle. -Mark On Fri, May 23, 2008 at 4:12 PM, Mark Miller wrote: > It appears to be there: dllcrt2.o in g95\lib. > > I'll re-install g95 to see if it helps. I'll also give gfortran in the > meantime too. > > -Mark > > > On Fri, May 23, 2008 at 4:05 PM, Robert Kern > wrote: > >> On Fri, May 23, 2008 at 5:59 PM, Mark Miller >> wrote: >> > In this case, I am just using the Windows command prompt. I do not have >> a >> > setup.cfg or pydistutils.cfg file. I did create a file in >> > Python25\Lib\distutils called distutils.cfg containing 2 lines: >> > >> > [build] >> > compiler = mingw32 >> > >> > That took care of the previous message. I am currently getting a >> 'failed >> > with exit status 1' message, that for the life of me I can't remember >> what >> > causes it. >> > >> > I have attached the full (albeit tedius) output from an attempt, if >> someone >> > is willing to wade through it. >> >> The important line is this one: >> >> ld: dllcrt2.o: No such file: No such file or directory >> >> This looks like a problem with g95. Either it is misconfigured or we >> aren't passing it the right flags. Can you check to see if there is a >> dllcrt2.o file somewhere in your g95 installation? >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 01:14:33 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 May 2008 23:14:33 -0600 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: References: Message-ID: On Fri, May 23, 2008 at 3:31 PM, Jarrod Millman wrote: > Hello, > > I plan to branch 1.1.x and tag 1.1.0 later today. As of now, please > consider the trunk in an near absolute freeze. If you feel that there > is some unimaginably important change that must take place before we > branch and tag, please send an email to the mailing list including > your proposed patch and an explanation of why it is absolutely > necessary. If the list overwhelmingly agrees with you, I will > consider applying the patch before branching. If there is any reason > why I shouldn't branch right now, please let me know ASAP! > > Thanks to everyone who put so much time and effort into this release. > The trunk has seen tremendous improvements in terms of bug-fixing, > increased testing and improved documentation. I am looking forward to > a whole host of improvements over the summer. Once I create the new > branch, I will designate the trunk ready for 1.2 development. Despite > the increased minor number, please remain cautious with your changes. > As always any changes to the trunk shouldn't break it. If you have > more experimental work that you want to try, please create a branch. > Will you just kick the d*mn thing out the door? TIA Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Sat May 24 02:16:33 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 23 May 2008 23:16:33 -0700 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: References: Message-ID: On Fri, May 23, 2008 at 10:14 PM, Charles R Harris wrote: > Will you just kick the d*mn thing out the door? I would be happy to my only concern is that I would like to avoid releasing something that is broken. Can I safely ignore the 5 buildbot failures that you pointed out yesterday? ====================================================================== FAIL: test_divide (test_errstate.TestErrstate) ---------------------------------------------------------------------- Traceback (most recent call last): File "", line 38, in test_divide AssertionError ====================================================================== FAIL: test_invalid (test_errstate.TestErrstate) ---------------------------------------------------------------------- Traceback (most recent call last): File "", line 24, in test_invalid AssertionError ====================================================================== FAIL: test_divideerr (numpy.core.tests.test_numeric.TestSeterr) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/buildbot/numpy/b2/numpy-install/lib/python2.6/site-packages/numpy/core/tests/test_numeric.py", line 196, in test_divideerr self.fail() AssertionError ====================================================================== FAIL: test_divide (numpy.core.tests.test_errstate.TestErrstate) ---------------------------------------------------------------------- Traceback (most recent call last): File "", line 38, in test_divide AssertionError ====================================================================== FAIL: test_invalid (numpy.core.tests.test_errstate.TestErrstate) ---------------------------------------------------------------------- Traceback (most recent call last): File "", line 24, in test_invalid AssertionError If those failures aren't real, I will go ahead and branch. Sorry that this process has been so difficult and long. We can talk about whether there is a better way to do releases after I finalize 1.1.0. It may make sense to start rotating release management as David suggested. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Sat May 24 02:25:26 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 23 May 2008 23:25:26 -0700 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: On Thu, May 22, 2008 at 11:25 PM, Charles R Harris wrote: > On Fri, May 23, 2008 at 12:10 AM, Charles R Harris > wrote: >> >> The python 2.6 buildbots are showing 5 failures that are being hidden by >> valgrind. > > They seem to have fixed themselves, they were probably related to the API > addition I made, then moved. However, it is a bad thing that the errors were > covered up by Valgrind and not reported. I didn't understand what you meant by this yesterday, but I see what you are talking about now: http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio You said they fixed themselves, but the failures are in the most recent buildbot reports. This is the only thing I am concerned about before branching, so hopefully someone can look at this and let me know whether the failures are indeed fixed. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From david at ar.media.kyoto-u.ac.jp Sat May 24 02:16:29 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 24 May 2008 15:16:29 +0900 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: References: Message-ID: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> Jarrod Millman wrote: > > I would be happy to my only concern is that I would like to avoid > releasing something that is broken. Can I safely ignore the 5 > buildbot failures that you pointed out yesterday? > Where do those errors appear ? I don't see them on the builbot. Are they 2.6 specific ? If yes, I would say ignore them, because 2.6 is not released yet, and is scheduled for september. cheers, David From millman at berkeley.edu Sat May 24 02:35:06 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 23 May 2008 23:35:06 -0700 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> References: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> Message-ID: On Fri, May 23, 2008 at 11:16 PM, David Cournapeau wrote: > Where do those errors appear ? I don't see them on the builbot. Are they > 2.6 specific ? If yes, I would say ignore them, because 2.6 is not > released yet, and is scheduled for september. You can see them here: http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio You have to search for them, because as Chuck pointed out it seems valgrind seems to be hiding them. We should figure out how to avoid this at some point down the road as well. I am not sure whether or not this is 2.6 specific or not. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From david at ar.media.kyoto-u.ac.jp Sat May 24 02:28:55 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 24 May 2008 15:28:55 +0900 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: References: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> Message-ID: <4837B5A7.3070908@ar.media.kyoto-u.ac.jp> Jarrod Millman wrote: > > You can see them here: > http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio > http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio > > You have to search for them, because as Chuck pointed out it seems > valgrind seems to be hiding them. We should figure out how to avoid > this at some point down the road as well. > > I am not sure whether or not this is 2.6 specific or not. > I don't see any error with r5226 on Ubuntu (32 bits). So either it is fedora specific, 64 bits specific, or python 2.6 specific (or not being exclusive). Which 2.6 version is used ? I am testing now with python 2.6 alpha 3 on Ubuntu 32 bits, and I can also do it in 64 bits. FC would take a lot of time for me, so someone with Fedora would be more suitable to do it. I too think we should make the release ASAP, I am tired of it (as you certainly are :) ). cheers, David From peridot.faceted at gmail.com Sat May 24 03:28:33 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 24 May 2008 03:28:33 -0400 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: 2008/5/24 Jarrod Millman : > On Thu, May 22, 2008 at 11:25 PM, Charles R Harris > wrote: >> On Fri, May 23, 2008 at 12:10 AM, Charles R Harris >> wrote: >>> >>> The python 2.6 buildbots are showing 5 failures that are being hidden by >>> valgrind. >> >> They seem to have fixed themselves, they were probably related to the API >> addition I made, then moved. However, it is a bad thing that the errors were >> covered up by Valgrind and not reported. > > I didn't understand what you meant by this yesterday, but I see what > you are talking about now: > http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio > http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio > > You said they fixed themselves, but the failures are in the most > recent buildbot reports. This is the only thing I am concerned about > before branching, so hopefully someone can look at this and let me > know whether the failures are indeed fixed. They do not appear on my machine (a pentium-M running Ubuntu). I should point out that they are only actually three distinct errors, because one of the test suites is being run twice.They are not exactly subtle tests; they're checking that seterr induces the raising of exceptions from np.arange(3)/0, np.sqrt(-np.arange(3)), and np.array([1.])/np.array([0.]). The tests pass on the x86_64 machine I have access to (a multiprocessor Opteron running Knoppix of all things). That is, it only has python 2.4 (don't ask) so the tests can't be run, but running the same tests by hand produces the expected results. This particular feature - seterr - is the sort of thing an overaggressive optimizer can easily butcher, though, so it could easily be the result of the particular configuration on the buildbot machine. I think somebody with access to the buildbot machine needs to see what's going on. In particular: does a manually-compiled numpy exhibit the problem? How clean does the buildbot make its environment? Do the functions behave correctly from an interactive session? Do other seterr conditions have the same problem? Anne P.S. Please ignore the alarming-looking buildbot failure; it is due to operator headspace. -A From david at ar.media.kyoto-u.ac.jp Sat May 24 03:27:05 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 24 May 2008 16:27:05 +0900 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: <4837C349.9030404@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > > They do not appear on my machine (a pentium-M running Ubuntu). I > should point out that they are only actually three distinct errors, > because one of the test suites is being run twice.They are not exactly > subtle tests; they're checking that seterr induces the raising of > exceptions from np.arange(3)/0, np.sqrt(-np.arange(3)), and > np.array([1.])/np.array([0.]). > (resending, I did not see that build logs were so big) Here are the build logs for 3 configurations (all on r5226): - Ubuntu hardy on 32 bits + python2.6 - Ubuntu hardy on 64 bits + python2.6 (inside vmware) - RHEL 5 64 bits + python2.5 None of them show the error. But RHEL (which is arguably the nearest to FC) do not have all the seterr tests, I don't know why: test_divide is not run with RHEL, only test_dividerr is). Can the error with seterr linked to CPU fpu state flag ? cheers, David -------------- next part -------------- A non-text attachment was scrubbed... Name: logs.tbz2 Type: application/x-bzip-compressed-tar Size: 10293 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Sat May 24 03:31:55 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 24 May 2008 16:31:55 +0900 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: <4837C46B.1090102@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > This particular > feature - seterr - is the sort of thing an overaggressive optimizer > can easily butcher, though, so it could easily be the result of the > particular configuration on the buildbot machine. gcc claims to be IEEE compliant at all level of optimizations (-O*): """ -ffast-math: Sets -fno-math-errno, -funsafe-math-optimizations, -fno-trapping-math, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans and fcx-limited-range. This option causes the preprocessor macro "__FAST_MATH__" to be defined. This option should never be turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions. """ So I don't think that's the problem. cheers, David From millman at berkeley.edu Sat May 24 04:48:02 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 24 May 2008 01:48:02 -0700 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: <4837B5A7.3070908@ar.media.kyoto-u.ac.jp> References: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> <4837B5A7.3070908@ar.media.kyoto-u.ac.jp> Message-ID: On Fri, May 23, 2008 at 11:28 PM, David Cournapeau wrote: > I too think we should make the release ASAP, I am tired of it (as you > certainly are :) ). Absolutely! I just created the 1.1.x maintenance branch: http://projects.scipy.org/scipy/numpy/changeset/5227 The trunk is now open for 1.2 development: http://projects.scipy.org/scipy/numpy/changeset/5228 I know several people have code for 1.2 who have been patiently waiting for me. Please feel free to go ahead and commit your code to the trunk now. Alan McIntyre officially starts his Google Summer of Code project on Monday. One of his first tasks will be to help convert NumPy to the nose testing framework. That will probably start in a branch and then get merged all at once just like Matthew Brett did for SciPy. Also Stefan should feel free to start committing all the documentation work to the trunk now. If no one has any more input on the recent buildbot failures, I will tag 1.1.0 from the branch tomorrow morning. Chris Burns is on vacation (it is memorial day in the States) until Monday, so he won't be able to create the Mac binaries until then. Most likely, since David is around the Windows binary will be available shortly after I tag the release. I will announce the NumPy 1.1.0 release late Monday night or early on Tuesday. Thanks to everyone who worked so hard getting this release out. Sorry that the release management was so uneven and frustrating. I will be happy to discuss what could have been done differently or better. I have tried to keep the release notes up-to-date as we went; but if you have any time, now would be a great time for a final pass: http://projects.scipy.org/scipy/numpy/milestone/1.1.0 Finally, I have also updated Trac so that new tickets default to the 1.2 milestone. If you create a ticket for 1.1.1, please be sure to change the milestone from the default. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From nwagner at iam.uni-stuttgart.de Sat May 24 08:54:04 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 24 May 2008 14:54:04 +0200 Subject: [Numpy-discussion] numpy.test() (failures=2, errors=3) Message-ID: Hi all, I found two failures and three errors wrt numpy.test() numpy.__version__ '1.2.0.dev5228' ====================================================================== ERROR: Ticket #396 ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/core/tests/test_regression.py", line 602, in check_poly1d_nan_roo self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") File "/usr/lib64/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/polynomial.py", line 661, in __getattr__ return roots(self.coeffs) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/polynomial.py", line 124, in roots roots = _eigvals(A) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/polynomial.py", line 40, in _eigvals return eigvals(arg) File "/usr/local/lib64/python2.5/site-packages/scipy/linalg/decomp.py", line 478, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File "/usr/local/lib64/python2.5/site-packages/scipy/linalg/decomp.py", line 150, in eig a1 = asarray_chkfinite(a) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/function_base.py", line 527, in asarray_chkfinite raise ValueError, "array must not contain infs or NaNs" ValueError: array must not contain infs or NaNs ====================================================================== ERROR: Ticket #396 ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/core/tests/test_regression.py", line 602, in check_poly1d_nan_roo self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") File "/usr/lib64/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/polynomial.py", line 661, in __getattr__ return roots(self.coeffs) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/polynomial.py", line 124, in roots roots = _eigvals(A) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/polynomial.py", line 37, in _eigvals return eigvals(arg) File "/usr/local/lib64/python2.5/site-packages/scipy/linalg/decomp.py", line 478, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File "/usr/local/lib64/python2.5/site-packages/scipy/linalg/decomp.py", line 150, in eig a1 = asarray_chkfinite(a) File "/usr/local/lib64/python2.5/site-packages/numpy/lib/function_base.py", line 527, in asarray_chkfinite raise ValueError, "array must not contain infs or NaNs" ValueError: array must not contain infs or NaNs ====================================================================== ERROR: test_hdquantiles (numpy.ma.tests.test_morestats.TestQuantiles) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 97, in test_hdquantiles hdq = hdquantiles_sd(data,[0.25, 0.5, 0.75]) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/morestats.py", line 168, in hdquantiles_sd result = _hdsd_1D(data.compressed(), p) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/morestats.py", line 144, in _hdsd_1D xsorted = numpy.sort(data.compressed()) AttributeError: 'numpy.ndarray' object has no attribute 'compressed' ====================================================================== FAIL: Tests the Marits-Jarrett estimator ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 36, in test_mjci assert_almost_equal(mjci(data),[55.76819,45.84028,198.8788],5) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 134, in assert_almost_equal return assert_array_almost_equal(actual, desired, decimal, err_msg) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 227, in assert_array_almost_equal header='Arrays are not almost equal') File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 193, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([ 55.76818915, 45.84027529, 198.8787528 ]) y: array([ 55.76819, 45.84028, 198.8788 ]) ====================================================================== FAIL: Test quantiles 1D - w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_mstats.py", line 61, in test_1d_mask [24.833333, 50.0, 75.166666]) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 134, in assert_almost_equal return assert_array_almost_equal(actual, desired, decimal, err_msg) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 227, in assert_array_almost_equal header='Arrays are not almost equal') File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 193, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 66.6666666667%) x: array([ 24.83333333, 50. , 75.16666667]) y: array([ 24.833333, 50. , 75.166666]) ---------------------------------------------------------------------- Ran 1290 tests in 1.993s FAILED (failures=2, errors=3) From mattknox.ca at gmail.com Sat May 24 10:46:40 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Sat, 24 May 2008 14:46:40 +0000 (UTC) Subject: [Numpy-discussion] numpy.test() (failures=2, errors=3) References: Message-ID: > ====================================================================== > ERROR: test_hdquantiles > (numpy.ma.tests.test_morestats.TestQuantiles) > ---------------------------------------------------------------------- You have some kind of franken-build going on there. test_morestats has long since been deleted from svn. Try wiping out your numpy installation and starting from scratch. - Matt From charlesr.harris at gmail.com Sat May 24 10:47:13 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 08:47:13 -0600 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: On Sat, May 24, 2008 at 12:25 AM, Jarrod Millman wrote: > On Thu, May 22, 2008 at 11:25 PM, Charles R Harris > wrote: > > On Fri, May 23, 2008 at 12:10 AM, Charles R Harris > > wrote: > >> > >> The python 2.6 buildbots are showing 5 failures that are being hidden by > >> valgrind. > > > > They seem to have fixed themselves, they were probably related to the > API > > addition I made, then moved. However, it is a bad thing that the errors > were > > covered up by Valgrind and not reported. > > I didn't understand what you meant by this yesterday, but I see what > you are talking about now: > > http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio > > http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio > > You said they fixed themselves, but the failures are in the most > recent buildbot reports. This is the only thing I am concerned about > before branching, so hopefully someone can look at this and let me > know whether the failures are indeed fixed. > That's actually a bit different, I saw the 5 failures after the main sequence of tests and those are OK now. Valgrind seems to run things twice and the new failures are in the valgrind tests, I missed those. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 10:49:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 08:49:12 -0600 Subject: [Numpy-discussion] numpy.test() (failures=2, errors=3) In-Reply-To: References: Message-ID: On Sat, May 24, 2008 at 6:54 AM, Nils Wagner wrote: > Hi all, > > I found two failures and three errors wrt numpy.test() > > numpy.__version__ > > '1.2.0.dev5228' > Hi Nils, can you try a clean install? Remove the build directory and maybe the numpy folder in site-packages and test to see if the errors are still there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 10:53:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 08:53:22 -0600 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: References: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> <4837B5A7.3070908@ar.media.kyoto-u.ac.jp> Message-ID: On Sat, May 24, 2008 at 2:48 AM, Jarrod Millman wrote: > On Fri, May 23, 2008 at 11:28 PM, David Cournapeau > wrote: > > I too think we should make the release ASAP, I am tired of it (as you > > certainly are :) ). > > Absolutely! I just created the 1.1.x maintenance branch: > http://projects.scipy.org/scipy/numpy/changeset/5227 > > The trunk is now open for 1.2 development: > http://projects.scipy.org/scipy/numpy/changeset/5228 > I know several people have code for 1.2 who have been patiently > waiting for me. Please feel free to go ahead and commit your code to > the trunk now. Alan McIntyre officially starts his Google Summer of > Code project on Monday. One of his first tasks will be to help > convert NumPy to the nose testing framework. That will probably start > in a branch and then get merged all at once just like Matthew Brett > did for SciPy. Also Stefan should feel free to start committing all > the documentation work to the trunk now. > > If no one has any more input on the recent buildbot failures, I will > tag 1.1.0 from the branch tomorrow morning. Chris Burns is on > vacation (it is memorial day in the States) until Monday, so he won't > be able to create the Mac binaries until then. Most likely, since > David is around the Windows binary will be available shortly after I > tag the release. > > I will announce the NumPy 1.1.0 release late Monday night or early on > Tuesday. Thanks to everyone who worked so hard getting this release > out. Sorry that the release management was so uneven and frustrating. > I will be happy to discuss what could have been done differently or > better. > > I have tried to keep the release notes up-to-date as we went; but if > you have any time, now would be a great time for a final pass: > http://projects.scipy.org/scipy/numpy/milestone/1.1.0 > Because the default testing level is now all=True some old tests no longer in numpy are also run if they are hanging around in site-packages. So I think we should have a note telling folks to start with a clean install, i.e., they should delete the previous install from site-packages. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 11:03:14 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 09:03:14 -0600 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: On Sat, May 24, 2008 at 8:47 AM, Charles R Harris wrote: > > > On Sat, May 24, 2008 at 12:25 AM, Jarrod Millman > wrote: > >> On Thu, May 22, 2008 at 11:25 PM, Charles R Harris >> wrote: >> > On Fri, May 23, 2008 at 12:10 AM, Charles R Harris >> > wrote: >> >> >> >> The python 2.6 buildbots are showing 5 failures that are being hidden >> by >> >> valgrind. >> > >> > They seem to have fixed themselves, they were probably related to the >> API >> > addition I made, then moved. However, it is a bad thing that the errors >> were >> > covered up by Valgrind and not reported. >> >> I didn't understand what you meant by this yesterday, but I see what >> you are talking about now: >> >> http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio >> >> http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio >> >> You said they fixed themselves, but the failures are in the most >> recent buildbot reports. This is the only thing I am concerned about >> before branching, so hopefully someone can look at this and let me >> know whether the failures are indeed fixed. >> > > That's actually a bit different, I saw the 5 failures after the main > sequence of tests and those are OK now. Valgrind seems to run things twice > and the new failures are in the valgrind tests, I missed those. > I take that back, I got confused looking through the output. The errors are the same and only seem to happen when valgrind runs the tests. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 11:12:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 09:12:01 -0600 Subject: [Numpy-discussion] David, please check. Message-ID: David, I merged the OBJECT_API and MULTIARRAY_API lists and fixed the SConstruct file in numpy/core, but you should check if I got it right. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From orest.kozyar at gmail.com Sat May 24 12:12:18 2008 From: orest.kozyar at gmail.com (Orest Kozyar) Date: Sat, 24 May 2008 09:12:18 -0700 (PDT) Subject: [Numpy-discussion] numpy.ndarray.astype conversion In-Reply-To: References: Message-ID: > > The following line: array([6.95e-5]).astype('S') > > returns: array(['6'], dtype='|S1') > I don't know what it should do. Issue a warning, maybe? > What are you trying to do? At the very least a warning would not hurt. When I first started using the numpy arrays it looked like conversion to string would ensure that the minimum string length needed to accurately represent the numbers in the array would be used. It wasn't until I noticed some odd outliers in my plotted data that I realized what was going on -- when exponential notation is used, the conversion gets truncated. I use numpy arrays to store data from SQL queries, and this data sometimes has mixed types (string, number, etc.). I now realize that I was setting up the arrays wrong and should have been using dtype=object instead. It just seems like the above might be something that might "bite" new users of numpy unintentionally. Orest From nwagner at iam.uni-stuttgart.de Sat May 24 12:14:45 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 24 May 2008 18:14:45 +0200 Subject: [Numpy-discussion] numpy.test() (failures=2, errors=3) In-Reply-To: References: Message-ID: On Sat, 24 May 2008 08:49:12 -0600 "Charles R Harris" wrote: > On Sat, May 24, 2008 at 6:54 AM, Nils Wagner > > wrote: > >> Hi all, >> >> I found two failures and three errors wrt numpy.test() >> >> numpy.__version__ >> >> '1.2.0.dev5228' >> > > > Hi Nils, can you try a clean install? Remove the build >directory and maybe > the numpy folder in site-packages and test to see if the >errors are still > there. > > Chuck Sorry for the noise. I have removed another site-package numpy directory. Works for me now. Thank you. Cheers Nils From charlesr.harris at gmail.com Sat May 24 13:35:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 11:35:25 -0600 Subject: [Numpy-discussion] SciPy page needs link to the FAQ Message-ID: That's all. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Sat May 24 13:48:50 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sat, 24 May 2008 18:48:50 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> Message-ID: Hi All, On Fri, May 23, 2008 at 4:50 PM, Andrea Gavana wrote: > Hi Peter & All, > > On Fri, May 23, 2008 at 4:16 PM, Peter Creasey wrote: >> Hi Andrea, >> >> 2008/5/23 "Andrea Gavana" : >>> And so on. The probelm with this approach is that I lose the original >>> indices for which I want all the inequality tests to succeed: >> >> To have the original indices you just need to re-index your indices, as it were >> >> idx = flatnonzero(xCent >= xMin) >> idx = idx[flatnonzero(xCent[idx] <= xMax)] >> idx = idx[flatnonzero(yCent[idx] >= yMin)] >> idx = idx[flatnonzero(yCent[idx] <= yMax)] >> ... >> (I haven't tested this code, apologies for bugs) >> >> However, there is a performance penalty for doing all this re-indexing >> (I once fell afoul of this), and if these conditions "mostly" evaluate >> to True you can often be better off with one of the solutions already >> suggested. > > Thank you for your answer. I have tried your suggestion, and the > performances are more or less comparable with the other NumPy > implementations (yours is roughly 1.2 times slower than the others), > but I do gain some advantage when the subgrids are very small (i.e., > most of the values in the first array are already False). I'll go and > implement your solution when I have many small subgrids in my model. I have added a few more implementations for this issue. One think that stroke me was that I could actually use sorted arrays for my xCent, yCent and zCent vectors, so I came up with a solution which uses numpy.searchsorted but the speed is more or less as it was without sorting. The specific function is this: def MultipleBoolean13(): """ Numpy solution 5 (Andrea). """ searchsorted = numpy.searchsorted indxStart, indxEnd = searchsorted(xCentS, [xMin, xMax]) indyStart, indyEnd = searchsorted(yCentS, [yMin, yMax]) indzStart, indzEnd = searchsorted(zCentS, [zMin, zMax]) xInd = numpy.zeros((nCells), dtype=numpy.bool) yInd = xInd.copy() zInd = xInd.copy() xInd[xIndices[indxStart:indxEnd]] = True yInd[yIndices[indyStart:indyEnd]] = True zInd[zIndices[indzStart:indzEnd]] = True xyzReq = numpy.nonzero(xInd & yInd & zInd)[0] Where xCentS, yCentS and zCentS are the sorted arrays. I don't see any easy way to optimize this method, so I'd like to know if it is possible to code it better than I did. I have done some testing and timings of all the solutions we came up until now (12 implementations), and this is what I get (please notice the nice work or ascii art :-D :-D ): ******************* * SUMMARY RESULTS * ******************* --------------------------------------------------------------------- Number Of Cells: 50000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 NumPy 5 (Andrea) 0.00562803 1.00000 2 NumPy 1 (Francesc) 0.00563365 1.00100 3 NumPy 2 (Nathan) 0.00577322 1.02580 4 NumPy 4 (Nathan-Vector) 0.00580577 1.03158 5 Fortran 6 (James) 0.00660514 1.17361 6 Fortran 3 (Alex) 0.00709856 1.26129 7 Fortran 5 (James) 0.00748726 1.33035 8 Fortran 2 (Mine) 0.00748816 1.33051 9 Fortran 1 (Mine) 0.00775906 1.37864 10 Fortran 4 {Michael) 0.00777685 1.38181 11 NumPy 3 (Peter) 0.01253662 2.22753 12 Cython (Stefan) 0.01597804 2.83901 --------------------------------------------------------------------- --------------------------------------------------------------------- Number Of Cells: 100000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 NumPy 5 (Andrea) 0.01080372 1.00000 2 NumPy 2 (Nathan) 0.01109147 1.02663 3 NumPy 1 (Francesc) 0.01114189 1.03130 4 NumPy 4 (Nathan-Vector) 0.01214118 1.12380 5 Fortran 6 (James) 0.01351264 1.25074 6 Fortran 5 (James) 0.01368450 1.26665 7 Fortran 3 (Alex) 0.01373010 1.27087 8 Fortran 2 (Mine) 0.01415306 1.31002 9 Fortran 1 (Mine) 0.01425558 1.31951 10 Fortran 4 {Michael) 0.01443192 1.33583 11 NumPy 3 (Peter) 0.02463268 2.28002 12 Cython (Stefan) 0.04298108 3.97836 --------------------------------------------------------------------- --------------------------------------------------------------------- Number Of Cells: 150000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 NumPy 1 (Francesc) 0.01613255 1.00000 2 NumPy 5 (Andrea) 0.01619734 1.00402 3 NumPy 2 (Nathan) 0.01647855 1.02145 4 NumPy 4 (Nathan-Vector) 0.01779452 1.10302 5 Fortran 3 (Alex) 0.02064676 1.27982 6 Fortran 2 (Mine) 0.02382278 1.47669 7 Fortran 4 {Michael) 0.02404563 1.49050 8 Fortran 5 (James) 0.02734487 1.69501 9 Fortran 6 (James) 0.02762538 1.71240 10 Fortran 1 (Mine) 0.03028402 1.87720 11 NumPy 3 (Peter) 0.03625735 2.24746 12 Cython (Stefan) 0.07515276 4.65845 --------------------------------------------------------------------- --------------------------------------------------------------------- Number Of Cells: 200000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 NumPy 5 (Andrea) 0.02187359 1.00000 2 NumPy 1 (Francesc) 0.02309221 1.05571 3 NumPy 2 (Nathan) 0.02323452 1.06222 4 NumPy 4 (Nathan-Vector) 0.02378610 1.08743 5 Fortran 3 (Alex) 0.02792134 1.27649 6 Fortran 2 (Mine) 0.03119301 1.42606 7 Fortran 4 {Michael) 0.03221007 1.47256 8 Fortran 5 (James) 0.03584257 1.63862 9 Fortran 6 (James) 0.03627464 1.65838 10 Fortran 1 (Mine) 0.04048422 1.85083 11 NumPy 3 (Peter) 0.04765184 2.17851 12 Cython (Stefan) 0.09927396 4.53853 --------------------------------------------------------------------- --------------------------------------------------------------------- Number Of Cells: 250000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 NumPy 5 (Andrea) 0.02651608 1.00000 2 NumPy 1 (Francesc) 0.02864898 1.08044 3 NumPy 2 (Nathan) 0.02933160 1.10618 4 NumPy 4 (Nathan-Vector) 0.02960466 1.11648 5 Fortran 3 (Alex) 0.03427299 1.29254 6 Fortran 2 (Mine) 0.03848291 1.45130 7 Fortran 4 {Michael) 0.03919467 1.47815 8 Fortran 6 (James) 0.04430585 1.67091 9 Fortran 5 (James) 0.04765620 1.79726 10 Fortran 1 (Mine) 0.04914228 1.85330 11 NumPy 3 (Peter) 0.05981760 2.25590 12 Cython (Stefan) 0.12797177 4.82619 --------------------------------------------------------------------- --------------------------------------------------------------------- Number Of Cells: 300000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 NumPy 5 (Andrea) 0.03365729 1.00000 2 NumPy 1 (Francesc) 0.03470315 1.03107 3 NumPy 2 (Nathan) 0.03519601 1.04572 4 NumPy 4 (Nathan-Vector) 0.03529574 1.04868 5 Fortran 3 (Alex) 0.04167365 1.23818 6 Fortran 2 (Mine) 0.04653522 1.38262 7 Fortran 4 {Michael) 0.04711222 1.39976 8 Fortran 6 (James) 0.05287650 1.57103 9 Fortran 5 (James) 0.05686667 1.68958 10 Fortran 1 (Mine) 0.05913682 1.75703 11 NumPy 3 (Peter) 0.07183072 2.13418 12 Cython (Stefan) 0.15876811 4.71720 --------------------------------------------------------------------- ****************************** * ELAPSED TIME: 64.828 Seconds * ****************************** Just in case someone is interested in it, I attach the source code of the implementations I have so far. Thank you vey much for your suggestions! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: MultipleBoolean.py URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: MultipleBooleanCython.pyx Type: application/octet-stream Size: 1157 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy.pxi Type: application/octet-stream Size: 3554 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: python.pxi Type: application/octet-stream Size: 647 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: MultipleBooleanFortran.f90 Type: application/octet-stream Size: 3621 bytes Desc: not available URL: From charlesr.harris at gmail.com Sat May 24 13:56:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 11:56:19 -0600 Subject: [Numpy-discussion] Closing ticket #164 and Windows XP 64 on AMD hardware. Message-ID: Hi All, I was about to close ticket #164 when I found this thread from almost a year ago: http://thread.gmane.org/gmane.comp.python.numeric.general/16296/focus=16302. Does anyone know if these issues have all been resolved? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat May 24 15:11:58 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 24 May 2008 21:11:58 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> Message-ID: <9457e7c80805241211m1318264fu7a837afbe3ba5041@mail.gmail.com> Hi Andrea 2008/5/24 Andrea Gavana : > Number Of Cells: 50000 > --------------------------------------------------------------------- > | Rank | Method Name | Execution Time | Relative Slowness | > --------------------------------------------------------------------- > 1 NumPy 5 (Andrea) 0.00562803 1.00000 > 2 NumPy 1 (Francesc) 0.00563365 1.00100 > 3 NumPy 2 (Nathan) 0.00577322 1.02580 > 4 NumPy 4 (Nathan-Vector) 0.00580577 1.03158 > 5 Fortran 6 (James) 0.00660514 1.17361 > 6 Fortran 3 (Alex) 0.00709856 1.26129 > 7 Fortran 5 (James) 0.00748726 1.33035 > 8 Fortran 2 (Mine) 0.00748816 1.33051 > 9 Fortran 1 (Mine) 0.00775906 1.37864 > 10 Fortran 4 {Michael) 0.00777685 1.38181 > 11 NumPy 3 (Peter) 0.01253662 2.22753 > 12 Cython (Stefan) 0.01597804 2.83901 > --------------------------------------------------------------------- When you bench the Cython code, you'll have to take out the Python calls (for checking dtype etc.), otherwise you're comparing apples and oranges. After I tweaked it, it ran roughly the same time as Francesc's version. But like I mentioned before, the Fortran results should trump all, so what is going on here? Regards St?fan From peridot.faceted at gmail.com Sat May 24 15:33:32 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 24 May 2008 15:33:32 -0400 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: 2008/5/24 Charles R Harris : > > I take that back, I got confused looking through the output. The errors are > the same and only seem to happen when valgrind runs the tests. Sounds like maybe valgrind is not IEEE clean: """ As of version 3.0.0, Valgrind has the following limitations in its implementation of x86/AMD64 floating point relative to IEEE754. """ + """ Numeric exceptions in FP code: IEEE754 defines five types of numeric exception that can happen: invalid operation (sqrt of negative number, etc), division by zero, overflow, underflow, inexact (loss of precision). For each exception, two courses of action are defined by IEEE754: either (1) a user-defined exception handler may be called, or (2) a default action is defined, which "fixes things up" and allows the computation to proceed without throwing an exception. Currently Valgrind only supports the default fixup actions. Again, feedback on the importance of exception support would be appreciated. When Valgrind detects that the program is trying to exceed any of these limitations (setting exception handlers, rounding mode, or precision control), it can print a message giving a traceback of where this has happened, and continue execution. This behaviour used to be the default, but the messages are annoying and so showing them is now disabled by default. Use --show-emwarns=yes to see them. """ Anne From peridot.faceted at gmail.com Sat May 24 15:35:54 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 24 May 2008 15:35:54 -0400 Subject: [Numpy-discussion] 1.1.x branch and 1.1.0 tag imminent In-Reply-To: References: <4837B2BD.7050807@ar.media.kyoto-u.ac.jp> Message-ID: 2008/5/24 Jarrod Millman : > On Fri, May 23, 2008 at 11:16 PM, David Cournapeau > wrote: >> Where do those errors appear ? I don't see them on the builbot. Are they >> 2.6 specific ? If yes, I would say ignore them, because 2.6 is not >> released yet, and is scheduled for september. > > You can see them here: > http://buildbot.scipy.org/builders/Linux_x86_64_Fedora/builds/486/steps/shell_2/logs/stdio > http://buildbot.scipy.org/builders/Linux_x86_Fedora_Py2.6/builds/461/steps/shell_2/logs/stdio > > You have to search for them, because as Chuck pointed out it seems > valgrind seems to be hiding them. We should figure out how to avoid > this at some point down the road as well. > > I am not sure whether or not this is 2.6 specific or not. No. These are a known limitation of valgrind when dealing with floating-point. Other than emailing the valgrind developers to tell them that yes, somebody cares about this, I think they can be safely ignored (thank goodness). Anne From charlesr.harris at gmail.com Sat May 24 16:08:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 14:08:46 -0600 Subject: [Numpy-discussion] Buildbot errors. In-Reply-To: References: Message-ID: On Sat, May 24, 2008 at 1:33 PM, Anne Archibald wrote: > 2008/5/24 Charles R Harris : > > > > I take that back, I got confused looking through the output. The errors > are > > the same and only seem to happen when valgrind runs the tests. > > Sounds like maybe valgrind is not IEEE clean: > > """ > As of version 3.0.0, Valgrind has the following limitations in its > implementation of x86/AMD64 floating point relative to IEEE754. > """ > + > """ > Numeric exceptions in FP code: IEEE754 defines five types of numeric > exception that can happen: invalid operation (sqrt of negative number, > etc), division by zero, overflow, underflow, inexact (loss of > precision). > > For each exception, two courses of action are defined by IEEE754: > either (1) a user-defined exception handler may be called, or (2) a > default action is defined, which "fixes things up" and allows the > computation to proceed without throwing an exception. > > Currently Valgrind only supports the default fixup actions. Again, > feedback on the importance of exception support would be appreciated. > > When Valgrind detects that the program is trying to exceed any of > these limitations (setting exception handlers, rounding mode, or > precision control), it can print a message giving a traceback of where > this has happened, and continue execution. This behaviour used to be > the default, but the messages are annoying and so showing them is now > disabled by default. Use --show-emwarns=yes to see them. > """ > Thanks for following that up, Anne. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 20:04:05 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 18:04:05 -0600 Subject: [Numpy-discussion] David, please check. In-Reply-To: References: Message-ID: On Sat, May 24, 2008 at 9:12 AM, Charles R Harris wrote: > David, > > I merged the OBJECT_API and MULTIARRAY_API lists and fixed the SConstruct > file in numpy/core, but you should check if I got it right. > I've since renamed generate_array_api to generate_numpy_api which led to some mods in scons_support.py, so you might want to check that also. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 21:31:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 19:31:19 -0600 Subject: [Numpy-discussion] ufunc oddities Message-ID: Hi All, I'm writing tests for ufuncs and turned up some oddities: In [4]: degrees(True) Out[4]: 57.29578 In [5]: radians(True) Out[5]: 0.017453292 In [6]: sin(True) Out[6]: 0.84147096 Do we want numeric functions to apply to booleans? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 21:36:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 19:36:57 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: Message-ID: On Sat, May 24, 2008 at 7:31 PM, Charles R Harris wrote: > Hi All, > > I'm writing tests for ufuncs and turned up some oddities: > > In [4]: degrees(True) > Out[4]: 57.29578 > > In [5]: radians(True) > Out[5]: 0.017453292 > > In [6]: sin(True) > Out[6]: 0.84147096 > > Do we want numeric functions to apply to booleans? > Some more: In [15]: x Out[15]: array([ True, False], dtype=bool) In [16]: floor_divide(x,True) Out[16]: array([1, 0], dtype=int8) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 24 21:47:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 20:47:00 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: Message-ID: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> On Sat, May 24, 2008 at 8:31 PM, Charles R Harris wrote: > Hi All, > > I'm writing tests for ufuncs and turned up some oddities: > > In [4]: degrees(True) > Out[4]: 57.29578 > > In [5]: radians(True) > Out[5]: 0.017453292 > > In [6]: sin(True) > Out[6]: 0.84147096 > > Do we want numeric functions to apply to booleans? I don't see a good reason to prevent it. They are just 0 and 1 under the covers and behave like it everywhere else (e.g. True + True == 2 and the very useful boolean_mask.sum()). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 22:09:06 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 20:09:06 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 7:47 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 8:31 PM, Charles R Harris > wrote: > > Hi All, > > > > I'm writing tests for ufuncs and turned up some oddities: > > > > In [4]: degrees(True) > > Out[4]: 57.29578 > > > > In [5]: radians(True) > > Out[5]: 0.017453292 > > > > In [6]: sin(True) > > Out[6]: 0.84147096 > > > > Do we want numeric functions to apply to booleans? > > I don't see a good reason to prevent it. They are just 0 and 1 under > the covers and behave like it everywhere else (e.g. True + True == 2 > and the very useful boolean_mask.sum()). > True + True == 1 In [5]: x + x Out[5]: array([ True, True], dtype=bool) In [6]: (x + x).astype(int) Out[6]: array([1, 1]) That is how the inner loop is implemented. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 24 22:25:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 21:25:34 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> Message-ID: <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> On Sat, May 24, 2008 at 9:09 PM, Charles R Harris wrote: > > On Sat, May 24, 2008 at 7:47 PM, Robert Kern wrote: >> >> On Sat, May 24, 2008 at 8:31 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > I'm writing tests for ufuncs and turned up some oddities: >> > >> > In [4]: degrees(True) >> > Out[4]: 57.29578 >> > >> > In [5]: radians(True) >> > Out[5]: 0.017453292 >> > >> > In [6]: sin(True) >> > Out[6]: 0.84147096 >> > >> > Do we want numeric functions to apply to booleans? >> >> I don't see a good reason to prevent it. They are just 0 and 1 under >> the covers and behave like it everywhere else (e.g. True + True == 2 >> and the very useful boolean_mask.sum()). > > True + True == 1 No, True + True == 2. Try it. We might have made boolean arrays behave differently than Python bool objects, but that's not what I wrote. > In [5]: x + x > Out[5]: array([ True, True], dtype=bool) > > In [6]: (x + x).astype(int) > Out[6]: array([1, 1]) > > That is how the inner loop is implemented. Fine. Internally, boolean arrays operated with boolean arrays with a boolean output work slightly differently than Python bool objects (which always act like integers). However, ufuncs like sin(), floor_divide(), etc. convert the argument to a dtype they can accept so True -> 1.0 or True -> uint8(1) and the ufunc goes on it's merry way. That's fine. Leave it alone. I don't think it's a problem, much less one worth trying to solve. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Sat May 24 22:28:40 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 24 May 2008 19:28:40 -0700 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 7:09 PM, Charles R Harris wrote: > > > On Sat, May 24, 2008 at 7:47 PM, Robert Kern wrote: >> >> On Sat, May 24, 2008 at 8:31 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > I'm writing tests for ufuncs and turned up some oddities: >> > >> > In [4]: degrees(True) >> > Out[4]: 57.29578 >> > >> > In [5]: radians(True) >> > Out[5]: 0.017453292 >> > >> > In [6]: sin(True) >> > Out[6]: 0.84147096 >> > >> > Do we want numeric functions to apply to booleans? >> >> I don't see a good reason to prevent it. They are just 0 and 1 under >> the covers and behave like it everywhere else (e.g. True + True == 2 >> and the very useful boolean_mask.sum()). > > True + True == 1 > > In [5]: x + x > Out[5]: array([ True, True], dtype=bool) > > In [6]: (x + x).astype(int) > Out[6]: array([1, 1]) > > That is how the inner loop is implemented. I think it's interesting how python and numpy bools behave differently. >> x = np.array([True, True], dtype=bool) >> x[0] + x[1] True >> x[0] & x[1] True >> >> x = [True, True] >> x[0] + x[1] 2 >> x[0] & x[1] True From robert.kern at gmail.com Sat May 24 22:36:49 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 21:36:49 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> Message-ID: <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> On Sat, May 24, 2008 at 9:28 PM, Keith Goodman wrote: > I think it's interesting how python and numpy bools behave differently. > >>> x = np.array([True, True], dtype=bool) >>> x[0] + x[1] > True >>> x[0] & x[1] > True >>> >>> x = [True, True] >>> x[0] + x[1] > 2 >>> x[0] & x[1] > True The difference arises straightforwardly from the principle that numpy tries not to upcast when you do an operation on two arrays of the same dtype; True+True==True is of somewhat more use than True+True==False. Python bools are just ints subclasses to give a nice string representation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 22:37:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 20:37:57 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 8:25 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 9:09 PM, Charles R Harris > wrote: > > > > On Sat, May 24, 2008 at 7:47 PM, Robert Kern > wrote: > >> > >> On Sat, May 24, 2008 at 8:31 PM, Charles R Harris > >> wrote: > >> > Hi All, > >> > > >> > I'm writing tests for ufuncs and turned up some oddities: > >> > > >> > In [4]: degrees(True) > >> > Out[4]: 57.29578 > >> > > >> > In [5]: radians(True) > >> > Out[5]: 0.017453292 > >> > > >> > In [6]: sin(True) > >> > Out[6]: 0.84147096 > >> > > >> > Do we want numeric functions to apply to booleans? > >> > >> I don't see a good reason to prevent it. They are just 0 and 1 under > >> the covers and behave like it everywhere else (e.g. True + True == 2 > >> and the very useful boolean_mask.sum()). > > > > True + True == 1 > > No, True + True == 2. Try it. We might have made boolean arrays behave > differently than Python bool objects, but that's not what I wrote. > Robert, the C code in the inner loop is generated with /**begin repeat #TYPE=(BOOL, BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE)*2# #OP=||, +*13, ^, -*13# #kind=add*14, subtract*14# #typ=(Bool, byte, ubyte, short, ushort, int, uint, long, ulong, longlong, ulonglong, float, double, longdouble)*2# */ Note that || is not the same as +. Also note that subtract is implemented as xor. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 24 22:40:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 21:40:14 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> Message-ID: <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> On Sat, May 24, 2008 at 9:37 PM, Charles R Harris wrote: > > > On Sat, May 24, 2008 at 8:25 PM, Robert Kern wrote: >> >> On Sat, May 24, 2008 at 9:09 PM, Charles R Harris >> wrote: >> > >> > On Sat, May 24, 2008 at 7:47 PM, Robert Kern >> > wrote: >> >> >> >> On Sat, May 24, 2008 at 8:31 PM, Charles R Harris >> >> wrote: >> >> > Hi All, >> >> > >> >> > I'm writing tests for ufuncs and turned up some oddities: >> >> > >> >> > In [4]: degrees(True) >> >> > Out[4]: 57.29578 >> >> > >> >> > In [5]: radians(True) >> >> > Out[5]: 0.017453292 >> >> > >> >> > In [6]: sin(True) >> >> > Out[6]: 0.84147096 >> >> > >> >> > Do we want numeric functions to apply to booleans? >> >> >> >> I don't see a good reason to prevent it. They are just 0 and 1 under >> >> the covers and behave like it everywhere else (e.g. True + True == 2 >> >> and the very useful boolean_mask.sum()). >> > >> > True + True == 1 >> >> No, True + True == 2. Try it. We might have made boolean arrays behave >> differently than Python bool objects, but that's not what I wrote. > > Robert, the C code in the inner loop is generated with > > /**begin repeat > > #TYPE=(BOOL, > BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE)*2# > #OP=||, +*13, ^, -*13# > #kind=add*14, subtract*14# > #typ=(Bool, byte, ubyte, short, ushort, int, uint, long, ulong, longlong, > ulonglong, float, double, longdouble)*2# > */ > > Note that || is not the same as +. Also note that subtract is implemented as > xor. I'm not sure why you're showing me numpy C code. I am talking about the Python bools True and False. $ python Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> True + True 2 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 22:45:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 20:45:09 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 8:40 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 9:37 PM, Charles R Harris > wrote: > > > > > > On Sat, May 24, 2008 at 8:25 PM, Robert Kern > wrote: > >> > >> On Sat, May 24, 2008 at 9:09 PM, Charles R Harris > >> wrote: > >> > > >> > On Sat, May 24, 2008 at 7:47 PM, Robert Kern > >> > wrote: > >> >> > >> >> On Sat, May 24, 2008 at 8:31 PM, Charles R Harris > >> >> wrote: > >> >> > Hi All, > >> >> > > >> >> > I'm writing tests for ufuncs and turned up some oddities: > >> >> > > >> >> > In [4]: degrees(True) > >> >> > Out[4]: 57.29578 > >> >> > > >> >> > In [5]: radians(True) > >> >> > Out[5]: 0.017453292 > >> >> > > >> >> > In [6]: sin(True) > >> >> > Out[6]: 0.84147096 > >> >> > > >> >> > Do we want numeric functions to apply to booleans? > >> >> > >> >> I don't see a good reason to prevent it. They are just 0 and 1 under > >> >> the covers and behave like it everywhere else (e.g. True + True == 2 > >> >> and the very useful boolean_mask.sum()). > >> > > >> > True + True == 1 > >> > >> No, True + True == 2. Try it. We might have made boolean arrays behave > >> differently than Python bool objects, but that's not what I wrote. > > > > Robert, the C code in the inner loop is generated with > > > > /**begin repeat > > > > #TYPE=(BOOL, > > > BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE)*2# > > #OP=||, +*13, ^, -*13# > > #kind=add*14, subtract*14# > > #typ=(Bool, byte, ubyte, short, ushort, int, uint, long, ulong, > longlong, > > ulonglong, float, double, longdouble)*2# > > */ > > > > Note that || is not the same as +. Also note that subtract is implemented > as > > xor. > > I'm not sure why you're showing me numpy C code. I am talking about > the Python bools True and False. Because I'm talking about ufuncs. The original question was about ufuncs and, since array booleans are not treated as numbers for ordinary arithmetic, the question was when *do* we treat them as numbers. I'm a bugger for consistency and booleans aren't consistent. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sat May 24 22:46:04 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 24 May 2008 19:46:04 -0700 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 7:36 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 9:28 PM, Keith Goodman wrote: >> I think it's interesting how python and numpy bools behave differently. >> >>>> x = np.array([True, True], dtype=bool) >>>> x[0] + x[1] >> True >>>> x[0] & x[1] >> True >>>> >>>> x = [True, True] >>>> x[0] + x[1] >> 2 >>>> x[0] & x[1] >> True > > The difference arises straightforwardly from the principle that numpy > tries not to upcast when you do an operation on two arrays of the same > dtype; True+True==True is of somewhat more use than True+True==False. > Python bools are just ints subclasses to give a nice string > representation. Sounds like there is no perfect solution. I like it the way it is but these are differences I never noticed. >> x = np.array([True, True], dtype=bool) >> x.sum() 2 >> x[0] + x[1] True From robert.kern at gmail.com Sat May 24 22:48:27 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 21:48:27 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: <3d375d730805241948y302bdee4lcdb01dc237074bfc@mail.gmail.com> On Sat, May 24, 2008 at 9:46 PM, Keith Goodman wrote: > On Sat, May 24, 2008 at 7:36 PM, Robert Kern wrote: >> On Sat, May 24, 2008 at 9:28 PM, Keith Goodman wrote: >>> I think it's interesting how python and numpy bools behave differently. >>> >>>>> x = np.array([True, True], dtype=bool) >>>>> x[0] + x[1] >>> True >>>>> x[0] & x[1] >>> True >>>>> >>>>> x = [True, True] >>>>> x[0] + x[1] >>> 2 >>>>> x[0] & x[1] >>> True >> >> The difference arises straightforwardly from the principle that numpy >> tries not to upcast when you do an operation on two arrays of the same >> dtype; True+True==True is of somewhat more use than True+True==False. >> Python bools are just ints subclasses to give a nice string >> representation. > > Sounds like there is no perfect solution. I like it the way it is but > these are differences I never noticed. > >>> x = np.array([True, True], dtype=bool) >>> x.sum() > 2 Yes, the default accumulator dtype for integer types is at least the size of the native int type, so we don't have the situation of "bool+bool=bool". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 22:57:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 20:57:23 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805241948y302bdee4lcdb01dc237074bfc@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> <3d375d730805241948y302bdee4lcdb01dc237074bfc@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 8:48 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 9:46 PM, Keith Goodman > wrote: > > On Sat, May 24, 2008 at 7:36 PM, Robert Kern > wrote: > >> On Sat, May 24, 2008 at 9:28 PM, Keith Goodman > wrote: > >>> I think it's interesting how python and numpy bools behave differently. > >>> > >>>>> x = np.array([True, True], dtype=bool) > >>>>> x[0] + x[1] > >>> True > >>>>> x[0] & x[1] > >>> True > >>>>> > >>>>> x = [True, True] > >>>>> x[0] + x[1] > >>> 2 > >>>>> x[0] & x[1] > >>> True > >> > >> The difference arises straightforwardly from the principle that numpy > >> tries not to upcast when you do an operation on two arrays of the same > >> dtype; True+True==True is of somewhat more use than True+True==False. > >> Python bools are just ints subclasses to give a nice string > >> representation. > > > > Sounds like there is no perfect solution. I like it the way it is but > > these are differences I never noticed. > > > >>> x = np.array([True, True], dtype=bool) > >>> x.sum() > > 2 > > Yes, the default accumulator dtype for integer types is at least the > size of the native int type, so we don't have the situation of > "bool+bool=bool". > How about In [14]: x += 5 In [15]: x Out[15]: array([ True, True], dtype=bool) In [16]: x.tostring() Out[16]: '\x01\x01' In [17]: x + 5 Out[17]: array([6, 6]) In [18]: (x + 5).dtype Out[18]: dtype('int32') In [19]: (x.astype(int8) + 5).dtype Out[19]: dtype('int8') I have to write tests for 64 of these buggers and some poor sod has to write the documentation. All these inconsistencies are going to drive both of us mad. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 24 22:59:24 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 21:59:24 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> Message-ID: <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> On Sat, May 24, 2008 at 9:45 PM, Charles R Harris wrote: >> I'm not sure why you're showing me numpy C code. I am talking about >> the Python bools True and False. > > Because I'm talking about ufuncs. The original question was about ufuncs > and, since array booleans are not treated as numbers for ordinary > arithmetic, the question was when *do* we treat them as numbers. I'm a > bugger for consistency and booleans aren't consistent. And I brought up the Python bools because there is a relationship that numpy has with the rest of Python's types that we need to keep in mind. I think it boils down to this: when we have places where bools are operated with bools and expect to output bools, we can decide what the rules are. For the arithmetic operations +-*, there are reasonable Boolean logic interpretations that we can apply. When upcasting is requested, either implicitly or explicitly, they turn into 1s and 0s. All of the behaviors you think are inconsistent just arise from the consistent application of that simple rule. Since it *is* sometimes useful to treat booleans as numerical 1s and 0s, I think it would be a mistake to prevent it. Sure, it doesn't really make sense to do sin(True), but if you want to preserve the useful behaviors, you will just end up with a big list of special cases and make behavior difficult to reason about. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sat May 24 23:01:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 22:01:02 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> <3d375d730805241948y302bdee4lcdb01dc237074bfc@mail.gmail.com> Message-ID: <3d375d730805242001m6de1a774y421d015f1a4dfe52@mail.gmail.com> On Sat, May 24, 2008 at 9:57 PM, Charles R Harris wrote: > How about > > In [14]: x += 5 > > In [15]: x > Out[15]: array([ True, True], dtype=bool) Output = bool. > In [16]: x.tostring() > Out[16]: '\x01\x01' > > > In [17]: x + 5 > Out[17]: array([6, 6]) Output != bool. > In [18]: (x + 5).dtype > Out[18]: dtype('int32') > > In [19]: (x.astype(int8) + 5).dtype > Out[19]: dtype('int8') > > I have to write tests for 64 of these buggers and some poor sod has to write > the documentation. All these inconsistencies are going to drive both of us > mad. They aren't inconsistent in the slightest. They follow from a very simple rule. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 23:02:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:02:59 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 8:59 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 9:45 PM, Charles R Harris > wrote: > >> I'm not sure why you're showing me numpy C code. I am talking about > >> the Python bools True and False. > > > > Because I'm talking about ufuncs. The original question was about ufuncs > > and, since array booleans are not treated as numbers for ordinary > > arithmetic, the question was when *do* we treat them as numbers. I'm a > > bugger for consistency and booleans aren't consistent. > > And I brought up the Python bools because there is a relationship that > numpy has with the rest of Python's types that we need to keep in > mind. > > I think it boils down to this: when we have places where bools are > operated with bools and expect to output bools, we can decide what the > rules are. For the arithmetic operations +-*, there are reasonable > Boolean logic interpretations that we can apply. > > When upcasting is requested, either implicitly or explicitly, they > turn into 1s and 0s. All of the behaviors you think are inconsistent > just arise from the consistent application of that simple rule. Since > it *is* sometimes useful to treat booleans as numerical 1s and 0s, I > think it would be a mistake to prevent it. Sure, it doesn't really > make sense to do sin(True), but if you want to preserve the useful > behaviors, you will just end up with a big list of special cases and > make behavior difficult to reason about. > So what about the rule that the array type takes precedence over the scalar type? That is broken for booleans. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 24 23:06:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 22:06:55 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> Message-ID: <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> On Sat, May 24, 2008 at 10:02 PM, Charles R Harris wrote: > So what about the rule that the array type takes precedence over the scalar > type? That is broken for booleans. Yes, and if it wasn't an intentional special case (I don't recall discussing it on the list, but it might have been), then it's a bug and suitable for changing. The other behaviors are intentional and thus not suitable for changing. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sat May 24 23:11:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 22:11:03 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> Message-ID: <3d375d730805242011k3eae5b95rae7f48e21d28366d@mail.gmail.com> On Sat, May 24, 2008 at 10:06 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 10:02 PM, Charles R Harris > wrote: >> So what about the rule that the array type takes precedence over the scalar >> type? That is broken for booleans. > > Yes, and if it wasn't an intentional special case (I don't recall > discussing it on the list, but it might have been), then it's a bug > and suitable for changing. The other behaviors are intentional and > thus not suitable for changing. Nope, I'm wrong. I just doublechecked the manual. bool_ is separate from number on the tree of dtypes, so bool_ + int_ is a cross-kind operation and the scalarness of the int_ is not a concern. E.g. array([1, 2]) + 3.0 has similar behavior. This is not an inconsistency. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 23:12:38 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:12:38 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:06 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 10:02 PM, Charles R Harris > wrote: > > So what about the rule that the array type takes precedence over the > scalar > > type? That is broken for booleans. > > Yes, and if it wasn't an intentional special case (I don't recall > discussing it on the list, but it might have been), then it's a bug > and suitable for changing. The other behaviors are intentional and > thus not suitable for changing. > So how do we know which is which? How does one write a test for an unspecified behavior? And it is clear that add.reduce breaks the rules for booleans. The result is that the boolean add function doesn't in fact *have* a reduce method. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 23:15:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:15:55 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805242011k3eae5b95rae7f48e21d28366d@mail.gmail.com> References: <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242011k3eae5b95rae7f48e21d28366d@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:11 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 10:06 PM, Robert Kern > wrote: > > On Sat, May 24, 2008 at 10:02 PM, Charles R Harris > > wrote: > >> So what about the rule that the array type takes precedence over the > scalar > >> type? That is broken for booleans. > > > > Yes, and if it wasn't an intentional special case (I don't recall > > discussing it on the list, but it might have been), then it's a bug > > and suitable for changing. The other behaviors are intentional and > > thus not suitable for changing. > > Nope, I'm wrong. I just doublechecked the manual. bool_ is separate > from number on the tree of dtypes, so bool_ + int_ is a cross-kind > operation and the scalarness of the int_ is not a concern. E.g. > array([1, 2]) + 3.0 has similar behavior. This is not an > inconsistency. > You are confusing promotion between kinds with promotion within kinds. So in this view, bool is not an integer type. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 24 23:17:19 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 22:17:19 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242011k3eae5b95rae7f48e21d28366d@mail.gmail.com> Message-ID: <3d375d730805242017u52113000h52baf59213e3b5eb@mail.gmail.com> On Sat, May 24, 2008 at 10:15 PM, Charles R Harris wrote: > > On Sat, May 24, 2008 at 9:11 PM, Robert Kern wrote: >> >> On Sat, May 24, 2008 at 10:06 PM, Robert Kern >> wrote: >> > On Sat, May 24, 2008 at 10:02 PM, Charles R Harris >> > wrote: >> >> So what about the rule that the array type takes precedence over the >> >> scalar >> >> type? That is broken for booleans. >> > >> > Yes, and if it wasn't an intentional special case (I don't recall >> > discussing it on the list, but it might have been), then it's a bug >> > and suitable for changing. The other behaviors are intentional and >> > thus not suitable for changing. >> >> Nope, I'm wrong. I just doublechecked the manual. bool_ is separate >> from number on the tree of dtypes, so bool_ + int_ is a cross-kind >> operation and the scalarness of the int_ is not a concern. E.g. >> array([1, 2]) + 3.0 has similar behavior. This is not an >> inconsistency. > > You are confusing promotion between kinds with promotion within kinds. So in > this view, bool is not an integer type. Take a look at the tree. bool_ does not descend from integer. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From aisaac at american.edu Sat May 24 23:21:55 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 24 May 2008 23:21:55 -0400 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com><3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, 24 May 2008, Keith Goodman apparently wrote: >>> x = np.array([True, True], dtype=bool) >>> x.sum() > 2 If you want bools, change the accumulator dtype:: >>> x.sum(dtype=bool) True Cheers, Alan Isaac From robert.kern at gmail.com Sat May 24 23:18:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 24 May 2008 22:18:18 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> Message-ID: <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> On Sat, May 24, 2008 at 10:12 PM, Charles R Harris wrote: > > On Sat, May 24, 2008 at 9:06 PM, Robert Kern wrote: >> >> On Sat, May 24, 2008 at 10:02 PM, Charles R Harris >> wrote: >> > So what about the rule that the array type takes precedence over the >> > scalar >> > type? That is broken for booleans. >> >> Yes, and if it wasn't an intentional special case (I don't recall >> discussing it on the list, but it might have been), then it's a bug >> and suitable for changing. The other behaviors are intentional and >> thus not suitable for changing. > > So how do we know which is which? How does one write a test for an > unspecified behavior? And it is clear that add.reduce breaks the rules for > booleans. It breaks it for every integer type, in fact. In [11]: add.reduce(array([1]*257, dtype=uint8)) Out[11]: 257 I thought we only did the accumulator-dtype changing for the .sum() method, but I was wrong. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat May 24 23:24:24 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:24:24 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:21 PM, Alan G Isaac wrote: > On Sat, 24 May 2008, Keith Goodman apparently wrote: > >>> x = np.array([True, True], dtype=bool) > >>> x.sum() > > 2 > > > > If you want bools, change the accumulator dtype:: > > >>> x.sum(dtype=bool) Shouldn't that be the other way round? If you want integers, do x.sum(dtype=int). Ints don't sum in float64 by default. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 24 23:32:39 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:32:39 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> References: <3d375d730805241925o6e90a28ci68ff6419f697542e@mail.gmail.com> <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:18 PM, Robert Kern wrote: > On Sat, May 24, 2008 at 10:12 PM, Charles R Harris > wrote: > > > > On Sat, May 24, 2008 at 9:06 PM, Robert Kern > wrote: > >> > >> On Sat, May 24, 2008 at 10:02 PM, Charles R Harris > >> wrote: > >> > So what about the rule that the array type takes precedence over the > >> > scalar > >> > type? That is broken for booleans. > >> > >> Yes, and if it wasn't an intentional special case (I don't recall > >> discussing it on the list, but it might have been), then it's a bug > >> and suitable for changing. The other behaviors are intentional and > >> thus not suitable for changing. > > > > So how do we know which is which? How does one write a test for an > > unspecified behavior? And it is clear that add.reduce breaks the rules > for > > booleans. > > It breaks it for every integer type, in fact. > > In [11]: add.reduce(array([1]*257, dtype=uint8)) > Out[11]: 257 > > I thought we only did the accumulator-dtype changing for the .sum() > method, but I was wrong. > I think the sum behavior has gone through changes over the last year and that it is time to write down how things are supposed to work. So that we can test that they actually work that way. This means specifying and checking the type of all the outputs. And there are exceptions here also In [33]: x = ones(2,dtype=int8) In [34]: x.sum().dtype Out[34]: dtype('int32') In [35]: x = ones(2,dtype=int64) In [36]: x.sum().dtype Out[36]: dtype('int64') So we have a default accumulator unless the precision is greater than the default. Unless the numbers are floats In [37]: x = ones(2,dtype=float32) In [38]: x.sum().dtype Out[38]: dtype('float32') But the accumalator is of the same kind unless the kind is boolean, in which case it is integer. Clear as a bell. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Sat May 24 23:35:18 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 24 May 2008 22:35:18 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 10:24 PM, Charles R Harris wrote: > > > Shouldn't that be the other way round? If you want integers, do > x.sum(dtype=int). Ints don't sum in float64 by default. > The default behavior (x.sum() -> int) is more useful than (x.sum() -> bool) since x.any() already exists. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From charlesr.harris at gmail.com Sat May 24 23:40:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:40:18 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:35 PM, Nathan Bell wrote: > On Sat, May 24, 2008 at 10:24 PM, Charles R Harris > wrote: > > > > > > Shouldn't that be the other way round? If you want integers, do > > x.sum(dtype=int). Ints don't sum in float64 by default. > > > > The default behavior (x.sum() -> int) is more useful than (x.sum() -> > bool) since x.any() already exists. > The question is consistency. A programmer should just have to remember a few simple rules, not a host of special cases. It makes things easier to learn and the code easier to understand because the intent is always made clear. Designing to whatever happens to be convenient at the moment leads to a mess and trying to document all the oddities is a PITA. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Sat May 24 23:40:26 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 24 May 2008 22:40:26 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 10:32 PM, Charles R Harris wrote: > > But the accumalator is of the same kind unless the kind is boolean, in which > case it is integer. Clear as a bell. > I believe the rule is that any integer type smaller than the machine word size is effectively upcast. IMO this is desirable since small types are very likely to overflow. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From charlesr.harris at gmail.com Sat May 24 23:48:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 21:48:11 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241940g30d59172jaea83092db0bfc1b@mail.gmail.com> <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:40 PM, Nathan Bell wrote: > On Sat, May 24, 2008 at 10:32 PM, Charles R Harris > wrote: > > > > But the accumalator is of the same kind unless the kind is boolean, in > which > > case it is integer. Clear as a bell. > > > > I believe the rule is that any integer type smaller than the machine > word size is effectively upcast. > > IMO this is desirable since small types are very likely to overflow. > You can't overflow in modular arithmetic, which is how numpy is supposed to work. Try In [51]: x Out[51]: array([2147483647, 2147483647]) In [52]: x.sum() Out[52]: -2 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Sat May 24 23:50:26 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 24 May 2008 22:50:26 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 10:40 PM, Charles R Harris wrote: > > The question is consistency. A programmer should just have to remember a few > simple rules, not a host of special cases. It makes things easier to learn > and the code easier to understand because the intent is always made clear. > Designing to whatever happens to be convenient at the moment leads to a mess > and trying to document all the oddities is a PITA. > Sometimes "do what I expect" is better than rigid consistency. I would argue that avoiding common overflow cases is more important than preserving the dtype when summing. Anyway, the point is moot. There's no way to change x.sum() without breaking lots of code. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From wnbell at gmail.com Sun May 25 00:00:12 2008 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 24 May 2008 23:00:12 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 10:48 PM, Charles R Harris wrote: > > You can't overflow in modular arithmetic, which is how numpy is supposed to > work. Try > > In [51]: x > Out[51]: array([2147483647, 2147483647]) > > In [52]: x.sum() > Out[52]: -2 > I would call that an overflow. Have you considered that other people might have a different notion of "how numpy is supposed to work"? -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From charlesr.harris at gmail.com Sun May 25 00:00:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 22:00:49 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 9:50 PM, Nathan Bell wrote: > On Sat, May 24, 2008 at 10:40 PM, Charles R Harris > wrote: > > > > The question is consistency. A programmer should just have to remember a > few > > simple rules, not a host of special cases. It makes things easier to > learn > > and the code easier to understand because the intent is always made > clear. > > Designing to whatever happens to be convenient at the moment leads to a > mess > > and trying to document all the oddities is a PITA. > > > > Sometimes "do what I expect" is better than rigid consistency. I > would argue that avoiding common overflow cases is more important than > preserving the dtype when summing. > > Anyway, the point is moot. There's no way to change x.sum() without > breaking lots of code. > Certainly, and the change to the default accumulator was made to avoid the common expectation of no wraparound. But it is time to document these things and write tests, and the tests will engrave the current behavior in stone. Which is why we need to be sure that the current behavior is in fact correct and not an overlooked booboo. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sun May 25 00:08:57 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 25 May 2008 00:08:57 -0400 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com><3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: > On Sat, May 24, 2008 at 9:21 PM, Alan G Isaac wrote: >> If you want bools, change the accumulator dtype:: >> >>> x.sum(dtype=bool) On Sat, 24 May 2008, Charles R Harris apparently wrote: > Shouldn't that be the other way round? If you want integers, do > x.sum(dtype=int). Ints don't sum in float64 by default. I was not taking a normative position. Just illustrating the current behavior. I think both sides have a point in this discussion. It sounds like the behavior you like will be costly (in time) to implement, so it would have to be substantially better to be worth doing. I personally like it better, but I worry new users will be startled. (I suppose I would address this by always having the default accumulator be ``float``, figuring that anyone who does not like that will know what to do about it.) Cheers, Alan From charlesr.harris at gmail.com Sun May 25 00:06:32 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 22:06:32 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241959t348bf494o6e44f18162419e3f@mail.gmail.com> <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 10:00 PM, Nathan Bell wrote: > On Sat, May 24, 2008 at 10:48 PM, Charles R Harris > wrote: > > > > You can't overflow in modular arithmetic, which is how numpy is supposed > to > > work. Try > > > > In [51]: x > > Out[51]: array([2147483647, 2147483647]) > > > > In [52]: x.sum() > > Out[52]: -2 > > > > I would call that an overflow. > > Have you considered that other people might have a different notion of > "how numpy is supposed to work"? > So, please tell me how numpy is supposed to work. Write as much as you please. If you are so moved, why not write the tests for all 64 ufuncs for all types and combinations and verify that they are all correct as specified and raise errors when they should. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun May 25 00:14:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 22:14:57 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: On Sat, May 24, 2008 at 10:08 PM, Alan G Isaac wrote: > > On Sat, May 24, 2008 at 9:21 PM, Alan G Isaac > wrote: > >> If you want bools, change the accumulator dtype:: > >> >>> x.sum(dtype=bool) > > > On Sat, 24 May 2008, Charles R Harris apparently wrote: > > Shouldn't that be the other way round? If you want integers, do > > x.sum(dtype=int). Ints don't sum in float64 by default. > > > I was not taking a normative position. > Just illustrating the current behavior. > > I think both sides have a point in this discussion. > It sounds like the behavior you like will be costly (in > time) to implement, so it would have to be substantially > better to be worth doing. I personally like it better, > but I worry new users will be startled. (I suppose > I would address this by always having the default > accumulator be ``float``, figuring that anyone who does not > like that will know what to do about it.) > It used to be stay in type and has been changed, and I don't disagree with that, it was discussed on the list. Nevertheless, booleans are different, both their own kind and integers. But my problem is not convenience, my problem is the very inconvenient one of writing comprehensive tests, and for that the desired behavior has to be specified; it can't simply be taken as whatever currently happens. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordan at math.ucsb.edu Sun May 25 01:29:10 2008 From: jordan at math.ucsb.edu (jordan at math.ucsb.edu) Date: Sat, 24 May 2008 22:29:10 -0700 (PDT) Subject: [Numpy-discussion] Dot in C extension Message-ID: <50952.71.102.223.154.1211693350.squirrel@mail.math.ucsb.edu> Hi all, I'm trying to write a Gauss-Seidel function in C++. The function works however it is too slow because I'm not using any acceleration for the vector multiplication. I'm not really sure how to access the dot function in my extension, nor what all the arguments are for. Is this the right function to use (found in ndarrayobject.h): typedef void (PyArray_DotFunc)(void *, npy_intp, void *, npy_intp, void *, npy_intp, void *); I guess the voids are array objects, the two to be dotted and the output. What's the fourth? Thanks all! Cheers From charlesr.harris at gmail.com Sun May 25 01:56:30 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 24 May 2008 23:56:30 -0600 Subject: [Numpy-discussion] Not all typecodes have names Message-ID: ? bool b signed char h short i integer l long integer q long long integer p p ---- B unsigned char H unsigned short I unsigned integer L unsigned long integer Q unsigned long long integer P P ---- f single precision d double precision g long precision F complex single precision D complex double precision G complex long double precision S string U unicode V void O object The typecodes come from typecode['All'] and the typenames from typename(). What are the names that should go with p and P? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun May 25 02:12:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 00:12:55 -0600 Subject: [Numpy-discussion] Dot in C extension In-Reply-To: <50952.71.102.223.154.1211693350.squirrel@mail.math.ucsb.edu> References: <50952.71.102.223.154.1211693350.squirrel@mail.math.ucsb.edu> Message-ID: On Sat, May 24, 2008 at 11:29 PM, wrote: > Hi all, > > I'm trying to write a Gauss-Seidel function in C++. The function works > however it is too slow because I'm not using any acceleration for the > vector multiplication. I'm not really sure how to access the dot function > in my extension, nor what all the arguments are for. > > Is this the right function to use (found in ndarrayobject.h): > > typedef void (PyArray_DotFunc)(void *, npy_intp, void *, npy_intp, void *, > npy_intp, void *); > > I guess the voids are array objects, the two to be dotted and the output. > What's the fourth? > It's ignored, so 0 (C++) should do. static void @name at _dot(char *ip1, intp is1, char *ip2, intp is2, char *op, intp n, void *ignore) { register @out@ tmp=(@out@)0; register intp i; for(i=0;i From charlesr.harris at gmail.com Sun May 25 02:29:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 00:29:16 -0600 Subject: [Numpy-discussion] Dot in C extension In-Reply-To: References: <50952.71.102.223.154.1211693350.squirrel@mail.math.ucsb.edu> Message-ID: On Sun, May 25, 2008 at 12:12 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, May 24, 2008 at 11:29 PM, wrote: > >> Hi all, >> >> I'm trying to write a Gauss-Seidel function in C++. The function works >> however it is too slow because I'm not using any acceleration for the >> vector multiplication. I'm not really sure how to access the dot function >> in my extension, nor what all the arguments are for. >> >> Is this the right function to use (found in ndarrayobject.h): >> >> typedef void (PyArray_DotFunc)(void *, npy_intp, void *, npy_intp, void *, >> npy_intp, void *); >> >> I guess the voids are array objects, the two to be dotted and the output. >> What's the fourth? >> > > It's ignored, so 0 (C++) should do. > > static void > @name at _dot(char *ip1, intp is1, char *ip2, intp is2, char *op, intp n, > void *ignore) > { > register @out@ tmp=(@out@)0; > register intp i; > for(i=0;i tmp += (@out@)(*((@type@ *)ip1)) * \ > (@out@)(*((@type@ *)ip2)); > } > *((@type@ *)op) = (@type@) tmp; > } > > Note that the function may call BLAS in practice, but you can figure the > use of the arguments from the above. Ignore the @type@ sort of stuff, it's > replaced by real types by the code generator. > I'm not sure how you get to these functions, which are type specific, someone else will have to supply that answer. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun May 25 04:02:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 25 May 2008 03:02:00 -0500 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805241847o4b773b52jdfff35363760a5d8@mail.gmail.com> <3d375d730805241936r60563bet6de4c76245369ea@mail.gmail.com> Message-ID: <3d375d730805250102k792fcaban795b36422d6c36ff@mail.gmail.com> On Sat, May 24, 2008 at 11:14 PM, Charles R Harris wrote: > It used to be stay in type and has been changed, and I don't disagree with > that, it was discussed on the list. Nevertheless, booleans are different, > both their own kind and integers. But my problem is not convenience, my > problem is the very inconvenient one of writing comprehensive tests, and for > that the desired behavior has to be specified; it can't simply be taken as > whatever currently happens. Fine. I have verified that the current behaviors you have mentioned are intended. If you want a more concise specification of the special cases, it is this: add.reduce() (and consequently sum()) is the special case. In that context, bool_s are treated as integers, and the integer dtypes plus bool_ use the default integer dtype for the accumulator to forestall overflow in the most common usage. Everything else should follow the generic rules. When bool_s are operated with bool_s, +*- take on Boolean algebraic meanings. When a bool_ is cast to another dtype, True->1 and False->0. bool_ is not part of the integer "kind" so the generic cross-kind rules apply when operations combine bool_s with integers. Does that help? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sun May 25 04:04:49 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 25 May 2008 03:04:49 -0500 Subject: [Numpy-discussion] Not all typecodes have names In-Reply-To: References: Message-ID: <3d375d730805250104n1b7e619bx2dae232d3d24b794@mail.gmail.com> On Sun, May 25, 2008 at 12:56 AM, Charles R Harris wrote: > ? bool > b signed char > h short > i integer > l long integer > q long long integer > p p ---- > B unsigned char > H unsigned short > I unsigned integer > L unsigned long integer > Q unsigned long long integer > P P ---- > f single precision > d double precision > g long precision > F complex single precision > D complex double precision > G complex long double precision > S string > U unicode > V void > O object > > The typecodes come from typecode['All'] and the typenames from typename(). > What are the names that should go with p and P? I believe they correspond to C's ssize_t and size_t respectively. I don't know if those are good names, per se, in this context. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From andrea.gavana at gmail.com Sun May 25 05:22:37 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 25 May 2008 10:22:37 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805241211m1318264fu7a837afbe3ba5041@mail.gmail.com> References: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> <9457e7c80805241211m1318264fu7a837afbe3ba5041@mail.gmail.com> Message-ID: Hi Stefan & All, On Sat, May 24, 2008 at 8:11 PM, St?fan van der Walt wrote: > Hi Andrea > > 2008/5/24 Andrea Gavana : >> Number Of Cells: 50000 >> --------------------------------------------------------------------- >> | Rank | Method Name | Execution Time | Relative Slowness | >> --------------------------------------------------------------------- >> 1 NumPy 5 (Andrea) 0.00562803 1.00000 >> 2 NumPy 1 (Francesc) 0.00563365 1.00100 >> 3 NumPy 2 (Nathan) 0.00577322 1.02580 >> 4 NumPy 4 (Nathan-Vector) 0.00580577 1.03158 >> 5 Fortran 6 (James) 0.00660514 1.17361 >> 6 Fortran 3 (Alex) 0.00709856 1.26129 >> 7 Fortran 5 (James) 0.00748726 1.33035 >> 8 Fortran 2 (Mine) 0.00748816 1.33051 >> 9 Fortran 1 (Mine) 0.00775906 1.37864 >> 10 Fortran 4 {Michael) 0.00777685 1.38181 >> 11 NumPy 3 (Peter) 0.01253662 2.22753 >> 12 Cython (Stefan) 0.01597804 2.83901 >> --------------------------------------------------------------------- > > When you bench the Cython code, you'll have to take out the Python > calls (for checking dtype etc.), otherwise you're comparing apples and > oranges. After I tweaked it, it ran roughly the same time as > Francesc's version. But like I mentioned before, the Fortran results > should trump all, so what is going on here? I thought I had removed the Python checks from the Cython code (if you look at the attached files), but maybe I haven't removed them all... about Fortran, I have no idea: I have 6 different implementations in Fortran, and they are all slower than the pure NumPy ones. I don't know if I can optimiza them further (I have asked to a Fortran newsgroup too, but no faster solution has arisen). I am not even sure if the defaults f2py compiler options are already on "maximum optimization" for Fortran. Does anyone know if this is true? Maybe Pearu can shed some light on this issue... Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From david at ar.media.kyoto-u.ac.jp Sun May 25 05:28:52 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 25 May 2008 18:28:52 +0900 Subject: [Numpy-discussion] Controlling the way a module is imported Message-ID: <48393154.2000209@ar.media.kyoto-u.ac.jp> Hi, This is not numpy specific, but I need it for numpy/scipy. More specifically, I would like to be able to have one module interface which load imp1, imp2, imp3, etc... depending on some options. I see two obvious solutions: monkey patching, and file configuration, but I try to avoid the former, and there is no mechanism for the later in scipy. I read that import hooks could be used for this purpose, but I don't quite understand the pep: http://www.python.org/dev/peps/pep-0302/ Has anyone played with it ? cheers, David From robert.kern at gmail.com Sun May 25 05:57:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 25 May 2008 04:57:25 -0500 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <48393154.2000209@ar.media.kyoto-u.ac.jp> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730805250257y1c7faddaufa17554c11c19db4@mail.gmail.com> On Sun, May 25, 2008 at 4:28 AM, David Cournapeau wrote: > Hi, > > This is not numpy specific, but I need it for numpy/scipy. More > specifically, I would like to be able to have one module interface which > load imp1, imp2, imp3, etc... depending on some options. Can you be more specific? > I see two > obvious solutions: monkey patching, and file configuration, but I try to > avoid the former, and there is no mechanism for the later in scipy. I > read that import hooks could be used for this purpose, but I don't quite > understand the pep: > > http://www.python.org/dev/peps/pep-0302/ > > Has anyone played with it ? Avoid custom import hooks at all costs. They are very fragile and interfere with each other. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Sun May 25 06:02:49 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 25 May 2008 19:02:49 +0900 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <3d375d730805250257y1c7faddaufa17554c11c19db4@mail.gmail.com> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> <3d375d730805250257y1c7faddaufa17554c11c19db4@mail.gmail.com> Message-ID: <48393949.4070008@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > > Can you be more specific? Sure: in my branch to refactor fftpack, every non default backend (that is everything but fftpack) is a separate python module, which implements some fft functions, and is 'importable'. So in scipy.fftpack, I have a function which: - tries to find one of the backend from a list (by using __import__) - if no backend is found, just use the default backend - if one is found, pull the functions from the found backend, and use the default backend as a fallback for non available functions (each backend can implement a different subset of fftpack). This is done in the following prototype: http://projects.scipy.org/scipy/scipy/browser/branches/refactor_fft/scipy/fftpack/common.py I would like to add the possibility to control which backend is used: the problem is how to communicate the name of the backend to scipy.fftpack in a persistent way (such as if you want to use mkl, it will remember it at the next python session), and in a dynamic way (such as you can decide which one to use before importing scipy.fftpack, or even changing the backend on the fly). thanks, David From zachary.pincus at yale.edu Sun May 25 11:25:32 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Sun, 25 May 2008 11:25:32 -0400 Subject: [Numpy-discussion] build outside C code with scons within numpy distutils Message-ID: Hello all, I've been following David's work making numpy build with scons with some interest. I have a quick question about how one might use scons/ numpy distutils from an outside project. Specifically, I have a package that uses numpy and numpy.distutils to built itself. Unfortunately, there are some pure-C libraries that I call using ctypes, and as these libraries are are not python extensions, it is hard to get distutils to build them happily on all platforms. From what I understand, one of the benefits of having scons available was that building and packaging these sort of libraries would become easier. Is that the case? And if so, what would be the best way to add the scons stuff to a simple setup.py file written for numpy.distutils? Thanks, Zach From peridot.faceted at gmail.com Sun May 25 13:30:08 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 25 May 2008 13:30:08 -0400 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: 2008/5/25 Charles R Harris : > So, please tell me how numpy is supposed to work. Write as much as you > please. If you are so moved, why not write the tests for all 64 ufuncs for > all types and combinations and verify that they are all correct as specified > and raise errors when they should. This sounds like something that will be easier with nose (if I understand nose correctly). Since the ufuncs are so uniform, it seems like the zillion tests you are proposing writing should be expressed as a list of ufuncs with some type annotations, a list of types to run them against, and a programmatic expression of the type rules Robert is proposing. While you can do this with the current test framework, you get one test rather than many, and the failure report is not nearly so valuable. Anne From wright at esrf.fr Sun May 25 13:38:43 2008 From: wright at esrf.fr (Jonathan Wright) Date: Sun, 25 May 2008 19:38:43 +0200 Subject: [Numpy-discussion] should abs gives negative at end of integer range? Message-ID: <4839A423.3040800@esrf.fr> This one comes up in a Java puzzler, but applies equally to numpy. http://www.youtube.com/watch?v=wDN_EYUvUq0 >>> import numpy, sys >>> abs(numpy.array([-sys.maxint-1],numpy.int)) > 0 array([False], dtype=bool) >>> abs(numpy.array([-129,-128,-127],numpy.int8)) > 0 array([ True, False, True], dtype=bool) ... etc. Sort of surprising that abs gives something negative. Is this the intended behaviour as covered by a unit test and doc already? -Jon From charlesr.harris at gmail.com Sun May 25 14:12:36 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 12:12:36 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review Message-ID: Hi All, Here is the current behavior of the ufuncs and some comments. They don't yet cover mixed types for binary functions, but when they do we will see things like: In [7]: power(True,10) Out[7]: array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) Which looks suspect ;) 1) Help strings on ufuncs don't work. This seems to be a problem with the help function, as printing the relevant __doc__ works fine. The docstrings are currently defined in code_generators/generate_umath.py and add_newdoc doesn't seem to work for them. 2) Complex divmod(), // and % are deprecated, should we make them raise errors? 3) The current behavior of remainder for complex is bizarre. Nor does it raise a deprecation warning. 4) IMHO, absolute('?') should return 'b' 5) Negative applied to '?' is equivalent to not. This gives me mixed feelings; the same functionality is covered by invert and logical_not. 6) The fmod ufunc applied to complex returns AttributeError. Shouldn't it be a TypeError? 7) Should degrees and radians work on complex? Hey, they work on booleans and it's just scaling. conjugate in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , absolute in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , f , d , g , negative in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , sign in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , invert in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , err, err, err, err, err, err, degrees in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, radians in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, arccos in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , arccosh in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , arcsin in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , arcsinh in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , arctan in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , arctanh in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , cos in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , sin in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , tan in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , cosh in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , sinh in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , tanh in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , exp in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , expm1 in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , log in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , log10 in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , log1p in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , sqrt in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , ceil in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, floor in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, fabs in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, rint in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , isnan in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , isinf in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , isfinite in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , signbit in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , err, err, err, left_shift in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , err, err, err, err, err, err, right_shift in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , err, err, err, err, err, err, add in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , subtract in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , multiply in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , divide in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , floor_divide in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , true_divide in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , F , D , G , fmod in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , err, err, err, power in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , greater in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , greater_equal in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , less in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , less_equal in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , equal in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , not_equal in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , logical_and in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , logical_not in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , logical_or in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , logical_xor in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , maximum in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , minimum in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , bitwise_and in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , err, err, err, err, err, err, bitwise_or in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , err, err, err, err, err, err, bitwise_xor in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: ? , b , h , i , l , q , B , H , I , L , Q , err, err, err, err, err, err, arctan2 in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, remainder in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: b , b , h , i , l , q , B , H , I , L , Q , f , d , g , O , O , err, hypot in: ? , b , h , i , l , q , B , H , I , L , Q , f , d , g , F , D , G , out: f , f , f , d , d , d , f , f , d , d , d , f , d , g , err, err, err, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sun May 25 14:13:15 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 25 May 2008 11:13:15 -0700 Subject: [Numpy-discussion] isnan surprise Message-ID: >> x = np.array([1.0]) >> np.isnan(x) array([False], dtype=bool) # <----- Expected >> np.isnan(x,x) array([ 0.]) # <----- Surprise (to me) The same happens with isfinite, isinf, etc. My use case (self.x is an array): def isnan(self): y = self.copy() np.isnan(y.x, y.x) return y Then when I try to do myobj[myobj.isnan()] = 0 I get (since myobj.isnan() are floats) IndexError: arrays used as indices must be of integer (or boolean) type From kwgoodman at gmail.com Sun May 25 14:17:59 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 25 May 2008 11:17:59 -0700 Subject: [Numpy-discussion] isnan surprise In-Reply-To: References: Message-ID: On Sun, May 25, 2008 at 11:13 AM, Keith Goodman wrote: >>> x = np.array([1.0]) >>> np.isnan(x) > array([False], dtype=bool) # <----- Expected >>> np.isnan(x,x) > array([ 0.]) # <----- Surprise (to me) I guess this is not surprising since I'm asking isnan to put the answer in a float array. From charlesr.harris at gmail.com Sun May 25 14:19:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 12:19:55 -0600 Subject: [Numpy-discussion] should abs gives negative at end of integer range? In-Reply-To: <4839A423.3040800@esrf.fr> References: <4839A423.3040800@esrf.fr> Message-ID: On Sun, May 25, 2008 at 11:38 AM, Jonathan Wright wrote: > This one comes up in a Java puzzler, but applies equally to numpy. > > http://www.youtube.com/watch?v=wDN_EYUvUq0 > > >>> import numpy, sys > >>> abs(numpy.array([-sys.maxint-1],numpy.int)) > 0 > array([False], dtype=bool) > >>> abs(numpy.array([-129,-128,-127],numpy.int8)) > 0 > array([ True, False, True], dtype=bool) > > ... etc. Sort of surprising that abs gives something negative. Is this > the intended behaviour as covered by a unit test and doc already? > I don't think it's specified. However In [1]: x = array([-128], dtype=int8) In [2]: -x Out[2]: array([-128], dtype=int8) Because that's how two's complement operates. In two's complement the positive and negative values aren't symmetric. OTOH, abs has to return a positive value. Hmm..., we could return the corresponding unsigned type in this case, but folks might not like changing the type, either. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun May 25 14:28:06 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 12:28:06 -0600 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: On Sun, May 25, 2008 at 11:30 AM, Anne Archibald wrote: > 2008/5/25 Charles R Harris : > > > So, please tell me how numpy is supposed to work. Write as much as you > > please. If you are so moved, why not write the tests for all 64 ufuncs > for > > all types and combinations and verify that they are all correct as > specified > > and raise errors when they should. > > This sounds like something that will be easier with nose (if I > understand nose correctly). Since the ufuncs are so uniform, The loops are easy enough, the needed information is in the ufuncs: nin,nout, etc. It is the results that need to be specified because they aren't, in fact, uniform. And all the promotion rules and such also need to be specified and checked. So we need to know what is supposed to happen and pinning that down is the problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sun May 25 14:42:13 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 25 May 2008 21:42:13 +0300 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: References: Message-ID: <1211740934.11579.27.camel@localhost.localdomain> su, 2008-05-25 kello 12:12 -0600, Charles R Harris kirjoitti: [clip] > 1) Help strings on ufuncs don't work. This seems to be a problem with the help function, as > printing the relevant __doc__ works fine. The docstrings are currently defined in > code_generators/generate_umath.py and add_newdoc doesn't seem to work for them. Yes, this is a problem with the help function. In pydoc.py in the Python standard library: def doc(...): ... if not (inspect.ismodule(object) or inspect.isclass(object) or inspect.isroutine(object) or inspect.isgetsetdescriptor(object) or inspect.ismemberdescriptor(object) or isinstance(object, property)): # If the passed object is a piece of data or an instance, # document its available methods instead of its value. object = type(object) ... Is it possible to make one of the above conditions True for ufuncs? -- Pauli Virtanen -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From charlesr.harris at gmail.com Sun May 25 14:43:39 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 12:43:39 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: References: Message-ID: On Sun, May 25, 2008 at 12:12 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > Here is the current behavior of the ufuncs and some comments. They don't > yet cover mixed types for binary functions, > but when they do we will see things like: > > In [7]: power(True,10) > Out[7]: > array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, > 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) > > Which looks suspect ;) > > 1) Help strings on ufuncs don't work. This seems to be a problem with the > help function, as > printing the relevant __doc__ works fine. The docstrings are currently > defined in > code_generators/generate_umath.py and add_newdoc doesn't seem to work > for them. > > 2) Complex divmod(), // and % are deprecated, should we make them raise > errors? > > 3) The current behavior of remainder for complex is bizarre. Nor does it > raise a deprecation warning. > > 4) IMHO, absolute('?') should return 'b' > > 5) Negative applied to '?' is equivalent to not. This gives me mixed > feelings; the same functionality > is covered by invert and logical_not. > > 6) The fmod ufunc applied to complex returns AttributeError. Shouldn't it > be a TypeError? > > 7) Should degrees and radians work on complex? Hey, they work on booleans > and it's just scaling. > Let me add that I have mixed feelings about overloading *any* of the arithmetic operators for booleans, I think they should all promote the kind. But I can see some virtue in the current behavior when one wants to write out a complicated boolean expression. However, overloading - with ^ violates the symmetry of the latter operator. Nor will -(a - b) be equivalent to b - a. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun May 25 14:55:52 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 12:55:52 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: <1211740934.11579.27.camel@localhost.localdomain> References: <1211740934.11579.27.camel@localhost.localdomain> Message-ID: On Sun, May 25, 2008 at 12:42 PM, Pauli Virtanen wrote: > su, 2008-05-25 kello 12:12 -0600, Charles R Harris kirjoitti: > [clip] > > 1) Help strings on ufuncs don't work. This seems to be a problem with the > help function, as > > printing the relevant __doc__ works fine. The docstrings are currently > defined in > > code_generators/generate_umath.py and add_newdoc doesn't seem to work > for them. > > Yes, this is a problem with the help function. In pydoc.py in the Python > standard library: > > def doc(...): > ... > if not (inspect.ismodule(object) or > inspect.isclass(object) or > inspect.isroutine(object) or > inspect.isgetsetdescriptor(object) or > inspect.ismemberdescriptor(object) or > isinstance(object, property)): > # If the passed object is a piece of data or an instance, > # document its available methods instead of its value. > object = type(object) > ... > > Is it possible to make one of the above conditions True for ufuncs? > I don't see why not, we could have numpy.doc. Hmm, it should probably be changed in ipython also. Something like In [5]: isinstance(sin, numpy.ufunc) Out[5]: True might do the trick. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun May 25 15:59:45 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 25 May 2008 21:59:45 +0200 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: References: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> <9457e7c80805241211m1318264fu7a837afbe3ba5041@mail.gmail.com> Message-ID: <9457e7c80805251259l7f3deeaalb13b19d002f37a9a@mail.gmail.com> Hi Andrea 2008/5/25 Andrea Gavana : >> When you bench the Cython code, you'll have to take out the Python >> calls (for checking dtype etc.), otherwise you're comparing apples and >> oranges. After I tweaked it, it ran roughly the same time as >> Francesc's version. But like I mentioned before, the Fortran results >> should trump all, so what is going on here? > > I thought I had removed the Python checks from the Cython code (if you > look at the attached files), but maybe I haven't removed them all... > about Fortran, I have no idea: I have 6 different implementations in > Fortran, and they are all slower than the pure NumPy ones. I don't > know if I can optimiza them further (I have asked to a Fortran > newsgroup too, but no faster solution has arisen). I am not even sure > if the defaults f2py compiler options are already on "maximum > optimization" for Fortran. Does anyone know if this is true? Maybe > Pearu can shed some light on this issue... Here are the timings on my machine. You were right -- you did remove the type checks in the Cython code. It turns out the bottleneck was the "for i in range" loop. I was under the impression that loop was correctly optimised; I'll have a chat with the Cython developers. If I change it to "for i in xrange" or "for i from 0 <= i < n" the speed improves a lot. I used gfortran 4.2.1, and as you can see below, the Fortran implementation beat all the others. --------------------------------------------------------------------- Number Of Cells: 300000 --------------------------------------------------------------------- | Rank | Method Name | Execution Time | Relative Slowness | --------------------------------------------------------------------- 1 Fortran 5 (James) 0.01854820 1.00000 2 Fortran 6 (James) 0.01882849 1.01511 3 Fortran 1 (Mine) 0.01917751 1.03393 4 Fortran 4 {Michael) 0.01927021 1.03893 5 Fortran 2 (Mine) 0.01937311 1.04447 6 NumPy 4 (Nathan-Vector) 0.02008982 1.08311 7 NumPy 2 (Nathan) 0.02046990 1.10361 8 NumPy 5 (Andrea) 0.02108521 1.13678 9 NumPy 1 (Francesc) 0.02211959 1.19255 10 Cython (Stefan) 0.02235680 1.20533 11 Fortran 3 (Alex) 0.02486629 1.34063 12 NumPy 3 (Peter) 0.05020461 2.70671 --------------------------------------------------------------------- Regards St?fan From charlesr.harris at gmail.com Sun May 25 16:01:44 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 14:01:44 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: References: <1211740934.11579.27.camel@localhost.localdomain> Message-ID: On Sun, May 25, 2008 at 12:55 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, May 25, 2008 at 12:42 PM, Pauli Virtanen wrote: > >> su, 2008-05-25 kello 12:12 -0600, Charles R Harris kirjoitti: >> [clip] >> > 1) Help strings on ufuncs don't work. This seems to be a problem with >> the help function, as >> > printing the relevant __doc__ works fine. The docstrings are >> currently defined in >> > code_generators/generate_umath.py and add_newdoc doesn't seem to work >> for them. >> >> Yes, this is a problem with the help function. In pydoc.py in the Python >> standard library: >> >> def doc(...): >> ... >> if not (inspect.ismodule(object) or >> inspect.isclass(object) or >> inspect.isroutine(object) or >> inspect.isgetsetdescriptor(object) or >> inspect.ismemberdescriptor(object) or >> isinstance(object, property)): >> # If the passed object is a piece of data or an instance, >> # document its available methods instead of its value. >> object = type(object) >> ... >> >> Is it possible to make one of the above conditions True for ufuncs? >> > > I don't see why not, we could have numpy.doc. Hmm, it should probably be > changed in ipython also. Something like > > In [5]: isinstance(sin, numpy.ufunc) > Out[5]: True > > might do the trick. > As to fixing things so that default python help works, hmmm. I don't know how to do that. The functions in the inspect module all use python types, and I'm not sure how to make numpy.ufunc recognizable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun May 25 16:14:38 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 25 May 2008 22:14:38 +0200 Subject: [Numpy-discussion] ufunc oddities In-Reply-To: References: <3d375d730805242006y19023c3ey282f10c07db58b56@mail.gmail.com> <3d375d730805242018o49af33c4o109fe9d2c72acc80@mail.gmail.com> Message-ID: <9457e7c80805251314s69142cd1x162c288a49e61b98@mail.gmail.com> 2008/5/25 Anne Archibald : > 2008/5/25 Charles R Harris : > >> So, please tell me how numpy is supposed to work. Write as much as you >> please. If you are so moved, why not write the tests for all 64 ufuncs for >> all types and combinations and verify that they are all correct as specified >> and raise errors when they should. > > This sounds like something that will be easier with nose (if I > understand nose correctly). Since the ufuncs are so uniform, it seems > like the zillion tests you are proposing writing should be expressed > as a list of ufuncs with some type annotations, a list of types to run > them against, and a programmatic expression of the type rules Robert > is proposing. While you can do this with the current test framework, > you get one test rather than many, and the failure report is not > nearly so valuable. We do have ParametricTestCase, but yes -- nose makes this very easy. Regards St?fan From robert.kern at gmail.com Sun May 25 16:25:49 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 25 May 2008 15:25:49 -0500 Subject: [Numpy-discussion] isnan surprise In-Reply-To: References: Message-ID: <3d375d730805251325u6b1da1datdb4ed2759f7d4dd4@mail.gmail.com> On Sun, May 25, 2008 at 1:17 PM, Keith Goodman wrote: > On Sun, May 25, 2008 at 11:13 AM, Keith Goodman wrote: >>>> x = np.array([1.0]) >>>> np.isnan(x) >> array([False], dtype=bool) # <----- Expected >>>> np.isnan(x,x) >> array([ 0.]) # <----- Surprise (to me) > > I guess this is not surprising since I'm asking isnan to put the > answer in a float array. Correct. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sun May 25 16:29:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 25 May 2008 15:29:25 -0500 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: References: Message-ID: <3d375d730805251329p49fde3d9kec1df3244297343e@mail.gmail.com> On Sun, May 25, 2008 at 1:12 PM, Charles R Harris wrote: > Hi All, > > Here is the current behavior of the ufuncs and some comments. They don't yet > cover mixed types for binary functions, > but when they do we will see things like: > > In [7]: power(True,10) > Out[7]: > array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, > 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) > > Which looks suspect ;) Very much so. >>> from numpy import * >>> power(True, 10) 1 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sun May 25 16:39:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 14:39:09 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: <3d375d730805251329p49fde3d9kec1df3244297343e@mail.gmail.com> References: <3d375d730805251329p49fde3d9kec1df3244297343e@mail.gmail.com> Message-ID: On Sun, May 25, 2008 at 2:29 PM, Robert Kern wrote: > On Sun, May 25, 2008 at 1:12 PM, Charles R Harris > wrote: > > Hi All, > > > > Here is the current behavior of the ufuncs and some comments. They don't > yet > > cover mixed types for binary functions, > > but when they do we will see things like: > > > > In [7]: power(True,10) > > Out[7]: > > array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, > > 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) > > > > Which looks suspect ;) > > Very much so. > > >>> from numpy import * > >>> power(True, 10) > 1 > Ah, it was a PyLab thing. I needed to update to the latest. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun May 25 16:43:53 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 14:43:53 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: <3d375d730805251329p49fde3d9kec1df3244297343e@mail.gmail.com> References: <3d375d730805251329p49fde3d9kec1df3244297343e@mail.gmail.com> Message-ID: On Sun, May 25, 2008 at 2:29 PM, Robert Kern wrote: > On Sun, May 25, 2008 at 1:12 PM, Charles R Harris > wrote: > > Hi All, > > > > Here is the current behavior of the ufuncs and some comments. They don't > yet > > cover mixed types for binary functions, > > but when they do we will see things like: > > > > In [7]: power(True,10) > > Out[7]: > > array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, > > 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) > > > > Which looks suspect ;) > > Very much so. > > >>> from numpy import * > >>> power(True, 10) > 1 > What about the signatures? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From discerptor at gmail.com Sun May 25 17:35:01 2008 From: discerptor at gmail.com (Joshua Lippai) Date: Sun, 25 May 2008 17:35:01 -0400 Subject: [Numpy-discussion] numpy.test error: test_hdquantiles Message-ID: <9911419a0805251435q646ce201i54ffb3922d18e236@mail.gmail.com> I seem to be getting a few errors and failures with the current numpy SVN (1.2.0.dev5236). I get this output with numpy.test(1,10): ERROR: Ticket #396 ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", line 602, in check_poly1d_nan_roots self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 661, in __getattr__ return roots(self.coeffs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 124, in roots roots = _eigvals(A) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 40, in _eigvals return eigvals(arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 478, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 150, in eig a1 = asarray_chkfinite(a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py", line 527, in asarray_chkfinite raise ValueError, "array must not contain infs or NaNs" ValueError: array must not contain infs or NaNs ====================================================================== ERROR: Ticket #396 ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", line 602, in check_poly1d_nan_roots self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 661, in __getattr__ return roots(self.coeffs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 124, in roots roots = _eigvals(A) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 37, in _eigvals return eigvals(arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 478, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", line 150, in eig a1 = asarray_chkfinite(a) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py", line 527, in asarray_chkfinite raise ValueError, "array must not contain infs or NaNs" ValueError: array must not contain infs or NaNs ====================================================================== ERROR: test_hdquantiles (numpy.ma.tests.test_morestats.TestQuantiles) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 97, in test_hdquantiles hdq = hdquantiles_sd(data,[0.25, 0.5, 0.75]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", line 168, in hdquantiles_sd result = _hdsd_1D(data.compressed(), p) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", line 144, in _hdsd_1D xsorted = numpy.sort(data.compressed()) AttributeError: 'numpy.ndarray' object has no attribute 'compressed' ====================================================================== FAIL: Tests the Marits-Jarrett estimator ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", line 36, in test_mjci assert_almost_equal(mjci(data),[55.76819,45.84028,198.8788],5) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 134, in assert_almost_equal return assert_array_almost_equal(actual, desired, decimal, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 227, in assert_array_almost_equal header='Arrays are not almost equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 193, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([ 55.76818915, 45.84027529, 198.8787528 ]) y: array([ 55.76819, 45.84028, 198.8788 ]) ====================================================================== FAIL: Test quantiles 1D - w/ mask. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mstats.py", line 61, in test_1d_mask [24.833333, 50.0, 75.166666]) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 134, in assert_almost_equal return assert_array_almost_equal(actual, desired, decimal, err_msg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 227, in assert_array_almost_equal header='Arrays are not almost equal') File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", line 193, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 66.6666666667%) x: array([ 24.83333333, 50. , 75.16666667]) y: array([ 24.833333, 50. , 75.166666]) ---------------------------------------------------------------------- Ran 1301 tests in 1.501s FAILED (failures=2, errors=3) >>> numpy.__version__ '1.2.0.dev5236' >>> scipy.__version__ '0.7.0.dev4386' I get that the failures are just NumPy being anal about the accuracy and rounding habits, but the errors concern me. The first two are the result of a ticket that's only supposed to pop up when SciPy isn't installed (which it is, and I even imported it before running the test). I assume those will go away by some strange voodoo magic on their own eventually because this has happened to me before with that particular ticket. The third error concerns me, though. Is anyone else getting an error with test_hdquantiles? From charlesr.harris at gmail.com Sun May 25 17:52:26 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 15:52:26 -0600 Subject: [Numpy-discussion] numpy.test error: test_hdquantiles In-Reply-To: <9911419a0805251435q646ce201i54ffb3922d18e236@mail.gmail.com> References: <9911419a0805251435q646ce201i54ffb3922d18e236@mail.gmail.com> Message-ID: On Sun, May 25, 2008 at 3:35 PM, Joshua Lippai wrote: > I seem to be getting a few errors and failures with the current numpy > SVN (1.2.0.dev5236). I get this output with numpy.test(1,10): > > ERROR: Ticket #396 > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", > line 602, in check_poly1d_nan_roots > self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", > line 320, in failUnlessRaises > callableObj(*args, **kwargs) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", > line 661, in __getattr__ > return roots(self.coeffs) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", > line 124, in roots > roots = _eigvals(A) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", > line 40, in _eigvals > return eigvals(arg) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", > line 478, in eigvals > return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", > line 150, in eig > a1 = asarray_chkfinite(a) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py", > line 527, in asarray_chkfinite > raise ValueError, "array must not contain infs or NaNs" > ValueError: array must not contain infs or NaNs > > ====================================================================== > ERROR: Ticket #396 > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", > line 602, in check_poly1d_nan_roots > self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,"r") > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", > line 320, in failUnlessRaises > callableObj(*args, **kwargs) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", > line 661, in __getattr__ > return roots(self.coeffs) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", > line 124, in roots > roots = _eigvals(A) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py", > line 37, in _eigvals > return eigvals(arg) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", > line 478, in eigvals > return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py", > line 150, in eig > a1 = asarray_chkfinite(a) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py", > line 527, in asarray_chkfinite > raise ValueError, "array must not contain infs or NaNs" > ValueError: array must not contain infs or NaNs > > ====================================================================== > ERROR: test_hdquantiles (numpy.ma.tests.test_morestats.TestQuantiles) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", > line 97, in test_hdquantiles > hdq = hdquantiles_sd(data,[0.25, 0.5, 0.75]) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", > line 168, in hdquantiles_sd > result = _hdsd_1D(data.compressed(), p) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/morestats.py", > line 144, in _hdsd_1D > xsorted = numpy.sort(data.compressed()) > AttributeError: 'numpy.ndarray' object has no attribute 'compressed' > > ====================================================================== > FAIL: Tests the Marits-Jarrett estimator > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_morestats.py", > line 36, in test_mjci > assert_almost_equal(mjci(data),[55.76819,45.84028,198.8788],5) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", > line 134, in assert_almost_equal > return assert_array_almost_equal(actual, desired, decimal, err_msg) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", > line 227, in assert_array_almost_equal > header='Arrays are not almost equal') > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", > line 193, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not almost equal > > (mismatch 33.3333333333%) > x: array([ 55.76818915, 45.84027529, 198.8787528 ]) > y: array([ 55.76819, 45.84028, 198.8788 ]) > > ====================================================================== > FAIL: Test quantiles 1D - w/ mask. > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/tests/test_mstats.py", > line 61, in test_1d_mask > [24.833333, 50.0, 75.166666]) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", > line 134, in assert_almost_equal > return assert_array_almost_equal(actual, desired, decimal, err_msg) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", > line 227, in assert_array_almost_equal > header='Arrays are not almost equal') > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ma/testutils.py", > line 193, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not almost equal > > (mismatch 66.6666666667%) > x: array([ 24.83333333, 50. , 75.16666667]) > y: array([ 24.833333, 50. , 75.166666]) > > ---------------------------------------------------------------------- > Ran 1301 tests in 1.501s > > FAILED (failures=2, errors=3) > > >>> numpy.__version__ > '1.2.0.dev5236' > >>> scipy.__version__ > '0.7.0.dev4386' > > I get that the failures are just NumPy being anal about the accuracy > and rounding habits, but the errors concern me. The first two are the > result of a ticket that's only supposed to pop up when SciPy isn't > installed (which it is, and I even imported it before running the > test). I assume those will go away by some strange voodoo magic on > their own eventually because this has happened to me before with that > particular ticket. The third error concerns me, though. Is anyone else > getting an error with test_hdquantiles? > ______ Try deleting site-packages/numpy and the build directory and reinstalling. The current default testing level is greater than it was and some old tests may be hanging about. What platform are you on? I get 1353 tests run, which is quite a bit larger than the 1301 your output shows. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Sun May 25 18:03:42 2008 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 25 May 2008 23:03:42 +0100 Subject: [Numpy-discussion] Multiple Boolean Operations In-Reply-To: <9457e7c80805251259l7f3deeaalb13b19d002f37a9a@mail.gmail.com> References: <6be8b94a0805230816j300bafe2l1df3d8a5a8a027a9@mail.gmail.com> <9457e7c80805241211m1318264fu7a837afbe3ba5041@mail.gmail.com> <9457e7c80805251259l7f3deeaalb13b19d002f37a9a@mail.gmail.com> Message-ID: Hi Stefan & All, On Sun, May 25, 2008 at 8:59 PM, St?fan van der Walt wrote: > Hi Andrea > > 2008/5/25 Andrea Gavana : >>> When you bench the Cython code, you'll have to take out the Python >>> calls (for checking dtype etc.), otherwise you're comparing apples and >>> oranges. After I tweaked it, it ran roughly the same time as >>> Francesc's version. But like I mentioned before, the Fortran results >>> should trump all, so what is going on here? >> >> I thought I had removed the Python checks from the Cython code (if you >> look at the attached files), but maybe I haven't removed them all... >> about Fortran, I have no idea: I have 6 different implementations in >> Fortran, and they are all slower than the pure NumPy ones. I don't >> know if I can optimiza them further (I have asked to a Fortran >> newsgroup too, but no faster solution has arisen). I am not even sure >> if the defaults f2py compiler options are already on "maximum >> optimization" for Fortran. Does anyone know if this is true? Maybe >> Pearu can shed some light on this issue... > > Here are the timings on my machine. You were right -- you did remove > the type checks in the Cython code. It turns out the bottleneck was > the "for i in range" loop. I was under the impression that loop was > correctly optimised; I'll have a chat with the Cython developers. If > I change it to "for i in xrange" or "for i from 0 <= i < n" the speed > improves a lot. > > I used gfortran 4.2.1, and as you can see below, the Fortran > implementation beat all the others. > > --------------------------------------------------------------------- > Number Of Cells: 300000 > --------------------------------------------------------------------- > | Rank | Method Name | Execution Time | Relative Slowness | > --------------------------------------------------------------------- > 1 Fortran 5 (James) 0.01854820 1.00000 > 2 Fortran 6 (James) 0.01882849 1.01511 > 3 Fortran 1 (Mine) 0.01917751 1.03393 > 4 Fortran 4 {Michael) 0.01927021 1.03893 > 5 Fortran 2 (Mine) 0.01937311 1.04447 > 6 NumPy 4 (Nathan-Vector) 0.02008982 1.08311 > 7 NumPy 2 (Nathan) 0.02046990 1.10361 > 8 NumPy 5 (Andrea) 0.02108521 1.13678 > 9 NumPy 1 (Francesc) 0.02211959 1.19255 > 10 Cython (Stefan) 0.02235680 1.20533 > 11 Fortran 3 (Alex) 0.02486629 1.34063 > 12 NumPy 3 (Peter) 0.05020461 2.70671 > --------------------------------------------------------------------- Thank you for the tests you have run. I have run mine with Compaq Visual Fortran 6.6 on Windows XP, and I assume I can not use gfortran with MS visual studio 2003 (which is the compiler Python is built with). So the only option I have is to try Intel Visual Fortran, which I will do tomorrow, or stick with the numpy implementation as it looks like there is nothing more I can do (even with sorted arrays or using another approach, and I can't think of anything else) to speed up my problem. A big thank you to everyone in this list, you have been very kind and helpful. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From cournapeau at cslab.kecl.ntt.co.jp Sun May 25 20:38:36 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 26 May 2008 09:38:36 +0900 Subject: [Numpy-discussion] build outside C code with scons within numpy distutils In-Reply-To: References: Message-ID: <1211762316.19514.7.camel@bbc8> On Sun, 2008-05-25 at 11:25 -0400, Zachary Pincus wrote: > Specifically, I have a package that uses numpy and numpy.distutils to > built itself. Unfortunately, there are some pure-C libraries that I > call using ctypes, and as these libraries are are not python > extensions, it is hard to get distutils to build them happily on all > platforms. You can take a look at the examples in numscons sources (sources/tests/ctypesext). In the setup.py, you call config.add_sconscript('SConstruct') And in the scons script, you do: from numscons import GetNumpyEnvironment env = GetNumpyEnvironment(ARGUMENTS) env.NumpyCtypes('foo', source = ['foo.c']) Note that although this has not changed since it existed, I do not guarantee API stability at this point. cheers, David From cournape at gmail.com Sun May 25 21:44:33 2008 From: cournape at gmail.com (David Cournapeau) Date: Mon, 26 May 2008 10:44:33 +0900 Subject: [Numpy-discussion] David, please check. In-Reply-To: References: Message-ID: <5b8d13220805251844t4da31871m961bf98b68d1730d@mail.gmail.com> On Sun, May 25, 2008 at 9:04 AM, Charles R Harris wrote: > > > I've since renamed generate_array_api to generate_numpy_api which led to > some mods in scons_support.py, so you might want to check that also. > Looks OK to me, thanks, David From zachary.pincus at yale.edu Sun May 25 21:58:44 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Sun, 25 May 2008 21:58:44 -0400 Subject: [Numpy-discussion] build outside C code with scons within numpy distutils In-Reply-To: <1211762316.19514.7.camel@bbc8> References: <1211762316.19514.7.camel@bbc8> Message-ID: <200EFE6A-0C53-4DE4-83CE-2B2A11BFFEE5@yale.edu> Thanks for the tips! This is very helpful. >> Specifically, I have a package that uses numpy and numpy.distutils to >> built itself. Unfortunately, there are some pure-C libraries that I >> call using ctypes, and as these libraries are are not python >> extensions, it is hard to get distutils to build them happily on all >> platforms. > > You can take a look at the examples in numscons sources > (sources/tests/ctypesext). > > In the setup.py, you call config.add_sconscript('SConstruct') > > And in the scons script, you do: > > from numscons import GetNumpyEnvironment > env = GetNumpyEnvironment(ARGUMENTS) > > env.NumpyCtypes('foo', source = ['foo.c']) > > Note that although this has not changed since it existed, I do not > guarantee API stability at this point. > > cheers, > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sun May 25 21:59:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 25 May 2008 19:59:28 -0600 Subject: [Numpy-discussion] David, please check. In-Reply-To: <5b8d13220805251844t4da31871m961bf98b68d1730d@mail.gmail.com> References: <5b8d13220805251844t4da31871m961bf98b68d1730d@mail.gmail.com> Message-ID: On Sun, May 25, 2008 at 7:44 PM, David Cournapeau wrote: > On Sun, May 25, 2008 at 9:04 AM, Charles R Harris > wrote: > > > > > > I've since renamed generate_array_api to generate_numpy_api which led to > > some mods in scons_support.py, so you might want to check that also. > > > > Looks OK to me, > Good, thanks for checking it out. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 12:29:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 10:29:17 -0600 Subject: [Numpy-discussion] Is this a bug? Message-ID: I vaguely recall this generated an array from all the characters. In [1]: array('123', dtype='c') Out[1]: array('1', dtype='|S1') Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From twaite at berkeley.edu Mon May 26 13:06:33 2008 From: twaite at berkeley.edu (Tom Waite) Date: Mon, 26 May 2008 10:06:33 -0700 Subject: [Numpy-discussion] triangular matrix fill In-Reply-To: <3d375d730805221913g5079424fxc51495ea2168a10b@mail.gmail.com> References: <3d375d730805221913g5079424fxc51495ea2168a10b@mail.gmail.com> Message-ID: Thanks Robert and Chuck. The matrix calculation is a bottleneck computation and that is my interest in only doing the lower triangular part of the outer product. This is part of a nonlinear brainwarp for the NIPY project. For a typical anatomical MRI volume size this is a 768 x 768 matrix that is inside three nested loops that walk the volume. This matrix is actually the curvature matrix computation and is "inspired" by the Marquardt routine (mrqcof) in Numerical Recipes. Having this lower triangular outer product in numpy could be of value for scipy.optimize down the road if Levenberg-Marquardt is added. Is this something I should post to be added to numpy or write my own extension code as this might be too specialized? Tom On Thu, May 22, 2008 at 7:13 PM, Robert Kern wrote: > On Thu, May 22, 2008 at 9:07 PM, Charles R Harris > wrote: > > > > On Thu, May 22, 2008 at 7:19 PM, Tom Waite wrote: > >> > >> I have a question on filling a lower triangular matrix using numpy. This > >> is essentially having two loops and the inner loop upper limit is the > >> outer loop current index. In the inner loop I have a vector being > >> multiplied by a constant set in the outer loop. For a matrix N*N in > size, > >> the C the code is: > >> > >> for(i = 0; i < N; ++i){ > >> for(j = 0; j < i; ++j){ > >> Matrix[i*N + j] = V1[i] * V2[j]; > >> } > >> } > >> > > > > You can use numpy.outer(V1,V2) and just ignore everything on and above > the > > diagonal. > > > > In [1]: x = arange(3) > > > > In [2]: y = arange(3,6) > > > > In [3]: outer(x,y) > > Out[3]: > > array([[ 0, 0, 0], > > [ 3, 4, 5], > > [ 6, 8, 10]]) > > > > You can mask the upper part if you want: > > > > In [16]: outer(x,y)*fromfunction(lambda i,j: i>j, (3,3)) > > Out[16]: > > array([[0, 0, 0], > > [3, 0, 0], > > [6, 8, 0]]) > > > > Or you could use fromfunction directly. > > Or numpy.tril(). > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Mon May 26 14:35:04 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 26 May 2008 14:35:04 -0400 Subject: [Numpy-discussion] Building universal (fat) OS X binaries with numpy distutils Message-ID: <276AF350-48C5-42F2-844B-C7B0FE36DDB6@yale.edu> Hello all, I'm wondering if anyone could let me know what the current "best practices" are for building universal python extensions on OS X with numpy's distutils and fortran code. Currently, I've been doing what this message suggests: http://mail.python.org/pipermail/pythonmac-sig/2007-June/018986.html That is, get gfortran from http://r.research.att.com/tools/ , copy libgfortran.a to somewhere different like ~/staticlibs/, and then do this to build: export LDFLAGS="-undefined dynamic_lookup -bundle -arch i386 -arch ppc -Wl,-search_paths_first" python setup.py config_fc --fcompiler=gnu95 --arch="-arch i386 -arch ppc" build_ext -L ~/staticlibs/ build Is this still the best bet? Also, how best should one get python itself to compile as universal? (For py2app purposes...) Thanks, Zach From charlesr.harris at gmail.com Mon May 26 14:42:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 12:42:00 -0600 Subject: [Numpy-discussion] triangular matrix fill In-Reply-To: References: <3d375d730805221913g5079424fxc51495ea2168a10b@mail.gmail.com> Message-ID: On Mon, May 26, 2008 at 11:06 AM, Tom Waite wrote: > Thanks Robert and Chuck. > > The matrix calculation is a bottleneck computation and that is my interest > in only doing the lower triangular part of the outer product. This is part > of a nonlinear brainwarp for the NIPY project. For a typical anatomical MRI > volume size this is a 768 x 768 matrix that is inside three nested loops > that walk the volume. This matrix is actually the curvature matrix > computation and is "inspired" by the Marquardt routine (mrqcof) in Numerical > Recipes. Having this lower triangular outer product in numpy could be of > value for scipy.optimize down the road if Levenberg-Marquardt is added. Is > this something I should post to be added to numpy or write my own extension > code as this might be too specialized? > > Tom > I think your own extension code is the place to start. If something like this goes into scipy, then it should probably be part of a larger package containing other relevant optimizations. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 15:00:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 13:00:23 -0600 Subject: [Numpy-discussion] Probable errors in type promotions in binary ufuncs. In-Reply-To: References: Message-ID: On Mon, May 26, 2008 at 12:38 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Continuing the ufunc saga, here are some probable error in the type > promotions. > The data is best viewed in a fixed spaced font, so you might want to use an > html aware mail client. > > The output type of the following functions should be symmetric in the input > types, but aren't. > Note the columns/rows for the l,L types. > I note that on my machine (32 bit) i and l are the same length. So the first part might only be an error in the sense that the associated dtype.char should be the same in these two cases. If instead this reflects a difference in the underlying c type, then this needs to be checked on 64 bit machines and with different compilers, so I've attached the script I've been using. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signatures.py Type: text/x-python Size: 3399 bytes Desc: not available URL: From timmichelsen at gmx-topmail.de Mon May 26 15:28:08 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 26 May 2008 21:28:08 +0200 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <48393154.2000209@ar.media.kyoto-u.ac.jp> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> Message-ID: Hello, I posted a similar question to Tutor: loading modules only when needed and PEP 008 - http://article.gmane.org/gmane.comp.python.tutor/46969 I wanted to use this for thridparty libs that I deliver with my application. Would like to see how you gonna solve it. Kind regards, Timmie From robert.kern at gmail.com Mon May 26 15:44:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 May 2008 14:44:13 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: Message-ID: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> On Mon, May 26, 2008 at 11:29 AM, Charles R Harris wrote: > I vaguely recall this generated an array from all the characters. > > In [1]: array('123', dtype='c') > Out[1]: > array('1', > dtype='|S1') When was the last time it did otherwise? This behavior is a consequence of treating strings as scalars rather than containers of characters. I believe we settled on this behavior before 1.0. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 26 15:46:24 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 May 2008 14:46:24 -0500 Subject: [Numpy-discussion] Building universal (fat) OS X binaries with numpy distutils In-Reply-To: <276AF350-48C5-42F2-844B-C7B0FE36DDB6@yale.edu> References: <276AF350-48C5-42F2-844B-C7B0FE36DDB6@yale.edu> Message-ID: <3d375d730805261246x6d89a0e3oe378dba0cbb7cc99@mail.gmail.com> On Mon, May 26, 2008 at 1:35 PM, Zachary Pincus wrote: > Hello all, > > I'm wondering if anyone could let me know what the current "best > practices" are for building universal python extensions on OS X with > numpy's distutils and fortran code. > > Currently, I've been doing what this message suggests: > http://mail.python.org/pipermail/pythonmac-sig/2007-June/018986.html > > That is, get gfortran from http://r.research.att.com/tools/ , copy > libgfortran.a to somewhere different like ~/staticlibs/, and then do > this to build: > > export LDFLAGS="-undefined dynamic_lookup -bundle -arch i386 -arch ppc > -Wl,-search_paths_first" > > python setup.py config_fc --fcompiler=gnu95 --arch="-arch i386 -arch > ppc" > build_ext -L ~/staticlibs/ build > > Is this still the best bet? Yes. Have you had any problems with it? > Also, how best should one get python > itself to compile as universal? (For py2app purposes...) Just us the binary from www.python.org. Is this not possible for your use case? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon May 26 15:52:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 13:52:00 -0600 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> References: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> Message-ID: On Mon, May 26, 2008 at 1:44 PM, Robert Kern wrote: > On Mon, May 26, 2008 at 11:29 AM, Charles R Harris > wrote: > > I vaguely recall this generated an array from all the characters. > > > > In [1]: array('123', dtype='c') > > Out[1]: > > array('1', > > dtype='|S1') > > When was the last time it did otherwise? This behavior is a > consequence of treating strings as scalars rather than containers of > characters. I believe we settled on this behavior before 1.0. > The 'c' type is special, it is a left over compatibility type for numeric. It would, I think, have been several months ago that it behaved differently. Maybe I should check out a version from before Travis's latest fixes for matrix types went in, because there used to be an exception in the code for the 'c' type. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 16:03:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 14:03:27 -0600 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> Message-ID: On Mon, May 26, 2008 at 1:52 PM, Charles R Harris wrote: > > > On Mon, May 26, 2008 at 1:44 PM, Robert Kern > wrote: > >> On Mon, May 26, 2008 at 11:29 AM, Charles R Harris >> wrote: >> > I vaguely recall this generated an array from all the characters. >> > >> > In [1]: array('123', dtype='c') >> > Out[1]: >> > array('1', >> > dtype='|S1') >> >> When was the last time it did otherwise? This behavior is a >> consequence of treating strings as scalars rather than containers of >> characters. I believe we settled on this behavior before 1.0. >> > > The 'c' type is special, it is a left over compatibility type for numeric. > It would, I think, have been several months ago that it behaved differently. > Maybe I should check out a version from before Travis's latest fixes for > matrix types went in, because there used to be an exception in the code for > the 'c' type. > It works the same in r5101, so it looks like it hasn't changed. What I vaguely remembered was the whole string being treated as a sequence of characters, but evidently that is not the case. Probably I remembered the opposite of the case from looking at the code back when. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon May 26 16:13:15 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 May 2008 15:13:15 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> Message-ID: <3d375d730805261313m609b5283mac64ca59e223867@mail.gmail.com> On Mon, May 26, 2008 at 3:03 PM, Charles R Harris wrote: > > On Mon, May 26, 2008 at 1:52 PM, Charles R Harris > wrote: >> >> >> On Mon, May 26, 2008 at 1:44 PM, Robert Kern >> wrote: >>> >>> On Mon, May 26, 2008 at 11:29 AM, Charles R Harris >>> wrote: >>> > I vaguely recall this generated an array from all the characters. >>> > >>> > In [1]: array('123', dtype='c') >>> > Out[1]: >>> > array('1', >>> > dtype='|S1') >>> >>> When was the last time it did otherwise? This behavior is a >>> consequence of treating strings as scalars rather than containers of >>> characters. I believe we settled on this behavior before 1.0. >> >> The 'c' type is special, it is a left over compatibility type for numeric. >> It would, I think, have been several months ago that it behaved differently. >> Maybe I should check out a version from before Travis's latest fixes for >> matrix types went in, because there used to be an exception in the code for >> the 'c' type. > > It works the same in r5101, so it looks like it hasn't changed. What I > vaguely remembered was the whole string being treated as a sequence of > characters, but evidently that is not the case. Probably I remembered the > opposite of the case from looking at the code back when. numpy 1.0 had the behaviour you describe. >>> import numpy >>> numpy.__version__ '1.0' >>> numpy.array('123', dtype='c') array(['1', '2', '3'], dtype='|S1') -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon May 26 16:15:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 May 2008 15:15:02 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <3d375d730805261313m609b5283mac64ca59e223867@mail.gmail.com> References: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> <3d375d730805261313m609b5283mac64ca59e223867@mail.gmail.com> Message-ID: <3d375d730805261315g26807f07gce3d8cd3aaab04c@mail.gmail.com> On Mon, May 26, 2008 at 3:13 PM, Robert Kern wrote: > numpy 1.0 had the behaviour you describe. > > >>>> import numpy >>>> numpy.__version__ > '1.0' >>>> numpy.array('123', dtype='c') > array(['1', '2', '3'], > dtype='|S1') Of course, this has its own inconsistencies: >>> numpy.dtype('c') dtype('|S1') >>> numpy.array('123', dtype='|S1') array('1', dtype='|S1') -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dalcinl at gmail.com Mon May 26 16:21:00 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 17:21:00 -0300 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <48393154.2000209@ar.media.kyoto-u.ac.jp> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> Message-ID: David, I've implemented something like this for petsc4py. Basically, I want to be able to build extension modules accessing PETSc for different configuration options (for example, optimized versus debug), when using PETSc, it is normal to select the build variant by a $PETSC_ARCH environ var. All this is implemented in pure Python code, and tanking advantage of the built-in 'imp' module. I believe it is robust enough, I had never received any report of this causing trouble from petsc4py users out there. If you think this fits your needs, then I'll help you. Buth then I'll need to know a bit more about the directory tree structure of your package and your extension modules. On 5/25/08, David Cournapeau wrote: > Hi, > > This is not numpy specific, but I need it for numpy/scipy. More > specifically, I would like to be able to have one module interface which > load imp1, imp2, imp3, etc... depending on some options. I see two > obvious solutions: monkey patching, and file configuration, but I try to > avoid the former, and there is no mechanism for the later in scipy. I > read that import hooks could be used for this purpose, but I don't quite > understand the pep: > > http://www.python.org/dev/peps/pep-0302/ > > Has anyone played with it ? > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From charlesr.harris at gmail.com Mon May 26 16:25:13 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 14:25:13 -0600 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <3d375d730805261315g26807f07gce3d8cd3aaab04c@mail.gmail.com> References: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> <3d375d730805261313m609b5283mac64ca59e223867@mail.gmail.com> <3d375d730805261315g26807f07gce3d8cd3aaab04c@mail.gmail.com> Message-ID: On Mon, May 26, 2008 at 2:15 PM, Robert Kern wrote: > On Mon, May 26, 2008 at 3:13 PM, Robert Kern > wrote: > > > numpy 1.0 had the behaviour you describe. > > > > > >>>> import numpy > >>>> numpy.__version__ > > '1.0' > >>>> numpy.array('123', dtype='c') > > array(['1', '2', '3'], > > dtype='|S1') > > Of course, this has its own inconsistencies: > > >>> numpy.dtype('c') > dtype('|S1') > >>> numpy.array('123', dtype='|S1') > array('1', > dtype='|S1') > Since it is a compatibility type, we should probably check to be sure what it is supposed to do. I think Travis would be the one to ask. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 17:17:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 15:17:28 -0600 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: <3d375d730805261244w37171205n5a278c8e0be8ea3@mail.gmail.com> <3d375d730805261313m609b5283mac64ca59e223867@mail.gmail.com> <3d375d730805261315g26807f07gce3d8cd3aaab04c@mail.gmail.com> Message-ID: On Mon, May 26, 2008 at 2:25 PM, Charles R Harris wrote: > > > On Mon, May 26, 2008 at 2:15 PM, Robert Kern > wrote: > >> On Mon, May 26, 2008 at 3:13 PM, Robert Kern >> wrote: >> >> > numpy 1.0 had the behaviour you describe. >> > >> > >> >>>> import numpy >> >>>> numpy.__version__ >> > '1.0' >> >>>> numpy.array('123', dtype='c') >> > array(['1', '2', '3'], >> > dtype='|S1') >> >> Of course, this has its own inconsistencies: >> >> >>> numpy.dtype('c') >> dtype('|S1') >> >>> numpy.array('123', dtype='|S1') >> array('1', >> dtype='|S1') >> > > Since it is a compatibility type, we should probably check to be sure what > it is supposed to do. I think Travis would be the one to ask. > It's a bug introduced in r5080 by, ahem, yours truly. And I thought I had it fixed. Off to get it right. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 19:31:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 17:31:17 -0600 Subject: [Numpy-discussion] Possible errors in type promotion in binary ufuncs Message-ID: Attached as a zip file. The mailing list is still stuck in the microzone, 40kb limit. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: binary-types.txt.zip Type: application/zip Size: 1793 bytes Desc: not available URL: From zachary.pincus at yale.edu Mon May 26 19:39:50 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 26 May 2008 19:39:50 -0400 Subject: [Numpy-discussion] Building universal (fat) OS X binaries with numpy distutils In-Reply-To: <3d375d730805261246x6d89a0e3oe378dba0cbb7cc99@mail.gmail.com> References: <276AF350-48C5-42F2-844B-C7B0FE36DDB6@yale.edu> <3d375d730805261246x6d89a0e3oe378dba0cbb7cc99@mail.gmail.com> Message-ID: <2D115A51-6E66-454B-BB63-A0E9F0E4C9B4@yale.edu> > On Mon, May 26, 2008 at 1:35 PM, Zachary Pincus > wrote: >> Hello all, >> >> I'm wondering if anyone could let me know what the current "best >> practices" are for building universal python extensions on OS X with >> numpy's distutils and fortran code. >> >> Currently, I've been doing what this message suggests: >> http://mail.python.org/pipermail/pythonmac-sig/2007-June/018986.html >> >> That is, get gfortran from http://r.research.att.com/tools/ , copy >> libgfortran.a to somewhere different like ~/staticlibs/, and then do >> this to build: >> >> export LDFLAGS="-undefined dynamic_lookup -bundle -arch i386 -arch >> ppc >> -Wl,-search_paths_first" >> >> python setup.py config_fc --fcompiler=gnu95 --arch="-arch i386 -arch >> ppc" >> build_ext -L ~/staticlibs/ build >> >> Is this still the best bet? > > Yes. Have you had any problems with it? No problems, but I just wanted to make sure that there wasn't now a simpler way already rolled in to the distutils or something, and to also make sure that there weren't some previously-discovered subtle problems with that method. >> Also, how best should one get python >> itself to compile as universal? (For py2app purposes...) > > Just us the binary from www.python.org. Is this not possible for > your use case? I can, but I often build Python myself and I was just wondering how one does it. I can check elsewhere about this, though -- it's a bit off-topic for this list, I guess. Thanks, Zach From robert.kern at gmail.com Mon May 26 20:09:12 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 May 2008 19:09:12 -0500 Subject: [Numpy-discussion] Building universal (fat) OS X binaries with numpy distutils In-Reply-To: <2D115A51-6E66-454B-BB63-A0E9F0E4C9B4@yale.edu> References: <276AF350-48C5-42F2-844B-C7B0FE36DDB6@yale.edu> <3d375d730805261246x6d89a0e3oe378dba0cbb7cc99@mail.gmail.com> <2D115A51-6E66-454B-BB63-A0E9F0E4C9B4@yale.edu> Message-ID: <3d375d730805261709s567a44aas6b2f9b5dea4b13b4@mail.gmail.com> On Mon, May 26, 2008 at 6:39 PM, Zachary Pincus wrote: > [Robert Kern wrote:] >> On Mon, May 26, 2008 at 1:35 PM, Zachary Pincus >> Also, how best should one get python >>> itself to compile as universal? (For py2app purposes...) >> >> Just us the binary from www.python.org. Is this not possible for >> your use case? > > I can, but I often build Python myself and I was just wondering how > one does it. I can check elsewhere about this, though -- it's a bit > off-topic for this list, I guess. It's easiest to use the script Mac/BuildScript/build-installer.py. I had to make a few modifications to the copy in the Python-2.5.2 tarball to update the URLs of the dependencies and such, but I expect the one from SVN trunk should be up to date. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zachary.pincus at yale.edu Mon May 26 22:01:40 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 26 May 2008 22:01:40 -0400 Subject: [Numpy-discussion] Building universal (fat) OS X binaries with numpy distutils In-Reply-To: <3d375d730805261709s567a44aas6b2f9b5dea4b13b4@mail.gmail.com> References: <276AF350-48C5-42F2-844B-C7B0FE36DDB6@yale.edu> <3d375d730805261246x6d89a0e3oe378dba0cbb7cc99@mail.gmail.com> <2D115A51-6E66-454B-BB63-A0E9F0E4C9B4@yale.edu> <3d375d730805261709s567a44aas6b2f9b5dea4b13b4@mail.gmail.com> Message-ID: > It's easiest to use the script Mac/BuildScript/build-installer.py. I > had to make a few modifications to the copy in the Python-2.5.2 > tarball to update the URLs of the dependencies and such, but I expect > the one from SVN trunk should be up to date. Thanks for the pointer! I really appreciate the helpfulness of all of the denizens of this list on issues large and small, and especially Robert. Thanks again. Zach From charlesr.harris at gmail.com Mon May 26 23:03:29 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:03:29 -0600 Subject: [Numpy-discussion] Binary ufuncs: bitwise operators Message-ID: Here are the bitwise operators. Note that combinations of signed and unsigned types lead to promotion to larger types. I don't think that is right, especially as the high bits will be zeroed anyway when the unsigned number is sign extended. Note also the object type used to promote the 'Q' (unsigned long long) type. On my machine i and l are the same precision, but different c-types, so it would also make me feel better if the rows/cols for l,L used the same types, making the functions symmetric in the arguments. bitwise_and not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 b | b , b , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 h | h , h , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 i | i , i , i , i , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , err, err, err, err, err, err, in2 l | l , l , l , l , l , l , q , l , q , q , O , err, err, err, err, err, err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , err, err, err, err, err, err, in2 q | q , q , q , q , q , q , q , q , q , q , O , err, err, err, err, err, err, in2 Q | Q , O , Q , O , Q , O , Q , O , Q , O , Q , err, err, err, err, err, err, in2 f | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 d | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 g | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 F | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 D | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, bitwise_or not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 b | b , b , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 h | h , h , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 i | i , i , i , i , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , err, err, err, err, err, err, in2 l | l , l , l , l , l , l , q , l , q , q , O , err, err, err, err, err, err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , err, err, err, err, err, err, in2 q | q , q , q , q , q , q , q , q , q , q , O , err, err, err, err, err, err, in2 Q | Q , O , Q , O , Q , O , Q , O , Q , O , Q , err, err, err, err, err, err, in2 f | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 d | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 g | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 F | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 D | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, bitwise_xor not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 b | b , b , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 h | h , h , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 i | i , i , i , i , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , err, err, err, err, err, err, in2 l | l , l , l , l , l , l , q , l , q , q , O , err, err, err, err, err, err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , err, err, err, err, err, err, in2 q | q , q , q , q , q , q , q , q , q , q , O , err, err, err, err, err, err, in2 Q | Q , O , Q , O , Q , O , Q , O , Q , O , Q , err, err, err, err, err, err, in2 f | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 d | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 g | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 F | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 D | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:07:56 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:07:56 -0600 Subject: [Numpy-discussion] Binary ufuncs: shift operators Message-ID: I think these operators should preserve the type of the first argument, except for booleans, instead of attempting a promotion to a larger type. Note again the Object output types in the col/row for the Q type. left_shift not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 b | b , b , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 h | h , h , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 i | i , i , i , i , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , err, err, err, err, err, err, in2 l | l , l , l , l , l , l , q , l , q , q , O , err, err, err, err, err, err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , err, err, err, err, err, err, in2 q | q , q , q , q , q , q , q , q , q , q , O , err, err, err, err, err, err, in2 Q | Q , O , Q , O , Q , O , Q , O , Q , O , Q , err, err, err, err, err, err, in2 f | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 d | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 g | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 F | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 D | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, right_shift not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 b | b , b , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 h | h , h , h , h , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , err, err, err, err, err, err, in2 i | i , i , i , i , i , i , q , i , q , q , O , err, err, err, err, err, err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , err, err, err, err, err, err, in2 l | l , l , l , l , l , l , q , l , q , q , O , err, err, err, err, err, err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , err, err, err, err, err, err, in2 q | q , q , q , q , q , q , q , q , q , q , O , err, err, err, err, err, err, in2 Q | Q , O , Q , O , Q , O , Q , O , Q , O , Q , err, err, err, err, err, err, in2 f | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 d | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 g | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 F | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 D | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:13:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:13:43 -0600 Subject: [Numpy-discussion] Binary ufuncs: true_divide Message-ID: Why does the combination of H and b output a double instead of a float? Should all integer combinations output doubles? true_divide symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 b | f , f , f , f , d , d , d , d , d , d , d , f , d , g , F , D , G , in2 B | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 h | f , f , f , f , d , d , d , d , d , d , d , f , d , g , F , D , G , in2 H | f , d , f , d , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 i | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 I | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 l | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 L | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 q | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 Q | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:15:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:15:48 -0600 Subject: [Numpy-discussion] Binary ufuncs: fmod Message-ID: Not sure if this should be symmetric or not. I think it should have the type of the second argument. fmod not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , f , d , g , err, err, err, in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , err, err, err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , err, err, err, in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , err, err, err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , err, err, err, in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , err, err, err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , err, err, err, in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , err, err, err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , err, err, err, in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , err, err, err, in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , err, err, err, in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , err, err, err, in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , err, err, err, in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , err, err, err, in2 F | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 D | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:17:32 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:17:32 -0600 Subject: [Numpy-discussion] Binary ufuncs: power Message-ID: I suspect this should preserve the type of the first argument for positive integer powers. I think this is an excellent function for flame bait and I offer it up as such. power not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:21:53 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:21:53 -0600 Subject: [Numpy-discussion] Binary ufuncs: remainder Message-ID: Note the two segfaults and the Object output types, they should be cured if we raise errors for all complex types. I suspect this function should preserve the type of the second argument. remainder not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , f , d , g , O , O , err, in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , O , O , err, in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , O , O , err, in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , O , O , err, in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , O , O , err, in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , O , O , err, in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , O , O , err, in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , O , O , err, in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , O , O , err, in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , O , O , err, in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , O , O , err, in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , O , O , err, in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , O , O , err, in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , seg, seg, err, in2 F | O , O , O , O , O , O , O , O , O , O , O , O , O , err, O , O , err, in2 D | O , O , O , O , O , O , O , O , O , O , O , O , O , err, O , O , err, in2 G | err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, err, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:31:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:31:28 -0600 Subject: [Numpy-discussion] Binary ufuncs: floor_divide Message-ID: Should b/Q produce a double? The result will fit nicely in b. How about Q/b, would a warning be more appropriate than promotion to double? floor_divide not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:34:44 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:34:44 -0600 Subject: [Numpy-discussion] Binary ufuncs: divide Message-ID: Again, note that b/Q will fit in b, so we don't need to produce a double here. divide not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:38:29 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:38:29 -0600 Subject: [Numpy-discussion] Binary ufuncs: maximum Message-ID: Note that max('b','Q') will fit in Q, do we want to produce a double instead? divide not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | b , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:41:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:41:16 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum Message-ID: Note that min('b','Q') will fit in b. Do we want to produce a double in this case? minimum not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 26 23:50:31 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 26 May 2008 21:50:31 -0600 Subject: [Numpy-discussion] Binary ufuncs: add, subtract, multiply Message-ID: Is it appropriate to produce a double for mixed 'Q','b' types? Or should we raise a warning instead. There is a loss of precision going to doubles anyway, and that will also often be the case for the lower precision types when multiplying. I would also feel more comfortable if the output types were symmetric in the underlying c-type, even though l,i are the same precision on my machine. That might make things easier for folks using c-types or trying to embed numpy. add not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , subtract not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , multiply not symmetric in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , g , F , D , G , -------------------------------------------------------------------------------------- in2 ? | ? , b , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 b | b , b , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 B | B , h , B , h , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 h | h , h , h , h , i , i , q , i , q , q , d , f , d , g , F , D , G , in2 H | H , i , H , i , H , i , I , i , I , q , Q , f , d , g , F , D , G , in2 i | i , i , i , i , i , i , q , i , q , q , d , d , d , g , D , D , G , in2 I | I , q , I , q , I , q , I , q , I , q , Q , d , d , g , D , D , G , in2 l | l , l , l , l , l , l , q , l , q , q , d , d , d , g , D , D , G , in2 L | L , q , L , q , L , q , L , q , L , q , Q , d , d , g , D , D , G , in2 q | q , q , q , q , q , q , q , q , q , q , d , d , d , g , D , D , G , in2 Q | Q , d , Q , d , Q , d , Q , d , Q , d , Q , d , d , g , D , D , G , in2 f | f , f , f , f , f , d , d , d , d , d , d , f , d , g , F , D , G , in2 d | d , d , d , d , d , d , d , d , d , d , d , d , d , g , D , D , G , in2 g | g , g , g , g , g , g , g , g , g , g , g , g , g , g , G , G , G , in2 F | F , F , F , F , F , D , D , D , D , D , D , F , D , G , F , D , G , in2 D | D , D , D , D , D , D , D , D , D , D , D , D , D , G , D , D , G , in2 G | G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , G , Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue May 27 00:01:06 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 27 May 2008 13:01:06 +0900 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: References: <48393154.2000209@ar.media.kyoto-u.ac.jp> Message-ID: <483B8782.10303@ar.media.kyoto-u.ac.jp> Lisandro Dalcin wrote: > David, I've implemented something like this for petsc4py. > > Basically, I want to be able to build extension modules accessing > PETSc for different configuration options (for example, optimized > versus debug), when using PETSc, it is normal to select the build > variant by a $PETSC_ARCH environ var. > Yes, that's another solution I forgot to mention. > If you think this fits your needs, then I'll help you. Buth then I'll > need to know a bit more about the directory tree structure of your > package and your extension modules. > Thanks, but I already have the code to dynamically load a module from a string. My problem is how to control it, and how we want to do this kind of things. Up to know, scipy have never depended on env variables or configuration files, so I don't want to introduce this kind of things without discussing it first. cheers, David From robert.kern at gmail.com Tue May 27 02:14:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 May 2008 01:14:13 -0500 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <483B8782.10303@ar.media.kyoto-u.ac.jp> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> <483B8782.10303@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730805262314u2611a53ek14ac6f04eeb3458c@mail.gmail.com> On Mon, May 26, 2008 at 11:01 PM, David Cournapeau wrote: > Lisandro Dalcin wrote: >> David, I've implemented something like this for petsc4py. >> >> Basically, I want to be able to build extension modules accessing >> PETSc for different configuration options (for example, optimized >> versus debug), when using PETSc, it is normal to select the build >> variant by a $PETSC_ARCH environ var. > > Yes, that's another solution I forgot to mention. > >> If you think this fits your needs, then I'll help you. Buth then I'll >> need to know a bit more about the directory tree structure of your >> package and your extension modules. > > Thanks, but I already have the code to dynamically load a module from a > string. My problem is how to control it, and how we want to do this kind > of things. Up to know, scipy have never depended on env variables or > configuration files, so I don't want to introduce this kind of things > without discussing it first. There are actually a couple of places, like scipy.misc.imshow(). I tend to dislike libraries automatically reading configuration (including environment variables). However, imshow() is not really a library function so much as a function for the interactive interpreter, so I don't mind so much. I would definitely *not* use environment variables as the sole control of FFT backend selection. If I have to restart my IPython session just because I forgot to set the right environment variable, I will be very unhappy. For FFTs, I would probably keep the backend functions in a module-level list. On import, the package would try to import the backends it knows could be bundled in scipy.fftpack and insert them into the list. Probably, these would just be the optimized versions; the default FFTPACK versions would be kept separate. If an import fails, it would be a good idea to format the traceback into a string and store it in an accessible location for debugging unintentional ImportErrors. Each API function (e.g. rfft()) would check its list for an optimized implementation and use it or else fall back to the default. Each implementation module (e.g. scipy.fftpack._fftw3, or whatever you have named them; I haven't reviewed your code, yet) would have an explicit registration function that puts its implementations (all or a subset, e.g. _fftw3.register() or _fftw3.register('fft', 'rfft')) at the top of the lists. One needs to think about how to handle power-of-two only implementations. This is probably the only real special case we have to handle, so we can probably do it explicitly instead of coming up with an overly generic solution. I wouldn't bother trying to persist this information. If anyone wants to persist their preferences, then can write a utility function that does exactly what they need. That's the five-minute overview of what I would try. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wright at esrf.fr Tue May 27 02:31:51 2008 From: wright at esrf.fr (Jonathan Wright) Date: Tue, 27 May 2008 08:31:51 +0200 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <483B8782.10303@ar.media.kyoto-u.ac.jp> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> <483B8782.10303@ar.media.kyoto-u.ac.jp> Message-ID: <483BAAD7.4020807@esrf.fr> David Cournapeau wrote: > Thanks, but I already have the code to dynamically load a module from a > string. My problem is how to control it, and how we want to do this kind > of things. Up to know, scipy have never depended on env variables or > configuration files, so I don't want to introduce this kind of things > without discussing it first. > Both matplotlib and ipython are using rc files already. For your fftpack work, it would be great to be able to compare different backends within the same session, both for benchmarking and accuracy checking. In that case the user would probably want to control what is loaded at runtime. Thanks! Jon From stefan at sun.ac.za Tue May 27 03:51:44 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 27 May 2008 09:51:44 +0200 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: Message-ID: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> Did this change recently? In [33]: np.__version__ Out[33]: '1.1.0.dev5211' In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype Out[34]: dtype('uint64') But yes, that looks like it should return a uint8. Regards St?fan 2008/5/27 Charles R Harris : > Note that min('b','Q') will fit in b. Do we want to produce a double in this > case? > > minimum > not symmetric > in1 ? , b , B , h , H , i , I , l , L , q , Q , f , d , > g , F , D , G , > > -------------------------------------------------------------------------------------- From markbak at gmail.com Tue May 27 03:56:54 2008 From: markbak at gmail.com (mark) Date: Tue, 27 May 2008 00:56:54 -0700 (PDT) Subject: [Numpy-discussion] Why does argwhere return column vector? Message-ID: Hello list. I don't understand why argwhere returns a column vector when I apply it to a row vector: >>> a = arange(5) >>> argwhere(a>1) array([[2], [3], [4]]) That seems odd and inconvenient. Any advantage that I am missing? Mark From stefan at sun.ac.za Tue May 27 04:05:36 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 27 May 2008 10:05:36 +0200 Subject: [Numpy-discussion] Binary ufuncs: add, subtract, multiply In-Reply-To: References: Message-ID: <9457e7c80805270105s87065f3hc235f4562d471f66@mail.gmail.com> 2008/5/27 Charles R Harris : > Is it appropriate to produce a double for mixed 'Q','b' types? Or should we > raise a warning instead. There is a loss of precision going to doubles > anyway, and that will also often be the case for the lower precision types > when multiplying. I don't think we should raise a warning by default. If the result is 'Q', then overflow occurs in the minority of cases. > I would also feel more comfortable if the output types > were symmetric in the underlying c-type, even though l,i are the same > precision on my machine. > That might make things easier for folks using > c-types or trying to embed numpy. I agree. Regards St?fan From stefan at sun.ac.za Tue May 27 04:15:10 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 27 May 2008 10:15:10 +0200 Subject: [Numpy-discussion] Why does argwhere return column vector? In-Reply-To: References: Message-ID: <9457e7c80805270115g78f7d783te43e3ff1d4270ed2@mail.gmail.com> Hi Mark 2008/5/27 mark : > I don't understand why argwhere returns a column vector when I apply > it to a row vector: > >>>> a = arange(5) >>>> argwhere(a>1) > array([[2], > [3], > [4]]) Each row of the result is a coordinate into your array. So if you had a = np.arange(12).reshape((4,3)) then you'd get In [54]: np.argwhere(a>5) Out[54]: array([[2, 0], [2, 1], [2, 2], [3, 0], [3, 1], [3, 2]]) If you want to grab the elements larger than 5, just do a[a>5] Regards St?fan From markbak at gmail.com Tue May 27 04:15:23 2008 From: markbak at gmail.com (mark) Date: Tue, 27 May 2008 01:15:23 -0700 (PDT) Subject: [Numpy-discussion] behavior of 'insert' for inserting multiple values Message-ID: <09421727-1a49-46fa-b89c-6fe65e53ab60@t54g2000hsg.googlegroups.com> Hello list. When I try to insert multiple values in one place, I want to do this: >>> a = arange(5.) >>> insert(a,3,[7,7]) array([ 0., 1., 2., 7., 3., 4.]) But insert only inserts one of the 7's, while I want both values to be inserted. Nor does numpy throw a warning (which I think would be appropriate). The way that works correctly is >>> insert(a,[3,3],[7,7]) array([ 0., 1., 2., 7., 7., 3., 4.]) Wouldn't it make sense if the first example works as well? Mark From markbak at gmail.com Tue May 27 04:44:34 2008 From: markbak at gmail.com (mark) Date: Tue, 27 May 2008 01:44:34 -0700 (PDT) Subject: [Numpy-discussion] Why does argwhere return column vector? In-Reply-To: <9457e7c80805270115g78f7d783te43e3ff1d4270ed2@mail.gmail.com> References: <9457e7c80805270115g78f7d783te43e3ff1d4270ed2@mail.gmail.com> Message-ID: OK, I get how it works for 2D arrays. What I want to do is insert a number, say 7, before every value in the array that is larger than, for example, 1. Then I need to first find all the indices of values larger than 1, and then I can do an insert: >>> a = arange(5) >>> i = argwhere( a>1 ) >>> insert(a,i[:,0],7) array([0, 1, 7, 2, 7, 3, 7, 4]) Is there a better way to do this? So for this instance, it is inconvenient that argwhere returns a column vector. But I understand the issue for arrays with higher dimensions. Thanks for the explanation, Mark > Each row of the result is a coordinate into your array. So if you had > > a = np.arange(12).reshape((4,3)) > > then you'd get > > In [54]: np.argwhere(a>5) > Out[54]: > array([[2, 0], > [2, 1], > [2, 2], > [3, 0], > [3, 1], > [3, 2]]) > > If you want to grab the elements larger than 5, just do > > a[a>5] From stefan at sun.ac.za Tue May 27 05:26:27 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 27 May 2008 11:26:27 +0200 Subject: [Numpy-discussion] behavior of 'insert' for inserting multiple values In-Reply-To: <09421727-1a49-46fa-b89c-6fe65e53ab60@t54g2000hsg.googlegroups.com> References: <09421727-1a49-46fa-b89c-6fe65e53ab60@t54g2000hsg.googlegroups.com> Message-ID: <9457e7c80805270226t39bdf764kf980892353cee302@mail.gmail.com> Hi Mark 2008/5/27 mark : >>>> a = arange(5.) >>>> insert(a,3,[7,7]) > array([ 0., 1., 2., 7., 3., 4.]) > > But insert only inserts one of the 7's, while I want both values to be > inserted. Nor does numpy throw a warning (which I think would be > appropriate). The way that works correctly is > >>>> insert(a,[3,3],[7,7]) > array([ 0., 1., 2., 7., 7., 3., 4.]) You need to specify two insertion positions, i.e. np.insert(a, [3, 3], [7, 7]) I think we should consider a special case for your example, though. Regards St?fan From stefan at sun.ac.za Tue May 27 05:43:47 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 27 May 2008 11:43:47 +0200 Subject: [Numpy-discussion] Why does argwhere return column vector? In-Reply-To: References: <9457e7c80805270115g78f7d783te43e3ff1d4270ed2@mail.gmail.com> Message-ID: <9457e7c80805270243k7d1c0e36u27926d701a9a45bc@mail.gmail.com> Hi Mark 2008/5/27 mark : > OK, I get how it works for 2D arrays. > > What I want to do is insert a number, say 7, before every value in the > array that is larger than, for example, 1. > > Then I need to first find all the indices of values larger than 1, and > then I can do an insert: > >>>> a = arange(5) >>>> i = argwhere( a>1 ) >>>> insert(a,i[:,0],7) > array([0, 1, 7, 2, 7, 3, 7, 4]) > > Is there a better way to do this? Inserting is slow, since a new array is allocated each time. In the case where you insert at multiple indexes, it may be handled better, but I haven't timed it. I would do it the following way: import numpy as np a = np.array([2,1,3,-1,2]) mask = a > 1 out = np.zeros(len(a) + sum(mask), dtype=int) out.fill(7) out[np.arange(len(a)) + mask.cumsum()] = a Regards St?fan From lbolla at gmail.com Tue May 27 08:36:20 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Tue, 27 May 2008 14:36:20 +0200 Subject: [Numpy-discussion] problem re-installing my own package Message-ID: <80c99e790805270536g358c7b20rdaa78b25b351be53@mail.gmail.com> I'm having problems re-installing a package of mine, that uses numpy and scipy. can it be due to some recent changes in numpy (scons?) here is the error message I have: ======================= $ python setup.py install running install running build running scons Traceback (most recent call last): File "setup.py", line 71, in setup_package() File "setup.py", line 64, in setup_package configuration = configuration) File "c:\Python25\Lib\site-packages\numpy\distutils\core.py", line 184, in setup return old_setup(**new_attr) File "c:\Python25\lib\distutils\core.py", line 151, in setup dist.run_commands() File "c:\Python25\lib\distutils\dist.py", line 974, in run_commands self.run_command(cmd) File "c:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "c:\Python25\Lib\site-packages\numpy\distutils\command\install.py", line 16, in run r = old_install.run(self) File "c:\Python25\lib\distutils\command\install.py", line 506, in run self.run_command('build') File "c:\Python25\lib\distutils\cmd.py", line 333, in run_command self.distribution.run_command(command) File "c:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "c:\Python25\Lib\site-packages\numpy\distutils\command\build.py", line 38, in run self.run_command('scons') File "c:\Python25\lib\distutils\cmd.py", line 333, in run_command self.distribution.run_command(command) File "c:\Python25\lib\distutils\dist.py", line 993, in run_command cmd_obj.ensure_finalized() File "c:\Python25\lib\distutils\cmd.py", line 117, in ensure_finalized self.finalize_options() File "c:\Python25\Lib\site-packages\numpy\distutils\command\scons.py", line 234, in finalize_options force=self.force) File "c:\Python25\Lib\site-packages\numpy\distutils\ccompiler.py", line 366, in new_compiler compiler = klass(None, dry_run, force) File "c:\Python25\Lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 46, in __init__ verbose,dry_run, force) File "c:\Python25\lib\distutils\cygwinccompiler.py", line 84, in __init__ get_versions() File "c:\Python25\lib\distutils\cygwinccompiler.py", line 424, in get_versions ld_version = StrictVersion(result.group(1)) File "c:\Python25\lib\distutils\version.py", line 40, in __init__ self.parse(vstring) File "c:\Python25\lib\distutils\version.py", line 107, in parse raise ValueError, "invalid version number '%s'" % vstring ValueError: invalid version number '2.18.50.20080523' ======================= thank you in advance. L. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue May 27 09:15:44 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 May 2008 22:15:44 +0900 Subject: [Numpy-discussion] Controlling the way a module is imported In-Reply-To: <3d375d730805262314u2611a53ek14ac6f04eeb3458c@mail.gmail.com> References: <48393154.2000209@ar.media.kyoto-u.ac.jp> <483B8782.10303@ar.media.kyoto-u.ac.jp> <3d375d730805262314u2611a53ek14ac6f04eeb3458c@mail.gmail.com> Message-ID: <5b8d13220805270615j18357dcn555c43405bcc2b44@mail.gmail.com> On Tue, May 27, 2008 at 3:14 PM, Robert Kern wrote: > > I would definitely *not* use environment variables as the sole control > of FFT backend selection. If I have to restart my IPython session just > because I forgot to set the right environment variable, I will be very > unhappy. Oh yes, I certainly agree it should not be the sole way to control it. It could, maybe, used to force it, and even then, I am not sure it is that useful. > > For FFTs, I would probably keep the backend functions in a > module-level list. On import, the package would try to import the > backends it knows could be bundled in scipy.fftpack and insert them > into the list. Probably, these would just be the optimized versions; > the default FFTPACK versions would be kept separate. If an import > fails, it would be a good idea to format the traceback into a string > and store it in an accessible location for debugging unintentional > ImportErrors. Each API function (e.g. rfft()) would check its list for > an optimized implementation and use it or else fall back to the > default. Each implementation module (e.g. scipy.fftpack._fftw3, or > whatever you have named them; I haven't reviewed your code, yet) would > have an explicit registration function that puts its implementations > (all or a subset, e.g. _fftw3.register() or _fftw3.register('fft', > 'rfft')) at the top of the lists Registration aside, that's more or less what I have done so far. > handle power-of-two only implementations. This is probably the only > real special case we have to handle, so we can probably do it > explicitly instead of coming up with an overly generic solution. I thought about changing the C Api to return an error code for invalid input, and writing a python wrapper which calls the default backend when an error is returned. This way, I can treat any kind of error without too much burden. cheers, David From oliphant at enthought.com Tue May 27 10:06:52 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 09:06:52 -0500 Subject: [Numpy-discussion] Dot in C extension In-Reply-To: References: <50952.71.102.223.154.1211693350.squirrel@mail.math.ucsb.edu> Message-ID: <483C157C.2090602@enthought.com> Charles R Harris wrote: > > > On Sun, May 25, 2008 at 12:12 AM, Charles R Harris > > wrote: > > > > On Sat, May 24, 2008 at 11:29 PM, > wrote: > > Hi all, > > I'm trying to write a Gauss-Seidel function in C++. The > function works > however it is too slow because I'm not using any acceleration > for the > vector multiplication. I'm not really sure how to access the > dot function > in my extension, nor what all the arguments are for. > > Is this the right function to use (found in ndarrayobject.h): > > typedef void (PyArray_DotFunc)(void *, npy_intp, void *, > npy_intp, void *, > npy_intp, void *); > > I guess the voids are array objects, the two to be dotted and > the output. > What's the fourth? > > > It's ignored, so 0 (C++) should do. > > static void > @name at _dot(char *ip1, intp is1, char *ip2, intp is2, char *op, intp n, > void *ignore) > { > register @out@ tmp=(@out@)0; > register intp i; > for(i=0;i tmp += (@out@)(*((@type@ *)ip1)) * \ > (@out@)(*((@type@ *)ip2)); > } > *((@type@ *)op) = (@type@) tmp; > } > > Note that the function may call BLAS in practice, but you can > figure the use of the arguments from the above. Ignore the @type@ > sort of stuff, it's replaced by real types by the code generator. > > > I'm not sure how you get to these functions, which are type specific, > someone else will have to supply that answer. They are attached to the .f member of the data-type object. So, you get the data-type object, get the .f member and then the dotfunc. Something like PyArray_DataType(a)->f->dotfunc (but I would have to look up the name to be sure). -Travis From robert.kern at gmail.com Tue May 27 11:09:41 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 May 2008 10:09:41 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> Message-ID: <3d375d730805270809r4e33af76hb71a76be995ca162@mail.gmail.com> On Tue, May 27, 2008 at 2:51 AM, St?fan van der Walt wrote: > Did this change recently? > > In [33]: np.__version__ > Out[33]: '1.1.0.dev5211' > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > Out[34]: dtype('uint64') > > But yes, that looks like it should return a uint8. While it is possible for the result to fit into uint8, that would break the generic ufunc casting rules. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue May 27 12:06:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 10:06:09 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <3d375d730805270809r4e33af76hb71a76be995ca162@mail.gmail.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <3d375d730805270809r4e33af76hb71a76be995ca162@mail.gmail.com> Message-ID: On Tue, May 27, 2008 at 9:09 AM, Robert Kern wrote: > On Tue, May 27, 2008 at 2:51 AM, St?fan van der Walt > wrote: > > Did this change recently? > > > > In [33]: np.__version__ > > Out[33]: '1.1.0.dev5211' > > > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > > Out[34]: dtype('uint64') > > > > But yes, that looks like it should return a uint8. > > While it is possible for the result to fit into uint8, that would > break the generic ufunc casting rules. > What generic rules? If you look, you will already find exceptions. And should such rules apply to the bitwise operators? The shift operators? What are the rules for comparing strings with numbers? I put posted these results for comment and review because they will soon be made permanent. I also don't think ufuncs should return object arrays in any circumstance that doesn't have an object array as part of the input. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From haase at msg.ucsf.edu Tue May 27 12:14:23 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 27 May 2008 18:14:23 +0200 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <3d375d730805270809r4e33af76hb71a76be995ca162@mail.gmail.com> Message-ID: On Tue, May 27, 2008 at 6:06 PM, Charles R Harris wrote: > > > On Tue, May 27, 2008 at 9:09 AM, Robert Kern wrote: >> >> On Tue, May 27, 2008 at 2:51 AM, St?fan van der Walt >> wrote: >> > Did this change recently? >> > >> > In [33]: np.__version__ >> > Out[33]: '1.1.0.dev5211' >> > >> > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype >> > Out[34]: dtype('uint64') >> > >> > But yes, that looks like it should return a uint8. >> >> While it is possible for the result to fit into uint8, that would >> break the generic ufunc casting rules. > > What generic rules? If you look, you will already find exceptions. And > should such rules apply to the bitwise operators? The shift operators? What > are the rules for comparing strings with numbers? > I put posted these results for comment and review because they will soon be > made permanent. I also don't think ufuncs should return object arrays in any > circumstance that doesn't have an object array as part of the input. > > Chuck > What do you expect from: >>> np.minimum(np.uint8(164), np.uint64(160)).dtype ? uint64 I guess, right !? -Sebastian Haase From charlesr.harris at gmail.com Tue May 27 12:30:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 10:30:42 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <3d375d730805270809r4e33af76hb71a76be995ca162@mail.gmail.com> Message-ID: On Tue, May 27, 2008 at 10:14 AM, Sebastian Haase wrote: > On Tue, May 27, 2008 at 6:06 PM, Charles R Harris > wrote: > > > > > > On Tue, May 27, 2008 at 9:09 AM, Robert Kern > wrote: > >> > >> On Tue, May 27, 2008 at 2:51 AM, St?fan van der Walt > >> wrote: > >> > Did this change recently? > >> > > >> > In [33]: np.__version__ > >> > Out[33]: '1.1.0.dev5211' > >> > > >> > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > >> > Out[34]: dtype('uint64') > >> > > >> > But yes, that looks like it should return a uint8. > >> > >> While it is possible for the result to fit into uint8, that would > >> break the generic ufunc casting rules. > > > > What generic rules? If you look, you will already find exceptions. And > > should such rules apply to the bitwise operators? The shift operators? > What > > are the rules for comparing strings with numbers? > > I put posted these results for comment and review because they will soon > be > > made permanent. I also don't think ufuncs should return object arrays in > any > > circumstance that doesn't have an object array as part of the input. > > > > Chuck > > > > What do you expect from: > >>> np.minimum(np.uint8(164), np.uint64(160)).dtype > ? > uint64 I guess, right !? > Yep, that's what happens -- ('Q','B'). I would argue that ('Q','b') should return q instead of double, following the 'rule' of preserving the size of the largest type when possible. It is the mixture of signed with unsigned types that causes type promotion to occur, the reasoning being that the unsigned type can't hold negative numbers. There is a problem with changing this with the current code as the existing ufuncs don't accept mixed types as input, although it looks to me like the stored signatures allow this. But returning a double (53 bits precision) vs 'Q' (64 bits), doesn't solve the precision problem and may misleadingly imply that it does. And in this case there is a solution. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue May 27 15:31:32 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 14:31:32 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: Message-ID: <483C6194.3060809@enthought.com> Charles R Harris wrote: > I vaguely recall this generated an array from all the characters. > > In [1]: array('123', dtype='c') > Out[1]: > array('1', > dtype='|S1') This may be a bug. >>> import Numeric >>> Numeric.array('123','c') array([1, 2, 3],'c') My memory of the point of 'c' was to mimic Numeric's behavior for character arrays. -Travis From oliphant at enthought.com Tue May 27 15:53:59 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 14:53:59 -0500 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: References: Message-ID: <483C66D7.6010401@enthought.com> Charles R Harris wrote: > Hi All, > > Here is the current behavior of the ufuncs and some comments. They > don't yet cover mixed types for binary functions, > but when they do we will see things like: > > In [7]: power(True,10) > Out[7]: > array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, > 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) > > Which looks suspect ;) I don't understand this. Like Robert, I don't get this output, and I'm not sure what the point being made is. > > 1) Help strings on ufuncs don't work. This seems to be a problem with > the help function, as > printing the relevant __doc__ works fine. The docstrings are > currently defined in > code_generators/generate_umath.py and add_newdoc doesn't seem to > work for them. This has been known for a long time. It is the reason that I wrote numpy.info. I should push for the Python help to change, but I'm not sure what problems that might create. > > 2) Complex divmod(), // and % are deprecated, should we make them > raise errors? Sometimes you have float data that is complex because of an intermediate calculation. I don't think we should cause these operations not to work on Numeric data just because Python deprecated them. I'm actually not sure why Python deprecated these functions. > 3) The current behavior of remainder for complex is bizarre. Nor does > it raise a deprecation warning. Please show what you mean: >>> x = array([5.0, 3.0],'D') >>> x array([ 5.+0.j, 3.+0.j]) >>> x % 3 __main__:1: DeprecationWarning: complex divmod(), // and % are deprecated array([(2+0j), 0j], dtype=object) I don't get why it should be deprecated. > 4) IMHO, absolute('?') should return 'b' Reasons? > > 5) Negative applied to '?' is equivalent to not. This gives me mixed > feelings; the same functionality > is covered by invert and logical_not. Yes, it is true. Do you have another suggestion as to what negative should do? > > 6) The fmod ufunc applied to complex returns AttributeError. Shouldn't > it be a TypeError? Maybe, but the error comes from complex-> promoted to object -> search for fmod method on Python object of complex type -> raise Attribute Error. Some special-case error re-mapping would have to be done to change it. > > 7) Should degrees and radians work on complex? Hey, they work on > booleans and it's just scaling. Sure -- for the same reason that floor_divide (//) and remainder (%) should work on complex (I realize that right now, the default object implementation is called for such cases). I didn't see anything of alarm in the list of signatures that you provided. If you have something of concern, please pick it out. Thanks for the close-up examination of the behavior. -Travis > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From oliphant at enthought.com Tue May 27 16:02:17 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 15:02:17 -0500 Subject: [Numpy-discussion] Possible errors in type promotion in binary ufuncs In-Reply-To: References: Message-ID: <483C68C9.7030602@enthought.com> Charles R Harris wrote: > Attached as a zip file. The mailing list is still stuck in the > microzone, 40kb limit. The type promotion rules are the same for all the ufuncs. I'm not sure what the value is in going over each binary ufunc one-by-one. The behavior is based on how ufuncs are found and the casting rules. In other words, these are not independently coded, so why are you going over each one separately. Your symmetry concern is minor in my mind. You are worried about the difference between 'i' and 'l' for the output types. These are the same type on 32-bit platforms and which one is selected is a function of how the ufuncs are found. It is true that we have over-registered ufuncs on some platforms (i.e. we don't need an 'i' and an 'l' ufunc inner loop when they are the same type). -Travis From oliphant at enthought.com Tue May 27 16:10:37 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 15:10:37 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> Message-ID: <483C6ABD.3010102@enthought.com> St?fan van der Walt wrote: > Did this change recently? > > In [33]: np.__version__ > Out[33]: '1.1.0.dev5211' > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > Out[34]: dtype('uint64') > > But yes, that looks like it should return a uint8. > This discussion is really moot unless a proposal for how to handle different casting rules for different ufuncs is proposed. Right now, the type-promotion rules are generic and do not depend on the ufunc only on coercion rules for the mixed types. One problem with a casting-rules-per-ufunc approach is that it makes it harder to add new types and have them fit in to the casting structure (which is currently possible now). Some mechanism for allowing the types to plug-in to the per-ufunc rules would be needed. These are not impossible things, just a bit of work and not on my personal priority list. -Travis From charlesr.harris at gmail.com Tue May 27 16:12:03 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 14:12:03 -0600 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <483C6194.3060809@enthought.com> References: <483C6194.3060809@enthought.com> Message-ID: On Tue, May 27, 2008 at 1:31 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > I vaguely recall this generated an array from all the characters. > > > > In [1]: array('123', dtype='c') > > Out[1]: > > array('1', > > dtype='|S1') > This may be a bug. > > >>> import Numeric > >>> Numeric.array('123','c') > array([1, 2, 3],'c') > > My memory of the point of 'c' was to mimic Numeric's behavior for > character arrays. > Current behavior after fix is In [1]: array('123','c') Out[1]: array(['1', '2', '3'], dtype='|S1') Is that correct, then? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 16:14:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 14:14:59 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C6ABD.3010102@enthought.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> Message-ID: On Tue, May 27, 2008 at 2:10 PM, Travis E. Oliphant wrote: > St?fan van der Walt wrote: > > Did this change recently? > > > > In [33]: np.__version__ > > Out[33]: '1.1.0.dev5211' > > > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > > Out[34]: dtype('uint64') > > > > But yes, that looks like it should return a uint8. > > > This discussion is really moot unless a proposal for how to handle > different casting rules for different ufuncs is proposed. Right now, > the type-promotion rules are generic and do not depend on the ufunc only > on coercion rules for the mixed types. > > One problem with a casting-rules-per-ufunc approach is that it makes it > harder to add new types and have them fit in to the casting structure > (which is currently possible now). Some mechanism for allowing the > types to plug-in to the per-ufunc rules would be needed. > > These are not impossible things, just a bit of work and not on my > personal priority list. > Not everything follows those rules, however. So I have put these things up for review so that we can agree on what the are. Of particular concern in my mind are the bitwise and shift operators which currently work in a counter intuitive manner. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue May 27 16:15:06 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 15:15:06 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: <483C6194.3060809@enthought.com> Message-ID: <483C6BCA.80705@enthought.com> Charles R Harris wrote: > > > On Tue, May 27, 2008 at 1:31 PM, Travis E. Oliphant > > wrote: > > Charles R Harris wrote: > > I vaguely recall this generated an array from all the characters. > > > > In [1]: array('123', dtype='c') > > Out[1]: > > array('1', > > dtype='|S1') > This may be a bug. > > >>> import Numeric > >>> Numeric.array('123','c') > array([1, 2, 3],'c') > > My memory of the point of 'c' was to mimic Numeric's behavior for > character arrays. > > > Current behavior after fix is > > In [1]: array('123','c') > Out[1]: > array(['1', '2', '3'], > dtype='|S1') > > Is that correct, then? Yes. -Travis From charlesr.harris at gmail.com Tue May 27 16:26:53 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 14:26:53 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: <483C66D7.6010401@enthought.com> References: <483C66D7.6010401@enthought.com> Message-ID: On Tue, May 27, 2008 at 1:53 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > Hi All, > > > > Here is the current behavior of the ufuncs and some comments. They > > don't yet cover mixed types for binary functions, > > but when they do we will see things like: > > > > In [7]: power(True,10) > > Out[7]: > > array([ 0.5822807 , 0.66568381, 0.11748811, 0.97047323, 0.60095205, > > 0.81218886, 0.0167618 , 0.80544138, 0.59540082, 0.82414302]) > > > > Which looks suspect ;) > > I don't understand this. Like Robert, I don't get this output, and I'm > not sure what the point being made is. It came from MPL, which used to have its own versions of some things. > > > > > 1) Help strings on ufuncs don't work. This seems to be a problem with > > the help function, as > > printing the relevant __doc__ works fine. The docstrings are > > currently defined in > > code_generators/generate_umath.py and add_newdoc doesn't seem to > > work for them. > This has been known for a long time. It is the reason that I wrote > numpy.info. I should push for the Python help to change, but I'm not > sure what problems that might create. I'm thinking that maybe we could make the numpy module overload the builtin help function when it is imported. > > > > > 2) Complex divmod(), // and % are deprecated, should we make them > > raise errors? > Sometimes you have float data that is complex because of an intermediate > calculation. I don't think we should cause these operations not to > work on Numeric data just because Python deprecated them. I'm actually > not sure why Python deprecated these functions. > > > > 3) The current behavior of remainder for complex is bizarre. Nor does > > it raise a deprecation warning. > Please show what you mean: > > >>> x = array([5.0, 3.0],'D') > >>> x > array([ 5.+0.j, 3.+0.j]) > >>> x % 3 > __main__:1: DeprecationWarning: complex divmod(), // and % are deprecated > array([(2+0j), 0j], dtype=object) > > I don't get why it should be deprecated. Hey, I didn't put in the warning. Although it currently segfaults for some values and I'm not sure what the definition should be for the complex case. > > > 4) IMHO, absolute('?') should return 'b' > > Reasons? It's an arithmetic operator, not boolean. > > > > > 5) Negative applied to '?' is equivalent to not. This gives me mixed > > feelings; the same functionality > > is covered by invert and logical_not. > Yes, it is true. Do you have another suggestion as to what negative > should do? Been thinking about it. > > > > 6) The fmod ufunc applied to complex returns AttributeError. Shouldn't > > it be a TypeError? > Maybe, but the error comes from > > complex-> promoted to object -> search for fmod method on Python object > of complex type -> raise Attribute Error. > > Some special-case error re-mapping would have to be done to change it. > > > > 7) Should degrees and radians work on complex? Hey, they work on > > booleans and it's just scaling. > Sure -- for the same reason that floor_divide (//) and remainder (%) > should work on complex (I realize that right now, the default object > implementation is called for such cases). > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 16:34:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 14:34:23 -0600 Subject: [Numpy-discussion] Possible errors in type promotion in binary ufuncs In-Reply-To: <483C68C9.7030602@enthought.com> References: <483C68C9.7030602@enthought.com> Message-ID: On Tue, May 27, 2008 at 2:02 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > Attached as a zip file. The mailing list is still stuck in the > > microzone, 40kb limit. > > The type promotion rules are the same for all the ufuncs. I'm not sure > what the value is in going over each binary ufunc one-by-one. The > behavior is based on how ufuncs are found and the casting rules. > > In other words, these are not independently coded, so why are you going > over each one separately. > Because they aren't all the same. That's why I picked out samples of different sorts rather than post the whole thing. And yes, it is needed to review everything also because checking what you say the code does against what it actually does is the whole point of writing unit tests. > Your symmetry concern is minor in my mind. You are worried about the > difference between 'i' and 'l' for the output types. These are the > same type on 32-bit platforms and which one is selected is a function of > how the ufuncs are found. > Well, it does indicate an asymmetry in the code. Also, I have yet to check these on a 64 bit platform to see what happens there when i is still 32 bits and l is 64 bits. Speaking of which, might it be better to make long the default accumulator type rather than int? That way it would be 64 bits on 64 bit systems. > It is true that we have over-registered ufuncs on some platforms (i.e. > we don't need an 'i' and an 'l' ufunc inner loop when they are the same > type). > That doesn't bother me, it costs little. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue May 27 16:40:15 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 15:40:15 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> Message-ID: <483C71AF.1090009@enthought.com> Charles R Harris wrote: > > > On Tue, May 27, 2008 at 2:10 PM, Travis E. Oliphant > > wrote: > > St?fan van der Walt wrote: > > Did this change recently? > > > > In [33]: np.__version__ > > Out[33]: '1.1.0.dev5211' > > > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > > Out[34]: dtype('uint64') > > > > But yes, that looks like it should return a uint8. > > > This discussion is really moot unless a proposal for how to handle > different casting rules for different ufuncs is proposed. Right now, > the type-promotion rules are generic and do not depend on the > ufunc only > on coercion rules for the mixed types. > > One problem with a casting-rules-per-ufunc approach is that it > makes it > harder to add new types and have them fit in to the casting structure > (which is currently possible now). Some mechanism for allowing the > types to plug-in to the per-ufunc rules would be needed. > > These are not impossible things, just a bit of work and not on my > personal priority list. > > > Not everything follows those rules, however. So I have put these > things up for review so that we can agree on what the are. Of > particular concern in my mind are the bitwise and shift operators > which currently work in a counter intuitive manner. Right now, the only complete description of the rules is the code that implements them. So, from that perspective, yes everything follows those rules ;-) -Travis From robert.kern at gmail.com Tue May 27 16:52:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 May 2008 15:52:04 -0500 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: <483C66D7.6010401@enthought.com> References: <483C66D7.6010401@enthought.com> Message-ID: <3d375d730805271352k6180adcdpc06dabb525901a84@mail.gmail.com> On Tue, May 27, 2008 at 2:53 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: >> 2) Complex divmod(), // and % are deprecated, should we make them >> raise errors? > Sometimes you have float data that is complex because of an intermediate > calculation. I don't think we should cause these operations not to > work on Numeric data just because Python deprecated them. I'm actually > not sure why Python deprecated these functions. floor() isn't particularly well-defined in CC. I guess we could just round each .real and .imag component down to the next lowest integer and define Z1 % Z2 to be the remainder from that. The operations don't really make sense for most elements in CC, but it "smoothly" extrapolates from the RR results. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue May 27 17:07:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 May 2008 16:07:54 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <483C6BCA.80705@enthought.com> References: <483C6194.3060809@enthought.com> <483C6BCA.80705@enthought.com> Message-ID: <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> On Tue, May 27, 2008 at 3:15 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: >> >> On Tue, May 27, 2008 at 1:31 PM, Travis E. Oliphant >> > wrote: >> >> Charles R Harris wrote: >> > I vaguely recall this generated an array from all the characters. >> > >> > In [1]: array('123', dtype='c') >> > Out[1]: >> > array('1', >> > dtype='|S1') >> This may be a bug. >> >> >>> import Numeric >> >>> Numeric.array('123','c') >> array([1, 2, 3],'c') >> >> My memory of the point of 'c' was to mimic Numeric's behavior for >> character arrays. >> >> >> Current behavior after fix is >> >> In [1]: array('123','c') >> Out[1]: >> array(['1', '2', '3'], >> dtype='|S1') >> >> Is that correct, then? > Yes. Can we make it so that dtype('c') is preserved instead of displaying '|S1'? It does not behave the same as dtype('|S1') although it compares equal to it. In [90]: dtype('c') Out[90]: dtype('|S1') In [91]: array('123', dtype='c') Out[91]: array(['1', '2', '3'], dtype='|S1') In [92]: array('123', dtype=dtype('c')) Out[92]: array(['1', '2', '3'], dtype='|S1') In [93]: array('123', dtype=dtype('|S1')) Out[93]: array('1', dtype='|S1') In [94]: array('456', dtype=array('123', dtype=dtype('c')).dtype) Out[94]: array(['4', '5', '6'], dtype='|S1') In [95]: dtype('c') == dtype('|S1') Out[95]: True -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From aisaac at american.edu Tue May 27 17:11:40 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 27 May 2008 17:11:40 -0400 Subject: [Numpy-discussion] complex arithmetic (floor) In-Reply-To: <3d375d730805271352k6180adcdpc06dabb525901a84@mail.gmail.com> References: <483C66D7.6010401@enthought.com> <3d375d730805271352k6180adcdpc06dabb525901a84@mail.gmail.com> Message-ID: Reason for deprecation might be exposed by this? PEP 3141 discusses this obliquely in the discussion of the Number ABC: Until NumPy can justify its choice with reference to either Python behavior or a C standard, might it be best not to promise anything? (E.g., raise an exception.) Cheers, Alan Isaac From charlesr.harris at gmail.com Tue May 27 17:21:26 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 15:21:26 -0600 Subject: [Numpy-discussion] Current ufunc signatures for review In-Reply-To: <3d375d730805271352k6180adcdpc06dabb525901a84@mail.gmail.com> References: <483C66D7.6010401@enthought.com> <3d375d730805271352k6180adcdpc06dabb525901a84@mail.gmail.com> Message-ID: On Tue, May 27, 2008 at 2:52 PM, Robert Kern wrote: > On Tue, May 27, 2008 at 2:53 PM, Travis E. Oliphant > wrote: > > Charles R Harris wrote: > > >> 2) Complex divmod(), // and % are deprecated, should we make them > >> raise errors? > > Sometimes you have float data that is complex because of an intermediate > > calculation. I don't think we should cause these operations not to > > work on Numeric data just because Python deprecated them. I'm actually > > not sure why Python deprecated these functions. > > floor() isn't particularly well-defined in CC. I guess we could just > round each .real and .imag component down to the next lowest integer > and define Z1 % Z2 to be the remainder from that. The operations don't > really make sense for most elements in CC, but it "smoothly" > extrapolates from the RR results. > Yeah, something like that. For mod we have the lattice of Gaussian integers times a complex is still a lattice which could be used to define the equivalence classes, but I'm not sure quite what to do there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 17:28:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 15:28:09 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C71AF.1090009@enthought.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> Message-ID: On Tue, May 27, 2008 at 2:40 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > On Tue, May 27, 2008 at 2:10 PM, Travis E. Oliphant > > > wrote: > > > > St?fan van der Walt wrote: > > > Did this change recently? > > > > > > In [33]: np.__version__ > > > Out[33]: '1.1.0.dev5211' > > > > > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > > > Out[34]: dtype('uint64') > > > > > > But yes, that looks like it should return a uint8. > > > > > This discussion is really moot unless a proposal for how to handle > > different casting rules for different ufuncs is proposed. Right > now, > > the type-promotion rules are generic and do not depend on the > > ufunc only > > on coercion rules for the mixed types. > > > > One problem with a casting-rules-per-ufunc approach is that it > > makes it > > harder to add new types and have them fit in to the casting structure > > (which is currently possible now). Some mechanism for allowing the > > types to plug-in to the per-ufunc rules would be needed. > > > > These are not impossible things, just a bit of work and not on my > > personal priority list. > > > > > > Not everything follows those rules, however. So I have put these > > things up for review so that we can agree on what the are. Of > > particular concern in my mind are the bitwise and shift operators > > which currently work in a counter intuitive manner. > > Right now, the only complete description of the rules is the code that > implements them. So, from that perspective, yes everything follows > those rules ;-) > So the segfaults are defined behavior? ;) It's like pulling teeth without anesthesia to get these things defined and everyone is going to think I'm an a-hole. It's a dirty job, but someone has got to do it. What about abs(-128) returning a negative number for int8? Might it not be better to return the corresponding unsigned type? Can't check that at the moment, as I'm trying to get 64 bit Ubuntu installed on a software raid1 partition for further checks, and Ubuntu overwrote my MBR without asking. It's a traditional Ubuntu thing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 17:28:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 15:28:41 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C71AF.1090009@enthought.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> Message-ID: On Tue, May 27, 2008 at 2:40 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > On Tue, May 27, 2008 at 2:10 PM, Travis E. Oliphant > > > wrote: > > > > St?fan van der Walt wrote: > > > Did this change recently? > > > > > > In [33]: np.__version__ > > > Out[33]: '1.1.0.dev5211' > > > > > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype > > > Out[34]: dtype('uint64') > > > > > > But yes, that looks like it should return a uint8. > > > > > This discussion is really moot unless a proposal for how to handle > > different casting rules for different ufuncs is proposed. Right > now, > > the type-promotion rules are generic and do not depend on the > > ufunc only > > on coercion rules for the mixed types. > > > > One problem with a casting-rules-per-ufunc approach is that it > > makes it > > harder to add new types and have them fit in to the casting structure > > (which is currently possible now). Some mechanism for allowing the > > types to plug-in to the per-ufunc rules would be needed. > > > > These are not impossible things, just a bit of work and not on my > > personal priority list. > > > > > > Not everything follows those rules, however. So I have put these > > things up for review so that we can agree on what the are. Of > > particular concern in my mind are the bitwise and shift operators > > which currently work in a counter intuitive manner. > > Right now, the only complete description of the rules is the code that > implements them. So, from that perspective, yes everything follows > those rules ;-) > So the segfaults are defined behavior? ;) It's like pulling teeth without anesthesia to get these things defined and everyone is going to think I'm an a-hole. It's a dirty job, but someone has got to do it. What about abs(-128) returning a negative number for int8? Might it not be better to return the corresponding unsigned type? Can't check that at the moment, as I'm trying to get 64 bit Ubuntu installed on a software raid1 partition for further checks, and Ubuntu overwrote my MBR without asking. It's a traditional Ubuntu thing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 17:54:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 15:54:37 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> Message-ID: On Tue, May 27, 2008 at 3:28 PM, Charles R Harris wrote: > > > On Tue, May 27, 2008 at 2:40 PM, Travis E. Oliphant < > oliphant at enthought.com> wrote: > >> Charles R Harris wrote: >> > >> > >> > On Tue, May 27, 2008 at 2:10 PM, Travis E. Oliphant >> > > wrote: >> > >> > St?fan van der Walt wrote: >> > > Did this change recently? >> > > >> > > In [33]: np.__version__ >> > > Out[33]: '1.1.0.dev5211' >> > > >> > > In [34]: np.minimum(np.uint8(164), np.uint64(12807)).dtype >> > > Out[34]: dtype('uint64') >> > > >> > > But yes, that looks like it should return a uint8. >> > > >> > This discussion is really moot unless a proposal for how to handle >> > different casting rules for different ufuncs is proposed. Right >> now, >> > the type-promotion rules are generic and do not depend on the >> > ufunc only >> > on coercion rules for the mixed types. >> > >> > One problem with a casting-rules-per-ufunc approach is that it >> > makes it >> > harder to add new types and have them fit in to the casting >> structure >> > (which is currently possible now). Some mechanism for allowing the >> > types to plug-in to the per-ufunc rules would be needed. >> > >> > These are not impossible things, just a bit of work and not on my >> > personal priority list. >> > >> > >> > Not everything follows those rules, however. So I have put these >> > things up for review so that we can agree on what the are. Of >> > particular concern in my mind are the bitwise and shift operators >> > which currently work in a counter intuitive manner. >> >> Right now, the only complete description of the rules is the code that >> implements them. So, from that perspective, yes everything follows >> those rules ;-) >> > > So the segfaults are defined behavior? ;) It's like pulling teeth without > anesthesia to get these things defined and everyone is going to think I'm an > a-hole. It's a dirty job, but someone has got to do it. > > What about abs(-128) returning a negative number for int8? Might it not be > better to return the corresponding unsigned type? Can't check that at the > moment, as I'm trying to get 64 bit Ubuntu installed on a software raid1 > partition for further checks, and Ubuntu overwrote my MBR without asking. > It's a traditional Ubuntu thing. > Yep, abs fails: In [1]: abs(array([-128,-128], dtype=int8)) Out[1]: array([-128, -128], dtype=int8) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue May 27 17:57:19 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 27 May 2008 14:57:19 -0700 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> Message-ID: <483C83BF.3020204@noaa.gov> Charles R Harris wrote: > It's like pulling teeth > without anesthesia to get these things defined and everyone is going to > think I'm an a-hole. It's a dirty job, but someone has got to do it. FWIW, I'm glad you're doing it! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at enthought.com Tue May 27 17:56:27 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 16:56:27 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> References: <483C6194.3060809@enthought.com> <483C6BCA.80705@enthought.com> <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> Message-ID: <483C838B.7000309@enthought.com> Robert Kern wrote: > > > Can we make it so that dtype('c') is preserved instead of displaying > '|S1'? It does not behave the same as dtype('|S1') although it > compares equal to it. > We could with some special-casing in the representation for string data-types. Right now, dtype('c') is equivalent to dtype('S1') except the type member of the underlying C-structure (char attribute in Python) is 'c' instead of 'S' -Travis From Chris.Barker at noaa.gov Tue May 27 18:07:28 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 27 May 2008 15:07:28 -0700 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> Message-ID: <483C8620.5080607@noaa.gov> Charles R Harris wrote: > Yep, abs fails: > > In [1]: abs(array([-128,-128], dtype=int8)) > Out[1]: array([-128, -128], dtype=int8) Well, yes, but this is a know vagary of the the hardware implementation for signed integers, as demonstrated by that JAVA Puzzles video that Jon Wright pointed us to a couple days ago. (Which to me could have been titled: "Why I don't want to use JAVA") Sure, it could be fixed in this case by promoting to a larger type, but it's going to fail at the largest integer anyway, and I don't think any expects abs() to return a new type. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From wnbell at gmail.com Tue May 27 18:10:19 2008 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 27 May 2008 17:10:19 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C8620.5080607@noaa.gov> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C8620.5080607@noaa.gov> Message-ID: On Tue, May 27, 2008 at 5:07 PM, Christopher Barker wrote: > > Sure, it could be fixed in this case by promoting to a larger type, but > it's going to fail at the largest integer anyway, and I don't think any > expects abs() to return a new type. > I think he was advocating using the corresponding unsigned type, not a larger type. e.g. abs(int8) -> uint8 abs(int64) -> uint64 -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From oliphant at enthought.com Tue May 27 18:11:24 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 17:11:24 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> Message-ID: <483C870C.5010705@enthought.com> > > So the segfaults are defined behavior? ;) It's like pulling teeth > without anesthesia to get these things defined and everyone is going > to think I'm an a-hole. It's a dirty job, but someone has got to do it. I actually appreciate what you are doing. Obviously the segfaults are bugs. It's just that there are constraints we have to work within (unless you are proposing to change the general versus specific coercion behavior). These constraints might change the testing approach that is taken as well as the possible proposed solutions. It's hard to engage the conversation until the questions show an understanding of that. My point is that we should look at the code to determine what is the expected behavior because not only is that what is being done, but that is also the framework we currently have within which to make changes. It is not much different from what Numeric originally provided. Unit tests should help us make sure the code is being executed under all possible circumstances. I don't know how to do that without actually looking at the code. The only thing that changes the coercion behavior in a ufunc-specific way is what underlying loops are available and the inputs and outputs that are defined. > > What about abs(-128) returning a negative number for int8? That looks like an issue with the implementation (I didn't realize that -MIN_INT == MIN_INT in C) and that is how the underlying loop is implemented. > Might it not be better to return the corresponding unsigned type? This is not a "coercion" rule in my mind --- it is a output type issues. And yes, we can change this on a per-ufunc basis because the output type can be defined for each input type in the ufuncs. Yes, it does make sense to me for abs to use an unsigned type for integers. > Can't check that at the moment, as I'm trying to get 64 bit Ubuntu > installed on a software raid1 partition for further checks, and Ubuntu > overwrote my MBR without asking. It's a traditional Ubuntu thing. Bummer. I didn't realize Ubuntu was so forceful. Thanks again for finding all these corner cases that are of interest in improving NumPy. My point should be distilled down to separating out what are ufunc inner-loop definition issues (i.e. what are the output and input types for a given ufunc and how are they implemented) and what are real "coercion" rule issues because they are very different pieces of code and we are not very limited in the former case and fairly limited in the latter (without significantly more work). -Travis From charlesr.harris at gmail.com Tue May 27 18:12:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 16:12:23 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C8620.5080607@noaa.gov> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C8620.5080607@noaa.gov> Message-ID: On Tue, May 27, 2008 at 4:07 PM, Christopher Barker wrote: > Charles R Harris wrote: > > > Yep, abs fails: > > > > In [1]: abs(array([-128,-128], dtype=int8)) > > Out[1]: array([-128, -128], dtype=int8) > > Well, yes, but this is a know vagary of the the hardware implementation > for signed integers, as demonstrated by that JAVA Puzzles video that Jon > Wright pointed us to a couple days ago. (Which to me could have been > titled: "Why I don't want to use JAVA") > > Sure, it could be fixed in this case by promoting to a larger type, but > it's going to fail at the largest integer anyway, and I don't think any > expects abs() to return a new type. > No, it could be completely fixed by promoting the output to the corresponding unsigned type. It wouldn't even require much cleverness in the ufunc. In [2]: abs(array([-128,-128], dtype=int8)).view(uint8) Out[2]: array([128, 128], dtype=uint8) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 18:16:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 16:16:07 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C870C.5010705@enthought.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> Message-ID: On Tue, May 27, 2008 at 4:11 PM, Travis E. Oliphant wrote: > > > > > So the segfaults are defined behavior? ;) It's like pulling teeth > > without anesthesia to get these things defined and everyone is going > > to think I'm an a-hole. It's a dirty job, but someone has got to do it. > I actually appreciate what you are doing. Obviously the segfaults are > bugs. > > It's just that there are constraints we have to work within (unless you > are proposing to change the general versus specific coercion > behavior). These constraints might change the testing approach that is > taken as well as the possible proposed solutions. It's hard to engage > the conversation until the questions show an understanding of that. > > My point is that we should look at the code to determine what is the > expected behavior because not only is that what is being done, but that > is also the framework we currently have within which to make changes. > It is not much different from what Numeric originally provided. > That's why I wrote the script to determine what the code is currently doing. That way we can all look at the results instead of reading the source. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 18:22:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 16:22:42 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C8620.5080607@noaa.gov> Message-ID: On Tue, May 27, 2008 at 4:12 PM, Charles R Harris wrote: > > > On Tue, May 27, 2008 at 4:07 PM, Christopher Barker > wrote: > >> Charles R Harris wrote: >> >> > Yep, abs fails: >> > >> > In [1]: abs(array([-128,-128], dtype=int8)) >> > Out[1]: array([-128, -128], dtype=int8) >> >> Well, yes, but this is a know vagary of the the hardware implementation >> for signed integers, as demonstrated by that JAVA Puzzles video that Jon >> Wright pointed us to a couple days ago. (Which to me could have been >> titled: "Why I don't want to use JAVA") >> >> Sure, it could be fixed in this case by promoting to a larger type, but >> it's going to fail at the largest integer anyway, and I don't think any >> expects abs() to return a new type. >> > > No, it could be completely fixed by promoting the output to the > corresponding unsigned type. It wouldn't even require much cleverness in the > ufunc. > > In [2]: abs(array([-128,-128], dtype=int8)).view(uint8) > Out[2]: array([128, 128], dtype=uint8) > I think this could be done in code_generators/generate_umath 'absolute' : Ufunc(1, 1, None, 'takes |x| elementwise.', TD(nocmplx), TD(cmplx, out=('f', 'd', 'g')), TD(O, f='PyNumber_Absolute'), ), by defining unsigned output types for the ints. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue May 27 18:23:52 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 27 May 2008 17:23:52 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C8620.5080607@noaa.gov> Message-ID: <483C89F8.8070805@enthought.com> Charles R Harris wrote: > > > On Tue, May 27, 2008 at 4:07 PM, Christopher Barker > > wrote: > > Charles R Harris wrote: > > > Yep, abs fails: > > > > In [1]: abs(array([-128,-128], dtype=int8)) > > Out[1]: array([-128, -128], dtype=int8) > > Well, yes, but this is a know vagary of the the hardware > implementation > for signed integers, as demonstrated by that JAVA Puzzles video > that Jon > Wright pointed us to a couple days ago. (Which to me could have been > titled: "Why I don't want to use JAVA") > > Sure, it could be fixed in this case by promoting to a larger > type, but > it's going to fail at the largest integer anyway, and I don't > think any > expects abs() to return a new type. > > > No, it could be completely fixed by promoting the output to the > corresponding unsigned type. It wouldn't even require much cleverness > in the ufunc. > > In [2]: abs(array([-128,-128], dtype=int8)).view(uint8) > Out[2]: array([128, 128], dtype=uint8) Cool! More intersting bit information about 2's complement. -Travis From Chris.Barker at noaa.gov Tue May 27 18:39:00 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 27 May 2008 15:39:00 -0700 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C870C.5010705@enthought.com> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> Message-ID: <483C8D84.4050109@noaa.gov> Travis E. Oliphant wrote: > Yes, it does make sense to me for abs to use an unsigned type for integers. I'm not so sure. I know I wouldn't expect to get a different type back with a call to abs(). Do we really want to change that expectation just for the case of MIN_INT? While everyone is going to want an unsigned value when calling abs(), who knows if they might want to use negative numbers later? Like: x = abs(x) x *= -1 Now what do we get/want? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From wnbell at gmail.com Tue May 27 19:27:39 2008 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 27 May 2008 18:27:39 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C8D84.4050109@noaa.gov> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: On Tue, May 27, 2008 at 5:39 PM, Christopher Barker wrote: > > I'm not so sure. I know I wouldn't expect to get a different type back > with a call to abs(). Do we really want to change that expectation just > for the case of MIN_INT? > > While everyone is going to want an unsigned value when calling abs(), > who knows if they might want to use negative numbers later? Like: > > x = abs(x) > x *= -1 > > Now what do we get/want? IMO abs() returning non-negative numbers is a more fundamental property. In-place operations on integer arrays are somewhat dangerous, and best left to more sophisticated users anyway. Interestingly, MATLAB (v7.5.0) takes a different approach: >> A = int8([ -128, 1]) A = -128 1 >> abs(A) ans = 127 1 >> -A ans = 127 -1 -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From charlesr.harris at gmail.com Tue May 27 19:32:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 17:32:49 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483C8D84.4050109@noaa.gov> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: On Tue, May 27, 2008 at 4:39 PM, Christopher Barker wrote: > Travis E. Oliphant wrote: > > Yes, it does make sense to me for abs to use an unsigned type for > integers. > > I'm not so sure. I know I wouldn't expect to get a different type back > with a call to abs(). Do we really want to change that expectation just > for the case of MIN_INT? > > While everyone is going to want an unsigned value when calling abs(), > who knows if they might want to use negative numbers later? Like: > > x = abs(x) > x *= -1 > > Now what do we get/want? > That one actually works if you view the result as signed ;) I guess it depends on what guarantees we want to make, which is what this is all about. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 27 19:34:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 17:34:46 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: On Tue, May 27, 2008 at 5:27 PM, Nathan Bell wrote: > On Tue, May 27, 2008 at 5:39 PM, Christopher Barker > wrote: > > > > I'm not so sure. I know I wouldn't expect to get a different type back > > with a call to abs(). Do we really want to change that expectation just > > for the case of MIN_INT? > > > > While everyone is going to want an unsigned value when calling abs(), > > who knows if they might want to use negative numbers later? Like: > > > > x = abs(x) > > x *= -1 > > > > Now what do we get/want? > > IMO abs() returning non-negative numbers is a more fundamental > property. In-place operations on integer arrays are somewhat > dangerous, and best left to more sophisticated users anyway. > > > Interestingly, MATLAB (v7.5.0) takes a different approach: > > >> A = int8([ -128, 1]) > A = > -128 1 > >> abs(A) > ans = > 127 1 > >> -A > ans = > 127 -1 > > Oooh, talk about compromises... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue May 27 20:08:22 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 27 May 2008 17:08:22 -0700 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: <483CA276.5050709@noaa.gov> Charles R Harris wrote: > I guess it > depends on what guarantees we want to make, which is what this is all about. Exactly. However, while I'd like to guarantee that abs(x) >= 0, the truth is that numpy is "close to the metal" in a lot of ways, and anyone should know that the arithmetic of integers near max and minimum values is fraught with danger. If we don't change the type, then any number other than MIN_INT works correctly. I think more code will break by silently going from a signed to a signed type than that one value being weird. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Tue May 27 20:27:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 18:27:18 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483CA276.5050709@noaa.gov> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> <483CA276.5050709@noaa.gov> Message-ID: On Tue, May 27, 2008 at 6:08 PM, Christopher Barker wrote: > Charles R Harris wrote: > > I guess it > > depends on what guarantees we want to make, which is what this is all > about. > > Exactly. However, while I'd like to guarantee that abs(x) >= 0, the > truth is that numpy is "close to the metal" in a lot of ways, and anyone > should know that the arithmetic of integers near max and minimum values > is fraught with danger. > > If we don't change the type, then any number other than MIN_INT works > correctly. I think more code will break by silently going from a signed > to a signed type than that one value being weird. > I disagree. By definition, abs() >= 0 always. The only time having an unsigned return will cause problems is in the augmented assignments, and they already cause problems by not raising errors on mixed type operations, so let the user beware. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Tue May 27 21:04:48 2008 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 27 May 2008 20:04:48 -0500 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483CA276.5050709@noaa.gov> References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> <483CA276.5050709@noaa.gov> Message-ID: On Tue, May 27, 2008 at 7:08 PM, Christopher Barker wrote: > > Exactly. However, while I'd like to guarantee that abs(x) >= 0, the > truth is that numpy is "close to the metal" in a lot of ways, and anyone > should know that the arithmetic of integers near max and minimum values > is fraught with danger. > It would be a mistake to assume that many/most NumPy users know the oddities of two's complement signed integer representations. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From kwgoodman at gmail.com Tue May 27 22:14:54 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 27 May 2008 19:14:54 -0700 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: On Tue, May 27, 2008 at 4:27 PM, Nathan Bell wrote: > On Tue, May 27, 2008 at 5:39 PM, Christopher Barker > wrote: >> >> I'm not so sure. I know I wouldn't expect to get a different type back >> with a call to abs(). Do we really want to change that expectation just >> for the case of MIN_INT? >> >> While everyone is going to want an unsigned value when calling abs(), >> who knows if they might want to use negative numbers later? Like: >> >> x = abs(x) >> x *= -1 >> >> Now what do we get/want? > > IMO abs() returning non-negative numbers is a more fundamental > property. In-place operations on integer arrays are somewhat > dangerous, and best left to more sophisticated users anyway. > > > Interestingly, MATLAB (v7.5.0) takes a different approach: > >>> A = int8([ -128, 1]) > A = > -128 1 >>> abs(A) > ans = > 127 1 >>> -A > ans = > 127 -1 octave-3.0.1:1> A = int8([-128,1]) A = -128 1 octave-3.0.1:2> abs(A) ans = 128 1 octave-3.0.1:3> -A ans = 127 -1 From charlesr.harris at gmail.com Tue May 27 23:23:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 May 2008 21:23:16 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: On Tue, May 27, 2008 at 8:14 PM, Keith Goodman wrote: > On Tue, May 27, 2008 at 4:27 PM, Nathan Bell wrote: > > On Tue, May 27, 2008 at 5:39 PM, Christopher Barker > > wrote: > >> > >> I'm not so sure. I know I wouldn't expect to get a different type back > >> with a call to abs(). Do we really want to change that expectation just > >> for the case of MIN_INT? > >> > >> While everyone is going to want an unsigned value when calling abs(), > >> who knows if they might want to use negative numbers later? Like: > >> > >> x = abs(x) > >> x *= -1 > >> > >> Now what do we get/want? > > > > IMO abs() returning non-negative numbers is a more fundamental > > property. In-place operations on integer arrays are somewhat > > dangerous, and best left to more sophisticated users anyway. > > > > > > Interestingly, MATLAB (v7.5.0) takes a different approach: > > > >>> A = int8([ -128, 1]) > > A = > > -128 1 > >>> abs(A) > > ans = > > 127 1 > >>> -A > > ans = > > 127 -1 > > octave-3.0.1:1> A = int8([-128,1]) > A = > -128 1 > octave-3.0.1:2> abs(A) > ans = > 128 1 > octave-3.0.1:3> -A > ans = > 127 -1 > ______ > We could simply define the range of int8 as [-127,127], but that is somewhat problematical also. It is probably important to distinguish between the hardware and mathematical correctness, not that we are ever going to represent all the integers in a finite number of bits. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed May 28 03:08:32 2008 From: Chris.Barker at noaa.gov (Chris.Barker) Date: Wed, 28 May 2008 00:08:32 -0700 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: References: <9457e7c80805270051l62bbd5d6t68fca2a0efa0d603@mail.gmail.com> <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> Message-ID: <483D04F0.80804@noaa.gov> Keith Goodman wrote: >> Interestingly, MATLAB (v7.5.0) takes a different approach: >> ans = >> 127 1 >>>> -A >> ans = >> 127 -1 can anyone explain that? -- just curious. Charles R Harris wrote: > We could simply define the range of int8 as [-127,127], but that is > somewhat problematical also. That would be nice, but probably a pain to do and could be a serious performance hit. And would no longer match hardware/C standards. Charles R Harris wrote: > I disagree. By definition, abs() >= 0 always. and 127 + 1 = 128 This is hardware integer math. It has it's own definition, and apparently abs(-128) = -128. > unsigned return will cause problems is in the augmented assignments, well, could folks end up with inadvertent upcasting in mixed operations, also? a = np.array(..., dtype=np.int8) b = np.array(..., dtype=np.int8) a = abs(a) c = a + b # or b + a what type is c? by the way, are you proposing that abs() always returns a unsigned integer type? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at ar.media.kyoto-u.ac.jp Wed May 28 05:14:35 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 28 May 2008 18:14:35 +0900 Subject: [Numpy-discussion] What does "Ignoring attempt to set 'name' (from ... " mean ? Message-ID: <483D227B.9030902@ar.media.kyoto-u.ac.jp> Hi, I encounter this message when I am building a subtree of scipy (for example scipy/sparsetools). What does it mean exactly ? Is the setup.py doing something wrong ? cheers, David From charlesr.harris at gmail.com Wed May 28 07:37:45 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 05:37:45 -0600 Subject: [Numpy-discussion] Binary ufuncs: minimum In-Reply-To: <483D04F0.80804@noaa.gov> References: <483C6ABD.3010102@enthought.com> <483C71AF.1090009@enthought.com> <483C870C.5010705@enthought.com> <483C8D84.4050109@noaa.gov> <483D04F0.80804@noaa.gov> Message-ID: On Wed, May 28, 2008 at 1:08 AM, Chris.Barker wrote: > Keith Goodman wrote: > >> Interestingly, MATLAB (v7.5.0) takes a different approach: > >> ans = > >> 127 1 > >>>> -A > >> ans = > >> 127 -1 > > can anyone explain that? -- just curious. > > Charles R Harris wrote: > > We could simply define the range of int8 as [-127,127], but that is > > somewhat problematical also. > > That would be nice, but probably a pain to do and could be a serious > performance hit. And would no longer match hardware/C standards. > > Charles R Harris wrote: > > I disagree. By definition, abs() >= 0 always. > > and 127 + 1 = 128 > > This is hardware integer math. It has it's own definition, and > apparently abs(-128) = -128. > > > unsigned return will cause problems is in the augmented assignments, > > well, could folks end up with inadvertent upcasting in mixed operations, > also? > > a = np.array(..., dtype=np.int8) > b = np.array(..., dtype=np.int8) > > a = abs(a) > > c = a + b # or b + a > > what type is c? > > by the way, are you proposing that abs() always returns a unsigned > integer type? > Just tossing the idea out there. I am more bothered by this: In [1]: ones(1,dtype=int8) & ones(1,dtype=uint8) Out[1]: array([1], dtype=int16) In [2]: ones(1,dtype=int8) << ones(1,dtype=uint8) Out[2]: array([2], dtype=int16) In [3]: ones(1,dtype=int64) & ones(1,dtype=uint64) Out[3]: array([1], dtype=object) In [4]: ones(1,dtype=int64) << ones(1,dtype=uint64) Out[4]: array([2], dtype=object) Which really does blaspheme the hardware spirits. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdh2358 at gmail.com Wed May 28 10:19:11 2008 From: jdh2358 at gmail.com (John Hunter) Date: Wed, 28 May 2008 09:19:11 -0500 Subject: [Numpy-discussion] logical masking, wrong length mask Message-ID: <88e473830805280719g243fa194sd11d03ef2ece9ce3@mail.gmail.com> I just spent a while tracking down a bug in my code, and found out the problem was numpy was letting me get away with using a logical mask of smaller size than the array it was masking. In [19]: x = np.random.rand(10) In [20]: x Out[20]: array([ 0.72253623, 0.8412243 , 0.12835194, 0.01595052, 0.62208366, 0.57229259, 0.46099861, 0.44114786, 0.23687212, 0.89507604]) In [21]: y = np.random.rand(11) In [22]: mask = x>.5 In [23]: x[mask] Out[23]: array([ 0.72253623, 0.8412243 , 0.62208366, 0.57229259, 0.89507604]) In [24]: y[mask] Out[24]: array([ 0.13440315, 0.83610533, 0.75390136, 0.79046615, 0.34776165]) In [25]: mask Out[25]: array([ True, True, False, False, True, True, False, False, False, True], dtype=bool) I initially thought line 24 below should raise an error, or coerce True to 1 and False to 0 and give me either y[0] or y[1] accordingly, but neither appear to be happening. Instead, I appear to be getting y[:len(mask)][mask] . In [27]: y[:10][mask] Out[27]: array([ 0.13440315, 0.83610533, 0.75390136, 0.79046615, 0.34776165]) In [28]: y[mask] Out[28]: array([ 0.13440315, 0.83610533, 0.75390136, 0.79046615, 0.34776165]) In [29]: len(y) Out[29]: 11 In [30]: len(mask) Out[30]: 10 In [31]: y[:len(mask)][mask] Out[31]: array([ 0.13440315, 0.83610533, 0.75390136, 0.79046615, 0.34776165]) In [32]: np.__version__ Out[32]: '1.2.0.dev5243' Bug or feature? From kwgoodman at gmail.com Wed May 28 10:30:29 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 28 May 2008 07:30:29 -0700 Subject: [Numpy-discussion] segmentation fault Message-ID: Does anyone else get this seg fault? >> def fn(): x = np.random.rand(5,2) x.cumsum(None, out=x) return x ....: >> fn() *** glibc detected *** /usr/bin/python: double free or corruption (out): 0x08212dc8 *** I'm running 1.0.4 from Debian Lenny with python 2.5.2 compiled with gcc 4.2.3-3, if that matters. From stefan at sun.ac.za Wed May 28 10:39:23 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 16:39:23 +0200 Subject: [Numpy-discussion] logical masking, wrong length mask In-Reply-To: <88e473830805280719g243fa194sd11d03ef2ece9ce3@mail.gmail.com> References: <88e473830805280719g243fa194sd11d03ef2ece9ce3@mail.gmail.com> Message-ID: <9457e7c80805280739n3e82d983nc0a3ae7010e1d1e9@mail.gmail.com> 2008/5/28 John Hunter : > I initially thought line 24 below should raise an error, or coerce > True to 1 and False to 0 and give me either y[0] or y[1] accordingly, > but neither appear to be happening. Instead, I appear to be getting > y[:len(mask)][mask] This "feature" looks dangerous enough to warrant raising a warning. Is a good reason why it works in the first place? Regards St?fan From Joris.DeRidder at ster.kuleuven.be Wed May 28 10:49:15 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Wed, 28 May 2008 16:49:15 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: Message-ID: <43131B22-4BAC-4E05-BAB8-E5D4E5CD776F@ster.kuleuven.be> On 28 May 2008, at 16:30, Keith Goodman wrote: > Does anyone else get this seg fault? > >>> def fn(): > x = np.random.rand(5,2) > x.cumsum(None, out=x) > return x > ....: >>> fn() > *** glibc detected *** /usr/bin/python: double free or corruption > (out): 0x08212dc8 *** > > I'm running 1.0.4 from Debian Lenny with python 2.5.2 compiled with > gcc 4.2.3-3, if that matters. Nope, In [1]: import numpy as np In [2]: def fn(): ...: x = np.random.rand(5,2) ...: x.cumsum(None,out=x) ...: return x ...: In [3]: fn() Out[3]: array([[ 0.07840917, 0.73252624], [ 0.8109354 , 0.38653933], [ 1.62187081, 0.1785713 ], [ 2.00841014, 0.76834603], [ 3.63028095, 0.64377051]]) Max OSX, Python 2.5, numpy 1.0.4. Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From alan.mcintyre at gmail.com Wed May 28 10:51:20 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 10:51:20 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: Message-ID: <1d36917a0805280751u47962f8cy15ac55f81b445b20@mail.gmail.com> On Wed, May 28, 2008 at 10:30 AM, Keith Goodman wrote: > Does anyone else get this seg fault? > >>> def fn(): > x = np.random.rand(5,2) > x.cumsum(None, out=x) > return x > ....: >>> fn() > *** glibc detected *** /usr/bin/python: double free or corruption > (out): 0x08212dc8 *** > > I'm running 1.0.4 from Debian Lenny with python 2.5.2 compiled with > gcc 4.2.3-3, if that matters. Yep I get one here too, using numpy and Python 2.5 from svn trunk. It doesn't always happen on the first invocation of the function. When the segfault happens, it seems to be in arrayobject.c, line 2093, which is "PyDimMem_FREE(self->dimensions)" and self (in the case where I checked) is some bogus value like 0x1d. At the moment I don't have time to dig into it; I might look later if nobody else has time. From stefan at sun.ac.za Wed May 28 10:53:22 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 16:53:22 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: Message-ID: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> 2008/5/28 Keith Goodman : > Does anyone else get this seg fault? > >>> def fn(): > x = np.random.rand(5,2) > x.cumsum(None, out=x) > return x > ....: >>> fn() > *** glibc detected *** /usr/bin/python: double free or corruption > (out): 0x08212dc8 *** > > I'm running 1.0.4 from Debian Lenny with python 2.5.2 compiled with > gcc 4.2.3-3, if that matters. Yeap, big bada-boom. Well caught! ==28547== Source and destination overlap in memcpy(0x5B83198, 0x5B83198, 8) ==28547== at 0x4C2508B: memcpy (mc_replace_strmem.c:402) ==28547== by 0x60C2574: PyUFunc_GenericReduction (ufuncobject.c:2758) ==28547== by 0x417E72: PyObject_Call (abstract.c:1861) ==28547== by 0x5E5AD22: PyArray_GenericAccumulateFunction (arrayobject.c:3531) ==28547== by 0x5E76031: PyArray_CumSum (multiarraymodule.c:991) ==28547== by 0x5E7611C: array_cumsum (arraymethods.c:1533) ==28547== by 0x417E72: PyObject_Call (abstract.c:1861) ==28547== by 0x4860E9: PyEval_EvalFrameEx (ceval.c:3784) ==28547== by 0x488B76: PyEval_EvalFrameEx (ceval.c:3659) ==28547== by 0x48A375: PyEval_EvalCodeEx (ceval.c:2836) ==28547== Invalid write of size 8 ==28547== at 0x60B6362: DOUBLE_add (umathmodule.c.src:1019) ==28547== by 0x60C25A3: PyUFunc_GenericReduction (ufuncobject.c:2762) ==28547== by 0x417E72: PyObject_Call (abstract.c:1861) ==28547== by 0x5E5AD22: PyArray_GenericAccumulateFunction (arrayobject.c:3531) ==28547== by 0x5E76031: PyArray_CumSum (multiarraymodule.c:991) ==28547== by 0x5E7611C: array_cumsum (arraymethods.c:1533) ==28547== by 0x417E72: PyObject_Call (abstract.c:1861) ==28547== by 0x4860E9: PyEval_EvalFrameEx (ceval.c:3784) ==28547== by 0x488B76: PyEval_EvalFrameEx (ceval.c:3659) ==28547== by 0x48A375: PyEval_EvalCodeEx (ceval.c:2836) Cheers St?fan From sransom at nrao.edu Wed May 28 10:59:10 2008 From: sransom at nrao.edu (Scott Ransom) Date: Wed, 28 May 2008 10:59:10 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> Message-ID: <200805281059.11789.sransom@nrao.edu> Hmmm. Interesting. I'm on a 64-bit Debian Unstable system with numpy 1.0.4 and python 2.5.2 and I don't get this: In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.0.4' In [3]: def fn(): ...: x = np.random.rand(5,2) ...: x.cumsum(None, out=x) ...: return x ...: In [4]: fn() Out[4]: array([[ 0.40329303, 0.45335328], [ 0.85664631, 0.84798294], [ 1.71329262, 0.05877989], [ 2.56127556, 0.99401291], [ 4.27456818, 0.79275409]]) Wonder if the 64-bit thing could be the difference? Scott On Wednesday 28 May 2008 10:53:22 am St?fan van der Walt wrote: > 2008/5/28 Keith Goodman : > > Does anyone else get this seg fault? > > > >>> def fn(): > > > > x = np.random.rand(5,2) > > x.cumsum(None, out=x) > > return x > > > > ....: > >>> fn() > > > > *** glibc detected *** /usr/bin/python: double free or corruption > > (out): 0x08212dc8 *** > > > > I'm running 1.0.4 from Debian Lenny with python 2.5.2 compiled with > > gcc 4.2.3-3, if that matters. > > Yeap, big bada-boom. Well caught! > > ==28547== Source and destination overlap in memcpy(0x5B83198, > 0x5B83198, 8) ==28547== at 0x4C2508B: memcpy > (mc_replace_strmem.c:402) > ==28547== by 0x60C2574: PyUFunc_GenericReduction > (ufuncobject.c:2758) ==28547== by 0x417E72: PyObject_Call > (abstract.c:1861) > ==28547== by 0x5E5AD22: PyArray_GenericAccumulateFunction > (arrayobject.c:3531) > ==28547== by 0x5E76031: PyArray_CumSum (multiarraymodule.c:991) > ==28547== by 0x5E7611C: array_cumsum (arraymethods.c:1533) > ==28547== by 0x417E72: PyObject_Call (abstract.c:1861) > ==28547== by 0x4860E9: PyEval_EvalFrameEx (ceval.c:3784) > ==28547== by 0x488B76: PyEval_EvalFrameEx (ceval.c:3659) > ==28547== by 0x48A375: PyEval_EvalCodeEx (ceval.c:2836) > > ==28547== Invalid write of size 8 > ==28547== at 0x60B6362: DOUBLE_add (umathmodule.c.src:1019) > ==28547== by 0x60C25A3: PyUFunc_GenericReduction > (ufuncobject.c:2762) ==28547== by 0x417E72: PyObject_Call > (abstract.c:1861) > ==28547== by 0x5E5AD22: PyArray_GenericAccumulateFunction > (arrayobject.c:3531) > ==28547== by 0x5E76031: PyArray_CumSum (multiarraymodule.c:991) > ==28547== by 0x5E7611C: array_cumsum (arraymethods.c:1533) > ==28547== by 0x417E72: PyObject_Call (abstract.c:1861) > ==28547== by 0x4860E9: PyEval_EvalFrameEx (ceval.c:3784) > ==28547== by 0x488B76: PyEval_EvalFrameEx (ceval.c:3659) > ==28547== by 0x48A375: PyEval_EvalCodeEx (ceval.c:2836) > > Cheers > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From sransom at nrao.edu Wed May 28 11:07:16 2008 From: sransom at nrao.edu (Scott Ransom) Date: Wed, 28 May 2008 11:07:16 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1d36917a0805280751u47962f8cy15ac55f81b445b20@mail.gmail.com> References: <1d36917a0805280751u47962f8cy15ac55f81b445b20@mail.gmail.com> Message-ID: <200805281107.16892.sransom@nrao.edu> On Wednesday 28 May 2008 10:51:20 am Alan McIntyre wrote: > On Wed, May 28, 2008 at 10:30 AM, Keith Goodman wrote: > > Does anyone else get this seg fault? > > > >>> def fn(): > > > > x = np.random.rand(5,2) > > x.cumsum(None, out=x) > > return x > > > > ....: > >>> fn() > > > > *** glibc detected *** /usr/bin/python: double free or corruption > > (out): 0x08212dc8 *** > > > > I'm running 1.0.4 from Debian Lenny with python 2.5.2 compiled with > > gcc 4.2.3-3, if that matters. > > Yep I get one here too, using numpy and Python 2.5 from svn trunk. > It doesn't always happen on the first invocation of the function. Ah. That appears to be correct. My last posting said that I didn't see this. But after running it a few times I get a segfault. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From nwagner at iam.uni-stuttgart.de Wed May 28 11:17:45 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 28 May 2008 17:17:45 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <200805281107.16892.sransom@nrao.edu> References: <1d36917a0805280751u47962f8cy15ac55f81b445b20@mail.gmail.com> <200805281107.16892.sransom@nrao.edu> Message-ID: On Wed, 28 May 2008 11:07:16 -0400 Scott Ransom wrote: > > On Wednesday 28 May 2008 10:51:20 am Alan McIntyre >wrote: >> On Wed, May 28, 2008 at 10:30 AM, Keith Goodman >> > wrote: >> > Does anyone else get this seg fault? >> > >> >>> def fn(): >> > >> > x = np.random.rand(5,2) >> > x.cumsum(None, out=x) >> > return x >> > >> > ....: >> >>> fn() >> > >> > *** glibc detected *** /usr/bin/python: double free or >>corruption >> > (out): 0x08212dc8 *** >> > >> > I'm running 1.0.4 from Debian Lenny with python 2.5.2 >>compiled with >> > gcc 4.2.3-3, if that matters. >> >> Yep I get one here too, using numpy and Python 2.5 from >>svn trunk. >> It doesn't always happen on the first invocation of the >>function. > > Ah. That appears to be correct. My last posting said >that I didn't see > this. But after running it a few times I get a >segfault. > > Scott > -- > Scott M. Ransom Address: NRAO > Phone: (434) 296-0320 520 Edgemont Rd. > email: sransom at nrao.edu Charlottesville, VA >22903 USA > GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA >B6FF FFD3 2989 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Here is a backtrace Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 182894182272 (LWP 22329)] collect (generation=2) at Modules/gcmodule.c:242 242 gc->gc.gc_refs = FROM_GC(gc)->ob_refcnt; (gdb) bt #0 collect (generation=2) at Modules/gcmodule.c:242 #1 0x00000000004ba049 in PyGC_Collect () at Modules/gcmodule.c:1265 #2 0x00000000004af43d in Py_Finalize () at Python/pythonrun.c:387 #3 0x0000000000411c97 in Py_Main (argc=-1073743320, argv=Variable "argv" is not available. ) at Modules/main.c:545 #4 0x0000003643a1c3fb in __libc_start_main () from /lib64/tls/libc.so.6 #5 0x000000000041163a in _start () From kwgoodman at gmail.com Wed May 28 11:21:33 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 28 May 2008 08:21:33 -0700 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: Message-ID: On Wed, May 28, 2008 at 7:30 AM, Keith Goodman wrote: > Does anyone else get this seg fault? > >>> def fn(): > x = np.random.rand(5,2) > x.cumsum(None, out=x) > return x > ....: >>> fn() > *** glibc detected *** /usr/bin/python: double free or corruption > (out): 0x08212dc8 *** I replaced cumsum with sum and with max. Both give ValueError: wrong shape for output So is that the problem? No check for shape is made? From pav at iki.fi Wed May 28 11:27:54 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 28 May 2008 18:27:54 +0300 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <200805281059.11789.sransom@nrao.edu> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> Message-ID: <1211988474.29961.153.camel@localhost> ke, 2008-05-28 kello 10:59 -0400, Scott Ransom kirjoitti: > Hmmm. Interesting. I'm on a 64-bit Debian Unstable system with numpy > 1.0.4 and python 2.5.2 and I don't get this: > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.0.4' > > In [3]: def fn(): > ...: x = np.random.rand(5,2) > ...: x.cumsum(None, out=x) > ...: return x > ...: > > In [4]: fn() > Out[4]: > array([[ 0.40329303, 0.45335328], > [ 0.85664631, 0.84798294], > [ 1.71329262, 0.05877989], > [ 2.56127556, 0.99401291], > [ 4.27456818, 0.79275409]]) > > Wonder if the 64-bit thing could be the difference? Try running fn() again to make the bug bomb out: $ python Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> imp KeyboardInterrupt >>> import numpy as np >>> def fn(): ... x = np.random.rand(5,2) ... x.cumsum(None, out=x) ... return x ... >>> np.__version__ '1.0.5.dev5024' >>> fn() array([[ 0.59654253, 0.12577169], [ 0.72231422, 0.30600244], [ 1.44462843, 0.35849553], [ 1.75063088, 0.56925858], [ 3.19525931, 0.77487798]]) >>> fn() *** glibc detected *** python: munmap_chunk(): invalid pointer: 0x08439d28 *** -- Pauli Virtanen From hoytak at gmail.com Wed May 28 11:55:57 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Wed, 28 May 2008 08:55:57 -0700 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1211988474.29961.153.camel@localhost> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> Message-ID: <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> In my experience tracking down these sorts of things, if the effect is delayed and detected by glibc, it almost always means that a few bytes beyond the end of the data part of an array have been overwritten. This causes glibc's memory management stuff to crash later on when the object is deallocated (or something like that). Of course, I should say I was doing the overwritting in my own c code... --Hoyt On Wed, May 28, 2008 at 8:27 AM, Pauli Virtanen wrote: > ke, 2008-05-28 kello 10:59 -0400, Scott Ransom kirjoitti: >> Hmmm. Interesting. I'm on a 64-bit Debian Unstable system with numpy >> 1.0.4 and python 2.5.2 and I don't get this: >> >> In [1]: import numpy as np >> >> In [2]: np.__version__ >> Out[2]: '1.0.4' >> >> In [3]: def fn(): >> ...: x = np.random.rand(5,2) >> ...: x.cumsum(None, out=x) >> ...: return x >> ...: >> >> In [4]: fn() >> Out[4]: >> array([[ 0.40329303, 0.45335328], >> [ 0.85664631, 0.84798294], >> [ 1.71329262, 0.05877989], >> [ 2.56127556, 0.99401291], >> [ 4.27456818, 0.79275409]]) >> >> Wonder if the 64-bit thing could be the difference? > > Try running fn() again to make the bug bomb out: > > $ python > Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) > [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> imp > KeyboardInterrupt >>>> import numpy as np >>>> def fn(): > ... x = np.random.rand(5,2) > ... x.cumsum(None, out=x) > ... return x > ... >>>> np.__version__ > '1.0.5.dev5024' >>>> fn() > array([[ 0.59654253, 0.12577169], > [ 0.72231422, 0.30600244], > [ 1.44462843, 0.35849553], > [ 1.75063088, 0.56925858], > [ 3.19525931, 0.77487798]]) >>>> fn() > *** glibc detected *** python: munmap_chunk(): invalid pointer: > 0x08439d28 *** > > -- > Pauli Virtanen > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- +++++++++++++++++++++++++++++++++++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ hoytak at gmail.com +++++++++++++++++++++++++++++++++++ From stefan at sun.ac.za Wed May 28 12:39:22 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 18:39:22 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> Message-ID: <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> 2008/5/28 Hoyt Koepke : > In my experience tracking down these sorts of things, if the effect is > delayed and detected by glibc, it almost always means that a few bytes > beyond the end of the data part of an array have been overwritten. > This causes glibc's memory management stuff to crash later on when the > object is deallocated (or something like that). Of course, I should > say I was doing the overwritting in my own c code... If you look at the valgrind trace I sent earlier, you'll see that that is the case. Regards St?fan From millman at berkeley.edu Wed May 28 12:46:09 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 28 May 2008 09:46:09 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.1.0 Message-ID: I'm pleased to announce the release of NumPy 1.1.0. NumPy is the fundamental package needed for scientific computing with Python. It contains: * a powerful N-dimensional array object * sophisticated (broadcasting) functions * basic linear algebra functions * basic Fourier transforms * sophisticated random number capabilities * tools for integrating Fortran code. Besides it's obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide-variety of databases. This is the first minor release since the 1.0 release in October 2006. There are a few major changes, which introduce some minor API breakage. In addition this release includes tremendous improvements in terms of bug-fixing, testing, and documentation. For information, please see the release notes: http://sourceforge.net/project/shownotes.php?release_id=602575&group_id=1369 Thank you to everybody who contributed to this release. Enjoy, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Wed May 28 12:58:03 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 10:58:03 -0600 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 10:39 AM, St?fan van der Walt wrote: > 2008/5/28 Hoyt Koepke : > > In my experience tracking down these sorts of things, if the effect is > > delayed and detected by glibc, it almost always means that a few bytes > > beyond the end of the data part of an array have been overwritten. > > This causes glibc's memory management stuff to crash later on when the > > object is deallocated (or something like that). Of course, I should > > say I was doing the overwritting in my own c code... > > If you look at the valgrind trace I sent earlier, you'll see that that > is the case. > It's shape related. In [7]: x = numpy.random.rand(5,2) In [8]: y = ones((5,2)) In [9]: x.cumsum(None,out=y) Out[9]: array([[ 0.76943981, 1. ], [ 1.12678411, 1. ], [ 1.69498328, 1. ], [ 2.50560628, 1. ], [ 3.23050034, 1. ]]) In [10]: x.cumsum(None,out=x.reshape(10)) Out[10]: array([ 0.76943981, 1.12678411, 1.69498328, 2.50560628, 3.23050034, 3.82341732, 4.78267467, 4.94663937, 5.39959179, 5.94577506]) In [11]: x.cumsum(None,out=x.reshape(10)) Out[11]: array([ 0.76943981, 1.89622392, 3.5912072 , 6.09681347, 9.32731382, 13.15073114, 17.9334058 , 22.88004517, 28.27963696, 34.22541202]) In [12]: x.cumsum(None,out=x.reshape(10)) Out[12]: array([ 0.76943981, 2.66566373, 6.25687093, 12.3536844 , 21.68099822, 34.83172935, 52.76513516, 75.64518033, 103.92481729, 138.1502293 ]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed May 28 13:22:22 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 19:22:22 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> Message-ID: <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> 2008/5/28 Charles R Harris : > It's shape related. > > In [7]: x = numpy.random.rand(5,2) > > In [8]: y = ones((5,2)) > > In [9]: x.cumsum(None,out=y) > Out[9]: > array([[ 0.76943981, 1. ], > [ 1.12678411, 1. ], > [ 1.69498328, 1. ], > [ 2.50560628, 1. ], > [ 3.23050034, 1. ]]) Yes, that first column doesn't stop there :) So, would it work to use a flattened view on the array in ufuncobject.c? Regards St?fan From stefan at sun.ac.za Wed May 28 13:26:19 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 19:26:19 +0200 Subject: [Numpy-discussion] ANN: NumPy 1.1.0 In-Reply-To: References: Message-ID: <9457e7c80805281026r7646d8b1kb88f98851819b24@mail.gmail.com> Jarrod, 2008/5/28 Jarrod Millman : > I'm pleased to announce the release of NumPy 1.1.0. Thank you for coordinating the birth of this behemoth release! We appreciate all the time and effort you put into it. Regards St?fan From stefan at sun.ac.za Wed May 28 13:36:26 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 19:36:26 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> Message-ID: <9457e7c80805281036q34f12ebcj1f9c1123561f1366@mail.gmail.com> 2008/5/28 St?fan van der Walt : > 2008/5/28 Charles R Harris : >> It's shape related. >> >> In [7]: x = numpy.random.rand(5,2) >> >> In [8]: y = ones((5,2)) >> >> In [9]: x.cumsum(None,out=y) >> Out[9]: >> array([[ 0.76943981, 1. ], >> [ 1.12678411, 1. ], >> [ 1.69498328, 1. ], >> [ 2.50560628, 1. ], >> [ 3.23050034, 1. ]]) > > Yes, that first column doesn't stop there :) > > So, would it work to use a flattened view on the array in ufuncobject.c? Scratch that. Such a hack would just hide the underlying problem. I'll keep quiet now until I've parsed construct_reduce and PyArray_UFuncReduce properly. Regards St?fan From charlesr.harris at gmail.com Wed May 28 13:37:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 11:37:42 -0600 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 11:22 AM, St?fan van der Walt wrote: > 2008/5/28 Charles R Harris : > > It's shape related. > > > > In [7]: x = numpy.random.rand(5,2) > > > > In [8]: y = ones((5,2)) > > > > In [9]: x.cumsum(None,out=y) > > Out[9]: > > array([[ 0.76943981, 1. ], > > [ 1.12678411, 1. ], > > [ 1.69498328, 1. ], > > [ 2.50560628, 1. ], > > [ 3.23050034, 1. ]]) > > Yes, that first column doesn't stop there :) > > So, would it work to use a flattened view on the array in ufuncobject.c? I think the bug is not raising an error on shape mismatch, the assumption on the first index follows from that. For the out=x parameter, I propose the rules: 1) x must have the shape of the expected output (1D in this case) 2) x must have the same type as the expected output (currently cast) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 28 13:55:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 11:55:34 -0600 Subject: [Numpy-discussion] proposal on ufunc bit ops Message-ID: Hi All, I would like to propose that the bit_ops preserve the length of the relevant types. Currently we have: In [1]: ones(1,dtype=int8) & ones(1,dtype=uint8) Out[1]: array([1], dtype=int16) In [3]: ones(1,dtype=int64) & ones(1,dtype=uint64) Out[3]: array([1], dtype=object) Note the increased size in the first case and the return of a Python long integer object in the second. As all the high order bits will be zeroed in any case when the smaller elements are sign extended, I propose that in these mixed cases an unsigned type be returned of the same size as the largest input size. I think this conforms with the rule of least surprise. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 28 14:01:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 12:01:16 -0600 Subject: [Numpy-discussion] proposal on ufunc shift operators. Message-ID: Hi All, Currently we have: In [2]: ones(1,dtype=int8) << ones(1,dtype=uint8) Out[2]: array([2], dtype=int16) In [4]: ones(1,dtype=int64) << ones(1,dtype=uint64) Out[4]: array([2], dtype=object) Note the increased size in the first case and the return of a Python long integer object in the second. I propose that these operators should preserve the type of the first argument, although this is not easy to do with the current ufunc setup. It is impossible to use a type of sufficient size for all shift values and preserving the type of the first argument is what I think most folks would expect. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed May 28 15:02:51 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 28 May 2008 14:02:51 -0500 Subject: [Numpy-discussion] What does "Ignoring attempt to set 'name' (from ... " mean ? In-Reply-To: <483D227B.9030902@ar.media.kyoto-u.ac.jp> References: <483D227B.9030902@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730805281202v1dcbaef5ocb4f1c45f3ea5fde@mail.gmail.com> On Wed, May 28, 2008 at 4:14 AM, David Cournapeau wrote: > Hi, > > I encounter this message when I am building a subtree of scipy (for > example scipy/sparsetools). What does it mean exactly ? Is the setup.py > doing something wrong ? Please provide the full error message with some context. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Wed May 28 15:17:10 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 15:17:10 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> Message-ID: <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> On Wed, May 28, 2008 at 1:37 PM, Charles R Harris wrote: > I think the bug is not raising an error on shape mismatch, the assumption on > the first index follows from that. For the out=x parameter, I propose the > rules: > > 1) x must have the shape of the expected output (1D in this case) > 2) x must have the same type as the expected output (currently cast) That seems to be consistent with the documented behavior of other functions that have an 'out' parameter. I wonder if this is something that ought to be looked at for all functions with an "out" parameter? ndarray.compress also had problems with array type mismatch (#789); I can't imagine that it's safe to assume only these two functions were doing it incorrectly. (Unless of course somebody has recently looked at all of them) Alan From charlesr.harris at gmail.com Wed May 28 15:34:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 13:34:07 -0600 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 1:17 PM, Alan McIntyre wrote: > On Wed, May 28, 2008 at 1:37 PM, Charles R Harris > wrote: > > I think the bug is not raising an error on shape mismatch, the assumption > on > > the first index follows from that. For the out=x parameter, I propose the > > rules: > > > > 1) x must have the shape of the expected output (1D in this case) > > 2) x must have the same type as the expected output (currently cast) > > That seems to be consistent with the documented behavior of other > functions that have an 'out' parameter. > > I wonder if this is something that ought to be looked at for all > functions with an "out" parameter? ndarray.compress also had problems > with array type mismatch (#789); I can't imagine that it's safe to > assume only these two functions were doing it incorrectly. (Unless of > course somebody has recently looked at all of them) > I think that is an excellent idea! A good start would be to list all the functions with the out parameter and then write some tests. The current behavior is inconsistent and we not only need to specify the behavior, but fix all the places that don't follow the rules. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed May 28 15:34:48 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 28 May 2008 21:34:48 +0200 Subject: [Numpy-discussion] proposal on ufunc shift operators. In-Reply-To: References: Message-ID: 2008/5/28 Charles R Harris : > Hi All, > > Currently we have: > > In [2]: ones(1,dtype=int8) << ones(1,dtype=uint8) > Out[2]: array([2], dtype=int16) > > In [4]: ones(1,dtype=int64) << ones(1,dtype=uint64) > Out[4]: array([2], dtype=object) > > Note the increased size in the first case and the return of a Python long > integer object in the second. I propose that these operators should preserve > the type of the first argument, although this is not easy to do with the > current ufunc setup. It is impossible to use a type of sufficient size for > all shift values and preserving the type of the first argument is what I > think most folks would expect. That sounds eminently sensible to me. Of course this will complicate explaining type rules for ufuncs. An alternative is to give it some reasonably simple non-exceptional rule and let users specify the output dtype if they care. Still, I think in the interest of minimal surprise, getting rid of the conversion to an object array of longs, at least, would be a good idea. Do we have a cyclic-shift ufunc? That is, uint8(1)<<<8==1? Anne From alan.mcintyre at gmail.com Wed May 28 16:06:54 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 16:06:54 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> Message-ID: <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> On Wed, May 28, 2008 at 3:34 PM, Charles R Harris wrote: >> I wonder if this is something that ought to be looked at for all >> functions with an "out" parameter? ndarray.compress also had problems >> with array type mismatch (#789); I can't imagine that it's safe to >> assume only these two functions were doing it incorrectly. (Unless of >> course somebody has recently looked at all of them) > > I think that is an excellent idea! A good start would be to list all the > functions with the out parameter and then write some tests. The current > behavior is inconsistent and we not only need to specify the behavior, but > fix all the places that don't follow the rules. There's a ticket (#416) that calls for the docstrings of such functions to be updated as well; that's something that should be easy to do at the same time. Since I'm working on tests nowadays, I'll see if I can work on coming up with a list and at least writing some tests. From kwgoodman at gmail.com Wed May 28 16:16:52 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 28 May 2008 13:16:52 -0700 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> References: <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 1:06 PM, Alan McIntyre wrote: > On Wed, May 28, 2008 at 3:34 PM, Charles R Harris > wrote: >>> I wonder if this is something that ought to be looked at for all >>> functions with an "out" parameter? ndarray.compress also had problems >>> with array type mismatch (#789); I can't imagine that it's safe to >>> assume only these two functions were doing it incorrectly. (Unless of >>> course somebody has recently looked at all of them) >> >> I think that is an excellent idea! A good start would be to list all the >> functions with the out parameter and then write some tests. The current >> behavior is inconsistent and we not only need to specify the behavior, but >> fix all the places that don't follow the rules. > > There's a ticket (#416) that calls for the docstrings of such > functions to be updated as well; that's something that should be easy > to do at the same time. Since I'm working on tests nowadays, I'll see > if I can work on coming up with a list and at least writing some > tests. I guess cumprod is an obvious one to try: >> def fn(): x = np.random.rand(5,2) x.cumprod(None, out=x) return x ....: >> fn() Segmentation fault From stefan at sun.ac.za Wed May 28 16:19:16 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 22:19:16 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> References: <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> Message-ID: <9457e7c80805281319p5a438629v99a0dfcc28d15896@mail.gmail.com> 2008/5/28 Alan McIntyre : > On Wed, May 28, 2008 at 3:34 PM, Charles R Harris > wrote: >>> I wonder if this is something that ought to be looked at for all >>> functions with an "out" parameter? ndarray.compress also had problems >>> with array type mismatch (#789); I can't imagine that it's safe to >>> assume only these two functions were doing it incorrectly. (Unless of >>> course somebody has recently looked at all of them) >> >> I think that is an excellent idea! A good start would be to list all the >> functions with the out parameter and then write some tests. The current >> behavior is inconsistent and we not only need to specify the behavior, but >> fix all the places that don't follow the rules. > > There's a ticket (#416) that calls for the docstrings of such > functions to be updated as well; that's something that should be easy > to do at the same time. Since I'm working on tests nowadays, I'll see > if I can work on coming up with a list and at least writing some > tests. A reminder: if docstrings need to be updated, it is really easy to do: http://sd-2116.dedibox.fr/doc/Docstrings/ Pauli has been hard at work at writing a Django app to replace the current wiki. It has proper work-flow, status tags and lots more bells-and-whistles; we'll soon switch over. In the meantime, the wiki remains the place to go. Regards St?fan From kwgoodman at gmail.com Wed May 28 16:49:27 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 28 May 2008 13:49:27 -0700 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 1:16 PM, Keith Goodman wrote: > On Wed, May 28, 2008 at 1:06 PM, Alan McIntyre wrote: >> On Wed, May 28, 2008 at 3:34 PM, Charles R Harris >> wrote: >>>> I wonder if this is something that ought to be looked at for all >>>> functions with an "out" parameter? ndarray.compress also had problems >>>> with array type mismatch (#789); I can't imagine that it's safe to >>>> assume only these two functions were doing it incorrectly. (Unless of >>>> course somebody has recently looked at all of them) >>> >>> I think that is an excellent idea! A good start would be to list all the >>> functions with the out parameter and then write some tests. The current >>> behavior is inconsistent and we not only need to specify the behavior, but >>> fix all the places that don't follow the rules. >> >> There's a ticket (#416) that calls for the docstrings of such >> functions to be updated as well; that's something that should be easy >> to do at the same time. Since I'm working on tests nowadays, I'll see >> if I can work on coming up with a list and at least writing some >> tests. > > I guess cumprod is an obvious one to try: > >>> def fn(): > x = np.random.rand(5,2) > x.cumprod(None, out=x) > return x > > ....: >>> fn() > Segmentation fault Are there any guarantees that the overwritten memory belongs to numpy so that the segfault cleans up its own mess? From alan.mcintyre at gmail.com Wed May 28 16:55:59 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 16:55:59 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <9457e7c80805281319p5a438629v99a0dfcc28d15896@mail.gmail.com> References: <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> <9457e7c80805281319p5a438629v99a0dfcc28d15896@mail.gmail.com> Message-ID: <1d36917a0805281355s66718145yec135b6202b66625@mail.gmail.com> On Wed, May 28, 2008 at 4:19 PM, St?fan van der Walt wrote: > A reminder: if docstrings need to be updated, it is really easy to do: > > http://sd-2116.dedibox.fr/doc/Docstrings/ > > Pauli has been hard at work at writing a Django app to replace the > current wiki. It has proper work-flow, status tags and lots more > bells-and-whistles; we'll soon switch over. In the meantime, the wiki > remains the place to go. I must have missed the earlier mention of that wiki, so please pardon my ignorance: does editing the wiki propagate the updated docstring back into svn? From kwgoodman at gmail.com Wed May 28 17:07:46 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 28 May 2008 14:07:46 -0700 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1d36917a0805281355s66718145yec135b6202b66625@mail.gmail.com> References: <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> <9457e7c80805281319p5a438629v99a0dfcc28d15896@mail.gmail.com> <1d36917a0805281355s66718145yec135b6202b66625@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 1:55 PM, Alan McIntyre wrote: > On Wed, May 28, 2008 at 4:19 PM, St?fan van der Walt wrote: >> A reminder: if docstrings need to be updated, it is really easy to do: >> >> http://sd-2116.dedibox.fr/doc/Docstrings/ >> >> Pauli has been hard at work at writing a Django app to replace the >> current wiki. It has proper work-flow, status tags and lots more >> bells-and-whistles; we'll soon switch over. In the meantime, the wiki >> remains the place to go. > > I must have missed the earlier mention of that wiki, so please pardon > my ignorance: does editing the wiki propagate the updated docstring > back into svn? Yes, St?fan does that. Once a week, I think. From oliphant at enthought.com Wed May 28 17:08:14 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 28 May 2008 16:08:14 -0500 Subject: [Numpy-discussion] segmentation fault In-Reply-To: References: <9457e7c80805280753q7428286ejc4c1de7278594911@mail.gmail.com> <200805281059.11789.sransom@nrao.edu> <1211988474.29961.153.camel@localhost> <4db580fd0805280855w53ee4628hae180c71a7a96133@mail.gmail.com> <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> Message-ID: <483DC9BE.4070607@enthought.com> Charles R Harris wrote: > > > On Wed, May 28, 2008 at 11:22 AM, St?fan van der Walt > > wrote: > > 2008/5/28 Charles R Harris >: > > It's shape related. > > > > In [7]: x = numpy.random.rand(5,2) > > > > In [8]: y = ones((5,2)) > > > > In [9]: x.cumsum(None,out=y) > > Out[9]: > > array([[ 0.76943981, 1. ], > > [ 1.12678411, 1. ], > > [ 1.69498328, 1. ], > > [ 2.50560628, 1. ], > > [ 3.23050034, 1. ]]) > > Yes, that first column doesn't stop there :) > > So, would it work to use a flattened view on the array in > ufuncobject.c? > > > I think the bug is not raising an error on shape mismatch, the > assumption on the first index follows from that. For the out=x > parameter, I propose the rules: > > 1) x must have the shape of the expected output (1D in this case) +1 > 2) x must have the same type as the expected output (currently cast) -lots From stefan at sun.ac.za Wed May 28 17:10:20 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 28 May 2008 23:10:20 +0200 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <1d36917a0805281355s66718145yec135b6202b66625@mail.gmail.com> References: <9457e7c80805280939g5c550e08le598cc7d356c0d58@mail.gmail.com> <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> <9457e7c80805281319p5a438629v99a0dfcc28d15896@mail.gmail.com> <1d36917a0805281355s66718145yec135b6202b66625@mail.gmail.com> Message-ID: <9457e7c80805281410h32e0c3fbva717fd9f2b9179e0@mail.gmail.com> Hi Alan 2008/5/28 Alan McIntyre : > On Wed, May 28, 2008 at 4:19 PM, St?fan van der Walt wrote: >> A reminder: if docstrings need to be updated, it is really easy to do: >> >> http://sd-2116.dedibox.fr/doc/Docstrings/ >> >> Pauli has been hard at work at writing a Django app to replace the >> current wiki. It has proper work-flow, status tags and lots more >> bells-and-whistles; we'll soon switch over. In the meantime, the wiki >> remains the place to go. > > I must have missed the earlier mention of that wiki, so please pardon > my ignorance: does editing the wiki propagate the updated docstring > back into svn? Yes, we have mechanisms in place to do that. I haven't merged for a while, because I am hoping that we can move the docstrings over to the new (web application) system soon. If that doesn't happen, I will probably do a merge by Friday. You can read more about the effort at http://www.scipy.org/Developer_Zone/DocMarathon2008 and on the front-page of the wiki http://sd-2116.dedibox.fr/doc Regards St?fan From alan.mcintyre at gmail.com Wed May 28 17:41:03 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 17:41:03 -0400 Subject: [Numpy-discussion] segmentation fault In-Reply-To: <9457e7c80805281410h32e0c3fbva717fd9f2b9179e0@mail.gmail.com> References: <9457e7c80805281022n31b5590bubd20fe419dda4448@mail.gmail.com> <1d36917a0805281217n5140972byd6be0d16d9650eb1@mail.gmail.com> <1d36917a0805281306s18a19c69n3884704555d22799@mail.gmail.com> <9457e7c80805281319p5a438629v99a0dfcc28d15896@mail.gmail.com> <1d36917a0805281355s66718145yec135b6202b66625@mail.gmail.com> <9457e7c80805281410h32e0c3fbva717fd9f2b9179e0@mail.gmail.com> Message-ID: <1d36917a0805281441y335421k44482d26ca902f06@mail.gmail.com> On Wed, May 28, 2008 at 5:10 PM, St?fan van der Walt wrote: > Yes, we have mechanisms in place to do that. I haven't merged for a > while, because I am hoping that we can move the docstrings over to the > new (web application) system soon. If that doesn't happen, I will > probably do a merge by Friday. Wow, that's cool, thanks! From alan.mcintyre at gmail.com Wed May 28 17:49:21 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 17:49:21 -0400 Subject: [Numpy-discussion] List of function-like things with an 'out' parameter Message-ID: <1d36917a0805281449l59432b10kffe977ef1a9ef234@mail.gmail.com> On Wed, May 28, 2008 at 3:34 PM, Charles R Harris >> I wonder if this is something that ought to be looked at for all >> functions with an "out" parameter? ndarray.compress also had problems >> with array type mismatch (#789); I can't imagine that it's safe to >> assume only these two functions were doing it incorrectly. (Unless of >> course somebody has recently looked at all of them) > > I think that is an excellent idea! A good start would be to list all the > functions with the out parameter and then write some tests. The current > behavior is inconsistent and we not only need to specify the behavior, but > fix all the places that don't follow the rules. Here's a list of things in numpy that have an 'out' argument (and their arguments); I think I eliminated all the duplicated items (that are imported from a subpackage into one of the main packages, for example). There's stuff that's missing, probably; I think the C-implemented functions don't have argument lists programmatically available so I may parse their docstrings or something, but this is a start. numpy.all ['a', 'axis', 'out'] numpy.alltrue ['a', 'axis', 'out'] numpy.amax ['a', 'axis', 'out'] numpy.amin ['a', 'axis', 'out'] numpy.any ['a', 'axis', 'out'] numpy.around ['a', 'decimals', 'out'] numpy.choose ['a', 'choices', 'out', 'mode'] numpy.clip ['a', 'a_min', 'a_max', 'out'] numpy.compress ['condition', 'a', 'axis', 'out'] numpy.core.cumprod ['a', 'axis', 'dtype', 'out'] numpy.core.cumproduct ['a', 'axis', 'dtype', 'out'] numpy.core.cumsum ['a', 'axis', 'dtype', 'out'] numpy.core.defmatrix.matrix.all ['self', 'axis', 'out'] numpy.core.defmatrix.matrix.any ['self', 'axis', 'out'] numpy.core.fromnumeric.mean ['a', 'axis', 'dtype', 'out'] numpy.core.fromnumeric.prod ['a', 'axis', 'dtype', 'out'] numpy.core.fromnumeric.product ['a', 'axis', 'dtype', 'out'] numpy.core.fromnumeric.ptp ['a', 'axis', 'out'] numpy.core.fromnumeric.round_ ['a', 'decimals', 'out'] numpy.core.fromnumeric.sometrue ['a', 'axis', 'out'] numpy.core.fromnumeric.std ['a', 'axis', 'dtype', 'out', 'ddof'] numpy.core.fromnumeric.sum ['a', 'axis', 'dtype', 'out'] numpy.core.fromnumeric.take ['a', 'indices', 'axis', 'out', 'mode'] numpy.core.fromnumeric.trace ['a', 'offset', 'axis1', 'axis2', 'dtype', 'out'] numpy.core.fromnumeric.var ['a', 'axis', 'dtype', 'out', 'ddof'] numpy.lib.function_base.median ['a', 'axis', 'out', 'overwrite_input'] numpy.ma.choose ['indices', 't', 'out', 'mode'] numpy.ma.core.MaskedArray.compress ['self', 'condition', 'axis', 'out'] numpy.ma.core.max ['obj', 'axis', 'out'] numpy.ma.core.min ['array', 'axis', 'out'] numpy.ma.core.round_ ['a', 'decimals', 'out'] numpy.ma.extras.median ['a', 'axis', 'out', 'overwrite_input'] From peridot.faceted at gmail.com Wed May 28 19:26:05 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 28 May 2008 19:26:05 -0400 Subject: [Numpy-discussion] List of function-like things with an 'out' parameter In-Reply-To: <1d36917a0805281449l59432b10kffe977ef1a9ef234@mail.gmail.com> References: <1d36917a0805281449l59432b10kffe977ef1a9ef234@mail.gmail.com> Message-ID: 2008/5/28 Alan McIntyre : > On Wed, May 28, 2008 at 3:34 PM, Charles R Harris >> I wonder if this > is something that ought to be looked at for all >>> functions with an "out" parameter? ndarray.compress also had problems >>> with array type mismatch (#789); I can't imagine that it's safe to >>> assume only these two functions were doing it incorrectly. (Unless of >>> course somebody has recently looked at all of them) >> >> I think that is an excellent idea! A good start would be to list all the >> functions with the out parameter and then write some tests. The current >> behavior is inconsistent and we not only need to specify the behavior, but >> fix all the places that don't follow the rules. > > Here's a list of things in numpy that have an 'out' argument (and > their arguments); I think I eliminated all the duplicated items (that > are imported from a subpackage into one of the main packages, for > example). There's stuff that's missing, probably; I think the > C-implemented functions don't have argument lists programmatically > available so I may parse their docstrings or something, but this is a > start. Nice! One noticeable absence is all the ufuncs. (Partly this is because it's not actually called "out", or on fact anything at all; it's just the last parameter if there are enough.) You might also check things like objects returned by vectorize() and frompyfunc(). Does it make sense to put this list on the Wiki somewhere, so that people who come across new things that take output parameters (however named) can post them? Anne From peridot.faceted at gmail.com Wed May 28 19:28:28 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 28 May 2008 19:28:28 -0400 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> References: <483C6194.3060809@enthought.com> <483C6BCA.80705@enthought.com> <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> Message-ID: 2008/5/27 Robert Kern : > Can we make it so that dtype('c') is preserved instead of displaying > '|S1'? It does not behave the same as dtype('|S1') although it > compares equal to it. It seems alarming to me that they should compare equal but behave differently. Is it possible to change more than just the way it prints? Anne From alan.mcintyre at gmail.com Wed May 28 19:46:44 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 28 May 2008 19:46:44 -0400 Subject: [Numpy-discussion] List of function-like things with an 'out' parameter In-Reply-To: References: <1d36917a0805281449l59432b10kffe977ef1a9ef234@mail.gmail.com> Message-ID: <1d36917a0805281646y73be8ba3x95d3d9106c8d88fa@mail.gmail.com> On Wed, May 28, 2008 at 7:26 PM, Anne Archibald wrote: > One noticeable absence is all the ufuncs. (Partly this is because it's > not actually called "out", or on fact anything at all; it's just the > last parameter if there are enough.) You might also check things like > objects returned by vectorize() and frompyfunc(). Here's an updated version of the list that includes ufuncs. > Does it make sense to put this list on the Wiki somewhere, so that > people who come across new things that take output parameters (however > named) can post them? I generated the list with a script; I don't know if that should be kept somewhere too. It could be modified to look for other things as well; I'm not sure how often people need lists of every function object in numpy with a given argument, though. Alan -------------- next part -------------- A non-text attachment was scrubbed... Name: func-like-with-out.txt.gz Type: application/x-gzip Size: 4400 bytes Desc: not available URL: From oliphant at enthought.com Wed May 28 20:52:15 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 28 May 2008 19:52:15 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: <483C6194.3060809@enthought.com> <483C6BCA.80705@enthought.com> <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> Message-ID: <483DFE3F.3070906@enthought.com> Anne Archibald wrote: > 2008/5/27 Robert Kern : > > >> Can we make it so that dtype('c') is preserved instead of displaying >> '|S1'? It does not behave the same as dtype('|S1') although it >> compares equal to it. >> > > It seems alarming to me that they should compare equal but behave > differently. Is it possible to change more than just the way it > prints? > comparison on dtype objects is about memory layout equivalency. Characters and length-1 strings are equivalent from a memory-layout perspective. -Travis From cournape at gmail.com Wed May 28 21:03:36 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 29 May 2008 10:03:36 +0900 Subject: [Numpy-discussion] What does "Ignoring attempt to set 'name' (from ... " mean ? In-Reply-To: <3d375d730805281202v1dcbaef5ocb4f1c45f3ea5fde@mail.gmail.com> References: <483D227B.9030902@ar.media.kyoto-u.ac.jp> <3d375d730805281202v1dcbaef5ocb4f1c45f3ea5fde@mail.gmail.com> Message-ID: <5b8d13220805281803n1212f8a4hd8d5e540c9ef9e0e@mail.gmail.com> On Thu, May 29, 2008 at 4:02 AM, Robert Kern wrote: > > Please provide the full error message with some context. > For example, in scipy/sparse: python setup.py config gives Appending sparse.linalg.isolve configuration to sparse.linalg Ignoring attempt to set 'name' (from 'sparse.linalg' to 'sparse.linalg.isolve') /Users/david/local/lib/python2.5/site-packages/numpy/distutils/system_info.py:414: UserWarning: UMFPACK sparse solver (http://www.cise.ufl.edu/research/sparse/umfpack/) not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [umfpack]) or by setting the UMFPACK environment variable. warnings.warn(self.notfounderror.__doc__) Appending sparse.linalg.dsolve.umfpack configuration to sparse.linalg.dsolve Ignoring attempt to set 'name' (from 'sparse.linalg.dsolve' to 'sparse.linalg.dsolve.umfpack') Appending sparse.linalg.dsolve configuration to sparse.linalg Ignoring attempt to set 'name' (from 'sparse.linalg' to 'sparse.linalg.dsolve') Appending sparse.linalg.eigen.arpack configuration to sparse.linalg.eigen Ignoring attempt to set 'name' (from 'sparse.linalg.eigen' to 'sparse.linalg.eigen.arpack') Appending sparse.linalg.eigen.lobpcg configuration to sparse.linalg.eigen Ignoring attempt to set 'name' (from 'sparse.linalg.eigen' to 'sparse.linalg.eigen.lobpcg') Appending sparse.linalg.eigen configuration to sparse.linalg Ignoring attempt to set 'name' (from 'sparse.linalg' to 'sparse.linalg.eigen') Appending sparse.linalg configuration to sparse Ignoring attempt to set 'name' (from 'sparse' to 'sparse.linalg') Appending sparse.sparsetools configuration to sparse Ignoring attempt to set 'name' (from 'sparse' to 'sparse.sparsetools') running config thanks, David From robert.kern at gmail.com Wed May 28 21:16:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 28 May 2008 20:16:04 -0500 Subject: [Numpy-discussion] What does "Ignoring attempt to set 'name' (from ... " mean ? In-Reply-To: <5b8d13220805281803n1212f8a4hd8d5e540c9ef9e0e@mail.gmail.com> References: <483D227B.9030902@ar.media.kyoto-u.ac.jp> <3d375d730805281202v1dcbaef5ocb4f1c45f3ea5fde@mail.gmail.com> <5b8d13220805281803n1212f8a4hd8d5e540c9ef9e0e@mail.gmail.com> Message-ID: <3d375d730805281816j56ca4500v85ece93c501db218@mail.gmail.com> On Wed, May 28, 2008 at 8:03 PM, David Cournapeau wrote: > On Thu, May 29, 2008 at 4:02 AM, Robert Kern wrote: >> >> Please provide the full error message with some context. > > For example, in scipy/sparse: python setup.py config gives > > Appending sparse.linalg.isolve configuration to sparse.linalg > Ignoring attempt to set 'name' (from 'sparse.linalg' to 'sparse.linalg.isolve') They're fine. Ignore them. They are silenced from the main setup.py with config.set_options(quiet=True) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed May 28 21:18:11 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 28 May 2008 20:18:11 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <483DFE3F.3070906@enthought.com> References: <483C6194.3060809@enthought.com> <483C6BCA.80705@enthought.com> <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> <483DFE3F.3070906@enthought.com> Message-ID: <3d375d730805281818p50eaf7few6c5bc2f5fdb4af9@mail.gmail.com> On Wed, May 28, 2008 at 7:52 PM, Travis E. Oliphant wrote: > Anne Archibald wrote: >> 2008/5/27 Robert Kern : >> >>> Can we make it so that dtype('c') is preserved instead of displaying >>> '|S1'? It does not behave the same as dtype('|S1') although it >>> compares equal to it. >>> >> >> It seems alarming to me that they should compare equal but behave >> differently. Is it possible to change more than just the way it >> prints? >> > comparison on dtype objects is about memory layout equivalency. > Characters and length-1 strings are equivalent from a memory-layout > perspective. That would be fine if dtypes only represented memory layout. However, in this case, they also represent a difference in interpretation of str objects in the array() constructor. That is a real difference that needs to be reflected in __eq__ and __repr__. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed May 28 21:45:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 May 2008 19:45:16 -0600 Subject: [Numpy-discussion] List of function-like things with an 'out' parameter In-Reply-To: <1d36917a0805281646y73be8ba3x95d3d9106c8d88fa@mail.gmail.com> References: <1d36917a0805281449l59432b10kffe977ef1a9ef234@mail.gmail.com> <1d36917a0805281646y73be8ba3x95d3d9106c8d88fa@mail.gmail.com> Message-ID: On Wed, May 28, 2008 at 5:46 PM, Alan McIntyre wrote: > On Wed, May 28, 2008 at 7:26 PM, Anne Archibald > wrote: > > One noticeable absence is all the ufuncs. (Partly this is because it's > > not actually called "out", or on fact anything at all; it's just the > > last parameter if there are enough.) You might also check things like > > objects returned by vectorize() and frompyfunc(). > > Here's an updated version of the list that includes ufuncs. > > > Does it make sense to put this list on the Wiki somewhere, so that > > people who come across new things that take output parameters (however > > named) can post them? > > I generated the list with a script; I don't know if that should be > kept somewhere too. It could be modified to look for other things as > well; I'm not sure how often people need lists of every function > object in numpy with a given argument, though. > This might add some also: PyUFunc_GenericReduction Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From garg1 at ualberta.ca Thu May 29 00:30:44 2008 From: garg1 at ualberta.ca (Rahul Garg) Date: Wed, 28 May 2008 22:30:44 -0600 Subject: [Numpy-discussion] Python-to-C compiler : very prelim alpha Message-ID: <20080528223044.s2m2ubsk00w8wsso@webmail.ualberta.ca> Hi. I had announced a project called Spyke earlier on this list. The name has been changed to unPython. unPython is a Python to C compiler. The homepage is at www.cs.ualberta.ca/~garg1/unpython/index.php A very prelim release is available for download. You can also download the manual on the download page. Note that this is a very preliminary release. If it compiles anything correctly, I will be very surprised :) The license currently is GPLv3. Please do not judge the compiler by what it is right now but instead please try it out and give me comments/flames/bug reports etc. I have also set up a mailing list at http://groups.google.com/group/unpython-discuss For those of you who do not like to play at the bleeding edge, a slightly better release is scheduled around mid June. thanks, rahul From david at ar.media.kyoto-u.ac.jp Thu May 29 07:58:51 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 29 May 2008 20:58:51 +0900 Subject: [Numpy-discussion] What does "Ignoring attempt to set 'name' (from ... " mean ? In-Reply-To: <3d375d730805281816j56ca4500v85ece93c501db218@mail.gmail.com> References: <483D227B.9030902@ar.media.kyoto-u.ac.jp> <3d375d730805281202v1dcbaef5ocb4f1c45f3ea5fde@mail.gmail.com> <5b8d13220805281803n1212f8a4hd8d5e540c9ef9e0e@mail.gmail.com> <3d375d730805281816j56ca4500v85ece93c501db218@mail.gmail.com> Message-ID: <483E9A7B.7060405@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > > They're fine. Ignore them. They are silenced from the main setup.py with > > config.set_options(quiet=True) > > What are the cases where those message are meaningful ? I did not understand from the distutils code what kind of issues were related to this message, cheers, David From garg1 at ualberta.ca Thu May 29 10:01:55 2008 From: garg1 at ualberta.ca (Rahul Garg) Date: Thu, 29 May 2008 08:01:55 -0600 Subject: [Numpy-discussion] C API : slicing? Message-ID: <20080529080155.sd5w2no204gscocw@webmail.ualberta.ca> Hi. Does the C api have some convenience functions for creating slices? For example : if I have a PyArrayObject *A, which represents lets say a 2d ndarray A in Python, is there a C api function to easily do the equivalent of A[a:b:c,d:e:f] ? thanks, rahul From strawman at astraw.com Thu May 29 10:57:34 2008 From: strawman at astraw.com (Andrew Straw) Date: Thu, 29 May 2008 07:57:34 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.1.0 In-Reply-To: References: Message-ID: <483EC45E.3050700@astraw.com> Thanks, Jarrod. Should I replace the old numpy 1.0.4 information at http://www.scipy.org/Download with the 1.1.0? It's still listing 1.0.4, but I wonder if there's some compatibility with scipy 0.6 issue that should cause it to stay at 1.0.4. In either case, I think the page should be updated -- particularly as searching Google for "numpy download" results in that page as the first hit. -Andrew Jarrod Millman wrote: > I'm pleased to announce the release of NumPy 1.1.0. > > NumPy is the fundamental package needed for scientific computing with > Python. It contains: > > * a powerful N-dimensional array object > * sophisticated (broadcasting) functions > * basic linear algebra functions > * basic Fourier transforms > * sophisticated random number capabilities > * tools for integrating Fortran code. > > Besides it's obvious scientific uses, NumPy can also be used as an > efficient multi-dimensional container of generic data. Arbitrary > data-types can be defined. This allows NumPy to seamlessly and > speedily integrate with a wide-variety of databases. > > This is the first minor release since the 1.0 release in > October 2006. There are a few major changes, which introduce > some minor API breakage. In addition this release includes > tremendous improvements in terms of bug-fixing, testing, and > documentation. > > For information, please see the release notes: > http://sourceforge.net/project/shownotes.php?release_id=602575&group_id=1369 > > Thank you to everybody who contributed to this release. > > Enjoy, > > From kwgoodman at gmail.com Thu May 29 11:57:25 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 29 May 2008 08:57:25 -0700 Subject: [Numpy-discussion] out= corner cases Message-ID: This looks good: >> import numpy as np >> x = np.random.rand(2,3) >> x.mean(None, out=x) --------------------------------------------------------------------------- ValueError: wrong shape for output But this is strange: >> x.std(None, out=x) 0.28264369725 >> x array([[ 0.54718012, 0.94296181, 0.23668961], [ 0.35561918, 0.80860405, 0.96713833]]) >> >> x.var(None, out=x) 0.0798874595952 >> x array([[ 0.54718012, 0.94296181, 0.23668961], [ 0.35561918, 0.80860405, 0.96713833]]) >> >> x.var(0, out=x) array([ 0.0091739 , 0.004513 , 0.13338883]) >> x array([[ 0.54718012, 0.94296181, 0.23668961], [ 0.35561918, 0.80860405, 0.96713833]]) I'm using numpy 1.0.4 from Debian Lenny. From stefan at sun.ac.za Thu May 29 12:26:16 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 29 May 2008 18:26:16 +0200 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: References: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> Message-ID: <9457e7c80805290926y1335b76dq797b7aca5555f20f@mail.gmail.com> 2008/5/23 Keith Goodman : > On Fri, May 23, 2008 at 11:44 AM, Robert Kern wrote: >> On Fri, May 23, 2008 at 12:22 PM, Keith Goodman wrote: >> >>> But the first example >>> >>>>> x = mp.matrix([[mp.nan]]) >>>>> x >>> matrix([[ NaN]]) >>>>> x.all() >>> True >>>>> x.any() >>> True >>> >>> is still surprising. >> >> On non-boolean arrays, .all() and .any() check each element to see if >> it is not equal to 0. NaN != 0. Returning False would be just as >> wrong. If there were a Maybe in addition to True and False, then >> perhaps that would be worth changing, but I don't see a reason to >> change the rule as it is. > > That makes sense. Hopefully it will find its way into the doc string. Hopefully you'll add it there :) Cheers St?fan From kwgoodman at gmail.com Thu May 29 13:22:41 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 29 May 2008 10:22:41 -0700 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: <9457e7c80805290926y1335b76dq797b7aca5555f20f@mail.gmail.com> References: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> <9457e7c80805290926y1335b76dq797b7aca5555f20f@mail.gmail.com> Message-ID: On Thu, May 29, 2008 at 9:26 AM, St?fan van der Walt wrote: > 2008/5/23 Keith Goodman : >> On Fri, May 23, 2008 at 11:44 AM, Robert Kern wrote: >>> On Fri, May 23, 2008 at 12:22 PM, Keith Goodman wrote: >>> >>>> But the first example >>>> >>>>>> x = mp.matrix([[mp.nan]]) >>>>>> x >>>> matrix([[ NaN]]) >>>>>> x.all() >>>> True >>>>>> x.any() >>>> True >>>> >>>> is still surprising. >>> >>> On non-boolean arrays, .all() and .any() check each element to see if >>> it is not equal to 0. NaN != 0. Returning False would be just as >>> wrong. If there were a Maybe in addition to True and False, then >>> perhaps that would be worth changing, but I don't see a reason to >>> change the rule as it is. >> >> That makes sense. Hopefully it will find its way into the doc string. > > Hopefully you'll add it there :) Yeah, but then I'd have to change these it's to its: Docstrings/numpy/ma/extras/polyfit . . . 1 match ...me when y is a 2D array. When full=True, the rank of the scaled Vandermonde matrix, it's effective rank in light of the rcond value, its singular values, and the specified ... Docstrings/numpy/lib/polynomial/polyfit . . . 1 match ...me when y is a 2D array. When full=True, the rank of the scaled Vandermonde matrix, it's effective rank in light of the rcond value, its singular values, and the specified ... Docstrings/numpy/lib/index-tricks/nd-grid . . . 1 match ...However, if the step length is a **complex number** (e.g. 5j), then the integer part of it's magnitude is interpreted as specifying the number of points to create between the start... Docstrings/numpy/lib/-datasource/Repository . . . 1 match ...one base URL. Initialize the Respository with the base URL, then refer to each file by it's filename only. *Methods*: - exists : test if the file exists locally or remotely ... Docstrings/numpy/fft/fftpack/ifft . . . 1 match ...input array is expected to be packed the same way as the output of fft, as discussed in it's documentation. This is the inverse of fft: ifft(fft(a)) == a within numerical accuracy... From kwgoodman at gmail.com Thu May 29 13:53:30 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 29 May 2008 10:53:30 -0700 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: <9457e7c80805290926y1335b76dq797b7aca5555f20f@mail.gmail.com> References: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> <9457e7c80805290926y1335b76dq797b7aca5555f20f@mail.gmail.com> Message-ID: On Thu, May 29, 2008 at 9:26 AM, St?fan van der Walt wrote: > 2008/5/23 Keith Goodman : >> On Fri, May 23, 2008 at 11:44 AM, Robert Kern wrote: >>> On Fri, May 23, 2008 at 12:22 PM, Keith Goodman wrote: >>> >>>> But the first example >>>> >>>>>> x = mp.matrix([[mp.nan]]) >>>>>> x >>>> matrix([[ NaN]]) >>>>>> x.all() >>>> True >>>>>> x.any() >>>> True >>>> >>>> is still surprising. >>> >>> On non-boolean arrays, .all() and .any() check each element to see if >>> it is not equal to 0. NaN != 0. Returning False would be just as >>> wrong. If there were a Maybe in addition to True and False, then >>> perhaps that would be worth changing, but I don't see a reason to >>> change the rule as it is. >> >> That makes sense. Hopefully it will find its way into the doc string. > > Hopefully you'll add it there :) > > Cheers > St?fan My username is kwgoodman. From darren.dale at cornell.edu Thu May 29 13:54:23 2008 From: darren.dale at cornell.edu (Darren Dale) Date: Thu, 29 May 2008 13:54:23 -0400 Subject: [Numpy-discussion] question about histogram2d Message-ID: <200805291354.23664.darren.dale@cornell.edu> I have a question about histogram2d. Say I do something like: import numpy from numpy import random import pylab x=random.rand(1000)-0.5 y=random.rand(1000)*10-5 xbins=numpy.linspace(-10,10,100) ybins=numpy.linspace(-10,10,100) h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins]) pylab.imshow(h,interpolation='nearest') pylab.show() The output is attached. I think I would have expected the transpose of what numpy histogram2d returned, so the tight x distribution appears along the x axis in the image. Maybe I am thinking about this incorrectly, or there is a convention I am unfamiliar with. If the behavior is correct, could the docstring include a comment explaining the orientation of the histogram array? Thanks, Darren -- Darren S. Dale, Ph.D. Staff Scientist Cornell High Energy Synchrotron Source Cornell University 275 Wilson Lab Rt. 366 & Pine Tree Road Ithaca, NY 14853 darren.dale at cornell.edu office: (607) 255-3819 fax: (607) 255-9001 http://www.chess.cornell.edu -------------- next part -------------- A non-text attachment was scrubbed... Name: hist2dimage.png Type: image/png Size: 4305 bytes Desc: not available URL: From pav at iki.fi Thu May 29 15:02:54 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 29 May 2008 22:02:54 +0300 Subject: [Numpy-discussion] Any and all NaNs In-Reply-To: References: <3d375d730805231144u69f46632sd123a2e270f1d857@mail.gmail.com> <9457e7c80805290926y1335b76dq797b7aca5555f20f@mail.gmail.com> Message-ID: <1212087774.8590.0.camel@localhost.localdomain> to, 2008-05-29 kello 10:53 -0700, Keith Goodman kirjoitti: > On Thu, May 29, 2008 at 9:26 AM, St?fan van der Walt wrote: > > 2008/5/23 Keith Goodman : > >> On Fri, May 23, 2008 at 11:44 AM, Robert Kern wrote: [clip] > >> That makes sense. Hopefully it will find its way into the doc string. > > > > Hopefully you'll add it there :) > > > > Cheers > > St?fan > > My username is kwgoodman. Thanks, edit permissions added. Pauli From charlesr.harris at gmail.com Thu May 29 15:33:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 29 May 2008 13:33:34 -0600 Subject: [Numpy-discussion] Question for Travis Message-ID: Travis, What are the fundamental types for ndarrays? We have the c types, 'bBhHiIlLqQfdg', together with the boolean and complex types. Then we have types defined by length, int8, uint8, etc. The long types change length going from 32 to 64 bit machines, so there can be a couple of c-types corresponding to the same precision. So I am wondering which types should be considered "fundamental" and what types are used to identify pickled arrays and arrays stored with the new .nz formats. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu May 29 16:05:25 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 29 May 2008 16:05:25 -0400 Subject: [Numpy-discussion] From float to records Message-ID: <200805291605.26845.pgmdevlist@gmail.com> All, I have a set of arrays that I want to transform to records. Viewing them as a new dtype is usually sufficient, but fails occasionally. Here's an example: #--------------------------------------- import numpy as np testdtype = [('a',float),('b',float),('c',float)] test = np.random.rand(15).reshape(5,3) # View the (5,3) array as 5 records of 3 fields newrecord = test.view(testdtype) # Create a new array with the wrong shape test = np.random.rand(15).reshape(3,5) #Try to view it try: newrecord = test.T.view(testdtype) except ValueError, msg: print "Error creating new record on transpose: %s" % msg # That failed, but won't with a copy try: newrecord = test.T.copy().view(testdtype) except ValueError, msg: print "Error creating new record on transpose+copy: %s" % msg #--------------------------------------- * Could somebody explain me what goes wrong in the second case (transpose+view) ? Is it because the transpose doesn't own the data ? * Is there a way to transform my (3,5) array into a (5,) recordarray without a copy ? Thanks a lot in advance. From charlesr.harris at gmail.com Thu May 29 16:25:24 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 29 May 2008 14:25:24 -0600 Subject: [Numpy-discussion] From float to records In-Reply-To: <200805291605.26845.pgmdevlist@gmail.com> References: <200805291605.26845.pgmdevlist@gmail.com> Message-ID: On Thu, May 29, 2008 at 2:05 PM, Pierre GM wrote: > All, > I have a set of arrays that I want to transform to records. Viewing them as > a > new dtype is usually sufficient, but fails occasionally. Here's an example: > > #--------------------------------------- > import numpy as np > testdtype = [('a',float),('b',float),('c',float)] > test = np.random.rand(15).reshape(5,3) > # View the (5,3) array as 5 records of 3 fields > newrecord = test.view(testdtype) > # Create a new array with the wrong shape > test = np.random.rand(15).reshape(3,5) > #Try to view it > try: > newrecord = test.T.view(testdtype) > except ValueError, msg: > print "Error creating new record on transpose: %s" % msg > # That failed, but won't with a copy > try: > newrecord = test.T.copy().view(testdtype) > except ValueError, msg: > print "Error creating new record on transpose+copy: %s" % msg > #--------------------------------------- > > * Could somebody explain me what goes wrong in the second case > (transpose+view) ? Is it because the transpose doesn't own the data ? > > * Is there a way to transform my (3,5) array into a (5,) recordarray > without a > copy ? > I don't think so. The transpose is just a view, it doesn't move the elements around, =so the three elements you want to be contiguous, aren't. It's possible to transpose in place, but it can be a tricky operation and I don't think it is available in numpy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu May 29 16:55:03 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 29 May 2008 16:55:03 -0400 Subject: [Numpy-discussion] From float to records In-Reply-To: References: <200805291605.26845.pgmdevlist@gmail.com> Message-ID: <200805291655.03703.pgmdevlist@gmail.com> On Thursday 29 May 2008 16:25:24 Charles R Harris wrote: > > * Could somebody explain me what goes wrong in the second case > > (transpose+view) ? Is it because the transpose doesn't own the data ? > > > > * Is there a way to transform my (3,5) array into a (5,) recordarray > > without a > > copy ? > > I don't think so. The transpose is just a view, it doesn't move the > elements around, =so the three elements you want to be contiguous, aren't. > It's possible to transpose in place, but it can be a tricky operation and I > don't think it is available in numpy. Ahah, so that was a problem of contiguity. OK, that makes sense. Thanks. From millman at berkeley.edu Thu May 29 17:37:07 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 29 May 2008 14:37:07 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.1.0 In-Reply-To: <483EC45E.3050700@astraw.com> References: <483EC45E.3050700@astraw.com> Message-ID: On Thu, May 29, 2008 at 7:57 AM, Andrew Straw wrote: > Should I replace the old numpy 1.0.4 information at > http://www.scipy.org/Download with the 1.1.0? It's still listing 1.0.4, > but I wonder if there's some compatibility with scipy 0.6 issue that > should cause it to stay at 1.0.4. In either case, I think the page > should be updated -- particularly as searching Google for "numpy > download" results in that page as the first hit. No, I will take care of it. I was away from home and decided to make a relatively quiet release, since I might not be able to respond in case their were problems. I only sent the email to the NumPy discussion list hoping that if there were problems the people on the list could sort things out. I just got home and plan to take some time today to make the announce the new release more widely. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Thu May 29 17:59:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 29 May 2008 15:59:12 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.1.0 In-Reply-To: References: <483EC45E.3050700@astraw.com> Message-ID: On Thu, May 29, 2008 at 3:37 PM, Jarrod Millman wrote: > On Thu, May 29, 2008 at 7:57 AM, Andrew Straw wrote: > > Should I replace the old numpy 1.0.4 information at > > http://www.scipy.org/Download with the 1.1.0? It's still listing 1.0.4, > > but I wonder if there's some compatibility with scipy 0.6 issue that > > should cause it to stay at 1.0.4. In either case, I think the page > > should be updated -- particularly as searching Google for "numpy > > download" results in that page as the first hit. > > No, I will take care of it. I was away from home and decided to make > a relatively quiet release, since I might not be able to respond in > case their were problems. I only sent the email to the NumPy > discussion list hoping that if there were problems the people on the > list could sort things out. I just got home and plan to take some > time today to make the announce the new release more widely. > I updated the download page and inserted an announcement this morning. Where are the problems? Andrew, your buildbot clients are offline, it that intentional? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Thu May 29 18:09:55 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 29 May 2008 15:09:55 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.1.0 In-Reply-To: References: <483EC45E.3050700@astraw.com> Message-ID: On Thu, May 29, 2008 at 2:59 PM, Charles R Harris wrote: >> No, I will take care of it. I was away from home and decided to make >> a relatively quiet release, since I might not be able to respond in >> case their were problems. I only sent the email to the NumPy >> discussion list hoping that if there were problems the people on the >> list could sort things out. I just got home and plan to take some >> time today to make the announce the new release more widely. > > I updated the download page and inserted an announcement this morning. Where > are the problems? Thanks, you beat me to it. I am not aware of any problems, but I wanted to make sure I could respond quickly in case, for example, I had uploaded a wrong file or one of the files was corrupt. Since I was at a meeting in St. Louis, I wasn't sure that I would be able to. I am back home and should be online most of the time now. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From stefan at sun.ac.za Thu May 29 18:28:08 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 30 May 2008 00:28:08 +0200 Subject: [Numpy-discussion] New documentation web application Message-ID: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> Hi all, The NumPy documentation project has taken another leap forward! Pauli Virtanen has, in a week of superhuman coding, produced a web application that enhances the work-flow and editing experience of NumPy docstrings on the web. Unfortunately, this means that those of you who signed up before will have to create new accounts at http://sd-2116.dedibox.fr/pydocweb Please, don't let this put you off! You have made great contributions so far, and you *will* see these changes in the next release of NumPy (1.2). Being able to say that you wrote part of the NumPy documentation certainly is something to be proud of! Please mail me or Pauli your usernames, and we shall add you to the editor or reviewer groups ASAP. The new web application has a number of advantages: - We can set the status of docstrings, e.g., modfied, in need of review, reviewed, in need of proof, proofed etc. - We can easy keep the docstrings in sync with the current SVN. This means that the developers can add to the docstrings without adversely influencing our effort. - We can keep track of documentation statistics, e.g., how many docstrings have an Examples section? - Docstrings are parsed according to the documentation standard, and displays as they would be in the reference guide. Again, please register at http://sd-2116.dedibox.fr/pydocweb To all those who have contributed so far, a big thank you! It is because of you that the author field of the Reference Guide reads "The Numpy Community". Regards St?fan From millman at berkeley.edu Thu May 29 18:42:43 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 29 May 2008 15:42:43 -0700 Subject: [Numpy-discussion] New documentation web application In-Reply-To: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> Message-ID: On Thu, May 29, 2008 at 3:28 PM, St?fan van der Walt wrote: > The NumPy documentation project has taken another leap forward! Pauli > Virtanen has, in a week of superhuman coding, produced a web > application that enhances the work-flow and editing experience of > NumPy docstrings on the web. Excellent. Thanks to everyone who is working on this. The documentation work that you are doing is going to make a huge difference in increasing NumPy adoption and will benefit the community in a major way. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From peridot.faceted at gmail.com Thu May 29 18:46:27 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 29 May 2008 18:46:27 -0400 Subject: [Numpy-discussion] New documentation web application In-Reply-To: References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> Message-ID: 2008/5/29 Jarrod Millman : > On Thu, May 29, 2008 at 3:28 PM, St?fan van der Walt wrote: >> The NumPy documentation project has taken another leap forward! Pauli >> Virtanen has, in a week of superhuman coding, produced a web >> application that enhances the work-flow and editing experience of >> NumPy docstrings on the web. > > Excellent. Thanks to everyone who is working on this. The > documentation work that you are doing is going to make a huge > difference in increasing NumPy adoption and will benefit the community > in a major way. Absolutely. It is also, already, improving the reliability and consistency of numpy's code as people carefully go over all the functions that have been neglected for years. This is a great project. Anne From wnbell at gmail.com Thu May 29 18:48:10 2008 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 29 May 2008 17:48:10 -0500 Subject: [Numpy-discussion] New documentation web application In-Reply-To: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> Message-ID: On Thu, May 29, 2008 at 5:28 PM, St?fan van der Walt wrote: > Hi all, > > The NumPy documentation project has taken another leap forward! Pauli > Virtanen has, in a week of superhuman coding, produced a web > application that enhances the work-flow and editing experience of > NumPy docstrings on the web. > > Unfortunately, this means that those of you who signed up before will > have to create new accounts at > > http://sd-2116.dedibox.fr/pydocweb Neat! I really like the layout. The red format warnings are a nice touch: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.umath.exp/ -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From rkompass at gmx.de Thu May 29 19:36:48 2008 From: rkompass at gmx.de (Raul Kompass) Date: Fri, 30 May 2008 01:36:48 +0200 Subject: [Numpy-discussion] Question about indexing In-Reply-To: <200805291354.23664.darren.dale@cornell.edu> References: <200805291354.23664.darren.dale@cornell.edu> Message-ID: <483F3E10.80206@gmx.de> I'm new to using numpy. Today I experimented a bit with indexing motivated by the finding that although a[a>0.5] and a[where(a>0.5)] give the same expected result (elements of a greater than 0.5) a[argwhere(a>0.5)] results in something else (rows of a in different order). I tried to figure out when indexing will yield rows and when it will give me an element and I could not find a simple rule. I systematically tried and got the follwing: ---------------------------------- >>> from scipy import * >>> a = random.rand(10).reshape(2,5) >>> a array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) >>> a[0,1] # shape([0,1]) = (2,) 0.767957427399 >>> a[[0],[1]] # shape([[0],[1]]) = (2, 1) array([ 0.76795743]) >>> a[[0,1]] # shape([[0,1]]) = (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) >>> a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) >>> a[[[0],[1]]] # shape([[[0],[1]]]) = (1, 2, 1) array([ 0.76795743]) >>> a[[[0]],[[1]]] # shape([[[0]],[[1]]]) = (2, 1, 1) array([[ 0.76795743]]) >>> a[[[[0,1]]]] # shape([[[[0,1]]]]) = (1, 1, 1, 2) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) >>> a[[[[0],[1]]]] # shape([[[[0],[1]]]]) = (1, 1, 2, 1) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]], [[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) >>> a[[[[0]],[[1]]]] # shape([[[[0]],[[1]]]]) = (1, 2, 1, 1) array([[ 0.76795743]]) >>> a[[[[0]]],[[[1]]]] # shape([[[[0]]],[[[1]]]]) = (2, 1, 1, 1) array([[[ 0.76795743]]]) ------------------------------------------- Can anyone explain this? Thank you very much, Raul From robince at gmail.com Thu May 29 19:57:20 2008 From: robince at gmail.com (Robin) Date: Fri, 30 May 2008 00:57:20 +0100 Subject: [Numpy-discussion] Question about indexing In-Reply-To: <483F3E10.80206@gmx.de> References: <200805291354.23664.darren.dale@cornell.edu> <483F3E10.80206@gmx.de> Message-ID: On Fri, May 30, 2008 at 12:36 AM, Raul Kompass wrote: > I'm new to using numpy. Today I experimented a bit with indexing > motivated by the finding that although > a[a>0.5] and a[where(a>0.5)] give the same expected result (elements of > a greater than 0.5) > a[argwhere(a>0.5)] results in something else (rows of a in different order). > > I tried to figure out when indexing will yield rows and when it will > give me an element and I could not find a simple rule. > > I systematically tried and got the follwing: > ---------------------------------- > >>> from scipy import * > >>> a = random.rand(10).reshape(2,5) > >>> a > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > > > >>> a[[0],[1]] # shape([[0],[1]]) = (2, 1) > array([ 0.76795743]) > > >>> a[[0,1]] # shape([[0,1]]) = (1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > >>> a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > >>> a[[[0],[1]]] # shape([[[0],[1]]]) = (1, 2, 1) > array([ 0.76795743]) > > >>> a[[[0]],[[1]]] # shape([[[0]],[[1]]]) = (2, 1, 1) > array([[ 0.76795743]]) > > >>> a[[[[0,1]]]] # shape([[[[0,1]]]]) = (1, 1, 1, 2) > array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) > > >>> a[[[[0],[1]]]] # shape([[[[0],[1]]]]) = (1, 1, 2, 1) > array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]], > > [[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) > > >>> a[[[[0]],[[1]]]] # shape([[[[0]],[[1]]]]) = (1, 2, 1, 1) > array([[ 0.76795743]]) > > >>> a[[[[0]]],[[[1]]]] # shape([[[[0]]],[[[1]]]]) = (2, 1, 1, 1) > array([[[ 0.76795743]]]) > ------------------------------------------- > > Can anyone explain this? > > Thank you very much, Hi, I don't have time to give a comprehensive answer - but I think I can offer a simple rule. The thing you are indexing (a) is 2 dimensional, so if you provide 2 arguments to index with (ie a[something, something]) you will get single elements - if you only provide a single argument (ie a[something]) it will pull out rows corresponding to the indexing. If you want just a specific element you have to add a second argument. Also - the outer [ ]'s in your indexing operations are just the syntax for indexing. So your shape comments are wrong: > >>> a[0,1] # shape([0,1]) = (2,) > 0.767957427399 you are indexing here with two scalars, 0,1. > >>> a[[0,1]] # shape([[0,1]]) = (1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) You are indexing here with a 1d list [0,1]. Since you don't provide a column index you get rows 0 and 1. If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1]. Hope this helps, Robin From robince at gmail.com Thu May 29 19:58:48 2008 From: robince at gmail.com (Robin) Date: Fri, 30 May 2008 00:58:48 +0100 Subject: [Numpy-discussion] Question about indexing In-Reply-To: References: <200805291354.23664.darren.dale@cornell.edu> <483F3E10.80206@gmx.de> Message-ID: On Fri, May 30, 2008 at 12:57 AM, Robin wrote: > You are indexing here with a 1d list [0,1]. Since you don't provide a > column index you get rows 0 and 1. > If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1]. Whoops - you get [0,0] and [1,1]. Robin From kwgoodman at gmail.com Thu May 29 20:02:28 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 29 May 2008 17:02:28 -0700 Subject: [Numpy-discussion] Question about indexing In-Reply-To: <483F3E10.80206@gmx.de> References: <200805291354.23664.darren.dale@cornell.edu> <483F3E10.80206@gmx.de> Message-ID: On Thu, May 29, 2008 at 4:36 PM, Raul Kompass wrote: > I'm new to using numpy. Today I experimented a bit with indexing > motivated by the finding that although > a[a>0.5] and a[where(a>0.5)] give the same expected result (elements of > a greater than 0.5) > a[argwhere(a>0.5)] results in something else (rows of a in different order). > > I tried to figure out when indexing will yield rows and when it will > give me an element and I could not find a simple rule. > > I systematically tried and got the follwing: > ---------------------------------- > >>> from scipy import * > >>> a = random.rand(10).reshape(2,5) > >>> a > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > > >>> a[0,1] # shape([0,1]) = (2,) > 0.767957427399 > > >>> a[[0],[1]] # shape([[0],[1]]) = (2, 1) > array([ 0.76795743]) > > >>> a[[0,1]] # shape([[0,1]]) = (1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > >>> a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > >>> a[[[0],[1]]] # shape([[[0],[1]]]) = (1, 2, 1) > array([ 0.76795743]) > > >>> a[[[0]],[[1]]] # shape([[[0]],[[1]]]) = (2, 1, 1) > array([[ 0.76795743]]) > > >>> a[[[[0,1]]]] # shape([[[[0,1]]]]) = (1, 1, 1, 2) > array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) > > >>> a[[[[0],[1]]]] # shape([[[[0],[1]]]]) = (1, 1, 2, 1) > array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]], > > [[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) > > >>> a[[[[0]],[[1]]]] # shape([[[[0]],[[1]]]]) = (1, 2, 1, 1) > array([[ 0.76795743]]) > > >>> a[[[[0]]],[[[1]]]] # shape([[[[0]]],[[[1]]]]) = (2, 1, 1, 1) > array([[[ 0.76795743]]]) > ------------------------------------------- Looks confusing to me too. I guess it's best to take it one step at a time. >> import numpy as np >> a = np.arange(6).reshape(2,3) >> a[0,1] 1 That's not surprising. >> a[[0,1]] That one looks odd. But it is just shorthand for: >> a[[0,1],:] So rows 0 and 1 and all columns. array([[0, 1, 2], [3, 4, 5]]) This gives the same thing: >> a[0:2,:] array([[0, 1, 2], [3, 4, 5]]) Only it's not quite the same thing. a[[0,1],:] returns a copy and a[0:2,:] returns a view >> a[[0,1],:].flags.owndata True >> a[0:2,:].flags.owndata False From david.huard at gmail.com Thu May 29 20:21:58 2008 From: david.huard at gmail.com (David Huard) Date: Thu, 29 May 2008 20:21:58 -0400 Subject: [Numpy-discussion] question about histogram2d In-Reply-To: <200805291354.23664.darren.dale@cornell.edu> References: <200805291354.23664.darren.dale@cornell.edu> Message-ID: <91cf711d0805291721qb5835fy721fe812a02caa26@mail.gmail.com> Hi Darren, If I remember correctly, the thinking under the current behavior is that it preserves similarity of results with histogramdd, where the histogram is oriented in the numpy order (columns, rows). I thought that making histogram2d(x,y) return something different than histogramdd([x,y]) was probably worst than satisfying the cartesian convention. Regards, David 2008/5/29 Darren Dale : > I have a question about histogram2d. Say I do something like: > > import numpy > from numpy import random > import pylab > > x=random.rand(1000)-0.5 > y=random.rand(1000)*10-5 > > xbins=numpy.linspace(-10,10,100) > ybins=numpy.linspace(-10,10,100) > h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins]) > > pylab.imshow(h,interpolation='nearest') > pylab.show() > > The output is attached. I think I would have expected the transpose of what > numpy histogram2d returned, so the tight x distribution appears along the x > axis in the image. Maybe I am thinking about this incorrectly, or there is > a > convention I am unfamiliar with. If the behavior is correct, could the > docstring include a comment explaining the orientation of the histogram > array? > > Thanks, > Darren > > -- > Darren S. Dale, Ph.D. > Staff Scientist > Cornell High Energy Synchrotron Source > Cornell University > 275 Wilson Lab > Rt. 366 & Pine Tree Road > Ithaca, NY 14853 > > darren.dale at cornell.edu > office: (607) 255-3819 > fax: (607) 255-9001 > http://www.chess.cornell.edu > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Thu May 29 21:32:57 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 29 May 2008 21:32:57 -0400 Subject: [Numpy-discussion] Question about indexing In-Reply-To: References: <200805291354.23664.darren.dale@cornell.edu><483F3E10.80206@gmx.de> Message-ID: On Thu, 29 May 2008, Keith Goodman apparently wrote: > >>> a[[0,1]] > That one looks odd. But it is just shorthand for: > >>> a[[0,1],:] Do you mean that ``a[[0,1],:]`` is a more primitive expression than ``a[[0,1]]``? In what sense, and does it ever matter? Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` and ``a[[0,1],:]``? Thanks, Alan Isaac From darren.dale at cornell.edu Thu May 29 21:34:34 2008 From: darren.dale at cornell.edu (Darren Dale) Date: Thu, 29 May 2008 21:34:34 -0400 Subject: [Numpy-discussion] question about histogram2d In-Reply-To: <91cf711d0805291721qb5835fy721fe812a02caa26@mail.gmail.com> References: <200805291354.23664.darren.dale@cornell.edu> <91cf711d0805291721qb5835fy721fe812a02caa26@mail.gmail.com> Message-ID: <200805292134.34858.darren.dale@cornell.edu> Hi David, In that case, I suggest histogram2d could be improved with a brief comment in the docstring to indicate how the output is formatted. Cheers, Darren On Thursday 29 May 2008 8:21:58 pm David Huard wrote: > Hi Darren, > > If I remember correctly, the thinking under the current behavior is that it > preserves similarity of results with histogramdd, where the histogram is > oriented in the numpy order (columns, rows). I thought that making > histogram2d(x,y) return something different than histogramdd([x,y]) was > probably worst than satisfying the cartesian convention. > > Regards, > > David > > 2008/5/29 Darren Dale : > > I have a question about histogram2d. Say I do something like: > > > > import numpy > > from numpy import random > > import pylab > > > > x=random.rand(1000)-0.5 > > y=random.rand(1000)*10-5 > > > > xbins=numpy.linspace(-10,10,100) > > ybins=numpy.linspace(-10,10,100) > > h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins]) > > > > pylab.imshow(h,interpolation='nearest') > > pylab.show() > > > > The output is attached. I think I would have expected the transpose of > > what numpy histogram2d returned, so the tight x distribution appears > > along the x axis in the image. Maybe I am thinking about this > > incorrectly, or there is a > > convention I am unfamiliar with. If the behavior is correct, could the > > docstring include a comment explaining the orientation of the histogram > > array? > > > > Thanks, > > Darren > > > > -- > > Darren S. Dale, Ph.D. > > Staff Scientist > > Cornell High Energy Synchrotron Source > > Cornell University > > 275 Wilson Lab > > Rt. 366 & Pine Tree Road > > Ithaca, NY 14853 > > > > darren.dale at cornell.edu > > office: (607) 255-3819 > > fax: (607) 255-9001 > > http://www.chess.cornell.edu > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion From fperez.net at gmail.com Thu May 29 22:45:24 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 29 May 2008 19:45:24 -0700 Subject: [Numpy-discussion] Tutorials at Scipy 2008 Message-ID: [ This is meant as a heads-up here, please keep the discussion on the SciPy user list so we can focus the conversation in one list only. ] Hi all, Travis Oliphant and myself have signed up to coordinate the tutorials sessions at this year's SciPy conference. Our tentative plan is described here: http://scipy.org/SciPy2008/Tutorials but it basically consists of holding in parallel: 1. A 2-day hands-on tutorial for beginners. 2. A set of 2 or 4 hour sessions on special topics. We need input from people on: - Do you like this idea? - If yes for #1, any suggestions/wishes? Eric Jones, Travis O and myself have all taught similar things and could potentially do it again, but none of us is trying to impose it. If someone else wants to do it, by all means mention it. The job could be split across multiple people once an agenda is organized. - For #2, please go to the wiki and fill in ideas for topics and/or presenters. We'll need a list of viable topics with actual presenters before we start narrowing down the schedule into something more concrete. Feel free to either discuss things here or to just put topics on the wiki. I find wikis to be a poor place for conversation but excellent for summarizing items. I'll try to update the wiki with ideas that arise here, but feel free to directly edit the wiki if you just want to suggest a specific topic or brief piece of info, do NOT feel like you have to vet anything on list. Cheers, Travis and Fernando. From kwgoodman at gmail.com Thu May 29 23:08:51 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 29 May 2008 20:08:51 -0700 Subject: [Numpy-discussion] Question about indexing In-Reply-To: References: <200805291354.23664.darren.dale@cornell.edu> <483F3E10.80206@gmx.de> Message-ID: On Thu, May 29, 2008 at 6:32 PM, Alan G Isaac wrote: > On Thu, 29 May 2008, Keith Goodman apparently wrote: >> >>> a[[0,1]] >> That one looks odd. But it is just shorthand for: >> >>> a[[0,1],:] > > > Do you mean that ``a[[0,1],:]`` is a more primitive > expression than ``a[[0,1]]``? In what sense, and does it > ever matter? > > Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` > and ``a[[0,1],:]``? I can see how the difference between a[0,1] a[[0,1]] is not obvious at first, especially if you come from octave/matlab. The first example has an obvious i and j. But the second example doesn't. So I tried to point out that i=[0,1] and j=:. From aisaac at american.edu Thu May 29 23:50:38 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 29 May 2008 23:50:38 -0400 Subject: [Numpy-discussion] Question about indexing In-Reply-To: References: <200805291354.23664.darren.dale@cornell.edu><483F3E10.80206@gmx.de> Message-ID: >> On Thu, 29 May 2008, Keith Goodman apparently wrote: >>>>>> a[[0,1]] >>> That one looks odd. But it is just shorthand for: >>>>>> a[[0,1],:] > On Thu, May 29, 2008 at 6:32 PM, Alan G Isaac > wrote: >> Do you mean that ``a[[0,1],:]`` is a more primitive >> expression than ``a[[0,1]]``? In what sense, and does it >> ever matter? >> Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` >> and ``a[[0,1],:]``? On Thu, 29 May 2008, Keith Goodman apparently wrote: > I can see how the difference between > a[0,1] > a[[0,1]] > is not obvious at first, especially if you come from octave/matlab. > The first example has an obvious i and j. But the second > example doesn't. So I tried to point out that i=[0,1] and > j=:. My questions were real questions, not rhetorical. Anyway... I think the initial mind-bender is the difference between a[(0,1)] and a[[0,1]]. The latter might also be written a[[0,1],], which I think links to your point. Cheers, Alan From ctrachte at gmail.com Thu May 29 23:59:08 2008 From: ctrachte at gmail.com (Carl Trachte) Date: Thu, 29 May 2008 20:59:08 -0700 Subject: [Numpy-discussion] Tutorials at Scipy 2008 In-Reply-To: References: Message-ID: <426ada670805292059r368ed6b9k90f10b18db8fe434@mail.gmail.com> On 5/29/08, Fernando Perez wrote: > [ This is meant as a heads-up here, please keep the discussion on the > SciPy user list so we can focus the conversation in one list only. ] > > > > Hi all, > > Travis Oliphant and myself have signed up to coordinate the tutorials > sessions at this year's SciPy conference. Our tentative plan is > described here: > > http://scipy.org/SciPy2008/Tutorials > > but it basically consists of holding in parallel: > > 1. A 2-day hands-on tutorial for beginners. > 2. A set of 2 or 4 hour sessions on special topics. > > We need input from people on: > > - Do you like this idea? > > - If yes for #1, any suggestions/wishes? Eric Jones, Travis O and > myself have all taught similar things and could potentially do it > again, but none of us is trying to impose it. If someone else wants > to do it, by all means mention it. The job could be split across > multiple people once an agenda is organized. > > - For #2, please go to the wiki and fill in ideas for topics and/or > presenters. We'll need a list of viable topics with actual presenters > before we start narrowing down the schedule into something more > concrete. > > Feel free to either discuss things here or to just put topics on the > wiki. I find wikis to be a poor place for conversation but excellent > for summarizing items. I'll try to update the wiki with ideas that > arise here, but feel free to directly edit the wiki if you just want > to suggest a specific topic or brief piece of info, do NOT feel like > you have to vet anything on list. > > Cheers, > > Travis and Fernando. > _______________________________________________ > > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Assuming I'm approved by my employer to attend, I could benefit from either the intro or some of the advanced topics. If the mayavi one is for hard core mayavi users, I would probably spend the day doing the intro one. If there are examples that can be followed without being a mayavi wizard, I would probably attend that one. Either way I see it as a win. Thanks for your efforts in getting the conference together. Carl T. From kwgoodman at gmail.com Fri May 30 00:14:15 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 29 May 2008 21:14:15 -0700 Subject: [Numpy-discussion] Question about indexing In-Reply-To: References: <200805291354.23664.darren.dale@cornell.edu> <483F3E10.80206@gmx.de> Message-ID: On Thu, May 29, 2008 at 8:50 PM, Alan G Isaac wrote: > Is ``a[[0,1]]`` completely equivalent to ``a[[0,1],...]`` > and ``a[[0,1],:]``? They look, smell, and taste the same. But I can't read array's __getitem__ since it is in C instead of python. >> np.index_exp[[0,1]] ([0, 1],) >> np.index_exp[[0,1],:] ([0, 1], slice(None, None, None)) >> np.index_exp[[0,1],...] ([0, 1], Ellipsis) >> a[[0,1]].flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >> >> a[[0,1],:].flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >> >> a[[0,1],...].flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Now, pop quiz, what does this mean: >> a[[[[[[[[[[[[[[[[0,1]]]]]]]]]]]]]]]] array([[[[[[[[[[[[[[[0, 1, 2], [3, 4, 5]]]]]]]]]]]]]]]) ? From fperez.net at gmail.com Fri May 30 02:51:35 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 29 May 2008 23:51:35 -0700 Subject: [Numpy-discussion] Tutorials at Scipy 2008 In-Reply-To: <426ada670805292059r368ed6b9k90f10b18db8fe434@mail.gmail.com> References: <426ada670805292059r368ed6b9k90f10b18db8fe434@mail.gmail.com> Message-ID: On Thu, May 29, 2008 at 8:59 PM, Carl Trachte wrote: > Assuming I'm approved by my employer to attend, I could benefit from > either the intro or some of the advanced topics. If the mayavi one is > for hard core mayavi users, I would probably spend the day doing the > intro one. If there are examples that can be followed without being a > mayavi wizard, I would probably attend that one. Either way I see it > as a win. I suspect the intro one will have some material on using mayavi, perhaps at a more basic level. But someone like myself can easily teach that, while Prabhu or Gael will hopefully be available to teach a deeper one on using TVTK, scripting mayavi, writing custom modules and filters, adding menus to mayavi using the Envisage API, etc. That's the kind of stuff I don't know how to do well enough to teach it. > Thanks for your efforts in getting the conference together. Glad to help. Cheers, f From oliphant at enthought.com Fri May 30 03:19:05 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 30 May 2008 02:19:05 -0500 Subject: [Numpy-discussion] Question for Travis In-Reply-To: References: Message-ID: <483FAA69.6030200@enthought.com> Charles R Harris wrote: > Travis, > > What are the fundamental types for ndarrays? We have the c types, > 'bBhHiIlLqQfdg', together with the boolean and complex types. Then we > have types defined by length, int8, uint8, etc. The long types change > length going from 32 to 64 bit machines, so there can be a couple of > c-types corresponding to the same precision. So I am wondering which > types should be considered "fundamental" and what types are used to > identify pickled arrays and arrays stored with the new .nz formats. I don't know what you mean. Arrays can have dynamic PyArray_Descr objects. The "fundamental" ones would be the builtin ones which are the 21 ctypes. The int8, uint8, etc are also "fundamental" in the sense that they correspond to built-in types. I don't know what you mean by the last question regarding identifying pickled arrays and arrays stored with the new .nz formats? -Travis From oliphant at enthought.com Fri May 30 03:52:29 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 30 May 2008 02:52:29 -0500 Subject: [Numpy-discussion] Question about indexing In-Reply-To: <483F3E10.80206@gmx.de> References: <200805291354.23664.darren.dale@cornell.edu> <483F3E10.80206@gmx.de> Message-ID: <483FB23D.4000301@enthought.com> Hi Raul, There are a some points that might help you with indexing: 1) a[obj] is (basically) equivalent to a.__getitem__(numpy.index_exp[obj]) 2) obj is always converted to a tuple if it isn't one already: * numpy.index_exp[0,1] == (0,1) * numpy.index_exp[(0,1)] == (0,1) * numpy.index_exp[[0,1]] == ([0,1],) 3) There are two basic kinds of indexing: a) Simple or slice-based indexing where the indexing tuple (obj) consists of just integers, slice objects, or Ellipses and the returned array is a "view" of the original array (no memory is copied). b) Fancy or advanced indexing which occurs when anything else (e.g. a list) is used in the indexing tuple and the returned array is a "copy" of the original array for largely technical reasons. 4) If the length of the indexing tuple is smaller than the number of dimensions in the array, the remaining un-indexed dimensions are returned. It is equivalent to appending slice(None) to the indexing tuple. 5) For fancy indexing using lists (and nested lists) in the indexing tuple, the shape of the array is the shape of the indexing (nested) list plus the shape of the un-indexed dimensions. Raul Kompass wrote: > I systematically tried and got the follwing: > ---------------------------------- > >>> from scipy import * > >>> a = random.rand(10).reshape(2,5) > >>> a > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > > >>> a[0,1] # shape([0,1]) = (2,) > 0.767957427399 > Equivalent to a[(0,1)] so the indexing tuple selects a single element of the 2d array. > >>> a[[0],[1]] # shape([[0],[1]]) = (2, 1) > array([ 0.76795743]) > Equivalent to a[([0], [1])] so the indexing tuple selects the same single element of the 2d array as before except now it is a 1-d array because fancy indexing is used [0] and [1] are lists. > >>> a[[0,1]] # shape([[0,1]]) = (1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > Equivalent to a[([0,1],)] so the indexing tuple is of length 1 and the shape of the resulting array is 2-d (the indexing list is 1-d and the un-indexed portion is 1-d). Rows 0 and 1 are selected from a. Equivalent to stacking a[0] and a[1] on top of each other. > >>> a[[[0,1]]] # shape([[[0,1]]]) = (1, 1, 2) > array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], > [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) > > The shape here I can't quite explain at the moment especially because a[ [[0,1]],] is shaped differently and probably shouldn't be. It looks like a[ ] has one smaller dimension than a[ , ] (notice the comma...) The rest of them follow from this pattern. -Travis From p.e.creasey.00 at googlemail.com Fri May 30 07:29:07 2008 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Fri, 30 May 2008 12:29:07 +0100 Subject: [Numpy-discussion] Installation info Message-ID: <6be8b94a0805300429y46ab7da4ga5adfc6605c376e8@mail.gmail.com> Is numpy v1.1 going to come out in egg format? I ask because I only see the superpack installers on the sourceforge page, and we have users who we are delivering to via egg - requires. thanks, Peter 2008/5/23 : > Send Numpy-discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://projects.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Numpy-discussion digest..." > > > Today's Topics: > > 1. Re: f2py errors: any help interpreting? (Mark Miller) > 2. Re: f2py errors: any help interpreting? (Robert Kern) > 3. Re: f2py errors: any help interpreting? (Mark Miller) > 4. Re: f2py errors: any help interpreting? (Robert Kern) > 5. Re: f2py errors: any help interpreting? (Mark Miller) > 6. Re: f2py errors: any help interpreting? (Mark Miller) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 23 May 2008 14:48:47 -0700 > From: "Mark Miller" > Subject: Re: [Numpy-discussion] f2py errors: any help interpreting? > To: "Discussion of Numerical Python" > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Super...I'll give it a try. Or should I just wait for the numpy 1.1 > release? > > thanks, > > -Mark > > On Fri, May 23, 2008 at 2:45 PM, Robert Kern wrote: > >> On Fri, May 23, 2008 at 4:00 PM, Mark Miller wrote: >> >> > File "C:\Python25\lib\site-packages\numpy\f2py\rules.py", line 1222, in >> > buildmodule >> > for l in '\n\n'.join(funcwrappers2)+'\n'.split('\n'): >> > TypeError: cannot concatenate 'str' and 'list' objects >> > >> > >> > Any thoughts? Please let me know if more information is needed to >> > troubleshoot. >> >> This is a bug that was fixed in SVN r4335. >> >> http://projects.scipy.org/scipy/numpy/changeset/4335 >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080523/750a2e8e/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Fri, 23 May 2008 17:01:04 -0500 > From: "Robert Kern" > Subject: Re: [Numpy-discussion] f2py errors: any help interpreting? > To: "Discussion of Numerical Python" > Message-ID: > <3d375d730805231501h564d030eod7e7b886f20c517b at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Fri, May 23, 2008 at 4:48 PM, Mark Miller wrote: >> Super...I'll give it a try. Or should I just wait for the numpy 1.1 >> release? > > Probably. You can get a binary installer for the release candidate here: > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > > > ------------------------------ > > Message: 3 > Date: Fri, 23 May 2008 15:48:02 -0700 > From: "Mark Miller" > Subject: Re: [Numpy-discussion] f2py errors: any help interpreting? > To: "Discussion of Numerical Python" > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Thank you...getting much closer now. > > My current issue is this message: > > running build_ext > error: don't know how to compile C/C++ code on platform 'nt' with 'g95' > compiler. > > Any help? > > Again, sorry to pester. I'm just pretty unfamiliar with these things. Once > I get environmental variables set up, I rarely need to fiddle with them > again. So I don't have a specific feel for what might be happening here. > > thanks, > > -Mark > > > > > On Fri, May 23, 2008 at 3:01 PM, Robert Kern wrote: > >> On Fri, May 23, 2008 at 4:48 PM, Mark Miller >> wrote: >> > Super...I'll give it a try. Or should I just wait for the numpy 1.1 >> > release? >> >> Probably. You can get a binary installer for the release candidate here: >> >> >> http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080523/7db92558/attachment-0001.html > > ------------------------------ > > Message: 4 > Date: Fri, 23 May 2008 17:50:28 -0500 > From: "Robert Kern" > Subject: Re: [Numpy-discussion] f2py errors: any help interpreting? > To: "Discussion of Numerical Python" > Message-ID: > <3d375d730805231550l49d1682dj6e0378f10c33c5e5 at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Fri, May 23, 2008 at 5:48 PM, Mark Miller wrote: >> Thank you...getting much closer now. >> >> My current issue is this message: >> >> running build_ext >> error: don't know how to compile C/C++ code on platform 'nt' with 'g95' >> compiler. >> >> Any help? > > What command line are you using? Do you have a setup.cfg or > pydistutils.cfg file that you are using? Can you show us the full > output? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > > > ------------------------------ > > Message: 5 > Date: Fri, 23 May 2008 15:51:22 -0700 > From: "Mark Miller" > Subject: Re: [Numpy-discussion] f2py errors: any help interpreting? > To: "Discussion of Numerical Python" > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Ignore last message: I seem to have figured out the next environmental > variable that needed to be set. Still some lingering issues, but I'll work > on them some more before pestering here again. > > thanks, > > -Mark > > On Fri, May 23, 2008 at 3:48 PM, Mark Miller > wrote: > >> Thank you...getting much closer now. >> >> My current issue is this message: >> >> running build_ext >> error: don't know how to compile C/C++ code on platform 'nt' with 'g95' >> compiler. >> >> Any help? >> >> Again, sorry to pester. I'm just pretty unfamiliar with these things. >> Once I get environmental variables set up, I rarely need to fiddle with them >> again. So I don't have a specific feel for what might be happening here. >> >> thanks, >> >> -Mark >> >> >> >> >> >> On Fri, May 23, 2008 at 3:01 PM, Robert Kern >> wrote: >> >>> On Fri, May 23, 2008 at 4:48 PM, Mark Miller >>> wrote: >>> > Super...I'll give it a try. Or should I just wait for the numpy 1.1 >>> > release? >>> >>> Probably. You can get a binary installer for the release candidate here: >>> >>> >>> http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.0rc1-win32-superpack-python2.5.exe >>> >>> -- >>> Robert Kern >>> >>> "I have come to believe that the whole world is an enigma, a harmless >>> enigma that is made terrible by our own mad attempt to interpret it as >>> though it had an underlying truth." >>> -- Umberto Eco >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080523/f9b4dccd/attachment-0001.html > > ------------------------------ > > Message: 6 > Date: Fri, 23 May 2008 15:59:43 -0700 > From: "Mark Miller" > Subject: Re: [Numpy-discussion] f2py errors: any help interpreting? > To: "Discussion of Numerical Python" > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > In this case, I am just using the Windows command prompt. I do not have a > setup.cfg or pydistutils.cfg file. I did create a file in > Python25\Lib\distutils called distutils.cfg containing 2 lines: > > [build] > compiler = mingw32 > > That took care of the previous message. I am currently getting a 'failed > with exit status 1' message, that for the life of me I can't remember what > causes it. > > I have attached the full (albeit tedius) output from an attempt, if someone > is willing to wade through it. > > -Mark > > > > On Fri, May 23, 2008 at 3:50 PM, Robert Kern wrote: > >> On Fri, May 23, 2008 at 5:48 PM, Mark Miller >> wrote: >> > Thank you...getting much closer now. >> > >> > My current issue is this message: >> > >> > running build_ext >> > error: don't know how to compile C/C++ code on platform 'nt' with 'g95' >> > compiler. >> > >> > Any help? >> >> What command line are you using? Do you have a setup.cfg or >> pydistutils.cfg file that you are using? Can you show us the full >> output? >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080523/01dfe7d6/attachment.html > -------------- next part -------------- > An embedded and charset-unspecified text was scrubbed... > Name: error message.txt > Url: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080523/01dfe7d6/attachment.txt > > ------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > End of Numpy-discussion Digest, Vol 20, Issue 122 > ************************************************* > From p.e.creasey.00 at googlemail.com Fri May 30 07:31:46 2008 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Fri, 30 May 2008 12:31:46 +0100 Subject: [Numpy-discussion] Installation info In-Reply-To: <6be8b94a0805300429y46ab7da4ga5adfc6605c376e8@mail.gmail.com> References: <6be8b94a0805300429y46ab7da4ga5adfc6605c376e8@mail.gmail.com> Message-ID: <6be8b94a0805300431p14eb1ae0pc5fd111db9b0503@mail.gmail.com> 2008/5/30 Peter Creasey : > Is numpy v1.1 going to come out in egg format? > Oops, I didn't mean to mail with an entire numpy digest in the body. sorry, Peter From hanni.ali at gmail.com Fri May 30 07:35:34 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Fri, 30 May 2008 12:35:34 +0100 Subject: [Numpy-discussion] Installation info In-Reply-To: <6be8b94a0805300431p14eb1ae0pc5fd111db9b0503@mail.gmail.com> References: <6be8b94a0805300429y46ab7da4ga5adfc6605c376e8@mail.gmail.com> <6be8b94a0805300431p14eb1ae0pc5fd111db9b0503@mail.gmail.com> Message-ID: <789d27b10805300435t361d12a8j59f69aadefd3f9e6@mail.gmail.com> I would also like to see a 64bit build for windows as well if possible. Hanni 2008/5/30 Peter Creasey : > 2008/5/30 Peter Creasey : > > Is numpy v1.1 going to come out in egg format? > > > > Oops, I didn't mean to mail with an entire numpy digest in the body. > > sorry, > Peter > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- E-mail: hanni.ali at gmail.com Mobile: +44 (0) 7985580147 My Blog: http://ainkaboot.co.uk/blogs/hanni/ Website: http://ainkaboot.co.uk http://drqueue.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 30 08:47:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 30 May 2008 06:47:01 -0600 Subject: [Numpy-discussion] Question for Travis In-Reply-To: <483FAA69.6030200@enthought.com> References: <483FAA69.6030200@enthought.com> Message-ID: On Fri, May 30, 2008 at 1:19 AM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > Travis, > > > > What are the fundamental types for ndarrays? We have the c types, > > 'bBhHiIlLqQfdg', together with the boolean and complex types. Then we > > have types defined by length, int8, uint8, etc. The long types change > > length going from 32 to 64 bit machines, so there can be a couple of > > c-types corresponding to the same precision. So I am wondering which > > types should be considered "fundamental" and what types are used to > > identify pickled arrays and arrays stored with the new .nz formats. > I don't know what you mean. > > Arrays can have dynamic PyArray_Descr objects. > > The "fundamental" ones would be the builtin ones which are the 21 > ctypes. The int8, uint8, etc are also "fundamental" in the sense that > they correspond to built-in types. > But not 1-1. The ufunc loops are all based on the c-types because, well, they are in c. Now, if I ask for an int32 array on my machine, it has a c-type of long, but if I store that array and recover it on a 64 bit machine, the only available c-type of the right size is int, so in this case the important part of the type is the int32 part and endianess. So I assume that the size information is fundamental in that case and the association with the c-type is made dynamically on a machine by machine basis. This breaks down for float{80,96,128}, which is essentially non-portable because the type is neither universal in format or size. So the second part of the question is was what information is stored with the data to make it recoverable on different architectures/compilers. One of the things that brought this to mind was that promotions to larger types in binary ufuncs will promote to long or int on my machine, depending on the order of the arguments, which is a bit confusing. Likewise on 64 bit machines the promotions go to both long and long long. So, in order to test all the loops, I need the c-types, but the promotion rules depend on the argument order, so they are not quite as simple and uniform as Robert implies. In addition to that, there are 7 different promotion patterns in both the unary and binary ufuncs, not counting the segfaults in remainder, and these patterns are different on 32 and 64 bit machines. So my problem from the testing point of view is to keep the tests as simple as possible while at the same time making them portable. At the least, it looks like I need to track the 32/64 bit information, but I'm also concerned about possible compiler differences. Hmm, it looks like I will need tabulate the c-types/precision pairings on each machine, then translate the expected results on a function by function basis. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Fri May 30 11:01:42 2008 From: markbak at gmail.com (mark) Date: Fri, 30 May 2008 08:01:42 -0700 (PDT) Subject: [Numpy-discussion] behavior of 'insert' for inserting multiple values In-Reply-To: <9457e7c80805270226t39bdf764kf980892353cee302@mail.gmail.com> References: <09421727-1a49-46fa-b89c-6fe65e53ab60@t54g2000hsg.googlegroups.com> <9457e7c80805270226t39bdf764kf980892353cee302@mail.gmail.com> Message-ID: <59fb4930-0c83-4195-8943-f786e48b1aac@8g2000hse.googlegroups.com> I think this is a special case that was overlooked. It works if there are multiple positions but only one value: >>> a = arange(5) >>> insert(a,[3,3],4) array([0, 1, 2, 4, 4, 3, 4]) But not when you give one position with mutiple values: >>> insert(a,3,[7,7]) array([0, 1, 2, 7, 3, 4]) It would be great if somebody could implement this. Probably not too hard I guess, Thanks. You guys are the best, Mark On May 27, 11:26 am, "St?fan van der Walt" wrote: > Hi Mark > > 2008/5/27 mark : > > >>>> a = arange(5.) > >>>> insert(a,3,[7,7]) > > array([ 0., 1., 2., 7., 3., 4.]) > > > But insert only inserts one of the 7's, while I want both values to be > > inserted. Nor does numpy throw a warning (which I think would be > > appropriate). The way that works correctly is > > >>>> insert(a,[3,3],[7,7]) > > array([ 0., 1., 2., 7., 7., 3., 4.]) > > You need to specify two insertion positions, i.e. > > np.insert(a, [3, 3], [7, 7]) > > I think we should consider a special case for your example, though. > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From klemm at phys.ethz.ch Fri May 30 11:17:18 2008 From: klemm at phys.ethz.ch (Hanno Klemm) Date: Fri, 30 May 2008 17:17:18 +0200 Subject: [Numpy-discussion] PyTables, array and recarray Message-ID: Hi, I try to save the contents of a numpy recarray with PyTables into a file. That works well, however, if I then try to retrieve the data, I get back an array with matching dtypes rather than a recarray. What is the best way to get a recarray back, or if that's not possible what's the most efficient way to convert an array to a recarray? This is what I am doing at the moment: Python 2.5 (r25:51908, Oct 4 2006, 17:28:51) [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-52)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as N >>> import tables as t >>> num = 2 >>> a = N.recarray(num, formats='i4,f8,f8',names='id,x,y') >>> a['id'] = [3,4] >>> a['x'] = [3.4,4.5] >>> a['y'] = [4.6,4.5] >>> a recarray([(3, 3.3999999999999999, 4.5999999999999996), (4, 4.5, 4.5)], dtype=[('id', '>> f = t.openFile('test.h5', 'w') >>> f.createTable('/', 'test', a) /test (Table(2L,)) '' description := { "id": Int32Col(shape=(), dflt=0, pos=0), "x": Float64Col(shape=(), dflt=0.0, pos=1), "y": Float64Col(shape=(), dflt=0.0, pos=2)} byteorder := 'little' chunkshape := (409,) >>> f.close() >>> f = t.openFile('test.h5', 'r') >>> b = f.root.test[:] >>> b array([(3, 3.3999999999999999, 4.5999999999999996), (4, 4.5, 4.5)], dtype=[('id', '>> type(b) >>> type(a) >>> Best regards, Hanno -- Hanno Klemm klemm at phys.ethz.ch From pgmdevlist at gmail.com Fri May 30 12:20:37 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 30 May 2008 12:20:37 -0400 Subject: [Numpy-discussion] PyTables, array and recarray In-Reply-To: References: Message-ID: <200805301220.38010.pgmdevlist@gmail.com> On Friday 30 May 2008 11:17:18 Hanno Klemm wrote: Hanno, Try a view: > >>> f.close() > >>> f = t.openFile('test.h5', 'r') > >>> b = f.root.test[:].view(N.recarray) Views are often the most efficient way to change the type of an array. From oliphant at enthought.com Fri May 30 12:38:21 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 30 May 2008 11:38:21 -0500 Subject: [Numpy-discussion] behavior of 'insert' for inserting multiple values In-Reply-To: <59fb4930-0c83-4195-8943-f786e48b1aac@8g2000hse.googlegroups.com> References: <09421727-1a49-46fa-b89c-6fe65e53ab60@t54g2000hsg.googlegroups.com> <9457e7c80805270226t39bdf764kf980892353cee302@mail.gmail.com> <59fb4930-0c83-4195-8943-f786e48b1aac@8g2000hse.googlegroups.com> Message-ID: <48402D7D.5000406@enthought.com> mark wrote: > I think this is a special case that was overlooked. > It works if there are multiple positions but only one value: > >>>> a = arange(5) >>>> insert(a,[3,3],4) >>>> > array([0, 1, 2, 4, 4, 3, 4]) > > But not when you give one position with mutiple values: > >>>> insert(a,3,[7,7]) >>>> > array([0, 1, 2, 7, 3, 4]) > > It would be great if somebody could implement this. > Probably not too hard I guess, > Thanks. You guys are the best, > If you file a ticket, this request is less likely to be lost or forgotten. Thanks for the testing and feedback. Best regards, -Travis > From falted at pytables.org Fri May 30 14:33:25 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 30 May 2008 20:33:25 +0200 Subject: [Numpy-discussion] PyTables, array and recarray In-Reply-To: References: Message-ID: <200805302033.26525.falted@pytables.org> A Friday 30 May 2008, Hanno Klemm escrigu?: > Hi, > > I try to save the contents of a numpy recarray with PyTables into a > file. That works well, however, if I then try to retrieve the data, I > get back an array with matching dtypes rather than a recarray. > > What is the best way to get a recarray back, or if that's not > possible what's the most efficient way to convert an array to a > recarray? > > This is what I am doing at the moment: > > Python 2.5 (r25:51908, Oct 4 2006, 17:28:51) > [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-52)] on linux2 > Type "help", "copyright", "credits" or "license" for more > information. > > >>> import numpy as N > >>> import tables as t > >>> num = 2 > >>> a = N.recarray(num, formats='i4,f8,f8',names='id,x,y') > >>> a['id'] = [3,4] > >>> a['x'] = [3.4,4.5] > >>> a['y'] = [4.6,4.5] > >>> a > > recarray([(3, 3.3999999999999999, 4.5999999999999996), (4, 4.5, > 4.5)], dtype=[('id', ' > >>> f = t.openFile('test.h5', 'w') > >>> f.createTable('/', 'test', a) > > /test (Table(2L,)) '' > description := { > "id": Int32Col(shape=(), dflt=0, pos=0), > "x": Float64Col(shape=(), dflt=0.0, pos=1), > "y": Float64Col(shape=(), dflt=0.0, pos=2)} > byteorder := 'little' > chunkshape := (409,) > > >>> f.close() > >>> f = t.openFile('test.h5', 'r') > >>> b = f.root.test[:] > >>> b > > array([(3, 3.3999999999999999, 4.5999999999999996), (4, 4.5, 4.5)], > dtype=[('id', ' > >>> type(b) > > > > >>> type(a) > > Yeah, this is on purpose because ndarray classes are actually C extensions and they are generally more efficient than Python classes. If what you want is a recarray class (Python class), you can always create a view as Pierre suggested. However, this is rarely needed because you can access most of the recarray functionality out of the ndarray class. Cheers, -- Francesc Alted From lxander.m at gmail.com Fri May 30 21:27:09 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Fri, 30 May 2008 21:27:09 -0400 Subject: [Numpy-discussion] Multi-Core Cache Usage Analyzer Message-ID: <525f23e80805301827idde0406k150d4df4c26f958b@mail.gmail.com> An HPC friend altered me to a recent announcement of a new memory-optimization product. The article in HPCwire is here: and the company website is here: . Note sure of something like this would be useful in the numpy conext, but I thought I would point it out in case it was and the right people in the numpy community weren't aware of it yet. From oliphant at enthought.com Fri May 30 22:24:08 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 30 May 2008 21:24:08 -0500 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: <3d375d730805281818p50eaf7few6c5bc2f5fdb4af9@mail.gmail.com> References: <483C6194.3060809@enthought.com> <483C6BCA.80705@enthought.com> <3d375d730805271407q88ca021g78bd904b5bae9b5a@mail.gmail.com> <483DFE3F.3070906@enthought.com> <3d375d730805281818p50eaf7few6c5bc2f5fdb4af9@mail.gmail.com> Message-ID: <4840B6C8.3020009@enthought.com> Robert Kern wrote: > On Wed, May 28, 2008 at 7:52 PM, Travis E. Oliphant > wrote: > >> Anne Archibald wrote: >> >>> 2008/5/27 Robert Kern : >>> >>> >>>> Can we make it so that dtype('c') is preserved instead of displaying >>>> '|S1'? It does not behave the same as dtype('|S1') although it >>>> compares equal to it. >>>> >>>> >>> It seems alarming to me that they should compare equal but behave >>> differently. Is it possible to change more than just the way it >>> prints? >>> >>> >> comparison on dtype objects is about memory layout equivalency. >> Characters and length-1 strings are equivalent from a memory-layout >> perspective. >> > > That would be fine if dtypes only represented memory layout. However, > in this case, they also represent a difference in interpretation of > str objects in the array() constructor. That is a real difference that > needs to be reflected in __eq__ and __repr__. > I think __repr__ can be changed without trouble. I'm concerned about changing __eq__, however. -Travis From cournape at gmail.com Fri May 30 23:01:10 2008 From: cournape at gmail.com (David Cournapeau) Date: Sat, 31 May 2008 12:01:10 +0900 Subject: [Numpy-discussion] Installation info In-Reply-To: <789d27b10805300435t361d12a8j59f69aadefd3f9e6@mail.gmail.com> References: <6be8b94a0805300429y46ab7da4ga5adfc6605c376e8@mail.gmail.com> <6be8b94a0805300431p14eb1ae0pc5fd111db9b0503@mail.gmail.com> <789d27b10805300435t361d12a8j59f69aadefd3f9e6@mail.gmail.com> Message-ID: <5b8d13220805302001p65e05d4ep4162aca1947c4f80@mail.gmail.com> On Fri, May 30, 2008 at 8:35 PM, Hanni Ali wrote: > I would also like to see a 64bit build for windows as well if possible. > > Unfortunately, this is difficult: windows 64 is not commonly available (I don't have any access to it personally, for example), and mingw is not available yet for windows 64 either. David From rbastian at free.fr Sat May 31 05:05:03 2008 From: rbastian at free.fr (R. Bastian) Date: Sat, 31 May 2008 11:05:03 +0200 Subject: [Numpy-discussion] New documentation web application In-Reply-To: References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> Message-ID: <20080531110503.6016dde3.rbastian@free.fr> On Thu, 29 May 2008 17:48:10 -0500 "Nathan Bell" wrote: > On Thu, May 29, 2008 at 5:28 PM, St?fan van der Walt wrote: > > Hi all, > > > > The NumPy documentation project has taken another leap forward! Pauli > > Virtanen has, in a week of superhuman coding, produced a web > > application that enhances the work-flow and editing experience of > > NumPy docstrings on the web. > > > > Unfortunately, this means that those of you who signed up before will > > have to create new accounts at > > > > http://sd-2116.dedibox.fr/pydocweb > > Neat! I really like the layout. The red format warnings are a nice touch: > http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.umath.exp/ > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ The doc is excellent. Thanks. But I am missing a more developped form of the numpy entry page http://sd-2116.dedibox.fr/pydocweb/doc/numpy/ with a brief description for every class or function Example: trim_zeros : Trim the leading and trailing zeros from a 1D array. or is there a way to get this on a html-page ? -- R. Bastian www.musiques-rb.org From stefan at sun.ac.za Sat May 31 05:45:41 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 31 May 2008 11:45:41 +0200 Subject: [Numpy-discussion] New documentation web application In-Reply-To: <20080531110503.6016dde3.rbastian@free.fr> References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> <20080531110503.6016dde3.rbastian@free.fr> Message-ID: <9457e7c80805310245o4a403610ld61e33bf1b9563a8@mail.gmail.com> Hi Bastian 2008/5/31 R. Bastian : > On Thu, 29 May 2008 17:48:10 -0500 > "Nathan Bell" wrote: > >> On Thu, May 29, 2008 at 5:28 PM, St?fan van der Walt wrote: >> > Hi all, >> > >> > The NumPy documentation project has taken another leap forward! Pauli >> > Virtanen has, in a week of superhuman coding, produced a web >> > application that enhances the work-flow and editing experience of >> > NumPy docstrings on the web. >> > >> > Unfortunately, this means that those of you who signed up before will >> > have to create new accounts at >> > >> > http://sd-2116.dedibox.fr/pydocweb >> >> Neat! I really like the layout. The red format warnings are a nice touch: >> http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.umath.exp/ >> >> -- >> Nathan Bell wnbell at gmail.com >> http://graphics.cs.uiuc.edu/~wnbell/ > > The doc is excellent. Thanks. > > But I am missing a more developped form of the numpy entry page > http://sd-2116.dedibox.fr/pydocweb/doc/numpy/ > > with a brief description for every class or function > > Example: > > trim_zeros : Trim the leading and trailing zeros from a 1D array. > > or is there a way to get this on a html-page ? Sure, that is very easy. I can generate it as part of the reference guide. I'll send you a link this evening. Regards St?fan From jdh2358 at gmail.com Sat May 31 08:23:01 2008 From: jdh2358 at gmail.com (John Hunter) Date: Sat, 31 May 2008 07:23:01 -0500 Subject: [Numpy-discussion] New documentation web application In-Reply-To: <20080531110503.6016dde3.rbastian@free.fr> References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> <20080531110503.6016dde3.rbastian@free.fr> Message-ID: <88e473830805310523n13f8a838j5e159945e343a49e@mail.gmail.com> On Sat, May 31, 2008 at 4:05 AM, R. Bastian wrote: >> Neat! I really like the layout. The red format warnings are a nice touch: >> http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.umath.exp/ Hi, I was just reading through this example when I noticed this usage: from matplotlib import pyplot as plt Although this is of course perfectly valid python, I have been encouraging people when importing modules from packages to use the syntax: import somepackage.somemodule as somemod rather than from somepackage import somemodule as somemod The reason is that in the first usage it is unambiguous that somemodule is a module and not a function or constant. Eg, both of these are valid python: In [7]: from numpy import arange In [8]: from numpy import fft but only the module import is valid here: In [3]: import numpy.fft as fft In [4]: import numpy.arange as arange ImportError Traceback (most recent call last) ImportError: No module named arange I taught a class on scientific computing in python to undergraduates, and the students were frequently confused about what was a module and what was a function. If you are coming from matlab, you are likely to think fft is a function when you see this: from numpy import fft By being consistent in importing modules using the 'import numpy.fft as fft', it can make it more clear that we are importing a module. I already recommend this usage in the matplotlib coding guide, and numpy may want to adopt it as well. JDH From matthew.brett at gmail.com Sat May 31 08:30:42 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 31 May 2008 12:30:42 +0000 Subject: [Numpy-discussion] New documentation web application In-Reply-To: <88e473830805310523n13f8a838j5e159945e343a49e@mail.gmail.com> References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> <20080531110503.6016dde3.rbastian@free.fr> <88e473830805310523n13f8a838j5e159945e343a49e@mail.gmail.com> Message-ID: <1e2af89e0805310530g6df96eb9h21dac00c305ad0b7@mail.gmail.com> Hi, > By being consistent in importing modules using the 'import numpy.fft > as fft', it can make it more clear that we are importing a module. > > I already recommend this usage in the matplotlib coding guide, and > numpy may want to adopt it as well. That's an excellent suggestion, seconded. Matthew From bala.biophysics at gmail.com Sat May 31 10:52:53 2008 From: bala.biophysics at gmail.com (Bala subramanian) Date: Sat, 31 May 2008 20:22:53 +0530 Subject: [Numpy-discussion] numpy import problem Message-ID: <288df32a0805310752y76cee520oa1d5379e854d9c94@mail.gmail.com> Dear friends, I installed numpy in a 32-bit machines running with RHEL3. The installation was successful. I tested the installtion by importint numpy inside python interpreter. By once i shutdown the system and restart, and try the same, it says ImportError: No module named numpy. What could be the problem i dnt get. Kindly someone write me on the same. Thanks, Bala -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Sat May 31 11:15:29 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sat, 31 May 2008 11:15:29 -0400 Subject: [Numpy-discussion] numpy import problem In-Reply-To: <288df32a0805310752y76cee520oa1d5379e854d9c94@mail.gmail.com> References: <288df32a0805310752y76cee520oa1d5379e854d9c94@mail.gmail.com> Message-ID: <1d36917a0805310815o695605aepcbbbedd8ebcec383@mail.gmail.com> Bala, One thing I can think of is that you might have multiple versions of Python installed. For example, I have Python 2.4 and 2.5 on my machine, but numpy is only installed for 2.5. Since just running "python" brings up 2.4, sometimes I find myself in the wrong interpreter typing "import numpy", which fails in the way you described. Cheers, Alan On Sat, May 31, 2008 at 10:52 AM, Bala subramanian wrote: > Dear friends, > > I installed numpy in a 32-bit machines running with RHEL3. The installation > was successful. I tested the installtion by importint numpy inside python > interpreter. > By once i shutdown the system and restart, and try the same, it says > ImportError: No module named numpy. > > What could be the problem i dnt get. Kindly someone write me on the same. > > Thanks, > Bala > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From oliphant at enthought.com Sat May 31 11:43:49 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 31 May 2008 10:43:49 -0500 Subject: [Numpy-discussion] New documentation web application In-Reply-To: <1e2af89e0805310530g6df96eb9h21dac00c305ad0b7@mail.gmail.com> References: <9457e7c80805291528x2b6685b7q94bddafe4cdef2a0@mail.gmail.com> <20080531110503.6016dde3.rbastian@free.fr> <88e473830805310523n13f8a838j5e159945e343a49e@mail.gmail.com> <1e2af89e0805310530g6df96eb9h21dac00c305ad0b7@mail.gmail.com> Message-ID: <48417235.4080106@enthought.com> Matthew Brett wrote: > Hi, > > >> By being consistent in importing modules using the 'import numpy.fft >> as fft', it can make it more clear that we are importing a module. >> >> I already recommend this usage in the matplotlib coding guide, and >> numpy may want to adopt it as well. >> > > That's an excellent suggestion, seconded. > +1 -Travis From pav at iki.fi Sat May 31 15:49:37 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 31 May 2008 22:49:37 +0300 Subject: [Numpy-discussion] umath ufunc docstrings Message-ID: <1212263377.8410.18.camel@localhost.localdomain> Hi, I'd like to adjust the way numpy.core.umath ufunc docstrings are defined to make them more easy to handle in the ongoing documentation marathon: - Remove the signature magic in ufunc_get_doc - Define ufunc docstrings in a separate module instead of in generate_umath.py, in the same format as in add_newdocs.py Suggested patch is attached; it passes numpy tests. Any thoughts on whether it should go in? I was also thinking about the problem with pydoc.help and ufuncs: would making PyUFuncObject a subclass of PyFunctionObject be a reasonable fix? Pauli -------------- next part -------------- A non-text attachment was scrubbed... Name: umath-docstrings.patch Type: text/x-patch Size: 26303 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From oliphant at enthought.com Sat May 31 16:02:56 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 31 May 2008 15:02:56 -0500 Subject: [Numpy-discussion] umath ufunc docstrings In-Reply-To: <1212263377.8410.18.camel@localhost.localdomain> References: <1212263377.8410.18.camel@localhost.localdomain> Message-ID: <4841AEF0.90102@enthought.com> Pauli Virtanen wrote: > Hi, > > I'd like to adjust the way numpy.core.umath ufunc docstrings are defined > to make them more easy to handle in the ongoing documentation marathon: > Thanks for your efforts here. It would be good to get an idea of what problems you are encountering that led you to the proposed solutions. > - Remove the signature magic in ufunc_get_doc > I don't see why this is needed and seems like un-necessary reshuffling. The current "magic" handles constructing the signature for you just like other functions. I don't see any argument as to why we should get rid of this useful construct. > - Define ufunc docstrings in a separate module > instead of in generate_umath.py, in the same format as in > add_newdocs.py > This seems like it might help so that you can insert strings more easily from the wiki. > > I was also thinking about the problem with pydoc.help and ufuncs: would > making PyUFuncObject a subclass of PyFunctionObject be a reasonable fix? > > It's an interesting idea, but making this work would require having ufuncobjects start their C-structures with a binary-equivalent of the PyFunction Object which seems un-necessary, wasteful, and confusing later. I would rather try and fix python's help system so that it looks for docstrings when they actually exist. -Travis From pav at iki.fi Sat May 31 16:26:43 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 31 May 2008 23:26:43 +0300 Subject: [Numpy-discussion] umath ufunc docstrings In-Reply-To: <4841AEF0.90102@enthought.com> References: <1212263377.8410.18.camel@localhost.localdomain> <4841AEF0.90102@enthought.com> Message-ID: <1212265603.8410.36.camel@localhost.localdomain> la, 2008-05-31 kello 15:02 -0500, Travis E. Oliphant kirjoitti: > Pauli Virtanen wrote: > > Hi, > > > > I'd like to adjust the way numpy.core.umath ufunc docstrings are > > defined to make them more easy to handle in the ongoing > > documentation marathon: > > Thanks for your efforts here. It would be good to get an idea of > what problems you are encountering that led you to the proposed solutions. The problem here was simply that inserting multi-line docstring into the UFunc list in generate_umath.py would make the list unwieldy to edit. Moreover, while this would probably still be possible to automate reliably, it is easier if there is only one way that we define docstrings for C objects. I already wrote code for the add_newdoc syntax, so it seemed cleanest to re-use it. > > - Remove the signature magic in ufunc_get_doc > > I don't see why this is needed and seems like un-necessary > reshuffling. The current "magic" handles constructing the signature > for you just like other functions. I don't see any argument as to > why we should get rid of this useful construct. This was for making the docstrings of the same form as those in add_newdocs.py; there the signature must always be contained in the docstring. I thought that the automatic signature was such a small gain that uniformity overruled it here. If you think otherwise, I'll just special-case the handling of ufunc docstrings in the documentation tools, which is not a big deal. > > I was also thinking about the problem with pydoc.help and ufuncs: would > > making PyUFuncObject a subclass of PyFunctionObject be a reasonable fix? > > It's an interesting idea, but making this work would require having > ufuncobjects start their C-structures with a binary-equivalent of the > PyFunction Object which seems un-necessary, wasteful, and confusing > later. I would rather try and fix python's help system so that it > looks for docstrings when they actually exist. Has a bug report for pydoc already been filled so that it might have a change of hitting 3.0? Pauli From tsyu80 at gmail.com Sat May 31 17:56:20 2008 From: tsyu80 at gmail.com (Tony Yu) Date: Sat, 31 May 2008 17:56:20 -0400 Subject: [Numpy-discussion] Strange behavior in setting masked array values in Numpy 1.1.0 Message-ID: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> Great job getting numpy 1.1.0 out and thanks for including the old API of masked arrays. I've been playing around with some software using numpy 1.0.4 and took a crack at upgrading it to numpy 1.1.0, but I ran into some strange behavior when assigning to slices of a masked array. I made the simplest example I could think of to show this weird behavior. Basically, reordering the masked array and assigning back to itself *on the same line* seems to work for part of the array, but other parts are left unchanged. In the example below, half of the array is assigned "properly" and the other half isn't. This problem is eliminated if the assignment is done with a copy of the array. Alternatively, this problem is eliminated if I using numpy.oldnumeric.ma.masked_array instead of the new masked array implementation. Is this just a problem on my setup? Thanks in advance for your help. -Tony Yu Example: ======== In [1]: import numpy In [2]: masked = numpy.ma.masked_array([[1, 2, 3, 4, 5]], mask=False) In [3]: masked[:] = numpy.fliplr(masked.copy()) In [4]: print masked [[5 4 3 2 1]] In [5]: masked[:] = numpy.fliplr(masked) In [6]: print masked [[1 2 3 2 1]] Specs: ====== Numpy 1.1.0 Python 2.5.1 OS X Leopard 10.5.3 From matthieu.brucher at gmail.com Sat May 31 18:04:54 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 1 Jun 2008 00:04:54 +0200 Subject: [Numpy-discussion] Strange behavior in setting masked array values in Numpy 1.1.0 In-Reply-To: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> References: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> Message-ID: Hi, This is to be expected. You are trying to modify and read the same array at the same time, which should never be done. Matthieu 2008/5/31 Tony Yu : > Great job getting numpy 1.1.0 out and thanks for including the old API > of masked arrays. > > I've been playing around with some software using numpy 1.0.4 and took > a crack at upgrading it to numpy 1.1.0, but I ran into some strange > behavior when assigning to slices of a masked array. > > I made the simplest example I could think of to show this weird > behavior. Basically, reordering the masked array and assigning back to > itself *on the same line* seems to work for part of the array, but > other parts are left unchanged. In the example below, half of the > array is assigned "properly" and the other half isn't. This problem is > eliminated if the assignment is done with a copy of the array. > Alternatively, this problem is eliminated if I using > numpy.oldnumeric.ma.masked_array instead of the new masked array > implementation. > > Is this just a problem on my setup? > > Thanks in advance for your help. > -Tony Yu > > Example: > ======== > In [1]: import numpy > > In [2]: masked = numpy.ma.masked_array([[1, 2, 3, 4, 5]], mask=False) > > In [3]: masked[:] = numpy.fliplr(masked.copy()) > > In [4]: print masked > [[5 4 3 2 1]] > > In [5]: masked[:] = numpy.fliplr(masked) > > In [6]: print masked > [[1 2 3 2 1]] > > > Specs: > ====== > Numpy 1.1.0 > Python 2.5.1 > OS X Leopard 10.5.3 > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat May 31 18:09:20 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 01 Jun 2008 01:09:20 +0300 Subject: [Numpy-discussion] Slice assignment and overlapping views (was: Strange behavior in setting masked array values in Numpy 1.1.0) In-Reply-To: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> References: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> Message-ID: <1212271760.8410.45.camel@localhost.localdomain> la, 2008-05-31 kello 17:56 -0400, Tony Yu kirjoitti: [clip] > I've been playing around with some software using numpy 1.0.4 and took > a crack at upgrading it to numpy 1.1.0, but I ran into some strange > behavior when assigning to slices of a masked array. [clip] > In [1]: import numpy > > In [2]: masked = numpy.ma.masked_array([[1, 2, 3, 4, 5]], mask=False) > > In [3]: masked[:] = numpy.fliplr(masked.copy()) > > In [4]: print masked > [[5 4 3 2 1]] > > In [5]: masked[:] = numpy.fliplr(masked) > > In [6]: print masked > [[1 2 3 2 1]] ?Note that >>> numpy.fliplr(masked).base.base.base.base is masked.base True The reason for the strange behavior of slice assignment is that when the left and right sides in a slice assignment are overlapping views of the same array, the result is currently effectively undefined. Same is true for ndarrays: >>> import numpy >>> a = numpy.array([1, 2, 3, 4, 5]) >>> a[::-1] array([5, 4, 3, 2, 1]) >>> a[:] = a[::-1] >>> a array([5, 4, 3, 4, 5]) This is a known issue. I'm not sure how easy would it be to arrange the assignment loops so that the overlapping data would be handled correctly. I think that numpy should at least raise a warning if not an error in __setitem__, when the two arrays have the same ancestor (needs walking up the .base links). What's the general opinion on this? Pauli -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From tsyu80 at gmail.com Sat May 31 18:35:32 2008 From: tsyu80 at gmail.com (Tony Yu) Date: Sat, 31 May 2008 18:35:32 -0400 Subject: [Numpy-discussion] Strange behavior in setting masked array values in Numpy 1.1.0 In-Reply-To: References: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> Message-ID: On May 31, 2008, at 6:04 PM, Matthieu Brucher wrote: > Hi, > > This is to be expected. You are trying to modify and read the same > array at the same time, which should never be done. Thanks, I'll have to keep this in mind next time. So, what's the best way to rearrange a subarray of an array. Copying seems inefficient. -Tony > > > Matthieu > > 2008/5/31 Tony Yu : > Great job getting numpy 1.1.0 out and thanks for including the old API > of masked arrays. > > I've been playing around with some software using numpy 1.0.4 and took > a crack at upgrading it to numpy 1.1.0, but I ran into some strange > behavior when assigning to slices of a masked array. > > I made the simplest example I could think of to show this weird > behavior. Basically, reordering the masked array and assigning back to > itself *on the same line* seems to work for part of the array, but > other parts are left unchanged. In the example below, half of the > array is assigned "properly" and the other half isn't. This problem is > eliminated if the assignment is done with a copy of the array. > Alternatively, this problem is eliminated if I using > numpy.oldnumeric.ma.masked_array instead of the new masked array > implementation. > > Is this just a problem on my setup? > > Thanks in advance for your help. > -Tony Yu > > Example: > ======== > In [1]: import numpy > > In [2]: masked = numpy.ma.masked_array([[1, 2, 3, 4, 5]], mask=False) > > In [3]: masked[:] = numpy.fliplr(masked.copy()) > > In [4]: print masked > [[5 4 3 2 1]] > > In [5]: masked[:] = numpy.fliplr(masked) > > In [6]: print masked > [[1 2 3 2 1]] > > > Specs: > ====== > Numpy 1.1.0 > Python 2.5.1 > OS X Leopard 10.5.3 > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/? > blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sat May 31 18:37:29 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 31 May 2008 15:37:29 -0700 Subject: [Numpy-discussion] Slice assignment and overlapping views (was: Strange behavior in setting masked array values in Numpy 1.1.0) In-Reply-To: <1212271760.8410.45.camel@localhost.localdomain> References: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> <1212271760.8410.45.camel@localhost.localdomain> Message-ID: On Sat, May 31, 2008 at 3:09 PM, Pauli Virtanen wrote: > > The reason for the strange behavior of slice assignment is that when the > left and right sides in a slice assignment are overlapping views of the > same array, the result is currently effectively undefined. Same is true > for ndarrays: > >>>> import numpy >>>> a = numpy.array([1, 2, 3, 4, 5]) >>>> a[::-1] > array([5, 4, 3, 2, 1]) >>>> a[:] = a[::-1] >>>> a > array([5, 4, 3, 4, 5]) Here's a fun one: >> x = np.random.rand(2,5) >> x.round() array([[ 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 1.]]) >> x.round(out=x[::-1]) array([[ 0., 1., 0., 0., 0.], [ 0., 1., 0., 0., 0.]]) Looks like the top row of x is rounded first and the result is placed in the bottom row. Then the bottom row is evaluated (taking the round of the already rounded top row) and placed in the top row. So the top and bottom will always be the same. That must be useful somewhere :) From pgmdevlist at gmail.com Sat May 31 19:41:44 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Sat, 31 May 2008 19:41:44 -0400 Subject: [Numpy-discussion] Strange behavior in setting masked array values in Numpy 1.1.0 In-Reply-To: References: <6F7AB06B-DB3A-4DFE-80F3-0BF6C32F75E2@gmail.com> Message-ID: <200805311941.44540.pgmdevlist@gmail.com> On Saturday 31 May 2008 18:35:32 Tony Yu wrote: > On May 31, 2008, at 6:04 PM, Matthieu Brucher wrote: > > Hi, > > > > This is to be expected. You are trying to modify and read the same > > array at the same time, which should never be done. > > Thanks, I'll have to keep this in mind next time. And the reason why it works with oldnumeric is that the right term is copied while setting the slice. > So, what's the best way to rearrange a subarray of an array. Copying > seems inefficient. But likely the only solution for your data to be contiguous