From pete at shinners.org Mon Mar 1 01:02:51 2010 From: pete at shinners.org (Peter Shinners) Date: Sun, 28 Feb 2010 22:02:51 -0800 Subject: [Numpy-discussion] take not respecting masked arrays? In-Reply-To: References: <4B8B1F72.5050709@shinners.org> Message-ID: <4B8B588B.6000500@shinners.org> On 02/28/2010 08:01 PM, Pierre GM wrote: > On Feb 28, 2010, at 8:59 PM, Peter Shinners wrote: > >> I have a 2D masked array that has indices into a 1D array. I want to use >> some form of "take" to fetch the values into the 2D array. I've tried >> both numpy.take and numpy.ma.take, but they both return a new unmasked >> array. >> > > Mmh. Surprising. np.ma.take should return a masked array if it's given a masked array as input. Can you pastebin the array that gives you trouble ? I need to investigate that. > As a temporary workaround, use np.take on first the _data, then the _mask and construct a new masked array from the two results. > Here is the code as I would like it to work. http://python.pastebin.com/CsEnUrSa import numpy as np values = np.array((40, 18, 37, 9, 22)) index = np.arange(3)[None,:] + np.arange(5)[:,None] mask = index >= len(values) maskedindex = np.ma.array(index, mask=mask) lookup = np.ma.take(values, maskedindex) # This fails with an index error, but illegal indices are masked. # It succeeds when mode="clip", but it does not return a masked array. print lookup From pgmdevlist at gmail.com Mon Mar 1 01:40:08 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 1 Mar 2010 01:40:08 -0500 Subject: [Numpy-discussion] take not respecting masked arrays? In-Reply-To: References: <4B8B1F72.5050709@shinners.org> Message-ID: <4F070AFA-832A-4F2A-8789-AA30748E7E70@gmail.com> On Feb 28, 2010, at 11:12 PM, Charles R Harris wrote: > > > ______ > > Ah, Pierre, now that you are here... ;) Can you take a look at the invalid value warnings in the masked array tests and maybe fix them up by turning off the warnings where appropriate? I'd do it myself except that I hesitate to touch masked array stuff. Chuck, did you just hijack the thread ;) ? To replace thiings in context: a few weeks ago, we had a global flag in numpy.ma that prevented warnings to be printed. Now, the warnings are handled in a case-by-case basis in the numy.ma functions. The problem is that when a numpy function is called on a masked array instead of its numpy.ma equivalent, the warnings are not trapped (that's what happen in the tests). In order to trap them, I'll have to use a new approach (something like __array_prepare__), which is not necessarily difficult but not trivial either. I should have plenty of free time in the next weeks. I'm afraid that won't be on time for the 2.0 release, though, sorry. From cournape at gmail.com Mon Mar 1 01:44:31 2010 From: cournape at gmail.com (David Cournapeau) Date: Mon, 1 Mar 2010 15:44:31 +0900 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> Message-ID: <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> On Mon, Mar 1, 2010 at 1:35 PM, James Bergstra wrote: > Could someone point me to documentation (or even numpy src) that shows > how to allocate a numpy.int8 in C, or check to see if a PyObject is a > numpy.int8? In numpy, the type is described in the dtype type object, so you should create the appropriate PyArray_Descr when creating an array. The exact procedure depends on how you create the array, but a simple way to create arrays is PyArray_SimpleNew, where you don't need to create your own dtype, and just pass the correponding typenum (C enum), something like PyArray_SimpleNew(nd, dims, NPY_INT8). If you need to create from a function which only takes PyArray_Descr, you can easily create a simple descriptor object from the enum using PyArray_DescrFromType. You can see examples in numpy/core/src/multiarray/ctors.c cheers, David From pgmdevlist at gmail.com Mon Mar 1 01:58:02 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 1 Mar 2010 01:58:02 -0500 Subject: [Numpy-discussion] take not respecting masked arrays? In-Reply-To: <4B8B588B.6000500@shinners.org> References: <4B8B1F72.5050709@shinners.org> <4B8B588B.6000500@shinners.org> Message-ID: On Mar 1, 2010, at 1:02 AM, Peter Shinners wrote: >> Here is the code as I would like it to work. > http://python.pastebin.com/CsEnUrSa > > > import numpy as np > > values = np.array((40, 18, 37, 9, 22)) > index = np.arange(3)[None,:] + np.arange(5)[:,None] > mask = index >= len(values) > > maskedindex = np.ma.array(index, mask=mask) > > lookup = np.ma.take(values, maskedindex) > # This fails with an index error, but illegal indices are masked. OK, but this doesn't even work on a regular ndarray: np.take(values, index) raises an IndexError as well. Not much I can do there, then. > # It succeeds when mode="clip", but it does not return a masked array. > print lookup Oh, I get it... The problem is that we use `take` on a ndarray (values) with a masked array as indices (maskedindex). OK, I could modify some of the mechanics so that a masked array is output even if a ndarray was parsed. Now, about masked indices: OK, you're right, the result should be masked accordingly. Can you open a ticket, then ? From friedrichromstedt at gmail.com Mon Mar 1 06:12:43 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 1 Mar 2010 12:12:43 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: 2010/3/1 Charles R Harris : > On Sun, Feb 28, 2010 at 7:58 PM, Ian Mallett wrote: >> Excellent--and a 3D rotation matrix is 3x3--so the list can remain n*3. >> Now the question is how to apply a rotation matrix to the array of vec3? > > It looks like you want something like > > res = dot(vec, rot) + tran > > You can avoid an extra copy being made by separating the parts > > res = dot(vec, rot) > res += tran > > where I've used arrays, not matrices. Note that the rotation matrix > multiplies every vector in the array. When you want to rotate a ndarray "list" of vectors: >>> a.shape (N, 3) >>> a [[1., 2., 3. ] [4., 5., 6. ]] by some rotation matrix: >>> rotation_matrix.shape (3, 3) where each row of the rotation_matrix represents one vector of the rotation target basis, expressed in the basis of the original system, you can do this by writing: >>> numpy.dot(a, rotations_matrix) , as Chuck pointed out. This gives you the rotated vectors in an ndarray "list" again: >>> numpy.dot(a, rotation_matrix).shape (N, 3) This is just somewhat more in detail what Chuck already stated > Note that the rotation matrix > multiplies every vector in the array. my 2 cents, Friedrich From ralf.gommers at googlemail.com Mon Mar 1 08:40:01 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 1 Mar 2010 21:40:01 +0800 Subject: [Numpy-discussion] Superpack - one Complex test failure Message-ID: Finally I got my Wine environment sorted out - I'm now able to build superpack installers for both Python 2.5 and 2.6. I tested the 2.6 installer on Windows XP, and got a single test failure. This exact same test also is the only test failure with the numpy 1.3 installer on sourceforge. So the installer seems fine to me, and either it's the machine on which I'm testing or this test just fails on windows. What do other people see when using an installer from Sourceforge? For completeness: the installed version is SSE3, processor is Intel Core Duo P8700. ====================================================================== FAIL: test_special_values (test_umath_complex.TestClog) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python26\lib\site-packages\numpy\core\tests\test_umath_complex.py", l ine 179, in test_special_values assert_almost_equal(np.log(x), y) File "C:\Python26\lib\site-packages\numpy\testing\utils.py", line 437, in asse rt_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ NaN+2.35619449j] DESIRED: (inf+2.35619449019j) ---------------------------------------------------------------------- Ran 2342 tests in 8.000s FAILED (KNOWNFAIL=7, SKIP=1, failures=1) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Mar 1 09:06:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Mar 2010 09:06:19 -0500 Subject: [Numpy-discussion] Superpack - one Complex test failure In-Reply-To: References: Message-ID: <1cd32cbb1003010606w5623ebb2qa7eeb98f421075de@mail.gmail.com> On Mon, Mar 1, 2010 at 8:40 AM, Ralf Gommers wrote: > Finally I got my Wine environment sorted out - I'm now able to build > superpack installers for both Python 2.5 and 2.6. I tested the 2.6 installer > on Windows XP, and got a single test failure. This exact same test also is > the only test failure with the numpy 1.3 installer on sourceforge. So the > installer seems fine to me, and either it's the machine on which I'm testing > or this test just fails on windows. What do other people see when using an > installer from Sourceforge? > > For completeness: the installed version is SSE3, processor is Intel Core Duo > P8700. > > > ====================================================================== > FAIL: test_special_values (test_umath_complex.TestClog) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ? File > "C:\Python26\lib\site-packages\numpy\core\tests\test_umath_complex.py", l > ine 179, in test_special_values > ??? assert_almost_equal(np.log(x), y) > ? File "C:\Python26\lib\site-packages\numpy\testing\utils.py", line 437, in > asse > rt_almost_equal > ??? "DESIRED: %s\n" % (str(actual), str(desired))) > AssertionError: Items are not equal: > ACTUAL: [ NaN+2.35619449j] > DESIRED: (inf+2.35619449019j) > > > ---------------------------------------------------------------------- > Ran 2342 tests in 8.000s > > FAILED (KNOWNFAIL=7, SKIP=1, failures=1) This test has been reported to fail for a while on Windows. It also fails with numpy 1.4.0 I don't have 1.3 or 1.4.1 available for testing right now. Josef > > Cheers, > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ralf.gommers at googlemail.com Mon Mar 1 09:11:31 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 1 Mar 2010 22:11:31 +0800 Subject: [Numpy-discussion] Superpack - one Complex test failure In-Reply-To: <1cd32cbb1003010606w5623ebb2qa7eeb98f421075de@mail.gmail.com> References: <1cd32cbb1003010606w5623ebb2qa7eeb98f421075de@mail.gmail.com> Message-ID: On Mon, Mar 1, 2010 at 10:06 PM, wrote: > This test has been reported to fail for a while on Windows. It also > fails with numpy 1.4.0 > > Thanks Josef. In that case I'm good to go. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Mar 1 09:52:27 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 1 Mar 2010 22:52:27 +0800 Subject: [Numpy-discussion] todos before 1.4.1 RC1 Message-ID: Hi all, Here are some requests / things I think need to be done before a 1.4.1 RC1 can be put out. 1. Bump up the version to 1.4.1 2. Update the release notes, including an explanation of why 1.4.0 was pulled. 3. Patrick and I need info on how to upload to Sourceforge. David or Jarrod, can you tell us how this works (offlist)? I guess I can do 1 and 2, but I don't have commit rights. I'm happy to get them, if you'd all prefer that I submit patches for a while first that's fine too. Then, I'm also going to try this once more: in my opinion we should not release numpy 1.4.1 before scipy 0.7.2. Consider this: numpy 1.3 + scipy 0.7.1 = OK numpy 1.4.1 + scipy 0.7.2 = OK numpy 1.3 + scipy 0.7.2 = OK numpy 1.4.1 + scipy 0.7.1 = NOT OK (Cython issue) So either release 1.4.1 and 0.7.2 at the same time, or scipy 0.7.2 first. Even having an incompatible combination out there for a couple of days/weeks is a bad idea imho. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.raspaud at smhi.se Mon Mar 1 10:04:19 2010 From: martin.raspaud at smhi.se (Martin Raspaud) Date: Mon, 01 Mar 2010 16:04:19 +0100 Subject: [Numpy-discussion] C-api and masked arrays Message-ID: <4B8BD773.3080701@smhi.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, We are using at the moment a c extension which should manipulate masked arrays. What we do is to fill the masked array with a given value (say 65535 if we run uint16 arrays), do the manipulation, and convert back to masked arrays when we go back to python. This seems like a naive way to do, is there another cleverer way to do it ? Thanks, Martin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJLi9dyAAoJEBdvyODiyJI4VmEIALRM/a1YXj0WBhHd3aGwEZgK TDakjA+vxlbTksWzZdLs28sYToc1fj0hAeh+VFLPwgOLbORgAeKg4F040DQIV8ea UXY+HZ3sI6W0c4V+6rhqTnKReGH8bxNl/2H3/s3gnAC/feULijCwJhfRo4yXxRzg tjKBejMFsjVWB/QvVHq1FHUOPaz8wWmF5ju9kc28RisiG5hsaAyFfLE4RtA0oaJn Z7YImX1D+NcdI7BrelUippL4gyLzHjAY34FfjCuJIuZ5AkwP/ezZKGBLHDpK3CTx JwJ83vsRjQCwjGUruFNPzhm/VPM39EH8Q7NiM1N6ikQaeG3LdywcNFpxLobeSt4= =8M1y -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: martin_raspaud.vcf Type: text/x-vcard Size: 260 bytes Desc: not available URL: From pgmdevlist at gmail.com Mon Mar 1 10:21:13 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 1 Mar 2010 10:21:13 -0500 Subject: [Numpy-discussion] C-api and masked arrays In-Reply-To: <4B8BD773.3080701@smhi.se> References: <4B8BD773.3080701@smhi.se> Message-ID: <162DBAC6-A493-4CFE-BF8C-CA8B24E38CD6@gmail.com> On Mar 1, 2010, at 10:04 AM, Martin Raspaud wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi all, > > We are using at the moment a c extension which should manipulate masked arrays. > What we do is to fill the masked array with a given value (say 65535 if we run > uint16 arrays), do the manipulation, and convert back to masked arrays when we > go back to python. > > This seems like a naive way to do, is there another cleverer way to do it ? Keep in mind that masked arrays are pure Python (because I still don't speak C), so there's no real proper C way to deal with them. That depends on the kind of manipulation you have in mind. If the underlying value is not important, you don't have to fill the array, just use its .data (the underlying ndarray). If you expect problems down the road (such as NaNs popping up), then yes, filling the masked array beforehand is the way to go. Don't forget to stitch the mask at the end... From martin.raspaud at smhi.se Mon Mar 1 10:39:58 2010 From: martin.raspaud at smhi.se (Martin Raspaud) Date: Mon, 01 Mar 2010 16:39:58 +0100 Subject: [Numpy-discussion] C-api and masked arrays In-Reply-To: <162DBAC6-A493-4CFE-BF8C-CA8B24E38CD6@gmail.com> References: <4B8BD773.3080701@smhi.se> <162DBAC6-A493-4CFE-BF8C-CA8B24E38CD6@gmail.com> Message-ID: <4B8BDFCE.1090604@smhi.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Pierre GM skrev: > On Mar 1, 2010, at 10:04 AM, Martin Raspaud wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi all, >> >> We are using at the moment a c extension which should manipulate masked arrays. >> What we do is to fill the masked array with a given value (say 65535 if we run >> uint16 arrays), do the manipulation, and convert back to masked arrays when we >> go back to python. >> >> This seems like a naive way to do, is there another cleverer way to do it ? > > > Keep in mind that masked arrays are pure Python (because I still don't speak C), so there's no real proper C way to deal with them. That depends on the kind of manipulation you have in mind. If the underlying value is not important, you don't have to fill the array, just use its .data (the underlying ndarray). If you expect problems down the road (such as NaNs popping up), then yes, filling the masked array beforehand is the way to go. Don't forget to stitch the mask at the end... Hi, We're talking map projections, so that means that the values will move around, including masked ones... So filling the array with a given value is a way of projecting the array and the mask in one shot... Martin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJLi9/OAAoJEBdvyODiyJI4pIwH/0pCKWOjhgwzC4AWXxXsvwIz M0kDHIpnZaHzKb1yIhnRJJQtyYsqWCXp9xb1KqGVD1GY6Xf2T8xI61pB17HSrSoY MvTUXjWHKsu2Uz8eNBfOvyTbfTGs+dtMEjwWGbj+OmZrLERPSGvpldPnsz8vxBsD QK0oG9aF2lrnHlv9YkHWHp8+XTTPGAVhzcNidic1hlGlhNoE/pcphpB3b8ssVL4H I+hF9JOzN5di9fjTKDo6rn988gxlxSZx/Im9QQ9A53sAZxxQNn1l5bTB5qACzHPj qaxY+cyEWJjSe/FI32BdWgPNwLM+7z/L6vPLzyA6rnuyIqgFnJqPGuYT/ZSWsts= =05+r -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: martin_raspaud.vcf Type: text/x-vcard Size: 260 bytes Desc: not available URL: From pgmdevlist at gmail.com Mon Mar 1 10:44:45 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 1 Mar 2010 10:44:45 -0500 Subject: [Numpy-discussion] C-api and masked arrays In-Reply-To: <4B8BDFCE.1090604@smhi.se> References: <4B8BD773.3080701@smhi.se> <162DBAC6-A493-4CFE-BF8C-CA8B24E38CD6@gmail.com> <4B8BDFCE.1090604@smhi.se> Message-ID: <5A87D213-2F73-451E-A1F6-50B7F79B68B5@gmail.com> On Mar 1, 2010, at 10:39 AM, Martin Raspaud wrote: > Hi, > > We're talking map projections, so that means that the values will move around, > including masked ones... > > So filling the array with a given value is a way of projecting the array and the > mask in one shot... OK then. Just make sure the value you choose for filling your array is never present otherwise... From doutriaux1 at llnl.gov Mon Mar 1 14:48:25 2010 From: doutriaux1 at llnl.gov (=?utf-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Mon, 1 Mar 2010 11:48:25 -0800 Subject: [Numpy-discussion] Snow Leopard In-Reply-To: <7E340A39-8F92-44F4-BE54-A6A49150DE47@cs.toronto.edu> References: <54208F16-C60C-4711-BB97-B9D794079300@llnl.gov> <7E340A39-8F92-44F4-BE54-A6A49150DE47@cs.toronto.edu> Message-ID: <075A02EA-EBD8-4742-A792-196703FB89A0@llnl.gov> Thx David, Maybe i will have to try that as a temporary fix. But in the long run i do want to build my own Python. C. On Feb 26, 2010, at 4:59 PM, David Warde-Farley wrote: > On 26-Feb-10, at 7:43 PM, Charles ???? Doutriaux wrote: > >> Any idea on how to build a pure 32bit numpy on snow leopard? > > If I'm not mistaken you'll probably want to build against the > Python.org Python rather than the wacky version that comes installed > on the system. The Python.org installer is a 32-bit Python that > installs itself in /Library. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://*mail.scipy.org/mailman/listinfo/numpy-discussion From loredo at astro.cornell.edu Mon Mar 1 15:04:10 2010 From: loredo at astro.cornell.edu (Tom Loredo) Date: Mon, 1 Mar 2010 15:04:10 -0500 Subject: [Numpy-discussion] Snow Leopard Py-2.7a3 _init_posix issue; IO test segfault Message-ID: <1267473850.4b8c1dbaa0e6d@astrosun2.astro.cornell.edu> Bruce Southey wrote: On Fri, Feb 26, 2010 at 6:59 PM, David Warde-Farley wrote: > On 26-Feb-10, at 7:43 PM, Charles سمير Doutriaux wrote: > >> Any idea on how to build a pure 32bit numpy on snow leopard? > > If I'm not mistaken you'll probably want to build against the > Python.org Python rather than the wacky version that comes installed > on the system. The Python.org installer is a 32-bit Python that > installs itself in /Library. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > If you remain with 2.7 then you should also view the thread started 3 days ago: 'distutils problem with NumPy-1.4 & Py-2.7a3 (Snow Leopard)' http://mail.scipy.org/pipermail/numpy-discussion/2010-February/048882.html In particular: Ticket 1355 - that should be resolved with r8260 (thanks Stefan): http://projects.scipy.org/numpy/ticket/1355 Ticket 1409 http://projects.scipy.org/numpy/ticket/1409 Ticket 1345: http://projects.scipy.org/numpy/ticket/1345 ~~~~~~~~~~~~~~~~~~~~ Ticket 1409 indicates the _init_posix issue was fixed 5 days ago, but as of today the unnecessary _init_posix reference is still in the version available on SVN (r8270): numpy-r8270:145$ time python setup.py build --fcompiler=gnu95 Running from numpy source directory.Traceback (most recent call last): File "setup.py", line 210, in setup_package() File "setup.py", line 187, in setup_package from numpy.distutils.core import setup File "/Volumes/Tracking/Temp-work/Sandbox-installs/numpy-r8270/numpy/distutils/__init__.py", line 7, in import ccompiler File "/Volumes/Tracking/Temp-work/Sandbox-installs/numpy-r8270/numpy/distutils/ccompiler.py", line 22, in _old_init_posix = distutils.sysconfig._init_posix AttributeError: 'module' object has no attribute '_init_posix' Per Robert's suggestion, I commented out the offending line (22) in ccompiler.py, and the build proceeded. I am using a 64-bit universal build. The resulting numpy gives a segfault on test: >>> numpy.test(verbose=10) .. test_ip_basic (test_multiarray.TestFromBuffer) ... ok test_multiarray.TestIO.test_ascii ... Segmentation fault At this point I don't know if this is just a 64-bit issue; I'm trying to look into it but 32-bit building for 10.6 was unintentionally crippled in the 2.6.x series and 2.7a3; it should be fixed in the next 2.7 release and in 2.6.5. I can also verify that this segfault bug remains: http://projects.scipy.org/numpy/ticket/1345 Dealing with it has been postponed since it affects 2.7 which isn't due out for a few months. However, 2.6.5rc1 is scheduled for today (I don't see it yet!), with final in 2 weeks; it may be worth seeing if these issues will appear in 2.6.5. BTW: The current NumPy and SciPy SVN (r8270, r6250) install successfully with 64-bit Py-2.6.4 on Snow Leopard; the NumPy tests are fine; the SciPy tests have many errors but at least they no longer segfault. I'll report further in another thread; here it's just to point out that the issues above do not affect 64-bit 2.6.4. -Tom ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/ From loredo at astro.cornell.edu Mon Mar 1 15:23:53 2010 From: loredo at astro.cornell.edu (Tom Loredo) Date: Mon, 1 Mar 2010 15:23:53 -0500 Subject: [Numpy-discussion] NumPy & SciPy with Snow Leopard 64-bit Py-2.6.4 Message-ID: <1267475033.4b8c2259aa3ef@astrosun2.astro.cornell.edu> Just wanted to report qualified success installing NumPy & SciPy under a 64-bit build of Python-2.6.4 (universal framework) on OS X 10.6.2 (current Snow Leopard). I am using the current SVN checkouts (numpy r8270, scipy r6250). NumPy has installed successfully for some time now and the current SVN maintains this: >>> numpy.test() .. Ran 2521 tests in 8.518s OK (KNOWNFAIL=4, SKIP=1) SciPy has been causing me problems for weeks with segfaults with IFFT tests, as reported on scipy-dev (no one ever responded to this so I made no progress in diagnosing it): http://mail.scipy.org/pipermail/scipy-dev/2010-February/013921.html However, r6250 now runs scipy.test() without segfault, though with 22 errors and 2 failures. The full tests give: >>> scipy.test('full') .. Ran 4982 tests in 652.478s FAILED (KNOWNFAIL=13, SKIP=27, errors=22, failures=6) I am not familiar with what many of the tests are covering, so I cannot assess the severity of all the errors and failures. Some of them seem to be bugs in the tests (e.g., use of an unexpected keyword "new" in several histogram tests); others are innocuous (e.g., missing PIL, which I haven't installed yet). I've posted the report here: http://www.pastie.org/848651 I'd appreciate comments on which issues are nontrivial and deserve attention to as a SciPy user. E.g., the first error is in an lapack test and involves a ValueError where infs or NaNs appear where they shouldn't. Is this a bug in the test, or does it indicate a 64-bit issue that is making inf/NaN appear where it shouldn't? E.g., there are arpack errors, but I don't know what the "Error info=-8" message signifies. Other arpack errors are due to large solution mismatches, which I presume are serious and deserve attention. Thanks, Tom Loredo ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/ From josef.pktd at gmail.com Mon Mar 1 15:49:52 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Mar 2010 15:49:52 -0500 Subject: [Numpy-discussion] NumPy & SciPy with Snow Leopard 64-bit Py-2.6.4 In-Reply-To: <1267475033.4b8c2259aa3ef@astrosun2.astro.cornell.edu> References: <1267475033.4b8c2259aa3ef@astrosun2.astro.cornell.edu> Message-ID: <1cd32cbb1003011249l214a2e3aj45e59ec240f6200b@mail.gmail.com> On Mon, Mar 1, 2010 at 3:23 PM, Tom Loredo wrote: > > Just wanted to report qualified success installing NumPy & SciPy under > a 64-bit build of Python-2.6.4 (universal framework) on OS X 10.6.2 > (current Snow Leopard). ?I am using the current SVN checkouts > (numpy r8270, scipy r6250). > > NumPy has installed successfully for some time now and the current > SVN maintains this: > >>>> numpy.test() > .. > Ran 2521 tests in 8.518s > OK (KNOWNFAIL=4, SKIP=1) > > SciPy has been causing me problems for weeks with segfaults with > IFFT tests, as reported on scipy-dev (no one ever responded to > this so I made no progress in diagnosing it): > > http://mail.scipy.org/pipermail/scipy-dev/2010-February/013921.html > > However, r6250 now runs scipy.test() without segfault, though with > 22 errors and 2 failures. ?The full tests give: > >>>> scipy.test('full') > .. > Ran 4982 tests in 652.478s > FAILED (KNOWNFAIL=13, SKIP=27, errors=22, failures=6) > > I am not familiar with what many of the tests are covering, so I > cannot assess the severity of all the errors and failures. ?Some > of them seem to be bugs in the tests (e.g., use of an unexpected > keyword "new" in several histogram tests); np.histogram(rvs,histsupp,new=True) this still needs to be changed in scipy.stats, because new keyword has been removed in numpy trunk 2 weeks ago. Josef others are innocuous > (e.g., missing PIL, which I haven't installed yet). ?I've posted > the report here: > > http://www.pastie.org/848651 > > I'd appreciate comments on which issues are nontrivial and > deserve attention to as a SciPy user. ?E.g., the first error > is in an lapack test and involves a ValueError where infs or > NaNs appear where they shouldn't. ?Is this a bug in the test, > or does it indicate a 64-bit issue that is making inf/NaN > appear where it shouldn't? ?E.g., there are arpack errors, > but I don't know what the "Error info=-8" message signifies. > Other arpack errors are due to large solution mismatches, > which I presume are serious and deserve attention. > > Thanks, > Tom Loredo > > > ------------------------------------------------- > This mail sent through IMP: http://horde.org/imp/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nwagner at iam.uni-stuttgart.de Mon Mar 1 16:08:30 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 01 Mar 2010 22:08:30 +0100 Subject: [Numpy-discussion] NumPy & SciPy with Snow Leopard 64-bit Py-2.6.4 In-Reply-To: <1cd32cbb1003011249l214a2e3aj45e59ec240f6200b@mail.gmail.com> References: <1267475033.4b8c2259aa3ef@astrosun2.astro.cornell.edu> <1cd32cbb1003011249l214a2e3aj45e59ec240f6200b@mail.gmail.com> Message-ID: On Mon, 1 Mar 2010 15:49:52 -0500 josef.pktd at gmail.com wrote: > On Mon, Mar 1, 2010 at 3:23 PM, Tom Loredo > wrote: >> >> Just wanted to report qualified success installing NumPy >>& SciPy under >> a 64-bit build of Python-2.6.4 (universal framework) on >>OS X 10.6.2 >> (current Snow Leopard). ?I am using the current SVN >>checkouts >> (numpy r8270, scipy r6250). >> >> NumPy has installed successfully for some time now and >>the current >> SVN maintains this: >> >>>>> numpy.test() >> .. >> Ran 2521 tests in 8.518s >> OK (KNOWNFAIL=4, SKIP=1) >> >> SciPy has been causing me problems for weeks with >>segfaults with >> IFFT tests, as reported on scipy-dev (no one ever >>responded to >> this so I made no progress in diagnosing it): >> >> http://mail.scipy.org/pipermail/scipy-dev/2010-February/013921.html >> >> However, r6250 now runs scipy.test() without segfault, >>though with >> 22 errors and 2 failures. ?The full tests give: >> >>>>> scipy.test('full') >> .. >> Ran 4982 tests in 652.478s >> FAILED (KNOWNFAIL=13, SKIP=27, errors=22, failures=6) >> >> I am not familiar with what many of the tests are >>covering, so I >> cannot assess the severity of all the errors and >>failures. ?Some >> of them seem to be bugs in the tests (e.g., use of an >>unexpected >> keyword "new" in several histogram tests); > > np.histogram(rvs,histsupp,new=True) > > this still needs to be changed in scipy.stats, because >new keyword has > been removed in numpy trunk 2 weeks ago. > > Josef > > others are innocuous >> (e.g., missing PIL, which I haven't installed yet). >>?I've posted >> the report here: >> >> http://www.pastie.org/848651 >> >> I'd appreciate comments on which issues are nontrivial >>and >> deserve attention to as a SciPy user. ?E.g., the first >>error >> is in an lapack test and involves a ValueError where >>infs or >> NaNs appear where they shouldn't. ?Is this a bug in the >>test, >> or does it indicate a 64-bit issue that is making >>inf/NaN >> appear where it shouldn't? ?E.g., there are arpack >>errors, >> but I don't know what the "Error info=-8" message >>signifies. >> Other arpack errors are due to large solution >>mismatches, >> which I presume are serious and deserve attention. >> >> Thanks, >> Tom Loredo >> >> Last but not least there is an annoying bug lurking around for 17 months http://projects.scipy.org/numpy/ticket/937 Nils From mattknox.ca at gmail.com Mon Mar 1 18:38:01 2010 From: mattknox.ca at gmail.com (Matt Knox) Date: Mon, 1 Mar 2010 23:38:01 +0000 (UTC) Subject: [Numpy-discussion] C-api and masked arrays References: <4B8BD773.3080701@smhi.se> Message-ID: Martin Raspaud smhi.se> writes: > We are using at the moment a c extension which should manipulate masked arrays. > What we do is to fill the masked array with a given value (say 65535 if we run > uint16 arrays), do the manipulation, and convert back to masked arrays when we > go back to python. > > This seems like a naive way to do, is there another cleverer way to do it ? You might want to take a look at the C code in the timeseries scikit. It works extensively with masked arrays. I don't claim that we use the optimal approach, but it might give you some ideas. All the code that deals with masked arrays is located here: http://svn.scipy.org/svn/scikits/trunk/timeseries/scikits/timeseries/src/c_tseries .c The functions that deal with frequency conversion and calculating moving sums, averages, etc all deal with masked arrays. - Matt From millman at berkeley.edu Mon Mar 1 19:47:18 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 1 Mar 2010 16:47:18 -0800 Subject: [Numpy-discussion] todos before 1.4.1 RC1 In-Reply-To: References: Message-ID: On Mon, Mar 1, 2010 at 6:52 AM, Ralf Gommers wrote: > Here are some requests / things I think need to be done before a 1.4.1 RC1 > can be put out. > > 1. Bump up the version to 1.4.1 > 2. Update the release notes, including an explanation of why 1.4.0 was > pulled. > 3. Patrick and I need info on how to upload to Sourceforge. David or Jarrod, > can you tell us how this works (offlist)? > > I guess I can do 1 and 2, but I don't have commit rights. I'm happy to get > them, if you'd all prefer that I submit patches for a while first that's > fine too. Sure, I will send you a follow-up email right away. > Then, I'm also going to try this once more: in my opinion we should not > release numpy 1.4.1 before scipy 0.7.2. Consider this: > numpy 1.3 + scipy 0.7.1 = OK > numpy 1.4.1 + scipy 0.7.2 = OK > numpy 1.3 + scipy 0.7.2 = OK > numpy 1.4.1 + scipy 0.7.1 = NOT OK (Cython issue) > > So either release 1.4.1 and 0.7.2 at the same time, or scipy 0.7.2 first. You should release scipy 0.7.2 first. Since this is your first time managing the release process, it would be easier to do them one at a time. Thanks, -- Jarrod Millman Helen Wills Neuroscience Institute 10 Giannini Hall, UC Berkeley http://cirl.berkeley.edu/ From bsouthey at gmail.com Mon Mar 1 20:01:04 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 1 Mar 2010 19:01:04 -0600 Subject: [Numpy-discussion] Snow Leopard Py-2.7a3 _init_posix issue; IO test segfault In-Reply-To: <1267473850.4b8c1dbaa0e6d@astrosun2.astro.cornell.edu> References: <1267473850.4b8c1dbaa0e6d@astrosun2.astro.cornell.edu> Message-ID: On Mon, Mar 1, 2010 at 2:04 PM, Tom Loredo wrote: > > Bruce Southey wrote: > > On Fri, Feb 26, 2010 at 6:59 PM, David Warde-Farley wrote: >> On 26-Feb-10, at 7:43 PM, Charles سمير Doutriaux wrote: >> >>> Any idea on how to build a pure 32bit numpy on snow leopard? >> >> If I'm not mistaken you'll probably want to build against the >> Python.org Python rather than the wacky version that comes installed >> on the system. The Python.org installer is a 32-bit Python that >> installs itself in /Library. >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > If you remain with 2.7 then you should also view the thread started 3 days ago: > 'distutils problem with NumPy-1.4 & Py-2.7a3 (Snow Leopard)' > http://mail.scipy.org/pipermail/numpy-discussion/2010-February/048882.html > > In particular: > Ticket 1355 - that should be resolved with r8260 (thanks Stefan): > http://projects.scipy.org/numpy/ticket/1355 > Ticket 1409 > http://projects.scipy.org/numpy/ticket/1409 > Ticket 1345: > http://projects.scipy.org/numpy/ticket/1345 > > ~~~~~~~~~~~~~~~~~~~~ > > Ticket 1409 indicates the _init_posix issue was fixed 5 days > ago, but as of today the unnecessary _init_posix reference > is still in the version available on SVN (r8270): Actually the only action was that I created the ticket Feb 24 (approx 5 days ago). It does not say it was been applied or not yet. > > numpy-r8270:145$ time python setup.py build --fcompiler=gnu95 > Running from numpy source directory.Traceback (most recent call last): > ?File "setup.py", line 210, in > ? ?setup_package() > ?File "setup.py", line 187, in setup_package > ? ?from numpy.distutils.core import setup > ?File "/Volumes/Tracking/Temp-work/Sandbox-installs/numpy-r8270/numpy/distutils/__init__.py", line 7, in > ? ?import ccompiler > ?File "/Volumes/Tracking/Temp-work/Sandbox-installs/numpy-r8270/numpy/distutils/ccompiler.py", line 22, in > ? ?_old_init_posix = distutils.sysconfig._init_posix > AttributeError: 'module' object has no attribute '_init_posix' > > Per Robert's suggestion, I commented out the offending line (22) > in ccompiler.py, and the build proceeded. ?I am using a 64-bit > universal build. ?The resulting numpy gives a segfault on test: > >>>> numpy.test(verbose=10) > .. > test_ip_basic (test_multiarray.TestFromBuffer) ... ok > test_multiarray.TestIO.test_ascii ... Segmentation fault > > At this point I don't know if this is just a 64-bit issue; I'm > trying to look into it but 32-bit building for 10.6 was > unintentionally crippled in the 2.6.x series and 2.7a3; it > should be fixed in the next 2.7 release and in 2.6.5. > > I can also verify that this segfault bug remains: > > http://projects.scipy.org/numpy/ticket/1345 This ticket is for that segfault and the patch just changes the PY_VERSION_HEX to allow Python 2.7. Bruce From sccolbert at gmail.com Mon Mar 1 20:15:50 2010 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 1 Mar 2010 20:15:50 -0500 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: <7f014ea61003011715o61a50a4fu23f5cb22669ed206@mail.gmail.com> This is how I always do it: In [1]: import numpy as np In [3]: tmat = np.array([[0., 1., 0., 5.],[0., 0., 1., 3.],[1., 0., 0., 2.]]) In [4]: tmat Out[4]: array([[ 0., 1., 0., 5.], [ 0., 0., 1., 3.], [ 1., 0., 0., 2.]]) In [5]: points = np.random.random((5, 3)) In [7]: hpoints = np.column_stack((points, np.ones(len(points)))) In [9]: hpoints Out[9]: array([[ 0.17437059, 0.38693627, 0.201047 , 1. ], [ 0.99712373, 0.16958721, 0.03050696, 1. ], [ 0.30653326, 0.62037744, 0.35785282, 1. ], [ 0.78936771, 0.93692711, 0.58138493, 1. ], [ 0.29914065, 0.08808239, 0.72032172, 1. ]]) In [10]: np.dot(tmat, hpoints.T).T Out[10]: array([[ 5.38693627, 3.201047 , 2.17437059], [ 5.16958721, 3.03050696, 2.99712373], [ 5.62037744, 3.35785282, 2.30653326], [ 5.93692711, 3.58138493, 2.78936771], [ 5.08808239, 3.72032172, 2.29914065]]) On Mon, Mar 1, 2010 at 6:12 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > 2010/3/1 Charles R Harris : > > On Sun, Feb 28, 2010 at 7:58 PM, Ian Mallett > wrote: > >> Excellent--and a 3D rotation matrix is 3x3--so the list can remain n*3. > >> Now the question is how to apply a rotation matrix to the array of vec3? > > > > It looks like you want something like > > > > res = dot(vec, rot) + tran > > > > You can avoid an extra copy being made by separating the parts > > > > res = dot(vec, rot) > > res += tran > > > > where I've used arrays, not matrices. Note that the rotation matrix > > multiplies every vector in the array. > > When you want to rotate a ndarray "list" of vectors: > > >>> a.shape > (N, 3) > > >>> a > [[1., 2., 3. ] > [4., 5., 6. ]] > > by some rotation matrix: > > >>> rotation_matrix.shape > (3, 3) > > where each row of the rotation_matrix represents one vector of the > rotation target basis, expressed in the basis of the original system, > > you can do this by writing: > > >>> numpy.dot(a, rotations_matrix) , > > as Chuck pointed out. > > This gives you the rotated vectors in an ndarray "list" again: > > >>> numpy.dot(a, rotation_matrix).shape > (N, 3) > > This is just somewhat more in detail what Chuck already stated > > Note that the rotation matrix > > multiplies every vector in the array. > > my 2 cents, > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From geometrian at gmail.com Mon Mar 1 20:39:09 2010 From: geometrian at gmail.com (Ian Mallett) Date: Mon, 1 Mar 2010 17:39:09 -0800 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: Excellent--this setup works perfectly! In the areas I was concentrating on, the the speed increased an order of magnitude. However, the overall speed seems to have dropped. I believe this may be because the heavy indexing that follows on the result is slower in numpy. Is this a correct analysis? What would be amazing is if I could port the other half of the algorithm to numpy too . . . The algorithm is for light volume extrusion of geometry. Pseudocode of the rest of the algorithm: 1 v_array #array of beautifully transformed vertices from last step 2 n_array #array of beautifully transformed normals from last step 3 f_list #list of vec4s, where every vec4 is three vertex indices and a 4 #normal index. These indices together describe a triangle-- 5 #three vertex indices into "v_array", and one normal from "n_array". 6 edges = [] 7 for v1index,v2index,v3index,nindex in f_list: 8 v1,v2,v3 = [v_array[i] for i in [vi1index,v2index,v3index]] 9 if angle_between(n_array[nindex],v1-a_constant_point)<90deg: 10 for edge in [[v1index,v2index],[v2index,v3index],[v3index,v1index]]: 11 #add "edge" to "edges" 12 #remove pairs of identical edges (also note that things like 13 #[831,326] are the same as [326,831]) 14 edges2 = [] 15 for edge in edges: 16 edges2.append(v_array[edge[0]],v_array[edge[1]]) 17 return edges2 If that needs clarification, let me know. The goal with this is obviously to remove as many looping operations as possible. I think the slowdown may be in lines 8, 9, and 16, as these are the lines that index into v_array or n_array. In line 9, the normal "n_array[nindex]" is tested against any vector from a vertex (in this case "v1") to a constant point (here, the light's position) see if it is less than 90deg--that is, that the triangle's front is towards the light. I thought this might be a particularly good candidate for a boolean array? The edge pair removal is something that I have worked fairly extensively on. By testing and removing pairs as they are added (line 11), a bit more performance can be gained. I have put it here in 12-13 because I'm not sure this can be done in numpy. My current algorithm works using Python's standard sets, and using the "edge" as a key. If an edge is already in the set of edges (in the same or reversed form), then the edge is removed from the set. I may be able to do something with lines 14-17 myself--but I'm not sure. If my code above can't be simplified using numpy, is there a way to efficiently convert numpy arrays back to standard Python lists? As mentioned before, I'm certainly no pro with numpy--but I am learning! Thanks! Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From pete at shinners.org Mon Mar 1 23:36:08 2010 From: pete at shinners.org (Peter Shinners) Date: Mon, 01 Mar 2010 20:36:08 -0800 Subject: [Numpy-discussion] take not respecting masked arrays? In-Reply-To: References: <4B8B1F72.5050709@shinners.org> <4B8B588B.6000500@shinners.org> Message-ID: <4B8C95B8.5050808@shinners.org> On 02/28/2010 10:58 PM, Pierre GM wrote: > On Mar 1, 2010, at 1:02 AM, Peter Shinners wrote: > >>> Here is the code as I would like it to work. >>> >> http://python.pastebin.com/CsEnUrSa >> >> >> import numpy as np >> >> values = np.array((40, 18, 37, 9, 22)) >> index = np.arange(3)[None,:] + np.arange(5)[:,None] >> mask = index>= len(values) >> >> maskedindex = np.ma.array(index, mask=mask) >> >> lookup = np.ma.take(values, maskedindex) >> # This fails with an index error, but illegal indices are masked. >> > OK, but this doesn't even work on a regular ndarray: np.take(values, index) raises an IndexError as well. Not much I can do there, then. > > >> # It succeeds when mode="clip", but it does not return a masked array. >> print lookup >> > > Oh, I get it... The problem is that we use `take` on a ndarray (values) with a masked array as indices (maskedindex). OK, I could modify some of the mechanics so that a masked array is output even if a ndarray was parsed. > Now, about masked indices: OK, you're right, the result should be masked accordingly. Can you open a ticket, then ? > Excellent. Ticket is #1418 http://projects.scipy.org/numpy/ticket/1418 From friedrichromstedt at gmail.com Tue Mar 2 03:27:07 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 2 Mar 2010 09:27:07 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: I'm learning too by answering your questions. 2010/3/2 Ian Mallett : > 1? v_array #array of beautifully transformed vertices from last step > 2? n_array #array of beautifully transformed normals from last step > 3? f_list? #list of vec4s, where every vec4 is three vertex indices and a > 4? ??????? #normal index.? These indices together describe a triangle-- > 5? ??????? #three vertex indices into "v_array", and one normal from > "n_array". > 6? edges = [] > 7? for v1index,v2index,v3index,nindex in f_list: > 8 ? ?? v1,v2,v3 = [v_array[i] for i in [vi1index,v2index,v3index]] > 9 ? ?? if angle_between(n_array[nindex],v1-a_constant_point)<90deg: > 10 ???? ?? for edge in [[v1index,v2index],[v2index,v3index],[v3index,v1index]]: > 11? ?????????? #add "edge" to "edges" > 12 #remove pairs of identical edges (also note that things like > 13 #[831,326] are the same as [326,831]) > 14 edges2 = [] > 15 for edge in edges: > 16 ??? edges2.append(v_array[edge[0]],v_array[edge[1]]) > 17 return edges2 The loop I can replace by numpy operations: >>> v_array array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> n_array array([[ 0.1, 0.2, 0.3], [ 0.4, 0.5, 0.6]]) >>> f_list array([[0, 1, 2, 0], [2, 1, 0, 1]]) Retrieving the v1 vectors: >>> v1s = v_array[f_list[:, 0]] >>> v1s array([[1, 2, 3], [7, 8, 9]]) Retrieving the normal vectors: >>> ns = n_array[f_list[:, 3]] >>> ns array([[ 0.1, 0.2, 0.3], [ 0.4, 0.5, 0.6]]) Now how to calculate the pairwise dot product (I suppress the difference of v1 to some_point for now): >>> inner = numpy.inner(ns, v1s) >>> inner array([[ 1.4, 5. ], [ 3.2, 12.2]]) This calculates *all* pairwise dot products, we have to select the diagonal of this square ndarray: >>> dotprods = inner[[numpy.arange(0, 2), numpy.arange(0, 2)]] >>> dotprods array([ 1.4, 12.2]) Now we can create a boolean array saying where the dotprod is > 0 (i.e, angle < 90?), and select those triangles: >>> select = dotprods > 0 >>> select array([ True, True], dtype=bool) >>> selected = f_list[select] >>> selected array([[0, 1, 2, 0], [2, 1, 0, 1]]) In this case it's the full list. Now build the triangles corners array: >>> corners = v_array[selected[:, :3]] >>> corners array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[7, 8, 9], [4, 5, 6], [1, 2, 3]]]) >>> This has indices [triangle, vertex number (0, 1, or 2), xyz]. And compute the edges (I think you can make use of them): >>> edges_dist = numpy.asarray([corners[:, 1] - corners[:, 0], corners[:, 2] - corners[:, 0], corners[:, 2] - corners[:, 1]]) >>> edges_dist array([[[ 3, 3, 3], [-3, -3, -3]], [[ 6, 6, 6], [-6, -6, -6]], [[ 3, 3, 3], [-3, -3, -3]]]) This has indices [corner number, triangle, xyz]. I think it's easier to compare then "reversed" edges, because then edge[i, j] == -edge[k, l]? But of course: >>> edges = numpy.asarray([[corners[:, 0], corners[:, 1]], [corners[:, 1], corners[:, 2]], [corners[:, 2], corners[:, 0]]]) >>> edges array([[[[1, 2, 3], [7, 8, 9]], [[4, 5, 6], [4, 5, 6]]], [[[4, 5, 6], [4, 5, 6]], [[7, 8, 9], [1, 2, 3]]], [[[7, 8, 9], [1, 2, 3]], [[1, 2, 3], [7, 8, 9]]]]) This has indices [edge number (0, 1, or 2), corner number in edge (0 or 1), triangle]. But this may not be what you want (not flattened in triangle number). Therefore: >>> edges0 = numpy.asarray([corners[:, 0], corners[:, 1]]) >>> edges1 = numpy.asarray([corners[:, 1], corners[:, 2]]) >>> edges2 = numpy.asarray([corners[:, 2], corners[:, 0]]) >>> edges0 array([[[1, 2, 3], [7, 8, 9]], [[4, 5, 6], [4, 5, 6]]]) >>> edges1 array([[[4, 5, 6], [4, 5, 6]], [[7, 8, 9], [1, 2, 3]]]) >>> edges2 array([[[7, 8, 9], [1, 2, 3]], [[1, 2, 3], [7, 8, 9]]]) >>> edges = numpy.concatenate((edges0, edges1, edges2), axis = 0) >>> edges array([[[1, 2, 3], [7, 8, 9]], [[4, 5, 6], [4, 5, 6]], [[4, 5, 6], [4, 5, 6]], [[7, 8, 9], [1, 2, 3]], [[7, 8, 9], [1, 2, 3]], [[1, 2, 3], [7, 8, 9]]]) This should be as intended. The indices are [flat edge number, edge side (left or right), xyz]. Now I guess you have to iterate over all pairs of them, don't know a numpy accelerated method. Maybe it's even faster to draw the edges twice than to invest O(N_edges ** 2) complexity for comparing? > I may be able to do something with lines 14-17 myself--but I'm not sure. Ok, than that's fine, and let us all know about your solution. It may seem a bit complicated, but I hope this impression is mainly because of the many outputs ... I double-checked everything, *hope* everything is correct. So far from me, Friedrich From amenity at enthought.com Tue Mar 2 11:18:02 2010 From: amenity at enthought.com (Amenity Applewhite) Date: Tue, 2 Mar 2010 10:18:02 -0600 Subject: [Numpy-discussion] EPD 6.1 & Upcoming Webinars References: <4F512ADB-26E3-42E8-9613-8FA2EF566E65@enthought.com> Message-ID: <847EEE39-ED9A-471B-A22C-781E87EC4F6E@enthought.com> March is shaping up to be as busy as ever: planning SciPy 2010 (http://conference.scipy.org/scipy2010), two great webinars...and a new release of EPD! * Enthought Python Distribution 6.1 * In EPD 6.1, NumPy and SciPy are dynamically linked against the MKL linear algebra routines. This allows EPD users to seamlessly benefit from the highly optimized BLAS and LAPACK routines in the MKL. We were certainly expecting to observe performance improvements, but we were surprised at just how dramatic the optimizations were for applications run on dual and multi-core Intel(R) processors. Refer to our benchmarking tests for more info: http://www.enthought.com/epd/mkl/ Then try it yourself! http://www.enthought.com/products/getepd.php * Enthought March Webinars * This Friday, Travis Oliphant will lead a webinar on optimization methods in EPD. Then, on the 19th, we'll host a webinar on Python libraries for integrating C and C++ code, namely Weave, Cython, and ctypes. Enthought Python Distribution Webinar How do I... optimize? Friday, March 5: 1pm CST/7pm UTC Wait-list (for non-subscribers): email amenity at enthought.com Scientific Computing with Python Webinar Weave, Cython, and ctypes Friday, March 19: 1pm CST/7pm UTC Register: https://www.gotomeeting.com/register/335697152 Enjoy! ~ The Enthought Team ~ Open Course Austin, TX: http://www.enthought.com/training/open-austin-sci.php Python for Scientists and Engineers * May 17- 19 Interfacing with C / C++ and Fortran * May 20 Introduction to UIs and Visualization * May 21 Financial Open Course, London, UK: http://www.enthought.com/training/open-london-fin.php Python for Quants * March 8-10 Software Craftsmanship * March 11 Introduction to UIs and Visualization * March 12 * Pricing, licensing, and training inquiries * Didrik and Matt are dedicated to answering your questions and getting you the support you need. US : Matt Harward mharward at enthought.com Europe: Didrik Pinte dpinte at enthought.com From bergstrj at iro.umontreal.ca Tue Mar 2 14:44:00 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Tue, 2 Mar 2010 14:44:00 -0500 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> Message-ID: <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> On Mon, Mar 1, 2010 at 1:44 AM, David Cournapeau wrote: > On Mon, Mar 1, 2010 at 1:35 PM, James Bergstra > wrote: >> Could someone point me to documentation (or even numpy src) that shows >> how to allocate a numpy.int8 in C, or check to see if a PyObject is a >> numpy.int8? > > In numpy, the type is described in the dtype type object, so you > should create the appropriate PyArray_Descr when creating an array. > The exact procedure depends on how you create the array, but a simple > way to create arrays is PyArray_SimpleNew, where you don't need to > create your own dtype, and just pass the correponding typenum (C > enum), something like PyArray_SimpleNew(nd, dims, NPY_INT8). > > If you need to create from a function which only takes PyArray_Descr, > you can easily create a simple descriptor object from the enum using > PyArray_DescrFromType. > > You can see examples in numpy/core/src/multiarray/ctors.c Maybe I'm missing something... but I don't think I want to create an array. In [3]: import numpy In [4]: type(numpy.int8()) Out[4]: In [5]: isinstance(numpy.int8(), numpy.ndarray) Out[5]: False I want to create one of those numpy.int8 guys. Can I do it with PyArray_XXX methods? -- http://www-etud.iro.umontreal.ca/~bergstrj From Chris.Barker at noaa.gov Tue Mar 2 15:09:40 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 02 Mar 2010 12:09:40 -0800 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> Message-ID: <4B8D7084.5050003@noaa.gov> James Bergstra wrote: > Maybe I'm missing something... but I don't think I want to create an array. > > In [3]: import numpy > > In [4]: type(numpy.int8()) > Out[4]: > > In [5]: isinstance(numpy.int8(), numpy.ndarray) > Out[5]: False right, it's a type object: In [13]: type(np.uint8) Out[13]: > I want to create one of those numpy.int8 guys. There's got to be a a way, though I can't tell you what it is. But I'm curious why you want that from C. It's a Python object, why not create it in Python? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bergstrj at iro.umontreal.ca Tue Mar 2 19:12:50 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Tue, 2 Mar 2010 19:12:50 -0500 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4B8D7084.5050003@noaa.gov> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> Message-ID: <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> On Tue, Mar 2, 2010 at 3:09 PM, Christopher Barker wrote: > James Bergstra wrote: >> Maybe I'm missing something... but I don't think I want to create an array. >> >> In [3]: import numpy >> >> In [4]: type(numpy.int8()) >> Out[4]: >> >> In [5]: isinstance(numpy.int8(), numpy.ndarray) >> Out[5]: False > > right, it's a type object: > > In [13]: type(np.uint8) > Out[13]: > Agreed :) >> I want to create one of those numpy.int8 guys. > np.int8 is a type, and so is numpy.ndarray. And they are different. There's lots of docs about how to make arrays, but how do I make a scalar? > There's got to be a a way, though I can't tell you what it is. But I'm > curious why you want that from C. It's a Python object, why not create > it in Python? It's for Theano. (http://www.deeplearning.net/software/theano) Theano generates some c functions that are supposed to return such np.int8 instances, and they don't. Same goes for all the other scalar types too, but I didn't want to complicate my original question. I want to be able to generate all the{int,uint}{8,16,32,64} combinations. James -- http://www-etud.iro.umontreal.ca/~bergstrj From warren.weckesser at enthought.com Tue Mar 2 19:18:51 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 02 Mar 2010 18:18:51 -0600 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> Message-ID: <4B8DAAEB.1050305@enthought.com> James Bergstra wrote: > On Tue, Mar 2, 2010 at 3:09 PM, Christopher Barker > wrote: > >> James Bergstra wrote: >> >>> Maybe I'm missing something... but I don't think I want to create an array. >>> >>> In [3]: import numpy >>> >>> In [4]: type(numpy.int8()) >>> Out[4]: >>> >>> In [5]: isinstance(numpy.int8(), numpy.ndarray) >>> Out[5]: False >>> >> right, it's a type object: >> >> In [13]: type(np.uint8) >> Out[13]: >> >> > Agreed :) > >>> I want to create one of those numpy.int8 guys. >>> > > np.int8 is a type, and so is numpy.ndarray. And they are different. > There's lots of docs about how to make arrays, but how do I make a > scalar? > In [1]: import numpy as np In [2]: x = np.uint8(23) In [3]: x Out[3]: 23 In [4]: type(x) Out[4]: Warren From bergstrj at iro.umontreal.ca Tue Mar 2 19:23:57 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Tue, 2 Mar 2010 19:23:57 -0500 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4B8DAAEB.1050305@enthought.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> <4B8DAAEB.1050305@enthought.com> Message-ID: <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> On Tue, Mar 2, 2010 at 7:18 PM, Warren Weckesser wrote: > James Bergstra wrote: >> On Tue, Mar 2, 2010 at 3:09 PM, Christopher Barker >> wrote: >> >>> James Bergstra wrote: >>> >>>> Maybe I'm missing something... but I don't think I want to create an array. >>>> >>>> In [3]: import numpy >>>> >>>> In [4]: type(numpy.int8()) >>>> Out[4]: >>>> >>>> In [5]: isinstance(numpy.int8(), numpy.ndarray) >>>> Out[5]: False >>>> >>> right, it's a type object: >>> >>> In [13]: type(np.uint8) >>> Out[13]: >>> >>> >> Agreed :) >> >>>> I want to create one of those numpy.int8 guys. >>>> >> >> np.int8 is a type, and so is numpy.ndarray. ?And they are different. >> There's lots of docs about how to make arrays, but how do I make a >> scalar? >> > > In [1]: import numpy as np > > In [2]: x = np.uint8(23) > > In [3]: x > Out[3]: 23 > > In [4]: type(x) > Out[4]: Sorry... again... how do I make such a scalar... *in C* ? What would be the recommended C equivalent of this python code? Are there C type-checking functions for instances of these objects? Are there C functions for converting to and from C scalars? Basically, is there a C API for working with these numpy scalars? -- http://www-etud.iro.umontreal.ca/~bergstrj From dwf at cs.toronto.edu Tue Mar 2 19:32:55 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 2 Mar 2010 19:32:55 -0500 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> <4B8DAAEB.1050305@enthought.com> <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> Message-ID: <4BFD1EF4-2381-45F3-8CD1-56E625107371@cs.toronto.edu> On 2-Mar-10, at 7:23 PM, James Bergstra wrote: > Sorry... again... how do I make such a scalar... *in C* ? What would > be the recommended C equivalent of this python code? Are there C > type-checking functions for instances of these objects? Are there C > functions for converting to and from C scalars? > > Basically, is there a C API for working with these numpy scalars? This bit looks relevant: http://projects.scipy.org/numpy/browser/trunk/numpy/core/src/multiarray/scalarapi.c?rev=7560#L565 David From bergstrj at iro.umontreal.ca Tue Mar 2 19:46:43 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Tue, 2 Mar 2010 19:46:43 -0500 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4BFD1EF4-2381-45F3-8CD1-56E625107371@cs.toronto.edu> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> <4B8DAAEB.1050305@enthought.com> <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> <4BFD1EF4-2381-45F3-8CD1-56E625107371@cs.toronto.edu> Message-ID: <7f1eaee31003021646wd0c9636l58344f1caeae955d@mail.gmail.com> On Tue, Mar 2, 2010 at 7:32 PM, David Warde-Farley wrote: > > On 2-Mar-10, at 7:23 PM, James Bergstra wrote: > >> Sorry... again... how do I make such a scalar... *in C* ? ?What would >> be the recommended C equivalent of this python code? ?Are there C >> type-checking functions for instances of these objects? ?Are there C >> functions for converting to and from C scalars? >> >> Basically, is there a C API for working with these numpy scalars? > > > This bit looks relevant: > > http://projects.scipy.org/numpy/browser/trunk/numpy/core/src/multiarray/scalarapi.c?rev=7560#L565 > > David Thanks David, that does look relevant! Thanks for the tip. James -- http://www-etud.iro.umontreal.ca/~bergstrj From warren.weckesser at enthought.com Tue Mar 2 19:48:04 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 02 Mar 2010 18:48:04 -0600 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> <4B8DAAEB.1050305@enthought.com> <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> Message-ID: <4B8DB1C4.6090706@enthought.com> James Bergstra wrote: > On Tue, Mar 2, 2010 at 7:18 PM, Warren Weckesser > wrote: > >> James Bergstra wrote: >> >>> On Tue, Mar 2, 2010 at 3:09 PM, Christopher Barker >>> wrote: >>> >>> >>>> James Bergstra wrote: >>>> >>>> >>>>> Maybe I'm missing something... but I don't think I want to create an array. >>>>> >>>>> In [3]: import numpy >>>>> >>>>> In [4]: type(numpy.int8()) >>>>> Out[4]: >>>>> >>>>> In [5]: isinstance(numpy.int8(), numpy.ndarray) >>>>> Out[5]: False >>>>> >>>>> >>>> right, it's a type object: >>>> >>>> In [13]: type(np.uint8) >>>> Out[13]: >>>> >>>> >>>> >>> Agreed :) >>> >>> >>>>> I want to create one of those numpy.int8 guys. >>>>> >>>>> >>> np.int8 is a type, and so is numpy.ndarray. And they are different. >>> There's lots of docs about how to make arrays, but how do I make a >>> scalar? >>> >>> >> In [1]: import numpy as np >> >> In [2]: x = np.uint8(23) >> >> In [3]: x >> Out[3]: 23 >> >> In [4]: type(x) >> Out[4]: >> > > Sorry... again... how do I make such a scalar... *in C* ? Whoops. So you said, in the subject even. :$ Warren "Oh. That's different... never mind." - E Litella. > What would > be the recommended C equivalent of this python code? Are there C > type-checking functions for instances of these objects? Are there C > functions for converting to and from C scalars? > > Basically, is there a C API for working with these numpy scalars? > > From Chris.Barker at noaa.gov Tue Mar 2 20:06:07 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 02 Mar 2010 17:06:07 -0800 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4B8DAAEB.1050305@enthought.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> <4B8DAAEB.1050305@enthought.com> Message-ID: <4B8DB5FF.7030907@noaa.gov> Warren Weckesser wrote: >>>> I want to create one of those numpy.int8 guys. >>>> >> np.int8 is a type, and so is numpy.ndarray. And they are different. >> There's lots of docs about how to make arrays, but how do I make a >> scalar? ah - I see -- you want to make a numpy scalar, not the same as a type, actually. Numpy scalars exist because numpy supports data types that core python does not have, so they are a lot like the built-in float and integer scalars. I'm still curious as to why you might need to make one, usually, C code works with numpy arrays, and you can easily make an array that has one element: In [83]: np.array(4, dtype=np.uint8) Out[83]: array(4, dtype=uint8) (In python). Usually, you have no arrays in C, and but work with the scalars as regular old C types. If you do really need to create a numpy scalar is C, it looks like you can use: PyArrayScalar(void* data, PyArray_Descr* dtype, PyObject* itemsize) I found this in "The Guide to Numpy": http://www.tramy.us/guidetoscipy.html HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at silveregg.co.jp Tue Mar 2 20:51:04 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Wed, 03 Mar 2010 10:51:04 +0900 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> Message-ID: <4B8DC088.2070402@silveregg.co.jp> James Bergstra wrote: > > Maybe I'm missing something... but I don't think I want to create an array. Ah, I misunderstood your question. There are at least two ways: PyObject *ret; ret = PyArrayScalar_New(UInt8); Or PyObject *ret; PyArray_Descr *typecode; typecode = PyArray_DescrFromType(PyArray_UINT8); ret = PyArray_Scalar(NULL, typecode, NULL); Py_DECREF(typecode); One way to set data in it is: data = PyMem_Malloc(1); *data = 4; ((PyArrayObject*)ret)->data = data; Or simpler, but may be less flexible depending on what you are doing: PyArrayScalar_VAL(ret, UInt8) = 4; This should be documented better somewhere, cheers, David From oliphant at enthought.com Tue Mar 2 20:51:45 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 2 Mar 2010 19:51:45 -0600 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <7f1eaee31003021646wd0c9636l58344f1caeae955d@mail.gmail.com> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8D7084.5050003@noaa.gov> <7f1eaee31003021612u6f5b0b5ma1398027f5e33918@mail.gmail.com> <4B8DAAEB.1050305@enthought.com> <7f1eaee31003021623j998aa5dyc2cd8ee920319280@mail.gmail.com> <4BFD1EF4-2381-45F3-8CD1-56E625107371@cs.toronto.edu> <7f1eaee31003021646wd0c9636l58344f1caeae955d@mail.gmail.com> Message-ID: <2517064A-95E0-4A2F-87D8-F9052D9554A9@enthought.com> PyArray_Scalar Is the one you want. Travis -- (mobile phone of) Travis Oliphant Enthought, Inc. 1-512-536-1057 http://www.enthought.com On Mar 2, 2010, at 6:46 PM, James Bergstra wrote: > On Tue, Mar 2, 2010 at 7:32 PM, David Warde-Farley > wrote: >> >> On 2-Mar-10, at 7:23 PM, James Bergstra wrote: >> >>> Sorry... again... how do I make such a scalar... *in C* ? What >>> would >>> be the recommended C equivalent of this python code? Are there C >>> type-checking functions for instances of these objects? Are there C >>> functions for converting to and from C scalars? >>> >>> Basically, is there a C API for working with these numpy scalars? >> >> >> This bit looks relevant: >> >> http://projects.scipy.org/numpy/browser/trunk/numpy/core/src/multiarray/scalarapi.c?rev=7560#L565 >> >> David > > Thanks David, that does look relevant! Thanks for the tip. > > James > -- > http://www-etud.iro.umontreal.ca/~bergstrj > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From david at silveregg.co.jp Tue Mar 2 20:58:31 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Wed, 03 Mar 2010 10:58:31 +0900 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4B8DC088.2070402@silveregg.co.jp> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <5b8d13221002282244v609cd49bp8fb9439284cc8213@mail.gmail.com> <7f1eaee31003021144m41069e8eg648f31d9703c01c0@mail.gmail.com> <4B8DC088.2070402@silveregg.co.jp> Message-ID: <4B8DC247.7050308@silveregg.co.jp> David Cournapeau wrote: > James Bergstra wrote: > >> >> Maybe I'm missing something... but I don't think I want to create an >> array. > > Ah, I misunderstood your question. There are at least two ways: > > PyObject *ret; > ret = PyArrayScalar_New(UInt8); > > Or > > PyObject *ret; > PyArray_Descr *typecode; > > > typecode = PyArray_DescrFromType(PyArray_UINT8); > ret = PyArray_Scalar(NULL, typecode, NULL); > Py_DECREF(typecode); ^^^^ Sorry, this is wrong, this does not work on my machine, but I am not sure to understand why. cheers, David From brennan.williams at visualreservoir.com Tue Mar 2 21:29:09 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Wed, 03 Mar 2010 15:29:09 +1300 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points Message-ID: <4B8DC975.2000502@visualreservoir.com> I'm reading a file which contains a grid definition. Each cell in the grid, apart from having an i,j,k index also has 8 x,y,z coordinates. I'm reading each set of coordinates into a numpy array. I then want to add/append those coordinates to what will be my large "points" array. Due to the orientation/order of the 8 corners of each hexahedral cell I may have to reorder them before adding them to my large points array (not sure about that yet). Should I create a numpy array with nothing in it and then .append to it? But this is probably expensive isn't it as it creates a new copy of the array each time? Or should I create a zero or empty array of sufficient size and then put each set of 8 coordinates into the correct position in that big array? I don't know exactly how big the array will be (some cells are inactive and therefore don't have a geometry defined) but I do know what its maximum size is (ni*nj*nk,3). Thanks Brennan From d.l.goldsmith at gmail.com Tue Mar 2 21:46:00 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 2 Mar 2010 18:46:00 -0800 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <4B8DC975.2000502@visualreservoir.com> References: <4B8DC975.2000502@visualreservoir.com> Message-ID: <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams < brennan.williams at visualreservoir.com> wrote: > I'm reading a file which contains a grid definition. Each cell in the > grid, apart from having an i,j,k index also has 8 x,y,z coordinates. > I'm reading each set of coordinates into a numpy array. I then want to > add/append those coordinates to what will be my large "points" array. > Due to the orientation/order of the 8 corners of each hexahedral cell I > may have to reorder them before adding them to my large points array > (not sure about that yet). > > Should I create a numpy array with nothing in it and then .append to it? > But this is probably expensive isn't it as it creates a new copy of the > array each time? > > Or should I create a zero or empty array of sufficient size and then put > each set of 8 coordinates into the correct position in that big array? > > I don't know exactly how big the array will be (some cells are inactive > and therefore don't have a geometry defined) but I do know what its > maximum size is (ni*nj*nk,3). > Someone will correct me if I'm wrong, but this problem - the "best" way to build a large array whose size is not known beforehand - came up in one of the tutorials at SciPyCon '09 and IIRC the answer was, perhaps surprisingly, build the thing as a Python list (which is optimized for this kind of indeterminate sequence building) and convert to a numpy array when you're done. Isn't that what was recommended, folks? DG > > Thanks > > Brennan > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brennan.williams at visualreservoir.com Tue Mar 2 21:59:02 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Wed, 03 Mar 2010 15:59:02 +1300 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> References: <4B8DC975.2000502@visualreservoir.com> <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> Message-ID: <4B8DD076.7040606@visualreservoir.com> David Goldsmith wrote: > > > On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams > > wrote: > > I'm reading a file which contains a grid definition. Each cell in the > grid, apart from having an i,j,k index also has 8 x,y,z coordinates. > I'm reading each set of coordinates into a numpy array. I then want to > add/append those coordinates to what will be my large "points" array. > Due to the orientation/order of the 8 corners of each hexahedral > cell I > may have to reorder them before adding them to my large points array > (not sure about that yet). > > Should I create a numpy array with nothing in it and then .append > to it? > But this is probably expensive isn't it as it creates a new copy > of the > array each time? > > Or should I create a zero or empty array of sufficient size and > then put > each set of 8 coordinates into the correct position in that big array? > > I don't know exactly how big the array will be (some cells are > inactive > and therefore don't have a geometry defined) but I do know what its > maximum size is (ni*nj*nk,3). > > > Someone will correct me if I'm wrong, but this problem - the "best" > way to build a large array whose size is not known beforehand - came > up in one of the tutorials at SciPyCon '09 and IIRC the answer was, > perhaps surprisingly, build the thing as a Python list (which is > optimized for this kind of indeterminate sequence building) and > convert to a numpy array when you're done. Isn't that what was > recommended, folks? > Build a list of floating point values, then convert to an array and shape accordingly? Or build a list of small arrays and then somehow convert that into a big numpy array? I've got 24 floating point values which I've got in an array of shape (8,3) but I could easily have those in a list rather than an array and then just keep appending each small list of values to a big list and then do the final conversion to the array - I'll try that and see how it goes. Brennan > DG > > > > Thanks > > Brennan > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d.l.goldsmith at gmail.com Tue Mar 2 22:47:46 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 2 Mar 2010 19:47:46 -0800 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <4B8DD076.7040606@visualreservoir.com> References: <4B8DC975.2000502@visualreservoir.com> <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> <4B8DD076.7040606@visualreservoir.com> Message-ID: <45d1ab481003021947l37e848b4u3703dd5c6afc8f21@mail.gmail.com> On Tue, Mar 2, 2010 at 6:59 PM, Brennan Williams < brennan.williams at visualreservoir.com> wrote: > David Goldsmith wrote: > > > > On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams > > > > wrote: > > > > I'm reading a file which contains a grid definition. Each cell in the > > grid, apart from having an i,j,k index also has 8 x,y,z coordinates. > > I'm reading each set of coordinates into a numpy array. I then want > to > > add/append those coordinates to what will be my large "points" array. > > Due to the orientation/order of the 8 corners of each hexahedral > > cell I > > may have to reorder them before adding them to my large points array > > (not sure about that yet). > > > > Should I create a numpy array with nothing in it and then .append > > to it? > > But this is probably expensive isn't it as it creates a new copy > > of the > > array each time? > > > > Or should I create a zero or empty array of sufficient size and > > then put > > each set of 8 coordinates into the correct position in that big > array? > > > > I don't know exactly how big the array will be (some cells are > > inactive > > and therefore don't have a geometry defined) but I do know what its > > maximum size is (ni*nj*nk,3). > > > > > > Someone will correct me if I'm wrong, but this problem - the "best" > > way to build a large array whose size is not known beforehand - came > > up in one of the tutorials at SciPyCon '09 and IIRC the answer was, > > perhaps surprisingly, build the thing as a Python list (which is > > optimized for this kind of indeterminate sequence building) and > > convert to a numpy array when you're done. Isn't that what was > > recommended, folks? > > > Build a list of floating point values, then convert to an array and > shape accordingly? Or build a list of small arrays and then somehow > convert that into a big numpy array? > My guess is that either way will be better than iteratively "appending" to an existing array. I've got 24 floating point values which I've got in an array of shape > (8,3) but I could easily have those in a list rather than an array and > then just keep appending each small list of values to a big list and > then do the final conversion to the array - I'll try that and see how it > goes. > Great! Be sure to report back. :-) Dg > > Brennan > > DG > > > > > > > > Thanks > > > > Brennan > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Wed Mar 3 03:33:56 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 3 Mar 2010 09:33:56 +0100 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4B8DC247.7050308@silveregg.co.jp> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <4B8DC088.2070402@silveregg.co.jp> <4B8DC247.7050308@silveregg.co.jp> Message-ID: <201003030933.56962.faltet@pytables.org> A Wednesday 03 March 2010 02:58:31 David Cournapeau escrigu?: > > PyObject *ret; > > PyArray_Descr *typecode; > > > > > > typecode = PyArray_DescrFromType(PyArray_UINT8); > > ret = PyArray_Scalar(NULL, typecode, NULL); > > Py_DECREF(typecode); > > ^^^^ Sorry, this is wrong, this does not work on my machine, but I am > not sure to understand why. Well, at least the next works in Cython: cdef npy_int8 next cdef dtype int8 cdef object current [...] int8 = PyArray_DescrFromType(NPY_INT8) current = PyArray_Scalar(&next, int8, None) -- Francesc Alted From david at silveregg.co.jp Wed Mar 3 04:40:45 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Wed, 03 Mar 2010 18:40:45 +0900 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <201003030933.56962.faltet@pytables.org> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <4B8DC088.2070402@silveregg.co.jp> <4B8DC247.7050308@silveregg.co.jp> <201003030933.56962.faltet@pytables.org> Message-ID: <4B8E2E9D.9010807@silveregg.co.jp> Francesc Alted wrote: > A Wednesday 03 March 2010 02:58:31 David Cournapeau escrigu?: >>> PyObject *ret; >>> PyArray_Descr *typecode; >>> >>> >>> typecode = PyArray_DescrFromType(PyArray_UINT8); >>> ret = PyArray_Scalar(NULL, typecode, NULL); >>> Py_DECREF(typecode); >> ^^^^ Sorry, this is wrong, this does not work on my machine, but I am >> not sure to understand why. > > Well, at least the next works in Cython: > > cdef npy_int8 next > cdef dtype int8 > cdef object current > > [...] > int8 = PyArray_DescrFromType(NPY_INT8) > current = PyArray_Scalar(&next, int8, None) Yes, what does not work is the Py_DECREF on typecode. Maybe I misunderstand the comment on PyArray_Scalar (typecode is not used but cannot be NULL). cheers, David From jesper.webmail at gmail.com Wed Mar 3 09:31:29 2010 From: jesper.webmail at gmail.com (Jesper Larsen) Date: Wed, 3 Mar 2010 15:31:29 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy Message-ID: Hi people, I was wondering about the status of using the standard library multiprocessing module with numpy. I found a cookbook example last updated one year ago which states that: "This page was obsolete as multiprocessing's internals have changed. More information will come shortly; a link to this page will then be added back to the Cookbook." http://www.scipy.org/Cookbook/multiprocessing I also found the code that used to be on this page in the cookbook but it does not work any more. So my question is: Is it possible to use numpy arrays as shared arrays in an application using multiprocessing and how do you do it? Best regards, Jesper From patrickmarshwx at gmail.com Wed Mar 3 09:34:13 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Wed, 3 Mar 2010 08:34:13 -0600 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> Message-ID: Greetings I sent this last night but the buildout.txt file was too large so my email was held pending moderation. When I got up this morning I realized that there was a good chance that the moderation queue is overwhelmed and not looked at much so I decided to post links to the out files and resubmit. I apologize if any of you get this twice... Okay, I'm about out of ideas. Hopefully someone on here has an idea as to what might be going on. 1. I am still unable to build the windows superpack using pavement.py, even after making sure I have all the necessary dependencies. However, the good news is that pavement.py now recognizes where the Atlas binaries David provided are located. The output from "paver bdist_wininst" is located here http://patricktmarsh.com/numpy/20100302.paveout.txt. 2. Since I couldn't get pavement.py to work, I decided to try and build Numpy with sse3 support using "python25 setup.py build -c mingw32 bdist_wininst > buildout.txt". (Note, I've also done this for building with Python 2.6.) This works and I'm able to build the windows installer for both Python 2.5 and 2.6. (The output of the build from Python 2.5 is here http://patricktmarsh.com/numpy/20100302.buildout.txt. However, when I try to run the test suite (using "python25 -c 'import numpy; print numpy.__version__; numpy.test();' > testout.txt"), Python 2.5 runs, with failures & errors, whereas Python 2.6 freezes and eventually (Python itself) quits. Since I'm unable to generate the test suite log for Numpy on Python 2.6, I'm working on the (incorrect?) assumption that the freezing when using Python 2.6 corresponds with the failures/errors when using Python 2.5. The Python 2.5 numpy test suite log is located here: http://patricktmarsh.com/numpy/20100302.testout.txt. Most of the errors with my locally build numpy version comes from the test suite being unable to call "matrix". One, of many, examples is attached below. I don't know where else to look anymore. I get the same results when building from trunk and building from the 1.4.x tag (which I downloaded this morning). I've gotten these same results when building on a different laptop (which was a 32bit version of Windows 7 Professional). I should disclose that I'm currently running a 64bit version of Windows 7 Professional. I'm running EPDv5.1.1 32bit for Python 2.5.4 (shipped numpy test suite works fine). I'm running EPDv6.1.1 32bit for Python 2.6.4 (shipped numpy test suite works fine). I've achieved the same errors when using a generic 32bit Python install downloaded from the official Python website. I'm hoping that I've looked at this for so long that I'm missing something obvious. Any thoughts/suggestions at this point would be appreciated. ====================================================================== ERROR: Test whether matrix.sum(axis=1) preserves orientation. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\lib\site-packages\numpy\core\tests\test_defmatrix.py", line 56, in test_sum M = matrix([[1,2,0,0], NameError: global name 'matrix' is not defined ====================================================================== Patrick -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Mar 3 09:48:07 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 3 Mar 2010 23:48:07 +0900 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> Message-ID: <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> On Wed, Mar 3, 2010 at 11:34 PM, Patrick Marsh wrote: > Okay, I'm about out of ideas. ?Hopefully someone on here has an idea as to > what might be going on. > 1. ?I am still unable to build the windows superpack using pavement.py, even > after making sure I have all the necessary dependencies. ?However, the good > news is that pavement.py now recognizes where the Atlas binaries David > provided are located. ?The output from "paver bdist_wininst" is located > here?http://patricktmarsh.com/numpy/20100302.paveout.txt. That's a bug in the pavement script - on windows 7, some env variables are necessary to run python correctly, which were not necessary for windows < 7. I will fix this. > 2. ?Since I couldn't get pavement.py to work, I decided to try and build > Numpy with sse3 support using "python25 setup.py build -c mingw32 > bdist_wininst > buildout.txt". ?(Note, I've also done this for building with > Python 2.6.) You should make sure that you are testing the numpy you think you are testing, and always, always remove the previous installed version. The matrix error is most likely due to some stalled files from a previous install David From ralf.gommers at googlemail.com Wed Mar 3 09:48:42 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 3 Mar 2010 22:48:42 +0800 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> Message-ID: On Wed, Mar 3, 2010 at 10:34 PM, Patrick Marsh wrote: > 1. I am still unable to build the windows superpack using pavement.py, > even after making sure I have all the necessary dependencies. However, the > good news is that pavement.py now recognizes where the Atlas binaries David > provided are located. The output from "paver bdist_wininst" is located > here http://patricktmarsh.com/numpy/20100302.paveout.txt. > > 2. Since I couldn't get pavement.py to work, I decided to try and build > Numpy with sse3 support using "python25 setup.py build -c mingw32 > bdist_wininst > buildout.txt". (Note, I've also done this for building with > Python 2.6.) This works and I'm able to build the windows installer for > both Python 2.5 and 2.6. (The output of the build from Python 2.5 is here > http://patricktmarsh.com/numpy/20100302.buildout.txt. However, when I try > to run the test suite (using "python25 -c 'import numpy; print > numpy.__version__; numpy.test();' > testout.txt"), Python 2.5 runs, with > failures & errors, whereas Python 2.6 freezes and eventually (Python itself) > quits. Since I'm unable to generate the test suite log for Numpy on Python > 2.6, I'm working on the (incorrect?) assumption that the freezing when using > Python 2.6 corresponds with the failures/errors when using Python 2.5. The > Python 2.5 numpy test suite log is located here: > http://patricktmarsh.com/numpy/20100302.testout.txt. > > It fails on importing tempfile, that may be unrelated to numpy. What happens if you run python in a shell and do "import tempfile"? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed Mar 3 09:51:19 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 03 Mar 2010 08:51:19 -0600 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <45d1ab481003021947l37e848b4u3703dd5c6afc8f21@mail.gmail.com> References: <4B8DC975.2000502@visualreservoir.com> <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> <4B8DD076.7040606@visualreservoir.com> <45d1ab481003021947l37e848b4u3703dd5c6afc8f21@mail.gmail.com> Message-ID: <4B8E7767.8080903@gmail.com> On 03/02/2010 09:47 PM, David Goldsmith wrote: > On Tue, Mar 2, 2010 at 6:59 PM, Brennan Williams > > wrote: > > David Goldsmith wrote: > > > > On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams > > > > >> wrote: > > > > I'm reading a file which contains a grid definition. Each > cell in the > > grid, apart from having an i,j,k index also has 8 x,y,z > coordinates. > > I'm reading each set of coordinates into a numpy array. I > then want to > > add/append those coordinates to what will be my large > "points" array. > > Due to the orientation/order of the 8 corners of each hexahedral > > cell I > > may have to reorder them before adding them to my large > points array > > (not sure about that yet). > > > > Should I create a numpy array with nothing in it and then > .append > > to it? > > But this is probably expensive isn't it as it creates a new copy > > of the > > array each time? > > > > Or should I create a zero or empty array of sufficient size and > > then put > > each set of 8 coordinates into the correct position in that > big array? > > > > I don't know exactly how big the array will be (some cells are > > inactive > > and therefore don't have a geometry defined) but I do know > what its > > maximum size is (ni*nj*nk,3). > > > > > > Someone will correct me if I'm wrong, but this problem - the "best" > > way to build a large array whose size is not known beforehand - came > > up in one of the tutorials at SciPyCon '09 and IIRC the answer was, > > perhaps surprisingly, build the thing as a Python list (which is > > optimized for this kind of indeterminate sequence building) and > > convert to a numpy array when you're done. Isn't that what was > > recommended, folks? > > > Build a list of floating point values, then convert to an array and > shape accordingly? Or build a list of small arrays and then somehow > convert that into a big numpy array? > > > My guess is that either way will be better than iteratively > "appending" to an existing array. > > Hi, Christopher Barker provided some code last last year on appending ndarrays eg: http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html A lot depends on your final usage of the array otherwise there are no suitable suggestions. That is do you need just to index the array using i, j, k indices (this gives you either an i by j by k array that contains the x, y, z coordinates) or do you also need to index the x, y, z coordinates as well (giving you an i by j by k by x by y by z array). If it is just plain storage then perhaps just a Python list, dict or sqlite object may be sufficient. There are also time and memory constraints as you can spend large effort just to get the input into a suitable format and memory usage. If you use a secondary storage like a Python list then you need memory to storage the list, the ndarray and all intermediate components and overheads. If you use scipy then you should look at using sparse arrays where space is only added as you need it. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrickmarshwx at gmail.com Wed Mar 3 09:53:12 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Wed, 3 Mar 2010 08:53:12 -0600 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> Message-ID: On Wed, Mar 3, 2010 at 8:48 AM, Ralf Gommers wrote: > > > On Wed, Mar 3, 2010 at 10:34 PM, Patrick Marsh wrote: > >> 1. I am still unable to build the windows superpack using pavement.py, >> even after making sure I have all the necessary dependencies. However, the >> good news is that pavement.py now recognizes where the Atlas binaries David >> provided are located. The output from "paver bdist_wininst" is located >> here http://patricktmarsh.com/numpy/20100302.paveout.txt. >> >> 2. Since I couldn't get pavement.py to work, I decided to try and build >> Numpy with sse3 support using "python25 setup.py build -c mingw32 >> bdist_wininst > buildout.txt". (Note, I've also done this for building with >> Python 2.6.) This works and I'm able to build the windows installer for >> both Python 2.5 and 2.6. (The output of the build from Python 2.5 is here >> http://patricktmarsh.com/numpy/20100302.buildout.txt. However, when I >> try to run the test suite (using "python25 -c 'import numpy; print >> numpy.__version__; numpy.test();' > testout.txt"), Python 2.5 runs, with >> failures & errors, whereas Python 2.6 freezes and eventually (Python itself) >> quits. Since I'm unable to generate the test suite log for Numpy on Python >> 2.6, I'm working on the (incorrect?) assumption that the freezing when using >> Python 2.6 corresponds with the failures/errors when using Python 2.5. The >> Python 2.5 numpy test suite log is located here: >> http://patricktmarsh.com/numpy/20100302.testout.txt. >> >> It fails on importing tempfile, that may be unrelated to numpy. What > happens if you run python in a shell and do "import tempfile"? > > Tempfile imports with no errors with both Python 2.5.4 and Python 2.6.4. > Cheers, > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrickmarshwx at gmail.com Wed Mar 3 10:22:47 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Wed, 3 Mar 2010 09:22:47 -0600 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> Message-ID: On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau wrote: > On Wed, Mar 3, 2010 at 11:34 PM, Patrick Marsh > wrote: > > > Okay, I'm about out of ideas. Hopefully someone on here has an idea as > to > > what might be going on. > > 1. I am still unable to build the windows superpack using pavement.py, > even > > after making sure I have all the necessary dependencies. However, the > good > > news is that pavement.py now recognizes where the Atlas binaries David > > provided are located. The output from "paver bdist_wininst" is located > > here http://patricktmarsh.com/numpy/20100302.paveout.txt. > > That's a bug in the pavement script - on windows 7, some env variables > are necessary to run python correctly, which were not necessary for > windows < 7. I will fix this. > Thanks! > > > 2. Since I couldn't get pavement.py to work, I decided to try and build > > Numpy with sse3 support using "python25 setup.py build -c mingw32 > > bdist_wininst > buildout.txt". (Note, I've also done this for building > with > > Python 2.6.) > > You should make sure that you are testing the numpy you think you are > testing, and always, always remove the previous installed version. The > matrix error is most likely due to some stalled files from a previous > install > > David > Okay, I had been removing the build and dist directories but didn't realize I needed to remove the numpy directory in the site-packages directory. Deleting this last directory fixed the "matrix" issues and I'm now left with the two failures. The latter failure doesn't seem to really be an issue to me and the first one is the same error that Ralf posted earlier - so for Python 2.5, I've got it working. However, Python 2.6.4 still freezes on the test suite. I'll have to look more into this today, but for reference, has anyone successfully built Numpy from the 1.4.x branch, on Windows 7, using Python 2.6.4? I'm going to attempt to get my hands on a Windows XP box today and try to build it there, but I don't know when/if I'll be able to get the XP box. Thanks for the help ====================================================================== FAIL: test_special_values (test_umath_complex.TestClog) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\core\tests\test_umath_complex.py", line 179, in test_special_values assert_almost_equal(np.log(x), y) File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 437, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ NaN+2.35619449j] DESIRED: (1.#INF+2.35619449019j) ====================================================================== FAIL: test_doctests (test_polynomial.TestDocs) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\lib\tests\test_polynomial.py", line 90, in test_doctests return rundocs() File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 953, in rundocs raise AssertionError("Some doctests failed:\n%s" % "\n".join(msg)) AssertionError: Some doctests failed: ********************************************************************** File "C:\Python25\lib\site-packages\numpy\lib\tests\test_polynomial.py", line 20, in test_polynomial Failed example: print poly1d([100e-90, 1.234567e-9j+3, -1234.999e8]) Expected: 2 1e-88 x + (3 + 1.235e-09j) x - 1.235e+11 Got: 2 1e-088 x + (3 + 1.235e-009j) x - 1.235e+011 ---------------------------------------------------------------------- Ran 2334 tests in 10.175s FAILED (KNOWNFAIL=7, SKIP=1, failures=2) Patrick -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Wed Mar 3 10:35:51 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 3 Mar 2010 16:35:51 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: References: Message-ID: <201003031635.51969.faltet@pytables.org> A Wednesday 03 March 2010 15:31:29 Jesper Larsen escrigu?: > Hi people, > > I was wondering about the status of using the standard library > multiprocessing module with numpy. I found a cookbook example last > updated one year ago which states that: > > "This page was obsolete as multiprocessing's internals have changed. > More information will come shortly; a link to this page will then be > added back to the Cookbook." > > http://www.scipy.org/Cookbook/multiprocessing > > I also found the code that used to be on this page in the cookbook but > it does not work any more. So my question is: > > Is it possible to use numpy arrays as shared arrays in an application > using multiprocessing and how do you do it? Yes, it is pretty easy if your problem can be vectorised. Just split your arrays in chunks and assign the computation of each chunk to a different process. I'm attaching a code that does this for computing a polynomial on a certain range. Here it is the output (for a dual-core processor): Serial computation... 10000000 0 Time elapsed in serial computation: 3.438 3333333 0 3333334 1 3333333 2 Time elapsed in parallel computation: 2.271 with 3 threads Speed-up: 1.51x -- Francesc Alted -------------- next part -------------- A non-text attachment was scrubbed... Name: poly-mp.py Type: text/x-python Size: 989 bytes Desc: not available URL: From cgohlke at uci.edu Wed Mar 3 10:56:31 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Wed, 03 Mar 2010 07:56:31 -0800 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> Message-ID: <4B8E86AF.3050204@uci.edu> Numpy 1.4.0 svn rev 8270 builds (with setup.py) and tests OK on Windows 7 using Python 2.6.4. The only test failure is test_special_values (test_umath_complex.TestClog). - Christoph On 3/3/2010 7:22 AM, Patrick Marsh wrote: > On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau > wrote: > > On Wed, Mar 3, 2010 at 11:34 PM, Patrick Marsh > > wrote: > > > Okay, I'm about out of ideas. Hopefully someone on here has an > idea as to > > what might be going on. > > 1. I am still unable to build the windows superpack using > pavement.py, even > > after making sure I have all the necessary dependencies. However, > the good > > news is that pavement.py now recognizes where the Atlas binaries David > > provided are located. The output from "paver bdist_wininst" is > located > > here http://patricktmarsh.com/numpy/20100302.paveout.txt. > > That's a bug in the pavement script - on windows 7, some env variables > are necessary to run python correctly, which were not necessary for > windows < 7. I will fix this. > > > > Thanks! > > > > 2. Since I couldn't get pavement.py to work, I decided to try and > build > > Numpy with sse3 support using "python25 setup.py build -c mingw32 > > bdist_wininst > buildout.txt". (Note, I've also done this for > building with > > Python 2.6.) > > You should make sure that you are testing the numpy you think you are > testing, and always, always remove the previous installed version. The > matrix error is most likely due to some stalled files from a previous > install > > David > > > Okay, I had been removing the build and dist directories but didn't > realize I needed to remove the numpy directory in the site-packages > directory. Deleting this last directory fixed the "matrix" issues and > I'm now left with the two failures. The latter failure doesn't seem to > really be an issue to me and the first one is the same error that Ralf > posted earlier - so for Python 2.5, I've got it working. However, > Python 2.6.4 still freezes on the test suite. I'll have to look more > into this today, but for reference, has anyone successfully built Numpy > from the 1.4.x branch, on Windows 7, using Python 2.6.4? I'm going to > attempt to get my hands on a Windows XP box today and try to build it > there, but I don't know when/if I'll be able to get the XP box. > > Thanks for the help > > > ====================================================================== > FAIL: test_special_values (test_umath_complex.TestClog) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "C:\Python25\Lib\site-packages\numpy\core\tests\test_umath_complex.py", > line 179, in test_special_values > assert_almost_equal(np.log(x), y) > File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line > 437, in assert_almost_equal > "DESIRED: %s\n" % (str(actual), str(desired))) > AssertionError: Items are not equal: > ACTUAL: [ NaN+2.35619449j] > DESIRED: (1.#INF+2.35619449019j) > > > ====================================================================== > FAIL: test_doctests (test_polynomial.TestDocs) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "C:\Python25\Lib\site-packages\numpy\lib\tests\test_polynomial.py", > line 90, in test_doctests > return rundocs() > File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line > 953, in rundocs > raise AssertionError("Some doctests failed:\n%s" % "\n".join(msg)) > AssertionError: Some doctests failed: > ********************************************************************** > File "C:\Python25\lib\site-packages\numpy\lib\tests\test_polynomial.py", > line 20, in test_polynomial > Failed example: > print poly1d([100e-90, 1.234567e-9j+3, -1234.999e8]) > Expected: > 2 > 1e-88 x + (3 + 1.235e-09j) x - 1.235e+11 > Got: > 2 > 1e-088 x + (3 + 1.235e-009j) x - 1.235e+011 > > > ---------------------------------------------------------------------- > Ran 2334 tests in 10.175s > > FAILED (KNOWNFAIL=7, SKIP=1, failures=2) > > > Patrick > -- From bergstrj at iro.umontreal.ca Wed Mar 3 12:06:01 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Wed, 3 Mar 2010 12:06:01 -0500 Subject: [Numpy-discussion] how to work with numpy.int8 in c In-Reply-To: <4B8E2E9D.9010807@silveregg.co.jp> References: <7f1eaee31002282035m1bd9dcc7p110accad7dbc1756@mail.gmail.com> <4B8DC088.2070402@silveregg.co.jp> <4B8DC247.7050308@silveregg.co.jp> <201003030933.56962.faltet@pytables.org> <4B8E2E9D.9010807@silveregg.co.jp> Message-ID: <7f1eaee31003030906p3eff3e0fg7eebb192e5d661fb@mail.gmail.com> Thanks all for your help, I think I'm on my way again. The catch in the first place was not being confident that a PyArray_Scalar was the thing I needed. I grep'd the code for uint8, int8 and so on and could not find their definitions. On first reading I overlooked the PyArray_Scalar link in this section: http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#scalararraytypes But now that it has been pointed out, I think the reference docs are good enough for me to do what I wanted. So the docs are pretty good. I guess the only thing that I managed to miss was the initial connection that a numpy.int8(6) is a PyArray_Scalar instance. Seems obvious enough in hindsight... On Wed, Mar 3, 2010 at 4:40 AM, David Cournapeau wrote: > Francesc Alted wrote: >> A Wednesday 03 March 2010 02:58:31 David Cournapeau escrigu?: >>>> PyObject *ret; >>>> PyArray_Descr *typecode; >>>> >>>> >>>> typecode = PyArray_DescrFromType(PyArray_UINT8); >>>> ret = PyArray_Scalar(NULL, typecode, NULL); >>>> Py_DECREF(typecode); >>> ^^^^ Sorry, this is wrong, this does not work on my machine, but I am >>> not sure to understand why. >> >> Well, at least the next works in Cython: >> >> cdef npy_int8 next >> cdef dtype int8 >> cdef object current >> >> [...] >> int8 = PyArray_DescrFromType(NPY_INT8) >> current = PyArray_Scalar(&next, int8, None) > > Yes, what does not work is the Py_DECREF on typecode. Maybe I > misunderstand the comment on PyArray_Scalar (typecode is not used but > cannot be NULL). > > cheers, > > David > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- http://www-etud.iro.umontreal.ca/~bergstrj From brennan.williams at visualreservoir.com Wed Mar 3 15:05:43 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Thu, 04 Mar 2010 09:05:43 +1300 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <4B8E7767.8080903@gmail.com> References: <4B8DC975.2000502@visualreservoir.com> <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> <4B8DD076.7040606@visualreservoir.com> <45d1ab481003021947l37e848b4u3703dd5c6afc8f21@mail.gmail.com> <4B8E7767.8080903@gmail.com> Message-ID: <4B8EC117.2070202@visualreservoir.com> Bruce Southey wrote: > On 03/02/2010 09:47 PM, David Goldsmith wrote: >> On Tue, Mar 2, 2010 at 6:59 PM, Brennan Williams >> > > wrote: >> >> David Goldsmith wrote: >> > >> > On Tue, Mar 2, 2010 at 6:29 PM, Brennan Williams >> > > >> > > >> wrote: >> > >> > I'm reading a file which contains a grid definition. Each >> cell in the >> > grid, apart from having an i,j,k index also has 8 x,y,z >> coordinates. >> > I'm reading each set of coordinates into a numpy array. I >> then want to >> > add/append those coordinates to what will be my large >> "points" array. >> > Due to the orientation/order of the 8 corners of each >> hexahedral >> > cell I >> > may have to reorder them before adding them to my large >> points array >> > (not sure about that yet). >> > >> > Should I create a numpy array with nothing in it and then >> .append >> > to it? >> > But this is probably expensive isn't it as it creates a new >> copy >> > of the >> > array each time? >> > >> > Or should I create a zero or empty array of sufficient size and >> > then put >> > each set of 8 coordinates into the correct position in that >> big array? >> > >> > I don't know exactly how big the array will be (some cells are >> > inactive >> > and therefore don't have a geometry defined) but I do know >> what its >> > maximum size is (ni*nj*nk,3). >> > >> > >> > Someone will correct me if I'm wrong, but this problem - the "best" >> > way to build a large array whose size is not known beforehand - >> came >> > up in one of the tutorials at SciPyCon '09 and IIRC the answer was, >> > perhaps surprisingly, build the thing as a Python list (which is >> > optimized for this kind of indeterminate sequence building) and >> > convert to a numpy array when you're done. Isn't that what was >> > recommended, folks? >> > >> Build a list of floating point values, then convert to an array and >> shape accordingly? Or build a list of small arrays and then somehow >> convert that into a big numpy array? >> >> >> My guess is that either way will be better than iteratively >> "appending" to an existing array. >> >> > Hi, > Christopher Barker provided some code last last year on appending > ndarrays eg: > http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html > > A lot depends on your final usage of the array otherwise there are no > suitable suggestions. That is do you need just to index the array > using i, j, k indices (this gives you either an i by j by k array that > contains the x, y, z coordinates) or do you also need to index the x, > y, z coordinates as well (giving you an i by j by k by x by y by z > array). If it is just plain storage then perhaps just a Python list, > dict or sqlite object may be sufficient. > Ultimately I'm trying to build a tvtk unstructured grid to view in a Traits/tvtk/Mayavi app. The grid is ni*nj*nk cells with 8 xyz's per cell (hexahedral cell with 6 faces). However some cells are inactive and therefore don't have geometry. Cells also have "connectivity" to other cells, usually to adjacent cells (e.g. cell i,j,k connected to cell i-1,j,k) but not always. I'll post more comments/questions as I go. Brennan > There are also time and memory constraints as you can spend large > effort just to get the input into a suitable format and memory usage. > If you use a secondary storage like a Python list then you need memory > to storage the list, the ndarray and all intermediate components and > overheads. > > If you use scipy then you should look at using sparse arrays where > space is only added as you need it. > > > Bruce > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From marcotuckner at public-files.de Wed Mar 3 15:09:21 2010 From: marcotuckner at public-files.de (Marco Tuckner) Date: Wed, 03 Mar 2010 21:09:21 +0100 Subject: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries) Message-ID: Hello, am using the scikit.timeseries to convert a hourly timeseries to a lower frequency unsing the appropriate function [1]. When I compare the result to the values calculated with a Pivot table in Excel there is a difference in the values which reaches quite high values in the total sum of all monthly values. I found out that the differnec arises from different decimal settings: In Python the numbers show: 12.88888888 whereas in Excel I see: 12.8888888888888 The difference due to the different decimals is small for single values and accumulates to a 2-digit number for the total of all values. * Why do these differences arise? * What can I do to achive comparable values? Thanks in advance for any hint, Marco [1] http://pytseries.sourceforge.net/generated/scikits.timeseries.convert.html P.S.: Sorry if this is a numpy question but as I was using the scikit I though this is the right forum. From robert.kern at gmail.com Wed Mar 3 15:33:41 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Mar 2010 14:33:41 -0600 Subject: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries) In-Reply-To: References: Message-ID: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> On Wed, Mar 3, 2010 at 14:09, Marco Tuckner wrote: > Hello, > am using the scikit.timeseries to convert a hourly timeseries to a lower > frequency unsing the appropriate function [1]. > > When I compare the result to the values calculated with a Pivot table in > Excel there is a difference in the values which reaches quite high > values in the total sum of all monthly values. > > I found out that the differnec arises from different decimal settings: > > In Python the numbers show: > 12.88888888 > > whereas in Excel I see: > 12.8888888888888 > > The difference due to the different decimals is small for single values > and accumulates to a 2-digit number for the total of all values. > > * Why do these differences arise? > * What can I do to achive comparable values? We default to printing only eight decimal digits for floating point values for convenience. There are more under the covers. Use numpy.set_printoptions(precision=16) to see all of them. If you are still seeing actual calculation differences, we will need to see a complete, self-contained example that demonstrates the difference. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Wed Mar 3 15:50:15 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 03 Mar 2010 22:50:15 +0200 Subject: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries) In-Reply-To: References: Message-ID: <1267649415.5550.4.camel@idol> ke, 2010-03-03 kello 21:09 +0100, Marco Tuckner kirjoitti: > am using the scikit.timeseries to convert a hourly timeseries to a lower > frequency unsing the appropriate function [1]. > > When I compare the result to the values calculated with a Pivot table in > Excel there is a difference in the values which reaches quite high > values in the total sum of all monthly values. > > I found out that the differnec arises from different decimal settings: > > In Python the numbers show: > 12.88888888 > > whereas in Excel I see: > 12.8888888888888 Typically, the internal precision used in Python and Numpy is significantly more than what is printed. Most likely, your problem has a different cause. Are you sure Excel is using a high enough accuracy? If you want more help, it would be useful to post a self-contained code that demonstrates the error. -- Pauli Virtanen From Chris.Barker at noaa.gov Wed Mar 3 16:06:48 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 03 Mar 2010 13:06:48 -0800 Subject: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries) In-Reply-To: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> References: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> Message-ID: <4B8ECF68.2030704@noaa.gov> Robert Kern wrote: > On Wed, Mar 3, 2010 at 14:09, Marco Tuckner >> In Python the numbers show: >> 12.88888888 >> >> whereas in Excel I see: >> 12.8888888888888 > If you are still seeing actual calculation differences, we will need > to see a complete, self-contained example that demonstrates the > difference. To add a bit more detail -- unless you are explicitly specifying single precision floats (dtype=float32), then both numpy and excel are using doubles -- so that's not the source of the differences. Even if you are using single precision in numpy, It's pretty rare for that to make a significant difference. Something else is going on. I suspect a different algorithm, you can tell timeseries.convert how you want it to interpolate -- who knows what excel is doing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Mar 3 16:22:42 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 03 Mar 2010 13:22:42 -0800 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <4B8E7767.8080903@gmail.com> References: <4B8DC975.2000502@visualreservoir.com> <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> <4B8DD076.7040606@visualreservoir.com> <45d1ab481003021947l37e848b4u3703dd5c6afc8f21@mail.gmail.com> <4B8E7767.8080903@gmail.com> Message-ID: <4B8ED322.4090208@noaa.gov> Bruce Southey wrote: > Christopher Barker provided some code last last year on appending > ndarrays eg: > http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html yup, I"d love someone else to pick that up and test/improve it. Anyway, that code only handles 1-d arrays, though that can be structured arrays. I"d like to extend it to handlw n-d arrays, though you could only grow them in the first dimension, which may work for your case. As for performance: My numpy code is a bit slower than using python lists, if you add elements one at a time, and the elements are a standard python data type. It should use less memory though, if that matters. If you add the data in big enough chunks, my method gets better performance. > Ultimately I'm trying to build a tvtk unstructured grid to view in a > Traits/tvtk/Mayavi app. I'd love to see that working, once you've got it! > The grid is ni*nj*nk cells with 8 xyz's per cell > (hexahedral cell with 6 faces). However some cells are inactive and > therefore don't have geometry. Cells also have "connectivity" to other > cells, usually to adjacent cells (e.g. cell i,j,k connected to cell > i-1,j,k) but not always. I'm confused now -- what does the array need to look like in the end? Maybe: ni*nj*nk X 8 X 3 ? How is inactive indicated? Is the connectivity somehow in the same array, or is that stored separately? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From samtygier at yahoo.co.uk Wed Mar 3 16:14:11 2010 From: samtygier at yahoo.co.uk (sam tygier) Date: Wed, 03 Mar 2010 21:14:11 +0000 Subject: [Numpy-discussion] dtype for a single char Message-ID: Hello today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: >>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a"), ("number", "i")]) array([('', 1), ('', 2), ('', 3)], dtype=[('letter', '|S0'), ('number', '>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a1"), ("number", "i")]) array([('a', 1), ('b', 2), ('c', 3)], dtype=[('letter', '|S1'), ('number', ' References: Message-ID: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> On Wed, Mar 3, 2010 at 15:14, sam tygier wrote: > Hello > > today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: > >>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a"), ("number", "i")]) > array([('', 1), ('', 2), ('', 3)], > ? ? ?dtype=[('letter', '|S0'), ('number', ' > the fix seems to be using 'a1' instead > >>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a1"), ("number", "i")]) > array([('a', 1), ('b', 2), ('c', 3)], > ? ? ?dtype=[('letter', '|S1'), ('number', ' > this seems odd to me, as other types eg 'i' and 'f' can be used on their own. is there a reason for this? Other types have a sensible default determined by the platform. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From marcotuckner at public-files.de Wed Mar 3 17:23:59 2010 From: marcotuckner at public-files.de (Marco Tuckner) Date: Wed, 03 Mar 2010 23:23:59 +0100 Subject: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries) In-Reply-To: <4B8ECF68.2030704@noaa.gov> References: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> <4B8ECF68.2030704@noaa.gov> Message-ID: Thanks to all who answered. This is really helpful! >> If you are still seeing actual calculation differences, we will >> need to see a complete, self-contained example that demonstrates >> the difference. > > To add a bit more detail -- unless you are explicitly specifying > single precision floats (dtype=float32), then both numpy and excel > are using doubles -- so that's not the source of the differences. > Even if you are using single precision in numpy, It's pretty rare for > that to make a significant difference. Something else is going on. > > I suspect a different algorithm, you can tell timeseries.convert how > you want it to interpolate -- who knows what excel is doing. I checked the values row by row comparing Excel against the Python results. The the values of both programs match perfectly at the data points where no periodic sequence occurs: so those values where the aggregated value results in a straight value (e.g. 12.04) the results were the same. At values points where the result was a periodic sequence (e.g. 12.222222 ...) the described difference could be observed. I will try to create a self contained example tomorrow. Thanks a lot and kind regards, Marco From marcotuckner at public-files.de Wed Mar 3 17:26:39 2010 From: marcotuckner at public-files.de (Marco Tuckner) Date: Wed, 03 Mar 2010 23:26:39 +0100 Subject: [Numpy-discussion] Recommendation on reference software [Re: setting decimal accuracy ... ] In-Reply-To: <4B8ECF68.2030704@noaa.gov> References: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> <4B8ECF68.2030704@noaa.gov> Message-ID: > I suspect a different algorithm, you can tell timeseries.convert how you > want it to interpolate -- who knows what excel is doing. does this mean that you are questioning Excel or more neutrally Spreadsheet programs? What software would you recommend as reference to test Python packages against? R-Project? Best regards, Marco From robert.kern at gmail.com Wed Mar 3 17:29:49 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Mar 2010 16:29:49 -0600 Subject: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries) In-Reply-To: References: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> <4B8ECF68.2030704@noaa.gov> Message-ID: <3d375d731003031429u5a98dca7p9fd22533b4b5281c@mail.gmail.com> On Wed, Mar 3, 2010 at 16:23, Marco Tuckner wrote: > Thanks to all who answered. > This is really helpful! > >>> If you are still seeing actual calculation differences, we will >>> need to see a complete, self-contained example that demonstrates >>> the difference. >> >> To add a bit more detail -- unless you are explicitly specifying >> single precision floats (dtype=float32), then both numpy and excel >> are using doubles -- so that's not the source of the differences. >> Even if you are using single precision in numpy, It's pretty rare for >> that to make a significant difference. Something else is going on. >> >> I suspect a different algorithm, you can tell timeseries.convert how >> you want it to interpolate -- who knows what excel is doing. > I checked the values row by row comparing Excel against the Python results. > > The the values of both programs match perfectly at the data points where > no periodic sequence occurs: > so those values where the aggregated value results in a straight value > (e.g. 12.04) the results were the same. > At values points where the result was a periodic sequence (e.g. > 12.222222 ...) the described difference could be observed. I think you are just seeing the effect of the different printing that I described. These are not differences in the actual values. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Mar 3 17:44:27 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Mar 2010 16:44:27 -0600 Subject: [Numpy-discussion] Recommendation on reference software [Re: setting decimal accuracy ... ] In-Reply-To: References: <3d375d731003031233v354b41bbx50f8b2d51c609006@mail.gmail.com> <4B8ECF68.2030704@noaa.gov> Message-ID: <3d375d731003031444n6b301578h1212f989c2f6c100@mail.gmail.com> On Wed, Mar 3, 2010 at 16:26, Marco Tuckner wrote: >> I suspect a different algorithm, you can tell timeseries.convert how you >> want it to interpolate -- who knows what excel is doing. > does this mean that you are questioning Excel or more neutrally > Spreadsheet programs? We are not questioning its accuracy, not yet at least. You just haven't told us exactly what calculation that you are trying to make it do. > What software would you recommend as reference to test Python packages > against? > > R-Project? Sometimes. It depends on exactly what you are trying to test. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at silveregg.co.jp Wed Mar 3 21:12:58 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Thu, 04 Mar 2010 11:12:58 +0900 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> Message-ID: <4B8F172A.90802@silveregg.co.jp> Patrick Marsh wrote: > On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau > wrote: > > That's a bug in the pavement script - on windows 7, some env variables > are necessary to run python correctly, which were not necessary for > windows < 7. I will fix this. This is fixed in both trunk and 1.4.x now. I have not tested it, though. > > Okay, I had been removing the build and dist directories but didn't > realize I needed to remove the numpy directory in the site-packages > directory. Deleting this last directory fixed the "matrix" issues and > I'm now left with the two failures. The latter failure doesn't seem to > really be an issue to me and the first one is the same error that Ralf > posted earlier - so for Python 2.5, I've got it working. However, > Python 2.6.4 still freezes on the test suite. I'll have to look more > into this today, but for reference, has anyone successfully built Numpy > from the 1.4.x branch, on Windows 7, using Python 2.6.4? This is almost always a problem with the C runtime. Those are a big PITA to debug/understand/fix. You built this with Mingw, right ? The first thing to check is whether you have several C runtimes loaded: you can check this with the problem depends.exe: http://www.dependencywalker.com I will try to look at this myself - I have only attempted Visual Studio builds on Windows 7 so far, cheers, David From dwf at cs.toronto.edu Thu Mar 4 00:13:47 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 4 Mar 2010 00:13:47 -0500 Subject: [Numpy-discussion] dtype for a single char In-Reply-To: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> References: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> Message-ID: <85D8C96E-0F7B-41B2-BCE0-F63CA8746FAA@cs.toronto.edu> On 3-Mar-10, at 4:56 PM, Robert Kern wrote: > Other types have a sensible default determined by the platform. Yes, and the 'S0' type isn't terribly sensible, if only because of this issue: http://projects.scipy.org/numpy/ticket/1239 David From nadavh at visionsense.com Thu Mar 4 03:54:24 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 04 Mar 2010 10:54:24 +0200 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: References: Message-ID: <1267692864.26747.3.camel@nadav.envision.co.il> There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in the cookbook page. I am into the same issue and going to test it today. Nadav On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > Hi people, > > I was wondering about the status of using the standard library > multiprocessing module with numpy. I found a cookbook example last > updated one year ago which states that: > > "This page was obsolete as multiprocessing's internals have changed. > More information will come shortly; a link to this page will then be > added back to the Cookbook." > > http://www.scipy.org/Cookbook/multiprocessing > > I also found the code that used to be on this page in the cookbook but > it does not work any more. So my question is: > > Is it possible to use numpy arrays as shared arrays in an application > using multiprocessing and how do you do it? > > Best regards, > Jesper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From samtygier at yahoo.co.uk Thu Mar 4 03:34:06 2010 From: samtygier at yahoo.co.uk (sam tygier) Date: Thu, 04 Mar 2010 08:34:06 +0000 Subject: [Numpy-discussion] dtype for a single char In-Reply-To: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> References: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> Message-ID: Robert Kern wrote: > On Wed, Mar 3, 2010 at 15:14, sam tygier wrote: >> Hello >> >> today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: >> >>>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a"), ("number", "i")]) >> array([('', 1), ('', 2), ('', 3)], >> dtype=[('letter', '|S0'), ('number', '> >> the fix seems to be using 'a1' instead >> >>>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a1"), ("number", "i")]) >> array([('a', 1), ('b', 2), ('c', 3)], >> dtype=[('letter', '|S1'), ('number', '> >> this seems odd to me, as other types eg 'i' and 'f' can be used on their own. is there a reason for this? > > Other types have a sensible default determined by the platform. > I don't understand. char has sensible default size on my platform (and all the others i am familiar with), 1 byte. Sam From nadavh at visionsense.com Thu Mar 4 04:55:52 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 4 Mar 2010 11:55:52 +0200 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy References: <1267692864.26747.3.camel@nadav.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD321@ex3.envision.co.il> Maybe the attached file can help. Adpted and tested on amd64 linux Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh Sent: Thu 04-Mar-10 10:54 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in the cookbook page. I am into the same issue and going to test it today. Nadav On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > Hi people, > > I was wondering about the status of using the standard library > multiprocessing module with numpy. I found a cookbook example last > updated one year ago which states that: > > "This page was obsolete as multiprocessing's internals have changed. > More information will come shortly; a link to this page will then be > added back to the Cookbook." > > http://www.scipy.org/Cookbook/multiprocessing > > I also found the code that used to be on this page in the cookbook but > it does not work any more. So my question is: > > Is it possible to use numpy arrays as shared arrays in an application > using multiprocessing and how do you do it? > > Best regards, > Jesper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 5109 bytes Desc: not available URL: From eadrogue at gmx.net Thu Mar 4 05:19:09 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Thu, 4 Mar 2010 11:19:09 +0100 Subject: [Numpy-discussion] combinatorics Message-ID: <20100304101909.GA7328@doriath.local> Hello everybody, Suppose I want to find all 2-digit numbers whose first digit is either 4 or 5, the second digit being 7, 8 or 9. Is there a Numpy/Scipy function to calculate that kind of combinations? I came up with this function, the problem is it uses recursion: def g(sets): if len(sets) < 2: return sets return [[i] + j for i in sets[0] for j in f(sets[1:])] In [157]: g([[4,5],[7,8,9]]) Out[157]: [[4, 7], [4, 8], [4, 9], [5, 7], [5, 8], [5, 9]] Is important that it works with more than two sets too. Any idea is appreciated. Ernest From eadrogue at gmx.net Thu Mar 4 05:35:37 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Thu, 4 Mar 2010 11:35:37 +0100 Subject: [Numpy-discussion] combinatorics In-Reply-To: <20100304101909.GA7328@doriath.local> References: <20100304101909.GA7328@doriath.local> Message-ID: <20100304103537.GA7448@doriath.local> 4/03/10 @ 11:19 (+0100), thus spake Ernest Adrogu?: > Hello everybody, > > Suppose I want to find all 2-digit numbers whose first digit > is either 4 or 5, the second digit being 7, 8 or 9. > Is there a Numpy/Scipy function to calculate that kind of > combinations? > > I came up with this function, the problem is it uses recursion: > > def g(sets): > if len(sets) < 2: > return sets > return [[i] + j for i in sets[0] for j in f(sets[1:])] Sorry, this is a mistake... it calls f() instead of g(). This is the one: def f(sets): if len(sets) < 2: return [[i] for i in sets[0]] return [[i] + j for i in sets[0] for j in f(sets[1:])] > In [157]: g([[4,5],[7,8,9]]) > Out[157]: [[4, 7], [4, 8], [4, 9], [5, 7], [5, 8], [5, 9]] > > Is important that it works with more than two sets too. > Any idea is appreciated. > > Ernest > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From Sam.Tygier at hep.manchester.ac.uk Thu Mar 4 06:20:22 2010 From: Sam.Tygier at hep.manchester.ac.uk (Sam Tygier) Date: Thu, 04 Mar 2010 11:20:22 +0000 Subject: [Numpy-discussion] combinatorics Message-ID: <1267701622.5629.223.camel@hydrogen> itertools in the python standard library has what you need >>> import itertools >>> list( itertools.product([4,5], [7,8,9]) ) [(4, 7), (4, 8), (4, 9), (5, 7), (5, 8), (5, 9)] (all the itertools functions return generators, so the list() is to convert it to a list) Sam From johan.gronqvist at gmail.com Thu Mar 4 06:26:25 2010 From: johan.gronqvist at gmail.com (=?ISO-8859-1?Q?Johan_Gr=F6nqvist?=) Date: Thu, 04 Mar 2010 12:26:25 +0100 Subject: [Numpy-discussion] combinatorics In-Reply-To: <20100304101909.GA7328@doriath.local> References: <20100304101909.GA7328@doriath.local> Message-ID: Ernest Adrogu? skrev: > Suppose I want to find all 2-digit numbers whose first digit > is either 4 or 5, the second digit being 7, 8 or 9. > > I came up with this function, the problem is it uses recursion: > [...] > In [157]: g([[4,5],[7,8,9]]) > Out[157]: [[4, 7], [4, 8], [4, 9], [5, 7], [5, 8], [5, 9]] > > Is important that it works with more than two sets too. > Any idea is appreciated. > The one-line function defined below using only standard python seems to work for me (CPython 2.5.5). The idea you had was to first merge the two first lists, and then merge the resulting lists with the third, and so on. This is exactly the idea behind the reduce function, called fold in other languages, and you recursive call can be replaced by a call to reduce. / johan --------------------------------------------------- In [5]: def a(xss): return reduce(lambda xss, ys: [ xs + [y] for xs in xss for y in ys ], xss, [[]]) ...: In [7]: a([[4, 5], [7, 8, 9]]) Out[7]: [[4, 7], [4, 8], [4, 9], [5, 7], [5, 8], [5, 9]] In [8]: a([[4, 5], [7, 8, 9], [10, 11, 12, 13]]) Out[8]: [[4, 7, 10], [4, 7, 11], [4, 7, 12], [4, 7, 13], [4, 8, 10], [4, 8, 11], [4, 8, 12], [4, 8, 13], [4, 9, 10], [4, 9, 11], [4, 9, 12], [4, 9, 13], [5, 7, 10], [5, 7, 11], [5, 7, 12], [5, 7, 13], [5, 8, 10], [5, 8, 11], [5, 8, 12], [5, 8, 13], [5, 9, 10], [5, 9, 11], [5, 9, 12], [5, 9, 13]] -------------------------------------------- From eadrogue at gmx.net Thu Mar 4 07:14:58 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Thu, 4 Mar 2010 13:14:58 +0100 Subject: [Numpy-discussion] combinatorics In-Reply-To: References: <20100304101909.GA7328@doriath.local> Message-ID: <20100304121458.GA7754@doriath.local> 4/03/10 @ 12:26 (+0100), thus spake Johan Gr?nqvist: > Ernest Adrogu? skrev: > > Suppose I want to find all 2-digit numbers whose first digit > > is either 4 or 5, the second digit being 7, 8 or 9. > > > > I came up with this function, the problem is it uses recursion: > > [...] > > In [157]: g([[4,5],[7,8,9]]) > > Out[157]: [[4, 7], [4, 8], [4, 9], [5, 7], [5, 8], [5, 9]] > > > > Is important that it works with more than two sets too. > > Any idea is appreciated. > > > > > The one-line function defined below using only standard python seems to > work for me (CPython 2.5.5). > > The idea you had was to first merge the two first lists, and then merge > the resulting lists with the third, and so on. This is exactly the idea > behind the reduce function, called fold in other languages, and you > recursive call can be replaced by a call to reduce. > > / johan > > > > --------------------------------------------------- > In [5]: def a(xss): > return reduce(lambda xss, ys: [ xs + [y] for xs in xss for y in ys > ], xss, [[]]) > ...: Thanks. It took me a while to understand how it works :) I have re-written your function using a for loop, which looks less intimidating in my opinion. def g(sets): out = [[]] for i in range(len(sets)): out = [j + [i] for i in sets[i] for j in out] return out In [196]: g([[4,5], [7,8,9]]) Out[196]: [[4, 7], [5, 7], [4, 8], [5, 8], [4, 9], [5, 9]] > In [7]: a([[4, 5], [7, 8, 9]]) > Out[7]: [[4, 7], [4, 8], [4, 9], [5, 7], [5, 8], [5, 9]] > > In [8]: a([[4, 5], [7, 8, 9], [10, 11, 12, 13]]) > Out[8]: > [[4, 7, 10], > [4, 7, 11], > [4, 7, 12], > [4, 7, 13], > [4, 8, 10], > [4, 8, 11], > [4, 8, 12], > [4, 8, 13], > [4, 9, 10], > [4, 9, 11], > [4, 9, 12], > [4, 9, 13], > [5, 7, 10], > [5, 7, 11], > [5, 7, 12], > [5, 7, 13], > [5, 8, 10], > [5, 8, 11], > [5, 8, 12], > [5, 8, 13], > [5, 9, 10], > [5, 9, 11], > [5, 9, 12], > [5, 9, 13]] > -------------------------------------------- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From nadavh at visionsense.com Thu Mar 4 08:06:34 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 4 Mar 2010 15:06:34 +0200 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy References: <1267692864.26747.3.camel@nadav.envision.co.il> <710F2847B0018641891D9A21602763605AD321@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD322@ex3.envision.co.il> Extended module that I used for some useful work. Comments: 1. Sturla's module is better designed, but did not work with very large (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + python-2.6.4 + numpy-1.4.0 Nadav. -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh Sent: Thu 04-Mar-10 11:55 To: Discussion of Numerical Python Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy Maybe the attached file can help. Adpted and tested on amd64 linux Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh Sent: Thu 04-Mar-10 10:54 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in the cookbook page. I am into the same issue and going to test it today. Nadav On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > Hi people, > > I was wondering about the status of using the standard library > multiprocessing module with numpy. I found a cookbook example last > updated one year ago which states that: > > "This page was obsolete as multiprocessing's internals have changed. > More information will come shortly; a link to this page will then be > added back to the Cookbook." > > http://www.scipy.org/Cookbook/multiprocessing > > I also found the code that used to be on this page in the cookbook but > it does not work any more. So my question is: > > Is it possible to use numpy arrays as shared arrays in an application > using multiprocessing and how do you do it? > > Best regards, > Jesper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 6120 bytes Desc: not available URL: From faltet at pytables.org Thu Mar 4 08:12:24 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 4 Mar 2010 14:12:24 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <710F2847B0018641891D9A21602763605AD322@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD321@ex3.envision.co.il> <710F2847B0018641891D9A21602763605AD322@ex3.envision.co.il> Message-ID: <201003041412.24253.faltet@pytables.org> What kind of calculations are you doing with this module? Can you please send some examples and the speed-ups you are getting? Thanks, Francesc A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigu?: > Extended module that I used for some useful work. > Comments: > 1. Sturla's module is better designed, but did not work with very large > (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + python-2.6.4 > + numpy-1.4.0 > > Nadav. > > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > Sent: Thu 04-Mar-10 11:55 > To: Discussion of Numerical Python > Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy > > Maybe the attached file can help. Adpted and tested on amd64 linux > > Nadav > > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > Sent: Thu 04-Mar-10 10:54 > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf > and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in > the cookbook page. I am into the same issue and going to test it today. > > Nadav > > On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > > Hi people, > > > > I was wondering about the status of using the standard library > > multiprocessing module with numpy. I found a cookbook example last > > updated one year ago which states that: > > > > "This page was obsolete as multiprocessing's internals have changed. > > More information will come shortly; a link to this page will then be > > added back to the Cookbook." > > > > http://www.scipy.org/Cookbook/multiprocessing > > > > I also found the code that used to be on this page in the cookbook but > > it does not work any more. So my question is: > > > > Is it possible to use numpy arrays as shared arrays in an application > > using multiprocessing and how do you do it? > > > > Best regards, > > Jesper > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Francesc Alted From patrickmarshwx at gmail.com Thu Mar 4 08:40:22 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Thu, 4 Mar 2010 07:40:22 -0600 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: <4B8F172A.90802@silveregg.co.jp> References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> <4B8F172A.90802@silveregg.co.jp> Message-ID: On Wed, Mar 3, 2010 at 8:12 PM, David Cournapeau wrote: > Patrick Marsh wrote: > > On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau > > wrote: > > > > > That's a bug in the pavement script - on windows 7, some env > variables > > are necessary to run python correctly, which were not necessary for > > windows < 7. I will fix this. > > This is fixed in both trunk and 1.4.x now. I have not tested it, though. > I just ran it (slightly modified to use MSVC instead of MinGW) and it built using Python 2.6 no problem. It started to build for Python 2.5, but I had to nix that since I don't have MSVC7.1 installed on this machine yet. > > > > > Okay, I had been removing the build and dist directories but didn't > > realize I needed to remove the numpy directory in the site-packages > > directory. Deleting this last directory fixed the "matrix" issues and > > I'm now left with the two failures. The latter failure doesn't seem to > > really be an issue to me and the first one is the same error that Ralf > > posted earlier - so for Python 2.5, I've got it working. However, > > Python 2.6.4 still freezes on the test suite. I'll have to look more > > into this today, but for reference, has anyone successfully built Numpy > > from the 1.4.x branch, on Windows 7, using Python 2.6.4? > > This is almost always a problem with the C runtime. Those are a big PITA > to debug/understand/fix. You built this with Mingw, right ? The first > thing to check is whether you have several C runtimes loaded: you can > check this with the problem depends.exe: http://www.dependencywalker.com > > I will try to look at this myself - I have only attempted Visual Studio > builds on Windows 7 so far, > > I was able to build Numpy for Python 2.6 by using Visual Studio, and the only test failure I get is the complex number test that everyone else gets as well. However, I'll still try to figure out what was up with the MinGW install. Just as an aside, the Numpy installer for Python 2.6 has always run really slow when compiling and optimizing. Consequently, the first time I import and use Numpy with Python 2.6, it is extremely slow. This is independent of building using MinGW or MSVC, 1.4.x branch or trunk. I don't have this slow down with the Python 2.5 installer, nor when I import and use Numpy with Python 2.5. Thanks for the help! Patrick > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Mar 4 10:26:37 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 4 Mar 2010 09:26:37 -0600 Subject: [Numpy-discussion] dtype for a single char In-Reply-To: References: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> Message-ID: <3d375d731003040726t5d594644r950b347b08cadc2b@mail.gmail.com> On Thu, Mar 4, 2010 at 02:34, sam tygier wrote: > Robert Kern wrote: >> On Wed, Mar 3, 2010 at 15:14, sam tygier wrote: >>> Hello >>> >>> today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: >>> >>>>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a"), ("number", "i")]) >>> array([('', 1), ('', 2), ('', 3)], >>> ? ? ?dtype=[('letter', '|S0'), ('number', '>> >>> the fix seems to be using 'a1' instead >>> >>>>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a1"), ("number", "i")]) >>> array([('a', 1), ('b', 2), ('c', 3)], >>> ? ? ?dtype=[('letter', '|S1'), ('number', '>> >>> this seems odd to me, as other types eg 'i' and 'f' can be used on their own. is there a reason for this? >> >> Other types have a sensible default determined by the platform. > > I don't understand. char has sensible default size on my platform (and all the others i am familiar with), 1 byte. 'S' is not char. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nadavh at visionsense.com Thu Mar 4 12:54:09 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 4 Mar 2010 19:54:09 +0200 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy References: <710F2847B0018641891D9A21602763605AD321@ex3.envision.co.il><710F2847B0018641891D9A21602763605AD322@ex3.envision.co.il> <201003041412.24253.faltet@pytables.org> Message-ID: <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> I can not give a reliable answer yet, since I have some more improvement to make. The application is an analysis of a stereoscopic-movie raw-data recording (both channels are recorded in the same file). I treat the data as a huge memory mapped file. The idea was to process each channel (left and right) on a different core. Right now the application is IO bounded since I do classical numpy operation, so each channel (which is handled as one array) is scanned several time. The improvement now over a single process is 10%, but I hope to achieve 10% ore after trivial optimizations. I used this application as an excuse to dive into multi-processing. I hope that the code I posted here would help someone. Nadav. -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Francesc Alted Sent: Thu 04-Mar-10 15:12 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy What kind of calculations are you doing with this module? Can you please send some examples and the speed-ups you are getting? Thanks, Francesc A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigu?: > Extended module that I used for some useful work. > Comments: > 1. Sturla's module is better designed, but did not work with very large > (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + python-2.6.4 > + numpy-1.4.0 > > Nadav. > > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > Sent: Thu 04-Mar-10 11:55 > To: Discussion of Numerical Python > Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy > > Maybe the attached file can help. Adpted and tested on amd64 linux > > Nadav > > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > Sent: Thu 04-Mar-10 10:54 > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf > and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in > the cookbook page. I am into the same issue and going to test it today. > > Nadav > > On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > > Hi people, > > > > I was wondering about the status of using the standard library > > multiprocessing module with numpy. I found a cookbook example last > > updated one year ago which states that: > > > > "This page was obsolete as multiprocessing's internals have changed. > > More information will come shortly; a link to this page will then be > > added back to the Cookbook." > > > > http://www.scipy.org/Cookbook/multiprocessing > > > > I also found the code that used to be on this page in the cookbook but > > it does not work any more. So my question is: > > > > Is it possible to use numpy arrays as shared arrays in an application > > using multiprocessing and how do you do it? > > > > Best regards, > > Jesper > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Francesc Alted _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4479 bytes Desc: not available URL: From samtygier at yahoo.co.uk Thu Mar 4 14:07:42 2010 From: samtygier at yahoo.co.uk (sam tygier) Date: Thu, 04 Mar 2010 19:07:42 +0000 Subject: [Numpy-discussion] dtype for a single char In-Reply-To: <3d375d731003040726t5d594644r950b347b08cadc2b@mail.gmail.com> References: <3d375d731003031356p7d1c350bv439b1f145dac1ddd@mail.gmail.com> <3d375d731003040726t5d594644r950b347b08cadc2b@mail.gmail.com> Message-ID: Robert Kern wrote: > On Thu, Mar 4, 2010 at 02:34, sam tygier wrote: >> Robert Kern wrote: >>> On Wed, Mar 3, 2010 at 15:14, sam tygier wrote: >>>> Hello >>>> >>>> today i was caught out by trying to use 'a' as a dtype for a single character. a simple example would be: >>>> >>>>>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a"), ("number", "i")]) >>>> array([('', 1), ('', 2), ('', 3)], >>>> dtype=[('letter', '|S0'), ('number', '>>> >>>> the fix seems to be using 'a1' instead >>>> >>>>>>> array([('a',1),('b',2),('c',3)], dtype=[("letter", "a1"), ("number", "i")]) >>>> array([('a', 1), ('b', 2), ('c', 3)], >>>> dtype=[('letter', '|S1'), ('number', '>>> >>>> this seems odd to me, as other types eg 'i' and 'f' can be used on their own. is there a reason for this? >>> Other types have a sensible default determined by the platform. >> I don't understand. char has sensible default size on my platform (and all the others i am familiar with), 1 byte. > > 'S' is not char. ok. somehow i had the impression that 'a' was a char. maybe to much back and forward between the dtype, and the python struct docs. thank Sam From brennan.williams at visualreservoir.com Thu Mar 4 17:52:42 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Fri, 05 Mar 2010 11:52:42 +1300 Subject: [Numpy-discussion] how to efficiently build an array of x, y, z points In-Reply-To: <4B8ED322.4090208@noaa.gov> References: <4B8DC975.2000502@visualreservoir.com> <45d1ab481003021846h5d33a5c2m266361c2552f7e73@mail.gmail.com> <4B8DD076.7040606@visualreservoir.com> <45d1ab481003021947l37e848b4u3703dd5c6afc8f21@mail.gmail.com> <4B8E7767.8080903@gmail.com> <4B8ED322.4090208@noaa.gov> Message-ID: <4B9039BA.7060606@visualreservoir.com> Christopher Barker wrote: > Bruce Southey wrote: > >> Christopher Barker provided some code last last year on appending >> ndarrays eg: >> http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046634.html >> > > yup, I"d love someone else to pick that up and test/improve it. > > Anyway, that code only handles 1-d arrays, though that can be structured > arrays. I"d like to extend it to handlw n-d arrays, though you could > only grow them in the first dimension, which may work for your case. > > As for performance: > > My numpy code is a bit slower than using python lists, if you add > elements one at a time, and the elements are a standard python data > type. It should use less memory though, if that matters. > > If you add the data in big enough chunks, my method gets better performance. > > Currently I'm adding all my corner point xyz's into a list and then converting to an array of shape (npoints,3) And I'm creating a celllist with the point indices for each cell and then converting that into an array of shape (nactivecells,8) Then I'm creating an unstructured grid. > >> Ultimately I'm trying to build a tvtk unstructured grid to view in a >> Traits/tvtk/Mayavi app. >> > > I'd love to see that working, once you've got it! > So will I. > > The grid is ni*nj*nk cells with 8 xyz's per cell > >> (hexahedral cell with 6 faces). However some cells are inactive and >> therefore don't have geometry. Cells also have "connectivity" to other >> cells, usually to adjacent cells (e.g. cell i,j,k connected to cell >> i-1,j,k) but not always. >> > > I'm confused now -- what does the array need to look like in the end? Maybe: > > ni*nj*nk X 8 X 3 ? > > How is inactive indicated? > I made a typo in my first posting. Each cell has 8 corners, each corner an x,y,z so yes, if all the cells in the grid are active then ni*nj*nk*8*3 but usually not all cells are active and it is optional whether to have inactive cell geometry written out to the grid file so it is actually nactivecells*8*3 > Is the connectivity somehow in the same array, or is that stored separately? > Bit of both - there is separate connectivity info and also implicit connectivity info. Often a cell will be fully connected to its adjacent cell(s) as they share a common face. But also, often there is not connectivity (e.g. a fault) and the +I face of a cell does not match up against the -I face of the adjacent cell. At the moment, I'm not removing duplicate points (of which there are a lot, probably 25-50% depending on the degree of faulting). One other thing I need to do is to re-order my xyz coordinates - in the attached image taken from the VTK file format pdf you can see the 0,1,2,3 and 4,5,6,7 node ordering. In my grid it is 0,1,3,2 and 4,5,7,6 so you can see that I need to swap round some of the coordinates. I need to do this for each cell and there may be 10,000 of them but there may be 2,000,000 of them. So I think it is probably best not to do it on a cell by cell basis but wait until I've built my full pointlist, then convert it to an array, probably of shape (nactivecells,8,3) and then somehow rearrange/reorder the 8 "columns". Sound the right way to go? Brennan -------------- next part -------------- A non-text attachment was scrubbed... Name: hexahedron.png Type: image/png Size: 6219 bytes Desc: not available URL: From kackvogel at gmail.com Thu Mar 4 20:10:57 2010 From: kackvogel at gmail.com (kaby) Date: Thu, 4 Mar 2010 17:10:57 -0800 (PST) Subject: [Numpy-discussion] memory usage of numpy arrays Message-ID: <27788048.post@talk.nabble.com> Hi. I am using numpy arrays and when constructing an array I get a "cannot allocate memory for thread-local data: ABORT" The array i'm constructing is zeros((numVars, 2, numVars, 2), dtype=float) Where numVars is at about 2000. I was expecting the memory usage to be 2000*2000*2*2*8Bytes=128.000.000Bytes=128MBytes So why is that happening? What am I missing? Thanks for any help. -- View this message in context: http://old.nabble.com/memory-usage-of-numpy-arrays-tp27788048p27788048.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Thu Mar 4 20:15:38 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 4 Mar 2010 19:15:38 -0600 Subject: [Numpy-discussion] memory usage of numpy arrays In-Reply-To: <27788048.post@talk.nabble.com> References: <27788048.post@talk.nabble.com> Message-ID: <3d375d731003041715w300cdb3ar42a90392855887b3@mail.gmail.com> On Thu, Mar 4, 2010 at 19:10, kaby wrote: > > Hi. > I am using numpy arrays and when constructing an array I get a ?"cannot > allocate memory for thread-local data: ABORT" > The array i'm constructing is > zeros((numVars, 2, numVars, 2), dtype=float) Where numVars is at about 2000. > > I was expecting the memory usage to be > 2000*2000*2*2*8Bytes=128.000.000Bytes=128MBytes > > So why is that happening? What am I missing? It's not having a problem allocating the memory itself. There are some things under the covers that use thread-local storage, and numpy is apparently not able to do those things. Can you show us a complete, minimal, self-contained script that fails? Can you please run that script and copy-and-paste the full traceback? Can you tell us what platform you are on and how you built numpy? If you did not build numpy yourself, please tell us exactly where you got your binaries from (the URL to the package itself is necessary). Thanks. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From patrickmarshwx at gmail.com Thu Mar 4 23:22:29 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Thu, 4 Mar 2010 22:22:29 -0600 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: <4B8F172A.90802@silveregg.co.jp> References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> <4B8F172A.90802@silveregg.co.jp> Message-ID: On Wed, Mar 3, 2010 at 8:12 PM, David Cournapeau wrote: > Patrick Marsh wrote: > > On Wed, Mar 3, 2010 at 8:48 AM, David Cournapeau > > wrote: > > > > > That's a bug in the pavement script - on windows 7, some env > variables > > are necessary to run python correctly, which were not necessary for > > windows < 7. I will fix this. > > This is fixed in both trunk and 1.4.x now. I have not tested it, though. > I've hacked the pavement script to check for the version of python I'm using to build with and have it use MinGW for versions Python 2.5 and earlier and MSVC for Python 2.6 and later. I hope to install MSVC7.1 tomorrow...if I can find my disks. Then I should be able to build entirely with MSVC. I'm assuming we want to build the official binaries using the same tools for both Python 2.5 and Python 2.6? > > > > > Okay, I had been removing the build and dist directories but didn't > > realize I needed to remove the numpy directory in the site-packages > > directory. Deleting this last directory fixed the "matrix" issues and > > I'm now left with the two failures. The latter failure doesn't seem to > > really be an issue to me and the first one is the same error that Ralf > > posted earlier - so for Python 2.5, I've got it working. However, > > Python 2.6.4 still freezes on the test suite. I'll have to look more > > into this today, but for reference, has anyone successfully built Numpy > > from the 1.4.x branch, on Windows 7, using Python 2.6.4? > > This is almost always a problem with the C runtime. Those are a big PITA > to debug/understand/fix. You built this with Mingw, right ? The first > thing to check is whether you have several C runtimes loaded: you can > check this with the problem depends.exe: http://www.dependencywalker.com I've run the Numpy superpack installer for Python 2.6 built with MinGW through the dependency walker. Unfortunately, outside of checking for some extremely obviously things, I'm in way over my head in interpreting the output (although, I'd like to learn). I've put the output from the program here: http://www.patricktmarsh.com/numpy/20100303.py26.superpack.dependencies.txt. I can also put the binary up somewhere, too if someone wants to check that. I still have concerns as to why the creation of the .pyc and .pyo files when I use the Python 2.6 installer takes so long. I didn't check yet (I'll do that tomorrow), but I'm wondering if they are actually being created. I know the first time I use a new tool in Numpy on Python 2.6 that it takes considerably longer to execute than on Python 2.5. The second time I use a function, the speed is identical on between both versions. Thanks again for your help. I've learned a lot the last two weeks about the build process. I have a much deeper appreciation of what you've been doing for awhile now! Cheers, Patrick > > I will try to look at this myself - I have only attempted Visual Studio > builds on Windows 7 so far, > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at silveregg.co.jp Fri Mar 5 00:08:48 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Fri, 05 Mar 2010 14:08:48 +0900 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> <4B8F172A.90802@silveregg.co.jp> Message-ID: <4B9091E0.6030702@silveregg.co.jp> Patrick Marsh wrote: > > I've hacked the pavement script to check for the version of python I'm > using to build with and have it use MinGW for versions Python 2.5 and > earlier and MSVC for Python 2.6 and later. I hope to install MSVC7.1 > tomorrow...if I can find my disks. Then I should be able to build > entirely with MSVC. I'm assuming we want to build the official binaries > using the same tools for both Python 2.5 and Python 2.6? Yes, no binary should be built with MSVC. They all should be built with mingw, unless we have to (which has not happened on 32 bits so far). > > > > > > > > Okay, I had been removing the build and dist directories but didn't > > realize I needed to remove the numpy directory in the site-packages > > directory. Deleting this last directory fixed the "matrix" > issues and > > I'm now left with the two failures. The latter failure doesn't > seem to > > really be an issue to me and the first one is the same error that > Ralf > > posted earlier - so for Python 2.5, I've got it working. However, > > Python 2.6.4 still freezes on the test suite. I'll have to look more > > into this today, but for reference, has anyone successfully built > Numpy > > from the 1.4.x branch, on Windows 7, using Python 2.6.4? > > This is almost always a problem with the C runtime. Those are a big PITA > to debug/understand/fix. You built this with Mingw, right ? The first > thing to check is whether you have several C runtimes loaded: you can > check this with the problem depends.exe: http://www.dependencywalker.com > > > I've run the Numpy superpack installer for Python 2.6 built with MinGW > through the dependency walker. Unfortunately, outside of checking for > some extremely obviously things, I'm in way over my head > in interpreting the output (although, I'd like to learn). I've put the > output from the program > here: http://www.patricktmarsh.com/numpy/20100303.py26.superpack.dependencies.txt. It does not look like you have several version of the MS runtimes here, and your SxS configuration has only a few dlls (the SxS is what often causes trouble on XP/Vista/Windows 7 for python 2.6). The only thing I can think of would be some weird issue with WoW - you are running windows 64 bits but with 32 bits python, right ? > Thanks again for your help. I've learned a lot the last two weeks about > the build process. You're welcome. I hope to be able to look a bit into that issue this WE, cheers, David From geometrian at gmail.com Fri Mar 5 01:13:31 2010 From: geometrian at gmail.com (Ian Mallett) Date: Thu, 4 Mar 2010 22:13:31 -0800 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: Firstly, I want to thank you for all the time and attention you've obviously put into this code. On Tue, Mar 2, 2010 at 12:27 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > The loop I can replace by numpy operations: > > >>> v_array > array([[1, 2, 3], > [4, 5, 6], > [7, 8, 9]]) > >>> n_array > array([[ 0.1, 0.2, 0.3], > [ 0.4, 0.5, 0.6]]) > >>> f_list > array([[0, 1, 2, 0], > [2, 1, 0, 1]]) > > Retrieving the v1 vectors: > > >>> v1s = v_array[f_list[:, 0]] > >>> v1s > array([[1, 2, 3], > [7, 8, 9]]) > > Retrieving the normal vectors: > > >>> ns = n_array[f_list[:, 3]] > >>> ns > array([[ 0.1, 0.2, 0.3], > [ 0.4, 0.5, 0.6]]) > I don't think you understand quite what I'm looking for here. Every vec4 in f_list describes a triangle. The first, second, and third are indices of vertices in v_array. The fourth is an index of n_array. From your code, I've learned that I can do this, which is more what I want: v1s = v_array[f_list[:,0:3]] ns = n_array[f_list[:,3]] Obviously, this changes the arrays quite a bit. With changes, the code doesn't actually crash until the corners = line. Do the following lines that do the dot product and comparison still behave correctly? They should be finding, for each triangle, whether or not the associated normal is less than 90 degrees. The triangle's edges should then be added to an array. See end. > Now how to calculate the pairwise dot product (I suppress the > difference of v1 to some_point for now): > > >>> inner = numpy.inner(ns, v1s) > >>> inner > array([[ 1.4, 5. ], > [ 3.2, 12.2]]) > > This calculates *all* pairwise dot products, we have to select the > diagonal of this square ndarray: > > >>> dotprods = inner[[numpy.arange(0, 2), numpy.arange(0, 2)]] > >>> dotprods > array([ 1.4, 12.2]) > > Now we can create a boolean array saying where the dotprod is > 0 > (i.e, angle < 90?), and select those triangles: > > >>> select = dotprods > 0 > >>> select > array([ True, True], dtype=bool) > >>> selected = f_list[select] > >>> selected > array([[0, 1, 2, 0], > [2, 1, 0, 1]]) > This seems like a clever idea. > In this case it's the full list. Now build the triangles corners array: > > >>> corners = v_array[selected[:, :3]] > >>> corners > array([[[1, 2, 3], > [4, 5, 6], > [7, 8, 9]], > > [[7, 8, 9], > [4, 5, 6], > [1, 2, 3]]]) > >>> > > This has indices [triangle, vertex number (0, 1, or 2), xyz]. > And compute the edges (I think you can make use of them): > > >>> edges_dist = numpy.asarray([corners[:, 1] - corners[:, 0], corners[:, > 2] - corners[:, 0], corners[:, 2] - corners[:, 1]]) > >>> edges_dist > array([[[ 3, 3, 3], > [-3, -3, -3]], > > [[ 6, 6, 6], > [-6, -6, -6]], > > [[ 3, 3, 3], > [-3, -3, -3]]]) > > This has indices [corner number, triangle, xyz]. > I think it's easier to compare then "reversed" edges, because then > edge[i, j] == -edge[k, l]? > > But of course: > > >>> edges = numpy.asarray([[corners[:, 0], corners[:, 1]], [corners[:, 1], > corners[:, 2]], [corners[:, 2], corners[:, 0]]]) > >>> edges > array([[[[1, 2, 3], > [7, 8, 9]], > > [[4, 5, 6], > [4, 5, 6]]], > > > [[[4, 5, 6], > [4, 5, 6]], > > [[7, 8, 9], > [1, 2, 3]]], > > > [[[7, 8, 9], > [1, 2, 3]], > > [[1, 2, 3], > [7, 8, 9]]]]) > > This has indices [edge number (0, 1, or 2), corner number in edge (0 > or 1), triangle]. > But this may not be what you want (not flattened in triangle number). > Therefore: > > >>> edges0 = numpy.asarray([corners[:, 0], corners[:, 1]]) > >>> edges1 = numpy.asarray([corners[:, 1], corners[:, 2]]) > >>> edges2 = numpy.asarray([corners[:, 2], corners[:, 0]]) > >>> edges0 > array([[[1, 2, 3], > [7, 8, 9]], > > [[4, 5, 6], > [4, 5, 6]]]) > >>> edges1 > array([[[4, 5, 6], > [4, 5, 6]], > > [[7, 8, 9], > [1, 2, 3]]]) > >>> edges2 > array([[[7, 8, 9], > [1, 2, 3]], > > [[1, 2, 3], > [7, 8, 9]]]) > > >>> edges = numpy.concatenate((edges0, edges1, edges2), axis = 0) > >>> edges > array([[[1, 2, 3], > [7, 8, 9]], > > [[4, 5, 6], > [4, 5, 6]], > > [[4, 5, 6], > [4, 5, 6]], > > [[7, 8, 9], > [1, 2, 3]], > > [[7, 8, 9], > [1, 2, 3]], > > [[1, 2, 3], > [7, 8, 9]]]) > > This should be as intended. > The indices are [flat edge number, edge side (left or right), xyz]. > > Now I guess you have to iterate over all pairs of them, don't know a > numpy accelerated method. Maybe it's even faster to draw the edges > twice than to invest O(N_edges ** 2) complexity for comparing? > Unfortunately, no. The whole point of the algorithm is to extrude back-facing triangles (those with normals facing away from the light) backward, leaving polygons behind in a light volume. Although a shadow volume tutorial can explain this in a more detailed way, basically, for every triangle, if it is back-facing, add its edges to a list. Remove the duplicate edges. So, the edge between two back-facing triangles-that-meet-along-an-edge is not kept. However, if a back-facing triangle and a front-facing triangle meet, only the back-facing triangle's edge is present in the list, and so it is not removed. Thus, only the border edges between the front-facing triangles and the back facing triangles remain in the list. As front-facing triangles face the light and back-facing triangles don't, a silhouette edge is built up. When these edges are extruded, they're extremely useful. The following image http://www.gamedev.net/reference/articles/1873/image010.gif shows the process. The four triangles are all back facing. If duplicate edges are removed, only edges I0-I2, I2-I4, I4-I3, and I3-I0 remain--the silhouette edges. Still to do, remove the duplicate edges (actually where a good deal of the optimization lies too). So, for every back-facing triangle [v1,v2,v3,n], (where v*n* is a vertex * index*), the edges [v1,v2], [v2,v3], and [v3,v1] need to be added to a list. I.e., f_list needs to be converted into a list of edges in this way. Then, duplicate edge pairs need to be removed, noting that [v1,v2] and [v2,v1] are still a pair (in my Python code, I simply sorted the edges before removing duplicates: [123,4] -> [4,123] and [56,968] -> [56,968]). The final edge list then should be converted back into actual vertices by indexing it into v_array (I think I understand how to do this now!): [ [1,2], [4,6], ... ] -> [ [[0.3,1.6,4.5],[9.1,4.7,7.7]], [[0.4,5.5,8.3],[9.6,8.1,0.3]], ... ] > It may seem a bit complicated, but I hope this impression is mainly > because of the many outputs ... > > I double-checked everything, *hope* everything is correct. > So far from me, > Friedrich > Once again, thanks so much, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Fri Mar 5 03:53:02 2010 From: faltet at pytables.org (Francesc Alted) Date: Fri, 5 Mar 2010 09:53:02 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> Message-ID: <201003050953.02357.faltet@pytables.org> Yeah, 10% of improvement by using multi-cores is an expected figure for memory bound problems. This is something people must know: if their computations are memory bound (and this is much more common that one may initially think), then they should not expect significant speed-ups on their parallel codes. Thanks for sharing your experience anyway, Francesc A Thursday 04 March 2010 18:54:09 Nadav Horesh escrigu?: > I can not give a reliable answer yet, since I have some more improvement to > make. The application is an analysis of a stereoscopic-movie raw-data > recording (both channels are recorded in the same file). I treat the data > as a huge memory mapped file. The idea was to process each channel (left > and right) on a different core. Right now the application is IO bounded > since I do classical numpy operation, so each channel (which is handled as > one array) is scanned several time. The improvement now over a single > process is 10%, but I hope to achieve 10% ore after trivial optimizations. > > I used this application as an excuse to dive into multi-processing. I hope > that the code I posted here would help someone. > > Nadav. > > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Francesc Alted > Sent: Thu 04-Mar-10 15:12 > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > What kind of calculations are you doing with this module? Can you please > send some examples and the speed-ups you are getting? > > Thanks, > Francesc > > A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigu?: > > Extended module that I used for some useful work. > > Comments: > > 1. Sturla's module is better designed, but did not work with very large > > (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + > > python-2.6.4 + numpy-1.4.0 > > > > Nadav. > > > > > > -----Original Message----- > > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > > Sent: Thu 04-Mar-10 11:55 > > To: Discussion of Numerical Python > > Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > Maybe the attached file can help. Adpted and tested on amd64 linux > > > > Nadav > > > > > > -----Original Message----- > > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > > Sent: Thu 04-Mar-10 10:54 > > To: Discussion of Numerical Python > > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf > > and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in > > the cookbook page. I am into the same issue and going to test it today. > > > > Nadav > > > > On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > > > Hi people, > > > > > > I was wondering about the status of using the standard library > > > multiprocessing module with numpy. I found a cookbook example last > > > updated one year ago which states that: > > > > > > "This page was obsolete as multiprocessing's internals have changed. > > > More information will come shortly; a link to this page will then be > > > added back to the Cookbook." > > > > > > http://www.scipy.org/Cookbook/multiprocessing > > > > > > I also found the code that used to be on this page in the cookbook but > > > it does not work any more. So my question is: > > > > > > Is it possible to use numpy arrays as shared arrays in an application > > > using multiprocessing and how do you do it? > > > > > > Best regards, > > > Jesper > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Francesc Alted From d.l.goldsmith at gmail.com Fri Mar 5 04:38:58 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 5 Mar 2010 01:38:58 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? Message-ID: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> Hi! Sorry for the cross-post, but my own investigation has led me to suspect that mine is actually a numpy problem, not a matplotlib problem. I'm getting the following traceback from a call to matplotlib.imshow: Traceback (most recent call last): File "C:\Users\Fermat\Documents\Fractals\Python\Source\Zodiac\aquarius_test.py", line 108, in ax.imshow(part2plot, cmap_name, extent = extent) File "C:\Python254\lib\site-packages\matplotlib\axes.py", line 6261, in imshow im.autoscale_None() File "C:\Python254\lib\site-packages\matplotlib\cm.py", line 236, in autoscale_None self.norm.autoscale_None(self._A) File "C:\Python254\lib\site-packages\matplotlib\colors.py", line 792, in autoscale_None if self.vmin is None: self.vmin = ma.minimum(A) File "C:\Python254\Lib\site-packages\numpy\ma\core.py", line 5555, in __call__ return self.reduce(a) File "C:\Python254\Lib\site-packages\numpy\ma\core.py", line 5570, in reduce t = self.ufunc.reduce(target, **kargs) ValueError: zero-size array to ufunc.reduce without identity Script terminated. Based on examination of the code, the last self is an instance of ma._extrema_operation (or one of its subclasses) - is there a reason why this class is unable to deal with a "zero-size array to ufunc.reduce without identity," (i.e., was it thought that it would - or should - never get one) or was this merely an oversight? Either way, there's other instances on the lists of this error cropping up, so this circumstance should probably be handled more robustly. In the meantime, workaround? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Fri Mar 5 04:51:12 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 5 Mar 2010 10:51:12 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <201003050953.02357.faltet@pytables.org> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> <201003050953.02357.faltet@pytables.org> Message-ID: <20100305095112.GD21772@phare.normalesup.org> On Fri, Mar 05, 2010 at 09:53:02AM +0100, Francesc Alted wrote: > Yeah, 10% of improvement by using multi-cores is an expected figure for > memory bound problems. This is something people must know: if their > computations are memory bound (and this is much more common that one > may initially think), then they should not expect significant speed-ups > on their parallel codes. Hey Francesc, Any chance this can be different for NUMA (non uniform memory access) architectures? AMD multicores used to be NUMA, when I was still following these problems. FWIW, I observe very good speedups on my problems (pretty much linear in the number of CPUs), and I have data parallel problems on fairly large data (~100Mo a piece, doesn't fit in cache), with no synchronisation at all between the workers. CPUs are Intel Xeons. Gael From pgmdevlist at gmail.com Fri Mar 5 05:51:38 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 5 Mar 2010 05:51:38 -0500 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> Message-ID: <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote: > Hi! Sorry for the cross-post, but my own investigation has led me to suspect that mine is actually a numpy problem, not a matplotlib problem. I'm getting the following traceback from a call to matplotlib.imshow: > ... > Based on examination of the code, the last self is an instance of ma._extrema_operation (or one of its subclasses) - is there a reason why this class is unable to deal with a "zero-size array to ufunc.reduce without identity," (i.e., was it thought that it would - or should - never get one) or was this merely an oversight? Either way, there's other instances on the lists of this error cropping up, so this circumstance should probably be handled more robustly. In the meantime, workaround? 'm'fraid no. I gonna have to investigate that. Please open a ticket with a self-contained example that reproduces the issue. Thx in advance... P. From schut at sarvision.nl Fri Mar 5 06:04:32 2010 From: schut at sarvision.nl (Vincent Schut) Date: Fri, 05 Mar 2010 12:04:32 +0100 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> Message-ID: On 03/05/2010 11:51 AM, Pierre GM wrote: > On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote: >> Hi! Sorry for the cross-post, but my own investigation has led me to suspect that mine is actually a numpy problem, not a matplotlib problem. I'm getting the following traceback from a call to matplotlib.imshow: >> ... >> Based on examination of the code, the last self is an instance of ma._extrema_operation (or one of its subclasses) - is there a reason why this class is unable to deal with a "zero-size array to ufunc.reduce without identity," (i.e., was it thought that it would - or should - never get one) or was this merely an oversight? Either way, there's other instances on the lists of this error cropping up, so this circumstance should probably be handled more robustly. In the meantime, workaround? > > > 'm'fraid no. I gonna have to investigate that. Please open a ticket with a self-contained example that reproduces the issue. > Thx in advance... > P. This might be completely wrong, but I seem to remember a similar issue, which I then traced down to having a masked array with a mask that was set to True or False, instead of being a full fledged bool mask array. I was in a hurry then and completely forgot about it later, so filed no bug report whatsoever, for which I apologize. VS. From faltet at pytables.org Fri Mar 5 08:14:51 2010 From: faltet at pytables.org (Francesc Alted) Date: Fri, 5 Mar 2010 08:14:51 -0500 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <20100305095112.GD21772@phare.normalesup.org> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> <201003050953.02357.faltet@pytables.org> <20100305095112.GD21772@phare.normalesup.org> Message-ID: <20100305131450.GA20600@pytables.org> Gael, On Fri, Mar 05, 2010 at 10:51:12AM +0100, Gael Varoquaux wrote: > On Fri, Mar 05, 2010 at 09:53:02AM +0100, Francesc Alted wrote: > > Yeah, 10% of improvement by using multi-cores is an expected figure for > > memory bound problems. This is something people must know: if their > > computations are memory bound (and this is much more common that one > > may initially think), then they should not expect significant speed-ups > > on their parallel codes. > > Hey Francesc, > > Any chance this can be different for NUMA (non uniform memory access) > architectures? AMD multicores used to be NUMA, when I was still following > these problems. As far as I can tell, NUMA architectures work better accelerating independent processes that run independently one of each other. In this case, hardware is in charge of putting closely-related data in memory that is 'nearer' to each processor. This scenario *could* happen in truly parallel process too, but as I said, in general it works best for independent processes (read multiuser machines). > FWIW, I observe very good speedups on my problems (pretty much linear in > the number of CPUs), and I have data parallel problems on fairly large > data (~100Mo a piece, doesn't fit in cache), with no synchronisation at > all between the workers. CPUs are Intel Xeons. Maybe your processes are not as memory-bound as you think. Do you get much better speed-up by using NUMA than a simple multi-core machine with one single path to memory? I don't think so, but maybe I'm wrong here. Francesc From gael.varoquaux at normalesup.org Fri Mar 5 08:46:00 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 5 Mar 2010 14:46:00 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <20100305131450.GA20600@pytables.org> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> <201003050953.02357.faltet@pytables.org> <20100305095112.GD21772@phare.normalesup.org> <20100305131450.GA20600@pytables.org> Message-ID: <20100305134600.GA18142@phare.normalesup.org> On Fri, Mar 05, 2010 at 08:14:51AM -0500, Francesc Alted wrote: > > FWIW, I observe very good speedups on my problems (pretty much linear in > > the number of CPUs), and I have data parallel problems on fairly large > > data (~100Mo a piece, doesn't fit in cache), with no synchronisation at > > all between the workers. CPUs are Intel Xeons. > Maybe your processes are not as memory-bound as you think. That's the only explaination that I can think of. I have two types of bottlenecks. One is blas level 3 operations (mainly SVDs) on large matrices, the second is resampling, where are repeat the same operation many times over almost the same chunk of data. In both cases the data is fairly large, so I expected the operations to be memory bound. However, thinking of it, I believe that when I had timed these operations carefully, it seems that processes were alternating a starving period, during which they were IO-bound, and a productive period, during which they were CPU-bound. After a few cycles, the different periods would fall in a mutually disynchronised alternation, with one process IO-bound, and the others CPU-bound, and it would become fairly efficient. Of course, this is possible because I have no cross-talk between the processes. > Do you get much better speed-up by using NUMA than a simple multi-core > machine with one single path to memory? I don't think so, but maybe > I'm wrong here. I don't know. All the boxes around here have Intel CPUs, and I believe that this is all SMPs. Ga?l From bruce.schultz at gmail.com Fri Mar 5 09:00:02 2010 From: bruce.schultz at gmail.com (Bruce Schultz) Date: Sat, 06 Mar 2010 00:00:02 +1000 Subject: [Numpy-discussion] printing structured arrays Message-ID: <4B910E62.7070209@gmail.com> Hi, I've just started playing with numpy and have noticed that when printing a structured array that the output is not nicely formatted. Is there a way to make the formatting look the same as it does for an unstructured array? Here an example of what I mean: data = [ (1, 2), (3, 4.1) ] dtype = [('x', float), ('y', float)] print '### ndarray' a = numpy.array(data) print a print '### structured array' a = numpy.array(data, dtype=dtype) print a Output is: ### ndarray [[ 1. 2. ] [ 3. 4.1]] ### structured array [(1.0, 2.0) (3.0, 4.0999999999999996)] Thanks Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Mar 5 10:16:54 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 5 Mar 2010 09:16:54 -0600 Subject: [Numpy-discussion] Why does np.nan{min, max} clobber my array mask? In-Reply-To: References: <20100213190410.I26855@halibut.com> <5AFAA9DD-E54C-4EF4-B0FC-A4B62AA6401C@gmail.com> <20100215175144.J26855@halibut.com> <5A82E5A3-8D6C-4DC9-B890-80BF59CC61D7@gmail.com> Message-ID: On Mon, Feb 15, 2010 at 9:24 PM, Bruce Southey wrote: > On Mon, Feb 15, 2010 at 8:35 PM, Pierre GM wrote: >> On Feb 15, 2010, at 8:51 PM, David Carmean wrote: >>> On Sun, Feb 14, 2010 at 03:22:04PM -0500, Pierre GM wrote: >>> >>>> >>>> I'm sorry, I can't follow you. Can you post a simpler self-contained example I can play with ? >>>> Why using np.nanmin/max ? These functions are designed for ndarrays, to avoid using a masked array: can't you just use min/max on the masked array ? >>> >>> I was using np.nanmin/max because I did not yet understand how masked arrays worked; perhaps the >>> docs for those methods need a note indicating that "If you can take the (small?) memory hit, >>> use Masked Arrays instead". ? Now that I know different... I'm ?going to drop it unless you >>> reall want to dig into it. >> >> >> I'm curious. Can you post an excerpt of your array, so that I can check what goes wrong? >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Hi, > David, please file a bug report. > > I think is occurs with np.nansum, np.nanmin and np.nanmax. Perhaps > some thing with the C99 changes as I think it exists with numpy 1.3. > > I think this code shows the problem with Linux and recent numpy svn: > > import numpy as np > uut = np.array([[2, 1, 3, np.nan], [5, 2, 3, np.nan]]) > msk = np.ma.masked_invalid(uut) > msk > np.nanmin(msk, axis=1) > msk > > $ python > Python 2.6 (r26:66714, Nov ?3 2009, 17:33:18) > [GCC 4.4.1 20090725 (Red Hat 4.4.1-2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy as np >>>> uut = np.array([[2, 1, 3, np.nan], [5, 2, 3, np.nan]]) >>>> msk = np.ma.masked_invalid(uut) >>>> msk > masked_array(data = > ?[[2.0 1.0 3.0 --] > ?[5.0 2.0 3.0 --]], > ? ? ? ? ? ? mask = > ?[[False False False ?True] > ?[False False False ?True]], > ? ? ? fill_value = 1e+20) > >>>> np.nanmin(msk, axis=1) > masked_array(data = [1.0 2.0], > ? ? ? ? ? ? mask = [False False], > ? ? ? fill_value = 1e+20) > >>>> msk > masked_array(data = > ?[[2.0 1.0 3.0 nan] > ?[5.0 2.0 3.0 nan]], > ? ? ? ? ? ? mask = > ?[[False False False False] > ?[False False False False]], > ? ? ? fill_value = 1e+20) > > > Bruce > Hi, I filed this ticket and hopefully the provided code is sufficient for a test: http://projects.scipy.org/numpy/ticket/1421 The bug is with the _nanop function because nansum, nanmin, nanmax, nanargmin and nanargmax have the same issue. Bruce Bruce From faltet at pytables.org Fri Mar 5 10:22:07 2010 From: faltet at pytables.org (Francesc Alted) Date: Fri, 5 Mar 2010 16:22:07 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <20100305134600.GA18142@phare.normalesup.org> References: <20100305131450.GA20600@pytables.org> <20100305134600.GA18142@phare.normalesup.org> Message-ID: <201003051622.07606.faltet@pytables.org> A Friday 05 March 2010 14:46:00 Gael Varoquaux escrigu?: > On Fri, Mar 05, 2010 at 08:14:51AM -0500, Francesc Alted wrote: > > > FWIW, I observe very good speedups on my problems (pretty much linear > > > in the number of CPUs), and I have data parallel problems on fairly > > > large data (~100Mo a piece, doesn't fit in cache), with no > > > synchronisation at all between the workers. CPUs are Intel Xeons. > > > > Maybe your processes are not as memory-bound as you think. > > That's the only explaination that I can think of. I have two types of > bottlenecks. One is blas level 3 operations (mainly SVDs) on large > matrices, the second is resampling, where are repeat the same operation > many times over almost the same chunk of data. In both cases the data is > fairly large, so I expected the operations to be memory bound. Not at all. BLAS 3 operations are mainly CPU-bounded, because algorithms (if they are correctly implemented, of course, but any decent BLAS 3 library will do) have many chances to reuse data from caches. BLAS 1 (and lately 2 too) are the ones that are memory-bound. And in your second case, you are repeating the same operation over the same chunk of data. If this chunk is small enough to fit in cache, then the bottleneck is CPU again (and probably access to L1/L2 cache), and not access to memory. But if, as you said, you are seeing periods that are memory- bounded (i.e. CPUs are starving), then it may well be that this chunksize does not fit well in cache, and then your problem is memory access for this case. Maybe you can get better performance by reducing your chunksize so that it fits in cache (L1 or L2). So, I do not think that NUMA architectures would perform your current computations any better than your current SMP platform (and you know that NUMA architectures are much more complex and expensive than SMP ones). But experimenting is *always* the best answer to these hairy questions ;-) -- Francesc Alted From dlenski at gmail.com Fri Mar 5 12:11:57 2010 From: dlenski at gmail.com (Dan Lenski) Date: Fri, 5 Mar 2010 17:11:57 +0000 (UTC) Subject: [Numpy-discussion] Loading bit strings Message-ID: Is there a good way in NumPy to convert from a bit string to a boolean array? For example, if I have a 2-byte string s='\xfd\x32', I want to get a 16-length boolean array out of it. Here's what I came up with: A = fromstring(s, dtype=uint8) out = empty(A.size * 8, dtype=bool) for bit in range(0,8): out[bit::8] = A&(1< References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> Message-ID: <45d1ab481003050922k528d2079p17d7bc86841d6d6a@mail.gmail.com> On Fri, Mar 5, 2010 at 2:51 AM, Pierre GM wrote: > On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote: > > Hi! Sorry for the cross-post, but my own investigation has led me to > suspect that mine is actually a numpy problem, not a matplotlib problem. > I'm getting the following traceback from a call to matplotlib.imshow: > > ... > > Based on examination of the code, the last self is an instance of > ma._extrema_operation (or one of its subclasses) - is there a reason why > this class is unable to deal with a "zero-size array to ufunc.reduce without > identity," (i.e., was it thought that it would - or should - never get one) > or was this merely an oversight? Either way, there's other instances on the > lists of this error cropping up, so this circumstance should probably be > handled more robustly. In the meantime, workaround? > > > 'm'fraid no. I gonna have to investigate that. Please open a ticket with a > self-contained example that reproduces the issue. > I'll do my best, but since it's a call from matplotlib and I don't really know what's causing the problem (other than a literal reading of the exception) I'm not sure I can. DG > Thx in advance... > P. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 5 12:29:03 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 5 Mar 2010 11:29:03 -0600 Subject: [Numpy-discussion] Loading bit strings In-Reply-To: References: Message-ID: <3d375d731003050929kd169249ne5db575ee0baa3dc@mail.gmail.com> On Fri, Mar 5, 2010 at 11:11, Dan Lenski wrote: > Is there a good way in NumPy to convert from a bit string to a boolean > array? > > For example, if I have a 2-byte string s='\xfd\x32', I want to get a > 16-length boolean array out of it. > > Here's what I came up with: > > A = fromstring(s, dtype=uint8) > out = empty(A.size * 8, dtype=bool) > for bit in range(0,8): > ?out[bit::8] = A&(1< > I just can't shake the feeling that there may be a better way to > do this, though... For short enough strings, it probably doesn't really matter. Any correct way will do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d.l.goldsmith at gmail.com Fri Mar 5 12:43:46 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 5 Mar 2010 09:43:46 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003050922k528d2079p17d7bc86841d6d6a@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003050922k528d2079p17d7bc86841d6d6a@mail.gmail.com> Message-ID: <45d1ab481003050943h2bf1b76aidcc898ec207b56b@mail.gmail.com> On Fri, Mar 5, 2010 at 9:22 AM, David Goldsmith wrote: > On Fri, Mar 5, 2010 at 2:51 AM, Pierre GM wrote: > >> On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote: >> > Hi! Sorry for the cross-post, but my own investigation has led me to >> suspect that mine is actually a numpy problem, not a matplotlib problem. >> I'm getting the following traceback from a call to matplotlib.imshow: >> > ... >> > Based on examination of the code, the last self is an instance of >> ma._extrema_operation (or one of its subclasses) - is there a reason why >> this class is unable to deal with a "zero-size array to ufunc.reduce without >> identity," (i.e., was it thought that it would - or should - never get one) >> or was this merely an oversight? Either way, there's other instances on the >> lists of this error cropping up, so this circumstance should probably be >> handled more robustly. In the meantime, workaround? >> >> >> 'm'fraid no. I gonna have to investigate that. Please open a ticket with a >> self-contained example that reproduces the issue. >> > > I'll do my best, but since it's a call from matplotlib and I don't really > know what's causing the problem (other than a literal reading of the > exception) I'm not sure I can. > Well, that was easy: mn = N.ma.core._minimum_operation() mn.reduce(N.array(())) Traceback (most recent call last): File "", line 1, in File "C:\Python254\Lib\site-packages\numpy\ma\core.py", line 5570, in reduce t = self.ufunc.reduce(target, **kargs) ValueError: zero-size array to ufunc.reduce without identity I'll file a ticket. DG > > DG > > >> Thx in advance... >> P. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Fri Mar 5 13:05:51 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 5 Mar 2010 10:05:51 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003050943h2bf1b76aidcc898ec207b56b@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003050922k528d2079p17d7bc86841d6d6a@mail.gmail.com> <45d1ab481003050943h2bf1b76aidcc898ec207b56b@mail.gmail.com> Message-ID: <45d1ab481003051005w42ab26bdw3924b172a0a4de3f@mail.gmail.com> On Fri, Mar 5, 2010 at 9:43 AM, David Goldsmith wrote: > On Fri, Mar 5, 2010 at 9:22 AM, David Goldsmith wrote: > >> On Fri, Mar 5, 2010 at 2:51 AM, Pierre GM wrote: >> >>> On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote: >>> > Hi! Sorry for the cross-post, but my own investigation has led me to >>> suspect that mine is actually a numpy problem, not a matplotlib problem. >>> I'm getting the following traceback from a call to matplotlib.imshow: >>> > ... >>> > Based on examination of the code, the last self is an instance of >>> ma._extrema_operation (or one of its subclasses) - is there a reason why >>> this class is unable to deal with a "zero-size array to ufunc.reduce without >>> identity," (i.e., was it thought that it would - or should - never get one) >>> or was this merely an oversight? Either way, there's other instances on the >>> lists of this error cropping up, so this circumstance should probably be >>> handled more robustly. In the meantime, workaround? >>> >>> >>> 'm'fraid no. I gonna have to investigate that. Please open a ticket with >>> a self-contained example that reproduces the issue. >>> >> >> I'll do my best, but since it's a call from matplotlib and I don't really >> know what's causing the problem (other than a literal reading of the >> exception) I'm not sure I can. >> > > Well, that was easy: > > mn = N.ma.core._minimum_operation() > mn.reduce(N.array(())) > > Traceback (most recent call last): > File "", line 1, in > > File "C:\Python254\Lib\site-packages\numpy\ma\core.py", line 5570, in > reduce > t = self.ufunc.reduce(target, **kargs) > ValueError: zero-size array to ufunc.reduce without identity > > I'll file a ticket. > OK, Ticket #1422 filed. DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Fri Mar 5 13:07:03 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 5 Mar 2010 13:07:03 -0500 Subject: [Numpy-discussion] Loading bit strings In-Reply-To: References: Message-ID: <39220C84-5831-4106-9635-D9515458CDDA@yale.edu> > Is there a good way in NumPy to convert from a bit string to a boolean > array? > > For example, if I have a 2-byte string s='\xfd\x32', I want to get a > 16-length boolean array out of it. numpy.unpackbits(numpy.fromstring('\xfd\x32', dtype=numpy.uint8)) From ellisonbg.net at gmail.com Fri Mar 5 14:29:11 2010 From: ellisonbg.net at gmail.com (Brian Granger) Date: Fri, 5 Mar 2010 11:29:11 -0800 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <201003050953.02357.faltet@pytables.org> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> <201003050953.02357.faltet@pytables.org> Message-ID: <6ce0ac131003051129g79e1ed35macd121688a798acc@mail.gmail.com> Francesc, Yeah, 10% of improvement by using multi-cores is an expected figure for > memory > bound problems. This is something people must know: if their computations > are > memory bound (and this is much more common that one may initially think), > then > they should not expect significant speed-ups on their parallel codes. > > +1 Thanks for emphasizing this. This is definitely a big issue with multicore. Cheers, Brian > Thanks for sharing your experience anyway, > Francesc > > A Thursday 04 March 2010 18:54:09 Nadav Horesh escrigu?: > > I can not give a reliable answer yet, since I have some more improvement > to > > make. The application is an analysis of a stereoscopic-movie raw-data > > recording (both channels are recorded in the same file). I treat the > data > > as a huge memory mapped file. The idea was to process each channel (left > > and right) on a different core. Right now the application is IO bounded > > since I do classical numpy operation, so each channel (which is handled > as > > one array) is scanned several time. The improvement now over a single > > process is 10%, but I hope to achieve 10% ore after trivial > optimizations. > > > > I used this application as an excuse to dive into multi-processing. I > hope > > that the code I posted here would help someone. > > > > Nadav. > > > > > > -----Original Message----- > > From: numpy-discussion-bounces at scipy.org on behalf of Francesc Alted > > Sent: Thu 04-Mar-10 15:12 > > To: Discussion of Numerical Python > > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > What kind of calculations are you doing with this module? Can you please > > send some examples and the speed-ups you are getting? > > > > Thanks, > > Francesc > > > > A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigu?: > > > Extended module that I used for some useful work. > > > Comments: > > > 1. Sturla's module is better designed, but did not work with very > large > > > (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + > > > python-2.6.4 + numpy-1.4.0 > > > > > > Nadav. > > > > > > > > > -----Original Message----- > > > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > > > Sent: Thu 04-Mar-10 11:55 > > > To: Discussion of Numerical Python > > > Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > > > Maybe the attached file can help. Adpted and tested on amd64 linux > > > > > > Nadav > > > > > > > > > -----Original Message----- > > > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > > > Sent: Thu 04-Mar-10 10:54 > > > To: Discussion of Numerical Python > > > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > > > There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf > > > and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in > > > the cookbook page. I am into the same issue and going to test it today. > > > > > > Nadav > > > > > > On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > > > > Hi people, > > > > > > > > I was wondering about the status of using the standard library > > > > multiprocessing module with numpy. I found a cookbook example last > > > > updated one year ago which states that: > > > > > > > > "This page was obsolete as multiprocessing's internals have changed. > > > > More information will come shortly; a link to this page will then be > > > > added back to the Cookbook." > > > > > > > > http://www.scipy.org/Cookbook/multiprocessing > > > > > > > > I also found the code that used to be on this page in the cookbook > but > > > > it does not work any more. So my question is: > > > > > > > > Is it possible to use numpy arrays as shared arrays in an application > > > > using multiprocessing and how do you do it? > > > > > > > > Best regards, > > > > Jesper > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Fri Mar 5 15:24:26 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Fri, 5 Mar 2010 21:24:26 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: Do you have doublets in the v_array? In case not, then you owe me a donut. See attachment. Friedrich P.S.: You misunderstood too, the line you wanted to change was in context to detect back-facing triangles, and there one vertex is sufficient. -------------- next part -------------- A non-text attachment was scrubbed... Name: shading.py Type: application/octet-stream Size: 4411 bytes Desc: not available URL: From gokhansever at gmail.com Fri Mar 5 17:35:44 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 5 Mar 2010 16:35:44 -0600 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <4B910E62.7070209@gmail.com> References: <4B910E62.7070209@gmail.com> Message-ID: <49d6b3501003051435s71e800cbq5a2461348220d1d2@mail.gmail.com> On Fri, Mar 5, 2010 at 8:00 AM, Bruce Schultz wrote: > Hi, > > I've just started playing with numpy and have noticed that when printing a > structured array that the output is not nicely formatted. Is there a way to > make the formatting look the same as it does for an unstructured array? > > Here an example of what I mean: > > data = [ (1, 2), (3, 4.1) ] > dtype = [('x', float), ('y', float)] > print '### ndarray' > a = numpy.array(data) > print a > print '### structured array' > a = numpy.array(data, dtype=dtype) > print a > > Output is: > ### ndarray > [[ 1. 2. ] > [ 3. 4.1]] > ### structured array > [(1.0, 2.0) (3.0, 4.0999999999999996)] > > > Thanks > Bruce > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > I still couldn't figure out how floating point numbers look nicely on screen in cases like yours (i.e., trying numpy.array2string()) but you can make sure by using numpy.savetxt("file", array, fmt="%.1f") you will always have specified precision in the written file. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From geometrian at gmail.com Fri Mar 5 17:58:52 2010 From: geometrian at gmail.com (Ian Mallett) Date: Fri, 5 Mar 2010 14:58:52 -0800 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: Cool--this works perfectly now :-) Unfortunately, it's actually slower :P Most of the slowest part is in the removing doubles section. Some of the costliest calls: #takes 0.04 seconds inner = np.inner(ns, v1s - some_point) #0.0840001106262 sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1) #0.0329999923706 sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0) #0.0269999504089 comparison_sum = (sum_1 == sum_2) #0.0909998416901 diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1) #0.0340001583099 diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0) #0.0269999504089 comparison_diff = (diff_1 == diff_2) #0.0230000019073 same_edges = comparison_sum * comparison_diff #0.128999948502 doublet_count = same_edges.sum(axis = 0) Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Mar 6 02:29:52 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 6 Mar 2010 16:29:52 +0900 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: References: <5b8d13221002280231q4c3eb8fm8b8f25b8bbc36962@mail.gmail.com> <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> <4B8F172A.90802@silveregg.co.jp> Message-ID: <5b8d13221003052329v2fd3ea2frbd8f8d64c9026c22@mail.gmail.com> On Fri, Mar 5, 2010 at 1:22 PM, Patrick Marsh wrote: > > I've run the Numpy superpack installer for Python 2.6 built with MinGW > through the dependency walker. ?Unfortunately, outside of checking for some > extremely obviously things, I'm in way over my head in?interpreting?the > output (although, I'd like to learn). ?I've put the output from the program > here:?http://www.patricktmarsh.com/numpy/20100303.py26.superpack.dependencies.txt. > ?I can also put the binary up somewhere, too if someone wants to check that. I have just attempted to build the super pack installer on windows 7 Ultimate (32 bits), and did not encounter any issue, the testsuite passing everything but a few things unrelated to our problem here. Could you put your binary somewhere so that I can look at it ? David From friedrichromstedt at gmail.com Sat Mar 6 04:20:52 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sat, 6 Mar 2010 10:20:52 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: 2010/3/5 Ian Mallett : > Cool--this works perfectly now :-) :-) > Unfortunately, it's actually slower :P? Most of the slowest part is in the > removing doubles section. Hmm. Let's see ... Can you tell me how I can test the time calls in a script take? I have no idea. > #takes 0.04 seconds > inner = np.inner(ns, v1s - some_point) I think I can do nothing about that at the moment. > #0.0840001106262 > sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1) > > #0.0329999923706 > sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0) > > #0.0269999504089 > comparison_sum = (sum_1 == sum_2) We can leave out the repeat() calls and leave only the reshape() calls there. Numpy will substitute dimi == 1 dimensions with stride == 0, i.e., it will effectively repeat those dimension, just as we did it explicitly. > #0.0909998416901 > diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1) > > #0.0340001583099 > diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0) > > #0.0269999504089 > comparison_diff = (diff_1 == diff_2) Same here. Delete the repeat() calls, but not the reshape() calls. > #0.0230000019073 > same_edges = comparison_sum * comparison_diff Hmm, maybe use numpy.logical_and(comparison_sum, comparison_diff)? I don't know, but I guess it is in some way optimised for such things. > #0.128999948502 > doublet_count = same_edges.sum(axis = 0) Maybe try axis = 1 instead. I wonder why this is so slow. Or maybe it's because he does the conversion to ints on-the-fly, so maybe try same_edges.astype(numpy.int8).sum(axis = 0). Hope this gives some improvement. I attach the modified version. Ah, one thing to mention, have you not accidentally timed also the printout functions? They should be pretty slow. Friedrich From friedrichromstedt at gmail.com Sat Mar 6 04:42:37 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sat, 6 Mar 2010 10:42:37 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: 2010/3/5 Ian Mallett : > #takes 0.04 seconds > inner = np.inner(ns, v1s - some_point) Ok, I don't know why I was able to overlook this: dotprod = (ns * (v1s - some_point)).sum(axis = 1) The things with the inner product have been deleted. Now I will really *attach* it ... Hope it's faster, Friedrich -------------- next part -------------- A non-text attachment was scrubbed... Name: shading.py Type: application/octet-stream Size: 4105 bytes Desc: not available URL: From patrickmarshwx at gmail.com Sat Mar 6 11:44:56 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Sat, 6 Mar 2010 10:44:56 -0600 Subject: [Numpy-discussion] Building Numpy Windows Superpack In-Reply-To: <5b8d13221003052329v2fd3ea2frbd8f8d64c9026c22@mail.gmail.com> References: <4B8B1C51.60604@silveregg.co.jp> <5b8d13221003030648s772d39cfk7ba49ebc461aa4b@mail.gmail.com> <4B8F172A.90802@silveregg.co.jp> <5b8d13221003052329v2fd3ea2frbd8f8d64c9026c22@mail.gmail.com> Message-ID: On Sat, Mar 6, 2010 at 1:29 AM, David Cournapeau wrote: > On Fri, Mar 5, 2010 at 1:22 PM, Patrick Marsh > wrote: > > > > > I've run the Numpy superpack installer for Python 2.6 built with MinGW > > through the dependency walker. Unfortunately, outside of checking for > some > > extremely obviously things, I'm in way over my head in interpreting the > > output (although, I'd like to learn). I've put the output from the > program > > here: > http://www.patricktmarsh.com/numpy/20100303.py26.superpack.dependencies.txt > . > > I can also put the binary up somewhere, too if someone wants to check > that. > > I have just attempted to build the super pack installer on windows 7 > Ultimate (32 bits), and did not encounter any issue, the testsuite > passing everything but a few things unrelated to our problem here. > > Could you put your binary somewhere so that I can look at it ? > > You can find the binary here: http://www.patricktmarsh.com/numpy/20100306.py26-1.4.0-win32.superpack.exe I built this one today. Patrick -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Sat Mar 6 13:04:10 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Sat, 6 Mar 2010 20:04:10 +0200 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy References: <201003041412.24253.faltet@pytables.org><710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il><201003050953.02357.faltet@pytables.org> <6ce0ac131003051129g79e1ed35macd121688a798acc@mail.gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD324@ex3.envision.co.il> I did some optimization, and the results are very instructive, although not surprising: javascript:SetCmd(cmdSend); As I wrote before, I processed stereoscopic movie recordings, by making each a memory mapped file and processing it in several steps. By this way I produced extra GB of transient data. Running as one process took 45 seconds, and in dual parallel process ~40 seconds. After rewriting the application to process the recording frame by frame. The code became shorter and the new scores are: One process --- 16 seconds, and dual process --- 9 seconds. What I learned: * Design for multi-procssing from the start, not as afterthought * Shared memory works, but on the expense of code elegance (much like common blocks in fortran) * Memory mapped files can be used much as shared memory. The strange thing is that I got an ignored AttributeError on every frame access to the memory mapped file from the child process. Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Brian Granger Sent: Fri 05-Mar-10 21:29 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy Francesc, Yeah, 10% of improvement by using multi-cores is an expected figure for > memory > bound problems. This is something people must know: if their computations > are > memory bound (and this is much more common that one may initially think), > then > they should not expect significant speed-ups on their parallel codes. > > +1 Thanks for emphasizing this. This is definitely a big issue with multicore. Cheers, Brian > Thanks for sharing your experience anyway, > Francesc > > A Thursday 04 March 2010 18:54:09 Nadav Horesh escrigu?: > > I can not give a reliable answer yet, since I have some more improvement > to > > make. The application is an analysis of a stereoscopic-movie raw-data > > recording (both channels are recorded in the same file). I treat the > data > > as a huge memory mapped file. The idea was to process each channel (left > > and right) on a different core. Right now the application is IO bounded > > since I do classical numpy operation, so each channel (which is handled > as > > one array) is scanned several time. The improvement now over a single > > process is 10%, but I hope to achieve 10% ore after trivial > optimizations. > > > > I used this application as an excuse to dive into multi-processing. I > hope > > that the code I posted here would help someone. > > > > Nadav. > > > > > > -----Original Message----- > > From: numpy-discussion-bounces at scipy.org on behalf of Francesc Alted > > Sent: Thu 04-Mar-10 15:12 > > To: Discussion of Numerical Python > > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > What kind of calculations are you doing with this module? Can you please > > send some examples and the speed-ups you are getting? > > > > Thanks, > > Francesc > > > > A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigu?: > > > Extended module that I used for some useful work. > > > Comments: > > > 1. Sturla's module is better designed, but did not work with very > large > > > (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + > > > python-2.6.4 + numpy-1.4.0 > > > > > > Nadav. > > > > > > > > > -----Original Message----- > > > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > > > Sent: Thu 04-Mar-10 11:55 > > > To: Discussion of Numerical Python > > > Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > > > Maybe the attached file can help. Adpted and tested on amd64 linux > > > > > > Nadav > > > > > > > > > -----Original Message----- > > > From: numpy-discussion-bounces at scipy.org on behalf of Nadav Horesh > > > Sent: Thu 04-Mar-10 10:54 > > > To: Discussion of Numerical Python > > > Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy > > > > > > There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf > > > and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in > > > the cookbook page. I am into the same issue and going to test it today. > > > > > > Nadav > > > > > > On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote: > > > > Hi people, > > > > > > > > I was wondering about the status of using the standard library > > > > multiprocessing module with numpy. I found a cookbook example last > > > > updated one year ago which states that: > > > > > > > > "This page was obsolete as multiprocessing's internals have changed. > > > > More information will come shortly; a link to this page will then be > > > > added back to the Cookbook." > > > > > > > > http://www.scipy.org/Cookbook/multiprocessing > > > > > > > > I also found the code that used to be on this page in the cookbook > but > > > > it does not work any more. So my question is: > > > > > > > > Is it possible to use numpy arrays as shared arrays in an application > > > > using multiprocessing and how do you do it? > > > > > > > > Best regards, > > > > Jesper > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 5471 bytes Desc: not available URL: From geometrian at gmail.com Sat Mar 6 13:45:36 2010 From: geometrian at gmail.com (Ian Mallett) Date: Sat, 6 Mar 2010 10:45:36 -0800 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: Hi, On Sat, Mar 6, 2010 at 1:20 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > Hmm. Let's see ... Can you tell me how I can test the time calls in a > script take? I have no idea. > I've been doing: t1 = time.time() #code t2 = time.time() print "[code]:",t2-t1 > > #0.0840001106262 > > sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1) > > > > #0.0329999923706 > > sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0) > > > > #0.0269999504089 > > comparison_sum = (sum_1 == sum_2) > > We can leave out the repeat() calls and leave only the reshape() calls > there. Numpy will substitute dimi == 1 dimensions with stride == 0, > i.e., it will effectively repeat those dimension, just as we did it > explicitly. > Wow! Drops to immeasurably quick :D > > #0.0909998416901 > > diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1) > > > > #0.0340001583099 > > diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0) > > > > #0.0269999504089 > > comparison_diff = (diff_1 == diff_2) > > Same here. Delete the repeat() calls, but not the reshape() calls. > Once again, drops to immeasurably quick :D > > #0.0230000019073 > > same_edges = comparison_sum * comparison_diff > > Hmm, maybe use numpy.logical_and(comparison_sum, comparison_diff)? I > don't know, but I guess it is in some way optimised for such things. > It's marginally faster. > > #0.128999948502 > > doublet_count = same_edges.sum(axis = 0) > > Maybe try axis = 1 instead. I wonder why this is so slow. Or maybe > it's because he does the conversion to ints on-the-fly, so maybe try > same_edges.astype(numpy.int8).sum(axis = 0). > Actually, it's marginally slower :S > Hope this gives some improvement. I attach the modified version. > > Ah, one thing to mention, have you not accidentally timed also the > printout functions? They should be pretty slow. > Nope--I've been timing as above. > Friedrich > On Sat, Mar 6, 2010 at 1:42 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > 2010/3/5 Ian Mallett : > > #takes 0.04 seconds > > inner = np.inner(ns, v1s - some_point) > > Ok, I don't know why I was able to overlook this: > > dotprod = (ns * (v1s - some_point)).sum(axis = 1) > Much faster :D So, the most costly lines: comparison_sum = (sum_1 == sum_2) #0.024 sec comparison_diff = (diff_1 == diff_2) #0.024 sec same_edges = np.logical_and(comparison_sum, comparison_diff) #0.029 sec doublet_count = same_edges.sum(axis = 0) #0.147 Thanks again, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Sat Mar 6 15:03:55 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sat, 6 Mar 2010 21:03:55 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: 2010/3/6 Ian Mallett : > On Sat, Mar 6, 2010 at 1:20 AM, Friedrich Romstedt > wrote: >> Hmm. ?Let's see ... Can you tell me how I can test the time calls in a >> script take? ?I have no idea. > > I've been doing: > t1 = time.time() > #code > t2 = time.time() > print "[code]:",t2-t1 Ok. I just wonder how you can measure those times. On my five-years old machine, they are < 10ms resolution. > Wow!? Drops to immeasurably quick :D Yeaah! >> > #0.128999948502 >> > doublet_count = same_edges.sum(axis = 0) >> >> Maybe try axis = 1 instead. ?I wonder why this is so slow. ?Or maybe >> it's because he does the conversion to ints on-the-fly, so maybe try >> same_edges.astype(numpy.int8).sum(axis = 0). > > Actually, it's marginally slower :S Hmm. I tried the axis = 1 thing, and it also gave no improvement (maybe you can try too, I'm guessing I'm actually measuring the time Python spends in exeuting my loop to get significant times ...) >> 2010/3/5 Ian Mallett : >> > #takes 0.04 seconds >> > inner = np.inner(ns, v1s - some_point) >> >> Ok, I don't know why I was able to overlook this: >> >> dotprod = (ns * (v1s - some_point)).sum(axis = 1) > > Much faster :D :-) I'm glad to be able to help! > So, the most costly lines: > comparison_sum = (sum_1 == sum_2) #0.024 sec > comparison_diff = (diff_1 == diff_2) #0.024 sec > same_edges = np.logical_and(comparison_sum, comparison_diff) #0.029 sec > doublet_count = same_edges.sum(axis = 0) #0.147 At the moment, I can do nothing about that. Seems that we have reached the limit. Anyhow, is it now faster than your Python list implementation, and if yes, how much? How large was your gain by using numpy means at all? I'm just curious. Friedrich From geometrian at gmail.com Sat Mar 6 15:08:23 2010 From: geometrian at gmail.com (Ian Mallett) Date: Sat, 6 Mar 2010 12:08:23 -0800 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: On Sat, Mar 6, 2010 at 12:03 PM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > At the moment, I can do nothing about that. Seems that we have > reached the limit. Anyhow, is it now faster than your Python list > implementation, and if yes, how much? How large was your gain by > using numpy means at all? I'm just curious. > Unfortunately, the pure Python implementation is actually an order of magnitude faster. The fastest solution right now is to use numpy for the transformations, then convert it back into a list (.tolist()) and use Python for the rest. Here's the actual Python code. def glLibInternal_edges(object,lightpos): edge_set = set([]) edges = {} for sublist in xrange(object.number_of_lists): #There's only one sublist here face_data = object.light_volume_face_data[sublist] for indices in face_data: #v1,v2,v3,n normal = object.transformed_normals[sublist][indices[3]] v1,v2,v3 = [ object.transformed_vertices[sublist][indices[i]] for i in xrange(3) ] if abs_angle_between_rad(normal,vec_subt(v1,lightpos)) Thanks, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Sat Mar 6 17:26:45 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sat, 6 Mar 2010 23:26:45 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: I'm a bit unhappy with your code, because it's so hard to read tbh. You don't like objects? 2010/3/6 Ian Mallett : > Unfortunately, the pure Python implementation is actually an order of > magnitude faster.? The fastest solution right now is to use numpy for the > transformations, then convert it back into a list (.tolist()) and use Python > for the rest. :-( But well, if it's faster, than do it that way, right? I can only guess, that for large datasets, the comparison to compare each and every vector "in parallel" makes it slow down. So an iterative approach might be fine. In fact, no one would implement such things in C++ using not lists with pushback(), I guess. A bit of commentary about your code: > Here's the actual Python code. > > def glLibInternal_edges(object,lightpos): > ??? edge_set = set([]) Where do you use edge_set? Btw, set() would do it. > ??? edges = {} > ??? for sublist in xrange(object.number_of_lists): #There's only one sublist here > ??????? face_data = object.light_volume_face_data[sublist] > ??????? for indices in face_data: #v1,v2,v3,n Here objects would fit in nicely. for indices in face_data: normal = object.transformed_normals[sublist][indices.nindex] (v1, v2, v3) = [object.transformed_vertices[sublist][vidx] for vidx in indices.vindices] > ??????????? normal = object.transformed_normals[sublist][indices[3]] > ??????????? v1,v2,v3 = [ object.transformed_vertices[sublist][indices[i]] for i in xrange(3) ] > ??????????? if abs_angle_between_rad(normal,vec_subt(v1,lightpos)) 0: (...) > ??????????????? for p1,p2 in [[indices[0],indices[1]], > ????????????????????????????? [indices[1],indices[2]], > ????????????????????????????? [indices[2],indices[0]]]: > ??????????????????? edge = [p1,p2] Why not writing: for edge in numpy.asarray([[indices[0], indices[1]], (...)]): (...) > ??????????????????? index = 0 Where do you use index? It's a lonely remnant? > ??????????????????? edge2 = list(edge) Why do you convert the unmodified edge list into a list again? Btw, I found your numbering quite a bit annoying. No one can tell from an appended 2 what purpose that xxx2 has. Furthermore, I think a slight speedup could be reached by: unique = (egde.sum(), abs((egde * [1, -1]).sum())) > ??????????????????? edge2.sort() > ??????????????????? edge2 = tuple(edge2) > ??????????????????? if edge2 in edges: edges[edge2][1] += 1 > ??????????????????? else:????????????? edges[edge2] = [edge,1] > > ??? edges2 = [] > ??? for edge_data in edges.values(): > ??????? if edge_data[1] == 1: > ??????????? p1 = object.transformed_vertices[sublist][edge_data[0][0]] > ??????????? p2 = object.transformed_vertices[sublist][edge_data[0][1]] > ??????????? edges2.append([p1,p2]) > ??? return edges2 My 2 cents: class Edge: def __init__(self, indices, count): self.indices = indices def __hash__(self): return hash(self.unique) edges = {} (...) edges.setdefault() edges[egde2] += 1 for (indices, count) in edges.items(): if count == 1: edges_clean.append(object.transformed_vertices[sublist][indices]) provided that transformed_vertices[sublist] is an ndarray. You can iterate over ndarrays using standard Python iteration, it will provide an iterator object. From friedrichromstedt at gmail.com Sat Mar 6 17:30:22 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sat, 6 Mar 2010 23:30:22 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: Sent prematurely, typed tab and then space :-(. Real message work in progress. From friedrichromstedt at gmail.com Sat Mar 6 18:18:53 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sun, 7 Mar 2010 00:18:53 +0100 Subject: [Numpy-discussion] Iterative Matrix Multiplication In-Reply-To: References: Message-ID: "I'm a bit unhappy with your code, because it's so hard to read tbh. You don't like objects?" I would phrase this differently now: I think you can improve your code by using objects when they are appropriate. :-) 2010/3/6 Ian Mallett : > Unfortunately, the pure Python implementation is actually an order of > magnitude faster. The fastest solution right now is to use numpy for the > transformations, then convert it back into a list (.tolist()) and use Python > for the rest. :-( But well, if it's faster, than do it that way, right? I can only guess, that for large datasets, the comparison to compare each and every vector "in parallel" makes it slow down. So an iterative approach might be fine. In fact, no one would implement such things in C++ using not lists with pushback(), I guess. A bit of commentary about your code: > Here's the actual Python code. > > def glLibInternal_edges(object,lightpos): > edge_set = set([]) Where do you use edge_set? Btw, set() would do it. > edges = {} > for sublist in xrange(object.number_of_lists): #There's only one sublist here > face_data = object.light_volume_face_data[sublist] > for indices in face_data: #v1,v2,v3,n Here objects would fit in nicely. for indices in face_data: normal = object.transformed_normals[sublist][indices.nindex] (v1, v2, v3) = [object.transformed_vertices[sublist][vidx] for vidx in indices.vindices] > normal = object.transformed_normals[sublist][indices[3]] > v1,v2,v3 = [ object.transformed_vertices[sublist][indices[i]] for i in xrange(3) ] > if abs_angle_between_rad(normal,vec_subt(v1,lightpos)) 0: (...) > for p1,p2 in [[indices[0],indices[1]], > [indices[1],indices[2]], > [indices[2],indices[0]]]: > edge = [p1,p2] Why not writing: for edge in numpy.asarray([[indices[0], indices[1]], (...)]): (...) > index = 0 Where do you use index? It's a lonely remnant? > edge2 = list(edge) Why do you convert the unmodified edge list into a list again? Btw, I found your numbering quite a bit annoying. No one can tell from an appended 2 what purpose that xxx2 has. Furthermore, I think a slight speedup could be reached by: unique = (egde.sum(), abs((egde * [1, -1]).sum())) > edge2.sort() > edge2 = tuple(edge2) > if edge2 in edges: edges[edge2][1] += 1 > else: edges[edge2] = [edge,1] > > edges2 = [] > for edge_data in edges.values(): > if edge_data[1] == 1: > p1 = object.transformed_vertices[sublist][edge_data[0][0]] > p2 = object.transformed_vertices[sublist][edge_data[0][1]] > edges2.append([p1,p2]) > return edges2 My 2 cents: class CountingEdge: def __init__(self, indices): self.indices = indices self.count = 0 edges = {} (...) edges.setdefault(unique, CountingEdge(edge)) edges[unique].count += 1 for counting_edge in edges.values(): if counting_edge.count == 1: edges_clean.append(object.transformed_vertices[sublist][counting_edge.indices]) provided that transformed_vertices[sublist] is an ndarray. You can iterate over ndarrays using standard Python iteration, it will provide an iterator object. I'm always good for fresh ideas, although they nearly never get accepted, but here is another cent: I want to save you calculating the mutual annihilation for several light sources again and again. BEFORE SHADING 1. For the mesh, calculate all edges. Put their unique tuples in a list. Store together with the unique tuple from what triangle they originate and what the original indices were. When you use the sorted indices as unique value, the original indices will be virtually identical with the unique tuple. class EdgeOfTriangle: def __init__(self, unique, vindices, triangleindex): self.unique = unique self.vindices = vindices self.triangleindex = triangleindex def __hash__(self): # For the "in" statement return hash(self.unique) def __eq__(self, other): return other == self.triangle edges = [] for triangle_descriptor in triangle_descriptors: (...) egde1 = EdgeOfTriangle(unique, vindices1, triangle_descriptor.nindex) edge2 = EdgeOfTriangle(unique, vindices2, traingle_descriptor.nindex) (and so on) for new_egde in (edge1, egde2, egde3): if new_edge not in edges: edges.append(new_edge) edges = numpy.asarray(egdes) 2. Make an ndarray matrix from numpy.zeros(len(edges), len(triangles)). Iterate through the list of EdgeOfTriangle objects created in step (1.) and put an offset +1 in the elements [edge_of_triangle.index, edge_of_triangle.triangle], ..., [egde_of_triangle.edge3, edge_of_triangle.triange]. Now, this matrix tells you how often *all* edges will be selected given the back-facing triangles. The matrix shall be called impact_matrix. Virtually do this by: edges_reshaped = edges.reshape((len(edges), 1)) triangles = numpy.arange(0, len(n_array)).reshape((1, len(n_array))) # Employs __eq__ above in EdgeOfTriangle: impact_matrix = (egdes_reshaped == triangles) DURING SHADING 3. Calculate the dot product with my old method. It gives you a list of floats. 4. Calculate selected = numpy.dot(impact_matrix, dotproducts > 0). It gives you precisely the number of times each edge is selected. selected = (numpy.dot(impact_matrix, dotproducts > 0) == 1) 5. Calculate edge_array[selected == 1], when edge_array is the array holding *all* edges occuring (built during step (1.). It will give you the vertex indices of the edges comprising the silhuette. def getvindices(edge_of_triangle): return edge_of_triangle.vindices ugetvindices = numpy.vectorize(getvindices) silhuette_edges_of_triangles = egdes[selected] silhuette_vindices = numpy.asarray(ugetvindices(silhuette_edges_of_triangles)) silhuette_vertices = v_array[silhuette_vindices] Checked everything only *once to twice* this time, fwiw, Friedrich From d.l.goldsmith at gmail.com Sun Mar 7 00:15:09 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 6 Mar 2010 21:15:09 -0800 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? Message-ID: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> >>> x = numpy.array(3) >>> x array(3) >>> x.shape () My question is: why? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From geometrian at gmail.com Sun Mar 7 00:37:04 2010 From: geometrian at gmail.com (Ian Mallett) Date: Sat, 6 Mar 2010 21:37:04 -0800 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? In-Reply-To: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> References: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> Message-ID: >>> x = numpy.array(3) >>> x array(3) >>> x.shape () >>> y = numpy.array([3]) >>> y array([3]) >>> y.shape (1,) Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Sun Mar 7 00:46:21 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 6 Mar 2010 21:46:21 -0800 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? In-Reply-To: References: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> Message-ID: <45d1ab481003062146r34daaafdg507b07b916e91447@mail.gmail.com> On Sat, Mar 6, 2010 at 9:37 PM, Ian Mallett wrote: > >>> x = numpy.array(3) > >>> x > array(3) > >>> x.shape > () > >>> y = numpy.array([3]) > >>> y > array([3]) > >>> y.shape > (1,) > > Ian > Thanks, Ian. I already figured out how to make it not so, but I still want to understand the design reasoning behind it being so in the first place (thus the use of the question "why (is it so)," not "how (to make it different)"). DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From geometrian at gmail.com Sun Mar 7 01:26:45 2010 From: geometrian at gmail.com (Ian Mallett) Date: Sat, 6 Mar 2010 22:26:45 -0800 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? In-Reply-To: <45d1ab481003062146r34daaafdg507b07b916e91447@mail.gmail.com> References: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> <45d1ab481003062146r34daaafdg507b07b916e91447@mail.gmail.com> Message-ID: On Sat, Mar 6, 2010 at 9:46 PM, David Goldsmith wrote: > Thanks, Ian. I already figured out how to make it not so, but I still want > to understand the design reasoning behind it being so in the first place > (thus the use of the question "why (is it so)," not "how (to make it > different)"). > Well, I can't help you with that. I would also ask why this design even exists? Equating an array with a single number doesn't make sense to me. Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Sun Mar 7 02:04:19 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 6 Mar 2010 23:04:19 -0800 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? In-Reply-To: References: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> <45d1ab481003062146r34daaafdg507b07b916e91447@mail.gmail.com> Message-ID: <45d1ab481003062304o18221d22r6093c5bd4ee64255@mail.gmail.com> On Sat, Mar 6, 2010 at 10:26 PM, Ian Mallett wrote: > On Sat, Mar 6, 2010 at 9:46 PM, David Goldsmith wrote: > >> Thanks, Ian. I already figured out how to make it not so, but I still >> want to understand the design reasoning behind it being so in the first >> place (thus the use of the question "why (is it so)," not "how (to make it >> different)"). >> > Well, I can't help you with that. I would also ask why this design even > exists? Equating an array with a single number doesn't make sense to me. > Ian > Here's an (unintended) use case: I wanted to convert anything in an array that's close to zero to be zero (and leave the rest as is), but I want it to be robust so that if it receives a scalar, it can work w/ that, too. Here's my (working) code (I'm sure that once Robert sees it, he'll be able to replace it w/ a one-liner): def convert_close(arg): arg = N.array(arg) if not arg.shape: arg = N.array((arg,)) if arg.size: t = N.array([0 if N.allclose(temp, 0) else temp for temp in arg]) if len(t.shape) - 1: return N.squeeze(t) else: return t else: return N.array() At first I wasn't "casting" arg to be an array upon entry, but I found that if arg is a scalar, my list comprehension failed, so I had choose _some_ sort of sequence to cast scalars to; since arg will typically be an array and that's what I wanted to return as well, it seemed most appropriate to "cast" incoming scalars to arrays. So I added the arg = N.array(arg) at the beginning (N.array(array) = array, and N.array(non-array seq) does the "right" thing as well), but the list comprehension still wouldn't work if arg was a scalar upon entry; after many print statements and much interactive experimenting, I finally figured out that this is because the shape of N.array(scalar) is () (and I thence immediately guessed, correctly of course, that N.array((scalar,)) has shape (1,)). So I added the if not arg.shape: to detect and correct for those zero size arrays, and now it works fine, but I'd still like to know _why_ N.array(scalar).shape == () but N.array((scalar,)).shape == (1,). No biggy, just curious. DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Sun Mar 7 07:30:30 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sun, 7 Mar 2010 13:30:30 +0100 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? In-Reply-To: <45d1ab481003062304o18221d22r6093c5bd4ee64255@mail.gmail.com> References: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> <45d1ab481003062146r34daaafdg507b07b916e91447@mail.gmail.com> <45d1ab481003062304o18221d22r6093c5bd4ee64255@mail.gmail.com> Message-ID: First, to David's routine: 2010/3/7 David Goldsmith : > def convert_close(arg): > arg = N.array(arg) > if not arg.shape: > arg = N.array((arg,)) > if arg.size: > t = N.array([0 if N.allclose(temp, 0) else temp for temp in arg]) > if len(t.shape) - 1: > return N.squeeze(t) > else: > return t > else: > return N.array() Ok, chaps, let's code: import numpy def convert_close(ndarray, atol = 1e-5, rtol = 1e-8): ndarray_abs = abs(ndarray) mask = (ndarray_abs > atol + rtol * ndarray_abs) return ndarray * mask > python -i close.py >>> a = numpy.asarray([1e-6]) >>> convert_close(a) array([ 0.]) >>> a = numpy.asarray([1e-6, 1]) >>> convert_close(a) array([ 0., 1.]) >>> a = numpy.asarray(1e-6) >>> convert_close(a) 0.0 >>> a = numpy.asarray([-1e-6, 1]) >>> convert_close(a) array([ 0., 1.]) It's not as good as Robert's (so far virtual) solution, but :-) > On Sat, Mar 6, 2010 at 10:26 PM, Ian Mallett wrote: >> On Sat, Mar 6, 2010 at 9:46 PM, David Goldsmith >> wrote: >>> Thanks, Ian.? I already figured out how to make it not so, but I still >>> want to understand the design reasoning behind it being so in the first >>> place (thus the use of the question "why (is it so)," not "how (to make it >>> different)"). 1. First from a mathematical point of view (don't be frightened): When an array has shape ndarray.shape, then the number of elements contained is: numpy.asarray(ndarray.shape).mul() When I type now: >>> numpy.asarray([]).prod() 1.0 This is the .shape of an scalar ndarray (without any strides), and therefore such a scalar ndarray holds exactly one item. Or, for hard-core friends (-: >>> numpy.asarray([[]]) array([], shape=(1, 0), dtype=float64) >>> numpy.asarray([[]]).prod() 1.0 So, ndarrays without elements yield .prod() == 1.0. This is sensible, because the product shall be algorithmically defined as: def prod(ndarray): product = 1.0 for item in ndarray.flatten(): product *= item return product Thus, the product of nothing is defined to be one to be consistent. One would end up with the same using a recursive definition of prod() instead of this iterative one. 2. From programmer's point of view. You can always write: ndarray[()]. This means, to give no index at all. Indeed, writing: ndarray[1, 2] is equivalent to writing: ndarray[(1, 2)] , as keys are always passed as a tuple or a scalar. Scalar in case of: ndarray[42] . Now, the call: ndarray[()] shall return 'something', which is the complete ndarray, because we didn't indice anything. For multidimensional arrays: a = numpy.ndarray([[1, 2], [3, 4]]) the call: a[0] shall return: array([1, 2]). This is clear. But now, what to return, if we consume all the indices available, e.g. when writing: a[0, 0] ? This means, we return the scalar array array(1) . That's another meaning of scalar arrays. When indicing an ndarray a with a tuple of length N_key, without slices, the return shape will be always: a.shape[N_key:] This means, using all indices available returns a shape: a.shape[a.ndim:] == [] , i.e., a scalar "without" shape. To conclude, everything is consistent when allowing scalar arrays, and everything breaks down if we don't. They are some kind of 0, like the 0 in the whole numbers, which the Roman's didn't know of. It makes things simpler (and more consistent). Also it unifies scalars and arrays to only on kind of type, which is a great deal. Friedrich From friedrichromstedt at gmail.com Sun Mar 7 07:41:27 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sun, 7 Mar 2010 13:41:27 +0100 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> Message-ID: 2010/3/5 Pierre GM : > 'm'fraid no. I gonna have to investigate that. Please open a ticket with a self-contained example that reproduces the issue. > Thx in advance... > P. I would like to stress the fact that imo this is maybe not ticket and not a bug. The issue arises when calling a.max() or similar of empty arrays a, i.e., with: >>> 0 in a.shape True Opposed to the .prod() of an empty array, such a .max() or .min() cannot be defined, because the set is empty. So it's fully correct to let such calls fail. Just the failure is a bit deep in numpy, and only the traceback gives some hint what went wrong. I posted something similar also on the matplotlib-users list, sorry for cross-posting thus. fwiw, Friedrich From renesd at gmail.com Sun Mar 7 14:00:03 2010 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 7 Mar 2010 19:00:03 +0000 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <6ce0ac131003051129g79e1ed35macd121688a798acc@mail.gmail.com> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> <201003050953.02357.faltet@pytables.org> <6ce0ac131003051129g79e1ed35macd121688a798acc@mail.gmail.com> Message-ID: <64ddb72c1003071100n5c786d6cv726731768ad32d73@mail.gmail.com> On Fri, Mar 5, 2010 at 7:29 PM, Brian Granger wrote: > Francesc, > >> Yeah, 10% of improvement by using multi-cores is an expected figure for >> memory >> bound problems. ?This is something people must know: if their computations >> are >> memory bound (and this is much more common that one may initially think), >> then >> they should not expect significant speed-ups on their parallel codes. >> > > +1 > > Thanks for emphasizing this.? This is definitely a big issue with multicore. > > Cheers, > > Brian > Hi, here's a few notes... A) cache B) multiple cores/cpu multiplies other optimisations. A) Understanding cache is also very useful. Cache at two levels: 1. disk cache. 2. cpu/core cache. 1. Mmap'd files are useful since you can reuse disk cache as program memory. So large files don't waste ram on the disk cache. For example, processing a 1 gig file can use 1GB of memory with mmap, but 2GB without. ps, learn about madvise for extra goodness :) mmap behaviour is very different on windows/linux/mac osx. The best mmap implementation is on linux. Note, that on some OS's the disk cache has separate reserved areas of memory which processes can not use... so mmap is the easiest way to access it. mmaping on SSDs is also quite fast :) 2. cpu cache is what can give you a speedup when you use extra cpus/cores. There are a number of different cpu architectures these days... but generally you will get a speed up if your cpus access different areas of memory. So don't get cpu1 to process one part of data, then cpu2 - otherwise the cache can get invalidated. Especially if you have a 8MB cache per cpu :) This is why the Xeons, and other high end cpus will give you numpy speedups more easily. Also consider processing in chunks less than the size of your cache (especially for multi pass arguments). There's a lot to caching, but I think the above gives enough useful hints :) B) Also, multiple processes can multiply the effects of your other optimisations. A 2x speed up via SSE or other SIMD can be multiplied over each cpu/core. So if you code gets 8x faster with multiple processes, then the 2x optimisation is likely a 16x speed up. The following is a common with optimisation pattern with python code. >From python to numpy you can get a 20x speedup. From numpy to C/C++ you can get up to 5 times speed up (or 50x over python). Then an asm optimisation is 2-4x faster again. So up to 200x faster compared to pure python... then multiply that by 8x, and you have up to 1600x faster code :) Also small optimisations add up... a small 0.2 times speedup can turn into a 1.6 times speed up easily when you have multiple cpus. So as you can see... multiple cores makes it EASIER to optimise programs, since your optimisations are often multiplied. cu, From gael.varoquaux at normalesup.org Sun Mar 7 14:03:21 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 7 Mar 2010 20:03:21 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <64ddb72c1003071100n5c786d6cv726731768ad32d73@mail.gmail.com> References: <201003041412.24253.faltet@pytables.org> <710F2847B0018641891D9A21602763605AD323@ex3.envision.co.il> <201003050953.02357.faltet@pytables.org> <6ce0ac131003051129g79e1ed35macd121688a798acc@mail.gmail.com> <64ddb72c1003071100n5c786d6cv726731768ad32d73@mail.gmail.com> Message-ID: <20100307190321.GI7459@phare.normalesup.org> On Sun, Mar 07, 2010 at 07:00:03PM +0000, Ren? Dudfield wrote: > 1. Mmap'd files are useful since you can reuse disk cache as program > memory. So large files don't waste ram on the disk cache. I second that. mmaping has worked very well for me for large datasets, especialy in the context of reducing memory pressure. Ga?l From d.l.goldsmith at gmail.com Mon Mar 8 02:22:14 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sun, 7 Mar 2010 23:22:14 -0800 Subject: [Numpy-discussion] Why is the shape of a singleton array the empty tuple? In-Reply-To: References: <45d1ab481003062115i4280f861r6ca370db09f6a91f@mail.gmail.com> <45d1ab481003062146r34daaafdg507b07b916e91447@mail.gmail.com> <45d1ab481003062304o18221d22r6093c5bd4ee64255@mail.gmail.com> Message-ID: <45d1ab481003072322w77bbfc10g3ea68eb831e276d5@mail.gmail.com> On Sun, Mar 7, 2010 at 4:30 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > First, to David's routine: > > 2010/3/7 David Goldsmith : > > def convert_close(arg): > > arg = N.array(arg) > > if not arg.shape: > > arg = N.array((arg,)) > > if arg.size: > > t = N.array([0 if N.allclose(temp, 0) else temp for temp in arg]) > > if len(t.shape) - 1: > > return N.squeeze(t) > > else: > > return t > > else: > > return N.array() > > Ok, chaps, let's code: > > import numpy > > def convert_close(ndarray, atol = 1e-5, rtol = 1e-8): > ndarray_abs = abs(ndarray) > mask = (ndarray_abs > atol + rtol * ndarray_abs) > return ndarray * mask > > > python -i close.py > >>> a = numpy.asarray([1e-6]) > >>> convert_close(a) > array([ 0.]) > >>> a = numpy.asarray([1e-6, 1]) > >>> convert_close(a) > array([ 0., 1.]) > >>> a = numpy.asarray(1e-6) > >>> convert_close(a) > 0.0 > >>> a = numpy.asarray([-1e-6, 1]) > >>> convert_close(a) > array([ 0., 1.]) > Great, thanks! DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Mon Mar 8 02:30:48 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sun, 7 Mar 2010 23:30:48 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> Message-ID: <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> On Sun, Mar 7, 2010 at 4:41 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > 2010/3/5 Pierre GM : > > 'm'fraid no. I gonna have to investigate that. Please open a ticket with > a self-contained example that reproduces the issue. > > Thx in advance... > > P. > > I would like to stress the fact that imo this is maybe not ticket and not a > bug. > > The issue arises when calling a.max() or similar of empty arrays a, i.e., > with: > > >>> 0 in a.shape > True > > Opposed to the .prod() of an empty array, such a .max() or .min() > cannot be defined, because the set is empty. So it's fully correct to > let such calls fail. Just the failure is a bit deep in numpy, and > only the traceback gives some hint what went wrong. > > I posted something similar also on the matplotlib-users list, sorry > for cross-posting thus. > Any suggestions, then, how to go about figuring out what's happening in my code that's causing this "feature" to manifest itself? DG > > fwiw, > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Mon Mar 8 03:44:52 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 8 Mar 2010 00:44:52 -0800 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) Message-ID: Hello, Given the interest in participating in the GSoC this summer, I am forwarding a very interesting email from Titus Brown. If you are interested in doing a GSoC or mentoring, please read his email carefully. Basically, the PSF will be focuing on Py3K-related projects. Given Pauli's work on Py3K support for NumPy, I think we might be in a good position to move forward on porting the rest of our stack to Py3K. So we should focus on projects to: 1. finish porting and testing NumPy with Py3K 2. port and test SciPy with Py3K 3. port and test matplotlib with Py3K 4. port and test ipython with Py3K 5. etc. Given the PSF's stated emphasis this year, it probably doesn't make sense to pursue any non-Py3K projects. Jarrod ---------- Forwarded message ---------- From: C. Titus Brown Date: Tue, Mar 2, 2010 at 6:12 AM Subject: [SoC2009-mentors] [ctb at msu.edu: GSoC 2010 - it's on!] To: soc2009-mentors at python.org ----- Forwarded message from "C. Titus Brown" ----- Date: Wed, 24 Feb 2010 12:54:52 -0800 From: "C. Titus Brown" To: psf-members at python.org Cc: gsoc2010-mentors at python.org Subject: GSoC 2010 - it's on! Hi all, it's that time of year again, and Google has decided to run the Google Summer of Code again! ?http://groups.google.com/group/google-summer-of-code-discuss/browse_thread/thread/d839c0b02ac15b3f ?http://socghop.appspot.com/ Arc Riley has stepped up to run it for the PSF again this year, and I'm backstopping him. ?If you are interested in mentoring or kibbitzing on those who are, please sign up for the soc2010-mentors mailing list here, ?http://mail.python.org/mailman/listinfo/soc2010-mentors This year we're proposing to solicit and prioritize applications for Python 3.x -- 3K tools, porting old projects, etc. ?Python 2.x projects will be a distinct second. ?There will be no "core" category this year, although obviously if someone on one of the core teams wants to push a project it'll help! If you have an idea for a project, please send it to the -mentors list and add it to the wiki at ? http://wiki.python.org/moin/SummerOfCode/2010 We're also going to change a few things up to make it more useful to the PSF. Specifically, ?- the foundation is going to *require* 1 blog post/wk from each student. ?- we're going to hire an administrative assistant to monitor the students. ?- the student application process will be a bit more rigorous and job-app ? like; the Django SF has been doing this for at least one round and they ? claim that it results in much better and more serious students. ?- we'll be focusing on student quality more than on project egalitarianism. ? If project X can recruit three fantastic students to one fantastic and one ? mediocre student for project Y, then project X gets three and project Y ? gets one. The hope is that this will make the GSoC much more useful for Python than it has been in the past. Arc will be posting something to the www.python.org site and python-announce soon, too. Followups to soc2010-mentors. cheers, --titus -- C. Titus Brown, ctb at msu.edu ----- End forwarded message ----- From bsouthey at gmail.com Mon Mar 8 09:52:38 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 08 Mar 2010 08:52:38 -0600 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> Message-ID: <4B950F36.3020303@gmail.com> On 03/08/2010 01:30 AM, David Goldsmith wrote: > On Sun, Mar 7, 2010 at 4:41 AM, Friedrich Romstedt > > wrote: > > 2010/3/5 Pierre GM >: > > 'm'fraid no. I gonna have to investigate that. Please open a > ticket with a self-contained example that reproduces the issue. > > Thx in advance... > > P. > > I would like to stress the fact that imo this is maybe not ticket > and not a bug. > > The issue arises when calling a.max() or similar of empty arrays > a, i.e., with: > > >>> 0 in a.shape > True > > Opposed to the .prod() of an empty array, such a .max() or .min() > cannot be defined, because the set is empty. So it's fully correct to > let such calls fail. Just the failure is a bit deep in numpy, and > only the traceback gives some hint what went wrong. > > I posted something similar also on the matplotlib-users list, sorry > for cross-posting thus. > > > Any suggestions, then, how to go about figuring out what's happening > in my code that's causing this "feature" to manifest itself? > > DG > > Perhaps providing the code with specific versions of Python, numpy etc. would help. I would guess that aquarius_test.py has not correctly setup the necessary inputs (or has invalid inputs) required by matplotlib (which I have no knowledge about). Really you have to find if the _A in cmp.py used by 'self.norm.autoscale_None(self._A)' is valid. You may be missing a valid initialization step because the TypeError exception in autoscale_None ('You must first set_array for mappable') implies something need to be done first. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Mon Mar 8 13:17:59 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 8 Mar 2010 10:17:59 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <4B950F36.3020303@gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> <4B950F36.3020303@gmail.com> Message-ID: <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> On Mon, Mar 8, 2010 at 6:52 AM, Bruce Southey wrote: > On 03/08/2010 01:30 AM, David Goldsmith wrote: > > On Sun, Mar 7, 2010 at 4:41 AM, Friedrich Romstedt < > friedrichromstedt at gmail.com> wrote: > >> I would like to stress the fact that imo this is maybe not ticket and not >> a bug. >> >> The issue arises when calling a.max() or similar of empty arrays a, i.e., >> with: >> >> >>> 0 in a.shape >> True >> >> Opposed to the .prod() of an empty array, such a .max() or .min() >> cannot be defined, because the set is empty. So it's fully correct to >> let such calls fail. Just the failure is a bit deep in numpy, and >> only the traceback gives some hint what went wrong. >> >> I posted something similar also on the matplotlib-users list, sorry >> for cross-posting thus. >> > > Any suggestions, then, how to go about figuring out what's happening in my > code that's causing this "feature" to manifest itself? > > DG > > Perhaps providing the code with specific versions of Python, numpy etc. > would help. > > I would guess that aquarius_test.py has not correctly setup the necessary > inputs (or has invalid inputs) required by matplotlib (which I have no > knowledge about). Really you have to find if the _A in cmp.py used by > 'self.norm.autoscale_None(self._A)' is valid. You may be missing a valid > initialization step because the TypeError exception in autoscale_None ('You > must first set_array for mappable') implies something need to be done first. > > > Bruce > Python 2.5.4, Numpy 1.4.0, Matplotlib 0.99.0, Windows 32bit Vista Home Premium SP2 # Code copyright 2010 by David Goldsmith # Comments and unnecessaries edited for brevity import numpy as N import matplotlib as MPL from matplotlib import pylab from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas from matplotlib.figure import Figure import matplotlib.cm as cm J = complex(0,1); tol = 1e-6; maxiter = 20; roots = (-2 + J/2, -1 + J, -0.5 + J/2, 0.5 + J, 1 + J/2, 2 + J, 2.5 + J/2, -2 - J, -1 - J/2, -0.5 - J, 0.5 - J/2, 1 - J, 2 - J/2, 2.5 - J) def ffp(z): w, wp = (0J, 0J) for root in roots: z0 = z - root w += N.sin(1/z0) wp -= N.cos(1/z0)/(z0*z0) return (w, wp) def iter(z): w, wp = ffp(z) return z - w/wp def find_root(z0):#, k, j): count = 0 z1 = iter(z0) if N.isnan(z1): return N.complex64(N.inf) while (N.abs(z1 - z0) > tol) and \ (count < maxiter): count += 1 z0 = z1 z1 = iter(z0) if N.abs(z1 - z0) > tol: result = 0 else: result = z1 return N.complex64(result) w, h, DPI = (3.2, 2.0, 100) fig = Figure(figsize=(w, h), dpi=DPI, frameon=False) ax = fig.add_subplot(1,1,1) canvas = FigureCanvas(fig) nx, xmin, xmax = (int(w*DPI), -0.5, 0.5) ny, ymin, ymax = (int(h*DPI), 0.6, 1.2) X, xincr = N.linspace(xmin,xmax,nx,retstep=True) Y, yincr = N.linspace(ymin,ymax,ny,retstep=True) W = N.zeros((ny,nx), dtype=N.complex64) for j in N.arange(nx): if not (j%100): # report progress print j for k in N.arange(ny): x, y = (X[j], Y[k]) z0 = x + J*y W[k,j] = find_root(z0)#,k,j) print N.argwhere(N.logical_not(N.isfinite(W.real))) print N.argwhere(N.logical_not(N.isfinite(W.imag))) W = W.T argW = N.angle(W) print N.argwhere(N.logical_not(N.isfinite(argW))) cms = ("Blues",)# "Blues_r", "cool", "cool_r", def all_ticks_off(ax): ax.xaxis.set_major_locator(pylab.NullLocator()) ax.yaxis.set_major_locator(pylab.NullLocator()) for cmap_name in cms: all_ticks_off(ax) ax.hold(True) for i in range(4): for j in range(4): part2plot = argW[j*ny/4:(j+1)*ny/4, i*nx/4:(i+1)*nx/4] if N.any(N.logical_not(N.isfinite(part2plot))): print i, j, print N.argwhere(N.logical_not(N.isfinite(part2plot))) extent = (i*nx/4, (i+1)*nx/4, (j+1)*ny/4, j*ny/4) ax.imshow(part2plot, cmap_name, extent = extent) ax.set_xlim(0, nx) ax.set_ylim(0, ny) canvas.print_figure('../../Data-Figures/Zodiac/Aquarius/'+ cmap_name + 'Aquarius_test.png', dpi=DPI) # End Aquarius_test.png DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Mon Mar 8 13:55:34 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 08 Mar 2010 19:55:34 +0100 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <4B910E62.7070209@gmail.com> References: <4B910E62.7070209@gmail.com> Message-ID: Hello, I am also looking into the convertsion from strcutured arrays to ndarray. > I've just started playing with numpy and have noticed that when printing > a structured array that the output is not nicely formatted. Is there a > way to make the formatting look the same as it does for an unstructured > array? > Output is: > ### ndarray > [[ 1. 2. ] > [ 3. 4.1]] > ### structured array > [(1.0, 2.0) (3.0, 4.0999999999999996)] How could we make this structured array look like the above shown ndarray with shape (2, 2)? Thanks for any additional hint, Timmie From josef.pktd at gmail.com Mon Mar 8 14:01:26 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Mar 2010 14:01:26 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: References: <4B910E62.7070209@gmail.com> Message-ID: <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> On Mon, Mar 8, 2010 at 1:55 PM, Tim Michelsen wrote: > Hello, > I am also looking into the convertsion from strcutured arrays to ndarray. > >> I've just started playing with numpy and have noticed that when printing >> a structured array that the output is not nicely formatted. Is there a >> way to make the formatting look the same as it does for an unstructured >> array? > >> Output is: >> ### ndarray >> [[ 1. ? 2. ] >> ?[ 3. ? 4.1]] >> ### structured array >> [(1.0, 2.0) (3.0, 4.0999999999999996)] > How could we make this structured array look like the above shown > ndarray with shape (2, 2)? .view(float) should do it, to created a ndarray view of the structured array data Josef > > Thanks for any additional hint, > Timmie > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Mon Mar 8 14:04:56 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 8 Mar 2010 14:04:56 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> Message-ID: On Mon, Mar 8, 2010 at 2:01 PM, wrote: > On Mon, Mar 8, 2010 at 1:55 PM, Tim Michelsen > wrote: >> Hello, >> I am also looking into the convertsion from strcutured arrays to ndarray. >> >>> I've just started playing with numpy and have noticed that when printing >>> a structured array that the output is not nicely formatted. Is there a >>> way to make the formatting look the same as it does for an unstructured >>> array? >> >>> Output is: >>> ### ndarray >>> [[ 1. ? 2. ] >>> ?[ 3. ? 4.1]] >>> ### structured array >>> [(1.0, 2.0) (3.0, 4.0999999999999996)] >> How could we make this structured array look like the above shown >> ndarray with shape (2, 2)? > > .view(float) should do it, to created a ndarray view of the structured > array data > Plus a reshape. I usually know how many columns I have, so I put in axis 1 and leave axis 0 as -1. In [21]: a.view(float).reshape(-1,2) Out[21]: array([[ 1. , 2. ], [ 3. , 4.1]]) Skipper From josef.pktd at gmail.com Mon Mar 8 14:17:43 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Mar 2010 14:17:43 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> Message-ID: <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> On Mon, Mar 8, 2010 at 2:04 PM, Skipper Seabold wrote: > On Mon, Mar 8, 2010 at 2:01 PM, ? wrote: >> On Mon, Mar 8, 2010 at 1:55 PM, Tim Michelsen >> wrote: >>> Hello, >>> I am also looking into the convertsion from strcutured arrays to ndarray. >>> >>>> I've just started playing with numpy and have noticed that when printing >>>> a structured array that the output is not nicely formatted. Is there a >>>> way to make the formatting look the same as it does for an unstructured >>>> array? >>> >>>> Output is: >>>> ### ndarray >>>> [[ 1. ? 2. ] >>>> ?[ 3. ? 4.1]] >>>> ### structured array >>>> [(1.0, 2.0) (3.0, 4.0999999999999996)] >>> How could we make this structured array look like the above shown >>> ndarray with shape (2, 2)? >> >> .view(float) should do it, to created a ndarray view of the structured >> array data >> > > Plus a reshape. ?I usually know how many columns I have, so I put in > axis 1 and leave axis 0 as -1. > > In [21]: a.view(float).reshape(-1,2) > Out[21]: > array([[ 1. , ?2. ], > ? ? ? [ 3. , ?4.1]]) a.view(float).reshape(len(a),-1) #if you don't want to count columns I obviously haven't done this in a while. And of course, it only works if all elements of the structured array have the same type. Josef > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Mon Mar 8 14:24:39 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 8 Mar 2010 14:24:39 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> Message-ID: On Mon, Mar 8, 2010 at 2:17 PM, wrote: > On Mon, Mar 8, 2010 at 2:04 PM, Skipper Seabold wrote: >> On Mon, Mar 8, 2010 at 2:01 PM, ? wrote: >>> On Mon, Mar 8, 2010 at 1:55 PM, Tim Michelsen >>> wrote: >>>> Hello, >>>> I am also looking into the convertsion from strcutured arrays to ndarray. >>>> >>>>> I've just started playing with numpy and have noticed that when printing >>>>> a structured array that the output is not nicely formatted. Is there a >>>>> way to make the formatting look the same as it does for an unstructured >>>>> array? >>>> >>>>> Output is: >>>>> ### ndarray >>>>> [[ 1. ? 2. ] >>>>> ?[ 3. ? 4.1]] >>>>> ### structured array >>>>> [(1.0, 2.0) (3.0, 4.0999999999999996)] >>>> How could we make this structured array look like the above shown >>>> ndarray with shape (2, 2)? >>> >>> .view(float) should do it, to created a ndarray view of the structured >>> array data >>> >> >> Plus a reshape. ?I usually know how many columns I have, so I put in >> axis 1 and leave axis 0 as -1. >> >> In [21]: a.view(float).reshape(-1,2) >> Out[21]: >> array([[ 1. , ?2. ], >> ? ? ? [ 3. , ?4.1]]) > > > a.view(float).reshape(len(a),-1) ? ? #if you don't want to count columns > > I obviously haven't done this in a while. > And of course, it only works if all elements of the structured array > have the same type. > For the archives with heterogeneous dtype. import numpy as np b = np.array([(1.0, 'string1', 2.0), (3.0, 'string2', 4.1)], dtype=[('x', float),('str_var', 'a7'),('y',float)]) b[['x','y']].view(float).reshape(len(b),-1) # note the list within list syntax #array([[ 1. , 2. ], # [ 3. , 4.1]]) Skipper From d.l.goldsmith at gmail.com Mon Mar 8 14:25:03 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 8 Mar 2010 11:25:03 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> <4B950F36.3020303@gmail.com> <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> Message-ID: <45d1ab481003081125xcee9c01wed3037fb2782b47@mail.gmail.com> On Mon, Mar 8, 2010 at 10:17 AM, David Goldsmith wrote: > On Mon, Mar 8, 2010 at 6:52 AM, Bruce Southey wrote: > >> On 03/08/2010 01:30 AM, David Goldsmith wrote: >> >> On Sun, Mar 7, 2010 at 4:41 AM, Friedrich Romstedt < >> friedrichromstedt at gmail.com> wrote: >> >>> I would like to stress the fact that imo this is maybe not ticket and not >>> a bug. >>> >>> The issue arises when calling a.max() or similar of empty arrays a, i.e., >>> with: >>> >>> >>> 0 in a.shape >>> True >>> >>> Opposed to the .prod() of an empty array, such a .max() or .min() >>> cannot be defined, because the set is empty. So it's fully correct to >>> let such calls fail. Just the failure is a bit deep in numpy, and >>> only the traceback gives some hint what went wrong. >>> >>> I posted something similar also on the matplotlib-users list, sorry >>> for cross-posting thus. >>> >> >> Any suggestions, then, how to go about figuring out what's happening in my >> code that's causing this "feature" to manifest itself? >> >> DG >> >> Perhaps providing the code with specific versions of Python, numpy etc. >> would help. >> >> I would guess that aquarius_test.py has not correctly setup the necessary >> inputs (or has invalid inputs) required by matplotlib (which I have no >> knowledge about). Really you have to find if the _A in cmp.py used by >> 'self.norm.autoscale_None(self._A)' is valid. You may be missing a valid >> initialization step because the TypeError exception in autoscale_None ('You >> must first set_array for mappable') implies something need to be done first. >> >> >> Bruce >> > > Python 2.5.4, Numpy 1.4.0, Matplotlib 0.99.0, Windows 32bit Vista Home > Premium SP2 > > # Code copyright 2010 by David Goldsmith > # Comments and unnecessaries edited for brevity > import numpy as N > import matplotlib as MPL > from matplotlib import pylab > from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas > from matplotlib.figure import Figure > import matplotlib.cm as cm > > J = complex(0,1); tol = 1e-6; maxiter = 20; > roots = (-2 + J/2, -1 + J, -0.5 + J/2, 0.5 + J, 1 + J/2, 2 + J, 2.5 + > J/2, > -2 - J, -1 - J/2, -0.5 - J, 0.5 - J/2, 1 - J, 2 - J/2, 2.5 - > J) > > def ffp(z): > w, wp = (0J, 0J) > for root in roots: > z0 = z - root > w += N.sin(1/z0) > wp -= N.cos(1/z0)/(z0*z0) > return (w, wp) > > def iter(z): > w, wp = ffp(z) > return z - w/wp > > def find_root(z0):#, k, j): > count = 0 > z1 = iter(z0) > if N.isnan(z1): > return N.complex64(N.inf) > while (N.abs(z1 - z0) > tol) and \ > (count < maxiter): > count += 1 > z0 = z1 > z1 = iter(z0) > if N.abs(z1 - z0) > tol: > result = 0 > else: > result = z1 > return N.complex64(result) > > w, h, DPI = (3.2, 2.0, 100) > fig = Figure(figsize=(w, h), > dpi=DPI, > frameon=False) > ax = fig.add_subplot(1,1,1) > canvas = FigureCanvas(fig) > nx, xmin, xmax = (int(w*DPI), -0.5, 0.5) > ny, ymin, ymax = (int(h*DPI), 0.6, 1.2) > > X, xincr = N.linspace(xmin,xmax,nx,retstep=True) > Y, yincr = N.linspace(ymin,ymax,ny,retstep=True) > W = N.zeros((ny,nx), dtype=N.complex64) > > for j in N.arange(nx): > if not (j%100): # report progress > print j > for k in N.arange(ny): > x, y = (X[j], Y[k]) > z0 = x + J*y > W[k,j] = find_root(z0)#,k,j) > > print N.argwhere(N.logical_not(N.isfinite(W.real))) > print N.argwhere(N.logical_not(N.isfinite(W.imag))) > W = W.T > argW = N.angle(W) > print N.argwhere(N.logical_not(N.isfinite(argW))) > cms = ("Blues",)# "Blues_r", "cool", "cool_r", > > def all_ticks_off(ax): > ax.xaxis.set_major_locator(pylab.NullLocator()) > ax.yaxis.set_major_locator(pylab.NullLocator()) > > for cmap_name in cms: > all_ticks_off(ax) > ax.hold(True) > for i in range(4): > for j in range(4): > part2plot = argW[j*ny/4:(j+1)*ny/4, i*nx/4:(i+1)*nx/4] > if N.any(N.logical_not(N.isfinite(part2plot))): > print i, j, > print N.argwhere(N.logical_not(N.isfinite(part2plot))) > extent = (i*nx/4, (i+1)*nx/4, (j+1)*ny/4, j*ny/4) > > ax.imshow(part2plot, cmap_name, extent = extent) > ax.set_xlim(0, nx) > ax.set_ylim(0, ny) > canvas.print_figure('../../Data-Figures/Zodiac/Aquarius/'+ cmap_name + > 'Aquarius_test.png', dpi=DPI) > # End Aquarius_test.png > > DG > Oh, and here's "fresh" output (i.e., I just reran it to confirm that I'm still having the problem). 0 100 200 300 [[133 319]] [] [] Traceback (most recent call last): File "C:\Users\Fermat\Documents\Fractals\Python\Source\Zodiac\aquarius_test.py", line 108, in ax.imshow(part2plot, cmap_name, extent = extent) File "C:\Python254\lib\site-packages\matplotlib\axes.py", line 6261, in imshow im.autoscale_None() File "C:\Python254\lib\site-packages\matplotlib\cm.py", line 236, in autoscale_None self.norm.autoscale_None(self._A) File "C:\Python254\lib\site-packages\matplotlib\colors.py", line 792, in autoscale_None if self.vmin is None: self.vmin = ma.minimum(A) File "C:\Python254\Lib\site-packages\numpy\ma\core.py", line 5555, in __call__ return self.reduce(a) File "C:\Python254\Lib\site-packages\numpy\ma\core.py", line 5570, in reduce t = self.ufunc.reduce(target, **kargs) ValueError: zero-size array to ufunc.reduce without identity Script terminated. Notes: 0) Despite N.argwhere(N.logical_not(N.isfinite(W.real))) being non-empty, N.argwhere(N.logical_not(N.isfinite(N.angle(W.T)))) apparently is empty, so I suppose whatever is behind this may also be the source of my problem, but: 1) Nothing is ever printed as a result of this attempt to catch a non-finite problem: if N.any(N.logical_not(N.isfinite(part2plot))): print i, j, print N.argwhere(N.logical_not(N.isfinite(part2plot))) which occurs for all part2plot's before they're passed to imshow, so I conclude that the problem is something other than non-finites? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Mon Mar 8 14:25:15 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 8 Mar 2010 20:25:15 +0100 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> <4B950F36.3020303@gmail.com> <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> Message-ID: It's pretty simple, but I was stunned myself how simple. Have a look at line 65 of your script you provided: W = W.T This means, x <-> y. But in the for loops, you still act as if W wasn't transposed. I added some prints, the positions should be clear for you: argW.shape = (320, 200) i, j = (0, 0) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 0, 80) part2plot.shape = (50, 80) i, j = (0, 1) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (50, 100, 0, 80) part2plot.shape = (50, 80) i, j = (0, 2) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (100, 150, 0, 80) part2plot.shape = (50, 80) i, j = (0, 3) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (150, 200, 0, 80) part2plot.shape = (50, 80) i, j = (1, 0) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 80, 160) part2plot.shape = (50, 80) i, j = (1, 1) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (50, 100, 80, 160) part2plot.shape = (50, 80) i, j = (1, 2) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (100, 150, 80, 160) part2plot.shape = (50, 80) i, j = (1, 3) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (150, 200, 80, 160) part2plot.shape = (50, 80) i, j = (2, 0) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 160, 240) part2plot.shape = (50, 40) i, j = (2, 1) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (50, 100, 160, 240) part2plot.shape = (50, 40) i, j = (2, 2) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (100, 150, 160, 240) part2plot.shape = (50, 40) i, j = (2, 3) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (150, 200, 160, 240) part2plot.shape = (50, 40) i, j = (3, 0) j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 240, 320) part2plot.shape = (50, 0) Traceback (most recent call last): File "D:\Home\Friedrich\Entwicklung\2010\David\aquarius.py", line 91, in ? ax.imshow(part2plot, extent = extent) File "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\matplotlib\ax es.py", line 5471, in imshow im.autoscale_None() File "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\matplotlib\cm .py", line 148, in autoscale_None self.norm.autoscale_None(self._A) File "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\matplotlib\co lors.py", line 682, in autoscale_None if self.vmin is None: self.vmin = ma.minimum(A) File "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\numpy\ma\core .py", line 3042, in __call__ return self.reduce(a) File "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\numpy\ma\core .py", line 3057, in reduce t = self.ufunc.reduce(target, **kargs) ValueError: zero-size array to ufunc.reduce without identity So you simply have to exchange the role of x and y in your slice indicing expression, and everything will work out fine, I suspect :-) Or simpy leave out the transposition? Note that in the other case, you also may have to consider to change to extent's axes to get it properly reflected. NB: With my version of matplotlib, it didn't accept the colormap, but when yours does, it doesn't matter. Friedrich From josef.pktd at gmail.com Mon Mar 8 14:30:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Mar 2010 14:30:17 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> Message-ID: <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> On Mon, Mar 8, 2010 at 2:24 PM, Skipper Seabold wrote: > On Mon, Mar 8, 2010 at 2:17 PM, ? wrote: >> On Mon, Mar 8, 2010 at 2:04 PM, Skipper Seabold wrote: >>> On Mon, Mar 8, 2010 at 2:01 PM, ? wrote: >>>> On Mon, Mar 8, 2010 at 1:55 PM, Tim Michelsen >>>> wrote: >>>>> Hello, >>>>> I am also looking into the convertsion from strcutured arrays to ndarray. >>>>> >>>>>> I've just started playing with numpy and have noticed that when printing >>>>>> a structured array that the output is not nicely formatted. Is there a >>>>>> way to make the formatting look the same as it does for an unstructured >>>>>> array? >>>>> >>>>>> Output is: >>>>>> ### ndarray >>>>>> [[ 1. ? 2. ] >>>>>> ?[ 3. ? 4.1]] >>>>>> ### structured array >>>>>> [(1.0, 2.0) (3.0, 4.0999999999999996)] >>>>> How could we make this structured array look like the above shown >>>>> ndarray with shape (2, 2)? >>>> >>>> .view(float) should do it, to created a ndarray view of the structured >>>> array data >>>> >>> >>> Plus a reshape. ?I usually know how many columns I have, so I put in >>> axis 1 and leave axis 0 as -1. >>> >>> In [21]: a.view(float).reshape(-1,2) >>> Out[21]: >>> array([[ 1. , ?2. ], >>> ? ? ? [ 3. , ?4.1]]) >> >> >> a.view(float).reshape(len(a),-1) ? ? #if you don't want to count columns >> >> I obviously haven't done this in a while. >> And of course, it only works if all elements of the structured array >> have the same type. >> > > For the archives with heterogeneous dtype. > > import numpy as np > > b = np.array([(1.0, 'string1', 2.0), (3.0, 'string2', 4.1)], > dtype=[('x', float),('str_var', 'a7'),('y',float)]) > > b[['x','y']].view(float).reshape(len(b),-1) # note the list within list syntax > > #array([[ 1. , ?2. ], > # ? ? ? [ 3. , ?4.1]]) nice, I've never seen selection of multiple columns before. I didn't know it is possible to get a subset of columns this way >>> b[['x','y']] array([(1.0, 2.0), (3.0, 4.0999999999999996)], dtype=[('x', '>> b['x'] array([ 1., 3.]) >>> b[['x']] array([(1.0,), (3.0,)], dtype=[('x', ' > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bsouthey at gmail.com Mon Mar 8 14:35:42 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 08 Mar 2010 13:35:42 -0600 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> <4B950F36.3020303@gmail.com> <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> Message-ID: <4B95518E.6080508@gmail.com> On 03/08/2010 12:17 PM, David Goldsmith wrote: > On Mon, Mar 8, 2010 at 6:52 AM, Bruce Southey > wrote: > > On 03/08/2010 01:30 AM, David Goldsmith wrote: >> On Sun, Mar 7, 2010 at 4:41 AM, Friedrich Romstedt >> > > wrote: >> >> I would like to stress the fact that imo this is maybe not >> ticket and not a bug. >> >> The issue arises when calling a.max() or similar of empty >> arrays a, i.e., with: >> >> >>> 0 in a.shape >> True >> >> Opposed to the .prod() of an empty array, such a .max() or .min() >> cannot be defined, because the set is empty. So it's fully >> correct to >> let such calls fail. Just the failure is a bit deep in >> numpy, and >> only the traceback gives some hint what went wrong. >> >> I posted something similar also on the matplotlib-users list, >> sorry >> for cross-posting thus. >> >> >> Any suggestions, then, how to go about figuring out what's >> happening in my code that's causing this "feature" to manifest >> itself? >> >> DG > Perhaps providing the code with specific versions of Python, numpy > etc. would help. > > I would guess that aquarius_test.py has not correctly setup the > necessary inputs (or has invalid inputs) required by matplotlib > (which I have no knowledge about). Really you have to find if the > _A in cmp.py used by 'self.norm.autoscale_None(self._A)' is valid. > You may be missing a valid initialization step because the > TypeError exception in autoscale_None ('You must first set_array > for mappable') implies something need to be done first. > > Bruce > > > Python 2.5.4, Numpy 1.4.0, Matplotlib 0.99.0, Windows 32bit Vista Home > Premium SP2 > > # Code copyright 2010 by David Goldsmith > # Comments and unnecessaries edited for brevity > import numpy as N > import matplotlib as MPL > from matplotlib import pylab > from matplotlib.backends.backend_agg import FigureCanvasAgg as > FigureCanvas > from matplotlib.figure import Figure > import matplotlib.cm as cm > > J = complex(0,1); tol = 1e-6; maxiter = 20; > roots = (-2 + J/2, -1 + J, -0.5 + J/2, 0.5 + J, 1 + J/2, 2 + J, 2.5 > + J/2, > -2 - J, -1 - J/2, -0.5 - J, 0.5 - J/2, 1 - J, 2 - J/2, > 2.5 - J) > > def ffp(z): > w, wp = (0J, 0J) > for root in roots: > z0 = z - root > w += N.sin(1/z0) > wp -= N.cos(1/z0)/(z0*z0) > return (w, wp) > > def iter(z): > w, wp = ffp(z) > return z - w/wp > > def find_root(z0):#, k, j): > count = 0 > z1 = iter(z0) > if N.isnan(z1): > return N.complex64(N.inf) > while (N.abs(z1 - z0) > tol) and \ > (count < maxiter): > count += 1 > z0 = z1 > z1 = iter(z0) > if N.abs(z1 - z0) > tol: > result = 0 > else: > result = z1 > return N.complex64(result) > > w, h, DPI = (3.2, 2.0, 100) > fig = Figure(figsize=(w, h), > dpi=DPI, > frameon=False) > ax = fig.add_subplot(1,1,1) > canvas = FigureCanvas(fig) > nx, xmin, xmax = (int(w*DPI), -0.5, 0.5) > ny, ymin, ymax = (int(h*DPI), 0.6, 1.2) > > X, xincr = N.linspace(xmin,xmax,nx,retstep=True) > Y, yincr = N.linspace(ymin,ymax,ny,retstep=True) > W = N.zeros((ny,nx), dtype=N.complex64) > > for j in N.arange(nx): > if not (j%100): # report progress > print j > for k in N.arange(ny): > x, y = (X[j], Y[k]) > z0 = x + J*y > W[k,j] = find_root(z0)#,k,j) > > print N.argwhere(N.logical_not(N.isfinite(W.real))) > print N.argwhere(N.logical_not(N.isfinite(W.imag))) > W = W.T > argW = N.angle(W) > print N.argwhere(N.logical_not(N.isfinite(argW))) > cms = ("Blues",)# "Blues_r", "cool", "cool_r", > > def all_ticks_off(ax): > ax.xaxis.set_major_locator(pylab.NullLocator()) > ax.yaxis.set_major_locator(pylab.NullLocator()) > > for cmap_name in cms: > all_ticks_off(ax) > ax.hold(True) > for i in range(4): > for j in range(4): > part2plot = argW[j*ny/4:(j+1)*ny/4, i*nx/4:(i+1)*nx/4] > if N.any(N.logical_not(N.isfinite(part2plot))): > print i, j, > print N.argwhere(N.logical_not(N.isfinite(part2plot))) > extent = (i*nx/4, (i+1)*nx/4, (j+1)*ny/4, j*ny/4) > ax.imshow(part2plot, cmap_name, extent = extent) > ax.set_xlim(0, nx) > ax.set_ylim(0, ny) > canvas.print_figure('../../Data-Figures/Zodiac/Aquarius/'+ cmap_name + > 'Aquarius_test.png', dpi=DPI) > # End Aquarius_test.png > > DG > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hmm, Appears that you have mixed your indices when creating part2plot. If you this line instead it works: part2plot = argW[j*nx/4:(j+1)*nx/4, i*ny/4:(i+1)*ny/4] I found that by looking the shape of the part2plot array that is component of the argW array. The shape of argW is (320, 200). So in your loops to find part2plot you eventually exceed 200 and eventually the index to the second axis is greater than 200 causing everything to crash: When i=0 or 1 then the shape of part2plot is (50, 80) when i=2 then the shape of part2plot is (50, 40) when i=3 then the shape of part2plot is (50, 0) # crash Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Mon Mar 8 14:40:57 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 8 Mar 2010 20:40:57 +0100 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: <4B95518E.6080508@gmail.com> References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> <4B950F36.3020303@gmail.com> <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> <4B95518E.6080508@gmail.com> Message-ID: 2010/3/8 Bruce Southey : > Hmm, > Appears that you have mixed your indices when creating part2plot. If you > this line instead it works: > part2plot = argW[j*nx/4:(j+1)*nx/4, i*ny/4:(i+1)*ny/4] > > > I found that by looking the shape of the part2plot array that is component > of the argW array. > > The shape of argW is (320, 200). So in your loops to find part2plot you > eventually exceed 200 and eventually the index to the second axis is greater > than 200 causing everything to crash: > When i=0 or 1 then the shape of part2plot is (50, 80) > when i=2 then the shape of part2plot is (50, 40) > when i=3 then the shape of part2plot is (50,? 0) # crash Nice that we have found this out both at the same instance of time (with 5min precision) :-) Friedrich From d.l.goldsmith at gmail.com Mon Mar 8 14:57:03 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 8 Mar 2010 11:57:03 -0800 Subject: [Numpy-discussion] Is this a bug in numpy.ma.reduce? In-Reply-To: References: <45d1ab481003050138j2c9dc7e2ncceccc5a0cbf5391@mail.gmail.com> <7623E691-B731-425C-B150-50558AADBDF5@gmail.com> <45d1ab481003072330i400cec09u9c8e38ef3e894e71@mail.gmail.com> <4B950F36.3020303@gmail.com> <45d1ab481003081017va651ccbye48f3d52d0d2d854@mail.gmail.com> Message-ID: <45d1ab481003081157g7fc9c5feya33b3ec5d79a2840@mail.gmail.com> How embarrassing! :O Well, as they say, 'nother set of eyes... Thanks! DG On Mon, Mar 8, 2010 at 11:25 AM, Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > It's pretty simple, but I was stunned myself how simple. Have a look > at line 65 of your script you provided: > > W = W.T > > This means, x <-> y. But in the for loops, you still act as if W > wasn't transposed. I added some prints, the positions should be clear > for you: > > argW.shape = (320, 200) > i, j = (0, 0) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 0, 80) > part2plot.shape = (50, 80) > i, j = (0, 1) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (50, 100, 0, 80) > part2plot.shape = (50, 80) > i, j = (0, 2) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (100, 150, 0, 80) > part2plot.shape = (50, 80) > i, j = (0, 3) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (150, 200, 0, 80) > part2plot.shape = (50, 80) > i, j = (1, 0) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 80, 160) > part2plot.shape = (50, 80) > i, j = (1, 1) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (50, 100, 80, 160) > part2plot.shape = (50, 80) > i, j = (1, 2) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (100, 150, 80, 160) > part2plot.shape = (50, 80) > i, j = (1, 3) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (150, 200, 80, 160) > part2plot.shape = (50, 80) > i, j = (2, 0) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 160, 240) > part2plot.shape = (50, 40) > i, j = (2, 1) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (50, 100, 160, 240) > part2plot.shape = (50, 40) > i, j = (2, 2) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (100, 150, 160, 240) > part2plot.shape = (50, 40) > i, j = (2, 3) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (150, 200, 160, 240) > part2plot.shape = (50, 40) > i, j = (3, 0) > j*ny/4, (j+1)*ny/4, i*nx/4, (i+1)*nx/4 = (0, 50, 240, 320) > part2plot.shape = (50, 0) > Traceback (most recent call last): > File "D:\Home\Friedrich\Entwicklung\2010\David\aquarius.py", line 91, in > ? > ax.imshow(part2plot, extent = extent) > File > "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\matplotlib\ax > es.py", line 5471, in imshow > im.autoscale_None() > File > "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\matplotlib\cm > .py", line 148, in autoscale_None > self.norm.autoscale_None(self._A) > File > "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\matplotlib\co > lors.py", line 682, in autoscale_None > if self.vmin is None: self.vmin = ma.minimum(A) > File > "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\numpy\ma\core > .py", line 3042, in __call__ > return self.reduce(a) > File > "D:\Programme\Programmierung\python-2.4.1\lib\site-packages\numpy\ma\core > .py", line 3057, in reduce > t = self.ufunc.reduce(target, **kargs) > ValueError: zero-size array to ufunc.reduce without identity > > So you simply have to exchange the role of x and y in your slice > indicing expression, and everything will work out fine, I suspect :-) > > Or simpy leave out the transposition? Note that in the other case, > you also may have to consider to change to extent's axes to get it > properly reflected. > > NB: With my version of matplotlib, it didn't accept the colormap, but > when yours does, it doesn't matter. > > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Mar 8 15:00:29 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 8 Mar 2010 15:00:29 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: References: <4B910E62.7070209@gmail.com> Message-ID: <7985D9C0-8F60-4A75-AF38-CCCA54B4DDB8@gmail.com> On Mar 8, 2010, at 1:55 PM, Tim Michelsen wrote: > Hello, > I am also looking into the convertsion from strcutured arrays to ndarray. > >> I've just started playing with numpy and have noticed that when printing >> a structured array that the output is not nicely formatted. Is there a >> way to make the formatting look the same as it does for an unstructured >> array? > >> Output is: >> ### ndarray >> [[ 1. 2. ] >> [ 3. 4.1]] >> ### structured array >> [(1.0, 2.0) (3.0, 4.0999999999999996)] > How could we make this structured array look like the above shown > ndarray with shape (2, 2)? if you're 100% sure all your fields have the same dtype (float), ``a.view((float,2))`` is the simplest. Note the tuple (dtype, len(a.dtype.names)). From millman at berkeley.edu Tue Mar 9 00:29:28 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 8 Mar 2010 21:29:28 -0800 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: References: Message-ID: I added Titus' email regarding the PSF's focus on Py3K-related projects to our SoC ideas wiki page: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas Given Titus' email, this is the most likely list of projects we will get accepted this year: - finish porting NumPy to Py3K - port SciPy to Py3K - port matplotlib to Py3K - port ipython to Py3K Given that we know what projects we will likely have accepted, it is worth starting to flesh these proposals out in detail. Also, we should start discussing how we will choose which student's we want to work on these ports. In particular, we should list what skills and background will be necessary to successfully complete these ports. Thoughts? Ideas? Best, Jarrod From charlesr.harris at gmail.com Tue Mar 9 00:39:00 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 8 Mar 2010 22:39:00 -0700 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: References: Message-ID: On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman wrote: > I added Titus' email regarding the PSF's focus on Py3K-related > projects to our SoC ideas wiki page: > http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas > > Given Titus' email, this is the most likely list of projects we will > get accepted this year: > > - finish porting NumPy to Py3K > I think Numpy is pretty much done. It needs use testing to wring out any small oddities, but it doesn't look to me like a GSOC project at the moment. Maybe Pauli can weigh in here. > - port SciPy to Py3K > This project might be more appropriate, although I'm not clear on what needs to be done. > - port matplotlib to Py3K > - port ipython to Py3K > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Mar 9 11:48:15 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 09 Mar 2010 10:48:15 -0600 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: References: Message-ID: <4B967BCF.3060202@gmail.com> On 03/08/2010 11:39 PM, Charles R Harris wrote: > > > On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman > wrote: > > I added Titus' email regarding the PSF's focus on Py3K-related > projects to our SoC ideas wiki page: > http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas > > Given Titus' email, this is the most likely list of projects we will > get accepted this year: > > - finish porting NumPy to Py3K > > > I think Numpy is pretty much done. It needs use testing to wring out > any small oddities, but it doesn't look to me like a GSOC project at > the moment. Maybe Pauli can weigh in here. > > - port SciPy to Py3K > > > This project might be more appropriate, although I'm not clear on what > needs to be done. > > - port matplotlib to Py3K > - port ipython to Py3K > > > Chuck > > +1 to all of these as these are probably next major roadblocks after numpy for Py3K usage. From the web page, having both f2py and/or Fwrap working with Py3K should be a specific goal (not that I use these). Also having Fwrap (as I understood the recent discussion) supporting the Fortran 77 used in Scipy. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdh2358 at gmail.com Tue Mar 9 12:14:58 2010 From: jdh2358 at gmail.com (John Hunter) Date: Tue, 9 Mar 2010 11:14:58 -0600 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: References: Message-ID: <88e473831003090914k65d9d60cldc0e0c5cb1731909@mail.gmail.com> On Mon, Mar 8, 2010 at 11:39 PM, Charles R Harris wrote: >> - port matplotlib to Py3K We'd be happy to mentor a project here. To my knowledge, nothing has been done, other than upgrade to CXX6 (our C++ extension lib). Most, but not all, of our extension code is exposed through CXX, which as of v6 is python3 compliant so that should help. But I suspect there is enough work to justify a GSOC project on the mpl side. JDH From timmichelsen at gmx-topmail.de Mon Mar 8 17:50:22 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 08 Mar 2010 23:50:22 +0100 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> Message-ID: Hello, thanks to all who responded and have their input here. I added a little code snippet to show the view and reshape: http://www.scipy.org/Cookbook/Recarray What do you think? Is this worth to go into the official docs? The page http://docs.scipy.org/doc/numpy/user/basics.rec.html is quite sparse... I still wonder why there is not a quick function for such a view / reshape conversion. Best regards, Timmie From pav+sp at iki.fi Tue Mar 9 09:02:37 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 9 Mar 2010 14:02:37 +0000 (UTC) Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) References: Message-ID: Mon, 08 Mar 2010 22:39:00 -0700, Charles R Harris wrote: > On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman > wrote: > > I added Titus' email regarding the PSF's focus on Py3K-related > > projects to our SoC ideas wiki page: > > http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas > > > > Given Titus' email, this is the most likely list of projects we will > > get accepted this year: > > > > - finish porting NumPy to Py3K > > I think Numpy is pretty much done. It needs use testing to wring out any > small oddities, but it doesn't look to me like a GSOC project at the > moment. Maybe Pauli can weigh in here. I think it's pretty much done. Even f2py should work. What's left is mostly testing it, and fixing any issues that crop up. Note that the SVN Numpy should preferably still see more testing on Python 2 against real-world packages that use it, before release. There's been a significant amount of code churn. >> - port SciPy to Py3K > > This project might be more appropriate, although I'm not clear on what > needs to be done. I think porting Scipy proceeds in these steps: 1. Set up a similar 2to3-using build system for Python 3 as currently in Numpy. 2. Change the code so that it works both on Python 2 and Python 3. This can be done one submodule at a time. For C code, the changes required are mainly PyString usage. Some macros need to be defined to allow the same code to build on Py2 and Py3. Scipy is probably easier to port than Numpy here, since IIRC it doesn't mess much with strings, and its use of the Python C-API is much more limited. Also, parts written with Cython need essentially no porting. For Python code, the main issue is again bytes vs. unicode fixes, mainly inserting numpy.compat.asbytes() at correct locations to make e.g. bytes literals. All I/O code should be changed to work solely with Bytes. Since 2to3 takes care of most the other things, the remaining work is in fixing things it mishandles. I don't have a good estimate for how much work is in making Scipy work. I'd guess the work needed is probably slightly less than for Numpy. -- Pauli Virtanen From D.P.Reichert at sms.ed.ac.uk Tue Mar 9 12:31:03 2010 From: D.P.Reichert at sms.ed.ac.uk (David Paul Reichert) Date: Tue, 09 Mar 2010 17:31:03 +0000 Subject: [Numpy-discussion] Memory leak with matrices? Message-ID: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> Hi, I've got two issues: First, the following seems to cause a memory leak, using numpy 1.3.0: a = matrix(ones(1)) while True: a += 0 This only seems to happen when a is a matrix rather than an array, and when the short hand '+=' is used. Second, I'm not sure whether that's a bug or whether I just don't understand what's going on, but when a is a column array, (e.g. a = ones((10, 1))), then a -= a[0,:] only subtracts from a[0, 0], whereas not using the short hand or using something else than a on the righthand side seems to subtract from all rows as expected. Thanks a lot, David -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From josef.pktd at gmail.com Tue Mar 9 12:50:41 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 12:50:41 -0500 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> Message-ID: <1cd32cbb1003090950n742ea1ebo2dc91e64d1f3a663@mail.gmail.com> On Mon, Mar 8, 2010 at 5:50 PM, Tim Michelsen wrote: > Hello, > thanks to all who responded and have their input here. > > I added a little code snippet to show the view and reshape: > > http://www.scipy.org/Cookbook/Recarray > > What do you think? > Is this worth to go into the official docs? > The page http://docs.scipy.org/doc/numpy/user/basics.rec.html is quite > sparse... > > I still wonder why there is not a quick function for such a view / > reshape conversion. Thanks, the docs for working with arrays with structured dtypes are sparse Note that in your example .view(np.ndarray) doesn't do anything >>> struct_diffdtype[['str_var', 'x', 'y']].view(np.ndarray).reshape(len(struct_diffdtype),-1) array([[(1.0, 'string1', 2.0)], [(3.0, 'string2', 4.0999999999999996)]], dtype=[('x', '>> struct_diffdtype[['x', 'y']].view(np.ndarray).reshape(len(struct_diffdtype),-1) array([[(1.0, 2.0)], [(3.0, 4.0999999999999996)]], dtype=[('x', '>> struct_diffdtype[['x', 'y']].view(float).reshape(len(struct_diffdtype),-1) array([[ 1. , 2. ], [ 3. , 4.1]]) and float view on strings is not possible >>> struct_diffdtype[['str_var', 'x', 'y']].view(float).reshape(len(struct_diffdtype),-1) Traceback (most recent call last): ValueError: new type not compatible with array. Josef > > > Best regards, > Timmie > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Mar 9 12:55:16 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 12:55:16 -0500 Subject: [Numpy-discussion] Memory leak with matrices? In-Reply-To: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> References: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> Message-ID: <1cd32cbb1003090955w4f386fa1l15a809bbbb76b651@mail.gmail.com> On Tue, Mar 9, 2010 at 12:31 PM, David Paul Reichert wrote: > Hi, > > I've got two issues: > > First, the following seems to cause a memory leak, > using numpy 1.3.0: > > a = matrix(ones(1)) > > while True: > ? ?a += 0 > > > This only seems to happen when a is a matrix rather > than an array, and when the short hand '+=' is used. > > Second, I'm not sure whether that's a bug or whether > I just don't understand what's going on, but when a is a column > array, (e.g. a = ones((10, 1))), then > > a -= a[0,:] > > only subtracts from a[0, 0], whereas not using the short hand > or using something else than a on the righthand side seems > to subtract from all rows as expected. this is because a[0,0] is set to zero after the first inplace subtraction, then zero is subtracted from all other rows >>> a = np.ones((10, 1)) >>> a array([[ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.]]) >>> a += a[0,:] >>> a array([[ 2.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.]]) >>> a -= a[0,:] >>> a array([[ 0.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.], [ 3.]]) Josef > > Thanks a lot, > > David > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d.p.reichert at sms.ed.ac.uk Tue Mar 9 13:15:06 2010 From: d.p.reichert at sms.ed.ac.uk (David Reichert) Date: Tue, 9 Mar 2010 18:15:06 +0000 Subject: [Numpy-discussion] Memory leak with matrices? In-Reply-To: <1cd32cbb1003090955w4f386fa1l15a809bbbb76b651@mail.gmail.com> References: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> <1cd32cbb1003090955w4f386fa1l15a809bbbb76b651@mail.gmail.com> Message-ID: <7b3b01551003091015q7a780eefxe5d324f7a4184052@mail.gmail.com> Thanks for the reply. Yes never mind the second issue, I had myself confused there. Any comments on the memory leak? On Tue, Mar 9, 2010 at 5:55 PM, wrote: > On Tue, Mar 9, 2010 at 12:31 PM, David Paul Reichert > wrote: > > Hi, > > > > I've got two issues: > > > > First, the following seems to cause a memory leak, > > using numpy 1.3.0: > > > > a = matrix(ones(1)) > > > > while True: > > a += 0 > > > > > > This only seems to happen when a is a matrix rather > > than an array, and when the short hand '+=' is used. > > > > Second, I'm not sure whether that's a bug or whether > > I just don't understand what's going on, but when a is a column > > array, (e.g. a = ones((10, 1))), then > > > > a -= a[0,:] > > > > only subtracts from a[0, 0], whereas not using the short hand > > or using something else than a on the righthand side seems > > to subtract from all rows as expected. > > this is because a[0,0] is set to zero after the first inplace > subtraction, then zero is subtracted from all other rows > > >>> a = np.ones((10, 1)) > >>> a > array([[ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.]]) > >>> a += a[0,:] > >>> a > array([[ 2.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.]]) > >>> a -= a[0,:] > >>> a > array([[ 0.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.], > [ 3.]]) > > Josef > > > > > > Thanks a lot, > > > > David > > > > -- > > The University of Edinburgh is a charitable body, registered in > > Scotland, with registration number SC005336. > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: From josef.pktd at gmail.com Tue Mar 9 13:24:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 13:24:29 -0500 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: References: Message-ID: <1cd32cbb1003091024m32c41209u99babe058b8f0e69@mail.gmail.com> On Tue, Mar 9, 2010 at 9:02 AM, Pauli Virtanen wrote: > Mon, 08 Mar 2010 22:39:00 -0700, Charles R Harris wrote: >> On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman >> wrote: >> > I added Titus' email regarding the PSF's focus on Py3K-related >> > projects to our SoC ideas wiki page: >> > http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas >> > >> > Given Titus' email, this is the most likely list of projects we will >> > get accepted this year: >> > >> > - finish porting NumPy to Py3K >> >> I think Numpy is pretty much done. It needs use testing to wring out any >> small oddities, but it doesn't look to me like a GSOC project at the >> moment. Maybe Pauli can weigh in here. > > I think it's pretty much done. Even f2py should work. What's left is > mostly testing it, and fixing any issues that crop up. > > Note that the SVN Numpy should preferably still see more testing on > Python 2 against real-world packages that use it, before release. There's > been a significant amount of code churn. > >>> - port SciPy to Py3K >> >> This project might be more appropriate, although I'm not clear on what >> needs to be done. > > I think porting Scipy proceeds in these steps: > > 1. Set up a similar 2to3-using build system for Python 3 as currently in > ? Numpy. > > 2. Change the code so that it works both on Python 2 and Python 3. > ? This can be done one submodule at a time. > > ? For C code, the changes required are mainly PyString usage. Some macros > ? need to be defined to allow the same code to build on Py2 and Py3. > ? Scipy is probably easier to port than Numpy here, since IIRC it doesn't > ? mess much with strings, and its use of the Python C-API is much more > ? limited. Also, parts written with Cython need essentially no porting. > > ? For Python code, the main issue is again bytes vs. unicode fixes, > ? mainly inserting numpy.compat.asbytes() at correct locations to make > ? e.g. bytes literals. All I/O code should be changed to work solely with > ? Bytes. Since 2to3 takes care of most the other things, the remaining > ? work is in fixing things it mishandles. > > I don't have a good estimate for how much work is in making Scipy work. > I'd guess the work needed is probably slightly less than for Numpy. a few questions: Is scipy.special difficult or time consuming to port? In the build system, is it possible not to build some subpackages that might be slow in being ported, e.g. ndimage, weave? Is there a good utility script to check dependencies between subpackages? (import scipy.stats loads a very large number of modules) >>> import sys >>> len(sys.modules) 125 >>> import numpy >>> len(sys.modules) 259 >>> import scipy >>> len(sys.modules) 339 >>> import scipy.stats >>> len(sys.modules) 556 Josef > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Tue Mar 9 13:28:27 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Mar 2010 12:28:27 -0600 Subject: [Numpy-discussion] Memory leak with matrices? In-Reply-To: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> References: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> Message-ID: <3d375d731003091028k217f0739n32d6d5fb5e40179e@mail.gmail.com> On Tue, Mar 9, 2010 at 11:31, David Paul Reichert wrote: > Hi, > > I've got two issues: > > First, the following seems to cause a memory leak, > using numpy 1.3.0: > > a = matrix(ones(1)) > > while True: > ? ?a += 0 > > > This only seems to happen when a is a matrix rather > than an array, and when the short hand '+=' is used. Yes, I can verify that there is a leak in the current SVN, too. It appears that we are creating dictionaries with the value {'_getitem': False} and never decrefing them. The matrix object itself owns the reference. > Second, I'm not sure whether that's a bug or whether > I just don't understand what's going on, but when a is a column > array, (e.g. a = ones((10, 1))), then > > a -= a[0,:] > > only subtracts from a[0, 0], whereas not using the short hand > or using something else than a on the righthand side seems > to subtract from all rows as expected. a[0,:] creates a view onto the same memory of the original array. Since you modify the values in-place, a[0,0] gets set to a[0,0]-a[0,0]==0, then a[1,0] gets set to a[1,0] - a[0,0] == a[1,0] - 0 == a[1,0], etc. Try this instead: a -= a[0,:].copy() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Tue Mar 9 13:32:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 13:32:42 -0500 Subject: [Numpy-discussion] Memory leak with matrices? In-Reply-To: <3d375d731003091028k217f0739n32d6d5fb5e40179e@mail.gmail.com> References: <20100309173103.hupeo6je0o8ow8cc@www.sms.ed.ac.uk> <3d375d731003091028k217f0739n32d6d5fb5e40179e@mail.gmail.com> Message-ID: <1cd32cbb1003091032u1d79d358uf000ff135a17df37@mail.gmail.com> On Tue, Mar 9, 2010 at 1:28 PM, Robert Kern wrote: > On Tue, Mar 9, 2010 at 11:31, David Paul Reichert > wrote: >> Hi, >> >> I've got two issues: >> >> First, the following seems to cause a memory leak, >> using numpy 1.3.0: >> >> a = matrix(ones(1)) >> >> while True: >> ? ?a += 0 >> >> >> This only seems to happen when a is a matrix rather >> than an array, and when the short hand '+=' is used. > > Yes, I can verify that there is a leak in the current SVN, too. It > appears that we are creating dictionaries with the value {'_getitem': > False} and never decrefing them. The matrix object itself owns the > reference. With numpy 1.4.0 memory usage also keeps growing until I kill the process. Josef > >> Second, I'm not sure whether that's a bug or whether >> I just don't understand what's going on, but when a is a column >> array, (e.g. a = ones((10, 1))), then >> >> a -= a[0,:] >> >> only subtracts from a[0, 0], whereas not using the short hand >> or using something else than a on the righthand side seems >> to subtract from all rows as expected. > > a[0,:] creates a view onto the same memory of the original array. > Since you modify the values in-place, a[0,0] gets set to > a[0,0]-a[0,0]==0, then a[1,0] gets set to a[1,0] - a[0,0] == a[1,0] - > 0 == a[1,0], etc. Try this instead: > > a -= a[0,:].copy() > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Mar 9 13:51:34 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue, 09 Mar 2010 10:51:34 -0800 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> Message-ID: <4B9698B6.204@noaa.gov> Tim Michelsen wrote: > I still wonder why there is not a quick function for such a view / > reshape conversion. Because it is difficult (impossible?) to do in the general case. .view() really isn't that bad, in fact, it remarkably powerful and flexible! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cohen at lpta.in2p3.fr Tue Mar 9 14:30:21 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Tue, 09 Mar 2010 20:30:21 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test Message-ID: <4B96A1CD.7080702@lpta.in2p3.fr> hi there, I just installed the current head of numpy and built it. trying python and then import numpy, and then CTRL-D to exit, all goes well. But doing the same with a numpy.test() before CTRL-D ends up in : Ran 2892 tests in 35.814s OK (KNOWNFAIL=4, SKIP=6) >>> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Aborted (core dumped) Does that ring a bell to any of you? My system is Fedora 12, I used gfortran to build and In [10]: sys.path Out[10]: ['', '/home/cohen/.local/bin', '/home/cohen/sources/python/ipython', '/usr/lib/python26.zip', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/home/cohen/.local/lib/python2.6/site-packages', '/usr/lib/python2.6/site-packages', '/usr/lib/python2.6/site-packages/PIL', '/usr/lib/python2.6/site-packages/gst-0.10', '/usr/lib/python2.6/site-packages/gtk-2.0', '/usr/lib/python2.6/site-packages/webkit-1.0', '/usr/lib/python2.6/site-packages/wx-2.8-gtk2-unicode', '/home/cohen/sources/python/ipython/IPython/extensions', u'/home/cohen/.ipython'] numpy and ipython live in $HOME/.local thanks in advance, Johann From robert.kern at gmail.com Tue Mar 9 14:35:14 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Mar 2010 13:35:14 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96A1CD.7080702@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> Message-ID: <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> On Tue, Mar 9, 2010 at 13:30, Johann Cohen-Tanugi wrote: > hi there, > I just installed the current head of numpy and built it. trying > python and then import numpy, and then CTRL-D to exit, all goes well. > But doing the same with a numpy.test() before CTRL-D ends up in : > > Ran 2892 tests in 35.814s > > OK (KNOWNFAIL=4, SKIP=6) > > ?>>> > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs > != 0' failed. > Aborted (core dumped) > > Does that ring a bell to any of you? Nope! Can you show us a gdb backtrace of the crash? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d.p.reichert at sms.ed.ac.uk Tue Mar 9 14:49:53 2010 From: d.p.reichert at sms.ed.ac.uk (David Reichert) Date: Tue, 9 Mar 2010 19:49:53 +0000 Subject: [Numpy-discussion] Memory leak in signal.convolve2d? Alternative? Message-ID: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> Hi, I just reported a memory leak with matrices, and I might have found another (unrelated) one in the convolve2d function: import scipy.signal from numpy import ones while True: scipy.signal.convolve2d(ones((1,1)), ones((1,1))) Is there an alternative implementation of a 2d convolution? On the long run I'd be interested in using GPU acceleration, but for now I'd just like to get my code to run without running out of memory... Cheers! David -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: From cohen at lpta.in2p3.fr Tue Mar 9 14:50:15 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Tue, 09 Mar 2010 20:50:15 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> Message-ID: <4B96A677.2080206@lpta.in2p3.fr> I have tried to localize the core dump in vain.... any idea where I should look for it? I did not manage to catch it with pdb : [cohen at jarrett ~]$ .local/bin/ipython Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) Type "copyright", "credits" or "license" for more information. IPython 0.11.alpha1.bzr.r1223 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more. In [1]: import pdb In [2]: pdb.set_trace() --Call-- > /home/cohen/sources/python/ipython/IPython/core/prompts.py(525)__call__() -> def __call__(self,arg=None): (Pdb) import numpy (Pdb) numpy.test() Ran 2892 tests in 35.888s OK (KNOWNFAIL=4, SKIP=6) (Pdb) exit() ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (137, 0)) --------------------------------------------------------------------------- BdbQuit Traceback (most recent call last) /home/cohen/ in () /home/cohen/sources/python/ipython/IPython/core/prompts.pyc in __call__(self, arg) 523 self.prompt_out.set_colors() 524 --> 525 def __call__(self,arg=None): 526 """Printing with history cache management. 527 /usr/lib/python2.6/bdb.pyc in trace_dispatch(self, frame, event, arg) 46 return self.dispatch_line(frame) 47 if event == 'call': ---> 48 return self.dispatch_call(frame, arg) 49 if event == 'return': 50 return self.dispatch_return(frame, arg) /usr/lib/python2.6/bdb.pyc in dispatch_call(self, frame, arg) 76 return # None 77 self.user_call(frame, arg) ---> 78 if self.quitting: raise BdbQuit 79 return self.trace_dispatch 80 BdbQuit: In [3]: Do you really want to exit ([y]/n)? python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Aborted (core dumped) Note that I have several warnings during test(): Warning: divide by zero encountered in power Warning: divide by zero encountered in power Warning: divide by zero encountered in power ...............................................................................Warning: invalid value encountered in sqrt ..Warning: invalid value encountered in sqrt etc.... I dont think it is related though.... Johann On 03/09/2010 08:35 PM, Robert Kern wrote: > On Tue, Mar 9, 2010 at 13:30, Johann Cohen-Tanugi wrote: > >> hi there, >> I just installed the current head of numpy and built it. trying >> python and then import numpy, and then CTRL-D to exit, all goes well. >> But doing the same with a numpy.test() before CTRL-D ends up in : >> >> Ran 2892 tests in 35.814s >> >> OK (KNOWNFAIL=4, SKIP=6) >> >> >>> >> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs >> != 0' failed. >> Aborted (core dumped) >> >> Does that ring a bell to any of you? >> > Nope! Can you show us a gdb backtrace of the crash? > > From robert.kern at gmail.com Tue Mar 9 14:56:11 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Mar 2010 13:56:11 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96A677.2080206@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> Message-ID: <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> On Tue, Mar 9, 2010 at 13:50, Johann Cohen-Tanugi wrote: > I have tried to localize the core dump in vain.... any idea where I should > look for it? > I did not manage to catch it with pdb : Not pdb, gdb. $ gdb python ... (gdb) run Starting program ... ... # Possibly another (gdb) prompt: (gdb) continue # <- Type this. Python 2.6.2 ... >>> import numpy # <- Type this and do whatever else is necessary to reproduce the crash. ... (gdb) bt # <- Type this. .... # <- Copy-paste these results here. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Mar 9 14:57:44 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Mar 2010 13:57:44 -0600 Subject: [Numpy-discussion] Memory leak in signal.convolve2d? Alternative? In-Reply-To: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> References: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> Message-ID: <3d375d731003091157x785fb0d1y6d8c29f8f471b8e5@mail.gmail.com> On Tue, Mar 9, 2010 at 13:49, David Reichert wrote: > Hi, > > I just reported a memory leak with matrices, and I might have found > another (unrelated) one in the convolve2d function: > > import scipy.signal > from numpy import ones > > while True: > ??? scipy.signal.convolve2d(ones((1,1)), ones((1,1))) This does not leak for me using SVN versions of numpy and scipy. I recommend upgrading. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cohen at lpta.in2p3.fr Tue Mar 9 15:07:04 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Tue, 09 Mar 2010 21:07:04 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> Message-ID: <4B96AA68.9080106@lpta.in2p3.fr> thanks Robert, here it is : >>> exit() python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Program received signal SIGABRT, Aborted. 0x004a1416 in __kernel_vsyscall () Missing separate debuginfos, use: debuginfo-install atlas-3.8.3-12.fc12.i686 compat-libf2c-34-3.4.6-18.i686 keyutils-libs-1.2-6.fc12.i686 krb5-libs-1.7.1-2.fc12.i686 libcom_err-1.41.9-7.fc12.i686 libgcc-4.4.3-4.fc12.i686 libgfortran-4.4.3-4.fc12.i686 libselinux-2.0.90-5.fc12.i686 (gdb) bt #0 0x004a1416 in __kernel_vsyscall () #1 0x00609a91 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #2 0x0060b35a in abort () at abort.c:92 #3 0x00602be8 in __assert_fail (assertion=, file=, line=, function=) at assert.c:81 #4 0x0032e31e in visit_decref (op=, data=) at Modules/gcmodule.c:277 #5 0x002a18c2 in dict_traverse (op=, visit=, arg=) at Objects/dictobject.c:2003 #6 0x0032eaf3 in subtract_refs (generation=) at Modules/gcmodule.c:296 #7 collect (generation=) at Modules/gcmodule.c:817 #8 0x0032f640 in PyGC_Collect () at Modules/gcmodule.c:1292 #9 0x003200f0 in Py_Finalize () at Python/pythonrun.c:424 #10 0x00320218 in Py_Exit (sts=) at Python/pythonrun.c:1714 #11 0x00320367 in handle_system_exit () at Python/pythonrun.c:1116 #12 0x0032057d in PyErr_PrintEx (set_sys_last_vars=) at Python/pythonrun.c:1126 #13 0x0032079f in PyErr_Print () at Python/pythonrun.c:1035 #14 0x00320f9a in PyRun_InteractiveOneFlags (fp=, filename=, flags=) at Python/pythonrun.c:843 #15 0x00321093 in PyRun_InteractiveLoopFlags (fp=, filename=, flags=0xbfffefec) at Python/pythonrun.c:760 #16 0x003211df in PyRun_AnyFileExFlags (fp=, filename=0x365add "", closeit=, flags=) at Python/pythonrun.c:729 #17 0x0032dc85 in Py_Main (argc=, argv=) at Modules/main.c:599 #18 0x080485c8 in main (argc=, argv=) at Modules/python.c:23 I also saw things like: warning: .dynamic section for "/lib/libcom_err.so.2" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/lib/libkeyutils.so.1" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations Johann On 03/09/2010 08:56 PM, Robert Kern wrote: > On Tue, Mar 9, 2010 at 13:50, Johann Cohen-Tanugi wrote: > >> I have tried to localize the core dump in vain.... any idea where I should >> look for it? >> I did not manage to catch it with pdb : >> > Not pdb, gdb. > > $ gdb python > ... > (gdb) run > Starting program ... > ... # Possibly another (gdb) prompt: > (gdb) continue #<- Type this. > Python 2.6.2 ... > > >>>> import numpy #<- Type this and do whatever else is necessary to reproduce the crash. >>>> > ... > (gdb) bt #<- Type this. > .... #<- Copy-paste these results here. > > From cohen at lpta.in2p3.fr Tue Mar 9 15:14:10 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Tue, 09 Mar 2010 21:14:10 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96AA68.9080106@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> Message-ID: <4B96AC12.1000800@lpta.in2p3.fr> thinking about it, this morning there was a fedora update to python, so I am using 2.6.2-4.fc12. Looks like the problem is in python itself, hence this piece of info. HTH, Johann On 03/09/2010 09:07 PM, Johann Cohen-Tanugi wrote: > thanks Robert, here it is : > > >>> exit() > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs > != 0' failed. > > Program received signal SIGABRT, Aborted. > 0x004a1416 in __kernel_vsyscall () > Missing separate debuginfos, use: debuginfo-install > atlas-3.8.3-12.fc12.i686 compat-libf2c-34-3.4.6-18.i686 > keyutils-libs-1.2-6.fc12.i686 krb5-libs-1.7.1-2.fc12.i686 > libcom_err-1.41.9-7.fc12.i686 libgcc-4.4.3-4.fc12.i686 > libgfortran-4.4.3-4.fc12.i686 libselinux-2.0.90-5.fc12.i686 > (gdb) bt > #0 0x004a1416 in __kernel_vsyscall () > #1 0x00609a91 in raise (sig=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #2 0x0060b35a in abort () at abort.c:92 > #3 0x00602be8 in __assert_fail (assertion=, > file=, line=, function= optimized out>) > at assert.c:81 > #4 0x0032e31e in visit_decref (op=, data= optimized out>) at Modules/gcmodule.c:277 > #5 0x002a18c2 in dict_traverse (op=, visit= optimized out>, arg=) at Objects/dictobject.c:2003 > #6 0x0032eaf3 in subtract_refs (generation=) at > Modules/gcmodule.c:296 > #7 collect (generation=) at Modules/gcmodule.c:817 > #8 0x0032f640 in PyGC_Collect () at Modules/gcmodule.c:1292 > #9 0x003200f0 in Py_Finalize () at Python/pythonrun.c:424 > #10 0x00320218 in Py_Exit (sts=) at > Python/pythonrun.c:1714 > #11 0x00320367 in handle_system_exit () at Python/pythonrun.c:1116 > #12 0x0032057d in PyErr_PrintEx (set_sys_last_vars= out>) at Python/pythonrun.c:1126 > #13 0x0032079f in PyErr_Print () at Python/pythonrun.c:1035 > #14 0x00320f9a in PyRun_InteractiveOneFlags (fp=, > filename=, flags=) at > Python/pythonrun.c:843 > #15 0x00321093 in PyRun_InteractiveLoopFlags (fp=, > filename=, flags=0xbfffefec) at Python/pythonrun.c:760 > #16 0x003211df in PyRun_AnyFileExFlags (fp=, > filename=0x365add "", closeit=, flags= optimized out>) > at Python/pythonrun.c:729 > #17 0x0032dc85 in Py_Main (argc=, argv= optimized out>) at Modules/main.c:599 > #18 0x080485c8 in main (argc=, argv= optimized out>) at Modules/python.c:23 > > > I also saw things like: > warning: .dynamic section for "/lib/libcom_err.so.2" is not at the > expected address > warning: difference appears to be caused by prelink, adjusting expectations > warning: .dynamic section for "/lib/libkeyutils.so.1" is not at the > expected address > warning: difference appears to be caused by prelink, adjusting expectations > > Johann > > On 03/09/2010 08:56 PM, Robert Kern wrote: > >> On Tue, Mar 9, 2010 at 13:50, Johann Cohen-Tanugi wrote: >> >> >>> I have tried to localize the core dump in vain.... any idea where I should >>> look for it? >>> I did not manage to catch it with pdb : >>> >>> >> Not pdb, gdb. >> >> $ gdb python >> ... >> (gdb) run >> Starting program ... >> ... # Possibly another (gdb) prompt: >> (gdb) continue #<- Type this. >> Python 2.6.2 ... >> >> >> >>>>> import numpy #<- Type this and do whatever else is necessary to reproduce the crash. >>>>> >>>>> >> ... >> (gdb) bt #<- Type this. >> .... #<- Copy-paste these results here. >> >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From timmichelsen at gmx-topmail.de Tue Mar 9 15:57:35 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 09 Mar 2010 21:57:35 +0100 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <4B9698B6.204@noaa.gov> References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> <4B9698B6.204@noaa.gov> Message-ID: >> I still wonder why there is not a quick function for such a view / >> reshape conversion. > > Because it is difficult (impossible?) to do in the general case. .view() > really isn't that bad, in fact, it remarkably powerful and flexible! I would not drop .view() but rather add a convenience function for struct_1dtype_float_alt = struct_1dtype.view((np.float, len(struct_1dtype.dtype.names))) From d.p.reichert at sms.ed.ac.uk Tue Mar 9 16:24:49 2010 From: d.p.reichert at sms.ed.ac.uk (David Reichert) Date: Tue, 9 Mar 2010 21:24:49 +0000 Subject: [Numpy-discussion] Memory leak in signal.convolve2d? Alternative? In-Reply-To: <3d375d731003091157x785fb0d1y6d8c29f8f471b8e5@mail.gmail.com> References: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> <3d375d731003091157x785fb0d1y6d8c29f8f471b8e5@mail.gmail.com> Message-ID: <7b3b01551003091324of483f5cnecbec3dfade2a5c6@mail.gmail.com> Hm, upgrading scipy from 0.7.0 to 0.7.1 didn't do the trick for me (still running numpy 1.3.0). I'm not sure if I feel confident enough to use developer versions, but I'll look into it. Cheers David On Tue, Mar 9, 2010 at 7:57 PM, Robert Kern wrote: > On Tue, Mar 9, 2010 at 13:49, David Reichert > wrote: > > Hi, > > > > I just reported a memory leak with matrices, and I might have found > > another (unrelated) one in the convolve2d function: > > > > import scipy.signal > > from numpy import ones > > > > while True: > > scipy.signal.convolve2d(ones((1,1)), ones((1,1))) > > This does not leak for me using SVN versions of numpy and scipy. I > recommend upgrading. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: From josef.pktd at gmail.com Tue Mar 9 16:32:56 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 16:32:56 -0500 Subject: [Numpy-discussion] Memory leak in signal.convolve2d? Alternative? In-Reply-To: <7b3b01551003091324of483f5cnecbec3dfade2a5c6@mail.gmail.com> References: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> <3d375d731003091157x785fb0d1y6d8c29f8f471b8e5@mail.gmail.com> <7b3b01551003091324of483f5cnecbec3dfade2a5c6@mail.gmail.com> Message-ID: <1cd32cbb1003091332t1456636bxdd7895b166bb198@mail.gmail.com> On Tue, Mar 9, 2010 at 4:24 PM, David Reichert wrote: > Hm, upgrading scipy from 0.7.0 to 0.7.1 didn't do the trick for me (still > running numpy 1.3.0). > I'm not sure if I feel confident enough to use developer versions, but I'll > look into it. If you don't need the extra options, you could also use in the meantime the nd version signal.convolve or the fft version signal.fftconvolve Josef > > Cheers > > David > > On Tue, Mar 9, 2010 at 7:57 PM, Robert Kern wrote: >> >> On Tue, Mar 9, 2010 at 13:49, David Reichert >> wrote: >> > Hi, >> > >> > I just reported a memory leak with matrices, and I might have found >> > another (unrelated) one in the convolve2d function: >> > >> > import scipy.signal >> > from numpy import ones >> > >> > while True: >> > ??? scipy.signal.convolve2d(ones((1,1)), ones((1,1))) >> >> This does not leak for me using SVN versions of numpy and scipy. I >> recommend upgrading. >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> ?-- Umberto Eco >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From timmichelsen at gmx-topmail.de Tue Mar 9 17:49:05 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 09 Mar 2010 23:49:05 +0100 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <1cd32cbb1003090950n742ea1ebo2dc91e64d1f3a663@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> <1cd32cbb1003090950n742ea1ebo2dc91e64d1f3a663@mail.gmail.com> Message-ID: josef.pktd at gmail.com schrieb: > On Mon, Mar 8, 2010 at 5:50 PM, Tim Michelsen > wrote: >> Hello, >> thanks to all who responded and have their input here. >> >> I added a little code snippet to show the view and reshape: >> >> http://www.scipy.org/Cookbook/Recarray >> >> What do you think? >> Is this worth to go into the official docs? >> The page http://docs.scipy.org/doc/numpy/user/basics.rec.html is quite >> sparse... >> >> I still wonder why there is not a quick function for such a view / >> reshape conversion. > > > Thanks, the docs for working with arrays with structured dtypes are sparse > > Note that in your example .view(np.ndarray) doesn't do anything Please note that I wanted to demonstrate many different ways of view & reshape. I updated the page: http://www.scipy.org/Cookbook/Recarray From timmichelsen at gmx-topmail.de Tue Mar 9 17:50:21 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 09 Mar 2010 23:50:21 +0100 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <1cd32cbb1003090950n742ea1ebo2dc91e64d1f3a663@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <1cd32cbb1003081101h891b4e4w331f035a5467e47a@mail.gmail.com> <1cd32cbb1003081117k313e8532oa8ca638f560002c2@mail.gmail.com> <1cd32cbb1003081130l4e3fd856jcb62e67cfcf8c1dc@mail.gmail.com> <1cd32cbb1003090950n742ea1ebo2dc91e64d1f3a663@mail.gmail.com> Message-ID: >> Is this worth to go into the official docs? >> The page http://docs.scipy.org/doc/numpy/user/basics.rec.html is quite >> sparse... >> >> I still wonder why there is not a quick function for such a view / >> reshape conversion. > > > Thanks, the docs for working with arrays with structured dtypes are sparse I cannot recover my longin for docs.scipy.org. Would you advice to add the elaborated example? From pav at iki.fi Tue Mar 9 18:07:34 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Mar 2010 01:07:34 +0200 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96AC12.1000800@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> Message-ID: <1268176055.10379.14.camel@idol> ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: > thinking about it, this morning there was a fedora update to python, so > I am using 2.6.2-4.fc12. > Looks like the problem is in python itself, hence this piece of info. That the problem would be in Python is not so clear to me. Can you try running it with the previous Python shipped by Fedora? Do you see the problem then? What's the previous version, btw? Memory errors are somewhat difficult to debug. Can you try running only a certain subset of the tests, first nosetests numpy.core Be sure to set Pythonpath so that you get the correct Numpy version. If it segfaults, proceed to (under numpy/core/tests) nosetests test_multiarray.py nosetests test_multiarray.py:TestNewBufferProtocol Since the crash occurs in cyclic garbage collection, I'm doubting a bit the numpy/core/src/multiarray/numpymemoryview.c implementation, since that's the only part in Numpy that supports that. Alternatively, just replace numpymemoryview.c with the attached one which has cyclic GC stripped, and see if you still get the crash. Cheers, Pauli -------------- next part -------------- A non-text attachment was scrubbed... Name: numpymemoryview.c Type: text/x-csrc Size: 8573 bytes Desc: not available URL: From cohen at lpta.in2p3.fr Tue Mar 9 18:33:02 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 00:33:02 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <1268176055.10379.14.camel@idol> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> Message-ID: <4B96DAAE.6060408@lpta.in2p3.fr> On 03/10/2010 12:07 AM, Pauli Virtanen wrote: > ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: > >> thinking about it, this morning there was a fedora update to python, so >> I am using 2.6.2-4.fc12. >> Looks like the problem is in python itself, hence this piece of info. >> > That the problem would be in Python is not so clear to me. Can you try > running it with the previous Python shipped by Fedora? Do you see the > problem then? What's the previous version, btw? > 2.6.2-1 IIRC. I would have to check, and I am not sure how to either query this information or step back one update up with yum :( > Memory errors are somewhat difficult to debug. Can you try running only > a certain subset of the tests, first > > nosetests numpy.core > crash > Be sure to set Pythonpath so that you get the correct Numpy version. If > it segfaults, proceed to (under numpy/core/tests) > > nosetests test_multiarray.py > nosetests test_multiarray.py:TestNewBufferProtocol > neither crash, so the problem is not there.... > Since the crash occurs in cyclic garbage collection, I'm doubting a bit > the numpy/core/src/multiarray/numpymemoryview.c implementation, since > that's the only part in Numpy that supports that. > > Alternatively, just replace numpymemoryview.c with the attached one > which has cyclic GC stripped, and see if you still get the crash. > > Cheers, > Pauli > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cohen at lpta.in2p3.fr Tue Mar 9 18:43:39 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 00:43:39 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96DAAE.6060408@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> Message-ID: <4B96DD2B.4040103@lpta.in2p3.fr> On 03/10/2010 12:33 AM, Johann Cohen-Tanugi wrote: > > > On 03/10/2010 12:07 AM, Pauli Virtanen wrote: >> ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: >> >>> thinking about it, this morning there was a fedora update to python, so >>> I am using 2.6.2-4.fc12. >>> Looks like the problem is in python itself, hence this piece of info. >>> >> That the problem would be in Python is not so clear to me. Can you try >> running it with the previous Python shipped by Fedora? Do you see the >> problem then? What's the previous version, btw? >> > 2.6.2-1 IIRC. I would have to check, and I am not sure how to either > query this information or step back one update up with yum :( >> Memory errors are somewhat difficult to debug. Can you try running only >> a certain subset of the tests, first >> >> nosetests numpy.core >> > crash >> Be sure to set Pythonpath so that you get the correct Numpy version. If >> it segfaults, proceed to (under numpy/core/tests) >> >> nosetests test_multiarray.py >> nosetests test_multiarray.py:TestNewBufferProtocol >> > neither crash, so the problem is not there.... I followed your lead and tried each script and ended up with : [cohen at jarrett tests]$ nosetests test_ufunc.py ............. ---------------------------------------------------------------------- Ran 13 tests in 1.146s OK python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Aborted (core dumped) so test_ufunc.py seems to be at stake >> Since the crash occurs in cyclic garbage collection, I'm doubting a bit >> the numpy/core/src/multiarray/numpymemoryview.c implementation, since >> that's the only part in Numpy that supports that. >> >> Alternatively, just replace numpymemoryview.c with the attached one >> which has cyclic GC stripped, and see if you still get the crash. >> >> Cheers, >> Pauli >> >> >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cohen at lpta.in2p3.fr Tue Mar 9 18:52:40 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 00:52:40 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96DD2B.4040103@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> Message-ID: <4B96DF48.60501@lpta.in2p3.fr> more fun : [cohen at jarrett tests]$ pwd /home/cohen/sources/python/numpy/numpy/core/tests [cohen at jarrett tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Aborted (core dumped) and in the debugger: (gdb) run Starting program: /usr/bin/python warning: .dynamic section for "/lib/libpthread.so.0" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/lib/libdl.so.2" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/lib/libc.so.6" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations [Thread debugging using libthread_db enabled] Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import test_ufunc >>> >>> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Program received signal SIGABRT, Aborted. 0x00aab416 in __kernel_vsyscall () Missing separate debuginfos, use: debuginfo-install atlas-3.8.3-12.fc12.i686 libgcc-4.4.3-4.fc12.i686 libgfortran-4.4.3-4.fc12.i686 (gdb) bt #0 0x00aab416 in __kernel_vsyscall () #1 0x00159a91 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #2 0x0015b35a in abort () at abort.c:92 #3 0x00152be8 in __assert_fail (assertion=, file=, line=, function=) at assert.c:81 #4 0x0050931e in visit_decref (op=, data=) at Modules/gcmodule.c:277 #5 0x0047c8c2 in dict_traverse (op=, visit=, arg=) at Objects/dictobject.c:2003 #6 0x00509af3 in subtract_refs (generation=) at Modules/gcmodule.c:296 #7 collect (generation=) at Modules/gcmodule.c:817 #8 0x0050a640 in PyGC_Collect () at Modules/gcmodule.c:1292 #9 0x004fb0f0 in Py_Finalize () at Python/pythonrun.c:424 #10 0x0050868f in Py_Main (argc=, argv=) at Modules/main.c:625 #11 0x080485c8 in main (argc=, argv=) at Modules/python.c:23 which looks identical to the bt I sent to Robert earlier on. HTH, Johann On 03/10/2010 12:43 AM, Johann Cohen-Tanugi wrote: > > > On 03/10/2010 12:33 AM, Johann Cohen-Tanugi wrote: >> >> >> On 03/10/2010 12:07 AM, Pauli Virtanen wrote: >>> ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: >>> >>>> thinking about it, this morning there was a fedora update to python, so >>>> I am using 2.6.2-4.fc12. >>>> Looks like the problem is in python itself, hence this piece of info. >>>> >>> That the problem would be in Python is not so clear to me. Can you try >>> running it with the previous Python shipped by Fedora? Do you see the >>> problem then? What's the previous version, btw? >>> >> 2.6.2-1 IIRC. I would have to check, and I am not sure how to either >> query this information or step back one update up with yum :( >>> Memory errors are somewhat difficult to debug. Can you try running only >>> a certain subset of the tests, first >>> >>> nosetests numpy.core >>> >> crash >>> Be sure to set Pythonpath so that you get the correct Numpy version. If >>> it segfaults, proceed to (under numpy/core/tests) >>> >>> nosetests test_multiarray.py >>> nosetests test_multiarray.py:TestNewBufferProtocol >>> >> neither crash, so the problem is not there.... > I followed your lead and tried each script and ended up with : > [cohen at jarrett tests]$ nosetests test_ufunc.py > ............. > ---------------------------------------------------------------------- > Ran 13 tests in 1.146s > > OK > python: Modules/gcmodule.c:277: visit_decref: Assertion > `gc->gc.gc_refs != 0' failed. > Aborted (core dumped) > > so test_ufunc.py seems to be at stake > >>> Since the crash occurs in cyclic garbage collection, I'm doubting a bit >>> the numpy/core/src/multiarray/numpymemoryview.c implementation, since >>> that's the only part in Numpy that supports that. >>> >>> Alternatively, just replace numpymemoryview.c with the attached one >>> which has cyclic GC stripped, and see if you still get the crash. >>> >>> Cheers, >>> Pauli >>> >>> >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> -- >> This message has been scanned for viruses and >> dangerous content by *MailScanner* , >> and is >> believed to be clean. >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce.schultz at gmail.com Tue Mar 9 19:09:34 2010 From: bruce.schultz at gmail.com (Bruce Schultz) Date: Wed, 10 Mar 2010 10:09:34 +1000 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <49d6b3501003051435s71e800cbq5a2461348220d1d2@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <49d6b3501003051435s71e800cbq5a2461348220d1d2@mail.gmail.com> Message-ID: <17a8763f1003091609scb416c3j13fbf986026ca340@mail.gmail.com> On Sat, Mar 6, 2010 at 8:35 AM, G?khan Sever wrote: > > > On Fri, Mar 5, 2010 at 8:00 AM, Bruce Schultz > wrote: >> >> Output is: >> ### ndarray >> [[ 1.?? 2. ] >> ?[ 3.?? 4.1]] >> ### structured array >> [(1.0, 2.0) (3.0, 4.0999999999999996)] >> >> >> Thanks >> Bruce >> > > I still couldn't figure out how floating point numbers look nicely on screen > in cases like yours (i.e., trying numpy.array2string()) but you can make > sure by using numpy.savetxt("file", array, fmt="%.1f") you will always have > specified precision in the written file. Using numpy.array2string() gives the same format as the output above. Using numpy.savetxt() creates the same nicely formatted file containing the lines below for both structured and unstructured arrays. 1.0 2.0 3.0 4.1 But I was mainly curious about this because I just want to quickly dump data out to the console for debugging, and the unstructured format is obviously much easier to read. It seems like from other discussion in the thread that the quick solution is to convert back to a unstructured array with something like view((float, 2)), but that seems a bit clumsy. Cheers Bruce From d.p.reichert at sms.ed.ac.uk Tue Mar 9 19:13:26 2010 From: d.p.reichert at sms.ed.ac.uk (David Reichert) Date: Wed, 10 Mar 2010 00:13:26 +0000 Subject: [Numpy-discussion] Memory leak in signal.convolve2d? Alternative? In-Reply-To: <1cd32cbb1003091332t1456636bxdd7895b166bb198@mail.gmail.com> References: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> <3d375d731003091157x785fb0d1y6d8c29f8f471b8e5@mail.gmail.com> <7b3b01551003091324of483f5cnecbec3dfade2a5c6@mail.gmail.com> <1cd32cbb1003091332t1456636bxdd7895b166bb198@mail.gmail.com> Message-ID: <7b3b01551003091613y1ce955d6m721f341bb23ba739@mail.gmail.com> Hi, Just another update: signal.convolve and signal.fftconvolve indeed do not seem to have the problem, however, they are slower by at least a factor of 2 for my situation. Moreover, I also tried out the numpy 1.4.x branch and the latest scipy svn, and a short test seemed to indicate that the memory leak still was present (albeit, interestingly, memory usage grew much slower), but then something else stopped working in my main program, so I converted back to scipy 7.1 and numpy 1.3.0 for now. I might have confused things somewhere along the line, though, I'm not an expert with these things. Maybe other people can confirm the problem one way or another. Thanks, David On Tue, Mar 9, 2010 at 9:32 PM, wrote: > On Tue, Mar 9, 2010 at 4:24 PM, David Reichert > wrote: > > Hm, upgrading scipy from 0.7.0 to 0.7.1 didn't do the trick for me (still > > running numpy 1.3.0). > > I'm not sure if I feel confident enough to use developer versions, but > I'll > > look into it. > > If you don't need the extra options, you could also use in the > meantime the nd version signal.convolve > or the fft version signal.fftconvolve > > Josef > > > > > Cheers > > > > David > > > > On Tue, Mar 9, 2010 at 7:57 PM, Robert Kern > wrote: > >> > >> On Tue, Mar 9, 2010 at 13:49, David Reichert > > >> wrote: > >> > Hi, > >> > > >> > I just reported a memory leak with matrices, and I might have found > >> > another (unrelated) one in the convolve2d function: > >> > > >> > import scipy.signal > >> > from numpy import ones > >> > > >> > while True: > >> > scipy.signal.convolve2d(ones((1,1)), ones((1,1))) > >> > >> This does not leak for me using SVN versions of numpy and scipy. I > >> recommend upgrading. > >> > >> -- > >> Robert Kern > >> > >> "I have come to believe that the whole world is an enigma, a harmless > >> enigma that is made terrible by our own mad attempt to interpret it as > >> though it had an underlying truth." > >> -- Umberto Eco > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > The University of Edinburgh is a charitable body, registered in > > Scotland, with registration number SC005336. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: From david at silveregg.co.jp Tue Mar 9 19:32:13 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Wed, 10 Mar 2010 09:32:13 +0900 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: <1cd32cbb1003091024m32c41209u99babe058b8f0e69@mail.gmail.com> References: <1cd32cbb1003091024m32c41209u99babe058b8f0e69@mail.gmail.com> Message-ID: <4B96E88D.3040703@silveregg.co.jp> josef.pktd at gmail.com wrote: > On Tue, Mar 9, 2010 at 9:02 AM, Pauli Virtanen wrote: >> Mon, 08 Mar 2010 22:39:00 -0700, Charles R Harris wrote: >>> On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman >>> wrote: >>>> I added Titus' email regarding the PSF's focus on Py3K-related >>>> projects to our SoC ideas wiki page: >>>> http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas >>>> >>>> Given Titus' email, this is the most likely list of projects we will >>>> get accepted this year: >>>> >>>> - finish porting NumPy to Py3K >>> I think Numpy is pretty much done. It needs use testing to wring out any >>> small oddities, but it doesn't look to me like a GSOC project at the >>> moment. Maybe Pauli can weigh in here. >> I think it's pretty much done. Even f2py should work. What's left is >> mostly testing it, and fixing any issues that crop up. >> >> Note that the SVN Numpy should preferably still see more testing on >> Python 2 against real-world packages that use it, before release. There's >> been a significant amount of code churn. >> >>>> - port SciPy to Py3K >>> This project might be more appropriate, although I'm not clear on what >>> needs to be done. >> I think porting Scipy proceeds in these steps: >> >> 1. Set up a similar 2to3-using build system for Python 3 as currently in >> Numpy. >> >> 2. Change the code so that it works both on Python 2 and Python 3. >> This can be done one submodule at a time. >> >> For C code, the changes required are mainly PyString usage. Some macros >> need to be defined to allow the same code to build on Py2 and Py3. >> Scipy is probably easier to port than Numpy here, since IIRC it doesn't >> mess much with strings, and its use of the Python C-API is much more >> limited. Also, parts written with Cython need essentially no porting. >> >> For Python code, the main issue is again bytes vs. unicode fixes, >> mainly inserting numpy.compat.asbytes() at correct locations to make >> e.g. bytes literals. All I/O code should be changed to work solely with >> Bytes. Since 2to3 takes care of most the other things, the remaining >> work is in fixing things it mishandles. >> >> I don't have a good estimate for how much work is in making Scipy work. >> I'd guess the work needed is probably slightly less than for Numpy. > > a few questions: > > Is scipy.special difficult or time consuming to port? I don't think it would be - most of it is fortran, so assuming f2py works correctly for py3k, there should not be big issues. > In the build system, is it possible not to build some subpackages that > might be slow in being ported, e.g. ndimage, weave? The way I used to do when porting to a new compiler/new platform is simply to comment everything but one package at a time in scipy/setup.py. linalg/lib/clusters are the first ones to port. I don't think special depends on much more than linalg/lib, but I could be wrong. David From pav at iki.fi Tue Mar 9 19:55:50 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Mar 2010 02:55:50 +0200 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96DF48.60501@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> Message-ID: <1268182550.1748.2.camel@Nokia-N900-42-11> > more fun : > [cohen at jarrett tests]$ pwd > /home/cohen/sources/python/numpy/numpy/core/tests > [cohen at jarrett tests]$ python -c 'import test_ufunc' > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs > != 0' failed. > Aborted (core dumped) What happens if you only import the umath_tests module (or something, it's a .so under numpy.core)? Just importing test_ufunc.py probably doesn't run a lot of extension code. Since you don't get crashes only by importing numpy, I really don't see many alternatives any more... Thanks, Pauli From josef.pktd at gmail.com Tue Mar 9 19:59:15 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 19:59:15 -0500 Subject: [Numpy-discussion] PSF GSoC 2010 (Py3K focus) In-Reply-To: <4B96E88D.3040703@silveregg.co.jp> References: <1cd32cbb1003091024m32c41209u99babe058b8f0e69@mail.gmail.com> <4B96E88D.3040703@silveregg.co.jp> Message-ID: <1cd32cbb1003091659t5dc6ce13ofc1553f4191128d9@mail.gmail.com> On Tue, Mar 9, 2010 at 7:32 PM, David Cournapeau wrote: > josef.pktd at gmail.com wrote: >> On Tue, Mar 9, 2010 at 9:02 AM, Pauli Virtanen wrote: >>> Mon, 08 Mar 2010 22:39:00 -0700, Charles R Harris wrote: >>>> On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman >>>> wrote: >>>>> I added Titus' email regarding the PSF's focus on Py3K-related >>>>> projects to our SoC ideas wiki page: >>>>> http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas >>>>> >>>>> Given Titus' email, this is the most likely list of projects we will >>>>> get accepted this year: >>>>> >>>>> - finish porting NumPy to Py3K >>>> I think Numpy is pretty much done. It needs use testing to wring out any >>>> small oddities, but it doesn't look to me like a GSOC project at the >>>> moment. Maybe Pauli can weigh in here. >>> I think it's pretty much done. Even f2py should work. What's left is >>> mostly testing it, and fixing any issues that crop up. >>> >>> Note that the SVN Numpy should preferably still see more testing on >>> Python 2 against real-world packages that use it, before release. There's >>> been a significant amount of code churn. >>> >>>>> - port SciPy to Py3K >>>> This project might be more appropriate, although I'm not clear on what >>>> needs to be done. >>> I think porting Scipy proceeds in these steps: >>> >>> 1. Set up a similar 2to3-using build system for Python 3 as currently in >>> ? Numpy. >>> >>> 2. Change the code so that it works both on Python 2 and Python 3. >>> ? This can be done one submodule at a time. >>> >>> ? For C code, the changes required are mainly PyString usage. Some macros >>> ? need to be defined to allow the same code to build on Py2 and Py3. >>> ? Scipy is probably easier to port than Numpy here, since IIRC it doesn't >>> ? mess much with strings, and its use of the Python C-API is much more >>> ? limited. Also, parts written with Cython need essentially no porting. >>> >>> ? For Python code, the main issue is again bytes vs. unicode fixes, >>> ? mainly inserting numpy.compat.asbytes() at correct locations to make >>> ? e.g. bytes literals. All I/O code should be changed to work solely with >>> ? Bytes. Since 2to3 takes care of most the other things, the remaining >>> ? work is in fixing things it mishandles. >>> >>> I don't have a good estimate for how much work is in making Scipy work. >>> I'd guess the work needed is probably slightly less than for Numpy. >> >> a few questions: >> >> Is scipy.special difficult or time consuming to port? > > I don't think it would be - most of it is fortran, so assuming f2py > works correctly for py3k, there should not be big issues. > >> In the build system, is it possible not to build some subpackages that >> might be slow in being ported, e.g. ndimage, weave? > > The way I used to do when porting to a new compiler/new platform is > simply to comment everything but one package at a time in > scipy/setup.py. linalg/lib/clusters are the first ones to port. I don't > think special depends on much more than linalg/lib, but I could be wrong. Thanks, Josef > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Mar 9 20:05:52 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Mar 2010 20:05:52 -0500 Subject: [Numpy-discussion] Memory leak in signal.convolve2d? Alternative? In-Reply-To: <7b3b01551003091613y1ce955d6m721f341bb23ba739@mail.gmail.com> References: <7b3b01551003091149y4848ebb2tcca33570d5fa0b5e@mail.gmail.com> <3d375d731003091157x785fb0d1y6d8c29f8f471b8e5@mail.gmail.com> <7b3b01551003091324of483f5cnecbec3dfade2a5c6@mail.gmail.com> <1cd32cbb1003091332t1456636bxdd7895b166bb198@mail.gmail.com> <7b3b01551003091613y1ce955d6m721f341bb23ba739@mail.gmail.com> Message-ID: <1cd32cbb1003091705l9144d74ha04f185494faacb4@mail.gmail.com> On Tue, Mar 9, 2010 at 7:13 PM, David Reichert wrote: > Hi, > > Just another update: > > signal.convolve and signal.fftconvolve indeed do not seem to have the > problem, > however, they are slower by at least a factor of 2 for my situation. > > Moreover, I also tried out the numpy 1.4.x branch and the latest scipy svn, > and a short test seemed to indicate that the memory leak still was present > (albeit, interestingly, memory usage grew much slower), but then something > else stopped working in my main program, so I converted back to scipy 7.1 > and numpy 1.3.0 for now. > > I might have confused things somewhere along the line, though, I'm > not an expert with these things. Maybe other people can confirm the problem > one way or another. I ran your convolve2d example for a few minutes without any increase in memory with numpy 1.4.0 and scipy svn that is about 2 months old Josef > > Thanks, > > David > > On Tue, Mar 9, 2010 at 9:32 PM, wrote: >> >> On Tue, Mar 9, 2010 at 4:24 PM, David Reichert >> wrote: >> > Hm, upgrading scipy from 0.7.0 to 0.7.1 didn't do the trick for me >> > (still >> > running numpy 1.3.0). >> > I'm not sure if I feel confident enough to use developer versions, but >> > I'll >> > look into it. >> >> If you don't need the extra options, you could also use in the >> meantime the nd version signal.convolve >> or the fft version signal.fftconvolve >> >> Josef >> >> > >> > Cheers >> > >> > David >> > >> > On Tue, Mar 9, 2010 at 7:57 PM, Robert Kern >> > wrote: >> >> >> >> On Tue, Mar 9, 2010 at 13:49, David Reichert >> >> >> >> wrote: >> >> > Hi, >> >> > >> >> > I just reported a memory leak with matrices, and I might have found >> >> > another (unrelated) one in the convolve2d function: >> >> > >> >> > import scipy.signal >> >> > from numpy import ones >> >> > >> >> > while True: >> >> > ??? scipy.signal.convolve2d(ones((1,1)), ones((1,1))) >> >> >> >> This does not leak for me using SVN versions of numpy and scipy. I >> >> recommend upgrading. >> >> >> >> -- >> >> Robert Kern >> >> >> >> "I have come to believe that the whole world is an enigma, a harmless >> >> enigma that is made terrible by our own mad attempt to interpret it as >> >> though it had an underlying truth." >> >> ?-- Umberto Eco >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > The University of Edinburgh is a charitable body, registered in >> > Scotland, with registration number SC005336. >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Wed Mar 10 00:27:31 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 9 Mar 2010 23:27:31 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B96DF48.60501@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> Message-ID: On Tue, Mar 9, 2010 at 5:52 PM, Johann Cohen-Tanugi wrote: > more fun : > [cohen at jarrett tests]$ pwd > /home/cohen/sources/python/numpy/numpy/core/tests > [cohen at jarrett tests]$ python -c 'import test_ufunc' > > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != > 0' failed. > Aborted (core dumped) > > and in the debugger: > (gdb) run > Starting program: /usr/bin/python > warning: .dynamic section for "/lib/libpthread.so.0" is not at the expected > address > > warning: difference appears to be caused by prelink, adjusting expectations > warning: .dynamic section for "/lib/libdl.so.2" is not at the expected > address > > warning: difference appears to be caused by prelink, adjusting expectations > warning: .dynamic section for "/lib/libc.so.6" is not at the expected > address > > warning: difference appears to be caused by prelink, adjusting expectations > [Thread debugging using libthread_db enabled] > > Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) > [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import test_ufunc > > >>> > >>> > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != > 0' failed. > > Program received signal SIGABRT, Aborted. > 0x00aab416 in __kernel_vsyscall () > Missing separate debuginfos, use: debuginfo-install > atlas-3.8.3-12.fc12.i686 libgcc-4.4.3-4.fc12.i686 > libgfortran-4.4.3-4.fc12.i686 > (gdb) bt > #0 0x00aab416 in __kernel_vsyscall () > #1 0x00159a91 in raise (sig=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #2 0x0015b35a in abort () at abort.c:92 > #3 0x00152be8 in __assert_fail (assertion=, > file=, line=, function= optimized out>) at assert.c:81 > #4 0x0050931e in visit_decref (op=, data= optimized out>) at Modules/gcmodule.c:277 > #5 0x0047c8c2 in dict_traverse (op=, visit= optimized out>, arg=) at Objects/dictobject.c:2003 > #6 0x00509af3 in subtract_refs (generation=) at > Modules/gcmodule.c:296 > > #7 collect (generation=) at Modules/gcmodule.c:817 > #8 0x0050a640 in PyGC_Collect () at Modules/gcmodule.c:1292 > #9 0x004fb0f0 in Py_Finalize () at Python/pythonrun.c:424 > #10 0x0050868f in Py_Main (argc=, argv= optimized out>) at Modules/main.c:625 > #11 0x080485c8 in main (argc=, argv= out>) at Modules/python.c:23 > > which looks identical to the bt I sent to Robert earlier on. > > HTH, > Johann > > > On 03/10/2010 12:43 AM, Johann Cohen-Tanugi wrote: > > > > On 03/10/2010 12:33 AM, Johann Cohen-Tanugi wrote: > > > > On 03/10/2010 12:07 AM, Pauli Virtanen wrote: > > ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: > > > thinking about it, this morning there was a fedora update to python, so > I am using 2.6.2-4.fc12. > Looks like the problem is in python itself, hence this piece of info. > > > That the problem would be in Python is not so clear to me. Can you try > running it with the previous Python shipped by Fedora? Do you see the > problem then? What's the previous version, btw? > > > 2.6.2-1 IIRC. I would have to check, and I am not sure how to either query > this information or step back one update up with yum :( > > Memory errors are somewhat difficult to debug. Can you try running only > a certain subset of the tests, first > > nosetests numpy.core > > > crash > > Be sure to set Pythonpath so that you get the correct Numpy version. If > it segfaults, proceed to (under numpy/core/tests) > > nosetests test_multiarray.py > nosetests test_multiarray.py:TestNewBufferProtocol > > > neither crash, so the problem is not there.... > > I followed your lead and tried each script and ended up with : > [cohen at jarrett tests]$ nosetests test_ufunc.py > ............. > ---------------------------------------------------------------------- > Ran 13 tests in 1.146s > > OK > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != > 0' failed. > Aborted (core dumped) > > so test_ufunc.py seems to be at stake > > Since the crash occurs in cyclic garbage collection, I'm doubting a bit > the numpy/core/src/multiarray/numpymemoryview.c implementation, since > that's the only part in Numpy that supports that. > > Alternatively, just replace numpymemoryview.c with the attached one > which has cyclic GC stripped, and see if you still get the crash. > > Cheers, > Pauli > > > > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Python 2.6.2 is rather old by now, the bugfix releases are up to 2.6.4. I don't know that that is related, but I haven't seen any crashes on ubuntu with 2.6.4. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cohen at lpta.in2p3.fr Wed Mar 10 04:24:30 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 10:24:30 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> Message-ID: <4B97654E.3010904@lpta.in2p3.fr> Ubuntu has a much shorter cycle of updates than Fedora, indeed. On 03/10/2010 06:27 AM, Charles R Harris wrote: > > > On Tue, Mar 9, 2010 at 5:52 PM, Johann Cohen-Tanugi > > wrote: > > more fun : > [cohen at jarrett tests]$ pwd > /home/cohen/sources/python/numpy/numpy/core/tests > [cohen at jarrett tests]$ python -c 'import test_ufunc' > > python: Modules/gcmodule.c:277: visit_decref: Assertion > `gc->gc.gc_refs != 0' failed. > Aborted (core dumped) > > and in the debugger: > (gdb) run > Starting program: /usr/bin/python > warning: .dynamic section for "/lib/libpthread.so.0" is not at the > expected address > > warning: difference appears to be caused by prelink, adjusting > expectations > warning: .dynamic section for "/lib/libdl.so.2" is not at the > expected address > > warning: difference appears to be caused by prelink, adjusting > expectations > warning: .dynamic section for "/lib/libc.so.6" is not at the > expected address > > warning: difference appears to be caused by prelink, adjusting > expectations > [Thread debugging using libthread_db enabled] > > Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) > [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import test_ufunc > > >>> > >>> > python: Modules/gcmodule.c:277: visit_decref: Assertion > `gc->gc.gc_refs != 0' failed. > > Program received signal SIGABRT, Aborted. > 0x00aab416 in __kernel_vsyscall () > Missing separate debuginfos, use: debuginfo-install > atlas-3.8.3-12.fc12.i686 libgcc-4.4.3-4.fc12.i686 > libgfortran-4.4.3-4.fc12.i686 > (gdb) bt > #0 0x00aab416 in __kernel_vsyscall () > #1 0x00159a91 in raise (sig=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #2 0x0015b35a in abort () at abort.c:92 > #3 0x00152be8 in __assert_fail (assertion=, > file=, line=, > function=) at assert.c:81 > #4 0x0050931e in visit_decref (op=, > data=) at Modules/gcmodule.c:277 > #5 0x0047c8c2 in dict_traverse (op=, > visit=, arg=) at > Objects/dictobject.c:2003 > #6 0x00509af3 in subtract_refs (generation=) > at Modules/gcmodule.c:296 > > #7 collect (generation=) at > Modules/gcmodule.c:817 > #8 0x0050a640 in PyGC_Collect () at Modules/gcmodule.c:1292 > #9 0x004fb0f0 in Py_Finalize () at Python/pythonrun.c:424 > #10 0x0050868f in Py_Main (argc=, argv= optimized out>) at Modules/main.c:625 > #11 0x080485c8 in main (argc=, argv= optimized out>) at Modules/python.c:23 > > which looks identical to the bt I sent to Robert earlier on. > > HTH, > Johann > > > On 03/10/2010 12:43 AM, Johann Cohen-Tanugi wrote: >> >> >> On 03/10/2010 12:33 AM, Johann Cohen-Tanugi wrote: >>> >>> >>> On 03/10/2010 12:07 AM, Pauli Virtanen wrote: >>>> ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: >>>> >>>>> thinking about it, this morning there was a fedora update to python, so >>>>> I am using 2.6.2-4.fc12. >>>>> Looks like the problem is in python itself, hence this piece of info. >>>>> >>>> That the problem would be in Python is not so clear to me. Can you try >>>> running it with the previous Python shipped by Fedora? Do you see the >>>> problem then? What's the previous version, btw? >>>> >>> 2.6.2-1 IIRC. I would have to check, and I am not sure how to >>> either query this information or step back one update up with yum :( >>>> Memory errors are somewhat difficult to debug. Can you try running only >>>> a certain subset of the tests, first >>>> >>>> nosetests numpy.core >>>> >>> crash >>>> Be sure to set Pythonpath so that you get the correct Numpy version. If >>>> it segfaults, proceed to (under numpy/core/tests) >>>> >>>> nosetests test_multiarray.py >>>> nosetests test_multiarray.py:TestNewBufferProtocol >>>> >>> neither crash, so the problem is not there.... >> I followed your lead and tried each script and ended up with : >> [cohen at jarrett tests]$ nosetests test_ufunc.py >> ............. >> ---------------------------------------------------------------------- >> Ran 13 tests in 1.146s >> >> OK >> python: Modules/gcmodule.c:277: visit_decref: Assertion >> `gc->gc.gc_refs != 0' failed. >> Aborted (core dumped) >> >> so test_ufunc.py seems to be at stake >> >>>> Since the crash occurs in cyclic garbage collection, I'm doubting a bit >>>> the numpy/core/src/multiarray/numpymemoryview.c implementation, since >>>> that's the only part in Numpy that supports that. >>>> >>>> Alternatively, just replace numpymemoryview.c with the attached one >>>> which has cyclic GC stripped, and see if you still get the crash. >>>> >>>> Cheers, >>>> Pauli >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> -- >>> This message has been scanned for viruses and >>> dangerous content by *MailScanner* >>> , and is >>> believed to be clean. >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> > > Python 2.6.2 is rather old by now, the bugfix releases are up to > 2.6.4. I don't know that that is related, but I haven't seen any > crashes on ubuntu with 2.6.4. > > Chuck > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cohen at lpta.in2p3.fr Wed Mar 10 04:28:07 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 10:28:07 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <1268182550.1748.2.camel@Nokia-N900-42-11> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> Message-ID: <4B976627.8060408@lpta.in2p3.fr> On 03/10/2010 01:55 AM, Pauli Virtanen wrote: >> more fun : >> [cohen at jarrett tests]$ pwd >> /home/cohen/sources/python/numpy/numpy/core/tests >> [cohen at jarrett tests]$ python -c 'import test_ufunc' >> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs >> != 0' failed. >> Aborted (core dumped) >> > What happens if you only import the umath_tests module (or something, it's a .so under numpy.core)? > [cohen at jarrett core]$ export PYTHONPATH=/home/cohen/.local/lib/python2.6/site-packages/numpy/core:$PYTHONPATH [cohen at jarrett core]$ python Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import umath_tests >>> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Aborted (core dumped) so this import also trigger the crash at exit... > Just importing test_ufunc.py probably doesn't run a lot of extension code. Since you don't get crashes only by importing numpy, I really don't see many alternatives any more... > > Thanks, > Pauli > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From esteve at lkb.ens.fr Wed Mar 10 08:33:55 2010 From: esteve at lkb.ens.fr (Jerome Esteve) Date: Wed, 10 Mar 2010 14:33:55 +0100 Subject: [Numpy-discussion] Backwards slicing including the first element Message-ID: <35EAF7EC-B62A-4FBB-8BE2-09F8FE89F41C@lkb.ens.fr> Dear all, Is there a way to give an integer value to j when using a[i:j:-1] so that the first element of the array can be included in the slice ? I would like to use some code like a[i:i-k:-1] to get a slice of length k. The numpy documentation seems to suggest that j=-1 should work: "Assume n is the number of elements in the dimension being sliced. Then, if i is not given it defaults to 0 for k > 0 and n for k < 0 . If j is not given it defaults to n for k > 0 and -1 for k < 0 . If k is not given it defaults to 1." But a[i:i-k:-1] is empty if i-k is -1. The workaround is a[i::-1][:k], is there something simpler ? Many thanks in advance, Jerome. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorgesmbox-ml at yahoo.es Wed Mar 10 08:44:37 2010 From: jorgesmbox-ml at yahoo.es (Jorge Scandaliaris) Date: Wed, 10 Mar 2010 13:44:37 +0000 (UTC) Subject: [Numpy-discussion] How to apply a function to an ndarray over a given dimension Message-ID: Hi, First, excuse me if I am over-optimizing, but I am curious if there exist a way to apply a function to an ndarray over a given dimension. In case I don't make myself clear, I have an array of shape( n,2,2) where each row represents a 2 by 2 covariance matrix, and I want to perform the eigenvalue decomposition of each row. Right now I do it with list comprehensions: import numpy as np import scipy as sp C = np.arange(10*2*2).reshape(10,2,2) ed = [sp.linalg.eig(r) for r in C[:]] Is there a better way, along the lines of vectorize, of doing this? Cheers, Jorge From bruce.schultz at gmail.com Wed Mar 10 09:06:28 2010 From: bruce.schultz at gmail.com (Bruce Schultz) Date: Thu, 11 Mar 2010 00:06:28 +1000 Subject: [Numpy-discussion] printing structured arrays In-Reply-To: <17a8763f1003091609scb416c3j13fbf986026ca340@mail.gmail.com> References: <4B910E62.7070209@gmail.com> <49d6b3501003051435s71e800cbq5a2461348220d1d2@mail.gmail.com> <17a8763f1003091609scb416c3j13fbf986026ca340@mail.gmail.com> Message-ID: <4B97A764.2050402@gmail.com> On 10/03/10 10:09, Bruce Schultz wrote: > On Sat, Mar 6, 2010 at 8:35 AM, G?khan Sever wrote: > >> On Fri, Mar 5, 2010 at 8:00 AM, Bruce Schultz >> wrote: >> >>> Output is: >>> ### ndarray >>> [[ 1. 2. ] >>> [ 3. 4.1]] >>> ### structured array >>> [(1.0, 2.0) (3.0, 4.0999999999999996)] >>> >> I still couldn't figure out how floating point numbers look nicely on screen >> in cases like yours (i.e., trying numpy.array2string()) but you can make >> sure by using numpy.savetxt("file", array, fmt="%.1f") you will always have >> specified precision in the written file. >> > > Using numpy.array2string() gives the same format as the output above. > I started looking at how array2string() is implemented, and came up with this patch which formats my structured array nicely, the same as an unstructured array. It was mainly done as a proof of concept, so it only works for floats and I'm probably doing the wrong thing to detect a structured array by comparing the dtype to void. Maybe someone with more numpy experience can tell me if I'm on the right track... === modified file 'numpy/core/arrayprint.py' --- numpy/core/arrayprint.py 2010-02-21 16:16:34 +0000 +++ numpy/core/arrayprint.py 2010-03-10 13:48:22 +0000 @@ -219,6 +219,10 @@ elif issubclass(dtypeobj, _nt.unicode_) or \ issubclass(dtypeobj, _nt.string_): format_function = repr + elif issubclass(dtypeobj, _nt.void): + #XXX this is for structured arrays.... + format_function = StructuredFormatter(a) + separator = '\n ' else: format_function = str @@ -231,6 +235,17 @@ return lst +class StructuredFormatter: + def __init__(self, a): + self.data = a + self.dtype = a.dtype #XXX use the dtype to build column formatters + + def __call__(self, x): + ff = FloatFormat(self.data.view(float), _float_output_precision, + _float_output_suppress_small) + return '[' + ' '.join([ff(n) for n in x]) + ']' + + def _convert_arrays(obj): import numeric as _nc newtup = [] Cheers Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Wed Mar 10 09:11:40 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 10 Mar 2010 14:11:40 +0000 (UTC) Subject: [Numpy-discussion] crash at prompt exit after running test References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> Message-ID: Wed, 10 Mar 2010 10:28:07 +0100, Johann Cohen-Tanugi wrote: > On 03/10/2010 01:55 AM, Pauli Virtanen wrote: >>> more fun : >>> [cohen at jarrett tests]$ pwd >>> /home/cohen/sources/python/numpy/numpy/core/tests [cohen at jarrett >>> tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: >>> visit_decref: Assertion `gc->gc.gc_refs != 0' failed. >>> Aborted (core dumped) >>> >> What happens if you only import the umath_tests module (or something, >> it's a .so under numpy.core)? >> > [cohen at jarrett core]$ export > PYTHONPATH=/home/cohen/.local/lib/python2.6/site-packages/numpy/core: $PYTHONPATH > [cohen at jarrett core]$ python > Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 > (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or > "license" for more information. > >>> import umath_tests > >>> > python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs > != 0' failed. > Aborted (core dumped) > > so this import also trigger the crash at exit... Then it is clear that the umath_tests module does something that is not permitted. It's possible that there is a some sort of a refcount error somewhere in the generalized ufuncs mechanisms -- that part of Numpy is not heavily used. Bug spotting challenge: Start from umath_tests.c.src:initumath_tests, follow the execution, and spot the bug (if any). Pauli PS. it might be a good idea to file a bug ticket now From cohen at lpta.in2p3.fr Wed Mar 10 09:35:52 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 15:35:52 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> Message-ID: <4B97AE48.6020906@lpta.in2p3.fr> http://projects.scipy.org/numpy/ticket/1425 for the bug trac well, I do feel challenged now... ;) J On 03/10/2010 03:11 PM, Pauli Virtanen wrote: > Wed, 10 Mar 2010 10:28:07 +0100, Johann Cohen-Tanugi wrote: > > >> On 03/10/2010 01:55 AM, Pauli Virtanen wrote: >> >>>> more fun : >>>> [cohen at jarrett tests]$ pwd >>>> /home/cohen/sources/python/numpy/numpy/core/tests [cohen at jarrett >>>> tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: >>>> visit_decref: Assertion `gc->gc.gc_refs != 0' failed. >>>> Aborted (core dumped) >>>> >>>> >>> What happens if you only import the umath_tests module (or something, >>> it's a .so under numpy.core)? >>> >>> >> [cohen at jarrett core]$ export >> PYTHONPATH=/home/cohen/.local/lib/python2.6/site-packages/numpy/core: >> > $PYTHONPATH > >> [cohen at jarrett core]$ python >> Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 >> (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or >> "license" for more information. >> >>> import umath_tests >> >>> >> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs >> != 0' failed. >> Aborted (core dumped) >> >> so this import also trigger the crash at exit... >> > Then it is clear that the umath_tests module does something that is not > permitted. It's possible that there is a some sort of a refcount error > somewhere in the generalized ufuncs mechanisms -- that part of Numpy is > not heavily used. > > Bug spotting challenge: Start from umath_tests.c.src:initumath_tests, > follow the execution, and spot the bug (if any). > > Pauli > > PS. it might be a good idea to file a bug ticket now > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From cohen at lpta.in2p3.fr Wed Mar 10 09:40:04 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Wed, 10 Mar 2010 15:40:04 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> Message-ID: <4B97AF44.80207@lpta.in2p3.fr> Pauli, isn't it hopeless to follow the execution of the source code when the crash actually occurs when I exit, and not when I execute. I would have to understand enough of this umath_tests.c.src to spot a refcount error or things like that???? On 03/10/2010 03:11 PM, Pauli Virtanen wrote: > Wed, 10 Mar 2010 10:28:07 +0100, Johann Cohen-Tanugi wrote: > > >> On 03/10/2010 01:55 AM, Pauli Virtanen wrote: >> >>>> more fun : >>>> [cohen at jarrett tests]$ pwd >>>> /home/cohen/sources/python/numpy/numpy/core/tests [cohen at jarrett >>>> tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: >>>> visit_decref: Assertion `gc->gc.gc_refs != 0' failed. >>>> Aborted (core dumped) >>>> >>>> >>> What happens if you only import the umath_tests module (or something, >>> it's a .so under numpy.core)? >>> >>> >> [cohen at jarrett core]$ export >> PYTHONPATH=/home/cohen/.local/lib/python2.6/site-packages/numpy/core: >> > $PYTHONPATH > >> [cohen at jarrett core]$ python >> Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 >> (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or >> "license" for more information. >> >>> import umath_tests >> >>> >> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs >> != 0' failed. >> Aborted (core dumped) >> >> so this import also trigger the crash at exit... >> > Then it is clear that the umath_tests module does something that is not > permitted. It's possible that there is a some sort of a refcount error > somewhere in the generalized ufuncs mechanisms -- that part of Numpy is > not heavily used. > > Bug spotting challenge: Start from umath_tests.c.src:initumath_tests, > follow the execution, and spot the bug (if any). > > Pauli > > PS. it might be a good idea to file a bug ticket now > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From warren.weckesser at enthought.com Wed Mar 10 09:49:06 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Mar 2010 08:49:06 -0600 Subject: [Numpy-discussion] Backwards slicing including the first element In-Reply-To: <35EAF7EC-B62A-4FBB-8BE2-09F8FE89F41C@lkb.ens.fr> References: <35EAF7EC-B62A-4FBB-8BE2-09F8FE89F41C@lkb.ens.fr> Message-ID: <4B97B162.8080902@enthought.com> Jerome Esteve wrote: > Dear all, > > Is there a way to give an integer value to j when using a[i:j:-1] so > that the first element of the array can be included in the slice ? > > I would like to use some code like a[i:i-k:-1] to get a slice of > length k. > > The numpy documentation seems to suggest that j=-1 should work: > > "Assume n is the number of elements in the dimension being sliced. > Then, if i is not given it defaults to 0 for k > 0 and n for k < 0 . > If j is not given it defaults to n for k > 0 and -1 for k < 0 . > If k is not given it defaults to 1." > > But a[i:i-k:-1] is empty if i-k is -1. The workaround is a[i::-1][:k], > is there something simpler ? You could use a[i:i-(len(a)+k):-1]. This works because a[-len(a)] is the same as a[0]. For example: ----- In [57]: a Out[57]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) In [58]: i = 3 In [59]: for k in range(5): ....: print k, a[i:i-(len(a)+k):-1] ....: ....: 0 [] 1 [13] 2 [13 12] 3 [13 12 11] 4 [13 12 11 10] ----- Warren > > Many thanks in advance, Jerome. > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav+sp at iki.fi Wed Mar 10 09:59:31 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 10 Mar 2010 14:59:31 +0000 (UTC) Subject: [Numpy-discussion] crash at prompt exit after running test References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> Message-ID: Wed, 10 Mar 2010 15:40:04 +0100, Johann Cohen-Tanugi wrote: > Pauli, isn't it hopeless to follow the execution of the source code when > the crash actually occurs when I exit, and not when I execute. I would > have to understand enough of this umath_tests.c.src to spot a refcount > error or things like that???? Yeah, it's not easy, and requires knowing how to track this type of errors. I didn't actually mean that you should try do it, just posed it as a general challenge to all interested parties :) On a more serious note, maybe there's a compilation flag or something in Python that warns when refcounts go negative (or something). Cheers, Pauli From bsouthey at gmail.com Wed Mar 10 10:09:44 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 10 Mar 2010 09:09:44 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B97AF44.80207@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> Message-ID: <4B97B638.6080004@gmail.com> On 03/10/2010 08:40 AM, Johann Cohen-Tanugi wrote: > Pauli, isn't it hopeless to follow the execution of the source code when > the crash actually occurs when I exit, and not when I execute. I would > have to understand enough of this umath_tests.c.src to spot a refcount > error or things like that???? > > On 03/10/2010 03:11 PM, Pauli Virtanen wrote: > >> Wed, 10 Mar 2010 10:28:07 +0100, Johann Cohen-Tanugi wrote: >> >> >> >>> On 03/10/2010 01:55 AM, Pauli Virtanen wrote: >>> >>> >>>>> more fun : >>>>> [cohen at jarrett tests]$ pwd >>>>> /home/cohen/sources/python/numpy/numpy/core/tests [cohen at jarrett >>>>> tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: >>>>> visit_decref: Assertion `gc->gc.gc_refs != 0' failed. >>>>> Aborted (core dumped) >>>>> >>>>> >>>>> >>>> What happens if you only import the umath_tests module (or something, >>>> it's a .so under numpy.core)? >>>> >>>> >>>> >>> [cohen at jarrett core]$ export >>> PYTHONPATH=/home/cohen/.local/lib/python2.6/site-packages/numpy/core: >>> >>> >> $PYTHONPATH >> >> >>> [cohen at jarrett core]$ python >>> Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 >>> (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or >>> "license" for more information. >>> >>> import umath_tests >>> >>> >>> python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs >>> != 0' failed. >>> Aborted (core dumped) >>> >>> so this import also trigger the crash at exit... >>> >>> >> Then it is clear that the umath_tests module does something that is not >> permitted. It's possible that there is a some sort of a refcount error >> somewhere in the generalized ufuncs mechanisms -- that part of Numpy is >> not heavily used. >> >> Bug spotting challenge: Start from umath_tests.c.src:initumath_tests, >> follow the execution, and spot the bug (if any). >> >> Pauli >> >> PS. it might be a good idea to file a bug ticket now >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, I can say that I have the same problem with Fedora 11 and Python 2.6 (both Fedora version and Unladen Swallow versions) with numpy 2.0.0.dev8272. However, I first noticed it with scipy but I see it with numpy. It does not appear to be with Python 2.5 with an svn version and I did not see it with numpy 1.3 or the removed numpy 1.4. Bruce From warren.weckesser at enthought.com Wed Mar 10 10:19:25 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Mar 2010 09:19:25 -0600 Subject: [Numpy-discussion] Backwards slicing including the first element In-Reply-To: <4B97B162.8080902@enthought.com> References: <35EAF7EC-B62A-4FBB-8BE2-09F8FE89F41C@lkb.ens.fr> <4B97B162.8080902@enthought.com> Message-ID: <4B97B87D.9030609@enthought.com> Warren Weckesser wrote: > Jerome Esteve wrote: > >> Dear all, >> >> Is there a way to give an integer value to j when using a[i:j:-1] so >> that the first element of the array can be included in the slice ? >> >> I would like to use some code like a[i:i-k:-1] to get a slice of >> length k. >> >> The numpy documentation seems to suggest that j=-1 should work: >> >> "Assume n is the number of elements in the dimension being sliced. >> Then, if i is not given it defaults to 0 for k > 0 and n for k < 0 . >> If j is not given it defaults to n for k > 0 and -1 for k < 0 . >> If k is not given it defaults to 1." >> >> But a[i:i-k:-1] is empty if i-k is -1. The workaround is a[i::-1][:k], >> is there something simpler ? >> > > You could use a[i:i-(len(a)+k):-1]. This works because a[-len(a)] is > the same as a[0]. > > I'm going to be pedantic and amend that last sentence. While it is true the a[-len(a)] is the same as a[0], that is not why this works. A better explanation is that Python is lenient about handling the value given as the end position in a slice. It does not have to be a valid index. If the value is out of range, Python will include everything up to the end of the actual data, and it will not raise an error. So you can do slices like the following: ----- In [101]: w Out[101]: [10, 11, 12, 13, 14] In [102]: w[2:-5:-1] Out[102]: [12, 11] In [103]: w[2:-6:-1] Out[103]: [12, 11, 10] In [104]: w[2:-7:-1] Out[104]: [12, 11, 10] ----- Note that -6 and -7 are not valid indices; using w[-6] will raise an IndexError. Warren > For example: > > ----- > In [57]: a > Out[57]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) > > In [58]: i = 3 > > In [59]: for k in range(5): > ....: print k, a[i:i-(len(a)+k):-1] > ....: > ....: > 0 [] > 1 [13] > 2 [13 12] > 3 [13 12 11] > 4 [13 12 11 10] > > ----- > > > Warren > > >> Many thanks in advance, Jerome. >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Wed Mar 10 10:56:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 10 Mar 2010 10:56:31 -0500 Subject: [Numpy-discussion] Backwards slicing including the first element In-Reply-To: <4B97B87D.9030609@enthought.com> References: <35EAF7EC-B62A-4FBB-8BE2-09F8FE89F41C@lkb.ens.fr> <4B97B162.8080902@enthought.com> <4B97B87D.9030609@enthought.com> Message-ID: <1cd32cbb1003100756r26c1a0f9i90dd10ec0b0b53ae@mail.gmail.com> On Wed, Mar 10, 2010 at 10:19 AM, Warren Weckesser wrote: > Warren Weckesser wrote: >> Jerome Esteve wrote: >> >>> Dear all, >>> >>> Is there a way to give an integer value to j when using a[i:j:-1] so >>> that the first element of the array can be included in the slice ? >>> >>> I would like to use some code like a[i:i-k:-1] to get a slice of >>> length k. >>> >>> The numpy documentation seems to suggest that j=-1 should work: >>> >>> "Assume n is the number of elements in the dimension being sliced. >>> Then, if i is not given it defaults to 0 for k > 0 and n for k < 0 . >>> If j is not given it defaults to n for k > 0 and -1 for k < 0 . >>> If k is not given it defaults to 1." >>> >>> But a[i:i-k:-1] is empty if i-k is -1. The workaround is a[i::-1][:k], >>> is there something simpler ? >>> >> >> You could use a[i:i-(len(a)+k):-1]. ?This works because a[-len(a)] is >> the same as a[0]. >> >> > > I'm going to be pedantic and amend that last sentence. ?While it is true > the a[-len(a)] is the same as a[0], that is not why this works. ?A > better explanation is that Python is lenient about handling the value > given as the end position in a slice. ?It does not have to be a valid > index. ?If the value is out of range, Python will include everything up > to the end of the actual data, and it will not raise an error. ?So you > can do slices like the following: > ----- > In [101]: w > Out[101]: [10, 11, 12, 13, 14] > > In [102]: w[2:-5:-1] > Out[102]: [12, 11] > > In [103]: w[2:-6:-1] > Out[103]: [12, 11, 10] > > In [104]: w[2:-7:-1] > Out[104]: [12, 11, 10] > ----- > Note that -6 and -7 are not valid indices; using w[-6] ?will raise an > IndexError. > > > Warren > > >> For example: >> >> ----- >> In [57]: a >> Out[57]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) >> >> In [58]: i = 3 >> >> In [59]: for k in range(5): >> ? ?....: ? ? print k, a[i:i-(len(a)+k):-1] >> ? ?....: >> ? ?....: >> 0 [] >> 1 [13] >> 2 [13 12] >> 3 [13 12 11] >> 4 [13 12 11 10] >> >> ----- I thought, I had also used the -1 before, but it only works with range, e.g. >>> [(i,i-k,np.arange(5)[range(i,i-k,-1)]) for i in range(1,5)] [(1, -1, array([1, 0])), (2, 0, array([2, 1])), (3, 1, array([3, 2])), (4, 2, array([4, 3]))] the "or None" trick is more complicated when counting down >>> [(i,i-k,np.arange(5)[i:(i-k if (i-k!=-1) else None):-1]) for i in range(1,5)] [(1, -1, array([1, 0])), (2, 0, array([2, 1])), (3, 1, array([3, 2])), (4, 2, array([4, 3]))] I will mark Warrens solution as something to remember. Josef >> >> Warren >> >> >>> Many thanks in advance, Jerome. >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bsouthey at gmail.com Wed Mar 10 11:39:42 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 10 Mar 2010 10:39:42 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: References: <4B96A1CD.7080702@lpta.in2p3.fr> <3d375d731003091135q303cd4bx1f8c2e01a4106dfb@mail.gmail.com> <4B96A677.2080206@lpta.in2p3.fr> <3d375d731003091156x2beab006k61656b5c4aedcbc@mail.gmail.com> <4B96AA68.9080106@lpta.in2p3.fr> <4B96AC12.1000800@lpta.in2p3.fr> <1268176055.10379.14.camel@idol> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> Message-ID: <4B97CB4E.7060304@gmail.com> On 03/10/2010 08:59 AM, Pauli Virtanen wrote: > Wed, 10 Mar 2010 15:40:04 +0100, Johann Cohen-Tanugi wrote: > >> Pauli, isn't it hopeless to follow the execution of the source code when >> the crash actually occurs when I exit, and not when I execute. I would >> have to understand enough of this umath_tests.c.src to spot a refcount >> error or things like that???? >> > Yeah, it's not easy, and requires knowing how to track this type of > errors. I didn't actually mean that you should try do it, just posed it > as a general challenge to all interested parties :) > > On a more serious note, maybe there's a compilation flag or something in > Python that warns when refcounts go negative (or something). > > Cheers, > Pauli > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, I think I managed to find this. I reverted back my svn versions ($svn update -r 8262) and cleaned both the build and installation directories. It occurred with changeset 8262 (earlier changesets appear okay but later ones do not) http://projects.scipy.org/numpy/changeset/8262 Specifically in the file: numpy/core/code_generators/generate_ufunc_api.py There is an extra call to that should have been deleted on line 54(?). Py_DECREF(numpy); Attached a patch to ticket 1425 http://projects.scipy.org/numpy/ticket/1425 Bruce From Chris.Barker at noaa.gov Wed Mar 10 11:48:34 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed, 10 Mar 2010 08:48:34 -0800 Subject: [Numpy-discussion] Backwards slicing including the first element In-Reply-To: <4B97B162.8080902@enthought.com> References: <35EAF7EC-B62A-4FBB-8BE2-09F8FE89F41C@lkb.ens.fr> <4B97B162.8080902@enthought.com> Message-ID: <4B97CD62.5010800@noaa.gov> > Jerome Esteve wrote: >> Is there a way to give an integer value to j when using a[i:j:-1] so >> that the first element of the array can be included in the slice ? Is this what you are looking for? In [11]: a = np.arange(10) In [12]: a[6::-1] Out[12]: array([6, 5, 4, 3, 2, 1, 0]) I know it's not an integer, but it does indicate "the last (or first) element of the array" -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed Mar 10 12:06:34 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 10 Mar 2010 11:06:34 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B97CB4E.7060304@gmail.com> References: <4B96A1CD.7080702@lpta.in2p3.fr> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> <4B97CB4E.7060304@gmail.com> Message-ID: On Wed, Mar 10, 2010 at 10:39 AM, Bruce Southey wrote: > On 03/10/2010 08:59 AM, Pauli Virtanen wrote: > > Wed, 10 Mar 2010 15:40:04 +0100, Johann Cohen-Tanugi wrote: > > > >> Pauli, isn't it hopeless to follow the execution of the source code when > >> the crash actually occurs when I exit, and not when I execute. I would > >> have to understand enough of this umath_tests.c.src to spot a refcount > >> error or things like that???? > >> > > Yeah, it's not easy, and requires knowing how to track this type of > > errors. I didn't actually mean that you should try do it, just posed it > > as a general challenge to all interested parties :) > > > > On a more serious note, maybe there's a compilation flag or something in > > Python that warns when refcounts go negative (or something). > > > > Cheers, > > Pauli > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Hi, > I think I managed to find this. I reverted back my svn versions ($svn > update -r 8262) and cleaned both the build and installation directories. > > It occurred with changeset 8262 (earlier changesets appear okay but > later ones do not) > http://projects.scipy.org/numpy/changeset/8262 > > Specifically in the file: > numpy/core/code_generators/generate_ufunc_api.py > > There is an extra call to that should have been deleted on line 54(?). > Py_DECREF(numpy); > > Attached a patch to ticket 1425 > http://projects.scipy.org/numpy/ticket/1425 > > Look like my bad. I'm out of town at the moment so someone else needs to apply the patch. That whole bit of code could probably use a daylight audit. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Mar 10 19:19:07 2010 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 10 Mar 2010 19:19:07 -0500 Subject: [Numpy-discussion] numarray iterator question Message-ID: This is a bit confusing to me: import numpy as np u = np.ones ((3,3)) for u_row in u: u_row = u_row * 2 << doesn't work print u [[ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.]] for u_row in u: u_row *= 2 << does work [[ 2. 2. 2.] [ 2. 2. 2.] [ 2. 2. 2.]] Naively, I'm thinking a *= b === a = a * b. Is this behavior expected? I'm asking because I really want: u_row = my_func (u_row) From robert.kern at gmail.com Wed Mar 10 19:22:40 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Mar 2010 18:22:40 -0600 Subject: [Numpy-discussion] numarray iterator question In-Reply-To: References: Message-ID: <3d375d731003101622s5b16f6fagabaf8f142d36c46e@mail.gmail.com> On Wed, Mar 10, 2010 at 18:19, Neal Becker wrote: > This is a bit confusing to me: > > import numpy as np > > u = np.ones ((3,3)) > > for u_row in u: > ? ?u_row = u_row * 2 ?<< doesn't work > > print u > [[ 1. ?1. ?1.] > ?[ 1. ?1. ?1.] > ?[ 1. ?1. ?1.]] > > for u_row in u: > ? ?u_row *= 2 ?<< does work > [[ 2. ?2. ?2.] > ?[ 2. ?2. ?2.] > ?[ 2. ?2. ?2.]] > > Naively, I'm thinking a *= b === a = a * b. http://docs.python.org/reference/simple_stmts.html#augmented-assignment-statements """An augmented assignment expression like x += 1 can be rewritten as x = x + 1 to achieve a similar, but not exactly equal effect. In the augmented version, x is only evaluated once. Also, when possible, the actual operation is performed in-place, meaning that rather than creating a new object and assigning that to the target, the old object is modified instead.""" > Is this behavior expected? ?I'm asking because I really want: > > u_row = my_func (u_row) Iterate over indices instead: for i in range(len(u)): u[i] = my_func(u[i]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From warren.weckesser at enthought.com Wed Mar 10 19:23:06 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Mar 2010 18:23:06 -0600 Subject: [Numpy-discussion] numarray iterator question In-Reply-To: References: Message-ID: <4B9837EA.9080607@enthought.com> Neal Becker wrote: > This is a bit confusing to me: > > import numpy as np > > u = np.ones ((3,3)) > > for u_row in u: > u_row = u_row * 2 << doesn't work > > Try this instead: for u_row in u: u_row[:] = u_row * 2 Warren > print u > [[ 1. 1. 1.] > [ 1. 1. 1.] > [ 1. 1. 1.]] > > for u_row in u: > u_row *= 2 << does work > [[ 2. 2. 2.] > [ 2. 2. 2.] > [ 2. 2. 2.]] > > Naively, I'm thinking a *= b === a = a * b. > > Is this behavior expected? I'm asking because I really want: > > u_row = my_func (u_row) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Thu Mar 11 04:04:36 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 11 Mar 2010 10:04:36 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <20100307190321.GI7459@phare.normalesup.org> References: <64ddb72c1003071100n5c786d6cv726731768ad32d73@mail.gmail.com> <20100307190321.GI7459@phare.normalesup.org> Message-ID: <201003111004.36993.faltet@pytables.org> A Sunday 07 March 2010 20:03:21 Gael Varoquaux escrigu?: > On Sun, Mar 07, 2010 at 07:00:03PM +0000, Ren? Dudfield wrote: > > 1. Mmap'd files are useful since you can reuse disk cache as program > > memory. So large files don't waste ram on the disk cache. > > I second that. mmaping has worked very well for me for large datasets, > especialy in the context of reducing memory pressure. As far as I know, memmap files (or better, the underlying OS) *use* all available RAM for loading data until RAM is exhausted and then start to use SWAP, so the "memory pressure" is still there. But I may be wrong... -- Francesc Alted From gael.varoquaux at normalesup.org Thu Mar 11 04:36:42 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 11 Mar 2010 10:36:42 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <201003111004.36993.faltet@pytables.org> References: <64ddb72c1003071100n5c786d6cv726731768ad32d73@mail.gmail.com> <20100307190321.GI7459@phare.normalesup.org> <201003111004.36993.faltet@pytables.org> Message-ID: <20100311093642.GE3515@phare.normalesup.org> On Thu, Mar 11, 2010 at 10:04:36AM +0100, Francesc Alted wrote: > As far as I know, memmap files (or better, the underlying OS) *use* all > available RAM for loading data until RAM is exhausted and then start to use > SWAP, so the "memory pressure" is still there. But I may be wrong... I believe that your above assertion is 'half' right. First I think that it is not SWAP that the memapped file uses, but the original disk space, thus you avoid running out of SWAP. Second, if you open several times the same data without memmapping, I believe that it will be duplicated in memory. On the other hand, when you memapping, it is not duplicated, thus if you are running several processing jobs on the same data, you save memory. I am very much in this case. Ga?l From nwagner at iam.uni-stuttgart.de Thu Mar 11 06:55:29 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 11 Mar 2010 12:55:29 +0100 Subject: [Numpy-discussion] Calling routines from a Fortran library using python In-Reply-To: <5b8d13221002220518q7f9db23ey39c3722ad875b5a9@mail.gmail.com> References: <4B7D16B2.1000009@silveregg.co.jp> <5b8d13221002180529g2c725bfel8ca8366b0a11b91b@mail.gmail.com> <5b8d13221002220518q7f9db23ey39c3722ad875b5a9@mail.gmail.com> Message-ID: On Mon, 22 Feb 2010 22:18:23 +0900 David Cournapeau wrote: > On Mon, Feb 22, 2010 at 10:01 PM, Nils Wagner > wrote: > >> >> ar x test.a >> gfortran -shared *.o -o libtest.so -lg2c >> >> to build a shared library. The additional option -lg2c >>was >> necessary due to an undefined symbol: s_cmp > > You should avoid the -lg2c option at any cost if >compiling with > gfortran. I am afraid that you got a library compiled >with g77. If > that's the case, you should use g77 and not gfortran. >You cannot mix > libraries built with one with libraries with another. > >> >> Now I am able to load the shared library >> >> from ctypes import * >> my_lib = CDLL('test.so') >> >> What are the next steps to use the library functions >> within python ? > > You use it as you would use a C library: > > http://python.net/crew/theller/ctypes/tutorial.html > > But the fortran ABI, at least for code built with g77 >and gfortran, > pass everything by reference. To make sure to pass the >right > arguments, I strongly suggest to double check with the >.h you > received. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Hi all, I tried to run the following script. The result is a segmentation fault. Did I use byref correctly ? from ctypes import * my_dsio = CDLL('libdsio20_gnu4.so') # loading dynamic link libraries # # FORTRAN : CALL DSIO(JUNCAT,FDSCAT,IERR) # # int I,J,K,N,IDE,IA,IE,IERR,JUNIT,JUNCAT,NDATA,NREC,LREADY,ONE=1; # Word BUF[100],HEAD[30]; # char *PATH,*STRING; # char *PGNAME,*DATE,*TIME,*TEXT; # int LHEAD=30; # # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); # IERR = c_int() FDSCAT = c_char_p('dscat.ds') JUNCAT = c_int() LDSNCAT = c_int(len(FDSCAT.value)) print print 'LDSNCAT', LDSNCAT.value print 'FDSCAT' , FDSCAT.value , len(FDSCAT.value) my_dsio.dsio(byref(JUNCAT),byref(FDSCAT),byref(IERR),byref(LDSNCAT)) # segmentation fault print IERR.value Any idea ? Nils From dagss at student.matnat.uio.no Thu Mar 11 07:01:33 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 11 Mar 2010 13:01:33 +0100 Subject: [Numpy-discussion] Calling routines from a Fortran library using python In-Reply-To: References: <4B7D16B2.1000009@silveregg.co.jp> <5b8d13221002180529g2c725bfel8ca8366b0a11b91b@mail.gmail.com> <5b8d13221002220518q7f9db23ey39c3722ad875b5a9@mail.gmail.com> Message-ID: <4B98DB9D.1060209@student.matnat.uio.no> Nils Wagner wrote: > On Mon, 22 Feb 2010 22:18:23 +0900 > David Cournapeau wrote: > >> On Mon, Feb 22, 2010 at 10:01 PM, Nils Wagner >> wrote: >> >> >>> ar x test.a >>> gfortran -shared *.o -o libtest.so -lg2c >>> >>> to build a shared library. The additional option -lg2c >>> was >>> necessary due to an undefined symbol: s_cmp >>> >> You should avoid the -lg2c option at any cost if >> compiling with >> gfortran. I am afraid that you got a library compiled >> with g77. If >> that's the case, you should use g77 and not gfortran. >> You cannot mix >> libraries built with one with libraries with another. >> >> >>> Now I am able to load the shared library >>> >>> from ctypes import * >>> my_lib = CDLL('test.so') >>> >>> What are the next steps to use the library functions >>> within python ? >>> >> You use it as you would use a C library: >> >> http://python.net/crew/theller/ctypes/tutorial.html >> >> But the fortran ABI, at least for code built with g77 >> and gfortran, >> pass everything by reference. To make sure to pass the >> right >> arguments, I strongly suggest to double check with the >> .h you >> received. >> >> cheers, >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > Hi all, > > I tried to run the following script. > The result is a segmentation fault. > Did I use byref correctly ? > > from ctypes import * > my_dsio = CDLL('libdsio20_gnu4.so') # loading dynamic > link libraries > # > # FORTRAN : CALL DSIO(JUNCAT,FDSCAT,IERR) > # > # int > I,J,K,N,IDE,IA,IE,IERR,JUNIT,JUNCAT,NDATA,NREC,LREADY,ONE=1; > # Word BUF[100],HEAD[30]; > # char *PATH,*STRING; > # char *PGNAME,*DATE,*TIME,*TEXT; > # int LHEAD=30; > # > # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); > # > > > IERR = c_int() > FDSCAT = c_char_p('dscat.ds') > JUNCAT = c_int() > LDSNCAT = c_int(len(FDSCAT.value)) > print > print 'LDSNCAT', LDSNCAT.value > print 'FDSCAT' , FDSCAT.value , len(FDSCAT.value) > > my_dsio.dsio(byref(JUNCAT),byref(FDSCAT),byref(IERR),byref(LDSNCAT)) > # segmentation fault > print IERR.value > > > Any idea ? > You shouldn't have byref on FDSCAT nor LDSNCAT, as explained by this line: # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); Dag Sverre From nadavh at visionsense.com Thu Mar 11 07:20:34 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 11 Mar 2010 14:20:34 +0200 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy References: <64ddb72c1003071100n5c786d6cv726731768ad32d73@mail.gmail.com><20100307190321.GI7459@phare.normalesup.org><201003111004.36993.faltet@pytables.org> <20100311093642.GE3515@phare.normalesup.org> Message-ID: <710F2847B0018641891D9A21602763605AD32E@ex3.envision.co.il> Here is a strange thing I am getting with multiprocessing and memory mapped array: The below script generates the error message 30 times (for every slice access): Exception AttributeError: AttributeError("'NoneType' object has no attribute 'tell'",) in ignored Although I get the correct answer eventually. ------------------------------------------------------ import numpy as N import multiprocessing as MP def average(cube): return [plane.mean() for plane in cube] N.arange(30*100*100, dtype=N.int32).tofile(open('30x100x100_int32.dat','w')) data = N.memmap('30x100x100_int32.dat', dtype=N.int32, shape=(30,100,100)) pool = MP.Pool(processes=1) job = pool.apply_async(average, [data,]) print job.get() ------------------------------------------------------ I use python 2.6.4 and numpy 1.4.0 on 64 bit linux (amd64) Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Gael Varoquaux Sent: Thu 11-Mar-10 11:36 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy On Thu, Mar 11, 2010 at 10:04:36AM +0100, Francesc Alted wrote: > As far as I know, memmap files (or better, the underlying OS) *use* all > available RAM for loading data until RAM is exhausted and then start to use > SWAP, so the "memory pressure" is still there. But I may be wrong... I believe that your above assertion is 'half' right. First I think that it is not SWAP that the memapped file uses, but the original disk space, thus you avoid running out of SWAP. Second, if you open several times the same data without memmapping, I believe that it will be duplicated in memory. On the other hand, when you memapping, it is not duplicated, thus if you are running several processing jobs on the same data, you save memory. I am very much in this case. Ga?l _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4071 bytes Desc: not available URL: From nwagner at iam.uni-stuttgart.de Thu Mar 11 07:38:57 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 11 Mar 2010 13:38:57 +0100 Subject: [Numpy-discussion] Calling routines from a Fortran library using python In-Reply-To: <4B98DB9D.1060209@student.matnat.uio.no> References: <4B7D16B2.1000009@silveregg.co.jp> <5b8d13221002180529g2c725bfel8ca8366b0a11b91b@mail.gmail.com> <5b8d13221002220518q7f9db23ey39c3722ad875b5a9@mail.gmail.com> <4B98DB9D.1060209@student.matnat.uio.no> Message-ID: On Thu, 11 Mar 2010 13:01:33 +0100 Dag Sverre Seljebotn wrote: > Nils Wagner wrote: >> On Mon, 22 Feb 2010 22:18:23 +0900 >> David Cournapeau wrote: >> >>> On Mon, Feb 22, 2010 at 10:01 PM, Nils Wagner >>> wrote: >>> >>> >>>> ar x test.a >>>> gfortran -shared *.o -o libtest.so -lg2c >>>> >>>> to build a shared library. The additional option -lg2c >>>> was >>>> necessary due to an undefined symbol: s_cmp >>>> >>> You should avoid the -lg2c option at any cost if >>> compiling with >>> gfortran. I am afraid that you got a library compiled >>> with g77. If >>> that's the case, you should use g77 and not gfortran. >>> You cannot mix >>> libraries built with one with libraries with another. >>> >>> >>>> Now I am able to load the shared library >>>> >>>> from ctypes import * >>>> my_lib = CDLL('test.so') >>>> >>>> What are the next steps to use the library functions >>>> within python ? >>>> >>> You use it as you would use a C library: >>> >>> http://python.net/crew/theller/ctypes/tutorial.html >>> >>> But the fortran ABI, at least for code built with g77 >>> and gfortran, >>> pass everything by reference. To make sure to pass the >>> right >>> arguments, I strongly suggest to double check with the >>> .h you >>> received. >>> >>> cheers, >>> >>> David >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> Hi all, >> >> I tried to run the following script. >> The result is a segmentation fault. >> Did I use byref correctly ? >> >> from ctypes import * >> my_dsio = CDLL('libdsio20_gnu4.so') # loading >>dynamic >> link libraries >> # >> # FORTRAN : CALL DSIO(JUNCAT,FDSCAT,IERR) >> # >> # int >> I,J,K,N,IDE,IA,IE,IERR,JUNIT,JUNCAT,NDATA,NREC,LREADY,ONE=1; >> # Word BUF[100],HEAD[30]; >> # char *PATH,*STRING; >> # char *PGNAME,*DATE,*TIME,*TEXT; >> # int LHEAD=30; >> # >> # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); >> # >> >> >> IERR = c_int() >> FDSCAT = c_char_p('dscat.ds') >> JUNCAT = c_int() >> LDSNCAT = c_int(len(FDSCAT.value)) >> print >> print 'LDSNCAT', LDSNCAT.value >> print 'FDSCAT' , FDSCAT.value , len(FDSCAT.value) >> >> my_dsio.dsio(byref(JUNCAT),byref(FDSCAT),byref(IERR),byref(LDSNCAT)) >> # segmentation fault >> print IERR.value >> >> >> Any idea ? >> > You shouldn't have byref on FDSCAT nor LDSNCAT, as >explained by this line: > > # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); > > Dag Sverre Sorry, I am newbie to C. What is the correct way ? Nils From dagss at student.matnat.uio.no Thu Mar 11 07:42:43 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 11 Mar 2010 13:42:43 +0100 Subject: [Numpy-discussion] Calling routines from a Fortran library using python In-Reply-To: References: <4B7D16B2.1000009@silveregg.co.jp> <5b8d13221002180529g2c725bfel8ca8366b0a11b91b@mail.gmail.com> <5b8d13221002220518q7f9db23ey39c3722ad875b5a9@mail.gmail.com> <4B98DB9D.1060209@student.matnat.uio.no> Message-ID: <4B98E543.4020900@student.matnat.uio.no> Nils Wagner wrote: > On Thu, 11 Mar 2010 13:01:33 +0100 > Dag Sverre Seljebotn > wrote: > >> Nils Wagner wrote: >> >>> On Mon, 22 Feb 2010 22:18:23 +0900 >>> David Cournapeau wrote: >>> >>> >>>> On Mon, Feb 22, 2010 at 10:01 PM, Nils Wagner >>>> wrote: >>>> >>>> >>>> >>>>> ar x test.a >>>>> gfortran -shared *.o -o libtest.so -lg2c >>>>> >>>>> to build a shared library. The additional option -lg2c >>>>> was >>>>> necessary due to an undefined symbol: s_cmp >>>>> >>>>> >>>> You should avoid the -lg2c option at any cost if >>>> compiling with >>>> gfortran. I am afraid that you got a library compiled >>>> with g77. If >>>> that's the case, you should use g77 and not gfortran. >>>> You cannot mix >>>> libraries built with one with libraries with another. >>>> >>>> >>>> >>>>> Now I am able to load the shared library >>>>> >>>>> from ctypes import * >>>>> my_lib = CDLL('test.so') >>>>> >>>>> What are the next steps to use the library functions >>>>> within python ? >>>>> >>>>> >>>> You use it as you would use a C library: >>>> >>>> http://python.net/crew/theller/ctypes/tutorial.html >>>> >>>> But the fortran ABI, at least for code built with g77 >>>> and gfortran, >>>> pass everything by reference. To make sure to pass the >>>> right >>>> arguments, I strongly suggest to double check with the >>>> .h you >>>> received. >>>> >>>> cheers, >>>> >>>> David >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> Hi all, >>> >>> I tried to run the following script. >>> The result is a segmentation fault. >>> Did I use byref correctly ? >>> >>> from ctypes import * >>> my_dsio = CDLL('libdsio20_gnu4.so') # loading >>> dynamic >>> link libraries >>> # >>> # FORTRAN : CALL DSIO(JUNCAT,FDSCAT,IERR) >>> # >>> # int >>> I,J,K,N,IDE,IA,IE,IERR,JUNIT,JUNCAT,NDATA,NREC,LREADY,ONE=1; >>> # Word BUF[100],HEAD[30]; >>> # char *PATH,*STRING; >>> # char *PGNAME,*DATE,*TIME,*TEXT; >>> # int LHEAD=30; >>> # >>> # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); >>> # >>> >>> >>> IERR = c_int() >>> FDSCAT = c_char_p('dscat.ds') >>> JUNCAT = c_int() >>> LDSNCAT = c_int(len(FDSCAT.value)) >>> print >>> print 'LDSNCAT', LDSNCAT.value >>> print 'FDSCAT' , FDSCAT.value , len(FDSCAT.value) >>> >>> my_dsio.dsio(byref(JUNCAT),byref(FDSCAT),byref(IERR),byref(LDSNCAT)) >>> # segmentation fault >>> print IERR.value >>> >>> >>> Any idea ? >>> >>> >> You shouldn't have byref on FDSCAT nor LDSNCAT, as >> explained by this line: >> >> # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); >> >> Dag Sverre >> > > > Sorry, I am newbie to C. What is the correct way ? > > my_dsio.dsio(byref(JUNCAT),FDSCAT,byref(IERR),LDSNCAT) Dag From nwagner at iam.uni-stuttgart.de Thu Mar 11 08:06:19 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 11 Mar 2010 14:06:19 +0100 Subject: [Numpy-discussion] Calling routines from a Fortran library using python In-Reply-To: <4B98E543.4020900@student.matnat.uio.no> References: <4B7D16B2.1000009@silveregg.co.jp> <5b8d13221002180529g2c725bfel8ca8366b0a11b91b@mail.gmail.com> <5b8d13221002220518q7f9db23ey39c3722ad875b5a9@mail.gmail.com> <4B98DB9D.1060209@student.matnat.uio.no> <4B98E543.4020900@student.matnat.uio.no> Message-ID: On Thu, 11 Mar 2010 13:42:43 +0100 Dag Sverre Seljebotn wrote: > Nils Wagner wrote: >> On Thu, 11 Mar 2010 13:01:33 +0100 >> Dag Sverre Seljebotn >> wrote: >> >>> Nils Wagner wrote: >>> >>>> On Mon, 22 Feb 2010 22:18:23 +0900 >>>> David Cournapeau wrote: >>>> >>>> >>>>> On Mon, Feb 22, 2010 at 10:01 PM, Nils Wagner >>>>> wrote: >>>>> >>>>> >>>>> >>>>>> ar x test.a >>>>>> gfortran -shared *.o -o libtest.so -lg2c >>>>>> >>>>>> to build a shared library. The additional option -lg2c >>>>>> was >>>>>> necessary due to an undefined symbol: s_cmp >>>>>> >>>>>> >>>>> You should avoid the -lg2c option at any cost if >>>>> compiling with >>>>> gfortran. I am afraid that you got a library compiled >>>>> with g77. If >>>>> that's the case, you should use g77 and not gfortran. >>>>> You cannot mix >>>>> libraries built with one with libraries with another. >>>>> >>>>> >>>>> >>>>>> Now I am able to load the shared library >>>>>> >>>>>> from ctypes import * >>>>>> my_lib = CDLL('test.so') >>>>>> >>>>>> What are the next steps to use the library functions >>>>>> within python ? >>>>>> >>>>>> >>>>> You use it as you would use a C library: >>>>> >>>>> http://python.net/crew/theller/ctypes/tutorial.html >>>>> >>>>> But the fortran ABI, at least for code built with g77 >>>>> and gfortran, >>>>> pass everything by reference. To make sure to pass the >>>>> right >>>>> arguments, I strongly suggest to double check with the >>>>> .h you >>>>> received. >>>>> >>>>> cheers, >>>>> >>>>> David >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> Hi all, >>>> >>>> I tried to run the following script. >>>> The result is a segmentation fault. >>>> Did I use byref correctly ? >>>> >>>> from ctypes import * >>>> my_dsio = CDLL('libdsio20_gnu4.so') # loading >>>> dynamic >>>> link libraries >>>> # >>>> # FORTRAN : CALL DSIO(JUNCAT,FDSCAT,IERR) >>>> # >>>> # int >>>> I,J,K,N,IDE,IA,IE,IERR,JUNIT,JUNCAT,NDATA,NREC,LREADY,ONE=1; >>>> # Word BUF[100],HEAD[30]; >>>> # char *PATH,*STRING; >>>> # char *PGNAME,*DATE,*TIME,*TEXT; >>>> # int LHEAD=30; >>>> # >>>> # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); >>>> # >>>> >>>> >>>> IERR = c_int() >>>> FDSCAT = c_char_p('dscat.ds') >>>> JUNCAT = c_int() >>>> LDSNCAT = c_int(len(FDSCAT.value)) >>>> print >>>> print 'LDSNCAT', LDSNCAT.value >>>> print 'FDSCAT' , FDSCAT.value , len(FDSCAT.value) >>>> >>>> my_dsio.dsio(byref(JUNCAT),byref(FDSCAT),byref(IERR),byref(LDSNCAT)) >>>> # segmentation fault >>>> print IERR.value >>>> >>>> >>>> Any idea ? >>>> >>>> >>> You shouldn't have byref on FDSCAT nor LDSNCAT, as >>> explained by this line: >>> >>> # C : DSIO(&JUNCAT,FDSCAT,&IERR,strlen(FDSCAT)); >>> >>> Dag Sverre >>> >> >> >> Sorry, I am newbie to C. What is the correct way ? >> >> > > my_dsio.dsio(byref(JUNCAT),FDSCAT,byref(IERR),LDSNCAT) > > Dag Great. It works like a charme. How can I "translate" the following C-code into Python ? I don't know how to handle HEAD and memcpy ? Any pointer would be appreciated. Thanks in advance. typedef union { int i; float f; char c[4]; } Word; int I,J,K,N,IDE,IA,IE,IERR,JUNIT,JUNCAT,NDATA,NREC,LREADY,ONE=1; Word BUF[100],HEAD[30]; for (I=5;I References: <201003111004.36993.faltet@pytables.org> <20100311093642.GE3515@phare.normalesup.org> Message-ID: <201003111426.49685.faltet@pytables.org> A Thursday 11 March 2010 10:36:42 Gael Varoquaux escrigu?: > On Thu, Mar 11, 2010 at 10:04:36AM +0100, Francesc Alted wrote: > > As far as I know, memmap files (or better, the underlying OS) *use* all > > available RAM for loading data until RAM is exhausted and then start to > > use SWAP, so the "memory pressure" is still there. But I may be wrong... > > I believe that your above assertion is 'half' right. First I think that > it is not SWAP that the memapped file uses, but the original disk space, > thus you avoid running out of SWAP. Second, if you open several times the > same data without memmapping, I believe that it will be duplicated in > memory. On the other hand, when you memapping, it is not duplicated, thus > if you are running several processing jobs on the same data, you save > memory. I am very much in this case. Mmh, this is not my experience. During the past month, I was proposing in a course the students to compare the memory consumption of numpy.memmap and tables.Expr (a module for performing out-of-memory computations in PyTables). The idea was precisely to show that, contrarily to tables.Expr, numpy.memmap computations do take a lot of memory when they are being accessed. I'm attaching a slightly modified version of that exercise. On it, one have to compute a polynomial in a certain range. Here it is the output of the script for the numpy.memmap case for a machine with 8 GB RAM and 6 GB of swap: Total size for datasets: 7629.4 MB Populating x using numpy.memmap with 500000000 points... Total file sizes: 4000000000 -- (3814.7 MB) *** Time elapsed populating: 70.982 Computing: '((.25*x + .75)*x - 1.5)*x - 2' using numpy.memmap Total file sizes: 8000000000 -- (7629.4 MB) **** Time elapsed computing: 81.727 10.08user 13.37system 2:33.26elapsed 15%CPU (0avgtext+0avgdata 0maxresident)k 7808inputs+15625008outputs (39major+5750196minor)pagefaults 0swaps While the computation was going on, I've spied the process with the top utility, and that told me that the total virtual size consumed by the Python process was 7.9 GB, with a total of *resident* memory of 6.7 GB (!). And this should not only be a top malfunction because I've checked that, by the end of the computation, my machine started to swap some processes out (i.e. the working set above was too large to allow the OS keep everything in memory). Now, just for the sake of comparison, I've tried running the same script but using tables.Expr. Here it is the output: Total size for datasets: 7629.4 MB Populating x using tables.Expr with 500000000 points... Total file sizes: 4000631280 -- (3815.3 MB) *** Time elapsed populating: 78.817 Computing: '((.25*x + .75)*x - 1.5)*x - 2' using tables.Expr Total file sizes: 8001261168 -- (7630.6 MB) **** Time elapsed computing: 155.836 13.11user 18.59system 3:58.61elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k 7842784inputs+15632208outputs (28major+940347minor)pagefaults 0swaps and top was telling me that memory consumption was 148 MB for total virtual size and just 44 MB (as expected, because computation was really made using an out-of-core algorithm). Interestingly, when using compression (Blosc level 4, in this case), the time to do the computation with tables.Expr has reduced a lot: Total size for datasets: 7629.4 MB Populating x using tables.Expr with 500000000 points... Total file sizes: 1080130765 -- (1030.1 MB) *** Time elapsed populating: 30.005 Computing: '((.25*x + .75)*x - 1.5)*x - 2' using tables.Expr Total file sizes: 2415761895 -- (2303.9 MB) **** Time elapsed computing: 40.048 37.11user 6.98system 1:12.88elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k 45312inputs+4720568outputs (4major+989323minor)pagefaults 0swaps while memory consumption is barely the same than above: 148 MB / 45 MB. So, in my experience, numpy.memmap is really using that large chunk of memory (unless my testbed is badly programmed, in which case I'd be grateful if you can point out what's wrong). -- Francesc Alted -------------- next part -------------- A non-text attachment was scrubbed... Name: poly.py Type: text/x-python Size: 4642 bytes Desc: not available URL: From gael.varoquaux at normalesup.org Thu Mar 11 08:35:49 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 11 Mar 2010 14:35:49 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <201003111426.49685.faltet@pytables.org> References: <201003111004.36993.faltet@pytables.org> <20100311093642.GE3515@phare.normalesup.org> <201003111426.49685.faltet@pytables.org> Message-ID: <20100311133549.GH3515@phare.normalesup.org> On Thu, Mar 11, 2010 at 02:26:49PM +0100, Francesc Alted wrote: > > I believe that your above assertion is 'half' right. First I think that > > it is not SWAP that the memapped file uses, but the original disk space, > > thus you avoid running out of SWAP. Second, if you open several times the > > same data without memmapping, I believe that it will be duplicated in > > memory. On the other hand, when you memapping, it is not duplicated, thus > > if you are running several processing jobs on the same data, you save > > memory. I am very much in this case. > Mmh, this is not my experience. During the past month, I was proposing in a > course the students to compare the memory consumption of numpy.memmap and > tables.Expr (a module for performing out-of-memory computations in PyTables). > [snip] > So, in my experience, numpy.memmap is really using that large chunk of memory > (unless my testbed is badly programmed, in which case I'd be grateful if you > can point out what's wrong). OK, so what you are saying is that my assertion #1 was wrong. Fair enough, as I was writing it I was thinking that I had no hard fact to back it. How about assertion #2? I can think only of this 'story' to explain why I can run parallel computation when I use memmap that blow up if I don't use memmap. Also, could it be that the memmap mode changes things? I use only the 'r' mode, which is read-only. This is all very interesting, and you have much more insights on these problems than me. Would you be interested in coming to Euroscipy in Paris to give a 1 or 2 hours long tutorial on memory and IO problems and how you address them with Pytables? It would be absolutely thrilling. I must warn that I am afraid that we won't be able to pay for your trip, though, as I want to keep the price of the conference low. Best, Ga?l From faltet at pytables.org Thu Mar 11 08:55:40 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 11 Mar 2010 14:55:40 +0100 Subject: [Numpy-discussion] multiprocessing shared arrays and numpy In-Reply-To: <20100311133549.GH3515@phare.normalesup.org> References: <201003111426.49685.faltet@pytables.org> <20100311133549.GH3515@phare.normalesup.org> Message-ID: <201003111455.40599.faltet@pytables.org> A Thursday 11 March 2010 14:35:49 Gael Varoquaux escrigu?: > > So, in my experience, numpy.memmap is really using that large chunk of > > memory (unless my testbed is badly programmed, in which case I'd be > > grateful if you can point out what's wrong). > > OK, so what you are saying is that my assertion #1 was wrong. Fair > enough, as I was writing it I was thinking that I had no hard fact to > back it. How about assertion #2? I can think only of this 'story' to > explain why I can run parallel computation when I use memmap that blow up > if I don't use memmap. Well, I must tell that I've not experience about running memmapped arrays in parallel computations, but it sounds like they can actually behave as shared- memory arrays, so yes, you may definitely be right for #2, i.e. memmapped data is not duplicated when accessed in parallel by different processes (in read- only mode, of course), which is certainly a very interesting technique to share data in parallel processes. Thanks for pointing out this! > Also, could it be that the memmap mode changes things? I use only the 'r' > mode, which is read-only. I don't think so. When doing the computation, I open the x values in read- only mode, and memory consumption is still there. > This is all very interesting, and you have much more insights on these > problems than me. Would you be interested in coming to Euroscipy in Paris > to give a 1 or 2 hours long tutorial on memory and IO problems and how > you address them with Pytables? It would be absolutely thrilling. I must > warn that I am afraid that we won't be able to pay for your trip, though, > as I want to keep the price of the conference low. Yes, no problem. I was already thinking about presenting something at EuroSciPy. A tutorial about PyTables/memory IO would be really great for me. We can nail the details off-list. -- Francesc Alted From dsdale24 at gmail.com Thu Mar 11 11:11:08 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 11 Mar 2010 11:11:08 -0500 Subject: [Numpy-discussion] subclassing ndarray in python3 Message-ID: Now that the trunk has some support for python3, I am working on making Quantities work with python3 as well. I'm running into some problems related to subclassing ndarray that can be illustrated with a simple script, reproduced below. It looks like there is a problem with the reflected operations, I see problems with __rmul__ and __radd__, but not with __mul__ and __add__: import numpy as np class A(np.ndarray): def __new__(cls, *args, **kwargs): return np.ndarray.__new__(cls, *args, **kwargs) class B(A): def __mul__(self, other): return self.view(A).__mul__(other) def __rmul__(self, other): return self.view(A).__rmul__(other) def __add__(self, other): return self.view(A).__add__(other) def __radd__(self, other): return self.view(A).__radd__(other) a = A((10,)) b = B((10,)) print('A __mul__:') print(a.__mul__(2)) # ok print(a.view(np.ndarray).__mul__(2)) # ok print(a*2) # ok print('A __rmul__:') print(a.__rmul__(2)) # yields NotImplemented print(a.view(np.ndarray).__rmul__(2)) # yields NotImplemented print(2*a) # ok !!?? print('B __mul__:') print(b.__mul__(2)) # ok print(b.view(A).__mul__(2)) # ok print(b.view(np.ndarray).__mul__(2)) # ok print(b*2) # ok print('B __add__:') print(b.__add__(2)) # ok print(b.view(A).__add__(2)) # ok print(b.view(np.ndarray).__add__(2)) # ok print(b+2) # ok print('B __rmul__:') print(b.__rmul__(2)) # yields NotImplemented print(b.view(A).__rmul__(2)) # yields NotImplemented print(b.view(np.ndarray).__rmul__(2)) # yields NotImplemented print(2*b) # yields: TypeError: unsupported operand type(s) for *: 'int' and 'B' print('B __radd__:') print(b.__radd__(2)) # yields NotImplemented print(b.view(A).__radd__(2)) # yields NotImplemented print(b.view(np.ndarray).__radd__(2)) # yields NotImplemented print(2+b) # yields: TypeError: unsupported operand type(s) for +: 'int' and 'B' From cohen at lpta.in2p3.fr Thu Mar 11 15:01:25 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Thu, 11 Mar 2010 21:01:25 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: References: <4B96A1CD.7080702@lpta.in2p3.fr> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> <4B97CB4E.7060304@gmail.com> Message-ID: <4B994C15.9040709@lpta.in2p3.fr> hi there, I am adding this to this thread and not to the trac, because I am not sure whether it adds noise or a piece of info. I just downloaded the scipy trunk and built it, and ran nosetests on it, which bombed instantly.... So I tried to get into subdirs to check test scripts separately..... and here is one : [cohen at jarrett tests]$ ~/.local/bin/ipython test_integrate.py --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) /home/cohen/sources/python/scipy/scipy/integrate/tests/test_integrate.py in () 208 209 if __name__ == "__main__": --> 210 run_module_suite() 211 212 /home/cohen/.local/lib/python2.6/site-packages/numpy/testing/nosetester.pyc in run_module_suite(file_to_run) 75 f = sys._getframe(1) 76 file_to_run = f.f_locals.get('__file__', None) ---> 77 assert file_to_run is not None 78 79 import_nose().run(argv=['',file_to_run]) AssertionError: python: Modules/gcmodule.c:277: visit_decref: Assertion `gc->gc.gc_refs != 0' failed. Aborted (core dumped) [cohen at jarrett tests]$ pwd /home/cohen/sources/python/scipy/scipy/integrate/tests the bomb is the same, but the context seems different... I leave that to the experts :) Johann On 03/10/2010 06:06 PM, Charles R Harris wrote: > > > On Wed, Mar 10, 2010 at 10:39 AM, Bruce Southey > wrote: > > On 03/10/2010 08:59 AM, Pauli Virtanen wrote: > > Wed, 10 Mar 2010 15:40:04 +0100, Johann Cohen-Tanugi wrote: > > > >> Pauli, isn't it hopeless to follow the execution of the source > code when > >> the crash actually occurs when I exit, and not when I execute. > I would > >> have to understand enough of this umath_tests.c.src to spot a > refcount > >> error or things like that???? > >> > > Yeah, it's not easy, and requires knowing how to track this type of > > errors. I didn't actually mean that you should try do it, just > posed it > > as a general challenge to all interested parties :) > > > > On a more serious note, maybe there's a compilation flag or > something in > > Python that warns when refcounts go negative (or something). > > > > Cheers, > > Pauli > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Hi, > I think I managed to find this. I reverted back my svn versions ($svn > update -r 8262) and cleaned both the build and installation > directories. > > It occurred with changeset 8262 (earlier changesets appear okay but > later ones do not) > http://projects.scipy.org/numpy/changeset/8262 > > Specifically in the file: > numpy/core/code_generators/generate_ufunc_api.py > > There is an extra call to that should have been deleted on line 54(?). > Py_DECREF(numpy); > > Attached a patch to ticket 1425 > http://projects.scipy.org/numpy/ticket/1425 > > > Look like my bad. I'm out of town at the moment so someone else needs > to apply the patch. That whole bit of code could probably use a > daylight audit. > > Chuck > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Mar 11 15:38:21 2010 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 11 Mar 2010 22:38:21 +0200 Subject: [Numpy-discussion] subclassing ndarray in python3 In-Reply-To: References: Message-ID: <1268339901.4962.68.camel@idol> Hi Darren, to, 2010-03-11 kello 11:11 -0500, Darren Dale kirjoitti: > Now that the trunk has some support for python3, I am working on > making Quantities work with python3 as well. I'm running into some > problems related to subclassing ndarray that can be illustrated with a > simple script, reproduced below. It looks like there is a problem with > the reflected operations, I see problems with __rmul__ and __radd__, > but not with __mul__ and __add__: Thanks for testing. I wish the test suite was more complete (hint! hint! :) Yes, Python 3 introduced some semantic changes in how subclasses of builtin classes (= written in C) inherit the __r*__ operations. Below I'll try to explain what is going on. We probably need to change some things to make things work better on Py3, within the bounds we are able to. Suggestions are welcome. The most obvious one could be to explicitly implement __rmul__ etc. on Python 3. [clip] > class A(np.ndarray): > def __new__(cls, *args, **kwargs): > return np.ndarray.__new__(cls, *args, **kwargs) > > class B(A): > def __mul__(self, other): > return self.view(A).__mul__(other) > def __rmul__(self, other): > return self.view(A).__rmul__(other) > def __add__(self, other): > return self.view(A).__add__(other) > def __radd__(self, other): > return self.view(A).__radd__(other) [clip] > print('A __rmul__:') > print(a.__rmul__(2)) > # yields NotImplemented > print(a.view(np.ndarray).__rmul__(2)) > # yields NotImplemented Correct. ndarray does not implement __rmul__, but relies on an automatic wrapper generated by Python. The automatic wrapper (wrap_binaryfunc_r) does the following: 1. Is `type(other)` a subclass of `type(self)`? If yes, call __mul__ with swapped arguments. 2. If not, bail out with NotImplemented. So it bails out. Previously, the ndarray type had a flag that made Python to skip the subclass check. That does not exist any more on Python 3, and is the root of this issue. > print(2*a) > # ok !!?? Here, Python checks 1. Does nb_multiply from the left op succeed? Nope, since floats don't know how to multiply ndarrays. 2. Does nb_multiply from the right op succeed? Here the execution passes *directly* to array_multiply, completely skipping the __rmul__ wrapper. Note also that in the C-level number protocol there is only a single multiplication function for both left and right multiplication. [clip] > print('B __rmul__:') > print(b.__rmul__(2)) > # yields NotImplemented > print(b.view(A).__rmul__(2)) > # yields NotImplemented > print(b.view(np.ndarray).__rmul__(2)) > # yields NotImplemented > print(2*b) > # yields: TypeError: unsupported operand type(s) for *: 'int' and 'B' But here, the subclass calls the wrapper ndarray.__rmul__, which wants to be careful with types, and hence fails. Yes, probably explicitly defining __rmul__ for ndarray could be the right solution. Please file a bug report on this. Cheers, Pauli From bsouthey at gmail.com Thu Mar 11 15:47:45 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 11 Mar 2010 14:47:45 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B994C15.9040709@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> <4B97CB4E.7060304@gmail.com> <4B994C15.9040709@lpta.in2p3.fr> Message-ID: <4B9956F1.8050903@gmail.com> On 03/11/2010 02:01 PM, Johann Cohen-Tanugi wrote: > hi there, I am adding this to this thread and not to the trac, because > I am not sure whether it adds noise or a piece of info. I just > downloaded the scipy trunk and built it, and ran nosetests on it, > which bombed instantly.... > So I tried to get into subdirs to check test scripts separately..... > and here is one : > [cohen at jarrett tests]$ ~/.local/bin/ipython test_integrate.py > --------------------------------------------------------------------------- > AssertionError Traceback (most recent call > last) > > /home/cohen/sources/python/scipy/scipy/integrate/tests/test_integrate.py > in () > 208 > 209 if __name__ == "__main__": > --> 210 run_module_suite() > 211 > 212 > > /home/cohen/.local/lib/python2.6/site-packages/numpy/testing/nosetester.pyc > in run_module_suite(file_to_run) > 75 f = sys._getframe(1) > 76 file_to_run = f.f_locals.get('__file__', None) > ---> 77 assert file_to_run is not None > 78 > 79 import_nose().run(argv=['',file_to_run]) > > AssertionError: > python: Modules/gcmodule.c:277: visit_decref: Assertion > `gc->gc.gc_refs != 0' failed. > Aborted (core dumped) > [cohen at jarrett tests]$ pwd > /home/cohen/sources/python/scipy/scipy/integrate/tests > > the bomb is the same, but the context seems different... I leave that > to the experts :) > Johann Yes, I think it is the same issue as I do not have the problem after fixing the following file and rebuilding numpy and scipy: numpy/core/code_generators/generate_ufunc_api.py Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From cohen at lpta.in2p3.fr Thu Mar 11 16:57:09 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Thu, 11 Mar 2010 22:57:09 +0100 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B9956F1.8050903@gmail.com> References: <4B96A1CD.7080702@lpta.in2p3.fr> <4B96DAAE.6060408@lpta.in2p3.fr> <4B96DD2B.4040103@lpta.in2p3.fr> <4B96DF48.60501@lpta.in2p3.fr> <1268182550.1748.2.camel@Nokia-N900-42-11> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> <4B97CB4E.7060304@gmail.com> <4B994C15.9040709@lpta.in2p3.fr> <4B9956F1.8050903@gmail.com> Message-ID: <4B996735.4090301@lpta.in2p3.fr> is your fix committed? On 03/11/2010 09:47 PM, Bruce Southey wrote: > On 03/11/2010 02:01 PM, Johann Cohen-Tanugi wrote: >> hi there, I am adding this to this thread and not to the trac, >> because I am not sure whether it adds noise or a piece of info. I >> just downloaded the scipy trunk and built it, and ran nosetests on >> it, which bombed instantly.... >> So I tried to get into subdirs to check test scripts separately..... >> and here is one : >> [cohen at jarrett tests]$ ~/.local/bin/ipython test_integrate.py >> --------------------------------------------------------------------------- >> AssertionError Traceback (most recent call >> last) >> >> /home/cohen/sources/python/scipy/scipy/integrate/tests/test_integrate.py >> in () >> 208 >> 209 if __name__ == "__main__": >> --> 210 run_module_suite() >> 211 >> 212 >> >> /home/cohen/.local/lib/python2.6/site-packages/numpy/testing/nosetester.pyc >> in run_module_suite(file_to_run) >> 75 f = sys._getframe(1) >> 76 file_to_run = f.f_locals.get('__file__', None) >> ---> 77 assert file_to_run is not None >> 78 >> 79 import_nose().run(argv=['',file_to_run]) >> >> AssertionError: >> python: Modules/gcmodule.c:277: visit_decref: Assertion >> `gc->gc.gc_refs != 0' failed. >> Aborted (core dumped) >> [cohen at jarrett tests]$ pwd >> /home/cohen/sources/python/scipy/scipy/integrate/tests >> >> the bomb is the same, but the context seems different... I leave that >> to the experts :) >> Johann > > Yes, > I think it is the same issue as I do not have the problem after fixing > the following file and rebuilding numpy and scipy: > numpy/core/code_generators/generate_ufunc_api.py > > Bruce > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Thu Mar 11 17:04:49 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 11 Mar 2010 17:04:49 -0500 Subject: [Numpy-discussion] subclassing ndarray in python3 In-Reply-To: <1268339901.4962.68.camel@idol> References: <1268339901.4962.68.camel@idol> Message-ID: Hi Pauli, On Thu, Mar 11, 2010 at 3:38 PM, Pauli Virtanen wrote: > Thanks for testing. I wish the test suite was more complete (hint! > hint! :) I'll be happy to contribute, but lately I get a few 15-30 minute blocks a week for this kind of work (hence the short attempt to work on Quantities this morning), and its not likely to let up for about 3 weeks. > Yes, probably explicitly defining __rmul__ for ndarray could be the > right solution. Please file a bug report on this. Done: http://projects.scipy.org/numpy/ticket/1426 Cheers, and *thank you* for all you have already done to support python-3, Darren From tpk at kraussfamily.org Thu Mar 11 19:30:53 2010 From: tpk at kraussfamily.org (Tom K.) Date: Thu, 11 Mar 2010 16:30:53 -0800 (PST) Subject: [Numpy-discussion] arange including stop value? In-Reply-To: <27866607.post@talk.nabble.com> References: <27866607.post@talk.nabble.com> Message-ID: <27872069.post@talk.nabble.com> davefallest wrote: > > ... > In [3]: np.arange(1.01, 1.1, 0.01) > Out[3]: array([ 1.01, 1.02, 1.03, 1.04, 1.05, 1.06, 1.07, 1.08, > 1.09, 1.1 ]) > > Why does the ... np.arange command end up including my stop value? > >From the help for arange: For floating point arguments, the length of the result is ``ceil((stop - start)/step)``. Because of floating point overflow, this rule may result in the last element of `out` being greater than `stop`. -- View this message in context: http://old.nabble.com/arange-including-stop-value--tp27866607p27872069.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From peridot.faceted at gmail.com Thu Mar 11 19:46:09 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 11 Mar 2010 19:46:09 -0500 Subject: [Numpy-discussion] arange including stop value? In-Reply-To: <27872069.post@talk.nabble.com> References: <27866607.post@talk.nabble.com> <27872069.post@talk.nabble.com> Message-ID: On 11 March 2010 19:30, Tom K. wrote: > > > > davefallest wrote: >> >> ... >> In [3]: np.arange(1.01, 1.1, 0.01) >> Out[3]: array([ 1.01, ?1.02, ?1.03, ?1.04, ?1.05, ?1.06, ?1.07, ?1.08, >> 1.09, ?1.1 ]) >> >> Why does the ... np.arange command end up including my stop value? Don't use arange for floating-point values. Use linspace instead. Anne > >From the help for arange: > > ? ? ? ?For floating point arguments, the length of the result is > ? ? ? ?``ceil((stop - start)/step)``. ?Because of floating point overflow, > ? ? ? ?this rule may result in the last element of `out` being greater > ? ? ? ?than `stop`. > > -- > View this message in context: http://old.nabble.com/arange-including-stop-value--tp27866607p27872069.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Thu Mar 11 22:18:26 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Mar 2010 21:18:26 -0600 Subject: [Numpy-discussion] crash at prompt exit after running test In-Reply-To: <4B996735.4090301@lpta.in2p3.fr> References: <4B96A1CD.7080702@lpta.in2p3.fr> <4B976627.8060408@lpta.in2p3.fr> <4B97AF44.80207@lpta.in2p3.fr> <4B97CB4E.7060304@gmail.com> <4B994C15.9040709@lpta.in2p3.fr> <4B9956F1.8050903@gmail.com> <4B996735.4090301@lpta.in2p3.fr> Message-ID: On Thu, Mar 11, 2010 at 3:57 PM, Johann Cohen-Tanugi wrote: > is your fix committed? > > No. Pauli thinks the problem may lie elsewhere. I haven't had time to look things over, but it is possible that the changes in the generated api exposed a bug elsewhere. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gberbeglia at gmail.com Fri Mar 12 13:52:32 2010 From: gberbeglia at gmail.com (gerardo.berbeglia) Date: Fri, 12 Mar 2010 10:52:32 -0800 (PST) Subject: [Numpy-discussion] matrix operation Message-ID: <27879913.post@talk.nabble.com> Hello, I want to "divide" an n x n (2-dimension) numpy array matrix A by a n (1-dimension) array d as follows: Take n = 2. Let A= 2 3 1 10 and let d = [ 3 2 ] Then i would like to have "A/d" = 2/3 3/3 1/2 10/2 This is to avoid loops to improve the speed. Thank you in advance! -- View this message in context: http://old.nabble.com/matrix-operation-tp27879913p27879913.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From gberbeglia at gmail.com Fri Mar 12 13:54:43 2010 From: gberbeglia at gmail.com (gerardo.berbeglia) Date: Fri, 12 Mar 2010 10:54:43 -0800 (PST) Subject: [Numpy-discussion] matrix division without a loop Message-ID: <27881350.post@talk.nabble.com> Hello, I want to "divide" an n x n (2-dimension) numpy array matrix A by a n (1-dimension) array d as follows: Take n = 2. Let A= 2 3 1 10 and let d = [ 3 2 ] Then i would like to have "A/d" = 2/3 3/3 1/2 10/2 This is to avoid loops to improve the speed. Thank you in advance! -- View this message in context: http://old.nabble.com/matrix-division-without-a-loop-tp27881350p27881350.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Fri Mar 12 13:56:39 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Mar 2010 12:56:39 -0600 Subject: [Numpy-discussion] matrix operation In-Reply-To: <27879913.post@talk.nabble.com> References: <27879913.post@talk.nabble.com> Message-ID: <3d375d731003121056q6b31895md34565e49b9f8e9@mail.gmail.com> On Fri, Mar 12, 2010 at 12:52, gerardo.berbeglia wrote: > > Hello, > > I want to "divide" an n x n (2-dimension) numpy array matrix A by a n > (1-dimension) array d as follows: > > Take n = 2. > Let A= ? 2 3 > ? ? ? ? ? 1 10 > and let d = [ 3 2 ] > Then i would like to have "A/d" = 2/3 ?3/3 > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1/2 ?10/2 In [2]: import numpy In [3]: a = numpy.array([[2.0, 3.0], [1.0, 10.0]]) In [4]: a Out[4]: array([[ 2., 3.], [ 1., 10.]]) In [5]: d = numpy.array([[3.0], [2.0]]) In [6]: d Out[6]: array([[ 3.], [ 2.]]) In [7]: a / d Out[7]: array([[ 0.66666667, 1. ], [ 0.5 , 5. ]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From friedrichromstedt at gmail.com Fri Mar 12 15:46:41 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Fri, 12 Mar 2010 21:46:41 +0100 Subject: [Numpy-discussion] matrix division without a loop In-Reply-To: <27881350.post@talk.nabble.com> References: <27881350.post@talk.nabble.com> Message-ID: >>> import numpy >>> A = numpy.asarray([[2, 3], [1, 10]]) >>> print A [[ 2 3] [ 1 10]] >>> d = numpy.asarray([3, 2]) >>> print d [3 2] >>> print (A.T * (1.0 / d)).T [[ 0.66666667 1. ] [ 0.5 5. ]] - or - >>> d = numpy.asarray([3.0, 2.0]) >>> print d [ 3. 2.] >>> print (A.T / d).T [[ 0.66666667 1. ] [ 0.5 5. ]] Friedrich From peridot.faceted at gmail.com Fri Mar 12 16:54:29 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 12 Mar 2010 16:54:29 -0500 Subject: [Numpy-discussion] matrix division without a loop In-Reply-To: <27881350.post@talk.nabble.com> References: <27881350.post@talk.nabble.com> Message-ID: On 12 March 2010 13:54, gerardo.berbeglia wrote: > > Hello, > > I want to "divide" an n x n (2-dimension) numpy array matrix A by a n > (1-dimension) array d as follows: Look up "broadcasting" in the numpy docs. The short version is this: operations like division act elementwise on arrays of the same shape. If two arrays' shapes differ, but only in that one has a 1 where the other has an n, the one whose dimension is 1 will be effectively "repeated" n times along that axis so they match. Finally, if two arrays have shapes of different lengths, the shorter is extended by prepending as many dimensions of size one as necessary, then the previous rules are applied. Put this together with the ability to add "extra" dimensions of length 1 to an array by indexing with np.newaxis, and your problem becomes simple: In [13]: A = np.array([[1,2],[3,4]]); B = np.array([1.,2.]) In [14]: A.shape Out[14]: (2, 2) In [15]: B.shape Out[15]: (2,) In [16]: A/B Out[16]: array([[ 1., 1.], [ 3., 2.]]) In [17]: A/B[:,np.newaxis] Out[17]: array([[ 1. , 2. ], [ 1.5, 2. ]]) If you're using matrices, don't. Stick to arrays. Anne > Take n = 2. > Let A= 2 3 > 1 10 > and let d = [ 3 2 ] > Then i would like to have "A/d" = 2/3 3/3 > 1/2 10/2 > > This is to avoid loops to improve the speed. > > Thank you in advance! > -- > View this message in context: http://old.nabble.com/matrix-division-without-a-loop-tp27881350p27881350.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Nicolas.Rougier at loria.fr Sat Mar 13 04:20:54 2010 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Sat, 13 Mar 2010 10:20:54 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation Message-ID: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> Hello, I'm trying to translate a small matlab program for the simulation in a 2D flow in a channel past a cylinder and since I do not have matlab access, I would like to know if someone can help me, especially on array indexing. The matlab source code is available at: http://www.lbmethod.org/openlb/lb.examples.html and below is what I've done so far in my translation effort. In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If someone knows, that would help me a lot... Nicolas ''' Channel flow past a cylindrical obstacle, using a LB method Copyright (C) 2006 Jonas Latt Address: Rue General Dufour 24, 1211 Geneva 4, Switzerland E-mail: Jonas.Latt at cui.unige.ch ''' import numpy # GENERAL FLOW CONSTANTS lx = 250 ly = 51 obst_x = lx/5.+1 # position of the cylinder; (exact obst_y = ly/2.+1 # y-symmetry is avoided) obst_r = ly/10.+1 # radius of the cylinder uMax = 0.02 # maximum velocity of Poiseuille inflow Re = 100 # Reynolds number nu = uMax * 2.*obst_r / Re # kinematic viscosity omega = 1. / (3*nu+1./2.) # relaxation parameter maxT = 4000 # total number of iterations tPlot = 5 # cycles # D2Q9 LATTICE CONSTANTS # matlab: t = [4/9, 1/9,1/9,1/9,1/9, 1/36,1/36,1/36,1/36]; # matlab: cx = [ 0, 1, 0, -1, 0, 1, -1, -1, 1]; # matlab: cy = [ 0, 0, 1, 0, -1, 1, 1, -1, -1]; # matlab: opp = [ 1, 4, 5, 2, 3, 8, 9, 6, 7]; # matlab: col = [2:(ly-1)]; t = numpy.array([4/9., 1/9.,1/9.,1/9.,1/9., 1/36.,1/36.,1/36.,1/36.]) cx = numpy.array([ 0, 1, 0, -1, 0, 1, -1, -1, 1]) cy = numpy.array([ 0, 0, 1, 0, -1, 1, 1, -1, -1]) opp = numpy.array([ 1, 4, 5, 2, 3, 8, 9, 6, 7]) col = numpy.arange(2,(ly-1)) # matlab: [y,x] = meshgrid(1:ly,1:lx); # matlab: obst = (x-obst_x).^2 + (y-obst_y).^2 <= obst_r.^2; # matlab: obst(:,[1,ly]) = 1; # matlab: bbRegion = find(obst); y,x = numpy.meshgrid(numpy.arange(ly),numpy.arange(lx)) obst = (x-obst_x)**2 + (y-obst_y)**2 <= obst_r**2 obst[:,0] = obst[:,ly-1] = 1 bbRegion = numpy.nonzero(obst) # INITIAL CONDITION: (rho=0, u=0) ==> fIn(i) = t(i) # matlab: fIn = reshape( t' * ones(1,lx*ly), 9, lx, ly); fIn = numpy.ones((lx,ly,9)) fIn [:,:] = t # MAIN LOOP (TIME CYCLES) # matlab: for cycle = 1:maxT for cycle in range(maxT): # MACROSCOPIC VARIABLES # matlab: rho = sum(fIn); # matlab: ux = reshape ( ... # matlab: (cx * reshape(fIn,9,lx*ly)), 1,lx,ly) ./rho; # matlab: uy = reshape ( ... # matlab: (cy * reshape(fIn,9,lx*ly)), 1,lx,ly) ./rho; rho = fIn.sum(-1) ux = (cx*fIn).sum(-1)/rho uy = (cx*fIn).sum(-1)/rho # MACROSCOPIC (DIRICHLET) BOUNDARY CONDITIONS # Inlet: Poiseuille profile # matlab: L = ly-2; y = col-1.5; # matlab: ux(:,1,col) = 4 * uMax / (L*L) * (y.*L-y.*y); # matlab: uy(:,1,col) = 0; # matlab: rho(:,1,col) = 1 ./ (1-ux(:,1,col)) .* ( ... # matlab: sum(fIn([1,3,5],1,col)) + ... # matlab: 2*sum(fIn([4,7,8],1,col)) ); L = ly-2.0 y = col-1.5 # Is that right ? ux[0,1:-1] = 4 * uMax / (L*L) * (y.*L-y.*y) # Is that right ? uy[0,1:-1] = 0 rho[0,1:-1] = 1./(1-ux[1,1:-1])*(sum(fIn[ ([1,3,5],1,col)) + 2*sum(fIn([4,7,8],1,col))) # Outlet: Zero gradient on rho/ux # matlab: rho(:,lx,col) = rho(:,lx-1,col); # matlab: uy(:,lx,col) = 0; # matlab: ux(:,lx,col) = ux(:,lx-1,col); # % Outlet: Zero gradient on rho/ux # rho(:,lx,col) = rho(:,lx-1,col); # uy(:,lx,col) = 0; # ux(:,lx,col) = ux(:,lx-1,col); # % COLLISION STEP # for i=1:9 # cu = 3*(cx(i)*ux+cy(i)*uy); # fEq(i,:,:) = rho .* t(i) .* ( 1 + cu + 1/2*(cu.*cu) - 3/2*(ux.^2+uy.^2) ); # fOut(i,:,:) = fIn(i,:,:) - omega .* (fIn(i,:,:)-fEq(i,:,:)); # end # % MICROSCOPIC BOUNDARY CONDITIONS # for i=1:9 # % Left boundary # fOut(i,1,col) = fEq(i,1,col) + 18*t(i)*cx(i)*cy(i)* ( fIn(8,1,col) - fIn(7,1,col)-fEq(8,1,col)+fEq(7,1,col) ); # % Right boundary # fOut(i,lx,col) = fEq(i,lx,col) + 18*t(i)*cx(i)*cy(i)* ( fIn(6,lx,col) - fIn(9,lx,col)-fEq(6,lx,col)+fEq(9,lx,col) ); # % Bounce back region # fOut(i,bbRegion) = fIn(opp(i),bbRegion); # % STREAMING STEP # for i=1:9 # fIn(i,:,:) = circshift(fOut(i,:,:), [0,cx(i),cy(i)]); From silva at lma.cnrs-mrs.fr Sat Mar 13 05:45:31 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Sat, 13 Mar 2010 11:45:31 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> Message-ID: <1268477132.2760.3.camel@Portable-s2m.cnrs-mrs.fr> Le samedi 13 mars 2010 ? 10:20 +0100, Nicolas Rougier a ?crit : > Hello, > I'm trying to translate a small matlab program for the simulation in a > 2D flow in a channel past a cylinder and since I do not have matlab > access, I would like to know if someone can help me, especially on > array indexing. The matlab source code is available at: > http://www.lbmethod.org/openlb/lb.examples.html and below is what I've > done so far in my translation effort. > > In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do > not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If > someone knows, that would help me a lot... As ux 's shape is (1,lx,ly), ux(:,1,col) is equal to ux(1,1,col) which is a vector with the elements [ux(1,1,2), ... ux(1,1,ly-1)]. Using ":" juste after the reshape seems a lit bit silly... -- Fabrice Silva LMA UPR CNRS 7051 From Nicolas.Rougier at loria.fr Sat Mar 13 06:42:27 2010 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Sat, 13 Mar 2010 12:42:27 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <1268477132.2760.3.camel@Portable-s2m.cnrs-mrs.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <1268477132.2760.3.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: Thanks. I agree that the use of ':' is a bit weird. Nicolas On Mar 13, 2010, at 11:45 , Fabrice Silva wrote: > Le samedi 13 mars 2010 ? 10:20 +0100, Nicolas Rougier a ?crit : >> Hello, >> I'm trying to translate a small matlab program for the simulation in a >> 2D flow in a channel past a cylinder and since I do not have matlab >> access, I would like to know if someone can help me, especially on >> array indexing. The matlab source code is available at: >> http://www.lbmethod.org/openlb/lb.examples.html and below is what I've >> done so far in my translation effort. >> >> In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do >> not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If >> someone knows, that would help me a lot... > > > As ux 's shape is (1,lx,ly), ux(:,1,col) is equal to ux(1,1,col) which > is a vector with the elements [ux(1,1,2), ... ux(1,1,ly-1)]. > Using ":" juste after the reshape seems a lit bit silly... > -- > Fabrice Silva > LMA UPR CNRS 7051 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From rmay31 at gmail.com Sat Mar 13 09:51:29 2010 From: rmay31 at gmail.com (Ryan May) Date: Sat, 13 Mar 2010 08:51:29 -0600 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <1268477132.2760.3.camel@Portable-s2m.cnrs-mrs.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <1268477132.2760.3.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Sat, Mar 13, 2010 at 4:45 AM, Fabrice Silva wrote: > Le samedi 13 mars 2010 ? 10:20 +0100, Nicolas Rougier a ?crit : >> Hello, >> I'm trying to translate a small matlab program for the simulation in a >> 2D flow in a channel past a cylinder and since I do not have matlab >> access, I would like to know if someone can help me, especially on >> array indexing. The matlab source code is available at: >> http://www.lbmethod.org/openlb/lb.examples.html and below is what I've >> done so far in my translation effort. >> >> In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do >> not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If >> someone knows, that would help me a lot... > > > As ux 's shape is (1,lx,ly), ux(:,1,col) is equal to ux(1,1,col) which > is a vector with the elements [ux(1,1,2), ... ux(1,1,ly-1)]. > Using ":" juste after the reshape seems a lit bit silly... Except that python uses 0-based indexing and does not include the last number in a slice, while Matlab uses 1-based indexing and includes the last number, so really: ux(:,1,col) becomes: ux(0, 0, col) # or ux(:, 0, col) And if col is col = [2:(ly-1)] This needs to be: col = np.arange([1, ly - 1) Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From friedrichromstedt at gmail.com Sat Mar 13 10:59:34 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sat, 13 Mar 2010 16:59:34 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> Message-ID: 2010/3/13 Nicolas Rougier : > I'm trying to translate a small matlab program for the simulation in a 2D flow in a channel past a cylinder and since I do not have matlab access, I would like to know if someone can help me, especially on array indexing. The matlab source code is available at: http://www.lbmethod.org/openlb/lb.examples.html and below is what I've done so far in my translation effort. > In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If someone knows, that would help me a lot... It means that you select all in the 0-axis all indices, in the 1-axis the index 0 (matlab: index 1), and in the 2-axis the indices given by the list {col}. {col} is in our case an ndarray of .ndim = 1. I attach a modified version of your script which is running, computing *something*. If you could provide information about matlab functions opp() and circshift() that would be helpful. I marked sections I changed with "CHANGED", todos with TODO and lonely notes with NOTE and so on. Friedrich -------------- next part -------------- A non-text attachment was scrubbed... Name: lbmethod.py Type: application/octet-stream Size: 6429 bytes Desc: not available URL: From silva at lma.cnrs-mrs.fr Sat Mar 13 12:20:55 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Sat, 13 Mar 2010 18:20:55 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <1268477132.2760.3.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: <1268500855.6137.3.camel@Portable-s2m.cnrs-mrs.fr> > > As ux 's shape is (1,lx,ly), ux(:,1,col) is equal to ux(1,1,col) which > > is a vector with the elements [ux(1,1,2), ... ux(1,1,ly-1)]. > > Using ":" juste after the reshape seems a lit bit silly... > > Except that python uses 0-based indexing and does not include the last > number in a slice, while Matlab uses 1-based indexing and includes the > last number, so really: > ux(:,1,col) > becomes: > ux(0, 0, col) # or ux(:, 0, col) > > And if col is > col = [2:(ly-1)] > This needs to be: > col = np.arange([1, ly - 1) You are right about the 0 or 1 based indexing argument, but I was speaking matlab language as visible in the symbols used for indexing ( () and not [] )... :) -- Fabrice Silva LMA UPR CNRS 7051 From z99719 at gmail.com Sat Mar 13 14:06:24 2010 From: z99719 at gmail.com (z99719 z99719) Date: Sat, 13 Mar 2010 12:06:24 -0700 Subject: [Numpy-discussion] problem with applying patch Message-ID: <6309d0f61003131106m1d112d92w73b113408bc9d84f@mail.gmail.com> I have installed latest numpy from svn on ubuntu karmic(9.10) and compiled the code in my sand box. I downloaded two patches (test_umath_complex.py.patch, build_clib.py.patch) from the review patch directory on http://projects.scipy.org/numpy/report/12 page for testing. The following patch command gives me patch -p0 -i test_umath_complex.py.patch patch unexpectedly ends in middle of line patch: **** Only garbage was found in the patch input. The patch files are placed in the top level numpy sandbox directory and are in xml format rather than a diff file. I used "save link as" on the link to download the patch on the "Attachments" section of the link page. Checked the test_umath_complex.py in the numpy/core/test directory and found significant differences with the listing on the patch link (both old, and the new versions). Don't think this is limited to just these patches any patch that I tried gives the same error message. Thank you regards alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Sat Mar 13 14:19:32 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Sat, 13 Mar 2010 19:19:32 +0000 (UTC) Subject: [Numpy-discussion] problem with applying patch References: <6309d0f61003131106m1d112d92w73b113408bc9d84f@mail.gmail.com> Message-ID: Sat, 13 Mar 2010 12:06:24 -0700, z99719 z99719 wrote: [clip] > The patch files are placed in the top level numpy sandbox directory and > are in xml format rather than a diff file. I used "save link as" on the > link to download the patch on the "Attachments" section of the link > page. That gives you the HTML page. Click on the link, and choose "Download in other formats: Original Format" from the bottom of the page. From aisaac at american.edu Sat Mar 13 15:32:32 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 13 Mar 2010 15:32:32 -0500 Subject: [Numpy-discussion] integer division rule? Message-ID: <4B9BF660.4050600@american.edu> Francesc Altet once provided an example that for integer division, numpy uses the C99 rule: round towards 0. This is different than Python's rule for integer division: round towards negative infinity. But I cannot reproduce his example. (NumPy 1.3.) Did this behavior change at some point? Thanks, Alan Isaac From josef.pktd at gmail.com Sat Mar 13 15:51:55 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Mar 2010 15:51:55 -0500 Subject: [Numpy-discussion] long integers in genfromtxt Message-ID: <1cd32cbb1003131251g14c4901fp7a92e0b4a87a530c@mail.gmail.com> I was trying to find out what the "helpful" message "TypeError: expected a readable buffer object" means and it seems genfromtxt has problems identifying long integers (at least on Windows 32) >>> np.array(4160680000,int) Traceback (most recent call last): File "", line 1, in np.array(4160680000,int) OverflowError: long int too large to convert to int >>> s = '''Date,Open,High,Low,Close,Volume,Adj Close ... 2010-02-12,1075.95,1077.81,1062.97,1075.51,4160680000,1075.51 ... 2010-02-11,1067.10,1080.04,1060.59,1078.47,4400870000,1078.47''' >>> sh = StringIO(s) >>> data = np.genfromtxt(sh, delimiter=",", dtype=None, names=True) Traceback (most recent call last): File "C:\Programs\Python25\Lib\site-packages\numpy\lib\io.py", line 1367, in genfromtxt output = np.array(data, dtype=ddtype) TypeError: expected a readable buffer object >>> sh = StringIO(s) >>> data = ts.tsfromtxt(sh, delimiter=",", dtype=None, datecols=(0,), names=True) Traceback (most recent call last): File "\Programs\Python25\Lib\site-packages\scikits\timeseries\extras.py", line 504, in tsfromtxt File "\Programs\Python25\Lib\site-packages\scikits\timeseries\_preview.py", line 1312, in genfromtxt TypeError: expected a readable buffer object >>> dt= [('','S10'),('',float),('',float),('',float),('',float),('',int),('',float)] >>> sh = StringIO(s) >>> data = np.genfromtxt(sh, delimiter=",", dtype=dt, names=True) Traceback (most recent call last): File "C:\Programs\Python25\Lib\site-packages\numpy\lib\io.py", line 1388, in genfromtxt rows = np.array(data, dtype=[('', _) for _ in dtype_flat]) TypeError: expected a readable buffer object >>> sh = StringIO(s) >>> data = ts.tsfromtxt(sh, delimiter=",", dtype=dt, datecols=(0,), names=True) Traceback (most recent call last): File "\Programs\Python25\Lib\site-packages\scikits\timeseries\extras.py", line 451, in tsfromtxt File "\Programs\Python25\Lib\site-packages\scikits\timeseries\_preview.py", line 798, in easy_dtype File "\Programs\Python25\Lib\site-packages\scikits\timeseries\_preview.py", line 364, in __call__ File "\Programs\Python25\Lib\site-packages\scikits\timeseries\_preview.py", line 329, in validate TypeError: object of type 'bool' has no len() using float works, but I needed the debugger to figure out what the problem with my data is (or whether I just make mistakes) >>> dt= [('','S10'),('',float),('',float),('',float),('',float),('',float),('',float)] >>> sh = StringIO(s) >>> data = np.genfromtxt(sh, delimiter=",", dtype=dt, names=True) >>> data array([ ('2010-02-12', 1075.95, 1077.8099999999999, 1062.97, 1075.51, 4160680000.0, 1075.51), ('2010-02-11', 1067.0999999999999, 1080.04, 1060.5899999999999, 1078.47, 4400870000.0, 1078.47)], dtype=[('Date', '|S10'), ('Open', '>> >>> ts.version.version '0.91.3' Josef From pav at iki.fi Sat Mar 13 16:04:52 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 13 Mar 2010 23:04:52 +0200 Subject: [Numpy-discussion] integer division rule? In-Reply-To: <4B9BF660.4050600@american.edu> References: <4B9BF660.4050600@american.edu> Message-ID: <1268514292.8600.3.camel@talisman> la, 2010-03-13 kello 15:32 -0500, Alan G Isaac kirjoitti: > Francesc Altet once provided an example that for integer > division, numpy uses the C99 rule: round towards 0. > This is different than Python's rule for integer division: > round towards negative infinity. > > But I cannot reproduce his example. (NumPy 1.3.) > Did this behavior change at some point? It was changed in r5888. What the rationale was is not clear from the commit message. Pauli From charlesr.harris at gmail.com Sat Mar 13 20:57:47 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Mar 2010 19:57:47 -0600 Subject: [Numpy-discussion] integer division rule? In-Reply-To: <1268514292.8600.3.camel@talisman> References: <4B9BF660.4050600@american.edu> <1268514292.8600.3.camel@talisman> Message-ID: On Sat, Mar 13, 2010 at 3:04 PM, Pauli Virtanen wrote: > la, 2010-03-13 kello 15:32 -0500, Alan G Isaac kirjoitti: > > Francesc Altet once provided an example that for integer > > division, numpy uses the C99 rule: round towards 0. > > This is different than Python's rule for integer division: > > round towards negative infinity. > > > > But I cannot reproduce his example. (NumPy 1.3.) > > Did this behavior change at some point? > > It was changed in r5888. What the rationale was is not clear from the > commit message. > > The change was before that, the logic of the loop after r5888 is the same as before. I suspect the change was made, whenever that was, in order to conform to python. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at loria.fr Sun Mar 14 08:14:26 2010 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Sun, 14 Mar 2010 13:14:26 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> Message-ID: <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> Thanks a lot for the translation. I think the "opp" relates to the opp array declared at the top and the circshift can be made using numpy roll. I modified your translation to include them. The results should be something like http://www.lbmethod.org/openlb/gif/karman.avi (I think) but unfortunately, it does not seem to do anything at the moment, I need to investigate further where is the problem. Nicolas New version: ''' Channel flow past a cylindrical obstacle, using a LB method Copyright (C) 2006 Jonas Latt Address: Rue General Dufour 24, 1211 Geneva 4, Switzerland E-mail: Jonas.Latt at cui.unige.ch ''' import numpy import matplotlib matplotlib.use('MacOSX') import matplotlib.pyplot as plt # General flow constants lx = 250 ly = 51 obst_x = lx/5.+1 # position of the cylinder; (exact obst_y = ly/2.+1 # y-symmetry is avoided) obst_r = ly/10.+1 # radius of the cylinder uMax = 0.02 # maximum velocity of Poiseuille inflow Re = 100 # Reynolds number nu = uMax * 2.*obst_r / Re # kinematic viscosity omega = 1. / (3*nu+1./2.) # relaxation parameter maxT = 4000 # total number of iterations tPlot = 5 # cycles # D2Q9 Lattice constants t = numpy.array([4/9., 1/9.,1/9.,1/9.,1/9., 1/36.,1/36.,1/36.,1/36.]) cx = numpy.array([ 0, 1, 0, -1, 0, 1, -1, -1, 1]) cy = numpy.array([ 0, 0, 1, 0, -1, 1, 1, -1, -1]) opp = numpy.array([ 0, 3, 4, 1, 2, 7, 8, 5, 6]) col = numpy.arange(1, ly - 1) y,x = numpy.meshgrid(numpy.arange(ly),numpy.arange(lx)) obst = (x-obst_x)**2 + (y-obst_y)**2 <= obst_r**2 obst[:,0] = obst[:,ly-1] = 1 bbRegion = numpy.nonzero(obst) # Initial condition: (rho=0, u=0) ==> fIn(i) = t(i) fIn = t[:, numpy.newaxis, numpy.newaxis].\ repeat(lx, axis = 1).\ repeat(ly, axis = 2) # Main loop (time cycles) for cycle in range(maxT): # Macroscopic variables rho = fIn.sum(axis = 0) ux = (cx[:,numpy.newaxis,numpy.newaxis] * fIn).sum(axis = 0) / rho uy = (cy[:,numpy.newaxis,numpy.newaxis] * fIn).sum(axis = 0) / rho # Macroscopic (Dirichlet) boundary condition L = ly-2.0 y = col-1.5 ux[:, 1, col] = 4 * uMax / (L ** 2) * (y * L - y ** 2) uy[:, 1, col] = 0 rho[0, col] = 1 / (1 - ux[0, col]) * \ (fIn[[1, 3, 5]][:, 0][:, col].sum(axis = 0) + \ 2 * fIn[[4, 7, 8]][:, 1][:, col].sum(axis = 0)) rho[lx - 1, col] = rho[lx - 2, col] uy[lx - 1, col] = 0 ux[lx - 1, col] = ux[:, lx - 2, col] fEq = numpy.zeros((9, lx, ly)) fOut = numpy.zeros((9, lx, ly)) for i in xrange(0, 9): cu = 3 * (cx[i] * ux + cy[i] * uy) fEq[i] = rho * t[i] * (1 + cu + 0.5 * cu ** 2 - \ 1.5 * (ux ** 2 + uy ** 2)) fOut[i] = fIn[i] - omega * (fIn[i] - fIn[i]) # Microscopic boundary conditions for i in xrange(0, 9): # Left boundary: fOut[i, 1, col] = fEq[i,0,col] + 18 * t[i] * cx[i] * cy[i] * \ (fIn[7,0,col] - fIn[6,0,col] - fEq[7,0,col] + fEq[6,0,col]) fOut[i,lx-1,col] = fEq[i,lx-1,col] + 18 * t[i] * cx[i] * cy[i] * \ (fIn[5,lx-1,col] - fIn[8,lx-1,col] - fEq[5,lx-1,col] + fEq[8,lx-1,col]) fOut[i,bbRegion] = fIn[opp[i],bbRegion] # Streaming step for i in xrange(0,9): fIn = numpy.roll(fIn, cx[i], 1) fIn = numpy.roll(fIn, cy[i], 2) if not cycle%tPlot: u = numpy.sqrt(ux**2+uy**2) #u[bbRegion] = numpy.nan print u.min(), u.max() #plt.imshow(u) #plt.show() On Mar 13, 2010, at 16:59 , Friedrich Romstedt wrote: > 2010/3/13 Nicolas Rougier : >> I'm trying to translate a small matlab program for the simulation in a 2D flow in a channel past a cylinder and since I do not have matlab access, I would like to know if someone can help me, especially on array indexing. The matlab source code is available at: http://www.lbmethod.org/openlb/lb.examples.html and below is what I've done so far in my translation effort. > >> In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If someone knows, that would help me a lot... > > It means that you select all in the 0-axis all indices, in the 1-axis > the index 0 (matlab: index 1), and in the 2-axis the indices given by > the list {col}. {col} is in our case an ndarray of .ndim = 1. > > I attach a modified version of your script which is running, computing > *something*. If you could provide information about matlab functions > opp() and circshift() that would be helpful. I marked sections I > changed with "CHANGED", todos with TODO and lonely notes with NOTE and > so on. > > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From bsouthey at gmail.com Sun Mar 14 10:55:07 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 14 Mar 2010 09:55:07 -0500 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> Message-ID: On Sun, Mar 14, 2010 at 7:14 AM, Nicolas Rougier wrote: > > Thanks a lot for the translation. > > I think the "opp" relates to the opp array declared at the top and the circshift can be made using numpy roll. I modified your translation to include them. > > The results should be something like http://www.lbmethod.org/openlb/gif/karman.avi (I think) but unfortunately, it does not seem to do anything at the moment, I need to investigate further where is the problem. > > Nicolas > > > > New version: > > > ''' > Channel flow past a cylindrical obstacle, using a LB method > Copyright (C) 2006 Jonas Latt > Address: Rue General Dufour 24, ?1211 Geneva 4, Switzerland > E-mail: Jonas.Latt at cui.unige.ch > ''' > import numpy > import matplotlib > matplotlib.use('MacOSX') > import matplotlib.pyplot as plt > > # General flow constants > lx = 250 > ly = 51 > obst_x = lx/5.+1 ? ? ? ? ? ? ? ? ?# position of the cylinder; (exact > obst_y = ly/2.+1 ? ? ? ? ? ? ? ? ?# y-symmetry is avoided) > obst_r = ly/10.+1 ? ? ? ? ? ? ? ? # radius of the cylinder > uMax ? = 0.02 ? ? ? ? ? ? ? ? ? ? # maximum velocity of Poiseuille inflow > Re ? ? = 100 ? ? ? ? ? ? ? ? ? ? # Reynolds number > nu ? ? = uMax * 2.*obst_r / Re ? ?# kinematic viscosity > omega ?= 1. / (3*nu+1./2.) ? ? ? # relaxation parameter > maxT ? = 4000 ? ? ? ? ? ? ? ? ? ?# total number of iterations > tPlot ?= 5 ? ? ? ? ? ? ? ? ? ? ? # cycles > > # D2Q9 Lattice constants > t ?= numpy.array([4/9., 1/9.,1/9.,1/9.,1/9., 1/36.,1/36.,1/36.,1/36.]) > cx = numpy.array([ ?0, ? 1, ?0, -1, ?0, ? ?1, ?-1, ?-1, ? 1]) > cy = numpy.array([ ?0, ? 0, ?1, ?0, -1, ? ?1, ? 1, ?-1, ?-1]) > opp = numpy.array([ 0, ? 3, ?4, ?1, ?2, ? ?7, ? 8, ? 5, ? 6]) > col = numpy.arange(1, ly - 1) > > y,x = numpy.meshgrid(numpy.arange(ly),numpy.arange(lx)) > obst = (x-obst_x)**2 + (y-obst_y)**2 <= obst_r**2 > obst[:,0] = obst[:,ly-1] = 1 > bbRegion = numpy.nonzero(obst) > > # Initial condition: (rho=0, u=0) ==> fIn(i) = t(i) > fIn = t[:, numpy.newaxis, numpy.newaxis].\ > ? ? ? ? ? ? ? ?repeat(lx, axis = 1).\ > ? ? ? ? ? ? ? ?repeat(ly, axis = 2) > > # Main loop (time cycles) > for cycle in range(maxT): > > ? # Macroscopic variables > ? rho = fIn.sum(axis = 0) > ? ux = (cx[:,numpy.newaxis,numpy.newaxis] * fIn).sum(axis = 0) / rho > ? uy = (cy[:,numpy.newaxis,numpy.newaxis] * fIn).sum(axis = 0) / rho You probably should split these into the creation part (cy[:,numpy.newaxis,numpy.newaxis]) and multiplication etc part ( * fIn).sum(axis = 0) / rho). It will save the repeated memory allocation. > > ? # Macroscopic (Dirichlet) boundary condition > ? L = ly-2.0 > ? y = col-1.5 These two variables can be moved out in the loop > ? ux[:, 1, col] = 4 * uMax / (L ** 2) * (y * L - y ** 2) > ? uy[:, 1, col] = 0 > ? rho[0, col] = 1 / (1 - ux[0, col]) * \ > ? ? ? ? ? ? ? ? ? (fIn[[1, 3, 5]][:, 0][:, col].sum(axis = 0) + \ > ? ? ? ? ? ? ? ? ? ?2 * fIn[[4, 7, 8]][:, 1][:, col].sum(axis = 0)) > ? rho[lx - 1, col] = rho[lx - 2, col] > ? uy[lx - 1, col] = 0 > ? ux[lx - 1, col] = ux[:, lx - 2, col] > > ? fEq = numpy.zeros((9, lx, ly)) > ? fOut = numpy.zeros((9, lx, ly)) > ? for i in xrange(0, 9): > ? ? ? ? ? cu = 3 * (cx[i] * ux + cy[i] * uy) > ? ? ? ? ? fEq[i] = rho * t[i] * (1 + cu + 0.5 * cu ** 2 - \ > ? ? ? ? ? ? ? ? ? ? ? ? ? 1.5 * (ux ** 2 + uy ** 2)) > ? ? ? ? ? fOut[i] = fIn[i] - omega * (fIn[i] - fIn[i]) This line is probably wrong, (fln[i]-fEq[i])? In anycase, I do not see the need for the loop (which why it caught my attention). If you are just indexing the same 'row' of an array using a loop then you probably don't need the loop. So you probably should be able to drop the indexing (assuming arrays) something like this, which is untested: cu-3*(cx*ux +cy*uy) fEq= rho * t * (1 + cu + 0.5 * cu ** 2 -1.5 * (ux ** 2 + uy ** 2)) fOut = fIn - omega * (fIn - fEq) If these are correct then you don't need the creation of fEQ and fOut as numpy.zeros. > > ? # Microscopic boundary conditions > ? for i in xrange(0, 9): > ? ? ?# Left boundary: > ? ? ?fOut[i, 1, col] = fEq[i,0,col] + 18 * t[i] * cx[i] * cy[i] * \ > ? ? ? ? ?(fIn[7,0,col] - fIn[6,0,col] - fEq[7,0,col] + fEq[6,0,col]) > ? ? ?fOut[i,lx-1,col] = fEq[i,lx-1,col] + 18 * t[i] * cx[i] * cy[i] * \ > ? ? ? ? ?(fIn[5,lx-1,col] - fIn[8,lx-1,col] - fEq[5,lx-1,col] + fEq[8,lx-1,col]) > ? ? ?fOut[i,bbRegion] = fIn[opp[i],bbRegion] You should be able to drop the indexing and loop here as well. > > ? # Streaming step > ? for i in xrange(0,9): > ? ? ?fIn = numpy.roll(fIn, cx[i], 1) > ? ? ?fIn = numpy.roll(fIn, cy[i], 2) Likewise, it would not surprise me if this loop is not needed. Bruce > > ? if not cycle%tPlot: > ? ? ?u = numpy.sqrt(ux**2+uy**2) > ? ? ?#u[bbRegion] = numpy.nan > ? ? ?print u.min(), u.max() > ? ? ?#plt.imshow(u) > ? ? ?#plt.show() > > > > > On Mar 13, 2010, at 16:59 , Friedrich Romstedt wrote: > >> 2010/3/13 Nicolas Rougier : >>> I'm trying to translate a small matlab program for the simulation in a 2D flow in a channel past a cylinder and since I do not have matlab access, I would like to know if someone can help me, especially on array indexing. The matlab source code is available at: http://www.lbmethod.org/openlb/lb.examples.html and below is what I've done so far in my translation effort. >> >>> In the matlab code, there is a "ux" array of shape (1,lx,ly) and I do not understand syntax: "ux(:,1,col)" with "col = [2:(ly-1)]". If someone knows, that would help me a lot... >> >> It means that you select all in the 0-axis all indices, in the 1-axis >> the index 0 (matlab: index 1), and in the 2-axis the indices given by >> the list {col}. ?{col} is in our case an ndarray of .ndim = 1. >> >> I attach a modified version of your script which is running, computing >> *something*. ?If you could provide information about matlab functions >> opp() and circshift() that would be helpful. ?I marked sections I >> changed with "CHANGED", todos with TODO and lonely notes with NOTE and >> so on. >> >> Friedrich >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From michel.dupront at hotmail.fr Mon Mar 15 04:45:40 2010 From: michel.dupront at hotmail.fr (Michel Dupront) Date: Mon, 15 Mar 2010 09:45:40 +0100 Subject: [Numpy-discussion] Numpi array and typemap Message-ID: Hello, I am trying to do what was suppose to be a very easy exercice. It consists in converting a C++ class Container into a numpy array as show below using typename My problem is that if I make the method getContainer run from python I will get a numpy array as expected by with wrong values. I guess I have a memory problem but I do not understand exactly wwhat is going on. I was hoping that somebody could help me with that and maybe tell me how to use the PyArray_SimpleNewFromData macro. Thanks %module Foo %{ #define SWIG_FILE_WITH_INIT #include "container.h" template Container getContainer() { ... C Container construction that will have a pointer to (1,2,3,4,5)... return C; } %} %typedef unsigned int size_t; %include "numpy.i" %init %{ import_array(); %} template class Container { public: Container(size_t length, T* pfirst_value); size_t lentgh() { ... return the lentgh ...} T* ptr() { ... return pointer to the first value ... } }; %template(cN) Container; %typemap(python,out) Container { npy_intp dims[] = { 5 }; PyObject* obj = PyArray_SimpleNewFromData (1,dims,PyArray_UINT,$1.ptr()); $result = obj; } template navlib::array getContainer(); %template(getContainerN ) getContainer; _________________________________________________________________ D?couvrez comment SURFER DISCRETEMENT sur un site de rencontres ! http://clk.atdmt.com/FRM/go/206608211/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gberbeglia at gmail.com Mon Mar 15 10:09:02 2010 From: gberbeglia at gmail.com (Gerardo Berbeglia) Date: Mon, 15 Mar 2010 10:09:02 -0400 Subject: [Numpy-discussion] another matrix operation Message-ID: I would like to do with numpy the following operation. Let A be an n x n matrix and let s be an integer between 1 and n. I would like to have an n x n matrix B = f(A,s) such that - If we only look at the first s columns of B, we will not see any difference with respect to the first s columns of the n x n identity matrix. - For the last n-s columns of B, we will not see any difference with respect to the last n-s columns of A. Example. n = 4, s = 2 A= [[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5]] B= f(A,s) = [[1,0,2,2],[0,1,3,3],[0,0,4,4],[0,0,5,5]] Is there a way to do this efficiently (without loops)? Thanks in advance. From josef.pktd at gmail.com Mon Mar 15 10:14:47 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Mar 2010 09:14:47 -0500 Subject: [Numpy-discussion] another matrix operation In-Reply-To: References: Message-ID: <1cd32cbb1003150714p7b877bfal5b4e2042479ec22@mail.gmail.com> On Mon, Mar 15, 2010 at 9:09 AM, Gerardo Berbeglia wrote: > I would like to do with numpy the following operation. > > Let A be an n x n matrix and let s be an integer between 1 and n. > > I would like to have an n x n matrix B = f(A,s) such that > > - If we only look at the first s columns of B, we will not see any > difference with respect to the first s columns of the n x n identity > matrix. > > - For the last n-s columns of B, we will not see any difference with > respect to the last n-s columns of A. > > Example. n = 4, s = 2 > > A= [[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5]] > > B= f(A,s) = [[1,0,2,2],[0,1,3,3],[0,0,4,4],[0,0,5,5]] > > Is there a way to do this efficiently (without loops)? something like this ? B = A.copy() B[:s,:] = np.eye[:s,:] Josef > > Thanks in advance. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon Mar 15 10:23:24 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Mar 2010 09:23:24 -0500 Subject: [Numpy-discussion] another matrix operation In-Reply-To: <1cd32cbb1003150714p7b877bfal5b4e2042479ec22@mail.gmail.com> References: <1cd32cbb1003150714p7b877bfal5b4e2042479ec22@mail.gmail.com> Message-ID: <1cd32cbb1003150723x44ad4d7bgfb8cdeff45a393e6@mail.gmail.com> On Mon, Mar 15, 2010 at 9:14 AM, wrote: > On Mon, Mar 15, 2010 at 9:09 AM, Gerardo Berbeglia wrote: >> I would like to do with numpy the following operation. >> >> Let A be an n x n matrix and let s be an integer between 1 and n. >> >> I would like to have an n x n matrix B = f(A,s) such that >> >> - If we only look at the first s columns of B, we will not see any >> difference with respect to the first s columns of the n x n identity >> matrix. >> >> - For the last n-s columns of B, we will not see any difference with >> respect to the last n-s columns of A. >> >> Example. n = 4, s = 2 >> >> A= [[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5]] >> >> B= f(A,s) = [[1,0,2,2],[0,1,3,3],[0,0,4,4],[0,0,5,5]] >> >> Is there a way to do this efficiently (without loops)? > > something like this ? > > B = A.copy() > B[:s,:] = np.eye[:s,:] that changes rows, this saves building one temporary eye >>> n,s = 4,2 >>> A = np.array([[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5]]) >>> B = np.eye(n) >>> B[:,-(n-s):] = A[:,-(n-s):] >>> B array([[ 1., 0., 2., 2.], [ 0., 1., 3., 3.], [ 0., 0., 4., 4.], [ 0., 0., 5., 5.]]) Josef > > Josef > >> >> Thanks in advance. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From gberbeglia at gmail.com Mon Mar 15 10:58:14 2010 From: gberbeglia at gmail.com (Gerardo Berbeglia) Date: Mon, 15 Mar 2010 10:58:14 -0400 Subject: [Numpy-discussion] more complex matrix operation Message-ID: I have another matrix operations which seems a little more complicated. Let A be an n x n matrix and let S be a subset of {0,...,n-1}. Assume S is represented by a binary vector s, with a 1 at the index i if i is in S. (e.g. if S={0,3} then s = [1,0,0,1]) I would like to have an efficient way to compute the function B = f (A,S) characterized as follows: - For each column i such that i is in S, then the column i of B is equal to the column i of A. - For each column i such that i is NOT in S, then the column i of B is equal to the ith column of the n x n identity matrix. Example. n=4. A = [[2,2,2,2],[3,3,3,3][4,4,4,4][5,5,5,5]] S = {0,2} => s=[1,0,1,0] f(A,S) = [[2,2,2,2],[0,1,0,0],[4,4,4,4],[0,0,0,1]] Which is the best way to compute f? Thanks again. From jsseabold at gmail.com Mon Mar 15 11:04:57 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 15 Mar 2010 11:04:57 -0400 Subject: [Numpy-discussion] more complex matrix operation In-Reply-To: References: Message-ID: On Mon, Mar 15, 2010 at 10:58 AM, Gerardo Berbeglia wrote: > I have another matrix operations which seems a little more complicated. > > Let A be an n x n matrix and let S be a subset of {0,...,n-1}. Assume > S is represented by a binary vector s, with a 1 at the index i if i is > in S. (e.g. if S={0,3} then s = [1,0,0,1]) > > I would like to have an efficient way to compute the function B = f > (A,S) characterized as follows: > > - For each column i such that i is in S, then the column i of B is > equal to the column i of A. > > - For each column i such that i is NOT in S, then the column i of B is > equal to the ith column of the n x n identity matrix. > > Example. n=4. > A = [[2,2,2,2],[3,3,3,3][4,4,4,4][5,5,5,5]] > S = {0,2} => s=[1,0,1,0] > > f(A,S) = [[2,2,2,2],[0,1,0,0],[4,4,4,4],[0,0,0,1]] > > Which is the best way to compute f? > You might find these helpful. http://www.scipy.org/Tentative_NumPy_Tutorial http://www.scipy.org/Numpy_Example_List Skipper From josef.pktd at gmail.com Mon Mar 15 11:06:37 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Mar 2010 10:06:37 -0500 Subject: [Numpy-discussion] more complex matrix operation In-Reply-To: References: Message-ID: <1cd32cbb1003150806v1c644a43m78d1b48f874df8bc@mail.gmail.com> On Mon, Mar 15, 2010 at 9:58 AM, Gerardo Berbeglia wrote: > I have another matrix operations which seems a little more complicated. > > Let A be an n x n matrix and let S be a subset of {0,...,n-1}. Assume > S is represented by a binary vector s, with a 1 at the index i if i is > in S. (e.g. if S={0,3} then s = [1,0,0,1]) > > I would like to have an efficient way to compute the function B = f > (A,S) characterized as follows: > > - For each column i such that i is in S, then the column i of B is > equal to the column i of A. > > - For each column i such that i is NOT in S, then the column i of B is > equal to the ith column of the n x n identity matrix. > > Example. n=4. > A = [[2,2,2,2],[3,3,3,3][4,4,4,4][5,5,5,5]] > S = {0,2} => s=[1,0,1,0] > > f(A,S) = [[2,2,2,2],[0,1,0,0],[4,4,4,4],[0,0,0,1]] here you have the rows changed ? > > Which is the best way to compute f? similar pattern as before. I think using boolean array as selector might be the easiest >>> s=np.array([1,0,1,0],bool) >>> s array([ True, False, True, False], dtype=bool) >>> B = np.eye(n,dtype=int) >>> A array([[2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4], [5, 5, 5, 5]]) >>> B[:,s] = A[:,s] >>> B array([[2, 0, 2, 0], [3, 1, 3, 0], [4, 0, 4, 0], [5, 0, 5, 1]]) Josef > > Thanks again. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gberbeglia at gmail.com Mon Mar 15 11:09:35 2010 From: gberbeglia at gmail.com (gerardo.berbeglia) Date: Mon, 15 Mar 2010 08:09:35 -0700 (PDT) Subject: [Numpy-discussion] matrix operation. Message-ID: <27905661.post@talk.nabble.com> I have another matrix operations which seems a little more complicated. Let A be an n x n matrix and let S be a subset of {0,...,n-1}. Assume S is represented by a binary vector s, with a 1 at the index i if i is in S. (e.g. if S={0,3} then s = [1,0,0,1]) I re-post the question because the example was wrong. I would like to have an efficient way to compute the function B = f (A,S) characterized as follows: - For each column i such that i is in S, then the column i of B is equal to the column i of A. - For each column i such that i is NOT in S, then the column i of B is equal to the ith column of the n x n identity matrix. Example. n=4. A = [[2,3,4,5], [3,4,5,6] [4,5,6,7] [5,6,7,8]] S = {0,2} => s=[1,0,1,0] f(A,S) = [[2,0,4,0], [3,1,5,0], [4,0,6,0], [5,0,7,1]] Which is the best way to compute f? Thanks again. -- View this message in context: http://old.nabble.com/matrix-operation.-tp27905661p27905661.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pgmdevlist at gmail.com Mon Mar 15 12:06:31 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 15 Mar 2010 12:06:31 -0400 Subject: [Numpy-discussion] long integers in genfromtxt In-Reply-To: <1cd32cbb1003131251g14c4901fp7a92e0b4a87a530c@mail.gmail.com> References: <1cd32cbb1003131251g14c4901fp7a92e0b4a87a530c@mail.gmail.com> Message-ID: On Mar 13, 2010, at 3:51 PM, josef.pktd at gmail.com wrote: > I was trying to find out what the "helpful" message > "TypeError: expected a readable buffer object" means > > and it seems genfromtxt has problems identifying long integers (at > least on Windows 32) > >>>> np.array(4160680000,int) > Traceback (most recent call last): > File "", line 1, in > np.array(4160680000,int) > OverflowError: long int too large to convert to int > That's likely a bug. Please open a ticket and and allocate it to me. From bsouthey at gmail.com Mon Mar 15 12:25:25 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 15 Mar 2010 11:25:25 -0500 Subject: [Numpy-discussion] long integers in genfromtxt In-Reply-To: References: <1cd32cbb1003131251g14c4901fp7a92e0b4a87a530c@mail.gmail.com> Message-ID: <4B9E5F75.6060101@gmail.com> On 03/15/2010 11:06 AM, Pierre GM wrote: > On Mar 13, 2010, at 3:51 PM, josef.pktd at gmail.com wrote: > >> I was trying to find out what the "helpful" message >> "TypeError: expected a readable buffer object" means >> >> and it seems genfromtxt has problems identifying long integers (at >> least on Windows 32) >> >> >>>>> np.array(4160680000,int) >>>>> >> Traceback (most recent call last): >> File "", line 1, in >> np.array(4160680000,int) >> OverflowError: long int too large to convert to int >> >> > > That's likely a bug. > Please open a ticket and and allocate it to me. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, I thought this is a MS windows limitation of only supporting 4 byte integers. Bruce From josef.pktd at gmail.com Mon Mar 15 12:34:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Mar 2010 11:34:31 -0500 Subject: [Numpy-discussion] long integers in genfromtxt In-Reply-To: <4B9E5F75.6060101@gmail.com> References: <1cd32cbb1003131251g14c4901fp7a92e0b4a87a530c@mail.gmail.com> <4B9E5F75.6060101@gmail.com> Message-ID: <1cd32cbb1003150934xe2a4822g51cddb8ec1b2cf21@mail.gmail.com> On Mon, Mar 15, 2010 at 11:25 AM, Bruce Southey wrote: > On 03/15/2010 11:06 AM, Pierre GM wrote: >> On Mar 13, 2010, at 3:51 PM, josef.pktd at gmail.com wrote: >> >>> I was trying to find out what the "helpful" message >>> "TypeError: expected a readable buffer object" ?means >>> >>> and it seems genfromtxt has problems identifying long integers (at >>> least on Windows 32) >>> >>> >>>>>> np.array(4160680000,int) >>>>>> >>> Traceback (most recent call last): >>> ? File "", line 1, in >>> ? ? np.array(4160680000,int) >>> OverflowError: long int too large to convert to int >>> >>> >> >> That's likely a bug. >> Please open a ticket and and allocate it to me. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Hi, > I thought this is a MS windows limitation of only supporting 4 byte > integers. at least a helpful error message would be useful int64 works (once I know what the problem is) >>> np.array(4160680000,np.int64) array(4160680000L, dtype=int64) >>> dt= [('','S10'),('',float),('',float),('',float),('',float),('',np.int64),('',float)] >>> sh = StringIO(s) >>> data = np.genfromtxt(sh, delimiter=",", dtype=dt, names=True) >>> data array([ ('2010-02-12', 1075.95, 1077.8099999999999, 1062.97, 1075.51, 4160680000L, 1075.51), ('2010-02-11', 1067.0999999999999, 1080.04, 1060.5899999999999, 1078.47, 4400870000L, 1078.47)], dtype=[('Date', '|S10'), ('Open', '>> Thanks Pierre, http://projects.scipy.org/numpy/ticket/1428 Josef > > Bruce > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From wfspotz at sandia.gov Mon Mar 15 13:42:50 2010 From: wfspotz at sandia.gov (Bill Spotz) Date: Mon, 15 Mar 2010 13:42:50 -0400 Subject: [Numpy-discussion] Numpi array and typemap In-Reply-To: References: Message-ID: <0A6DBCC2-CB3A-4410-891D-35E832FA88D9@sandia.gov> Michel, A couple observations: 1) I don't see anything in your interface file for which numpy.i would be helpful. Is that for something else not shown? 2) When you call PyArray_SimpleNewFromData(), the data buffer pointer argument should be cast to (void*). I am surprised the compiler didn't complain, and if it didn't, the results might be unpredictable. 3) Your getContainer() prototype has inconsistent return type in the two places you declare it. One way to avoid this with functions you define in your interface file AND want to be wrapped is with %inline: %inline template Container getContainer() { ... } 4) Are you absolutely sure that size_t is unsigned int on your system? If none of these help, you will need to debug. First by looking at the generated code in, I presume, Foo_wrap.cxx, then maybe with print statements added directly to the wrapper, and then with a real debugger. When I compile with debugging symbol tables, I can get "gdb --args python foo_script.py" to work. On Mar 15, 2010, at 4:45 AM, Michel Dupront wrote: > Hello, > > I am trying to do what was suppose to be a very easy > exercice. It consists in converting a C++ class Container into a > numpy array as show below using typename > My problem is that if I make the method getContainer run from > python I will get a numpy array as expected by with wrong values. > I guess I have a memory problem but I do not understand exactly > wwhat is going on. > I was hoping that somebody could help me with that and maybe > tell me how to use the PyArray_SimpleNewFromData macro. > > Thanks > > > %module Foo > %{ > #define SWIG_FILE_WITH_INIT > #include "container.h" > template Container getContainer() > { > ... C Container construction that will have a pointer to > (1,2,3,4,5)... > return C; > } > %} > > %typedef unsigned int size_t; > %include "numpy.i" > %init %{ > import_array(); > %} > > template class Container > { > public: > Container(size_t length, T* pfirst_value); > size_t lentgh() { ... return the lentgh ...} > T* ptr() { ... return pointer to the first value ... } > }; > > %template(cN) Container; > > %typemap(python,out) Container > { > npy_intp dims[] = { 5 }; > PyObject* obj = PyArray_SimpleNewFromData (1,dims,PyArray_UINT, > $1.ptr()); > $result = obj; > } > > template navlib::array getContainer(); > %template(getContainerN ) getContainer; ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From gberbeglia at gmail.com Mon Mar 15 14:20:25 2010 From: gberbeglia at gmail.com (gerardo.berbeglia) Date: Mon, 15 Mar 2010 11:20:25 -0700 (PDT) Subject: [Numpy-discussion] basic numpy array operation Message-ID: <27908290.post@talk.nabble.com> Suppose i have an array A of length n I want a variable b to be an array consisting of the first k elements of A and variable c to be an array with the last n-k elements of A. How do you do this? Example A = np.array([1,2,3,4,5,6]), k = 2 b = [1,2] c=[3,4,5,6] Thanks. -- View this message in context: http://old.nabble.com/basic-numpy-array-operation-tp27908290p27908290.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Mon Mar 15 14:22:38 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 15 Mar 2010 13:22:38 -0500 Subject: [Numpy-discussion] basic numpy array operation In-Reply-To: <27908290.post@talk.nabble.com> References: <27908290.post@talk.nabble.com> Message-ID: <3d375d731003151122u5fd026bfm88b37b520a39ab3a@mail.gmail.com> On Mon, Mar 15, 2010 at 13:20, gerardo.berbeglia wrote: > > Suppose i have an array A of length n > I want a variable b to be an array consisting of the first k elements of A > and variable c to be an array with the last n-k elements of A. > > How do you do this? > > Example A = np.array([1,2,3,4,5,6]), k = 2 > b = [1,2] > c=[3,4,5,6] b = A[:2] c = A[2:] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From z99719 at gmail.com Mon Mar 15 15:12:44 2010 From: z99719 at gmail.com (alex) Date: Mon, 15 Mar 2010 19:12:44 +0000 (UTC) Subject: [Numpy-discussion] problem with applying patch References: <6309d0f61003131106m1d112d92w73b113408bc9d84f@mail.gmail.com> Message-ID: Pauli Virtanen iki.fi> writes: Thank you. the problem is solved From friedrichromstedt at gmail.com Mon Mar 15 17:32:31 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 15 Mar 2010 22:32:31 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> Message-ID: Ok, so I send yet another version. Maybe Bruce is right, but I didn't care, because we have fret enough. Now it not only computes something, but also displays something :-( Nicolas, maybe you can now waste some of your time with it? I was curious, both to understand and to get it working, but I failed. I doubt especially the section "Microscopic boundary conditions", because commenting it out makes things, well, say worser. Leaving the other sections out is also not recommendable, but at least not that destructive. I do not understand why in the microscopic boundary section only directions 6 and 7 come into play and not 3. Also I do not understand why they occur in *all* output direction expressions. Furthermore, the fluid, albeit behaving also at the inlet quite strange, bounces back the outlet ... I disabled the obstacle so far, and plotted the 4 direction (downwards), and the resulting ux and uy flows. I give up so far. Friedrich -------------- next part -------------- A non-text attachment was scrubbed... Name: lbmethod.10-03-14.Friedrich.py Type: application/octet-stream Size: 4861 bytes Desc: not available URL: From Nicolas.Rougier at loria.fr Mon Mar 15 17:58:46 2010 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Mon, 15 Mar 2010 22:58:46 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> Message-ID: <33E1A630-D98D-4F47-A63F-B64ECAE53A6F@loria.fr> Thanks and in fact, I already wasted quite some time on and your last version will help me a lot. Unfortunately, I'm not a specialist at lattice Boltzmann methods at all so I'm not able to answer your questions (my initial idea was to convert the matlab script to be have a running example to get some starting point). Also, I found today some computers in the lab to test the matlab version and it seems to run as advertised on the site. I now need to run both versions side by side and to check where are the differences. I will post sources as soon as I get it to run properly. Thanks again. Nicolas On Mar 15, 2010, at 22:32 , Friedrich Romstedt wrote: > Ok, so I send yet another version. Maybe Bruce is right, but I didn't > care, because we have fret enough. Now it not only computes > something, but also displays something :-( > > Nicolas, maybe you can now waste some of your time with it? I was > curious, both to understand and to get it working, but I failed. I > doubt especially the section "Microscopic boundary conditions", > because commenting it out makes things, well, say worser. Leaving the > other sections out is also not recommendable, but at least not that > destructive. I do not understand why in the microscopic boundary > section only directions 6 and 7 come into play and not 3. Also I do > not understand why they occur in *all* output direction expressions. > > Furthermore, the fluid, albeit behaving also at the inlet quite > strange, bounces back the outlet ... > > I disabled the obstacle so far, and plotted the 4 direction > (downwards), and the resulting ux and uy flows. > > I give up so far. > > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Mon Mar 15 18:53:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Mar 2010 18:53:12 -0400 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <33E1A630-D98D-4F47-A63F-B64ECAE53A6F@loria.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> <33E1A630-D98D-4F47-A63F-B64ECAE53A6F@loria.fr> Message-ID: <1cd32cbb1003151553y257c1dddse6dfe50c9b92043a@mail.gmail.com> On Mon, Mar 15, 2010 at 5:58 PM, Nicolas Rougier wrote: > > > Thanks and in fact, I already wasted quite some time on and your last version will help me a lot. Unfortunately, I'm not a specialist at lattice Boltzmann methods at all so I'm not able to answer your questions (my initial idea was to convert the matlab script to be have a running example to get some starting point). Also, I found today some computers in the lab to test the matlab version and it seems to run as advertised on the site. I ?now need to run both versions side by side and to check where are the differences. I will post sources as soon as I get it to run properly. > > Thanks again. > > > Nicolas > > > > > > On Mar 15, 2010, at 22:32 , Friedrich Romstedt wrote: > >> Ok, so I send yet another version. ?Maybe Bruce is right, but I didn't >> care, because we have fret enough. ?Now it not only computes >> something, but also displays something :-( >> >> Nicolas, maybe you can now waste some of your time with it? ?I was >> curious, both to understand and to get it working, but I failed. ?I >> doubt especially the section "Microscopic boundary conditions", >> because commenting it out makes things, well, say worser. ?Leaving the >> other sections out is also not recommendable, but at least not that >> destructive. ?I do not understand why in the microscopic boundary >> section only directions 6 and 7 come into play and not 3. ?Also I do >> not understand why they occur in *all* output direction expressions. >> >> Furthermore, the fluid, albeit behaving also at the inlet quite >> strange, bounces back the outlet ... >> >> I disabled the obstacle so far, and plotted the 4 direction >> (downwards), and the resulting ux and uy flows. >> >> I give up so far. >> >> Friedrich >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I'm surprised you can translate matlab without have dot products showing up all over. I didn't really look at the code, but aren't these dot products ? # % COLLISION STEP # for i=1:9 # cu = 3*(cx(i)*ux+cy(i)*uy); I usually put print shape/size inside the code for debugging until I'm sure about correct shapes Josef From general.mooney at googlemail.com Tue Mar 16 07:05:25 2010 From: general.mooney at googlemail.com (=?UTF-8?Q?Ciar=C3=A1n_Mooney?=) Date: Tue, 16 Mar 2010 11:05:25 +0000 Subject: [Numpy-discussion] [ANN] EuroPython 2010 registration and talk submissions now open! Message-ID: <3e4e51a81003160405m7e0c64fay7eea9b43b7de3ee6@mail.gmail.com> EuroPython 2010 - 17th to 24th July 2010 ---------------------------------------- EuroPython is a conference for the Python programming language community, including the Django, Zope and Plone communities. It is aimed at everyone in the Python community, of all skill levels, both users and programmers. Last year's conference was the largest open source conference in the UK and one of the largest community organised software conferences in Europe. This year EuroPython will be held from the 17th to 24th July in Birmingham, UK. It will include over 100 talks, tutorials, sprints and social events. Registration ------------ Registration is open now at: http://www.europython.eu/registration/ For the best registration rates, book as soon as you can! Extra Early Bird closes soon, after which normal Early Bird rate will apply until 10th May Talks, Activities and Events ---------------------------- Do you have something you wish to present at EuroPython? You want to give a talk, run a tutorial or sprint? Go to http://www.europython.eu/talks/cfp/ for information and advice! Go to http://wiki.europython.eu/Sprints to plan a sprint! Help Us Out ----------- EuroPython is run by volunteers, like you! We could use a hand, and any contribution is welcome. Go to http://wiki.europython.eu/Helping to join us! Go to http://www.europython.eu/contact/ to contact us directly! Sponsors -------- Sponsoring EuroPython is a unique opportunity to affiliate with this prestigious conference and to reach a large number of Python users from computing professionals to academics, from entrepreneurs to motivated and well-educated job seekers. http://www.europython.eu/sponsors/ Spread the Word --------------- We are a community-run not-for-profit conference. Please help to spread the word by distributing this announcement to colleagues, project mailing lists, friends, your blog, Web site, and through your social networking connections. Take a look at our publicity resources: http://wiki.europython.eu/Publicity General Information ------------------- For more information about the conference, please visit the official site: http://www.europython.eu/ Looking forward to see you! The EuroPython Team From jorgesmbox-ml at yahoo.es Tue Mar 16 08:00:59 2010 From: jorgesmbox-ml at yahoo.es (Jorge Scandaliaris) Date: Tue, 16 Mar 2010 12:00:59 +0000 (UTC) Subject: [Numpy-discussion] Array mapping question Message-ID: Hi, I have a 1D array containing indexes to specific measurements. As this array is a slice of a bigger one, the indexes don't necessarily start at 0 nor they are sequential. For example, I can have an array A where In [34]: A.shape Out[34]: (4764,) In [35]: ctab = np.unique(A) In [36]: ctab Out[36]: array([48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62], dtype=int64) I would like to map these indexes to a sequence starting from zero. The usual look up table approach doesn't work here. I can solve this using a dictionary, but then I am forced to using a loop or a list comprehension: In [38]: cdic = dict(zip(ctab, range(ctab.size))) In [39]: cdic Out[39]: {48: 0, 49: 1, 50: 2, 51: 3, 52: 4, 53: 5, 54: 6, 55: 7, 56: 8, 57: 9, 58: 10, 59: 11, 60: 12, 61: 13, 62: 14} A_remapped = np.asarray([cdic[x] for x in A]) Am I overlooking a better way of doing this? Thanks, Jorge From Sam.Tygier at hep.manchester.ac.uk Tue Mar 16 08:47:23 2010 From: Sam.Tygier at hep.manchester.ac.uk (Sam Tygier) Date: Tue, 16 Mar 2010 12:47:23 +0000 Subject: [Numpy-discussion] Documentation for dtypes with named fields Message-ID: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> Hello I can't find much documentation for using arrays where the dtype has named fields (is there a term for this sort of array). there is some in http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html but that's most creation. for example if i have a = zeros(5, dtype=[('a','f'), ('b','f'), ('c','i')]) and fill it with values. if i want to fill a row, i can do a[0] = (0.1,0.2,3) but if i use a list it wont work a[0] = [0.1,0.2,3] is there an explanation of why? now if i wanted to do some sort of conversion, eg multiply columns 'a' and 'b' by 10, is there a way to do a *= [10,10,1] like i would do with a if were a zeros((5,3)) array is a['a'] *= 10 a['b'] *= 10 the best method? also whats the best way to take a slice of the columns? if i want just the 'a' and 'b' columns. if it were a 2d array i could do a[0:2] but a['a':'b'] does not work. is there a way to do this? thanks Sam From josef.pktd at gmail.com Tue Mar 16 08:54:21 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 08:54:21 -0400 Subject: [Numpy-discussion] Array mapping question In-Reply-To: References: Message-ID: <1cd32cbb1003160554v14061c71k58ad1b9892efdfd@mail.gmail.com> On Tue, Mar 16, 2010 at 8:00 AM, Jorge Scandaliaris wrote: > Hi, > I have a 1D array containing indexes to specific measurements. As this array is > a slice of a bigger one, the indexes don't necessarily start at 0 nor they are > sequential. For example, I can have an array A where > > In [34]: A.shape > Out[34]: (4764,) > In [35]: ctab = np.unique(A) > In [36]: ctab > Out[36]: array([48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62], > dtype=int64) > > I would like to map these indexes to a sequence starting from zero. The usual > look up table approach doesn't work here. I can solve this using a dictionary, > but then I am forced to using a loop or a list comprehension: > > In [38]: cdic = dict(zip(ctab, range(ctab.size))) > In [39]: cdic > Out[39]: > {48: 0, > ?49: 1, > ?50: 2, > ?51: 3, > ?52: 4, > ?53: 5, > ?54: 6, > ?55: 7, > ?56: 8, > ?57: 9, > ?58: 10, > ?59: 11, > ?60: 12, > ?61: 13, > ?62: 14} > > A_remapped = np.asarray([cdic[x] for x in A]) If I understand correctly, then you want return_inverse (the original array recoded to using integers 0...len(ctab)-1 help(np.unique) ... return_inverse : bool, optional If True, also return the indices of the unique array that can be used to reconstruct `ar`. Josef > > Am I overlooking a better way of doing this? > > Thanks, > > Jorge > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Tue Mar 16 09:10:14 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 16 Mar 2010 09:10:14 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> Message-ID: On Tue, Mar 16, 2010 at 8:47 AM, Sam Tygier wrote: > Hello > > I can't find much documentation for using arrays where the dtype has named fields (is there a term for this sort of array). there is some in Structured array (or record array if you make a record array -- the difference is record array allows access to the fields through attribute). http://docs.scipy.org/doc/numpy/user/basics.rec.html There might be another page, but I can't find it right now. > http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html > http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html > but that's most creation. > > for example if i have > a = zeros(5, dtype=[('a','f'), ('b','f'), ('c','i')]) > and fill it with values. > > if i want to fill a row, i can do > a[0] = (0.1,0.2,3) > but if i use a list it wont work > a[0] = [0.1,0.2,3] > is there an explanation of why? > Structured arrays expect tuples. > now if i wanted to do some sort of conversion, eg multiply columns 'a' and 'b' by 10, is there a way to do > a *= [10,10,1] > like i would do with a if were a zeros((5,3)) array > is > a['a'] *= 10 > a['b'] *= 10 > the best method? > I think so. > also whats the best way to take a slice of the columns? if i want just the 'a' and 'b' columns. if it were a 2d array i could do > a[0:2] > but > a['a':'b'] > does not work. is there a way to do this? > a[['a','b']] Notice the list within a list notation. Skipper From josef.pktd at gmail.com Tue Mar 16 09:14:38 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 09:14:38 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> Message-ID: <1cd32cbb1003160614o66f89e54gbebca177782bc7f3@mail.gmail.com> On Tue, Mar 16, 2010 at 8:47 AM, Sam Tygier wrote: > Hello > > I can't find much documentation for using arrays where the dtype has named fields (is there a term for this sort of array). there is some in > http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html > http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html > but that's most creation. structured arrays, or structured dtypes (similar to recarrays but simpler) http://docs.scipy.org/doc/numpy/user/basics.rec.html http://www.scipy.org/Cookbook/Recarray the best information, I think, is on the mailing list, search for structured array e.g. slicing is not possible, slicing requires a view as regular ndarray there are some slightly hidden (because still work in progress) helper functions >>> import numpy.lib.recfunctions >>> dir(numpy.lib.recfunctions) Josef > > for example if i have > a = zeros(5, dtype=[('a','f'), ('b','f'), ('c','i')]) > and fill it with values. > > if i want to fill a row, i can do > a[0] = (0.1,0.2,3) > but if i use a list it wont work > a[0] = [0.1,0.2,3] > is there an explanation of why? > > now if i wanted to do some sort of conversion, eg multiply columns 'a' and 'b' by 10, is there a way to do > a *= [10,10,1] > like i would do with a if were a zeros((5,3)) array > is > a['a'] *= 10 > a['b'] *= 10 > the best method? > > also whats the best way to take a slice of the columns? if i want just the 'a' and 'b' columns. if it were a 2d array i could do > a[0:2] > but > a['a':'b'] > does not work. is there a way to do this? > > thanks > > Sam > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jorgesmbox-ml at yahoo.es Tue Mar 16 09:33:40 2010 From: jorgesmbox-ml at yahoo.es (Jorge Scandaliaris) Date: Tue, 16 Mar 2010 13:33:40 +0000 (UTC) Subject: [Numpy-discussion] Array mapping question References: <1cd32cbb1003160554v14061c71k58ad1b9892efdfd@mail.gmail.com> Message-ID: gmail.com> writes: > > If I understand correctly, then you want return_inverse (the original > array recoded to using integers 0...len(ctab)-1 > > Josef Right, thanks! I didn't see this cause I use numpy 1.3, where this is not available. Jorge From josef.pktd at gmail.com Tue Mar 16 09:42:01 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 09:42:01 -0400 Subject: [Numpy-discussion] Array mapping question In-Reply-To: References: <1cd32cbb1003160554v14061c71k58ad1b9892efdfd@mail.gmail.com> Message-ID: <1cd32cbb1003160642g3f76ec85l634c055a4d4e6533@mail.gmail.com> On Tue, Mar 16, 2010 at 9:33 AM, Jorge Scandaliaris wrote: > ? gmail.com> writes: > >> >> If I understand correctly, then you want return_inverse ?(the original >> array recoded to using integers 0...len(ctab)-1 >> > >> Josef > > Right, thanks! I didn't see this cause I use numpy 1.3, where this is > not available. If I remember correctly it was unique1d in numpy 1.3 that had the return_inverse options. Check the functions in arraysetops for numpy 1.3. unique1d is in my help file for numpy 1.2 (which was the fastest for me to look up) Josef > > Jorge > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gberbeglia at gmail.com Tue Mar 16 09:43:41 2010 From: gberbeglia at gmail.com (gerardo.berbeglia) Date: Tue, 16 Mar 2010 06:43:41 -0700 (PDT) Subject: [Numpy-discussion] How to fix the diagonal values of a matrix Message-ID: <27917991.post@talk.nabble.com> How can i take out the diagonal values of a matrix and fix them to zero? Example: input: [[2,3,4],[3,4,5],[4,5,6]] output: [[0,3,4],[3,0,5],[4,5,0]] Thanks. -- View this message in context: http://old.nabble.com/How-to-fix-the-diagonal-values-of-a-matrix-tp27917991p27917991.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From josef.pktd at gmail.com Tue Mar 16 09:56:45 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 09:56:45 -0400 Subject: [Numpy-discussion] How to fix the diagonal values of a matrix In-Reply-To: <27917991.post@talk.nabble.com> References: <27917991.post@talk.nabble.com> Message-ID: <1cd32cbb1003160656j43f27dc2o4049fa85992930e0@mail.gmail.com> On Tue, Mar 16, 2010 at 9:43 AM, gerardo.berbeglia wrote: > > How can i take out the diagonal values of a matrix and fix them to zero? > > Example: > > input: [[2,3,4],[3,4,5],[4,5,6]] > > output: [[0,3,4],[3,0,5],[4,5,0]] assuming a is square a[range(len(a)),range(len(a))] = 0 see also np.diag Josef > > Thanks. > -- > View this message in context: http://old.nabble.com/How-to-fix-the-diagonal-values-of-a-matrix-tp27917991p27917991.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Tue Mar 16 10:43:24 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 16 Mar 2010 06:43:24 -0800 Subject: [Numpy-discussion] How to fix the diagonal values of a matrix In-Reply-To: <1cd32cbb1003160656j43f27dc2o4049fa85992930e0@mail.gmail.com> References: <27917991.post@talk.nabble.com> <1cd32cbb1003160656j43f27dc2o4049fa85992930e0@mail.gmail.com> Message-ID: On Tue, Mar 16, 2010 at 5:56 AM, wrote: > On Tue, Mar 16, 2010 at 9:43 AM, gerardo.berbeglia wrote: >> >> How can i take out the diagonal values of a matrix and fix them to zero? >> >> Example: >> >> input: [[2,3,4],[3,4,5],[4,5,6]] >> >> output: [[0,3,4],[3,0,5],[4,5,0]] > > assuming a is square > > a[range(len(a)),range(len(a))] = 0 > > see also np.diag > > Josef Or, if you need speed, here's the fast way: a.flat[::4] = 0 or more generally a.flat[::a.shape[0]+1] = 0 From Sam.Tygier at hep.manchester.ac.uk Tue Mar 16 11:20:02 2010 From: Sam.Tygier at hep.manchester.ac.uk (Sam Tygier) Date: Tue, 16 Mar 2010 15:20:02 +0000 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> Message-ID: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> Thanks for those responses. could the dtype pages in the numpy reference link to the basics.rec page in the user guide? there seem to be some gotchas in list within a list notation. if i have a = array([0,0.1,0.2,0.3,0.4]) b = array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) then >>> a[[0,1,4]] array([ 0. , 0.1, 0.4]) >>> a[[4,1,0]] array([ 0.4, 0.1, 0. ]) but >>> b[['a','b','f']] (0.0, 0.10000000149011612, 0.40000000596046448) >>> b[['f','b','a']] (0.0, 0.10000000149011612, 0.40000000596046448) so i always get the vales back in the original order. is the by design, or a bug? thanks Sam From jsseabold at gmail.com Tue Mar 16 11:34:12 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 16 Mar 2010 11:34:12 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> Message-ID: On Tue, Mar 16, 2010 at 11:20 AM, Sam Tygier wrote: > Thanks for those responses. > > could the dtype pages in the numpy reference link to the basics.rec page in the user guide? > > there seem to be some gotchas in list within a list notation. > > if i have > a = array([0,0.1,0.2,0.3,0.4]) > b = ?array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) > > then >>>> a[[0,1,4]] > array([ 0. , ?0.1, ?0.4]) >>>> a[[4,1,0]] > array([ 0.4, ?0.1, ?0. ]) > > but >>>> b[['a','b','f']] > (0.0, 0.10000000149011612, 0.40000000596046448) >>>> b[['f','b','a']] > (0.0, 0.10000000149011612, 0.40000000596046448) > > so i always get the vales back in the original order. is the by design, or a bug? > I've been bitten by this before too and asked the same question with no response. I think it's just a limitation of the design of structured arrays. Skipper From jorgesmbox-ml at yahoo.es Tue Mar 16 11:35:52 2010 From: jorgesmbox-ml at yahoo.es (Jorge Scandaliaris) Date: Tue, 16 Mar 2010 15:35:52 +0000 (UTC) Subject: [Numpy-discussion] Array mapping question References: <1cd32cbb1003160554v14061c71k58ad1b9892efdfd@mail.gmail.com> <1cd32cbb1003160642g3f76ec85l634c055a4d4e6533@mail.gmail.com> Message-ID: gmail.com> writes: > If I remember correctly it was unique1d in numpy 1.3 that had the > return_inverse options. > > Check the functions in arraysetops for numpy 1.3. > unique1d is in my help file for numpy 1.2 (which was the fastest for > me to look up) > You're right, again. unique1d is in numpy 1.3 Thanks, Jorge From josef.pktd at gmail.com Tue Mar 16 11:46:24 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 11:46:24 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> Message-ID: <1cd32cbb1003160846i2e137357ub1e7fce59c80c471@mail.gmail.com> On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: > On Tue, Mar 16, 2010 at 11:20 AM, Sam Tygier > wrote: >> Thanks for those responses. >> >> could the dtype pages in the numpy reference link to the basics.rec page in the user guide? >> >> there seem to be some gotchas in list within a list notation. >> >> if i have >> a = array([0,0.1,0.2,0.3,0.4]) >> b = ?array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) >> >> then >>>>> a[[0,1,4]] >> array([ 0. , ?0.1, ?0.4]) >>>>> a[[4,1,0]] >> array([ 0.4, ?0.1, ?0. ]) >> >> but >>>>> b[['a','b','f']] >> (0.0, 0.10000000149011612, 0.40000000596046448) >>>>> b[['f','b','a']] >> (0.0, 0.10000000149011612, 0.40000000596046448) >> >> so i always get the vales back in the original order. is the by design, or a bug? >> > > I've been bitten by this before too and asked the same question with > no response. ?I think it's just a limitation of the design of > structured arrays. It might be by historical design, structured arrays are not really designed for slicing but I think more like sets of variables. But it means it cannot be used directly for the old pattern [arr(name) for name in listofnames] Skipper, Is this subset selection documented anywhere? I only know about it because you showed the example. Josef > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Tue Mar 16 11:46:18 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 16 Mar 2010 11:46:18 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> Message-ID: On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: > On Tue, Mar 16, 2010 at 11:20 AM, Sam Tygier > wrote: >> Thanks for those responses. >> >> could the dtype pages in the numpy reference link to the basics.rec page in the user guide? >> >> there seem to be some gotchas in list within a list notation. >> >> if i have >> a = array([0,0.1,0.2,0.3,0.4]) >> b = ?array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) >> >> then >>>>> a[[0,1,4]] >> array([ 0. , ?0.1, ?0.4]) >>>>> a[[4,1,0]] >> array([ 0.4, ?0.1, ?0. ]) >> >> but >>>>> b[['a','b','f']] >> (0.0, 0.10000000149011612, 0.40000000596046448) >>>>> b[['f','b','a']] >> (0.0, 0.10000000149011612, 0.40000000596046448) >> >> so i always get the vales back in the original order. is the by design, or a bug? >> > > I've been bitten by this before too and asked the same question with > no response. ?I think it's just a limitation of the design of > structured arrays. > I added an example of this to: http://docs.scipy.org/numpy/docs/numpy.doc.structured_arrays/ Skipper From jsseabold at gmail.com Tue Mar 16 11:52:45 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 16 Mar 2010 11:52:45 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <1cd32cbb1003160846i2e137357ub1e7fce59c80c471@mail.gmail.com> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> <1cd32cbb1003160846i2e137357ub1e7fce59c80c471@mail.gmail.com> Message-ID: On Tue, Mar 16, 2010 at 11:46 AM, wrote: > On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: >> On Tue, Mar 16, 2010 at 11:20 AM, Sam Tygier >> wrote: >>> Thanks for those responses. >>> >>> could the dtype pages in the numpy reference link to the basics.rec page in the user guide? >>> >>> there seem to be some gotchas in list within a list notation. >>> >>> if i have >>> a = array([0,0.1,0.2,0.3,0.4]) >>> b = ?array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) >>> >>> then >>>>>> a[[0,1,4]] >>> array([ 0. , ?0.1, ?0.4]) >>>>>> a[[4,1,0]] >>> array([ 0.4, ?0.1, ?0. ]) >>> >>> but >>>>>> b[['a','b','f']] >>> (0.0, 0.10000000149011612, 0.40000000596046448) >>>>>> b[['f','b','a']] >>> (0.0, 0.10000000149011612, 0.40000000596046448) >>> >>> so i always get the vales back in the original order. is the by design, or a bug? >>> >> >> I've been bitten by this before too and asked the same question with >> no response. ?I think it's just a limitation of the design of >> structured arrays. > > It might be by historical design, structured arrays are not really > designed for slicing but I think more like sets of variables. > > But it means it cannot be used directly for the old pattern > > [arr(name) for name in listofnames] > > Skipper, Is this subset selection documented anywhere? I only know > about it because you showed the example. > Just added it and a link to the cookbook for recarrays. I don't think it will show up until the doc wiki changes are applied(?). http://docs.scipy.org/numpy/docs/numpy.doc.structured_arrays/ Skipper From Sam.Tygier at hep.manchester.ac.uk Tue Mar 16 12:04:14 2010 From: Sam.Tygier at hep.manchester.ac.uk (Sam Tygier) Date: Tue, 16 Mar 2010 16:04:14 +0000 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk>, <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> Message-ID: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: >> so i always get the vales back in the original order. is the by design, or a bug? >> > > I've been bitten by this before too and asked the same question with > no response. I think it's just a limitation of the design of > structured arrays. i had a hunt for the code. and it seems easy to fix. its in numpy/core/_internal.py:301 --- numpy/core/_internal.py 2010-03-16 16:01:28.000000000 +0000 +++ numpy/core/_internal.py.old 2010-03-16 16:00:52.000000000 +0000 @@ -298,7 +298,7 @@ def _index_fields(ary, fields): from multiarray import empty, dtype dt = ary.dtype - new_dtype = [(name, dt[name]) for name in fields if name in dt.names] + new_dtype = [(name, dt[name]) for name in dt.names if name in fields] if ary.flags.f_contiguous: order = 'F' else: From josef.pktd at gmail.com Tue Mar 16 12:08:37 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 12:08:37 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> <1cd32cbb1003160846i2e137357ub1e7fce59c80c471@mail.gmail.com> Message-ID: <1cd32cbb1003160908i1b77ce39r9626c844f8948e96@mail.gmail.com> On Tue, Mar 16, 2010 at 11:52 AM, Skipper Seabold wrote: > On Tue, Mar 16, 2010 at 11:46 AM, ? wrote: >> On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: >>> On Tue, Mar 16, 2010 at 11:20 AM, Sam Tygier >>> wrote: >>>> Thanks for those responses. >>>> >>>> could the dtype pages in the numpy reference link to the basics.rec page in the user guide? >>>> >>>> there seem to be some gotchas in list within a list notation. >>>> >>>> if i have >>>> a = array([0,0.1,0.2,0.3,0.4]) >>>> b = ?array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) >>>> >>>> then >>>>>>> a[[0,1,4]] >>>> array([ 0. , ?0.1, ?0.4]) >>>>>>> a[[4,1,0]] >>>> array([ 0.4, ?0.1, ?0. ]) >>>> >>>> but >>>>>>> b[['a','b','f']] >>>> (0.0, 0.10000000149011612, 0.40000000596046448) >>>>>>> b[['f','b','a']] >>>> (0.0, 0.10000000149011612, 0.40000000596046448) >>>> >>>> so i always get the vales back in the original order. is the by design, or a bug? >>>> >>> >>> I've been bitten by this before too and asked the same question with >>> no response. ?I think it's just a limitation of the design of >>> structured arrays. >> >> It might be by historical design, structured arrays are not really >> designed for slicing but I think more like sets of variables. >> >> But it means it cannot be used directly for the old pattern >> >> [arr(name) for name in listofnames] >> >> Skipper, Is this subset selection documented anywhere? I only know >> about it because you showed the example. >> > > Just added it and a link to the cookbook for recarrays. ?I don't think > it will show up until the doc wiki changes are applied(?). > > http://docs.scipy.org/numpy/docs/numpy.doc.structured_arrays/ looks good, together with the cookbook on .view() it almost covers the FAQs for structured arrays I changed "OK to apply:" to Yes so it will get into the docs soon Josef > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Mar 16 12:11:44 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 12:11:44 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <1cd32cbb1003160908i1b77ce39r9626c844f8948e96@mail.gmail.com> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> <1cd32cbb1003160846i2e137357ub1e7fce59c80c471@mail.gmail.com> <1cd32cbb1003160908i1b77ce39r9626c844f8948e96@mail.gmail.com> Message-ID: <1cd32cbb1003160911m2881b617sa821ca558dd0a5b1@mail.gmail.com> On Tue, Mar 16, 2010 at 12:08 PM, wrote: > On Tue, Mar 16, 2010 at 11:52 AM, Skipper Seabold wrote: >> On Tue, Mar 16, 2010 at 11:46 AM, ? wrote: >>> On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: >>>> On Tue, Mar 16, 2010 at 11:20 AM, Sam Tygier >>>> wrote: >>>>> Thanks for those responses. >>>>> >>>>> could the dtype pages in the numpy reference link to the basics.rec page in the user guide? >>>>> >>>>> there seem to be some gotchas in list within a list notation. >>>>> >>>>> if i have >>>>> a = array([0,0.1,0.2,0.3,0.4]) >>>>> b = ?array((0,0.1,0.2,0.3,0.4), dtype=[('a','f'), ('b','f'), ('c','f'), ('d','f'),('f','f')]) >>>>> >>>>> then >>>>>>>> a[[0,1,4]] >>>>> array([ 0. , ?0.1, ?0.4]) >>>>>>>> a[[4,1,0]] >>>>> array([ 0.4, ?0.1, ?0. ]) >>>>> >>>>> but >>>>>>>> b[['a','b','f']] >>>>> (0.0, 0.10000000149011612, 0.40000000596046448) >>>>>>>> b[['f','b','a']] >>>>> (0.0, 0.10000000149011612, 0.40000000596046448) >>>>> >>>>> so i always get the vales back in the original order. is the by design, or a bug? >>>>> >>>> >>>> I've been bitten by this before too and asked the same question with >>>> no response. ?I think it's just a limitation of the design of >>>> structured arrays. >>> >>> It might be by historical design, structured arrays are not really >>> designed for slicing but I think more like sets of variables. >>> >>> But it means it cannot be used directly for the old pattern >>> >>> [arr(name) for name in listofnames] >>> >>> Skipper, Is this subset selection documented anywhere? I only know >>> about it because you showed the example. >>> >> >> Just added it and a link to the cookbook for recarrays. ?I don't think >> it will show up until the doc wiki changes are applied(?). >> >> http://docs.scipy.org/numpy/docs/numpy.doc.structured_arrays/ > > looks good, together with the cookbook on .view() it almost covers the > FAQs for structured arrays > > I changed "OK to apply:" to ?Yes ?so it will get into the docs soon I also changed the aka to and, to avoid the confusion between recarrays and structured arrays (another FAQ) Structured Arrays (aka Record Arrays) to Structured Arrays (and Record Arrays) Josef > > Josef >> >> Skipper >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From josef.pktd at gmail.com Tue Mar 16 12:23:48 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 12:23:48 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> Message-ID: <1cd32cbb1003160923v53e5e8cale58d6ce128db06a1@mail.gmail.com> On Tue, Mar 16, 2010 at 12:04 PM, Sam Tygier wrote: > On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: >>> so i always get the vales back in the original order. is the by design, or a bug? >>> >> >> I've been bitten by this before too and asked the same question with >> no response. ?I think it's just a limitation of the design of >> structured arrays. > > i had a hunt for the code. and it seems easy to fix. > its in numpy/core/_internal.py:301 > > --- numpy/core/_internal.py ? ? 2010-03-16 16:01:28.000000000 +0000 > +++ numpy/core/_internal.py.old 2010-03-16 16:00:52.000000000 +0000 > @@ -298,7 +298,7 @@ > ?def _index_fields(ary, fields): > ? ? from multiarray import empty, dtype > ? ? dt = ary.dtype > - ? ?new_dtype = [(name, dt[name]) for name in fields if name in dt.names] > + ? ?new_dtype = [(name, dt[name]) for name in dt.names if name in fields] > ? ? if ary.flags.f_contiguous: > ? ? ? ? order = 'F' > ? ? else: You can file a ticket, but if this is a function that is already in real use, then it would be an unpleasant break in the API Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Tue Mar 16 12:29:29 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 16 Mar 2010 12:29:29 -0400 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <1cd32cbb1003160923v53e5e8cale58d6ce128db06a1@mail.gmail.com> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> <1cd32cbb1003160923v53e5e8cale58d6ce128db06a1@mail.gmail.com> Message-ID: On Tue, Mar 16, 2010 at 12:23 PM, wrote: > On Tue, Mar 16, 2010 at 12:04 PM, Sam Tygier > wrote: >> On Tue, Mar 16, 2010 at 11:34 AM, Skipper Seabold wrote: >>>> so i always get the vales back in the original order. is the by design, or a bug? >>>> >>> >>> I've been bitten by this before too and asked the same question with >>> no response. ?I think it's just a limitation of the design of >>> structured arrays. >> >> i had a hunt for the code. and it seems easy to fix. >> its in numpy/core/_internal.py:301 >> >> --- numpy/core/_internal.py ? ? 2010-03-16 16:01:28.000000000 +0000 >> +++ numpy/core/_internal.py.old 2010-03-16 16:00:52.000000000 +0000 >> @@ -298,7 +298,7 @@ >> ?def _index_fields(ary, fields): >> ? ? from multiarray import empty, dtype >> ? ? dt = ary.dtype >> - ? ?new_dtype = [(name, dt[name]) for name in fields if name in dt.names] >> + ? ?new_dtype = [(name, dt[name]) for name in dt.names if name in fields] >> ? ? if ary.flags.f_contiguous: >> ? ? ? ? order = 'F' >> ? ? else: Nice! That works for me. > You can file a ticket, but if this is a function that is already in > real use, then it would be an unpleasant break in the API > Yeah, I would have to change some code around, but I think this would be a worthwhile enhancement. Also worth noting that this wasn't documented anywhere. I only knew about it because Travis pointed it out on the list once. Skipper From Sam.Tygier at hep.manchester.ac.uk Tue Mar 16 12:45:10 2010 From: Sam.Tygier at hep.manchester.ac.uk (Sam Tygier) Date: Tue, 16 Mar 2010 16:45:10 +0000 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk>, <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk>, <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> Message-ID: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D7@exchange.hep.manchester.ac.uk> On Tue, Mar 16, 2010 at 12:23 PM, gmail.com> wrote: > You can file a ticket, but if this is a function that is already in > real use, then it would be an unpleasant break in the API done http://projects.scipy.org/numpy/ticket/1431 thanks Sam From d.l.goldsmith at gmail.com Tue Mar 16 13:05:34 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 16 Mar 2010 10:05:34 -0700 Subject: [Numpy-discussion] Documentation for dtypes with named fields In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D7@exchange.hep.manchester.ac.uk> References: <75828F93B3A771439C6F9F4B67CFE58B046A7B64D1@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D3@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D6@exchange.hep.manchester.ac.uk> <75828F93B3A771439C6F9F4B67CFE58B046A7B64D7@exchange.hep.manchester.ac.uk> Message-ID: <45d1ab481003161005t1bc81418x991e5132a4e01ad3@mail.gmail.com> On Tue, Mar 16, 2010 at 9:45 AM, Sam Tygier wrote: > On Tue, Mar 16, 2010 at 12:23 PM, ? gmail.com> wrote: >> You can file a ticket, but if this is a function that is already in >> real use, then it would be an unpleasant break in the API > > done > http://projects.scipy.org/numpy/ticket/1431 > > thanks > > Sam > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Thanks, guys! Rec/struc arrays are in good hands, DG From kwgoodman at gmail.com Tue Mar 16 13:57:32 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 16 Mar 2010 10:57:32 -0700 Subject: [Numpy-discussion] How to fix the diagonal values of a matrix In-Reply-To: References: <27917991.post@talk.nabble.com> <1cd32cbb1003160656j43f27dc2o4049fa85992930e0@mail.gmail.com> Message-ID: On Tue, Mar 16, 2010 at 7:43 AM, Keith Goodman wrote: > On Tue, Mar 16, 2010 at 5:56 AM, ? wrote: >> On Tue, Mar 16, 2010 at 9:43 AM, gerardo.berbeglia wrote: >>> >>> How can i take out the diagonal values of a matrix and fix them to zero? >>> >>> Example: >>> >>> input: [[2,3,4],[3,4,5],[4,5,6]] >>> >>> output: [[0,3,4],[3,0,5],[4,5,0]] >> >> assuming a is square >> >> a[range(len(a)),range(len(a))] = 0 >> >> see also np.diag >> >> Josef > > Or, if you need speed, here's the fast way: > > a.flat[::4] = 0 > > or more generally > > a.flat[::a.shape[0]+1] = 0 Oh, I see that fill_diagonal is in numpy 1.4. So: >> a = np.array([[2,3,4],[3,4,5],[4,5,6]]) >> np.fill_diagonal(a, 0) >> a array([[0, 3, 4], [3, 0, 5], [4, 5, 0]]) From josef.pktd at gmail.com Tue Mar 16 15:39:52 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Mar 2010 15:39:52 -0400 Subject: [Numpy-discussion] How to fix the diagonal values of a matrix In-Reply-To: References: <27917991.post@talk.nabble.com> <1cd32cbb1003160656j43f27dc2o4049fa85992930e0@mail.gmail.com> Message-ID: <1cd32cbb1003161239p52dc2cf0y57b1c8d9e17b0a37@mail.gmail.com> On Tue, Mar 16, 2010 at 1:57 PM, Keith Goodman wrote: > On Tue, Mar 16, 2010 at 7:43 AM, Keith Goodman wrote: >> On Tue, Mar 16, 2010 at 5:56 AM, ? wrote: >>> On Tue, Mar 16, 2010 at 9:43 AM, gerardo.berbeglia wrote: >>>> >>>> How can i take out the diagonal values of a matrix and fix them to zero? >>>> >>>> Example: >>>> >>>> input: [[2,3,4],[3,4,5],[4,5,6]] >>>> >>>> output: [[0,3,4],[3,0,5],[4,5,0]] >>> >>> assuming a is square >>> >>> a[range(len(a)),range(len(a))] = 0 >>> >>> see also np.diag >>> >>> Josef >> >> Or, if you need speed, here's the fast way: >> >> a.flat[::4] = 0 >> >> or more generally >> >> a.flat[::a.shape[0]+1] = 0 > > Oh, I see that fill_diagonal is in numpy 1.4. So: > >>> a = np.array([[2,3,4],[3,4,5],[4,5,6]]) >>> np.fill_diagonal(a, 0) >>> a > > array([[0, 3, 4], > ? ? ? [3, 0, 5], > ? ? ? [4, 5, 0]]) it looks like there are a lot of functions in numpy np.source(np.fill_diagonal) shows it's the same as your previous version this (seems to) work for non-square nd arrays with ndim>1: >>> a = np.ones((2,3,4)) >>> a[[range(min(a.shape))]*a.ndim] = 0 >>> a array([[[ 0., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.]], [[ 1., 1., 1., 1.], [ 1., 0., 1., 1.], [ 1., 1., 1., 1.]]]) >>> a = np.ones((4,5)) >>> a[[range(min(a.shape))]*a.ndim] = 0 >>> a array([[ 0., 1., 1., 1., 1.], [ 1., 0., 1., 1., 1.], [ 1., 1., 0., 1., 1.], [ 1., 1., 1., 0., 1.]]) Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From friedrichromstedt at gmail.com Tue Mar 16 17:01:44 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 16 Mar 2010 22:01:44 +0100 Subject: [Numpy-discussion] Some help on matlab to numpy translation In-Reply-To: <33E1A630-D98D-4F47-A63F-B64ECAE53A6F@loria.fr> References: <21FAD255-B245-4A07-94DC-F08BD4E8350D@loria.fr> <5F63F799-8E62-4335-B341-65F4CBF8FBD0@loria.fr> <33E1A630-D98D-4F47-A63F-B64ECAE53A6F@loria.fr> Message-ID: Ok, maybe can you print shape of the {rho} array as calculated my matlab? I know that sum() in matlab sums over rows (i.e., the first dimension), but I'm curious if it returns for an, say, (10x20) array an (20,) array or an (1, 20) array. And to Josef: cx is an 1d vector, so no. And Hmmm ... most of the operations are .* in matlab, so element-wise multiplication. There is only one matrix product involved, as far as I see, and this has been replaced by a tensordot(...) ... So far, Friedrich 2010/3/15 Nicolas Rougier : > Thanks and in fact, I already wasted quite some time on and your last version will help me a lot. Unfortunately, I'm not a specialist at lattice Boltzmann methods at all so I'm not able to answer your questions (my initial idea was to convert the matlab script to be have a running example to get some starting point). Also, I found today some computers in the lab to test the matlab version and it seems to run as advertised on the site. I ?now need to run both versions side by side and to check where are the differences. I will post sources as soon as I get it to run properly. From petertbrady at gmail.com Tue Mar 16 17:32:46 2010 From: petertbrady at gmail.com (Peter Brady) Date: Tue, 16 Mar 2010 14:32:46 -0700 Subject: [Numpy-discussion] f2py compiler version errors Message-ID: <3b6093ef1003161432t6f414bf4g379727483bb7cf9@mail.gmail.com> Hello all, The version of f2py that's installed on our system doesn't appear to handle version numbers correctly. I've attached the relevant output of f2py below: customize IntelFCompiler > Couldn't match compiler version for 'Intel(R) Fortran Intel(R) 64 Compiler > Professional for applications running on Intel(R) 64, Version 11.0 Build > 20090318 \nCopyright (C) 1985-2009 Intel Corporation. All rights > reserved.\nFOR NON-COMMERCIAL USE ONLY\n\n Intel Fortran 11.0-1578' > IntelFCompiler instance properties: > archiver = ['ar', '-cr'] > compile_switch = '-c' > compiler_f77 = ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > 72', '-w90', '-w95', '-KPIC', '-cm', '-O3', '-unroll', > '- > tpp7', '-xW', '-arch SSE2'] > compiler_f90 = ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > FR', '-KPIC', '-cm', '-O3', '-unroll', '-tpp7', '-xW', > '- > arch SSE2'] > compiler_fix = ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > FI', '-KPIC', '-cm', '-O3', '-unroll', '-tpp7', '-xW', > '- > arch SSE2'] > libraries = [] > library_dirs = [] > linker_so = ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > shared', '-tpp7', '-xW', '-arch SSE2'] > object_switch = '-o ' > ranlib = ['ranlib'] > version = None > version_cmd = ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '-FI > -V -c /tmp/tmpx6aZa8__dummy.f -o > /tmp/tmpx6aZa8__dummy.o'] > The output of f2py is: Version: 2_3473 > numpy Version: 1.0.1 > Requires: Python 2.3 or higher. > License: NumPy license (see LICENSE.txt in the NumPy source code) > Copyright 1999 - 2005 Pearu Peterson all rights reserved. > http://cens.ioc.ee/projects/f2py2e/ > We're running 64bit linux with python 2.4. How do I make this work? thanks, Peter. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dlc at halibut.com Tue Mar 16 19:07:53 2010 From: dlc at halibut.com (David Carmean) Date: Tue, 16 Mar 2010 16:07:53 -0700 Subject: [Numpy-discussion] Subclassing ma.MaskedArray Message-ID: <20100316160753.D21089@halibut.com> I understand that ma.MaskedArray is a subclass of ndarray; in addition to the requirements for subclassing the latter, what does ma.MaskedArray add to the list? I.e. what do I have to watch out for? Basically I need a version of Luke Campagnola's MetaArray ( http://www.scipy.org/Cookbook/MetaArray ) that works with masked arrays. My first attempts have failed. Thanks. From mattknox.ca at gmail.com Tue Mar 16 20:58:22 2010 From: mattknox.ca at gmail.com (Matt Knox) Date: Wed, 17 Mar 2010 00:58:22 +0000 (UTC) Subject: [Numpy-discussion] Subclassing ma.MaskedArray References: <20100316160753.D21089@halibut.com> Message-ID: David Carmean halibut.com> writes: > > > I understand that ma.MaskedArray is a subclass of ndarray; in addition to > the requirements for subclassing the latter, what does ma.MaskedArray add to > the list? I.e. what do I have to watch out for? You may want to take a look at the TimeSeries class in the scikits.timeseries module for a rather extensive example of subclassing MaskedArray - Matt From pgmdevlist at gmail.com Wed Mar 17 02:07:16 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 17 Mar 2010 02:07:16 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() Message-ID: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> All, As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as "invalid value in ...". These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. I thought I could use the new __array_prepare__ method to intercept the call of a standard ufunc. After actual testing, that can't work. __array_prepare only help to prepare the *output* of the operation, not to change the input on the fly, just for this operation. Actually, you can modify the input in place, but it's usually not what you want. Then, I tried to use __array_prepare__ to store the current error status in the input, force it to ignore divide/invalid errors and send the input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ does change the error status, but as far as I understand, the ufunc is called is still called with the original error status. That means that if something goes wrong, your error status can stay stuck. Not a good idea either. I'm running out of ideas at this point. For the test suite, I'd suggest to disable the warnings in test_fix_invalid and test_basic_arithmetic. An additional issue is that if one of the error status is set to 'raise', the numpy ufunc will raise the exception (as expected), while its numpy.ma version will not. I'll put also a warning in the docs to that effect. Please send me your comments before I commit any changes. Cheers, P. From miroslav.sedivy at weather-consult.com Wed Mar 17 07:12:27 2010 From: miroslav.sedivy at weather-consult.com (Miroslav Sedivy) Date: Wed, 17 Mar 2010 12:12:27 +0100 Subject: [Numpy-discussion] Problem migrating PDL's index() into NumPy Message-ID: <4BA0B91B.9000808@weather-consult.com> Hello, being quite new to NumPy and having used previously PDL in Perl, I am currently migrating one of my PDL projects into NumPy. Most of the functions can be migrated without problems and there are functions in NumPy that allow me to do things in much clearer way than in PDL. However, I have a problem with the following operation: There are two 2D arrays with dimensions: A[10000,1000] and B[10000,100]. The first dimension of both arrays corresponds to a list of 10000 objects. The array A contains for each of 10000 objects 1000 integer values between 0 and 99, so that for each of 10000 objects a corresponding value can be found in the array B. I need a new array C[10000,1000] with values from B the following way: for x in range(10000): for y in range(1000): C[x,y] = B[x,A[x,y]] In Perl's PDL, this can be done with $C = $B->index($A) If in NumPy I do C = B[A], then I do not get a [10000,1000] 2D array, but rather a [10000,1000,1000] 3D array, in which I can find the correct values on the following positions: for x in range(10000): for y in range(1000): C[x,y,y] which may seem nice, but it needs 1000 times more memory and very probably 1000 times more time to calculate... Impossible with such large arrays... :-( Could anyone help me, please? Regards, Miroslav Sedivy From josef.pktd at gmail.com Wed Mar 17 07:30:27 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 07:30:27 -0400 Subject: [Numpy-discussion] Problem migrating PDL's index() into NumPy In-Reply-To: <4BA0B91B.9000808@weather-consult.com> References: <4BA0B91B.9000808@weather-consult.com> Message-ID: <1cd32cbb1003170430h7fd987ebn51a8c076f40166d6@mail.gmail.com> On Wed, Mar 17, 2010 at 7:12 AM, Miroslav Sedivy wrote: > Hello, > > being quite new to NumPy and having used previously PDL in Perl, I am > currently migrating one of my PDL projects into NumPy. > > Most of the functions can be migrated without problems and there are > functions in NumPy that allow me to do things in much clearer way than > in PDL. However, I have a problem with the following operation: > > There are two 2D arrays with dimensions: A[10000,1000] and B[10000,100]. > The first dimension of both arrays corresponds to a list of 10000 objects. > > The array A contains for each of 10000 objects 1000 integer values > between 0 and 99, so that for each of 10000 objects a corresponding > value can be found in the array B. > > I need a new array C[10000,1000] with values from B the following way: > > for x in range(10000): > ? ?for y in range(1000): > ? ? ? C[x,y] = B[x,A[x,y]] > > In Perl's PDL, this can be done with $C = $B->index($A) > > If in NumPy I do C = B[A], then I do not get a [10000,1000] 2D array, > but rather a [10000,1000,1000] 3D array, in which I can find the correct > values on the following positions: > > for x in range(10000): > ? ?for y in range(1000): > ? ? ? C[x,y,y] > > which may seem nice, but it needs 1000 times more memory and very > probably 1000 times more time to calculate... Impossible with such large > arrays... :-( > > Could anyone help me, please? try C = B[:,A] or C = B[np.arange(1000)[:,None], A] I think, one of the two (or both) should work (but no time for trying it myself) Josef > > Regards, > Miroslav Sedivy > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dsdale24 at gmail.com Wed Mar 17 08:19:45 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 17 Mar 2010 08:19:45 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM wrote: > All, > As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as "invalid value in ...". These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. > I thought I could use the new __array_prepare__ method to intercept the call of a standard ufunc. After actual testing, that can't work. __array_prepare only help to prepare the *output* of the operation, not to change the input on the fly, just for this operation. Actually, you can modify the input in place, but it's usually not what you want. That is correct, __array_prepare__ is called just after the output array is created, but before the ufunc actually gets down to business. I have the same limitation in quantities you are now seeing with masked array, in my case I want the opportunity to rescale different but compatible quantities for the operation (without changing the original arrays in place, of course). > Then, I tried to use ?__array_prepare__ to store the current error status in the input, force it to ignore divide/invalid errors and send the input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ does change the error status, but as far as I understand, the ufunc is called is still called with the original error status. That means that if something goes wrong, your error status can stay stuck. Not a good idea either. > I'm running out of ideas at this point. For the test suite, I'd suggest to disable the warnings in test_fix_invalid and test_basic_arithmetic. > An additional issue is that if one of the error status is set to 'raise', the numpy ufunc will raise the exception (as expected), while its numpy.ma version will not. I'll put also a warning in the docs to that effect. > Please send me your comments before I commit any changes. I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Darren From miroslav.sedivy at weather-consult.com Wed Mar 17 09:36:59 2010 From: miroslav.sedivy at weather-consult.com (Miroslav Sedivy) Date: Wed, 17 Mar 2010 14:36:59 +0100 Subject: [Numpy-discussion] Problem migrating PDL's index() into NumPy In-Reply-To: <1cd32cbb1003170430h7fd987ebn51a8c076f40166d6@mail.gmail.com> References: <4BA0B91B.9000808@weather-consult.com> <1cd32cbb1003170430h7fd987ebn51a8c076f40166d6@mail.gmail.com> Message-ID: <4BA0DAFB.5000405@weather-consult.com> josef.pktd at gmail.com wrote: > On Wed, Mar 17, 2010 at 7:12 AM, Miroslav Sedivy wrote: >> There are two 2D arrays with dimensions: A[10000,1000] and B[10000,100]. >> The first dimension of both arrays corresponds to a list of 10000 objects. >> >> The array A contains for each of 10000 objects 1000 integer values >> between 0 and 99, so that for each of 10000 objects a corresponding >> value can be found in the array B. >> >> I need a new array C[10000,1000] with values from B the following way: >> >> for x in range(10000): >> for y in range(1000): >> C[x,y] = B[x,A[x,y]] >> >> In Perl's PDL, this can be done with $C = $B->index($A) >> >> If in NumPy I do C = B[A], then I do not get a [10000,1000] 2D array, >> but rather a [10000,1000,1000] 3D array, in which I can find the correct >> values on the following positions: >> >> for x in range(10000): >> for y in range(1000): >> C[x,y,y] >> >> which may seem nice, but it needs 1000 times more memory and very >> probably 1000 times more time to calculate... Impossible with such large >> arrays... :-( >> >> Could anyone help me, please? > > try > C = B[:,A] > or > C = B[np.arange(1000)[:,None], A] > > I think, one of the two (or both) should work (but no time for trying it myself) > Josef Thank you, Josef, for responding. None of them works correctly. The first one works only as B.T[:,A] and gives me the same _3D_ array as B[A].T The second one tells me: ValueError: shape mismatch: objects cannot be broadcast to a single shape Now I am using an iteration over all 10000 elements: C = np.empty_like(A) for i in range(10000): C[:,i] = B[:,i][A[:,i]] which works perfectly. Just it is a real pain seeing such a for-loop in the NumPy-World :-( Thanks, Miroslav From josef.pktd at gmail.com Wed Mar 17 10:01:48 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 10:01:48 -0400 Subject: [Numpy-discussion] Problem migrating PDL's index() into NumPy In-Reply-To: <4BA0DAFB.5000405@weather-consult.com> References: <4BA0B91B.9000808@weather-consult.com> <1cd32cbb1003170430h7fd987ebn51a8c076f40166d6@mail.gmail.com> <4BA0DAFB.5000405@weather-consult.com> Message-ID: <1cd32cbb1003170701t5f4b251y7d8f23bf5d0009e8@mail.gmail.com> On Wed, Mar 17, 2010 at 9:36 AM, Miroslav Sedivy wrote: > josef.pktd at gmail.com wrote: >> On Wed, Mar 17, 2010 at 7:12 AM, Miroslav Sedivy wrote: >>> There are two 2D arrays with dimensions: A[10000,1000] and B[10000,100]. >>> The first dimension of both arrays corresponds to a list of 10000 objects. >>> >>> The array A contains for each of 10000 objects 1000 integer values >>> between 0 and 99, so that for each of 10000 objects a corresponding >>> value can be found in the array B. >>> >>> I need a new array C[10000,1000] with values from B the following way: >>> >>> for x in range(10000): >>> ? ?for y in range(1000): >>> ? ? ? C[x,y] = B[x,A[x,y]] >>> >>> In Perl's PDL, this can be done with $C = $B->index($A) >>> >>> If in NumPy I do C = B[A], then I do not get a [10000,1000] 2D array, >>> but rather a [10000,1000,1000] 3D array, in which I can find the correct >>> values on the following positions: >>> >>> for x in range(10000): >>> ? ?for y in range(1000): >>> ? ? ? C[x,y,y] >>> >>> which may seem nice, but it needs 1000 times more memory and very >>> probably 1000 times more time to calculate... Impossible with such large >>> arrays... :-( >>> >>> Could anyone help me, please? >> >> try >> C = B[:,A] >> or >> C = B[np.arange(1000)[:,None], A] >> >> I think, one of the two (or both) should work (but no time for trying it myself) >> Josef > > > Thank you, Josef, for responding. > > None of them works correctly. The first one works only as B.T[:,A] and > gives me the same _3D_ array as B[A].T > > The second one tells me: ValueError: shape mismatch: objects cannot be > broadcast to a single shape because you have 10000 rows not 1000 as in the example I typed Index arrays are broadcasted so they have to have matching shapes >>> n0 = 5 # number of rows >>> B = np.ones((n0,3))*np.arange(3) >>> A = np.random.randint(3,size=(n0,3)) >>> C = B[np.arange(n0)[:,None],A] >>> assert (A == C).all() >>> A array([[2, 0, 1], [2, 0, 1], [2, 1, 2], [0, 0, 2], [2, 0, 0]]) >>> C array([[ 2., 0., 1.], [ 2., 0., 1.], [ 2., 1., 2.], [ 0., 0., 2.], [ 2., 0., 0.]]) Josef > > Now I am using an iteration over all 10000 elements: > > C = np.empty_like(A) > for i in range(10000): > ? ?C[:,i] = B[:,i][A[:,i]] > > which works perfectly. Just it is a real pain seeing such a for-loop in > the NumPy-World :-( > > Thanks, > Miroslav > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Wed Mar 17 10:08:35 2010 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 17 Mar 2010 10:08:35 -0400 Subject: [Numpy-discussion] bug in ndarray.resize? Message-ID: <4BA0E263.7090100@american.edu> Is the zero-fill intentional? If so, it is documented? (NumPy 1.3) Alan Isaac >>> a = np.arange(5) >>> b = a.copy() >>> c = np.resize(a, (5,2)) >>> b.resize((5,2)) >>> c # as expected array([[0, 1], [2, 3], [4, 0], [1, 2], [3, 4]]) >>> b # surprise! array([[0, 1], [2, 3], [4, 0], [0, 0], [0, 0]]) From rmay31 at gmail.com Wed Mar 17 10:11:27 2010 From: rmay31 at gmail.com (Ryan May) Date: Wed, 17 Mar 2010 09:11:27 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 7:19 AM, Darren Dale wrote: > Is this general enough for your use case? I haven't tried to think > about how to change some global state at one point and change it back > at another, that seems like a bad idea and difficult to support. Sounds like the textbook use case for the python 2.5/2.6 context manager. Pity we can't use it yet... (and I'm not sure it'd be easy to wrap around the calls here.) Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From josef.pktd at gmail.com Wed Mar 17 10:16:20 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 10:16:20 -0400 Subject: [Numpy-discussion] bug in ndarray.resize? In-Reply-To: <4BA0E263.7090100@american.edu> References: <4BA0E263.7090100@american.edu> Message-ID: <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> On Wed, Mar 17, 2010 at 10:08 AM, Alan G Isaac wrote: > Is the zero-fill intentional? > If so, it is documented? > (NumPy 1.3) > > Alan Isaac > >>>> a = np.arange(5) >>>> b = a.copy() >>>> c = np.resize(a, (5,2)) >>>> b.resize((5,2)) >>>> c ?# as expected > array([[0, 1], > ? ? ? ?[2, 3], > ? ? ? ?[4, 0], > ? ? ? ?[1, 2], > ? ? ? ?[3, 4]]) >>>> b ?# surprise! > array([[0, 1], > ? ? ? ?[2, 3], > ? ? ? ?[4, 0], > ? ? ? ?[0, 0], > ? ? ? ?[0, 0]]) It is documented as in your example numpy.resize(a, new_shape) Return a new array with the specified shape. If the new array is larger than the original array, then the new array is filled with repeated copied of a. Note that this behavior is different from a.resize(new_shape) which fills with zeros instead of repeated copies of a. Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dsdale24 at gmail.com Wed Mar 17 10:20:11 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 17 Mar 2010 10:20:11 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 10:11 AM, Ryan May wrote: > On Wed, Mar 17, 2010 at 7:19 AM, Darren Dale wrote: >> Is this general enough for your use case? I haven't tried to think >> about how to change some global state at one point and change it back >> at another, that seems like a bad idea and difficult to support. > > Sounds like the textbook use case for the python 2.5/2.6 context > manager. ? Pity we can't use it yet... (and I'm not sure it'd be easy > to wrap around the calls here.) I don't think context managers would work. They would be implemented in one of the subclasses special methods and would thus go out of scope before the ufunc got around to performing the calculation that required the change in state. Darren From gberbeglia at gmail.com Wed Mar 17 10:40:14 2010 From: gberbeglia at gmail.com (gerardo.berbeglia) Date: Wed, 17 Mar 2010 07:40:14 -0700 (PDT) Subject: [Numpy-discussion] size of a specific dimension of a numpy array Message-ID: <27933090.post@talk.nabble.com> I would like to know a simple way to know the size of a given dimension of a numpy array. Example A = numpy.zeros((10,20,30),float) The size of the second dimension of the array A is 20. Thanks. -- View this message in context: http://old.nabble.com/size-of-a-specific-dimension-of-a-numpy-array-tp27933090p27933090.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From matthieu.brucher at gmail.com Wed Mar 17 10:41:32 2010 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 17 Mar 2010 15:41:32 +0100 Subject: [Numpy-discussion] size of a specific dimension of a numpy array In-Reply-To: <27933090.post@talk.nabble.com> References: <27933090.post@talk.nabble.com> Message-ID: Hi, A.shape[1] 2010/3/17 gerardo.berbeglia : > > I would like to know a simple way to know the size of a given dimension of a > numpy array. > > Example > A = numpy.zeros((10,20,30),float) > The size of the second dimension of the array A is 20. > > Thanks. > > > > > -- > View this message in context: http://old.nabble.com/size-of-a-specific-dimension-of-a-numpy-array-tp27933090p27933090.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher From aisaac at american.edu Wed Mar 17 10:42:27 2010 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 17 Mar 2010 10:42:27 -0400 Subject: [Numpy-discussion] bug in ndarray.resize? In-Reply-To: <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> References: <4BA0E263.7090100@american.edu> <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> Message-ID: <4BA0EA53.10607@american.edu> On 3/17/2010 10:16 AM, josef.pktd at gmail.com wrote: > numpy.resize(a, new_shape) > Return a new array with the specified shape. > > If the new array is larger than the original array, then the new array > is filled with repeated copied of a. Note that this behavior is > different from a.resize(new_shape) which fills with zeros instead of > repeated copies of a. Yes indeed. Sorry, I must have scrolled the help without realizing it, and this part was at the top. So my follow up: why is this desirable/necessary? (I find it surprising.) Alan From charlesr.harris at gmail.com Wed Mar 17 10:45:54 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Mar 2010 08:45:54 -0600 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 6:19 AM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM wrote: > > All, > > As you're probably aware, the current test suite for numpy.ma raises > some nagging warnings such as "invalid value in ...". These warnings are > only issued when a standard numpy ufunc (eg., np.sqrt) is called on a > MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The > reason is that the masked versions of the ufuncs temporarily set the numpy > error status to 'ignore' before the operation takes place, and reset the > status to its original value. > > > I thought I could use the new __array_prepare__ method to intercept the > call of a standard ufunc. After actual testing, that can't work. > __array_prepare only help to prepare the *output* of the operation, not to > change the input on the fly, just for this operation. Actually, you can > modify the input in place, but it's usually not what you want. > > That is correct, __array_prepare__ is called just after the output > array is created, but before the ufunc actually gets down to business. > I have the same limitation in quantities you are now seeing with > masked array, in my case I want the opportunity to rescale different > but compatible quantities for the operation (without changing the > original arrays in place, of course). > > > Then, I tried to use __array_prepare__ to store the current error status > in the input, force it to ignore divide/invalid errors and send the input to > the ufunc. Doesn't work either: np.seterr in __array_prepare__ does change > the error status, but as far as I understand, the ufunc is called is still > called with the original error status. That means that if something goes > wrong, your error status can stay stuck. Not a good idea either. > > I'm running out of ideas at this point. For the test suite, I'd suggest > to disable the warnings in test_fix_invalid and test_basic_arithmetic. > > An additional issue is that if one of the error status is set to 'raise', > the numpy ufunc will raise the exception (as expected), while its numpy.maversion will not. I'll put also a warning in the docs to that effect. > > Please send me your comments before I commit any changes. > > I started thinking about a third method called __input_prepare__ that > would be called on the way into the ufunc, which would allow you to > intercept the input and pass a somehow modified copy back to the > ufunc. The total flow would be: > > 1) Call myufunc(x, y[, z]) > 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', > y' (or simply passes through x,y by default) > 3) myufunc creates the output array z (if not specified) and calls > ?.__array_prepare__(z, (myufunc, x, y, ...)) > 4) myufunc finally gets around to performing the calculation > 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns > the result to the caller > > Is this general enough for your use case? I haven't tried to think > about how to change some global state at one point and change it back > at another, that seems like a bad idea and difficult to support. > > I'm not a masked array user and not familiar with the specific problems here, but as an outsider it's beginning to look like one little fix after another. Is there some larger framework that would help here? Changes to the ufuncs themselves? There was some code for masked ufuncs on the c level posted a while back that I thought was interesting, would it help to have masked masked versions of the ufuncs? So on and so forth. It just looks like a larger design issue needs to be addressed here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed Mar 17 11:09:41 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 17 Mar 2010 10:09:41 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: <4BA0F0B5.4050608@gmail.com> On 03/17/2010 01:07 AM, Pierre GM wrote: > All, > As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as "invalid value in ...". These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. > Perhaps naive question, what is really being tested here? That is, it appears that you are testing both the generation of the invalid values and function. So if the generation fails, then the function will also fail. However, the test for the generation of invalid values should be elsewhere so you have to assume that the generation of values will work correctly. I think that you should be only testing that the specific function passes the test. Why not just use 'invalid' values like np.inf directly? For example, in numpy/ma/tests/test_core.py We have this test: def test_fix_invalid(self): "Checks fix_invalid." data = masked_array(np.sqrt([-1., 0., 1.]), mask=[0, 0, 1]) data_fixed = fix_invalid(data) If that is to test that fix_invalid Why not create the data array as: data = masked_array([np.inf, 0., 1.]), mask=[0, 0, 1]) However, I am not sure the output should be for the test_ndarray_mask test because ma automatically masks the value resulting from sqrt(-1): >>> a = masked_array([-1, 0, 1, 2, 3], mask=[0, 0, 0, 0, 1]) >>> np.sqrt(a) Warning: invalid value encountered in sqrt masked_array(data = [-- 0.0 1.0 1.41421356237 --], mask = [ True False False False True], fill_value = 999999) Note the warning is important because it does indicate that the result might not be as expected. But if the -1 is replaced by np.inf, then it is not automatically masked: >>> b = masked_array([np.inf, 0, 1, 2, 3], mask=[0, 0, 0, 0, 1]) >>> np.sqrt(b) masked_array(data = [inf 0.0 1.0 1.41421356237 --], mask = [False False False False True], fill_value = 1e+20) Bruce > I thought I could use the new __array_prepare__ method to intercept the call of a standard ufunc. After actual testing, that can't work. __array_prepare only help to prepare the *output* of the operation, not to change the input on the fly, just for this operation. Actually, you can modify the input in place, but it's usually not what you want. > Then, I tried to use __array_prepare__ to store the current error status in the input, force it to ignore divide/invalid errors and send the input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ does change the error status, but as far as I understand, the ufunc is called is still called with the original error status. That means that if something goes wrong, your error status can stay stuck. Not a good idea either. > I'm running out of ideas at this point. For the test suite, I'd suggest to disable the warnings in test_fix_invalid and test_basic_arithmetic. > An additional issue is that if one of the error status is set to 'raise', the numpy ufunc will raise the exception (as expected), while its numpy.ma version will not. I'll put also a warning in the docs to that effect. > Please send me your comments before I commit any changes. > Cheers, > P. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rmay31 at gmail.com Wed Mar 17 11:33:29 2010 From: rmay31 at gmail.com (Ryan May) Date: Wed, 17 Mar 2010 10:33:29 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 9:20 AM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 10:11 AM, Ryan May wrote: >> On Wed, Mar 17, 2010 at 7:19 AM, Darren Dale wrote: >>> Is this general enough for your use case? I haven't tried to think >>> about how to change some global state at one point and change it back >>> at another, that seems like a bad idea and difficult to support. >> >> Sounds like the textbook use case for the python 2.5/2.6 context >> manager. ? Pity we can't use it yet... (and I'm not sure it'd be easy >> to wrap around the calls here.) > > I don't think context managers would work. They would be implemented > in one of the subclasses special methods and would thus go out of > scope before the ufunc got around to performing the calculation that > required the change in state. Right, that's the part I was referring to in the last part of my post. But the concept of modifying global state and ensuring that no matter what happens, that state reset to its initial condition, is the textbook use case for context managers. Problem is, I think that limitation replies to any method that tries to be exception-safe. It seems like you basically need to wrap the initial function call. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From dsdale24 at gmail.com Wed Mar 17 11:35:43 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 17 Mar 2010 11:35:43 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 10:45 AM, Charles R Harris wrote: > > > On Wed, Mar 17, 2010 at 6:19 AM, Darren Dale wrote: >> >> On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM wrote: >> > All, >> > As you're probably aware, the current test suite for numpy.ma raises >> > some nagging warnings such as "invalid value in ...". These warnings are >> > only issued when a standard numpy ufunc (eg., np.sqrt) is called on a >> > MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The >> > reason is that the masked versions of the ufuncs temporarily set the numpy >> > error status to 'ignore' before the operation takes place, and reset the >> > status to its original value. >> >> > I thought I could use the new __array_prepare__ method to intercept the >> > call of a standard ufunc. After actual testing, that can't work. >> > __array_prepare only help to prepare the *output* of the operation, not to >> > change the input on the fly, just for this operation. Actually, you can >> > modify the input in place, but it's usually not what you want. >> >> That is correct, __array_prepare__ is called just after the output >> array is created, but before the ufunc actually gets down to business. >> I have the same limitation in quantities you are now seeing with >> masked array, in my case I want the opportunity to rescale different >> but compatible quantities for the operation (without changing the >> original arrays in place, of course). >> >> > Then, I tried to use ?__array_prepare__ to store the current error >> > status in the input, force it to ignore divide/invalid errors and send the >> > input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ does >> > change the error status, but as far as I understand, the ufunc is called is >> > still called with the original error status. That means that if something >> > goes wrong, your error status can stay stuck. Not a good idea either. >> > I'm running out of ideas at this point. For the test suite, I'd suggest >> > to disable the warnings in test_fix_invalid and test_basic_arithmetic. >> > An additional issue is that if one of the error status is set to >> > 'raise', the numpy ufunc will raise the exception (as expected), while its >> > numpy.ma version will not. I'll put also a warning in the docs to that >> > effect. >> > Please send me your comments before I commit any changes. >> >> I started thinking about a third method called __input_prepare__ that >> would be called on the way into the ufunc, which would allow you to >> intercept the input and pass a somehow modified copy back to the >> ufunc. The total flow would be: >> >> 1) Call myufunc(x, y[, z]) >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', >> y' (or simply passes through x,y by default) >> 3) myufunc creates the output array z (if not specified) and calls >> ?.__array_prepare__(z, (myufunc, x, y, ...)) >> 4) myufunc finally gets around to performing the calculation >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns >> the result to the caller >> >> Is this general enough for your use case? I haven't tried to think >> about how to change some global state at one point and change it back >> at another, that seems like a bad idea and difficult to support. >> > > I'm not a masked array user and not familiar with the specific problems > here, but as an outsider it's beginning to look like one little fix after > another. Yeah, I was concerned that criticism would come up. > Is there some larger framework that would help here? I think there is: http://www.python.org/dev/peps/pep-3124/ > Changes to the ufuncs themselves? Perhaps, if ufuncs were instances of a class that implemented __call__, it would be easier to include context management. Maybe this approach could be coupled with input_prepare, array_prepare and array_wrap to provide everything we need. > There was some code for masked ufuncs on the c level > posted a while back that I thought was interesting, would it help to have > masked masked versions of the ufuncs? I think we need a solution that avoids implementing an entirely new set of ufuncs for specific subclasses. > So on and so forth. It just looks like a larger design issue needs to be addressed here. I'm interested to hear other people's perspectives or suggestions. Darren From miroslav.sedivy at weather-consult.com Wed Mar 17 11:39:21 2010 From: miroslav.sedivy at weather-consult.com (Miroslav Sedivy) Date: Wed, 17 Mar 2010 16:39:21 +0100 Subject: [Numpy-discussion] Problem migrating PDL's index() into NumPy In-Reply-To: <1cd32cbb1003170701t5f4b251y7d8f23bf5d0009e8@mail.gmail.com> References: <4BA0B91B.9000808@weather-consult.com> <1cd32cbb1003170430h7fd987ebn51a8c076f40166d6@mail.gmail.com> <4BA0DAFB.5000405@weather-consult.com> <1cd32cbb1003170701t5f4b251y7d8f23bf5d0009e8@mail.gmail.com> Message-ID: <4BA0F7A9.4080703@weather-consult.com> >>>> n0 = 5 # number of rows >>>> B = np.ones((n0,3))*np.arange(3) >>>> A = np.random.randint(3,size=(n0,3)) >>>> C = B[np.arange(n0)[:,None],A] >>>> assert (A == C).all() >>>> A > array([[2, 0, 1], > [2, 0, 1], > [2, 1, 2], > [0, 0, 2], > [2, 0, 0]]) >>>> C > array([[ 2., 0., 1.], > [ 2., 0., 1.], > [ 2., 1., 2.], > [ 0., 0., 2.], > [ 2., 0., 0.]]) > > Josef Thank you, Josef, now it works! I had a problem with the shape of my arrays. When I transposed them correctly, your solution has worked! Miroslav From gberbeglia at gmail.com Wed Mar 17 11:51:54 2010 From: gberbeglia at gmail.com (gerardob) Date: Wed, 17 Mar 2010 08:51:54 -0700 (PDT) Subject: [Numpy-discussion] Setting small numbers to zero. Message-ID: <27933569.post@talk.nabble.com> How can i modified all the values of a numpy array whose value is smaller than a given epsilon to zero? Example epsilon=0.01 a = [[0.003,2][23,0.0001]] output: [[0,2][23,0]] -- View this message in context: http://old.nabble.com/Setting-small-numbers-to-zero.-tp27933569p27933569.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From alan.mcintyre at gmail.com Wed Mar 17 11:55:34 2010 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 17 Mar 2010 08:55:34 -0700 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <27933569.post@talk.nabble.com> References: <27933569.post@talk.nabble.com> Message-ID: <1d36917a1003170855i7dd43a74i35545e4293f1e6a5@mail.gmail.com> On Wed, Mar 17, 2010 at 8:51 AM, gerardob wrote: > How can i modified all the values of a numpy array whose value is smaller > than a given epsilon to zero? > > Example > epsilon=0.01 > a = [[0.003,2][23,0.0001]] > > output: > [[0,2][23,0]] Give this a try: >>> import numpy as np >>> epsilon=0.01 >>> a=np.array([[0.003,2],[23,0.0001]]) >>> a array([[ 3.00000000e-03, 2.00000000e+00], [ 2.30000000e+01, 1.00000000e-04]]) >>> a[np.abs(a) < epsilon] = 0 >>> a array([[ 0., 2.], [ 23., 0.]]) From kwgoodman at gmail.com Wed Mar 17 11:56:05 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 17 Mar 2010 08:56:05 -0700 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <27933569.post@talk.nabble.com> References: <27933569.post@talk.nabble.com> Message-ID: On Wed, Mar 17, 2010 at 8:51 AM, gerardob wrote: > > How can i modified all the values of a numpy array whose value is smaller > than a given epsilon to zero? > > Example > epsilon=0.01 > a = [[0.003,2][23,0.0001]] > > output: > [[0,2][23,0]] Here's one way: >> a = np.array([[0.003,2],[23,0.0001]]) >> a[np.abs(a) < 0.01] = 0 >> a array([[ 0., 2.], [ 23., 0.]]) From josef.pktd at gmail.com Wed Mar 17 11:57:05 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 11:57:05 -0400 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: References: <27933569.post@talk.nabble.com> Message-ID: <1cd32cbb1003170857m22bb1385o61f337e32666fa74@mail.gmail.com> On Wed, Mar 17, 2010 at 11:56 AM, Keith Goodman wrote: > On Wed, Mar 17, 2010 at 8:51 AM, gerardob wrote: >> >> How can i modified all the values of a numpy array whose value is smaller >> than a given epsilon to zero? >> >> Example >> epsilon=0.01 >> a = [[0.003,2][23,0.0001]] >> >> output: >> [[0,2][23,0]] > > Here's one way: > >>> a = np.array([[0.003,2],[23,0.0001]]) >>> a[np.abs(a) < 0.01] = 0 >>> a > > array([[ ?0., ? 2.], > ? ? ? [ 23., ? 0.]]) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > But as Skipper said before You might find these helpful. http://www.scipy.org/Tentative_NumPy_Tutorial http://www.scipy.org/Numpy_Example_List Josef From friedrichromstedt at gmail.com Wed Mar 17 13:50:30 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Wed, 17 Mar 2010 18:50:30 +0100 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <27933569.post@talk.nabble.com> References: <27933569.post@talk.nabble.com> Message-ID: Code: import numpy import time a = numpy.random.random((2000, 2000)) start = time.time() a[abs(a) < 10] = 0 stop = time.time() print stop - start a = numpy.random.random((2000, 2000)) start = time.time() a = a * (abs(a) >= 10) stop = time.time() print stop - start a = numpy.random.random((2000, 2000)) start = time.time() a *= (abs(a) >= 10) stop = time.time() print stop - start Output (reproducible): 0.680999994278 0.220999956131 0.19000005722 Friedrich From efiring at hawaii.edu Wed Mar 17 14:28:21 2010 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 17 Mar 2010 08:28:21 -1000 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: <4BA11F45.9010201@hawaii.edu> Charles R Harris wrote: > > > On Wed, Mar 17, 2010 at 6:19 AM, Darren Dale > wrote: > > On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM > wrote: > > All, > > As you're probably aware, the current test suite for numpy.ma > raises some nagging warnings such as "invalid > value in ...". These warnings are only issued when a standard numpy > ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its > numpy.ma (eg., np.ma.sqrt) equivalent. The reason > is that the masked versions of the ufuncs temporarily set the numpy > error status to 'ignore' before the operation takes place, and reset > the status to its original value. > > > I thought I could use the new __array_prepare__ method to > intercept the call of a standard ufunc. After actual testing, that > can't work. __array_prepare only help to prepare the *output* of the > operation, not to change the input on the fly, just for this > operation. Actually, you can modify the input in place, but it's > usually not what you want. > > That is correct, __array_prepare__ is called just after the output > array is created, but before the ufunc actually gets down to business. > I have the same limitation in quantities you are now seeing with > masked array, in my case I want the opportunity to rescale different > but compatible quantities for the operation (without changing the > original arrays in place, of course). > > > Then, I tried to use __array_prepare__ to store the current > error status in the input, force it to ignore divide/invalid errors > and send the input to the ufunc. Doesn't work either: np.seterr in > __array_prepare__ does change the error status, but as far as I > understand, the ufunc is called is still called with the original > error status. That means that if something goes wrong, your error > status can stay stuck. Not a good idea either. > > I'm running out of ideas at this point. For the test suite, I'd > suggest to disable the warnings in test_fix_invalid and > test_basic_arithmetic. > > An additional issue is that if one of the error status is set to > 'raise', the numpy ufunc will raise the exception (as expected), > while its numpy.ma version will not. I'll put also > a warning in the docs to that effect. > > Please send me your comments before I commit any changes. > > I started thinking about a third method called __input_prepare__ that > would be called on the way into the ufunc, which would allow you to > intercept the input and pass a somehow modified copy back to the > ufunc. The total flow would be: > > 1) Call myufunc(x, y[, z]) > 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', > y' (or simply passes through x,y by default) > 3) myufunc creates the output array z (if not specified) and calls > ?.__array_prepare__(z, (myufunc, x, y, ...)) > 4) myufunc finally gets around to performing the calculation > 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns > the result to the caller > > Is this general enough for your use case? I haven't tried to think > about how to change some global state at one point and change it back > at another, that seems like a bad idea and difficult to support. > > > I'm not a masked array user and not familiar with the specific problems > here, but as an outsider it's beginning to look like one little fix > after another. Is there some larger framework that would help here? > Changes to the ufuncs themselves? There was some code for masked ufuncs > on the c level posted a while back that I thought was interesting, would > it help to have masked masked versions of the ufuncs? So on and so > forth. It just looks like a larger design issue needs to be addressed here. > Chuck, I'm glad you found it interesting, and I'm sorry I haven't had time to follow up on the work with masked ufuncs in C. My motivation for going to the C level was speed and control; many ma operations are very slow compared to their numpy counterparts, and moving the mask handling to C can erase nearly all of this penalty. Regarding nan-handling, using masked ufuncs in C means that calculations are simply not done with masked values, so it doesn't matter whether a masked value is invalid or not; consequently, so long as an invalid value is masked, the seterr state doesn't matter. And, the seterr state then applies normally to the unmasked values. I'm not sure whether this solves the problem at hand, but it does seem to me to be sensible behavior and a step in the right direction. The devil is in the details--coming up with some basic masked ufunc functionality in C was fairly easy, but figuring out how to handle all ufuncs, and especially their methods (reduce, etc.) would be quite a bit of work. It might be a good project for a student. Realistically, I don't think I will ever have the time to do it myself. In case anyone is interested, my initial feeble attempt nearly a year ago is still on github: http://github.com/efiring/numpy-work Eric > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From gberbeglia at gmail.com Wed Mar 17 14:47:41 2010 From: gberbeglia at gmail.com (gerardob) Date: Wed, 17 Mar 2010 11:47:41 -0700 (PDT) Subject: [Numpy-discussion] matrix operation Message-ID: <27936387.post@talk.nabble.com> Let A and B be two n x n matrices. I would like to have another n x n matrix C such that C_ij = min {A_ij, B_ij} Example: A = numpy.array([[2,3],[10,12]]) B = numpy.array([[1,4],[9,13]]) Output C = [[1,3],[9,12]] The function min(axis) of numpy seems to be only unary. Thanks. -- View this message in context: http://old.nabble.com/matrix-operation-tp27936387p27936387.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From geometrian at gmail.com Wed Mar 17 14:52:11 2010 From: geometrian at gmail.com (Ian Mallett) Date: Wed, 17 Mar 2010 11:52:11 -0700 Subject: [Numpy-discussion] matrix operation In-Reply-To: <27936387.post@talk.nabble.com> References: <27936387.post@talk.nabble.com> Message-ID: >>> import numpy >>> A = numpy.array([[2,3],[10,12]]) >>> B = numpy.array([[1,4],[9,13]]) >>> C = numpy.array([A,B]) >>> numpy.min(C,0) array([[ 1, 3], [ 9, 12]]) Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From eadrogue at gmx.net Wed Mar 17 14:53:48 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 17 Mar 2010 19:53:48 +0100 Subject: [Numpy-discussion] matrix operation In-Reply-To: <27936387.post@talk.nabble.com> References: <27936387.post@talk.nabble.com> Message-ID: <20100317185348.GA12625@doriath.local> 17/03/10 @ 11:47 (-0700), thus spake gerardob: > > Let A and B be two n x n matrices. > > I would like to have another n x n matrix C such that > C_ij = min {A_ij, B_ij} > > Example: > A = numpy.array([[2,3],[10,12]]) > B = numpy.array([[1,4],[9,13]]) > > Output > > C = [[1,3],[9,12]] > > The function min(axis) of numpy seems to be only unary. Try numpy.minimum: In [7]: np.minimum(A, B) Out[7]: array([[ 1, 3], [ 9, 12]]) > Thanks. > > -- > View this message in context: http://old.nabble.com/matrix-operation-tp27936387p27936387.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From kwgoodman at gmail.com Wed Mar 17 14:53:31 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 17 Mar 2010 11:53:31 -0700 Subject: [Numpy-discussion] matrix operation In-Reply-To: <27936387.post@talk.nabble.com> References: <27936387.post@talk.nabble.com> Message-ID: On Wed, Mar 17, 2010 at 11:47 AM, gerardob wrote: > > Let A and B be two n x n matrices. > > I would like to have another n x n ?matrix C such that > C_ij = min {A_ij, B_ij} > > Example: > A = numpy.array([[2,3],[10,12]]) > B = numpy.array([[1,4],[9,13]]) > > Output > > C = [[1,3],[9,12]] > > The function min(axis) of numpy seems to be only unary. > > Thanks. >> numpy.minimum(A, B) array([[ 1, 3], [ 9, 12]]) In ipython: >> numpy.min [tab] numpy.min numpy.minimum numpy.mintypecode From efiring at hawaii.edu Wed Mar 17 14:53:42 2010 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 17 Mar 2010 08:53:42 -1000 Subject: [Numpy-discussion] matrix operation In-Reply-To: <27936387.post@talk.nabble.com> References: <27936387.post@talk.nabble.com> Message-ID: <4BA12536.5060004@hawaii.edu> gerardob wrote: > Let A and B be two n x n matrices. > > I would like to have another n x n matrix C such that > C_ij = min {A_ij, B_ij} > > Example: > A = numpy.array([[2,3],[10,12]]) > B = numpy.array([[1,4],[9,13]]) > > Output > > C = [[1,3],[9,12]] > > The function min(axis) of numpy seems to be only unary. > > Thanks. > In [4]:C = numpy.minimum(A,B) In [5]:C Out[5]: array([[ 1, 3], [ 9, 12]]) From Chris.Barker at noaa.gov Wed Mar 17 15:04:00 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 17 Mar 2010 12:04:00 -0700 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: References: <27933569.post@talk.nabble.com> Message-ID: <4BA127A0.5050203@noaa.gov> Friedrich Romstedt wrote: > Code: > > import numpy > import time > > a = numpy.random.random((2000, 2000)) > > start = time.time() > a[abs(a) < 10] = 0 > stop = time.time() I highly recommend ipython and its "timeit" function --much better for this. And numpy.clip() may be helpful here, though last I checked, it's written in Python, and thus not all that fast. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Mar 17 15:06:47 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 17 Mar 2010 12:06:47 -0700 Subject: [Numpy-discussion] matrix operation In-Reply-To: <27936387.post@talk.nabble.com> References: <27936387.post@talk.nabble.com> Message-ID: <4BA12847.6070006@noaa.gov> gerardob wrote: > Let A and B be two n x n matrices. > > I would like to have another n x n matrix C such that > C_ij = min {A_ij, B_ij} In [30]: A = numpy.array([[2,3],[10,12]]) In [31]: B = numpy.array([[1,4],[9,13]]) In [32]: numpy.minimum(A,B) Out[32]: array([[ 1, 3], [ 9, 12]]) -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Mar 17 15:01:04 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Mar 2010 14:01:04 -0500 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <4BA127A0.5050203@noaa.gov> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> Message-ID: <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> On Wed, Mar 17, 2010 at 14:04, Christopher Barker wrote: > Friedrich Romstedt wrote: >> Code: >> >> import numpy >> import time >> >> a = numpy.random.random((2000, 2000)) >> >> start = time.time() >> a[abs(a) < 10] = 0 >> stop = time.time() > > I highly recommend ipython and its "timeit" function --much better for this. > > And numpy.clip() may be helpful here, No, it's not. > though last I checked, it's > written in Python, No, it isn't. > and thus not all that fast. No, it's reasonably performant. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Wed Mar 17 15:03:28 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 17 Mar 2010 12:03:28 -0700 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <4BA127A0.5050203@noaa.gov> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> Message-ID: On Wed, Mar 17, 2010 at 12:04 PM, Christopher Barker wrote: > Friedrich Romstedt wrote: >> Code: >> >> import numpy >> import time >> >> a = numpy.random.random((2000, 2000)) >> >> start = time.time() >> a[abs(a) < 10] = 0 >> stop = time.time() > > I highly recommend ipython and its "timeit" function --much better for this. One of the methods, the fast one, uses an in-place operation, so timeit won't work. But for cases where timeit does work, yes, it is great. From Chris.Barker at noaa.gov Wed Mar 17 15:12:18 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 17 Mar 2010 12:12:18 -0700 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA11F45.9010201@hawaii.edu> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> Message-ID: <4BA12992.5040607@noaa.gov> Eric Firing wrote: > My motivation for going > to the C level was speed and control; many ma operations are very slow > compared to their numpy counterparts, and moving the mask handling to C > can erase nearly all of this penalty. really? very cool. I was thinking about this the other day, and thinking that in some grand future vision, all numpy arrays should be masked arrays (or could be). The idea is that missing/invalid data is a really common case, and it is simply wonderful to have the software handle that gracefully. One of the things I liked about MATLAB was that NaNs were well handled almost all the time. Given all the limitations of NaN, having a masked array is a better way to go, but I'd love it if they were "just there", and therefore EVERY numpy function and package built on numpy would handle them gracefully. I had thought that there would be a significant performance penalty, and thus there would be a boatload of "if_mask" code all over the place, but maybe not. Anyway, just a fantasy, but C-level ufuncs that support masks would be great. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Wed Mar 17 15:07:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 15:07:31 -0400 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> Message-ID: <1cd32cbb1003171207i4f4a4e32vec42b7d374516d79@mail.gmail.com> On Wed, Mar 17, 2010 at 3:01 PM, Robert Kern wrote: > On Wed, Mar 17, 2010 at 14:04, Christopher Barker wrote: >> Friedrich Romstedt wrote: >>> Code: >>> >>> import numpy >>> import time >>> >>> a = numpy.random.random((2000, 2000)) >>> >>> start = time.time() >>> a[abs(a) < 10] = 0 >>> stop = time.time() >> >> I highly recommend ipython and its "timeit" function --much better for this. >> >> And numpy.clip() may be helpful here, > > No, it's not. why not? np.clip([-1,-5,1,1e90,np.inf, np.nan], 0, np.inf) array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+00, 1.00000000e+90, Inf, NaN]) Josef > >> though last I checked, it's >> written in Python, > > No, it isn't. > >> and thus not all that fast. > > No, it's reasonably performant. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Wed Mar 17 15:08:54 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Mar 2010 14:08:54 -0500 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> Message-ID: <3d375d731003171208g2816475cka90391f4a1ad3e93@mail.gmail.com> On Wed, Mar 17, 2010 at 14:03, Keith Goodman wrote: > On Wed, Mar 17, 2010 at 12:04 PM, Christopher Barker > wrote: >> Friedrich Romstedt wrote: >>> Code: >>> >>> import numpy >>> import time >>> >>> a = numpy.random.random((2000, 2000)) >>> >>> start = time.time() >>> a[abs(a) < 10] = 0 >>> stop = time.time() >> >> I highly recommend ipython and its "timeit" function --much better for this. > > One of the methods, the fast one, uses an in-place operation, so > timeit won't work. But for cases where timeit does work, yes, it is > great. You need to provide correct setup code. from timeit import Timer t = Timer("a[abs(a) < 0.1] = 0", "import numpy;a=numpy.random.random((2000, 2000))") t.repeat() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed Mar 17 15:17:25 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 17 Mar 2010 12:17:25 -0700 Subject: [Numpy-discussion] matrix operation In-Reply-To: <4BA12847.6070006@noaa.gov> References: <27936387.post@talk.nabble.com> <4BA12847.6070006@noaa.gov> Message-ID: <4BA12AC5.40909@noaa.gov> Christopher Barker wrote: > In [32]: numpy.minimum(A,B) wow! fifth to answer that one -- darn I'm slow! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Mar 17 15:11:22 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Mar 2010 14:11:22 -0500 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <1cd32cbb1003171207i4f4a4e32vec42b7d374516d79@mail.gmail.com> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> <1cd32cbb1003171207i4f4a4e32vec42b7d374516d79@mail.gmail.com> Message-ID: <3d375d731003171211h56c973ecp557f4d476155d1f2@mail.gmail.com> On Wed, Mar 17, 2010 at 14:07, wrote: > On Wed, Mar 17, 2010 at 3:01 PM, Robert Kern wrote: >> On Wed, Mar 17, 2010 at 14:04, Christopher Barker wrote: >>> Friedrich Romstedt wrote: >>>> Code: >>>> >>>> import numpy >>>> import time >>>> >>>> a = numpy.random.random((2000, 2000)) >>>> >>>> start = time.time() >>>> a[abs(a) < 10] = 0 >>>> stop = time.time() >>> >>> I highly recommend ipython and its "timeit" function --much better for this. >>> >>> And numpy.clip() may be helpful here, >> >> No, it's not. > > why not? > > np.clip([-1,-5,1,1e90,np.inf, np.nan], 0, np.inf) > array([ ?0.00000000e+00, ? 0.00000000e+00, ? 1.00000000e+00, > ? ? ? ? 1.00000000e+90, ? ? ? ? ? ? ?Inf, ? ? ? ? ? ? ?NaN]) Why do you think that's equivalent? That just clips negative numbers to 0. The question is how to set numbers of small absolute magnitude to 0. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed Mar 17 15:20:41 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 17 Mar 2010 12:20:41 -0700 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> Message-ID: <4BA12B89.7000702@noaa.gov> Robert Kern wrote: > On Wed, Mar 17, 2010 at 14:04, Christopher Barker wrote: >> though last I checked, it's >> written in Python, > > No, it isn't. > >> and thus not all that fast. > > No, it's reasonably performant. nice to know -- a good while back, I wrote a small collection of add-ons to Numeric, in painfully hand-written C, including a "fast_clip". I think every one of them has been rendered obsolete in recent versions of numpy. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Wed Mar 17 15:18:01 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 15:18:01 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA12992.5040607@noaa.gov> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> Message-ID: <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> On Wed, Mar 17, 2010 at 3:12 PM, Christopher Barker wrote: > Eric Firing wrote: >> My motivation for going >> to the C level was speed and control; many ma operations are very slow >> compared to their numpy counterparts, and moving the mask handling to C >> can erase nearly all of this penalty. > > really? very cool. I was thinking about this the other day, and thinking > that in some grand future vision, all numpy arrays should be masked > arrays (or could be). The idea is that missing/invalid data is a really > common case, and it is simply wonderful to have the software handle that > gracefully. > > One of the things I liked about MATLAB was that NaNs were well handled > almost all the time. Given all the limitations of NaN, having a masked > array is a better way to go, but I'd love it if they were "just there", > and therefore EVERY numpy function and package built on numpy would > handle them gracefully. I had thought that there would be a significant > performance penalty, and thus there would be a boatload of "if_mask" > code all over the place, but maybe not. many function are defined differently for missing values, in stats, regression or time series analysis with the assumption of equally spaced time periods always needs to use special methods to handle missing values. Plus, you have to operate on two arrays and keep both in memory. So the penalty is pretty high even in C. (on the statsmodels mailing list, Wes did a comparison for different implementations of moving average, although the difference wouldn't be as huge as it currently is.) Josef > > Anyway, just a fantasy, but C-level ufuncs that support masks would be > great. > > -Chris > > > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Wed Mar 17 15:18:58 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 17 Mar 2010 12:18:58 -0700 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <3d375d731003171208g2816475cka90391f4a1ad3e93@mail.gmail.com> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> <3d375d731003171208g2816475cka90391f4a1ad3e93@mail.gmail.com> Message-ID: On Wed, Mar 17, 2010 at 12:08 PM, Robert Kern wrote: > On Wed, Mar 17, 2010 at 14:03, Keith Goodman wrote: >> On Wed, Mar 17, 2010 at 12:04 PM, Christopher Barker >> wrote: >>> Friedrich Romstedt wrote: >>>> Code: >>>> >>>> import numpy >>>> import time >>>> >>>> a = numpy.random.random((2000, 2000)) >>>> >>>> start = time.time() >>>> a[abs(a) < 10] = 0 >>>> stop = time.time() >>> >>> I highly recommend ipython and its "timeit" function --much better for this. >> >> One of the methods, the fast one, uses an in-place operation, so >> timeit won't work. But for cases where timeit does work, yes, it is >> great. > > You need to provide correct setup code. > > from timeit import Timer > > t = Timer("a[abs(a) < 0.1] = 0", "import > numpy;a=numpy.random.random((2000, 2000))") > t.repeat() BTW, is there some way to get the above to give the same output (and same number of runs etc) as timeit a[abs(a) < 0.1] = 0 From josef.pktd at gmail.com Wed Mar 17 15:26:41 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Mar 2010 15:26:41 -0400 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: <3d375d731003171211h56c973ecp557f4d476155d1f2@mail.gmail.com> References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> <3d375d731003171201q62af9c1amc7387c485a48d5c3@mail.gmail.com> <1cd32cbb1003171207i4f4a4e32vec42b7d374516d79@mail.gmail.com> <3d375d731003171211h56c973ecp557f4d476155d1f2@mail.gmail.com> Message-ID: <1cd32cbb1003171226h56a6d4d1s891c92b095152914@mail.gmail.com> On Wed, Mar 17, 2010 at 3:11 PM, Robert Kern wrote: > On Wed, Mar 17, 2010 at 14:07, ? wrote: >> On Wed, Mar 17, 2010 at 3:01 PM, Robert Kern wrote: >>> On Wed, Mar 17, 2010 at 14:04, Christopher Barker wrote: >>>> Friedrich Romstedt wrote: >>>>> Code: >>>>> >>>>> import numpy >>>>> import time >>>>> >>>>> a = numpy.random.random((2000, 2000)) >>>>> >>>>> start = time.time() >>>>> a[abs(a) < 10] = 0 >>>>> stop = time.time() >>>> >>>> I highly recommend ipython and its "timeit" function --much better for this. >>>> >>>> And numpy.clip() may be helpful here, >>> >>> No, it's not. >> >> why not? >> >> np.clip([-1,-5,1,1e90,np.inf, np.nan], 0, np.inf) >> array([ ?0.00000000e+00, ? 0.00000000e+00, ? 1.00000000e+00, >> ? ? ? ? 1.00000000e+90, ? ? ? ? ? ? ?Inf, ? ? ? ? ? ? ?NaN]) > > Why do you think that's equivalent? That just clips negative numbers > to 0. The question is how to set numbers of small absolute magnitude > to 0. my mistake, that's the reason scipy.stats.threshold still exists >>> stats.threshold([-1,0.05,1e-4,1, np.inf, np.nan],0.1,np.inf) array([ 0., 0., 0., 1., Inf, NaN]) Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Wed Mar 17 15:28:04 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Mar 2010 14:28:04 -0500 Subject: [Numpy-discussion] Setting small numbers to zero. In-Reply-To: References: <27933569.post@talk.nabble.com> <4BA127A0.5050203@noaa.gov> <3d375d731003171208g2816475cka90391f4a1ad3e93@mail.gmail.com> Message-ID: <3d375d731003171228p601040b2r21cf771b3a1151b2@mail.gmail.com> On Wed, Mar 17, 2010 at 14:18, Keith Goodman wrote: > On Wed, Mar 17, 2010 at 12:08 PM, Robert Kern wrote: >> On Wed, Mar 17, 2010 at 14:03, Keith Goodman wrote: >>> On Wed, Mar 17, 2010 at 12:04 PM, Christopher Barker >>> wrote: >>>> Friedrich Romstedt wrote: >>>>> Code: >>>>> >>>>> import numpy >>>>> import time >>>>> >>>>> a = numpy.random.random((2000, 2000)) >>>>> >>>>> start = time.time() >>>>> a[abs(a) < 10] = 0 >>>>> stop = time.time() >>>> >>>> I highly recommend ipython and its "timeit" function --much better for this. >>> >>> One of the methods, the fast one, uses an in-place operation, so >>> timeit won't work. But for cases where timeit does work, yes, it is >>> great. >> >> You need to provide correct setup code. >> >> from timeit import Timer >> >> t = Timer("a[abs(a) < 0.1] = 0", "import >> numpy;a=numpy.random.random((2000, 2000))") >> t.repeat() > > BTW, is there some way to get the above to give the same output (and > same number of runs etc) as > > timeit a[abs(a) < 0.1] = 0 Much of that is implemented in the implementation of %timeit: http://bazaar.launchpad.net/~ipython-dev/ipython/trunk/annotate/head:/IPython/core/magic.py#L1755 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Wed Mar 17 16:48:10 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 17 Mar 2010 16:48:10 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> Message-ID: <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: > > I started thinking about a third method called __input_prepare__ that > would be called on the way into the ufunc, which would allow you to > intercept the input and pass a somehow modified copy back to the > ufunc. The total flow would be: > > 1) Call myufunc(x, y[, z]) > 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', > y' (or simply passes through x,y by default) > 3) myufunc creates the output array z (if not specified) and calls > ?.__array_prepare__(z, (myufunc, x, y, ...)) > 4) myufunc finally gets around to performing the calculation > 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns > the result to the caller > > Is this general enough for your use case? I haven't tried to think > about how to change some global state at one point and change it back > at another, that seems like a bad idea and difficult to support. Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. From pearu.peterson at gmail.com Wed Mar 17 17:11:35 2010 From: pearu.peterson at gmail.com (Pearu Peterson) Date: Wed, 17 Mar 2010 23:11:35 +0200 Subject: [Numpy-discussion] f2py compiler version errors In-Reply-To: <3b6093ef1003161432t6f414bf4g379727483bb7cf9@mail.gmail.com> References: <3b6093ef1003161432t6f414bf4g379727483bb7cf9@mail.gmail.com> Message-ID: <4BA14587.6070801@cens.ioc.ee> Hi, You are running rather old numpy version (1.0.1). Try upgrading numpy as at least recent numpy from svn detects this compiler fine. Regards, Pearu Peter Brady wrote: > Hello all, > > The version of f2py that's installed on our system doesn't appear to > handle version numbers correctly. I've attached the relevant output of > f2py below: > > customize IntelFCompiler > Couldn't match compiler version for 'Intel(R) Fortran Intel(R) 64 > Compiler Professional for applications running on Intel(R) 64, > Version 11.0 Build 20090318 \nCopyright (C) 1985-2009 Intel > Corporation. All rights reserved.\nFOR NON-COMMERCIAL USE ONLY\n\n > Intel Fortran 11.0-1578' > IntelFCompiler instance properties: > archiver = ['ar', '-cr'] > compile_switch = '-c' > compiler_f77 = > ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > 72', '-w90', '-w95', '-KPIC', '-cm', '-O3', > '-unroll', '- > tpp7', '-xW', '-arch SSE2'] > compiler_f90 = > ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > FR', '-KPIC', '-cm', '-O3', '-unroll', '-tpp7', > '-xW', '- > arch SSE2'] > compiler_fix = > ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > FI', '-KPIC', '-cm', '-O3', '-unroll', '-tpp7', > '-xW', '- > arch SSE2'] > libraries = [] > library_dirs = [] > linker_so = > ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '- > shared', '-tpp7', '-xW', '-arch SSE2'] > object_switch = '-o ' > ranlib = ['ranlib'] > version = None > version_cmd = > ['/opt/intel/Compiler/11.0/083/bin/intel64/ifort', '-FI > -V -c /tmp/tmpx6aZa8__dummy.f -o > /tmp/tmpx6aZa8__dummy.o'] > > > The output of f2py is: > > Version: 2_3473 > numpy Version: 1.0.1 > Requires: Python 2.3 or higher. > License: NumPy license (see LICENSE.txt in the NumPy source code) > Copyright 1999 - 2005 Pearu Peterson all rights reserved. > http://cens.ioc.ee/projects/f2py2e/ > > > We're running 64bit linux with python 2.4. How do I make this work? > > thanks, > Peter. > > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From dsdale24 at gmail.com Wed Mar 17 17:13:44 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 17 Mar 2010 17:13:44 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM wrote: > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: >> >> I started thinking about a third method called __input_prepare__ that >> would be called on the way into the ufunc, which would allow you to >> intercept the input and pass a somehow modified copy back to the >> ufunc. The total flow would be: >> >> 1) Call myufunc(x, y[, z]) >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', >> y' (or simply passes through x,y by default) >> 3) myufunc creates the output array z (if not specified) and calls >> ?.__array_prepare__(z, (myufunc, x, y, ...)) >> 4) myufunc finally gets around to performing the calculation >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns >> the result to the caller >> >> Is this general enough for your use case? I haven't tried to think >> about how to change some global state at one point and change it back >> at another, that seems like a bad idea and difficult to support. > > > Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. I think it is better to keep them separate, so we don't have one method that is trying to do too much. It would be easier to explain in the documentation. I may not have much time to look into this until after Monday. Is there a deadline we need to consider? Darren From charlesr.harris at gmail.com Wed Mar 17 17:43:03 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Mar 2010 15:43:03 -0600 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM wrote: > > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: > >> > >> I started thinking about a third method called __input_prepare__ that > >> would be called on the way into the ufunc, which would allow you to > >> intercept the input and pass a somehow modified copy back to the > >> ufunc. The total flow would be: > >> > >> 1) Call myufunc(x, y[, z]) > >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', > >> y' (or simply passes through x,y by default) > >> 3) myufunc creates the output array z (if not specified) and calls > >> ?.__array_prepare__(z, (myufunc, x, y, ...)) > >> 4) myufunc finally gets around to performing the calculation > >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns > >> the result to the caller > >> > >> Is this general enough for your use case? I haven't tried to think > >> about how to change some global state at one point and change it back > >> at another, that seems like a bad idea and difficult to support. > > > > > > Sounds like a good plan. If we could find a way to merge the first two > (__input_prepare__ and __array_prepare__), that'd be ideal. > > I think it is better to keep them separate, so we don't have one > method that is trying to do too much. It would be easier to explain in > the documentation. > > I may not have much time to look into this until after Monday. Is > there a deadline we need to consider? > > I don't think this should go into 2.0, I think it needs more thought. And 2.0 already has significant code churn. Is there any reason beyond a big hassle not to set/restore the error state around all the ufunc calls in ma? Beyond that, the PEP that you pointed to looks interesting. Maybe some sort of decorator around ufunc calls could also be made to work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Mar 17 17:56:20 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Mar 2010 15:56:20 -0600 Subject: [Numpy-discussion] bug in ndarray.resize? In-Reply-To: <4BA0EA53.10607@american.edu> References: <4BA0E263.7090100@american.edu> <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> <4BA0EA53.10607@american.edu> Message-ID: On Wed, Mar 17, 2010 at 8:42 AM, Alan G Isaac wrote: > On 3/17/2010 10:16 AM, josef.pktd at gmail.com wrote: > > numpy.resize(a, new_shape) > > Return a new array with the specified shape. > > > > If the new array is larger than the original array, then the new array > > is filled with repeated copied of a. Note that this behavior is > > different from a.resize(new_shape) which fills with zeros instead of > > repeated copies of a. > > > Yes indeed. Sorry, I must have scrolled the help without realizing it, > and this part was at the top. > > So my follow up: why is this desirable/necessary? (I find it surprising.) > IIRC, it behaved that way in Numeric. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Wed Mar 17 19:26:58 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 17 Mar 2010 19:26:58 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris wrote: > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale wrote: >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM wrote: >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: >> >> >> >> I started thinking about a third method called __input_prepare__ that >> >> would be called on the way into the ufunc, which would allow you to >> >> intercept the input and pass a somehow modified copy back to the >> >> ufunc. The total flow would be: >> >> >> >> 1) Call myufunc(x, y[, z]) >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', >> >> y' (or simply passes through x,y by default) >> >> 3) myufunc creates the output array z (if not specified) and calls >> >> ?.__array_prepare__(z, (myufunc, x, y, ...)) >> >> 4) myufunc finally gets around to performing the calculation >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns >> >> the result to the caller >> >> >> >> Is this general enough for your use case? I haven't tried to think >> >> about how to change some global state at one point and change it back >> >> at another, that seems like a bad idea and difficult to support. >> > >> > >> > Sounds like a good plan. If we could find a way to merge the first two >> > (__input_prepare__ and __array_prepare__), that'd be ideal. >> >> I think it is better to keep them separate, so we don't have one >> method that is trying to do too much. It would be easier to explain in >> the documentation. >> >> I may not have much time to look into this until after Monday. Is >> there a deadline we need to consider? >> > > I don't think this should go into 2.0, I think it needs more thought. Now that you mention it, I agree that it would be too rushed to try to get it in for 2.0. Concerning a later release, is there anything in particular that you think needs to be clarified or reconsidered? > And > 2.0 already has significant code churn. Is there any reason beyond a big > hassle not to set/restore the error state around all the ufunc calls in ma? > Beyond that, the PEP that you pointed to looks interesting. Maybe some sort > of decorator around ufunc calls could also be made to work. I think the PEP is interesting, but it is languishing. There were some questions and criticisms on the mailing list that I do not think were satisfactorily addressed, and as far as I know the author of the PEP has not pursued the matter further. There was some interest on the python-dev mailing list in the numpy community's use case, but I think we need to consider what can be done now to meet the needs of ndarray subclasses. I don't see PEP 3124 happening in the near future. What I am proposing is a simple extension to our existing framework to let subclasses hook into ufuncs and customize their behavior based on the context of the operation (using the __array_priority__ of the inputs and/or outputs, and the identity of the ufunc). The steps I listed allow customization at the critical steps: prepare the input, prepare the output, populate the output (currently no proposal for customization here), and finalize the output. The only additional step proposed is to prepare the input. In the long run, we could consider if ufuncs should be instances of a class, perhaps implemented in Cython. This way the ufunc will be able to pass itself to the special array methods as part of the context tuple, as is currently done. Maybe an alternative approach would be for ufuncs to provide methods where subclasses could register routines for the various steps I specified based on the types of the inputs, similar to the PEP. This way, the ufunc would determine the context based on the input (rather than the current way of the ufunc determining part of the context based on the input by inspecting __array_priority__ and then the input with highest priority determining the context based on the identity of the ufunc and the rest of the input.) This new (half baked) approach could be backward-compatible with the old one: if the combination of inputs isn't found in the registry, it would fall back on the existing input-/array_prepare array_wrap mechanisms (which in principle could then be deprecated, and at that point __array_priority__ might no longer be necessary). I don't see anything to indicate that we would regret implementing a special __input_prepare__ method down the road. Darren From charlesr.harris at gmail.com Wed Mar 17 20:22:07 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Mar 2010 18:22:07 -0600 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris > wrote: > > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale wrote: > >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM > wrote: > >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: > >> >> > >> >> I started thinking about a third method called __input_prepare__ that > >> >> would be called on the way into the ufunc, which would allow you to > >> >> intercept the input and pass a somehow modified copy back to the > >> >> ufunc. The total flow would be: > >> >> > >> >> 1) Call myufunc(x, y[, z]) > >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns > x', > >> >> y' (or simply passes through x,y by default) > >> >> 3) myufunc creates the output array z (if not specified) and calls > >> >> ?.__array_prepare__(z, (myufunc, x, y, ...)) > >> >> 4) myufunc finally gets around to performing the calculation > >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and > returns > >> >> the result to the caller > >> >> > >> >> Is this general enough for your use case? I haven't tried to think > >> >> about how to change some global state at one point and change it back > >> >> at another, that seems like a bad idea and difficult to support. > >> > > >> > > >> > Sounds like a good plan. If we could find a way to merge the first two > >> > (__input_prepare__ and __array_prepare__), that'd be ideal. > >> > >> I think it is better to keep them separate, so we don't have one > >> method that is trying to do too much. It would be easier to explain in > >> the documentation. > >> > >> I may not have much time to look into this until after Monday. Is > >> there a deadline we need to consider? > >> > > > > I don't think this should go into 2.0, I think it needs more thought. > > Now that you mention it, I agree that it would be too rushed to try to > get it in for 2.0. Concerning a later release, is there anything in > particular that you think needs to be clarified or reconsidered? > > > And > > 2.0 already has significant code churn. Is there any reason beyond a big > > hassle not to set/restore the error state around all the ufunc calls in > ma? > > Beyond that, the PEP that you pointed to looks interesting. Maybe some > sort > > of decorator around ufunc calls could also be made to work. > > I think the PEP is interesting, but it is languishing. There were some > questions and criticisms on the mailing list that I do not think were > satisfactorily addressed, and as far as I know the author of the PEP > has not pursued the matter further. There was some interest on the > python-dev mailing list in the numpy community's use case, but I think > we need to consider what can be done now to meet the needs of ndarray > subclasses. I don't see PEP 3124 happening in the near future. > > What I am proposing is a simple extension to our existing framework to > let subclasses hook into ufuncs and customize their behavior based on > the context of the operation (using the __array_priority__ of the > inputs and/or outputs, and the identity of the ufunc). The steps I > listed allow customization at the critical steps: prepare the input, > prepare the output, populate the output (currently no proposal for > customization here), and finalize the output. The only additional step > proposed is to prepare the input. > > What bothers me here is the opposing desire to separate ufuncs from their ndarray dependency, having them operate on buffer objects instead. As I see it ufuncs would be split into layers, with a lower layer operating on buffer objects, and an upper layer tying them together with ndarrays where the "business" logic -- kinds, casting, etc -- resides. It is in that upper layer that what you are proposing would reside. Mind, I'm not sure that having matrices and masked arrays subclassing ndarray was the way to go, but given that they do one possible solution is to dump the whole mess onto the subtype with the highest priority. That subtype would then be responsible for casts and all the other stuff needed for the call and wrapping the result. There could be library routines to help with that. It seems to me that that would be the most general way to go. In that sense ndarrays themselves would just be another subtype with especially low priority. In the long run, we could consider if ufuncs should be instances of a > class, perhaps implemented in Cython. This way the ufunc will be able > to pass itself to the special array methods as part of the context > tuple, as is currently done. Maybe an alternative approach would be > for ufuncs to provide methods where subclasses could register routines > for the various steps I specified based on the types of the inputs, > similar to the PEP. This way, the ufunc would determine the context > based on the input (rather than the current way of the ufunc > determining part of the context based on the input by inspecting > __array_priority__ and then the input with highest priority > determining the context based on the identity of the ufunc and the > rest of the input.) This new (half baked) approach could be > backward-compatible with the old one: if the combination of inputs > isn't found in the registry, it would fall back on the existing > input-/array_prepare array_wrap mechanisms (which in principle could > then be deprecated, and at that point __array_priority__ might no > longer be necessary). I don't see anything to indicate that we would > regret implementing a special __input_prepare__ method down the road. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Wed Mar 17 21:39:12 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 17 Mar 2010 21:39:12 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris wrote: > > > On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale wrote: >> >> On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris >> wrote: >> > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale wrote: >> >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM >> >> wrote: >> >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: >> >> >> >> >> >> I started thinking about a third method called __input_prepare__ >> >> >> that >> >> >> would be called on the way into the ufunc, which would allow you to >> >> >> intercept the input and pass a somehow modified copy back to the >> >> >> ufunc. The total flow would be: >> >> >> >> >> >> 1) Call myufunc(x, y[, z]) >> >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns >> >> >> x', >> >> >> y' (or simply passes through x,y by default) >> >> >> 3) myufunc creates the output array z (if not specified) and calls >> >> >> ?.__array_prepare__(z, (myufunc, x, y, ...)) >> >> >> 4) myufunc finally gets around to performing the calculation >> >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and >> >> >> returns >> >> >> the result to the caller >> >> >> >> >> >> Is this general enough for your use case? I haven't tried to think >> >> >> about how to change some global state at one point and change it >> >> >> back >> >> >> at another, that seems like a bad idea and difficult to support. >> >> > >> >> > >> >> > Sounds like a good plan. If we could find a way to merge the first >> >> > two >> >> > (__input_prepare__ and __array_prepare__), that'd be ideal. >> >> >> >> I think it is better to keep them separate, so we don't have one >> >> method that is trying to do too much. It would be easier to explain in >> >> the documentation. >> >> >> >> I may not have much time to look into this until after Monday. Is >> >> there a deadline we need to consider? >> >> >> > >> > I don't think this should go into 2.0, I think it needs more thought. >> >> Now that you mention it, I agree that it would be too rushed to try to >> get it in for 2.0. Concerning a later release, is there anything in >> particular that you think needs to be clarified or reconsidered? >> >> > And >> > 2.0 already has significant code churn. Is there any reason beyond a big >> > hassle not to set/restore the error state around all the ufunc calls in >> > ma? >> > Beyond that, the PEP that you pointed to looks interesting. Maybe some >> > sort >> > of decorator around ufunc calls could also be made to work. >> >> I think the PEP is interesting, but it is languishing. There were some >> questions and criticisms on the mailing list that I do not think were >> satisfactorily addressed, and as far as I know the author of the PEP >> has not pursued the matter further. There was some interest on the >> python-dev mailing list in the numpy community's use case, but I think >> we need to consider what can be done now to meet the needs of ndarray >> subclasses. I don't see PEP 3124 happening in the near future. >> >> What I am proposing is a simple extension to our existing framework to >> let subclasses hook into ufuncs and customize their behavior based on >> the context of the operation (using the __array_priority__ of the >> inputs and/or outputs, and the identity of the ufunc). The steps I >> listed allow customization at the critical steps: prepare the input, >> prepare the output, populate the output (currently no proposal for >> customization here), and finalize the output. The only additional step >> proposed is to prepare the input. >> > > What bothers me here is the opposing desire to separate ufuncs from their > ndarray dependency, having them operate on buffer objects instead. As I see > it ufuncs would be split into layers, with a lower layer operating on buffer > objects, and an upper layer tying them together with ndarrays where the > "business" logic -- kinds, casting, etc -- resides. It is in that upper > layer that what you are proposing would reside. Mind, I'm not sure that > having matrices and masked arrays subclassing ndarray was the way to go, but > given that they do one possible solution is to dump the whole mess onto the > subtype with the highest priority. That subtype would then be responsible > for casts and all the other stuff needed for the call and wrapping the > result. There could be library routines to help with that. It seems to me > that that would be the most general way to go. In that sense ndarrays > themselves would just be another subtype with especially low priority. I'm sorry, I didn't understand your point. What you described sounds identical to how things are currently done. What distinction are you making, aside from operating on the buffer object? How would adding a method to modify the input to a ufunc complicate the situation? Darren From charlesr.harris at gmail.com Wed Mar 17 22:16:29 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Mar 2010 20:16:29 -0600 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris > wrote: > > > > > > On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale wrote: > >> > >> On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris > >> wrote: > >> > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale > wrote: > >> >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM > >> >> wrote: > >> >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: > >> >> >> > >> >> >> I started thinking about a third method called __input_prepare__ > >> >> >> that > >> >> >> would be called on the way into the ufunc, which would allow you > to > >> >> >> intercept the input and pass a somehow modified copy back to the > >> >> >> ufunc. The total flow would be: > >> >> >> > >> >> >> 1) Call myufunc(x, y[, z]) > >> >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns > >> >> >> x', > >> >> >> y' (or simply passes through x,y by default) > >> >> >> 3) myufunc creates the output array z (if not specified) and calls > >> >> >> ?.__array_prepare__(z, (myufunc, x, y, ...)) > >> >> >> 4) myufunc finally gets around to performing the calculation > >> >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and > >> >> >> returns > >> >> >> the result to the caller > >> >> >> > >> >> >> Is this general enough for your use case? I haven't tried to think > >> >> >> about how to change some global state at one point and change it > >> >> >> back > >> >> >> at another, that seems like a bad idea and difficult to support. > >> >> > > >> >> > > >> >> > Sounds like a good plan. If we could find a way to merge the first > >> >> > two > >> >> > (__input_prepare__ and __array_prepare__), that'd be ideal. > >> >> > >> >> I think it is better to keep them separate, so we don't have one > >> >> method that is trying to do too much. It would be easier to explain > in > >> >> the documentation. > >> >> > >> >> I may not have much time to look into this until after Monday. Is > >> >> there a deadline we need to consider? > >> >> > >> > > >> > I don't think this should go into 2.0, I think it needs more thought. > >> > >> Now that you mention it, I agree that it would be too rushed to try to > >> get it in for 2.0. Concerning a later release, is there anything in > >> particular that you think needs to be clarified or reconsidered? > >> > >> > And > >> > 2.0 already has significant code churn. Is there any reason beyond a > big > >> > hassle not to set/restore the error state around all the ufunc calls > in > >> > ma? > >> > Beyond that, the PEP that you pointed to looks interesting. Maybe some > >> > sort > >> > of decorator around ufunc calls could also be made to work. > >> > >> I think the PEP is interesting, but it is languishing. There were some > >> questions and criticisms on the mailing list that I do not think were > >> satisfactorily addressed, and as far as I know the author of the PEP > >> has not pursued the matter further. There was some interest on the > >> python-dev mailing list in the numpy community's use case, but I think > >> we need to consider what can be done now to meet the needs of ndarray > >> subclasses. I don't see PEP 3124 happening in the near future. > >> > >> What I am proposing is a simple extension to our existing framework to > >> let subclasses hook into ufuncs and customize their behavior based on > >> the context of the operation (using the __array_priority__ of the > >> inputs and/or outputs, and the identity of the ufunc). The steps I > >> listed allow customization at the critical steps: prepare the input, > >> prepare the output, populate the output (currently no proposal for > >> customization here), and finalize the output. The only additional step > >> proposed is to prepare the input. > >> > > > > What bothers me here is the opposing desire to separate ufuncs from their > > ndarray dependency, having them operate on buffer objects instead. As I > see > > it ufuncs would be split into layers, with a lower layer operating on > buffer > > objects, and an upper layer tying them together with ndarrays where the > > "business" logic -- kinds, casting, etc -- resides. It is in that upper > > layer that what you are proposing would reside. Mind, I'm not sure that > > having matrices and masked arrays subclassing ndarray was the way to go, > but > > given that they do one possible solution is to dump the whole mess onto > the > > subtype with the highest priority. That subtype would then be responsible > > for casts and all the other stuff needed for the call and wrapping the > > result. There could be library routines to help with that. It seems to me > > that that would be the most general way to go. In that sense ndarrays > > themselves would just be another subtype with especially low priority. > > I'm sorry, I didn't understand your point. What you described sounds > identical to how things are currently done. What distinction are you > making, aside from operating on the buffer object? How would adding a > method to modify the input to a ufunc complicate the situation? > > Just *one* function to rule them all and on the subtype dump it. No __array_wrap__, __input_prepare__, or __array_prepare__, just something like __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing having the ufunc upper layer do nothing but decide which argument type will do all the rest of the work, casting, calling the low level ufunc base, providing buffers, wrapping, etc. Instead of pasting bits and pieces into the existing framework I would like to lay out a line of attack that ends up separating ufuncs into smaller pieces that provide low level routines that work on strided memory while leaving policy implementation to the subtype. There would need to be some default type (ndarray) when the functions are called on nested lists and scalars and I'm not sure of the best way to handle that. I'm just sort of thinking out loud, don't take it too seriously. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From frank at horow.net Thu Mar 18 02:01:15 2010 From: frank at horow.net (Frank Horowitz) Date: Thu, 18 Mar 2010 14:01:15 +0800 Subject: [Numpy-discussion] evaluating a function in a matrix element??? Message-ID: <4A2645B8-A650-42E5-B021-CDE54CA39791@horow.net> Dear All, I'm working on a piece of optimisation code where it turns out to be mathematically convenient to have a matrix where a few pre-chosen elements must be computed at evaluation time for the dot product (i.e. matrix multiplication) of a matrix with a vector. As I see the problem, there are two basic approaches to accomplishing this. First (and perhaps conceptually simplest, not to mention apparently faster) might be to stash appropriate functions at their corresponding locations in the matrix (with the rest of the matrix being constants, as usual). I mucked around with this for a little while in iPython, and it appears that having dtype == object_ works for stashing the references to the functions, but fails to allow me to actually evaluate the function(s) when the matrix is used in the dot product. Does anyone have any experience with making such a beast work within numpy? If so, how? The second basic approach is to build a ufunc that implements the switching logic, and returns the constants and evaluated functions in the appropriate locations. This seems to be the more do-able approach, but it requires the ufunc to be aware of both the position of each element (via index, or somesuch) as well as the arguments to the functions themselves being evaluated at the matrix elements. It appears that frompyfunc() nearly does what I want, but I am currently failing to see how to actually *use* it for anything more elaborate than the octal example code (i.e. one value in, and one value out). Does anyone have any other more elaborate examples they can point me towards? Thanks in advance for any help you might be able to provide! Frank Horowitz frank at horow.net From faltet at pytables.org Thu Mar 18 04:55:49 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 18 Mar 2010 09:55:49 +0100 Subject: [Numpy-discussion] bug in ndarray.resize? In-Reply-To: References: <4BA0E263.7090100@american.edu> <4BA0EA53.10607@american.edu> Message-ID: <201003180955.49789.faltet@pytables.org> A Wednesday 17 March 2010 22:56:20 Charles R Harris escrigu?: > On Wed, Mar 17, 2010 at 8:42 AM, Alan G Isaac wrote: > > On 3/17/2010 10:16 AM, josef.pktd at gmail.com wrote: > > > numpy.resize(a, new_shape) > > > Return a new array with the specified shape. > > > > > > If the new array is larger than the original array, then the new array > > > is filled with repeated copied of a. Note that this behavior is > > > different from a.resize(new_shape) which fills with zeros instead of > > > repeated copies of a. > > > > Yes indeed. Sorry, I must have scrolled the help without realizing it, > > and this part was at the top. > > > > So my follow up: why is this desirable/necessary? (I find it > > surprising.) > > IIRC, it behaved that way in Numeric. This does not mean that this behaviour is desirable. I find it inconsistent and misleading so +1 for fixing it. -- Francesc Alted From seb.haase at gmail.com Thu Mar 18 04:56:21 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Thu, 18 Mar 2010 09:56:21 +0100 Subject: [Numpy-discussion] bug in ndarray.resize? In-Reply-To: References: <4BA0E263.7090100@american.edu> <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> <4BA0EA53.10607@american.edu> Message-ID: On Wed, Mar 17, 2010 at 10:56 PM, Charles R Harris wrote: > > > On Wed, Mar 17, 2010 at 8:42 AM, Alan G Isaac wrote: >> >> On 3/17/2010 10:16 AM, josef.pktd at gmail.com wrote: >> > numpy.resize(a, new_shape) >> > Return a new array with the specified shape. >> > >> > If the new array is larger than the original array, then the new array >> > is filled with repeated copied of a. Note that this behavior is >> > different from a.resize(new_shape) which fills with zeros instead of >> > repeated copies of a. >> >> >> Yes indeed. ?Sorry, I must have scrolled the help without realizing it, >> and this part was at the top. >> >> So my follow up: why is this desirable/necessary? ?(I find it surprising.) > > IIRC, it behaved that way in Numeric. > How would people feel about unifying the function vs. the method behavior ? One could add an addition option like `repeat` or `fillZero`. One could (at first !?) keep opposite defaults to not change the current behavior. But this way it would be most visible and clear what is going on. Regards, Sebastian Haase From gael.varoquaux at normalesup.org Thu Mar 18 04:58:17 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 18 Mar 2010 09:58:17 +0100 Subject: [Numpy-discussion] bug in ndarray.resize? In-Reply-To: References: <4BA0E263.7090100@american.edu> <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> <4BA0EA53.10607@american.edu> Message-ID: <20100318085817.GB30043@phare.normalesup.org> On Thu, Mar 18, 2010 at 09:56:21AM +0100, Sebastian Haase wrote: > How would people feel about unifying the function vs. the method behavior ? > One could add an addition option like > `repeat` or `fillZero`. You mean fill_zero... Sorry, my linter went off. :) Ga?l From dsdale24 at gmail.com Thu Mar 18 08:14:08 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 18 Mar 2010 08:14:08 -0400 Subject: [Numpy-discussion] ufunc improvements [Was: Warnings in numpy.ma.test()] Message-ID: On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris wrote: > > > On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale wrote: >> >> On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris >> wrote: >> > >> > >> > On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale wrote: >> >> >> >> On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris >> >> wrote: >> >> > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale >> >> > wrote: >> >> >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM >> >> >> wrote: >> >> >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: >> >> >> >> >> >> >> >> I started thinking about a third method called __input_prepare__ >> >> >> >> that >> >> >> >> would be called on the way into the ufunc, which would allow you >> >> >> >> to >> >> >> >> intercept the input and pass a somehow modified copy back to the >> >> >> >> ufunc. The total flow would be: >> >> >> >> >> >> >> >> 1) Call myufunc(x, y[, z]) >> >> >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which >> >> >> >> returns >> >> >> >> x', >> >> >> >> y' (or simply passes through x,y by default) >> >> >> >> 3) myufunc creates the output array z (if not specified) and >> >> >> >> calls >> >> >> >> ?.__array_prepare__(z, (myufunc, x, y, ...)) >> >> >> >> 4) myufunc finally gets around to performing the calculation >> >> >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and >> >> >> >> returns >> >> >> >> the result to the caller >> >> >> >> >> >> >> >> Is this general enough for your use case? I haven't tried to >> >> >> >> think >> >> >> >> about how to change some global state at one point and change it >> >> >> >> back >> >> >> >> at another, that seems like a bad idea and difficult to support. >> >> >> > >> >> >> > >> >> >> > Sounds like a good plan. If we could find a way to merge the first >> >> >> > two >> >> >> > (__input_prepare__ and __array_prepare__), that'd be ideal. >> >> >> >> >> >> I think it is better to keep them separate, so we don't have one >> >> >> method that is trying to do too much. It would be easier to explain >> >> >> in >> >> >> the documentation. >> >> >> >> >> >> I may not have much time to look into this until after Monday. Is >> >> >> there a deadline we need to consider? >> >> >> >> >> > >> >> > I don't think this should go into 2.0, I think it needs more thought. >> >> >> >> Now that you mention it, I agree that it would be too rushed to try to >> >> get it in for 2.0. Concerning a later release, is there anything in >> >> particular that you think needs to be clarified or reconsidered? >> >> >> >> > And >> >> > 2.0 already has significant code churn. Is there any reason beyond a >> >> > big >> >> > hassle not to set/restore the error state around all the ufunc calls >> >> > in >> >> > ma? >> >> > Beyond that, the PEP that you pointed to looks interesting. Maybe >> >> > some >> >> > sort >> >> > of decorator around ufunc calls could also be made to work. >> >> >> >> I think the PEP is interesting, but it is languishing. There were some >> >> questions and criticisms on the mailing list that I do not think were >> >> satisfactorily addressed, and as far as I know the author of the PEP >> >> has not pursued the matter further. There was some interest on the >> >> python-dev mailing list in the numpy community's use case, but I think >> >> we need to consider what can be done now to meet the needs of ndarray >> >> subclasses. I don't see PEP 3124 happening in the near future. >> >> >> >> What I am proposing is a simple extension to our existing framework to >> >> let subclasses hook into ufuncs and customize their behavior based on >> >> the context of the operation (using the __array_priority__ of the >> >> inputs and/or outputs, and the identity of the ufunc). The steps I >> >> listed allow customization at the critical steps: prepare the input, >> >> prepare the output, populate the output (currently no proposal for >> >> customization here), and finalize the output. The only additional step >> >> proposed is to prepare the input. >> >> >> > >> > What bothers me here is the opposing desire to separate ufuncs from >> > their >> > ndarray dependency, having them operate on buffer objects instead. As I >> > see >> > it ufuncs would be split into layers, with a lower layer operating on >> > buffer >> > objects, and an upper layer tying them together with ndarrays where the >> > "business" logic -- kinds, casting, etc -- resides. It is in that upper >> > layer that what you are proposing would reside. Mind, I'm not sure that >> > having matrices and masked arrays subclassing ndarray was the way to go, >> > but >> > given that they do one possible solution is to dump the whole mess onto >> > the >> > subtype with the highest priority. That subtype would then be >> > responsible >> > for casts and all the other stuff needed for the call and wrapping the >> > result. There could be library routines to help with that. It seems to >> > me >> > that that would be the most general way to go. In that sense ndarrays >> > themselves would just be another subtype with especially low priority. >> >> I'm sorry, I didn't understand your point. What you described sounds >> identical to how things are currently done. What distinction are you >> making, aside from operating on the buffer object? How would adding a >> method to modify the input to a ufunc complicate the situation? >> > > Just *one* function to rule them all and on the subtype dump it. No > __array_wrap__, __input_prepare__, or __array_prepare__, just something like > __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing > having the ufunc upper layer do nothing but decide which argument type will > do all the rest of the work, casting, calling the low level ufunc base, > providing buffers, wrapping, etc. Instead of pasting bits and pieces into > the existing framework I would like to lay out a line of attack that ends up > separating ufuncs into smaller pieces that provide low level routines that > work on strided memory while leaving policy implementation to the subtype. > There would need to be some default type (ndarray) when the functions are > called on nested lists and scalars and I'm not sure of the best way to > handle that. > > I'm just sort of thinking out loud, don't take it too seriously. Thanks for the clarification. I think I see how this could work: if ufuncs were callable instances of classes, __call__ would find the input with highest priority and pass itself and the input to that object's __handle_ufunc__. Now it is up to __handle_ufunc__ to determine whether and how to modify the input, call some method on the ufunc (like execute) to perform the buffer operation, then __handle_ufunc__ performs the cast, deals with metadata and returns the result. I skipped a step: initializing the output buffer. Would that be rolled into the ufunc execution, or should it be possible for __handle_ufunc__ to access the initialized buffer before execution occurs(__array_prepare__)? I think it is important to be able to perform the cast and calculate metadata before ufunc execution. If an error occurs, an exception can be raised before the ufunc operates on the arrays, which can modifies the data in place. Darren From martin.raspaud at smhi.se Thu Mar 18 09:33:30 2010 From: martin.raspaud at smhi.se (Martin Raspaud) Date: Thu, 18 Mar 2010 14:33:30 +0100 Subject: [Numpy-discussion] Int bitsize in python and c Message-ID: <4BA22BAA.5050402@smhi.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I work on a 64bit machine with 64bits enable fedora on it. I just discovered that numpy.int on the python part are 64bits ints, while npy_int in the C api are 32bits ints. I can live with it, but it seems to be different on 32bit machines, hence I wonder what is the right way to do when retrieving an array from python to C. Here is what I use now: data_pyarray = (PyArrayObject *)PyArray_ContiguousFromObject(data_list, PyArray_INT, 1, 2); but that implies that I send np.int32 arrays to the C part. Should I use longs instead ? Regards, Martin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJLoiuqAAoJEBdvyODiyJI4TikIAIUpnsIxxeYMlz8qEeZL/UUB 3UTGOCcrcIICPVRW/CLbOss5W4xe8BTxPslRXZfckSuMMgHHiD3rGC302gZgfvsb mS6fcDzTOboJ1da1xoczpJYVCwvC9aWAPEjEDa6jyI331pDAXABurmjzIQqjowDw 1cWX5swt9MeSn0yOa/a2EYQP8Xj+n0RQlSIutEDR5jktlK3yyHX8LAtZd0tAPgrd hr9RGwO09Hwcn7ke4B9SwHF7Zg/mBrHgdTdaufW+kjPleZ479lyMO8r/LsWbehVo usQ5wefnmnzhDhOoxff8aKUo8D+Ne8gqxI4BR5EOAdHfQ2uUPpBA91pJ0cNbzZI= =E0XH -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: martin_raspaud.vcf Type: text/x-vcard Size: 260 bytes Desc: not available URL: From faltet at pytables.org Thu Mar 18 09:57:25 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 18 Mar 2010 14:57:25 +0100 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal Message-ID: <201003181457.25785.faltet@pytables.org> Hi, Konrad Hinsen has just told me that my article "Why Modern CPUs Are Starving and What Can Be Done About It", which has just released on the March/April issue of "Computing in Science and Engineering", also made into this month's free-access selection on IEEE's ComputingNow portal: http://www.computer.org/portal/web/computingnow http://www.computer.org/portal/web/computingnow/0310/whatsnew/cise On it, I discuss one of my favourite subjects, memory access, and why it is important for nowadays computing. There are also recommendations for people wanting to squeeze the maximum of performance out of their computers. And, although I tried to be as language-agnostic as I could, there can be seen some Python references here and there :-). Well, sorry about this semi-OT but I could not resist :-) -- Francesc Alted From dagss at student.matnat.uio.no Thu Mar 18 10:25:39 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 18 Mar 2010 15:25:39 +0100 Subject: [Numpy-discussion] Int bitsize in python and c In-Reply-To: <4BA22BAA.5050402@smhi.se> References: <4BA22BAA.5050402@smhi.se> Message-ID: <4BA237E3.2080508@student.matnat.uio.no> Martin Raspaud wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > I work on a 64bit machine with 64bits enable fedora on it. > > I just discovered that numpy.int on the python part are 64bits ints, while > npy_int in the C api are 32bits ints. > np.intc Dag Sverre > I can live with it, but it seems to be different on 32bit machines, hence I > wonder what is the right way to do when retrieving an array from python to C. > > Here is what I use now: > data_pyarray = (PyArrayObject *)PyArray_ContiguousFromObject(data_list, > PyArray_INT, 1, 2); > > but that implies that I send np.int32 arrays to the C part. > > Should I use longs instead ? > > Regards, > Martin > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJLoiuqAAoJEBdvyODiyJI4TikIAIUpnsIxxeYMlz8qEeZL/UUB > 3UTGOCcrcIICPVRW/CLbOss5W4xe8BTxPslRXZfckSuMMgHHiD3rGC302gZgfvsb > mS6fcDzTOboJ1da1xoczpJYVCwvC9aWAPEjEDa6jyI331pDAXABurmjzIQqjowDw > 1cWX5swt9MeSn0yOa/a2EYQP8Xj+n0RQlSIutEDR5jktlK3yyHX8LAtZd0tAPgrd > hr9RGwO09Hwcn7ke4B9SwHF7Zg/mBrHgdTdaufW+kjPleZ479lyMO8r/LsWbehVo > usQ5wefnmnzhDhOoxff8aKUo8D+Ne8gqxI4BR5EOAdHfQ2uUPpBA91pJ0cNbzZI= > =E0XH > -----END PGP SIGNATURE----- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pgmdevlist at gmail.com Wed Mar 17 17:20:27 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 17 Mar 2010 17:20:27 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA0F0B5.4050608@gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA0F0B5.4050608@gmail.com> Message-ID: <489CB638-7D4A-4DD8-BE22-CE4FC77B1F41@gmail.com> On Mar 17, 2010, at 11:09 AM, Bruce Southey wrote: > On 03/17/2010 01:07 AM, Pierre GM wrote: >> All, >> As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as "invalid value in ...". These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. >> > Perhaps naive question, what is really being tested here? > > That is, it appears that you are testing both the generation of the > invalid values and function. So if the generation fails, then the > function will also fail. However, the test for the generation of invalid > values should be elsewhere so you have to assume that the generation of > values will work correctly. That's not really the point here. The issue is that when numpy ufuncs are called on a MaskedArray, a warning or an exception is raised when an invalid is met. With the numpy.ma version of those functions, the error is trapped and processed. Of course, using the numpy.ma version of the ufuncs is the right way to go > I think that you should be only testing that the specific function > passes the test. Why not just use 'invalid' values like np.inf directly? > > For example, in numpy/ma/tests/test_core.py > We have this test: > def test_fix_invalid(self): > "Checks fix_invalid." > data = masked_array(np.sqrt([-1., 0., 1.]), mask=[0, 0, 1]) > data_fixed = fix_invalid(data) > > If that is to test that fix_invalid Why not create the data array as: > data = masked_array([np.inf, 0., 1.]), mask=[0, 0, 1]) Sure, that's nicer. But once again, that's not really the core of the issue. From pgmdevlist at gmail.com Wed Mar 17 17:25:51 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 17 Mar 2010 17:25:51 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> Message-ID: <0B1546E3-08CD-4B82-967C-FD8AA0F4B375@gmail.com> On Mar 17, 2010, at 3:18 PM, josef.pktd at gmail.com wrote: > On Wed, Mar 17, 2010 at 3:12 PM, Christopher Barker > wrote: >> >> One of the things I liked about MATLAB was that NaNs were well handled >> almost all the time. Given all the limitations of NaN, having a masked >> array is a better way to go, but I'd love it if they were "just there", >> and therefore EVERY numpy function and package built on numpy would >> handle them gracefully. I had thought that there would be a significant >> performance penalty, and thus there would be a boatload of "if_mask" >> code all over the place, but maybe not. > > many function are defined differently for missing values, in stats, > regression or time series analysis with the assumption of equally > spaced time periods always needs to use special methods to handle > missing values. I think Christopher was referring to ufuncs, not necessarily to more complicated functions (like in stats or such). From peridot.faceted at gmail.com Thu Mar 18 11:26:09 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 18 Mar 2010 11:26:09 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <201003181457.25785.faltet@pytables.org> References: <201003181457.25785.faltet@pytables.org> Message-ID: On 18 March 2010 09:57, Francesc Alted wrote: > Hi, > > Konrad Hinsen has just told me that my article "Why Modern CPUs Are Starving > and What Can Be Done About It", which has just released on the March/April > issue of "Computing in Science and Engineering", also made into this month's > free-access selection on IEEE's ComputingNow portal: Speak for your own CPUs :). But seriously, congratulations on the wide publication of the article; it's an important issue we often don't think enough about. I'm just a little snarky because this exact issue came up for us recently - a visiting astro speaker put it as "flops are free" - and so I did some tests and found that even without optimizing for memory access, our tasks are already CPU-bound: http://lighthouseinthesky.blogspot.com/2010/03/flops.html In terms of specifics, I was a little surprised you didn't mention FFTW among your software tools that optimize memory access. FFTW's planning scheme seems ideal for ensuring memory locality, as much as possible, during large FFTs. (And in fact I also found that for really large FFTs, reducing padding - memory size - at the cost of a non-power-of-two size was also worth it.) > http://www.computer.org/portal/web/computingnow > http://www.computer.org/portal/web/computingnow/0310/whatsnew/cise > > On it, I discuss one of my favourite subjects, memory access, and why it is > important for nowadays computing. ?There are also recommendations for people > wanting to squeeze the maximum of performance out of their computers. ?And, > although I tried to be as language-agnostic as I could, there can be seen some > Python references here and there :-). Heh. Indeed numexpr is a good tool for this sort of thing; it's an unfortunate fact that simple use of numpy tends to do operations in the pessimal order... Ane > Well, sorry about this semi-OT but I could not resist :-) > > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From martin.raspaud at smhi.se Thu Mar 18 11:32:22 2010 From: martin.raspaud at smhi.se (Martin Raspaud) Date: Thu, 18 Mar 2010 16:32:22 +0100 Subject: [Numpy-discussion] Int bitsize in python and c In-Reply-To: <4BA237E3.2080508@student.matnat.uio.no> References: <4BA22BAA.5050402@smhi.se> <4BA237E3.2080508@student.matnat.uio.no> Message-ID: <4BA24786.8000506@smhi.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dag Sverre Seljebotn skrev: > Martin Raspaud wrote: > Hello, > > I work on a 64bit machine with 64bits enable fedora on it. > > I just discovered that numpy.int on the python part are 64bits ints, while > npy_int in the C api are 32bits ints. > > >> np.intc Thanks ! But I'm also wondering about the C side. How can I make sure my conversion to PyArray goes well ? Should I use PyArray_LONG ? or is this also platform dependent ? Regards, Martin > I can live with it, but it seems to be different on 32bit machines, hence I > wonder what is the right way to do when retrieving an array from python to C. > > Here is what I use now: > data_pyarray = (PyArrayObject *)PyArray_ContiguousFromObject(data_list, > PyArray_INT, 1, 2); > > but that implies that I send np.int32 arrays to the C part. > > Should I use longs instead ? > > Regards, > Martin _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJLokeGAAoJEBdvyODiyJI4tv0H/R+sUTIfvH2JejX85XHUGtOw mUwDkAD2ePj9qNc62RJJQg75B58useoe8zHSNj2NFG7U5fhFYviIQImugJ4dAESo z4WgrIRQE29apyCScRid8lXBHDL8kvJtpI+G/uFQ62jxSzheIDpEEzKUlDMYZeIA MLFhMWqzHVmuNEUVyUdPt+ryL6T23t/uzQEtpt+QKt5U2Vj/dYerRRjilMA5bvLn r+xGIFQqKOLW8MzwJ+zo0nkea7JvfDBbSZdU2oS5yuQ8rdXR/v9OeSfzTI5fuI7C erUTkEozZ+5jShZbmJbihJcjjw61KleF7QfiySOahqW/E/gKYWRSX5doB4rF35A= =MDuv -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: martin_raspaud.vcf Type: text/x-vcard Size: 260 bytes Desc: not available URL: From robert.kern at gmail.com Thu Mar 18 11:39:16 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 18 Mar 2010 10:39:16 -0500 Subject: [Numpy-discussion] Int bitsize in python and c In-Reply-To: <4BA22BAA.5050402@smhi.se> References: <4BA22BAA.5050402@smhi.se> Message-ID: <3d375d731003180839rcd6061by8ee51dfa97d0a816@mail.gmail.com> On Thu, Mar 18, 2010 at 08:33, Martin Raspaud wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > I work on a 64bit machine with 64bits enable fedora on it. > > I just discovered that numpy.int on the python part are 64bits ints, while > npy_int in the C api are 32bits ints. Note that np.int is just Python's builtin int type (there only for historical reasons). It corresponds to a C long. npy_int corresponds to a C int. > I can live with it, but it seems to be different on 32bit machines, hence I > wonder what is the right way to do when retrieving an array from python to C. > > Here is what I use now: > data_pyarray = (PyArrayObject *)PyArray_ContiguousFromObject(data_list, > PyArray_INT, 1, 2); > > but that implies that I send np.int32 arrays to the C part. > > Should I use longs instead ? Not necessarily; C longs are the cause of your problem. On some platforms they are 64-bit, some they are 32-bit. Technically speaking, C ints can vary from platform to platform, but they are typically 32-bits on all modern platforms running numpy. Of course, numpy defaults to using a C long for its integer arrays, just as Python does for its int type, so perhaps using a C long would work best for you. It's platform dependent, but it matches the platform dependent changes in numpy. It depends on what your needs are. If you need a consistent size (perhaps you are writing bytes out to a file), then always use the int32 or int64 specific types. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Thu Mar 18 11:46:25 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 18 Mar 2010 11:46:25 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Mar 17, 2010, at 5:43 PM, Charles R Harris wrote: > > > > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM wrote: > > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: > >> > >> I started thinking about a third method called __input_prepare__ that > >> would be called on the way into the ufunc, which would allow you to > >> intercept the input and pass a somehow modified copy back to the > >> ufunc. The total flow would be: > >> > >> 1) Call myufunc(x, y[, z]) > >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', > >> y' (or simply passes through x,y by default) > >> 3) myufunc creates the output array z (if not specified) and calls > >> ?.__array_prepare__(z, (myufunc, x, y, ...)) > >> 4) myufunc finally gets around to performing the calculation > >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns > >> the result to the caller > >> > >> Is this general enough for your use case? I haven't tried to think > >> about how to change some global state at one point and change it back > >> at another, that seems like a bad idea and difficult to support. > > > > > > Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. > > I think it is better to keep them separate, so we don't have one > method that is trying to do too much. It would be easier to explain in > the documentation. > > I may not have much time to look into this until after Monday. Is > there a deadline we need to consider? > > > I don't think this should go into 2.0, I think it needs more thought. And 2.0 already has significant code churn. Is there any reason beyond a big hassle not to set/restore the error state around all the ufunc calls in ma? Should be done with r8295. Please let me know whether I missed one. Otherwise, I agree with Chuck. Let's take some time to figure something. It'd be significant a change that it shouldn't go in 2.0 From bsouthey at gmail.com Thu Mar 18 12:03:38 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 18 Mar 2010 11:03:38 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <489CB638-7D4A-4DD8-BE22-CE4FC77B1F41@gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA0F0B5.4050608@gmail.com> <489CB638-7D4A-4DD8-BE22-CE4FC77B1F41@gmail.com> Message-ID: <4BA24EDA.6090605@gmail.com> On 03/17/2010 04:20 PM, Pierre GM wrote: > On Mar 17, 2010, at 11:09 AM, Bruce Southey wrote: > >> On 03/17/2010 01:07 AM, Pierre GM wrote: >> >>> All, >>> As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as "invalid value in ...". These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. >>> >>> >> Perhaps naive question, what is really being tested here? >> >> That is, it appears that you are testing both the generation of the >> invalid values and function. So if the generation fails, then the >> function will also fail. However, the test for the generation of invalid >> values should be elsewhere so you have to assume that the generation of >> values will work correctly. >> > That's not really the point here. The issue is that when numpy ufuncs are called on a MaskedArray, a warning or an exception is raised when an invalid is met. With the numpy.ma version of those functions, the error is trapped and processed. Of course, using the numpy.ma version of the ufuncs is the right way to go > > > >> I think that you should be only testing that the specific function >> passes the test. Why not just use 'invalid' values like np.inf directly? >> >> For example, in numpy/ma/tests/test_core.py >> We have this test: >> def test_fix_invalid(self): >> "Checks fix_invalid." >> data = masked_array(np.sqrt([-1., 0., 1.]), mask=[0, 0, 1]) >> data_fixed = fix_invalid(data) >> >> If that is to test that fix_invalid Why not create the data array as: >> data = masked_array([np.inf, 0., 1.]), mask=[0, 0, 1]) >> > Sure, that's nicer. But once again, that's not really the core of the issue. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I needed to point out that your statement was not completely correct: 'These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray...'. There are valid warnings for some of the tests because these are do not operate on masked arrays. As in the above example, the masked array only occurs *after* the square root has been taken and hence the warning. So some of the warnings in the tests should be eliminated just by changing the test input. Furthermore, this warning is technically valid for non-masked values: >>> np.sqrt(np.ma.array([-1], mask=[0])) Warning: invalid value encountered in sqrt masked_array(data = [--], mask = [ True], fill_value = 1e+20) In contrast not warning is emitted with ma function: >>> np.ma.sqrt(np.ma.array([-1], mask=[0])) masked_array(data = [--], mask = [ True], fill_value = 1e+20) But I fully agree that there is a problem when the value is masked because it should have ignored the operation: >>> np.sqrt(np.ma.array([-1], mask=[1])) Warning: invalid value encountered in sqrt masked_array(data = [--], mask = [ True], fill_value = 1e+20) Here I understand your view is that if the operation is on a masked array then all 'invalid' operations like square root then these should be silently converted to masked values. Bruce From pgmdevlist at gmail.com Thu Mar 18 12:07:14 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 18 Mar 2010 11:07:14 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: <6A5C99FB-5030-47FE-A982-FE21A1764156@gmail.com> On Mar 17, 2010, at 9:16 PM, Charles R Harris wrote: > > > On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris > wrote: > > > > What bothers me here is the opposing desire to separate ufuncs from their > > ndarray dependency, having them operate on buffer objects instead. As I see > > it ufuncs would be split into layers, with a lower layer operating on buffer > > objects, and an upper layer tying them together with ndarrays where the > > "business" logic -- kinds, casting, etc -- resides. It is in that upper > > layer that what you are proposing would reside. Mind, I'm not sure that > > having matrices and masked arrays subclassing ndarray was the way to go, but > > given that they do one possible solution is to dump the whole mess onto the > > subtype with the highest priority. That subtype would then be responsible > > for casts and all the other stuff needed for the call and wrapping the > > result. There could be library routines to help with that. It seems to me > > that that would be the most general way to go. In that sense ndarrays > > themselves would just be another subtype with especially low priority. > > I'm sorry, I didn't understand your point. What you described sounds > identical to how things are currently done. What distinction are you > making, aside from operating on the buffer object? How would adding a > method to modify the input to a ufunc complicate the situation? > > > Just *one* function to rule them all and on the subtype dump it. No __array_wrap__, __input_prepare__, or __array_prepare__, just something like __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing having the ufunc upper layer do nothing but decide which argument type will do all the rest of the work, casting, calling the low level ufunc base, providing buffers, wrapping, etc. Instead of pasting bits and pieces into the existing framework I would like to lay out a line of attack that ends up separating ufuncs into smaller pieces that provide low level routines that work on strided memory while leaving policy implementation to the subtype. There would need to be some default type (ndarray) when the functions are called on nested lists and scalars and I'm not sure of the best way to handle that. > > I'm just sort of thinking out loud, don't take it too seriously. Still, I like it. It sounds far cleaner than the current Harlequin's costume approach. In the thinking out loud department: * the upper layer should allow the user to modify the input on the fly (a current limitation of __array_prepare__), so that we can change values that we know will give invalid results before the lower layer processes them. * It'd be nice to have the domains defined in the functions. Right now, the domains are defined in numpy.ma.core for unary and binary functions. Maybe we could extend the 'context' of each ufunc ? From pgmdevlist at gmail.com Thu Mar 18 12:09:34 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 18 Mar 2010 11:09:34 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA24EDA.6090605@gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA0F0B5.4050608@gmail.com> <489CB638-7D4A-4DD8-BE22-CE4FC77B1F41@gmail.com> <4BA24EDA.6090605@gmail.com> Message-ID: <7EBB89C0-EBC6-4EEE-8FEA-FE291E3C419B@gmail.com> On Mar 18, 2010, at 11:03 AM, Bruce Southey wrote: > On 03/17/2010 04:20 PM, Pierre GM wrote: >> On Mar 17, 2010, at 11:09 AM, Bruce Southey wrote: >> >>> On 03/17/2010 01:07 AM, Pierre GM wrote: >>> >>>> All, >>>> As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as "invalid value in ...". These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. >>>> >>>> >>> Perhaps naive question, what is really being tested here? >>> >>> That is, it appears that you are testing both the generation of the >>> invalid values and function. So if the generation fails, then the >>> function will also fail. However, the test for the generation of invalid >>> values should be elsewhere so you have to assume that the generation of >>> values will work correctly. >>> >> That's not really the point here. The issue is that when numpy ufuncs are called on a MaskedArray, a warning or an exception is raised when an invalid is met. With the numpy.ma version of those functions, the error is trapped and processed. Of course, using the numpy.ma version of the ufuncs is the right way to go >> >> >> >>> I think that you should be only testing that the specific function >>> passes the test. Why not just use 'invalid' values like np.inf directly? >>> >>> For example, in numpy/ma/tests/test_core.py >>> We have this test: >>> def test_fix_invalid(self): >>> "Checks fix_invalid." >>> data = masked_array(np.sqrt([-1., 0., 1.]), mask=[0, 0, 1]) >>> data_fixed = fix_invalid(data) >>> >>> If that is to test that fix_invalid Why not create the data array as: >>> data = masked_array([np.inf, 0., 1.]), mask=[0, 0, 1]) >>> >> Sure, that's nicer. But once again, that's not really the core of the issue. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > I needed to point out that your statement was not completely correct: > 'These warnings are only issued when a standard numpy ufunc (eg., > np.sqrt) is called on a MaskedArray...'. > > There are valid warnings for some of the tests because these are do not > operate on masked arrays. As in the above example, the masked array only > occurs *after* the square root has been taken and hence the warning. So > some of the warnings in the tests should be eliminated just by changing > the test input. Dang, I knew I was forgetting something... OK, I'll work on that. But anyway, I agree with you, some of the tests are not particularly well-thought.... From gamercier at yahoo.com Thu Mar 18 13:13:20 2010 From: gamercier at yahoo.com (Gustavo Mercier) Date: Thu, 18 Mar 2010 10:13:20 -0700 (PDT) Subject: [Numpy-discussion] Daniel Wright Message-ID: <422446.30775.qm@web31813.mail.mud.yahoo.com> http://carshowcolombia.com/John.html From gokhansever at gmail.com Thu Mar 18 13:20:52 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 18 Mar 2010 12:20:52 -0500 Subject: [Numpy-discussion] ask.scipy.org is online Message-ID: <49d6b3501003181020x23f3236bp23f4664ed8d2aba2@mail.gmail.com> Hello, Please tolerate my impatience for being the first announcing the new discussion platform :) and my cross-posting over the lists. The new site is beating at ask.scipy.org . David Warde-Farley is moving the questions from the old-site at advice.mechanicalkern.com (announced at SciPy09 by Robert Kern) We welcome you to join the discussions there. I have kicked-off the new questions chain by http://ask.scipy.org/en/topic/11-what-is-your-favorite-numpy-feature (Also cross-posted at http://stackoverflow.com/questions/2471872/what-is-your-favorite-numpy-featureto pull more attention to ask.scipy) Please share your questions and comments, and have fun with your discussions. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Thu Mar 18 13:53:00 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 18 Mar 2010 18:53:00 +0100 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> Message-ID: <201003181853.00829.faltet@pytables.org> A Thursday 18 March 2010 16:26:09 Anne Archibald escrigu?: > Speak for your own CPUs :). > > But seriously, congratulations on the wide publication of the article; > it's an important issue we often don't think enough about. I'm just a > little snarky because this exact issue came up for us recently - a > visiting astro speaker put it as "flops are free" - and so I did some > tests and found that even without optimizing for memory access, our > tasks are already CPU-bound: > http://lighthouseinthesky.blogspot.com/2010/03/flops.html Well, I thought that my introduction was enough to convince anybody about the problem, but forgot that you, the scientists, always try to demonstrate things experimentally :-/ Seriously, your example is a clear example of what I'm recommending in the article, i.e. always try to use libraries that are already leverage the blocking technique (that is, taking advantage of both temporal and spatial locality). Don't know about FFTW (never used it, sorry), but after having a look at its home page, I'm pretty convinced that its authors are very conscious about these techniques. Being said this, it seems that, in addition, you are applying the blocking technique yourself also: get the data in bunches (256 floating point elements, which fits perfectly well on modern L1 caches), apply your computation (in this case, FFTW) and put the result back in memory. A perfect example of what I wanted to show to the readers so, congratulations! you made it without the need to read my article (so perhaps the article was not so necessary after all :-) > In terms of specifics, I was a little surprised you didn't mention > FFTW among your software tools that optimize memory access. FFTW's > planning scheme seems ideal for ensuring memory locality, as much as > possible, during large FFTs. (And in fact I also found that for really > large FFTs, reducing padding - memory size - at the cost of a > non-power-of-two size was also worth it.) I must say that I'm quite na?ve in many existing great tools for scientific computing. What I know, is that when I need to do something I always look for good existing tools first. So this is basically why I spoke about numexpr and BLAS/LAPACK: I know them well. > Heh. Indeed numexpr is a good tool for this sort of thing; it's an > unfortunate fact that simple use of numpy tends to do operations in > the pessimal order... Well, to honor the truth, NumPy does not have control in the order of the operations in expressions and how temporaries are managed: it is Python who decides that. NumPy only can do what Python wants it to do, and do it as good as possible. And NumPy plays its role reasonably well here, but of course, this is not enough for providing performance. In fact, this problem probably affects to all interpreted languages out there, unless they implement a JIT compiler optimised for evaluating expressions --and this is basically what numexpr is. Anyway, thanks for constructive criticism, I really appreciate it! -- Francesc Alted From rmay31 at gmail.com Thu Mar 18 14:15:54 2010 From: rmay31 at gmail.com (Ryan May) Date: Thu, 18 Mar 2010 13:15:54 -0500 Subject: [Numpy-discussion] numpy.gradient() does not return ndarray subclasses Message-ID: Hi, Can I get someone to look at: http://projects.scipy.org/numpy/ticket/1435 Basically, numpy.gradient() uses numpy.zeros() to create an output array. This breaks the use of any ndarray subclasses, like masked arrays, since the function will only return ndarrays. I've attached a patch that fixes that problem and has a simple test checking output types. With this patch, I can use gradient on masked arrays and get appropriately masked output. If we could, it'd be nice to get this in for 2.0 so that I (and my coworkers who found the bug) don't have to use a custom patched gradient until the next release after that. Thanks, Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- A non-text attachment was scrubbed... Name: fix_gradient_with_subclasses.diff Type: application/octet-stream Size: 1200 bytes Desc: not available URL: From Chris.Barker at noaa.gov Thu Mar 18 15:12:10 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 18 Mar 2010 12:12:10 -0700 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> Message-ID: <4BA27B0A.7010305@noaa.gov> josef.pktd at gmail.com wrote: > On Wed, Mar 17, 2010 at 3:12 PM, Christopher Barker >> Given all the limitations of NaN, having a masked >> array is a better way to go, but I'd love it if they were "just there", >> and therefore EVERY numpy function and package built on numpy would >> handle them gracefully. > many function are defined differently for missing values, in stats, > regression or time series analysis with the assumption of equally > spaced time periods always needs to use special methods to handle > missing values. sure -- that's kind of my point -- if EVERY numpy array were (potentially) masked, then folks would write code to deal with them appropriately. > Plus, you have to operate on two arrays and keep both in memory. So > the penalty is pretty high even in C. Only if there is actually a mask, which might make this pretty ugly -- lots of "if mask" code branching. If a given routine either didn't make sense with missing values, or was simply too costly with them, it could certainly raise an exception if it got an array with a non-null mask. >> Anyway, just a fantasy, but C-level ufuncs that support masks would be >> great. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From gael.varoquaux at normalesup.org Thu Mar 18 15:19:11 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 18 Mar 2010 20:19:11 +0100 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA27B0A.7010305@noaa.gov> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> Message-ID: <20100318191910.GA16916@phare.normalesup.org> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: > sure -- that's kind of my point -- if EVERY numpy array were > (potentially) masked, then folks would write code to deal with them > appropriately. That's pretty much saying: "I have a complicated problem and I want every one else to have to deal with the full complexity of it, even if they have a simple problem". In my experience, such choice doesn't fair well, unless it is inside a controled codebase, and someone working on that codebase is ready to spend heaps of time working on that specific issue. Ga?l From josef.pktd at gmail.com Thu Mar 18 15:32:58 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Mar 2010 15:32:58 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <20100318191910.GA16916@phare.normalesup.org> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> Message-ID: <1cd32cbb1003181232kec0bba1q7dfd6fde842235c9@mail.gmail.com> On Thu, Mar 18, 2010 at 3:19 PM, Gael Varoquaux wrote: > On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >> sure -- that's kind of my point -- if EVERY numpy array were >> (potentially) masked, then folks would write code to deal with them >> appropriately. > > That's pretty much saying: "I have a complicated problem and I want every > one else to have to deal with the full complexity of it, even if they > have a simple problem". In my experience, such choice doesn't fair well, > unless it is inside a controled codebase, and someone working on that > codebase is ready to spend heaps of time working on that specific issue. If the mask doesn't get quietly added during an operation, we would need to keep out the masked arrays at the front door. I worry about speed penalties for pure number crunching, although, if nomask=True gives a fast path, then it might not be too much of a problem. ufuncs are simple enough, but for reduce operations and other more complicated things (linalg, fft) the user would need to control how missing values are supposed to be handled, which still requires special treatment and "if mask" all over the place. Josef > > Ga?l > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Thu Mar 18 15:46:21 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 18 Mar 2010 12:46:21 -0700 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <20100318191910.GA16916@phare.normalesup.org> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> Message-ID: <4BA2830D.5070209@noaa.gov> Gael Varoquaux wrote: > On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >> sure -- that's kind of my point -- if EVERY numpy array were >> (potentially) masked, then folks would write code to deal with them >> appropriately. > > That's pretty much saying: "I have a complicated problem and I want every > one else to have to deal with the full complexity of it, even if they > have a simple problem". Well -- I did say it was a fantasy... But I disagree -- having invalid data is a very common case. What we have now is a situation where we have two parallel systems, masked arrays and regular arrays. Each time someone does something new with masked arrays, they often find another missing feature, and have to solve that. Also, the fact that masked arrays are tacked on means that performance suffers. Maybe it would simply be too ugly, but If I were to start from the ground up with a scientific computing package, I would want to put in support for missing values from that start. There are some cases where is it simply too complicated or to expensive to handle missing values -- fine, then an exception is raised. You may be right about how complicated it would be, and what would happen is that everyone would simply put a: if a.masked: raise ("I can't deal with masked dat") stanza at the top of every new method they wrote, but I suspect that if the core infrastructure was in place, it would be used. I'm facing this at the moment: not a big deal, but I'm using histogram2d on some data. I just realized that it may have some NaNs in it, and I have no idea how those are being handled. I also want to move to masked arrays and have no idea if histogram2d can deal with those. At the least, I need to do some testing, and I suspect I'll need to do some hacking on histogram2d (or just write my own). I'm sure I'm not the only one in the world that needs to histogram some data that may have invalid values -- so wouldn't it be nice if that were already handled in a defined way? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rmay31 at gmail.com Thu Mar 18 15:53:44 2010 From: rmay31 at gmail.com (Ryan May) Date: Thu, 18 Mar 2010 14:53:44 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA2830D.5070209@noaa.gov> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> Message-ID: On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker wrote: > Gael Varoquaux wrote: >> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >>> sure -- that's kind of my point -- if EVERY numpy array were >>> (potentially) masked, then folks would write code to deal with them >>> appropriately. >> >> That's pretty much saying: "I have a complicated problem and I want every >> one else to have to deal with the full complexity of it, even if they >> have a simple problem". > > Well -- I did say it was a fantasy... > > But I disagree -- having invalid data is a very common case. What we > have now is a situation where we have two parallel systems, masked > arrays and regular arrays. Each time someone does something new with > masked arrays, they often find another missing feature, and have to > solve that. Also, the fact that masked arrays are tacked on means that > performance suffers. Case in point, I just found a bug in np.gradient where it forces the output to be an ndarray. (http://projects.scipy.org/numpy/ticket/1435). Easy fix that doesn't actually require any special casing for masked arrays, just making sure to use the proper function to create a new array of the same subclass as the input. However, now for any place that I can't patch I have to use a custom function until a fixed numpy is released. Maybe universal support for masked arrays (and masking invalid points) is a pipe dream, but every function in numpy should IMO deal properly with subclasses of ndarray. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From josef.pktd at gmail.com Thu Mar 18 16:15:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Mar 2010 16:15:42 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA2830D.5070209@noaa.gov> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> Message-ID: <1cd32cbb1003181315v6865d721m8c154ca7a7bf2a9d@mail.gmail.com> On Thu, Mar 18, 2010 at 3:46 PM, Christopher Barker wrote: > Gael Varoquaux wrote: >> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >>> sure -- that's kind of my point -- if EVERY numpy array were >>> (potentially) masked, then folks would write code to deal with them >>> appropriately. >> >> That's pretty much saying: "I have a complicated problem and I want every >> one else to have to deal with the full complexity of it, even if they >> have a simple problem". > > Well -- I did say it was a fantasy... > > But I disagree -- having invalid data is a very common case. What we > have now is a situation where we have two parallel systems, masked > arrays and regular arrays. Each time someone does something new with > masked arrays, they often find another missing feature, and have to > solve that. Also, the fact that masked arrays are tacked on means that > performance suffers. > > Maybe it would simply be too ugly, but If I were to start from the > ground up with a scientific computing package, I would want to put in > support for missing values from that start. > > There are some cases where is it simply too complicated or to expensive > to handle missing values -- fine, then an exception is raised. > > You may be right about how complicated it would be, and what would > happen is that everyone would simply put a: > > if a.masked: > ? ?raise ("I can't deal with masked dat") > > stanza at the top of every new method they wrote, but I suspect that if > the core infrastructure was in place, it would be used. > > I'm facing this at the moment: not a big deal, but I'm using histogram2d > on some data. I just realized that it may have some NaNs in it, and I > have no idea how those are being handled. I also want to move to masked > arrays and have no idea if histogram2d can deal with those. At the > least, I need to do some testing, and I suspect I'll need to do some > hacking on histogram2d (or just write my own). > > I'm sure I'm not the only one in the world that needs to histogram some > data that may have invalid values -- so wouldn't it be nice if that were > already handled in a defined way? histogram2d handles neither masked arrays nor arrays with nans correctly, but assuming you want to drop all columns that have at least one missing value, then it is just one small step. Unless you want to replace the missing value with the mean, or a conditional prediction, or by interpolation. This could be included in the histogram function. >>> x = np.ma.array([[1,2, 3],[2,1,1]], mask=[[0, 1,0], [0,0,0]]) >>> np.histogram2d(x[0],x[1],bins=3) (array([[ 0., 0., 1.], [ 1., 0., 0.], [ 1., 0., 0.]]), array([ 1. , 1.66666667, 2.33333333, 3. ]), array([ 1. , 1.33333333, 1.66666667, 2. ])) >>> x2=x[:,~x.mask.any(0)] >>> np.histogram2d(x2[0],x2[1],bins=3) (array([[ 0., 0., 1.], [ 0., 0., 0.], [ 1., 0., 0.]]), array([ 1. , 1.66666667, 2.33333333, 3. ]), array([ 1. , 1.33333333, 1.66666667, 2. ])) >>> x = np.array([[1.,np.nan, 3],[2,1,1]]) >>> x array([[ 1., NaN, 3.], [ 2., 1., 1.]]) >>> np.histogram2d(x[0],x[1],bins=3) (array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]]), array([ NaN, NaN, NaN, NaN]), array([ 1. , 1.33333333, 1.66666667, 2. ])) >>> x2=x[:,np.isfinite(x).all(0)] >>> np.histogram2d(x2[0],x2[1],bins=3) (array([[ 0., 0., 1.], [ 0., 0., 0.], [ 1., 0., 0.]]), array([ 1. , 1.66666667, 2.33333333, 3. ]), array([ 1. , 1.33333333, 1.66666667, 2. ])) >>> Josef > -Chris > > > > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From efiring at hawaii.edu Thu Mar 18 17:12:55 2010 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 18 Mar 2010 11:12:55 -1000 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> Message-ID: <4BA29757.9010502@hawaii.edu> Ryan May wrote: > On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker > wrote: >> Gael Varoquaux wrote: >>> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >>>> sure -- that's kind of my point -- if EVERY numpy array were >>>> (potentially) masked, then folks would write code to deal with them >>>> appropriately. >>> That's pretty much saying: "I have a complicated problem and I want every >>> one else to have to deal with the full complexity of it, even if they >>> have a simple problem". >> Well -- I did say it was a fantasy... >> >> But I disagree -- having invalid data is a very common case. What we >> have now is a situation where we have two parallel systems, masked >> arrays and regular arrays. Each time someone does something new with >> masked arrays, they often find another missing feature, and have to >> solve that. Also, the fact that masked arrays are tacked on means that >> performance suffers. > > Case in point, I just found a bug in np.gradient where it forces the > output to be an ndarray. > (http://projects.scipy.org/numpy/ticket/1435). Easy fix that doesn't > actually require any special casing for masked arrays, just making > sure to use the proper function to create a new array of the same > subclass as the input. However, now for any place that I can't patch > I have to use a custom function until a fixed numpy is released. > > Maybe universal support for masked arrays (and masking invalid points) > is a pipe dream, but every function in numpy should IMO deal properly > with subclasses of ndarray. 1) This can't be done in general because subclasses can change things to the point where there is little one can count on. The matrix subclass, for example, redefines multiplication and iteration, making it difficult to write functions that will work for ndarrays or matrices. 2) There is a lot that can be done to improve the handling of masked arrays, and I still believe that much of it should be done at the C level, where it can be done with speed and simplicity. Unfortunately, figuring out how to do it well, and implementing it well, will require a lot of intensive work. I suspect it won't get done unless we can figure out how to get a qualified person dedicated to it. Eric > > Ryan > From davide.cittaro at ifom-ieo-campus.it Thu Mar 18 17:19:13 2010 From: davide.cittaro at ifom-ieo-campus.it (Davide Cittaro) Date: Thu, 18 Mar 2010 22:19:13 +0100 Subject: [Numpy-discussion] peak finding approach Message-ID: <4EA51F48-273B-40A7-905C-47E0564C10C2@ifom-ieo-campus.it> Hi all, Is there a fast numpy way to find the peak boundaries in a (looong, millions of points) smoothed signal? I've found some approaches, like this: z = data[1:-1] l = data[:-2] r = data[2:] f = np.greater(z, l) f *= np.greater(z, r) boundaries = np.nonzero(f) but it is too sensitive... it detects any small variations in slope on the shoulders of a bigger peak... Any hint? thanks d From friedrichromstedt at gmail.com Thu Mar 18 17:31:24 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 18 Mar 2010 22:31:24 +0100 Subject: [Numpy-discussion] evaluating a function in a matrix element??? In-Reply-To: <4A2645B8-A650-42E5-B021-CDE54CA39791@horow.net> References: <4A2645B8-A650-42E5-B021-CDE54CA39791@horow.net> Message-ID: 2010/3/18 Frank Horowitz : > I'm working on a piece of optimisation code where it turns out to be mathematically convenient to have a matrix where a few pre-chosen elements must be computed at evaluation time for the dot product (i.e. matrix multiplication) of a matrix with a vector. The following *might* be helpful: >>> class X: ... def __mul__(self, other): ... return numpy.random.random() * other ... def __rmul__(self, other): ... return other * numpy.random.random() Instances of this class calculate the product in-time: >>> x = X() >>> x * 1 0.222103712078775 >>> 1 * x 0.044647569053175573 How to use it: >>> a = numpy.asarray([[X(), X()], [0, 1]]) >>> a array([[<__main__.X instance at 0x00AABAA8>, <__main__.X instance at 0x00E76530>], [0, 1]], dtype=object) The first row is purely random, the second purely deterministic: >>> numpy.dot(a, [1, 2]) array([1.60154958363, 2], dtype=object) >>> numpy.dot(a, [1, 2]) array([2.06294335235, 2], dtype=object) You can convert back to dtype = numpy.float by result.astype(numpy.float): >>> numpy.dot(a, [1, 2]).astype(numpy.float) array([ 1.33217562, 2. ]) Friedrich From dsdale24 at gmail.com Thu Mar 18 17:42:53 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 18 Mar 2010 17:42:53 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA29757.9010502@hawaii.edu> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> <4BA29757.9010502@hawaii.edu> Message-ID: On Thu, Mar 18, 2010 at 5:12 PM, Eric Firing wrote: > Ryan May wrote: >> On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker >> wrote: >>> Gael Varoquaux wrote: >>>> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >>>>> sure -- that's kind of my point -- if EVERY numpy array were >>>>> (potentially) masked, then folks would write code to deal with them >>>>> appropriately. >>>> That's pretty much saying: "I have a complicated problem and I want every >>>> one else to have to deal with the full complexity of it, even if they >>>> have a simple problem". >>> Well -- I did say it was a fantasy... >>> >>> But I disagree -- having invalid data is a very common case. What we >>> have now is a situation where we have two parallel systems, masked >>> arrays and regular arrays. Each time someone does something new with >>> masked arrays, they often find another missing feature, and have to >>> solve that. Also, the fact that masked arrays are tacked on means that >>> performance suffers. >> >> Case in point, I just found a bug in np.gradient where it forces the >> output to be an ndarray. >> (http://projects.scipy.org/numpy/ticket/1435). ?Easy fix that doesn't >> actually require any special casing for masked arrays, just making >> sure to use the proper function to create a new array of the same >> subclass as the input. ?However, now for any place that I can't patch >> I have to use a custom function until a fixed numpy is released. >> >> Maybe universal support for masked arrays (and masking invalid points) >> is a pipe dream, but every function in numpy should IMO deal properly >> with subclasses of ndarray. > > 1) This can't be done in general because subclasses can change things to > the point where there is little one can count on. ?The matrix subclass, > for example, redefines multiplication and iteration, making it difficult > to write functions that will work for ndarrays or matrices. I'm more optimistic that it can be done in general, if we provide a mechanism where the subclass with highest priority can customize the execution of the function (ufunc or not). In principle, the subclass could even override the buffer operation, like in the case of matrices. It still can put a lot of responsibility on the authors of the subclass, but what is gained is a framework where np.add (for example) could yield the appropriate result for any subclass, as opposed to the current situation of needing to know which add function can be used for a particular type of input. All speculative, of course. I'll start throwing some examples together when I get a chance. Darren From josef.pktd at gmail.com Thu Mar 18 18:06:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Mar 2010 18:06:31 -0400 Subject: [Numpy-discussion] peak finding approach In-Reply-To: <4EA51F48-273B-40A7-905C-47E0564C10C2@ifom-ieo-campus.it> References: <4EA51F48-273B-40A7-905C-47E0564C10C2@ifom-ieo-campus.it> Message-ID: <1cd32cbb1003181506t4dc45a06j710fc35382bdb836@mail.gmail.com> On Thu, Mar 18, 2010 at 5:19 PM, Davide Cittaro wrote: > Hi all, > Is there a fast numpy way to find the peak boundaries in a (looong, millions of points) smoothed signal? I've found some approaches, like this: > > z = data[1:-1] > l = data[:-2] > r = data[2:] > f = np.greater(z, l) > f *= np.greater(z, r) > boundaries = np.nonzero(f) > > but it is too sensitive... it detects any small variations in slope on the shoulders of a bigger peak... > Any hint? to find peaks something like the following works np.nonzero(data == maximumfilter data, larger window size) where maximum filter is in scipy.signal or ndimage I'm not sure about boundaries, maybe the same with minimum filter Josef > > thanks > > d > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Thu Mar 18 19:26:04 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 18 Mar 2010 16:26:04 -0700 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <1cd32cbb1003181315v6865d721m8c154ca7a7bf2a9d@mail.gmail.com> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> <1cd32cbb1003181315v6865d721m8c154ca7a7bf2a9d@mail.gmail.com> Message-ID: <4BA2B68C.2040000@noaa.gov> josef.pktd at gmail.com wrote: >> I'm facing this at the moment: not a big deal, but I'm using histogram2d >> on some data. I just realized that it may have some NaNs in it, and I >> have no idea how those are being handled. > histogram2d handles neither masked arrays nor arrays with nans > correctly, I really wasn't asking for help (yet) .. but thanks! >>>> x2=x[:,np.isfinite(x).all(0)] >>>> np.histogram2d(x2[0],x2[1],bins=3) > (array([[ 0., 0., 1.], > [ 0., 0., 0.], > [ 1., 0., 0.]]), array([ 1. , 1.66666667, > 2.33333333, 3. ]), array([ 1. , 1.33333333, > 1.66666667, 2. ])) I'll probably do something like that for now. I guess the question is -- should this be built in to histogram2d (and other similar functions)? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Thu Mar 18 20:32:38 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Mar 2010 20:32:38 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA2B68C.2040000@noaa.gov> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> <1cd32cbb1003181315v6865d721m8c154ca7a7bf2a9d@mail.gmail.com> <4BA2B68C.2040000@noaa.gov> Message-ID: <1cd32cbb1003181732l7f86dbd0m3f5b4f3e1a70dbe9@mail.gmail.com> On Thu, Mar 18, 2010 at 7:26 PM, Christopher Barker wrote: > josef.pktd at gmail.com wrote: >>> I'm facing this at the moment: not a big deal, but I'm using histogram2d >>> on some data. I just realized that it may have some NaNs in it, and I >>> have no idea how those are being handled. > >> histogram2d handles neither masked arrays nor arrays with nans >> correctly, > > I really wasn't asking for help (yet) .. but thanks! > >>>>> x2=x[:,np.isfinite(x).all(0)] >>>>> np.histogram2d(x2[0],x2[1],bins=3) >> (array([[ 0., ?0., ?1.], >> ? ? ? ?[ 0., ?0., ?0.], >> ? ? ? ?[ 1., ?0., ?0.]]), array([ 1. ? ? ? ?, ?1.66666667, >> 2.33333333, ?3. ? ? ? ?]), array([ 1. ? ? ? ?, ?1.33333333, >> 1.66666667, ?2. ? ? ? ?])) > > I'll probably do something like that for now. I guess the question is -- > should this be built in to histogram2d (and other similar functions)? I think yes, for all functions that are closer to actual data and where there is an obvious way to handle the missing values. But, it's work and adds a lot of code to a nice simple function. And if it's just one extra line for the user, than it is not too high on my priority. For example, I rewrote stats.zscore a while ago to handle also matrices and masked arrays, and Bruce rewrote geometric mean and others, but these are easy cases, for many of the other functions it's more work. Also. if a function gets too much overhead, I end up rewriting and inlining the core of the function over and over again when I need it inside a loop, for example for optimization, or I keep a copy of the function around that doesn't use the overhead. I actually do little profiling, so I don't really know what the cost would be in a loop. Josef > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From frank at horow.net Fri Mar 19 03:55:06 2010 From: frank at horow.net (Frank Horowitz) Date: Fri, 19 Mar 2010 15:55:06 +0800 Subject: [Numpy-discussion] evaluating a function in a matrix element??? Message-ID: > >>> > class X: > > ... def __mul__(self, other): > ... return numpy.random.random() * other > ... def __rmul__(self, other): > ... return other * numpy.random.random() Thanks for that Friedrich! I had forgotten about __mul__, __rmul__, __float__ et al. I think that using a class method instead of a function and overriding those __*__ methods is the cleaner way to do the task I need. Cheers, Frank Horowitz frank at horow.net From sebastian.walter at gmail.com Fri Mar 19 04:35:39 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Fri, 19 Mar 2010 09:35:39 +0100 Subject: [Numpy-discussion] evaluating a function in a matrix element??? In-Reply-To: <4A2645B8-A650-42E5-B021-CDE54CA39791@horow.net> References: <4A2645B8-A650-42E5-B021-CDE54CA39791@horow.net> Message-ID: On Thu, Mar 18, 2010 at 7:01 AM, Frank Horowitz wrote: > Dear All, > > I'm working on a piece of optimisation code where it turns out to be mathematically convenient to have a matrix where a few pre-chosen elements must be computed at evaluation time for the dot product (i.e. matrix multiplication) of a matrix with a vector. > > As I see the problem, there are two basic approaches to accomplishing this. Why don't you just override dot to compute those array elements and then internally call numpy.dot? def dot(x,y): update some elements of x return numpy.dot(x,y) > > First (and perhaps conceptually simplest, not to mention apparently faster) might be to stash appropriate functions at their corresponding locations in the matrix (with the rest of the matrix being constants, as usual). I mucked around with this for a little while in iPython, and it appears that having dtype == object_ works for stashing the references to the functions, but fails to allow me to actually evaluate the function(s) when the matrix is used in the dot product. > > Does anyone have any experience with making such a beast work within numpy? If so, how? Could you elaborate on why you think that an array of objects is faster than an array of floats? > > The second basic approach is to build a ufunc that implements the switching logic, and returns the constants and evaluated functions in the appropriate locations. ?This seems to be the more do-able approach, but it requires the ufunc to be aware of both the position of each element (via index, or somesuch) as well as the arguments to the functions themselves being evaluated at the matrix elements. It appears that frompyfunc() nearly does what I want, but I am currently failing to see how to actually *use* it for anything more elaborate than the octal example code (i.e. one value in, and one value out). Does anyone have any other more elaborate examples they can point me towards? > I didn't know that dot is a ufunc. According to http://docs.scipy.org/doc/numpy/reference/ufuncs.html a ufunc is a function that operates element wise. Sebastian > Thanks in advance for any help you might be able to provide! > > ? ? ? ?Frank Horowitz > ? ? ? ?frank at horow.net > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dsdale24 at gmail.com Fri Mar 19 09:37:12 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 19 Mar 2010 09:37:12 -0400 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <5B46F515-C265-4F9A-B9F2-2752D6836452@gmail.com> Message-ID: On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris wrote: > On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale wrote: >> On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris >> > What bothers me here is the opposing desire to separate ufuncs from >> > their >> > ndarray dependency, having them operate on buffer objects instead. As I >> > see >> > it ufuncs would be split into layers, with a lower layer operating on >> > buffer >> > objects, and an upper layer tying them together with ndarrays where the >> > "business" logic -- kinds, casting, etc -- resides. It is in that upper >> > layer that what you are proposing would reside. Mind, I'm not sure that >> > having matrices and masked arrays subclassing ndarray was the way to go, >> > but >> > given that they do one possible solution is to dump the whole mess onto >> > the >> > subtype with the highest priority. That subtype would then be >> > responsible >> > for casts and all the other stuff needed for the call and wrapping the >> > result. There could be library routines to help with that. It seems to >> > me >> > that that would be the most general way to go. In that sense ndarrays >> > themselves would just be another subtype with especially low priority. >> >> I'm sorry, I didn't understand your point. What you described sounds >> identical to how things are currently done. What distinction are you >> making, aside from operating on the buffer object? How would adding a >> method to modify the input to a ufunc complicate the situation? >> > > Just *one* function to rule them all and on the subtype dump it. No > __array_wrap__, __input_prepare__, or __array_prepare__, just something like > __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing > having the ufunc upper layer do nothing but decide which argument type will > do all the rest of the work, casting, calling the low level ufunc base, > providing buffers, wrapping, etc. Instead of pasting bits and pieces into > the existing framework I would like to lay out a line of attack that ends up > separating ufuncs into smaller pieces that provide low level routines that > work on strided memory while leaving policy implementation to the subtype. > There would need to be some default type (ndarray) when the functions are > called on nested lists and scalars and I'm not sure of the best way to > handle that. > > I'm just sort of thinking out loud, don't take it too seriously. This is a seemingly simplified approach. I was taken with it last night but then I remembered that it will make subclassing difficult. A simple example can illustrate the problem. We have MaskedArray, which needs to customize some functions that operate on arrays or buffers, so we pass the function and the arguments to __handle_ufunc__ and it takes care of the whole shebang. But now I develop a MaskedQuantity that takes masked arrays and gives them the ability to handle units, and so it needs to customize those functions further. Maybe MaskedQuantity can modify the input passed to its __handle_ufunc__ and then pass everything on to super().__handle_ufunc__, such that MaskedQuantity does not have to reimplement MaskedArray's customizations to that particular function, but that is not enough flexibility for the general case. If a my subclass needs to call the low-level ufunc base, it can't rely on the superclass.__handle_ufunc__ because it *also* calls the ufunc base, so my subclass has to reimplement all of the superclass function customizations. The current scheme (__input_prepare__, ...) is better able to handle subclassing, although I agree that it could be improved. If the subclasses were responsible for calling the ufunc base, alternative bases could be provided (like the c routines for masked arrays). That still seems to require the high-level function to provide three or four entry points: 1) modify the input, 2) initialize the output (chance to deal with metadata), 3) call the function base, 4) finalize the output (deal with metadata that requires the ufunc results). Perhaps 2 and 4 would not both be needed, I'm not sure. Darren From gberbeglia at gmail.com Fri Mar 19 10:53:56 2010 From: gberbeglia at gmail.com (gerardob) Date: Fri, 19 Mar 2010 07:53:56 -0700 (PDT) Subject: [Numpy-discussion] lists of zeros and ones Message-ID: <27950978.post@talk.nabble.com> Hello, i would like to produce lists of lists 1's and 0's. For example, to produce the list composed of: L = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]] I just need to do the following: n=4 numpy.eye(n,dtype=int).tolist() I would like to know a simple way to generate a list containing all the lists having two 1's at each element. Example, n = 4 L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] Any ideas? Thanks. -- View this message in context: http://old.nabble.com/lists-of-zeros-and-ones-tp27950978p27950978.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From jkington at wisc.edu Fri Mar 19 11:17:41 2010 From: jkington at wisc.edu (Joe Kington) Date: Fri, 19 Mar 2010 10:17:41 -0500 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: <27950978.post@talk.nabble.com> References: <27950978.post@talk.nabble.com> Message-ID: See itertools.permutations (python standard library) e.g. In [3]: list(itertools.permutations([1,1,0,0])) Out[3]: [(1, 1, 0, 0), (1, 1, 0, 0), (1, 0, 1, 0), (1, 0, 0, 1), (1, 0, 1, 0), (1, 0, 0, 1), (1, 1, 0, 0), (1, 1, 0, 0), (1, 0, 1, 0), (1, 0, 0, 1), (1, 0, 1, 0), (1, 0, 0, 1), (0, 1, 1, 0), (0, 1, 0, 1), (0, 1, 1, 0), (0, 1, 0, 1), (0, 0, 1, 1), (0, 0, 1, 1), (0, 1, 1, 0), (0, 1, 0, 1), (0, 1, 1, 0), (0, 1, 0, 1), (0, 0, 1, 1), (0, 0, 1, 1)] Hope that helps, -Joe On Fri, Mar 19, 2010 at 9:53 AM, gerardob wrote: > > Hello, i would like to produce lists of lists 1's and 0's. > > For example, to produce the list composed of: > > L = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]] > > I just need to do the following: > > n=4 > numpy.eye(n,dtype=int).tolist() > > I would like to know a simple way to generate a list containing all the > lists having two 1's at each element. > > Example, n = 4 > L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] > > Any ideas? > Thanks. > > > > > > > > -- > View this message in context: > http://old.nabble.com/lists-of-zeros-and-ones-tp27950978p27950978.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Mar 19 11:17:58 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 19 Mar 2010 08:17:58 -0700 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: <27950978.post@talk.nabble.com> References: <27950978.post@talk.nabble.com> Message-ID: On Fri, Mar 19, 2010 at 7:53 AM, gerardob wrote: > > Hello, i would like to produce lists of lists 1's and 0's. > > For example, to produce the list composed of: > > L = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]] > > I just need to do the following: > > n=4 > numpy.eye(n,dtype=int).tolist() > > I would like to know a simple way to generate a list containing all the > lists having two 1's at each element. > > Example, n = 4 > L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] > > Any ideas? > Thanks. Here's the brute force way: >> for i in range(4): ....: for j in range(i+1, 4): ....: x = np.zeros(4) ....: x[i] = 1 ....: x[j] = 1 ....: print x ....: ....: [ 1. 1. 0. 0.] [ 1. 0. 1. 0.] [ 1. 0. 0. 1.] [ 0. 1. 1. 0.] [ 0. 1. 0. 1.] [ 0. 0. 1. 1.] From kwgoodman at gmail.com Fri Mar 19 11:21:17 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 19 Mar 2010 08:21:17 -0700 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: References: <27950978.post@talk.nabble.com> Message-ID: On Fri, Mar 19, 2010 at 8:17 AM, Joe Kington wrote: > See itertools.permutations (python standard library) > e.g. > In [3]: list(itertools.permutations([1,1,0,0])) > Out[3]: > [(1, 1, 0, 0), > ?(1, 1, 0, 0), > ?(1, 0, 1, 0), > ?(1, 0, 0, 1), > ?(1, 0, 1, 0), > ?(1, 0, 0, 1), > ?(1, 1, 0, 0), > ?(1, 1, 0, 0), > ?(1, 0, 1, 0), > ?(1, 0, 0, 1), > ?(1, 0, 1, 0), > ?(1, 0, 0, 1), > ?(0, 1, 1, 0), > ?(0, 1, 0, 1), > ?(0, 1, 1, 0), > ?(0, 1, 0, 1), > ?(0, 0, 1, 1), > ?(0, 0, 1, 1), > ?(0, 1, 1, 0), > ?(0, 1, 0, 1), > ?(0, 1, 1, 0), > ?(0, 1, 0, 1), > > > > ?(0, 0, 1, 1), > ?(0, 0, 1, 1)] > Hope that helps, > -Joe That treats each 1 as distinct. set() solves that: >> list(set(itertools.permutations([1,1,0,0]))) [(1, 0, 1, 0), (1, 1, 0, 0), (0, 0, 1, 1), (1, 0, 0, 1), (0, 1, 1, 0), (0, 1, 0, 1)] From jkington at wisc.edu Fri Mar 19 11:21:19 2010 From: jkington at wisc.edu (Joe Kington) Date: Fri, 19 Mar 2010 10:21:19 -0500 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: <2797_1269011879_ZZg0U6cLjKDc1.00_f6951adf1003190817w3feaeea2va6c30e834318a1f3@mail.gmail.com> References: <27950978.post@talk.nabble.com> <2797_1269011879_ZZg0U6cLjKDc1.00_f6951adf1003190817w3feaeea2va6c30e834318a1f3@mail.gmail.com> Message-ID: I just realized that permutations isn't quite what you want, as swapping the first "1" for the second "1" gives the same thing. You can use set to get the unique permutations. e.g. In [4]: set(itertools.permutations([1,1,0,0])) Out[4]: set([(0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 0), (1, 0, 0, 1), (1, 0, 1, 0), (1, 1, 0, 0)]) On Fri, Mar 19, 2010 at 10:17 AM, Joe Kington wrote: > See itertools.permutations (python standard library) > > e.g. > In [3]: list(itertools.permutations([1,1,0,0])) > Out[3]: > [(1, 1, 0, 0), > (1, 1, 0, 0), > (1, 0, 1, 0), > (1, 0, 0, 1), > (1, 0, 1, 0), > (1, 0, 0, 1), > (1, 1, 0, 0), > (1, 1, 0, 0), > (1, 0, 1, 0), > (1, 0, 0, 1), > (1, 0, 1, 0), > (1, 0, 0, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > (0, 0, 1, 1), > (0, 0, 1, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > > > > (0, 0, 1, 1), > (0, 0, 1, 1)] > > Hope that helps, > -Joe > > On Fri, Mar 19, 2010 at 9:53 AM, gerardob wrote: > >> >> Hello, i would like to produce lists of lists 1's and 0's. >> >> For example, to produce the list composed of: >> >> L = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]] >> >> I just need to do the following: >> >> n=4 >> numpy.eye(n,dtype=int).tolist() >> >> I would like to know a simple way to generate a list containing all the >> lists having two 1's at each element. >> >> Example, n = 4 >> L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] >> >> Any ideas? >> Thanks. >> >> >> >> >> >> >> >> -- >> View this message in context: >> http://old.nabble.com/lists-of-zeros-and-ones-tp27950978p27950978.html >> Sent from the Numpy-discussion mailing list archive at Nabble.com. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Fri Mar 19 11:21:56 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 19 Mar 2010 10:21:56 -0500 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: References: <27950978.post@talk.nabble.com> Message-ID: <49d6b3501003190821xe1e1224qf432281f7b33f0e2@mail.gmail.com> On Fri, Mar 19, 2010 at 10:17 AM, Joe Kington wrote: > See itertools.permutations (python standard library) > > e.g. > In [3]: list(itertools.permutations([1,1,0,0])) > Out[3]: > [(1, 1, 0, 0), > (1, 1, 0, 0), > (1, 0, 1, 0), > (1, 0, 0, 1), > (1, 0, 1, 0), > (1, 0, 0, 1), > (1, 1, 0, 0), > (1, 1, 0, 0), > (1, 0, 1, 0), > (1, 0, 0, 1), > (1, 0, 1, 0), > (1, 0, 0, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > (0, 0, 1, 1), > (0, 0, 1, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > (0, 1, 1, 0), > (0, 1, 0, 1), > > > > (0, 0, 1, 1), > (0, 0, 1, 1)] > > Hope that helps, > -Joe > > If you use "set" you automatically eliminate replicates: a = set(permutations([0,0,1,1])) a set([(0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 0), (1, 0, 0, 1), (1, 0, 1, 0), (1, 1, 0, 0)]) Converting back to a list: b = list(a) -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 19 11:24:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 19 Mar 2010 11:24:31 -0400 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: References: <27950978.post@talk.nabble.com> Message-ID: <1cd32cbb1003190824j68293ccxf0cdb123924b2a43@mail.gmail.com> On Fri, Mar 19, 2010 at 11:17 AM, Keith Goodman wrote: > On Fri, Mar 19, 2010 at 7:53 AM, gerardob wrote: >> >> Hello, i would like to produce lists of lists 1's and 0's. >> >> For example, to produce the list composed of: >> >> L = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]] >> >> I just need to do the following: >> >> n=4 >> numpy.eye(n,dtype=int).tolist() >> >> I would like to know a simple way to generate a list containing all the >> lists having two 1's at each element. >> >> Example, n = 4 >> L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] >> >> Any ideas? >> Thanks. > > Here's the brute force way: > >>> for i in range(4): > ? ....: ? ? for j in range(i+1, 4): > ? ....: ? ? ? ? x = np.zeros(4) > ? ....: ? ? ? ? x[i] = 1 > ? ....: ? ? ? ? x[j] = 1 > ? ....: ? ? ? ? print x > ? ....: > ? ....: > [ 1. ?1. ?0. ?0.] > [ 1. ?0. ?1. ?0.] > [ 1. ?0. ?0. ?1.] > [ 0. ?1. ?1. ?0.] > [ 0. ?1. ?0. ?1.] > [ 0. ?0. ?1. ?1.] here are two numpy version, but they don't take any shortcuts >>> a= np.array(list(np.ndindex(*2*np.ones(4)))) >>> a array([[0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [0, 0, 1, 1], [0, 1, 0, 0], [0, 1, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 1], [1, 0, 1, 0], [1, 0, 1, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 1, 1, 0], [1, 1, 1, 1]]) >>> a[a.sum(1)==2] array([[0, 0, 1, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 0, 0, 1], [1, 0, 1, 0], [1, 1, 0, 0]]) >>> [list(r) for r in np.ndindex(*2*np.ones(4)) if sum(r)==2] [[0, 0, 1, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 0, 0, 1], [1, 0, 1, 0], [1, 1, 0, 0]] Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Fri Mar 19 11:40:28 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 19 Mar 2010 11:40:28 -0400 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: <1cd32cbb1003190824j68293ccxf0cdb123924b2a43@mail.gmail.com> References: <27950978.post@talk.nabble.com> <1cd32cbb1003190824j68293ccxf0cdb123924b2a43@mail.gmail.com> Message-ID: <1cd32cbb1003190840w279f1578v39acf8a3f7465cbf@mail.gmail.com> On Fri, Mar 19, 2010 at 11:24 AM, wrote: > On Fri, Mar 19, 2010 at 11:17 AM, Keith Goodman wrote: >> On Fri, Mar 19, 2010 at 7:53 AM, gerardob wrote: >>> >>> Hello, i would like to produce lists of lists 1's and 0's. >>> >>> For example, to produce the list composed of: >>> >>> L = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]] >>> >>> I just need to do the following: >>> >>> n=4 >>> numpy.eye(n,dtype=int).tolist() >>> >>> I would like to know a simple way to generate a list containing all the >>> lists having two 1's at each element. >>> >>> Example, n = 4 >>> L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] >>> >>> Any ideas? >>> Thanks. >> >> Here's the brute force way: >> >>>> for i in range(4): >> ? ....: ? ? for j in range(i+1, 4): >> ? ....: ? ? ? ? x = np.zeros(4) >> ? ....: ? ? ? ? x[i] = 1 >> ? ....: ? ? ? ? x[j] = 1 >> ? ....: ? ? ? ? print x >> ? ....: >> ? ....: >> [ 1. ?1. ?0. ?0.] >> [ 1. ?0. ?1. ?0.] >> [ 1. ?0. ?0. ?1.] >> [ 0. ?1. ?1. ?0.] >> [ 0. ?1. ?0. ?1.] >> [ 0. ?0. ?1. ?1.] > > here are two numpy version, but they don't take any shortcuts > > >>>> a= np.array(list(np.ndindex(*2*np.ones(4)))) >>>> a > array([[0, 0, 0, 0], > ? ? ? [0, 0, 0, 1], > ? ? ? [0, 0, 1, 0], > ? ? ? [0, 0, 1, 1], > ? ? ? [0, 1, 0, 0], > ? ? ? [0, 1, 0, 1], > ? ? ? [0, 1, 1, 0], > ? ? ? [0, 1, 1, 1], > ? ? ? [1, 0, 0, 0], > ? ? ? [1, 0, 0, 1], > ? ? ? [1, 0, 1, 0], > ? ? ? [1, 0, 1, 1], > ? ? ? [1, 1, 0, 0], > ? ? ? [1, 1, 0, 1], > ? ? ? [1, 1, 1, 0], > ? ? ? [1, 1, 1, 1]]) >>>> a[a.sum(1)==2] > array([[0, 0, 1, 1], > ? ? ? [0, 1, 0, 1], > ? ? ? [0, 1, 1, 0], > ? ? ? [1, 0, 0, 1], > ? ? ? [1, 0, 1, 0], > ? ? ? [1, 1, 0, 0]]) > > >>>> [list(r) for r in np.ndindex(*2*np.ones(4)) if sum(r)==2] > [[0, 0, 1, 1], [0, 1, 0, 1], [0, 1, 1, 0], [1, 0, 0, 1], [1, 0, 1, 0], > [1, 1, 0, 0]] just for fun >>> ndim=5;n=3;nel=2;len([list(r) for r in np.ndindex(*n*np.ones(ndim)) if (np.array(r)!=0).sum()==nel]) 40 Josef > > Josef > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From aisaac at american.edu Fri Mar 19 12:05:47 2010 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 19 Mar 2010 12:05:47 -0400 Subject: [Numpy-discussion] lists of zeros and ones In-Reply-To: <27950978.post@talk.nabble.com> References: <27950978.post@talk.nabble.com> Message-ID: <4BA3A0DB.1080300@american.edu> On 3/19/2010 10:53 AM, gerardob wrote: > I would like to know a simple way to generate a list containing all the > lists having two 1's at each element. > > Example, n = 4 > L2 = [[1,1,0,0],[1,0,1,0],[1,0,0,1],[0,1,1,0],[0,1,0,1],[0,0,1,1]] > I like list(set(itertools.permutations([1,1]+[0]*(n-2)))) (Modify if you don't like the tuples.) But you can use a NumPy array if you wish: cols = list(itertools.combinations(range(n),2)) r = len(cols) rows = np.arange(r)[:,None] a = np.zeros((r,n),dtype=np.int_) a[rows,cols] = 1 a.tolist() hth, Alan Isaac From aisaac at american.edu Fri Mar 19 12:38:58 2010 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 19 Mar 2010 12:38:58 -0400 Subject: [Numpy-discussion] np.resize differs from ndarray.resize In-Reply-To: References: <4BA0E263.7090100@american.edu> <1cd32cbb1003170716n623c6852i9654d43e4eb219a3@mail.gmail.com> <4BA0EA53.10607@american.edu> Message-ID: <4BA3A8A2.1040204@american.edu> On 3/18/2010 4:56 AM, Sebastian Haase wrote: > How would people feel about unifying the function vs. the method behavior ? > One could add an addition option like > `repeat` or `fillZero`. > One could (at first !?) keep opposite defaults to not change the > current behavior. > But this way it would be most visible and clear what is going on. The current situation is confusing. I therefore hope that one of two things will happen. - unification (probably with a `fill` option, with a default of None that produces the function's behavior but that can be give any numerical value (not just 0)), or - eliminate the method Thanks, Alan From peridot.faceted at gmail.com Fri Mar 19 13:13:33 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 19 Mar 2010 13:13:33 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <201003181853.00829.faltet@pytables.org> References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> Message-ID: On 18 March 2010 13:53, Francesc Alted wrote: > A Thursday 18 March 2010 16:26:09 Anne Archibald escrigu?: >> Speak for your own CPUs :). >> >> But seriously, congratulations on the wide publication of the article; >> it's an important issue we often don't think enough about. I'm just a >> little snarky because this exact issue came up for us recently - a >> visiting astro speaker put it as "flops are free" - and so I did some >> tests and found that even without optimizing for memory access, our >> tasks are already CPU-bound: >> http://lighthouseinthesky.blogspot.com/2010/03/flops.html > > Well, I thought that my introduction was enough to convince anybody about the > problem, but forgot that you, the scientists, always try to demonstrate things > experimentally :-/ Snrk. Well, technically, that is our job description... > Seriously, your example is a clear example of what I'm recommending in the > article, i.e. always try to use libraries that are already leverage the > blocking technique (that is, taking advantage of both temporal and spatial > locality). ?Don't know about FFTW (never used it, sorry), but after having a > look at its home page, I'm pretty convinced that its authors are very > conscious about these techniques. > Being said this, it seems that, in addition, you are applying the blocking > technique yourself also: get the data in bunches (256 floating point elements, > which fits perfectly well on modern L1 caches), apply your computation (in > this case, FFTW) and put the result back in memory. ?A perfect example of what > I wanted to show to the readers so, congratulations! you made it without the > need to read my article (so perhaps the article was not so necessary after all > :-) What I didn't go into in detail in the article was that there's a trade-off of processing versus memory access available: we could reduce the memory load by a factor of eight by doing interpolation on the fly instead of all at once in a giant FFT. But that would cost cache space and flops, and we're not memory-dominated. One thing I didn't try, and should: running four of these jobs at once on a four-core machine. If I correctly understand the architecture, that won't affect the cache issues, but it will effectively quadruple the memory bandwidth needed, without increasing the memory bandwidth available. (Which, honestly, makes me wonder what the point is of building multicore machines.) Maybe I should look into that interpolation stuff. >> Heh. Indeed numexpr is a good tool for this sort of thing; it's an >> unfortunate fact that simple use of numpy tends to do operations in >> the pessimal order... > > Well, to honor the truth, NumPy does not have control in the order of the > operations in expressions and how temporaries are managed: it is Python who > decides that. ?NumPy only can do what Python wants it to do, and do it as good > as possible. ?And NumPy plays its role reasonably well here, but of course, > this is not enough for providing performance. ?In fact, this problem probably > affects to all interpreted languages out there, unless they implement a JIT > compiler optimised for evaluating expressions --and this is basically what > numexpr is. I'm not knocking numpy; it does (almost) the best it can. (I'm not sure of the optimality of the order in which ufuncs are executed; I think some optimizations there are possible.) But a language designed from scratch for vector calculations could certainly compile expressions into a form that would save a lot of memory accesses, particularly if an optimizer combined many lines of code. I've actually thought about whether such a thing could be done in python; I think the way to do it would be to build expression objects from variable objects, then have a single "apply" function that fed values in to all the variables. The same framework would support automatic differentiation and other cool things, but I'm not sure it would be useful enough to be worth the implementation complexity. Anne From pgmdevlist at gmail.com Fri Mar 19 13:58:58 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 19 Mar 2010 12:58:58 -0500 Subject: [Numpy-discussion] Warnings in numpy.ma.test() In-Reply-To: <4BA29757.9010502@hawaii.edu> References: <99C00E43-2842-4588-91D7-DE955D492206@gmail.com> <4BA11F45.9010201@hawaii.edu> <4BA12992.5040607@noaa.gov> <1cd32cbb1003171218u327e031fxadb420acbf8458b9@mail.gmail.com> <4BA27B0A.7010305@noaa.gov> <20100318191910.GA16916@phare.normalesup.org> <4BA2830D.5070209@noaa.gov> <4BA29757.9010502@hawaii.edu> Message-ID: On Mar 18, 2010, at 4:12 PM, Eric Firing wrote: > Ryan May wrote: >> On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker >> wrote: >>> Gael Varoquaux wrote: >>>> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: >>>>> sure -- that's kind of my point -- if EVERY numpy array were >>>>> (potentially) masked, then folks would write code to deal with them >>>>> appropriately. >>>> That's pretty much saying: "I have a complicated problem and I want every >>>> one else to have to deal with the full complexity of it, even if they >>>> have a simple problem". >>> Well -- I did say it was a fantasy... >>> >>> But I disagree -- having invalid data is a very common case. What we >>> have now is a situation where we have two parallel systems, masked >>> arrays and regular arrays. Each time someone does something new with >>> masked arrays, they often find another missing feature, and have to >>> solve that. Also, the fact that masked arrays are tacked on means that >>> performance suffers. Please keep in mind that MaskedArrays were always provided for convenience, that's all. If you need performance, you must implement a solution adapted to your problem (dropping missing values, filling them with some kind of interpolation...) and just use standard ndarrays. Anyway, the plan was since the beginning to have MaskedArrays implemented in C at one point or another. A few years back I checked how to subclass ndarrays in Cython, but ran into a lot of problems. Travis O advised me to focus on MaskedArrays instead, for good reasons. Now we have something that's pretty close to a ndarray (by opposition to the implementation in numeric), that works most of the time but could be optimized. >> Case in point, I just found a bug in np.gradient where it forces the >> output to be an ndarray. >> (http://projects.scipy.org/numpy/ticket/1435). Easy fix that doesn't >> actually require any special casing for masked arrays, just making >> sure to use the proper function to create a new array of the same >> subclass as the input. However, now for any place that I can't patch >> I have to use a custom function until a fixed numpy is released. >> >> Maybe universal support for masked arrays (and masking invalid points) >> is a pipe dream, but every function in numpy should IMO deal properly >> with subclasses of ndarray. > > 1) This can't be done in general because subclasses can change things to > the point where there is little one can count on. The matrix subclass, > for example, redefines multiplication and iteration, making it difficult > to write functions that will work for ndarrays or matrices. And one can always add a function to numpy.ma.extras... > > 2) There is a lot that can be done to improve the handling of masked > arrays, and I still believe that much of it should be done at the C > level, where it can be done with speed and simplicity. Unfortunately, > figuring out how to do it well, and implementing it well, will require a > lot of intensive work. I suspect it won't get done unless we can figure > out how to get a qualified person dedicated to it. I still can't speak C, but now that I'm unemployed, I should have plenty of free time to learn... Hire me ;) From Chris.Barker at noaa.gov Fri Mar 19 15:06:28 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 19 Mar 2010 12:06:28 -0700 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> Message-ID: <4BA3CB34.4000405@noaa.gov> Anne Archibald wrote: > (Which, honestly, makes me wonder what the point is of > building multicore machines.) Advertising... Oh, and having multiple cores slows down multi-threading in Python, so that feature is worth the expense! > But a language designed > from scratch for vector calculations Ever hear of ZPL? http://www.cs.washington.edu/research/zpl/home/index.html I think it's pretty much a dead project, but I always liked the idea. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From dwf at cs.toronto.edu Fri Mar 19 18:18:17 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 19 Mar 2010 18:18:17 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> Message-ID: <75633AD8-176D-452C-8A54-C43FD2BC9539@cs.toronto.edu> On 19-Mar-10, at 1:13 PM, Anne Archibald wrote: > I'm not knocking numpy; it does (almost) the best it can. (I'm not > sure of the optimality of the order in which ufuncs are executed; I > think some optimizations there are possible.) But a language designed > from scratch for vector calculations could certainly compile > expressions into a form that would save a lot of memory accesses, > particularly if an optimizer combined many lines of code. I've > actually thought about whether such a thing could be done in python; I > think the way to do it would be to build expression objects from > variable objects, then have a single "apply" function that fed values > in to all the variables. Hey Anne, Some folks across town from you at U de M have built just such at thing. :) http://deeplearning.net/software/theano/ It does all that, plus automatic differentiation, detection and correction of numerical instabilities, etc. Probably the most amazing thing about it is that with recent versions, you basically flip a switch and it will instead use an available CUDA- capable Nvidia GPU instead of the CPU. I'll admit, when James Bergstra initially told me about this plan to make it possible to transparently switch to running stuff on the GPU, I thought it was so ambitious that it would never happen. Then it did... David From faltet at pytables.org Sat Mar 20 06:32:24 2010 From: faltet at pytables.org (Francesc Alted) Date: Sat, 20 Mar 2010 11:32:24 +0100 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> Message-ID: <201003201132.24691.faltet@pytables.org> A Friday 19 March 2010 18:13:33 Anne Archibald escrigu?: [clip] > What I didn't go into in detail in the article was that there's a > trade-off of processing versus memory access available: we could > reduce the memory load by a factor of eight by doing interpolation on > the fly instead of all at once in a giant FFT. But that would cost > cache space and flops, and we're not memory-dominated. > > One thing I didn't try, and should: running four of these jobs at once > on a four-core machine. If I correctly understand the architecture, > that won't affect the cache issues, but it will effectively quadruple > the memory bandwidth needed, without increasing the memory bandwidth > available. (Which, honestly, makes me wonder what the point is of > building multicore machines.) > > Maybe I should look into that interpolation stuff. Please do. Although you may be increasing the data rate by 4x, your program is already very efficient in how it handles data, so chances are that you still get a good speed-up. I'd glad to hear you back on your experience. -- Francesc Alted From peridot.faceted at gmail.com Sat Mar 20 10:20:38 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 20 Mar 2010 10:20:38 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <201003201132.24691.faltet@pytables.org> References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <201003201132.24691.faltet@pytables.org> Message-ID: On 20 March 2010 06:32, Francesc Alted wrote: > A Friday 19 March 2010 18:13:33 Anne Archibald escrigu?: > [clip] >> What I didn't go into in detail in the article was that there's a >> trade-off of processing versus memory access available: we could >> reduce the memory load by a factor of eight by doing interpolation on >> the fly instead of all at once in a giant FFT. But that would cost >> cache space and flops, and we're not memory-dominated. >> >> One thing I didn't try, and should: running four of these jobs at once >> on a four-core machine. If I correctly understand the architecture, >> that won't affect the cache issues, but it will effectively quadruple >> the memory bandwidth needed, without increasing the memory bandwidth >> available. (Which, honestly, makes me wonder what the point is of >> building multicore machines.) >> >> Maybe I should look into that interpolation stuff. > > Please do. ?Although you may be increasing the data rate by 4x, your program > is already very efficient in how it handles data, so chances are that you > still get a good speed-up. ?I'd glad to hear you back on your experience. The thing is, it reduces the data rate from memory, but at the cost of additional FFTs (to implement convolutions). If my program is already spending all its time doing FFTs, and the loads from memory are happening while the CPU does FFTs, then there's no win in runtime from reducing the memory load, and there's a runtime cost of doing those convolutions - not just more flops but also more cache pressure (to store the interpolated array and the convolution kernels). One could go a further step and do interpolation directly, without convolution, but that adds really a lot of flops, which translates directly to runtime. On the other hand, if it doesn't completely blow out the cache, we do have non-interpolated FFTs already on disk (with red noise adjustments already applied), so we might save on the relatively minor cost of the giant FFT. I'll have to do some time trials. Anne > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Sat Mar 20 13:26:21 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 20 Mar 2010 19:26:21 +0200 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> Message-ID: <1269105981.13114.2.camel@Nokia-N900-42-11> Anne Archibald wrote: > I'm not knocking numpy; it does (almost) the best it can. (I'm not > sure of the optimality of the order in which ufuncs are executed; I > think some optimizations there are possible.) Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC dimensions are always traversed in order from left to right. Large speedups are possible in some cases, but in a quick try I didn't manage to come up with an algorithm that would always improve the speed (there was a thread about this last year or so, and there's a ticket). Things varied between computers, so this probably depends a lot on the actual cache arrangement. But perhaps numexpr has such heuristics, and we could steal them? -- Pauli Virtanen From charlesr.harris at gmail.com Sat Mar 20 14:00:54 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Mar 2010 12:00:54 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. Message-ID: Example, Compute the qr factorization of a matrix. > > Factor the matrix `a` as `qr`, where `q` is orthonormal > (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) and > `r` is upper-triangular. > Arrggghhhh... Totally. Unreadable. Why not say the columns of q are orthonormal vectors and r is upper triangular? The math tutorial should go in the notes, if anywhere. Might mention that this is a 'thin' factorization (Golub). Let me propose a rule: no math markup in the summary, ever. Parameters > ---------- > a : array_like, shape (M, N) > Matrix to be factored. > mode : {'full', 'r', 'economic'} > Specifies the information to be returned. 'full' is the default. > mode='r' returns a "true" `r`, while 'economic' returns a > "polluted" > `r` (albeit slightly faster; see Returns below). > > Oh, come now, "true", "polluted"? Sounds a bit political... Actually, 'economic' contains info on the Householder reflections. In any case, why mention it at all, just refer to the return documentation. And wouldn't values be a better word than information? > Returns > ------- > * If mode = 'full': > > * q : ndarray of float or complex, shape (M, K) > * r : ndarray of float or complex, shape (K, N) > > Size K = min(M, N) > > * If mode = 'r': > > * r : ndarray of float or complex, shape (K, N) > > * If mode = 'economic': > > * a2 : ndarray of float or complex, shape (M, N) > > The diagonal and the upper triangle of a2 contains r, > while the rest of the matrix is undefined. > WTF? I'm seeing stars. I may be old and crotchety, and I don't mean to be mean to the folks who have done the hard work to bring the docstring to its current state, but I think things have gotten out of hand. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 20 14:15:49 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 20 Mar 2010 14:15:49 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: Message-ID: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> On Sat, Mar 20, 2010 at 2:00 PM, Charles R Harris wrote: > Example, > >> ??? Compute the qr factorization of a matrix. >> >> ??? Factor the matrix `a` as `qr`, where `q` is orthonormal >> ??? (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) >> and >> ??? `r` is upper-triangular. > > Arrggghhhh... Totally. Unreadable. Why not say the columns of q are > orthonormal vectors and r is upper triangular? The math tutorial should go > in the notes, if anywhere. Might mention that this is a 'thin' factorization > (Golub). Let me propose a rule: no math markup in the summary, ever. > >> ??? Parameters >> ??? ---------- >> ??? a : array_like, shape (M, N) >> ??????? Matrix to be factored. >> ??? mode : {'full', 'r', 'economic'} >> ??????? Specifies the information to be returned. 'full' is the default. >> ??????? mode='r' returns a "true" `r`, while 'economic' returns a >> "polluted" >> ??????? `r` (albeit slightly faster; see Returns below). >> > > Oh, come now, "true", "polluted"? Sounds a bit political... Actually, > 'economic' contains info on the Householder reflections. In any case, why > mention it at all, just refer to the return documentation. And wouldn't > values be a better word than information? > >> >> ??? Returns >> ??? ------- >> ??? * If mode = 'full': >> >> ??????? * q : ndarray of float or complex, shape (M, K) >> ??????? * r : ndarray of float or complex, shape (K, N) >> >> ????? Size K = min(M, N) >> >> ??? * If mode = 'r': >> >> ????? * r : ndarray of float or complex, shape (K, N) >> >> ??? * If mode = 'economic': >> >> ????? * a2 : ndarray of float or complex, shape (M, N) >> >> ????? The diagonal and the upper triangle of a2 contains r, >> ????? while the rest of the matrix is undefined. > > WTF? I'm seeing stars. As far as I know, stars are the only way to render a list in restructured txt, otherwise it looses the list formatting. I tried several versions, but never found a different way. (I don't remember whether the empty lines are strictly necessary in sphinx docs, but they are in rst.) But, the markup creates nice hml and htmlhelp docs, and it can be published as pdf. Josef > > I may be old and crotchety, and I don't mean to be mean to the folks who > have done the hard work to bring the docstring to its current state, but I > think things have gotten out of hand. > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ralf.gommers at googlemail.com Sat Mar 20 14:16:09 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 02:16:09 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: Message-ID: On Sun, Mar 21, 2010 at 2:00 AM, Charles R Harris wrote: > Example, > > Compute the qr factorization of a matrix. >> >> Factor the matrix `a` as `qr`, where `q` is orthonormal >> (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) >> and >> `r` is upper-triangular. >> > > Arrggghhhh... Totally. Unreadable. Why not say the columns of q are > orthonormal vectors and r is upper triangular? The math tutorial should go > in the notes, if anywhere. Might mention that this is a 'thin' factorization > (Golub). Let me propose a rule: no math markup in the summary, ever. > That's already a rule, although it could be expressed even clearer. My understanding is: latex only in the Notes section, and even there very sparingly. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Sat Mar 20 14:24:56 2010 From: rmay31 at gmail.com (Ryan May) Date: Sat, 20 Mar 2010 13:24:56 -0500 Subject: [Numpy-discussion] Array bug with character (regression from 1.4.0) Message-ID: The following code, which works with numpy 1.4.0, results in an error: In [1]: import numpy as np In [2]: v = 'm' In [3]: dt = np.dtype('>c') In [4]: a = np.asarray(v, dt) On 1.4.0: In [5]: a Out[5]: array('m', dtype='|S1') In [6]: np.__version__ Out[6]: '1.4.0' On SVN trunk: /home/rmay/.local/lib/python2.6/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 282 283 """ --> 284 return array(a, dtype, copy=False, order=order) 285 286 def asanyarray(a, dtype=None, order=None): ValueError: assignment to 0-d array In [5]: np.__version__ Out[5]: '2.0.0.dev8297' Thoughts? (Filed at: http://projects.scipy.org/numpy/ticket/1436) Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From charlesr.harris at gmail.com Sat Mar 20 14:38:08 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Mar 2010 12:38:08 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> Message-ID: On Sat, Mar 20, 2010 at 12:15 PM, wrote: > On Sat, Mar 20, 2010 at 2:00 PM, Charles R Harris > wrote: > > Example, > > > >> Compute the qr factorization of a matrix. > >> > >> Factor the matrix `a` as `qr`, where `q` is orthonormal > >> (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) > >> and > >> `r` is upper-triangular. > > > > Arrggghhhh... Totally. Unreadable. Why not say the columns of q are > > orthonormal vectors and r is upper triangular? The math tutorial should > go > > in the notes, if anywhere. Might mention that this is a 'thin' > factorization > > (Golub). Let me propose a rule: no math markup in the summary, ever. > > > >> Parameters > >> ---------- > >> a : array_like, shape (M, N) > >> Matrix to be factored. > >> mode : {'full', 'r', 'economic'} > >> Specifies the information to be returned. 'full' is the default. > >> mode='r' returns a "true" `r`, while 'economic' returns a > >> "polluted" > >> `r` (albeit slightly faster; see Returns below). > >> > > > > Oh, come now, "true", "polluted"? Sounds a bit political... Actually, > > 'economic' contains info on the Householder reflections. In any case, why > > mention it at all, just refer to the return documentation. And wouldn't > > values be a better word than information? > > > >> > >> Returns > >> ------- > >> * If mode = 'full': > >> > >> * q : ndarray of float or complex, shape (M, K) > >> * r : ndarray of float or complex, shape (K, N) > >> > >> Size K = min(M, N) > >> > >> * If mode = 'r': > >> > >> * r : ndarray of float or complex, shape (K, N) > >> > >> * If mode = 'economic': > >> > >> * a2 : ndarray of float or complex, shape (M, N) > >> > >> The diagonal and the upper triangle of a2 contains r, > >> while the rest of the matrix is undefined. > > > > WTF? I'm seeing stars. > > As far as I know, stars are the only way to render a list in > restructured txt, otherwise it looses the list formatting. I tried > several versions, but never found a different way. (I don't remember > whether the empty lines are strictly necessary in sphinx docs, but > they are in rst.) > > But they can't be read on a terminal. The Numpy docstring format was designed to save vertical space, all those stars and blank lines undoes that effort. But, the markup creates nice hml and htmlhelp docs, and it can be > published as pdf. > > If we need that, let's fix the numpy format so we can get rid of the stars. Personally, I think the html docs should be secondary, i.e., as long as we are stuck with terminals, screen readability comes first. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 20 14:52:35 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 20 Mar 2010 14:52:35 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> Message-ID: <1cd32cbb1003201152x2d31ce06y34571ce4aded6f4a@mail.gmail.com> On Sat, Mar 20, 2010 at 2:38 PM, Charles R Harris wrote: > > > On Sat, Mar 20, 2010 at 12:15 PM, wrote: >> >> On Sat, Mar 20, 2010 at 2:00 PM, Charles R Harris >> wrote: >> > Example, >> > >> >> ??? Compute the qr factorization of a matrix. >> >> >> >> ??? Factor the matrix `a` as `qr`, where `q` is orthonormal >> >> ??? (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) >> >> and >> >> ??? `r` is upper-triangular. >> > >> > Arrggghhhh... Totally. Unreadable. Why not say the columns of q are >> > orthonormal vectors and r is upper triangular? The math tutorial should >> > go >> > in the notes, if anywhere. Might mention that this is a 'thin' >> > factorization >> > (Golub). Let me propose a rule: no math markup in the summary, ever. >> > >> >> ??? Parameters >> >> ??? ---------- >> >> ??? a : array_like, shape (M, N) >> >> ??????? Matrix to be factored. >> >> ??? mode : {'full', 'r', 'economic'} >> >> ??????? Specifies the information to be returned. 'full' is the >> >> default. >> >> ??????? mode='r' returns a "true" `r`, while 'economic' returns a >> >> "polluted" >> >> ??????? `r` (albeit slightly faster; see Returns below). >> >> >> > >> > Oh, come now, "true", "polluted"? Sounds a bit political... Actually, >> > 'economic' contains info on the Householder reflections. In any case, >> > why >> > mention it at all, just refer to the return documentation. And wouldn't >> > values be a better word than information? >> > >> >> >> >> ??? Returns >> >> ??? ------- >> >> ??? * If mode = 'full': >> >> >> >> ??????? * q : ndarray of float or complex, shape (M, K) >> >> ??????? * r : ndarray of float or complex, shape (K, N) >> >> >> >> ????? Size K = min(M, N) >> >> >> >> ??? * If mode = 'r': >> >> >> >> ????? * r : ndarray of float or complex, shape (K, N) >> >> >> >> ??? * If mode = 'economic': >> >> >> >> ????? * a2 : ndarray of float or complex, shape (M, N) >> >> >> >> ????? The diagonal and the upper triangle of a2 contains r, >> >> ????? while the rest of the matrix is undefined. >> > >> > WTF? I'm seeing stars. >> >> As far as I know, stars are the only way to render a list in >> restructured txt, otherwise it looses the list formatting. I tried >> several versions, but never found a different way. (I don't remember >> whether the empty lines are strictly necessary in sphinx docs, but >> they are in rst.) >> > > But they can't be read on a terminal. The Numpy docstring format was > designed to save vertical space, all those stars and blank lines undoes that > effort. > >> But, the markup creates nice hml and htmlhelp docs, and it can be >> published as pdf. >> > > If we need that, let's fix the numpy format so we can get rid of the stars. > Personally, I think the html docs should be secondary, i.e., as long as we > are stuck with terminals, screen readability comes first. What's a terminal ? For most packages, I'm reading sphinx generated docs. (I know I'm an outlier compared to the "usual" users on the mailing list.) Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From dagss at student.matnat.uio.no Sat Mar 20 14:56:03 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 20 Mar 2010 19:56:03 +0100 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <1269105981.13114.2.camel@Nokia-N900-42-11> References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <1269105981.13114.2.camel@Nokia-N900-42-11> Message-ID: <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> Pauli Virtanen wrote: > Anne Archibald wrote: >> I'm not knocking numpy; it does (almost) the best it can. (I'm not >> sure of the optimality of the order in which ufuncs are executed; I >> think some optimizations there are possible.) > > Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC > dimensions are always traversed in order from left to right. Large > speedups are possible in some cases, but in a quick try I didn't manage to > come up with an algorithm that would always improve the speed (there was a > thread about this last year or so, and there's a ticket). Things varied > between computers, so this probably depends a lot on the actual cache > arrangement. > > But perhaps numexpr has such heuristics, and we could steal them? At least in MultiIter (and I always assumed ufuncs too, but perhaps not) there's functionality to remove the largest dimension so that it can be put innermost in a loop. In many situations, removing the dimension with the smallest stride from the iterator would probably work much better. It's all about balancing iterator overhead and memory overhead. Something simple like "select the dimension with length > 200 which has smallest stride, or the dimension with largest length if none are above 200" would perhaps work well? Dag Sverre From peridot.faceted at gmail.com Sat Mar 20 15:22:57 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 20 Mar 2010 15:22:57 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <1269105981.13114.2.camel@Nokia-N900-42-11> <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> Message-ID: On 20 March 2010 14:56, Dag Sverre Seljebotn wrote: > Pauli Virtanen wrote: >> Anne Archibald wrote: >>> I'm not knocking numpy; it does (almost) the best it can. (I'm not >>> sure of the optimality of the order in which ufuncs are executed; I >>> think some optimizations there are possible.) >> >> Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC >> dimensions are always traversed in order from left to right. Large >> speedups are possible in some cases, but in a quick try I didn't manage to >> come up with an algorithm that would always improve the speed (there was a >> thread about this last year or so, and there's a ticket). Things varied >> between computers, so this probably depends a lot on the actual cache >> arrangement. >> >> But perhaps numexpr has such heuristics, and we could steal them? > > At least in MultiIter (and I always assumed ufuncs too, but perhaps not) > there's functionality to remove the largest dimension so that it can be > put innermost in a loop. In many situations, removing the dimension with > the smallest stride from the iterator would probably work much better. > > It's all about balancing iterator overhead and memory overhead. Something > simple like "select the dimension with length > 200 which has smallest > stride, or the dimension with largest length if none are above 200" would > perhaps work well? There's more to it than that: there's no point (I think) going with the smallest stride if that stride is more than a cache line (usually 64 bytes). There are further optimizations - often stride[i]*shape[i]==shape[j] for some (i,j), and in that case these two dimensions can be treated as one with stride stride[i] and length shape[i]*shape[j]. Applying this would "unroll" all C- or Fortran-contiguous arrays to a single non-nested loop. Of course, reordering ufuncs is potentially hazardous in terms of breaking user code: the effect of A[1:]*=A[:-1] depends on whether you run through A in index order or reverse index order (which might be the order it's stored in memory, potentially faster from a cache point of view). Anne > > Dag Sverre > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Sat Mar 20 15:23:02 2010 From: cournape at gmail.com (David Cournapeau) Date: Sun, 21 Mar 2010 04:23:02 +0900 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003201152x2d31ce06y34571ce4aded6f4a@mail.gmail.com> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <1cd32cbb1003201152x2d31ce06y34571ce4aded6f4a@mail.gmail.com> Message-ID: <5b8d13221003201223k5e4982eel7130d1213ab0a18f@mail.gmail.com> On Sun, Mar 21, 2010 at 3:52 AM, wrote: > > What's a terminal ? ?For most packages, I'm reading sphinx generated docs. Broadly speaking, the equivalent of cmd.exe (i.e. "dos" windows) on unix. It is important to keep a good balance between readability in un-rendered (terminals) and rendered modes (pdf, html). That's the whole point of using something like rest in the first place. cheers, David From aisaac at american.edu Sat Mar 20 15:32:44 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 20 Mar 2010 15:32:44 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> Message-ID: <4BA522DC.2050100@american.edu> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: > As far as I know, stars are the only way to render a list in > restructured txt, otherwise it looses the list formatting. Try a definition list? Example below. Alan Returns ------- q, r if mode = 'full': - q : ndarray of float or complex, shape (M, K) - r : ndarray of float or complex, shape (K, N) K = min(M, N) r if mode = 'r': - r : ndarray of float or complex, shape (K, N) a2 if mode = 'economic': - a2 : ndarray of float or complex, shape (M, N) The diagonal and the upper triangle of a2 contains r, while the rest of the matrix is undefined. From charlesr.harris at gmail.com Sat Mar 20 15:39:00 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Mar 2010 13:39:00 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <5b8d13221003201223k5e4982eel7130d1213ab0a18f@mail.gmail.com> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <1cd32cbb1003201152x2d31ce06y34571ce4aded6f4a@mail.gmail.com> <5b8d13221003201223k5e4982eel7130d1213ab0a18f@mail.gmail.com> Message-ID: On Sat, Mar 20, 2010 at 1:23 PM, David Cournapeau wrote: > On Sun, Mar 21, 2010 at 3:52 AM, wrote: > > > > > What's a terminal ? For most packages, I'm reading sphinx generated > docs. > > Broadly speaking, the equivalent of cmd.exe (i.e. "dos" windows) on > unix. It is important to keep a good balance between readability in > un-rendered (terminals) and rendered modes (pdf, html). That's the > whole point of using something like rest in the first place. > > And here I thought Josef was joking ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Sat Mar 20 15:41:27 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 20 Mar 2010 12:41:27 -0700 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: Message-ID: <45d1ab481003201241i3dab91e1v1c90b1d6f30301ed@mail.gmail.com> On Sat, Mar 20, 2010 at 11:00 AM, Charles R Harris wrote: > Example, > >> ??? Compute the qr factorization of a matrix. >> >> ??? Factor the matrix `a` as `qr`, where `q` is orthonormal >> ??? (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) >> and >> ??? `r` is upper-triangular. > > Arrggghhhh... Totally. Unreadable. Why not say the columns of q are > orthonormal vectors and r is upper triangular? The math tutorial should go > in the notes, if anywhere. Might mention that this is a 'thin' factorization > (Golub). Let me propose a rule: no math markup in the summary, ever. > >> ??? Parameters >> ??? ---------- >> ??? a : array_like, shape (M, N) >> ??????? Matrix to be factored. >> ??? mode : {'full', 'r', 'economic'} >> ??????? Specifies the information to be returned. 'full' is the default. >> ??????? mode='r' returns a "true" `r`, while 'economic' returns a >> "polluted" >> ??????? `r` (albeit slightly faster; see Returns below). >> > > Oh, come now, "true", "polluted"? Sounds a bit political... Actually, > 'economic' contains info on the Householder reflections. In any case, why > mention it at all, just refer to the return documentation. And wouldn't > values be a better word than information? Mea culpa (at least up to this point) - I'll change it back. I'm going to stay out of the whole terminal v. higher-tech display debate: I'm having trouble seeing a win-win. DG From d.l.goldsmith at gmail.com Sat Mar 20 15:53:52 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 20 Mar 2010 12:53:52 -0700 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <45d1ab481003201241i3dab91e1v1c90b1d6f30301ed@mail.gmail.com> References: <45d1ab481003201241i3dab91e1v1c90b1d6f30301ed@mail.gmail.com> Message-ID: <45d1ab481003201253n33468332o68a4555c16469a36@mail.gmail.com> On Sat, Mar 20, 2010 at 12:41 PM, David Goldsmith wrote: > On Sat, Mar 20, 2010 at 11:00 AM, Charles R Harris > wrote: >> Example, >> >>> ??? Compute the qr factorization of a matrix. >>> >>> ??? Factor the matrix `a` as `qr`, where `q` is orthonormal >>> ??? (:math:`dot( q_{:,i}, q_{:,j}) = \delta_{ij}`, the Kronecker delta) >>> and >>> ??? `r` is upper-triangular. >> >> Arrggghhhh... Totally. Unreadable. Why not say the columns of q are >> orthonormal vectors and r is upper triangular? The math tutorial should go >> in the notes, if anywhere. Might mention that this is a 'thin' factorization >> (Golub). Let me propose a rule: no math markup in the summary, ever. >> >>> ??? Parameters >>> ??? ---------- >>> ??? a : array_like, shape (M, N) >>> ??????? Matrix to be factored. >>> ??? mode : {'full', 'r', 'economic'} >>> ??????? Specifies the information to be returned. 'full' is the default. >>> ??????? mode='r' returns a "true" `r`, while 'economic' returns a >>> "polluted" >>> ??????? `r` (albeit slightly faster; see Returns below). >>> >> >> Oh, come now, "true", "polluted"? Sounds a bit political... Actually, >> 'economic' contains info on the Householder reflections. In any case, why >> mention it at all, just refer to the return documentation. And wouldn't >> values be a better word than information? > > Mea culpa (at least up to this point) - I'll change it back. > > I'm going to stay out of the whole terminal v. higher-tech display > debate: I'm having trouble seeing a win-win. > > DG OK, reverted my changes, and "demoted" to "Being written" pending some form of resolution vis-a-vis the Returns list debate. DG From seb.haase at gmail.com Sat Mar 20 16:18:02 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Sat, 20 Mar 2010 21:18:02 +0100 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <1269105981.13114.2.camel@Nokia-N900-42-11> <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> Message-ID: On Sat, Mar 20, 2010 at 8:22 PM, Anne Archibald wrote: > On 20 March 2010 14:56, Dag Sverre Seljebotn > wrote: >> Pauli Virtanen wrote: >>> Anne Archibald wrote: >>>> I'm not knocking numpy; it does (almost) the best it can. (I'm not >>>> sure of the optimality of the order in which ufuncs are executed; I >>>> think some optimizations there are possible.) >>> >>> Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC >>> dimensions are always traversed in order from left to right. Large >>> speedups are possible in some cases, but in a quick try I didn't manage to >>> come up with an algorithm that would always improve the speed (there was a >>> thread about this last year or so, and there's a ticket). Things varied >>> between computers, so this probably depends a lot on the actual cache >>> arrangement. >>> >>> But perhaps numexpr has such heuristics, and we could steal them? >> >> At least in MultiIter (and I always assumed ufuncs too, but perhaps not) >> there's functionality to remove the largest dimension so that it can be >> put innermost in a loop. In many situations, removing the dimension with >> the smallest stride from the iterator would probably work much better. >> >> It's all about balancing iterator overhead and memory overhead. Something >> simple like "select the dimension with length > 200 which has smallest >> stride, or the dimension with largest length if none are above 200" would >> perhaps work well? > > There's more to it than that: there's no point (I think) going with > the smallest stride if that stride is more than a cache line (usually > 64 bytes). There are further optimizations - often > stride[i]*shape[i]==shape[j] for some (i,j), and in that case these > two dimensions can be treated as one with stride stride[i] and length > shape[i]*shape[j]. Applying this would "unroll" all C- or > Fortran-contiguous arrays to a single non-nested loop. > > Of course, reordering ufuncs is potentially hazardous in terms of > breaking user code: the effect of > A[1:]*=A[:-1] > depends on whether you run through A in index order or reverse index > order (which might be the order it's stored in memory, potentially > faster from a cache point of view). > I think Travis O suggested at some point to make this kind of "overlapping inplace" operations invalid. (maybe it was someone else ...) The idea was to protect new ("uninitiated") users from unexpected results. Anyway, maybe one could use (something like) np.seterror to determine how to handle these cases... -Sebastian From charlesr.harris at gmail.com Sat Mar 20 16:18:53 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Mar 2010 14:18:53 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <4BA522DC.2050100@american.edu> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac wrote: > On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: > > As far as I know, stars are the only way to render a list in > > restructured txt, otherwise it looses the list formatting. > > Try a definition list? > Example below. > Alan > > > Returns > ------- > > q, r if mode = 'full': > - q : ndarray of float or complex, shape (M, K) > - r : ndarray of float or complex, shape (K, N) > > K = min(M, N) > > r if mode = 'r': > - r : ndarray of float or complex, shape (K, N) > > a2 if mode = 'economic': > - a2 : ndarray of float or complex, shape (M, N) > > The diagonal and the upper triangle of a2 contains r, > while the rest of the matrix is undefined. > > Maybe handle it in a manner similar to the other sections. q,r <> mode = 'r'' q: [M,N] ndarray The columns of 'q' are orthonomal. r: [K,N] ndarray Upper triangular array. ... The "<>" standing in for "if". The indentation could be moved out. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Sat Mar 20 17:36:24 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 20 Mar 2010 17:36:24 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <1269105981.13114.2.camel@Nokia-N900-42-11> <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> Message-ID: On 20 March 2010 16:18, Sebastian Haase wrote: > On Sat, Mar 20, 2010 at 8:22 PM, Anne Archibald > wrote: >> On 20 March 2010 14:56, Dag Sverre Seljebotn >> wrote: >>> Pauli Virtanen wrote: >>>> Anne Archibald wrote: >>>>> I'm not knocking numpy; it does (almost) the best it can. (I'm not >>>>> sure of the optimality of the order in which ufuncs are executed; I >>>>> think some optimizations there are possible.) >>>> >>>> Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC >>>> dimensions are always traversed in order from left to right. Large >>>> speedups are possible in some cases, but in a quick try I didn't manage to >>>> come up with an algorithm that would always improve the speed (there was a >>>> thread about this last year or so, and there's a ticket). Things varied >>>> between computers, so this probably depends a lot on the actual cache >>>> arrangement. >>>> >>>> But perhaps numexpr has such heuristics, and we could steal them? >>> >>> At least in MultiIter (and I always assumed ufuncs too, but perhaps not) >>> there's functionality to remove the largest dimension so that it can be >>> put innermost in a loop. In many situations, removing the dimension with >>> the smallest stride from the iterator would probably work much better. >>> >>> It's all about balancing iterator overhead and memory overhead. Something >>> simple like "select the dimension with length > 200 which has smallest >>> stride, or the dimension with largest length if none are above 200" would >>> perhaps work well? >> >> There's more to it than that: there's no point (I think) going with >> the smallest stride if that stride is more than a cache line (usually >> 64 bytes). There are further optimizations - often >> stride[i]*shape[i]==shape[j] for some (i,j), and in that case these >> two dimensions can be treated as one with stride stride[i] and length >> shape[i]*shape[j]. Applying this would "unroll" all C- or >> Fortran-contiguous arrays to a single non-nested loop. >> >> Of course, reordering ufuncs is potentially hazardous in terms of >> breaking user code: the effect of >> A[1:]*=A[:-1] >> depends on whether you run through A in index order or reverse index >> order (which might be the order it's stored in memory, potentially >> faster from a cache point of view). >> > I think Travis O suggested at some point to make this kind of > "overlapping inplace" operations invalid. > (maybe it was someone else ...) > The idea was to protect new ("uninitiated") users from unexpected results. > Anyway, maybe one could use (something like) np.seterror to determine > how to handle these cases... I was in on that discussion. My recollection of the conclusion was that on the one hand they're useful, carefully applied, while on the other hand they're very difficult to reliably detect (since you don't want to forbid operations on non-overlapping slices of the same array). Anne > -Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ralf.gommers at googlemail.com Sat Mar 20 21:45:38 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 09:45:38 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: On Sun, Mar 21, 2010 at 4:18 AM, Charles R Harris wrote: > > > On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac wrote: > >> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: >> > As far as I know, stars are the only way to render a list in >> > restructured txt, otherwise it looses the list formatting. >> >> Try a definition list? >> Example below. >> Alan >> >> >> Returns >> ------- >> >> q, r if mode = 'full': >> - q : ndarray of float or complex, shape (M, K) >> - r : ndarray of float or complex, shape (K, N) >> >> K = min(M, N) >> >> r if mode = 'r': >> - r : ndarray of float or complex, shape (K, N) >> >> a2 if mode = 'economic': >> - a2 : ndarray of float or complex, shape (M, N) >> >> The diagonal and the upper triangle of a2 contains r, >> while the rest of the matrix is undefined. >> >> > Maybe handle it in a manner similar to the other sections. > > q,r <> mode = 'r'' > q: [M,N] ndarray > The columns of 'q' are orthonomal. > r: [K,N] ndarray > Upper triangular array. > ... > > The "<>" standing in for "if". The indentation could be moved out. > > Looks good, but what determines that this is a list, the <>? What if you want a list that does not use if's? If this can be made to work, great, but it will probably be much more robust if there's some kind of markup. Stars or dashes would not look that bad imho if there would be no need for blank lines. Also, if someone feels like tackling this, please make multi-line list items work at the same time. See http://code.google.com/p/pydocweb/issues/detail?id=46 Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 20 22:18:26 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Mar 2010 20:18:26 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: On Sat, Mar 20, 2010 at 7:45 PM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 4:18 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac wrote: >> >>> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: >>> > As far as I know, stars are the only way to render a list in >>> > restructured txt, otherwise it looses the list formatting. >>> >>> Try a definition list? >>> Example below. >>> Alan >>> >>> >>> Returns >>> ------- >>> >>> q, r if mode = 'full': >>> - q : ndarray of float or complex, shape (M, K) >>> - r : ndarray of float or complex, shape (K, N) >>> >>> K = min(M, N) >>> >>> r if mode = 'r': >>> - r : ndarray of float or complex, shape (K, N) >>> >>> a2 if mode = 'economic': >>> - a2 : ndarray of float or complex, shape (M, N) >>> >>> The diagonal and the upper triangle of a2 contains r, >>> while the rest of the matrix is undefined. >>> >>> >> Maybe handle it in a manner similar to the other sections. >> >> q,r <> mode = 'r'' >> q: [M,N] ndarray >> The columns of 'q' are orthonomal. >> r: [K,N] ndarray >> Upper triangular array. >> ... >> >> The "<>" standing in for "if". The indentation could be moved out. >> >> Looks good, but what determines that this is a list, the <>? What if you > want a list that does not use if's? If this can be made to work, great, but > it will probably be much more robust if there's some kind of markup. Stars > or dashes would not look that bad imho if there would be no need for blank > lines. > > That was just a suggestion, I think it can probably be improved upon. Thoughts? > Also, if someone feels like tackling this, please make multi-line list > items work at the same time. See > http://code.google.com/p/pydocweb/issues/detail?id=46 > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Mar 20 22:54:02 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 10:54:02 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: On Sun, Mar 21, 2010 at 10:18 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, Mar 20, 2010 at 7:45 PM, Ralf Gommers > wrote: > >> >> >> On Sun, Mar 21, 2010 at 4:18 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac wrote: >>> >>>> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: >>>> > As far as I know, stars are the only way to render a list in >>>> > restructured txt, otherwise it looses the list formatting. >>>> >>>> Try a definition list? >>>> Example below. >>>> Alan >>>> >>>> >>>> Returns >>>> ------- >>>> >>>> q, r if mode = 'full': >>>> - q : ndarray of float or complex, shape (M, K) >>>> - r : ndarray of float or complex, shape (K, N) >>>> >>>> K = min(M, N) >>>> >>>> r if mode = 'r': >>>> - r : ndarray of float or complex, shape (K, N) >>>> >>>> a2 if mode = 'economic': >>>> - a2 : ndarray of float or complex, shape (M, N) >>>> >>>> The diagonal and the upper triangle of a2 contains r, >>>> while the rest of the matrix is undefined. >>>> >>>> >>> Maybe handle it in a manner similar to the other sections. >>> >>> q,r <> mode = 'r'' >>> q: [M,N] ndarray >>> The columns of 'q' are orthonomal. >>> r: [K,N] ndarray >>> Upper triangular array. >>> ... >>> >>> The "<>" standing in for "if". The indentation could be moved out. >>> >>> Looks good, but what determines that this is a list, the <>? What if you >> want a list that does not use if's? If this can be made to work, great, but >> it will probably be much more robust if there's some kind of markup. Stars >> or dashes would not look that bad imho if there would be no need for blank >> lines. >> >> > That was just a suggestion, I think it can probably be improved upon. > Thoughts? > In general a list should just be defined with *. Like: * item 1 * sub-item 1 Hey, a multi-line sub-item works too! * sub-item 2 * item 2 In the specific case of a variable number of return values, I do not like the if..else construction. How about this: q : ndarray The q-value. If mode='r' this contains .... If mode='economic' .... r : ndarray, optional The r-value. Is only returned if mode='r'. 'optional' could be changed to 'conditional' or something like that. Ralf > >> Also, if someone feels like tackling this, please make multi-line list >> items work at the same time. See >> http://code.google.com/p/pydocweb/issues/detail?id=46 >> >> > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 20 23:59:34 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 20 Mar 2010 23:59:34 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: <1cd32cbb1003202059y117430bfu9faa6cd205642142@mail.gmail.com> On Sat, Mar 20, 2010 at 10:54 PM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 10:18 AM, Charles R Harris > wrote: >> >> >> On Sat, Mar 20, 2010 at 7:45 PM, Ralf Gommers >> wrote: >>> >>> >>> On Sun, Mar 21, 2010 at 4:18 AM, Charles R Harris >>> wrote: >>>> >>>> >>>> On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac >>>> wrote: >>>>> >>>>> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: >>>>> > As far as I know, stars are the only way to render a list in >>>>> > restructured txt, otherwise it looses the list formatting. >>>>> >>>>> Try a definition list? >>>>> Example below. >>>>> Alan >>>>> >>>>> >>>>> Returns >>>>> ------- >>>>> >>>>> q, r if mode = 'full': >>>>> ? ?- q : ndarray of float or complex, shape (M, K) >>>>> ? ?- r : ndarray of float or complex, shape (K, N) >>>>> >>>>> ? ?K = min(M, N) >>>>> >>>>> r if mode = 'r': >>>>> ? ?- r : ndarray of float or complex, shape (K, N) >>>>> >>>>> a2 if mode = 'economic': >>>>> ? ?- a2 : ndarray of float or complex, shape (M, N) >>>>> >>>>> ? ?The diagonal and the upper triangle of a2 contains r, >>>>> ? ?while the rest of the matrix is undefined. >>>>> >>>> >>>> Maybe handle it in a manner similar to the other sections. >>>> >>>> q,r <> mode = 'r'' >>>> ??? q: [M,N] ndarray >>>> ??????? The columns of 'q' are orthonomal. >>>> ??? r:? [K,N] ndarray >>>> ??????? Upper triangular array. >>>> ... >>>> >>>> The "<>" standing in for "if". The indentation could be moved out. >>>> >>> Looks good, but what determines that this is a list, the <>? What if you >>> want a list that does not use if's? If this can be made to work, great, but >>> it will probably be much more robust if there's some kind of markup. Stars >>> or dashes would not look that bad imho if there would be no need for blank >>> lines. >>> >> >> That was just a suggestion, I think it can probably be improved upon. >> Thoughts? > > In general a list should just be defined with *. Like: > * item 1 > ??? * sub-item 1 > ????? Hey, a multi-line sub-item works too! > ??? * sub-item 2 > * item 2 > > In the specific case of a variable number of return values, I do not like > the if..else construction. How about this: > > q : ndarray > ??? The q-value. If mode='r' this contains .... > ??? If mode='economic' .... > r : ndarray, optional > ??? The r-value. Is only returned if mode='r'. > > 'optional' could be changed to 'conditional' or something like that. I agree this is better and I also thought about doing it this way. Another thought: if "terminal" refers to ipython then it should be possible that ipython removes some of the markup (stars?) when doing its magic, or not? Although that wouldn't help with a plain terminal or command shell. Josef > > Ralf > >> >>> >>> Also, if someone feels like tackling this, please make multi-line list >>> items work at the same time. See >>> http://code.google.com/p/pydocweb/issues/detail?id=46 >>> >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Sun Mar 21 00:24:25 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Mar 2010 22:24:25 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: On Sat, Mar 20, 2010 at 8:54 PM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 10:18 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sat, Mar 20, 2010 at 7:45 PM, Ralf Gommers < >> ralf.gommers at googlemail.com> wrote: >> >>> >>> >>> On Sun, Mar 21, 2010 at 4:18 AM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac wrote: >>>> >>>>> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: >>>>> > As far as I know, stars are the only way to render a list in >>>>> > restructured txt, otherwise it looses the list formatting. >>>>> >>>>> Try a definition list? >>>>> Example below. >>>>> Alan >>>>> >>>>> >>>>> Returns >>>>> ------- >>>>> >>>>> q, r if mode = 'full': >>>>> - q : ndarray of float or complex, shape (M, K) >>>>> - r : ndarray of float or complex, shape (K, N) >>>>> >>>>> K = min(M, N) >>>>> >>>>> r if mode = 'r': >>>>> - r : ndarray of float or complex, shape (K, N) >>>>> >>>>> a2 if mode = 'economic': >>>>> - a2 : ndarray of float or complex, shape (M, N) >>>>> >>>>> The diagonal and the upper triangle of a2 contains r, >>>>> while the rest of the matrix is undefined. >>>>> >>>>> >>>> Maybe handle it in a manner similar to the other sections. >>>> >>>> q,r <> mode = 'r'' >>>> q: [M,N] ndarray >>>> The columns of 'q' are orthonomal. >>>> r: [K,N] ndarray >>>> Upper triangular array. >>>> ... >>>> >>>> The "<>" standing in for "if". The indentation could be moved out. >>>> >>>> Looks good, but what determines that this is a list, the <>? What if you >>> want a list that does not use if's? If this can be made to work, great, but >>> it will probably be much more robust if there's some kind of markup. Stars >>> or dashes would not look that bad imho if there would be no need for blank >>> lines. >>> >>> >> That was just a suggestion, I think it can probably be improved upon. >> Thoughts? >> > > In general a list should just be defined with *. Like: > * item 1 > * sub-item 1 > Hey, a multi-line sub-item works too! > * sub-item 2 > * item 2 > > I really, really want to get rid of the asterisks, they are ugly and distracting (IMHO). Unlike dashes, colons, and underlines they aren't part of the usual textual repetoire. > In the specific case of a variable number of return values, I do not like > the if..else construction. How about this: > > q : ndarray > The q-value. If mode='r' this contains .... > If mode='economic' .... > r : ndarray, optional > The r-value. Is only returned if mode='r'. > > In the case at hand, q is optional and r has two forms. > 'optional' could be changed to 'conditional' or something like that. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Mar 21 00:39:03 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 12:39:03 +0800 Subject: [Numpy-discussion] draft release guide Message-ID: Hi all, At http://github.com/rgommers/NumPy-release-guide you can find a summary of how to set up your system to build numpy binaries on OS X. I still have to add info on scipy (that's turning out to be fairly painful) but for numpy it is pretty complete. Any feedback is appreciated! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Mar 21 00:44:57 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 12:44:57 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: On Sun, Mar 21, 2010 at 12:24 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > Maybe handle it in a manner similar to the other sections. >>>>> >>>>> q,r <> mode = 'r'' >>>>> q: [M,N] ndarray >>>>> The columns of 'q' are orthonomal. >>>>> r: [K,N] ndarray >>>>> Upper triangular array. >>>>> ... >>>>> >>>>> The "<>" standing in for "if". The indentation could be moved out. >>>>> >>>>> Looks good, but what determines that this is a list, the <>? What if >>>> you want a list that does not use if's? If this can be made to work, great, >>>> but it will probably be much more robust if there's some kind of markup. >>>> Stars or dashes would not look that bad imho if there would be no need for >>>> blank lines. >>>> >>>> >>> That was just a suggestion, I think it can probably be improved upon. >>> Thoughts? >>> >> >> In general a list should just be defined with *. Like: >> * item 1 >> * sub-item 1 >> Hey, a multi-line sub-item works too! >> * sub-item 2 >> * item 2 >> >> > I really, really want to get rid of the asterisks, they are ugly and > distracting (IMHO). Unlike dashes, colons, and underlines they aren't part > of the usual textual repetoire. > OK, all dashes then. Those are also valid reST list specifiers. > > >> In the specific case of a variable number of return values, I do not like >> the if..else construction. How about this: >> >> q : ndarray >> The q-value. If mode='r' this contains .... >> If mode='economic' .... >> r : ndarray, optional >> The r-value. Is only returned if mode='r'. >> >> > In the case at hand, q is optional and r has two forms. > Sure, it was just an example. As long as you agree that it's better than "if mode='x' then ...", "if mode='y' then ...". Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 21 00:47:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 21 Mar 2010 00:47:53 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> On Sun, Mar 21, 2010 at 12:24 AM, Charles R Harris wrote: > > > On Sat, Mar 20, 2010 at 8:54 PM, Ralf Gommers > wrote: >> >> >> On Sun, Mar 21, 2010 at 10:18 AM, Charles R Harris >> wrote: >>> >>> >>> On Sat, Mar 20, 2010 at 7:45 PM, Ralf Gommers >>> wrote: >>>> >>>> >>>> On Sun, Mar 21, 2010 at 4:18 AM, Charles R Harris >>>> wrote: >>>>> >>>>> >>>>> On Sat, Mar 20, 2010 at 1:32 PM, Alan G Isaac >>>>> wrote: >>>>>> >>>>>> On 3/20/2010 2:15 PM, josef.pktd at gmail.com wrote: >>>>>> > As far as I know, stars are the only way to render a list in >>>>>> > restructured txt, otherwise it looses the list formatting. >>>>>> >>>>>> Try a definition list? >>>>>> Example below. >>>>>> Alan >>>>>> >>>>>> >>>>>> Returns >>>>>> ------- >>>>>> >>>>>> q, r if mode = 'full': >>>>>> ? ?- q : ndarray of float or complex, shape (M, K) >>>>>> ? ?- r : ndarray of float or complex, shape (K, N) >>>>>> >>>>>> ? ?K = min(M, N) >>>>>> >>>>>> r if mode = 'r': >>>>>> ? ?- r : ndarray of float or complex, shape (K, N) >>>>>> >>>>>> a2 if mode = 'economic': >>>>>> ? ?- a2 : ndarray of float or complex, shape (M, N) >>>>>> >>>>>> ? ?The diagonal and the upper triangle of a2 contains r, >>>>>> ? ?while the rest of the matrix is undefined. >>>>>> >>>>> >>>>> Maybe handle it in a manner similar to the other sections. >>>>> >>>>> q,r <> mode = 'r'' >>>>> ??? q: [M,N] ndarray >>>>> ??????? The columns of 'q' are orthonomal. >>>>> ??? r:? [K,N] ndarray >>>>> ??????? Upper triangular array. >>>>> ... >>>>> >>>>> The "<>" standing in for "if". The indentation could be moved out. >>>>> >>>> Looks good, but what determines that this is a list, the <>? What if you >>>> want a list that does not use if's? If this can be made to work, great, but >>>> it will probably be much more robust if there's some kind of markup. Stars >>>> or dashes would not look that bad imho if there would be no need for blank >>>> lines. >>>> >>> >>> That was just a suggestion, I think it can probably be improved upon. >>> Thoughts? >> >> In general a list should just be defined with *. Like: >> * item 1 >> ??? * sub-item 1 >> ????? Hey, a multi-line sub-item works too! >> ??? * sub-item 2 >> * item 2 >> > > I really, really want to get rid of the asterisks, they are ugly and > distracting (IMHO). Unlike dashes, colons, and underlines they aren't part > of the usual textual repetoire. but bullets (bullet lists) are, at least in latex and powerpoint, and asterisks are the ASCI version of bullet points, and all my regular notes (targeted for wiki markup or rst) use them heavily to structure lists. dashes would be also ok, but I don't think rst would recognize them. My main problem with rst is that it doesn't allow hard line breaks, which forces some kind of markup to render lists. But as Ralf's rewrite shows, there is often a way to get around lists. I only know of a few places in the scipy docs, where list items in the Parameters or Returns are really necessary. Josef > >> >> In the specific case of a variable number of return values, I do not like >> the if..else construction. How about this: >> >> q : ndarray >> ??? The q-value. If mode='r' this contains .... >> ??? If mode='economic' .... >> r : ndarray, optional >> ??? The r-value. Is only returned if mode='r'. >> > > In the case at hand, q is optional and r has two forms. > >> >> 'optional' could be changed to 'conditional' or something like that. >> > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ralf.gommers at googlemail.com Sun Mar 21 00:54:06 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 12:54:06 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> Message-ID: On Sun, Mar 21, 2010 at 12:47 PM, wrote: > > dashes would be also ok, but I don't think rst would recognize them. > Valid list markers are *, + and - according to http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#bullet-lists My main problem with rst is that it doesn't allow hard line breaks, > > Agreed, too many blank lines are needed. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sun Mar 21 09:43:34 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 21 Mar 2010 09:43:34 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> Message-ID: <4BA62286.2060502@american.edu> On 3/21/2010 12:24 AM, Charles R Harris wrote: > I really, really want to get rid of the asterisks, they are ugly and > distracting (IMHO). I agree, which is why my deflist example did not use asterisks. I consider it readable and only very slightly verbose. (The blank lines are needed by reST, but the whole example is still only 14 lines long.) I also like `if` much better than `<>`, which I also find visually distracting. fwiw, Alan PS A more compact example: q, r if mode = 'full': - q : ndarray of float or complex, shape (M, K) - r : ndarray of float or complex, shape (K, N) r if mode = 'r': - r : ndarray of float or complex, shape (K, N) a2 if mode = 'economic': - a2 : ndarray of float or complex, shape (M, N) K = min(M, N). The diagonal and the upper triangle of `a2` contains `r`, while the rest of `a2` is undefined. From aisaac at american.edu Sun Mar 21 09:47:50 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 21 Mar 2010 09:47:50 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> Message-ID: <4BA62386.5060200@american.edu> On 3/21/2010 12:47 AM, josef.pktd at gmail.com wrote: > dashes would be also ok, but I don't think rst would recognize them. It does. But again, a definition list (using indentation) is also a list structure. It needs no markup besides the indentation. > My main problem with rst is that it doesn't allow hard line breaks, > which forces some kind of markup to render lists. Not true: you can use preformatted text if you want. But of course the result in not a list environment (e.g., in HTML or LatTeX output). Alan Isaac From aisaac at american.edu Sun Mar 21 09:51:00 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 21 Mar 2010 09:51:00 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003201115u1f442ff7j518338dd197695e2@mail.gmail.com> <4BA522DC.2050100@american.edu> <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> Message-ID: <4BA62444.4080708@american.edu> On 3/21/2010 12:54 AM, Ralf Gommers wrote: > too many blank lines are needed Please define "need" after seeing the compact example I posted. Personally, I think reST makes the right trade-offs, minimizing markup within the constraint of being unambiguous. Alan Isaac From josef.pktd at gmail.com Sun Mar 21 09:57:51 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 21 Mar 2010 09:57:51 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <4BA62444.4080708@american.edu> References: <4BA522DC.2050100@american.edu> <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> Message-ID: <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> On Sun, Mar 21, 2010 at 9:51 AM, Alan G Isaac wrote: > On 3/21/2010 12:54 AM, Ralf Gommers wrote: >> too many blank lines are needed > > Please define "need" after seeing the compact example I posted. > > Personally, I think reST makes the right trade-offs, > minimizing markup within the constraint of being unambiguous. I tried http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.convolve/diff/4791/5687/ last night, but no version looks really nice. I didn't manage the definition list. The mode parameter description is an example for the most common case when we need to do lists in the Parameters descriptions. But I don't think we have consistent use of markup for this case until now One alternative is here: http://docs.scipy.org/scipy/docs/scipy.interpolate.rbf.Rbf/ A good example that can be used as pattern and is acceptable would be useful. Josef > > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ralf.gommers at googlemail.com Sun Mar 21 09:58:42 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 21:58:42 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <4BA62444.4080708@american.edu> References: <4BA522DC.2050100@american.edu> <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> Message-ID: On Sun, Mar 21, 2010 at 9:51 PM, Alan G Isaac wrote: > On 3/21/2010 12:54 AM, Ralf Gommers wrote: > > too many blank lines are needed > > Please define "need" after seeing the compact example I posted. > > You need 4 blank lines in your example. Now I tried adding a description for the first argument (q) like this: q, r if mode = 'full' : - q : ndarray of float or complex, shape (M, K) Description of `q`. - r : ndarray of float or complex, shape (K, N) That doesn't work, you need yet more blank lines (try this in the wiki editor). I just changed the docstring to the following, looks much better in both plain text and html imho: q : ndarray of float or complex, optional The orthonormal matrix, of shape (M, K). Only returned if ``mode='full'``. r : ndarray of float or complex, optional The upper-triangular matrix, of shape (K, N) with K = min(M, N). Only returned when ``mode='full'`` or ``mode='r'``. a2 : ndarray of float or complex, optional Array of shape (M, N), only returned when ``mode='economic``'. The diagonal and the upper triangle of `a2` contains `r`, while the rest of the matrix is undefined. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Mar 21 10:01:29 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 22:01:29 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> Message-ID: On Sun, Mar 21, 2010 at 9:58 PM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 9:51 PM, Alan G Isaac wrote: > >> On 3/21/2010 12:54 AM, Ralf Gommers wrote: >> > too many blank lines are needed >> >> Please define "need" after seeing the compact example I posted. >> >> You need 4 blank lines in your example. Now I tried adding a description > for the first argument (q) like this: > > q, r if mode = 'full' : > - q : ndarray of float or complex, shape (M, K) > Description of `q`. > > - r : ndarray of float or complex, shape (K, N) > > That doesn't work, you need yet more blank lines (try this in the wiki > editor). > > > I just changed the docstring to the following, looks much better in both > plain text and html imho: > > > q : ndarray of float or complex, optional > The orthonormal matrix, of shape (M, K). Only returned if > ``mode='full'``. > r : ndarray of float or complex, optional > The upper-triangular matrix, of shape (K, N) with K = min(M, N). > Only returned when ``mode='full'`` or ``mode='r'``. > a2 : ndarray of float or complex, optional > Array of shape (M, N), only returned when ``mode='economic``'. > The diagonal and the upper triangle of `a2` contains `r`, while > the rest of the matrix is undefined. > This line in the code is fairly amusing by the way: # economic mode. Isn't actually economic. Economic mode is very similar to 'r' mode anyway, what's the point? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Mar 21 10:16:45 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 22:16:45 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> Message-ID: On Sun, Mar 21, 2010 at 9:57 PM, wrote: > On Sun, Mar 21, 2010 at 9:51 AM, Alan G Isaac wrote: > > On 3/21/2010 12:54 AM, Ralf Gommers wrote: > >> too many blank lines are needed > > > > Please define "need" after seeing the compact example I posted. > > > > Personally, I think reST makes the right trade-offs, > > minimizing markup within the constraint of being unambiguous. > > I tried > > http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.convolve/diff/4791/5687/ > > last night, but no version looks really nice. I didn't manage the > definition list. > > The mode parameter description is an example for the most common case > when we need to do lists in the Parameters descriptions. > > But I don't think we have consistent use of markup for this case until now > > One alternative is here: > http://docs.scipy.org/scipy/docs/scipy.interpolate.rbf.Rbf/ > > A good example that can be used as pattern and is acceptable would be > useful. > > Both look sort of okay, but are abusing the syntax. What do you think about the following: 1. Do not use lists with multiple indentation levels, it just doesn't look good and should not be necessary. 2. Use dashes for simple lists. 3. List with multi-line items are broken only inside the Parameters/Returns sections. This is a bug and simply needs to be fixed. (this would fix both of your examples) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Mar 21 10:23:09 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Mar 2010 08:23:09 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> Message-ID: On Sun, Mar 21, 2010 at 8:01 AM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 9:58 PM, Ralf Gommers > wrote: > >> >> >> On Sun, Mar 21, 2010 at 9:51 PM, Alan G Isaac wrote: >> >>> On 3/21/2010 12:54 AM, Ralf Gommers wrote: >>> > too many blank lines are needed >>> >>> Please define "need" after seeing the compact example I posted. >>> >>> You need 4 blank lines in your example. Now I tried adding a description >> for the first argument (q) like this: >> >> q, r if mode = 'full' : >> - q : ndarray of float or complex, shape (M, K) >> Description of `q`. >> >> - r : ndarray of float or complex, shape (K, N) >> >> That doesn't work, you need yet more blank lines (try this in the wiki >> editor). >> >> >> I just changed the docstring to the following, looks much better in both >> plain text and html imho: >> >> >> q : ndarray of float or complex, optional >> The orthonormal matrix, of shape (M, K). Only returned if >> ``mode='full'``. >> r : ndarray of float or complex, optional >> The upper-triangular matrix, of shape (K, N) with K = min(M, N). >> Only returned when ``mode='full'`` or ``mode='r'``. >> a2 : ndarray of float or complex, optional >> Array of shape (M, N), only returned when ``mode='economic``'. >> The diagonal and the upper triangle of `a2` contains `r`, while >> the rest of the matrix is undefined. >> > > This line in the code is fairly amusing by the way: > # economic mode. Isn't actually economic. > > Economic mode is very similar to 'r' mode anyway, what's the point? > > Economic mode is what the low level algorithm likely returns, it contains the info needed to contruct q if needed, or to efficiently apply q to different vectors without constructing q; constructing q adds to the computational and memory costs, as does pulling r out of the economic return. The situation is analogous to the LU decomposition where the natural form is to store both L and U in the original matrix. Other algorithms can then use that compact form to solve equations with different right hand sides. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 21 10:41:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 21 Mar 2010 10:41:19 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> Message-ID: <1cd32cbb1003210741h34759c65p15d2e7570bdd6ca9@mail.gmail.com> On Sun, Mar 21, 2010 at 10:16 AM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 9:57 PM, wrote: >> >> On Sun, Mar 21, 2010 at 9:51 AM, Alan G Isaac wrote: >> > On 3/21/2010 12:54 AM, Ralf Gommers wrote: >> >> too many blank lines are needed >> > >> > Please define "need" after seeing the compact example I posted. >> > >> > Personally, I think reST makes the right trade-offs, >> > minimizing markup within the constraint of being unambiguous. >> >> I tried >> >> http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.convolve/diff/4791/5687/ >> >> last night, but no version looks really nice. I didn't manage the >> definition list. >> >> The mode parameter description is an example for ?the most common case >> when we need to do lists in the Parameters descriptions. >> >> But I don't think we have consistent use of markup for this case until now >> >> One alternative is here: >> http://docs.scipy.org/scipy/docs/scipy.interpolate.rbf.Rbf/ >> >> A good example that can be used as pattern and is acceptable would be >> useful. >> > Both look sort of okay, but are abusing the syntax. > > What do you think about the following: > 1. Do not use lists with multiple indentation levels, it just doesn't look > good and should not be necessary. > 2. Use dashes for simple lists. both fine with me, we can convert asterisks to dashes > 3. List with multi-line items are broken only inside the Parameters/Returns > sections. This is a bug and simply needs to be fixed. (this would fix both > of your examples) Does this mean if this bug gets fixed, then we wouldn't need the extra empty lines between list items? Currently, the rendering in the doc editor view for item lists has also wrong indentation http://docs.scipy.org/numpy/docs/numpy.ndarray.transpose/ but the html looks ok http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html (correctly rendered definition lists might be nicer than bullet lists in html) Josef > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ralf.gommers at googlemail.com Sun Mar 21 10:53:14 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Mar 2010 22:53:14 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <1cd32cbb1003210741h34759c65p15d2e7570bdd6ca9@mail.gmail.com> References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> <1cd32cbb1003210741h34759c65p15d2e7570bdd6ca9@mail.gmail.com> Message-ID: On Sun, Mar 21, 2010 at 10:41 PM, wrote: > On Sun, Mar 21, 2010 at 10:16 AM, Ralf Gommers > wrote: > > > Both look sort of okay, but are abusing the syntax. > > > > What do you think about the following: > > 1. Do not use lists with multiple indentation levels, it just doesn't > look > > good and should not be necessary. > > 2. Use dashes for simple lists. > > both fine with me, we can convert asterisks to dashes > > > 3. List with multi-line items are broken only inside the > Parameters/Returns > > sections. This is a bug and simply needs to be fixed. (this would fix > both > > of your examples) > > Does this mean if this bug gets fixed, then we wouldn't need the extra > empty lines between list items? > Yes. The following works in the Notes section, but not in Parameters: - item one - item two this is a multi-line item - item 3 > > Currently, the rendering in the doc editor view for item lists has > also wrong indentation > http://docs.scipy.org/numpy/docs/numpy.ndarray.transpose/ > but the html looks ok > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html > Html looks fine indeed. It should still look like that once that bug is fixed. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sun Mar 21 12:27:47 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 21 Mar 2010 12:27:47 -0400 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <4BA522DC.2050100@american.edu> <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> Message-ID: <4BA64903.8050005@american.edu> >> On 3/21/2010 12:54 AM, Ralf Gommers wrote: >>> too many blank lines are needed > On Sun, Mar 21, 2010 at 9:51 PM, Alan G Isaac > wrote: >> Please define "need" after seeing the compact example I posted. On 3/21/2010 9:58 AM, Ralf Gommers wrote: > You need 4 blank lines in your example. Now I tried adding a description Here is the compact example I posted. q, r if mode = 'full': - q : ndarray of float or complex, shape (M, K) - r : ndarray of float or complex, shape (K, N) r if mode = 'r': - r : ndarray of float or complex, shape (K, N) a2 if mode = 'economic': - a2 : ndarray of float or complex, shape (M, N) K = min(M, N). The diagonal and the upper triangle of `a2` contains `r`, while the rest of `a2` is undefined. Alan From sebastian.walter at gmail.com Sun Mar 21 13:13:00 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Sun, 21 Mar 2010 18:13:00 +0100 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <75633AD8-176D-452C-8A54-C43FD2BC9539@cs.toronto.edu> References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <75633AD8-176D-452C-8A54-C43FD2BC9539@cs.toronto.edu> Message-ID: On Fri, Mar 19, 2010 at 11:18 PM, David Warde-Farley wrote: > On 19-Mar-10, at 1:13 PM, Anne Archibald wrote: > >> I'm not knocking numpy; it does (almost) the best it can. (I'm not >> sure of the optimality of the order in which ufuncs are executed; I >> think some optimizations there are possible.) But a language designed >> from scratch for vector calculations could certainly compile >> expressions into a form that would save a lot of memory accesses, >> particularly if an optimizer combined many lines of code. I've >> actually thought about whether such a thing could be done in python; I >> think the way to do it would be to build expression objects from >> variable objects, then have a single "apply" function that fed values >> in to all the variables. > > Hey Anne, > > Some folks across town from you at U de M have built just such at > thing. :) > > http://deeplearning.net/software/theano/ > > It does all that, plus automatic differentiation, detection and > correction of numerical instabilities, etc. > > Probably the most amazing thing about it is that with recent versions, > you basically flip a switch and it will instead use an available CUDA- > capable Nvidia GPU instead of the CPU. I'll admit, when James Bergstra > initially told me about this plan to make it possible to transparently > switch to running stuff on the GPU, I thought it was so ambitious that > it would never happen. Then it did... The progress Theano is making is promising. I had several times a look at theano and I like the idea of code generation, especially the numpy support. I hope it may be useful for one of my projects in the future. What I couldn't figure out from the documentation is the actual performance and ease of use. Am I right with the assumption that you are not a Theano dev? Have you used Theano in a project? What are you experiences? Do you happen to know how big the computational graphs can be? Is there the possibility to have loops and if then else statements? Sorry for being a little offtopic here. Sebastian > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Sun Mar 21 18:08:35 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 22 Mar 2010 00:08:35 +0200 Subject: [Numpy-discussion] Array bug with character (regression from 1.4.0) In-Reply-To: References: Message-ID: <1269209315.14208.2.camel@Nokia-N900-42-11> Ryan May wrote: > The following code, which works with numpy 1.4.0, results in an error: Python 2.6, I presume? > In [1]: import numpy as np > In [2]: v = 'm' > In [3]: dt = np.dtype('>c') > In [4]: a = np.asarray(v, dt) > > On SVN trunk: > ValueError: assignment to 0-d array > > In [5]: np.__version__ > Out[5]: '2.0.0.dev8297' > > Thoughts? Nope, but it's likely my bad. Smells a bit like the dtype '>c' has size 0 that doesn't get automatically adjusted in the array constructor, so you end up assigning size-1 string to size-0 array element... Which is strange sice I don't think this particular code path has been changed at all. One would have to follow the C execution path with gdb to find out what goes wrong. -- Pauli Virtanen From charlesr.harris at gmail.com Sun Mar 21 18:13:27 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Mar 2010 16:13:27 -0600 Subject: [Numpy-discussion] Array bug with character (regression from 1.4.0) In-Reply-To: <1269209315.14208.2.camel@Nokia-N900-42-11> References: <1269209315.14208.2.camel@Nokia-N900-42-11> Message-ID: On Sun, Mar 21, 2010 at 4:08 PM, Pauli Virtanen wrote: > Ryan May wrote: > > The following code, which works with numpy 1.4.0, results in an error: > > Python 2.6, I presume? > > > In [1]: import numpy as np > > In [2]: v = 'm' > > In [3]: dt = np.dtype('>c') > > In [4]: a = np.asarray(v, dt) > > > > On SVN trunk: > > ValueError: assignment to 0-d array > > > > In [5]: np.__version__ > > Out[5]: '2.0.0.dev8297' > > > > Thoughts? > > Nope, but it's likely my bad. Smells a bit like the dtype '>c' has size 0 > that doesn't get automatically adjusted in the array constructor, so you end > up assigning size-1 string to size-0 array element... Which is strange sice > I don't think this particular code path has been changed at all. > > One would have to follow the C execution path with gdb to find out what > goes wrong. > > I was wondering if this was related to Michael's fixes for character arrays? A little bisection might help localize the problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sun Mar 21 18:29:11 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 22 Mar 2010 00:29:11 +0200 Subject: [Numpy-discussion] Array bug with character (regression from 1.4.0) In-Reply-To: References: <1269209315.14208.2.camel@Nokia-N900-42-11> Message-ID: <1269210551.4260.8.camel@talisman> su, 2010-03-21 kello 16:13 -0600, Charles R Harris kirjoitti: > I was wondering if this was related to Michael's fixes for > character arrays? A little bisection might help localize the problem. It's a bug I introduced in r8144... I forgot one *can* assign strings to 0-d arrays, and strings are indeed one sequence type. I'm going to back that changeset out, since it was only a cosmetic fix. That particular part of code needs some cleanup (it's a bit too hairy if things like this can slip), but I don't have the time at the moment to come up with a more complete fix. Cheers, Pauli From charlesr.harris at gmail.com Sun Mar 21 19:03:44 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Mar 2010 17:03:44 -0600 Subject: [Numpy-discussion] Array bug with character (regression from 1.4.0) In-Reply-To: <1269210551.4260.8.camel@talisman> References: <1269209315.14208.2.camel@Nokia-N900-42-11> <1269210551.4260.8.camel@talisman> Message-ID: On Sun, Mar 21, 2010 at 4:29 PM, Pauli Virtanen wrote: > su, 2010-03-21 kello 16:13 -0600, Charles R Harris kirjoitti: > > I was wondering if this was related to Michael's fixes for > > character arrays? A little bisection might help localize the problem. > > It's a bug I introduced in r8144... I forgot one *can* assign strings to > 0-d arrays, and strings are indeed one sequence type. > > I'm going to back that changeset out, since it was only a cosmetic fix. > That particular part of code needs some cleanup (it's a bit too hairy if > things like this can slip), but I don't have the time at the moment to > come up with a more complete fix. > > Lots of the code needs some cleanup ;) The first step was to reformat it -- still ongoing, I've got changes on hold for after the release -- and break it up into smaller files with some similarity of function. At some point the code needs to be documented and some macros relaced with inline functions. All macros that implement jumps should be removed and replaced by something more transparent. I'd also like to see some of the "policy" type stuff move up to cython, which should be easier to document and understand, not to mention making for a cleaner interface to Python. So on and so on... We could use a few more developers, especially with David getting a real job. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Sun Mar 21 19:06:06 2010 From: rmay31 at gmail.com (Ryan May) Date: Sun, 21 Mar 2010 18:06:06 -0500 Subject: [Numpy-discussion] Array bug with character (regression from 1.4.0) In-Reply-To: <1269210551.4260.8.camel@talisman> References: <1269209315.14208.2.camel@Nokia-N900-42-11> <1269210551.4260.8.camel@talisman> Message-ID: On Sun, Mar 21, 2010 at 5:29 PM, Pauli Virtanen wrote: > su, 2010-03-21 kello 16:13 -0600, Charles R Harris kirjoitti: >> I was wondering if this was related to Michael's fixes for >> character arrays? A little bisection might help localize the problem. > > It's a bug I introduced in r8144... I forgot one *can* assign strings to > 0-d arrays, and strings are indeed one sequence type. > > I'm going to back that changeset out, since it was only a cosmetic fix. > That particular part of code needs some cleanup (it's a bit too hairy if > things like this can slip), but I don't have the time at the moment to > come up with a more complete fix. That fixed it for me, thanks for getting done quickly. What's amusing is that I found it because pupynere was failing to write files where a variable had an attribute that consisted of a single letter. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From charlesr.harris at gmail.com Sun Mar 21 19:07:49 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Mar 2010 17:07:49 -0600 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> <1cd32cbb1003210741h34759c65p15d2e7570bdd6ca9@mail.gmail.com> Message-ID: On Sun, Mar 21, 2010 at 8:53 AM, Ralf Gommers wrote: > > > On Sun, Mar 21, 2010 at 10:41 PM, wrote: > >> On Sun, Mar 21, 2010 at 10:16 AM, Ralf Gommers >> wrote: >> >> > Both look sort of okay, but are abusing the syntax. >> > >> > What do you think about the following: >> > 1. Do not use lists with multiple indentation levels, it just doesn't >> look >> > good and should not be necessary. >> > 2. Use dashes for simple lists. >> >> both fine with me, we can convert asterisks to dashes >> >> > 3. List with multi-line items are broken only inside the >> Parameters/Returns >> > sections. This is a bug and simply needs to be fixed. (this would fix >> both >> > of your examples) >> >> Does this mean if this bug gets fixed, then we wouldn't need the extra >> empty lines between list items? >> > > Yes. The following works in the Notes section, but not in Parameters: > - item one > - item two > this is a multi-line item > - item 3 > > >> >> Currently, the rendering in the doc editor view for item lists has >> also wrong indentation >> http://docs.scipy.org/numpy/docs/numpy.ndarray.transpose/ >> but the html looks ok >> >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html >> > > Html looks fine indeed. It should still look like that once that bug is > fixed. > > Could you open a ticket for this? We need to track it somewhere. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Mar 21 22:17:23 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 22 Mar 2010 10:17:23 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <4BA64903.8050005@american.edu> References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <4BA64903.8050005@american.edu> Message-ID: On Mon, Mar 22, 2010 at 12:27 AM, Alan G Isaac wrote: > >> On 3/21/2010 12:54 AM, Ralf Gommers wrote: > >>> too many blank lines are needed > > > > On Sun, Mar 21, 2010 at 9:51 PM, Alan G Isaac > > wrote: > >> Please define "need" after seeing the compact example I posted. > > > On 3/21/2010 9:58 AM, Ralf Gommers wrote: > > You need 4 blank lines in your example. Now I tried adding a description > > > Here is the compact example I posted. > > q, r if mode = 'full': > - q : ndarray of float or complex, shape (M, K) > - r : ndarray of float or complex, shape (K, N) > r if mode = 'r': > - r : ndarray of float or complex, shape (K, N) > a2 if mode = 'economic': > - a2 : ndarray of float or complex, shape (M, N) > > K = min(M, N). > The diagonal and the upper triangle of `a2` contains `r`, > while the rest of `a2` is undefined. > > Looked at the wrong thing, apologies. I'll play with your example tonight. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Mon Mar 22 00:49:14 2010 From: rmay31 at gmail.com (Ryan May) Date: Sun, 21 Mar 2010 23:49:14 -0500 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass Message-ID: Hi, I found that trapz() doesn't work with subclasses: http://projects.scipy.org/numpy/ticket/1438 A simple patch (attached) to change asarray() to asanyarray() fixes the problem fine. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- A non-text attachment was scrubbed... Name: fix_trapz_subclass.diff Type: application/octet-stream Size: 484 bytes Desc: not available URL: From josef.pktd at gmail.com Mon Mar 22 00:57:16 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Mar 2010 00:57:16 -0400 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: Message-ID: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: > Hi, > > I found that trapz() doesn't work with subclasses: > > http://projects.scipy.org/numpy/ticket/1438 > > A simple patch (attached) to change asarray() to asanyarray() fixes > the problem fine. Are you sure this function works with matrices and other subclasses? Looking only very briefly at it: the multiplication might be a problem. Josef > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From eemselle at eso.org Mon Mar 22 05:02:00 2010 From: eemselle at eso.org (Eric Emsellem) Date: Mon, 22 Mar 2010 10:02:00 +0100 Subject: [Numpy-discussion] Test if one element of string array is in a defined list Message-ID: <4BA73208.8090706@eso.org> Hi I would like to test whether strings in a numpy S array are in a given list but I don't manage to do so. Any hint is welcome. ======================================================= # So here is an example of what I would like to do # I have a String numpy array: import numpy as num Sarray = num.asarray(["test1","test2","tutu","toto"]) Farray = num.arange(len(Sarray)) mylist = ["tutu","hello","why"] # and I would like to do: result = num.where(Sarray in mylist, Farray, 0) ======================================================= ===> but of course I get a ValueError since the first test syntax is wrong. I would like to be able to do: Sarray in mylist and get the output: array([False, False, True, False],dtype=bool) since only the 3rd string "tutu" of Sarray is in mylist. Any input is welcome. cheers Eric From kshipras at packtpub.com Mon Mar 22 07:49:43 2010 From: kshipras at packtpub.com (Kshipra Singh) Date: Mon, 22 Mar 2010 17:19:43 +0530 Subject: [Numpy-discussion] Author NumPy books- Packt Publishing Message-ID: Hi All, I am writing to you for Packt Publishing, the publishers of computer related books. We are planning to extend our catalogue of books based on Scientific Computing Tools and are currently inviting authors interested in writing for Packt. This doesn't need any previous writing experience. Just an expert knowledge of your subject and a passion to share it with others is all that we require. So, if you love NumPy and are interested in authoring a book, please write to us with your book ideas at author at packtpub.com. Even if you don't have a book idea and are simply interested in authoring a book, we are still keen to hear from you. More details about the opportunity are available at: http://authors.packtpub.com/content/scientific-compting-tools-write-Packt Thanks Kshipra Singh Author Relationship Manager Packt Publishing www.PacktPub.com Skype: kshiprasingh15 Twitter: http://twitter.com/kshipras Interested in becoming an author? Visit http://authors.packtpub.com for all the information you need about writing for Packt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Mar 22 08:20:36 2010 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 22 Mar 2010 08:20:36 -0400 Subject: [Numpy-discussion] PEP 384 Message-ID: I'm thinking PEP 384, if approved, would require a lot of redesign to numpy. From neilcrighton at gmail.com Mon Mar 22 08:15:43 2010 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 22 Mar 2010 12:15:43 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?Test_if_one_element_of_string_array_?= =?utf-8?q?is_in_a=09defined_list?= References: <4BA73208.8090706@eso.org> Message-ID: Eric Emsellem eso.org> writes: > Hi > > I would like to test whether strings in a numpy S array are in a given list but > I don't manage to do so. Any hint is welcome. > > ======================================================= > # So here is an example of what I would like to do > # I have a String numpy array: > > import numpy as num > Sarray = num.asarray(["test1","test2","tutu","toto"]) > Farray = num.arange(len(Sarray)) > mylist = ["tutu","hello","why"] > in1d() does what you want. >>> import numpy as np >>> Sarray = np.array(["test1","test2","tutu","toto"]) >>> mylist = ["tutu","hello","why"] >>> np.in1d(Sarray, mylist) array([False, False, True, False], dtype=bool) Be careful of whitespace when doing string comparisons; "tutu " != "tutu" (I've been burnt by this in the past). in1d() is only in more recent versions of numpy (1.4+). If you can't upgrade, you can cut and paste the in1d() and unique() routines from here: http://projects.scipy.org/numpy/browser/branches/datetime/numpy/lib/arraysetops. py to use in your own modules. Cheers, Neil From wkerzendorf at googlemail.com Mon Mar 22 09:15:00 2010 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Tue, 23 Mar 2010 00:15:00 +1100 Subject: [Numpy-discussion] using sunperf for compiling scipy and numpy Message-ID: <4BA76D54.5070008@gmail.com> Dear all, I would like to use the sunperf libraries when compiling scipy and numpy. I tried using setupscons.py which seems to check from SUNPERF libraries, but it didnt recognize where mine are: here is a listing of /pkg/linux/SS12/sunstudio12.1 (thats where the sunperf library lives): wkerzend at mosura:/home/wkerzend>ls /pkg/linux/SS12/sunstudio12.1/lib/ CCios/ libdbx_agent.so@ libsunperf.so.3@ amd64/ libfcollector.so@ libtha.so@ collector.jar@ libfsu.so@ libtha.so.1@ dbxrc@ libfsu.so.1@ locale/ debugging.so@ libfui.so@ make.rules@ er.rc@ libfui.so.1@ rw7/ libblacs_openmpi.so@ librtc.so@ sse2/ libblacs_openmpi.so.1@ libscalapack.so@ stlport4/ libcollectorAPI.so@ libscalapack.so.1@ svr4.make.rules@ libcollectorAPI.so.1@ libsunperf.so@ tools_svc_mgr@ ------------------------------- I tried to specify this directory in sites.cfg, but I still get the following errors: Checking if g77 needs dummy main - MAIN__. Checking g77 name mangling - '_', '', lower-case. Checking g77 C compatibility runtime ...-L/usr/lib/gcc/x86_64-redhat-linux/3.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../.. -L/lib/../lib64 -L/usr/lib/../lib64 -lfrtbegin -lg2c -lm Checking MKL ... Failed (could not check header(s) : check config.log in build/scons/scipy/integrate for more details) Checking ATLAS ... Failed (could not check header(s) : check config.log in build/scons/scipy/integrate for more details) Checking SUNPERF ... Failed (could not check symbol cblas_sgemm : check config.log in build/scons/scipy/integrate for more details)) Checking Generic BLAS ... yes Checking for BLAS (Generic BLAS) ... Failed: BLAS (Generic BLAS) test could not be linked and run Exception: Could not find F77 BLAS, needed for integrate package: File "/priv/manana1/wkerzend/install_dir/scipy-0.7.1/scipy/integrate/SConstruct", line 2: GetInitEnvironment(ARGUMENTS).DistutilsSConscript('SConscript') File "/home/wkerzend/python_coala/numscons-0.10.1-py2.6.egg/numscons/core/numpyenv.py", line 108: build_dir = '$build_dir', src_dir = '$src_dir') File "/priv/manana1/wkerzend/python_coala/numscons-0.10.1-py2.6.egg/numscons/scons-local/scons-local-1.2.0/SCons/Script/SConscript.py", line 549: return apply(_SConscript, [self.fs,] + files, subst_kw) File "/priv/manana1/wkerzend/python_coala/numscons-0.10.1-py2.6.egg/numscons/scons-local/scons-local-1.2.0/SCons/Script/SConscript.py", line 259: exec _file_ in call_stack[-1].globals File "/priv/manana1/wkerzend/install_dir/scipy-0.7.1/build/scons/scipy/integrate/SConscript", line 15: raise Exception("Could not find F77 BLAS, needed for integrate package") error: Error while executing scons command. See above for more information. If you think it is a problem in numscons, you can also try executing the scons command with --log-level option for more detailed output of what numscons is doing, for example --log-level=0; the lowest the level is, the more detailed the output it.----- -------------------------- any help is appreciated Wolfgang From rmay31 at gmail.com Mon Mar 22 10:14:41 2010 From: rmay31 at gmail.com (Ryan May) Date: Mon, 22 Mar 2010 09:14:41 -0500 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> Message-ID: On Sun, Mar 21, 2010 at 11:57 PM, wrote: > On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: >> Hi, >> >> I found that trapz() doesn't work with subclasses: >> >> http://projects.scipy.org/numpy/ticket/1438 >> >> A simple patch (attached) to change asarray() to asanyarray() fixes >> the problem fine. > > Are you sure this function works with matrices and other subclasses? > > Looking only very briefly at it: the multiplication might be a problem. Correct, it probably *is* a problem in some cases with matrices. In this case, I was using quantities (Darren Dale's unit-aware array package), and the result was that units were stripped off. The patch can't make trapz() work with all subclasses. However, right now, you have *no* hope of getting a subclass out of trapz(). With this change, subclasses that don't redefine operators can work fine. If you're passing a Matrix to trapz() and expecting it to work, IMHO you're doing it wrong. You can still pass one in by using asarray() yourself. Without this patch, I'm left with copying and maintaining a copy of the code elsewhere, just so I can loosen the function's input processing. That seems wrong, since there's really no need in my case to drop down to an ndarray. The input I'm giving it supports all the operations it needs, so it should just work with my original input. Or am I just off base here? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From ralf.gommers at googlemail.com Mon Mar 22 11:02:44 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 22 Mar 2010 23:02:44 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: <4BA64903.8050005@american.edu> References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <4BA64903.8050005@american.edu> Message-ID: On Mon, Mar 22, 2010 at 12:27 AM, Alan G Isaac wrote: > >> On 3/21/2010 12:54 AM, Ralf Gommers wrote: > >>> too many blank lines are needed > > > > On Sun, Mar 21, 2010 at 9:51 PM, Alan G Isaac > > wrote: > >> Please define "need" after seeing the compact example I posted. > > > On 3/21/2010 9:58 AM, Ralf Gommers wrote: > > You need 4 blank lines in your example. Now I tried adding a description > > > Here is the compact example I posted. > > q, r if mode = 'full': > - q : ndarray of float or complex, shape (M, K) > - r : ndarray of float or complex, shape (K, N) > r if mode = 'r': > - r : ndarray of float or complex, shape (K, N) > a2 if mode = 'economic': > - a2 : ndarray of float or complex, shape (M, N) > > K = min(M, N). > The diagonal and the upper triangle of `a2` contains `r`, > while the rest of `a2` is undefined. > > Your example works, the only blank lines it needs is before and after the whole block, plus above "K = min(M, N)." With one level of indentation this is an alternative to dashed lists. After adding definitions for the returned arguments I still think it doesn't look good, but that's maybe a matter of taste. Try this in the wiki (in the Notes, doesn't work in Parameters/Returns): q, r if mode = 'full': - q : ndarray of float or complex, shape (M, K) Definition of q. - r : ndarray of float or complex, shape (K, N) Definition of r. r if mode = 'r': - r : ndarray of float or complex, shape (K, N) Definition of r. a2 if mode = 'economic': - a2 : ndarray of float or complex, shape (M, N) Definition of a. K = min(M, N). The diagonal and the upper triangle of `a2` contains `r`, while the rest of `a2` is undefined. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Mar 22 11:12:05 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 22 Mar 2010 23:12:05 +0800 Subject: [Numpy-discussion] Help!!! Docstrings overrun by markup crap. In-Reply-To: References: <1cd32cbb1003202147h6e2b780ao44b6c7a8497c2b97@mail.gmail.com> <4BA62444.4080708@american.edu> <1cd32cbb1003210657g25129f0fgfb83d299605b81b6@mail.gmail.com> <1cd32cbb1003210741h34759c65p15d2e7570bdd6ca9@mail.gmail.com> Message-ID: On Mon, Mar 22, 2010 at 7:07 AM, Charles R Harris wrote: > > > On Sun, Mar 21, 2010 at 8:53 AM, Ralf Gommers > wrote: > >> >> >> On Sun, Mar 21, 2010 at 10:41 PM, wrote: >> >>> > 3. List with multi-line items are broken only inside the >>> Parameters/Returns >>> > sections. This is a bug and simply needs to be fixed. (this would fix >>> both >>> > of your examples) >>> >>> Does this mean if this bug gets fixed, then we wouldn't need the extra >>> empty lines between list items? >>> >> >> Yes. The following works in the Notes section, but not in Parameters: >> - item one >> - item two >> this is a multi-line item >> - item 3 >> >> >>> >>> Currently, the rendering in the doc editor view for item lists has >>> also wrong indentation >>> http://docs.scipy.org/numpy/docs/numpy.ndarray.transpose/ >>> but the html looks ok >>> >>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html >>> >> >> Html looks fine indeed. It should still look like that once that bug is >> fixed. >> >> > Could you open a ticket for this? We need to track it somewhere. > It's already a pydocweb ticket: http://code.google.com/p/pydocweb/issues/detail?id=46 The html in the compiled docs looks decent so I don't think it's a numpy issue. It's just hard to get markup right if the wiki doesn't work (usually the big red letters indicate actual human error). Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From bergstrj at iro.umontreal.ca Mon Mar 22 15:07:00 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Mon, 22 Mar 2010 15:07:00 -0400 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <75633AD8-176D-452C-8A54-C43FD2BC9539@cs.toronto.edu> Message-ID: <7f1eaee31003221207l7c2e84adk63e4904e5705072d@mail.gmail.com> On Sun, Mar 21, 2010 at 1:13 PM, Sebastian Walter wrote: > On Fri, Mar 19, 2010 at 11:18 PM, David Warde-Farley wrote: >> On 19-Mar-10, at 1:13 PM, Anne Archibald wrote: >> >>> I'm not knocking numpy; it does (almost) the best it can. (I'm not >>> sure of the optimality of the order in which ufuncs are executed; I >>> think some optimizations there are possible.) But a language designed >>> from scratch for vector calculations could certainly compile >>> expressions into a form that would save a lot of memory accesses, >>> particularly if an optimizer combined many lines of code. I've >>> actually thought about whether such a thing could be done in python; I >>> think the way to do it would be to build expression objects from >>> variable objects, then have a single "apply" function that fed values >>> in to all the variables. >> >> Hey Anne, >> >> Some folks across town from you at U de M have built just such at >> thing. :) >> >> http://deeplearning.net/software/theano/ >> >> It does all that, plus automatic differentiation, detection and >> correction of numerical instabilities, etc. >> >> Probably the most amazing thing about it is that with recent versions, >> you basically flip a switch and it will instead use an available CUDA- >> capable Nvidia GPU instead of the CPU. I'll admit, when James Bergstra >> initially told me about this plan to make it possible to transparently >> switch to running stuff on the GPU, I thought it was so ambitious that >> it would never happen. Then it did... > > The progress Theano is making is promising. I had several times a look > at theano and I like the idea of code generation, > especially the numpy support. I hope it may be useful for one of my > projects in the future. > > What I couldn't figure out from the documentation is the actual > performance and ease of use. > Am I right with the assumption that you are not a Theano dev? Have you > used Theano in a project? What are you experiences? > Do you happen to know how big the computational graphs can be? > Is there the possibility to have loops and if then else statements? > > Sorry for being a little offtopic here. I encourage you to sign up for theano-users at googlegroup.com if you want to keep an eye on things. If you fwd this to theano-users I'd be happy answer it there. James -- http://www-etud.iro.umontreal.ca/~bergstrj From pav at iki.fi Mon Mar 22 15:42:55 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 22 Mar 2010 21:42:55 +0200 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <1269105981.13114.2.camel@Nokia-N900-42-11> <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> Message-ID: <1269286975.3289.5.camel@talisman> la, 2010-03-20 kello 17:36 -0400, Anne Archibald kirjoitti: > I was in on that discussion. My recollection of the conclusion was > that on the one hand they're useful, carefully applied, while on the > other hand they're very difficult to reliably detect (since you don't > want to forbid operations on non-overlapping slices of the same > array). I think one alternative brought up was copy if unsure whether the slices overlap which would make A[whatever] = A[whatever2] be always identical in functionality to A[whatever] = A[whatever2].copy() which is how things should work. This would permit optimizing simple cases (at least 1D), and avoids running into NP-completeness (for numpy, the exponential growth is however limited by NPY_MAXDIMS which is 64, IIRC). This would be a change in semantics, but in a very obscure corner that hopefully nobody relies on. -- Pauli Virtanen From peridot.faceted at gmail.com Mon Mar 22 15:58:42 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 22 Mar 2010 14:58:42 -0500 Subject: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal In-Reply-To: <1269286975.3289.5.camel@talisman> References: <201003181457.25785.faltet@pytables.org> <201003181853.00829.faltet@pytables.org> <1269105981.13114.2.camel@Nokia-N900-42-11> <4c1b9792141f429fc4a0443ff6b491e6.squirrel@webmail.uio.no> <1269286975.3289.5.camel@talisman> Message-ID: On 22 March 2010 14:42, Pauli Virtanen wrote: > la, 2010-03-20 kello 17:36 -0400, Anne Archibald kirjoitti: >> I was in on that discussion. My recollection of the conclusion was >> that on the one hand they're useful, carefully applied, while on the >> other hand they're very difficult to reliably detect (since you don't >> want to forbid operations on non-overlapping slices of the same >> array). > > I think one alternative brought up was > > ? ? ? ?copy if unsure whether the slices overlap > > which would make > > ? ? ? ?A[whatever] = A[whatever2] > > be always identical in functionality to > > ? ? ? ?A[whatever] = A[whatever2].copy() > > which is how things should work. This would permit optimizing simple > cases (at least 1D), and avoids running into NP-completeness (for numpy, > the exponential growth is however limited by NPY_MAXDIMS which is 64, > IIRC). It can produce surprise copies, but I could certainly live with this. Or maybe a slight modification: "always produces values equivalent to using a copy", to allow handling the common A[:-1]=A[1:] without a copy. Of course, we'd have to wait for someone to implement it... > This would be a change in semantics, but in a very obscure corner that > hopefully nobody relies on. It would certainly be nice to replace unspecified behaviour by specified behaviour if it can be done with minimal cost. And I think it could be, with some careful programming. Anne From aisaac at american.edu Mon Mar 22 18:11:52 2010 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 22 Mar 2010 18:11:52 -0400 Subject: [Numpy-discussion] integer division rule changed In-Reply-To: References: <4B9BF660.4050600@american.edu> <1268514292.8600.3.camel@talisman> Message-ID: <4BA7EB28.8050603@american.edu> On 3/13/2010 8:57 PM, Charles R Harris wrote: > I suspect the change was made, whenever that was, in order to conform to > python. So is there an actual polcy? When C99 behavior and Python behavior differ, will NumPy follow Python as a *rule*? Thanks, Alan Isaac From charlesr.harris at gmail.com Mon Mar 22 19:54:16 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 22 Mar 2010 17:54:16 -0600 Subject: [Numpy-discussion] integer division rule changed In-Reply-To: <4BA7EB28.8050603@american.edu> References: <4B9BF660.4050600@american.edu> <1268514292.8600.3.camel@talisman> <4BA7EB28.8050603@american.edu> Message-ID: On Mon, Mar 22, 2010 at 4:11 PM, Alan G Isaac wrote: > On 3/13/2010 8:57 PM, Charles R Harris wrote: > > I suspect the change was made, whenever that was, in order to conform to > > python. > > So is there an actual polcy? When C99 behavior > and Python behavior differ, will NumPy follow > Python as a *rule*? > > I don't know if it is official policy, but in python3.x integer division becomes true division and the old behaviour has to be gotten by using //. This will no doubt cause some problems... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cristianofini at googlemail.com Tue Mar 23 09:42:13 2010 From: cristianofini at googlemail.com (Cristiano Fini) Date: Tue, 23 Mar 2010 13:42:13 +0000 Subject: [Numpy-discussion] Column-Specific Conditions and Column-Specific Substitution Values Message-ID: <882b37311003230642q556cc2d1tef99c7e26aeeafcc@mail.gmail.com> Hi Everyone, a beginner's question on how to perform some data substitution efficiently. I have a panel dataset, or in other words x individuals observed over a certain time span. For each column or individual, I need to substitute a certain value anytime a certain condition is satisfied. Both the condition and the value to be substituted into the panel dataset are individual specific. I can tackle the fact that the condition is individual specific but I cannot find a way to tackle the fact that the value to be substituted is individual specific without using a for ? lop. Frankly, considering the size of the dataset the use of a for loop is perfectly acceptable in terms of the time needed to complete task but still it would be nice to learn a way to do this (a task I implement often) in a more efficient way. Thanks in advance Cristiano import numpy as np from copy import deepcopy Data = np.array([[0,4,0], [2,5,7], [2,5,6]]) EditedData = deepcopy(Data) Condition = np.array([0, 5, 6]) # individual-specific condition SubstituteData = np.array([1, 10,100]) # The logic here # if the value of any obssrvation for the 1st individual is 0, substitute 1, # the 2nd individual is 5, substitute 10 # the 3rd individual is 6, substitute 100 # This wouldn't a problem if SubstituteData was not individual specific Data # eg EditedData[Data==Condition] = 555 # As SubstituteData is individual specifc, I need to use a for loop for i in range(np.shape(EditedData)[1]): TempData = EditedData[:, i] # I introduce TempData to increase readability TempData[TempData == Condition[i]] = SubstituteData[i] EditedData[:, i] = TempData print EditedData -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Tue Mar 23 09:55:23 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 23 Mar 2010 14:55:23 +0100 Subject: [Numpy-discussion] Column-Specific Conditions and Column-Specific Substitution Values In-Reply-To: <882b37311003230642q556cc2d1tef99c7e26aeeafcc@mail.gmail.com> References: <882b37311003230642q556cc2d1tef99c7e26aeeafcc@mail.gmail.com> Message-ID: <4BA8C84B.6060103@student.matnat.uio.no> Cristiano Fini wrote: > > Hi Everyone, > a beginner's question on how to perform some data substitution > efficiently. I have a panel dataset, or in other words x individuals > observed over a certain time span. For each column or individual, I > need to substitute a certain value anytime a certain condition is > satisfied. Both the condition and the value to be substituted into the > panel dataset are individual specific. I can tackle the fact that the > condition is individual specific but I cannot find a way to tackle the > fact that the value to be substituted is individual specific without > using a for ? lop. Frankly, considering the size of the dataset the > use of a for loop is perfectly acceptable in terms of the time needed > to complete task but still it would be nice to learn a way to do this > (a task I implement often) in a more efficient way. > Thanks in advance > Cristiano > > > import numpy as np > from copy import deepcopy > Data = np.array([[0,4,0], > [2,5,7], > [2,5,6]]) > EditedData = deepcopy(Data) Data.copy() will do for non-object arrays like this. > Condition = np.array([0, 5, 6]) # individual-specific condition > SubstituteData = np.array([1, 10,100]) > # The logic here > # if the value of any obssrvation for the 1st individual is 0, > substitute 1, > # the 2nd individual is 5, > substitute 10 > # the 3rd individual is 6, > substitute 100 > > # This wouldn't a problem if SubstituteData was not individual > specific Data > # eg EditedData[Data==Condition] = 555 > # As SubstituteData is individual specifc, I need to use a for loop > for i in range(np.shape(EditedData)[1]): > TempData = EditedData[:, i] # I introduce TempData to increase > readability > TempData[TempData == Condition[i]] = SubstituteData[i] > EditedData[:, i] = TempData How about should_replace = (Data != Condition[np.newaxis, :]) Then for instance EditedData = Data * (~should_replace) + SubstituteData[np.newaxis, :] * should_replace although a copy-and-modification in EditedData might be possible as well... Dag Sverre From warren.weckesser at enthought.com Tue Mar 23 10:11:58 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 23 Mar 2010 09:11:58 -0500 Subject: [Numpy-discussion] Column-Specific Conditions and Column-Specific Substitution Values In-Reply-To: <882b37311003230642q556cc2d1tef99c7e26aeeafcc@mail.gmail.com> References: <882b37311003230642q556cc2d1tef99c7e26aeeafcc@mail.gmail.com> Message-ID: <4BA8CC2E.7010402@enthought.com> Cristiano Fini wrote: > > Hi Everyone, > a beginner's question on how to perform some data substitution > efficiently. I have a panel dataset, or in other words x individuals > observed over a certain time span. For each column or individual, I > need to substitute a certain value anytime a certain condition is > satisfied. Both the condition and the value to be substituted into the > panel dataset are individual specific. I can tackle the fact that the > condition is individual specific but I cannot find a way to tackle the > fact that the value to be substituted is individual specific without > using a for ? lop. Frankly, considering the size of the dataset the > use of a for loop is perfectly acceptable in terms of the time needed > to complete task but still it would be nice to learn a way to do this > (a task I implement often) in a more efficient way. > Thanks in advance > Cristiano > > > import numpy as np > from copy import deepcopy > Data = np.array([[0,4,0], > [2,5,7], > [2,5,6]]) > EditedData = deepcopy(Data) > Condition = np.array([0, 5, 6]) # individual-specific condition > SubstituteData = np.array([1, 10,100]) > # The logic here > # if the value of any obssrvation for the 1st individual is 0, > substitute 1, > # the 2nd individual is 5, > substitute 10 > # the 3rd individual is 6, > substitute 100 > > # This wouldn't a problem if SubstituteData was not individual > specific Data > # eg EditedData[Data==Condition] = 555 > # As SubstituteData is individual specifc, I need to use a for loop > for i in range(np.shape(EditedData)[1]): > TempData = EditedData[:, i] # I introduce TempData to increase > readability > TempData[TempData == Condition[i]] = SubstituteData[i] > EditedData[:, i] = TempData > > print EditedData > Instead of the loop, you could use: EditedData = np.choose(Data == Condition, (Data, SubstituteData)) Warren > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Mar 23 13:40:41 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 23 Mar 2010 10:40:41 -0700 Subject: [Numpy-discussion] draft release guide In-Reply-To: References: Message-ID: <4BA8FD19.7090103@noaa.gov> Ralf Gommers wrote: > At http://github.com/rgommers/NumPy-release-guide you can find a summary > of how to set up your system to build numpy binaries on OS X. I still > have to add info on scipy (that's turning out to be fairly painful) but > for numpy it is pretty complete. > > Any feedback is appreciated! We really don't want to supply binaries for python 2.5 anymore? (maybe this will get me to finally switch! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From reckoner at gmail.com Tue Mar 23 17:04:17 2010 From: reckoner at gmail.com (Reckoner) Date: Tue, 23 Mar 2010 14:04:17 -0700 Subject: [Numpy-discussion] dtype='|S8' -- what does vertical bar mean? Message-ID: Hi, I've been looking through the documentation and occasionally there is a dtype='|S8' reference or something with a "|" in it. I don't know what the "|" this notation means. I can't find it in the documentation. This should be easy. Little help? thanks in advance. From patrickmarshwx at gmail.com Tue Mar 23 17:08:04 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Tue, 23 Mar 2010 16:08:04 -0500 Subject: [Numpy-discussion] draft release guide In-Reply-To: <4BA8FD19.7090103@noaa.gov> References: <4BA8FD19.7090103@noaa.gov> Message-ID: Maybe I missed the discussion, but is there a reason why we don't want to support Python 2.5 via providing binaries? I'm working on a detailed write up of how to create windows binaries on Windows 7. I hope to finish in the next day or so, however, my brother-in-law made an unexpected visit, so I don't have as much time as I thought I did. With that said, I hope to be able to share it by this weekend. Patrick On Tue, Mar 23, 2010 at 12:40 PM, Christopher Barker wrote: > Ralf Gommers wrote: > > At http://github.com/rgommers/NumPy-release-guide you can find a summary > > of how to set up your system to build numpy binaries on OS X. I still > > have to add info on scipy (that's turning out to be fairly painful) but > > for numpy it is pretty complete. > > > > Any feedback is appreciated! > > We really don't want to supply binaries for python 2.5 anymore? > > (maybe this will get me to finally switch! > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Mar 23 17:12:10 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 23 Mar 2010 16:12:10 -0500 Subject: [Numpy-discussion] dtype='|S8' -- what does vertical bar mean? In-Reply-To: References: Message-ID: <3d375d731003231412u67e164f0lc5c8386094794531@mail.gmail.com> On Tue, Mar 23, 2010 at 16:04, Reckoner wrote: > Hi, > > I've been looking through the documentation and occasionally there is > a dtype='|S8' reference or something with a "|" in it. I don't know > what the "|" this notation means. I can't find it in the > documentation. > > This should be easy. Little help? That character is used to specify the byte order. "|" means "native byte order". For 'S' dtypes, it doesn't make a difference; it has no byte order. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Tue Mar 23 17:28:51 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 23 Mar 2010 17:28:51 -0400 Subject: [Numpy-discussion] dtype='|S8' -- what does vertical bar mean? In-Reply-To: References: Message-ID: <9F34CCE3-2324-450D-9C0C-9044DD98ECEA@cs.toronto.edu> On 23-Mar-10, at 5:04 PM, Reckoner wrote: > I don't know > what the "|" this notation means. I can't find it in the > documentation. > > This should be easy. Little help? A > or < in this position means big or little endianness. Strings don't have endianness, hence "|". David From ralf.gommers at googlemail.com Tue Mar 23 20:49:50 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 24 Mar 2010 08:49:50 +0800 Subject: [Numpy-discussion] draft release guide In-Reply-To: <4BA8FD19.7090103@noaa.gov> References: <4BA8FD19.7090103@noaa.gov> Message-ID: On Wed, Mar 24, 2010 at 1:40 AM, Christopher Barker wrote: > Ralf Gommers wrote: > > At http://github.com/rgommers/NumPy-release-guide you can find a summary > > of how to set up your system to build numpy binaries on OS X. I still > > have to add info on scipy (that's turning out to be fairly painful) but > > for numpy it is pretty complete. > > > > Any feedback is appreciated! > > We really don't want to supply binaries for python 2.5 anymore? > > (maybe this will get me to finally switch! > That's fast, I was hoping to get away with that one! I discussed a bit with David, and he suggested to not make 2.5 binaries for scipy 0.7.2 and then see if there was still any demand. It's a lot of work with perhaps not so much return. Of course a maintenance release of scipy with no new features may be a different story from a new release of numpy/scipy. So by no means is this decided or even publicly discussed. I do want to point out that we have not provided 2.4 binaries for a while without much demand. So now the question: who still wants and uses 2.5 binaries? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Mar 23 23:57:30 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue, 23 Mar 2010 20:57:30 -0700 Subject: [Numpy-discussion] draft release guide In-Reply-To: References: <4BA8FD19.7090103@noaa.gov> Message-ID: <4BA98DAA.2040309@noaa.gov> Ralf Gommers wrote: > So now the question: who still wants and uses 2.5 binaries? I do -- though probably not for long to be honest. -Chris From faltet at pytables.org Wed Mar 24 05:50:42 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 24 Mar 2010 10:50:42 +0100 Subject: [Numpy-discussion] draft release guide In-Reply-To: References: <4BA8FD19.7090103@noaa.gov> Message-ID: <201003241050.42394.faltet@pytables.org> A Wednesday 24 March 2010 01:49:50 Ralf Gommers escrigu?: > On Wed, Mar 24, 2010 at 1:40 AM, Christopher Barker > > wrote: > > Ralf Gommers wrote: > > > At http://github.com/rgommers/NumPy-release-guide you can find a > > > summary of how to set up your system to build numpy binaries on OS X. I > > > still have to add info on scipy (that's turning out to be fairly > > > painful) but for numpy it is pretty complete. > > > > > > Any feedback is appreciated! > > > > We really don't want to supply binaries for python 2.5 anymore? > > > > (maybe this will get me to finally switch! > > That's fast, I was hoping to get away with that one! > > I discussed a bit with David, and he suggested to not make 2.5 binaries for > scipy 0.7.2 and then see if there was still any demand. It's a lot of work > with perhaps not so much return. Of course a maintenance release of scipy > with no new features may be a different story from a new release of > numpy/scipy. So by no means is this decided or even publicly discussed. I > do want to point out that we have not provided 2.4 binaries for a while > without much demand. > > So now the question: who still wants and uses 2.5 binaries? I do. I think at least 2 binary versions is a recommendable practice for allowing people to change versions more comfortably. Also, I have read the draft and I cannot see references to 64-bit binary packages. With the advent of Windows 7 and Mac OSX Snow Leopard, 64-bit are way more spread than before, so they would be a great thing to deliver, IMO. Thanks, -- Francesc Alted From ralf.gommers at googlemail.com Wed Mar 24 06:45:43 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 24 Mar 2010 18:45:43 +0800 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003241050.42394.faltet@pytables.org> References: <4BA8FD19.7090103@noaa.gov> <201003241050.42394.faltet@pytables.org> Message-ID: On Wed, Mar 24, 2010 at 5:50 PM, Francesc Alted wrote: > A Wednesday 24 March 2010 01:49:50 Ralf Gommers escrigu?: > > On Wed, Mar 24, 2010 at 1:40 AM, Christopher Barker > > > > wrote: > > > Ralf Gommers wrote: > > > > At http://github.com/rgommers/NumPy-release-guide you can find a > > > > summary of how to set up your system to build numpy binaries on OS X. > I > > > > still have to add info on scipy (that's turning out to be fairly > > > > painful) but for numpy it is pretty complete. > > > > > > > > Any feedback is appreciated! > > > > > > We really don't want to supply binaries for python 2.5 anymore? > > > > > > (maybe this will get me to finally switch! > > > > That's fast, I was hoping to get away with that one! > > > > I discussed a bit with David, and he suggested to not make 2.5 binaries > for > > scipy 0.7.2 and then see if there was still any demand. It's a lot of > work > > with perhaps not so much return. Of course a maintenance release of scipy > > with no new features may be a different story from a new release of > > numpy/scipy. So by no means is this decided or even publicly discussed. I > > do want to point out that we have not provided 2.4 binaries for a while > > without much demand. > > > > So now the question: who still wants and uses 2.5 binaries? > > I do. I think at least 2 binary versions is a recommendable practice for > allowing people to change versions more comfortably. > Okay thanks, that's two people already so I'll try. > > Also, I have read the draft and I cannot see references to 64-bit binary > packages. With the advent of Windows 7 and Mac OSX Snow Leopard, 64-bit > are > way more spread than before, so they would be a great thing to deliver, > IMO. > The reason is that for OS X the binaries from python.org are 32-bit only so far. Apple Python is 64-bit, but seems to have its own issues (way behind with bug fix releases, maybe there's more). David told me to always only target the python.org version, which makes sense to me. For Windows 7, I agree it's worth looking into 64-bits binaries, but can't tell you right now how easy that would be. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Mar 24 07:00:36 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 24 Mar 2010 20:00:36 +0900 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003241050.42394.faltet@pytables.org> References: <4BA8FD19.7090103@noaa.gov> <201003241050.42394.faltet@pytables.org> Message-ID: <5b8d13221003240400s19d762f7od4b0ecee7829e4b0@mail.gmail.com> On Wed, Mar 24, 2010 at 6:50 PM, Francesc Alted wrote: > Also, I have read the draft and I cannot see references to 64-bit binary > packages. ?With the advent of Windows 7 and Mac OSX Snow Leopard, 64-bit are > way more spread than before, so they would be a great thing to deliver, IMO. For Mac OS X, we should wait for official 64 bits binaries from python.org - EPD offers 64 bits binaries if necessary. On windows, the situation is more complicated, because there is no way to build numpy and scipy correctly with free compilers. For various reasons, I think we should avoid distributing binaries based on non-free compilers (not available to everyone, problem of compatibility and redistribution of non-free code, especially for fortran) - and for people who don't mind/care, there are unofficial win64 binaries out there (we may put a link somewhere). Unfortunately, I don't think I will have much time to spend on making scipy and gfortran work together on win64 in the near future, cheers, David From bsouthey at gmail.com Wed Mar 24 10:17:26 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 24 Mar 2010 09:17:26 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 Message-ID: <4BAA1EF6.8000002@gmail.com> Hi, Wow, this is really impressive! I installed the svn numpy version '2.0.0.dev8300' with the latest Python 3.1.2 and it works! All the tests pass except: test_utils.test_lookfor I am guessing that it is this line as the other io imports do not have the period. from .io import StringIO ====================================================================== ERROR: test_utils.test_lookfor ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.1/site-packages/nose/case.py", line 177, in runTest self.test(*self.arg) File "/usr/local/lib/python3.1/site-packages/numpy/lib/tests/test_utils.py", line 10, in test_lookfor import_modules=False) File "/usr/local/lib/python3.1/site-packages/numpy/lib/utils.py", line 751, in lookfor cache = _lookfor_generate_cache(module, import_modules, regenerate) File "/usr/local/lib/python3.1/site-packages/numpy/lib/utils.py", line 852, in _lookfor_generate_cache from .io import StringIO ImportError: cannot import name StringIO ---------------------------------------------------------------------- Ran 2898 tests in 24.646s FAILED (KNOWNFAIL=5, errors=1) Bruce From faltet at pytables.org Wed Mar 24 10:25:16 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 24 Mar 2010 15:25:16 +0100 Subject: [Numpy-discussion] draft release guide In-Reply-To: <5b8d13221003240400s19d762f7od4b0ecee7829e4b0@mail.gmail.com> References: <201003241050.42394.faltet@pytables.org> <5b8d13221003240400s19d762f7od4b0ecee7829e4b0@mail.gmail.com> Message-ID: <201003241525.16596.faltet@pytables.org> A Wednesday 24 March 2010 12:00:36 David Cournapeau escrigu?: > On Wed, Mar 24, 2010 at 6:50 PM, Francesc Alted wrote: > > Also, I have read the draft and I cannot see references to 64-bit binary > > packages. With the advent of Windows 7 and Mac OSX Snow Leopard, 64-bit > > are way more spread than before, so they would be a great thing to > > deliver, IMO. > > For Mac OS X, we should wait for official 64 bits binaries from > python.org - EPD offers 64 bits binaries if necessary. On windows, the > situation is more complicated, because there is no way to build numpy > and scipy correctly with free compilers. For various reasons, I think > we should avoid distributing binaries based on non-free compilers (not > available to everyone, problem of compatibility and redistribution of > non-free code, especially for fortran) - and for people who don't > mind/care, there are unofficial win64 binaries out there (we may put a > link somewhere). Unfortunately, I don't think I will have much time to > spend on making scipy and gfortran work together on win64 in the near > future, Ok. I've been having a try at mingw-w64 project: http://mingw-w64.sourceforge.net/ with no success so far with build numpy: $ python setup.py build --compiler=mingw32 [...] Found executable C:\mingw-w64_x86_64-mingw\mingw64\bin\gcc.exe g++ -mno-cygwin _configtest.o -lmsvcr90 -o _configtest.exe Found executable C:\mingw-w64_x86_64-mingw\mingw64\bin\g++.exe c:/mingw-w64_x86_64-mingw/mingw64/bin/../lib/gcc/x86_64-w64- mingw32/4.4.4/../../ ../../x86_64-w64-mingw32/bin/ld.exe: cannot find -lmsvcr90 collect2: ld returned 1 exit status failure. [...] However, I can compile it with plain mingw tool chain (obviously, only can get w32 binaries). Any hint on how to proceed with mingw-w64? Thanks, -- Francesc Alted From cournape at gmail.com Wed Mar 24 10:38:58 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 24 Mar 2010 23:38:58 +0900 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003241525.16596.faltet@pytables.org> References: <201003241050.42394.faltet@pytables.org> <5b8d13221003240400s19d762f7od4b0ecee7829e4b0@mail.gmail.com> <201003241525.16596.faltet@pytables.org> Message-ID: <5b8d13221003240738k22a63ac6la51c3eedda659e98@mail.gmail.com> On Wed, Mar 24, 2010 at 11:25 PM, Francesc Alted wrote: > > Ok. ?I've been having a try at mingw-w64 project: > > http://mingw-w64.sourceforge.net/ > > with no success so far with build numpy: > > $ python setup.py build --compiler=mingw32 Oh, it is not that easy :) First, for some reason, the mingw-w64 project does not provide 64 hosted compilers, and since pushing for mingw cross compilation support in distutils would redefine the meaning of insanity, I build my gcc. Since building gcc on windows is not a fun ride either, you have to build it on unix (I have the scripts on my github account: github.com/cournape/). Then, you can hope to build numpy with mingw 64 on windows 64. But then, it randomly crashes, and last time I looked at it (~ 8 months ago), there was no gdb support and the mingw debug symbols cannot be understood by MS tools, so I was stuck at trying to understand what went wrong without a meaningful backtrace. I also looked into making gfortran works with MS compiler, but this means porting libgfortran to something MS compilers can understand. This is not as crazy as it sounds, because we don't use many fortran runtime functions in scipy. Of course, atlas on windows 64 is nowhere near buildable. cheers, David From nadavh at visionsense.com Wed Mar 24 10:35:05 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 24 Mar 2010 16:35:05 +0200 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 References: <4BAA1EF6.8000002@gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> Any idea why from .io import StringIO and not from io import StringIO ??? (Why is the extra "." before "io") Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Bruce Southey Sent: Wed 24-Mar-10 16:17 To: Discussion of Numerical Python Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 Hi, Wow, this is really impressive! I installed the svn numpy version '2.0.0.dev8300' with the latest Python 3.1.2 and it works! All the tests pass except: test_utils.test_lookfor I am guessing that it is this line as the other io imports do not have the period. from .io import StringIO ====================================================================== ERROR: test_utils.test_lookfor ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.1/site-packages/nose/case.py", line 177, in runTest self.test(*self.arg) File "/usr/local/lib/python3.1/site-packages/numpy/lib/tests/test_utils.py", line 10, in test_lookfor import_modules=False) File "/usr/local/lib/python3.1/site-packages/numpy/lib/utils.py", line 751, in lookfor cache = _lookfor_generate_cache(module, import_modules, regenerate) File "/usr/local/lib/python3.1/site-packages/numpy/lib/utils.py", line 852, in _lookfor_generate_cache from .io import StringIO ImportError: cannot import name StringIO ---------------------------------------------------------------------- Ran 2898 tests in 24.646s FAILED (KNOWNFAIL=5, errors=1) Bruce _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3374 bytes Desc: not available URL: From cournape at gmail.com Wed Mar 24 10:43:08 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 24 Mar 2010 23:43:08 +0900 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> Message-ID: <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> On Wed, Mar 24, 2010 at 11:35 PM, Nadav Horesh wrote: > Any idea why > > ?from .io import StringIO > > and not > > ?from io import StringIO > > ??? > > (Why is the extra "." before "io") Maybe a bug in py2to3, because StringIO is in io in python 3, and we have a io module in numpy (.io is the new syntax for relative import). David From robert.kern at gmail.com Wed Mar 24 10:46:40 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Mar 2010 09:46:40 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> Message-ID: <3d375d731003240746o3be0485dh16ae5483a503d4b4@mail.gmail.com> On Wed, Mar 24, 2010 at 09:43, David Cournapeau wrote: > On Wed, Mar 24, 2010 at 11:35 PM, Nadav Horesh wrote: >> Any idea why >> >> ?from .io import StringIO >> >> and not >> >> ?from io import StringIO >> >> ??? >> >> (Why is the extra "." before "io") > > Maybe a bug in py2to3, because StringIO is in io in python 3, and we > have a io module in numpy (.io is the new syntax for relative import). Ignore my previous email. This is the correct answer. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Mar 24 10:43:59 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Mar 2010 09:43:59 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> Message-ID: <3d375d731003240743k2734b07dw70736a87c9708925@mail.gmail.com> On Wed, Mar 24, 2010 at 09:35, Nadav Horesh wrote: > Any idea why > > ?from .io import StringIO > > and not > > ?from io import StringIO > > ??? > > (Why is the extra "." before "io") It is a relative import, i.e. numpy.lib.io . -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Mar 24 11:07:27 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Mar 2010 10:07:27 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> Message-ID: <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> On Wed, Mar 24, 2010 at 09:43, David Cournapeau wrote: > On Wed, Mar 24, 2010 at 11:35 PM, Nadav Horesh wrote: >> Any idea why >> >> ?from .io import StringIO >> >> and not >> >> ?from io import StringIO >> >> ??? >> >> (Why is the extra "." before "io") > > Maybe a bug in py2to3, because StringIO is in io in python 3, and we > have a io module in numpy (.io is the new syntax for relative import). Bug reported: http://bugs.python.org/issue8221 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Mar 24 11:20:48 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Mar 2010 09:20:48 -0600 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> Message-ID: On Wed, Mar 24, 2010 at 9:07 AM, Robert Kern wrote: > On Wed, Mar 24, 2010 at 09:43, David Cournapeau > wrote: > > On Wed, Mar 24, 2010 at 11:35 PM, Nadav Horesh > wrote: > >> Any idea why > >> > >> from .io import StringIO > >> > >> and not > >> > >> from io import StringIO > >> > >> ??? > >> > >> (Why is the extra "." before "io") > > > > Maybe a bug in py2to3, because StringIO is in io in python 3, and we > > have a io module in numpy (.io is the new syntax for relative import). > > Bug reported: > > http://bugs.python.org/issue8221 > > What would be the best fix? Should we rename io to something like npyio? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Mar 24 11:28:13 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Mar 2010 10:28:13 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> Message-ID: <3d375d731003240828k27347daawa313b134ad2e7232@mail.gmail.com> On Wed, Mar 24, 2010 at 10:20, Charles R Harris wrote: > > On Wed, Mar 24, 2010 at 9:07 AM, Robert Kern wrote: >> >> On Wed, Mar 24, 2010 at 09:43, David Cournapeau >> wrote: >> > On Wed, Mar 24, 2010 at 11:35 PM, Nadav Horesh >> > wrote: >> >> Any idea why >> >> >> >> ?from .io import StringIO >> >> >> >> and not >> >> >> >> ?from io import StringIO >> >> >> >> ??? >> >> >> >> (Why is the extra "." before "io") >> > >> > Maybe a bug in py2to3, because StringIO is in io in python 3, and we >> > have a io module in numpy (.io is the new syntax for relative import). >> >> Bug reported: >> >> http://bugs.python.org/issue8221 >> > > What would be the best fix? Should we rename io to something like npyio? utils.py is the only file in there that imports StringIO. It should probably do a local import "from io import BytesIO" because io.py already contains some Python3-awareness: if sys.version_info[0] >= 3: import io BytesIO = io.BytesIO else: from cStringIO import StringIO as BytesIO -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Wed Mar 24 11:29:26 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Mar 2010 17:29:26 +0200 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> Message-ID: <1269444566.2833.3.camel@talisman> ke, 2010-03-24 kello 09:20 -0600, Charles R Harris kirjoitti: > What would be the best fix? Should we rename io to something like > npyio? That, or: Disable import conversions in tools/py3tool.py for that particular file, and fix any import errors manually so that the same code works both for Python 2 and Python 3. Actually, I suspect there are no import errors, since the top-level imports in that file are already absolute. Anyway, it's a bug in 2to3. I suppose it first converts from StringIO import StringIO to from io import StringIO and then runs the import fixer, which does from .io import StringIO and that's a bug. Pauli From pav at iki.fi Wed Mar 24 11:30:55 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Mar 2010 17:30:55 +0200 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <3d375d731003240828k27347daawa313b134ad2e7232@mail.gmail.com> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> <3d375d731003240828k27347daawa313b134ad2e7232@mail.gmail.com> Message-ID: <1269444655.2833.4.camel@talisman> ke, 2010-03-24 kello 10:28 -0500, Robert Kern kirjoitti: > utils.py is the only file in there that imports StringIO. It should > probably do a local import "from io import BytesIO" because io.py > already contains some Python3-awareness: > > if sys.version_info[0] >= 3: > import io > BytesIO = io.BytesIO > else: > from cStringIO import StringIO as BytesIO The lookfor stuff in utils.py deals with docstrings, so it probably has to use StringIO instead of BytesIO for unicode-cleanliness. Pauli From charlesr.harris at gmail.com Wed Mar 24 11:59:46 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Mar 2010 09:59:46 -0600 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <1269444566.2833.3.camel@talisman> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> <1269444566.2833.3.camel@talisman> Message-ID: On Wed, Mar 24, 2010 at 9:29 AM, Pauli Virtanen wrote: > ke, 2010-03-24 kello 09:20 -0600, Charles R Harris kirjoitti: > > What would be the best fix? Should we rename io to something like > > npyio? > > That, or: > > Disable import conversions in tools/py3tool.py for that particular file, > and fix any import errors manually so that the same code works both for > Python 2 and Python 3. Actually, I suspect there are no import errors, > since the top-level imports in that file are already absolute. > > I have some preference for the name change just to avoid shadowing a standard module, it makes numpy just a bit safer. > Anyway, it's a bug in 2to3. I suppose it first converts > > from StringIO import StringIO > > to > > from io import StringIO > > and then runs the import fixer, which does > > from .io import StringIO > > and that's a bug. > > Sounds right. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed Mar 24 11:59:20 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 24 Mar 2010 10:59:20 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <1269444655.2833.4.camel@talisman> References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> <3d375d731003240828k27347daawa313b134ad2e7232@mail.gmail.com> <1269444655.2833.4.camel@talisman> Message-ID: <4BAA36D8.6000505@gmail.com> On 03/24/2010 10:30 AM, Pauli Virtanen wrote: > ke, 2010-03-24 kello 10:28 -0500, Robert Kern kirjoitti: > >> utils.py is the only file in there that imports StringIO. It should >> probably do a local import "from io import BytesIO" because io.py >> already contains some Python3-awareness: >> >> if sys.version_info[0]>= 3: >> import io >> BytesIO = io.BytesIO >> else: >> from cStringIO import StringIO as BytesIO >> > The lookfor stuff in utils.py deals with docstrings, so it probably has > to use StringIO instead of BytesIO for unicode-cleanliness. > > Pauli > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > This appears to be a regression because I do not get the error with Python 3.1.1. I rebuilt Python3.1.1 and Python3.1.2 to compare the builds (I don't know how to keep the two Python 3.1 minor releases separate). From the diff it is clear that Python3.1.2 (build directory) is adding the .io but the Python3.1.1 (build311 directory). $ diff build/py3k/numpy/lib/utils.py build311/py3k/numpy/lib/utils.py 8d7 < import collections 852c851 < from .io import StringIO --- > from io import StringIO 945c944 < elif isinstance(item, collections.Callable): --- > elif hasattr(item, '__call__'): Bruce From charlesr.harris at gmail.com Wed Mar 24 12:26:44 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Mar 2010 10:26:44 -0600 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: References: <4BAA1EF6.8000002@gmail.com> <710F2847B0018641891D9A21602763605AD368@ex3.envision.co.il> <5b8d13221003240743h3ed89f58kaa00b87255bde7fc@mail.gmail.com> <3d375d731003240807h7cb6367cj4a57f6d85af741e8@mail.gmail.com> <1269444566.2833.3.camel@talisman> Message-ID: On Wed, Mar 24, 2010 at 9:59 AM, Charles R Harris wrote: > > > On Wed, Mar 24, 2010 at 9:29 AM, Pauli Virtanen wrote: > >> ke, 2010-03-24 kello 09:20 -0600, Charles R Harris kirjoitti: >> > What would be the best fix? Should we rename io to something like >> > npyio? >> >> That, or: >> >> Disable import conversions in tools/py3tool.py for that particular file, >> and fix any import errors manually so that the same code works both for >> Python 2 and Python 3. Actually, I suspect there are no import errors, >> since the top-level imports in that file are already absolute. >> >> > I have some preference for the name change just to avoid shadowing a > standard module, it makes numpy just a bit safer. > In particular, here are the current occurences of io numpy/lib/__init__.py:from io import * numpy/lib/tests/test__iotools.py: from io import BytesIO numpy/lib/tests/test_format.py: ... from io import BytesIO as StringIO numpy/lib/tests/test_format.py: from io import BytesIO as StringIO numpy/lib/tests/test_io.py: from io import BytesIO And it looks to me like that could easily get confusing since io is also a Python module since 2.6. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Mar 24 13:41:33 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed, 24 Mar 2010 10:41:33 -0700 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003241050.42394.faltet@pytables.org> References: <4BA8FD19.7090103@noaa.gov> <201003241050.42394.faltet@pytables.org> Message-ID: <4BAA4ECD.5030109@noaa.gov> Francesc Alted wrote: > Also, I have read the draft and I cannot see references to 64-bit binary > packages. With the advent of Windows 7 and Mac OSX Snow Leopard, 64-bit are > way more spread than before, so they would be a great thing to deliver, IMO. True, however the situation is a bit ugly with OS-X -- python.org does not distribute 64 bit binaries at all, and Apple's Python is a bit weird in that regard. Also some key libraries (anything build on Carbon , such as wxPython) don't work on 64 bit. That being said, a 32+64 bit build for the Apple supplied Python for OS-X 10.6 might be nice, though a bit of a niche market. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed Mar 24 14:25:09 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Mar 2010 12:25:09 -0600 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: <4BAA1EF6.8000002@gmail.com> References: <4BAA1EF6.8000002@gmail.com> Message-ID: On Wed, Mar 24, 2010 at 8:17 AM, Bruce Southey wrote: > Hi, > Wow, this is really impressive! > I installed the svn numpy version '2.0.0.dev8300' with the latest Python > 3.1.2 and it works! > > All the tests pass except: > test_utils.test_lookfor > > I am guessing that it is this line as the other io imports do not have > the period. > from .io import StringIO > > I went ahead and renamed lib/io.py to lib/npyio.py in r8302. Can you give it a try? you might need to rm the build/py3k directory first to make it work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed Mar 24 14:35:51 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 24 Mar 2010 13:35:51 -0500 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: References: <4BAA1EF6.8000002@gmail.com> Message-ID: <4BAA5B87.2050906@gmail.com> On 03/24/2010 01:25 PM, Charles R Harris wrote: > > > On Wed, Mar 24, 2010 at 8:17 AM, Bruce Southey > wrote: > > Hi, > Wow, this is really impressive! > I installed the svn numpy version '2.0.0.dev8300' with the latest > Python > 3.1.2 and it works! > > All the tests pass except: > test_utils.test_lookfor > > I am guessing that it is this line as the other io imports do not have > the period. > from .io import StringIO > > > I went ahead and renamed lib/io.py to lib/npyio.py in r8302. Can you > give it a try? you might need to rm the build/py3k directory first to > make it work. > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I get a different error because it also has 'import collections'. $ python3.1 Python 3.1.2 (r312:79147, Mar 24 2010, 10:44:23) [GCC 4.4.3 20100127 (Red Hat 4.4.3-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.__version__ '2.0.0.dev8302' >>> numpy.test() Running unit tests for numpy NumPy version 2.0.0.dev8302 NumPy is installed in /usr/local/lib/python3.1/site-packages/numpy Python version 3.1.2 (r312:79147, Mar 24 2010, 10:44:23) [GCC 4.4.3 20100127 (Red Hat 4.4.3-4)] nose version 0.11.0 ..... ====================================================================== ERROR: test_utils.test_lookfor ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.1/site-packages/nose/case.py", line 177, in runTest self.test(*self.arg) File "/usr/local/lib/python3.1/site-packages/numpy/lib/tests/test_utils.py", line 10, in test_lookfor import_modules=False) File "/usr/local/lib/python3.1/site-packages/numpy/lib/utils.py", line 751, in lookfor cache = _lookfor_generate_cache(module, import_modules, regenerate) File "/usr/local/lib/python3.1/site-packages/numpy/lib/utils.py", line 945, in _lookfor_generate_cache elif isinstance(item, collections.Callable): File "/usr/local/lib/python3.1/abc.py", line 121, in __instancecheck__ subclass = instance.__class__ AttributeError: 'PyCapsule' object has no attribute '__class__' ---------------------------------------------------------------------- Ran 2898 tests in 21.738s FAILED (KNOWNFAIL=5, errors=1) $ diff build/py3k/build/lib.linux-x86_64-3.1/numpy/lib/utils.py build311/py3k/build/lib.linux-x86_64-3.1/numpy/lib/utils.py 8d7 < import collections 945c944 < elif isinstance(item, collections.Callable): --- > elif hasattr(item, '__call__'): -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Wed Mar 24 14:41:27 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 24 Mar 2010 18:41:27 +0000 (UTC) Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 References: <4BAA1EF6.8000002@gmail.com> <4BAA5B87.2050906@gmail.com> Message-ID: Wed, 24 Mar 2010 13:35:51 -0500, Bruce Southey wrote: [clip] > elif isinstance(item, collections.Callable): > File "/usr/local/lib/python3.1/abc.py", line 121, in > __instancecheck__ > subclass = instance.__class__ > AttributeError: 'PyCapsule' object has no attribute '__class__' Seems like another Python bug. All objects probably should have the __class__ attribute... -- Pauli Virtanen From zachary.pincus at yale.edu Wed Mar 24 15:13:04 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 24 Mar 2010 15:13:04 -0400 Subject: [Numpy-discussion] numpy.array(arr.flat) mutates arr if arr.flags.fortran: bug? Message-ID: Hello, I assume it is a bug that calling numpy.array() on a flatiter of a fortran-strided array that owns its own data causes that array to be rearranged somehow? Not sure what happens with a fancier-strided array that also owns its own data (because I'm not sure how to create one of those in python). This is from the latest svn version (2.0.0.dev8302) but was also present in a previous version too. Zach In [9]: a = numpy.array([[1,2],[3,4]]).copy('F') In [10]: a Out[10]: array([[1, 2], [3, 4]]) In [11]: list(a.flat) Out[11]: [1, 2, 3, 4] In [12]: a # no problem Out[12]: array([[1, 2], [3, 4]]) In [13]: numpy.array(a.flat) Out[13]: array([1, 2, 3, 4]) In [14]: a # this ain't right! Out[14]: array([[1, 3], [2, 4]]) In [15]: a = numpy.array([[1,2],[3,4]]).copy('C') In [16]: numpy.array(a.flat) Out[16]: array([1, 2, 3, 4]) In [17]: a Out[17]: array([[1, 2], [3, 4]]) From faltet at pytables.org Wed Mar 24 15:31:04 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 24 Mar 2010 20:31:04 +0100 Subject: [Numpy-discussion] draft release guide In-Reply-To: <5b8d13221003240738k22a63ac6la51c3eedda659e98@mail.gmail.com> References: <201003241525.16596.faltet@pytables.org> <5b8d13221003240738k22a63ac6la51c3eedda659e98@mail.gmail.com> Message-ID: <201003242031.04336.faltet@pytables.org> A Wednesday 24 March 2010 15:38:58 David Cournapeau escrigu?: > Oh, it is not that easy :) > > First, for some reason, the mingw-w64 project does not provide 64 > hosted compilers, and since pushing for mingw cross compilation > support in distutils would redefine the meaning of insanity, I build > my gcc. Since building gcc on windows is not a fun ride either, you > have to build it on unix (I have the scripts on my github account: > github.com/cournape/). Mmh, not sure about what you mean by hosted compiler, but there is certainly a native compiler package for Win64: mingw-w64-bin_x86_64-mingw_20100322_sezero.zip It comes with gcc, g++ and gfortran 4.4.4. gdb support is also there. So it seems like a pretty complete toolset for windows amd64. With it, and with some fixes in numpy sources (very few), I achieved to pass the build phase. I can provide the patch in case someone is interested. The generated extensions are: 24/03/2010 19:34 1.492.313 multiarray.pyd 24/03/2010 19:34 124.866 multiarray_tests.pyd 24/03/2010 19:34 453.377 scalarmath.pyd 24/03/2010 19:34 1.079.827 umath.pyd 24/03/2010 19:34 121.651 umath_tests.pyd 24/03/2010 19:34 304.014 _sort.pyd which looks good to my eyes. Now, when I try to generate the installable package I'm in trouble again: $ python setup.py bdist [...] File "C: \Users\francesc\Desktop\NumPy\1.4.x\numpy\distutils\command\config.py" , line 56, in _check_compiler self.compiler.initialize() File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 359, in initialize vc_env = query_vcvarsall(VERSION, plat_spec) File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 275, in query_vcvar sall raise ValueError(str(list(result.keys()))) ValueError: [u'path'] [...] So, it looks like either numpy or python cannot determine that the used compiler is mingw instead of msvc9. However, when I try to specify mingw explicitly, I get the next error: $ python setup.py bdist --compiler=mingw32 Running from numpy source directory. Forcing DISTUTILS_USE_SDK=1 usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] or: setup.py --help [cmd1 cmd2 ...] or: setup.py --help-commands or: setup.py cmd --help error: option --compiler not recognized Someone could tell me why distutils can be told to use mingw32 compiler for the build stage but not for bdist? What is more, why the need for a compiler for bdist if numpy is already built? I feel that I'm almost there, but some piece still resists... -- Francesc Alted From josef.pktd at gmail.com Wed Mar 24 15:44:34 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Mar 2010 15:44:34 -0400 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003242031.04336.faltet@pytables.org> References: <201003241525.16596.faltet@pytables.org> <5b8d13221003240738k22a63ac6la51c3eedda659e98@mail.gmail.com> <201003242031.04336.faltet@pytables.org> Message-ID: <1cd32cbb1003241244p6e9e57d9j1d2b9cca1b763b0@mail.gmail.com> On Wed, Mar 24, 2010 at 3:31 PM, Francesc Alted wrote: > A Wednesday 24 March 2010 15:38:58 David Cournapeau escrigu?: >> Oh, it is not that easy :) >> >> First, for some reason, the mingw-w64 project does not provide 64 >> hosted compilers, and since pushing for mingw cross compilation >> support in distutils would redefine the meaning of insanity, I build >> my gcc. Since building gcc on windows is not a fun ride either, you >> have to build it on unix (I have the scripts on my github account: >> github.com/cournape/). > > Mmh, not sure about what you mean by hosted compiler, but there is certainly a > native compiler package for Win64: > > mingw-w64-bin_x86_64-mingw_20100322_sezero.zip > > It comes with gcc, g++ and gfortran 4.4.4. ?gdb support is also there. ?So it > seems like a pretty complete toolset for windows amd64. > > With it, and with some fixes in numpy sources (very few), I achieved to pass > the build phase. ?I can provide the patch in case someone is interested. ?The > generated extensions are: > > 24/03/2010 ?19:34 ? ? ? ? 1.492.313 multiarray.pyd > 24/03/2010 ?19:34 ? ? ? ? ? 124.866 multiarray_tests.pyd > 24/03/2010 ?19:34 ? ? ? ? ? 453.377 scalarmath.pyd > 24/03/2010 ?19:34 ? ? ? ? 1.079.827 umath.pyd > 24/03/2010 ?19:34 ? ? ? ? ? 121.651 umath_tests.pyd > 24/03/2010 ?19:34 ? ? ? ? ? 304.014 _sort.pyd > > which looks good to my eyes. > > Now, when I try to generate the installable package I'm in trouble again: > > $ python setup.py bdist > [...] > ?File "C: > \Users\francesc\Desktop\NumPy\1.4.x\numpy\distutils\command\config.py" > , line 56, in _check_compiler > ? ?self.compiler.initialize() > ?File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 359, in > initialize > ? ?vc_env = query_vcvarsall(VERSION, plat_spec) > ?File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 275, in > query_vcvar > sall > ? ?raise ValueError(str(list(result.keys()))) > ValueError: [u'path'] > [...] > > So, it looks like either numpy or python cannot determine that the used > compiler is mingw instead of msvc9. ?However, when I try to specify mingw > explicitly, I get the next error: > > $ python setup.py bdist --compiler=mingw32 > Running from numpy source directory. > Forcing DISTUTILS_USE_SDK=1 > usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] > ? or: setup.py --help [cmd1 cmd2 ...] > ? or: setup.py --help-commands > ? or: setup.py cmd --help > > error: option --compiler not recognized For some similar problems, which I don't remember exactly, I needed to create the dist in the same command as the build, e.g. python setup.py build --compiler=mingw32 bdist I don't know if this works in your case, I have the compiler specified in distutils.cfg Josef > > Someone could tell me why distutils can be told to use mingw32 compiler for > the build stage but not for bdist? ?What is more, why the need for a compiler > for bdist if numpy is already built? I feel that I'm almost there, but some > piece still resists... > > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From reckoner at gmail.com Wed Mar 24 18:38:27 2010 From: reckoner at gmail.com (reckoner) Date: Wed, 24 Mar 2010 15:38:27 -0700 Subject: [Numpy-discussion] What does float64 mean on a 32-bit machine? Message-ID: <4BAA9463.1050006@gmail.com> How can I have a float64 dtype on a 32-bit machine? For example: In [90]: x = array([1/3],dtype=float32) In [91]: x Out[91]: array([ 0.33333334], dtype=float32) In [92]: x = array([1/3],dtype=float64) In [93]: x Out[93]: array([ 0.33333333]) Obviously, the float32 and float64 representations of 1/3 are different, but what is the meaning of float64 on a 32-bit machine? Shouldn't a 32-bit machine only be able represent float32? Thanks! From robert.kern at gmail.com Wed Mar 24 18:47:32 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Mar 2010 17:47:32 -0500 Subject: [Numpy-discussion] What does float64 mean on a 32-bit machine? In-Reply-To: <4BAA9463.1050006@gmail.com> References: <4BAA9463.1050006@gmail.com> Message-ID: <3d375d731003241547x1ec827fct3d9d9b37c7839747@mail.gmail.com> On Wed, Mar 24, 2010 at 17:38, reckoner wrote: > How can I have a float64 dtype on a 32-bit machine? For example: float64 is a 64-bit float on all machines. A "32-bit machine" refers only to the size of its memory address space and the size of the integer type used for pointers. It has no effect on floating point types; 32- and 64-bit versions are standard on all supported platforms though the higher precisions vary significantly from machine to machine regardless of whether it is 32- or 64-bit. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed Mar 24 18:49:52 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 24 Mar 2010 15:49:52 -0700 Subject: [Numpy-discussion] What does float64 mean on a 32-bit machine? In-Reply-To: <4BAA9463.1050006@gmail.com> References: <4BAA9463.1050006@gmail.com> Message-ID: <4BAA9710.7050705@noaa.gov> reckoner wrote: > How can I have a float64 dtype on a 32-bit machine? For example: float64 is known as "double" in C, just for this reason. Modern FPUs use 64 bit (actually more bits), so you can get very good performance with float64 on 32 bit machines. And it is the standard Python float as well. -Chris > > In [90]: x = array([1/3],dtype=float32) > > In [91]: x > Out[91]: array([ 0.33333334], dtype=float32) > > In [92]: x = array([1/3],dtype=float64) > > In [93]: x > Out[93]: array([ 0.33333333]) > > Obviously, the float32 and float64 representations of 1/3 are different, > but what is the meaning of float64 on a 32-bit machine? Shouldn't a > 32-bit machine only be able represent float32? > > Thanks! > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From amenity at enthought.com Wed Mar 24 19:44:34 2010 From: amenity at enthought.com (Amenity Applewhite) Date: Wed, 24 Mar 2010 18:44:34 -0500 Subject: [Numpy-discussion] Hiring: Software Developer, Scientific Applications Message-ID: <6C2FD6F0-A710-4AF2-BF27-19BFDAE9C046@enthought.com> Enthought is hiring a Software Developer. See the description below, or on our website: http://www.enthought.com/company/sd-scientific-app.php Best, Amenity -- Amenity Applewhite Enthought, Inc. Scientific Computing Solutions www.enthought.com Software Developer The Software Developer at Enthought, Inc. participates in the development of scientific and technical applications involving GUIs, 2- D and 3-D graphics, workflow and pipeline architecture, and numerical algorithms. The position also involves some interaction with clients. Some travel may be required. We are interested both in experienced applicants as well as in recent graduates. Applicants should have a BS, MS, or PhD degree with a strong background in science and mathematics, as well as real experience developing quality software, either commercial or open source. More experienced applicants should also have demonstrated project management skills and the ability to lead a team of strong developers with highly technical backgrounds. Applicants will be measured against the following qualifications: ? (Required) Bachelor's Degree in Computer Science or other scientific or engineering field with preferably an M.S. or Ph.D. degree. ? (Required) Minimum 2 years of technical lead or development experience with 4 or more years preferred. ? Ability to understand a problem domain and then conceive of and implement an intuitive user interface geared toward the scientist or engineer user. ? Discipline, pride, and professionalism to write readable, documented, and unit-tested code that serves as an example to others who later study your work. ? Strong work ethic and commitment to satisfying the customer. ? Experience with Python, and a strong understanding of how to apply its capabilities to develop GUI frameworks, work flow frameworks, and elegant scientific applications. ? Strong understanding of statistics, optimization, image processing, signal processing, or other technical area. ? Experience with the following: ? GUI frameworks such as NetBeans or Eclipse ? wxPython, Qt ? Low-level 2-D graphics APIs such as Quartz or GDI+ ? 3-D graphics, preferably using VTK ? Developing or working with plotting APIs ? Experience using (and interest in contributing to) SciPy ? numeric algorithms Enthought offers competitive salaries and the opportunity to work on varied and interesting technical projects. We are located in Austin, TX, consistently rated as one of the best places to live in the US. Benefits include health, dental, vision and a 401k plan. If you are interested in applying, submit a resume to jobs at enthought.com. Code samples and links to previous work are encouraged but not required. -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrickmarshwx at gmail.com Wed Mar 24 20:47:27 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Wed, 24 Mar 2010 19:47:27 -0500 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003242031.04336.faltet@pytables.org> References: <201003241525.16596.faltet@pytables.org> <5b8d13221003240738k22a63ac6la51c3eedda659e98@mail.gmail.com> <201003242031.04336.faltet@pytables.org> Message-ID: What happens if you try to build a windows installer "python setup.py bdist_wininst" Also, have you attempted to specify the compiler in "$PYTHON_ROOT\Lib\distutils\distutils.cfg" ? I've got a script I use to manually change the "build" section of this file between [build] msvc and [build] mingw32 to switch between default compilers when I've had issues with setup.py. On Wed, Mar 24, 2010 at 2:31 PM, Francesc Alted wrote: > A Wednesday 24 March 2010 15:38:58 David Cournapeau escrigu?: > > Oh, it is not that easy :) > > > > First, for some reason, the mingw-w64 project does not provide 64 > > hosted compilers, and since pushing for mingw cross compilation > > support in distutils would redefine the meaning of insanity, I build > > my gcc. Since building gcc on windows is not a fun ride either, you > > have to build it on unix (I have the scripts on my github account: > > github.com/cournape/). > > Mmh, not sure about what you mean by hosted compiler, but there is > certainly a > native compiler package for Win64: > > mingw-w64-bin_x86_64-mingw_20100322_sezero.zip > > It comes with gcc, g++ and gfortran 4.4.4. gdb support is also there. So > it > seems like a pretty complete toolset for windows amd64. > > With it, and with some fixes in numpy sources (very few), I achieved to > pass > the build phase. I can provide the patch in case someone is interested. > The > generated extensions are: > > 24/03/2010 19:34 1.492.313 multiarray.pyd > 24/03/2010 19:34 124.866 multiarray_tests.pyd > 24/03/2010 19:34 453.377 scalarmath.pyd > 24/03/2010 19:34 1.079.827 umath.pyd > 24/03/2010 19:34 121.651 umath_tests.pyd > 24/03/2010 19:34 304.014 _sort.pyd > > which looks good to my eyes. > > Now, when I try to generate the installable package I'm in trouble again: > > $ python setup.py bdist > [...] > File "C: > \Users\francesc\Desktop\NumPy\1.4.x\numpy\distutils\command\config.py" > , line 56, in _check_compiler > self.compiler.initialize() > File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 359, in > initialize > vc_env = query_vcvarsall(VERSION, plat_spec) > File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 275, in > query_vcvar > sall > raise ValueError(str(list(result.keys()))) > ValueError: [u'path'] > [...] > > So, it looks like either numpy or python cannot determine that the used > compiler is mingw instead of msvc9. However, when I try to specify mingw > explicitly, I get the next error: > > $ python setup.py bdist --compiler=mingw32 > Running from numpy source directory. > Forcing DISTUTILS_USE_SDK=1 > usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] > or: setup.py --help [cmd1 cmd2 ...] > or: setup.py --help-commands > or: setup.py cmd --help > > error: option --compiler not recognized > > Someone could tell me why distutils can be told to use mingw32 compiler for > the build stage but not for bdist? What is more, why the need for a > compiler > for bdist if numpy is already built? I feel that I'm almost there, but some > piece still resists... > > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at silveregg.co.jp Wed Mar 24 21:00:36 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Thu, 25 Mar 2010 10:00:36 +0900 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003242031.04336.faltet@pytables.org> References: <201003241525.16596.faltet@pytables.org> <5b8d13221003240738k22a63ac6la51c3eedda659e98@mail.gmail.com> <201003242031.04336.faltet@pytables.org> Message-ID: <4BAAB5B4.4050102@silveregg.co.jp> Francesc Alted wrote: > A Wednesday 24 March 2010 15:38:58 David Cournapeau escrigu?: >> Oh, it is not that easy :) >> >> First, for some reason, the mingw-w64 project does not provide 64 >> hosted compilers, and since pushing for mingw cross compilation >> support in distutils would redefine the meaning of insanity, I build >> my gcc. Since building gcc on windows is not a fun ride either, you >> have to build it on unix (I have the scripts on my github account: >> github.com/cournape/). > > Mmh, not sure about what you mean by hosted compiler Hosted compiler refers to the platform the compiler itself runs on (so here I mean a native 64 bits compiler, instead of a 32 bits compiler which targets 64 bits). It is nice that mingw-w64 gives a 64 bits hosted, that's recent. > It comes with gcc, g++ and gfortran 4.4.4. gdb support is also there. So it > seems like a pretty complete toolset for windows amd64. The problem with gdb (but I would need to see if that's still true) is that it did not give useful traceback for code compiled with MS compiler , the stack was always corrupted somewhere for the crashes I encountered. > > With it, and with some fixes in numpy sources (very few), I achieved to pass > the build phase. I can provide the patch in case someone is interested. Building should work out of the box, actually - what are the issues ? > > which looks good to my eyes. > > Now, when I try to generate the installable package I'm in trouble again: > > $ python setup.py bdist > [...] > File "C: > \Users\francesc\Desktop\NumPy\1.4.x\numpy\distutils\command\config.py" > , line 56, in _check_compiler > self.compiler.initialize() > File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 359, in > initialize > vc_env = query_vcvarsall(VERSION, plat_spec) > File "C:\Python26_64\lib\distutils\msvc9compiler.py", line 275, in > query_vcvar > sall > raise ValueError(str(list(result.keys()))) > ValueError: [u'path'] > [...] > > So, it looks like either numpy or python cannot determine that the used > compiler is mingw instead of msvc9. Yes, you have to set it explicitly, this is no different than on 32 bits: python setup.py build -c mingw32 bdist_wininst. But the main problem is not to build numpy (the 1.3.0 64 bits msi was built with mingw-w64 already) - it is to find out what causes the various crashes people encountered. Some were due to bugs in mingw which have been fixed since then. cheers, David From d.l.goldsmith at gmail.com Thu Mar 25 04:21:22 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 25 Mar 2010 01:21:22 -0700 Subject: [Numpy-discussion] Hiring: Software Developer, Scientific Applications In-Reply-To: <6C2FD6F0-A710-4AF2-BF27-19BFDAE9C046@enthought.com> References: <6C2FD6F0-A710-4AF2-BF27-19BFDAE9C046@enthought.com> Message-ID: <45d1ab481003250121v2c9208e2t40c5985b71042223@mail.gmail.com> Is telecommuting an option? DG On Wed, Mar 24, 2010 at 4:44 PM, Amenity Applewhite wrote: > > Enthought is hiring a Software Developer. > See the description below, or on our > website:?http://www.enthought.com/company/sd-scientific-app.php > Best, > Amenity > -- > Amenity Applewhite > Enthought, Inc. > Scientific Computing Solutions > www.enthought.com > > > > Software Developer > > The Software Developer at Enthought, Inc. participates in the development of > scientific and technical applications involving GUIs, 2-D and 3-D graphics, > workflow and pipeline architecture, and numerical algorithms.?? The position > also involves some interaction with clients. Some travel may be required. > > We are interested both in experienced applicants as well as in recent > graduates. Applicants should have a BS, MS, or PhD degree with a strong > background in science and mathematics, as well as real experience developing > quality software, either commercial or open source. More experienced > applicants should also have demonstrated project management skills and the > ability to lead a team of strong developers with highly technical > backgrounds. > > Applicants will be measured against the following qualifications: > > ? (Required)?Bachelor's Degree in Computer Science or other scientific or > engineering field with preferably an M.S. or Ph.D. degree. > ? (Required)?Minimum 2 years of technical lead or development experience > with 4 or more years preferred. > ? Ability to understand a problem domain and then conceive of and implement > an intuitive user interface geared toward the scientist or engineer user. > ? Discipline, pride, and professionalism to write readable, documented, and > unit-tested code that serves as an example to others who later study your > work. > ? Strong work ethic and commitment to satisfying the customer. > ? Experience with Python, and a strong understanding of how to apply its > capabilities to develop GUI frameworks, work flow frameworks, and elegant > scientific applications. > ? Strong understanding of statistics, optimization, image processing, signal > processing, or other technical area. > ? Experience with the following: > ? GUI frameworks such as NetBeans or Eclipse > ? wxPython, Qt > ? Low-level 2-D graphics APIs such as Quartz or GDI+ > ? 3-D graphics, preferably using VTK > ? Developing or working with plotting APIs > ? Experience using (and interest in contributing to) SciPy > ? numeric algorithms > Enthought offers competitive salaries and the opportunity to work on varied > and interesting technical projects. We are located in Austin, TX, > consistently rated as one of the best places to live in the US.? Benefits > include health, dental, vision and a 401k plan. > > If you are interested in applying, submit a resume to jobs at enthought.com. > Code samples and links to previous work are encouraged but not required. > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From d.l.goldsmith at gmail.com Thu Mar 25 04:22:31 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 25 Mar 2010 01:22:31 -0700 Subject: [Numpy-discussion] Hiring: Software Developer, Scientific Applications In-Reply-To: <45d1ab481003250121v2c9208e2t40c5985b71042223@mail.gmail.com> References: <6C2FD6F0-A710-4AF2-BF27-19BFDAE9C046@enthought.com> <45d1ab481003250121v2c9208e2t40c5985b71042223@mail.gmail.com> Message-ID: <45d1ab481003250122n227a9ecq9138da2e3ed47b50@mail.gmail.com> On Thu, Mar 25, 2010 at 1:21 AM, David Goldsmith wrote: > Is telecommuting an option? > > DG Sorry, I didn't mean to send that to the list. :-( DG > On Wed, Mar 24, 2010 at 4:44 PM, Amenity Applewhite > wrote: >> >> Enthought is hiring a Software Developer. >> See the description below, or on our >> website:?http://www.enthought.com/company/sd-scientific-app.php >> Best, >> Amenity >> -- >> Amenity Applewhite >> Enthought, Inc. >> Scientific Computing Solutions >> www.enthought.com >> >> >> >> Software Developer >> >> The Software Developer at Enthought, Inc. participates in the development of >> scientific and technical applications involving GUIs, 2-D and 3-D graphics, >> workflow and pipeline architecture, and numerical algorithms.?? The position >> also involves some interaction with clients. Some travel may be required. >> >> We are interested both in experienced applicants as well as in recent >> graduates. Applicants should have a BS, MS, or PhD degree with a strong >> background in science and mathematics, as well as real experience developing >> quality software, either commercial or open source. More experienced >> applicants should also have demonstrated project management skills and the >> ability to lead a team of strong developers with highly technical >> backgrounds. >> >> Applicants will be measured against the following qualifications: >> >> ? (Required)?Bachelor's Degree in Computer Science or other scientific or >> engineering field with preferably an M.S. or Ph.D. degree. >> ? (Required)?Minimum 2 years of technical lead or development experience >> with 4 or more years preferred. >> ? Ability to understand a problem domain and then conceive of and implement >> an intuitive user interface geared toward the scientist or engineer user. >> ? Discipline, pride, and professionalism to write readable, documented, and >> unit-tested code that serves as an example to others who later study your >> work. >> ? Strong work ethic and commitment to satisfying the customer. >> ? Experience with Python, and a strong understanding of how to apply its >> capabilities to develop GUI frameworks, work flow frameworks, and elegant >> scientific applications. >> ? Strong understanding of statistics, optimization, image processing, signal >> processing, or other technical area. >> ? Experience with the following: >> ? GUI frameworks such as NetBeans or Eclipse >> ? wxPython, Qt >> ? Low-level 2-D graphics APIs such as Quartz or GDI+ >> ? 3-D graphics, preferably using VTK >> ? Developing or working with plotting APIs >> ? Experience using (and interest in contributing to) SciPy >> ? numeric algorithms >> Enthought offers competitive salaries and the opportunity to work on varied >> and interesting technical projects. We are located in Austin, TX, >> consistently rated as one of the best places to live in the US.? Benefits >> include health, dental, vision and a 401k plan. >> >> If you are interested in applying, submit a resume to jobs at enthought.com. >> Code samples and links to previous work are encouraged but not required. >> >> >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > From faltet at pytables.org Thu Mar 25 05:34:06 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 25 Mar 2010 10:34:06 +0100 Subject: [Numpy-discussion] draft release guide In-Reply-To: <4BAAB5B4.4050102@silveregg.co.jp> References: <201003242031.04336.faltet@pytables.org> <4BAAB5B4.4050102@silveregg.co.jp> Message-ID: <201003251034.06735.faltet@pytables.org> A Thursday 25 March 2010 02:00:36 David Cournapeau escrigu?: > Hosted compiler refers to the platform the compiler itself runs on (so > here I mean a native 64 bits compiler, instead of a 32 bits compiler > which targets 64 bits). It is nice that mingw-w64 gives a 64 bits > hosted, that's recent. Yeah, but unfortunately it is not enough for numpy to work. After creating the binary package (thanks for the "build -c mingw32 bdist_wininst" trick), and installing nose, I get a big crash when running the test units. When trying to use gdb to see the backtrace, I got: C:\Users\francesc\Desktop\NumPy>gdb python GNU gdb (GDB) 7.1.50.20100318-cvs Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-w64-mingw32". For bug reporting instructions, please see: ... Reading symbols from \Python26_64/python.exe...(no debugging symbols found)...done. (gdb) r Starting program: \Python26_64/python.exe [New Thread 2880.0x410] Python 2.6.5 (r265:79096, Mar 19 2010, 18:02:59) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy C:\Python26_64\lib\site-packages\numpy\core\__init__.py:5: Warning: Numpy built with MINGW-W64 on Windows 64 bits is experimental, and only available for testing. You are advised not to use it for production. CRASHES ARE TO BE EXPECTED - PLEASE REPORT THEM TO NUMPY DEVELOPERS import multiarray >>> numpy.test() Running unit tests for numpy NumPy version 1.4.0 NumPy is installed in C:\Python26_64\lib\site-packages\numpy Python version 2.6.5 (r265:79096, Mar 19 2010, 18:02:59) [MSC v.1500 64 bit (AMD 64)] nose version 0.11.3 Forcing DISTUTILS_USE_SDK=1 .....warning: HEAP[python.exe]: warning: Invalid address specified to RtlFreeHeap( 0000000002240000, 0000000000C 25110 ) Program received signal SIGTRAP, Trace/breakpoint trap. 0x0000000076fa6061 in ntdll!DbgBreakPoint () from C:\Windows\system32\ntdll.dll (gdb) bt #0 0x0000000076fa6061 in ntdll!DbgBreakPoint () from C:\Windows\system32\ntdll.dll #1 0x0000000076ffe17a in ntdll!EtwEventProviderEnabled () from C:\Windows\system32\ntdll.dll #2 0x00000000022af0d8 in ?? () #3 0x000000005104095c in ?? () #4 0x0000000000219a08 in ?? () #5 0x000000000e040001 in ?? () #6 0x0000000002240000 in ?? () #7 0x0000000076fe27a1 in ntdll!MD4Final () from C:\Windows\system32\ntdll.dll #8 0x0000000076fb9630 in ntdll!LdrGetProcedureAddress () from C:\Windows\system32\ntdll.dll #9 0x0000000076fb9500 in ntdll!LdrGetProcedureAddress () from C:\Windows\system32\ntdll.dll #10 0x0000000002240000 in ?? () #11 0x0000000000c25110 in ?? () #12 0x0000000002240000 in ?? () #13 0x000000000623f197 in ?? () #14 0x0000000002240000 in ?? () #15 0x00000000770151a9 in ntdll!RtlTraceDatabaseCreate () from C:\Windows\system32\ntdll.dll #16 0x0000000000000000 in ?? () (gdb) So, it is still exactly as you said: it seems like gdb cannot introspect MS debug symbols. Unfortunately I don't think solving this would be easy at all :-/ Perhaps reporting this to mingw-w64 would help... > > With it, and with some fixes in numpy sources (very few), I achieved to > > pass the build phase. I can provide the patch in case someone is > > interested. > > Building should work out of the box, actually - what are the issues ? I'm attaching the patch. Caveat emptor: the patch is *very* crude, and it is just meant to allow a clean compile with mingw-w64. A more elaborated patch should be implemented in case we want it to go into numpy. > But the main problem is not to build numpy (the 1.3.0 64 bits msi was > built with mingw-w64 already) - it is to find out what causes the > various crashes people encountered. Some were due to bugs in mingw which > have been fixed since then. Exactly. OK, I think I'm done with this try for the moment. It is a pity because mingw-w64 does an excellent job with my preliminary tests with Blosc compressor (it achieves 4x better performance than mingw-w32 and 2x better than MSVC 2008 32-bit). But before going MSVC 64-bit route, I'd like to test Intel compiler 64-bit. Anyone knows if it can cope with numpy? A quick look at setup.py seems to say that only 32-bit Intel compiler is supported. Thanks, -- Francesc Alted -------------- next part -------------- A non-text attachment was scrubbed... Name: mingw64.patch Type: text/x-patch Size: 2011 bytes Desc: not available URL: From david at silveregg.co.jp Thu Mar 25 05:53:34 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Thu, 25 Mar 2010 18:53:34 +0900 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003251034.06735.faltet@pytables.org> References: <201003242031.04336.faltet@pytables.org> <4BAAB5B4.4050102@silveregg.co.jp> <201003251034.06735.faltet@pytables.org> Message-ID: <4BAB329E.6000700@silveregg.co.jp> Francesc Alted wrote: > > C:\Users\francesc\Desktop\NumPy>gdb python > GNU gdb (GDB) 7.1.50.20100318-cvs > Copyright (C) 2010 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-w64-mingw32". > For bug reporting instructions, please see: > ... > Reading symbols from \Python26_64/python.exe...(no debugging symbols > found)...done. > (gdb) r > Starting program: \Python26_64/python.exe > [New Thread 2880.0x410] > Python 2.6.5 (r265:79096, Mar 19 2010, 18:02:59) [MSC v.1500 64 bit (AMD64)] > on > win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy > C:\Python26_64\lib\site-packages\numpy\core\__init__.py:5: Warning: Numpy > built > with MINGW-W64 on Windows 64 bits is experimental, and only available for > testing. You are advised not to use it for production. > > CRASHES ARE TO BE EXPECTED - PLEASE REPORT THEM TO NUMPY DEVELOPERS > import multiarray >>>> numpy.test() > Running unit tests for numpy > NumPy version 1.4.0 > NumPy is installed in C:\Python26_64\lib\site-packages\numpy > Python version 2.6.5 (r265:79096, Mar 19 2010, 18:02:59) [MSC v.1500 64 bit > (AMD > 64)] > nose version 0.11.3 > Forcing DISTUTILS_USE_SDK=1 > .....warning: HEAP[python.exe]: > warning: Invalid address specified to RtlFreeHeap( 0000000002240000, > 0000000000C > 25110 ) > > > Program received signal SIGTRAP, Trace/breakpoint trap. > 0x0000000076fa6061 in ntdll!DbgBreakPoint () > from C:\Windows\system32\ntdll.dll > (gdb) bt > #0 0x0000000076fa6061 in ntdll!DbgBreakPoint () > from C:\Windows\system32\ntdll.dll > #1 0x0000000076ffe17a in ntdll!EtwEventProviderEnabled () > from C:\Windows\system32\ntdll.dll > #2 0x00000000022af0d8 in ?? () > #3 0x000000005104095c in ?? () > #4 0x0000000000219a08 in ?? () > #5 0x000000000e040001 in ?? () > #6 0x0000000002240000 in ?? () > #7 0x0000000076fe27a1 in ntdll!MD4Final () from C:\Windows\system32\ntdll.dll > #8 0x0000000076fb9630 in ntdll!LdrGetProcedureAddress () > from C:\Windows\system32\ntdll.dll > #9 0x0000000076fb9500 in ntdll!LdrGetProcedureAddress () > from C:\Windows\system32\ntdll.dll > #10 0x0000000002240000 in ?? () > #11 0x0000000000c25110 in ?? () > #12 0x0000000002240000 in ?? () > #13 0x000000000623f197 in ?? () > #14 0x0000000002240000 in ?? () > #15 0x00000000770151a9 in ntdll!RtlTraceDatabaseCreate () > from C:\Windows\system32\ntdll.dll > #16 0x0000000000000000 in ?? () > (gdb) Believe it or not, but this is already much better than what I had last time I looked at it (the stack was corrupted after two items, and gdb often crashed). I had to build custom mingw runtimes to get there last year :) > > So, it is still exactly as you said: it seems like gdb cannot introspect MS > debug symbols. And for completeness, the MS tools of course cannot read the mingw debugging symbols, but the contrary would have been surprising. It does not help that debugging things inside the MS C runtime is a nightmare (because you have to rebuild everything, including python). I wonder if MS makes the sources of their runtime available under some license to look more into it. I also thought about using valgrind, but then there are some issues running wine in 64 bits mode under valgrind (maybe solved since then: https://bugs.kde.org/show_bug.cgi?id=197259). > > Exactly. OK, I think I'm done with this try for the moment. It is a pity > because mingw-w64 does an excellent job with my preliminary tests with Blosc > compressor (it achieves 4x better performance than mingw-w32 and 2x better > than MSVC 2008 32-bit). But before going MSVC 64-bit route, I'd like to test > Intel compiler 64-bit. Anyone knows if it can cope with numpy? What works well is MS compiler + Intel Fortran compiler (at least with numscons). Surprisingly, when I tried using the Intel C compiler + Intel Fortran compiler, I also had a lot of issues, not unlike the ones with mingw. What I am afraid is that the C runtimes issues are unsolvable on win64 because of python, and that we have to use the MS compiler (for C and C++). cheers, David From faltet at pytables.org Thu Mar 25 08:33:02 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 25 Mar 2010 13:33:02 +0100 Subject: [Numpy-discussion] draft release guide In-Reply-To: <4BAB329E.6000700@silveregg.co.jp> References: <201003251034.06735.faltet@pytables.org> <4BAB329E.6000700@silveregg.co.jp> Message-ID: <201003251333.02065.faltet@pytables.org> A Thursday 25 March 2010 10:53:34 David Cournapeau escrigu?: > Believe it or not, but this is already much better than what I had last > time I looked at it (the stack was corrupted after two items, and gdb > often crashed). I had to build custom mingw runtimes to get there last > year :) Well, I've reported the problem just in case: https://sourceforge.net/projects/mingw-w64/forums/forum/723797/topic/3638920 > What works well is MS compiler + Intel Fortran compiler (at least with > numscons). Surprisingly, when I tried using the Intel C compiler + Intel > Fortran compiler, I also had a lot of issues, not unlike the ones with > mingw. What I am afraid is that the C runtimes issues are unsolvable on > win64 because of python, and that we have to use the MS compiler (for C > and C++). Ok. So it seems the MS compiler venue for 64-bit is unavoidable (at this moment, at least). One question though: is a fortran compiler really necessary for compiling just numpy? If so, why? -- Francesc Alted From cournape at gmail.com Thu Mar 25 10:57:54 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 25 Mar 2010 23:57:54 +0900 Subject: [Numpy-discussion] draft release guide In-Reply-To: <201003251333.02065.faltet@pytables.org> References: <201003251034.06735.faltet@pytables.org> <4BAB329E.6000700@silveregg.co.jp> <201003251333.02065.faltet@pytables.org> Message-ID: <5b8d13221003250757j3836e040p6e042eebb2396e9@mail.gmail.com> On Thu, Mar 25, 2010 at 9:33 PM, Francesc Alted wrote: > > Ok. ?So it seems the MS compiler venue for 64-bit is unavoidable (at this > moment, at least). ?One question though: is a fortran compiler really > necessary for compiling just numpy? No, but I personally have little interest in numpy without scipy. You can build numpy with MS compilers on 64 bits since 1.3.0, this is entirely supported. David From markbak at gmail.com Thu Mar 25 11:10:42 2010 From: markbak at gmail.com (Mark Bakker) Date: Thu, 25 Mar 2010 16:10:42 +0100 Subject: [Numpy-discussion] partial autocorrelation function Message-ID: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> Anybody know of a partial autocorrelation function in numpy? Maybe in scipy? Thanks, Mark ps. I know, not too difficult, but if it is around I'd be happy to use it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Mar 25 11:14:33 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Mar 2010 11:14:33 -0400 Subject: [Numpy-discussion] partial autocorrelation function In-Reply-To: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> References: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> Message-ID: <1cd32cbb1003250814x1f5f1d1ap5d8c01ea0c4b8904@mail.gmail.com> On Thu, Mar 25, 2010 at 11:10 AM, Mark Bakker wrote: > Anybody know of a partial autocorrelation function in numpy? Maybe in scipy? > > Thanks, > > Mark > > ps. I know, not too difficult, but if it is around I'd be happy to use it. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From jsseabold at gmail.com Thu Mar 25 11:20:38 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 25 Mar 2010 11:20:38 -0400 Subject: [Numpy-discussion] partial autocorrelation function In-Reply-To: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> References: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> Message-ID: On Thu, Mar 25, 2010 at 11:10 AM, Mark Bakker wrote: > Anybody know of a partial autocorrelation function in numpy? Maybe in scipy? > > Thanks, > > Mark > > ps. I know, not too difficult, but if it is around I'd be happy to use it. > I'm sure there's another version somewhere, but I couldn't find one and wrote pacorr http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/sandbox/tsa/stattools.py#L92 It relies on statsmodels, but could easily be made to work without it. You can roll your own OLS, add_constant just adds a column of ones, and you can see lagmat here: http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/sandbox/tools/tools_tsa.py Skipper From josef.pktd at gmail.com Thu Mar 25 11:21:38 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Mar 2010 11:21:38 -0400 Subject: [Numpy-discussion] partial autocorrelation function In-Reply-To: <1cd32cbb1003250814x1f5f1d1ap5d8c01ea0c4b8904@mail.gmail.com> References: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> <1cd32cbb1003250814x1f5f1d1ap5d8c01ea0c4b8904@mail.gmail.com> Message-ID: <1cd32cbb1003250821g13402f68p98a408f7572afd7a@mail.gmail.com> On Thu, Mar 25, 2010 at 11:14 AM, wrote: > On Thu, Mar 25, 2010 at 11:10 AM, Mark Bakker wrote: >> Anybody know of a partial autocorrelation function in numpy? Maybe in scipy? >> >> Thanks, >> >> Mark >> >> ps. I know, not too difficult, but if it is around I'd be happy to use it. We have partial autocorrelation in scikits.statsmodels, based on ols or on yule walker. I think it's possible to get them more efficiently with Levinson-Durbin recursion e.g. in scikits.talkbox, but I haven't figured out yet how to recover it. (I hit the wrong button in the previous message) Josef >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > From josef.pktd at gmail.com Thu Mar 25 11:25:43 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Mar 2010 11:25:43 -0400 Subject: [Numpy-discussion] partial autocorrelation function In-Reply-To: References: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> Message-ID: <1cd32cbb1003250825t680b5340wff5957f8f07b1391@mail.gmail.com> On Thu, Mar 25, 2010 at 11:20 AM, Skipper Seabold wrote: > On Thu, Mar 25, 2010 at 11:10 AM, Mark Bakker wrote: >> Anybody know of a partial autocorrelation function in numpy? Maybe in scipy? >> >> Thanks, >> >> Mark >> >> ps. I know, not too difficult, but if it is around I'd be happy to use it. >> > > I'm sure there's another version somewhere, but I couldn't find one > and wrote pacorr > > http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/sandbox/tsa/stattools.py#L92 > > It relies on statsmodels, but could easily be made to work without it. > ?You can roll your own OLS, ?add_constant just adds a column of ones, > and you can see lagmat here: > > http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/sandbox/tools/tools_tsa.py My versions are in the last release scikits.statsmodels.sandbox.tsa.pacf_ols scikits.statsmodels.sandbox.tsa.pacf_yw difference in the third (I think) decimal compared to matlab, where I didn't find out the reason Josef > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Thu Mar 25 11:40:28 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 25 Mar 2010 11:40:28 -0400 Subject: [Numpy-discussion] partial autocorrelation function In-Reply-To: <1cd32cbb1003250825t680b5340wff5957f8f07b1391@mail.gmail.com> References: <6946b9501003250810ub3d1266lf2aa8a5e14010ddd@mail.gmail.com> <1cd32cbb1003250825t680b5340wff5957f8f07b1391@mail.gmail.com> Message-ID: On Thu, Mar 25, 2010 at 11:25 AM, wrote: > On Thu, Mar 25, 2010 at 11:20 AM, Skipper Seabold wrote: >> On Thu, Mar 25, 2010 at 11:10 AM, Mark Bakker wrote: >>> Anybody know of a partial autocorrelation function in numpy? Maybe in scipy? >>> >>> Thanks, >>> >>> Mark >>> >>> ps. I know, not too difficult, but if it is around I'd be happy to use it. >>> >> >> I'm sure there's another version somewhere, but I couldn't find one >> and wrote pacorr >> >> http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/sandbox/tsa/stattools.py#L92 >> >> It relies on statsmodels, but could easily be made to work without it. >> ?You can roll your own OLS, ?add_constant just adds a column of ones, >> and you can see lagmat here: >> >> http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/sandbox/tools/tools_tsa.py > > My versions are in the last release > > scikits.statsmodels.sandbox.tsa.pacf_ols > scikits.statsmodels.sandbox.tsa.pacf_yw > > difference in the third (I think) decimal compared to matlab, where I > didn't find out the reason > > Josef > Ah, there they are. To be clear, use these then, as I'll be removing the other. Skipper From dsdale24 at gmail.com Thu Mar 25 16:01:44 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 25 Mar 2010 16:01:44 -0400 Subject: [Numpy-discussion] should ndarray implement __round__ for py3k? Message-ID: A simple test in python 3: >>> import numpy as np >>> round(np.arange(10)) Traceback (most recent call last): File "", line 1, in TypeError: type numpy.ndarray doesn't define __round__ method Here is some additional context: http://bugs.python.org/issue7261 Darren From robert.kern at gmail.com Thu Mar 25 16:09:42 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 25 Mar 2010 15:09:42 -0500 Subject: [Numpy-discussion] should ndarray implement __round__ for py3k? In-Reply-To: References: Message-ID: <3d375d731003251309s27c1575cj4f3692d0c3b15bf9@mail.gmail.com> On Thu, Mar 25, 2010 at 15:01, Darren Dale wrote: > A simple test in python 3: > >>>> import numpy as np >>>> round(np.arange(10)) > Traceback (most recent call last): > ?File "", line 1, in > TypeError: type numpy.ndarray doesn't define __round__ method > > Here is some additional context: http://bugs.python.org/issue7261 I'd put that in the "would be nice, but not a priority" category. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu Mar 25 16:58:09 2010 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 25 Mar 2010 22:58:09 +0200 Subject: [Numpy-discussion] should ndarray implement __round__ for py3k? In-Reply-To: References: Message-ID: <1269550690.6921.2.camel@Nokia-N900-42-11> Darren Dale wrote: > A simple test in python 3: > > > > > import numpy as np > > > > round(np.arange(10)) > Traceback (most recent call last): >? ? File "", line 1, in > TypeError: type numpy.ndarray doesn't define __round__ method I implemented this for array scalars already, but forgot about arrays. Anyway, it could be nice to have. -- Pauli Virtanen From charlesr.harris at gmail.com Thu Mar 25 17:01:56 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 25 Mar 2010 15:01:56 -0600 Subject: [Numpy-discussion] should ndarray implement __round__ for py3k? In-Reply-To: <1269550690.6921.2.camel@Nokia-N900-42-11> References: <1269550690.6921.2.camel@Nokia-N900-42-11> Message-ID: On Thu, Mar 25, 2010 at 2:58 PM, Pauli Virtanen wrote: > Darren Dale wrote: > > A simple test in python 3: > > > > > > > import numpy as np > > > > > round(np.arange(10)) > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: type numpy.ndarray doesn't define __round__ method > > I implemented this for array scalars already, but forgot about arrays. > Anyway, it could be nice to have. > > I like the fact that python >= 3.1 returns integers when rounding floats. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Thu Mar 25 18:25:38 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 25 Mar 2010 18:25:38 -0400 Subject: [Numpy-discussion] f2py: "could not crack entity declaration" Message-ID: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> I decided to give wrapping this code a try: http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90 I'm afraid my Fortran skills are fairly limited, but I do know that gfortran compiles it fine. f2py run on this file produces lots of errors of the form, Reading fortran codes... Reading file 'GLMnet.f90' (format:fix) Line #263 in GLMnet.f90:" real x(no,ni),y(no),w(no),vp(ni),ca(nx,nlam) 353" updatevars: could not crack entity declaration "ca(nx,nlam)353". Ignoring. Line #264 in GLMnet.f90:" real ulam(nlam),a0(nlam),rsq(nlam),alm(nlam) 354" updatevars: could not crack entity declaration "alm(nlam)354". Ignoring. Line #265 in GLMnet.f90:" integer jd(*),ia(nx),nin(nlam) 355" updatevars: could not crack entity declaration "nin(nlam)355". Ignoring. Line #289 in GLMnet.f90:" real x(no,ni),y(no),w(no),vp(ni),ulam(nlam) 378" updatevars: could not crack entity declaration "ulam(nlam)378". Ignoring. Line #290 in GLMnet.f90:" real ca(nx,nlam),a0(nlam),rsq(nlam),alm(nlam) 379" updatevars: could not crack entity declaration "alm(nlam)379". Ignoring. Line #291 in GLMnet.f90:" integer jd(*),ia(nx),nin(nlam) 380" updatevars: could not crack entity declaration "nin(nlam)380". Ignoring. Line #306 in GLMnet.f90:" call chkvars(no,ni,x,ju) 392" analyzeline: No name/args pattern found for li Is it the numbers that it is objecting to (I'm assuming these are some sort of punchcard thing)? Do I need to modify the code in some way to make it f2py-friendly? Thanks, David From pearu.peterson at gmail.com Thu Mar 25 18:34:56 2010 From: pearu.peterson at gmail.com (Pearu Peterson) Date: Fri, 26 Mar 2010 00:34:56 +0200 Subject: [Numpy-discussion] f2py: "could not crack entity declaration" In-Reply-To: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> References: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> Message-ID: <4BABE510.4040500@cens.ioc.ee> Try renaming GLMnet.f90 to GLMnet.f. HTH, Pearu David Warde-Farley wrote: > I decided to give wrapping this code a try: > > http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90 > > I'm afraid my Fortran skills are fairly limited, but I do know that > gfortran compiles it fine. f2py run on this file produces lots of > errors of the form, > > Reading fortran codes... > Reading file 'GLMnet.f90' (format:fix) > Line #263 in GLMnet.f90:" real > x(no,ni),y(no),w(no),vp(ni),ca(nx,nlam) 353" > updatevars: could not crack entity declaration "ca(nx,nlam)353". > Ignoring. > Line #264 in GLMnet.f90:" real > ulam(nlam),a0(nlam),rsq(nlam),alm(nlam) 354" > updatevars: could not crack entity declaration "alm(nlam)354". > Ignoring. > Line #265 in GLMnet.f90:" integer > jd(*),ia(nx),nin(nlam) 355" > updatevars: could not crack entity declaration "nin(nlam)355". > Ignoring. > Line #289 in GLMnet.f90:" real > x(no,ni),y(no),w(no),vp(ni),ulam(nlam) 378" > updatevars: could not crack entity declaration "ulam(nlam)378". > Ignoring. > Line #290 in GLMnet.f90:" real > ca(nx,nlam),a0(nlam),rsq(nlam),alm(nlam) 379" > updatevars: could not crack entity declaration "alm(nlam)379". > Ignoring. > Line #291 in GLMnet.f90:" integer > jd(*),ia(nx),nin(nlam) 380" > updatevars: could not crack entity declaration "nin(nlam)380". > Ignoring. > Line #306 in GLMnet.f90:" call > chkvars(no,ni,x,ju) 392" > analyzeline: No name/args pattern found for li > > Is it the numbers that it is objecting to (I'm assuming these are some > sort of punchcard thing)? Do I need to modify the code in some way to > make it f2py-friendly? > > Thanks, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From oliphant at enthought.com Thu Mar 25 23:05:40 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 25 Mar 2010 22:05:40 -0500 Subject: [Numpy-discussion] numpy.array(arr.flat) mutates arr if arr.flags.fortran: bug? In-Reply-To: References: Message-ID: <2E439135-2D4F-46C6-8FFF-45D30F6D49F0@enthought.com> On Mar 24, 2010, at 2:13 PM, Zachary Pincus wrote: > Hello, > > I assume it is a bug that calling numpy.array() on a flatiter of a > fortran-strided array that owns its own data causes that array to be > rearranged somehow? > > Not sure what happens with a fancier-strided array that also owns its > own data (because I'm not sure how to create one of those in python). > > This is from the latest svn version (2.0.0.dev8302) but was also > present in a previous version too. Hmm.. Yeah, this doesn't seem right. I'm not really sure what array(a.flat) should return since a.flat is an iterator... but what it's doing is not right. -Travis From bioinformed at gmail.com Fri Mar 26 08:08:53 2010 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 26 Mar 2010 08:08:53 -0400 Subject: [Numpy-discussion] f2py: "could not crack entity declaration" In-Reply-To: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> References: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> Message-ID: <2e1434c11003260508p505405c0h564faa1ff7cd1713@mail.gmail.com> On Thu, Mar 25, 2010 at 6:25 PM, David Warde-Farley wrote: > I decided to give wrapping this code a try: > > http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90 > > > I have a working f2py wrapper located at: http://code.google.com/p/glu-genetics/source/browse/trunk/glu/lib/glm/glmnet.pyf I've only used some of the basic functionality, so there may be a few rough edges. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Mar 26 11:26:05 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 26 Mar 2010 09:26:05 -0600 Subject: [Numpy-discussion] numpy.array(arr.flat) mutates arr if arr.flags.fortran: bug? In-Reply-To: References: Message-ID: On Wed, Mar 24, 2010 at 1:13 PM, Zachary Pincus wrote: > Hello, > > I assume it is a bug that calling numpy.array() on a flatiter of a > fortran-strided array that owns its own data causes that array to be > rearranged somehow? > > Not sure what happens with a fancier-strided array that also owns its > own data (because I'm not sure how to create one of those in python). > > This is from the latest svn version (2.0.0.dev8302) but was also > present in a previous version too. > > Zach > > > In [9]: a = numpy.array([[1,2],[3,4]]).copy('F') > > In [10]: a > Out[10]: > array([[1, 2], > [3, 4]]) > > In [11]: list(a.flat) > Out[11]: [1, 2, 3, 4] > > In [12]: a # no problem > Out[12]: > array([[1, 2], > [3, 4]]) > > In [13]: numpy.array(a.flat) > Out[13]: array([1, 2, 3, 4]) > > In [14]: a # this ain't right! > Out[14]: > array([[1, 3], > [2, 4]]) > > In [15]: a = numpy.array([[1,2],[3,4]]).copy('C') > > In [16]: numpy.array(a.flat) > Out[16]: array([1, 2, 3, 4]) > > In [17]: a > Out[17]: > array([[1, 2], > [3, 4]]) > > You should open a ticket for this. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Fri Mar 26 16:25:27 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 26 Mar 2010 16:25:27 -0400 Subject: [Numpy-discussion] f2py: "could not crack entity declaration" In-Reply-To: <2e1434c11003260508p505405c0h564faa1ff7cd1713@mail.gmail.com> References: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> <2e1434c11003260508p505405c0h564faa1ff7cd1713@mail.gmail.com> Message-ID: On 26-Mar-10, at 8:08 AM, Kevin Jacobs wrote: > On Thu, Mar 25, 2010 at 6:25 PM, David Warde-Farley >wrote: > >> I decided to give wrapping this code a try: >> >> http://morrislab.med.utoronto.ca/~dwf/GLMnet.f90 >> >> >> > I have a working f2py wrapper located at: > > http://code.google.com/p/glu-genetics/source/browse/trunk/glu/lib/glm/glmnet.pyf Thanks Kevin. Part of it was as an exercise to arse myself to learn how to use f2py, but it's good to have some reference :) That said, I gave that wrapper a whirl and it crashed on me... I noticed you added an 'njd' argument to the wrapper for elnet, did you modify the elnet Fortran function at all? Is it fine to have arguments in the wrapped version that don't directly correspond to the raw fortran version? Thanks, David From pgmdevlist at gmail.com Fri Mar 26 16:29:51 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 26 Mar 2010 16:29:51 -0400 Subject: [Numpy-discussion] Renamed numpy.lib.io ? Message-ID: All, I'm surprised by the renaming of numpy.lib.io to numpy.lib.npyio. Was it really necessary ? Is it to take effect with the incoming numpy 2.0 release ? BTW, when is it scheduled ? From robert.kern at gmail.com Fri Mar 26 16:36:40 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 26 Mar 2010 15:36:40 -0500 Subject: [Numpy-discussion] Renamed numpy.lib.io ? In-Reply-To: References: Message-ID: <3d375d731003261336x60e36e38s8c94e94352507c8@mail.gmail.com> On Fri, Mar 26, 2010 at 15:29, Pierre GM wrote: > All, > I'm surprised by the renaming of numpy.lib.io to numpy.lib.npyio. Was it really necessary ? http://mail.scipy.org/pipermail/numpy-discussion/2010-March/049543.html http://mail.scipy.org/pipermail/numpy-discussion/2010-March/049551.html .... etc. > Is it to take effect with the incoming numpy 2.0 release ? Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Fri Mar 26 16:49:21 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 26 Mar 2010 16:49:21 -0400 Subject: [Numpy-discussion] Renamed numpy.lib.io ? In-Reply-To: <3d375d731003261336x60e36e38s8c94e94352507c8@mail.gmail.com> References: <3d375d731003261336x60e36e38s8c94e94352507c8@mail.gmail.com> Message-ID: On Mar 26, 2010, at 4:36 PM, Robert Kern wrote: > On Fri, Mar 26, 2010 at 15:29, Pierre GM wrote: >> All, >> I'm surprised by the renaming of numpy.lib.io to numpy.lib.npyio. Was it really necessary ? > > http://mail.scipy.org/pipermail/numpy-discussion/2010-March/049543.html > http://mail.scipy.org/pipermail/numpy-discussion/2010-March/049551.html > .... etc. Oops, I miss this thread... >> Is it to take effect with the incoming numpy 2.0 release ? > > Yes. OK then. I still think the renaming was a bit overkill (we could have explicitly imported numpy.lib.io as such in the tests), but hey... So, when is the released scheduled for? From dwf at cs.toronto.edu Fri Mar 26 19:07:19 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 26 Mar 2010 19:07:19 -0400 Subject: [Numpy-discussion] f2py: "could not crack entity declaration" In-Reply-To: References: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> <2e1434c11003260508p505405c0h564faa1ff7cd1713@mail.gmail.com> Message-ID: <34F6D2C2-AF42-4D56-A351-25C137A0EFE6@cs.toronto.edu> On 26-Mar-10, at 4:25 PM, David Warde-Farley wrote: > That said, I gave that wrapper a whirl and it crashed on me... > > I noticed you added an 'njd' argument to the wrapper for elnet, did > you modify the elnet Fortran function at all? Is it fine to have > arguments in the wrapped version that don't directly correspond to the > raw fortran version? D'oh, nevermind. I looked in your repository and it's a different version of glmnet.f than the one I have. David From sierra_mtnview at sbcglobal.net Fri Mar 26 20:44:05 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 26 Mar 2010 17:44:05 -0700 Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? Message-ID: <4BAD54D5.3080500@sbcglobal.net> An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Attached Message Part URL: From PHobson at Geosyntec.com Fri Mar 26 22:22:49 2010 From: PHobson at Geosyntec.com (PHobson at Geosyntec.com) Date: Fri, 26 Mar 2010 22:22:49 -0400 Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? In-Reply-To: <4BAD54D5.3080500@sbcglobal.net> References: <4BAD54D5.3080500@sbcglobal.net> Message-ID: Wayne, The current release of Scipy doesn't work perfectly well with Numpy 1.4. On my systems (Mac OS 10.6, WinXP, and Ubuntu), I'm running Numpy 1.4 with the current Scipy on Python 2.6.4. I get the same error you describe below on the first attempt. For some reason unknown to me, it works on the second try. Switching to Numpy 1.3 is the best solution to the error. -paul From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Wayne Watson Sent: Friday, March 26, 2010 5:44 PM To: numpy-discussion at scipy.org Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? I wrote a program in Python 2.5 under Win7 and it runs fine using Numpy 1.2 , but not on a colleague's machine who has a slightly newer 2.5. We both use IDLE to execute the program. During import he gets this: >>> Traceback (most recent call last): File "C:\Documents and Settings\HP_Administrator.DavesDesktop\My Documents\Astro\Meteors\NC-FireballReport.py", line 38, in from scipy import stats as stats # scoreatpercentile File "C:\Python25\lib\site-packages\scipy\stats\__init__.py", line 7, in from stats import * File "C:\Python25\lib\site-packages\scipy\stats\stats.py", line 191, in import scipy.special as special File "C:\Python25\lib\site-packages\scipy\special\__init__.py", line 22, in from numpy.testing import NumpyTest ImportError: cannot import name NumpyTest >>> Comments? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet Poisoned Shipments. Serious illegal waste dumping may be occuring in the Meditrainean. Radioactive material, mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f. Web Page: > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Fri Mar 26 23:08:51 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 26 Mar 2010 20:08:51 -0700 Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? In-Reply-To: References: <4BAD54D5.3080500@sbcglobal.net> Message-ID: <4BAD76C3.1020708@sbcglobal.net> An HTML attachment was scrubbed... URL: From ellisonbg.net at gmail.com Sat Mar 27 00:16:02 2010 From: ellisonbg.net at gmail.com (Brian Granger) Date: Fri, 26 Mar 2010 21:16:02 -0700 Subject: [Numpy-discussion] SciPy 2010 Tutorials: brainstorming and call for proposals Message-ID: <6ce0ac131003262116q63fc5766sef16951ca6f6c397@mail.gmail.com> Greetings everyone, This year, there will be two days of tutorials (June 28th and 29th) before the main SciPy 2010 conference. Each of the two tutorial tracks (intro, advanced) will have a 3-4 hour morning and afternoon session both days, for a total of 4 intro sessions and 4 advanced sessions. The main tutorial web page for SciPy 2010 is here: http://conference.scipy.org/scipy2010/tutorials.html We are currently in the process of planning the tutorial sessions. You can help us in two ways: Brainstorm/vote on potential tutorial topics ============================================ To help us plan the tutorials, we have setup a web site that allow everyone in the community to brainstorm and vote on tutorial ideas/topics. The website for brainstorming/voting is here: http://conference.scipy.org/scipy2010/tutorialsUV.html The tutorial committee will use this information to help select the tutorials. Please jump in and let us know what tutorial topics you would like to see. Tutorial proposal submissions ============================= We are now accepting tutorial proposals from individuals or teams that would like to present a tutorial. Tutorials should be focused on covering a well defined topic in a hands on manner. We want to see tutorial attendees coding! We are pleased to offer tutorial presenters stipends this year for the first time: * 1 Session: $1,000 (half day) * 2 Sessions: $1,500 (full day) Optionally, part of this stipend can be applied to the presenter's registration costs. To submit a tutorial proposal please submit the following materials to 2010tutorials at scipy.org by April 15: * A short bio of the presenter or team members. * Which track the tutorial would be in (intro or advanced). * A short description and/or outline of the tutorial content. * A list of Python packages that attendees will need to have installed to follow along. From charlesr.harris at gmail.com Sat Mar 27 02:25:48 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 27 Mar 2010 00:25:48 -0600 Subject: [Numpy-discussion] StringIO test failure with Python3.1.2 In-Reply-To: References: <4BAA1EF6.8000002@gmail.com> <4BAA5B87.2050906@gmail.com> Message-ID: On Wed, Mar 24, 2010 at 12:41 PM, Pauli Virtanen > wrote: > Wed, 24 Mar 2010 13:35:51 -0500, Bruce Southey wrote: > [clip] > > elif isinstance(item, collections.Callable): > > File "/usr/local/lib/python3.1/abc.py", line 121, in > > __instancecheck__ > > subclass = instance.__class__ > > AttributeError: 'PyCapsule' object has no attribute '__class__' > > Seems like another Python bug. All objects probably should have the > __class__ attribute... > > Might be related to this also: http://bugs.python.org/issue7624. I don't see any easy way to get a patch from the python repository for the revision that was to supposed to fix that. Anyone know how to do that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Sat Mar 27 09:11:46 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Sat, 27 Mar 2010 09:11:46 -0400 Subject: [Numpy-discussion] numpy.array(arr.flat) mutates arr if arr.flags.fortran: bug? In-Reply-To: References: Message-ID: > You should open a ticket for this. http://projects.scipy.org/numpy/ticket/1439 On Mar 26, 2010, at 11:26 AM, Charles R Harris wrote: > > > On Wed, Mar 24, 2010 at 1:13 PM, Zachary Pincus > wrote: > Hello, > > I assume it is a bug that calling numpy.array() on a flatiter of a > fortran-strided array that owns its own data causes that array to be > rearranged somehow? > > Not sure what happens with a fancier-strided array that also owns its > own data (because I'm not sure how to create one of those in python). > > This is from the latest svn version (2.0.0.dev8302) but was also > present in a previous version too. > > Zach > > > In [9]: a = numpy.array([[1,2],[3,4]]).copy('F') > > In [10]: a > Out[10]: > array([[1, 2], > [3, 4]]) > > In [11]: list(a.flat) > Out[11]: [1, 2, 3, 4] > > In [12]: a # no problem > Out[12]: > array([[1, 2], > [3, 4]]) > > In [13]: numpy.array(a.flat) > Out[13]: array([1, 2, 3, 4]) > > In [14]: a # this ain't right! > Out[14]: > array([[1, 3], > [2, 4]]) > > In [15]: a = numpy.array([[1,2],[3,4]]).copy('C') > > In [16]: numpy.array(a.flat) > Out[16]: array([1, 2, 3, 4]) > > In [17]: a > Out[17]: > array([[1, 2], > [3, 4]]) > > > You should open a ticket for this. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From bioinformed at gmail.com Sat Mar 27 11:09:14 2010 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Sat, 27 Mar 2010 11:09:14 -0400 Subject: [Numpy-discussion] f2py: "could not crack entity declaration" In-Reply-To: <34F6D2C2-AF42-4D56-A351-25C137A0EFE6@cs.toronto.edu> References: <26577F31-AED8-4B3D-B3D5-93B7057C70AF@cs.toronto.edu> <2e1434c11003260508p505405c0h564faa1ff7cd1713@mail.gmail.com> <34F6D2C2-AF42-4D56-A351-25C137A0EFE6@cs.toronto.edu> Message-ID: <2e1434c11003270809j45c4834bj2b03a9423ee8979b@mail.gmail.com> On Fri, Mar 26, 2010 at 7:07 PM, David Warde-Farley wrote: > On 26-Mar-10, at 4:25 PM, David Warde-Farley wrote: > > > That said, I gave that wrapper a whirl and it crashed on me... > > > > I noticed you added an 'njd' argument to the wrapper for elnet, did > > you modify the elnet Fortran function at all? Is it fine to have > > arguments in the wrapped version that don't directly correspond to the > > raw fortran version? > > D'oh, nevermind. I looked in your repository and it's a different > version of glmnet.f than the one I have. > > My apologies-- I'd forgotten that I had modified the Fortran code to simplify the wrappers. Hopefully it was still of use to you. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Sat Mar 27 13:00:50 2010 From: rmay31 at gmail.com (Ryan May) Date: Sat, 27 Mar 2010 11:00:50 -0600 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> Message-ID: On Mon, Mar 22, 2010 at 8:14 AM, Ryan May wrote: > On Sun, Mar 21, 2010 at 11:57 PM, ? wrote: >> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: >>> Hi, >>> >>> I found that trapz() doesn't work with subclasses: >>> >>> http://projects.scipy.org/numpy/ticket/1438 >>> >>> A simple patch (attached) to change asarray() to asanyarray() fixes >>> the problem fine. >> >> Are you sure this function works with matrices and other subclasses? >> >> Looking only very briefly at it: the multiplication might be a problem. > > Correct, it probably *is* a problem in some cases with matrices. ?In > this case, I was using quantities (Darren Dale's unit-aware array > package), and the result was that units were stripped off. > > The patch can't make trapz() work with all subclasses. However, right > now, you have *no* hope of getting a subclass out of trapz(). ?With > this change, subclasses that don't redefine operators can work fine. > If you're passing a Matrix to trapz() and expecting it to work, IMHO > you're doing it wrong. ?You can still pass one in by using asarray() > yourself. ?Without this patch, I'm left with copying and maintaining a > copy of the code elsewhere, just so I can loosen the function's input > processing. That seems wrong, since there's really no need in my case > to drop down to an ndarray. The input I'm giving it supports all the > operations it needs, so it should just work with my original input. Anyone else care to weigh in here? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From josef.pktd at gmail.com Sat Mar 27 13:12:50 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Mar 2010 13:12:50 -0400 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> Message-ID: <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> On Sat, Mar 27, 2010 at 1:00 PM, Ryan May wrote: > On Mon, Mar 22, 2010 at 8:14 AM, Ryan May wrote: >> On Sun, Mar 21, 2010 at 11:57 PM, ? wrote: >>> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: >>>> Hi, >>>> >>>> I found that trapz() doesn't work with subclasses: >>>> >>>> http://projects.scipy.org/numpy/ticket/1438 >>>> >>>> A simple patch (attached) to change asarray() to asanyarray() fixes >>>> the problem fine. >>> >>> Are you sure this function works with matrices and other subclasses? >>> >>> Looking only very briefly at it: the multiplication might be a problem. >> >> Correct, it probably *is* a problem in some cases with matrices. ?In >> this case, I was using quantities (Darren Dale's unit-aware array >> package), and the result was that units were stripped off. >> >> The patch can't make trapz() work with all subclasses. However, right >> now, you have *no* hope of getting a subclass out of trapz(). ?With >> this change, subclasses that don't redefine operators can work fine. >> If you're passing a Matrix to trapz() and expecting it to work, IMHO >> you're doing it wrong. ?You can still pass one in by using asarray() >> yourself. ?Without this patch, I'm left with copying and maintaining a >> copy of the code elsewhere, just so I can loosen the function's input >> processing. That seems wrong, since there's really no need in my case >> to drop down to an ndarray. The input I'm giving it supports all the >> operations it needs, so it should just work with my original input. With asarray it gives correct results for matrices and all array_like and subclasses, it just doesn't preserve the type. Your patch would break matrices and possibly other types, masked_arrays?, ... One solution would be using arraywrap as in numpy.linalg. for related discussion: http://mail.scipy.org/pipermail/scipy-dev/2009-June/012061.html Josef > > Anyone else care to weigh in here? > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rmay31 at gmail.com Sat Mar 27 14:31:24 2010 From: rmay31 at gmail.com (Ryan May) Date: Sat, 27 Mar 2010 12:31:24 -0600 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> Message-ID: On Sat, Mar 27, 2010 at 11:12 AM, wrote: > On Sat, Mar 27, 2010 at 1:00 PM, Ryan May wrote: >> On Mon, Mar 22, 2010 at 8:14 AM, Ryan May wrote: >>> On Sun, Mar 21, 2010 at 11:57 PM, ? wrote: >>>> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: >>>>> Hi, >>>>> >>>>> I found that trapz() doesn't work with subclasses: >>>>> >>>>> http://projects.scipy.org/numpy/ticket/1438 >>>>> >>>>> A simple patch (attached) to change asarray() to asanyarray() fixes >>>>> the problem fine. >>>> >>>> Are you sure this function works with matrices and other subclasses? >>>> >>>> Looking only very briefly at it: the multiplication might be a problem. >>> >>> Correct, it probably *is* a problem in some cases with matrices. ?In >>> this case, I was using quantities (Darren Dale's unit-aware array >>> package), and the result was that units were stripped off. >>> >>> The patch can't make trapz() work with all subclasses. However, right >>> now, you have *no* hope of getting a subclass out of trapz(). ?With >>> this change, subclasses that don't redefine operators can work fine. >>> If you're passing a Matrix to trapz() and expecting it to work, IMHO >>> you're doing it wrong. ?You can still pass one in by using asarray() >>> yourself. ?Without this patch, I'm left with copying and maintaining a >>> copy of the code elsewhere, just so I can loosen the function's input >>> processing. That seems wrong, since there's really no need in my case >>> to drop down to an ndarray. The input I'm giving it supports all the >>> operations it needs, so it should just work with my original input. > > With asarray it gives correct results for matrices and all array_like > and subclasses, it just doesn't preserve the type. > Your patch would break matrices and possibly other types, masked_arrays?, ... It would break matrices, yes. I would argue that masked arrays are already broken with trapz: In [1]: x = np.arange(10) In [2]: y = x * x In [3]: np.trapz(y, x) Out[3]: 244.5 In [4]: ym = np.ma.array(y, mask=(x>4)&(x<7)) In [5]: np.trapz(ym, x) Out[5]: 244.5 In [6]: y[5:7] = 0 In [7]: ym = np.ma.array(y, mask=(x>4)&(x<7)) In [8]: np.trapz(ym, x) Out[8]: 183.5 Because of the call to asarray(), the mask is completely discarded and you end up with identical results to an unmasked array, which is not what I'd expect. Worse, the actual numeric value of the positions that were masked affect the final answer. My patch allows this to work as expected too. > One solution would be using arraywrap as in numpy.linalg. By arraywrap, I'm assuming you mean: def _makearray(a): new = asarray(a) wrap = getattr(a, "__array_prepare__", new.__array_wrap__) return new, wrap I'm not sure if that's identical to just letting the subclass handle what's needed. To my eyes, that doesn't look as though it'd be equivalent, both for handling masked arrays and Quantities. For quantities at least, the result of trapz will have different units than either of the inputs. > for related discussion: > http://mail.scipy.org/pipermail/scipy-dev/2009-June/012061.html Actually, that discussion kind of makes my point. Matrices are a pain to make work in a general sense because they *break* ndarray conventions--to me it doesn't make sense to help along classes that break convention at the expense of making well-behaved classes a pain to use. You should need an *explicit* cast of a matrix to an ndarray instead of the function quietly doing it for you. ("Explicit is better than implicit") It just seems absurd that if I make my own ndarray subclass that *just* adds some behavior to the array, but doesn't break *any* operations, I need to do one of the following: 1) Have my own copy of trapz that works with my class 2) Wrap every call to numpy's own trapz() to put the metadata back. Does it not seem backwards that the class that breaks conventions "just works" while those that don't break conventions, will work perfectly with the function as written, need help to be treated properly? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From msarahan at gmail.com Sat Mar 27 19:38:27 2010 From: msarahan at gmail.com (Mike Sarahan) Date: Sat, 27 Mar 2010 16:38:27 -0700 Subject: [Numpy-discussion] Dealing with roundoff error Message-ID: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> Hi all, I have run into some roundoff problems trying to line up some experimental spectra. The x coordinates are given in intervals of 0.1 units. I read the data in from a text file using np.loadtxt(). I think Robert's post here explains why the problem exists: http://mail.scipy.org/pipermail/numpy-discussion/2007-June/028133.html However, even linspace shows roundoff error: a=np.linspace(0.0,10.0,endpoint=False) b=np.linspace(0.1,10.1,endpoint=False) np.sum(a[1:]==b[:-1]) # Gives me 72, no 100 What is the best way to deal with it? Multiply the intervals by 10, then convert them to ints? Thanks, Mike From andrea.gavana at gmail.com Sat Mar 27 20:24:01 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 28 Mar 2010 00:24:01 +0000 Subject: [Numpy-discussion] Interpolation question Message-ID: Hi All, I have an interpolation problem and I am having some difficulties in tackling it. I hope I can explain myself clearly enough. Basically, I have a whole bunch of 3D fluid flow simulations (close to 1000), and they are a result of different combinations of parameters. I was planning to use the Radial Basis Functions in scipy, but for the moment let's assume, to simplify things, that I am dealing only with one parameter (x). In 1000 simulations, this parameter x has 1000 values, obviously. The problem is, the outcome of every single simulation is a vector of oil production over time (let's say 40 values per simulation, one per year), and I would like to be able to interpolate my x parameter (1000 values) against all the simulations (1000x40) and get an approximating function that, given another x parameter (of size 1x1) will give me back an interpolated production profile (of size 1x40). Something along these lines: import numpy as np from scipy.interpolate import Rbf # x.shape = (1000, 1) # y.shape = (1000, 40) rbf = Rbf(x, y) # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) fi = rbf(xi) Does anyone have a suggestion on how I could implement this? Sorry if it sounds confused... Please feel free to correct any wrong assumptions I have made, or to propose other approaches if you think RBFs are not suitable for this kind of problems. Thank you in advance for your suggestions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From josef.pktd at gmail.com Sat Mar 27 21:14:08 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Mar 2010 21:14:08 -0400 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: Message-ID: <1cd32cbb1003271814l425ba096u4201a978f4fa175@mail.gmail.com> On Sat, Mar 27, 2010 at 8:24 PM, Andrea Gavana wrote: > Hi All, > > ? ?I have an interpolation problem and I am having some difficulties > in tackling it. I hope I can explain myself clearly enough. > > Basically, I have a whole bunch of 3D fluid flow simulations (close to > 1000), and they are a result of different combinations of parameters. > I was planning to use the Radial Basis Functions in scipy, but for the > moment let's assume, to simplify things, that I am dealing only with > one parameter (x). In 1000 simulations, this parameter x has 1000 > values, obviously. The problem is, the outcome of every single > simulation is a vector of oil production over time (let's say 40 > values per simulation, one per year), and I would like to be able to > interpolate my x parameter (1000 values) against all the simulations > (1000x40) and get an approximating function that, given another x > parameter (of size 1x1) will give me back an interpolated production > profile (of size 1x40). > > Something along these lines: > > import numpy as np > from scipy.interpolate import Rbf > > # x.shape = (1000, 1) > # y.shape = (1000, 40) > > rbf = Rbf(x, y) > > # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) > fi = rbf(xi) > > > Does anyone have a suggestion on how I could implement this? Sorry if > it sounds confused... Please feel free to correct any wrong > assumptions I have made, or to propose other approaches if you think > RBFs are not suitable for this kind of problems. if I understand correctly then you have a function (oil production y) over time, observed at 40 time periods (t), and this function is parameterized by a single parameter x. y = f(t,x) t = np.arange(40) len(x) = 1000 blowing up t and x you would have a grid of 1000*40 with 40000 observations in t,x and y rbf = Rbf(t, x, y) might theoretically work, however I think the arrays are too large, the full distance matrix would be 40000*40000 I don't know if bivariate splines would be able to handle this size. How much noise do you have in the simulations? If there is not much noise, and you are mainly interested in interpolating for x, then I would interpolate pointwise for each of the 40 t y = f(x) for each t, which has 1000 observations. This might still be too much for RBF but some of the other univariate interpolators should be able to handle it. I would try this first, because it is the easiest to do. This could be extended to interpolating in a second stage for t if necessary. If there is a lot of noise, some interpolation local in t (and maybe x) might be able to handle it, and combine information across t. An alternative would be to model y as a function of t semi-parametrically, e.g. with polynomials, and interpolate the coefficients of the polynomial across x, or try some of the interpolators that are designed for images, which might have a better chance for the array size, but there I don't know anything. Just some thoughts, Josef > Thank you in advance for your suggestions. > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Sat Mar 27 22:23:03 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Mar 2010 22:23:03 -0400 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> Message-ID: <1cd32cbb1003271923s2fc5c92bq7bd50102486b02e7@mail.gmail.com> On Sat, Mar 27, 2010 at 2:31 PM, Ryan May wrote: > On Sat, Mar 27, 2010 at 11:12 AM, ? wrote: >> On Sat, Mar 27, 2010 at 1:00 PM, Ryan May wrote: >>> On Mon, Mar 22, 2010 at 8:14 AM, Ryan May wrote: >>>> On Sun, Mar 21, 2010 at 11:57 PM, ? wrote: >>>>> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: >>>>>> Hi, >>>>>> >>>>>> I found that trapz() doesn't work with subclasses: >>>>>> >>>>>> http://projects.scipy.org/numpy/ticket/1438 >>>>>> >>>>>> A simple patch (attached) to change asarray() to asanyarray() fixes >>>>>> the problem fine. >>>>> >>>>> Are you sure this function works with matrices and other subclasses? >>>>> >>>>> Looking only very briefly at it: the multiplication might be a problem. >>>> >>>> Correct, it probably *is* a problem in some cases with matrices. ?In >>>> this case, I was using quantities (Darren Dale's unit-aware array >>>> package), and the result was that units were stripped off. >>>> >>>> The patch can't make trapz() work with all subclasses. However, right >>>> now, you have *no* hope of getting a subclass out of trapz(). ?With >>>> this change, subclasses that don't redefine operators can work fine. >>>> If you're passing a Matrix to trapz() and expecting it to work, IMHO >>>> you're doing it wrong. ?You can still pass one in by using asarray() >>>> yourself. ?Without this patch, I'm left with copying and maintaining a >>>> copy of the code elsewhere, just so I can loosen the function's input >>>> processing. That seems wrong, since there's really no need in my case >>>> to drop down to an ndarray. The input I'm giving it supports all the >>>> operations it needs, so it should just work with my original input. >> >> With asarray it gives correct results for matrices and all array_like >> and subclasses, it just doesn't preserve the type. >> Your patch would break matrices and possibly other types, masked_arrays?, ... > > It would break matrices, yes. ?I would argue that masked arrays are > already broken with trapz: > > In [1]: x = np.arange(10) > > In [2]: y = x * x > > In [3]: np.trapz(y, x) > Out[3]: 244.5 > > In [4]: ym = np.ma.array(y, mask=(x>4)&(x<7)) > > In [5]: np.trapz(ym, x) > Out[5]: 244.5 > > In [6]: y[5:7] = 0 > > In [7]: ym = np.ma.array(y, mask=(x>4)&(x<7)) > > In [8]: np.trapz(ym, x) > Out[8]: 183.5 > > Because of the call to asarray(), the mask is completely discarded and > you end up with identical results to an unmasked array, > which is not what I'd expect. ?Worse, the actual numeric value of the > positions that were masked affect the final answer. My patch allows > this to work as expected too. > >> One solution would be using arraywrap as in numpy.linalg. > > By arraywrap, I'm assuming you mean: > > def _makearray(a): > ? ?new = asarray(a) > ? ?wrap = getattr(a, "__array_prepare__", new.__array_wrap__) > ? ?return new, wrap > > I'm not sure if that's identical to just letting the subclass handle > what's needed. ?To my eyes, that doesn't look as though it'd be > equivalent, both for handling masked arrays and Quantities. For > quantities at least, the result of trapz will have different units > than either of the inputs. > >> for related discussion: >> http://mail.scipy.org/pipermail/scipy-dev/2009-June/012061.html > > Actually, that discussion kind of makes my point. ?Matrices are a pain > to make work in a general sense because they *break* ndarray > conventions--to me it doesn't make sense to help along classes that > break convention at the expense of making well-behaved classes a pain > to use. ?You should need an *explicit* cast of a matrix to an ndarray > instead of the function quietly doing it for you. ("Explicit is better > than implicit") It just seems absurd that if I make my own ndarray > subclass that *just* adds some behavior to the array, but doesn't > break *any* operations, I need to do one of the following: > > 1) Have my own copy of trapz that works with my class > 2) Wrap every call to numpy's own trapz() to put the metadata back. > > Does it not seem backwards that the class that breaks conventions > "just works" while those that don't break conventions, will work > perfectly with the function as written, need help to be treated > properly? (trying again, firefox or gmail deleted my earlier response instead of sending it) Matrices have been part of numpy for a long time and your patch would break backwards compatibility in a pretty serious way. subclasses of ndarray, like masked_arrays and quantities, and classes that delegate to array calculations, like pandas, can redefine anything. So there is not much that can be relied on if any subclass is allowed to be used inside a function e.g. quantities redefines sin, cos,... http://packages.python.org/quantities/user/issues.html#umath-functions What happens if you call fft with a quantity array? For ndarrays I know that (x*y).sum(0) is equivalent to np.dot(x.T,y) for appropriate y and x (e.g. for a faster calculation of means), so, it's just an implementation detail what I use. If a subclass redefines ufuncs but not linalg, then these two can be different. Just a thought, what happens if an existing function for ndarrays is cythonized? I guess the behavior for subclasses might change if they were not converted to ndarrays. So it appears to me that the only feasible way *in general* is to have separate functions or wrappers like np.ma. Except for simple functions and ufuncs, it would be a lot of work and fragile to allow asanyarray. And, as we discussed in a recent thread on masked arrays (and histogram), it would push the work on the function writer instead of the ones that are creating new subclasses. Of course, the behavior in numpy and scipy can be improved, and trapz may be simple enough to change, but I don't think a patch that breaks backwards compatibility pretty seriously and is not accompanied by sufficient tests should go into numpy or scipy. (On the other hand, I'm very slowly getting used to the pattern that for a simple function, 10% is calculation and 90% is interface code.) Josef > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rmay31 at gmail.com Sat Mar 27 23:37:23 2010 From: rmay31 at gmail.com (Ryan May) Date: Sat, 27 Mar 2010 21:37:23 -0600 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: <1cd32cbb1003271923s2fc5c92bq7bd50102486b02e7@mail.gmail.com> References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> <1cd32cbb1003271923s2fc5c92bq7bd50102486b02e7@mail.gmail.com> Message-ID: On Sat, Mar 27, 2010 at 8:23 PM, wrote: > Matrices have been part of numpy for a long time and your patch would > break backwards compatibility in a pretty serious way. Yeah, and I should admit that I realize that makes this particular patch a no-go. However, that to me doesn't put the issue to bed for any future code that gets written (see below). > subclasses of ndarray, like masked_arrays and quantities, and classes > that delegate to array calculations, like pandas, can redefine > anything. So there is not much that can be relied on if any subclass > is allowed to be used inside a function > > e.g. quantities redefines sin, cos,... > http://packages.python.org/quantities/user/issues.html#umath-functions > What happens if you call fft with a quantity array? Probably ends up casting to an ndarray. But that's a complex operation that I can live with not working. It's coded in C and can't be implemented quickly using array methods. And in this > Except for simple functions and ufuncs, it would be a lot of work and > fragile to allow asanyarray. And, as we discussed in a recent thread > on masked arrays (and histogram), it would push the work on the > function writer instead of the ones that are creating new subclasses. I disagree in this case. I think the function writer should only be burdened to try to use array methods rather than numpy functions, if possible, and avoiding casts other than asanyarray() at all costs. I think we shouldn't be scared of getting an error when a subclass is passed to a function, because that's an indication to the programmer that it doesn't work with what you're passing in and you need to *explicitly* cast it to an ndarray. Having the function do the cast for you is: 1) magical and implicit 2) Forces an unnecessary cast on those who would otherwise work fine. I get errors when I try to pass structured arrays to math functions, but I don't see numpy casting that away. > Of course, the behavior in numpy and scipy can be improved, and trapz > may be simple enough to change, but I don't think a patch that breaks > backwards compatibility pretty seriously and is not accompanied by > sufficient tests should go into numpy or scipy. If sufficient tests is the only thing holding this back, let me know. I'll get to coding. But I can't argue with the backwards incompatibility. At this point, I think I'm more trying to see if there's any agreement that: casting *everyone* because some class breaks behavior is a bad idea. The programmer can always make it work by explicitly asking for the cast, but there's no way for the programmer to ask the function *not* to cast the data. Hell, I'd be happy if trapz took a flag just telling it subok=True. > (On the other hand, I'm very slowly getting used to the pattern that > for a simple function, 10% is calculation and 90% is interface code.) Yeah, it's kind of annoying, since the 10% is the cool part you want, and that 90% is thorny to design and boring to code. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From josef.pktd at gmail.com Sun Mar 28 01:11:16 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 28 Mar 2010 01:11:16 -0400 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> <1cd32cbb1003271923s2fc5c92bq7bd50102486b02e7@mail.gmail.com> Message-ID: <1cd32cbb1003272211l30893431i1698273a5562fd75@mail.gmail.com> On Sat, Mar 27, 2010 at 11:37 PM, Ryan May wrote: > On Sat, Mar 27, 2010 at 8:23 PM, ? wrote: >> Matrices have been part of numpy for a long time and your patch would >> break backwards compatibility in a pretty serious way. > > Yeah, and I should admit that I realize that makes this particular > patch a no-go. However, that to me doesn't put the issue to bed for > any future code that gets written (see below). > >> subclasses of ndarray, like masked_arrays and quantities, and classes >> that delegate to array calculations, like pandas, can redefine >> anything. So there is not much that can be relied on if any subclass >> is allowed to be used inside a function >> >> e.g. quantities redefines sin, cos,... >> http://packages.python.org/quantities/user/issues.html#umath-functions >> What happens if you call fft with a quantity array? > > Probably ends up casting to an ndarray. But that's a complex operation > that I can live with not working. It's coded in C and can't be > implemented quickly using array methods. And in this > >> Except for simple functions and ufuncs, it would be a lot of work and >> fragile to allow asanyarray. And, as we discussed in a recent thread >> on masked arrays (and histogram), it would push the work on the >> function writer instead of the ones that are creating new subclasses. > > I disagree in this case. ?I think the function writer should only be > burdened to try to use array methods rather than numpy functions, if > possible, and avoiding casts other than asanyarray() at all costs. ?I > think we shouldn't be scared of getting an error when a subclass is > passed to a function, because that's an indication to the programmer > that it doesn't work with what you're passing in and you need to > *explicitly* cast it to an ndarray. Having the function do the cast > for you is: 1) magical and implicit 2) Forces an unnecessary cast on > those who would otherwise work fine. I get errors when I try to pass > structured arrays to math functions, but I don't see numpy casting > that away. the problem is quality control and testing. If the cast is automatically done, then if I feed anything array_like into the function, I only have to pay attention that casting to ndarray works as intended (e.g. it doesn't work with masked arrays with inappropriate masked values). If the casting is correct, I know that I get correct numbers back out, even if I have to reattach the meta information or convert the type again. With asanyarray, anything could happen, including getting numbers back that are wrong, maybe obviously so, maybe not. (structured arrays at least have the advantage that an exception is thrown pretty fast.) And from what I have seen so far, testing is not a high priority for many users. Overall, I think that there are very few functions that are simple enough that asanyarray would work, without wrapping and casting (at least internally). In scipy.stats we converted a few, but for most functions I don't want to spend the time thinking how it can be written so we can have the internal code use anyarray (main problem are matrices, masked_array and arrays with nans need special code anyway, structured dtypes are mostly useless for calculations without creating a view) e.g. what is the quantities outcome of a call to np.corrcoef, np.cov ? And as long as not every function returns a consistent result, changes in implementation details can affect the outcome, and then it will be a lot of fun hunting for bugs. e.g. stats.linregress, dot or explicit sum of squares or np.cov, or ... Are you sure you always get the same result with quantities? > >> Of course, the behavior in numpy and scipy can be improved, and trapz >> may be simple enough to change, but I don't think a patch that breaks >> backwards compatibility pretty seriously and is not accompanied by >> sufficient tests should go into numpy or scipy. > > If sufficient tests is the only thing holding this back, let me know. > I'll get to coding. > > But I can't argue with the backwards incompatibility. At this point, I > think I'm more trying to see if there's any agreement that: casting > *everyone* because some class breaks behavior is a bad idea. ?The > programmer can always make it work by explicitly asking for the cast, > but there's no way for the programmer to ask the function *not* to > cast the data. Hell, I'd be happy if trapz took a flag just telling it > subok=True. I thought a while ago that subclasses or even classes that implement an array_like interface should have an attribute to signal this, like iamalgebrasafe, or subok or dontcast. The freedom or choice not to get cast to ndarray is desirable, but the increase in bugs and bug reports won't be much fun. And the user with the next subclass will argue that numpy/scipy should do the casting because it's too much wrapping code that has to be build around every function. (Just as a related aside, in statsmodels I'm also still trying hard to keep the main models to use ndarrays only, either it becomes to specialized if it is based on a specific class, or it requires a lot of wrapping code. I don't think your proposal, to just let any array class in, will get very far before raising an exception or producing incorrect numbers (except maybe for array subclasses that don't change any numerical behavior.) That's my opinion, maybe others see it in a different way. But in any case, it should be possible to change individual functions even if the overall policy doesn't change. Josef > >> (On the other hand, I'm very slowly getting used to the pattern that >> for a simple function, 10% is calculation and 90% is interface code.) > > Yeah, it's kind of annoying, since the 10% is the cool part you want, > and that 90% is thorny to design and boring to code. > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Sun Mar 28 02:25:22 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 28 Mar 2010 09:25:22 +0300 Subject: [Numpy-discussion] Dealing with roundoff error In-Reply-To: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> References: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> Message-ID: <1269757522.9704.4.camel@Nokia-N900-42-11> Mike Sarahan wrote: > However, even linspace shows roundoff error: > > a=np.linspace(0.0,10.0,endpoint=False) > b=np.linspace(0.1,10.1,endpoint=False) > np.sum(a[1:]==b[:-1])? # Gives me 72, no 100 Are you sure equally spaced floating point numbers having this property even exist? 0.1 does not have a terminating representation in base-2: 0.1_10 = 0.0001100110011001100110011.._2 -- Pauli Virtanen From dsdale24 at gmail.com Sun Mar 28 03:29:30 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 28 Mar 2010 03:29:30 -0400 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: <1cd32cbb1003271923s2fc5c92bq7bd50102486b02e7@mail.gmail.com> References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> <1cd32cbb1003271923s2fc5c92bq7bd50102486b02e7@mail.gmail.com> Message-ID: On Sat, Mar 27, 2010 at 10:23 PM, wrote: > subclasses of ndarray, like masked_arrays and quantities, and classes > that delegate to array calculations, like pandas, can redefine > anything. So there is not much that can be relied on if any subclass > is allowed to be used inside a function > > e.g. quantities redefines sin, cos,... > http://packages.python.org/quantities/user/issues.html#umath-functions Those functions were only intended to be used in the short term, until the ufuncs that ship with numpy included a mechanism that allowed quantity arrays to propagate the units. It would be nice to have a mechanism (like we have discussed briefly just recently on this list) where there is a single entry point to a given function like add, but subclasses can tweak the execution. We discussed the possibility of simplifying the wrapping scheme with a method like __handle_gfunc__. (I don't think this necessarily has to be limited to ufuncs.) I think a second method like __prepare_input__ is also necessary. Imagine something like: class GenericFunction: @property def executable(self): return self._executable def __init__(self, executable): self._executable = executable def __call__(self, *args, **kwargs): # find the input with highest priority, and then: args, kwargs = input.__prepare_input__(self, *args, **kwargs) return input.__handle_gfunc__(self, *args, **kwargs) # this is the core function to be passed to the generic class: def _add(a, b, out=None): # the generic, ndarray implementation. ... # here is the publicly exposed interface: add = GenericFunction(_add) # now my subclasses class MyArray(ndarray): # My class tweaks the execution of the function in __handle_gfunc__ def __prepare_input__(self, gfunc, *args, **kwargs): return mod_input[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): res = gfunc.executable(*args, **kwargs) # you could have called a different core func there return mod_output[gfunc](res, *args, **kwargs) class MyNextArray(MyArray): def __prepare_input__(self, gfunc, *args, **kwargs): # let the superclass do its thing: args, kwargs = MyArray.__prepare_input__(self, gfunc, *args, **kwargs) # now I can tweak it further: return mod_input_further[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): # let's defer to the superclass to handle calling the core function: res = MyArray.__handle_gfunc__(self, gfunc, *args, **kwargs) # and now we have one more crack at the result before passing it back: return mod_output_further[gfunc](res, *args, **kwargs) If a gfunc is not recognized, the subclass might raise a NotImplementedError or it might just pass the original args, kwargs on through. I didn't write that part out because the example was already running long. But the point is that a single entry point could be used for any subclass, without having to worry about how to support every subclass. Darren From peridot.faceted at gmail.com Sun Mar 28 04:14:29 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 28 Mar 2010 04:14:29 -0400 Subject: [Numpy-discussion] Dealing with roundoff error In-Reply-To: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> References: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> Message-ID: On 27 March 2010 19:38, Mike Sarahan wrote: > Hi all, > > I have run into some roundoff problems trying to line up some > experimental spectra. ?The x coordinates are given in intervals of 0.1 > units. ?I read the data in from a text file using np.loadtxt(). > > I think Robert's post here explains why the problem exists: > http://mail.scipy.org/pipermail/numpy-discussion/2007-June/028133.html > > However, even linspace shows roundoff error: > > a=np.linspace(0.0,10.0,endpoint=False) > b=np.linspace(0.1,10.1,endpoint=False) > np.sum(a[1:]==b[:-1]) ?# Gives me 72, no 100 > > What is the best way to deal with it? ?Multiply the intervals by 10, > then convert them to ints? It is almost never a good idea to compare floats for equality. (Exceptions include mostly situations where the float is not being operated on at all.) If your problem is that your spectra are really sampled at the same points but the floats coming out are slightly different, it's probably enough to test for abs(x-y) Thanks, > Mike > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From peridot.faceted at gmail.com Sun Mar 28 04:26:28 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 28 Mar 2010 04:26:28 -0400 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: Message-ID: On 27 March 2010 20:24, Andrea Gavana wrote: > Hi All, > > ? ?I have an interpolation problem and I am having some difficulties > in tackling it. I hope I can explain myself clearly enough. > > Basically, I have a whole bunch of 3D fluid flow simulations (close to > 1000), and they are a result of different combinations of parameters. > I was planning to use the Radial Basis Functions in scipy, but for the > moment let's assume, to simplify things, that I am dealing only with > one parameter (x). In 1000 simulations, this parameter x has 1000 > values, obviously. The problem is, the outcome of every single > simulation is a vector of oil production over time (let's say 40 > values per simulation, one per year), and I would like to be able to > interpolate my x parameter (1000 values) against all the simulations > (1000x40) and get an approximating function that, given another x > parameter (of size 1x1) will give me back an interpolated production > profile (of size 1x40). If I understand your problem correctly, you have a function taking one value as input (or one 3D vector) and returning a vector of length 40. You want to know whether there are tools in scipy to support this. I'll say first that it's not strictly necessary for there to be: you could always just build 40 different interpolators, one for each component of the output. After all, there's no interaction in the calculations between the output coordinates. This is of course awkward, in that you'd like to just call F(x) and get back a vector of length 40, but that can be remedied by writing a short wrapper function that simply calls all 40 interpolators. A problem that may be more serious is that the python loop over the 40 interpolators can be slow, while a C implementation would give vector-valued results rapidly. To make this work, you're going to have to find a compiled-code interpolator that returns vector values. This is not in principle complicated, since they just need to run the same interpolator code on 40 sets of coefficients. But I don't think many of scipy's interpolators do this. The only ones I'm sure are able to do this are the polynomial interpolators I wrote, which work only on univariate inputs (and provide no kind of smoothing). If you're going to use these I recommend using scipy's spline functions to construct smoothing splines, which you'd then convert to a piecewise cubic. But I'd say, give the 40 interpolators a try. If they're slow, try interpolating many points at once: rather than giving just one x value, call the interpolators with a thousand (or however many you need) at a time. This will speed up scipy's interpolators, and it will make the overhead of a forty-element loop negligible. Anne > Something along these lines: > > import numpy as np > from scipy.interpolate import Rbf > > # x.shape = (1000, 1) > # y.shape = (1000, 40) > > rbf = Rbf(x, y) > > # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) > fi = rbf(xi) > > > Does anyone have a suggestion on how I could implement this? Sorry if > it sounds confused... Please feel free to correct any wrong > assumptions I have made, or to propose other approaches if you think > RBFs are not suitable for this kind of problems. > > Thank you in advance for your suggestions. > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andrea.gavana at gmail.com Sun Mar 28 06:33:01 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 28 Mar 2010 10:33:01 +0000 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: Message-ID: On 28 March 2010 08:26, Anne Archibald wrote: > On 27 March 2010 20:24, Andrea Gavana wrote: >> Hi All, >> >> ? ?I have an interpolation problem and I am having some difficulties >> in tackling it. I hope I can explain myself clearly enough. >> >> Basically, I have a whole bunch of 3D fluid flow simulations (close to >> 1000), and they are a result of different combinations of parameters. >> I was planning to use the Radial Basis Functions in scipy, but for the >> moment let's assume, to simplify things, that I am dealing only with >> one parameter (x). In 1000 simulations, this parameter x has 1000 >> values, obviously. The problem is, the outcome of every single >> simulation is a vector of oil production over time (let's say 40 >> values per simulation, one per year), and I would like to be able to >> interpolate my x parameter (1000 values) against all the simulations >> (1000x40) and get an approximating function that, given another x >> parameter (of size 1x1) will give me back an interpolated production >> profile (of size 1x40). > > If I understand your problem correctly, you have a function taking one > value as input (or one 3D vector) and returning a vector of length 40. > You want to know whether there are tools in scipy to support this. > > I'll say first that it's not strictly necessary for there to be: you > could always just build 40 different interpolators, one for each > component of the output. After all, there's no interaction in the > calculations between the output coordinates. This is of course > awkward, in that you'd like to just call F(x) and get back a vector of > length 40, but that can be remedied by writing a short wrapper > function that simply calls all 40 interpolators. Thank you Anne and Josef, my explanation was very bad but your suggestions opened up my mind :-D . I believe I am going to give the 40 interpolators a try, although you mentioned that RBFs are going to have some problems if the vectors' sizes are too big... I planned for a multidimensional interpolation of about 10 parameters (each of these has 1000 elements), but at this point I am afraid it will not work. If any of you is aware of another methodology/library I could use (Fortran is also fine, as long as I can wrap it with f2py) for this problem please feel free to put me on the right track. Thank you again for your suggestions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From dsdale24 at gmail.com Sun Mar 28 09:23:06 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 28 Mar 2010 09:23:06 -0400 Subject: [Numpy-discussion] ufunc improvements [Was: Warnings in numpy.ma.test()] In-Reply-To: References: Message-ID: I'd like to use this thread to discuss possible improvements to generalize numpys functions. Sorry for double posting, but we will have a hard time keeping track of discussion about how to improve functions to deal with subclasses if they are spread across threads talking about warnings in masked arrays or masked arrays not dealing well with trapz. There is an additional bit at the end that was not discussed elsewhere. On Thu, Mar 18, 2010 at 8:14 AM, Darren Dale wrote: > On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris > wrote: >> Just *one* function to rule them all and on the subtype dump it. No >> __array_wrap__, __input_prepare__, or __array_prepare__, just something like >> __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing >> having the ufunc upper layer do nothing but decide which argument type will >> do all the rest of the work, casting, calling the low level ufunc base, >> providing buffers, wrapping, etc. Instead of pasting bits and pieces into >> the existing framework I would like to lay out a line of attack that ends up >> separating ufuncs into smaller pieces that provide low level routines that >> work on strided memory while leaving policy implementation to the subtype. >> There would need to be some default type (ndarray) when the functions are >> called on nested lists and scalars and I'm not sure of the best way to >> handle that. >> >> I'm just sort of thinking out loud, don't take it too seriously. > > Thanks for the clarification. I think I see how this could work: if > ufuncs were callable instances of classes, __call__ would find the > input with highest priority and pass itself and the input to that > object's __handle_ufunc__. Now it is up to __handle_ufunc__ to > determine whether and how to modify the input, call some method on the > ufunc (like execute) > to perform the buffer operation, then __handle_ufunc__ performs the > cast, deals with metadata and returns the result. > > I skipped a step: initializing the output buffer. Would that be rolled > into the ufunc execution, or should it be possible for > __handle_ufunc__ to access the initialized buffer before execution > occurs(__array_prepare__)? I think it is important to be able to > perform the cast and calculate metadata before ufunc execution. If an > error occurs, an exception can be raised before the ufunc operates on > the arrays, which can modifies the data in place. We discussed the possibility of simplifying the wrapping scheme with a method like __handle_gfunc__. (I don't think this necessarily has to be limited to ufuncs.) I think a second method like __prepare_input__ is also necessary. Imagine something like: class GenericFunction: @property def executable(self): return self._executable def __init__(self, executable): self._executable = executable def __call__(self, *args, **kwargs): # find the input with highest priority, and then: args, kwargs = input.__prepare_input__(self, *args, **kwargs) return input.__handle_gfunc__(self, *args, **kwargs) # this is the core function to be passed to the generic class: def _add(a, b, out=None): # the generic, ndarray implementation. ... # here is the publicly exposed interface: add = GenericFunction(_add) # now my subclasses class MyArray(ndarray): # My class tweaks the execution of the function in __handle_gfunc__ def __prepare_input__(self, gfunc, *args, **kwargs): return mod_input[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): res = gfunc.executable(*args, **kwargs) # you could have called a different core func there return mod_output[gfunc](res, *args, **kwargs) class MyNextArray(MyArray): def __prepare_input__(self, gfunc, *args, **kwargs): # let the superclass do its thing: args, kwargs = MyArray.__prepare_input__(self, gfunc, *args, **kwargs) # now I can tweak it further: return mod_input_further[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): # let's defer to the superclass to handle calling the core function: res = MyArray.__handle_gfunc__(self, gfunc, *args, **kwargs) # and now we have one more crack at the result before passing it back: return mod_output_further[gfunc](res, *args, **kwargs) If a gfunc is not recognized, the subclass might raise a NotImplementedError or it might just pass the original args, kwargs on through. I didn't write that part out because the example was already running long. But the point is that a single entry point could be used for any subclass, without having to worry about how to support every subclass. It may still be necessary to be mindful to use asanyarray in the core functions, but if a subclass alters the behavior of some operation such that an operation needs to happen on an ndarray view of the data, __prepare_input__ provides an opportinuty to prepare such views. For example, in our current situation, matrices would not be compatible with trapz if trapz did not cast the input to ndarrays, but as a result trapz is not compatible with masked arrays or quantities. With the proposed scheme, matrices would in some cases pass ndarray views to the core function, but in other cases pass the arguments through unmodified, since the function might build on other functions that are already generalized to support those types of data. Darren From PHobson at Geosyntec.com Sun Mar 28 12:13:45 2010 From: PHobson at Geosyntec.com (PHobson at Geosyntec.com) Date: Sun, 28 Mar 2010 12:13:45 -0400 Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? In-Reply-To: <4BAD76C3.1020708@sbcglobal.net> References: <4BAD54D5.3080500@sbcglobal.net> <4BAD76C3.1020708@sbcglobal.net> Message-ID: If your on windows, you can probably get rid of it through the Add/Remove Programs portion of the Conrol Panel. -- Paul Hobson Senior Staff Engineer Geosyntec Consultants Portland, OR On Mar 26, 2010, at 8:09 PM, "Wayne Watson" > wrote: Thanks. How do I switch? Do I just pull down 1.3 or better 1.2 (I use it.), and install it? How do I (actually my colleague) somehow remove 1.4? Is it as easy as going to IDLE's path browser and removing, under site-packages, numpy? (I'm not sure that's even possible. I don't see a right-click menu.) On 3/26/2010 7:22 PM, PHobson at Geosyntec.com wrote: Wayne, The current release of Scipy doesn?t work perfectly well with Numpy 1.4. On my systems (Mac OS 10.6, WinXP, and Ubuntu), I?m running Numpy 1.4 with the current Scipy on Python 2.6.4. I get the same error you describe below on the first attempt. For some reason unknown to me, it works on the second try. Switching to Numpy 1.3 is the best solution to the error. -paul From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Wayne Watson Sent: Friday, March 26, 2010 5:44 PM To: numpy-discussion at scipy.org Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? I wrote a program in Python 2.5 under Win7 and it runs fine using Numpy 1.2 , but not on a colleague's machine who has a slightly newer 2.5. We both use IDLE to execute the program. During import he gets this: >>> Traceback (most recent call last): File "C:\Documents and Settings\HP_Administrator.DavesDesktop\My Documents\Astro\Meteors\NC-FireballReport.py", line 38, in from scipy import stats as stats # scoreatpercentile File "C:\Python25\lib\site-packages\scipy\stats\__init__.py", line 7, in from stats import * File "C:\Python25\lib\site-packages\scipy\stats\stats.py", line 191, in import scipy.special as special File "C:\Python25\lib\site-packages\scipy\special\__init__.py", line 22, in from numpy.testing import NumpyTest ImportError: cannot import name NumpyTest >>> Comments? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet Poisoned Shipments. Serious illegal waste dumping may be occuring in the Meditrainean. Radioactive material, mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f. Web Page: <www.speckledwithstars.net/> _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet Poisoned Shipments. Serious illegal waste dumping may be occuring in the Meditrainean. Radioactive material, mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f. Web Page: <www.speckledwithstars.net/> _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Sun Mar 28 14:22:24 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 28 Mar 2010 13:22:24 -0500 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: Message-ID: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: > On 27 March 2010 20:24, Andrea Gavana wrote: >> Hi All, >> >> ? ?I have an interpolation problem and I am having some difficulties >> in tackling it. I hope I can explain myself clearly enough. >> >> Basically, I have a whole bunch of 3D fluid flow simulations (close to >> 1000), and they are a result of different combinations of parameters. >> I was planning to use the Radial Basis Functions in scipy, but for the >> moment let's assume, to simplify things, that I am dealing only with >> one parameter (x). In 1000 simulations, this parameter x has 1000 >> values, obviously. The problem is, the outcome of every single >> simulation is a vector of oil production over time (let's say 40 >> values per simulation, one per year), and I would like to be able to >> interpolate my x parameter (1000 values) against all the simulations >> (1000x40) and get an approximating function that, given another x >> parameter (of size 1x1) will give me back an interpolated production >> profile (of size 1x40). > > If I understand your problem correctly, you have a function taking one > value as input (or one 3D vector) and returning a vector of length 40. > You want to know whether there are tools in scipy to support this. > > I'll say first that it's not strictly necessary for there to be: you > could always just build 40 different interpolators, one for each > component of the output. After all, there's no interaction in the > calculations between the output coordinates. This is of course > awkward, in that you'd like to just call F(x) and get back a vector of > length 40, but that can be remedied by writing a short wrapper > function that simply calls all 40 interpolators. > > A problem that may be more serious is that the python loop over the 40 > interpolators can be slow, while a C implementation would give > vector-valued results rapidly. 40 is not a bad number to loop over. The thing I would be worried about is the repeated calculation of the (1000, 1000) radial function evaluation. I think that a small modification of Rbf could be made to handle the vector-valued case. I leave that as an exercise to Andrea, of course. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ranavishal at gmail.com Sun Mar 28 15:19:42 2010 From: ranavishal at gmail.com (Vishal Rana) Date: Sun, 28 Mar 2010 12:19:42 -0700 Subject: [Numpy-discussion] Applying formula to all in an array which has value from previous Message-ID: Hi, For a numpy array: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) I do some calculation with 0, 1... and get a value = 2.5, now use this value to do the repeat the same calculation with next element for example... 2.5, 2 and get a value = 3.1 3.1, 3 and get a value = 4.2 4.2, 4 and get a value = 5.1 .... .... and get a value = 8.5 8.5, 9 and get a value = 9.8 So I should be getting a new array like array([0, 2.5, 3.1, 4.2, 5.1, ..... 8.5,9.8]) Is it where numpy or scipy can help? Thanks Vishal -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Sun Mar 28 15:44:31 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Sun, 28 Mar 2010 21:44:31 +0200 Subject: [Numpy-discussion] Dealing with roundoff error In-Reply-To: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> References: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> Message-ID: 2010/3/28 Mike Sarahan : > I have run into some roundoff problems trying to line up some > experimental spectra. ?The x coordinates are given in intervals of 0.1 > units. ?I read the data in from a text file using np.loadtxt(). I don't know your problem well enough, so the suggestion to use numpy.interp() is maybe not more than a useless shot in the dark? http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html#numpy.interp Friedrich From andrea.gavana at gmail.com Sun Mar 28 16:47:31 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 28 Mar 2010 21:47:31 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> Message-ID: HI All, On 28 March 2010 19:22, Robert Kern wrote: > On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >> On 27 March 2010 20:24, Andrea Gavana wrote: >>> Hi All, >>> >>> ? ?I have an interpolation problem and I am having some difficulties >>> in tackling it. I hope I can explain myself clearly enough. >>> >>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>> 1000), and they are a result of different combinations of parameters. >>> I was planning to use the Radial Basis Functions in scipy, but for the >>> moment let's assume, to simplify things, that I am dealing only with >>> one parameter (x). In 1000 simulations, this parameter x has 1000 >>> values, obviously. The problem is, the outcome of every single >>> simulation is a vector of oil production over time (let's say 40 >>> values per simulation, one per year), and I would like to be able to >>> interpolate my x parameter (1000 values) against all the simulations >>> (1000x40) and get an approximating function that, given another x >>> parameter (of size 1x1) will give me back an interpolated production >>> profile (of size 1x40). >> >> If I understand your problem correctly, you have a function taking one >> value as input (or one 3D vector) and returning a vector of length 40. >> You want to know whether there are tools in scipy to support this. >> >> I'll say first that it's not strictly necessary for there to be: you >> could always just build 40 different interpolators, one for each >> component of the output. After all, there's no interaction in the >> calculations between the output coordinates. This is of course >> awkward, in that you'd like to just call F(x) and get back a vector of >> length 40, but that can be remedied by writing a short wrapper >> function that simply calls all 40 interpolators. >> >> A problem that may be more serious is that the python loop over the 40 >> interpolators can be slow, while a C implementation would give >> vector-valued results rapidly. > > 40 is not a bad number to loop over. The thing I would be worried > about is the repeated calculation of the (1000, 1000) radial function > evaluation. I think that a small modification of Rbf could be made to > handle the vector-valued case. I leave that as an exercise to Andrea, > of course. :-) It seems like this whole interpolation stuff is not working as I thought. In particular, considering scalar-valued interpolation (i.e., looking at the final oil recovery only and not the time-based oil production profile), interpolation with RBFs is giving counter-intuitive and meaningless answers. The issues I am seeing are basically these: # Interpolate the cumulative oil production rbf = Rbf(x1, x2, x3, x4, x5, x6, final_oil_recovery) # Try to use existing combination of parameters to get back # the original result (more or less) possible_recovery = rbf(x1[10], x2[10], x3[10], x4[10], x5[10], x6[10]) 1) If I attempt to use the resulting interpolation function ("rbf"), inputting a single value for each x1, x2, ..., x6 that is *already present* in the original x1, x2, ..., x6 vectors, I get meaningless results (i.e., negative oil productions, 1000% error, and so on) in all cases with some (rare) exceptions when using the "thin-plate" interpolation option; 2) Using a combination of parameters x1, x2, ..., x6 that is *not* in the original set, I get non-physical (i.e., unrealistic) results: it appears that RBFs think that if I increase the number of production wells I get less oil recovered (!), sometimes by a factor of 10. I am pretty sure I am missing something in this whole interpolation thing (I am no expert at all), but honestly it is the first time I get such counter-intuitive results in my entire Python numerical life ;-) Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From pgmdevlist at gmail.com Sun Mar 28 17:14:50 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 28 Mar 2010 17:14:50 -0400 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> Message-ID: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> On Mar 28, 2010, at 4:47 PM, Andrea Gavana wrote: > HI All, > > On 28 March 2010 19:22, Robert Kern wrote: >> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >>> On 27 March 2010 20:24, Andrea Gavana wrote: >>>> Hi All, >>>> >>>> I have an interpolation problem and I am having some difficulties >>>> in tackling it. I hope I can explain myself clearly enough. >>>> >>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>> 1000), and they are a result of different combinations of parameters. > It seems like this whole interpolation stuff is not working as I > thought. In particular, considering scalar-valued interpolation (i.e., > looking at the final oil recovery only and not the time-based oil > production profile), interpolation with RBFs is giving > counter-intuitive and meaningless answers. The issues I am seeing are > basically these: Which is hardly surprising: you're working with a physical process, you must have some constraints on your parameters (whether dependence between parameters, bounds on the estimates...) that are not taken into account by the interpolation scheme you're using. So, it's back to the drawing board. What are you actually trying to achieve ? Find the best estimates of your 10 parameters to match an observed production timeline ? Find a space for your 10 parameters that gives some realistic production ? Assuming that your 10 parameters are actually independent, did you run 1000**10 simulations to test all the possible combinations? Probably not, so you could try using a coarser interval between min and max values for each parameters (say, 10 ?) and check the combos... Or you could try to decrease the number of parameters by finding the ones that have more influence on the final outcome and dropping the others. A different problem all together... My point is: don't be discouraged by the weird results you're getting: it's probably because you're not using the right approach yet. From josef.pktd at gmail.com Sun Mar 28 17:18:56 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 28 Mar 2010 17:18:56 -0400 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> Message-ID: <1cd32cbb1003281418t32ed5aedq3728eaadb042bbea@mail.gmail.com> On Sun, Mar 28, 2010 at 4:47 PM, Andrea Gavana wrote: > HI All, > > On 28 March 2010 19:22, Robert Kern wrote: >> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >>> On 27 March 2010 20:24, Andrea Gavana wrote: >>>> Hi All, >>>> >>>> ? ?I have an interpolation problem and I am having some difficulties >>>> in tackling it. I hope I can explain myself clearly enough. >>>> >>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>> 1000), and they are a result of different combinations of parameters. >>>> I was planning to use the Radial Basis Functions in scipy, but for the >>>> moment let's assume, to simplify things, that I am dealing only with >>>> one parameter (x). In 1000 simulations, this parameter x has 1000 >>>> values, obviously. The problem is, the outcome of every single >>>> simulation is a vector of oil production over time (let's say 40 >>>> values per simulation, one per year), and I would like to be able to >>>> interpolate my x parameter (1000 values) against all the simulations >>>> (1000x40) and get an approximating function that, given another x >>>> parameter (of size 1x1) will give me back an interpolated production >>>> profile (of size 1x40). >>> >>> If I understand your problem correctly, you have a function taking one >>> value as input (or one 3D vector) and returning a vector of length 40. >>> You want to know whether there are tools in scipy to support this. >>> >>> I'll say first that it's not strictly necessary for there to be: you >>> could always just build 40 different interpolators, one for each >>> component of the output. After all, there's no interaction in the >>> calculations between the output coordinates. This is of course >>> awkward, in that you'd like to just call F(x) and get back a vector of >>> length 40, but that can be remedied by writing a short wrapper >>> function that simply calls all 40 interpolators. >>> >>> A problem that may be more serious is that the python loop over the 40 >>> interpolators can be slow, while a C implementation would give >>> vector-valued results rapidly. >> >> 40 is not a bad number to loop over. The thing I would be worried >> about is the repeated calculation of the (1000, 1000) radial function >> evaluation. I think that a small modification of Rbf could be made to >> handle the vector-valued case. I leave that as an exercise to Andrea, >> of course. :-) > > It seems like this whole interpolation stuff is not working as I > thought. In particular, considering scalar-valued interpolation (i.e., > looking at the final oil recovery only and not the time-based oil > production profile), interpolation with RBFs is giving > counter-intuitive and meaningless answers. The issues I am seeing are > basically these: > > # Interpolate the cumulative oil production > rbf = Rbf(x1, x2, x3, x4, x5, x6, final_oil_recovery) > > # Try to use existing combination of parameters to get back > # the original result (more or less) > possible_recovery = rbf(x1[10], x2[10], x3[10], x4[10], x5[10], x6[10]) > > 1) If I attempt to use the resulting interpolation function ("rbf"), > inputting a single value for each x1, x2, ..., x6 that is *already > present* in the original x1, x2, ..., x6 vectors, I get meaningless > results (i.e., negative oil productions, 1000% error, and so on) in > all cases with some (rare) exceptions when using the "thin-plate" > interpolation option; > 2) Using a combination of parameters x1, x2, ..., x6 that is *not* in > the original set, I get non-physical (i.e., unrealistic) results: it > appears that RBFs think that if I increase the number of production > wells I get less oil recovered (!), sometimes by a factor of 10. > > I am pretty sure I am missing something in this whole interpolation > thing (I am no expert at all), but honestly it is the first time I get > such counter-intuitive results in my entire Python numerical life ;-) The interpolation with a large number of points can be pretty erratic. Did you use all 1000 points in the RBF? Do see what's going on you could try 2 things: 1) Use only a few points (10 to 100) 2) Use gaussian with a large negative smooth I had problems in the past with rbf in examples, and in one case I switched to using neighborhood points only (e.g. with scipy.spatial, KDTree) Josef > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andrea.gavana at gmail.com Sun Mar 28 17:40:15 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 28 Mar 2010 22:40:15 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: Hi All, On 28 March 2010 22:14, Pierre GM wrote: > On Mar 28, 2010, at 4:47 PM, Andrea Gavana wrote: >> HI All, >> >> On 28 March 2010 19:22, Robert Kern wrote: >>> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >>>> On 27 March 2010 20:24, Andrea Gavana wrote: >>>>> Hi All, >>>>> >>>>> ? ?I have an interpolation problem and I am having some difficulties >>>>> in tackling it. I hope I can explain myself clearly enough. >>>>> >>>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>>> 1000), and they are a result of different combinations of parameters. > >> It seems like this whole interpolation stuff is not working as I >> thought. In particular, considering scalar-valued interpolation (i.e., >> looking at the final oil recovery only and not the time-based oil >> production profile), interpolation with RBFs is giving >> counter-intuitive and meaningless answers. The issues I am seeing are >> basically these: > > Which is hardly surprising: you're working with a physical process, you must have some constraints on your parameters (whether dependence between parameters, bounds on the estimates...) that are not taken into account by the interpolation scheme you're using. So, it's back to the drawing board. The curious thing is, when using the rbf interpolated function to find a new approximation, I am not giving RBFs input values that are outside the bounds of the existing parameters. Either they are exactly the same as the input ones (for a single simulation), or they are slightly different but always inside the bounds. I always thought that, at least for the same input > What are you actually trying to achieve ? Find the best estimates of your 10 parameters to match an observed production timeline ? Find a space for your 10 parameters that gives some realistic production ? > Assuming that your 10 parameters are actually independent, did you run 1000**10 simulations to test all the possible combinations? ?Probably not, so you could try using a coarser interval between min and max values for each parameters (say, 10 ?) and check the combos... Or you could try to decrease the number of parameters by finding the ones that have more influence on the final outcome and dropping the others. A different problem all together... > My point is: don't be discouraged by the weird results you're getting: it's probably because you're not using the right approach yet. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From brennan.williams at visualreservoir.com Sun Mar 28 17:50:46 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Mon, 29 Mar 2010 10:50:46 +1300 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> Message-ID: <4BAFCF36.4010405@visualreservoir.com> Andrea Gavana wrote: > HI All, > > On 28 March 2010 19:22, Robert Kern wrote: > >> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >> >>> On 27 March 2010 20:24, Andrea Gavana wrote: >>> >>>> Hi All, >>>> >>>> I have an interpolation problem and I am having some difficulties >>>> in tackling it. I hope I can explain myself clearly enough. >>>> >>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>> 1000), and they are a result of different combinations of parameters. >>>> I was planning to use the Radial Basis Functions in scipy, but for the >>>> moment let's assume, to simplify things, that I am dealing only with >>>> one parameter (x). In 1000 simulations, this parameter x has 1000 >>>> values, obviously. The problem is, the outcome of every single >>>> simulation is a vector of oil production over time (let's say 40 >>>> values per simulation, one per year), and I would like to be able to >>>> interpolate my x parameter (1000 values) against all the simulations >>>> (1000x40) and get an approximating function that, given another x >>>> parameter (of size 1x1) will give me back an interpolated production >>>> profile (of size 1x40). >>>> >>> If I understand your problem correctly, you have a function taking one >>> value as input (or one 3D vector) and returning a vector of length 40. >>> You want to know whether there are tools in scipy to support this. >>> >>> I'll say first that it's not strictly necessary for there to be: you >>> could always just build 40 different interpolators, one for each >>> component of the output. After all, there's no interaction in the >>> calculations between the output coordinates. This is of course >>> awkward, in that you'd like to just call F(x) and get back a vector of >>> length 40, but that can be remedied by writing a short wrapper >>> function that simply calls all 40 interpolators. >>> >>> A problem that may be more serious is that the python loop over the 40 >>> interpolators can be slow, while a C implementation would give >>> vector-valued results rapidly. >>> >> 40 is not a bad number to loop over. The thing I would be worried >> about is the repeated calculation of the (1000, 1000) radial function >> evaluation. I think that a small modification of Rbf could be made to >> handle the vector-valued case. I leave that as an exercise to Andrea, >> of course. :-) >> > > It seems like this whole interpolation stuff is not working as I > thought. In particular, considering scalar-valued interpolation (i.e., > looking at the final oil recovery only and not the time-based oil > production profile), interpolation with RBFs is giving > counter-intuitive and meaningless answers. The issues I am seeing are > basically these: > > # Interpolate the cumulative oil production > rbf = Rbf(x1, x2, x3, x4, x5, x6, final_oil_recovery) > > # Try to use existing combination of parameters to get back > # the original result (more or less) > possible_recovery = rbf(x1[10], x2[10], x3[10], x4[10], x5[10], x6[10]) > > 1) If I attempt to use the resulting interpolation function ("rbf"), > inputting a single value for each x1, x2, ..., x6 that is *already > present* in the original x1, x2, ..., x6 vectors, I get meaningless > results (i.e., negative oil productions, 1000% error, and so on) in > all cases with some (rare) exceptions when using the "thin-plate" > interpolation option; > 2) Using a combination of parameters x1, x2, ..., x6 that is *not* in > the original set, I get non-physical (i.e., unrealistic) results: it > appears that RBFs think that if I increase the number of production > wells I get less oil recovered (!), sometimes by a factor of 10. > > I am pretty sure I am missing something in this whole interpolation > thing (I am no expert at all), but honestly it is the first time I get > such counter-intuitive results in my entire Python numerical life ;-) > > Hi Andrea, I'm looking at doing this as well, i.e. using RBF to create what is effectively a "proxy simulator". I'm going to have a read through the scipy rbf documentation to get up to speed. I take it you are using an experimental design algorithm such as Plackett-Burman or similar? Brennan > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From andrea.gavana at gmail.com Sun Mar 28 17:53:44 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 28 Mar 2010 22:53:44 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: Hi All, On 28 March 2010 22:14, Pierre GM wrote: > On Mar 28, 2010, at 4:47 PM, Andrea Gavana wrote: >> HI All, >> >> On 28 March 2010 19:22, Robert Kern wrote: >>> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >>>> On 27 March 2010 20:24, Andrea Gavana wrote: >>>>> Hi All, >>>>> >>>>> I have an interpolation problem and I am having some difficulties >>>>> in tackling it. I hope I can explain myself clearly enough. >>>>> >>>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>>> 1000), and they are a result of different combinations of parameters. > >> It seems like this whole interpolation stuff is not working as I >> thought. In particular, considering scalar-valued interpolation (i.e., >> looking at the final oil recovery only and not the time-based oil >> production profile), interpolation with RBFs is giving >> counter-intuitive and meaningless answers. The issues I am seeing are >> basically these: > > Which is hardly surprising: you're working with a physical process, you must have some constraints on your parameters (whether dependence between parameters, bounds on the estimates...) that are not taken into account by the interpolation scheme you're using. So, it's back to the drawing board. Sorry, this laptop has gone mad and sent the message before I was finished... The curious thing is, when using the rbf interpolated function to find a new approximation, I am not giving RBFs input values that are outside the bounds of the existing parameters. Either they are exactly the same as the input ones (for a single simulation), or they are slightly different but always inside the bounds. I always thought that, at least for the same input values, it should give (approximatively) the same answer as the real one. > What are you actually trying to achieve ? Find the best estimates of your 10 parameters to match an observed production timeline ? Find a space for your 10 parameters that gives some realistic production ? I have this bunch of simulations for possible future oil productions (forecasts), run over the past 4 years, which have different combination of number of oil producing wells, gas injection wells, oil plateau values, gas injection plateau values and gas production plateau values (6 parameters in total, not 10). Now, the combinations themselves are pretty nicely spread, meaning that I seem to have a solid base to build an interpolator that will give me a reasonable answer when I ask it for parameters inside the bounds (i.e., no extrapolation involved). > Assuming that your 10 parameters are actually independent, did you run 1000**10 simulations to test all the possible combinations? I hope you're kidding :-D . Each of these simulations take from 7 to 12 hours, not counting the huge amount of time needed to build their data files by hand... Let's see a couple of practical examples (I can share the data if someone is interested). Example 1 # o2 and o3 are the number of production wells, split into 2 # different categories # inj is the number of injection wells # fomts is the final oil recovery rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) op = [50380] gp = [103014000] gi = [53151000] o2w = [45] o3w = [20] inw = [15] fi = rbf(op, gp, gi, o2w, o3w, inw) # I => KNOW <= the answer to be close to +3.5e8 print fi [ -1.00663296e+08] (yeah right...) Example 2 Changing o2w from 45 to 25 (again, the answer should be close to 3e8, less wells => less production) fi = rbf(op, gp, gi, o2w, o3w, inw) print fi [ 1.30023424e+08] And keep in mind, that nowhere I have such low values of oil recovery in my data... the lowest one are close to 2.8e8... Andrea. From andrea.gavana at gmail.com Sun Mar 28 18:02:19 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 28 Mar 2010 23:02:19 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <4BAFCF36.4010405@visualreservoir.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <4BAFCF36.4010405@visualreservoir.com> Message-ID: HI Brennan, On 28 March 2010 22:50, Brennan Williams wrote: > Andrea Gavana wrote: >> HI All, >> >> On 28 March 2010 19:22, Robert Kern wrote: >> >>> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >>> >>>> On 27 March 2010 20:24, Andrea Gavana wrote: >>>> >>>>> Hi All, >>>>> >>>>> ? ?I have an interpolation problem and I am having some difficulties >>>>> in tackling it. I hope I can explain myself clearly enough. >>>>> >>>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>>> 1000), and they are a result of different combinations of parameters. >>>>> I was planning to use the Radial Basis Functions in scipy, but for the >>>>> moment let's assume, to simplify things, that I am dealing only with >>>>> one parameter (x). In 1000 simulations, this parameter x has 1000 >>>>> values, obviously. The problem is, the outcome of every single >>>>> simulation is a vector of oil production over time (let's say 40 >>>>> values per simulation, one per year), and I would like to be able to >>>>> interpolate my x parameter (1000 values) against all the simulations >>>>> (1000x40) and get an approximating function that, given another x >>>>> parameter (of size 1x1) will give me back an interpolated production >>>>> profile (of size 1x40). >>>>> >>>> If I understand your problem correctly, you have a function taking one >>>> value as input (or one 3D vector) and returning a vector of length 40. >>>> You want to know whether there are tools in scipy to support this. >>>> >>>> I'll say first that it's not strictly necessary for there to be: you >>>> could always just build 40 different interpolators, one for each >>>> component of the output. After all, there's no interaction in the >>>> calculations between the output coordinates. This is of course >>>> awkward, in that you'd like to just call F(x) and get back a vector of >>>> length 40, but that can be remedied by writing a short wrapper >>>> function that simply calls all 40 interpolators. >>>> >>>> A problem that may be more serious is that the python loop over the 40 >>>> interpolators can be slow, while a C implementation would give >>>> vector-valued results rapidly. >>>> >>> 40 is not a bad number to loop over. The thing I would be worried >>> about is the repeated calculation of the (1000, 1000) radial function >>> evaluation. I think that a small modification of Rbf could be made to >>> handle the vector-valued case. I leave that as an exercise to Andrea, >>> of course. :-) >>> >> >> It seems like this whole interpolation stuff is not working as I >> thought. In particular, considering scalar-valued interpolation (i.e., >> looking at the final oil recovery only and not the time-based oil >> production profile), interpolation with RBFs is giving >> counter-intuitive and meaningless answers. The issues I am seeing are >> basically these: >> >> # Interpolate the cumulative oil production >> rbf = Rbf(x1, x2, x3, x4, x5, x6, final_oil_recovery) >> >> # Try to use existing combination of parameters to get back >> # the original result (more or less) >> possible_recovery = rbf(x1[10], x2[10], x3[10], x4[10], x5[10], x6[10]) >> >> 1) If I attempt to use the resulting interpolation function ("rbf"), >> inputting a single value for each x1, x2, ..., x6 that is *already >> present* in the original x1, x2, ..., x6 vectors, I get meaningless >> results (i.e., negative oil productions, 1000% error, and so on) in >> all cases with some (rare) exceptions when using the "thin-plate" >> interpolation option; >> 2) Using a combination of parameters x1, x2, ..., x6 that is *not* in >> the original set, I get non-physical (i.e., unrealistic) results: it >> appears that RBFs think that if I increase the number of production >> wells I get less oil recovered (!), sometimes by a factor of 10. >> >> I am pretty sure I am missing something in this whole interpolation >> thing (I am no expert at all), but honestly it is the first time I get >> such counter-intuitive results in my entire Python numerical life ;-) >> >> > Hi Andrea, I'm looking at doing this as well, i.e. using RBF to create > what is effectively a "proxy simulator". > I'm going to have a read through the scipy rbf documentation to get up > to speed. I take it you are using > an experimental design algorithm such as Plackett-Burman or similar? No, it's simply a collection of all the simulations we have run in the past 3-4 years. We can't afford to use experimental design or latin hypercube algorithms in our simulations, as the field we are studying is so immense that simulation times are prohibitive for any kind of proxy model. What I was trying to do, is to collect all the information and data we gathered with so much effort in the past 4 years and then build some kind of approximating function that would give us a possible production profile without actually running the simulation itself. I should mention that, given any combination of these 6 parameters, any one of us reservoir engineers could draw a production profile for this field by hand (being blinded too, if you wish) without any effort, because the simulated response of this field is so linear any time you change any of the 6 parameters i mentioned in my other posts. This is why I am so surprised about the behaviour of RBFs. I am going to give Matlab (sigh...) a go tomorrow morning and see what griddatan thinks about this issue. I really don't want to go back to Matlab, it has been the worst of the nightmares for a GUI-builder person like me 6 years ago. I switched to Python/Numpy/Scipy/wxPython and I have never looked back. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From brennan.williams at visualreservoir.com Sun Mar 28 18:36:21 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Mon, 29 Mar 2010 11:36:21 +1300 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: <4BAFD9E5.4070804@visualreservoir.com> Andrea Gavana wrote: > Hi All, > > On 28 March 2010 22:14, Pierre GM wrote: > >> On Mar 28, 2010, at 4:47 PM, Andrea Gavana wrote: >> >>> HI All, >>> >>> On 28 March 2010 19:22, Robert Kern wrote: >>> >>>> On Sun, Mar 28, 2010 at 03:26, Anne Archibald wrote: >>>> >>>>> On 27 March 2010 20:24, Andrea Gavana wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I have an interpolation problem and I am having some difficulties >>>>>> in tackling it. I hope I can explain myself clearly enough. >>>>>> >>>>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to >>>>>> 1000), and they are a result of different combinations of parameters. >>>>>> >>> It seems like this whole interpolation stuff is not working as I >>> thought. In particular, considering scalar-valued interpolation (i.e., >>> looking at the final oil recovery only and not the time-based oil >>> production profile), interpolation with RBFs is giving >>> counter-intuitive and meaningless answers. The issues I am seeing are >>> basically these: >>> >> Which is hardly surprising: you're working with a physical process, you must have some constraints on your parameters (whether dependence between parameters, bounds on the estimates...) that are not taken into account by the interpolation scheme you're using. So, it's back to the drawing board. >> > > Sorry, this laptop has gone mad and sent the message before I was finished... > > The curious thing is, when using the rbf interpolated function to find > a new approximation, I am not giving RBFs input values that are > outside the bounds of the existing parameters. Either they are exactly > the same as the input ones (for a single simulation), or they are > slightly different but always inside the bounds. I always thought > that, at least for the same input values, it should give > (approximatively) the same answer as the real one. > > >> What are you actually trying to achieve ? Find the best estimates of your 10 parameters to match an observed production timeline ? Find a space for your 10 parameters that gives some realistic production ? >> > > I have this bunch of simulations for possible future oil productions > (forecasts), run over the past 4 years, which have different > combination of number of oil producing wells, gas injection wells, oil > plateau values, gas injection plateau values and gas production > plateau values (6 parameters in total, not 10). Now, the combinations > themselves are pretty nicely spread, meaning that I seem to have a > solid base to build an interpolator that will give me a reasonable > answer when I ask it for parameters inside the bounds (i.e., no > extrapolation involved). > > >> Assuming that your 10 parameters are actually independent, did you run 1000**10 simulations to test all the possible combinations? >> > > I hope you're kidding :-D . Each of these simulations take from 7 to > 12 hours, not counting the huge amount of time needed to build their > data files by hand... > > Let's see a couple of practical examples (I can share the data if > someone is interested). > > Definitely interested in helping solve this one so feel free to email the data (obviously not 1,000 Eclipse smspec and unsmry files although I can handle that without any problems). > Example 1 > > # o2 and o3 are the number of production wells, split into 2 > # different categories > # inj is the number of injection wells > # fomts is the final oil recovery > > rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) > > op = [50380] > gp = [103014000] > gi = [53151000] > o2w = [45] > o3w = [20] > inw = [15] > > I take it all your inputs are float arrays? i.e. you just consider number of production wells as a floating point value. I suppose strictly speaking o2 and o3 are not independent of eachother but I would guess that for the purposes of using Rbf it wouldn't be an issue. I'm just having a look at the cookbook example (http://www.scipy.org/Cookbook/RadialBasisFunctions) and adding a couple more variables, bumping up the number of points to 1000 and setting the z values to be up around 1.0e+09 - so far it still seems to work although if you increase the sample points to 2000+ it falls over with a memory error. I've got a smaller bunch of Eclipse simulations I could try Rbf out on - 19 runs with 9 variables (it's a Tornado algorithm with -1,0,+1 values for each variable). FOPT's will be in a similar range ot yours. Brennan > fi = rbf(op, gp, gi, o2w, o3w, inw) > > # I => KNOW <= the answer to be close to +3.5e8 > > print fi > > [ -1.00663296e+08] > > (yeah right...) > > > Example 2 > > Changing o2w from 45 to 25 (again, the answer should be close to 3e8, > less wells => less production) > > fi = rbf(op, gp, gi, o2w, o3w, inw) > > print fi > > [ 1.30023424e+08] > > And keep in mind, that nowhere I have such low values of oil recovery > in my data... the lowest one are close to 2.8e8... > > Andrea. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From friedrichromstedt at gmail.com Sun Mar 28 18:51:38 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 29 Mar 2010 00:51:38 +0200 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: 2010/3/28 Andrea Gavana : > Example 1 > > # o2 and o3 are the number of production wells, split into 2 > # different categories > # inj is the number of injection wells > # fomts is the final oil recovery > > rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) > > op = [50380] > gp = [103014000] > gi = [53151000] > o2w = [45] > o3w = [20] > inw = [15] > > fi = rbf(op, gp, gi, o2w, o3w, inw) > > # I => KNOW <= the answer to be close to +3.5e8 > > print fi > > [ -1.00663296e+08] > > (yeah right...) > > > Example 2 > > Changing o2w from 45 to 25 (again, the answer should be close to 3e8, > less wells => less production) > > fi = rbf(op, gp, gi, o2w, o3w, inw) > > print fi > > [ ?1.30023424e+08] > > And keep in mind, that nowhere I have such low values of oil recovery > in my data... the lowest one are close to 2.8e8... I want to put my2 cents in, fwiw ... What I see from http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.interpolate.Rbf.html#scipy.interpolate.Rbf are three things: 1. Rbf uses some weighting based on the radial functions. 2. Rbf results go through the nodal points without *smooth* set to some value != 0 3. Rbf is isotropic (3.) is most important. I see from your e-mail that the values you pass in to Rbf are of very different order of magnitude. But the default norm used in Rbf is for sure isotropic, i.e., it will result in strange and useless "mean distances" in R^N where there are N parameters. You have to either pass in a *norm* which weights the coords according to their extent, or to scale the data such that the aspect ratios of the hypecube's edges are sensible. You can imagine it as the follwing ascii plot: * * * * * * * the x dimension is say the gas plateau dimension (~1e10), the y dimension is the year dimension (~1e1). In an isotropic plot, using the data lims and aspect = 1, they may be well homogenous, but on this scaling, as used by Rbf, they are lumped. I don't know if it's clear what I mean from my description? Are the corresponding parameter values spread completely randomly, or is there always varied only one axis? If random, you could try a fractal analysis to find out the optimal scaling of the axes for N-d coverage of N-d space. In the case above, is is something between points and lines, I guess. When scaling it properly, it becomes a plane-like coveage of the xy plane. When scaling further, it will become points again (lumped together). So you can find an optimum. There are quantitative methods to obtain the fractal dimension, and they are quite simple. Friedrich From andrea.gavana at gmail.com Sun Mar 28 19:24:41 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 00:24:41 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <4BAFD9E5.4070804@visualreservoir.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <4BAFD9E5.4070804@visualreservoir.com> Message-ID: Hi Brennan & All, On 28 March 2010 23:36, Brennan Williams wrote: > Andrea Gavana wrote: >> Let's see a couple of practical examples (I can share the data if >> someone is interested). >> >> > Definitely interested in helping solve this one so feel free to email > the data (obviously not 1,000 Eclipse smspec and unsmry files although I > can handle that without any problems). It's 1 TB of data and my script took almost 4 hours in parallel on 4 processor over a network to reduce that huge amount of data to a tiny 406x7 matrix... but anyway, I'll see if I have the permission of posting the data and I'll let you know. I could really use some suggestions... or maybe a good kick in my a** telling me I am doing a bunch of stupid things. >> Example 1 >> >> # o2 and o3 are the number of production wells, split into 2 >> # different categories >> # inj is the number of injection wells >> # fomts is the final oil recovery >> >> rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) >> >> op = [50380] >> gp = [103014000] >> gi = [53151000] >> o2w = [45] >> o3w = [20] >> inw = [15] >> >> > I take it all your inputs are float arrays? i.e. you just consider > number of production wells as a floating point value. They are floats, but it doesn't really matter as RBFs should treat them all as floats. My results are the same whether I use "45" or "45.0". > I suppose strictly speaking o2 and o3 are not independent of eachother > but I would guess that for the purposes of using Rbf it wouldn't be an > issue. They are independent: they are 2 completely different kind of wells, o2 will give you much more gas while o3 much more oil. It turns out that all interpolation methods *except* "linear" fail miserably with huge errors even when I use exactly the same input points as the original data points. If I use the "linear" interpolator, I get errors in the range of -1.16305401154e-005 1.29984131475e-006 In percentage, which is very good :-D. Unfortunately, when I input even a single parameter value that is not in the original set, I still get these non-physical results of oil production increasing when I decrease the number of wells... Doh! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From andrea.gavana at gmail.com Sun Mar 28 19:30:47 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 00:30:47 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: Hi Friedrich & All, On 28 March 2010 23:51, Friedrich Romstedt wrote: > 2010/3/28 Andrea Gavana : >> Example 1 >> >> # o2 and o3 are the number of production wells, split into 2 >> # different categories >> # inj is the number of injection wells >> # fomts is the final oil recovery >> >> rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) >> >> op = [50380] >> gp = [103014000] >> gi = [53151000] >> o2w = [45] >> o3w = [20] >> inw = [15] >> >> fi = rbf(op, gp, gi, o2w, o3w, inw) >> >> # I => KNOW <= the answer to be close to +3.5e8 >> >> print fi >> >> [ -1.00663296e+08] >> >> (yeah right...) >> >> >> Example 2 >> >> Changing o2w from 45 to 25 (again, the answer should be close to 3e8, >> less wells => less production) >> >> fi = rbf(op, gp, gi, o2w, o3w, inw) >> >> print fi >> >> [ ?1.30023424e+08] >> >> And keep in mind, that nowhere I have such low values of oil recovery >> in my data... the lowest one are close to 2.8e8... > > I want to put my2 cents in, fwiw ... > > What I see from > http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.interpolate.Rbf.html#scipy.interpolate.Rbf > are three things: > > 1. Rbf uses some weighting based on the radial functions. > 2. Rbf results go through the nodal points without *smooth* set to > some value != 0 > 3. Rbf is isotropic > > (3.) is most important. ?I see from your e-mail that the values you > pass in to Rbf are of very different order of magnitude. ?But the > default norm used in Rbf is for sure isotropic, i.e., it will result > in strange and useless "mean distances" in R^N where there are N > parameters. ?You have to either pass in a *norm* which weights the > coords according to their extent, or to scale the data such that the > aspect ratios of the hypecube's edges are sensible. > > You can imagine it as the follwing ascii plot: > > ? * ? ? ? ? ? ? * ? ? ? ? ? ? ? ? ? ? ? ? ? ? * > * ? ? ?* ? ? ? ? ? ? ?* ? ? ? ? ? ? ? ? ? ? ? ? * > > the x dimension is say the gas plateau dimension (~1e10), the y > dimension is the year dimension (~1e1). ?In an isotropic plot, using > the data lims and aspect = 1, they may be well homogenous, but on this > scaling, as used by Rbf, they are lumped. ?I don't know if it's clear > what I mean from my description? > > Are the corresponding parameter values spread completely randomly, or > is there always varied only one axis? > > If random, you could try a fractal analysis to find out the optimal > scaling of the axes for N-d coverage of N-d space. ?In the case above, > is is something between points and lines, I guess. ?When scaling it > properly, it becomes a plane-like coveage of the xy plane. ?When > scaling further, it will become points again (lumped together). ?So > you can find an optimum. ?There are quantitative methods to obtain the > fractal dimension, and they are quite simple. I believe I need a technical dictionary to properly understand all that... :-D . Sorry, I am no expert at all, really, just an amateur with some imagination, but your suggestion about the different magnitude of the matrix is a very interesting one. Although I have absolutely no idea on how to re-scale them properly to avoid RBFs going crazy. As for your question, the parameter are not spread completely randomly, as this is a collection of simulations done over the years, trying manually different scenarios, without having in mind a proper experimental design or any other technique. Nor the parameter values vary only on one axis in each simulation (few of them are like that). Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From robert.kern at gmail.com Sun Mar 28 19:34:45 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 28 Mar 2010 18:34:45 -0500 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> On Sun, Mar 28, 2010 at 18:30, Andrea Gavana wrote: > Hi Friedrich & All, > > On 28 March 2010 23:51, Friedrich Romstedt wrote: >> 2010/3/28 Andrea Gavana : >>> Example 1 >>> >>> # o2 and o3 are the number of production wells, split into 2 >>> # different categories >>> # inj is the number of injection wells >>> # fomts is the final oil recovery >>> >>> rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) >>> >>> op = [50380] >>> gp = [103014000] >>> gi = [53151000] >>> o2w = [45] >>> o3w = [20] >>> inw = [15] >>> >>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>> >>> # I => KNOW <= the answer to be close to +3.5e8 >>> >>> print fi >>> >>> [ -1.00663296e+08] >>> >>> (yeah right...) >>> >>> >>> Example 2 >>> >>> Changing o2w from 45 to 25 (again, the answer should be close to 3e8, >>> less wells => less production) >>> >>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>> >>> print fi >>> >>> [ ?1.30023424e+08] >>> >>> And keep in mind, that nowhere I have such low values of oil recovery >>> in my data... the lowest one are close to 2.8e8... >> >> I want to put my2 cents in, fwiw ... >> >> What I see from >> http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.interpolate.Rbf.html#scipy.interpolate.Rbf >> are three things: >> >> 1. Rbf uses some weighting based on the radial functions. >> 2. Rbf results go through the nodal points without *smooth* set to >> some value != 0 >> 3. Rbf is isotropic >> >> (3.) is most important. ?I see from your e-mail that the values you >> pass in to Rbf are of very different order of magnitude. ?But the >> default norm used in Rbf is for sure isotropic, i.e., it will result >> in strange and useless "mean distances" in R^N where there are N >> parameters. ?You have to either pass in a *norm* which weights the >> coords according to their extent, or to scale the data such that the >> aspect ratios of the hypecube's edges are sensible. > I believe I need a technical dictionary to properly understand all > that... :-D . Sorry, I am no expert at all, really, just an amateur > with some imagination, but your suggestion about the different > magnitude of the matrix is a very interesting one. Although I have > absolutely no idea on how to re-scale them properly to avoid RBFs > going crazy. Scaling each axis by its standard deviation is a typical first start. Shifting and scaling the values such that they each go from 0 to 1 is another useful thing to try. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From brennan.williams at visualreservoir.com Sun Mar 28 19:46:47 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Mon, 29 Mar 2010 12:46:47 +1300 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> Message-ID: <4BAFEA67.7050808@visualreservoir.com> Andrea Gavana wrote: > Hi Friedrich & All, > > On 28 March 2010 23:51, Friedrich Romstedt wrote: > >> 2010/3/28 Andrea Gavana : >> >>> Example 1 >>> >>> # o2 and o3 are the number of production wells, split into 2 >>> # different categories >>> # inj is the number of injection wells >>> # fomts is the final oil recovery >>> >>> rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) >>> >>> op = [50380] >>> gp = [103014000] >>> gi = [53151000] >>> o2w = [45] >>> o3w = [20] >>> inw = [15] >>> >>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>> >>> # I => KNOW <= the answer to be close to +3.5e8 >>> >>> print fi >>> >>> [ -1.00663296e+08] >>> >>> (yeah right...) >>> >>> >>> Example 2 >>> >>> Changing o2w from 45 to 25 (again, the answer should be close to 3e8, >>> less wells => less production) >>> >>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>> >>> print fi >>> >>> [ 1.30023424e+08] >>> >>> And keep in mind, that nowhere I have such low values of oil recovery >>> in my data... the lowest one are close to 2.8e8... >>> >> I want to put my2 cents in, fwiw ... >> >> What I see from >> http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.interpolate.Rbf.html#scipy.interpolate.Rbf >> are three things: >> >> 1. Rbf uses some weighting based on the radial functions. >> 2. Rbf results go through the nodal points without *smooth* set to >> some value != 0 >> 3. Rbf is isotropic >> >> (3.) is most important. I see from your e-mail that the values you >> pass in to Rbf are of very different order of magnitude. But the >> default norm used in Rbf is for sure isotropic, i.e., it will result >> in strange and useless "mean distances" in R^N where there are N >> parameters. You have to either pass in a *norm* which weights the >> coords according to their extent, or to scale the data such that the >> aspect ratios of the hypecube's edges are sensible. >> >> You can imagine it as the follwing ascii plot: >> >> * * * >> * * * * >> >> the x dimension is say the gas plateau dimension (~1e10), the y >> dimension is the year dimension (~1e1). In an isotropic plot, using >> the data lims and aspect = 1, they may be well homogenous, but on this >> scaling, as used by Rbf, they are lumped. I don't know if it's clear >> what I mean from my description? >> >> Are the corresponding parameter values spread completely randomly, or >> is there always varied only one axis? >> >> If random, you could try a fractal analysis to find out the optimal >> scaling of the axes for N-d coverage of N-d space. In the case above, >> is is something between points and lines, I guess. When scaling it >> properly, it becomes a plane-like coveage of the xy plane. When >> scaling further, it will become points again (lumped together). So >> you can find an optimum. There are quantitative methods to obtain the >> fractal dimension, and they are quite simple. >> > > I believe I need a technical dictionary to properly understand all > that... :-D . Sorry, I am no expert at all, really, just an amateur > with some imagination, but your suggestion about the different > magnitude of the matrix is a very interesting one. Although I have > absolutely no idea on how to re-scale them properly to avoid RBFs > going crazy. > > As for your question, the parameter are not spread completely > randomly, as this is a collection of simulations done over the years, > trying manually different scenarios, without having in mind a proper > experimental design or any other technique. Nor the parameter values > vary only on one axis in each simulation (few of them are like that). > I assume that there is a default "norm" that calculates the distance between points irrespective of the order of the input coordinates? So if that isn't working, leading to the spurious results, the next step is to normalise all the inputs so they are in the same range, e.g max-min=1.0 On a related note, what approach would be best if one of the input parameters wasn't continuous? e.g. I have three quite different geological distributions called say A,B and C. SO some of my simulations use distribution A, some use B and some use C. I could assign them the numbers 1,2,3 but a value of 1.5 is meaningless. Andrea, if you have 1TB of data for 1,000 simulation runs, then, if I assume you only mean the smspec/unsmry files, that means each of your summary files is 1GB in size? Are those o2w,o3w and inw figures the number of new wells only or existing+new? It's fun dealing with this amount of data isn't it? Brennan > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From andrea.gavana at gmail.com Sun Mar 28 19:59:02 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 00:59:02 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: On 29 March 2010 00:34, Robert Kern wrote: > On Sun, Mar 28, 2010 at 18:30, Andrea Gavana wrote: >> Hi Friedrich & All, >> >> On 28 March 2010 23:51, Friedrich Romstedt wrote: >>> 2010/3/28 Andrea Gavana : >>>> Example 1 >>>> >>>> # o2 and o3 are the number of production wells, split into 2 >>>> # different categories >>>> # inj is the number of injection wells >>>> # fomts is the final oil recovery >>>> >>>> rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) >>>> >>>> op = [50380] >>>> gp = [103014000] >>>> gi = [53151000] >>>> o2w = [45] >>>> o3w = [20] >>>> inw = [15] >>>> >>>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>>> >>>> # I => KNOW <= the answer to be close to +3.5e8 >>>> >>>> print fi >>>> >>>> [ -1.00663296e+08] >>>> >>>> (yeah right...) >>>> >>>> >>>> Example 2 >>>> >>>> Changing o2w from 45 to 25 (again, the answer should be close to 3e8, >>>> less wells => less production) >>>> >>>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>>> >>>> print fi >>>> >>>> [ ?1.30023424e+08] >>>> >>>> And keep in mind, that nowhere I have such low values of oil recovery >>>> in my data... the lowest one are close to 2.8e8... >>> >>> I want to put my2 cents in, fwiw ... >>> >>> What I see from >>> http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.interpolate.Rbf.html#scipy.interpolate.Rbf >>> are three things: >>> >>> 1. Rbf uses some weighting based on the radial functions. >>> 2. Rbf results go through the nodal points without *smooth* set to >>> some value != 0 >>> 3. Rbf is isotropic >>> >>> (3.) is most important. ?I see from your e-mail that the values you >>> pass in to Rbf are of very different order of magnitude. ?But the >>> default norm used in Rbf is for sure isotropic, i.e., it will result >>> in strange and useless "mean distances" in R^N where there are N >>> parameters. ?You have to either pass in a *norm* which weights the >>> coords according to their extent, or to scale the data such that the >>> aspect ratios of the hypecube's edges are sensible. > >> I believe I need a technical dictionary to properly understand all >> that... :-D . Sorry, I am no expert at all, really, just an amateur >> with some imagination, but your suggestion about the different >> magnitude of the matrix is a very interesting one. Although I have >> absolutely no idea on how to re-scale them properly to avoid RBFs >> going crazy. > > Scaling each axis by its standard deviation is a typical first start. > Shifting and scaling the values such that they each go from 0 to 1 is > another useful thing to try. Ah, magnifico! Thank you Robert and Friedrich, it seems to be working now... I get reasonable values for various combinations of parameters by scaling the input data using the standard deviation of each of them. It seems also that the other interpolation schemes are much less erratic now, and in fact (using input values equal to the original data) I get these range of errors for the various schemes: inverse multiquadric -15.6098482614 15.7194674906 linear -1.76157336073e-010 1.24949181055e-010 cubic -0.000709860285963 0.018385394661 gaussian -293.930336611 282.058111404 quintic -0.176381494531 5.37780806549 multiquadric -30.9515933446 58.3786105046 thin-plate -7.06755391536e-006 8.71407169821e-005 In percentage. Some of them are still off the mark, but you should have seen them before ;-) . I'll do some more analysis tomorrow, and if it works I am going to try the bigger profile-over-time interpolation. Thank you so much guys for your suggestions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From kgdunn at gmail.com Sun Mar 28 20:12:57 2010 From: kgdunn at gmail.com (Kevin Dunn) Date: Sun, 28 Mar 2010 20:12:57 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 42, Issue 85 In-Reply-To: References: Message-ID: > Date: Sun, 28 Mar 2010 00:24:01 +0000 > From: Andrea Gavana > Subject: [Numpy-discussion] Interpolation question > To: Discussion of Numerical Python > Message-ID: > ? ? ? ? > Content-Type: text/plain; charset=ISO-8859-1 > > Hi All, > > ? ?I have an interpolation problem and I am having some difficulties > in tackling it. I hope I can explain myself clearly enough. > > Basically, I have a whole bunch of 3D fluid flow simulations (close to > 1000), and they are a result of different combinations of parameters. > I was planning to use the Radial Basis Functions in scipy, but for the > moment let's assume, to simplify things, that I am dealing only with > one parameter (x). In 1000 simulations, this parameter x has 1000 > values, obviously. The problem is, the outcome of every single > simulation is a vector of oil production over time (let's say 40 > values per simulation, one per year), and I would like to be able to > interpolate my x parameter (1000 values) against all the simulations > (1000x40) and get an approximating function that, given another x > parameter (of size 1x1) will give me back an interpolated production > profile (of size 1x40). Andrea, may I suggest a different approach to RBF's. Realize that your vector of 40 values for each row in y are not independent of each other (they will be correlated). First perform a principal component analysis on this 1000 x 40 matrix and reduce it down to a 1000 x A matrix, called your scores matrix, where A is the number of independent components. A is selected so that it adequately summarizes Y without over-fitting and you will find A << 40, maybe 2 or 3. There are tools, such as cross-validation, that do this well enough. Then you can relate your single column of X to these independent column in A using a tool such as least squares: one least squares model per column in the scores matrix. This works because each column in the score vector is independent (contains totally orthogonal information) to the others. But I would be surprised if this works well enough, unless A = 1. But it sounds like your don't just have a single column in you X-variables (you hinted that the single column was just for simplification). In that case, I would build a projection to latent structures model (PLS) model that builds a single latent-variable model that simultaneously models the X-matrix, the Y-matrix as well as providing the maximal covariance between these two matrices. > Something along these lines: > > import numpy as np > from scipy.interpolate import Rbf > > # x.shape = (1000, 1) > # y.shape = (1000, 40) > > rbf = Rbf(x, y) > > # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) > fi = rbf(xi) > > > Does anyone have a suggestion on how I could implement this? Sorry if > it sounds confused... Please feel free to correct any wrong > assumptions I have made, or to propose other approaches if you think > RBFs are not suitable for this kind of problems. > > Thank you in advance for your suggestions. > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From kgdunn at gmail.com Sun Mar 28 20:18:25 2010 From: kgdunn at gmail.com (Kevin Dunn) Date: Sun, 28 Mar 2010 20:18:25 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 42, Issue 85 In-Reply-To: References: Message-ID: On Sun, Mar 28, 2010 at 20:12, Kevin Dunn wrote: >> Date: Sun, 28 Mar 2010 00:24:01 +0000 >> From: Andrea Gavana >> Subject: [Numpy-discussion] Interpolation question >> To: Discussion of Numerical Python >> Message-ID: >> ? ? ? ? >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Hi All, >> >> ? ?I have an interpolation problem and I am having some difficulties >> in tackling it. I hope I can explain myself clearly enough. >> >> Basically, I have a whole bunch of 3D fluid flow simulations (close to >> 1000), and they are a result of different combinations of parameters. >> I was planning to use the Radial Basis Functions in scipy, but for the >> moment let's assume, to simplify things, that I am dealing only with >> one parameter (x). In 1000 simulations, this parameter x has 1000 >> values, obviously. The problem is, the outcome of every single >> simulation is a vector of oil production over time (let's say 40 >> values per simulation, one per year), and I would like to be able to >> interpolate my x parameter (1000 values) against all the simulations >> (1000x40) and get an approximating function that, given another x >> parameter (of size 1x1) will give me back an interpolated production >> profile (of size 1x40). > > Andrea, may I suggest a different approach to RBF's. > > Realize that your vector of 40 values for each row in y are not > independent of each other (they will be correlated). ?First perform a > principal component analysis on this 1000 x 40 matrix and reduce it > down to a 1000 x A matrix, called your scores matrix, where A is the > number of independent components. A is selected so that it adequately > summarizes Y without over-fitting and you will find A << 40, maybe 2 > or 3. There are tools, such as cross-validation, that do this well > enough. > > Then you can relate your single column of X to these independent > column in A using a tool such as least squares: one least squares > model per column in the scores matrix. ?This works because each column > in the score vector is independent (contains totally orthogonal > information) to the others. ?But I would be surprised if this works > well enough, unless A = 1. > > But it sounds like your don't just have a single column in you > X-variables (you hinted that the single column was just for > simplification). ?In that case, I would build a projection to latent > structures model (PLS) model that builds a single latent-variable > model that simultaneously models the X-matrix, the Y-matrix as well as > providing the maximal covariance between these two matrices. Ooops, that got sent before I was about to end by saying, that if you need some references and an outline of code, then I can readily provide these. This is a standard problem with data from spectroscopic instruments and with batch processes. They produce hundreds, sometimes 1000's of samples per row. PCA and PLS are very effective at summarizing these down to a much smaller number of independent columns, very often just a handful, and relating them (i.e. building a predictive model) to other data matrices. Kevin Dunn >> Something along these lines: >> >> import numpy as np >> from scipy.interpolate import Rbf >> >> # x.shape = (1000, 1) >> # y.shape = (1000, 40) >> >> rbf = Rbf(x, y) >> >> # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) >> fi = rbf(xi) >> >> >> Does anyone have a suggestion on how I could implement this? Sorry if >> it sounds confused... Please feel free to correct any wrong >> assumptions I have made, or to propose other approaches if you think >> RBFs are not suitable for this kind of problems. >> >> Thank you in advance for your suggestions. >> >> Andrea. >> >> "Imagination Is The Only Weapon In The War Against Reality." >> http://xoomer.alice.it/infinity77/ >> >> ==> Never *EVER* use RemovalGroup for your house removal. You'll >> regret it forever. >> http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > From brennan.williams at visualreservoir.com Sun Mar 28 20:36:19 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Mon, 29 Mar 2010 13:36:19 +1300 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: <4BAFF603.900@visualreservoir.com> Andrea Gavana wrote: > On 29 March 2010 00:34, Robert Kern wrote: > >> On Sun, Mar 28, 2010 at 18:30, Andrea Gavana wrote: >> >>> Hi Friedrich & All, >>> >>> On 28 March 2010 23:51, Friedrich Romstedt wrote: >>> >>>> 2010/3/28 Andrea Gavana : >>>> >>>>> Example 1 >>>>> >>>>> # o2 and o3 are the number of production wells, split into 2 >>>>> # different categories >>>>> # inj is the number of injection wells >>>>> # fomts is the final oil recovery >>>>> >>>>> rbf = Rbf(oilPlateau, gasPlateau, gasInjPlateau, o2, o3, inj, fomts) >>>>> >>>>> op = [50380] >>>>> gp = [103014000] >>>>> gi = [53151000] >>>>> o2w = [45] >>>>> o3w = [20] >>>>> inw = [15] >>>>> >>>>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>>>> >>>>> # I => KNOW <= the answer to be close to +3.5e8 >>>>> >>>>> print fi >>>>> >>>>> [ -1.00663296e+08] >>>>> >>>>> (yeah right...) >>>>> >>>>> >>>>> Example 2 >>>>> >>>>> Changing o2w from 45 to 25 (again, the answer should be close to 3e8, >>>>> less wells => less production) >>>>> >>>>> fi = rbf(op, gp, gi, o2w, o3w, inw) >>>>> >>>>> print fi >>>>> >>>>> [ 1.30023424e+08] >>>>> >>>>> And keep in mind, that nowhere I have such low values of oil recovery >>>>> in my data... the lowest one are close to 2.8e8... >>>>> >>>> I want to put my2 cents in, fwiw ... >>>> >>>> What I see from >>>> http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.interpolate.Rbf.html#scipy.interpolate.Rbf >>>> are three things: >>>> >>>> 1. Rbf uses some weighting based on the radial functions. >>>> 2. Rbf results go through the nodal points without *smooth* set to >>>> some value != 0 >>>> 3. Rbf is isotropic >>>> >>>> (3.) is most important. I see from your e-mail that the values you >>>> pass in to Rbf are of very different order of magnitude. But the >>>> default norm used in Rbf is for sure isotropic, i.e., it will result >>>> in strange and useless "mean distances" in R^N where there are N >>>> parameters. You have to either pass in a *norm* which weights the >>>> coords according to their extent, or to scale the data such that the >>>> aspect ratios of the hypecube's edges are sensible. >>>> >>> I believe I need a technical dictionary to properly understand all >>> that... :-D . Sorry, I am no expert at all, really, just an amateur >>> with some imagination, but your suggestion about the different >>> magnitude of the matrix is a very interesting one. Although I have >>> absolutely no idea on how to re-scale them properly to avoid RBFs >>> going crazy. >>> >> Scaling each axis by its standard deviation is a typical first start. >> Shifting and scaling the values such that they each go from 0 to 1 is >> another useful thing to try. >> > > Ah, magnifico! Thank you Robert and Friedrich, it seems to be working > now... I get reasonable values for various combinations of parameters > by scaling the input data using the standard deviation of each of > them. It seems also that the other interpolation schemes are much less > erratic now, and in fact (using input values equal to the original > data) I get these range of errors for the various schemes: > > inverse multiquadric -15.6098482614 15.7194674906 > linear -1.76157336073e-010 1.24949181055e-010 > cubic -0.000709860285963 0.018385394661 > gaussian -293.930336611 282.058111404 > quintic -0.176381494531 5.37780806549 > multiquadric -30.9515933446 58.3786105046 > thin-plate -7.06755391536e-006 8.71407169821e-005 > > In percentage. Some of them are still off the mark, but you should > have seen them before ;-) . > > That's great news. Going to give it a try myself. Interesting that the linear scheme gives the lowest error range. > I'll do some more analysis tomorrow, and if it works I am going to try > the bigger profile-over-time interpolation. Thank you so much guys for > your suggestions. > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From kgdunn at gmail.com Sun Mar 28 20:38:52 2010 From: kgdunn at gmail.com (Kevin Dunn) Date: Sun, 28 Mar 2010 20:38:52 -0400 Subject: [Numpy-discussion] Interpolation question Message-ID: > Message: 5 > Date: Sun, 28 Mar 2010 00:24:01 +0000 > From: Andrea Gavana > Subject: [Numpy-discussion] Interpolation question > To: Discussion of Numerical Python > Message-ID: > ? ? ? ? > Content-Type: text/plain; charset=ISO-8859-1 > > Hi All, > > ? ?I have an interpolation problem and I am having some difficulties > in tackling it. I hope I can explain myself clearly enough. > > Basically, I have a whole bunch of 3D fluid flow simulations (close to > 1000), and they are a result of different combinations of parameters. > I was planning to use the Radial Basis Functions in scipy, but for the > moment let's assume, to simplify things, that I am dealing only with > one parameter (x). In 1000 simulations, this parameter x has 1000 > values, obviously. The problem is, the outcome of every single > simulation is a vector of oil production over time (let's say 40 > values per simulation, one per year), and I would like to be able to > interpolate my x parameter (1000 values) against all the simulations > (1000x40) and get an approximating function that, given another x > parameter (of size 1x1) will give me back an interpolated production > profile (of size 1x40). [I posted the following earlier but forgot to change the subject - it appears as a new thread called "NumPy-Discussion Digest, Vol 42, Issue 85" - please ignore that thread] Andrea, may I suggest a different approach to RBF's. Realize that your vector of 40 values for each row in y are not independent of each other (they will be correlated). First build a principal component analysis (PCA) model on this 1000 x 40 matrix and reduce it down to a 1000 x A matrix, called your scores matrix, where A is the number of independent components. A is selected so that it adequately summarizes Y without over-fitting and you will find A << 40, maybe A = 2 or 3. There are tools, such as cross-validation, that will help select a reasonable value of A. Then you can relate your single column of X to these independent columns in A using a tool such as least squares: one least squares model per column in the scores matrix. This works because each column in the score vector is independent (contains totally orthogonal information) to the others. But I would be surprised if this works well enough, unless A = 1. But it sounds like your don't just have a single column in your X-variables (you hinted that the single column was just for simplification). In that case, I would build a projection to latent structures model (PLS) model that builds a single latent-variable model that simultaneously models the X-matrix, the Y-matrix as well as providing the maximal covariance between these two matrices. If you need some references and an outline of code, then I can readily provide these. This is a standard problem with data from spectroscopic instruments and with batch processes. They produce hundreds, sometimes 1000's of samples per row. PCA and PLS are very effective at summarizing these down to a much smaller number of independent columns, very often just a handful, and relating them (i.e. building a predictive model) to other data matrices. I also just saw the suggestions of others to center the data by subtracting the mean from each column in Y and scaling (by dividing through by the standard deviation). This is a standard data preprocessing step, called autoscaling and makes sense for any data analysis, as you already discovered. Hope that helps, Kevin > Something along these lines: > > import numpy as np > from scipy.interpolate import Rbf > > # x.shape = (1000, 1) > # y.shape = (1000, 40) > > rbf = Rbf(x, y) > > # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) > fi = rbf(xi) > > > Does anyone have a suggestion on how I could implement this? Sorry if > it sounds confused... Please feel free to correct any wrong > assumptions I have made, or to propose other approaches if you think > RBFs are not suitable for this kind of problems. > > Thank you in advance for your suggestions. > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From cournape at gmail.com Mon Mar 29 06:13:11 2010 From: cournape at gmail.com (David Cournapeau) Date: Mon, 29 Mar 2010 19:13:11 +0900 Subject: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy Message-ID: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> Hi, I have worked on porting scipy to py3k, and it is mostly working. One thing which would be useful is to install something similar to npy_3kcompat.h in numpy, so that every scipy extension could share the compat header. Is the current python 3 compatibility header usable "in the wild", or will it still significantly change (this requiring a different, more stable one) ? cheers, David From sierra_mtnview at sbcglobal.net Mon Mar 29 06:58:13 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 29 Mar 2010 03:58:13 -0700 Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? In-Reply-To: References: <4BAD54D5.3080500@sbcglobal.net> <4BAD76C3.1020708@sbcglobal.net> Message-ID: <4BB087C5.8000800@sbcglobal.net> Yes, that is very likely the solution. It's clear that the module is in the list. I say likely, since I've never done it before and there always seems to be something that gets overlooked in what seems to be something so simple. :-) However, my colleague is on XP. Ah, same idea there. I find it odd that no one seems to address the removal of modules from site-packages from Windows. Thanks. On 3/28/2010 9:13 AM, PHobson at Geosyntec.com wrote: > If your on windows, you can probably get rid of it through the Add/Remove Programs portion of the Conrol Panel. > > -- > Paul Hobson > Senior Staff Engineer > Geosyntec Consultants > Portland, OR > > On Mar 26, 2010, at 8:09 PM, "Wayne Watson"> wrote: > > Thanks. How do I switch? Do I just pull down 1.3 or better 1.2 (I use it.), and install it? How do I (actually my colleague) somehow remove 1.4? Is it as easy as going to IDLE's path browser and removing, under site-packages, numpy? (I'm not sure that's even possible. I don't see a right-click menu.) > > On 3/26/2010 7:22 PM, PHobson at Geosyntec.com wrote: > Wayne, > > The current release of Scipy doesn?t work perfectly well with Numpy 1.4. > > On my systems (Mac OS 10.6, WinXP, and Ubuntu), I?m running Numpy 1.4 with the current Scipy on Python 2.6.4. I get the same error you describe below on the first attempt. For some reason unknown to me, it works on the second try. > > Switching to Numpy 1.3 is the best solution to the error. > > -paul > > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Wayne Watson > Sent: Friday, March 26, 2010 5:44 PM > To: numpy-discussion at scipy.org > Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4? > > I wrote a program in Python 2.5 under Win7 and it runs fine using Numpy 1.2 , but not on a colleague's machine who has a slightly newer 2.5. We both use IDLE to execute the program. During import he gets this: > > >>>> > Traceback (most recent call last): > File "C:\Documents and Settings\HP_Administrator.DavesDesktop\My Documents\Astro\Meteors\NC-FireballReport.py", line 38, in > from scipy import stats as stats # scoreatpercentile > File "C:\Python25\lib\site-packages\scipy\stats\__init__.py", line 7, in > from stats import * > File "C:\Python25\lib\site-packages\scipy\stats\stats.py", line 191, in > import scipy.special as special > File "C:\Python25\lib\site-packages\scipy\special\__init__.py", line 22, in > from numpy.testing import NumpyTest > ImportError: cannot import name NumpyTest > >>>> > Comments? > > > -- > > Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > > > (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > > Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet > > Poisoned Shipments. Serious illegal waste dumping may be > > occuring in the Meditrainean. Radioactive material, > > mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f. > > > > Web Page:<www.speckledwithstars.net/> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet > Poisoned Shipments. Serious illegal waste dumping may be > occuring in the Meditrainean. Radioactive material, > mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f. > > Web Page:<www.speckledwithstars.net/> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet Poisoned Shipments. Serious illegal waste dumping may be occuring in the Meditrainean. Radioactive material, mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f. Web Page: From nadavh at visionsense.com Mon Mar 29 07:07:48 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 29 Mar 2010 14:07:48 +0300 Subject: [Numpy-discussion] Applying formula to all in an array which hasvalue from previous References: Message-ID: <710F2847B0018641891D9A21602763605AD37B@ex3.envision.co.il> The general guideline: Suppose the function definition is: def func(x,y): # x and y are scalars bla bla bla ... return z # a scalar So, import numpy as np vecfun = np.vectorize(func) vecfun.ufunc.accumulate(array((0,1,2,3,4,5,6,7,8,9)) Nadav. -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Vishal Rana Sent: Sun 28-Mar-10 21:19 To: Discussion of Numerical Python Subject: [Numpy-discussion] Applying formula to all in an array which hasvalue from previous Hi, For a numpy array: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) I do some calculation with 0, 1... and get a value = 2.5, now use this value to do the repeat the same calculation with next element for example... 2.5, 2 and get a value = 3.1 3.1, 3 and get a value = 4.2 4.2, 4 and get a value = 5.1 .... .... and get a value = 8.5 8.5, 9 and get a value = 9.8 So I should be getting a new array like array([0, 2.5, 3.1, 4.2, 5.1, ..... 8.5,9.8]) Is it where numpy or scipy can help? Thanks Vishal -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3308 bytes Desc: not available URL: From bsouthey at gmail.com Mon Mar 29 10:00:58 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 29 Mar 2010 09:00:58 -0500 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> Message-ID: <4BB0B29A.4040905@gmail.com> On 03/27/2010 01:31 PM, Ryan May wrote: > On Sat, Mar 27, 2010 at 11:12 AM, wrote: > >> On Sat, Mar 27, 2010 at 1:00 PM, Ryan May wrote: >> >>> On Mon, Mar 22, 2010 at 8:14 AM, Ryan May wrote: >>> >>>> On Sun, Mar 21, 2010 at 11:57 PM, wrote: >>>> >>>>> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I found that trapz() doesn't work with subclasses: >>>>>> >>>>>> http://projects.scipy.org/numpy/ticket/1438 >>>>>> >>>>>> A simple patch (attached) to change asarray() to asanyarray() fixes >>>>>> the problem fine. >>>>>> >>>>> Are you sure this function works with matrices and other subclasses? >>>>> >>>>> Looking only very briefly at it: the multiplication might be a problem. >>>>> >>>> Correct, it probably *is* a problem in some cases with matrices. In >>>> this case, I was using quantities (Darren Dale's unit-aware array >>>> package), and the result was that units were stripped off. >>>> >>>> The patch can't make trapz() work with all subclasses. However, right >>>> now, you have *no* hope of getting a subclass out of trapz(). With >>>> this change, subclasses that don't redefine operators can work fine. >>>> If you're passing a Matrix to trapz() and expecting it to work, IMHO >>>> you're doing it wrong. You can still pass one in by using asarray() >>>> yourself. Without this patch, I'm left with copying and maintaining a >>>> copy of the code elsewhere, just so I can loosen the function's input >>>> processing. That seems wrong, since there's really no need in my case >>>> to drop down to an ndarray. The input I'm giving it supports all the >>>> operations it needs, so it should just work with my original input. >>>> >> With asarray it gives correct results for matrices and all array_like >> and subclasses, it just doesn't preserve the type. >> Your patch would break matrices and possibly other types, masked_arrays?, ... >> > It would break matrices, yes. I would argue that masked arrays are > already broken with trapz: > > In [1]: x = np.arange(10) > > In [2]: y = x * x > > In [3]: np.trapz(y, x) > Out[3]: 244.5 > > In [4]: ym = np.ma.array(y, mask=(x>4)&(x<7)) > > In [5]: np.trapz(ym, x) > Out[5]: 244.5 > > In [6]: y[5:7] = 0 > > In [7]: ym = np.ma.array(y, mask=(x>4)&(x<7)) > > In [8]: np.trapz(ym, x) > Out[8]: 183.5 > > Because of the call to asarray(), the mask is completely discarded and > you end up with identical results to an unmasked array, > which is not what I'd expect. Worse, the actual numeric value of the > positions that were masked affect the final answer. My patch allows > this to work as expected too. > Actually you should assume that unless it is explicitly addressed (either by code or via a test), any subclass of ndarray (matrix, masked, structured, record and even sparse) may not provide a 'valid' answer. There are probably many numpy functions that only really work with the standard ndarray. Most of the time people do not meet these with the subclasses or have workarounds so there has been little requirement to address this especially due to the added overhead needed for checking. Also, any patch that does not explicitly define the assumed behavior with points that are masked has to be rejected. It is not even clear what the expected behavior is for masked arrays should be: Is it even valid for trapz to be integrating across the full range if there are missing points? That implies some assumption about the missing points. If is valid, then should you just ignore the masked values or try to predict the missing values first? Perhaps you may want to have the option to do both. >> One solution would be using arraywrap as in numpy.linalg. >> > By arraywrap, I'm assuming you mean: > > def _makearray(a): > new = asarray(a) > wrap = getattr(a, "__array_prepare__", new.__array_wrap__) > return new, wrap > > I'm not sure if that's identical to just letting the subclass handle > what's needed. To my eyes, that doesn't look as though it'd be > equivalent, both for handling masked arrays and Quantities. For > quantities at least, the result of trapz will have different units > than either of the inputs. > > >> for related discussion: >> http://mail.scipy.org/pipermail/scipy-dev/2009-June/012061.html >> > Actually, that discussion kind of makes my point. Matrices are a pain > to make work in a general sense because they *break* ndarray > conventions--to me it doesn't make sense to help along classes that > break convention at the expense of making well-behaved classes a pain > to use. You should need an *explicit* cast of a matrix to an ndarray > instead of the function quietly doing it for you. ("Explicit is better > than implicit") It just seems absurd that if I make my own ndarray > subclass that *just* adds some behavior to the array, but doesn't > break *any* operations, I need to do one of the following: > > 1) Have my own copy of trapz that works with my class > 2) Wrap every call to numpy's own trapz() to put the metadata back. > > Does it not seem backwards that the class that breaks conventions > "just works" while those that don't break conventions, will work > perfectly with the function as written, need help to be treated > properly? > > Ryan > > You need your own version of trapz or whatever function because it has the behavior that you expect. But a patch should not break numpy so you need to at least to have a section that looks for masked array subtypes and performs the desired behavior(s). Bruce From charlesr.harris at gmail.com Mon Mar 29 10:54:21 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Mar 2010 08:54:21 -0600 Subject: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy In-Reply-To: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> References: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> Message-ID: On Mon, Mar 29, 2010 at 4:13 AM, David Cournapeau wrote: > Hi, > > I have worked on porting scipy to py3k, and it is mostly working. One > thing which would be useful is to install something similar to > npy_3kcompat.h in numpy, so that every scipy extension could share the > compat header. Is the current python 3 compatibility header usable "in > the wild", or will it still significantly change (this requiring a > different, more stable one) ? > > Pauli will have to weigh in here, but I think it is pretty stable and has been for a while. The only thing I'm thinking of changing are the PyCapsule compatibility functions; instead of having them downgrade PyCapsule to PyCObject like behaviour, go the other way. Going the other way requires changes to the surrounding code to handle errors, which is why Numpy doesn't use those functions at the moment. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Mar 29 11:11:47 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 29 Mar 2010 18:11:47 +0300 Subject: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy In-Reply-To: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> References: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> Message-ID: <1269875507.2743.10.camel@talisman> ma, 2010-03-29 kello 19:13 +0900, David Cournapeau kirjoitti: > I have worked on porting scipy to py3k, and it is mostly working. One > thing which would be useful is to install something similar to > npy_3kcompat.h in numpy, so that every scipy extension could share the > compat header. Is the current python 3 compatibility header usable "in > the wild", or will it still significantly change (this requiring a > different, more stable one) ? I believe it's reasonably stable, as it contains mostly simple stuff. Something perhaps can be added later, but I don't think anything will need to be removed. At least, I don't see what I would like to change there. The only thing I wouldn't perhaps like to have in the long run are the PyString and possibly PyInt redefinition macros. But if the header is going to be used elsewhere, we can as well freeze it now and promise not to remove or change anything existing. Pauli From rmay31 at gmail.com Mon Mar 29 11:17:25 2010 From: rmay31 at gmail.com (Ryan May) Date: Mon, 29 Mar 2010 09:17:25 -0600 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: <4BB0B29A.4040905@gmail.com> References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> <4BB0B29A.4040905@gmail.com> Message-ID: On Mon, Mar 29, 2010 at 8:00 AM, Bruce Southey wrote: > On 03/27/2010 01:31 PM, Ryan May wrote: >> Because of the call to asarray(), the mask is completely discarded and >> you end up with identical results to an unmasked array, >> which is not what I'd expect. ?Worse, the actual numeric value of the >> positions that were masked affect the final answer. My patch allows >> this to work as expected too. >> > Actually you should assume that unless it is explicitly addressed > (either by code or via a test), any subclass of ndarray (matrix, masked, > structured, record and even sparse) may not provide a 'valid' answer. > There are probably many numpy functions that only really work with the > standard ndarray. Most of the time people do not meet these with the > subclasses or have workarounds so there has been little requirement to > address this especially due to the added overhead needed for checking. It's not that I'm surprised that masked arrays don't work. It's more that the calls to np.asarray within trapz() have been held up as being necessary for things like matrices and (at the time) masked arrays to work properly; as if calling asarray() is supposed to make all subclasses work, though at a base level by dropping to an ndarray. To me, the current behavior with masked arrays is worse than if passing in a matrix raised an exception. One is a silently wrong answer, the other is a big error that the programmer can see, test, and fix. > Also, any patch that does not explicitly define the assumed behavior > with points that are masked ?has to be rejected. It is not even clear > what the expected behavior is for masked arrays should be: > Is it even valid for trapz to be integrating across the full range if > there are missing points? That implies some assumption about the missing > points. > If is valid, then should you just ignore the masked values or try to > predict the missing values first? Perhaps you may want to have the > option to do both. You're right, it doesn't actually work with MaskedArrays as it stand right now, because it calls add.reduce() directly instead of using the array.sum() method. Once fixed, by allowing MaskedArray to handle the operation, you end up not integrating over the masked region. Any operation involving masked points results in contributions by masked points are ignored. I guess it's as if you assumed the function was 0 over the masked region. If you wanted to ignore the masked points, but integrate over the region (making a really big trapezoid over that region), you could just pass in the .compressed() versions of the arrays. >> than implicit") It just seems absurd that if I make my own ndarray >> subclass that *just* adds some behavior to the array, but doesn't >> break *any* operations, I need to do one of the following: >> >> 1) Have my own copy of trapz that works with my class >> 2) Wrap every call to numpy's own trapz() to put the metadata back. >> >> Does it not seem backwards that the class that breaks conventions >> "just works" while those that don't break conventions, will work >> perfectly with the function as written, need help to be treated >> properly? >> > You need your own version of trapz or whatever function because it has > the behavior that you expect. But a patch should not break numpy so you > need to at least to have a section that looks for masked array subtypes > and performs the desired behavior(s). I'm not trying to be difficult but it seems like there are conflicting ideas here: we shouldn't break numpy, which in this case means making matrices no longer work with trapz(). On the other hand, subclasses can do a lot of things, so there's no real expectation that they should ever work with numpy functions in general. Am I missing something here? I'm just trying to understand what I perceive to be some inconsistencies in numpy's behavior and, more importantly, convention with regard subclasses. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From rmay31 at gmail.com Mon Mar 29 11:47:52 2010 From: rmay31 at gmail.com (Ryan May) Date: Mon, 29 Mar 2010 09:47:52 -0600 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> <4BB0B29A.4040905@gmail.com> Message-ID: Hi, I decided that having actual code that does what I want and keeps backwards compatibility (and adds tests) might be better than arguing semantics. I've updated my patch to: * Uses the array.sum() method instead of add.reduce to make subclasses fully work (this was still breaking masked arrays. * Catches an exception on doing the actual multiply and sum of the arrays and tries again after casting to ndarrays. This allows any subclasses that relied on being cast to still work. * Adds tests that ensure matrices work (test passes before and after changes to trapz()) and adds a test for masked arrays that checks that masked points are treated as expected. In this case, expected is defined to be the same as if you implemented the trapezoidal method by hand using MaskedArray's basic arithmetic operations. Attached here and at: http://projects.scipy.org/numpy/ticket/1438 I think this addresses the concerns that were raised about the changes for subclasses in this case. Let me know if I've missed something (or if there's no way in hell any such patch will ever be committed). Thanks, Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From bsouthey at gmail.com Mon Mar 29 12:39:29 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 29 Mar 2010 11:39:29 -0500 Subject: [Numpy-discussion] numpy.trapz() doesn't respect subclass In-Reply-To: References: <1cd32cbb1003212157s767728f9r3e66ea5a8b458532@mail.gmail.com> <1cd32cbb1003271012l534a02c7we440ae1451daf0d9@mail.gmail.com> <4BB0B29A.4040905@gmail.com> Message-ID: <4BB0D7C1.1000306@gmail.com> On 03/29/2010 10:17 AM, Ryan May wrote: > On Mon, Mar 29, 2010 at 8:00 AM, Bruce Southey wrote: > >> On 03/27/2010 01:31 PM, Ryan May wrote: >> >>> Because of the call to asarray(), the mask is completely discarded and >>> you end up with identical results to an unmasked array, >>> which is not what I'd expect. Worse, the actual numeric value of the >>> positions that were masked affect the final answer. My patch allows >>> this to work as expected too. >>> >>> >> Actually you should assume that unless it is explicitly addressed >> (either by code or via a test), any subclass of ndarray (matrix, masked, >> structured, record and even sparse) may not provide a 'valid' answer. >> There are probably many numpy functions that only really work with the >> standard ndarray. Most of the time people do not meet these with the >> subclasses or have workarounds so there has been little requirement to >> address this especially due to the added overhead needed for checking. >> > It's not that I'm surprised that masked arrays don't work. It's more > that the calls to np.asarray within trapz() have been held up as being > necessary for things like matrices and (at the time) masked arrays to > work properly; as if calling asarray() is supposed to make all > subclasses work, though at a base level by dropping to an ndarray. To > me, the current behavior with masked arrays is worse than if passing > in a matrix raised an exception. One is a silently wrong answer, the > other is a big error that the programmer can see, test, and fix. > > >> Also, any patch that does not explicitly define the assumed behavior >> with points that are masked has to be rejected. It is not even clear >> what the expected behavior is for masked arrays should be: >> Is it even valid for trapz to be integrating across the full range if >> there are missing points? That implies some assumption about the missing >> points. >> If is valid, then should you just ignore the masked values or try to >> predict the missing values first? Perhaps you may want to have the >> option to do both. >> > You're right, it doesn't actually work with MaskedArrays as it stand > right now, because it calls add.reduce() directly instead of using the > array.sum() method. Once fixed, by allowing MaskedArray to handle the > operation, you end up not integrating over the masked region. Any > operation involving masked points results in contributions by masked > points are ignored. I guess it's as if you assumed the function was 0 > over the masked region. If you wanted to ignore the masked points, > but integrate over the region (making a really big trapezoid over that > region), you could just pass in the .compressed() versions of the > arrays. > > >>> than implicit") It just seems absurd that if I make my own ndarray >>> subclass that *just* adds some behavior to the array, but doesn't >>> break *any* operations, I need to do one of the following: >>> >>> 1) Have my own copy of trapz that works with my class >>> 2) Wrap every call to numpy's own trapz() to put the metadata back. >>> >>> Does it not seem backwards that the class that breaks conventions >>> "just works" while those that don't break conventions, will work >>> perfectly with the function as written, need help to be treated >>> properly? >>> >>> >> You need your own version of trapz or whatever function because it has >> the behavior that you expect. But a patch should not break numpy so you >> need to at least to have a section that looks for masked array subtypes >> and performs the desired behavior(s). >> > I'm not trying to be difficult but it seems like there are conflicting > ideas here: we shouldn't break numpy, which in this case means making > matrices no longer work with trapz(). On the other hand, subclasses > can do a lot of things, so there's no real expectation that they > should ever work with numpy functions in general. Am I missing > something here? I'm just trying to understand what I perceive to be > some inconsistencies in numpy's behavior and, more importantly, > convention with regard subclasses. > > Ryan > > You should not confuse class functions with normal Python functions. Functions that are inherited from the ndarray superclass should be the same in the subclass unless these class functions have been modified. However many functions like trapz are not part of the ndarray superclass that have been written to handle the standard array (i.e. the unmodified ndarray superclass) but these may or may not work for all ndarray subclasses. Other functions have been written to handle the specific ndarray subclass such as masked array or Matrix and may (but not guaranteed to) work for the standard array. Thus, I think your 'inconsistencies' relate to the simple fact that not all numpy functions are aware of ndarray subclasses. What is missing are the bug reports and solutions for these functions that occur when the expected behavior differs between the ndarray superclass and ndarray subclasses. In the case of trapz, the bug report needs to be at least an indication of what is the expected behavior when there are masked values present. Bruce From pascal22p at parois.net Mon Mar 29 17:00:53 2010 From: pascal22p at parois.net (Pascal) Date: Mon, 29 Mar 2010 23:00:53 +0200 Subject: [Numpy-discussion] Fourier transform Message-ID: <20100329230053.2d32665f@parois.net> Hi, Does anyone have an idea how fft functions are implemented? Is it pure python? based on BLAS/LAPACK? or is it using fftw? I successfully used numpy.fft in 3D. I would like to know if I can calculate a specific a plane using the numpy.fft. I have in 3D: r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} \exp(-2\pi \i (hx/N+ky/M+lz/O)) So for the plane, z is no longer independant. I need to solve the system: ax+by+cz+d=0 r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} \exp(-2\pi \i (hx/N+ky/M+lz/O)) Do you think it's possible to use numpy.fft for this? Regards, Pascal From robert.kern at gmail.com Mon Mar 29 17:05:46 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 29 Mar 2010 16:05:46 -0500 Subject: [Numpy-discussion] Fourier transform In-Reply-To: <20100329230053.2d32665f@parois.net> References: <20100329230053.2d32665f@parois.net> Message-ID: <3d375d731003291405r6f127320s14c5b4bb76f7897f@mail.gmail.com> On Mon, Mar 29, 2010 at 16:00, Pascal wrote: > Hi, > > Does anyone have an idea how fft functions are implemented? Is it pure > python? based on BLAS/LAPACK? or is it using fftw? Using FFTPACK converted from FORTRAN to C. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From andrea.gavana at gmail.com Mon Mar 29 17:20:00 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 22:20:00 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: Hi All, On 29 March 2010 00:59, Andrea Gavana wrote: > On 29 March 2010 00:34, Robert Kern wrote: >> Scaling each axis by its standard deviation is a typical first start. >> Shifting and scaling the values such that they each go from 0 to 1 is >> another useful thing to try. > > Ah, magnifico! Thank you Robert and Friedrich, it seems to be working > now... I get reasonable values for various combinations of parameters > by scaling the input data using the standard deviation of each of > them. It seems also that the other interpolation schemes are much less > erratic now, and in fact (using input values equal to the original > data) I get these range of errors for the various schemes: > > inverse multiquadric -15.6098482614 15.7194674906 > linear ? ? ? ? ? ? ? ? ? ? ? ?-1.76157336073e-010 1.24949181055e-010 > cubic ? ? ? ? ? ? ? ? ? ? ? ?-0.000709860285963 0.018385394661 > gaussian ? ? ? ? ? ? ? ? ?-293.930336611 282.058111404 > quintic ? ? ? ? ? ? ? ? ? ? ?-0.176381494531 5.37780806549 > multiquadric ? ? ? ? ? ? -30.9515933446 58.3786105046 > thin-plate ? ? ? ? ? ? ? ? ?-7.06755391536e-006 8.71407169821e-005 > > In percentage. Some of them are still off the mark, but you should > have seen them before ;-) . > > I'll do some more analysis tomorrow, and if it works I am going to try > the bigger profile-over-time interpolation. Thank you so much guys for > your suggestions. If anyone is interested in a follow up, I have tried a time-based interpolation of my oil profile (and gas and gas injection profiles) using those 40 interpolators (and even more, up to 400, one every month of fluid flow simulation time step). I wasn't expecting too much out of it, but when the interpolated profiles came out (for different combinations of input parameters) I felt like being on the wrong side of the Lala River in the valley of Areyoukidding. The results are striking. I get an impressive agreement between this interpolated proxy model and the real simulations, whether I use existing combinations of parameters or new ones (i.e., I create the interpolation and then run the real fluid flow simulation, comparing the outcomes). As an aside, I got my colleagues reservoir engineers playfully complaining that it's time for them to pack their stuff and go home as this interpolator is doing all the job for us; obviously, this misses the point that it took 4 years to build such a comprehensive bunch of simulations which now allows us to somewhat "predict" a possible production profile in advance. I wrapped everything up in a wxPython GUI with some Matplotlib graphs, and everyone seems happy. The only small complain I have is that I wasn't able to come up with a vector implementation of RBFs, so it can be pretty slow to build and interpolate 400 RBFs for each property (3 of them). Thanks to everyone for your valuable suggestions! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From andrea.gavana at gmail.com Mon Mar 29 17:23:31 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 22:23:31 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: Message-ID: Hi Kevin, On 29 March 2010 01:38, Kevin Dunn wrote: >> Message: 5 >> Date: Sun, 28 Mar 2010 00:24:01 +0000 >> From: Andrea Gavana >> Subject: [Numpy-discussion] Interpolation question >> To: Discussion of Numerical Python >> Message-ID: >> ? ? ? ? >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Hi All, >> >> ? ?I have an interpolation problem and I am having some difficulties >> in tackling it. I hope I can explain myself clearly enough. >> >> Basically, I have a whole bunch of 3D fluid flow simulations (close to >> 1000), and they are a result of different combinations of parameters. >> I was planning to use the Radial Basis Functions in scipy, but for the >> moment let's assume, to simplify things, that I am dealing only with >> one parameter (x). In 1000 simulations, this parameter x has 1000 >> values, obviously. The problem is, the outcome of every single >> simulation is a vector of oil production over time (let's say 40 >> values per simulation, one per year), and I would like to be able to >> interpolate my x parameter (1000 values) against all the simulations >> (1000x40) and get an approximating function that, given another x >> parameter (of size 1x1) will give me back an interpolated production >> profile (of size 1x40). > > [I posted the following earlier but forgot to change the subject - it > appears as a new thread called "NumPy-Discussion Digest, Vol 42, Issue > 85" - please ignore that thread] > > Andrea, may I suggest a different approach to RBF's. > > Realize that your vector of 40 values for each row in y are not > independent of each other (they will be correlated). ?First build a > principal component analysis (PCA) model on this 1000 x 40 matrix and > reduce it down to a 1000 x A matrix, called your scores matrix, where > A is the number of independent components. A is selected so that it > adequately summarizes Y without over-fitting and you will find A << > 40, maybe A = 2 or 3. There are tools, such as cross-validation, that > will help select a reasonable value of A. > > Then you can relate your single column of X to these independent > columns in A using a tool such as least squares: one least squares > model per column in the scores matrix. ?This works because each column > in the score vector is independent (contains totally orthogonal > information) to the others. ?But I would be surprised if this works > well enough, unless A = 1. > > But it sounds like your don't just have a single column in your > X-variables (you hinted that the single column was just for > simplification). ?In that case, I would build a projection to latent > structures model (PLS) model that builds a single latent-variable > model that simultaneously models the X-matrix, the Y-matrix as well as > providing the maximal covariance between these two matrices. > > If you need some references and an outline of code, then I can readily > provide these. > > This is a standard problem with data from spectroscopic instruments > and with batch processes. ?They produce hundreds, sometimes 1000's of > samples per row. PCA and PLS are very effective at summarizing these > down to a much smaller number of independent columns, very often just > a handful, and relating them (i.e. building a predictive model) to > other data matrices. > > I also just saw the suggestions of others to center the data by > subtracting the mean from each column in Y and scaling (by dividing > through by the standard deviation). ?This is a standard data > preprocessing step, called autoscaling and makes sense for any data > analysis, as you already discovered. I have got some success by using time-based RBFs interpolations, but I am always open to other possible implementations (as the one I am using can easily fail for strange combinations of input parameters). Unfortunately, my understanding of your explanation is very very limited: I am not an expert at all, so it's a bit hard for me to translate the mathematical technical stuff in something I can understand. If you have an example code (even a very trivial one) for me to study so that I can understand what the code is actually doing, I would be more than grateful for your help :-) Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From andrea.gavana at gmail.com Mon Mar 29 17:31:19 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 22:31:19 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <4BAFEA67.7050808@visualreservoir.com> References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <4BAFEA67.7050808@visualreservoir.com> Message-ID: Hi Brennan & All, On 29 March 2010 00:46, Brennan Williams wrote: > Andrea Gavana wrote: >> As for your question, the parameter are not spread completely >> randomly, as this is a collection of simulations done over the years, >> trying manually different scenarios, without having in mind a proper >> experimental design or any other technique. Nor the parameter ?values >> vary only on one axis in each simulation (few of them are like that). >> > > I assume that there is a default "norm" that calculates the distance > between points irrespective of the order of the input coordinates? > > So if that isn't working, leading to the spurious results, the next step > is to normalise all the inputs so they are in the same range, e.g > max-min=1.0 Scaling the input data using their standard deviation worked very well for my case. > On a related note, what approach would be best if one of the input > parameters wasn't continuous? e.g. I have three quite different > geological distributions called say A,B and C. > SO some of my simulations use distribution A, some use B and some use C. > I could assign them the numbers 1,2,3 but a value of 1.5 is meaningless. Not sure about this: I do have integer numbers too (the number of wells can not be a fractional one, obviously), but I don't care about it as it is an input parameter (i.e., the user choose how many o2/o3/injector wells he/she wants, and I get an interpolated production profiles). Are you saying that the geological realization is one of your output variables? > Andrea, if you have 1TB of data for 1,000 simulation runs, then, if I > assume you only mean the smspec/unsmry files, that means each of your > summary files is 1GB in size? It depends on the simulation, and also for how many years the forecast is run. Standard runs go up to 2038, but we have a bunch of them running up to 2120 (!) . As we do have really many wells in this field, the ECLIPSE summary file dimensions skyrocket pretty quickly. > Are those o2w,o3w and inw figures the number of new wells only or > existing+new? It's fun dealing with this amount of data isn't it? They're only new wells, with a range of 0 <= o2w <= 150 and 0 <= o3 <= 84 and 0 <= inw <= 37, and believe it or not, our set of simulations contains a lot of the possible combinations for these 2 variables (and the other 4 variables too)... Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From Chris.Barker at noaa.gov Mon Mar 29 17:35:50 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 29 Mar 2010 14:35:50 -0700 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: <4BB11D36.1040907@noaa.gov> Andrea Gavana wrote: >>> Scaling each axis by its standard deviation is a typical first start. >>> Shifting and scaling the values such that they each go from 0 to 1 is >>> another useful thing to try. >> Ah, magnifico! Thank you Robert and Friedrich, it seems to be working >> now... One other thought -- core to much engineering is dimensional analysis -- you know how we like those non-dimensional number! I think this situation is less critical, as you are interpolating, not optimizing or something, but many interpolation methods are built on the idea of some data points being closer than others to your point of interest. Who is to say if a point that is 2 hours away is closer or father than one 2 meters away? This is essentially what you are doing. Scaling everything to the same range is a start, but then you've still given them an implicit weighting. An alternative to to figure out a way to non-dimensionalize your parameters -- that *may* give you a more physically based scaling. And you might invent the "Gavana Number" in the process ;-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From andrea.gavana at gmail.com Mon Mar 29 17:45:30 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 22:45:30 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <4BB11D36.1040907@noaa.gov> References: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> <4BB11D36.1040907@noaa.gov> Message-ID: Hi Chris and All, On 29 March 2010 22:35, Christopher Barker wrote: > Andrea Gavana wrote: >>>> Scaling each axis by its standard deviation is a typical first start. >>>> Shifting and scaling the values such that they each go from 0 to 1 is >>>> another useful thing to try. >>> Ah, magnifico! Thank you Robert and Friedrich, it seems to be working >>> now... > > One other thought -- core to much engineering is dimensional analysis -- > you know how we like those non-dimensional number! > > I think this situation is less critical, as you are interpolating, not > optimizing or something, but many interpolation methods are built on the > idea of some data points being closer than others to your point of interest. > > Who is to say if a point that is 2 hours away is closer or father than > one 2 meters away? This is essentially what you are doing. > > Scaling everything to the same range is a start, but then you've still > given them an implicit weighting. > > An alternative to to figure out a way to non-dimensionalize your > parameters -- that *may* give you a more physically based scaling. > > And you might invent the "Gavana Number" in the process ;-) Might be :-D . At the moment I am pretty content with what I have got, it seems to be working fairly well although I didn't examine all the possible cases and it is very likely that my little tool will break disastrously for some combinations of parameters. However, I am not sure I am allowed to post an image comparing the "real" simulation with the prediction of the interpolated proxy model, but if you could see it, you would surely agree that it is a very reasonable approach. It seems to good to be true :-D . Again, this is mainly due to the fact that we have a very extensive set of simulations which cover a wide range of combinations of parameters, so the interpolation itself is only doing its job correctly. I don't think the technique can be applied blindly to whatever oil/gas/condensate /whatever reservoir, as non-linearities in fluid flow simulations appear where you least expect them, but it seems to be working ok (up to now) for our field (which is, by the way, one of the most complex and less understood condensate reservoir out there). Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From dwf at cs.toronto.edu Mon Mar 29 17:59:29 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 29 Mar 2010 17:59:29 -0400 Subject: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals Message-ID: Hi, In my setup.py, I have from numpy.distutils.misc_util import Configuration fflags= '-fdefault-real-8 -ffixed-form' config = Configuration( 'foo', parent_package=None, top_path=None, f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags, fflags) ) However I am still getting stuff returned in 'real' variables as dtype=float32. Am I doing something wrong? Thanks, David From charlesr.harris at gmail.com Mon Mar 29 18:12:56 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Mar 2010 16:12:56 -0600 Subject: [Numpy-discussion] Fourier transform In-Reply-To: <20100329230053.2d32665f@parois.net> References: <20100329230053.2d32665f@parois.net> Message-ID: On Mon, Mar 29, 2010 at 3:00 PM, Pascal wrote: > Hi, > > Does anyone have an idea how fft functions are implemented? Is it pure > python? based on BLAS/LAPACK? or is it using fftw? > > I successfully used numpy.fft in 3D. I would like to know if I can > calculate a specific a plane using the numpy.fft. > > I have in 3D: > r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} > \exp(-2\pi \i (hx/N+ky/M+lz/O)) > > So for the plane, z is no longer independant. > I need to solve the system: > ax+by+cz+d=0 > r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} > \exp(-2\pi \i (hx/N+ky/M+lz/O)) > > Do you think it's possible to use numpy.fft for this? > > I'm not clear on what you want to do here, but note that the term in the in the exponent is of the form , i.e., the inner product of the vectors k and x. So if you rotate x by O so that the plane is defined by z = 0, then = . That is, you can apply the transpose of the rotation to the result of the fft. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From brennan.williams at visualreservoir.com Mon Mar 29 18:13:42 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 30 Mar 2010 11:13:42 +1300 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> <4BB11D36.1040907@noaa.gov> Message-ID: <4BB12616.5010906@visualreservoir.com> Andrea Gavana wrote: > Hi Chris and All, > > On 29 March 2010 22:35, Christopher Barker wrote: > >> Andrea Gavana wrote: >> >>>>> Scaling each axis by its standard deviation is a typical first start. >>>>> Shifting and scaling the values such that they each go from 0 to 1 is >>>>> another useful thing to try. >>>>> >>>> Ah, magnifico! Thank you Robert and Friedrich, it seems to be working >>>> now... >>>> >> One other thought -- core to much engineering is dimensional analysis -- >> you know how we like those non-dimensional number! >> >> I think this situation is less critical, as you are interpolating, not >> optimizing or something, but many interpolation methods are built on the >> idea of some data points being closer than others to your point of interest. >> >> Who is to say if a point that is 2 hours away is closer or father than >> one 2 meters away? This is essentially what you are doing. >> >> Scaling everything to the same range is a start, but then you've still >> given them an implicit weighting. >> >> An alternative to to figure out a way to non-dimensionalize your >> parameters -- that *may* give you a more physically based scaling. >> >> And you might invent the "Gavana Number" in the process ;-) >> > > Might be :-D . At the moment I am pretty content with what I have got, > it seems to be working fairly well although I didn't examine all the > possible cases and it is very likely that my little tool will break > disastrously for some combinations of parameters. However, I am not > sure I am allowed to post an image comparing the "real" simulation > with the prediction of the interpolated proxy model, but if you could > see it, you would surely agree that it is a very reasonable approach. > It seems to good to be true :-D . > > Again, this is mainly due to the fact that we have a very extensive > set of simulations which cover a wide range of combinations of > parameters, so the interpolation itself is only doing its job > correctly. I don't think the technique can be applied blindly to > whatever oil/gas/condensate /whatever reservoir, as non-linearities in > fluid flow simulations appear where you least expect them, but it > seems to be working ok (up to now) for our field (which is, by the > way, one of the most complex and less understood condensate reservoir > out there). > > And of course that proxy simulator only deals with the input variables that you decided on 1,000+ simulations ago. All you need is for someone to suggest something else like "how about gas injection?" and you're back to having to do more real simulation runs (which is where a good experimental design comes in). It would be interesting to know how well your proxy simulator compares to the real simulator for a combination of input variable values that is a good distance outside your original parameter space. Brennan > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From andrea.gavana at gmail.com Mon Mar 29 18:32:16 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 23:32:16 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: <4BB12616.5010906@visualreservoir.com> References: <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> <4BB11D36.1040907@noaa.gov> <4BB12616.5010906@visualreservoir.com> Message-ID: On 29 March 2010 23:13, Brennan Williams wrote: > Andrea Gavana wrote: >> Hi Chris and All, >> >> On 29 March 2010 22:35, Christopher Barker wrote: >> >>> Andrea Gavana wrote: >>> >>>>>> Scaling each axis by its standard deviation is a typical first start. >>>>>> Shifting and scaling the values such that they each go from 0 to 1 is >>>>>> another useful thing to try. >>>>>> >>>>> Ah, magnifico! Thank you Robert and Friedrich, it seems to be working >>>>> now... >>>>> >>> One other thought -- core to much engineering is dimensional analysis -- >>> you know how we like those non-dimensional number! >>> >>> I think this situation is less critical, as you are interpolating, not >>> optimizing or something, but many interpolation methods are built on the >>> idea of some data points being closer than others to your point of interest. >>> >>> Who is to say if a point that is 2 hours away is closer or father than >>> one 2 meters away? This is essentially what you are doing. >>> >>> Scaling everything to the same range is a start, but then you've still >>> given them an implicit weighting. >>> >>> An alternative to to figure out a way to non-dimensionalize your >>> parameters -- that *may* give you a more physically based scaling. >>> >>> And you might invent the "Gavana Number" in the process ;-) >>> >> >> Might be :-D . At the moment I am pretty content with what I have got, >> it seems to be working fairly well although I didn't examine all the >> possible cases and it is very likely that my little tool will break >> disastrously for some combinations of parameters. However, I am not >> sure I am allowed to post an image comparing the "real" simulation >> with the prediction of the interpolated proxy model, but if you could >> see it, you would surely agree that it is a very reasonable approach. >> It seems to good to be true :-D . >> >> Again, this is mainly due to the fact that we have a very extensive >> set of simulations which cover a wide range of combinations of >> parameters, so the interpolation itself is only doing its job >> correctly. I don't think the technique can be applied blindly to >> whatever oil/gas/condensate /whatever reservoir, as non-linearities in >> fluid flow simulations appear where you least expect them, but it >> seems to be working ok (up to now) for our field (which is, by the >> way, one of the most complex and less understood condensate reservoir >> out there). >> >> > And of course that proxy simulator only deals with the input variables > that you decided on 1,000+ simulations ago. (1000 < x < 0 = now) . Correct: the next phase of the development for the field has some strict rules we are not allowed to break. The parameters we chose for optimizing this development phase (4 years ago up to now) are the same as the ones I am using for the interpolation. There is no more room for other options. > All you need is for someone to suggest something else like "how about > gas injection?" This is already accounted for. > and you're back to having to do > more real simulation runs (which is where a good experimental design > comes in). Possibly, but as I said we can't even think of doing something like ED now. It's too computationally expensive for such a field. This is why I had the idea of using the existing set of simulations, which are a good ED by themselves (even if we didn't plan for it in advance). > It would be interesting to know how well your proxy simulator compares > to the real simulator for a combination of input variable values that > is a good distance outside your original parameter space. This is extremely unlikely to happen ever. As I said, we explored a wide range of number/type of possible producers and injectors, plus a fairly extended range of production/injection profiles. As we are dealing with reality here, there are physical limits on what facilities you can install on your fields to try and ramp up production as much as you can (i.e., upper bounds for number of producers/injectors/plateaus), and political limits on how low you can go and still be economical (i.e., lower bounds for number of producers/injectors/plateaus). It seems to me we're covering everything except the most extreme cases. Anyway, I am not here to convince anyone on the validity of the approach: I am a practical person, this thing works reasonably well and I am more than happy. Other than that, we are already going waaaay off topic for the Numpy list. Sorry about that. If you (or anyone else) wishes to continue the discussion on Reservoir Simulation, please feel free to contact me directly. For all other interpolation suggestions, I am always open to new ideas. Thank you again to the list for your help. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From friedrichromstedt at gmail.com Mon Mar 29 18:44:01 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 30 Mar 2010 00:44:01 +0200 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281122y3c5e677code4ae11ae26c6b5e@mail.gmail.com> <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: 2010/3/29 Andrea Gavana : > If anyone is interested in a follow up, I have tried a time-based > interpolation of my oil profile (and gas and gas injection profiles) > using those 40 interpolators (and even more, up to 400, one every > month of fluid flow simulation time step). > > I wasn't expecting too much out of it, but when the interpolated > profiles came out (for different combinations of input parameters) I > felt like being on the wrong side of the Lala River in the valley of > Areyoukidding. The results are striking. I get an impressive agreement > between this interpolated proxy model and the real simulations, > whether I use existing combinations of parameters or new ones (i.e., I > create the interpolation and then run the real fluid flow simulation, > comparing the outcomes). I'm reasoning about the implications of this observation to our understanding of your interpolation. As Christopher pointed out, it's very important to know how many gas injections wells are to be weighted the same as one year. When you have nice results using 40 Rbfs for each time instant, this procedure means that the values for one time instant will not be influenced by adjacent-year data. I.e., you would probably get the same result using a norm extraordinary blowing up the time coordinate. To make it clear in code, when the time is your first coordinate, and you have three other coordinates, the *norm* would be: def norm(x1, x2): return numpy.sqrt((((x1 - x2) * [1e3, 1, 1]) ** 2).sum()) In this case, the epsilon should be fixed, to avoid the influence of the changing distances on the epsilon determination inside of Rbf, which would spoil the whole thing. I have an idea how to tune your model: Take, say, the half or three thirds of your simulation data as interpolation database, and try to reproduce the remaining part. I have some ideas how to tune using this in practice. > As an aside, I got my colleagues reservoir engineers playfully > complaining that it's time for them to pack their stuff and go home as > this interpolator is doing all the job for us; obviously, this misses > the point that it took 4 years to build such a comprehensive bunch of > simulations which now allows us to somewhat "predict" a possible > production profile in advance. :-) :-) > I wrapped everything up in a wxPython GUI with some Matplotlib graphs, > and everyone seems happy. Not only your collegues! > The only small complain I have is that I > wasn't able to come up with a vector implementation of RBFs, so it can > be pretty slow to build and interpolate 400 RBFs for each property (3 > of them). Haven't you spoken about 40 Rbfs for the time alone?? Something completely different: Are you going to do more simulations? Friedrich From ranavishal at gmail.com Mon Mar 29 18:54:00 2010 From: ranavishal at gmail.com (Vishal Rana) Date: Mon, 29 Mar 2010 15:54:00 -0700 Subject: [Numpy-discussion] Applying formula to all in an array which hasvalue from previous In-Reply-To: <710F2847B0018641891D9A21602763605AD37B@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD37B@ex3.envision.co.il> Message-ID: Thanks Nadav! On Mon, Mar 29, 2010 at 4:07 AM, Nadav Horesh wrote: > The general guideline: > > Suppose the function definition is: > > def func(x,y): > # x and y are scalars > bla bla bla ... > return z # a scalar > > So, > > import numpy as np > > vecfun = np.vectorize(func) > > vecfun.ufunc.accumulate(array((0,1,2,3,4,5,6,7,8,9)) > > > Nadav. > > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Vishal Rana > Sent: Sun 28-Mar-10 21:19 > To: Discussion of Numerical Python > Subject: [Numpy-discussion] Applying formula to all in an array which > hasvalue from previous > > Hi, > > For a numpy array: > > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > I do some calculation with 0, 1... and get a value = 2.5, now use this > value > to do the repeat the same calculation with next element for example... > 2.5, 2 and get a value = 3.1 > 3.1, 3 and get a value = 4.2 > 4.2, 4 and get a value = 5.1 > .... > .... and get a value = 8.5 > 8.5, 9 and get a value = 9.8 > > So I should be getting a new array like array([0, 2.5, 3.1, 4.2, 5.1, ..... > 8.5,9.8]) > > Is it where numpy or scipy can help? > > Thanks > Vishal > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Mon Mar 29 18:57:29 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 29 Mar 2010 23:57:29 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: HI Friedrich & All, On 29 March 2010 23:44, Friedrich Romstedt wrote: > 2010/3/29 Andrea Gavana : >> If anyone is interested in a follow up, I have tried a time-based >> interpolation of my oil profile (and gas and gas injection profiles) >> using those 40 interpolators (and even more, up to 400, one every >> month of fluid flow simulation time step). >> >> I wasn't expecting too much out of it, but when the interpolated >> profiles came out (for different combinations of input parameters) I >> felt like being on the wrong side of the Lala River in the valley of >> Areyoukidding. The results are striking. I get an impressive agreement >> between this interpolated proxy model and the real simulations, >> whether I use existing combinations of parameters or new ones (i.e., I >> create the interpolation and then run the real fluid flow simulation, >> comparing the outcomes). > > I'm reasoning about the implications of this observation to our > understanding of your interpolation. ?As Christopher pointed out, it's > very important to know how many gas injections wells are to be > weighted the same as one year. > > When you have nice results using 40 Rbfs for each time instant, this > procedure means that the values for one time instant will not be > influenced by adjacent-year data. ?I.e., you would probably get the > same result using a norm extraordinary blowing up the time coordinate. > ?To make it clear in code, when the time is your first coordinate, and > you have three other coordinates, the *norm* would be: > > def norm(x1, x2): > ? ?return numpy.sqrt((((x1 - x2) * [1e3, 1, 1]) ** 2).sum()) > > In this case, the epsilon should be fixed, to avoid the influence of > the changing distances on the epsilon determination inside of Rbf, > which would spoil the whole thing. > > I have an idea how to tune your model: ?Take, say, the half or three > thirds of your simulation data as interpolation database, and try to > reproduce the remaining part. ?I have some ideas how to tune using > this in practice. This is a very good idea indeed: I am actually running out of test cases (it takes a while to run a simulation, and I need to do it every time I try a new combination of parameters to check if the interpolation is good enough or rubbish). I'll give it a go tomorrow at work and I'll report back (even if I get very bad results :-D ). >> As an aside, I got my colleagues reservoir engineers playfully >> complaining that it's time for them to pack their stuff and go home as >> this interpolator is doing all the job for us; obviously, this misses >> the point that it took 4 years to build such a comprehensive bunch of >> simulations which now allows us to somewhat "predict" a possible >> production profile in advance. > > :-) :-) > >> I wrapped everything up in a wxPython GUI with some Matplotlib graphs, >> and everyone seems happy. > Not only your collegues! >> The only small complain I have is that I >> wasn't able to come up with a vector implementation of RBFs, so it can >> be pretty slow to build and interpolate 400 RBFs for each property (3 >> of them). > > Haven't you spoken about 40 Rbfs for the time alone?? Yes, sorry about the confusion: depending on which "time-step" I choose to compare the interpolation with the real simulation, I can have 40 RBFs (1 every year of simulation) or more than 400 (one every month of simulation, not all the monthly data are available for all the simulations I have). > Something completely different: Are you going to do more simulations? 110% surely undeniably yes. The little interpolation tool I have is just a proof-of-concept and a little helper for us to have an initial grasp of how the production profiles might look like before actually running the real simulation. Something like a toy to play with (if you can call "play" actually working on a reservoir simulation...). There is no possible substitute for the reservoir simulator itself. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From robert.kern at gmail.com Mon Mar 29 19:10:06 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 29 Mar 2010 18:10:06 -0500 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: <3d375d731003291610t46076451k54b5a011a3383843@mail.gmail.com> On Mon, Mar 29, 2010 at 17:57, Andrea Gavana wrote: > HI Friedrich & All, > > On 29 March 2010 23:44, Friedrich Romstedt wrote: >> Something completely different: Are you going to do more simulations? > > 110% surely undeniably yes. The little interpolation tool I have is > just a proof-of-concept and a little helper for us to have an initial > grasp of how the production profiles might look like before actually > running the real simulation. Something like a toy to play with (if you > can call "play" actually working on a reservoir simulation...). There > is no possible substitute for the reservoir simulator itself. One thing you might want to do is to investigate using Gaussian processes instead of RBFs. They are closely related (and I even think the 'gaussian' RBF corresponds to what you get from a particularly-constructed GP), but you can include uncertainty estimates of your data and get an estimate of the uncertainty of the interpolant. GPs are also very closely related to kriging, which you may also be familiar with. If you are going to be running more simulations, GPs can tell you what new inputs you should simulate to reduce your uncertainty the most. PyMC has some GP code: http://pymc.googlecode.com/files/GPUserGuide.pdf -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at silveregg.co.jp Mon Mar 29 21:04:32 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Tue, 30 Mar 2010 10:04:32 +0900 Subject: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy In-Reply-To: <1269875507.2743.10.camel@talisman> References: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> <1269875507.2743.10.camel@talisman> Message-ID: <4BB14E20.3090404@silveregg.co.jp> Pauli Virtanen wrote: > ma, 2010-03-29 kello 19:13 +0900, David Cournapeau kirjoitti: >> I have worked on porting scipy to py3k, and it is mostly working. One >> thing which would be useful is to install something similar to >> npy_3kcompat.h in numpy, so that every scipy extension could share the >> compat header. Is the current python 3 compatibility header usable "in >> the wild", or will it still significantly change (this requiring a >> different, more stable one) ? > > I believe it's reasonably stable, as it contains mostly simple stuff. > Something perhaps can be added later, but I don't think anything will > need to be removed. Ok. If the C capsule stuff is still in flux, it may just be removed from the public header - I don't think we will need it anywhere in scipy. > At least, I don't see what I would like to change there. The only thing > I wouldn't perhaps like to have in the long run are the PyString and > possibly PyInt redefinition macros. I would also prefer a new name, instead of macro redefinition, but I can do it. The header would not be part of the public API proper anyway (I will put it somewhere else than numpy/include), it should never be pulled implicitly in when including one of the .h in numpy/include. David From david at silveregg.co.jp Mon Mar 29 22:17:55 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Tue, 30 Mar 2010 11:17:55 +0900 Subject: [Numpy-discussion] Making a 2to3 distutils command ? Message-ID: <4BB15F53.9030908@silveregg.co.jp> Hi, Currently, when building numpy with python 3, the 2to3 conversion happens before calling any distutils command. Was there a reason for doing it as it is done now ? I would like to make a proper numpy.distutils command for it, so that it can be more finely controlled (in particular, using the -j option). It would also avoid duplication in scipy. cheers, David From msarahan at gmail.com Mon Mar 29 23:19:12 2010 From: msarahan at gmail.com (Mike Sarahan) Date: Mon, 29 Mar 2010 20:19:12 -0700 Subject: [Numpy-discussion] Dealing with roundoff error In-Reply-To: References: <8275939c1003271638g2aa1897dp65be12a9b3952191@mail.gmail.com> Message-ID: <8275939c1003292019l4aa1200aud46c642163a15cbb@mail.gmail.com> Thank you all for your suggestions. I ended up multiplying by 10 and rounding, while casting the array to an int. Certainly not the most universal solution, but it worked for my data. code, for anyone searching for examples: np.array(np.round((hlspec[:,0]-offset)*10),dtype=np.int) -Mike On Sun, Mar 28, 2010 at 12:44 PM, Friedrich Romstedt wrote: > 2010/3/28 Mike Sarahan : >> I have run into some roundoff problems trying to line up some >> experimental spectra. ?The x coordinates are given in intervals of 0.1 >> units. ?I read the data in from a text file using np.loadtxt(). > > I don't know your problem well enough, so the suggestion to use > numpy.interp() is maybe not more than a useless shot in the dark? > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html#numpy.interp > > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Tue Mar 30 02:51:05 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Mar 2010 09:51:05 +0300 Subject: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy In-Reply-To: <4BB14E20.3090404@silveregg.co.jp> References: <5b8d13221003290313l6ab01d9ub9423e0a029c2e13@mail.gmail.com> <1269875507.2743.10.camel@talisman> <4BB14E20.3090404@silveregg.co.jp> Message-ID: <253f0f1a1003292351x2cf32aaegdb2869d060a2bed5@mail.gmail.com> 2010/3/30 David Cournapeau > Pauli Virtanen wrote: [clip] > > At least, I don't see what I would like to change there. The only thing > > I wouldn't perhaps like to have in the long run are the PyString and > > possibly PyInt redefinition macros. > > I would also prefer a new name, instead of macro redefinition, but I can > do it. For strings, the new names are "PyBytes", "PyUnicode" and "PyUString" (unicode on Py3, bytes on Py2). One of these should be used instead of PyString in all places, but redefining the macro reduced the work in making a quick'n'dirty port of numpy. For integers, perhaps a separate "PyInt on Py2, PyLong on Py3" type should be defined. Not sure if this would reduce work in practice. > The header would not be part of the public API proper anyway (I > will put it somewhere else than numpy/include), it should never be > pulled implicitly in when including one of the .h in numpy/include. Agreed. Pauli From pav at iki.fi Tue Mar 30 03:09:17 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Mar 2010 10:09:17 +0300 Subject: [Numpy-discussion] Making a 2to3 distutils command ? In-Reply-To: <4BB15F53.9030908@silveregg.co.jp> References: <4BB15F53.9030908@silveregg.co.jp> Message-ID: <253f0f1a1003300009h7b24c1efp82917a989e7cc434@mail.gmail.com> 2010/3/30 David Cournapeau : > Currently, when building numpy with python 3, the 2to3 conversion > happens before calling any distutils command. Was there a reason for > doing it as it is done now ? This allowed 2to3 to also port the various setup*.py files and numpy.distutils, and implementing it this way required the minimum amount of work and understanding of distutils -- you need to force it to proceed with the build using the set of output files from 2to3. > I would like to make a proper numpy.distutils command for it, so that it > can be more finely controlled (in particular, using the -j option). It > would also avoid duplication in scipy. Are you sure you want to mix distutils in this? Wouldn't it only obscure how things work? If the aim is in making the 2to3 processing reusable, I'd rather simply move tools/py3tool.py under numpy.distutils (+ perhaps do some cleanups), and otherwise keep it completely separate from distutils. It could be nice to have the 2to3 conversion parallelizable, but there are probably simple ways to do it without mixing distutils in. But if you think this is really worth doing, go ahead. Pauli From dagss at student.matnat.uio.no Tue Mar 30 05:01:43 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 30 Mar 2010 11:01:43 +0200 Subject: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals In-Reply-To: References: Message-ID: <4BB1BDF7.7020405@student.matnat.uio.no> David Warde-Farley wrote: > Hi, > > In my setup.py, I have > from numpy.distutils.misc_util import Configuration > > fflags= '-fdefault-real-8 -ffixed-form' > config = Configuration( > 'foo', > parent_package=None, > top_path=None, > f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags, > fflags) > ) > > However I am still getting stuff returned in 'real' variables as > dtype=float32. Am I doing something wrong? > Unless f2py is (too) smart, it probably just pass along --f77flags and --f90flags to the Fortran compiler, but don't use them when creating the wrapping C Python extension. So, Fortran uses real(8) and the type in C is "float" -- and what NumPy ends up seeing is float32. Unless you're dealing with arrays, you are likely blowing your stack here... I wouldn't think there's a way around this except fixing the original source. f2py is very much based on assumptions about type sizes which you then violate when passing -fdefault-real-8. Dag Sverre From dagss at student.matnat.uio.no Tue Mar 30 05:02:21 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 30 Mar 2010 11:02:21 +0200 Subject: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals In-Reply-To: <4BB1BDF7.7020405@student.matnat.uio.no> References: <4BB1BDF7.7020405@student.matnat.uio.no> Message-ID: <4BB1BE1D.8090600@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > David Warde-Farley wrote: >> Hi, >> >> In my setup.py, I have >> from numpy.distutils.misc_util import Configuration >> >> fflags= '-fdefault-real-8 -ffixed-form' >> config = Configuration( >> 'foo', >> parent_package=None, >> top_path=None, >> f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags, >> fflags) >> ) >> >> However I am still getting stuff returned in 'real' variables as >> dtype=float32. Am I doing something wrong? >> > Unless f2py is (too) smart, it probably just pass along --f77flags and > --f90flags to the Fortran compiler, but don't use them when creating > the wrapping C Python extension. So, Fortran uses real(8) and the type > in C is "float" -- and what NumPy ends up seeing is float32. > > Unless you're dealing with arrays, you are likely blowing your stack > here... > > I wouldn't think there's a way around this except fixing the original > source. f2py is very much based on assumptions about type sizes which > you then violate when passing -fdefault-real-8. Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Dag Sverre From rmay31 at gmail.com Tue Mar 30 09:18:26 2010 From: rmay31 at gmail.com (Ryan May) Date: Tue, 30 Mar 2010 07:18:26 -0600 Subject: [Numpy-discussion] Making a 2to3 distutils command ? In-Reply-To: <253f0f1a1003300009h7b24c1efp82917a989e7cc434@mail.gmail.com> References: <4BB15F53.9030908@silveregg.co.jp> <253f0f1a1003300009h7b24c1efp82917a989e7cc434@mail.gmail.com> Message-ID: On Tue, Mar 30, 2010 at 1:09 AM, Pauli Virtanen wrote: > 2010/3/30 David Cournapeau : >> Currently, when building numpy with python 3, the 2to3 conversion >> happens before calling any distutils command. Was there a reason for >> doing it as it is done now ? > > This allowed 2to3 to also port the various setup*.py files and > numpy.distutils, and implementing it this way required the minimum > amount of work and understanding of distutils -- you need to force it > to proceed with the build using the set of output files from 2to3. > >> I would like to make a proper numpy.distutils command for it, so that it >> can be more finely controlled (in particular, using the -j option). It >> would also avoid duplication in scipy. > > Are you sure you want to mix distutils in this? Wouldn't it only > obscure how things work? > > If the aim is in making the 2to3 processing reusable, I'd rather > simply move tools/py3tool.py under numpy.distutils (+ perhaps do some > cleanups), and otherwise keep it completely separate from distutils. > It could be nice to have the 2to3 conversion parallelizable, but there > are probably simple ways to do it without mixing distutils in. Out of curiosity, is there something wrong with the support for 2to3 that already exists within distutils? (Other than it just being distutils) http://bruynooghe.blogspot.com/2010/03/using-lib2to3-in-setuppy.html Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From pav at iki.fi Tue Mar 30 09:47:13 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Mar 2010 16:47:13 +0300 Subject: [Numpy-discussion] Making a 2to3 distutils command ? In-Reply-To: References: <4BB15F53.9030908@silveregg.co.jp> <253f0f1a1003300009h7b24c1efp82917a989e7cc434@mail.gmail.com> Message-ID: <1269956833.2905.3.camel@talisman> ti, 2010-03-30 kello 07:18 -0600, Ryan May kirjoitti: > Out of curiosity, is there something wrong with the support for 2to3 > that already exists within distutils? (Other than it just being > distutils) > > http://bruynooghe.blogspot.com/2010/03/using-lib2to3-in-setuppy.html That AFAIK converts only those Python files that will be installed, and I don't know how to tell it to disable some conversion on a per-file basis. (Numpy also contains some files necessary for building that will not be installed...) But OK, to be honest, I didn't look closely if one could make it work using the bundled 2to3 command. Things might be cleaner if it can be made to work. Pauli From amenity at enthought.com Tue Mar 30 10:07:49 2010 From: amenity at enthought.com (Amenity Applewhite) Date: Tue, 30 Mar 2010 09:07:49 -0500 Subject: [Numpy-discussion] SciPy 2010: Vote on Tutorials & Sprints References: Message-ID: <81C97727-3E4E-43BA-B4D0-6D7C95C6BCB4@enthought.com> Email not displaying correctly? View it in your browser. Hello Amenity, Spring is upon us and arrangements for SciPy 2010 are in full swing. We're already nearing on some important deadlines for conference participants: April 11th is the deadline for submitting an abstract for a paper, and April 15th is the deadline for submitting a tutorial proposal. Help choose tutorials for SciPy 2010... We set up a UserVoice page to brainstorm tutorial topics last week and we already have some great ideas. The top ones at the moment are: Effective multi-core programming with Cython and Python Building your own app with Mayavi High performance computing with Python Propose your own or vote on the existing suggestions here. ...Or instruct a tutorial and cover your conference costs. Did you know that we're awarding generous stipends to tutorial instructors this year? So if you believe you could lead a tutorial, by all means submit your proposal ? soon! They're due April 15th. Call for Papers Continues Submitting a paper for to present at SciPy 2010 is easy, so remember to prepare one and have your friends and colleagues follow suit. Send us your abstract before April 11th and let us know whether you'd like to speak at the main conference or one of the specialized tracks. Details here. Have you registered? Booking your tickets early should save you money ? not to mention the early registration prices you will qualify for if you register before May 10th. Best, The SciPy 2010 Team @SciPy2010 on Twitter You are receiving this email because you have registered for the SciPy 2010 conference in Austin, TX. Unsubscribe amenity at enthought.com from this list | Forward to a friend | Update your profile Our mailing address is: Enthought, Inc. 515 Congress Ave. Austin, TX 78701 Add us to your address book Copyright (C) 2010 Enthought, Inc. All rights reserved. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpk at kraussfamily.org Tue Mar 30 10:13:12 2010 From: tpk at kraussfamily.org (Tom K.) Date: Tue, 30 Mar 2010 07:13:12 -0700 (PDT) Subject: [Numpy-discussion] indexing question Message-ID: <28083162.post@talk.nabble.com> This one bit me again, and I am trying to understand it better so I can anticipate when it will happen. What I want to do is get rid of singleton dimensions, and index into the last dimension with an array. In [1]: import numpy as np In [2]: x=np.zeros((10,1,1,1,14,1024)) In [3]: x[:,0,0,0,:,[1,2,3]].shape Out[3]: (3, 10, 14) Whoa! Trimming my array to a desired number ends up moving the last dimension to the first! In [4]: np.__version__ Out[4]: '1.3.0' ... In [7]: x[:,:,:,:,:,[1,2,3]].shape Out[7]: (10, 1, 1, 1, 14, 3) This looks right... In [8]: x[...,[1,2,3]].shape Out[8]: (10, 1, 1, 1, 14, 3) and this... In [9]: x[...,[1,2,3]][:,0,0,0].shape Out[9]: (10, 14, 3) ... In [11]: x[:,0,0,0][...,[1,2,3]].shape Out[11]: (10, 14, 3) Either of the last 2 attempts above results in what I want, so I can do that... I just need some help deciphering when and why the first thing happens. -- View this message in context: http://old.nabble.com/indexing-question-tp28083162p28083162.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From aisaac at american.edu Tue Mar 30 10:24:56 2010 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 30 Mar 2010 10:24:56 -0400 Subject: [Numpy-discussion] indexing question In-Reply-To: <28083162.post@talk.nabble.com> References: <28083162.post@talk.nabble.com> Message-ID: <4BB209B8.8030901@american.edu> On 3/30/2010 10:13 AM, Tom K. wrote: > What I want to do is get rid of singleton dimensions, and index into the > last dimension with an array. >>> x=np.zeros((10,1,1,1,14,1024)) >>> np.squeeze(x).shape (10, 14, 1024) hth, Alan Isaac From charlesr.harris at gmail.com Tue Mar 30 10:46:22 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Mar 2010 08:46:22 -0600 Subject: [Numpy-discussion] indexing question In-Reply-To: <28083162.post@talk.nabble.com> References: <28083162.post@talk.nabble.com> Message-ID: On Tue, Mar 30, 2010 at 8:13 AM, Tom K. wrote: > > This one bit me again, and I am trying to understand it better so I can > anticipate when it will happen. > > What I want to do is get rid of singleton dimensions, and index into the > last dimension with an array. > > In [1]: import numpy as np > > In [2]: x=np.zeros((10,1,1,1,14,1024)) > > In [3]: x[:,0,0,0,:,[1,2,3]].shape > Out[3]: (3, 10, 14) > > Hmm... That doesn't look right. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Mar 30 10:52:24 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 30 Mar 2010 09:52:24 -0500 Subject: [Numpy-discussion] indexing question In-Reply-To: References: <28083162.post@talk.nabble.com> Message-ID: <3d375d731003300752u56e7c8e3la9a7c9c9d807f700@mail.gmail.com> On Tue, Mar 30, 2010 at 09:46, Charles R Harris wrote: > > > On Tue, Mar 30, 2010 at 8:13 AM, Tom K. wrote: >> >> This one bit me again, and I am trying to understand it better so I can >> anticipate when it will happen. >> >> What I want to do is get rid of singleton dimensions, and index into the >> last dimension with an array. >> >> In [1]: import numpy as np >> >> In [2]: x=np.zeros((10,1,1,1,14,1024)) >> >> In [3]: x[:,0,0,0,:,[1,2,3]].shape >> Out[3]: (3, 10, 14) >> > > Hmm... That doesn't look right. It's a known feature. Slicing and list indexing are separate subsystems. The list indexing takes priority so the list-indexed axes end up first in the result. The sliced axes follow them. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Tue Mar 30 10:53:45 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 30 Mar 2010 10:53:45 -0400 Subject: [Numpy-discussion] indexing question In-Reply-To: <28083162.post@talk.nabble.com> References: <28083162.post@talk.nabble.com> Message-ID: <1cd32cbb1003300753q7f7bbddbvd88dfcb689204f1e@mail.gmail.com> On Tue, Mar 30, 2010 at 10:13 AM, Tom K. wrote: > > This one bit me again, and I am trying to understand it better so I can > anticipate when it will happen. > > What I want to do is get rid of singleton dimensions, and index into the > last dimension with an array. > > In [1]: import numpy as np > > In [2]: x=np.zeros((10,1,1,1,14,1024)) > > In [3]: x[:,0,0,0,:,[1,2,3]].shape > Out[3]: (3, 10, 14) > > Whoa! ?Trimming my array to a desired number ends up moving the last > dimension to the first! > > In [4]: np.__version__ > Out[4]: '1.3.0' > > ... > In [7]: x[:,:,:,:,:,[1,2,3]].shape > Out[7]: (10, 1, 1, 1, 14, 3) > > This looks right... > > In [8]: x[...,[1,2,3]].shape > Out[8]: (10, 1, 1, 1, 14, 3) > > and this... > > In [9]: x[...,[1,2,3]][:,0,0,0].shape > Out[9]: (10, 14, 3) > > ... > In [11]: x[:,0,0,0][...,[1,2,3]].shape > Out[11]: (10, 14, 3) > > Either of the last 2 attempts above results in what I want, so I can do > that... I just need some help deciphering when and why the first thing > happens. An explanation about the surprising behavior when slicing and fancy indexing is mixed with more than 2 dimensions is in this thread http://www.mail-archive.com/numpy-discussion at scipy.org/msg16299.html More examples show up every once in a while on the mailing list. Josef > > -- > View this message in context: http://old.nabble.com/indexing-question-tp28083162p28083162.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From srmulcahy at gmail.com Tue Mar 30 12:56:43 2010 From: srmulcahy at gmail.com (Sean Mulcahy) Date: Tue, 30 Mar 2010 09:56:43 -0700 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero Message-ID: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> Hello all, I'm relatively new to numpy. I'm working with text images as 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23 < x < 45). Any advice is much appreciated. Sean From aisaac at american.edu Tue Mar 30 13:12:02 2010 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 30 Mar 2010 13:12:02 -0400 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero In-Reply-To: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> Message-ID: <4BB230E2.40507@american.edu> On 3/30/2010 12:56 PM, Sean Mulcahy wrote: > 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23< x< 45). x[(23 References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> <4BB230E2.40507@american.edu> Message-ID: On Tue, Mar 30, 2010 at 11:12 AM, Alan G Isaac wrote: > On 3/30/2010 12:56 PM, Sean Mulcahy wrote: >> 512x512 arrays. ?I would like to set elements of the array whose value fall within a specified range to zero (eg 23< ?x< ?45). > > x[(23 References: <4BB1BDF7.7020405@student.matnat.uio.no> <4BB1BE1D.8090600@student.matnat.uio.no> Message-ID: <9D56419C-AC52-491D-8807-B04F9C515EC6@cs.toronto.edu> Hey Dag, On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: > Well, you can pass -fdefault-real-8 and then write .pyf headers where > real(8) is always given explicitly. Actually I've gotten it to work this way, with real(8) in the wrappers. BUT... for some reason it requires me to set the environment variable G77='gfortran -fdefault-real-8'. Simply putting it in f2py_options='-- f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault-real-8\'' doesn't seem to do the trick. I don't really understand why. David From dwf at cs.toronto.edu Tue Mar 30 14:25:22 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 30 Mar 2010 14:25:22 -0400 Subject: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals In-Reply-To: <9D56419C-AC52-491D-8807-B04F9C515EC6@cs.toronto.edu> References: <4BB1BDF7.7020405@student.matnat.uio.no> <4BB1BE1D.8090600@student.matnat.uio.no> <9D56419C-AC52-491D-8807-B04F9C515EC6@cs.toronto.edu> Message-ID: On 30-Mar-10, at 2:14 PM, David Warde-Farley wrote: > Hey Dag, > > On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: > >> Well, you can pass -fdefault-real-8 and then write .pyf headers where >> real(8) is always given explicitly. > > > Actually I've gotten it to work this way, with real(8) in the > wrappers. > > BUT... for some reason it requires me to set the environment > variable G77='gfortran -fdefault-real-8'. Simply putting it in > f2py_options='--f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault- > real-8\'' doesn't seem to do the trick. I don't really understand why. Sorry, that should say F77='gfortran -fdefault-real-8'. Setting G77 obviously doesn't do anything. ;) Without that environment variable, I think it is blowing the stack; the program just hangs. David From ranavishal at gmail.com Tue Mar 30 14:59:40 2010 From: ranavishal at gmail.com (Vishal Rana) Date: Tue, 30 Mar 2010 11:59:40 -0700 Subject: [Numpy-discussion] Replacing NANs Message-ID: Hi, In an array I want to replace all NANs with some number say 100, I found a method* **nan_to_num *but it only replaces with zero. Any solution for this? * *Thanks Vishal -------------- next part -------------- An HTML attachment was scrubbed... URL: From amcmorl at gmail.com Tue Mar 30 15:02:59 2010 From: amcmorl at gmail.com (Angus McMorland) Date: Tue, 30 Mar 2010 15:02:59 -0400 Subject: [Numpy-discussion] Replacing NANs In-Reply-To: References: Message-ID: On 30 March 2010 14:59, Vishal Rana wrote: > Hi, > In an array I want to replace all NANs with some number say 100, I found a > method?nan_to_num but it only replaces with zero. > Any solution for this? ar[np.isnan(ar)] = my_num where ar is your array and my_num is the number you want to replace nans with. Angus. > Thanks > Vishal > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- AJC McMorland Post-doctoral research fellow Neurobiology, University of Pittsburgh From zachary.pincus at yale.edu Tue Mar 30 15:05:41 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Tue, 30 Mar 2010 15:05:41 -0400 Subject: [Numpy-discussion] Replacing NANs In-Reply-To: References: Message-ID: <7C1F9E20-CF5F-41C6-9008-AB620C9AE087@yale.edu> > In an array I want to replace all NANs with some number say 100, I > found a method nan_to_num but it only replaces with zero. > Any solution for this? Indexing with a mask is one approach here: a[numpy.isnan(a)] = 100 also cf. numpy.isfinite as well in case you want the same with infs. Zach On Mar 30, 2010, at 2:59 PM, Vishal Rana wrote: > Hi, > > In an array I want to replace all NANs with some number say 100, I > found a method nan_to_num but it only replaces with zero. > Any solution for this? > > Thanks > Vishal > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ranavishal at gmail.com Tue Mar 30 15:13:38 2010 From: ranavishal at gmail.com (Vishal Rana) Date: Tue, 30 Mar 2010 12:13:38 -0700 Subject: [Numpy-discussion] Replacing NANs In-Reply-To: References: Message-ID: That was quick! Thanks Angus and Zachary On Tue, Mar 30, 2010 at 11:59 AM, Vishal Rana wrote: > Hi, > > In an array I want to replace all NANs with some number say 100, I found a > method* **nan_to_num *but it only replaces with zero. > Any solution for this? > * > *Thanks > Vishal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kgdunn at gmail.com Tue Mar 30 16:14:05 2010 From: kgdunn at gmail.com (Kevin Dunn) Date: Tue, 30 Mar 2010 20:14:05 +0000 (UTC) Subject: [Numpy-discussion] Interpolation question References: Message-ID: Andrea Gavana gmail.com> writes: > > Hi Kevin, > > On 29 March 2010 01:38, Kevin Dunn wrote: > >> Message: 5 > >> Date: Sun, 28 Mar 2010 00:24:01 +0000 > >> From: Andrea Gavana gmail.com> > >> Subject: [Numpy-discussion] Interpolation question > >> To: Discussion of Numerical Python scipy.org> > >> Message-ID: > >> ? ? ? ? mail.gmail.com> > >> Content-Type: text/plain; charset=ISO-8859-1 > >> > >> Hi All, > >> > >> ? ?I have an interpolation problem and I am having some difficulties > >> in tackling it. I hope I can explain myself clearly enough. > >> > >> Basically, I have a whole bunch of 3D fluid flow simulations (close to > >> 1000), and they are a result of different combinations of parameters. > >> I was planning to use the Radial Basis Functions in scipy, but for the > >> moment let's assume, to simplify things, that I am dealing only with > >> one parameter (x). In 1000 simulations, this parameter x has 1000 > >> values, obviously. The problem is, the outcome of every single > >> simulation is a vector of oil production over time (let's say 40 > >> values per simulation, one per year), and I would like to be able to > >> interpolate my x parameter (1000 values) against all the simulations > >> (1000x40) and get an approximating function that, given another x > >> parameter (of size 1x1) will give me back an interpolated production > >> profile (of size 1x40). > > > > [I posted the following earlier but forgot to change the subject - it > > appears as a new thread called "NumPy-Discussion Digest, Vol 42, Issue > > 85" - please ignore that thread] > > > > Andrea, may I suggest a different approach to RBF's. > > > > Realize that your vector of 40 values for each row in y are not > > independent of each other (they will be correlated). ?First build a > > principal component analysis (PCA) model on this 1000 x 40 matrix and > > reduce it down to a 1000 x A matrix, called your scores matrix, where > > A is the number of independent components. A is selected so that it > > adequately summarizes Y without over-fitting and you will find A << > > 40, maybe A = 2 or 3. There are tools, such as cross-validation, that > > will help select a reasonable value of A. > > > > Then you can relate your single column of X to these independent > > columns in A using a tool such as least squares: one least squares > > model per column in the scores matrix. ?This works because each column > > in the score vector is independent (contains totally orthogonal > > information) to the others. ?But I would be surprised if this works > > well enough, unless A = 1. > > > > But it sounds like your don't just have a single column in your > > X-variables (you hinted that the single column was just for > > simplification). ?In that case, I would build a projection to latent > > structures model (PLS) model that builds a single latent-variable > > model that simultaneously models the X-matrix, the Y-matrix as well as > > providing the maximal covariance between these two matrices. > > > > If you need some references and an outline of code, then I can readily > > provide these. > > > > This is a standard problem with data from spectroscopic instruments > > and with batch processes. ?They produce hundreds, sometimes 1000's of > > samples per row. PCA and PLS are very effective at summarizing these > > down to a much smaller number of independent columns, very often just > > a handful, and relating them (i.e. building a predictive model) to > > other data matrices. > > > > I also just saw the suggestions of others to center the data by > > subtracting the mean from each column in Y and scaling (by dividing > > through by the standard deviation). ?This is a standard data > > preprocessing step, called autoscaling and makes sense for any data > > analysis, as you already discovered. > > I have got some success by using time-based RBFs interpolations, but I > am always open to other possible implementations (as the one I am > using can easily fail for strange combinations of input parameters). > Unfortunately, my understanding of your explanation is very very > limited: I am not an expert at all, so it's a bit hard for me to > translate the mathematical technical stuff in something I can > understand. If you have an example code (even a very trivial one) for > me to study so that I can understand what the code is actually doing, > I would be more than grateful for your help PCA can be calculated in several different ways. The code below works well enough in many cases, but it can't handle missing data. If you need that capability, let me know. You can read more about PCA here (shameless plug for the course I teach on this material): http://stats4.eng.mcmaster.ca/wiki/Latent_variable_methods_and_applications And here is some Python code for a fake data set and an actual data set. Hope that gets you started, Kevin. import numpy as np use_fake_data = False if use_fake_data: N = 1000 K = 40 A = 3 # extract 3 principal components # Replace this garbage data with your actual, raw data X_raw = np.random.normal(loc=20.0, scale=3.242, size=(N, K)) else: import urllib2 import StringIO # Get a N=460 rows and K=650 column data set of NIR spectra url = 'http://stats4.eng.mcmaster.ca/datasets/tablet-spectra.csv' data_string = urllib2.urlopen(url).read() X_raw = np.genfromtxt(StringIO.StringIO(data_string), delimiter=",") N, K = X_raw.shape A = 3 # extract 3 principal components # Center and scale X (you should check for columns with # zero variance first) X = X_raw - X_raw.mean(axis=0) X = X / X.std(axis=0) # Verify the centering and scaling print(X.mean(axis=0)) # Row of zeros print(X.std(axis=0)) # Row of ones # Use SVD to calculate the components (can't handle missing data!) u, d, v = np.linalg.svd(X) # These are the loadings (direction vectors for the A components) P = v.T[:, range(0, A)] # These are the scores, A scores for each observation T = np.dot(X, P) # How well did these A scores summarize the original data? X_hat = np.dot(T, P.T) residuals = X - X_hat residual_ssq = np.sum(residuals * residuals) initial_ssq = np.sum(X * X) variance_captured = 1.0 - residual_ssq/initial_ssq print variance_captured # With the spectral data set, 3 components capture 94% of the # variance in the original data. Now you work with only these 3 # columns, rather than the K=650 columns from your original data. > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== > From dwf at cs.toronto.edu Tue Mar 30 16:35:32 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 30 Mar 2010 16:35:32 -0400 Subject: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals In-Reply-To: <4BB1BE1D.8090600@student.matnat.uio.no> References: <4BB1BDF7.7020405@student.matnat.uio.no> <4BB1BE1D.8090600@student.matnat.uio.no> Message-ID: On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: > Well, you can pass -fdefault-real-8 and then write .pyf headers where > real(8) is always given explicitly. Okay, the answer (without setting the F77 environment variable) is basically to expect real-8's in the .pyf file and compile the whole package with python setup.py config_fc --fcompiler=gnu95 --f77flags='-fdefault- real-8' --f90flags='-fdefault-real-8' build Still not sure if there's a sane way to make this the default in my setup.py (I found some old mailing list posts where Pearu suggests modifying sys.argv but I think this is widely considered a bad idea). The right way might involve instantiating a GnuF95Compiler object and messing with its fields or something like that. Anyway, this works for now. Thanks, David From friedrichromstedt at gmail.com Tue Mar 30 16:48:25 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 30 Mar 2010 22:48:25 +0200 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <22F257FA-7ECA-4AB9-A24B-D361AFAAAFA2@gmail.com> <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: 2010/3/30 Andrea Gavana : > On 29 March 2010 23:44, Friedrich Romstedt wrote: >> When you have nice results using 40 Rbfs for each time instant, this >> procedure means that the values for one time instant will not be >> influenced by adjacent-year data. ?I.e., you would probably get the >> same result using a norm extraordinary blowing up the time coordinate. >> ?To make it clear in code, when the time is your first coordinate, and >> you have three other coordinates, the *norm* would be: >> >> def norm(x1, x2): >> ? ?return numpy.sqrt((((x1 - x2) * [1e3, 1, 1]) ** 2).sum()) >> >> In this case, the epsilon should be fixed, to avoid the influence of >> the changing distances on the epsilon determination inside of Rbf, >> which would spoil the whole thing. Of course, it are here two and not three "other variables." >> I have an idea how to tune your model: ?Take, say, the half or three >> thirds of your simulation data as interpolation database, and try to >> reproduce the remaining part. ?I have some ideas how to tune using >> this in practice. Here, of course it are three quarters and not three thirds :-) > This is a very good idea indeed: I am actually running out of test > cases (it takes a while to run a simulation, and I need to do it every > time I try a new combination of parameters to check if the > interpolation is good enough or rubbish). I'll give it a go tomorrow > at work and I'll report back (even if I get very bad results :-D ). I refined the idea a bit. Select one simulation, and use the complete rest as the interpolation base. Then repreat this for each simualation. Calculate some joint value for all the results, the simplest would maybe be, to calculate: def joint_ln_density(simulation_results, interpolation_results): return -((interpolation_results - simulation_results) ** 2) / (simulation_results ** 2) In fact, this calculates the logarithm of the Gaussians centered at *simulation_results* and taken at the "obervations" *interpolation_results*. It is the logarithms of the product of this Gaussians. The standard deviation of the Gaussians is assumed to be the value of the *simulation_results*, which means, that I assume that low-valued outcomes are much more precise in absolute numbers than high-outcome values, but /relative/ to their nominal value they are all the same precise. (NB: A scaling of the stddevs wouldn't make a significant difference /for you/. Same the neglected coefficients of the Gaussians.) I don't know, which method you like the most. Robert's and Kevin's proposals are hard to compete with ... You could optimise (maximise) the joint_ln_density outcome as a function of *epsilon* and the different scalings. afaic, scipy comes with some optimisation algorithms included. I checked it: http://docs.scipy.org/doc/scipy-0.7.x/reference/optimize.html#general-purpose . Friedrich From andrea.gavana at gmail.com Tue Mar 30 17:09:55 2010 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 30 Mar 2010 22:09:55 +0100 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: Hi Friedrich & All, On 30 March 2010 21:48, Friedrich Romstedt wrote: > 2010/3/30 Andrea Gavana : >> On 29 March 2010 23:44, Friedrich Romstedt wrote: >>> When you have nice results using 40 Rbfs for each time instant, this >>> procedure means that the values for one time instant will not be >>> influenced by adjacent-year data. ?I.e., you would probably get the >>> same result using a norm extraordinary blowing up the time coordinate. >>> ?To make it clear in code, when the time is your first coordinate, and >>> you have three other coordinates, the *norm* would be: >>> >>> def norm(x1, x2): >>> ? ?return numpy.sqrt((((x1 - x2) * [1e3, 1, 1]) ** 2).sum()) >>> >>> In this case, the epsilon should be fixed, to avoid the influence of >>> the changing distances on the epsilon determination inside of Rbf, >>> which would spoil the whole thing. > > Of course, it are here two and not three "other variables." > >>> I have an idea how to tune your model: ?Take, say, the half or three >>> thirds of your simulation data as interpolation database, and try to >>> reproduce the remaining part. ?I have some ideas how to tune using >>> this in practice. > > Here, of course it are three quarters and not three thirds :-) > >> This is a very good idea indeed: I am actually running out of test >> cases (it takes a while to run a simulation, and I need to do it every >> time I try a new combination of parameters to check if the >> interpolation is good enough or rubbish). I'll give it a go tomorrow >> at work and I'll report back (even if I get very bad results :-D ). > > I refined the idea a bit. ?Select one simulation, and use the complete > rest as the interpolation base. ?Then repreat this for each > simualation. ?Calculate some joint value for all the results, the > simplest would maybe be, to calculate: > > def joint_ln_density(simulation_results, interpolation_results): > ? ? ? ?return -((interpolation_results - simulation_results) ** 2) / > (simulation_results ** 2) > > In fact, this calculates the logarithm of the Gaussians centered at > *simulation_results* and taken at the "obervations" > *interpolation_results*. ?It is the logarithms of the product of this > Gaussians. ?The standard deviation of the Gaussians is assumed to be > the value of the *simulation_results*, which means, that I assume that > low-valued outcomes are much more precise in absolute numbers than > high-outcome values, but /relative/ to their nominal value they are > all the same precise. ?(NB: A scaling of the stddevs wouldn't make a > significant difference /for you/. ?Same the neglected coefficients of > the Gaussians.) > > I don't know, which method you like the most. ?Robert's and Kevin's > proposals are hard to compete with ... > > You could optimise (maximise) the joint_ln_density outcome as a > function of *epsilon* and the different scalings. ?afaic, scipy comes > with some optimisation algorithms included. ?I checked it: > http://docs.scipy.org/doc/scipy-0.7.x/reference/optimize.html#general-purpose I had planned to show some of the results based on the suggestion you gave me yesterday: I took two thirds ( :-D ) of the simulations database to use them as interpolation base and tried to reproduce the rest using the interpolation. Unfortunately it seems like my computer at work has blown up (maybe a BSOD, I was doing waaaay too many heavy things at once) and I can't access it from home at the moment. I can't show the real field profiles, but at least I can show you how good or bad the interpolation performs (in terms of relative errors), and I was planning to post a matplotlib composite graph to do just that. I am still hopeful my PC will resurrect at some point :-D However, from the first 100 or so interpolated simulations, I could gather these findings: 1) Interpolations on *cumulative* productions on oil and gas are extremely good, with a maximum range of relative error of -3% / +2%: most of them (95% more or less) show errors < 1%, but for few of them I get the aforementioned range of errors in the interpolations; 2) Interpolations on oil and gas *rates* (time dependent), I usually get a -5% / +3% error per timestep, which is very good for my purposes. I still need to check these values but the first set of results were very promising; 3) Interpolations on gas injection (cumulative and rate) are a bit more shaky (15% error more or less), but this is essentially due to a particular complex behaviour of the reservoir simulator when it needs to decide (based on user input) if the gas is going to be re-injected, sold, used as excess gas and a few more; I am not that worried about this issue for the moment. I'll be off for a week for Easter (I'll leave for Greece in few hours), but I am really looking forward to try the suggestion Kevin posted and to investigate Robert's idea: I know about kriging, but I initially thought it wouldn't do a good job in this case. I'll reconsider for sure. And I'll post a screenshot of the results as soon as my PC get out of the Emergency Room. Thank you again! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== From friedrichromstedt at gmail.com Tue Mar 30 17:16:54 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 30 Mar 2010 23:16:54 +0200 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero In-Reply-To: References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> <4BB230E2.40507@american.edu> Message-ID: 2010/3/30 Ryan May : > On Tue, Mar 30, 2010 at 11:12 AM, Alan G Isaac wrote: >> On 3/30/2010 12:56 PM, Sean Mulcahy wrote: >>> 512x512 arrays. ?I would like to set elements of the array whose value fall within a specified range to zero (eg 23< ?x< ?45). >> >> x[(23 > Or a version that seems a bit more obvious (doing a multiply between > boolean arrays to get an AND operator seems a tad odd): > > x[(23= 45)) . Friedrich From pascal22p at parois.net Tue Mar 30 17:18:27 2010 From: pascal22p at parois.net (Pascal) Date: Tue, 30 Mar 2010 23:18:27 +0200 Subject: [Numpy-discussion] Fourier transform In-Reply-To: References: <20100329230053.2d32665f@parois.net> Message-ID: <20100330231827.3f4ad564@parois.net> Le Mon, 29 Mar 2010 16:12:56 -0600, Charles R Harris a ?crit : > On Mon, Mar 29, 2010 at 3:00 PM, Pascal wrote: > > > Hi, > > > > Does anyone have an idea how fft functions are implemented? Is it > > pure python? based on BLAS/LAPACK? or is it using fftw? > > > > I successfully used numpy.fft in 3D. I would like to know if I can > > calculate a specific a plane using the numpy.fft. > > > > I have in 3D: > > r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} > > \exp(-2\pi \i (hx/N+ky/M+lz/O)) > > > > So for the plane, z is no longer independant. > > I need to solve the system: > > ax+by+cz+d=0 > > r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} > > \exp(-2\pi \i (hx/N+ky/M+lz/O)) > > > > Do you think it's possible to use numpy.fft for this? > > > > > I'm not clear on what you want to do here, but note that the term in > the in the exponent is of the form , i.e., the inner product of > the vectors k and x. So if you rotate x by O so that the plane is > defined by z = 0, then = . That is, you can apply the > transpose of the rotation to the result of the fft. In other words, z is no longer independent but depends on x and y. Apparently, nobody is calculating the exact plane but they are making a slice in the 3D grid and doing some interpolation. However, your answer really help me on something completely different :) Thanks, Pascal From rmay31 at gmail.com Tue Mar 30 17:35:16 2010 From: rmay31 at gmail.com (Ryan May) Date: Tue, 30 Mar 2010 15:35:16 -0600 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero In-Reply-To: References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> <4BB230E2.40507@american.edu> Message-ID: On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt wrote: > 2010/3/30 Ryan May : >> On Tue, Mar 30, 2010 at 11:12 AM, Alan G Isaac wrote: >>> On 3/30/2010 12:56 PM, Sean Mulcahy wrote: >>>> 512x512 arrays. ?I would like to set elements of the array whose value fall within a specified range to zero (eg 23< ?x< ?45). >>> >>> x[(23> >> Or a version that seems a bit more obvious (doing a multiply between >> boolean arrays to get an AND operator seems a tad odd): >> >> x[(23 > We recently found out that it executes faster using: > > x *= ((x <= 23) | (x >= 45)) ?. Interesting. In an ideal world, I'd love to see why exactly that is, because I don't think multiplication should be faster than a boolean op. If you need speed, then by all means go for it. But if you don't need speed I'd use the & since that will be more obvious to the person who ends up reading your code later and has to spend time decoding what that line does. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From friedrichromstedt at gmail.com Tue Mar 30 17:44:10 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 30 Mar 2010 23:44:10 +0200 Subject: [Numpy-discussion] Interpolation question In-Reply-To: References: <3d375d731003281634wa6d2cc2g4ee1106f0dbac708@mail.gmail.com> Message-ID: 2010/3/30 Andrea Gavana : > However, from the first 100 or so interpolated simulations, I could > gather these findings: > > 1) Interpolations on *cumulative* productions on oil and gas are > extremely good, with a maximum range of relative error of -3% / +2%: > most of them (95% more or less) show errors < 1%, but for few of them > I get the aforementioned range of errors in the interpolations; > 2) Interpolations on oil and gas *rates* (time dependent), I usually > get a -5% / +3% error per timestep, which is very good for my > purposes. I still need to check these values but the first set of > results were very promising; > 3) Interpolations on gas injection (cumulative and rate) are a bit > more shaky (15% error more or less), but this is essentially due to a > particular complex behaviour of the reservoir simulator when it needs > to decide (based on user input) if the gas is going to be re-injected, > sold, used as excess gas and a few more; I am not that worried about > this issue for the moment. Have a nice time in Greece, and what you write makes me laughing. :-) When you are back, you should maybe elaborate a bit on what gas injections, wells, re-injected gas and so on is, I don't know about it. Friedrich From robert.kern at gmail.com Tue Mar 30 17:40:37 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 30 Mar 2010 16:40:37 -0500 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero In-Reply-To: References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> <4BB230E2.40507@american.edu> Message-ID: <3d375d731003301440l55a5d722yb7f72fe6237faf84@mail.gmail.com> On Tue, Mar 30, 2010 at 16:35, Ryan May wrote: > On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt > wrote: >> x *= ((x <= 23) | (x >= 45)) ?. > > Interesting. In an ideal world, I'd love to see why exactly that is, > because I don't think multiplication should be faster than a boolean > op. Branch prediction failures are really costly in modern CPUs. http://en.wikipedia.org/wiki/Branch_prediction -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From friedrichromstedt at gmail.com Tue Mar 30 17:51:30 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 30 Mar 2010 23:51:30 +0200 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero In-Reply-To: References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> <4BB230E2.40507@american.edu> Message-ID: 2010/3/30 Ryan May : > On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt > wrote: >> We recently found out that it executes faster using: >> >> x *= ((x <= 23) | (x >= 45)) ?. > > Interesting. In an ideal world, I'd love to see why exactly that is, > because I don't think multiplication should be faster than a boolean > op. ?If you need speed, then by all means go for it. ?But if you don't > need speed I'd use the & since that will be more obvious to the person > who ends up reading your code later and has to spend time decoding > what that line does. Hmm, I'm not familiar with numpy's internals, but I guess, it is because numpy doesn't have to evaluate the indexing boolean array? When the indexing array is applied to x via x[...] = 0? Robert's reply just came in when writing this. Friedrich From rmay31 at gmail.com Tue Mar 30 17:57:39 2010 From: rmay31 at gmail.com (Ryan May) Date: Tue, 30 Mar 2010 15:57:39 -0600 Subject: [Numpy-discussion] Set values of a matrix within a specified range to zero In-Reply-To: <3d375d731003301440l55a5d722yb7f72fe6237faf84@mail.gmail.com> References: <6EB8CEC9-7A9F-4287-BD80-4F7067841B6B@gmail.com> <4BB230E2.40507@american.edu> <3d375d731003301440l55a5d722yb7f72fe6237faf84@mail.gmail.com> Message-ID: On Tue, Mar 30, 2010 at 3:40 PM, Robert Kern wrote: > On Tue, Mar 30, 2010 at 16:35, Ryan May wrote: >> On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt >> wrote: > >>> x *= ((x <= 23) | (x >= 45)) ?. >> >> Interesting. In an ideal world, I'd love to see why exactly that is, >> because I don't think multiplication should be faster than a boolean >> op. > > Branch prediction failures are really costly in modern CPUs. > > http://en.wikipedia.org/wiki/Branch_prediction That makes sense. I still maintain that for 95% of code, easy to understand code is more important than performance differences due to branch misprediction. (And more importantly, we don't want to be teaching new users to code like that from the beginning.) Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From tjhnson at gmail.com Wed Mar 31 13:30:46 2010 From: tjhnson at gmail.com (T J) Date: Wed, 31 Mar 2010 10:30:46 -0700 Subject: [Numpy-discussion] Bug in logaddexp2.reduce Message-ID: Hi, I'm getting some strange behavior with logaddexp2.reduce: from itertools import permutations import numpy as np x = np.array([-53.584962500721154, -1.5849625007211563, -0.5849625007211563]) for p in permutations([0,1,2]): print p, np.logaddexp2.reduce(x[list(p)]) Essentially, the result depends on the order of the array...and we get nans in the "bad" orders. Likely, this also affects logaddexp. From tjhnson at gmail.com Wed Mar 31 13:38:46 2010 From: tjhnson at gmail.com (T J) Date: Wed, 31 Mar 2010 10:38:46 -0700 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: On Wed, Mar 31, 2010 at 10:30 AM, T J wrote: > Hi, > > I'm getting some strange behavior with logaddexp2.reduce: > > from itertools import permutations > import numpy as np > x = np.array([-53.584962500721154, -1.5849625007211563, -0.5849625007211563]) > for p in permutations([0,1,2]): > ? ?print p, np.logaddexp2.reduce(x[list(p)]) > > Essentially, the result depends on the order of the array...and we get > nans in the "bad" orders. ?Likely, this also affects logaddexp. > Sorry, forgot version information: $ python -c "import numpy;print numpy.__version__" 1.5.0.dev8106 From charlesr.harris at gmail.com Wed Mar 31 16:21:41 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 31 Mar 2010 14:21:41 -0600 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: On Wed, Mar 31, 2010 at 11:38 AM, T J wrote: > On Wed, Mar 31, 2010 at 10:30 AM, T J wrote: > > Hi, > > > > I'm getting some strange behavior with logaddexp2.reduce: > > > > from itertools import permutations > > import numpy as np > > x = np.array([-53.584962500721154, -1.5849625007211563, > -0.5849625007211563]) > > for p in permutations([0,1,2]): > > print p, np.logaddexp2.reduce(x[list(p)]) > > > > Essentially, the result depends on the order of the array...and we get > > nans in the "bad" orders. Likely, this also affects logaddexp. > > > > Sorry, forgot version information: > > $ python -c "import numpy;print numpy.__version__" > 1.5.0.dev8106 > __ > Looks like roundoff error. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jlewi at intellisis.com Wed Mar 31 17:05:32 2010 From: jlewi at intellisis.com (Jeremy Lewi) Date: Wed, 31 Mar 2010 14:05:32 -0700 Subject: [Numpy-discussion] Dealloat Numy arrays in C Message-ID: <005201cad115$e774cb70$b65e6250$@com> Hello, I'm passing a numpy array into a C-extension. I would like my C-extension to take ownership of the data and handle deallocating the memory when it is no longer needed. (The data is large so I want to avoid unnecessarily copying the data). So my question is what is the best way to ensure I'm using the correct memory deallocator for the buffer? i.e the deallocator for what ever allocator numpy used to allocate the array? Thanks Jeremy Jeremy Lewi Engineering Scientist The Intellisis Corporation jlewi at intellisis.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Mar 31 17:19:12 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 31 Mar 2010 15:19:12 -0600 Subject: [Numpy-discussion] Dealloat Numy arrays in C In-Reply-To: <005201cad115$e774cb70$b65e6250$@com> References: <005201cad115$e774cb70$b65e6250$@com> Message-ID: On Wed, Mar 31, 2010 at 15:05, Jeremy Lewi wrote: > So my question is what is the best way to ensure I?m using the correct > memory deallocator for the buffer? i.e the deallocator for what ever > allocator numpy used to allocate the array? PyArray_free() (note the capitalization). This is almost always PyMem_Free() from Python. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tjhnson at gmail.com Wed Mar 31 18:15:34 2010 From: tjhnson at gmail.com (T J) Date: Wed, 31 Mar 2010 15:15:34 -0700 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris wrote: > > Looks like roundoff error. > So this is "expected" behavior? In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) Out[1]: -1.5849625007211561 In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) Out[2]: nan In [3]: np.log2(np.exp2(-0.5849625007211563) + np.exp2(-53.584962500721154)) Out[3]: -0.58496250072115608 Shouldn't the result at least behave nicely and just return the larger value? From charlesr.harris at gmail.com Wed Mar 31 18:36:12 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 31 Mar 2010 16:36:12 -0600 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: On Wed, Mar 31, 2010 at 4:15 PM, T J wrote: > On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris > wrote: > > > > Looks like roundoff error. > > > > So this is "expected" behavior? > > In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) > Out[1]: -1.5849625007211561 > > In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) > Out[2]: nan > > In [3]: np.log2(np.exp2(-0.5849625007211563) + > np.exp2(-53.584962500721154)) > Out[3]: -0.58496250072115608 > > I don't see that In [1]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) Out[1]: -0.58496250072115619 In [2]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) Out[2]: -1.5849625007211561 What system are you running on. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Wed Mar 31 18:38:54 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 31 Mar 2010 18:38:54 -0400 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: <30C886CF-3F82-4034-905E-F45DFF1D7ECE@cs.toronto.edu> On 31-Mar-10, at 6:15 PM, T J wrote: > > In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) > Out[1]: -1.5849625007211561 > > In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) > Out[2]: nan > > In [3]: np.log2(np.exp2(-0.5849625007211563) + > np.exp2(-53.584962500721154)) > Out[3]: -0.58496250072115608 > > Shouldn't the result at least behave nicely and just return the > larger value? Unfortunately there's no good way of getting around order-of- operations-related rounding error using the reduce() machinery, that I know of. I'd like to see a 'logsumexp' and 'logsumexp2' in NumPy instead, using the generalized ufunc architecture, to do it over an arbitrary dimension of an array, rather than needing to invoke 'reduce' on logaddexp. I tried my hand at writing one but I couldn't manage to get it working for some reason, and I haven't had time to revisit it: http://mail.scipy.org/pipermail/numpy-discussion/2010-January/048067.html David From tjhnson at gmail.com Wed Mar 31 18:42:22 2010 From: tjhnson at gmail.com (T J) Date: Wed, 31 Mar 2010 15:42:22 -0700 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: On Wed, Mar 31, 2010 at 3:36 PM, Charles R Harris wrote: >> So this is "expected" behavior? >> >> In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) >> Out[1]: -1.5849625007211561 >> >> In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) >> Out[2]: nan >> > I don't see that > > In [1]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) > Out[1]: -0.58496250072115619 > > In [2]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) > Out[2]: -1.5849625007211561 > > What system are you running on. > $ python --version Python 2.6.4 $ uname -a Linux localhost 2.6.31-20-generic-pae #58-Ubuntu SMP Fri Mar 12 06:25:51 UTC 2010 i686 GNU/Linux From tjhnson at gmail.com Wed Mar 31 18:51:15 2010 From: tjhnson at gmail.com (T J) Date: Wed, 31 Mar 2010 15:51:15 -0700 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: <30C886CF-3F82-4034-905E-F45DFF1D7ECE@cs.toronto.edu> References: <30C886CF-3F82-4034-905E-F45DFF1D7ECE@cs.toronto.edu> Message-ID: On Wed, Mar 31, 2010 at 3:38 PM, David Warde-Farley wrote: > Unfortunately there's no good way of getting around order-of- > operations-related rounding error using the reduce() machinery, that I > know of. > That seems reasonable, but receiving a nan, in this case, does not. Are my expectations are unreasonable? Would sorting the values before reducing be an ugly(-enough) solution? It seems to fix it in this particular case. Or is the method you posted in the link below a better solution? > I'd like to see a 'logsumexp' and 'logsumexp2' in NumPy instead, using > the generalized ufunc architecture, to do it over an arbitrary > dimension of an array, rather than needing to invoke 'reduce' on > logaddexp. I tried my hand at writing one but I couldn't manage to get > it working for some reason, and I haven't had time to revisit it: http://mail.scipy.org/pipermail/numpy-discussion/2010-January/048067.html > I saw this original post and was hoping someone would post a response. Maybe now... From warren.weckesser at enthought.com Wed Mar 31 19:37:23 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 31 Mar 2010 18:37:23 -0500 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: <4BB3DCB3.7040402@enthought.com> T J wrote: > On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris > wrote: > >> Looks like roundoff error. >> >> > > So this is "expected" behavior? > > In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) > Out[1]: -1.5849625007211561 > > In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) > Out[2]: nan > Is any able to reproduce this? I don't get 'nan' in either 1.4.0 or 2.0.0.dev8313 (32 bit Mac OSX). In an earlier email T J reported using 1.5.0.dev8106. Warren > In [3]: np.log2(np.exp2(-0.5849625007211563) + np.exp2(-53.584962500721154)) > Out[3]: -0.58496250072115608 > > Shouldn't the result at least behave nicely and just return the larger value? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Wed Mar 31 20:08:32 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 31 Mar 2010 20:08:32 -0400 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: <4BB3DCB3.7040402@enthought.com> References: <4BB3DCB3.7040402@enthought.com> Message-ID: On Wed, Mar 31, 2010 at 7:37 PM, Warren Weckesser wrote: > T J wrote: >> On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris >> wrote: >> >>> Looks like roundoff error. >>> >>> >> >> So this is "expected" behavior? >> >> In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) >> Out[1]: -1.5849625007211561 >> >> In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) >> Out[2]: nan >> > > Is any able to reproduce this? ?I don't get 'nan' in either 1.4.0 or > 2.0.0.dev8313 (32 bit Mac OSX). ?In an earlier email T J reported using > 1.5.0.dev8106. >>> np.logaddexp2(-0.5849625007211563, -53.584962500721154) nan >>> np.logaddexp2(-1.5849625007211563, -53.584962500721154) -1.5849625007211561 >>> np.version.version '1.4.0' WindowsXP 32 Josef > > Warren > >> In [3]: np.log2(np.exp2(-0.5849625007211563) + np.exp2(-53.584962500721154)) >> Out[3]: -0.58496250072115608 >> >> Shouldn't the result at least behave nicely and just return the larger value? >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rmay31 at gmail.com Wed Mar 31 21:35:51 2010 From: rmay31 at gmail.com (Ryan May) Date: Wed, 31 Mar 2010 19:35:51 -0600 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: <4BB3DCB3.7040402@enthought.com> References: <4BB3DCB3.7040402@enthought.com> Message-ID: On Wed, Mar 31, 2010 at 5:37 PM, Warren Weckesser wrote: > T J wrote: >> On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris >> wrote: >> >>> Looks like roundoff error. >>> >>> >> >> So this is "expected" behavior? >> >> In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) >> Out[1]: -1.5849625007211561 >> >> In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) >> Out[2]: nan >> > > Is any able to reproduce this? ?I don't get 'nan' in either 1.4.0 or > 2.0.0.dev8313 (32 bit Mac OSX). ?In an earlier email T J reported using > 1.5.0.dev8106. No luck here on Gentoo Linux: Python 2.6.4 (r264:75706, Mar 11 2010, 09:29:48) [GCC 4.3.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.logaddexp2(-0.5849625007211563, -53.584962500721154) -0.58496250072115619 >>> np.logaddexp2(-1.5849625007211563, -53.584962500721154) -1.5849625007211561 >>> np.version.version '2.0.0.dev8313' Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From david at silveregg.co.jp Wed Mar 31 22:02:46 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Thu, 01 Apr 2010 11:02:46 +0900 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: <4BB3DCB3.7040402@enthought.com> Message-ID: <4BB3FEC6.5080304@silveregg.co.jp> Ryan May wrote: > On Wed, Mar 31, 2010 at 5:37 PM, Warren Weckesser > wrote: >> T J wrote: >>> On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris >>> wrote: >>> >>>> Looks like roundoff error. >>>> >>>> >>> So this is "expected" behavior? >>> >>> In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) >>> Out[1]: -1.5849625007211561 >>> >>> In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) >>> Out[2]: nan >>> >> Is any able to reproduce this? I don't get 'nan' in either 1.4.0 or >> 2.0.0.dev8313 (32 bit Mac OSX). In an earlier email T J reported using >> 1.5.0.dev8106. Having the config.h as well as the compilation options would be most useful, to determine which functions are coming from the system, and which one are the numpy ones, cheers, David From charlesr.harris at gmail.com Wed Mar 31 22:06:00 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 31 Mar 2010 20:06:00 -0600 Subject: [Numpy-discussion] Bug in logaddexp2.reduce In-Reply-To: References: Message-ID: On Wed, Mar 31, 2010 at 4:42 PM, T J wrote: > On Wed, Mar 31, 2010 at 3:36 PM, Charles R Harris > wrote: > >> So this is "expected" behavior? > >> > >> In [1]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) > >> Out[1]: -1.5849625007211561 > >> > >> In [2]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) > >> Out[2]: nan > >> > > I don't see that > > > > In [1]: np.logaddexp2(-0.5849625007211563, -53.584962500721154) > > Out[1]: -0.58496250072115619 > > > > In [2]: np.logaddexp2(-1.5849625007211563, -53.584962500721154) > > Out[2]: -1.5849625007211561 > > > > What system are you running on. > > > > $ python --version > Python 2.6.4 > > $ uname -a > Linux localhost 2.6.31-20-generic-pae #58-Ubuntu SMP Fri Mar 12 > 06:25:51 UTC 2010 i686 GNU/Linux > That is a 32 bit kernel, right? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shailendra.vikas at gmail.com Wed Mar 31 23:24:54 2010 From: shailendra.vikas at gmail.com (Shailendra) Date: Wed, 31 Mar 2010 23:24:54 -0400 Subject: [Numpy-discussion] "Match" two arrays Message-ID: Hi All, I want to make a function which should be like this cordinates1=(x1,y1) # x1 and y1 are x-cord and y-cord of a large number of points cordinates2=(x2,y2) # similar to condinates1 indices1,indices2= match_cordinates(cordinates1,cordinates2) (x1[indices1],y1[indices1]) "matches" (x2[indices2],y2[indices2]) where definition of "match" is such that : If A is closest point to B and distance between A and B is less that delta than it is a "match". If A is closest point to B and distance between A and B is more that delta than there is no match. Every point has either 1 "match"(closest point) or none Also, the size of the cordinates1 and cordinates2 are quite large and "outer" should not be used. I can think of only C style code to achieve this. Can any one suggest pythonic way of doing this? Thanks, Shailendra