Mailman 3 EPD Py2.5 v4.1.30101 Released - NumPy-Discussion

EPD Py2.5 v4.1.30101 Released

Chris Casey

22 Dec 2008 22 Dec '08

7:11 p.m.

Greetings, Enthought, Inc. is very pleased to announce the newest release of the Enthought Python Distribution (EPD) Py2.5 v4.1.30101: http://www.enthought.com/epd The size of the installer has be reduced by about half. Also, this is the first release to include a 3.1.0 version of the Enthought Tool Suite (http://code.enthought.com/), featuring Mayavi 3.1.0. This is also the first release to use Enthought's enhanced version of setuptools, Enstaller (http://code.enthought.com/projects/enstaller/). Windows installation enhancements, matplotlib and wx issues, and menu consistency accross platforms are among notable fixes. The full release notes for this release can be found here: https://svn.enthought.com/epd/wiki/Py25/4.1.30101/RelNotes Many thanks to the EPD team for putting this release together, and to the community of folks who have provided all of the valuable tools bundled here. Best Regards, Chris --------- About EPD --------- The Enthought Python Distribution (EPD) is a "kitchen-sink-included" distribution of the Python™ Programming Language, including over 80 additional tools and libraries. The EPD bundle includes NumPy, SciPy, IPython, 2D and 3D visualization, database adapters, and a lot of other tools right out of the box. http://www.enthought.com/products/epd.php It is currently available as an easy, single-click installer for Windows XP (x86), Mac OS X (a universal binary for Intel 10.4 and above) and RedHat EL3 (x86 and amd64). EPD is free for 30-day trial use and for use in degree-granting academic institutions. An annual Subscription and installation support are available for commercial use (http://www.enthought.com/products/epddownload.php ) including an Enterprise Subscription with support for particular deployment environments (http://www.enthought.com/products/enterprise.php ). _______________________________________________ Enthought-dev mailing list Enthought-dev@mail.enthought.com https://mail.enthought.com/mailman/listinfo/enthought-dev

Show replies by date

Gael Varoquaux

23 Dec 23 Dec

12:39 a.m.

New subject: Thoughts on persistence/object tracking in scientific code

Hi, This mailing list is full of people spending their time writing non-trivial numerical code. This is why I would like to share my interrogations on a code smell that I notice a lot in my numerical code that revolves around persisting to disk often, and the mess that results. It is a bit hard to describe and it has been on my mind for a couple of months. I have finally written a blog post in an attempt to share my thoughts: http://gael-varoquaux.info/blog/?p=83 Pointing to a blog post on a mailing list seems to me almost rude, and I hope you'll forgive, but I'd love any feedback. It seems to me I am missing a pattern, or simply some insight on a recurrent problem. Cheers, Gaël

Olivier Grisel

1:10 a.m.

New subject: Thoughts on persistence/object tracking in scientific code

Interesting topic indeed. I think I have been hit with similar problems on toy experimental scripts. So far the solution was always adhoc FS caches of numpy arrays with manual filename management. Maybe the first step for designing a generic solution would be to list some representative yet simple enough use cases with real sample python code so as to focus on concrete matters and avoid over engineering a general solution for philosophical problems. -- Olivier On Dec 23, 2008 1:40 AM, "Gael Varoquaux" wrote: Hi, This mailing list is full of people spending their time writing non-trivial numerical code. This is why I would like to share my interrogations on a code smell that I notice a lot in my numerical code that revolves around persisting to disk often, and the mess that results. It is a bit hard to describe and it has been on my mind for a couple of months. I have finally written a blog post in an attempt to share my thoughts: http://gael-varoquaux.info/blog/?p=83 Pointing to a blog post on a mailing list seems to me almost rude, and I hope you'll forgive, but I'd love any feedback. It seems to me I am missing a pattern, or simply some insight on a recurrent problem. Cheers, Gaël _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Gael Varoquaux

24 Dec 24 Dec

1:21 p.m.

New subject: Thoughts on persistence/object tracking in scientific code

On Tue, Dec 23, 2008 at 02:10:50AM +0100, Olivier Grisel wrote:

...

Interesting topic indeed. I think I have been hit with similar problems on toy experimental scripts. So far the solution was always adhoc FS caches of numpy arrays with manual filename management. Maybe the first step for designing a generic solution would be to list some representative yet simple enough use cases with real sample python code so as to focus on concrete matters and avoid over engineering a general solution for philosophical problems.

Yes, that's clearly a first ste: list the usecases, and the way we would like it solved: think about the API. My internet connection is quite random currently, and I'll probably loose it for a week any time soon. Do you want to start such a page on the wiki. Mark it as a sratch page, and we'll delete it later. I should point out that joblib (on PyPI and launchpad) was a first attempt to solve this problem, so you could have a look at it. I have already identified things that are wrong with joblib (more on the API side than actual bugs), so I know it is not a final solution. Figuring out what was wrong only came from using it heavily in my work. I thing the only way forward it to start something, use it, figure out what's wrong, and start again... Looking forward to your input, Gaël

Bradford Cross

27 Dec 27 Dec

3:59 p.m.

New subject: Thoughts on persistence/object tracking in scientific code

I prototyped an approach last year that worked out well. I don't really know what to call it - maybe something like "property based persistence." It is kind of strange and I am not sure how broadly applicable it is - I have only used it for financial time series data. I'll try to explain how the idea works. I start with a python object that has a number of properties and an associated large data set (in my case, financial instruments and their associated time series in the form of numpy arrays.) I then created infrastructure that allowed me to define a simple "mapper" function that used a subset of the object's properties to define a "path" (expressible in the same form either as a file system path or as a path in HDF to a table.) Then I persisted the bulky data set (again, time series in my case) at that location. This little piece of infrastructure is very lightweight and cuts the client side persistence code down to only the small "mapper" functions. The mapper functions don't actually build up paths - they just specify the properties and ordering that you want to use to build up the paths. It also makes querying very simple and fast because you don't really query at all - instead the properties associated with the query directly express the path at which the data is located. The drawback of this simplistic approach is that you need to add a second level of path addressing if you deal with datasets so large that you can not really persist them under a single path. If you have single multi GB or TB arrays you probably want to chunk things up a bit more in the style of GFS and its open source counterparts. I still have the python code for this properties based time series database. It is a very small and simple peice of code, but I am happy to give it a quick polish and open source it if anyone is interested in taking a look. I am also about to try this model using F# and db4o for a .Net project. On Wed, Dec 24, 2008 at 2:21 PM, Gael Varoquaux < gael.varoquaux@normalesup.org> wrote:

...

...
Interesting topic indeed. I think I have been hit with similar

On Tue, Dec 23, 2008 at 02:10:50AM +0100, Olivier Grisel wrote: problems on

...
toy experimental scripts. So far the solution was always adhoc FS caches of numpy arrays with manual filename management. Maybe the first step for designing a generic solution would be to list some representative yet simple enough use cases with real sample python code so as to focus on concrete matters and avoid over engineering a general solution for philosophical problems.

Yes, that's clearly a first ste: list the usecases, and the way we would like it solved: think about the API.

My internet connection is quite random currently, and I'll probably loose it for a week any time soon. Do you want to start such a page on the wiki. Mark it as a sratch page, and we'll delete it later.

I should point out that joblib (on PyPI and launchpad) was a first attempt to solve this problem, so you could have a look at it. I have already identified things that are wrong with joblib (more on the API side than actual bugs), so I know it is not a final solution. Figuring out what was wrong only came from using it heavily in my work. I thing the only way forward it to start something, use it, figure out what's wrong, and start again...

Looking forward to your input,

Gaël _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Gael Varoquaux

5:33 p.m.

New subject: Thoughts on persistence/object tracking in scientific code

On Sat, Dec 27, 2008 at 04:59:25PM +0100, Bradford Cross wrote:

...

I prototyped an approach last year that worked out well. I don't really know what to call it - maybe something like "property based persistence." It is kind of strange and I am not sure how broadly applicable it is - I have only used it for financial time series data.

Yeay, that's exactly what I had in mind for my second try. I though I would call this special object some kind of execution context.

...

I still have the python code for this properties based time series database. It is a very small and simple peice of code, but I am happy to give it a quick polish and open source it if anyone is interested in taking a look.

I am very interested in both your code, and anything you can to tell us about what worked well, and what you would do different.

...

I am also about to try this model using F# and db4o for a .Net project.

Functionally language are clearly a very interesting alley to go down for these problems. I am right now in Python, and staying there for a while, but I believe I can learn a lot from functionnal languages. Thanks for your feedback, Ga�l

5598

Age (days ago)

5603

Last active (days ago)

List overview

Download

5 comments

4 participants

participants (4)

Bradford Cross
Chris Casey
Gael Varoquaux
Olivier Grisel

EPD Py2.5 v4.1.30101 Released

Chris Casey

Gael Varoquaux

Olivier Grisel

Gael Varoquaux

Bradford Cross

Gael Varoquaux

tags

participants (4)