Writing/reading a numeric array to a file
Hello everyone, I have been using Numeric and/or numarray for a while now and I now I have a question concerning the reading and writing data to a file. Until now I had used pycdf, which is an interface to the netcdf library, to save and load my numeric arrays. However, since my application already has too many dependencies, I would like to cut down on these two and replace the saving and loading by a method intrinisc to Numeric or numarray. Pickling is not an alternative to me, because I need to have a representation that can be read by other programs (and yet is faster than ASCII). NetCDF was a good choice for that, but I would really like to not depend on it. So my questions are: (1) Are the numarray functions 'fromfile' and 'tofile' 100% portable among platforms, i.e. do they automatically recognize endian-ness and such? (2) Does it make sense to still use numarray? I know Travis would say "use scipy_core". However, for me this would provide much unneeded functionality and I have not yet found an easy way to install scipy_core (it seems to require ATLAS and such, which are not so easy to install if you don't have it prepackaged). And after all, my goal is to cut down dependencies... (3) Let me restate question (2): Will numarray still be maintained? Or is it also deprecated? What would you advice someone who just needs the array interface? And of course, (4) What solutions do you use to save/load data to files? I would deeply appreciate any help on this subject, Niklas Volbers.
N. Volbers wrote:
Hello everyone,
I have been using Numeric and/or numarray for a while now and I now I have a question concerning the reading and writing data to a file.
Until now I had used pycdf, which is an interface to the netcdf library, to save and load my numeric arrays. However, since my application already has too many dependencies, I would like to cut down on these two and replace the saving and loading by a method intrinisc to Numeric or numarray. Pickling is not an alternative to me, because I need to have a representation that can be read by other programs (and yet is faster than ASCII). NetCDF was a good choice for that, but I would really like to not depend on it.
So my questions are:
(1) Are the numarray functions 'fromfile' and 'tofile' 100% portable among platforms, i.e. do they automatically recognize endian-ness and such?
No. But depending on how general a solution you need here, this can be fairly easy (i.e. for numerical arrays only).
(2) Does it make sense to still use numarray?
Absolutely... numarray works with records and memory mapping now. But you also need to keep a shrewd eye on scipy newcore and recognize that it will most probably replace numarray over the course of the next year or two as it becomes sufficiently complete and stable to do so. There are smart things to do now if you want to use numarray: a. Use the Numeric-compatible C-API as much as possible. b. Keep an eye on the introduction of newcore compatible typenames (int32 vs Int32), keywords (dtype vs. type), and attributes and use those as you write new code in numarray. c. Use the array protocol.
I know Travis would say "use scipy_core". However, for me this would provide much unneeded functionality and I have not yet found an easy way to install scipy_core (it seems to require ATLAS and such, which are not so easy to install if you don't have it prepackaged). And after all, my goal is to cut down dependencies...
(3) Let me restate question (2): Will numarray still be maintained?
numarray will be maintained at STScI until (a) newcore is ready to replace it or (b) our budget gets cut to the point that we cannot and no one else is interested. Neither of those is guaranteed to happen, but (a) looks likely to us. STScI has the same problems with installation and dependencies so they'll have to be solved before we use newcore either.
Or is it also deprecated? What would you advice someone who just needs the array interface?
Pay careful attention to the __array_struct__ attribute described here: http://numeric.scipy.org/array_interface.html It's the easiest and best performing method to interface from C. To interface from Python you have to use more of the protocol.
And of course,
(4) What solutions do you use to save/load data to files?
numarray was written to support astronomical data processing. The dominant data format in astronomy is called FITS. STScI has another package called PyFITS which is built on numarray and exposes the FITS format to Python. Regards, Todd
N. Volbers wrote:
(2) Does it make sense to still use numarray? I know Travis would say "use scipy_core". However, for me this would provide much unneeded functionality and I have not yet found an easy way to install scipy_core (it seems to require ATLAS and such, which are not so easy to install if you don't have it prepackaged). And after all, my goal is to cut down dependencies...
What unneeded functionality is there. SciPy Core is just a Numeric replacement. It does not NEED Atlas, it just uses it if you have it --- exactly the same as Numeric and numarray. It is a misconception to say scipy core needs any thing else but Python installed. So, let's not let that rumor kill convergence to a single package.
(4) What solutions do you use to save/load data to files?
SciPy core arrays have tofile methods and a fromfile function. They are raw reading and writing --- nothing fancy. You need to use Pickling if you want to recognize endian-ness among platforms. What is your opposition to Pickling? I think you should take a look at PyTables for more elegant solutions. -Travis
On Wednesday 09 November 2005 20:21, Travis Oliphant wrote:
SciPy core arrays have tofile methods and a fromfile function. They are raw reading and writing --- nothing fancy. You need to use Pickling if you want to recognize endian-ness among platforms. What is your opposition to Pickling?
I think you should take a look at PyTables for more elegant solutions.
The read_array and write_array of the old scipy.io are very handy. Any chance that they will be incorporated somehow in the new scipy? PyTables is powerful but perhaps a bit overkill if you just want to read or write a few columns in ascii format. Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
On Thu, 10 Nov 2005, Joris De Ridder wrote:
On Wednesday 09 November 2005 20:21, Travis Oliphant wrote:
SciPy core arrays have tofile methods and a fromfile function. They are raw reading and writing --- nothing fancy. You need to use Pickling if you want to recognize endian-ness among platforms. What is your opposition to Pickling?
I think you should take a look at PyTables for more elegant solutions.
The read_array and write_array of the old scipy.io are very handy. Any chance that they will be incorporated somehow in the new scipy? PyTables is powerful but perhaps a bit overkill if you just want to read or write a few columns in ascii format.
They are already there in newscipy (as most of the other routines from "old" scipy). Best, Arnd
participants (5)
-
Arnd Baecker
-
Joris De Ridder
-
N. Volbers
-
Todd Miller
-
Travis Oliphant