Hi, The following trivial codelet does not work as expected: ------------------------------- from scipy import * import copy shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data data=flipud(data) data.shape=(256*256,) print 'new shape', data.shape ------------------------------- exiting with an uncomprehensive error: data.shape=(256*256,) AttributeError: incompatible shape for a non-contiguous array If 'flipud' is ommited, it works as expected. I tried via a deepcopy, the problem persists. Why should flipud invalidate 'reshapeability'? What am I doing wrong? Thanks a lot for any hints, - Dominik -- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
Dominik Szczerba wrote:
Hi,
The following trivial codelet does not work as expected:
------------------------------- from scipy import * import copy
shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data
data=flipud(data) data.shape=(256*256,) print 'new shape', data.shape -------------------------------
exiting with an uncomprehensive error: data.shape=(256*256,) AttributeError: incompatible shape for a non-contiguous array
If 'flipud' is ommited, it works as expected. I tried via a deepcopy, the problem persists. Why should flipud invalidate 'reshapeability'?
Assigning to .shape only adjusts the strides. It does not change any of the memory. It will only let you do that when the memory layout is consistent with the desired shape. flipud() just gets a view on the original memory by using different strides; the result is non-contiguous. The memory layout is no longer consistent with the flattened view that you are requesting. Here is an example: In [25]: data = arange(4) This is the layout in memory for 'data' and (later) 'd2': In [26]: data Out[26]: array([0, 1, 2, 3]) In [29]: data.shape = (2, 2) In [30]: data Out[30]: array([[0, 1], [2, 3]]) In [31]: d2 = flipud(data) In [32]: d2 Out[32]: array([[2, 3], [0, 1]]) Calling .ravel() will copy the array if it is non-contiguous and will show you the memory layout that 'd2' is mimicking with its strides. In [33]: d2.ravel() Out[33]: array([2, 3, 0, 1]) Assigning to .shape will only let you do that if the memory layout is consistent with the view that the array is trying to do. In [52]: import copy In [53]: d3 = copy.deepcopy(d2) In [54]: d3 Out[54]: array([[2, 3], [0, 1]]) In [55]: d3.shape = (4,) In [56]: d3 Out[56]: array([2, 3, 0, 1]) copy.deepcopy() should have worked. I don't know why it didn't for you. However:
What am I doing wrong?
You will want to use numpy.reshape() if you want the most foolproof and idiomatic way to get a reshaped array. It will copy the array if necessary. In [57]: reshape(d2, (4,)) Out[57]: array([2, 3, 0, 1]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Thank you for a very helpful explanation. Please see below: Robert Kern wrote:
Dominik Szczerba wrote:
Hi,
The following trivial codelet does not work as expected:
------------------------------- from scipy import * import copy
shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data
data=flipud(data) data.shape=(256*256,) print 'new shape', data.shape -------------------------------
exiting with an uncomprehensive error: data.shape=(256*256,) AttributeError: incompatible shape for a non-contiguous array
If 'flipud' is ommited, it works as expected. I tried via a deepcopy, the problem persists. Why should flipud invalidate 'reshapeability'?
Assigning to .shape only adjusts the strides. It does not change any of the memory. It will only let you do that when the memory layout is consistent with the desired shape. flipud() just gets a view on the original memory by using different strides; the result is non-contiguous. The memory layout is no longer consistent with the flattened view that you are requesting. Here is an example:
In [25]: data = arange(4)
This is the layout in memory for 'data' and (later) 'd2':
In [26]: data Out[26]: array([0, 1, 2, 3])
In [29]: data.shape = (2, 2)
In [30]: data Out[30]: array([[0, 1], [2, 3]])
In [31]: d2 = flipud(data)
In [32]: d2 Out[32]: array([[2, 3], [0, 1]])
Calling .ravel() will copy the array if it is non-contiguous and will show you the memory layout that 'd2' is mimicking with its strides.
quite a bit of a gotcha for a post-matlab user. deepcopy thing was already not pleasant to swallow.
In [33]: d2.ravel() Out[33]: array([2, 3, 0, 1])
Assigning to .shape will only let you do that if the memory layout is consistent with the view that the array is trying to do.
In [52]: import copy
In [53]: d3 = copy.deepcopy(d2)
In [54]: d3 Out[54]: array([[2, 3], [0, 1]])
In [55]: d3.shape = (4,)
In [56]: d3 Out[56]: array([2, 3, 0, 1])
copy.deepcopy() should have worked. I don't know why it didn't for you. However:
I was doing it in another way, namely flipping a deepcopy. You say to deepcopy the result and it works: data = flipud(data) data3 = copy.deepcopy(data) data3.shape = (256*256,) print 'new shape', data3.shape
What am I doing wrong?
You will want to use numpy.reshape() if you want the most foolproof and idiomatic way to get a reshaped array. It will copy the array if necessary.
In [57]: reshape(d2, (4,)) Out[57]: array([2, 3, 0, 1])
This actually did not work: shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data data2 = flipud(data) data2.ravel() reshape(data2,(256*256,)) print 'new shape', data2.shape The shape is preserved! Even though I am fine with the previously given solution, I am still curious what is wrong here. BTW> Why (size,) and not (size,1)? Thanks a lot for your help, - Dominik -- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
Dominik Szczerba wrote:
Thank you for a very helpful explanation. Please see below:
Robert Kern wrote:
Hi,
The following trivial codelet does not work as expected:
------------------------------- from scipy import * import copy
shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data
data=flipud(data) data.shape=(256*256,) print 'new shape', data.shape -------------------------------
exiting with an uncomprehensive error: data.shape=(256*256,) AttributeError: incompatible shape for a non-contiguous array
If 'flipud' is ommited, it works as expected. I tried via a deepcopy, the problem persists. Why should flipud invalidate 'reshapeability'? Assigning to .shape only adjusts the strides. It does not change any of the memory. It will only let you do that when the memory layout is consistent with
Dominik Szczerba wrote: the desired shape. flipud() just gets a view on the original memory by using different strides; the result is non-contiguous. The memory layout is no longer consistent with the flattened view that you are requesting. Here is an example:
In [25]: data = arange(4)
This is the layout in memory for 'data' and (later) 'd2':
In [26]: data Out[26]: array([0, 1, 2, 3])
In [29]: data.shape = (2, 2)
In [30]: data Out[30]: array([[0, 1], [2, 3]])
In [31]: d2 = flipud(data)
In [32]: d2 Out[32]: array([[2, 3], [0, 1]])
Calling .ravel() will copy the array if it is non-contiguous and will show you the memory layout that 'd2' is mimicking with its strides.
quite a bit of a gotcha for a post-matlab user. deepcopy thing was already not pleasant to swallow.
I don't recommend using deepcopy. Use reshape().
What am I doing wrong? You will want to use numpy.reshape() if you want the most foolproof and idiomatic way to get a reshaped array. It will copy the array if necessary.
In [57]: reshape(d2, (4,)) Out[57]: array([2, 3, 0, 1])
This actually did not work:
shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data data2 = flipud(data) data2.ravel() reshape(data2,(256*256,)) print 'new shape', data2.shape
The shape is preserved! Even though I am fine with the previously given solution, I am still curious what is wrong here.
It returns a (possibly new) object with the requested shape. It does not affect the shape of the original array since it is not always possible to do that safely. Assigning to .shape is the appropriate way to change the shape of an existing array in-place if it is safe to do so. Two different ways to do two different things.
BTW> Why (size,) and not (size,1)?
Because they are different things. Not everything is a 2D array. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Dominik Szczerba wrote:
Thank you for a very helpful explanation. Please see below:
Robert Kern wrote:
Hi,
The following trivial codelet does not work as expected:
------------------------------- from scipy import * import copy
shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data
data=flipud(data) data.shape=(256*256,) print 'new shape', data.shape -------------------------------
exiting with an uncomprehensive error: data.shape=(256*256,) AttributeError: incompatible shape for a non-contiguous array
If 'flipud' is ommited, it works as expected. I tried via a deepcopy, the problem persists. Why should flipud invalidate 'reshapeability'? Assigning to .shape only adjusts the strides. It does not change any of the memory. It will only let you do that when the memory layout is consistent with
Dominik Szczerba wrote: the desired shape. flipud() just gets a view on the original memory by using different strides; the result is non-contiguous. The memory layout is no longer consistent with the flattened view that you are requesting. Here is an example:
In [25]: data = arange(4)
This is the layout in memory for 'data' and (later) 'd2':
In [26]: data Out[26]: array([0, 1, 2, 3])
In [29]: data.shape = (2, 2)
In [30]: data Out[30]: array([[0, 1], [2, 3]])
In [31]: d2 = flipud(data)
In [32]: d2 Out[32]: array([[2, 3], [0, 1]])
Calling .ravel() will copy the array if it is non-contiguous and will show you the memory layout that 'd2' is mimicking with its strides. quite a bit of a gotcha for a post-matlab user. deepcopy thing was already not pleasant to swallow.
I don't recommend using deepcopy. Use reshape().
What am I doing wrong? You will want to use numpy.reshape() if you want the most foolproof and idiomatic way to get a reshaped array. It will copy the array if necessary.
In [57]: reshape(d2, (4,)) Out[57]: array([2, 3, 0, 1])
This actually did not work:
shape = (256,256) data = zeros(256*256) data.shape = shape print 'old shape', data.shape print data data2 = flipud(data) data2.ravel() reshape(data2,(256*256,)) print 'new shape', data2.shape
The shape is preserved! Even though I am fine with the previously given solution, I am still curious what is wrong here.
It returns a (possibly new) object with the requested shape. It does not affect the shape of the original array since it is not always possible to do that safely. Assigning to .shape is the appropriate way to change the shape of an existing array in-place if it is safe to do so. Two different ways to do two different things.
OK, so data3 = reshape(data2,(256*256,)) fixes it at a least expense. Thanks a lot for clearing up the confusion. - Dominik
BTW> Why (size,) and not (size,1)?
Because they are different things. Not everything is a 2D array.
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
Hi, Is it possible to directly read/write bz2 compressed binary files with scipy? Thanks, Dominik -- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
Dominik Szczerba wrote:
Hi, Is it possible to directly read/write bz2 compressed binary files with scipy? Thanks, Dominik
Check out the Python bz2 module. -- Bill wjdandreta@att.net Gentoo Linux 2.6.20-gentoo-r8 Reclaim Your Inbox with http://www.mozilla.org/products/thunderbird/ All things cometh to he who waiteth as long as he who waiteth worketh like hell while he waiteth.
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process) - Dominik Bill Dandreta wrote:
Dominik Szczerba wrote:
Hi, Is it possible to directly read/write bz2 compressed binary files with scipy? Thanks, Dominik
Check out the Python bz2 module.
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
El dc 20 de 06 del 2007 a les 17:48 +0200, en/na Dominik Szczerba va escriure:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process)
Do you need bzip2 for something in special? In general, zlib or lzo are enough for achieving decent compress ratios in numerical data, while allowing much better compression, and specially decompression, speed. In any case, PyTables does have support for the (zlib, lzo, bzip2) threesome right out of the box. In addition, it is meant to deal with huge arrays (it saves data in small chunks that are compressed and decompressed individually, so you don't have to worry about wasting too much memory for (de-)compression buffers). Regards, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth
PyTables is great (and big) while I just need to read in a sequence of values. Thanks a lot anyway, Dominik Francesc Altet wrote:
El dc 20 de 06 del 2007 a les 17:48 +0200, en/na Dominik Szczerba va escriure:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process)
Do you need bzip2 for something in special? In general, zlib or lzo are enough for achieving decent compress ratios in numerical data, while allowing much better compression, and specially decompression, speed.
In any case, PyTables does have support for the (zlib, lzo, bzip2) threesome right out of the box. In addition, it is meant to deal with huge arrays (it saves data in small chunks that are compressed and decompressed individually, so you don't have to worry about wasting too much memory for (de-)compression buffers).
Regards,
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
El dc 20 de 06 del 2007 a les 21:01 +0200, en/na Dominik Szczerba va escriure:
PyTables is great (and big) while I just need to read in a sequence of values.
Ok, that's fine. In any case, I'm interested in knowing the reasons on why you are using bzip2 instead zlib. Have you detected some data pattern where you get significantly more compression than by using zlib for example?. I'm asking this because, in my experience with numerical data, I was unable to detect important compression level differences between bzip2 and zlib. See: http://www.pytables.org/docs/manual/ch05.html#compressionIssues for some experiments in that regard. I'd appreciate any input on this subject (bzip2 vs zlib). -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth
Hi, I meant bz2 over zlib due to higher compression, if slower performance. This common belief was usually parallel to my experience. However, a simple test below made with fresh morning data clearly undermines this thinking:
du -hsc test9*.dat
428M total
time gzip test9*.dat
real 0m31.663s user 0m28.946s sys 0m1.612s
du -hsc test9*.dat.gz
215M total
time gunzip test9*.dat.gz
real 0m7.447s user 0m6.036s sys 0m1.264s
time bzip2 test9*.dat
real 2m1.696s user 1m54.527s sys 0m4.008s
du -hsc test9*.dat.bz2
219M total
time bunzip2 test9*.dat.bz2
real 0m43.252s user 0m39.926s sys 0m2.792s I am surprised, as I well remember cases where I could gain 20%. But indeed, given the much slower performance, you have me convinced to use zlib over bz2. thanks for forcing me to do this test, - Dominik Francesc Altet wrote:
El dc 20 de 06 del 2007 a les 21:01 +0200, en/na Dominik Szczerba va escriure:
PyTables is great (and big) while I just need to read in a sequence of values.
Ok, that's fine. In any case, I'm interested in knowing the reasons on why you are using bzip2 instead zlib. Have you detected some data pattern where you get significantly more compression than by using zlib for example?.
I'm asking this because, in my experience with numerical data, I was unable to detect important compression level differences between bzip2 and zlib. See:
http://www.pytables.org/docs/manual/ch05.html#compressionIssues
for some experiments in that regard.
I'd appreciate any input on this subject (bzip2 vs zlib).
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
El dj 21 de 06 del 2007 a les 12:57 +0200, en/na Dominik Szczerba va escriure:
Hi,
I meant bz2 over zlib due to higher compression, if slower performance. This common belief was usually parallel to my experience. However, a simple test below made with fresh morning data clearly undermines this thinking:
du -hsc test9*.dat
428M total
time gzip test9*.dat
real 0m31.663s user 0m28.946s sys 0m1.612s
du -hsc test9*.dat.gz
215M total
time gunzip test9*.dat.gz
real 0m7.447s user 0m6.036s sys 0m1.264s
time bzip2 test9*.dat
real 2m1.696s user 1m54.527s sys 0m4.008s
du -hsc test9*.dat.bz2
219M total
time bunzip2 test9*.dat.bz2
real 0m43.252s user 0m39.926s sys 0m2.792s
I am surprised, as I well remember cases where I could gain 20%.
Yeah, there should be cases where bzip2 is clearly better than zlib and one of these could be images. My teammate Ivan has come with this example: -rw------- 1 ivan ivan 733373 2007-06-21 13:02 lena1.tif.gz -rw------- 1 ivan ivan 584478 2007-06-21 13:02 lena2.tif.bz2 (you should already know where the source is: www.lenna.org ) But when it comes to general binary data for scientific uses, the compression advantages of bzip2 over zlib are less clear.
But indeed, given the much slower performance, you have me convinced to use zlib over bz2.
thanks for forcing me to do this test,
You are welcome ;) -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth
There is also another thing, namely bz2 uses --best per default while gzip uses -6. The whole thing is of course strongly data-dependent. - Dominik Francesc Altet wrote:
El dj 21 de 06 del 2007 a les 12:57 +0200, en/na Dominik Szczerba va escriure:
Hi,
I meant bz2 over zlib due to higher compression, if slower performance. This common belief was usually parallel to my experience. However, a simple test below made with fresh morning data clearly undermines this thinking:
du -hsc test9*.dat 428M total
time gzip test9*.dat real 0m31.663s user 0m28.946s sys 0m1.612s
du -hsc test9*.dat.gz 215M total
time gunzip test9*.dat.gz real 0m7.447s user 0m6.036s sys 0m1.264s
time bzip2 test9*.dat real 2m1.696s user 1m54.527s sys 0m4.008s
du -hsc test9*.dat.bz2 219M total
time bunzip2 test9*.dat.bz2 real 0m43.252s user 0m39.926s sys 0m2.792s
I am surprised, as I well remember cases where I could gain 20%.
Yeah, there should be cases where bzip2 is clearly better than zlib and one of these could be images. My teammate Ivan has come with this example:
-rw------- 1 ivan ivan 733373 2007-06-21 13:02 lena1.tif.gz -rw------- 1 ivan ivan 584478 2007-06-21 13:02 lena2.tif.bz2
(you should already know where the source is: www.lenna.org )
But when it comes to general binary data for scientific uses, the compression advantages of bzip2 over zlib are less clear.
But indeed, given the much slower performance, you have me convinced to use zlib over bz2.
thanks for forcing me to do this test,
You are welcome ;)
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
Hi, 2007/6/21, Francesc Altet <faltet@carabos.com>: <snip>
Ok, that's fine. In any case, I'm interested in knowing the reasons on why you are using bzip2 instead zlib. Have you detected some data pattern where you get significantly more compression than by using zlib for example?.
I'm asking this because, in my experience with numerical data, I was unable to detect important compression level differences between bzip2 and zlib. See:
http://www.pytables.org/docs/manual/ch05.html#compressionIssues
for some experiments in that regard.
I'd appreciate any input on this subject (bzip2 vs zlib).
Probably not very meaningful, but with ascii data (float as ascii) bzip2 seems to have a certain degree of advantages (both in speed and compress ratio): $ du -h lena.txt 3,1M lena.txt $ time gzip -9 lena.txt real 0m4.937s <= user 0m4.758s sys 0m0.018s $ du -h lena.txt.gz 316K lena.txt.gz $ time gunzip lena.txt.gz real 0m0.092s user 0m0.038s sys 0m0.020s $ time bzip2 lena.txt real 0m2.524s <= user 0m2.396s sys 0m0.027s $ du -h lena.txt.bz2 188K lena.txt.bz2 $ time bunzip2 lena.txt.bz2 real 0m0.868s user 0m0.775s sys 0m0.040s Even if it's usually a bad idea to put numerical data in ascii format, sometimes may be handy. Regards, ~ Antonio
I remember even for my binary data that bzip was about 20% better, but significantly slower. Best would be of course to have both (and more) compressors and chose which suits the case best. But in real world probabely zlib is a more general choice, if only one compressor is intended. PS. Yes, it's a very bad idea to keep real numbers as ascii. - Dominik Antonino Ingargiola wrote:
Hi,
2007/6/21, Francesc Altet <faltet@carabos.com>:
<snip>
Ok, that's fine. In any case, I'm interested in knowing the reasons on why you are using bzip2 instead zlib. Have you detected some data pattern where you get significantly more compression than by using zlib for example?.
I'm asking this because, in my experience with numerical data, I was unable to detect important compression level differences between bzip2 and zlib. See:
http://www.pytables.org/docs/manual/ch05.html#compressionIssues
for some experiments in that regard.
I'd appreciate any input on this subject (bzip2 vs zlib).
Probably not very meaningful, but with ascii data (float as ascii) bzip2 seems to have a certain degree of advantages (both in speed and compress ratio):
$ du -h lena.txt 3,1M lena.txt
$ time gzip -9 lena.txt
real 0m4.937s <= user 0m4.758s sys 0m0.018s
$ du -h lena.txt.gz 316K lena.txt.gz
$ time gunzip lena.txt.gz
real 0m0.092s user 0m0.038s sys 0m0.020s
$ time bzip2 lena.txt
real 0m2.524s <= user 0m2.396s sys 0m0.027s
$ du -h lena.txt.bz2 188K lena.txt.bz2
$ time bunzip2 lena.txt.bz2
real 0m0.868s user 0m0.775s sys 0m0.040s
Even if it's usually a bad idea to put numerical data in ascii format, sometimes may be handy.
Regards,
~ Antonio _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
On 20/06/07, Dominik Szczerba <domi@vision.ee.ethz.ch> wrote:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process)
If the bz2 module will provide a file-like object, scipy.read_array can read from that. Anne
That works very well for ascii files, but I failed to figure out about binary data... Thanks for any hints, - Dominik Anne Archibald wrote:
On 20/06/07, Dominik Szczerba <domi@vision.ee.ethz.ch> wrote:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process)
If the bz2 module will provide a file-like object, scipy.read_array can read from that.
Anne _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
I got it (partially) working, but am not sure about optimality. In particular, will fromstring copy memory into the array or decompress in place? I think the former (how else would it know the size, and tell() will be slow), but please correct me if I am wrong. import gzip fh = gzip.GzipFile("test.dat.gz", 'rb'); #ps = zeros(256*256) - will it help? ps = fromstring(fh.read(), 'd') ps.shape = (256,256) fh.close() fp = open('test.dat', 'wb') io.numpyio.fwrite(fp, ps.size, ps) fp.close() - Dominik Dominik Szczerba wrote:
That works very well for ascii files, but I failed to figure out about binary data... Thanks for any hints, - Dominik
Anne Archibald wrote:
On 20/06/07, Dominik Szczerba <domi@vision.ee.ethz.ch> wrote:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process) If the bz2 module will provide a file-like object, scipy.read_array can read from that.
Anne _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
On 20-Jun-07, at 4:18 PM, Dominik Szczerba wrote:
I got it (partially) working, but am not sure about optimality. In particular, will fromstring copy memory into the array or decompress in place? I think the former (how else would it know the size, and tell() will be slow), but please correct me if I am wrong.
I would almost certainly bet it would do a copy. Did you try using Anne's suggestion of scipy.read_array with your 'fh' object? Also, somebody correct me if I'm wrong, but I don't think modifying the 'shape' property directly is the recommended way to do it, I think you should be using ps.resize(). David
import gzip fh = gzip.GzipFile("test.dat.gz", 'rb'); #ps = zeros(256*256) - will it help? ps = fromstring(fh.read(), 'd') ps.shape = (256,256) fh.close() fp = open('test.dat', 'wb') io.numpyio.fwrite(fp, ps.size, ps) fp.close()
- Dominik
Dominik Szczerba wrote:
That works very well for ascii files, but I failed to figure out about binary data... Thanks for any hints, - Dominik
Anne Archibald wrote:
On 20/06/07, Dominik Szczerba <domi@vision.ee.ethz.ch> wrote:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process) If the bz2 module will provide a file-like object, scipy.read_array can read from that.
Anne _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
David Warde-Farley wrote:
On 20-Jun-07, at 4:18 PM, Dominik Szczerba wrote:
I got it (partially) working, but am not sure about optimality. In particular, will fromstring copy memory into the array or decompress in place? I think the former (how else would it know the size, and tell() will be slow), but please correct me if I am wrong.
I would almost certainly bet it would do a copy. Did you try using
Is there a way to avoid it if I know the size of the unpacked sequence a priori?
Anne's suggestion of scipy.read_array with your 'fh' object?
Yes I did and reported it back to the list (it works only for ascii data)
Also, somebody correct me if I'm wrong, but I don't think modifying the 'shape' property directly is the recommended way to do it, I think you should be using ps.resize().
Thanks for a warning, but actually, I was able to do things with so formed array (matplotlib plots, usual stuff like sqrt and powers etc.) Thanks a lot, Dominik
David
import gzip fh = gzip.GzipFile("test.dat.gz", 'rb'); #ps = zeros(256*256) - will it help? ps = fromstring(fh.read(), 'd') ps.shape = (256,256) fh.close() fp = open('test.dat', 'wb') io.numpyio.fwrite(fp, ps.size, ps) fp.close()
- Dominik
Dominik Szczerba wrote:
That works very well for ascii files, but I failed to figure out about binary data... Thanks for any hints, - Dominik
Anne Archibald wrote:
On 20/06/07, Dominik Szczerba <domi@vision.ee.ethz.ch> wrote:
Yes, I know it, but it does not return a scipy array, does it? Can I achieve it without copying memory? (I have huge arrays to process) If the bz2 module will provide a file-like object, scipy.read_array can read from that.
Anne _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user -- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
-- Dominik Szczerba, Ph.D. Computer Vision Lab CH-8092 Zurich http://www.vision.ee.ethz.ch/~domi
participants (7)
-
Anne Archibald -
Antonino Ingargiola -
Bill Dandreta -
David Warde-Farley -
Dominik Szczerba -
Francesc Altet -
Robert Kern