scipy.io.numpyio fwrite - appending or updating an array
![](https://secure.gravatar.com/avatar/2bf3df70bc01428685e6d79065c82c0b.jpg?s=120&d=mm&r=g)
I have an existing binary file containing numpy array data. It has been created using open,fwrite & close and I can read the data using fread. I want to be able to either append a new array to the end of the file or update an existing array within the file. I've tried opening the file with a mode of either 'ab+' or 'wb+' and then writing the data using something like.... fd = open(vfname, 'ab+') if fd: filepos=(self.id-1)*self.yarray.size*4 fd.seek(filepos) fwrite(fd, self.yarray.size, self.yarray,'f') fd.close() When I use a mode of 'ab+' it looks like the data has been written to the file ok (no errors reported) but when I read it back I get my original data. When I use 'wb+' then my updated data gets written and read back ok. But when I reload the file, everything apart from my updated data (i.e. everything before it in the file) is now zero. The '+' in the mode seems to make no difference. What am I doing wrong? Thanks Bren.
![](https://secure.gravatar.com/avatar/2bf3df70bc01428685e6d79065c82c0b.jpg?s=120&d=mm&r=g)
I've tried replacing numpyio with both fopen and now also npfile but I'm getting the same problem, i.e. if I write a numpy array to the file, everything else before that position in the file is now zero. It is as if it is a new file, not an existing one. Brennan Williams wrote:
I have an existing binary file containing numpy array data. It has been created using open,fwrite & close and I can read the data using fread.
I want to be able to either append a new array to the end of the file or update an existing array within the file.
I've tried opening the file with a mode of either 'ab+' or 'wb+' and then writing the data using something like....
fd = open(vfname, 'ab+') if fd: filepos=(self.id-1)*self.yarray.size*4 fd.seek(filepos) fwrite(fd, self.yarray.size, self.yarray,'f') fd.close()
When I use a mode of 'ab+' it looks like the data has been written to the file ok (no errors reported) but when I read it back I get my original data.
When I use 'wb+' then my updated data gets written and read back ok. But when I reload the file, everything apart from my updated data (i.e. everything before it in the file) is now zero.
The '+' in the mode seems to make no difference.
What am I doing wrong?
Thanks
Bren. _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
Hi Brennan 2008/8/1 Brennan Williams <brennan.williams@visualreservoir.com>:
I've tried replacing numpyio with both fopen and now also npfile but I'm getting the same problem, i.e. if I write a numpy array to the file, everything else before that position in the file is now zero. It is as if it is a new file, not an existing one.
Have you looked at SciPy's memmap class? Regards Stéfan
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
Hi Brennan 2008/8/1 Brennan Williams <brennan.williams@visualreservoir.com>:
I've tried replacing numpyio with both fopen and now also npfile but I'm getting the same problem, i.e. if I write a numpy array to the file, everything else before that position in the file is now zero. It is as if it is a new file, not an existing one.
Have you looked at SciPy's memmap class? Regards Stéfan
![](https://secure.gravatar.com/avatar/2bf3df70bc01428685e6d79065c82c0b.jpg?s=120&d=mm&r=g)
I'll look into memmap as Stefan suggested. If Robert Kern's out there, do you have any comments about what I might be doing wrong. Id on't know memmap at all yet - basically each file will have multiple numpy arrays written, read, appended and updated as required. Brennan Brennan Williams wrote:
I've tried replacing numpyio with both fopen and now also npfile but I'm getting the same problem, i.e. if I write a numpy array to the file, everything else before that position in the file is now zero. It is as if it is a new file, not an existing one.
Brennan Williams wrote:
I have an existing binary file containing numpy array data. It has been created using open,fwrite & close and I can read the data using fread.
I want to be able to either append a new array to the end of the file or update an existing array within the file.
I've tried opening the file with a mode of either 'ab+' or 'wb+' and then writing the data using something like....
fd = open(vfname, 'ab+') if fd: filepos=(self.id-1)*self.yarray.size*4 fd.seek(filepos) fwrite(fd, self.yarray.size, self.yarray,'f') fd.close()
When I use a mode of 'ab+' it looks like the data has been written to the file ok (no errors reported) but when I read it back I get my original data.
When I use 'wb+' then my updated data gets written and read back ok. But when I reload the file, everything apart from my updated data (i.e. everything before it in the file) is now zero.
The '+' in the mode seems to make no difference.
What am I doing wrong?
Thanks
Bren. _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
On Fri, Aug 1, 2008 at 04:14, Brennan Williams <brennan.williams@visualreservoir.com> wrote:
I'll look into memmap as Stefan suggested. If Robert Kern's out there, do you have any comments about what I might be doing wrong.
I think you want 'rb+'. 'ab+' puts you at the end of the file for writing (because you asked to append). I believe the '+' in 'wb+' is simply ignored, so you are getting the truncating behavior of 'wb'. I've only tested 'rb+' with file.write(), but ultimately, both file.write() and scipy.io.fwrite() both use C's fwrite(3) down at the bottom. In [40]: f = open('foo.dat', 'wb') In [41]: f.write('Foo!' * 4) In [42]: f.close() In [43]: open('foo.dat', 'rb').read() Out[43]: 'Foo!Foo!Foo!Foo!' In [44]: f = open('foo.dat', 'rb+') In [45]: f.tell() Out[45]: 0L In [46]: f.seek(4) In [47]: f.tell() Out[47]: 4L In [48]: f.write('Bar!') In [49]: f.tell() Out[49]: 8L In [50]: f.close() In [51]: open('foo.dat', 'rb').read() Out[51]: 'Foo!Bar!Foo!Foo!' But yeah, look at memmap arrays. Much nicer. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/2bf3df70bc01428685e6d79065c82c0b.jpg?s=120&d=mm&r=g)
Robert Thanks for the help. 'rb+' seems to work. 'ab+' or 'ab' is working ok. 'wb' or 'wb+' doesn't work, i.e. everything preceding the position I'm writing to in the file becomes zero. I re-read (slowly this time) the online documentation - in fopen the permissions are the same as for the built-in open so I looked at that and yes, 'w+' will truncate. However, to me, truncation means truncation "after" but it evidently also means truncation "before" as well which just wasn't what I was expecting (probably due to a Fortran background many years ago). It also seems strange to me to open a file with 'r+' when I want to write to it. But there you go, it seems to be working. I will also look at memmap - as the size of my data files gets bigger with larger datasets etc. it may well be very useful. Brennan Robert Kern wrote:
On Fri, Aug 1, 2008 at 04:14, Brennan Williams <brennan.williams@visualreservoir.com> wrote:
I'll look into memmap as Stefan suggested. If Robert Kern's out there, do you have any comments about what I might be doing wrong.
I think you want 'rb+'. 'ab+' puts you at the end of the file for writing (because you asked to append). I believe the '+' in 'wb+' is simply ignored, so you are getting the truncating behavior of 'wb'. I've only tested 'rb+' with file.write(), but ultimately, both file.write() and scipy.io.fwrite() both use C's fwrite(3) down at the bottom.
In [40]: f = open('foo.dat', 'wb')
In [41]: f.write('Foo!' * 4)
In [42]: f.close()
In [43]: open('foo.dat', 'rb').read() Out[43]: 'Foo!Foo!Foo!Foo!'
In [44]: f = open('foo.dat', 'rb+')
In [45]: f.tell() Out[45]: 0L
In [46]: f.seek(4)
In [47]: f.tell() Out[47]: 4L
In [48]: f.write('Bar!')
In [49]: f.tell() Out[49]: 8L
In [50]: f.close()
In [51]: open('foo.dat', 'rb').read() Out[51]: 'Foo!Bar!Foo!Foo!'
But yeah, look at memmap arrays. Much nicer.
participants (3)
-
Brennan Williams
-
Robert Kern
-
Stéfan van der Walt