[Numpy-discussion] "formstring()" in place?
Chris Barker
cbarker at jps.net
Thu Nov 2 16:46:05 EST 2000
I have a narge array of type "1" (single bytes). I need to convert it to
Int32, in the manner that fromstring() would. Right now, I am doing:
Array = fromstring(Array.tostring(),'f')
This works fine, but what concerns me is that I need to do this on
potentially HUGE arrays, and if I understand this right, I am going to
create a copy with tostring, and then another copy with fromstring, that
then gets referenced to Array, at which point the first original copy
gets de-referenced, and should be deleted, and the temporary one gets
deleted at some point in this process. I don't know when stuff created
in the middle of a statement gets deleted, so I could potentially have
three copies of the data around at the same time, and at least two.
Since it is exactly the same C array, I'd like to be able to do this
without making any copies at all. Is it possible? It seems like it
should be a simple matter of changing the typecode and shape, but is
this possible?
While I'm asking questions: can I byteswap in place as well?
The greater problem:
To give a little background, and to see if anyone has a better idea of
how to do what I am doing, I thought I'd run through the task that I
really need to do.
I am reading a binary file full of a lot of data. I have some control
over the form of the file, but it needs to be compact, so I can't just
make everything the same large type. The file is essentially a whole
bunch of records, each of which is a collection of a couple of different
types, and which I would eventually like to get into a couple of NumPy
arrays. My first cut at the problem was to read each record one at a
time in a loop, and use the struct module to convert everything. This
worked fine, but was pretty darn slow, so I am now doing it all with
NumPy like this (one example, I have more complex ones):
num_bytes = 9 # number of bytes in a record: two longs and a char
# read all the data into a single byte array
data = fromstring(file.read(num_bytes*num_timesteps*num_LEs),'1')
# rearrange into 3-d array
data.shape = (num_timesteps,num_LEs,num_bytes)
# extract LE data:
LEs = data[:,:,:8]
# extract flag data
flags = data[:,:,8]
# convert LE data to longs
LEs = fromstring(LEs.tostring(),Int32)
if endian == 'L': # byteswap if required
LEs = LEs.byteswapped()
# convert to 3-d array
LEs.shape = (num_timesteps,num_LEs,2)
Anyone have any better ideas on how to do this?
Thanks,
-Chris
--
Christopher Barker,
Ph.D.
cbarker at jps.net --- --- ---
http://www.jps.net/cbarker -----@@ -----@@ -----@@
------@@@ ------@@@ ------@@@
Water Resources Engineering ------ @ ------ @ ------ @
Coastal and Fluvial Hydrodynamics ------- --------- --------
------------------------------------------------------------------------
------------------------------------------------------------------------
More information about the NumPy-Discussion
mailing list