[Numpy-discussion] record data previous to Numpy use
robbmcleod at gmail.com
Wed Jul 5 20:41:00 EDT 2017
While I'm going to bet that the fastest way to build a ndarray from ascii
is with a 'io.ByteIO` stream, NumPy does have a function to load from text,
`numpy.loadtxt` that works well enough for most purposes.
It's hard to tell from the original post if the ascii is being continuously
generated or not. If it's being produced in an on-going fashion then a
stream object is definitely the way to go, as the array chunks can be
produced by `numpy.frombuffer()`.
On Wed, Jul 5, 2017 at 3:21 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Jul 5, 2017 at 5:41 AM, <paul.carrico at free.fr> wrote:
> > Dear all
> > I’m sorry if my question is too basic (not fully in relation to Numpy –
> while it is to build matrices and to work with Numpy afterward), but I’m
> spending a lot of time and effort to find a way to record data from an asci
> while, and reassign it into a matrix/array … with unsuccessfully!
> > The only way I found is to use ‘append()’ instruction involving dynamic
> memory allocation. :-(
> Are you talking about appending to Python list objects? Or the np.append()
> function on numpy arrays?
> In my experience, it is usually fine to build a list with the `.append()`
> method while reading the file of unknown size and then converting it to an
> array afterwards, even for dozens of millions of lines. The list object is
> quite smart about reallocating memory so it is not that expensive. You
> should generally avoid the np.append() function, though; it is not smart.
> > From my current experience under Scilab (a like Matlab scientific
> solver), it is well know:
> > Step 1 : matrix initialization like ‘np.zeros(n,n)’
> > Step 2 : record the data
> > and write it in the matrix (step 3)
> > I’m obviously influenced by my current experience, but I’m interested in
> moving to Python and its packages
> > For huge asci files (involving dozens of millions of lines), my strategy
> is to work by ‘blocks’ as :
> > Find the line index of the beginning and the end of one block (this
> implies that the file is read ounce)
> > Read the block
> > (process repeated on the different other blocks)
> Are the blocks intrinsic parts of the file? Or are you just trying to
> break up the file into fixed-size chunks?
> > I tried different codes such as bellow, but each time Python is telling
> me I cannot mix iteration and record method
> > #############################################
> > position = ; j=0
> > with open(PATH + file_name, "r") as rough_ data:
> > for line in rough_ data:
> > if my_criteria in line:
> > position.append(j) ## huge blocs but limited in
> > j=j+1
> > i = 0
> > blockdata = np.zeros( (size_block), dtype=np.float)
> > with open(PATH + file_name, "r") as f:
> > for line in itertools.islice(f,1,size_block):
> > blockdata [i]=float(f.readline() )
> For what it's worth, this is the line that is causing the error that you
> describe. When you iterate over the file with the `for line in
> itertools.islice(f, ...):` loop, you already have the line text. You don't
> (and can't) call `f.readline()` to get it again. It would mess up the
> iteration if you did and cause you to skip lines.
> By the way, it is useful to help us help you if you copy-paste the exact
> code that you are running as well as the full traceback instead of
> paraphrasing the error message.
> Robert Kern
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
Robert McLeod, Ph.D.
robert.mcleod at unibas.ch
robert.mcleod at bsse.ethz.ch <robert.mcleod at ethz.ch>
robbmcleod at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion