[Numpy-discussion] record data previous to Numpy use

paul.carrico at free.fr paul.carrico at free.fr
Wed Jul 5 08:41:49 EDT 2017


Dear all 

I'm sorry if my question is too basic (not fully in relation to Numpy -
while it is to build matrices and to work with Numpy afterward), but I'm
spending a lot of time and effort to find a way to record data from an
asci while, and reassign it into a matrix/array … with unsuccessfully! 

The only way I found is to use _'append()'_ instruction involving
dynamic memory allocation. :-( 

>From my current experience under Scilab (a like Matlab scientific
solver), it is well know: 

 	* Step 1 : matrix initialization like _'np.zeros(n,n)'_
 	* Step 2 : record the data
 	* and write it in the matrix (step 3)

I'm obviously influenced by my current experience, but I'm interested in
moving to Python and its packages 

For huge asci files (involving dozens of millions of lines), my strategy
is to work by 'blocks' as : 

 	* Find the line index of the beginning and the end of one block (this
implies that the file is read ounce)
 	* Read the block
 	* (process repeated on the different other blocks)

I tried different codes such as bellow, but each time Python is telling
me I CANNOT MIX ITERATION AND RECORD METHOD 

############################################# 

position = []; j=0 

with open(PATH + file_name, "r") as rough_ data: 

            for line in rough_ data: 

                if _my_criteria_ in line: 

                    position.append(j) ## huge blocs but limited in
number 

                j=j+1 

        i = 0 

        blockdata = np.zeros( (size_block), dtype=np.float) 

        with open(PATH + file_name, "r") as f: 

                 for line in itertools.islice(f,1,size_block): 

                     blockdata [i]=float(f.readline() ) 

                     i=i+1 

 ######################################### 

Should I work on lists using f.readlines (but this implies to load all
the file in memory). 

Additional question:  can I use record with vectorization, with 'i
=np.arange(0,65406)' if I remain  in the previous example 

Thanks for your time and comprehension 

(I'm obviously interested by doc references speaking about those
specific tasks) 

Paul 

PS: for Chuck:  I'll had a look to pandas package but in an code
optimization step :-) (nearly 2000 doc pages)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170705/f80332e0/attachment.html>


More information about the NumPy-Discussion mailing list