[Tutor] text processing lines variable content

ingo janssen ingoogni at gmail.com
Wed Feb 6 11:33:50 EST 2019


For parsing the out put of the Voro++ program and writing the data to a 
POV-Ray include file I created a bunch of functions.

def pop_left_slice(inputlist, length):
   outputlist = inputlist[0:length]
   del inputlist[:length]
   return outputlist

this is used by every function to chop of the required part of the input 
line.
Two examples of the functions that proces a chopped of slice of the line 
and append the data to the approriate list.

def f_vector(outlist):
   x,y,z = pop_left_slice(line,3)
   outlist.append(f"<{x},{y},{z}>,")

def f_vector_array(outlist, length):
   rv = pop_left_slice(line, length)
   rv = [f'<{i[1:-1]}>' for i in rv]  #i format is: '(1.234,2.345,3.456)'
   rv = ",".join(rv)
   outlist.append(f"  //label: {lbl}\n  array[{length}]"+"{\n 
"+rv+"\n  }\n")

Every line can contain up to 21 data chunks. Within one file each line 
contains the same amount of chunks, but it varies between files. The 
types of chunks vary and their position varies. I know beforehand how a 
line in a file is constructed. I'd like to adapt the order in that the 
functions are applied, but how?

for i, line in enumerate(open("vorodat.vol",'r')):
   points = i+1
   line = line.strip()
   line = line.split(" ")
   lbl = f_label(label)
   f_vector(point)
   f_value(radius)
   v=f_number(num_vertex)
   f_vector_array(rel_vertex,v)
   f_vector_array(glob_vertex,v)
   f_value_array(vertex_orders,v)
   f_value(max_radius)
   e=f_number(num_edge)
   f_value(edge_dist)
   ...etc

I thought about putting the functions in a dict and then create a list 
with the proper order, but can't get it to work.

A second question, all this works for small files with hundreds of 
lines, but some have 100000. Then I can get at max 22 lists with 100000 
items. Not fun. I tried writing the data to a file "out of sequence", 
not fun either. What would be the way to do this?
I thought about writing each data chunk to a proper temporary file 
instead of putting it in a list first. This would require at max 22 temp 
files and then a merge of the files into one.

TIA,

ingo


More information about the Tutor mailing list