[Tutor] text processing lines variable content
Mark Lawrence
breamoreboy at gmail.com
Wed Feb 6 15:45:27 EST 2019
On 06/02/2019 18:51, ingo janssen wrote:
>
> On 06/02/2019 19:07, Mark Lawrence wrote:
>
>> That's going to a lot of work slicing and dicing the input lists.
>> Perhaps a chunked recipe like this
>> https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked
>> would be better.
>
> The length of the text chunks varies from a single character to a list
> of ~30 3D vectors.
So what, you still don't need to chop the front from the list, just
process the data.
>
>>> I'd like to adapt the order in that the functions are applied, but how?
>>
>> I suspect that you're trying to over complicate things, what's wrong
>> with a simple if/elif chain, a switch based on a dict or similar?
>>
>
> You mean create a list with the order=[a,b,e,d...]
> if a in order:
> f_vector_array(a, 3)
> elseif b in order:
> f_value(max_radius)
>
> that would run the proper function, but not in the right order?
Again I've no idea what you're saying here.
>
>>>
>>> for i, line in enumerate(open("vorodat.vol",'r')):
>>> points = i+1
>>
>> enumerate takes a start argument so you shouldn't need the above line.
>
> points is needed later on in the program and I don't know beforehand how
> many lines I have.
Now you tell us :-(
>
>>> I thought about putting the functions in a dict and then create a
>>> list with the proper order, but can't get it to work.
>>
>> Please show us your code and exactly why it didn't work.
>>
>
> def f_vector_array(outlist, length):
> rv = pop_left_slice(line, length)
> rv = [f'<{i[1:-1]}>' for i in rv] #i format is: '(1.234,2.345,3.456)'
> rv = ",".join(rv)
> outlist.append(f" //label: {lbl}\n array[{length}]"+"{\n "+rv+"\n
> }\n")
>
> functions={
> 'a':f_number(num_vertex),
> 'b':f_vector_array(rel_vertex,v)
> }
> where rel_vertex is the list where to move the processed data to and v
> the amount of text to chop of the front of the line. v is not known when
> defining the dictionary. v comes from an other function
> v=f_number(num_vertex) that also should live in the dict.
You don't need to specify the parameters in the dict, just give the
function name.
>
> then loop order=[a,b,e,d...] for each line
>
What has a loop order got to do with using a dict?
>>
>> I'm not absolutely sure what you're saying here, but would something
>> like the SortedList from
>> http://www.grantjenks.com/docs/sortedcontainers/ help?
>
> Maybe this explains it better, assume the split input lines:
> line1=[a,b,c,d,e,f,...]
> line2=[a,b,c,d,e,f,...]
> line3=[a,b,c,d,e,f,...]
> ...
> line100000=...
>
> all data on position a should go to list a
>
> a=[a1,a2,a3,...a_n]
> b=[b1,b2,b3,...b_n]
> c=[c1,c2,c3,...n_n]
> etc.
>
> this is what for example the function f_vector_array(a, 3) does.
Why bother, just have a list of lists and index on the position, or are
we talking at cross purposes?
>
> All these lists have to be written to a single file, each list contains
> 100000 items. Instead of keeping it all in memory I could write a1 to a
> temp file A instead of putting it in a list first and b1 to a temp file
> B etc. in the next loop a2 to file A, b2 to file B etc. When all lines
> are processed combine the files A,B,C ... to a single file. Or is there
> a more practical way? Speed is not important.
What is your definition of "combine the files A,B,C ... to a single file"?
>
> ingo
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
More information about the Tutor
mailing list