[Numpy-discussion] data transit

Renato Serodio renato.serodio at gmail.com
Fri Dec 7 10:44:20 EST 2007


Hello all,

I'm developing a custom computational application which I chose to
base in Numpy. Already quite in love with Python, and with proprietary
things making me increasingly sick (through forced exposure to stupid
errors I can't correct), Numpy was the way to go.

Now, it is in times like this that I regret not going for graduate
studies in computation - I'm a bit locked in the old paradigms that my
[fortran] generation learn.

Since my application is only vaguely required to be 'generic', I had
to dive into the wonderful world of computer science - a previous post
in this group led to some very interesting solutions for the
application, which, while doing nothing, is capable of doing
everything :)

A bit of context: the application is supposed to process telemetry,
outputing some chart, alarm, etc. Raw data is obtained through
plugin-like objects, which provide a uniform interface to distinct
sources. The processing routines are objects as well, but operate on
data as if they were functions (sort of sin(x)). This way, I don't
need to define anything other than the interfaces - the core remains
flexible.

I came to a problem, though, while trying to define some structure for
data transit. At first I imagined I could keep both raw data and
results inside the same object; unfortunately, if I want to use these
results in a second stage, my flexibility is rather impaired.

Then I thought about getting raw data into an object, passing that to
the processing core, and finally storing its output in another object.
While this has the advantage of clearing raw data out of memory as
soon as I finish chewing it, I seem to lose the relation between raw
and result data sets - which I have to maintain somewhere else.

Yet another issue crops up, in relation to very large data sets. If
there's not enough memory to cope with the data set, one either relies
on swapping or changes the algorithm - and in this case having
'inteligent' data objects allows good, textbook encapsulation.

My question is thus, does anyone have experience or could point to
literature/code where related problems are addressed? I understand
that my application may be suffering from excessive 'generality', but
certainly this problem has surfaced elsewhere.

Looking forward to your answers,

Renato



More information about the NumPy-Discussion mailing list