Modular toolkit for Data Processing (MDP) released
We are pleased to announce the first public release of the MDP library for Python (http://mdp-toolkit.sourceforge.net). This package has been developed in the context of computational neuroscience research, but it should fit the needs of a larger audience of scientists and developers. Modular toolkit for Data Processing is a Python library to implement data processing elements (nodes) and to combine them into data processing sequences (flows). A node corresponds to a learning algorithm or to a generic data processing unit. Each node can have a training phase, during which the internal structures are learned from training data (e.g. the weights of a neural network are adapted or the covariance matrix is estimated) and an execution phase, where new data can be processed forwards (by processing the data through the node) or backwards (by applying the inverse of the transformation computed by the node if defined). MDP is designed to make the implementation of new algorithms easy and intuitive, for example by setting automatically input and output dimension and by casting the data to match the typecode (e.g. float or double) of the internal structures. The nodes were designed to be applied to arbitrarily long sets of data: the internal structures can be updated successively by sending chunks of the input data (this is equivalent to online learning if the chunks consists of single observations, or to batch learning if the whole data is sent in a single chunk). Already implemented nodes include Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Slow Feature Analysis (SFA). A flow consists in an acyclic graph of nodes (currently only node sequences are implemented). The data is sent to an input node and is successively processed by the following nodes on the graph. The general flow implementation automatizes the training, execution and inverse execution (if defined) of the whole graph. A subclass of the basic flow class allows user-supplied checkpoint functions to be executed at the end of each phase, for example to save the internal structures of a node for later analysis. Best regards, Pietro Berkes and Tiziano Zito ---------------------------------------- {p.berkes, t.zito}@biologie.hu-berlin.de Institute for Theoretical Biology Humboldt University Invalidenstrasse 43 D-10115 Berlin, Germany ----------------------------------------
participants (1)
-
Pietro Berkes and Tiziano Zito