[EuroPython] Trying to submit a talk, error

Steven Johnston sjj698 at gmail.com
Mon May 2 00:01:56 CEST 2005


Sorry to mail you, i guess you must get too many already. I am trying
to submit a talk for europython (last min :-) and i cant get the
webstite to work, i keep getting the same error. (error at the end of
this email). Would it be possible to put forward my presentation
proposal to you in an email? (here it is),

Many thanks

Storage Resource Broker (SRB), Large scientific data and Python

Abstract (max 200)

The BioSimGrid (www.biosimgrid.org) project is used with in the
biochemical community to manage large amounts of simulation data. Each
biochemical simulation produces trajectories which store the location
and velocity of each atom in a protein over a given time period. Each 
trajectory can be 5-20GB in size and the BioSimGrid project can store
over 2000 trajectories.
We addresses these issues by delivering a distributed database system,
to enable more effective storage, access and exchange of biomolecular
simulation data, utilising Python as the integration language and
flatfiles for storage.
The flatfile distribution is managed using a Storage Resource Broker
(SRB, http://www.sdsc.edu/srb/) and controlled via a limited python
To exploit the efficiency of the flatfiles for the trajectory data and
the power of relational databases for the metadata, we present a
hybrid approach to storing the data, using (i) SRB to manage flatfiles
and using (ii) Oracle10g to store the metadata across 6 distributes

Long description 

Computer simulations play a vital role in biochemical research. By
simulating the interactions of all atoms within a molecule or protein,
the biochemical properties of the structure can be revealed. One
important application of such Molecular Dynamics and Monte-Carlo
simulations is predictive modelling in drug discovery, where the
motion of proteins are important. These simulations are
computationally demanding and they produce huge amount of data which
is analysed by a variety of methods in order to obtain biochemical
properties.  Generally, these data are stored at the laboratory where
they have been computed in a proprietary format which is unique to the
simulation code that has been used. This constrains the sharing of
data and results within the biochemistry community therefore the data
can generally not be compared easily with post processing tools due to
the varying data formats.
This talk covers how the BioSimGrid project utilises python to produce
a framework which stores the data as well as allows users to submit
processing scripts to process the data. The data are structured into
two key areas, the trajectory data and the metadata. As the metadata
is small and relational, it is stored in a database, curently Oracle
10g. This database is replicated across our 6 sites and accessed using
the Python interfaces. The trajectory data is 5-20GB and not very
suited to a relational database, so for storage and processing
requirements we have produced a framework which stores and retrieves
the data into a Storage Resource Broker (SRB) using python.

SRB enables us to deposit data at any site/location and have it
available to all sites thus eliminating the need to transport or
record the location of the data. There are currently many interfaces
to SRB for other languages as well as a Windows based application and
a very basic python interface. This talk will also look into the
existing API as well as ongoing work, it will cover the advantages and
disadvantages of SRB, providing a practical example showing how it can
be used to store extremely large volumes of data.

Steven Johnston (MEng) 
Computational Engineering and Design Research Group
School of Engineering Science 
University of Southampton
SO17 1BJ
Telephone: + 44 (0) 23 8059 8348
Mob phone: + 44 (0) 77 6439 1901

Email: sjj698 at zepler.org

MSN Login (sjj698 at hotmail.com)

