ANN: Bubbles 0.1 – Virtual Data Object Framework

Sun Jun 23 19:58:54 CEST 2013

Hi,

I'm happy to announce new Python data framework: Bubbles

Motto: Focus on the process, not the data technology.

Blog post: http://blog.databrewery.org/posts/bubbles-0-1-released.html

Here is a short presentation of the core concepts:

	http://www.slideshare.net/Stiivi/data-brewery-2-data-objects

The concepts are:

* data objects – abstraction of tabular data, one object might have multiple representations at once (SQL, iterator, ...)
* data stores – abstraction of dataset collections
* operations (performing on top of representations) and execution context (with operation catalog)
* processing pipelines

Priorities of the framework are:

* understandability of the process
* auditability of the data being processed (frequent use of metadata)
* usability
* versatility

Working with data:

* keep data in their original form. For example: represent data by a SQL statement and do not touch neither move around data if not necessary.
* use native operations if possible: compose SQL statements, chain python iterators, compose APIs
* performance provided by technology: SQL optimizer should know the best
* have options – custom operations are easy to create

Bubbles is performance agnostic at the low level of physical data implementation. Performance should be assured by the data technology and proper use of operations.

Summary of current operations:

	http://www.scribd.com/doc/147247069/Bubbles-Brewery2-Operations

More will come, at least basic Mongo ops are planned for 0.2.

Github: https://github.com/Stiivi/bubbles

If you have any comments, suggestions or questions, let me know.

Cheers,

Stefan