[Numpy-discussion] ANN: BLZ 0.6.1 has been released

Francesc Alted francesc at continuum.io
Sun Jan 26 02:52:19 EST 2014


Announcing BLZ 0.6 series
=========================

What it is
----------

BLZ is a chunked container for numerical data.  Chunking allows for
efficient enlarging/shrinking of data container.  In addition, it can
also be compressed for reducing memory/disk needs.  The compression
process is carried out internally by Blosc, a high-performance
compressor that is optimized for binary data.

The main objects in BLZ are `barray` and `btable`.  `barray` is meant
for storing multidimensional homogeneous datasets efficiently.
`barray` objects provide the foundations for building `btable`
objects, where each column is made of a single `barray`.  Facilities
are provided for iterating, filtering and querying `btables` in an
efficient way.  You can find more info about `barray` and `btable` in
the tutorial:

http://blz.pydata.org/blz-manual/tutorial.html

BLZ can use numexpr internally so as to accelerate many vector and
query operations (although it can use pure NumPy for doing so too)
either from memory or from disk.  In the future, it is planned to use
Numba as the computational kernel and to provide better Blaze
(http://blaze.pydata.org) integration.


What's new
----------

BLZ has been branched off from the Blaze project
(http://blaze.pydata.org).  BLZ was meant as a persistent format and
library for I/O in Blaze.  BLZ in Blaze is based on previous carray
0.5 and this is why this new version is labeled 0.6.

BLZ supports completely transparent storage on-disk in addition to
memory.  That means that *everything* that can be done with the
in-memory container can be done using the disk as well.

The advantages of a disk-based container is that the addressable space
is much larger than just your available memory.  Also, as BLZ is based
on a chunked and compressed data layout based on the super-fast Blosc
compression library, the data access speed is very good.

The format chosen for the persistence layer is based on the
'bloscpack' library and described in the "Persistent format for BLZ"
chapter of the user manual ('docs/source/persistence-format.rst').
More about Bloscpack here: https://github.com/esc/bloscpack

You may want to know more about BLZ in this blog entry:
http://continuum.io/blog/blz-format

In this version, support for Blosc 1.3 has been added, that meaning
that a new `cname` parameter has been added to the `bparams` class, so
that you can select you preferred compressor from 'blosclz', 'lz4',
'lz4hc', 'snappy' and 'zlib'.

Also, many bugs have been fixed, providing a much smoother experience.

CAVEAT: The BLZ/bloscpack format is still evolving, so don't trust on
forward compatibility of the format, at least until 1.0, where the
internal format will be declared frozen.


Resources
---------

Visit the main BLZ site repository at:
http://github.com/ContinuumIO/blz

Read the online docs at:
http://blz.pydata.org/blz-manual/index.html

Home of Blosc compressor:
http://www.blosc.org

User's mail list:
blaze-dev at continuum.io

----

    Enjoy!

Francesc Alted
Continuum Analytics, Inc.




More information about the NumPy-Discussion mailing list