ANN: python-blosc 1.0.2
==================================================== Announcing python-blosc 1.0.2 A Python wrapper for the Blosc compression library ==================================================== What is it? =========== Blosc (http://blosc.pytables.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc. python-blosc is a Python package that wraps it. What is new? ============ Updated to Blosc 1.1.2. Fixes some bugs when dealing with very small buffers (typically smaller than specified typesizes). Closes #1. Basic Usage =========== [Using IPython shell and a 2-core machine below] # Create a binary string made of int (32-bit) elements
import array a = array.array('i', range(10*1000*1000)) bytes_array = a.tostring()
# Compress it
import blosc bpacked = blosc.compress(bytes_array, typesize=a.itemsize) len(bytes_array) / len(bpacked) 110 # 110x compression ratio. Not bad! # Compression speed? timeit blosc.compress(bytes_array, typesize=a.itemsize) 100 loops, best of 3: 12.8 ms per loop len(bytes_array) / 0.0128 / (1024*1024*1024) 2.9103830456733704 # wow, compressing at ~ 3 GB/s, that's fast!
# Decompress it
bytes_array2 = blosc.decompress(bpacked) # Check whether our data have had a good trip bytes_array == bytes_array2 True # yup, it seems so # Decompression speed? timeit blosc.decompress(bpacked) 10 loops, best of 3: 21.3 ms per loop len(bytes_array) / 0.0213 / (1024*1024*1024) 1.7489625814375185 # decompressing at ~ 1.7 GB/s is pretty good too!
More examples showing other features (and using NumPy arrays) are available on the python-blosc wiki page: http://github.com/FrancescAlted/python-blosc/wiki Documentation ============= Please refer to docstrings. Start by the main package:
import blosc help(blosc)
and ask for more docstrings in the referenced functions. Download sources ================ Go to: http://github.com/FrancescAlted/python-blosc and download the most recent release from here. Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for details. Mailing list ============ There is an official mailing list for Blosc at: blosc@googlegroups.com http://groups.google.es/group/blosc ---- **Enjoy data!** -- Francesc Alted
participants (1)
-
Francesc Alted