ANN: PyTables 1.3 released

Francesc Altet faltet at
Sat Apr 1 21:38:48 CEST 2006

 Announcing PyTables 1.3

This is a new major release of PyTables.  The most remarkable feature
added in this version is a complete support (well, almost, because
unicode arrays are not there yet) for NumPy objects. Improved support
for native HDF5 is there as well. As an aside, I'm happy to inform you
that the PyTables web site ( has been converted
into a wiki so that users can contribute to the project with recipes or
any other document.  Try it out!

Go to the (new) PyTables web site for downloading the beast:

or keep reading for more info about the new features and bugs fixed.

Changes more in depth


- Support for NumPy objects in all the objects of PyTables, namely:
  Array, CArray, EArray, VLArray and Table. All the numerical and
  character (except unicode arrays) flavors are supported as well as
  plain and nested heterogeneous NumPy arrays. PyTables leverages the
  adoption of the array interface
  ( for a very efficient
  conversion between all the numarray (which continues to be the native
  flavor for PyTables) object to/from NumPy/Numeric.

- The FLAVOR schema in PyTables has been refined and simplified. Now,
  the only 'flavors' allowed for data objects are: "numarray", "numpy",
  "numeric" and "python". The changes has been made so that they are
  fully backward compatible with existing PyTables files. However, when
  users would try to use old flavors (like "Numeric" or "Tuple") in
  existing code, a ``DeprecationWarning`` will be issued in order to
  encourage them to migrate to the new flavors as soon as possible.

- Nested fields can be specified in the "field" parameter of
  by using a '/' as a separator between fields (e.g. 'Info/value').

- The Table.Cols accessor has received a new ``__setitem__()`` method
  that allows doing things like:

            table.cols[4] = record
            table.cols.x[4:1000:2] = array   # homogeneous column
            table.cols.Info[4:1000:2] = recarray   # nested column

- A clean-up function (using ``atexit``) has been registered so that
  remaining opened files are closed when a user hits a ^C, for
  example. That would help to avoid ending with corrupted files.

- Native HDF5 compound datasets that are contiguous are supported
  now. Before, only chunked datasets were supported.

- Updated (and much improved) sections about compression issues in the
  User's Guide. It includes new benchmarks made with PyTables 1.3 and a
  exhaustive comparison between Zlib, LZO and bzip2.

- The HTML version of manual is made now from the docbook2html package
  for an improved look (IMO).

Bug fixes:

- Solved a problem when trying to save CharArrays with itemsize = 0 as
  attributes of nodes. Now, these objects are pickled in order to
  prevent HDF5 from crashing.

- Fixed some alignment issues with nested record arrays under certain
  architectures (e.g. PowerPC).

- Fixed automatic conversions when a VLArray is read in a platform with
  a byte ordering different from the file.

Deprecated features:

- Due to recurrent problems with the UCL compression library, it has
  been declared deprecated from this version on. You can still compile
  PyTables with UCL support (using the --force-ucl), but you are urged
  to not use it anymore and convert any existing datafiles with UCL to
  other supported library (zlib, lzo or bzip2) with the ``ptrepack``

Backward-incompatible changes:

- Please, see ``RELEASE-NOTES.txt`` file.

Important note for Windows users

If you are willing to use PyTables with Python 2.4 in Windows platforms,
you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET
2003.  It can be found at:

Users of Python 2.3 on Windows will have to download the version of HDF5
compiled with MSVC 6.0 available in:

What it is

**PyTables** is a package for managing hierarchical datasets and
designed to efficiently cope with extremely large amounts of data (with
support for full 64-bit file addressing).  It features an
object-oriented interface that, combined with C extensions for the
performance-critical parts of the code, makes it a very easy-to-use tool
for high performance data storage and retrieval.

PyTables runs on top of the HDF5 library and numarray (but NumPy and
Numeric are also supported) package for achieving maximum throughput and
convenient use.

Besides, PyTables I/O for table objects is buffered, implemented in C
and carefully tuned so that you can reach much better performance with
PyTables than with your own home-grown wrappings to the HDF5 library.
PyTables sports indexing capabilities as well, allowing doing selections
in tables exceeding one billion of rows in just seconds.


This version has been extensively checked on quite a few platforms, like
Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64
(Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC (and PowerPC64)
and MacOSX on PowerPC.  For other platforms, chances are that the code
can be easily compiled and run without further issues.  Please, contact
us in case you are experiencing problems.


Go to the PyTables web site for more details:

About the HDF5 library:

About numarray:

To know more about the company behind the PyTables development, see:


Thanks to various the users who provided feature improvements, patches,
bug reports, support and suggestions.  See the ``THANKS`` file in the
distribution package for a (incomplete) list of contributors.  Many
thanks also to SourceForge who have helped to make and distribute this
package!  And last but not least, a big thank you to THG
( for sponsoring many of the new features
recently introduced in PyTables.

Share your experience

Let us know of any bugs, suggestions, gripes, kudos, etc. you may


  **Enjoy data!**

  -- The PyTables Team

More information about the Python-announce-list mailing list