<br><div class="gmail_quote">On Thu, May 3, 2012 at 11:03 PM, Paul Rubin <span dir="ltr"><<a href="mailto:no.email@nospam.invalid" target="_blank">no.email@nospam.invalid</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im">

> Sort of as you suggest, you could build a Huffman encoding for a<br>

> representative run of data, save that tree off somewhere, and then use<br>

> it for all your future encoding/decoding.<br>

<br>

</div>Zlib is better than Huffman in my experience, and Python's zlib module<br>

already has the right entry points.<br>

<br></blockquote><div>Isn't zlib kind of dated?  Granted, it's newer than Huffman, but there's been bzip2 and xz since then, among numerous others.<br><br>Here's something for xz:<br><a href="http://stromberg.dnsalias.org/svn/xz_mod/trunk/">http://stromberg.dnsalias.org/svn/xz_mod/trunk/</a><br>

An xz module is in the CPython 3.3 alphas - the above module wraps it if available, otherwise it uses ctypes or a pipe to an xz binary..<br><br>And I believe bzip2 is in the standard library for most versions of CPython.<br>

<br></div></div>