I am glad to announce that the second alpha release of HarvestMan 1.4.5 version is available for download. The first alpha of version 1.4.5 was released on May 27 2005. HarvestMan [http://harvestman.freezope.org] is a multithreaded web crawler program written entirely in Python. It has many features which allows you to highly customize your www crawling/offline downloading. HarvestMan features as much as 60+ configuration options. Options are configured using a custom XML configuration file. Here are some changes for this release. o Use of collections.deque data structure in critical places. This improves the performance of the program when run with Python 2.4 . o Improved HTTP redirect handler to take care of redirect handling that requires cookies. o A number of bug fixes to reduce invalid url (HTTP 404) errors. o Code cleanup to rewrite exception handlers. o Obsolte modules removed. Complete changelog at http://harvestman.freezope.org /files/Changelog.txt . WWW: http://harvestman.freezope.org . -Anand
participants (1)