ANN: HarvestMan 1.4.5 a2

Anand abpillai at gmail.com
Thu Jul 21 20:30:35 CEST 2005


I am glad to announce that the second alpha release of HarvestMan 1.4.5
version is available for download. The first alpha of version 1.4.5 was
released on May 27 2005.

     HarvestMan [http://harvestman.freezope.org] is a multithreaded web
crawler program written entirely in Python. It has many features which
allows you to highly customize your www crawling/offline downloading.
HarvestMan features as much as 60+ configuration options. Options are
configured using a custom XML configuration file.

    Here are some changes for this release.

o Use of collections.deque data structure in critical places. This
  improves the performance of the program when run with Python 2.4 .
o Improved HTTP redirect handler to take care of redirect handling
  that requires cookies.
o A number of bug fixes to reduce invalid url (HTTP 404) errors.
o Code cleanup to rewrite exception handlers.
o Obsolte modules removed.

Complete changelog at http://harvestman.freezope.org
/files/Changelog.txt .

WWW: http://harvestman.freezope.org .

-Anand



More information about the Python-announce-list mailing list