On 4/2/06, Fredrik Lundh
Brett Cannon wrote:
oh, I forgot that the Procrastination & Stop energy Foundation was involved in this.
Fredrik, if you would like to help move this all forward, great; I would appreciate the help. You can write a page scraper to get the data out of SF
challenge accepted ;-)
http://effbot.python-hosting.com/browser/stuff/sandbox/sourceforge/
contains three basic tools; getindex to grab index information from a python tracker, getpages to get "raw" xhtml versions of the item pages, and getfiles to get attached files.
I'm currently downloading a tracker snapshot that could be useful for testing; it'll take a few more hours before all data are downloaded (provided that SF doesn't ban me, and I don't stumble upon more cases where a certain rhettinger has pasted binary gunk into an iso-8859-1 form ;-).
$ python status.py tracker-105470 6681 items 1201 pages (17%) 104 files tracker-305470 3610 items 0 pages (0%) 0 files tracker-355470 430 items 430 pages (100%) 80 files
the final step is to finish the "off-line scraper" library (a straightfor- ward ET hack), and make a snapshot archive available to interested parties. (drop me a line if you want a copy)
If you would rather contribute by collecting a list of possible trackers along with who will maintain it, then please do. I am not going to dive into that quite yet, but if you want to parallelize the work needed then I would appreciate the help.
that is what I expected the PSF infrastructure committee to do (I hope you're not the only one in that committee?); it's a bit disappointing to hear that we're still stuck on the SF export issue.
The reason I didn't want to deal with the trackers quite yet was that I could see people getting the trackers up and squared away, and then just get frustrated when we were unable to get the SF data to them quickly. I didn't want other people stuck spinning there wheels waiting on us. -Brett
(wasn't there someone with backchannel access to the SF data ?)
The tracker will need to be able to import the SF data somehow (probably will require a custom tool so the volunteers need to be aware of this), be able to export data (so we can back it up on a regular basis so we don't have to go through this again), and an email interface for at least replying to tracker items. A community-wide announcement will probably be needed to get a good group of volunteers together for any one non-commercial tracker.
But I am not procrastinating. I don't think I have ever come off as a procrastinator on this list and I don't think I deserve the label.
I wasn't talking about individuals, I was referring to the trend where PSF moves something off a public forum, and the work just ends up going nowhere.
</F>
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org