[Python-checkins] r57571 - tracker/importer/README.rst

erik.forsberg python-checkins at python.org
Mon Aug 27 21:52:29 CEST 2007


Author: erik.forsberg
Date: Mon Aug 27 21:52:29 2007
New Revision: 57571

Added:
   tracker/importer/README.rst
Log:
Some basic information about the three converters available in this directory

Added: tracker/importer/README.rst
==============================================================================
--- (empty file)
+++ tracker/importer/README.rst	Mon Aug 27 21:52:29 2007
@@ -0,0 +1,149 @@
+Sourceforge to Roundup Converters
+=================================
+
+This README describes the sourceforge_ to roundup_ converters available
+via svn from http://svn.python.org/projects/tracker/importer/.
+
+The converters were written with the aim of providing the python
+project with their own bugtracker based on roundup_ to replace the
+sourceforge bugtracker that were used earlier.
+
+The code in the converters are specific to roundup and to the roundup
+instanse developed for the python project, but it can probably serve
+as a base and inspiration for other projects that need to process
+sourceforge data.
+
+Three Converters
+----------------
+
+Three different converters are available. The first screenscrapes
+sourceforge to get the data. This version will only be briefly
+documented. 
+
+The second parses the format output by sourceforge's "old" backup
+system, known as "The XML Data Export". This export is available to
+all project administrators, and is found on the following URL:
+
+https://sourceforge.net/export/xml_export.php?group_id=<project group id>
+
+The third converter parses the format output by sourceforge's "new"
+backup system, known as "an enhanced version of our XML Data Export
+facility". This exporter is only available to users that are both
+project administrator *and* subscribers, as the export is in "early
+release feature" mode. This second export is available under
+
+https://sourceforge.net/export/xml_export2.php?group_id=<project group id>
+
+Due to the size of the python project, we had to use the "enhanced"
+version (xml_export2.php), as the version available to all projects
+produced invalid XML with missing data. 
+
+If the export from the standard version does not validate as XML due
+to a missing </artifacts> and a missing </project_export>, you have
+hit the same bug and need the enhanced version, which is at the time
+of writing available only to subscribers. 
+
+Why Three Different Converters?
+-------------------------------
+
+To make a long story short:
+
+ * When we began with the project of replacing sourceforge's tracker
+   with one based on roundup, the XML Data Export was completely
+   broken. So, Fredrik Lundh provided a screenscraping framework for
+   sourceforge, and a converter based on this was written. 
+
+   This converter took about 15 hours to complete, and was very error
+   prone due to the instability of the sourceforge web servers.
+
+ * The XML Data Export then got bug fixed, and a new converter were
+   written that processed the XML instead of doing
+   screenscraping. Much faster (about 2h), and also more reliable.
+
+ * The XML Data Export then began to malfunction, producing invalid
+   XML. This problem was reported to sourceforge.
+
+ * After a while, sourceforge told us about the enhanced data export,
+   which unfortunately not only were available only to subscribers,
+   but also produced a completely different XML format.
+
+   A third converter were written.
+
+The Screenscraping Converter
+----------------------------
+
+The following files in this directory is needed by the screenscraping
+converter: 
+
+ * effbot2roundup.py
+
+ * handlers.py
+
+The Screenscraping converter also requires effbot's sourceforge
+screenscraper from http://effbot.org/zone/sandbox-sourceforge.htm
+ 
+The Standard "XML Data Export" Converter
+----------------------------------------
+
+The converter for the standard export available to all project
+administrators consists of the following files:
+
+ * BeautifulSoup.py
+ 
+ * sfxml2roundup.py
+
+ * sfxmlhandlers.py
+
+Basic usage is to run 'sfxml2roundup.py --xmlfile <download from
+sourceforge> --trackerhome <path to your initialized roundup tracker
+instance>'. The handlers list in sfxml2roundup.py needs to adjusted to
+suit your roundup instance's schema, and the handlers in
+sfxmlhandlers.py probably also need adjustment.
+
+The "enhanced XML Data Export" Converter
+----------------------------------------
+
+The converter for the enhanced XML Data Export available to
+subscribed project members consists of the following files:
+
+ * config.py
+ 
+ * xmlexport2handlers.py
+
+ * xmlexport2toroundup.py
+
+Basic usage is to run xmlexport2toroundup.py --xmlfile <download from
+sourceforge> --trackerhome <path to your initialized roundup tracker
+instance>'. The handlers list in xmlexport2toroundup.py and the
+handlers in xmlexport2handlers.py need to be adjusted to your roundup
+instance's schema.
+
+config.py contains some mappings between sourceforge's values for
+properties of different kinds and the corresponding properties in your
+roundup instance.
+
+Other Important Utilities
+-------------------------
+
+The fixsfmojibake.py script takes care of the fact that the export
+from sourceforge has mixed-up character encodings. The export claims
+to be iso-8859-1 but also contains UTF-8. 
+
+This has only been tested with the export from the enhanced version,
+but if you experience character set trouble with the export from the
+standard version, this script might help.
+
+Usage: fixsfmojibake.py < in.xml > out.xml
+
+
+Getting more help
+-----------------
+
+Subscribe to and mail the tracker-discuss mailing list to get hold of
+the people that wrote the
+importer. http://mail.python.org/mailman/listinfo/tracker-discuss has
+the details.
+
+
+.. _sourceforge: http://sf.net
+.. _roundup: http://roundup.sf.net


More information about the Python-checkins mailing list