Hi Exarkun, 

Thanks for coming back with this.

No problem if we exclude the spam filter tables. I don't need them.

Concerning the svn tables, if they don't carry additional information linking revisions to other (trac-specific metadata), I don't need them either.

I can understand your concerns regarding the permissions table. I thought they may be useful to help determine a role of a specific user in the organization, if that is enough as argumentation perfect, otherwise we can leave them out as they are not my primary source of data.

Cheers,
Jonathan




On Fri, Apr 5, 2013 at 8:24 AM, <exarkun@twistedmatrix.com> wrote:
On 28 Mar, 05:16 pm, jonathan@stoppani.name wrote:
>Hello everybody (with access to the Trac DB),
>
>I am currently doing my master thesis on the "analysis and management
>of change propagation in complex systems". I'm concentrating my
>efforts on software-based complex systems.
>
>As part of my analysis, I gather data from different domains, such as
>dependencies between modules of the source code, interactions between
>people (like, for example, this mailing list) and change requests (in
>this specific case, issues and tickets on an issue tracking system).
>
>A couple of weeks ago I asked on IRC if it would be possible to get
>the Trac data from twistedmatrix.com and I was told to write a script
>to dump the database by excluding sensitive information.
>
>The script is up for review and auditing over here:
>
>https://gist.github.com/GaretJax/5264941
>
>It can be run by saving it to a .py file or directly with the following
>command:
>
>curl -s
>https://gist.github.com/GaretJax/5264941/raw/c478c2c4ec39cdb4bc3ceeb05d57a31063a0a486
>/dump-trac.py
>| python - <projenv> <outfile>
>
>(by replacing the two arguments: trac base directory and the output
>file).
>
>There are no privacy concerns, as all data being analyzes is publicly
>available, either in the repository, in the mailing list archives or
>on twistedmatrix.com
>
>After being reviewed, can someone with access to the server please run
>it for me?

Hi Jonathan,

A couple questions about the script.  There are a few more tables in the
database that I'm not sure will be interesting to you.

Do you mind if we also exclude:

    spamfilter_bayes
    spamfilter_log

Some of the tables are also basically an inefficient mirror of the
subversion repository - revision, node_change.  Do you want this data as
well?

Lastly, I have some reluctance to distribute the contents of the
permission table.  I could probably be easily convinced to do so, but if
you don't thing you'll actually use that data, I'd just as soon not.

Thanks!
Jean-Paul
>Thanks,
>Jonathan
>
>P.S.: If someone would like more details about the research, a draft
>of the project statement can be found here:
>https://www.dropbox.com/s/qu3jpxcd4wpat2i/statement-0-r0-2013-03-19.pdf
>
>_______________________________________________
>Twisted-Python mailing list
>Twisted-Python@twistedmatrix.com
>http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python