Exporting member info (including options)
Is there a good way to export all of the member info for a list?
I want all the mailing list options as well as the email address, so that I can import it into MySQL and use the MySQL adaptor.
I know "list_members" gives just the email addresses.
Is the best method for me to use dbdump to access the "pickle" files and then write a script to parse that?
Aaron
-- Energy Justice Communities Map Developer - http://www.energyjustice.net/map
Aaron Kreider wrote:
Is there a good way to export all of the member info for a list?
I want all the mailing list options as well as the email address, so that I can import it into MySQL and use the MySQL adaptor.
There is a script at <http://www.msapiro.net/scripts/mailman-subscribers.py> (mirrored at <http://fog.ccsf.cc.ca.us/~msapiro/scripts/mailman-subscribers.py>) that will do this by screen scraping the web admin Membership List pages, but if I were doing it, I would just create a withlist script to do it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Aaron Kreider wrote:
Is there a good way to export all of the member info for a list?
I want all the mailing list options as well as the email address, so that I can import it into MySQL and use the MySQL adaptor.
There is a script at <http://www.msapiro.net/scripts/mailman-subscribers.py> (mirrored at <http://fog.ccsf.cc.ca.us/~msapiro/scripts/mailman-subscribers.py>) that will do this by screen scraping the web admin Membership List pages, but if I were doing it, I would just create a withlist script to do it.
I thought when you kicked in the mysql adaptor, it populated the mysql table with all that data... you're saying it doesn't, and you have to start from scratch?
Bob
Bob Puff wrote:
I thought when you kicked in the mysql adaptor, it populated the mysql table with all that data... you're saying it doesn't, and you have to start from scratch?
If you start a new list with any of the MysqlMemberships.py adaptors that I have seen, it creates the necessary database tables.
If you switch an existing list from OldStyleMemberships.py to MysqlMemberships.py, you have replaced all the list methods that access the list's config.pck for member data with ones that access the MySQL database. Thus, MysqlMemberships.py has no way to get the old membership data and you have to somehow get it before the switch to populate the new database.
Ideally, there would be a utility to extract the membership data, populate the MySQL database and switch the member adaptor, but as far as I know, there is no such utility.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
I'm figuring out where to get the information from the config.pck (pickle file) for each list to populate the MySQL table. It seems like the MySQL format is extracting some of its fields from the user_options fields, is this true? If this is true, it is confusing because several of the user_options are listed in their own field AND in the user_options field. ***MySQL field -- Pickle Source*** address -- members object hide -- useroptions (16) nomail -- useroptions (128) - am I right about this? ack -- useroptions (4) not_metoo -- useroptions (2) digest -- digest_members object plain -- useroptions (8) password -- password object lang -- language object name -- usernames object one_last_digest -- one_last_digest object. I guess the email would be listed here if I had any people using this setting. user_options -- user_options delivery_status -- This seems to be a combination of user_options settings "1" and "2", am I right? topics_userinterest -- there is a topics_userinterest object which I'm not using. delivery_status_timestamp - I don't see this listed in config.pck. Where is this? I guess I'm not too worried about it as it seems like a temporary thing to do with bounces. I'm skipping the following fields which store information about bounces as I'm fine with losing my bounce data. bi_cookie bi_score bi_noticesleft bi_lastnotice bi_date ------------------------------------ BTW, I exported the pickle into JSON so I could use it in PHP. Used a python script from: http://stackoverflow.com/questions/3040872/pythons-cpickle-deserialization-f... # pickle2json.py import sys, optparse, cPickle, os try: import json except: import simplejson as json # Setup the arguments this script can accept from the command line parser = optparse.OptionParser() parser.add_option('-p','--pickled_data_path',dest="pickled_data_path",type="string",help="Path to the file containing pickled data.") parser.add_option('-j','--json_data_path',dest="json_data_path",type="string",help="Path to where the json data should be saved.") opts,args=parser.parse_args() # Load in the pickled data from either a file or the standard input stream if opts.pickled_data_path: unpickled_data = cPickle.loads(open(opts.pickled_data_path).read()) else: unpickled_data = cPickle.loads(sys.stdin.read()) # Output the json version of the data either to another file or to the standard output if opts.json_data_path: open(opts.json_data_path, 'w').write(json.dumps(unpickled_data)) else: print unpickled_data
Aaron Kreider wrote;
I'm figuring out where to get the information from the config.pck (pickle file) for each list to populate the MySQL table.
It seems like the MySQL format is extracting some of its fields from the user_options fields, is this true?
user_options is used by OldStyleMemberships.py to store the user's options defined by the bit definitions (from Defaults.py) # Bitfield for user options. See DEFAULT_NEW_MEMBER_OPTIONS above to set # defaults for all new lists. Digests = 0 # handled by other mechanism, doesn't need a flag. DisableDelivery = 1 # Obsolete; use set/getDeliveryStatus() DontReceiveOwnPosts = 2 # Non-digesters only AcknowledgePosts = 4 DisableMime = 8 # Digesters only ConcealSubscription = 16 SuppressPasswordReminder = 32 ReceiveNonmatchingTopics = 64 Moderate = 128 DontReceiveDuplicates = 256 but these options are accessed and set via the list (set|get)MemberOption() methods defined by the MemberAdaptor and can be stored internally by the MemberAdaptor in any way it wants. There is no requirement that the MemberAdaptor store an integer field with value = the sum of the option bits, however, if it doesn't, it complicates the setting of member options from new_member_options in the addNewMember() method.
If this is true, it is confusing because several of the user_options are listed in their own field AND in the user_options field.
Examination of <http://trac.rezo.net/trac/rezo/browser/Mailman/MySQLMemberAdaptor/MysqlMembe...> (is that what you're using?) shows that it stores and uses user_options in the same way as OldStyleMermberships.py. hide, ack, not_metoo and plain are defined fields in the MySQL database table, but are otherwise unreferenced by MysqlMemberships.py.
***MySQL field -- Pickle Source*** address -- members object hide -- useroptions (16) nomail -- useroptions (128) - am I right about this?
No. nomail is not a user_options bit although it used to be the DisableDelivery bit. The 128 bit is Moderate.
ack -- useroptions (4) not_metoo -- useroptions (2) digest -- digest_members object plain -- useroptions (8) password -- password object lang -- language object name -- usernames object one_last_digest -- one_last_digest object. I guess the email would be listed here if I had any people using this setting.
This setting is used when a member switches from digest to non-digest if there are digest messages pending at the time of the switch. It is effectively unused by MysqlMemberships.py. Whatever you put in the MySQL one_last_digest field is irrelevant because there is no MemberAdaptor method to query it or return a list of members for which one_last_digest is true. The code in ToDigest.py is drecips = mlist.getDigestMemberKeys() + mlist.one_last_digest.keys() so unless the MemberAdaptor maintains the list attribute one_last_digest dictionary, it doesn't work anyway. And, MysqlMemberships.py doesn't even maintain the MySQL database field.
user_options -- user_options delivery_status -- This seems to be a combination of user_options settings "1" and "2", am I right?
delivery_status is no longer a user_options bit. It is a separate entity maintained with (set|get)DeliveryStatus() as a tuple with possible values # Delivery statuses ENABLED = 0 # enabled UNKNOWN = 1 # legacy disabled BYUSER = 2 # disabled by user choice BYADMIN = 3 # disabled by admin choice BYBOUNCE = 4 # disabled by bounces along with a timestamp.
topics_userinterest -- there is a topics_userinterest object which I'm not using. delivery_status_timestamp - I don't see this listed in config.pck. Where is this? I guess I'm not too worried about it as it seems like a temporary thing to do with bounces.
It is used for the getDeliveryStatusChangeTime() method. With OldStyleMemberships.py, delivery_status is a dictionary keyed by member with values of tuples consisting of the status (0 - 4 as above) and the time it was set. I think you would be much better off looking at the various get and set methods in OldStyleMemberships.py and comparing them to the ones defined in MysqlMemberships.py to see what MySQL database items correspond to the list member attributes stored by OldStyleMemberships.py. -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Examination of
<http://trac.rezo.net/trac/rezo/browser/Mailman/MySQLMemberAdaptor/MysqlMembe...> (is that what you're using?) shows that it stores and uses user_options in the same way as OldStyleMermberships.py. hide, ack, not_metoo and plain are defined fields in the MySQL database table, but are otherwise unreferenced by MysqlMemberships.py.
I'm using that version.
This is very strange - why are there fields in the mysql table that aren't in use? hide, nomail, ack, not_metoo, and plain
I'm guessing they were either deprecated or the programmer planned to use them in the future.
Aaron
Aaron Kreider wrote:
This is very strange - why are there fields in the mysql table that aren't in use? hide, nomail, ack, not_metoo, and plain
I'm guessing they were either deprecated or the programmer planned to use them in the future.
They were not deprecated in the sense of having at one time been defined in the member data in the OldStyleMemberships.py list object. They have never been attributes of the list object other than as bits in user_options.
These fields also exist (unreferenced) in the table in Kev Green's MysqlMembersips.py <https://bugs.launchpad.net/mailman/+bug/558093>, upon which Fil's is based, so my guess is Kev intended to use them at some point but never got to it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
When I tried to depickle a list, I got an "Import Error No Module named Mailman.Bouncer".
I suspect this is for a list that has bounces. I was able to depickle a much smaller list.
Could it be a users permission issue? I tried running the script as root, but it still gave me the error. I'm using this python script, running it in the mailman/Mailman directory:
import sys, optparse, cPickle, os try: import json except: import simplejson as json
# Setup the arguments this script can accept from the command line parser = optparse.OptionParser() parser.add_option('-p','--pickled_data_path',dest="pickled_data_path",type="string",help="Path to the file containing pickled data.") parser.add_option('-j','--json_data_path',dest="json_data_path",type="string",help="Path to where the json data should be saved.") opts,args=parser.parse_args()
# Load in the pickled data from either a file or the standard input stream if opts.pickled_data_path: unpickled_data = cPickle.loads(open(opts.pickled_data_path).read()) else: unpickled_data = cPickle.loads(sys.stdin.read())
# Output the json version of the data either to another file or to the standard output if opts.json_data_path: open(opts.json_data_path, 'w').write(json.dumps(unpickled_data)) else: print unpickled_data
Full Error: File "pickle2jsonroot.py", line 15, in ? unpickled_data = cPickle.loads(open(opts.pickled_data_path).read()) ImportError: No module named Mailman.Bouncer
Aaron Kreider wrote:
When I tried to depickle a list, I got an "Import Error No Module named Mailman.Bouncer".
I suspect this is for a list that has bounces. I was able to depickle a much smaller list.
Yes, the issue only occurs for lists with members that have bounce_info.
There are at least three ways around this.
Create your script as a withlist script and run it via withlist (see bin/withlist --help), or
run your script from Mailman's bin/ directory and 'import paths' at the beginning of the script, or
copy Mailman's bin/paths.py to the directory from which you want to run the script, and 'import paths' at the beginning of the script.
Could it be a users permission issue? I tried running the script as root, but it still gave me the error. I'm using this python script, running it in the mailman/Mailman directory:
It would probably also work as is if you ran it in the mailman/ directory, but I think the other methods are preferable.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 8/17/2010 7:12 PM, Mark Sapiro wrote:
There are at least three ways around this. run your script from Mailman's bin/ directory and 'import paths' at the beginning of the script, or
This worked. Now I get a JSON error:
File "pickle2json.py", line 21, in ? open(opts.json_data_path, 'w').write(json.dumps(unpickled_data)) File "build/bdist.linux-x86_64/egg/simplejson/__init__.py", line 261, in dumps File "build/bdist.linux-x86_64/egg/simplejson/encoder.py", line 214, in encode File "build/bdist.linux-x86_64/egg/simplejson/encoder.py", line 282, in iterencode File "build/bdist.linux-x86_64/egg/simplejson/encoder.py", line 190, in default TypeError: <bounce info for member email@domain.org current score: 1.0 last bounce date: (2007, 1, 8) email notices left: 3 last notice date: (1970, 1, 1) confirmation cookie: None
is not JSON serializable
I also tried adding skipkeys=True to the JSON decode, but that didn't work: open(opts.json_data_path, 'w').write(json.dumps(unpickled_data, skipkeys=True))
Some Googling on this topic suggests that simplejson cannot handle decimal places. Is that true? It might also be having a problem with the date objects.
I might just manually delete all the bounce data for all the lists (around 20) in the mailman admin interface, so I can export them.
Aaron
Aaron Kreider wrote:
This worked. Now I get a JSON error:
File "pickle2json.py", line 21, in ? open(opts.json_data_path, 'w').write(json.dumps(unpickled_data)) File "build/bdist.linux-x86_64/egg/simplejson/__init__.py", line 261, in dumps File "build/bdist.linux-x86_64/egg/simplejson/encoder.py", line 214, in encode File "build/bdist.linux-x86_64/egg/simplejson/encoder.py", line 282, in iterencode File "build/bdist.linux-x86_64/egg/simplejson/encoder.py", line 190, in default TypeError: <bounce info for member email@domain.org current score: 1.0 last bounce date: (2007, 1, 8) email notices left: 3 last notice date: (1970, 1, 1) confirmation cookie: None
is not JSON serializable
The problem is that a member's bounce_info is an instance of the Mailman.Bouncer._BounceInfo class (this is why your script needs to be able to find the Mailman.Bouncer module to unpickle a list with bounce_info).
JSON can't represent this class instance directly, just as MysqlMemberships.py can't store it directly in the MySQL database. This is why MysqlMemberships.py takes the class instances attributes score, date, noticesleft, lastnotice and cookie and stores them as separate fields bi_score, bi_date, bi_noticesleft, bi_lastnotice and bi_cookie in the MySQL database table.
Your script needs to do a similar thing since it's the attribute values you need to put in the database anyway.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Aug 12, 2010, at 02:00 PM, Aaron Kreider wrote:
Is there a good way to export all of the member info for a list?
Mailman 3 has two related, but incomplete approaches that might provide some useful examples. There is an "export" command that dumps the various configurations to an XML format file. See src/mailman/bin/export.py
There's also an "import" command which will be used to import a Mailman 2 config.pck file into the schema used by Mailman 3. This is kind of working and there are some doctests for this, but it still needs work to be a complete solution. See src/mailman/command/cli_import.py for details.
-Barry
participants (4)
-
Aaron Kreider
-
Barry Warsaw
-
Bob Puff
-
Mark Sapiro