[Tutor] Continue Matching after First Match
Martin Walsh
mwalsh at groktech.org
Sun May 20 21:04:31 CEST 2007
Hi Tom,
Tom Tucker wrote:
> Why the cStringIO stuff? The input data shown below is collected from
> os.popen. I was trying to find an easy way of matching my regex.
Ah, ldap...
> Matching with a string seemed easier than looping through the ouput
> collected. Hmm. Come to think of it, I guess I could match on the
> first "^dn" catpure that output and then keep looping until "^cn:" is
> seen. Then repeat.
Honestly, I'm not very good with regular expressions -- and try to avoid
them when possible. But in cases where they seem to be the best option,
I have formed a heavy dependence on regex debuggers like kodos.
http://kodos.sourceforge.net/
> Anyways, any suggestions to fix the below code?
<snip>
Have you had a look at the python-ldap package?
http://python-ldap.sourceforge.net/
You could probably access ldap directly with python, if that's an
option. Or, you could roll your own ldif parser (but make sure your data
contains a newline between each dn, or the parser will choke with a
'ValueError: Two lines starting with dn: in one record.'):
import ldif
from cStringIO import StringIO
class MyLDIF(ldif.LDIFParser):
def __init__(self, inputfile):
ldif.LDIFParser.__init__(self, inputfile)
self.users = []
def handle(self, dn, entry):
self.users.append((entry['uid'], entry['cn']))
raw = """\
<snip your ldif example with newlines added between dns>
"""
if __name__ == '__main__':
io = StringIO(raw)
lp = MyLDIF(io)
lp.parse()
for user in lp.users:
uid = user[0][0]
cn = user[1][0]
print uid
print cn
... or ...
You could also use ldif.LDIFRecordList directly without creating a
custom parser class which would return a list of (dn, entry) tuples. The
module author warns that 'It can be a memory hog!', and I can imagine
this is true if you are working with a particularly large ldap directory.
io = StringIO(raw)
directory = ldif.LDIFRecordList(io)
directory.parse()
for dn, entry in directory.all_records:
print entry['uid'][0]
print entry['cn'][0]
More information about the Tutor
mailing list