extract to dictionaries

Mike Kazantsev mk.fraggod at gmail.com
Fri May 29 19:03:07 EDT 2009


On Thu, 28 May 2009 16:03:45 -0700 (PDT)
Marius Retegan <marius.s.retegan at gmail.com> wrote:

> Hello
> I have simple text file that I have to parse. It looks something like
> this:
> 
> parameters1
>      key1 value1
>      key2 value2
> end
> 
> parameters2
>      key1 value1
>      key2 value2
> end
> 
> So I want to create two dictionaries parameters1={key1:value1,
> key2:value2} and the same for parameters2.


You can use iterators to efficiently parse no-matter-how-large file.
Following code depends on line breaks and 'end' statement rather than
indentation.


  import itertools as it, operator as op, functools as ft
  from string import whitespace as spaces

  with open('test.src') as src:
    lines = it.ifilter(bool, it.imap(lambda x: x.strip(spaces), src))
    sections = ( (lines.next(), dict(it.imap(str.split, lines))) for sep,lines in
      it.groupby(lines, key=lambda x: x == 'end') if not sep )
    data = dict(sections)

  print data
  # { 'parameters2': {'key2': 'value2', 'key1': 'value1'},
  #  'parameters1': {'key2': 'value2', 'key1': 'value1'} }



To save namespace and make it a bit more unreadable you can write it
as a one-liner:

  with open('test.src') as src:
    data = dict( (lines.next(), dict(it.imap(str.split, lines))) for sep,lines in
      it.groupby(it.ifilter(bool, it.imap(lambda x: x.strip(spaces), src)),
      key=lambda x: x == 'end') if not sep )


-- 
Mike Kazantsev // fraggod.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 205 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20090530/436fcba3/attachment-0001.sig>


More information about the Python-list mailing list