OrderedDict

silver0346 at gmail.com silver0346 at gmail.com
Thu Jun 9 06:02:21 EDT 2016


On Friday, May 20, 2016 at 7:15:38 AM UTC+2, silve... at gmail.com wrote:
> On Wednesday, May 18, 2016 at 2:25:16 PM UTC+2, Peter Otten wrote:
> > Chris Angelico wrote:
> > 
> > > On Wed, May 18, 2016 at 7:28 PM, Peter Otten <__peter__ at web.de> wrote:
> > >> I don't see an official way to pass a custom dict type to the library,
> > >> but if you are not afraid to change its source code the following patch
> > >> will allow you to access the value of dictionaries with a single entry as
> > >> d[0]:
> > >>
> > >> $ diff -u py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> > >> py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> > >> --- py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py      
> > >> 2016-05-18 11:18:44.000000000 +0200
> > >> +++ py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py       
> > >> 2016-05-18 11:11:13.417665697 +0200 @@ -35,6 +35,13 @@
> > >>  __version__ = '0.10.1'
> > >>  __license__ = 'MIT'
> > >>
> > >> +_OrderedDict = OrderedDict
> > >> +class OrderedDict(_OrderedDict):
> > >> +    def __getitem__(self, key):
> > >> +        if key == 0:
> > >> +            [result] = self.values()
> > >> +            return result
> > >> +        return _OrderedDict.__getitem__(self, key)
> > >>
> > >>  class ParsingInterrupted(Exception):
> > >>      pass
> > > 
> > > Easier than patching might be monkeypatching.
> > > 
> > > class OrderedDict(OrderedDict):
> > >     ... getitem code as above ...
> > > xmltodict.OrderedDict = OrderedDict
> > > 
> > > Try it, see if it works.
> > 
> > It turns out I was wrong on (at least) two accounts: 
> > 
> > - xmltodict does offer a way to specify the dict type
> > - the proposed dict implementation will not solve the OP's problem
> > 
> > Here is an improved fix which should work:
> > 
> > 
> > $ cat sample.xml 
> > <?xml version="1.0" encoding="utf-8" ?>
> > <profiles>
> >   <profile id='visio02' revision='2015051501' >
> >   <package package-id='0964-gpg4win' />
> >   </profile>
> > </profiles>
> > $ cat sample2.xml 
> > <?xml version="1.0" encoding="utf-8" ?>
> > <profiles>
> >   <profile id='visio02' revision='2015051501' >
> >   <package package-id='0964-gpg4win' />
> >   <package package-id='0965-gpg4win' />
> >   </profile>
> > </profiles>
> > $ cat demo.py
> > import collections
> > import sys
> > import xmltodict
> > 
> > 
> > class MyOrderedDict(collections.OrderedDict):
> >     def __getitem__(self, key):
> >         if key == 0 and len(self) == 1:
> >             return self
> >         return super(MyOrderedDict, self).__getitem__(key)
> > 
> > 
> > def main():
> >     filename = sys.argv[1]
> >     with open(filename) as f:
> >         doc = xmltodict.parse(f.read(), dict_constructor=MyOrderedDict)
> > 
> >     print "doc:\n{}\n".format(doc)
> >     print "package-id: {}".format(
> >         doc['profiles']['profile']['package'][0]['@package-id'])
> > 
> > 
> > if __name__ == "__main__":
> >     main()
> > $ python demo.py sample.xml 
> > doc:
> > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile', 
> > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), 
> > (u'package', MyOrderedDict([(u'@package-id', u'0964-gpg4win')]))]))]))])
> > 
> > package-id: 0964-gpg4win
> > $ python demo.py sample2.xml 
> > doc:
> > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile', 
> > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), 
> > (u'package', [MyOrderedDict([(u'@package-id', u'0964-gpg4win')]), 
> > MyOrderedDict([(u'@package-id', u'0965-gpg4win')])])]))]))])
> > 
> > package-id: 0964-gpg4win
> 
> I have tested the first solution. Works nice. Before I used xml.etree to parse 2000 xml files. 
> 
> Execution time decrease from more then 5 min to 20 sec. Great. On weekend I will test the solution with the own class.
> 
> Many thanks.

Hi all,

tests with solution with the own class successful. Nice inspiration. I use this solution in my django script.

Many thanks.


More information about the Python-list mailing list