OrderedDict
silver0346 at gmail.com
silver0346 at gmail.com
Thu Jun 9 06:02:21 EDT 2016
On Friday, May 20, 2016 at 7:15:38 AM UTC+2, silve... at gmail.com wrote:
> On Wednesday, May 18, 2016 at 2:25:16 PM UTC+2, Peter Otten wrote:
> > Chris Angelico wrote:
> >
> > > On Wed, May 18, 2016 at 7:28 PM, Peter Otten <__peter__ at web.de> wrote:
> > >> I don't see an official way to pass a custom dict type to the library,
> > >> but if you are not afraid to change its source code the following patch
> > >> will allow you to access the value of dictionaries with a single entry as
> > >> d[0]:
> > >>
> > >> $ diff -u py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> > >> py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> > >> --- py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> > >> 2016-05-18 11:18:44.000000000 +0200
> > >> +++ py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> > >> 2016-05-18 11:11:13.417665697 +0200 @@ -35,6 +35,13 @@
> > >> __version__ = '0.10.1'
> > >> __license__ = 'MIT'
> > >>
> > >> +_OrderedDict = OrderedDict
> > >> +class OrderedDict(_OrderedDict):
> > >> + def __getitem__(self, key):
> > >> + if key == 0:
> > >> + [result] = self.values()
> > >> + return result
> > >> + return _OrderedDict.__getitem__(self, key)
> > >>
> > >> class ParsingInterrupted(Exception):
> > >> pass
> > >
> > > Easier than patching might be monkeypatching.
> > >
> > > class OrderedDict(OrderedDict):
> > > ... getitem code as above ...
> > > xmltodict.OrderedDict = OrderedDict
> > >
> > > Try it, see if it works.
> >
> > It turns out I was wrong on (at least) two accounts:
> >
> > - xmltodict does offer a way to specify the dict type
> > - the proposed dict implementation will not solve the OP's problem
> >
> > Here is an improved fix which should work:
> >
> >
> > $ cat sample.xml
> > <?xml version="1.0" encoding="utf-8" ?>
> > <profiles>
> > <profile id='visio02' revision='2015051501' >
> > <package package-id='0964-gpg4win' />
> > </profile>
> > </profiles>
> > $ cat sample2.xml
> > <?xml version="1.0" encoding="utf-8" ?>
> > <profiles>
> > <profile id='visio02' revision='2015051501' >
> > <package package-id='0964-gpg4win' />
> > <package package-id='0965-gpg4win' />
> > </profile>
> > </profiles>
> > $ cat demo.py
> > import collections
> > import sys
> > import xmltodict
> >
> >
> > class MyOrderedDict(collections.OrderedDict):
> > def __getitem__(self, key):
> > if key == 0 and len(self) == 1:
> > return self
> > return super(MyOrderedDict, self).__getitem__(key)
> >
> >
> > def main():
> > filename = sys.argv[1]
> > with open(filename) as f:
> > doc = xmltodict.parse(f.read(), dict_constructor=MyOrderedDict)
> >
> > print "doc:\n{}\n".format(doc)
> > print "package-id: {}".format(
> > doc['profiles']['profile']['package'][0]['@package-id'])
> >
> >
> > if __name__ == "__main__":
> > main()
> > $ python demo.py sample.xml
> > doc:
> > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile',
> > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'),
> > (u'package', MyOrderedDict([(u'@package-id', u'0964-gpg4win')]))]))]))])
> >
> > package-id: 0964-gpg4win
> > $ python demo.py sample2.xml
> > doc:
> > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile',
> > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'),
> > (u'package', [MyOrderedDict([(u'@package-id', u'0964-gpg4win')]),
> > MyOrderedDict([(u'@package-id', u'0965-gpg4win')])])]))]))])
> >
> > package-id: 0964-gpg4win
>
> I have tested the first solution. Works nice. Before I used xml.etree to parse 2000 xml files.
>
> Execution time decrease from more then 5 min to 20 sec. Great. On weekend I will test the solution with the own class.
>
> Many thanks.
Hi all,
tests with solution with the own class successful. Nice inspiration. I use this solution in my django script.
Many thanks.
More information about the Python-list
mailing list