[XML-SIG] Help parsing XML
Dan Gunter
dkgunter at lbl.gov
Wed Mar 30 01:00:36 CEST 2005
I can suggest where to start. You could use XSLT to transform it first,
ie into <HL> ..stuff.. </HL> sections. The XSLT cookbook (o'reilly)
recipe 6.8 "Deepening an XML hierarchy", should help. Or you could
stream the tree through (eg PullDOM or elementtree) and write the
program logic to transform it in Python (assuming what's between HL tags
fits into memory, but probably the XSLT approach has the same
limitation). Hope that helps.
Greg Lindstrom wrote:
> Hello-
> I have a general (I guess) xml parsing question that I hope has an
> answer. I am busy parsing health care claim records using xpath and
> do not see a way to parse the following (stripped down) file (I've
> added lines to group my problem...)
> 1. + <seg id='ST'>
> 2. + <loop id='HEADER'>
> 3. - <loop id='DETAIL'>
> 4. - <loop id='2000A'>
> 5. + <seg id='HL'>
> 6. + <loop id='2000AA'>
> 7. + <loop id='2000B'>
> 8. + <seg id='HL'> --------+
> 9. + <seg id='SBR'> |
> 10. + <loop id='2010BA'> | Group 1
> 11. + <loop id='2010BB'> |
> 12. + <loop id='2300'> -----+
> 13. + <seg id='HL'> ---------+
> 14. + <seg id='SBR'> |
> 15. + <loop id='2010BA'> |
> 16. + <loop id='2010BB'> | Group 2
> 17. + <loop id='2300'> -----+
> 18. </loop>
> 19. </loop>
> 20. </loop>
> What I need to do is process the records from lines 8-12 as a group,
> then the records from lines 13-17 as another group. Each of the "HL"
> segments indicates the beginning of a new set of records to process.
> I would think that the xml should (would/could) be defined so that
> each of the HL statements would start a new loop structure, but that's
> not how it's defined and I can't change it. There is no way of
> knowing how many lines will be in each set of records, or how many HL
> segments will be beneath the 2000B loop, so is there a way I can
> logically group the record segments together to form a packet of
> record to process?
> Thanks for any attention/help you can pass my way.
> --greg
More information about the XML-SIG
mailing list