Trying to parse a HUGE(1gb) xml file
spaceman-spiff
ashish.makani at gmail.com
Mon Dec 20 15:29:01 EST 2010
Hi Usernet
First up, thanks for your prompt reply.
I will make sure i read RFC1855, before posting again, but right now chasing a hard deadline :)
I am sorry i left out what exactly i am trying to do.
0. Goal :I am looking for a specific element..there are several 10s/100s occurrences of that element in the 1gb xml file.
The contents of the xml, is just a dump of config parameters from a packet switch( although imho, the contents of the xml dont matter)
I need to detect them & then for each 1, i need to copy all the content b/w the element's start & end tags & create a smaller xml file.
1. Can you point me to some examples/samples of using SAX, especially , ones dealing with really large XML files.
2.This brings me to another q. which i forgot to ask in my OP(original post).
Is simply opening the file, & using reg ex to look for the element i need, a *good* approach ?
While researching my problem, some article seemed to advise against this, especially since its known apriori, that the file is an xml & since regex code gets complicated very quickly & is not very readable.
But is that just a "style"/"elegance" issue, & for my particular problem (detecting a certain element, & then creating(writing) a smaller xml file corresponding to, each pair of start & end tags of said element), is the open file & regex approach, something you would recommend ?
Thanks again for your super-prompt response :)
cheers
ashish
More information about the Python-list
mailing list