[Expat-discuss] junk after document element at line 2053

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Tue May 18 04:55:47 EDT 2004


On Mon, 17 May 2004, Greg Martin wrote:

>A well-formed XML document has only one top level tag (as you've
>discovered). I think that you can only have a prolog at the beginning of
>a document (which would probably justify the name prolog) which would
>mean that if you wrapped three files in a top-level tag and any had
>prolog's it probably wouldn't be well-formed either. If there was the
>possibility of any of the files having a prolog you might be better off
>to instantiate a new parser for each file.

Yup, I found this out too... (I guess by prolog you mean something like:-

<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd">

Sadly this occurs for every XML document in the file, and makes the parser
unhappy even when I wrap the whole file in a top level tag.

In the end I stripped out all the lines like the above from the file
(from 1000's of individual  XML documents), then I did somthing like

cat "<Start>" multi_xml_document_files_(with_prologs_removed) "</Start>" | my_xml_parser.plx

Except that exact syntax won't work, but you get the idea.

How could I request some XML::Parser options to make its checking less
strict? Is this a bad road to go down?

Thanks very much,
Dan.

>
>-----Original Message-----
>From: expat-discuss-bounces at libexpat.org
>[mailto:expat-discuss-bounces at libexpat.org]On Behalf Of Dan Bolser
>Sent: Sunday, May 16, 2004 4:51 PM
>To: expat-discuss at libexpat.org
>Subject: [Expat-discuss] junk after document element at line 2053
>
>
>
>
>junk after document element at line 2053, column 0, byte 107114 at
>/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/XML/Parser.pm
>line 185
>
>
>I get the above where the first xml document ends and the next begins.
>
>I am trying to parse the file with perl XML::Parser
>
>I want the parser to simply keep going past the first document and onto
>the second...
>
>Could I just wrap the whole file in XML document tags?
>
>Sorry for my ignorance, but how can I do this?
>
>Suppose file1, file2 and file3 all contain multiple concatenated XML
>documents, how do I create a fourth file (file4) to 'pull in' file[1-3] ?
>
>This sounds familiar, but I have ~ zero XML experience.
>
>Thanks for any suggestions,
>
>Dan.
>
>
>_______________________________________________
>Expat-discuss mailing list
>Expat-discuss at libexpat.org
>http://mail.libexpat.org/mailman/listinfo/expat-discuss
>
>
>
>_______________________________________________
>Expat-discuss mailing list
>Expat-discuss at libexpat.org
>http://mail.libexpat.org/mailman/listinfo/expat-discuss
>




More information about the Expat-discuss mailing list