[XML-SIG] Mixed encodings and XML
M.-A. Lemburg
mal@lemburg.com
Thu, 14 Dec 2000 12:10:08 +0100
uche.ogbuji@fourthought.com wrote:
>
> > This is not really related to text encodings, but somewhat similar:
> >
> > Is there a standard way of including binary data in XML files ?
>
> No.
Rich Salz pointed out in private mail that I could use base64
as encoding (can '<' and '>' appear in base64 ?). Alas, I would
lose the search capability...
> > I would like to put a complete web-site into a (large) XML file.
> > The XML file should ideally contain not only the structure
> > information, attributes, etc. but also the HTML files, the images
> > and maybe even sound files or flash apps.
>
> Ah. This is similar to what the ebXML folks and the SOAP folks were at odds
> over. Not, this is a well-known deficiency in XML. The most common
> suggestion is: put it all into one file, separate them with form-feeds, and
> have the application process each bit separately. Clearly this doesn't suit
> your needs, but there's not much more to go on right now.
Now thats about as non-XML like as it could get: form-feeds
to separate file parts... ;-)
> > Is something like this possible or will I have to use some
> > other storage method for the binary parts and reference these
> > from within the XML file (I would prefer not to, so that I can
> > include e.g. the HTML file content in XML searches) ?
>
> Could you expand on this last bit about the searches? It hints at what might
> be a work-around if that's your main concern.
I would like to be able to use XML searching machinery to scan
over web site structures. This includes limiting searches to
certain attributes, e.g. keywords or meta-descriptions of the content,
but should also cover full-text search of the content itself.
Even better would be a possible recursive application of this
scheme to embedded XML files, e.g. take a product catalog which
is stored as XML and made available on the site using special
site tools which only show the relevant parts of that file.
I think I would have to provide a special tag
<content encoding="base64|hex|plain|..." mimetype="...">
...
</content>
to enable this.
Thanks,
--
Marc-Andre Lemburg
______________________________________________________________________
Company: http://www.egenix.com/
Consulting: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/