[XML-SIG] Proposed XBEL DTD
Fred L. Drake
Fred L. Drake, Jr." <fdrake@acm.org
Fri, 2 Oct 1998 12:51:32 -0400 (EDT)
--UK/cpViHvX
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit
Marc van Grootel writes:
> Just before the weekend I've got a few possible issues for the XBEL
> DTD. I say possible because it's not my intention to upset the current
> design. After reading some stuff about metadata I can see that it's a
Ha, you're too late! He, he, he.... oh, well. Andrew, please
ignore the public text I sent for XBEL this morning. ;-)
I'll attach an updated DTD below for continued discussion.
> A while ago I suggested adding the ISO Latin 1 entities (like HTML
> does) was that ruled out? It would keep XBEL more readable.
Do we want just latin-1, or all of the standard ISO entities that
have been defined for XML? This would be closer to what HTML uses.
I propose to include all those listed at
<http://www.schema.net/entities/> if we're going to allow more than a
minimal set. (Note that five currently in previous version of the DTD
are defined in the "Numeric and Special Graphic" entity set, not
latin-1.)
> Folders
>
> The name for the folder element is directly derived from common
> bookmark files. In some ways the 'folder' is like the 'sect' in the
> DocBook DTD. One interpretation of a folder is 'a grouped set of
> nodes' the fact that this is rendered as a real folder in a bookmark
> file is a presentational aspect. I see a real advantage when I can use
> a folder element as just a group of nodes inside the current folder
> without always being rendered as a separate folder. Such a
...
I'm not sure I see the value in what's been defined largely as an
interchange format. Most applications would not understand or care
about the distinction (assuming I understand it). Perhaps "groups"
can be defined using application-specific metadata?
<bookmark href="http://xxx.lanl.gov/hypertex/"
added="1998-01-27T13:09:45-05:00"
visited="1998-03-24T10:12:10-05:00">
<title>HyperTeX FAQ</title>
<info>
<metadata scheme="application::GroupHandler">
<meta name="group">hypertext resources</meta>
</metadata>
</info>
</bookmark>
> shortcuts). However there's an asymmetry: an 'url' alias does not
> point to an element with info, a 'folder' alias does. So resolving the
> 'url' alias (including the info) is now different from resolving a
> 'folder' alias. This is a result from our decision to accept 'bare'
> url's. This moved the id attribute from bookmark to the url element.
This will probably be a non-issue for many bookmark-specific
applications, but might create problems for general XML tools. In my
current implementation, the internal node created for a <bookmark> is
the same as that created for a <url>, and all the right stuff happens
by magic. Since the location of attribtutes is currently (and
appropriately) constrained, this doesn't present any real issues.
Also, note that an <alias> can refer to a <bookmark> that doesn't have
an <info> child at all, so the argument doesn't appear compelling.
> Now both 'bookmark' and 'url' get %common.attrs;. When they are really
> being used this will automatically raise the question: Where to put
> the value for a common attribute? On a 'bookmark' or on the contained
> 'url'? Previously we removed the id attribute from bookmark to avoid
> this.
>
> It seems that the 'bare' url is causing subtle problems. Maybe this
> was a bad decision. Should we undo that and merge url with bookmark?
> It doesn't cause a big upset since most of bookmarks content is
My current implementation attempts to "minimize" the generate output
by using a bare <url> if it doesn't cause a loss of information. What
I find by looking at the output for my general bookmarks is twofold:
(1) most entries turn into <url> elements, which are substantially
more compact for viewing by a human, and (2) I've lost a lot of
descriptions by testing older versions of Grail bookmark code on
"live" data. ;-(
I think (hope?) my point is that brevity of markup is pretty
valuable for this application. Perhaps <url> should be retained for
this reason, and perhaps not. There is good thinking in putting "id"
on the <bookmark> element, however, and using just one element would
simplify processing somewhat.
But let's take a look at the size difference anyway, since I've
brought it up. This uses the previous version of XBEL:
<url href="http://xxx.lanl.gov/hypertex/"
added="1998-01-27T13:09:45-05:00"
visited="1998-03-24T10:12:10-05:00"
>HyperTeX FAQ</url>
This is the same bookmark, without <url>:
<bookmark href="http://xxx.lanl.gov/hypertex/"
added="1998-01-27T13:09:45-05:00"
visited="1998-03-24T10:12:10-05:00">
<title>HyperTeX FAQ</title>
</bookmark>
Well, that doesn't look so bad. I'll go ahead and adjust the DTD.
> Metadata
>
> I can follow the reasons for removing ID from metadata. But the
> ability to reference a block of metadata is now lost. I wonder of this
I see two real options: put "id" attributes only on things that it
make immediate sense to refer to via <alias>, and to put it in
common.attrs. The only place to link to anything other than folders
and bookmarks (assuming that linking to a folder makes sense), is from
outside the document. I would expect this to happen rarely, if ever.
From this perspective, I'll vote for simplicity.
> It is a little more complex but more powerful to be able to
> reference an info-nugget (it could be done by copying via an entity
> reference though). Not all info follows the folder hierarchy. On the
Perhaps the need to linking will be made clear if you can describe
an application of it?
> Scheme
>
> What is the content of the scheme attribute? Should it be an URL (or
> URN) or can it be any CDATA string? Since an xbel probably uses only
CDATA seems appropriate; there's no catalog of metadata schemes.
Since <metadata> is overloaded with application-specific schemes, we
cannot presume to predict the range of possible values. I just took a
look at the <META> element from HTML 4.0, and it looks like there's a
slightly different approach described there: <HEAD> has a "profile"
attribute which names what I've been calling the scheme, and the
<META> attribute "scheme" is used to describe the notation of the
metadata value. Perhaps the <metadata> "scheme" should be named
"profile", and add an HTML 4.0-style "scheme" to <meta>, primarily to
allow applications which collect information from HTML pages to store
it without loosing fidelity.
> ID/IDREF pair (or CDATA link attributes) and adding a <scheme
> name="a-long-formal-id" id="s1"/>) somewhere near the top of the
> document. This clearly documents which info schemes are being
> used. Others may exist inside the document but the ones mentioned can
> be used by reference (this could cut down on the file size in the case
> of formal scheme names, which tend to be quite long).
I'd avoid this since I don't know of any formal registries for
metadata schemes/profiles/whatever. This can also be accomplished
using general entities:
<!DOCTYPE xbel ... [
<!ENTITY my-scheme "...long identifier...">
]>
<xbel>
<bookmark href="http://xxx.lanl.gov/hypertex/"
added="1998-01-27T13:09:45-05:00"
visited="1998-03-24T10:12:10-05:00">
<title>HyperTeX FAQ</title>
<info>
<metadata scheme="&my-scheme;">
...
</metadata>
</bookmark>
</xbel>
> Other linking issues (maybe consider these for next version)
>
> Are there other linking issues? What about a way to make xref's? Link
> to external xbel documents and/or external metadata
> information-nuggets?
There comes a point at which we punt and require people to use
XPointer. Or wait for specific issues to crop up and make a new
version. ;-)
-Fred
--
Fred L. Drake, Jr. <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr. Reston, VA 20191
--UK/cpViHvX
Content-Type: text/xml
Content-Description: YAX (Yet Another XBEL)
Content-Disposition: inline;
filename="xbel.dtd"
Content-Transfer-Encoding: 7bit
<!-- This is the XML Bookmarks Exchange Language, version 1.0. It should
be used with the formal public identifier:
-//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML
One valid system identifier at which this DTD will remain
available is:
http://www.python.org/topics/xml/dtds/xbel-1.0.dtd
More information the on the DTD, including reference
documentation, is available at:
http://www.python.org/topics/xml/xbel/
Attributes which take date/time values should encode the value
according to the W3C NOTE on date/time formats:
http://www.w3.org/TR/NOTE-datetime
-->
<!ENTITY ISOlat1
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN//XML"
"http://www.schema.net/public-text/ISOlat1.pen">
<!ENTITY ISOlat2
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 2//EN//XML"
"http://www.schema.net/public-text/ISOlat2.pen">
<!ENTITY ISOnum
PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML"
"http://www.schema.net/public-text/ISOnum.pen">
<!ENTITY ISOpub
PUBLIC "ISO 8879:1986//ENTITIES Publishing//EN//XML"
"http://www.schema.net/public-text/ISOpub.pen">
<!ENTITY ISOtech
PUBLIC "ISO 8879:1986//ENTITIES General Technical//EN//XML"
"http://www.schema.net/public-text/ISOtech.pen">
<!ENTITY ISOdia
PUBLIC "ISO 8879:1986//ENTITIES Diacritical Marks//EN//XML"
"http://www.schema.net/public-text/ISOdia.pen">
<!ENTITY ISOgrk1
PUBLIC "ISO 9573-15:1993//ENTITIES Greek Letters//EN//XML"
"http://www.schema.net/public-text/ISOgrk1.pen">
<!ENTITY ISOgrk2
PUBLIC "ISO 9573-15:1993//ENTITIES Monotoniko Greek//EN//XML"
"http://www.schema.net/public-text/ISOgrk2.pen">
<!ENTITY ISOgrk3
PUBLIC "ISO 8879:1986//ENTITIES Greek Symbols//EN//XML"
"http://www.schema.net/public-text/ISOgrk3.pen">
<!ENTITY ISOgrk4
PUBLIC "ISO 8879:1986//ENTITIES Alternative Greek Symbols//EN//XML"
"http://www.schema.net/public-text/ISOgrk4.pen">
<!ENTITY % common.attrs "">
<!ENTITY % node.attrs "id ID #IMPLIED
added CDATA #IMPLIED">
<!ENTITY % url.attrs "href CDATA #REQUIRED
visited CDATA #IMPLIED
modified CDATA #IMPLIED">
<!ENTITY % nodes "bookmark|folder|alias|separator">
<!ELEMENT xbel (title?, info?, desc?, (%nodes;)*)>
<!ATTLIST xbel
version CDATA #FIXED "1.0"
>
<!ELEMENT title (#PCDATA)>
<!ATTLIST title
%common.attrs;
>
<!--=================== Info ======================================-->
<!ELEMENT info (metadata*)>
<!ATTLIST info
%common.attrs;
>
<!ELEMENT metadata (meta*)>
<!ATTLIST metadata
%common.attrs;
scheme CDATA #IMPLIED
>
<!ELEMENT meta (#PCDATA)>
<!ATTLIST meta
%common.attrs;
name CDATA #REQUIRED
>
<!--=================== Folder ====================================-->
<!ELEMENT folder (title?, info?, desc?,(%nodes;)*)>
<!ATTLIST folder
%common.attrs;
%node.attrs;
folded (yes|no) 'yes'
>
<!--=================== Bookmark ==================================-->
<!ELEMENT bookmark (title, info?, desc?)>
<!ATTLIST bookmark
%common.attrs;
%node.attrs;
%url.attrs;
>
<!ELEMENT desc (#PCDATA)>
<!ATTLIST desc
%common.attrs;
>
<!--=================== Separator =================================-->
<!ELEMENT separator EMPTY>
<!ATTLIST separator
%common.attrs;
>
<!--=================== Alias =====================================-->
<!ELEMENT alias EMPTY>
<!ATTLIST alias
%common.attrs;
ref IDREF #REQUIRED
>
--UK/cpViHvX--