[XML-SIG] Content Syndication

Mark Nottingham mnot@pobox.com
Fri, 2 Jul 1999 19:08:40 +1000

> HTML is a display-oriented format.  It usually is not even well-formed in
> xml sense.  Further, it has great potential to break the layout of your
> for example, if the publisher embeds a </TABLE> tag.  It is possible to
> for this, and avoid it, but its not exactly 'simple'.  The real problem is
> it gives the publisher control over the display of the content. In a
> system, I think what you really want is to be able to publish *data*, and
> the receiver format it however they choose, so long as they can understand

I"m with you and in complete agreement. It doesn't make sense at all to have
HTML in. However, some people already do it; for instance, passing <i>, <P>
and <BR> to format their text.

Some people will want to put formatting into the channels, but IMHO that
level of detail belongs in the original, cited content, not a short 'teaser'
to the link. This sort of stuff should be spelled out clearly to potential
content providers, and enforced by aggregation/presentation engines.

> On second thought, one thing that might be a cool compromise is if we had
> optional tag indicating which style-sheet(s) the publisher thinks should
> used.

Don't know how that would be incorporated, but it's interesting to think

> If you have a real need to transfer HTML documents, then what you need is
> something like ICE that takes care of the packaging and tagging of
> Otherwise, what we provide as a subset will never be fully html
compliant -- eg
> some tags won't work, and will also be problematic from a validation point
> view, since HTML is generally not "valid" in the xml sense.

Assuming you're not just reponding to my question rhetorically, when do you
think such a capability would be necessary/desireable?

> I'd like to hear people's thoughts on this topic.  I'm going to be gone
> Tues though, so if I'm quiet, that's why.  My own feeling, having worked a
> little with RDF and metadata previously, is that the goal should be to
> data about resources.  We should not try to dictate how that data will be
> presented, if at all, which is what happens when embedded HTML is allowed.
> Reader's comments and related links could certainly be construed as
> about a given resource, so it would be nice to be able to transmit these
> well.

Hmm. A commenting/annotation capability is intriguing. My initial thought is
that perhaps there is a need for a separate annotation format, that can
optionally be used in conjunction with this; my concept is that this is
primarily resource discovery; annotation is another domain which may
interoperate with it, but the overlap is minimal. Have you seen
http://www.thirdvoice.com/ ?