[XML-SIG] XBEL metadata question

Chad Loder cloder@acm.org
Thu, 23 Mar 2000 13:15:02 -0500 (EST)


On Thu, 23 Mar 2000, Juergen Hermann wrote:

> On Thu, 23 Mar 2000 03:13:11 -0500 (EST), Chad Loder wrote:
>=20
> >Hi. I wrote a tool to annotate XBEL files with metadata from the
> >referenced page.
>=20
> When you already do that, could you extend it to grab the <TITLE> from th=
e
> page and insert it into the <bookmark> if that bookmark has no or an empt=
y
> title element?

Yes, it already does that. It basically works like this:

For each bookmark
=09If title or desc is empty/missing, visit the page

=09If 404 not found, set bookmark folded=3D"yes", add note about 404

=09If 301 moved, add note about old URL and change href (and check
=09new href)

=09Else if succesful, get title and description from page. If
=09bookmark title was missing, set title. If bookmark description
=09was missing, set description. Also set lastVisited date.

I could also use the HTTP header Last-Modified to set the lastModified
date. It's just a matter of writing some code to convert HTTP (GMT)
style datetimes into ISO8601 datetimes.

I also wrote a tool which publishes an XBEL file into a web site looking
much like Yahoo or Open Directory. You can see a sample at:

=09http://www.ccs.neu.edu/home/cloder/ie_favorites

What would be nice is a folder metadata called dmozCategory which maps
each folder onto its closest Open Directory category (www.dmoz.org). This
would allow separate collections to be merged together automatically.

=09c

>=20
> >What should the owner be of the metadata tags I create? Should it
> >be my program?
> >
> >For example, an HTML page may have:
> >
> ><META NAME=3D"description" CONTENT=3D"This is the description">
>=20
> I'd do it this way:
>=20
> <!DOCTYPE .... [
>     <!ATTLIST metadata
>         http-name =09CDATA      #IMPLIED
>         http-content=09CDATA      #IMPLIED
>     >
> ]>
>=20
> ...
> <meta owner=3D"http" http-name=3D"description" ...>
>=20
>=20
>=20
> Ciao, J=FCrgen
>=20
> --
> J=FCrgen Hermann (jhe@webde-ag.de)
> WEB.DE AG, Amalienbadstr.41, D-76227 Karlsruhe
> Tel.: 0721/94329-0, Fax: 0721/94329-22
>=20
>=20
>=20
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig
>=20

----------------------------------------------------
| Chad Loder - Somerville, MA, USA                 |
| EMail:     cloder@acm.org                        |
| Home Page: http://www.ccs.neu.edu/home/cloder    |
----------------------------------------------------