# [DOC-SIG] Python Library Reference in new HTML form

Laurence Tratt tratt@dcs.kcl.ac.uk
Wed, 18 Mar 1998 09:38:08 +0000

>> I've (hopefully) just about completed a project for my platform to conver
>> the Python Library Reference and Tutorial to a format specific to the
>> platform.
>I don't see where you mention what platform / format you were
> originally targetting.  Could you tell us about it?

Um, it's Acorn RISC OS, and if you've actually heard of it, I'd be
very impressed :) The format I converted it to is vaguely similar to
HTML, but has several advantages in that it's searchable (and the
searching is massively fast. It's virtually instantaneous on 2Mb of
data, and it has to load it all from my not too fast hard disk), the
program that displays it keeps windows nice and small so you can have
them at the side of the screen whilst you're programming. The page
splitting thing came from the fact that this was the way I could
click "F1" on a function in my text editor, and get help on it
straight away.

> I will definately be taking a look at your version, but I'm a little
> swamped right now.  I'll stash a copy aside and take sneak peeks.  ;-)

I did guess that even if this is of no obvious help, it might give
the next HTML generation of documentation some good ideas.

> Were you aware of the module index in the last release?  A single
> page with a list of modules alphabetically sorted.

Yes, but I find it's annoying to have to wade through things to get
to what I want when really I want a front page where I can click on
what I want straight away. Perhaps the order of the current manuals
front page is more useful for total novices, but after that I think
that most people probably find it a pain unless they know their way
around it very well. It seems distinctly un-user-friendly, and slow.
This is just my opinion, I could be wrong here :)

> I have mixed feelings about extensive links, but a lot of that has
> to do with the aweful presentation of web browsers.

Well, I've kept to a fairly small subset of HTML. There's tables in
there (obviously), but that's as difficult as it gets. There's no
font size commands or anything else that makes things look good on
one browser and awful on another. I only tested it out on Acorn
Browse and ANT Fresco (no, you probably haven't heard of either of
them), but I'll try Netscape later today. Hopefully, things will look
OK.

> CSS can help, but there does't appear to be a common understanding
> among the browsers as  to what "getting it right" means.  Oh, the
> font problems on Netscape/UNIX drive me bonkers!

I think it is easy to get embroiled in platform specific stuff like
this. I'm lucky; RISC OS has had a very good antianialised font
manager in since 1988, so there's no problems. I kept things simple
because I know that as soon as people start marking around with font
size and font faces on most OSs, things start breaking down (Windows
is horrendous with text that isn't relatively large, for example,
because it doesn't antianialise text).

> Future releases will improve the semantic content of the markup,
> making it a little easier to create links automatically.

Hmmm, you should see the regular expression code I've got to do
unmaintainable :)

> The version I'll be releasing shortly makes URLs "hot", but leaves
> email addresses alone (though they are marked and could easily be
> turned into hyperlinks).  I do this to avoid swamping people with
> mail.

Fair point. What do other people think of this?

> Good idea.  And I've been wrestling with LaTeX2HTML indexing a lot
> lately.  ;-(

Never run it. I had to look at it before I could guess what {\rm}
meant :)

> I guess my name is "somebody", then.  ;-) I've been working on them
> a bit, but I've been spending a fair amount of time lately on Q/A for
> the LaTeX documents.  My expectation is that I'll be able to generate
> more usable SGML from them.  The intention of the SGML conversion
> project is to move toward SGML for the official documentation sources.
> There's still a lot to do, though, mostly due to my schedule
> constraints.

Could you explain this a little more? I'm guessing that at the moment
you're updating the LaTeX docs and once you've got those stable
you'll convert them all to SGML, and use those as the base
documentation? Sounds very sensible to me. LaTeX is not particularly
easy to parse, I've found.

>> 1) There's a .tar.gz file to download, it contains lots (nearly 2300
>> files if I'm being honest) files with 150ish directories at the
>> top level
> That is a lot of files....

It's an unusual system :)

>> build from "this morning"), very slow but entirely in Python. I will
>> release the source in the near future (hopefully within a month),
>> but at
> I look forward to this.  I hope you had the good sense to ignore
> partparse.py from the current distribution!  ;-)  (The LaTeX scanning
> isn't really that bad, but the code is unmaintainable!)

I'm afraid the code is about 99.8% based on me looking at LaTeX,
guessing what the output would look like if I'd invented the
language, then going about doing it. I had to look in the LaTeX2HTML
source to guess what {\rm} could be, and every so often I referred to
the original HTML documentation to check to see if something that I
thought was wrong was intentional or not... This is probably not good
programming practice :) Everything is a little fragile, because the
parser was only ever intended to work with the PLR.

> I plan to make available a documentation release in HTML, LaTeX,
> PDF, and PostScript next week.

Good. How much has been updated since the last release?

Laurie

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org