[DOC-SIG] Python Library Reference in new HTML form

Laurence Tratt tratt@dcs.kcl.ac.uk
Wed, 18 Mar 1998 09:38:08 +0000


>> I've (hopefully) just about completed a project for my platform to conver
>> the Python Library Reference and Tutorial to a format specific to the
>> platform.
>I don't see where you mention what platform / format you were
> originally targetting.  Could you tell us about it?

Um, it's Acorn RISC OS, and if you've actually heard of it, I'd be 
very impressed :) The format I converted it to is vaguely similar to 
HTML, but has several advantages in that it's searchable (and the 
searching is massively fast. It's virtually instantaneous on 2Mb of 
data, and it has to load it all from my not too fast hard disk), the 
program that displays it keeps windows nice and small so you can have 
them at the side of the screen whilst you're programming. The page 
splitting thing came from the fact that this was the way I could 
click "F1" on a function in my text editor, and get help on it 
straight away.

> I will definately be taking a look at your version, but I'm a little 
> swamped right now.  I'll stash a copy aside and take sneak peeks.  ;-)

I did guess that even if this is of no obvious help, it might give 
the next HTML generation of documentation some good ideas.

> Were you aware of the module index in the last release?  A single
> page with a list of modules alphabetically sorted.

Yes, but I find it's annoying to have to wade through things to get 
to what I want when really I want a front page where I can click on 
what I want straight away. Perhaps the order of the current manuals 
front page is more useful for total novices, but after that I think 
that most people probably find it a pain unless they know their way 
around it very well. It seems distinctly un-user-friendly, and slow. 
This is just my opinion, I could be wrong here :)

> I have mixed feelings about extensive links, but a lot of that has
> to do with the aweful presentation of web browsers.

Well, I've kept to a fairly small subset of HTML. There's tables in 
there (obviously), but that's as difficult as it gets. There's no 
font size commands or anything else that makes things look good on 
one browser and awful on another. I only tested it out on Acorn 
Browse and ANT Fresco (no, you probably haven't heard of either of 
them), but I'll try Netscape later today. Hopefully, things will look 
OK.

> CSS can help, but there does't appear to be a common understanding 
> among the browsers as  to what "getting it right" means.  Oh, the
> font problems on Netscape/UNIX drive me bonkers!

I think it is easy to get embroiled in platform specific stuff like 
this. I'm lucky; RISC OS has had a very good antianialised font 
manager in since 1988, so there's no problems. I kept things simple 
because I know that as soon as people start marking around with font 
size and font faces on most OSs, things start breaking down (Windows 
is horrendous with text that isn't relatively large, for example, 
because it doesn't antianialise text).

> Future releases will improve the semantic content of the markup,
> making it a little easier to create links automatically.

Hmmm, you should see the regular expression code I've got to do 
links. It's not very long, perhaps 3 or 4Kb, but already 
unmaintainable :)
 
> The version I'll be releasing shortly makes URLs "hot", but leaves
> email addresses alone (though they are marked and could easily be
> turned into hyperlinks).  I do this to avoid swamping people with
> mail.

Fair point. What do other people think of this?

> Good idea.  And I've been wrestling with LaTeX2HTML indexing a lot
> lately.  ;-(

Never run it. I had to look at it before I could guess what {\rm} 
meant :)
 
> I guess my name is "somebody", then.  ;-) I've been working on them
> a bit, but I've been spending a fair amount of time lately on Q/A for
> the LaTeX documents.  My expectation is that I'll be able to generate
> more usable SGML from them.  The intention of the SGML conversion
> project is to move toward SGML for the official documentation sources.
> There's still a lot to do, though, mostly due to my schedule
> constraints.

Could you explain this a little more? I'm guessing that at the moment 
you're updating the LaTeX docs and once you've got those stable 
you'll convert them all to SGML, and use those as the base 
documentation? Sounds very sensible to me. LaTeX is not particularly 
easy to parse, I've found.
 
>> 1) There's a .tar.gz file to download, it contains lots (nearly 2300
>> files if I'm being honest) files with 150ish directories at the 
>> top level
> That is a lot of files....

It's an unusual system :)
 
>> build from "this morning"), very slow but entirely in Python. I will
>> release the source in the near future (hopefully within a month), 
>> but at 
> I look forward to this.  I hope you had the good sense to ignore
> partparse.py from the current distribution!  ;-)  (The LaTeX scanning
> isn't really that bad, but the code is unmaintainable!)

I'm afraid the code is about 99.8% based on me looking at LaTeX, 
guessing what the output would look like if I'd invented the 
language, then going about doing it. I had to look in the LaTeX2HTML 
source to guess what {\rm} could be, and every so often I referred to 
the original HTML documentation to check to see if something that I 
thought was wrong was intentional or not... This is probably not good 
programming practice :) Everything is a little fragile, because the 
parser was only ever intended to work with the PLR.

> I plan to make available a documentation release in HTML, LaTeX,
> PDF, and PostScript next week.

Good. How much has been updated since the last release?


Laurie

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________