[OT] Google URLs (was Re: popen eating quotes?)

Steven Taschuk staschuk at telusplanet.net
Sun Aug 3 19:45:00 EDT 2003


Quoth Jeff Epler:
> http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=0000161b%40bossar.com.pl&rnum=1&prev=/groups%3Fq%3Dshell%2Bquoting%2Bnt%2Bgroup:comp.lang.python%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D0000161b%2540bossar.com.pl%26rnum%3D1%26filter%3D0
> [apologies for the long URL, I don't know how to get a good memorable
> URL for a google groups search]

Not memorable, but at least shorter, is
    <http://groups.google.com/groups?threadm=0000161b%40bossar.com.pl>
In general all you need is the threadm or selm parameter, which
gives the message-id of the post in question; if memory serves,
threadm and selm differ in whether the resulting page shows the
rest of the thread.

(Ignore the selm in your URL above; it's inside the prev
parameter, which stores information about where you were before
arriving at the page the URL is actually for.)

It seems that Google also assigns an id to each thread; with
suitable poking around (see, e.g., the source for the page at the
above URL) you can locate a 'th' parameter which can be used in
place of the threadm/selm parameter if you wish to refer to the
thread as a whole.  In your case that's
    <http://groups.google.com/groups?th=77576209b6262476>
(As this illustrates, the resulting URL is often shorter than one
using the message-id.)  The first ten messages in the thread
appear on the page obtained thus, with anchors, so you can refer
to them individually by appending '#link1', '#link2', etc.  This
makes nice short URLs too (though I'm not certain that *which*
messages such anchors refer to is constant over time as the thread
grows).

All discovered empirically.  Use at your own risk.

-- 
Steven Taschuk                  staschuk at telusplanet.net
"Telekinesis would be worth patenting."  -- James Gleick





More information about the Python-list mailing list