Check URL --> Simply? (fwd)
bokr at accessone.com
Sat Aug 18 05:52:31 CEST 2001
On Thu, 16 Aug 2001 11:02:10 -0700, Jeff Shannon <jeff at ccvcorp.com> wrote:
>David Eppstein wrote:
>> In article <tnmkhd51goof0b at xo.supernews.co.uk>,
>> "Julius Welby" <jwelby at waitrose.com> wrote:
>> > > Ever more opportunity at shameless self-promotion. This zillion special
>> > > cases of 404-ish pages is something I use as an example in my
>> > > forthcoming book _Text Processing in Python_ (a few more months until
>> > > done). Here's the code I present as an attempt at recognizing what only
>> > > humans can:
>> Humans can't usually see the http response line with the actual 404 number
>> in it in place of the 200 indicating an ok page, but machines can -- why
>> don't you use that?
>Among other things, someone already pointed out that certain "helpful"
>webservers will not only generate pretty error pages, but will return those
>pages with a 200 number instead of a 404 number. So, while checking the code is
>a good idea, it is not sufficient.
Probably one reason the "helpful" webservers put 200 status in their error pages
is to get past certain "helpful" browsers that see a 404 and ignore the immediately
following (actually possibly helpful) html in favor of their own glitzy boilerplate
More information about the Python-list