URL parsing for the hard cases
nagle at animats.com
Sun Jul 22 19:56:37 CEST 2007
Is there something available that will parse the "netloc" field as
returned by URLparse, including all the hard cases? The "netloc" field
can potentially contain a port number and a numeric IP address. The
IP address may take many forms, including an IPv6 address.
I'm parsing URLs used by hostile sites, and the wierd cases come up
all too frequently.
More information about the Python-list