BeautifulSoup bug when ">>>" found in attribute value
Anne van Kesteren
annevankesteren at gmail.com
Thu Dec 28 04:02:41 EST 2006
Duncan Booth schreef:
> The /> was in the original input that you gave it:
>
> <param name="movie" value="/images/offersBanners/sw04.swf?binfot=We
> offer fantastic rates for selected weeks or days!!&blinkt=Click here
> >>>&linkurl=/Europe/Spain/Madrid/Apartments/Offer/2408" />
>
> You don't actually *have* to escape > when it appears in html.
You don't have to escape it in XML either, except when it's preceded by
]].
> As I said before, it looks like BeautifulSoup decided that the tag ended
> at the first > although it took text beyond that up to the closing " as
> the value of the attribute. The remaining text was then simply treated
> as text content of the unclosed param tag. Finally it inserted a
> </param> to close the unclosed param tag.
The param element doesn't have a closing tag.
http://www.w3.org/TR/html401/struct/objects.html#h-13.3.2
> Mind you, the sentence before that says 'should' for quoting < characters
> which is just plain silly.
For quoted attribute values it isn't silly at all. It's actually part
of how HTML works.
--
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>
More information about the Python-list
mailing list