parsing "&A" in a string..
Tim Roberts
timr at probo.com
Mon Sep 1 00:26:48 EDT 2008
"bruce" <bedouglas at earthlink.net> wrote:
>
>it's the beautifulsoup() that's taking the "&E" and giving the "&E;"...
Right, as it should. "A&E" is not valid HTML, and beautifulsoup expects
valid HTML.
This can be difficult to fix in the general case, because your page might
already contain "&". If it is possible that some of them might be
wrong while some are right, you can do something like:
s = s.replace( '&', '&' ).replace( '&', '&' )
--
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the Python-list
mailing list