parsing "&A" in a string..
Tino Wildenhain
tino at wildenhain.de
Mon Sep 1 00:45:56 EDT 2008
Tim Roberts wrote:
> "bruce" <bedouglas at earthlink.net> wrote:
>> it's the beautifulsoup() that's taking the "&E" and giving the "&E;"...
>
> Right, as it should. "A&E" is not valid HTML, and beautifulsoup expects
> valid HTML.
>
> This can be difficult to fix in the general case, because your page might
> already contain "&". If it is possible that some of them might be
> wrong while some are right, you can do something like:
>
> s = s.replace( '&', '&' ).replace( '&', '&' )
Yeah, but what about รค and friend then? As you said, its not really
easy to fix.
Regards
Tino
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3241 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20080901/5f6bbb0e/attachment-0001.bin>
More information about the Python-list
mailing list