How to convert " " in a string to blank space?
wittempj@hotmail.com
martin.witte at gmail.com
Mon Oct 30 13:12:39 EST 2006
On Oct 30, 6:44 pm, "一首诗" <newpt... at gmail.com> wrote:
> Oh, I didn't make myself clear.
>
> What I mean is how to convert a piece of html to plain text bu keep as
> much format as possible.
>
> Such as convert " " to blank space and convert <br> to "\r\n"
>
Then you can explore the parser,
http://docs.python.org/lib/module-HTMLParser.html, like
#!/usr/bin/env python
from HTMLParser import HTMLParser
parsedtext = ''
class Parser(HTMLParser):
def handle_starttag(self, tag, attrs):
if tag == 'br':
global parsedtext
parsedtext += '\\r\\n'
def handle_data(self, data):
global parsedtext
parsedtext += data
def handle_entityref(self, name):
if name == 'nbsp':
pass
x = Parser()
x.feed('An text<br>')
print parsedtext
> Gary Herron wrote:
> > 一首诗 wrote:
> > > Is there any simple way to solve this problem?
>
> > Yes, strings have a replace method:
>
> > >>> s = "abc def"
> > >>> s.replace(' ',' ')
> > 'abc def'
>
> > Also various modules that are meant to deal with web and xml and such
> > have functions to do such operations.
>
> > Gary Herron
More information about the Python-list
mailing list