[issue1663] Modification HTMLParser.py

diego report at bugs.python.org
Wed Dec 19 21:07:19 CET 2007


diego added the comment:

---------- Forwarded message ----------
From: diego <report at bugs.python.org>
Date: 19-dic-2007 17:05
Subject: [issue1663] Modification HTMLParser.py
To: diego.arias at gmail.com

New submission from diego:

Hello my name is Diego, I needed to parse HTML to retrieve only text,
but not grasped how to do it with class HTMLParser, so the change to do
it. The code to use is:
class ParsearHTML (HTMLParser.HTMLParser):

    def __init__(self,datos):
        HTMLParser.HTMLParser.__init__(self)
        self.feed(datos)
        self.close()

    def handle_data(self,data):
        return data

parser  = ParsearHTML(onTmp)
data = parser.feed(onTmp)
And changes in the class are attached. Thank you very much. Diego.

----------
components: None
files: HTMLParser.py
messages: 58821
nosy: diegorubenarias
severity: normal
status: open
title: Modification HTMLParser.py
type: resource usage
versions: Python 2.4
Added file: http://bugs.python.org/file9000/HTMLParser.py

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1663>
__________________________________

Added file: http://bugs.python.org/file9002/unnamed
Added file: http://bugs.python.org/file9003/HTMLParser.py

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1663>
__________________________________
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: unnamed
Url: http://mail.python.org/pipermail/python-bugs-list/attachments/20071219/8dd9ae77/attachment.txt 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: HTMLParser.py
Type: text/x-python
Size: 13168 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-bugs-list/attachments/20071219/8dd9ae77/attachment.py 


More information about the Python-bugs-list mailing list