[ python-Bugs-1122916 ] incorrect handle of declaration in
markupbase
SourceForge.net
noreply at sourceforge.net
Tue Feb 15 18:09:05 CET 2005
Bugs item #1122916, was opened at 2005-02-14 23:04
Message generated for change (Comment added) made by tungwaiyip
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1122916&group_id=5470
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Wai Yip Tung (tungwaiyip)
Assigned to: Nobody/Anonymous (nobody)
Summary: incorrect handle of declaration in markupbase
Initial Comment:
When parsing the document below using sgmllib:
<html>
<!-BAD COMMENT->hello
</html>
The incorrect declaration is returned with hello as one
single character data:
"<!-BAD COMMENT->hello"
markupbase should have treated it as an error (to be
consistent with it strict treatment in _scan_name).
I believe the line 73 of markupbase.py should be
if rawdata[j:j+2] in ("-", ""):
intead of
if rawdata[j:j+1] in ("-", ""):
Also note that the condition in line 79 will not be true
if rawdata[j:j+1] == '--'
----------------------------------------------------------------------
>Comment By: Wai Yip Tung (tungwaiyip)
Date: 2005-02-15 09:09
Message:
Logged In: YES
user_id=561546
To clarify the syndrome, actually everything after the <!- is
returned as a single character data:
"<!-BAD COMMENT->hello\r\n</html>"
This means all the tags like </html> are not parsed as tags but
as character data as soon as there is a <!-. That's why I think
it is significant bug to report.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1122916&group_id=5470
More information about the Python-bugs-list
mailing list