[Python-bugs-list] [ python-Bugs-500073 ] HTMLParser fail to handle '&foobar'
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 08 Jan 2002 13:03:12 -0800
Bugs item #500073, was opened at 2002-01-06 00:06
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=500073&group_id=5470
Category: Extension Modules
Group: Python 2.1.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Bernard YUE (berniey)
Assigned to: Skip Montanaro (montanaro)
Summary: HTMLParser fail to handle '&foobar'
Initial Comment:
HTMLParser did not distingish between &foobar; and
&foobar. The later is still considered as a
charref/entityref. Below is my posposed fix:
File: sgmllib.py
# SGMLParser.goahead()
# line 162-176
# from
elif rawdata[i] == '&':
match = charref.match(rawdata, i)
if match:
name = match.group(1)
self.handle_charref(name)
i = match.end(0)
if rawdata[i-1] != ';': i = i-1
continue
match = entityref.match(rawdata, i)
if match:
name = match.group(1)
self.handle_entityref(name)
i = match.end(0)
if rawdata[i-1] != ';': i = i-1
continue
# to
elif rawdata[i] == '&'
match = charref.match(rawdata, i)
if match:
if rawdata[match.end(0)-1] != ';':
# not really an charref
self.handle_data(rawdata[i])
i = i+1
else:
name = match.group(1)
self.handle_charref(name)
i = match.end(0)
continue
match = entityref.match(rawdata, i)
if match:
if rawdata[match.end(0)-1] != ';':
# not really an entitiyref
self.handle_data(rawdata[i])
i = i+1
else:
name = match.group(1)
self.handle_entityref(name)
i = match.end(0)
continue
----------------------------------------------------------------------
>Comment By: Skip Montanaro (montanaro)
Date: 2002-01-08 13:03
Message:
Logged In: YES
user_id=44345
Bernie,
I see nothing wrong in principal with recognizing
" "
when the user should have typed " ", but I wonder
about
the validity of " ". You mentioned it's still
a charref or
entityref. Is that documented somewhere or
is it simply a practical
approach to a common problem?
Thanks,
Skip
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=500073&group_id=5470