[New-bugs-announce] [issue38487] expat infinite loop

Marcos Dione report at bugs.python.org
Tue Oct 15 13:12:40 EDT 2019


New submission from Marcos Dione <mdione at grulic.org.ar>:

I'm trying to add external entities support to xmltodict[1]. For that I extended the handler to have a ExternalEntityRefHandler handler. After reading a couple of files, the script lock in a tight loop.

I ran the script with gdb (!!) and found that expat think that two of the parsers are parent of each other. I setup a breakpoint in XML_ExternalEntityParserCreate() (yes, this is expat, I know) right after the new parser uses the old parser as parent (xmlparse.c:1279 in my system).

Here are the backtraces and values I found:

--- >8 ---
landuse-lowzoom None styles-otm/landuse-lowzoom.xml None

#0  XML_ExternalEntityParserCreate (oldParser=0xadc4d0, context=context at entry=0x7ffff6c871e0 "landuse-lowzoom", encodingName=encodingName at entry=0x0) at ../../src/lib/xmlparse.c:1281
#1  0x000000000044ec90 in pyexpat_xmlparser_ExternalEntityParserCreate_impl (encoding=0x0, context=0x7ffff6c871e0 "landuse-lowzoom", self=0x7ffff6d556e0) at ../Modules/pyexpat.c:943
#2  pyexpat_xmlparser_ExternalEntityParserCreate (self=0x7ffff6d556e0, args=<optimized out>, nargs=<optimized out>) at ../Modules/clinic/pyexpat.c.h:137
[...]
#15 0x000000000044d80d in my_ExternalEntityRefHandler (parser=<optimized out>, context=0xae1d2c "landuse-lowzoom", base=<optimized out>, systemId=<optimized out>, publicId=<optimized out>)
    at ../Modules/pyexpat.c:659
#16 0x00007ffff7d990c8 in doContent (parser=parser at entry=0xadc4d0, startTagLevel=startTagLevel at entry=0, enc=<optimized out>,
    s=s at entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=nextPtr at entry=0xadc500, haveMore=1 '\001') at ../../src/lib/xmlparse.c:2685
#17 0x00007ffff7d9957c in contentProcessor (parser=parser at entry=0xadc4d0,
    start=start at entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., endPtr=endPtr at entry=0xadc500) at ../../src/lib/xmlparse.c:2444
#18 0x00007ffff7d96a73 in doProlog (parser=parser at entry=0xadc4d0, enc=0x7ffff7db89e0 <utf8_encoding>,
    s=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "...,
    s at entry=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., tok=29, next=<optimized out>, nextPtr=0xadc500, haveMore=1 '\001', allowClosingDoctype=1 '\001') at ../../src/lib/xmlparse.c:4371
#19 0x00007ffff7d97f3a in prologProcessor (parser=0xadc4d0,
    s=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=0xadc500) at ../../src/lib/xmlparse.c:4094
#20 0x00007ffff7d9bb1c in XML_ParseBuffer (isFinal=0, len=<optimized out>, parser=0xadc4d0) at ../../src/lib/xmlparse.c:1893
#21 XML_ParseBuffer (parser=0xadc4d0, len=len at entry=2048, isFinal=isFinal at entry=0) at ../../src/lib/xmlparse.c:1863
#22 0x000000000060886d in pyexpat_xmlparser_ParseFile (self=0x7ffff6d556e0, file=<optimized out>) at ../Modules/pyexpat.c:841

(gdb) print oldParser
$33 = (XML_Parser) 0xadc4d0
(gdb) print parser
$32 = (XML_Parser) 0xadecb0

7ffff6d556e0, 7ffff6d55750
<_io.BufferedReader name='styles-otm/landuse-lowzoom.xml'>
<_io.BufferedReader name='styles-otm/landuse-lowzoom.xml'>
landuse None styles-otm/landuse.xml None

#0  XML_ExternalEntityParserCreate (oldParser=0xadecb0, context=context at entry=0x7ffff6c88660 "landuse", encodingName=encodingName at entry=0x0) at ../../src/lib/xmlparse.c:1281
#1  0x000000000044ec90 in pyexpat_xmlparser_ExternalEntityParserCreate_impl (encoding=0x0, context=0x7ffff6c88660 "landuse", self=0x7ffff6d55750) at ../Modules/pyexpat.c:943
#2  pyexpat_xmlparser_ExternalEntityParserCreate (self=0x7ffff6d55750, args=<optimized out>, nargs=<optimized out>) at ../Modules/clinic/pyexpat.c.h:137
[...]
#15 0x000000000044d80d in my_ExternalEntityRefHandler (parser=<optimized out>, context=0xae1d2c "landuse", base=<optimized out>, systemId=<optimized out>, publicId=<optimized out>) at ../Modules/pyexpat.c:659
#16 0x00007ffff7d990c8 in doContent (parser=parser at entry=0xadc4d0, startTagLevel=startTagLevel at entry=0, enc=<optimized out>,
    s=s at entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=nextPtr at entry=0xadc500, haveMore=1 '\001') at ../../src/lib/xmlparse.c:2685
#17 0x00007ffff7d9957c in contentProcessor (parser=parser at entry=0xadc4d0,
    start=start at entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., endPtr=endPtr at entry=0xadc500) at ../../src/lib/xmlparse.c:2444
#18 0x00007ffff7d96a73 in doProlog (parser=parser at entry=0xadc4d0, enc=0x7ffff7db89e0 <utf8_encoding>,
    s=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "...,
    s at entry=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., tok=29, next=<optimized out>, nextPtr=0xadc500, haveMore=1 '\001', allowClosingDoctype=1 '\001') at ../../src/lib/xmlparse.c:4371
#19 0x00007ffff7d97f3a in prologProcessor (parser=0xadc4d0,
    s=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=0xadc500) at ../../src/lib/xmlparse.c:4094
#20 0x00007ffff7d9bb1c in XML_ParseBuffer (isFinal=0, len=<optimized out>, parser=0xadc4d0) at ../../src/lib/xmlparse.c:1893
#21 XML_ParseBuffer (parser=0xadc4d0, len=len at entry=2048, isFinal=isFinal at entry=0) at ../../src/lib/xmlparse.c:1863
#22 0x000000000060886d in pyexpat_xmlparser_ParseFile (self=0x7ffff6d556e0, file=<optimized out>) at ../Modules/pyexpat.c:841

(gdb) print oldParser
$35 = (XML_Parser) 0xadecb0
(gdb) print parser
$34 = (XML_Parser) 0xae5e00

7ffff6d55750, 7ffff6d557c0
<_io.BufferedReader name='styles-otm/landuse.xml'>
<_io.BufferedReader name='styles-otm/landuse.xml'>
landuse-over-hillshade None styles-otm/landuse-over-hillshade.xml None

#0  XML_ExternalEntityParserCreate (oldParser=0xae5e00, context=context at entry=0x7ffff6c81a60 "landuse-over-hillshade", encodingName=encodingName at entry=0x0) at ../../src/lib/xmlparse.c:1281
#1  0x000000000044ec90 in pyexpat_xmlparser_ExternalEntityParserCreate_impl (encoding=0x0, context=0x7ffff6c81a60 "landuse-over-hillshade", self=0x7ffff6d557c0) at ../Modules/pyexpat.c:943
#2  pyexpat_xmlparser_ExternalEntityParserCreate (self=0x7ffff6d557c0, args=<optimized out>, nargs=<optimized out>) at ../Modules/clinic/pyexpat.c.h:137
[...]
#15 0x000000000044d80d in my_ExternalEntityRefHandler (parser=<optimized out>, context=0xae1d2c "landuse-over-hillshade", base=<optimized out>, systemId=<optimized out>, publicId=<optimized out>)
    at ../Modules/pyexpat.c:659
#16 0x00007ffff7d990c8 in doContent (parser=parser at entry=0xadc4d0, startTagLevel=startTagLevel at entry=0, enc=<optimized out>,
    s=s at entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=nextPtr at entry=0xadc500, haveMore=1 '\001') at ../../src/lib/xmlparse.c:2685
#17 0x00007ffff7d9957c in contentProcessor (parser=parser at entry=0xadc4d0,
    start=start at entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., endPtr=endPtr at entry=0xadc500) at ../../src/lib/xmlparse.c:2444
#18 0x00007ffff7d96a73 in doProlog (parser=parser at entry=0xadc4d0, enc=0x7ffff7db89e0 <utf8_encoding>,
    s=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "...,
    s at entry=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=end at entry=0xae0ce6 '\001' <repeats 200 times>..., tok=29, next=<optimized out>, nextPtr=0xadc500, haveMore=1 '\001', allowClosingDoctype=1 '\001') at ../../src/lib/xmlparse.c:4371
#19 0x00007ffff7d97f3a in prologProcessor (parser=0xadc4d0,
    s=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=0xadc500) at ../../src/lib/xmlparse.c:4094
#20 0x00007ffff7d9bb1c in XML_ParseBuffer (isFinal=0, len=<optimized out>, parser=0xadc4d0) at ../../src/lib/xmlparse.c:1893
#21 XML_ParseBuffer (parser=0xadc4d0, len=len at entry=2048, isFinal=isFinal at entry=0) at ../../src/lib/xmlparse.c:1863
#22 0x000000000060886d in pyexpat_xmlparser_ParseFile (self=0x7ffff6d556e0, file=<optimized out>) at ../Modules/pyexpat.c:841

(gdb) print oldParser
$36 = (XML_Parser) 0xae5e00
(gdb) print parser
$37 = (XML_Parser) 0xadecb0
--- 8< ---

As I hope you can see, the last two values (parent 0xae5e00, new 0xadecb0) are the exact opposite of the previous one (parent 0xadecb0, new 0xae5e00). Later, when get_hash_secret_salt() is called, it enters in a infinite loop climbing up the parent ladder.

Now, this looks like an expat issue ands not pyexpat, but given that pyexpat provides its own allocator, and that the parser addresses are returned by that, I will start opening this issue here. If it can be proven that it's an expat issue, I'll take it to their issue tracker.

-----
[1] https://github.com/martinblech/xmltodict/issues/226

----------
components: XML
files: expat.tar.gz
messages: 354747
nosy: StyXman
priority: normal
severity: normal
status: open
title: expat infinite loop
type: behavior
versions: Python 3.7
Added file: https://bugs.python.org/file48663/expat.tar.gz

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38487>
_______________________________________


More information about the New-bugs-announce mailing list