[New-bugs-announce] [issue32072] Issues with binary plists

Serhiy Storchaka report at bugs.python.org
Sat Nov 18 14:06:16 EST 2017


New submission from Serhiy Storchaka <storchaka+cpython at gmail.com>:

plistlib creates a new objects when read references instead of using
already read object.

As result it doesn't preserve identity:

>>> import plistlib
>>> a = [['spam']]*2
>>> a[0] is a[1]
True
>>> b = plistlib.loads(plistlib.dumps(a, fmt=plistlib.FMT_BINARY))
>>> b == a
True
>>> b[0] is b[1]
False

And plistlib.loads() is vulnerable to plists containing cyclic
references (as was exposed in issue31897). For example,
plistlib.loads(b'bplist00\xa1\x00\x08\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0a')
could return a list containing itself, but it is failed with
RecursionError.

plistlib.dumps() preserves reference in the output, but it saves
redundant copies. For example plistlib.dumps([[]]*5,
fmt=plistlib.FMT_BINARY) saves a list containing 5 identical empty
lists, but it saves an empty list 5 times, and only the last copy is
used. The other 4 copies are not referenced and just spent the file
volume and the space of reference numbers. Saving
[[[[['spam']*100]*100]*100]*100]*100 will result in a multigigabyte,
while less than a kilobyte would be enough for saving it. Loading
properly saved [[[[['spam']*100]*100]*100]*100]*100 withe the current
plistlib.loads() will cause consuming many gigabytes of memory.

1. The issues with plistlib.dumps() are:
1a) Inefficient saving data with references. This is minor resource usage issue.
1b) Impossibility to save a data with cyclic references. This is a
lack of a feature.

2. The issues with plistlib.loads() are:
2a) Inefficient loading data with references. This can be not just a
resource usage issue, but a security issue. Loading an malicious input
data smaller than 100 byte ([[[...]*2]*2]*2) can cause consuming many
gigabytes of memory.
2b) Impossibility to load a data with cyclic references. This is a
lack of a feature, but can be lesser security issue. Small malicious
input can cause RecursionError. If the recursion limit is set high and
you are unlucky it can cause a stack overflow.

Security issues affect you only when you load plists from untrusted sources.

Adding the proper support of references could be considered a new
feature, but taking to account security issues it should be backported
up to 3.4 when the support of binary plists was added.

----------
assignee: serhiy.storchaka
components: Library (Lib)
messages: 306493
nosy: ned.deily, ronaldoussoren, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Issues with binary plists
type: security
versions: Python 3.4, Python 3.5, Python 3.6, Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32072>
_______________________________________


More information about the New-bugs-announce mailing list