[Expat-bugs] [ expat-Bugs-1990430 ] Parser crash with specially formatted UTF-8 sequences

SourceForge.net noreply at sourceforge.net
Wed Jun 11 16:46:50 CEST 2008


Bugs item #1990430, was opened at 2008-06-11 00:45
Message generated for change (Comment added) made by kwaclaw
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1990430&group_id=10127

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: www.libexpat.org
Group: None
Status: Open
>Resolution: Fixed
Priority: 5
Private: Yes
Submitted By: Peter Valchev (petervalchev)
>Assigned to: Karl Waclawek (kwaclaw)
Summary: Parser crash with specially formatted UTF-8 sequences

Initial Comment:
I have discovered a way to crash libexpat's xml parser with certain specially formatted UTF-8 sequences. All applications that link w/ expat and use it to render user-provided XML files are affected. As far as I see, the issue is not exploitable, just denial of service.

This is the patch that I have come up with, also attached to this email:

+++ lib/xmltok_impl.c 2007-12-21 11:11:42.054417000 -0800
@@ -1745,6 +1745,9 @@
 switch (BYTE_TYPE(enc, ptr)) {
 #define LEAD_CASE(n) \
 case BT_LEAD ## n: \
+ if (end - ptr < n) { \
+   return; \
+ } \
 ptr += n; \
 break;
 LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4)

The parser's updatePosition function which keeps track of the current position pointer increments the ptr by {2, 3, 4} to skip past multibyte character ombinations, and this causes ptr in the "while (ptr != end)" loop to jump past the terminating condition, causing the loop to continue reading past 'end' and into out of bounds memory until a crash.

In general this parser does not appear the most robust and could be the source of some security issues.

A fault file is attached. To reproduce, compile examples/outline.c and run against it. This patch may not be 100% complete...

Contact:
Peter Valchev <pvalchev at google.com>

----------------------------------------------------------------------

>Comment By: Karl Waclawek (kwaclaw)
Date: 2008-06-11 10:46

Message:
Logged In: YES 
user_id=290026
Originator: NO

Can reproduce. The problem is that this code can be called *after* an
error has been found (to report line and column number). Therefore it
should not rely on correct byte counts for multibyte characters.

Patch applied in xmltok_impl.c rev. 1.14.

Would you please also report all the other issues you have found?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1990430&group_id=10127


More information about the Expat-bugs mailing list