[New-bugs-announce] [issue46704] Parser API not checking for null-terminator

anthony shaw report at bugs.python.org
Wed Feb 9 20:23:04 EST 2022


New submission from anthony shaw <anthonyshaw at apache.org>:

In tokenizer.c, the translate_newlines() function does a `strlen()` on the input string, if the string is not null-terminated, e.g. 
'\xbe' this leads to a heap-buffer-overflow. The overflow is not exploitable, but if there are further changes to the parser, it might be worth using a strlen() alternative, like strnlen().

static char *
translate_newlines(const char *s, int exec_input, struct tok_state *tok) {
    int skip_next_lf = 0;
    size_t needed_length = strlen(s) + 2, final_length;


This leads to a heap-buffer-overflow detected by ASAN in a simple reproducible example, calling PyRun_StringFlags() from the LLVM fuzzer:


fuzz_target(47084,0x11356f600) malloc: nano zone abandoned due to inability to preallocate reserved vm space.
Dictionary: 35 entries
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3034498392
INFO: Loaded 1 modules   (43 inline 8-bit counters): 43 [0x10a2b71e8, 0x10a2b7213), 
INFO: Loaded 1 PC tables (43 PCs): 43 [0x10a2b7218,0x10a2b74c8), 
INFO:        1 files found in ../Tests/fuzzing/corpus
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
=================================================================
==47084==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000003131 at pc 0x00010bd1d555 bp 0x7ff7b5da0590 sp 0x7ff7b5d9fd50
READ of size 2 at 0x602000003131 thread T0
    #0 0x10bd1d554 in wrap_strlen+0x184 (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x15554)
    #1 0x10b12132b in translate_newlines+0x1b (Python:x86_64+0x5d32b)
    #2 0x10b12071c in _PyParser_ASTFromString+0x1ac (Python:x86_64+0x5c71c)
    #3 0x10b2f86de in PyRun_StringFlags+0x5e (Python:x86_64+0x2346de)
    #4 0x10a25ec6b in CompileCode(char const*) fuzz_target.cpp:54
    #5 0x10a25f247 in LLVMFuzzerTestOneInput fuzz_target.cpp:68
    #6 0x10a27aff3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) FuzzerLoop.cpp:611
    #7 0x10a27c3c4 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) FuzzerLoop.cpp:804
    #8 0x10a27c859 in fuzzer::Fuzzer::Loop(std::__1::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) FuzzerLoop.cpp:857
    #9 0x10a26aa5f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) FuzzerDriver.cpp:906
    #10 0x10a298e42 in main FuzzerMain.cpp:20
    #11 0x1134f44fd in start+0x1cd (dyld:x86_64+0x54fd)

0x602000003131 is located 0 bytes to the right of 1-byte region [0x602000003130,0x602000003131)
allocated by thread T0 here:
    #0 0x10bd58a0d in wrap__Znam+0x7d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x50a0d)
    #1 0x10a27af02 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) FuzzerLoop.cpp:596
    #2 0x10a27c3c4 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) FuzzerLoop.cpp:804
    #3 0x10a27c859 in fuzzer::Fuzzer::Loop(std::__1::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) FuzzerLoop.cpp:857
    #4 0x10a26aa5f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) FuzzerDriver.cpp:906
    #5 0x10a298e42 in main FuzzerMain.cpp:20
    #6 0x1134f44fd in start+0x1cd (dyld:x86_64+0x54fd)

SUMMARY: AddressSanitizer: heap-buffer-overflow (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x15554) in wrap_strlen+0x184
Shadow bytes around the buggy address:
  0x1c04000005d0: fa fa 02 fa fa fa 02 fa fa fa 02 fa fa fa 02 fa
  0x1c04000005e0: fa fa 02 fa fa fa 02 fa fa fa 02 fa fa fa 02 fa
  0x1c04000005f0: fa fa 03 fa fa fa 03 fa fa fa 03 fa fa fa 03 fa
  0x1c0400000600: fa fa 01 fa fa fa 01 fa fa fa 01 fa fa fa 01 fa
  0x1c0400000610: fa fa 00 00 fa fa 00 fa fa fa 00 fa fa fa 00 00
=>0x1c0400000620: fa fa 00 fa fa fa[01]fa fa fa fd fa fa fa fd fd
  0x1c0400000630: fa fa fd fa fa fa fd fa fa fa 00 fa fa fa 04 fa
  0x1c0400000640: fa fa 00 00 fa fa 01 fa fa fa 01 fa fa fa 01 fa
  0x1c0400000650: fa fa fd fa fa fa fd fa fa fa fd fd fa fa 01 fa
  0x1c0400000660: fa fa 00 00 fa fa 01 fa fa fa fd fa fa fa fd fa
  0x1c0400000670: fa fa 01 fa fa fa 06 fa fa fa 00 00 fa fa 06 fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==47084==ABORTING
MS: 0 ; base unit: 0000000000000000000000000000000000000000


artifact_prefix='./'; Test unit written to ./crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
Base64: 
zsh: abort      ./fuzz_target -dict=../Tests/fuzzing/python.dict -only_ascii=1

----------
components: Parser
messages: 412965
nosy: anthonypjshaw, lys.nikolaou, pablogsal
priority: normal
severity: normal
status: open
title: Parser API not checking for null-terminator
versions: Python 3.10, Python 3.11

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue46704>
_______________________________________


More information about the New-bugs-announce mailing list