[New-bugs-announce] [issue34437] print statement using \x results in improper and extra bytes

Nathan Benson report at bugs.python.org
Sun Aug 19 19:24:40 EDT 2018


New submission from Nathan Benson <jfa at packetdamage.com>:

While writing some shellcode I uncovered an unusual bug where Python 3 seems to print out incorrect (and extra) hex bytes using the print statement with \x. Needless to say I was pulling my hair out trying to figure out why my shellcode wasn’t working.  Python 2 behaves as expected.

I haven't tested the latest version of Python 3, but all the versions prior to that seem to have the bug.  I’ve been able to reproduce the bug in Ubuntu Linux and on my Mac.

An example printing "\xfd\x84\x04\x08" I expect to get back "fd 84 04 08", but Python 3 seems to add bytes beginning with c2 and c3 and tosses in random bytes.

For the purpose of these demonstrations:

  Akame:~ jfa$ python2 --version
  Python 2.7.15

  Akame:~ jfa$ python3 --version
  Python 3.7.0


Here is Python 2 operating as expected:

Akame:~ jfa$ python2 -c 'print("\xfd\x84\x04\x08")' | hexdump -C
00000000  fd 84 04 08 0a                                    |.....|
00000005


Here is Python 3 with the exact same print statement:

Akame:~ jfa$ python3 -c 'print("\xfd\x84\x04\x08")' | hexdump -C
00000000  c3 bd c2 84 04 08 0a                              |.......|
00000007

There are 6 bytes not 4 and where did the c3, bd, and c2 come from?

Playing around with it a little bit more it seems like the problem arises when you are printing bytes that start with a-f or 8 or 9:

Here is a-f:

Akame:~ jfa$ for b in {a..f}; do echo "\x${b}0"; python3 -c "print(\"\x${b}0\")" | hexdump -C; done
\xa0
00000000  c2 a0 0a                                          |...|
00000003
\xb0
00000000  c2 b0 0a                                          |...|
00000003
\xc0
00000000  c3 80 0a                                          |...|
00000003
\xd0
00000000  c3 90 0a                                          |...|
00000003
\xe0
00000000  c3 a0 0a                                          |...|
00000003
\xf0
00000000  c3 b0 0a                                          |...|
00000003


Here is 0-9 (notice everything is fine until 8):

Akame:~ jfa$ for b in {0..9}; do echo "\x${b}0"; python3 -c "print(\"\x${b}0\")" | hexdump -C; done
\x00
00000000  00 0a                                             |..|
00000002
\x10
00000000  10 0a                                             |..|
00000002
\x20
00000000  20 0a                                             | .|
00000002
\x30
00000000  30 0a                                             |0.|
00000002
\x40
00000000  40 0a                                             |@.|
00000002
\x50
00000000  50 0a                                             |P.|
00000002
\x60
00000000  60 0a                                             |`.|
00000002
\x70
00000000  70 0a                                             |p.|
00000002
\x80
00000000  c2 80 0a                                          |...|
00000003
\x90
00000000  c2 90 0a                                          |...|
00000003



Here are the same tests with Python 2:

Akame:~ jfa$ for b in {a..f}; do echo "\x${b}0"; python2 -c "print(\"\x${b}0\")" | hexdump -C; done
\xa0
00000000  a0 0a                                             |..|
00000002
\xb0
00000000  b0 0a                                             |..|
00000002
\xc0
00000000  c0 0a                                             |..|
00000002
\xd0
00000000  d0 0a                                             |..|
00000002
\xe0
00000000  e0 0a                                             |..|
00000002
\xf0
00000000  f0 0a                                             |..|
00000002


Akame:~ jfa$ for b in {0..9}; do echo "\x${b}0"; python2 -c "print(\"\x${b}0\")" | hexdump -C; done
\x00
00000000  00 0a                                             |..|
00000002
\x10
00000000  10 0a                                             |..|
00000002
\x20
00000000  20 0a                                             | .|
00000002
\x30
00000000  30 0a                                             |0.|
00000002
\x40
00000000  40 0a                                             |@.|
00000002
\x50
00000000  50 0a                                             |P.|
00000002
\x60
00000000  60 0a                                             |`.|
00000002
\x70
00000000  70 0a                                             |p.|
00000002
\x80
00000000  80 0a                                             |..|
00000002
\x90
00000000  90 0a                                             |..|
00000002


As you can see Python 2 works as expected and Python 3, when printing using \x[a-f08], seem to cause the byte to be replaced with a c2 or c3 and another byte of data.

----------
messages: 323773
nosy: Nathan Benson
priority: normal
severity: normal
status: open
title: print statement using \x results in improper and extra bytes
type: behavior
versions: Python 3.4, Python 3.5, Python 3.6, Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34437>
_______________________________________


More information about the New-bugs-announce mailing list