[Tutor] convert '\\t' to '\t'

Magnus Lycka magnus@thinkware.se
Fri Jan 17 05:33:02 2003


At 00:50 2003-01-17 +0000, Hy Python wrote:
>The problem with myRaw=myRaw.replace('\\n','\n') is that it only deals 
>with the '\\n' case. I am interested in something universal which can 
>convert things like '\\s', '\\t', '\\r', etc.

But it's not quite that simple. See
http://www.python.org/doc/current/ref/strings.html

For instance, \x42 should be replaced by B, \103 by C
and \u, \U and \N are used with Unicode characters.

I suggest you keep it simple, and just code translations
for the sequences you know. Surely, it's at least as easy
to code one transformation as it is to document it so that
the user will understand it.

Naturally, you could use the dreaded exec. That would
surely mean that all escape sequences are handled the
way python expects...

 >>> x = raw_input()
 >>> x
'tab\\tnew line\\n'
 >>> exec "s = '%s'" % x
 >>> s
'tab\tnew line\n'
 >>> print s
tab     new line

But beware of security aspects...

 >>> x = raw_input()
 >>> print x
he he!!';print 5*3, 'hm
 >>> exec "s = '%s'" % x
15 hm
 >>> s
'he he!!'

As you see, In this case I could easily get the program
to execute arbitrary code by including a ' in the string.

Instead of printing '15 hm' I could have made python reformat
the disk... I saw a bug like this in a Linux firewall
distribution where a system administrator who was perverse
enough to choose a password like for instance "hello;rm -rf /"
would end up with the password "hello" and an empty file system.
That was NOT a python product though... ;) It could easily
have been though. This C program made the equivalent of
os.system("htpasswd -b /xxx/yyyy admin %s" % newPasswdString)

We can try to avoid the problem in this particular case with:

 >>> x = x.replace("'","''")
 >>> exec "s = '%s'" % x
 >>> print s
he he!!;print 5*3, hm

Now all of x will end up in s. But are you really sure
that there are no other security issues with this?

Let's see...

 >>> x = raw_input()
 >>> x = x.replace("'","''")
 >>> exec "s = '%s'" % x
Gotcha!
 >>> print s
'

Yes, I could go around that as well. It's left as an
exercise to the reader to figure out what I typed in
to raw_input()...

 >>> print x
???????????????????????????????????????????????? ;)

Actually, if we do

x = x.replace("'","\'")

instead, it's probably safer, but maybe someone can craft
a way to trick that as well?

Either way, using exec, it's probably impossible to avoid
SyntaxError to occur with some input. But on the other hand,
since the user is to provide some kind of processing
information, there is always a risk that the instructions
will be incorrect. The SyntaxError can be caught with a try
block.


-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus@thinkware.se