[Tutor] unstring
Steven D'Aprano
steve at pearwood.info
Wed Jun 19 04:41:07 CEST 2013
On Tue, Jun 18, 2013 at 06:41:01PM -0700, Jim Mooney wrote:
> Is there a way to unstring something? That is str(object) will give me
> a string, but what if I want the original object back, for some
> purpose, without a lot of foofaraw?
The short answer is, "no".
The slightly longer answer is, "sometimes".
The accurate answer is, "it depends on what the object is, whether you
insist on a human-readable string, and whether or not you like living
dangerously".
If you know what sort of object the string is supposed to represent,
then often (but not always) you can convert it like this:
x = 23456
s = str(s)
y = int(s)
after which, x should equal y. This will work for ints, and will
probably work for floats[1]. On the other hand, this does not work for
(say) list, tuple or other similar types of object:
s = str([None, 42, 'abc'])
list(s)
returns something very different from what you started with. (Try it and
see.)
Many objects -- but not all -- have the property that if you call eval()
on their repr(), you get the same value back again:
s = repr([None, 42, 'abc'])
eval(s)
ought to return the same list as you started with. But:
- this is not guaranteed for all objects;
- it's also unsafe.
If the string you call eval on comes from an untrusted source, they can
do *anything they like* on your computer. Imagine if you are taking a
string from somewhere, which you assumed was generated using repr(), but
somebody can fool you to accept this string instead:
"[None, 42, 'abc'] and __import__('os').system('echo Got You Now, sucker!')"
Try eval()'ing the above string. Now imagine something more malicious.
So, my advice is, *** don't use eval on untrusted strings ***
Another option is to use ast.literal_eval, which is much, much more
limited and consequently is safer.
py> ast.literal_eval("[None, 42, 'abc']")
[None, 42, 'abc']
To summarise, some but not all objects can be round-tripped to
and from human-readable strings, like those produced by str() and
repr(). Some of them can even be done so safely, without eval().
As an alternative, if you give up the requirement that the string be
human-readable, you can *serialise* the object. Not all objects can be
serialised, but most can. You can use:
- marshal
- pickle
- json
- yaml # not in the standard library
- and others
but they all have pros and cons. For instance, pickle can handle nearly
anything, but it has the same vulnerability as eval(), it can evaluated
arbitrary code. json and yaml are pretty close to human-readable, even
human-editable, but they can't handle arbitrary objects.
py> import pickle
py> x = [None, 42, 'abc']
py> pickle.dumps(x)
b'\x80\x03]q\x00(NK*X\x03\x00\x00\x00abcq\x01e.'
Not exactly human-readable, but guaranteed[2] to round-trip:
py> pickle.loads(pickle.dumps(x)) == x
True
If these are of interest, I suggest starting by reading the docs, then
coming back with any questions.
[1] I think it will always work, but floats are just tricky enough that
I am not willing to promise it.
[2] Guarantee void on planet Earth.
--
Steven
More information about the Tutor
mailing list