a clean way to define dictionary
Alexander Schmolck
a.schmolck at gmx.net
Thu Jun 19 14:00:09 EDT 2003
mis6 at pitt.edu (Michele Simionato) writes:
> Alexander Schmolck <a.schmolck at gmx.net> wrote in message news:<yfsof0v880w.fsf at black132.ex.ac.uk>...
> > Maybe I'm a bit blinkered, but right now I can't see how
> >
> > dict(foo=1, bar='sean')
> >
> > is so much better/more convinient than
> >
> > {'foo':1, bar:'sean'}
> >
> > that it justifies forcing people to learn a new redundant and less general
> > dictionary creation syntax that at least hinders customizing dictionary
> > instantiation like in
> >
> > counts = dict(default=0)
> > for item in seq:
> > counts[item] += 1
> >
> > To get the first behavior you have to write a 1-line function, for the second
> > you have to subclass (and presumably incur a noticeable performance penalty).
> > Also the first precludes adding further constructor options, while the second
> > doesn't.
> >
> >
> > 'as
>
> I do think you have a point here. Nevertheless, I wonder about the
> "noticeable performance penality" due to subclassing. I tried this
> dictionary:
>
> class mydict(dict):
> def __new__(cls,d,default=0):
> #absolutely trivial, doing nothing with the default argument
> return super(mydict,cls).__new__(cls,d)
>
> d=dict([(i,str(i)) for i in range(1000)])
> m=mydict([(i,str(i)) for i in range(1000)])
>
> And I have measured the access times for 'd' and 'm' with timeit. There is no
> noticeable difference for getting a value (<1%); setting a value is slower
> by a 7-8%. Not a big deal. I get more or less the same numbers for bigger
> or smaller dictionaries. Nevertheless, I haven't tried many dictionaries,
> maybe there are cases where the performance is worse.
Well, your timings are not all that meaningful because your code does nothing
and you only instance creation and not item access (which obviously also needs
to be overridden with python code). Anywhere, here is a real DefaultDict class
and some ad hoc timings, which show *10* fold slowdown for creation and 4 fold
for item access (the copy.copy call seems harmless performance-wise).
class DefaultDict(dict):
r"""Dictionary with a default value for unknown keys."""
def __init__(self, default, noCopies=False):
self.default = default
if noCopies:
self.noCopies = True
else:
self.noCopies = False
def __getitem__(self, key):
r"""If `self.noCopies` is `False` (default), a **copy** of
`self.default` is returned by default.
"""
if key in self: return self.get(key)
if self.noCopies: return self.setdefault(key, self.default)
else: return self.setdefault(key, copy.copy(self.default))
timings for python2.2
In [58]: timeCall(nTimes, 10000, dict)
Out[58]: 0.018522977828979492
In [59]: timeCall(nTimes, 10000, DefaultDict, 1)
Out[59]: 0.11231005191802979
In [47]: timeCall(nTimes, 10000, {'foo':0}.__getitem__, 'foo')
Out[47]: 0.012811064720153809
In [48]: timeCall(nTimes, 10000, DefaultDict(1).__getitem__, 'foo')
Out[48]: 0.045480012893676758
In [65]: timeCall(nTimes, 10000, DefaultDict(1,0).__getitem__, 'foo')
Out[65]: 0.04657900333404541
>
> If somebody wants to experiment and report here few numbers, it would be
> interesting.
>
> Michele
What are your results for instance creation with this class? Maybe 2.3 is
noticeably faster?
'as
More information about the Python-list
mailing list