[IronPython] Weird performance issue

Dino Viehland dinov at microsoft.com
Fri Jan 9 01:03:36 CET 2009


I'm going to back port this perf fixes + the property perf fixes to 2.0.1.  I was planning on doing this tomorrow.

Moving to 2.1 shouldn't be painful.  The biggest gotcha is a large number of renames that have happened in the DLR inner layer (Microsoft.Scripting.Core).  The good news is those shouldn't effect Python code - only other DLR languages which are interoperating w/ IronPython.  The outer layer, with the hosting APIs, should all still be the same.

It is a 2.5/2.6 hybrid but the potentially breaking changes (for example the parser changes) have remained under the -X:Python26 option.  The bad news is that option will eventually go away.  Right now we're working towards getting 2.1 to run against the 2.6 standard library while at the same time keeping it running against 2.5.  There are some breaking changes in 2.6 - for example slicing has been "fixed" in some subtle places and so at some point those will get put in and it'll be hard to make them dependent upon a switch.  So once we get switched over and running all the 2.6 tests then that option will go away forcing the 2.6 behavior upon you.  Where we believe the features are non-disruptive (say the new string formatting APIs) the feature isn't controlled by the -X:Python26 option.

Other than that I'd just say the biggest factor is churn and release dates.  2.1 is undergoing active development but we also continue to gate our check-ins on passing our test suite.  Our tentative planning is that we won't finish 2.1 until the 2nd half of this year.  So ultimately you lose some degree of stability for a reasonable chunk of time.

> -----Original Message-----
> From: users-bounces at lists.ironpython.com [mailto:users-
> bounces at lists.ironpython.com] On Behalf Of Michael Foord
> Sent: Thursday, January 08, 2009 3:38 PM
> To: Discussion of IronPython
> Subject: Re: [IronPython] Weird performance issue
>
> Hi Dino,
>
> This could be at the root of *several* of the really odd performance
> issues we've been having with our IronPython 2 port. That is quite an
> improvement and mirrors the kind of behaviour we see.
>
> Can we get a patch against the 2.0 source code for your dictionary fix?
>
> On the other hand, what do we lose by moving to the 2.1 branch?
> Obviously this is less tested (in the wild) code, but we have pretty
> comprehensive tests. 2.1 is also a bit of a Python 2.5 / 2.6 hybrid at
> the moment I guess - are there any potential gotchas?
>
> Michael
>
> Dino Viehland wrote:
> >
> > Ok, I've looked at this more closely now and I believe I have a fix.
> > I've also discovered that changes in the DLR make this much better as
> > well in the 2.1 branch. But the numbers I get are:
> >
> > 2.0: 59.9065353243
> >
> > 2.1: 4.28863087116
> >
> > 2.1 + dict fixes: 1.22241745833
> >
> > These numbers are quite dependent upon the number of types that have
> > been seen for hashing and equality. In 2.0 increasing the number of
> > types will cause the long running loop to become progressively worse.
> > In 2.1 we're smart enough to re-organize the L2 cache based upon the
> > most recently used items which makes this much faster. W/ my changes
> > the dictionaries are smarter about managing call sites based upon
> > their type of keys.
> >
> > My guess is we can add my dictionary fix to 2.0 but back porting all
> > of the call site changes might be too much.
> >
> > Here's my repro:
> >
> > import System
> >
> > class x(object):
> >
> > pass
> >
> > data = [1, 1.0, 1L, 1.0, 'abc', 1L, frozenset(), xrange(2),
> > frozenset(), BaseException(), ord, type, True, x(),
> > System.Collections.ArrayList(), System.Collections.Hashtable(),
> > System, KeyError('x'), AttributeError('foo'), Warning(), 1L]
> >
> > d = {}
> >
> > for i in data:
> >
> > d[i] = 0
> >
> > i in d
> >
> > del d[i]
> >
> > import time
> >
> > start = time.clock()
> >
> > data = data[-5:]
> >
> > x = {}
> >
> > for i in xrange(100000):
> >
> > for val in data:
> >
> > x[val] = 0
> >
> > val in x
> >
> > del x[val]
> >
> > end = time.clock()
> >
> > print end - start
> >
> > *From:* users-bounces at lists.ironpython.com
> > [mailto:users-bounces at lists.ironpython.com] *On Behalf Of *Dino
> Viehland
> > *Sent:* Tuesday, January 06, 2009 11:25 AM
> > *To:* glenn.k.jones+ipy at gmail.com; Discussion of IronPython
> > *Subject:* Re: [IronPython] Weird performance issue
> >
> > I think we are thrashing the hash code call site in
> > CommonDictionaryStorage.HashSite._HashSite. Alternately it could be
> > the equals call site in PythonContext._equalRetBoolSite. Similar to
> > the property issue you can look at the call site object and see if
> > it's going to the fixed read-only EmptyRuleSet.
> >
> > We can probably add 2 sites which are per-dictionary but we might
> need
> > to do something more clever as that could regress general dictionary
> perf.
> >
> > The DLR team is going to be looking into general issues with
> > understanding what's going on with call sites. Yesterday I pointed
> out
> > performance investigations as being something important for them to
> > look at.
> >
> > *From:* users-bounces at lists.ironpython.com
> > [mailto:users-bounces at lists.ironpython.com] *On Behalf Of *Glenn
> Jones
> > *Sent:* Tuesday, January 06, 2009 6:07 AM
> > *To:* Discussion of IronPython
> > *Subject:* [IronPython] Weird performance issue
> >
> > Hi all,
> >
> > More from the Resolver One upgrade:
> > We are seeing several performance failures in our tests, which we
> > expect because the performance profile of IPy2 is different to IPy1.
> > There is one that is perplexing and we hope that someone can give us
> > some insight into possible causes for us to investigate.
> >
> > One of our performance tests only fails when it is preceded by other
> > tests. There is (as far as we can tell) no shared state between the
> > tests. After much poking around, we discovered that replacing a set
> > with a list caused the performance to improve from ~700ms down to
> > ~100ms. We have been unable get a minimal repro, so far. Here's a
> > little more detail on the how the code works:
> >
> > class Thing(object):
> > def run(arg1, arg2):
> > * do some stuff
> > s = set()
> > * do stuff that populates s
> > return list(s)
> >
> > We create a Thing and call run so that the set contains an entry, if
> > we then create a new Thing and call the run method it takes 10 times
> > longer than expected.
> > If we change s = set to s = [] (and convert s.add to s.append, where
> > appropriate), and do the same as above, the second call to run
> behaves
> > as we expect. There is (at least in our code) nothing shared between
> > the first call to run and the second call.
> >
> > Does anyone have any ideas about where we can look to work out what's
> > happening? Any suggestions for ways to diagnose this?
> >
> > Thanks
> > Glenn & Kamil
> >
> > ---------------------------------------------------------------------
> ---
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.ironpython.com
> > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> >
>
>
> --
> http://www.ironpythoninaction.com/
> http://www.voidspace.org.uk/blog
>
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com




More information about the Ironpython-users mailing list