[IronPython] Slow Performance of CPython libs?

Dan Shechter dans at houmus.org
Sat Feb 23 17:25:25 CET 2008


Well it does look interesting, but I think there is something flawed 
with my own suggestion as a general rule.

As someone mentioned here, the CompileToAssembly() does create a new 
assembly, which gets loaded it into the AppDomain (eventually) and it 
doesn't get reclaimed for eternity (forgive my melodrama).
So I would definitely not "suggest" it as a general solution for all 
Regex woes in the world.

I would try, just for the fun of it to patch IP to use 
CompileToAssembly() just to see how much time in reality does it  knock 
off the profiling / general runtime.

Also note that your profiler might be causing you faux-visages of 
performance. Have you tried to just time the two different versions 
without profiling? perhaps performance is already "Good Enough"?

Birsch wrote:
> Update:
> 
> I patched re.cs with a compile-check flag (same as Dino), and while it 
> knocked off 25%, it still remained very slow in comparison to CPython 
> (x5-x10).
> So I ran the profiler again on the patched code:
> 
> Namespace 	Method name 	Time (sec.) 	Time with children (sec.) 	Hit 
> count 	Source file
> IronPython.Runtime.Operations 	Ops.CallWithContext(ICallerContext 
> context, object func, object arg0, object arg1) 	5.7779 	35.5915 	75882 	
> System.Text.RegularExpressions 	CompiledRegexRunner.Go() 	5.3627 
> 19.9116 	13684621 	
> IronPython.Runtime.Calls 	Method.Call(ICallerContext context, object 
> arg0) 	5.0284 	36.0412 	50886 	
> IronPython.Runtime.Calls 	PythonFunction.CallInstance(ICallerContext 
> context, object arg0, object arg1) 	5.0055 	36.0168 	50886 	
> IronPython.Runtime.Calls 	Function2.Call(ICallerContext context, object 
> arg0, object arg1) 	4.7494 	35.7496 	47485 	
> IronPython.Runtime.Operations 	Ops.CallWithContext(ICallerContext 
> context, object func, object arg0) 	4.3452 	35.3645 	146642 	
> System.Text.RegularExpressions 	RegexCharClass.CharInClassRecursive(char 
> ch, string set, int start) 	4.0303 	6.0294 	31498121 	
> System.Text.RegularExpressions 	RegexCharClass.CharInClass(char ch, 
> string set) 	3.7132 	9.7426 	31498121 	
> System.Text.RegularExpressions 	RegexRunner.CharInClass(char ch, string 
> charClass) 	3.1865 	12.9291 	31498114 	
> IronPython.Runtime.Calls 	Method.Call(ICallerContext context, object 
> arg0, object arg1) 	3.1467 	8.2760 	15008 	
> IronPython.Runtime.Calls 	PythonFunction.CallInstance(ICallerContext 
> context, object arg0, object arg1, object arg2) 	3.1440 	8.2725 	15008 	
> System.Text.RegularExpressions 	RegexCharClass.CharInClassInternal(char 
> ch, string set, int start, int mySetLength, int myCategoryLength) 
> 1.9806 	1.9991 	31498121 	
> System.Text.RegularExpressions 	RegexRunner.EnsureStorage() 	1.6674 
> 1.6674 	37765633 	
> System.Text.RegularExpressions 	RegexRunner.Scan(Regex regex, string 
> text, int textbeg, int textend, int textstart, int prevlen, bool 
> quick) 	1.5465 	22.9673 	39137 	
> IronPython.Runtime.Operations 	Ops.GetAttr(ICallerContext context, 
> object o, SymbolId name) 	1.5139 	5.5342 	546878 	
> System.Text.RegularExpressions 	CompiledRegexRunner.FindFirstChar() 
> 1.3493 	1.4119 	13689019 	
> IronPython.Runtime.Types 	OldClass.TryLookupSlot(SymbolId name, out 
> object ret) 	1.2987 	2.4052 	350822 	
> System.Globalization 	CompareInfo.IsPrefix(string source, string prefix, 
> CompareOptions options) 	0.3656 	0.3669 	3421 	
> IronPython.Runtime 	SymbolId.op_Equality(SymbolId a, SymbolId b) 
> 0.3635 	0.3635 	7974446 	
> IronPython.Runtime 	CustomSymbolDict.TryGetValue(SymbolId key, out 
> object value) 	0.2286 	1.3466 	1165376 	
> System 	String.InternalSubString(int startIndex, int length, bool 
> fAlwaysCopy) 	0.2253 	0.3669 	52536 	
> IronPython.Runtime.Calls 
> FunctionEnvironmentNDictionary.TryGetExtraValue(SymbolId key, out object 
> value) 	0.2228 	0.4089 	152594 	
> IronPython.Runtime.Types 	OldInstance.TryGetAttr(ICallerContext context, 
> SymbolId name, out object value) 	0.2156 	3.0540 	396617 	
> IronPython.Runtime.Calls 	Function3.Call(ICallerContext context, object 
> arg0, object arg1, object arg2) 	0.2109 	5.3373 	11221 	
> System 	String.wstrcpy(char* dmem, char* smem, int charCount) 	0.1831 
> 0.1831 	621212 	
> System.Collections.Generic 	Dictionary<TKey, TValue>.FindEntry(TKey 
> key) 	0.1460 	0.3733 	558255 	
> IronPython.Runtime.Calls 
> FunctionEnvironment32Dictionary.TryGetExtraValue(SymbolId key, out 
> object value) 	0.1457 	0.2680 	103477 	
> IronPython.Runtime.Types 	DynamicType.TryGetSlot(ICallerContext context, 
> SymbolId name, out object ret) 	0.1367 	0.3618 	406706 	
> IronPython.Runtime.Operations 	Ops.CallWithContext(ICallerContext 
> context, object func, object[] args) 	0.1326 	1.7429 	10267 	
> 
> 	sgmllib.finish_starttag$f318(FunctionEnvironment32Dictionary $env, 
> object self, object tag, object attrs) 	0.1213 	5.0059 	3787 	
> 
> 
> If I'm reading this correctly, it now runs the compiled regex code, and 
> the performance diff boils down to regex performance itself.
> 
> Dan Shechter mentioned earlier an option of generating 
> regex-pattern-specific runtime code. With almost an order of magnitude 
> to knock off here - do you think that will do the trick?
> 
> -Birsch
> 
> On Fri, Feb 22, 2008 at 10:24 PM, Birsch <birsch at gmail.com 
> <mailto:birsch at gmail.com>> wrote:
> 
>     Great, you've got my vote :)
> 
>     Thanks a lot for your help.
> 
> 
>     On Fri, Feb 22, 2008 at 9:03 PM, Dino Viehland
>     <dinov at exchange.microsoft.com <mailto:dinov at exchange.microsoft.com>>
>     wrote:
> 
>         Oh, and I've gone ahead and created a bug for this one for 2.0:
>         http://www.codeplex.com/IronPython/WorkItem/View.aspx?WorkItemId=15289
> 
>          
> 
>         *From:* users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>
>         [mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>] *On Behalf Of *Birsch
>         *Sent:* Friday, February 22, 2008 9:40 AM
> 
>         *To:* Discussion of IronPython
>         *Subject:* Re: [IronPython] Slow Performance of CPython libs?
> 
>          
> 
>         Dino - do you want me to register a workitem? Also, if I wanted
>         to patch v1.1.1, where should I start?
> 
>         On Thu, Feb 21, 2008 at 8:48 PM, Dino Viehland
>         <dinov at exchange.microsoft.com
>         <mailto:dinov at exchange.microsoft.com>> wrote:
> 
>         After Birsh's last comment that they're calling compile I was
>         thinking we might be doing something really stupid.  Turns out
>         that we are.  For our supposedly compiled regex's we're not
>         passing the Compiled option.  So they're all getting
>         interpretted and of course that's going to suck big time.
> 
>         So the fix for this is probably to just pass
>         RegexOptions.Compiled to the Regex object being contructed into
>         RE_Pattern.  Internally the Regex class will compile to dynamic
>         methods (much better than compiling to an assembly because it'll
>         be collectible).
> 
>         It's actually a little more complex than that because we create
>         RE_Pattern objects for the non-compiled so we need to flow the
>         option in, but it's still a simple fix.
> 
> 
>         -----Original Message-----
>         From: users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>
>         [mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>] On Behalf Of Dan
>         Shechter
>         Sent: Thursday, February 21, 2008 10:19 AM
>         To: Discussion of IronPython
>         Subject: Re: [IronPython] Slow Performance of CPython libs?
> 
>         Just out of curiosity...
>         I'm guessing that there are two levels of caching that could be
>         used.
>         One would be to hold a global dictionary of some sort of pattern ->
>         Regex that will save the lengthy parsing of the re.
> 
>         The other would be to actually call Regex.CompileToAssembly to get a
>         more efficient reperesentation (in runtime that is) of the re.
> 
>         Would you (the IP team) favor the former or the latter?
> 
>                Shechter.
> 
> 
>         Birsch wrote:
>          > I checked both sgmllib.py and BeautifulSoup.py - and it seems
>         both are
>          > reusing the same regexps (searched for re.compile).
>          >
>          > I think your suggestion is very relevant in this case. It
>         makes sense to
>          > replicate the "compile once use many" behavior that is
>         commonly used
>          > with regexp.
>          >
>          > -Birsch
>          >
>          > On Thu, Feb 21, 2008 at 7:30 PM, Dino Viehland
>          > <dinov at exchange.microsoft.com
>         <mailto:dinov at exchange.microsoft.com>
>         <mailto:dinov at exchange.microsoft.com
>         <mailto:dinov at exchange.microsoft.com>>> wrote:
>          >
>          >     Do you know if the same reg ex is being used repeatedly?
>          If so
>          >     maybe we can cache & compile the regex instead of
>         interpretting it
>          >     all the time.
>          >
>          >
>          >
>          >     *From:* users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>
>          >     <mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>>
>          >     [mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>
>          >     <mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>>] *On Behalf Of *Birsch
>          >     *Sent:* Thursday, February 21, 2008 8:30 AM
>          >
>          >     *To:* Discussion of IronPython
>          >     *Subject:* Re: [IronPython] Slow Performance of CPython libs?
>          >
>          >
>          >
>          >     I took on Cooper's advice and profiled with ANTS. Here
>         are the top
>          >     methods:
>          >
>          >     *Namespace*
>          >
>          >
>          >
>          >     *Method name*
>          >
>          >
>          >
>          >     *Time (sec.)*
>          >
>          >
>          >
>          >     *Time with children (sec.)*
>          >
>          >
>          >
>          >     *Hit count*
>          >
>          >
>          >
>          >     *Source file*
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Go()
>          >
>          >
>          >
>          >     37.0189
>          >
>          >
>          >
>          >     94.4676
>          >
>          >
>          >
>          >     13689612
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Operator()
>          >
>          >
>          >
>          >     6.2411
>          >
>          >
>          >
>          >     6.2411
>          >
>          >
>          >
>          >     131146274
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Advance(int i)
>          >
>          >
>          >
>          >     5.9264
>          >
>          >
>          >
>          >     8.7202
>          >
>          >
>          >
>          >     66000263
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.SetOperator(int op)
>          >
>          >
>          >
>          >     5.5750
>          >
>          >
>          >
>          >     5.5750
>          >
>          >
>          >
>          >     131146274
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Backtrack()
>          >
>          >
>          >
>          >     5.5692
>          >
>          >
>          >
>          >     9.4895
>          >
>          >
>          >
>          >     37781343
>          >
>          >
>          >
>          >     IronPython.Runtime.Operations
>          >
>          >
>          >
>          >     Ops.CallWithContext(ICallerContext context, object func,
>         object
>          >     arg0, object arg1)
>          >
>          >
>          >
>          >     5.5572
>          >
>          >
>          >
>          >     114.5245
>          >
>          >
>          >
>          >     79754
>          >
>          >
>          >
>          >     IronPython.Runtime.Calls
>          >
>          >
>          >
>          >     Method.Call(ICallerContext context, object arg0)
>          >
>          >
>          >
>          >     4.9052
>          >
>          >
>          >
>          >     114.8251
>          >
>          >
>          >
>          >     50886
>          >
>          >
>          >
>          >     IronPython.Runtime.Calls
>          >
>          >
>          >
>          >     PythonFunction.CallInstance(ICallerContext context,
>         object arg0,
>          >     object arg1)
>          >
>          >
>          >
>          >     4.8876
>          >
>          >
>          >
>          >     114.8059
>          >
>          >
>          >
>          >     50886
>          >
>          >
>          >
>          >     IronPython.Runtime.Calls
>          >
>          >
>          >
>          >     Function2.Call(ICallerContext context, object arg0,
>         object arg1)
>          >
>          >
>          >
>          >     4.6400
>          >
>          >
>          >
>          >     114.5471
>          >
>          >
>          >
>          >     47486
>          >
>          >
>          >
>          >     IronPython.Runtime.Operations
>          >
>          >
>          >
>          >     Ops.CallWithContext(ICallerContext context, object func,
>         object arg0)
>          >
>          >
>          >
>          >     4.2344
>          >
>          >
>          >
>          >     114.1604
>          >
>          >
>          >
>          >     146658
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexBoyerMoore.Scan(string text, int index, int
>         beglimit, int endlimit)
>          >
>          >
>          >
>          >     3.6465
>          >
>          >
>          >
>          >     3.6465
>          >
>          >
>          >
>          >     13678131
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexCharClass.CharInClassRecursive(char ch, string set,
>         int start)
>          >
>          >
>          >
>          >     3.6288
>          >
>          >
>          >
>          >     5.7113
>          >
>          >
>          >
>          >     31508162
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Goto(int newpos)
>          >
>          >
>          >
>          >     3.2058
>          >
>          >
>          >
>          >     5.1470
>          >
>          >
>          >
>          >     27364668
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Operand(int i)
>          >
>          >
>          >
>          >     3.1923
>          >
>          >
>          >
>          >     3.1923
>          >
>          >
>          >
>          >     73230687
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexRunner.EnsureStorage()
>          >
>          >
>          >
>          >     3.0803
>          >
>          >
>          >
>          >     3.0803
>          >
>          >
>          >
>          >     51474823
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexCharClass.CharInClass(char ch, string set)
>          >
>          >
>          >
>          >     3.0713
>          >
>          >
>          >
>          >     8.7827
>          >
>          >
>          >
>          >     31508162
>          >
>          >
>          >
>          >     IronPython.Runtime.Calls
>          >
>          >
>          >
>          >     Method.Call(ICallerContext context, object arg0, object arg1)
>          >
>          >
>          >
>          >     2.9821
>          >
>          >
>          >
>          >     7.8675
>          >
>          >
>          >
>          >     15012
>          >
>          >
>          >
>          >     IronPython.Runtime.Calls
>          >
>          >
>          >
>          >     PythonFunction.CallInstance(ICallerContext context,
>         object arg0,
>          >     object arg1, object arg2)
>          >
>          >
>          >
>          >     2.9794
>          >
>          >
>          >
>          >     7.8639
>          >
>          >
>          >
>          >     15012
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Forwardcharnext()
>          >
>          >
>          >
>          >     2.8852
>          >
>          >
>          >
>          >     2.8852
>          >
>          >
>          >
>          >     62865185
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Forwardchars()
>          >
>          >
>          >
>          >     2.8279
>          >
>          >
>          >
>          >     2.8279
>          >
>          >
>          >
>          >     59436277
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexCharClass.CharInClassInternal(char ch, string set,
>         int start,
>          >     int mySetLength, int myCategoryLength)
>          >
>          >
>          >
>          >     2.0632
>          >
>          >
>          >
>          >     2.0826
>          >
>          >
>          >
>          >     31508162
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexRunner.Scan(Regex regex, string text, int textbeg,
>         int textend,
>          >     int textstart, int prevlen, bool quick)
>          >
>          >
>          >
>          >     1.8376
>          >
>          >
>          >
>          >     101.7226
>          >
>          >
>          >
>          >     43009
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.FindFirstChar()
>          >
>          >
>          >
>          >     1.6405
>          >
>          >
>          >
>          >     5.3456
>          >
>          >
>          >
>          >     13701755
>          >
>          >
>          >
>          >     IronPython.Runtime.Types
>          >
>          >
>          >
>          >     OldClass.TryLookupSlot(SymbolId name, out object ret)
>          >
>          >
>          >
>          >     1.5573
>          >
>          >
>          >
>          >     2.8124
>          >
>          >
>          >
>          >     389516
>          >
>          >
>          >
>          >     IronPython.Runtime.Operations
>          >
>          >
>          >
>          >     Ops.GetAttr(ICallerContext context, object o, SymbolId name)
>          >
>          >
>          >
>          >     1.5365
>          >
>          >
>          >
>          >     5.3456
>          >
>          >
>          >
>          >     558524
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Textpos()
>          >
>          >
>          >
>          >     1.4020
>          >
>          >
>          >
>          >     1.4020
>          >
>          >
>          >
>          >     32648926
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Advance()
>          >
>          >
>          >
>          >     1.1916
>          >
>          >
>          >
>          >     2.9526
>          >
>          >
>          >
>          >     13703950
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Textto(int newpos)
>          >
>          >
>          >
>          >     1.1218
>          >
>          >
>          >
>          >     1.1218
>          >
>          >
>          >
>          >     24120890
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPeek()
>          >
>          >
>          >
>          >     1.0579
>          >
>          >
>          >
>          >     1.0579
>          >
>          >
>          >
>          >     24120894
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     Regex.Run(bool quick, int prevlen, string input, int
>         beginning, int
>          >     length, int startat)
>          >
>          >
>          >
>          >     0.7280
>          >
>          >
>          >
>          >     102.4644
>          >
>          >
>          >
>          >     43009
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPush(int I1)
>          >
>          >
>          >
>          >     0.6834
>          >
>          >
>          >
>          >     0.6834
>          >
>          >
>          >
>          >     13745149
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.StackPush(int I1)
>          >
>          >
>          >
>          >     0.6542
>          >
>          >
>          >
>          >     0.6542
>          >
>          >
>          >
>          >     13703955
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPop()
>          >
>          >
>          >
>          >     0.6068
>          >
>          >
>          >
>          >     0.6068
>          >
>          >
>          >
>          >     13663035
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPush()
>          >
>          >
>          >
>          >     0.6049
>          >
>          >
>          >
>          >     0.6049
>          >
>          >
>          >
>          >     13708230
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.StackPop()
>          >
>          >
>          >
>          >     0.5836
>          >
>          >
>          >
>          >     0.5836
>          >
>          >
>          >
>          >     13703956
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.Bump()
>          >
>          >
>          >
>          >     0.4987
>          >
>          >
>          >
>          >     0.4987
>          >
>          >
>          >
>          >     10472790
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPush(int I1, int I2)
>          >
>          >
>          >
>          >     0.4864
>          >
>          >
>          >
>          >     0.4864
>          >
>          >
>          >
>          >     10472790
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPeek(int i)
>          >
>          >
>          >
>          >     0.4663
>          >
>          >
>          >
>          >     0.4663
>          >
>          >
>          >
>          >     10457859
>          >
>          >
>          >
>          >     System.Text.RegularExpressions
>          >
>          >
>          >
>          >     RegexInterpreter.TrackPop(int framesize)
>          >
>          >
>          >
>          >     0.4396
>          >
>          >
>          >
>          >     0.4396
>          >
>          >
>          >
>          >     10457859
>          >
>          >
>          >
>          >
>          >     Moving up the stack of regex.Go(), most calls originate from
>          >     sgmllib's parse_starttag.
>          >
>          >     HTH,
>          >     -Birsch
>          >
>          >     On Thu, Feb 21, 2008 at 2:35 PM, Birsch <birsch at gmail.com
>         <mailto:birsch at gmail.com>
>          >     <mailto:birsch at gmail.com <mailto:birsch at gmail.com>>> wrote:
>          >
>          >     Thanks Michael and Dino.
>          >
>          >     I'll prof and send update. Got a good profiler
>         recommendation for .Net?
>          >     Meanwhile I noticed the sample site below causes
>         BeautifulSoup to
>          >     generate quite a few [python] exceptions during __init__.
>         Does
>          >     IronPython handle exceptions significantly slower than
>         CPtyhon?
>          >
>          >     Repro code is simple (just build a BeautifulSoup obj with
>         mininova's
>          >     home page).
>          >     Here are the .py and .cs I used to time the diffs:
>          >
>          >     *bstest.py:*
>          >     #Bypass CPython default socket implementation with IPCE/FePy
>          >     import imp, os, sys
>          >     sys.modules['socket'] = module = imp.new_module('socket')
>          >     execfile('socket.py', module.__dict__)
>          >
>          >     from BeautifulSoup import BeautifulSoup
>          >     from urllib import urlopen
>          >     import datetime
>          >
>          >     def getContent(url):
>          >         #Download html data
>          >         startTime = datetime.datetime.now()
>          >         print "Getting url", url
>          >         html = urlopen(url).read()
>          >         print "Time taken:", datetime.datetime.now() - startTime
>          >
>          >         #Make soup
>          >         startTime = datetime.datetime.now()
>          >         print "Making soup..."
>          >         soup = BeautifulSoup(markup=html)
>          >         print "Time taken:", datetime.datetime.now() - startTime
>          >
>          >     if __name__ == "__main__":
>          >         print getContent("www.mininova.org
>         <http://www.mininova.org> <http://www.mininova.org>")
>          >
>          >
>          >     *C#:*
>          >     using System;
>          >     using System.Collections.Generic;
>          >     using System.Text;
>          >     using IronPython.Hosting;
>          >
>          >     namespace IronPythonBeautifulSoupTest
>          >     {
>          >         public class Program
>          >         {
>          >             public static void Main(string[] args)
>          >             {
>          >                 //Init
>          >                 System.Console.WriteLine("Starting...");
>          >                 DateTime start = DateTime.Now;
>          >                 PythonEngine engine = new PythonEngine();
>          >
>          >                 //Add paths:
>          >                 //BeautifulSoup.py, socket.py, bstest.py
>         located on exe dir
>          >                 engine.AddToPath(@".");
>          >                 //CPython Lib (replace with your own)
>          >                 engine.AddToPath(@"D:\Dev\Python\Lib");
>          >
>          >                 //Import and load
>          >                 TimeSpan span = DateTime.Now - start;
>          >                 System.Console.WriteLine("[1] Import: " +
>          >     span.TotalSeconds);
>          >                 DateTime d = DateTime.Now;
>          >                 engine.ExecuteFile(@"bstest.py");
>          >                 span = DateTime.Now - d;
>          >                 System.Console.WriteLine("[2] Load: " +
>         span.TotalSeconds);
>          >
>          >                 //Execute
>          >                 d = DateTime.Now;
>          >                
>         engine.Execute("getContent(\"http://www.mininova.org\
>         <http://www.mininova.org%5C>")");
>          >                 span = DateTime.Now - d;
>          >                 System.Console.WriteLine("[3] Execute: " +
>          >     span.TotalSeconds);
>          >                 span = DateTime.Now - start;
>          >                 System.Console.WriteLine("Total: " +
>         span.TotalSeconds);
>          >
>          >
>          >             }
>          >         }
>          >     }
>          >
>          >
>          >     On Wed, Feb 20, 2008 at 6:57 PM, Dino Viehland
>          >     <dinov at exchange.microsoft.com
>         <mailto:dinov at exchange.microsoft.com>
>         <mailto:dinov at exchange.microsoft.com
>         <mailto:dinov at exchange.microsoft.com>>>
>          >     wrote:
>          >
>          >     We've actually had this issue reported once before a long
>         time ago -
>          >     it's a very low CodePlex ID -
>          >    
>         http://www.codeplex.com/IronPython/WorkItem/View.aspx?WorkItemId=651
>          >
>          >     We haven't had a chance to investigate the end-to-end
>         scenario.  If
>          >     someone could come up with a smaller simpler repro that'd
>         be great.
>          >      Otherwise we haven't forgotten about it we've just had more
>          >     immediately pressing issues to work on :(.
>          >
>          >
>          >     -----Original Message-----
>          >     From: users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>
>          >     <mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>>
>          >     [mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>
>          >     <mailto:users-bounces at lists.ironpython.com
>         <mailto:users-bounces at lists.ironpython.com>>] On Behalf Of
>         Michael Foord
>          >     Sent: Wednesday, February 20, 2008 5:20 AM
>          >     To: Discussion of IronPython
>          >     Subject: Re: [IronPython] Slow Performance of CPython libs?
>          >
>          >     Birsch wrote:
>          >      > Hi - We've been using IronPython successfully to allow
>         extensibility
>          >      > of our application.
>          >      >
>          >      > Overall we are happy with the performance, with the
>         exception of
>          >      > BeautifulSoup which seems to run very slowly: x5 or
>         more time to
>          >      > execute compared to CPython.
>          >      >
>          >      > Most of the time seems to be spent during __init__()
>         of BS, where the
>          >      > markup is parsed.
>          >      >
>          >      > We suspect this has to do with the fact that our
>         CPython env is
>          >      > executing .pyc files and can precompile its libs,
>         while the
>          >     IronPython
>          >      > environment compiles each iteration. We couldn't find
>         a way to
>          >      > pre-compile the libs and then introduce them into the
>         code, but
>          >     in any
>          >      > case this will result in a large management overhead
>         since the amount
>          >      > of CPython libs we expose to our users contains 100's
>         of modules.
>          >      >
>          >      > Any ideas on how to optimize?
>          >
>          >     I think it is worth doing real profiling to find out
>         where the time is
>          >     being spent during parsing.
>          >
>          >     If it is spending most of the time in '__init__' then the
>         time is
>          >     probably not spent in importing - so compilation isn't
>         relevant and it
>          >     is a runtime performance issue. (Importing is much slower
>         with
>          >     IronPython and at Resolver Systems we do use precompiled
>         binaries - but
>          >     strangely enough it doesn't provide much of a performance
>         gain.)
>          >
>          >     Michael
>          >     http://www.manning.com/foord
>          >
>          >      >
>          >      > Thanks,
>          >      > -Birsch
>          >      >
>          >      > Note: we're using FePy/IPCE libs with regular IP
>         v1.1.1 runtime DLLs
>          >      > (this was done to overcome library incompatibilities
>         and network
>          >      > errors). However, the relevant slow .py code (mainly
>         SGMLParser and
>          >      > BeautifulSoup) is the same.
>          >      >
>          >    
>         ------------------------------------------------------------------------
>          >      >
>          >      > _______________________________________________
>          >      > Users mailing list
>          >      > Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>
>         <mailto:Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>>
>          >      >
>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>          >      >
>          >
>          >     _______________________________________________
>          >     Users mailing list
>          >     Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>
>         <mailto:Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>>
>          >     http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>          >     _______________________________________________
>          >     Users mailing list
>          >     Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>
>         <mailto:Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>>
>          >     http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>          >
>          >
>          >
>          >
>          >
>          >
>          >     _______________________________________________
>          >     Users mailing list
>          >     Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>
>         <mailto:Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>>
>          >     http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>          >
>          >
>          >
>          >
>         ------------------------------------------------------------------------
>          >
>          > _______________________________________________
>          > Users mailing list
>          > Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>          > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>         _______________________________________________
>         Users mailing list
>         Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>         _______________________________________________
>         Users mailing list
>         Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> 
>          
> 
> 
>         _______________________________________________
>         Users mailing list
>         Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com



More information about the Ironpython-users mailing list