[IronPython] Slow Performance of CPython libs?

Birsch birsch at gmail.com
Thu Feb 21 13:35:59 CET 2008


Thanks Michael and Dino.

I'll prof and send update. Got a good profiler recommendation for .Net?
Meanwhile I noticed the sample site below causes BeautifulSoup to generate
quite a few [python] exceptions during __init__. Does IronPython handle
exceptions significantly slower than CPtyhon?

Repro code is simple (just build a BeautifulSoup obj with mininova's home
page).
Here are the .py and .cs I used to time the diffs:

*bstest.py:*
#Bypass CPython default socket implementation with IPCE/FePy
import imp, os, sys
sys.modules['socket'] = module = imp.new_module('socket')
execfile('socket.py', module.__dict__)

from BeautifulSoup import BeautifulSoup
from urllib import urlopen
import datetime

def getContent(url):
    #Download html data
    startTime = datetime.datetime.now()
    print "Getting url", url
    html = urlopen(url).read()
    print "Time taken:", datetime.datetime.now() - startTime

    #Make soup
    startTime = datetime.datetime.now()
    print "Making soup..."
    soup = BeautifulSoup(markup=html)
    print "Time taken:", datetime.datetime.now() - startTime

if __name__ == "__main__":
    print getContent("www.mininova.org")


*C#:*
using System;
using System.Collections.Generic;
using System.Text;
using IronPython.Hosting;

namespace IronPythonBeautifulSoupTest
{
    public class Program
    {
        public static void Main(string[] args)
        {
            //Init
            System.Console.WriteLine("Starting...");
            DateTime start = DateTime.Now;
            PythonEngine engine = new PythonEngine();

            //Add paths:
            //BeautifulSoup.py, socket.py, bstest.py located on exe dir
            engine.AddToPath(@".");
            //CPython Lib (replace with your own)
            engine.AddToPath(@"D:\Dev\Python\Lib");

            //Import and load
            TimeSpan span = DateTime.Now - start;
            System.Console.WriteLine("[1] Import: " + span.TotalSeconds);
            DateTime d = DateTime.Now;
            engine.ExecuteFile(@"bstest.py");
            span = DateTime.Now - d;
            System.Console.WriteLine("[2] Load: " + span.TotalSeconds);

            //Execute
            d = DateTime.Now;
            engine.Execute("getContent(\"http://www.mininova.org\")");
            span = DateTime.Now - d;
            System.Console.WriteLine("[3] Execute: " + span.TotalSeconds);
            span = DateTime.Now - start;
            System.Console.WriteLine("Total: " + span.TotalSeconds);
        }
    }
}



On Wed, Feb 20, 2008 at 6:57 PM, Dino Viehland <dinov at exchange.microsoft.com>
wrote:

> We've actually had this issue reported once before a long time ago - it's
> a very low CodePlex ID -
> http://www.codeplex.com/IronPython/WorkItem/View.aspx?WorkItemId=651
>
> We haven't had a chance to investigate the end-to-end scenario.  If
> someone could come up with a smaller simpler repro that'd be great.
>  Otherwise we haven't forgotten about it we've just had more immediately
> pressing issues to work on :(.
>
> -----Original Message-----
> From: users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] On Behalf Of Michael Foord
> Sent: Wednesday, February 20, 2008 5:20 AM
> To: Discussion of IronPython
> Subject: Re: [IronPython] Slow Performance of CPython libs?
>
> Birsch wrote:
> > Hi - We've been using IronPython successfully to allow extensibility
> > of our application.
> >
> > Overall we are happy with the performance, with the exception of
> > BeautifulSoup which seems to run very slowly: x5 or more time to
> > execute compared to CPython.
> >
> > Most of the time seems to be spent during __init__() of BS, where the
> > markup is parsed.
> >
> > We suspect this has to do with the fact that our CPython env is
> > executing .pyc files and can precompile its libs, while the IronPython
> > environment compiles each iteration. We couldn't find a way to
> > pre-compile the libs and then introduce them into the code, but in any
> > case this will result in a large management overhead since the amount
> > of CPython libs we expose to our users contains 100's of modules.
> >
> > Any ideas on how to optimize?
>
> I think it is worth doing real profiling to find out where the time is
> being spent during parsing.
>
> If it is spending most of the time in '__init__' then the time is
> probably not spent in importing - so compilation isn't relevant and it
> is a runtime performance issue. (Importing is much slower with
> IronPython and at Resolver Systems we do use precompiled binaries - but
> strangely enough it doesn't provide much of a performance gain.)
>
> Michael
> http://www.manning.com/foord
>
> >
> > Thanks,
> > -Birsch
> >
> > Note: we're using FePy/IPCE libs with regular IP v1.1.1 runtime DLLs
> > (this was done to overcome library incompatibilities and network
> > errors). However, the relevant slow .py code (mainly SGMLParser and
> > BeautifulSoup) is the same.
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.ironpython.com
> > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> >
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20080221/a00f135e/attachment.html>


More information about the Ironpython-users mailing list