C++ performance myths debunked

Gerson Kurz gerson.kurz at t-online.de
Fri Aug 2 20:28:35 CEST 2002


Download the libraries from over at www.boost.org. Compile a sample
program (with VC6), get 31 warnings like this (I'm not making this up,
and yes, it is *one* single warning):

C:\Program Files\Microsoft Visual Studio\VC98\INCLUDE\utility(31) :
warning C4786:
'boost::detail::find_param_continue::select<boost::detail::cons_type<boost::detail::cons_type<boost::detail::value_type_tag,std::basic_string<char,std::char_traits<ch
ar>,std::allocator<char> >
>,boost::detail::cons_type<boost::detail::cons_type<boost::detail::reference_tag,std::basic_string<char,std::char_traits<char>,std::allocator<char>
> const
&>,boost::detail::cons_type<boost::detail::cons_type<boost::detail
::pointer_tag,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const *>,boost::detail::cons_type<boost::detail::cons_type<boost::detail::iterator_category_tag,std::forward_iterator_tag>,boost::detail::cons_type<boost::detail::con
s_type<boost::detail::difference_type_tag,int>,boost::detail::end_of_list>
> > > >,boost::detail::iterator_category_tag>' : identifier was
truncated to '255' characters in the debug information

So much for C++ syntax.

This is the sample code

   string s = "This is,  a test";
   tokenizer<> tok(s);
   for(tokenizer<>::iterator beg=tok.begin(); beg!=tok.end();++beg)
   {
      const char* shit = beg->c_str();
   }

100000 loops of that take 2609 ms. Move "string s" out of the loop,
because it does some allocating, go down to 2300 ms. 

Next, try my good old classlib I've been using for five years now:

   PTokens tokens("This is,  a test"," ");
   for( DWORD i = 0; i < tokens.Count(); i++ )
   {
      const char* shit = tokens[i];
   }

100000 loops of that takes 1100 ms. Less code, and more readable
(granted, to me who has been using that stuff some time). 

Now try the same thing with strtok. It sure doesn't look so pretty:

    PString s("This is,  a test"); // will create a copy of input
string, because strtok modifes its argument inplace
    char* token = strtok( s, " " );
    while( token != NULL )
    {
       const char* shit = token;
       token = strtok( NULL, " " );
    }

100000 loops of that takes .... TADA: 200 ms. 

Now, lets try that in Python.

import win32api

def test():
    s0 = win32api.GetTickCount()
    i = 0
    while i < 100000:
        for token in "This is,  a test".split():
            shit = token
        i += 1
    elapsed = win32api.GetTickCount() - s0
    print "Took %.2f" % elapsed

Took 484 ms. 

Damn, I feel stupid now for my own C++ classes!

Hey, this is not a scientific study or something - its just: you dl
something that is said to be "state of the C++ art" (see associated
Kuro5hin article), and it is *so* much worse than python where you
really would not have expected it. Coming from C++ I always had
suspicions about the performance about all those neat string ops
Python has - but it now seems I have to reconsider some of my
preconceptions. 

Thanks, Python!




More information about the Python-list mailing list