python philosophical question - strong vs duck typing

Neil Cerutti neilc at norwich.edu
Tue Jan 3 14:40:25 EST 2012


On 2012-01-03, David Harks <dave at dwink.net> wrote:
> On 1/3/2012 12:13 PM, Sean Wolfe wrote:
>> if we are coding in python but looking for
>> more performance,
>
> Are you in fact in this situation? Despite years of folks
> mentioning how Python is 'slower than C++', I've seen a project
> where a developer churned out a feature using Python's
> generators that performed much faster than the C++
> implementation it replaced. It wasn't because the C++ was
> slower by nature; it's because it was harder to express the
> optimal algorithm in C++ so the C++ developer chose a
> sub-optimal approach in the interest of meeting a deadline.

I was again looking through The Practice of Programming
(Kernighan & Pike) today, and the chapter on the Markov chain
program makes similar tradeoffs. For educational purposes they
implement the program in several languages, and the C
implemenation is fastest by far. The C++ implementation is
surprisingly not close in performance to the C version, for two
main reasons.

1. They used a std::deque to store the prefixes, when a std::list
   is the correct container. They point this out in a note on the
   poor performance of the C++ program, but do not note the irony
   that they had automatically used a singly-linked list in C.

2. When building the data structure in C they store only
   pointers to the original text, which they munge in memory to
   create zero-terminated strings. This is idiomatic C.
   But there's no reason C++ can't do a similar optimization.
   A std::list<char*> is certainly possible, but loses you most
   of the benefits of the standard containers. Best is probably
   to create a registry of words using a std::map<int,
   std::string>, with your lists storing references to words in
   the registry. You still have to copy of the text, but only
   once. The C++ implementation starts to smell sort of like
   Python. ;)

-- 
Neil Cerutti



More information about the Python-list mailing list