Dumb python questions

Alex Martelli aleax at aleax.it
Thu Aug 16 12:44:35 EDT 2001


"Paul Rubin" <phr-n2001 at nightsong.com> wrote in message
news:7xy9okeeha.fsf at ruckus.brouhaha.com...
    ...
> over a lot of use and works well.  PHP basically copied it so PHP's
> pack/unpack work the same way.  Python's struct package is very limited
> by comparison--it does only a few formats, and the record length has
> to be hard coded into the format string.  So to turn a list of ints

If I felt severely limited by Python's struct package, I'd see it
as an occasion to hack on it to extend it to my liking -- then one
can submit the enhanced struct for inclusion into Python and/or
just publish it on Parnassus or somewhere.  (That's assuming one's
a C hacker, but, as gawk's original author, I would definitely
believe you must be!-).

The fact that struct is a separate module means all you have to
grok for the hacking is 1200 lines of code in structmodule.c,
and the reasonably-clean Python C API -- a far sight from
having to hack the internals of a large program... that's
always gonna be hairier.

> Well, there's the XDR package, which apparently does the right thing.
> But then I looked at the code for it, and it works something like this:
>
>    def pack(stuff):
>      buf = ""
>      for i in stuff:
>         buf = buf + encode(i)  # string concatenation
>
> Very natural coding, but yikes!  The running time is quadratic in the
> number of items being coded, because the string keeps growing as you
> append stuff and keep reallocating it.  You won't notice it for small

I agree, it's a serious coding defect in xdrlib.py.  All of the
self.__buf = self.__buf + something will indeed kill performance
if a Packer is used to pack something big, and should be replaced
with appending of the something to a list of strings -- the
one-string buffer being built JIT in the get_buffer method.

The fact that you're already finding such performance bugs, that
must have been there for ages (xdrlib is pretty old, I think),
confirms that you're *definitely* the kind of person we want
in the Python community!-)  I've submitted this as a bug to
sourceforge, as I see you hadn't done that.

> objects but then suddenly you try generating some megabyte images (or
> whatever) with it and wonder why your program bogged down.  So I'd
> have to say this isn't a mature implementation.  A real production
> system shouldn't have gotchas like that if they're avoidable (which

If you know of any large software system without at least one
silly performance-bug lurking somewhere in rarely-used parts
of the system, I'd really love to be introduced to it!-).  If,
as and when we notice such bugs, we submit them, they get fixed,
and the quality keeps going upwards -- that's supposed to be
a key benefit of open-source, after all:-).

> this is).  I saw a PEP saying they're adding buffer objects (something
> like Java StringBuf's) and maybe some of these problems will go away.

Buffer objects may be even faster than a list of strings that
are joined by ''.join, but the key issue is noticing that some
piece of software is doing the xx=xx+something routine -- fixing
that is trivial once noticed, even if a builtin buffer object or
array.array could provide another, say, 5% or 10% performance
bonus wrt a list of strings joined by ''.join.


> I think as a work of pure language design, Python is excellent despite
> a few minor warts.

I fully agree.

> I'm less impressed by the current runtime library
> which I think is less well developed than those of Perl, Java, Common
> Lisp, etc.  Python is supposed to be a "batteries included" language

If you include CPAN into Perl, yes, its library is huge.  So are,
without any discussion, CL's and Java's, although about the
design of the latter I could raise a zillion objections -- I'm
getting familiar with them again through Jython (a complete
implementation of Python that runs on a JVM and seamlessly
gives me complete access to any Java class I could ever possibly
have around) after a few years' lapse in my Java usage, and I
see they've grown very fat but, in many spots, not necessarily
much better (IMHO).

> but I think it would benefit if the implementers flipped through the
> manuals for Perl, CL, etc. and made Python counterparts to those
> languages' library functions when there was a reason to do so.  That
> would save a lot of experimenting and evolution.  Why reinvent the
> wheel?

Copying good parts from another language is indeed an excellent
idea -- Python's RE architecture is a reasonably faithful copy
of Perl's, for example, and I've heard that Perl 5's addition of
OO features was partly inspired by the way OO is done in Python.


> Python's online documentation is also nowhere near as thorough as
> Perl's.  I see on amazon.com that there's a 900 page O'Reilly Python
> book that I can buy, but if I have to buy a separate book to get
> important info, then the distribution isn't really self contained.  So
> this needs improvement too.

I suspect that "Programming Python" isn't really giving you
important info that's omitted from the online docs, as much
as using a completely different style, lots of big examples
and case studies, etc, to present exactly the same language
and standard libraries.  Yes, there are holes (particularly
in the extending and embedding sections), but the language
and library reference manuals seem pretty complete to me.


> I guess it partly depends on what you're doing.  Python is clearly
> further developed in some application domains than others.  Complex
> numerics in Python are still something of a dark spot, as we saw.  As

Sorry, I don't get that -- I must have missed something; I
saw you ask about .real/.imag and branch cuts and gave you
doc pointers.  Where's the "dark spot" and where did we
see it?

> I wouldn't say I'm hunting for problems, but I try to do necessary
> things in obvious ways and I hit snags and have been taking note of
> them.  I guess that after a while one gets used to the snags.

Hunting for problems is a GREAT thing, as long as one points
them out, ideally (when clearly bugs, such as those in xdrlib)
by posting a bug on sourceforce, or even fixes them (e.g. by
extending struct or at least proposing a PEP about extending
it) -- but that isn't crucial: a defect that is clearly pointed
out and acknowledged as such does quickly get fixed, anyway,
particularly in the extensions and libraries (problems deep
in the core of the language would be harder to fix, of course).


Alex


>
> Anyway, the positive is definitely there, or else I wouldn't continue
> to mess with Python.  There's just not a whole lot to post about it.





More information about the Python-list mailing list