[Guppy-pe-list] An iteration idiom (Was: Re: loading files containing multiple dumps)
Chris Withers
chris at simplistix.co.uk
Fri Sep 11 11:32:42 EDT 2009
Sverker Nilsson wrote:
> If you just use heap(), and only want total memory not relative to a
> reference point, you can just use hpy() directly. So rather than:
>
> CASE 1:
>
> h=hpy()
> h.heap().dump(...)
> #other code, the data internal to h is still around
> h.heap().dump(...)
>
> you'd do:
>
> CASE 2:
>
> hpy().heap().dump(...)
> #other code. No data from Heapy is hanging around
> hpy().heap().dump(...)
>
> The difference is that in case 1, the second call to heap() could reuse
> the internal data in h,
But that internal data would have to hang around, right? (which might,
in itself, cause memory problems?)
> whereas in case 2, it would have to be recreated
> which would take longer time. (The data would be such things as the
> dictionary owner map.)
How long is longer? Do you have any metrics that would help make good
decisions about when to keep a hpy() instance around and when it's best
to save memory?
>>> Do you mean we should actually _remove_ features to create a new
>>> standalone system?
>> Absolutely, why provide more than is used or needed?
>
> How should we understand this? Should we have to support 2 or more
> systems depending on what functionality you happen to need? Or do
> you mean most functionality is actually _never_ used by
> _anybody_ (and will not be in the future)? That would be quite gross
> wouldn't it.
I'm saying have one project and dump all the excess stuff that no-one
but you uses ;-)
Or, maybe easier, have a core, separate, package that just has the
essentials in a simply, clean fashion and then another package that
builds on this to add all the other stuff...
> It also gives as an alternative, "If this is not possible, a string of
> the form <...some useful description...> should be returned"
>
> The __repr__ I use don't have the enclosing <>, granted, maybe I missed
> this or it wasn't in the docs in 2005 or I didn't think it was important
> (still don't) but was that really what the complain was about?
No, it was about the fact that when I do repr(something_from_heapy) I
get a shedload of text.
> I thought it was more useful to actually get information of what was
> contained in the object directly at the prompt, than try to show how to
> recreate it which wasn't possible anyway.
Agreed, but I think the stuff you currently have in __repr__ would be
better placed in its own method:
>>> heap()
<IdentitySet object at 0x0000 containing 10 items>
>>> _.show()
... all the current __repr__ output
>> That should have another name... I don't know what a partition or
>> equivalence order are in the contexts you're using them, but I do know
>> that hijacking __getitem__ for this is wrong.
>
> Opinions may differ, I'd say one can in principle never 'know' if such a
> thing is 'right' or 'wrong', but that gets us into philosophical territory. Anyway...
I would bet that if you asked 100 experienced python programmers, most
of them would tell you that what you're doing with __getitem__ is wrong,
some might even say evil ;-)
> To get a tutorial provided by someone who did not seem to share your
> conviction about indexing, but seemed to regard the way Heapy does it natural
> (although has other valid complaints, though it is somewhat outdated i.e.
> wrt 64 bit) see:
>
> http://www.pkgcore.org/trac/pkgcore/doc/dev-notes/heapy.rst
This link has become broken recently, but I don't remember reading the
author's comments as liking the indexing stuff...
Chris
--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
More information about the Python-list
mailing list