Using the module heapq, I found that it's not easy to use. Or, at
least, that it could be straightforward to use.
My main issue is that for correct usage:
- user should provide an external list, that shouldn't use for
anything else to don't break the invariant
- to alter the heap queue, a bunch of functions must be used, passing
always the external list
I think that changing "external list" for "internal attribute", and
"bunch of functions " for "methods", it will leave the module easier
to use, safer, and more object oriented.
So, I basically coded it. What do you think about including this class
in the heap module?
"""
class Heap(object):
'''Maintains a heapified list, always conserving the invariant.
Heaps are arrays for which heap[k] <= heap[2*k+1] and
heap[k] <= heap[2*k+2] for all k, counting elements from zero.
'''
def __init__(self, iterable=None):
'''Initializes the heap from any iterable.
>>> Heap([1, 2])
Heap([1, 2])
>>> Heap([])
Heap([])
>>> Heap()
Heap([])
>>> Heap((1, 2, 3))
Heap([1, 2, 3])
>>> Heap(x**2 for x in range(3))
Heap([0, 1, 4])
'''
if iterable is None:
self._queue = []
else:
self._queue = list(iterable)
heapq.heapify(self._queue)
def __repr__(self):
return "Heap(%s)" % self._queue
def push(self, item):
'''Push the item to the heap queue.
>>> h = Heap()
>>> h.push(5)
>>> h.push(4)
>>> h.push(3)
>>> h.push(2)
>>> h.push(1)
>>> h.push(0)
>>> h
Heap([0, 2, 1, 5, 3, 4])
'''
heapq.heappush(self._queue, item)
def pop(self):
'''Pop one item from the heap queue.
>>> h = Heap([0, 2, 1, 5, 3, 4])
>>> h.pop()
0
>>> h.pop()
1
>>> h.pop()
2
>>> h.pop()
3
>>> h.pop()
4
>>> h.pop()
5
>>> h
Heap([])
>>> h.pop()
Traceback (most recent call last):
...
IndexError: index out of range
'''
return heapq.heappop(self._queue)
def pushpop(self, item):
'''Push the item and pop one from the heap queue.
Note that this method is more efficient than calling both
push() and pop() separately.
>>> h = Heap()
>>> h.pushpop(7)
7
>>> h.push(5)
>>> h.push(4)
>>> h.push(3)
>>> h.pushpop(7)
3
>>> h.pushpop(7)
4
>>> h
Heap([5, 7, 7])
'''
return heapq.heappushpop(self._queue, item)
def replace(self, item):
'''Pop one item and push the received one into the heap queue
Note that this method is more efficient than calling both
pop() and push() separately.
>>> h = Heap()
>>> h.replace(3)
Traceback (most recent call last):
...
IndexError: index out of range
>>> h.push(7)
>>> h.replace(2)
7
>>> h
Heap([2])
>>> h.push(3)
>>> h
Heap([2, 3])
>>> h.replace(1)
2
>>> h.replace(9)
1
>>> h
Heap([3, 9])
'''
return heapq.heapreplace(self._queue, item)
"""
Regards,
--
. Facundo
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
Hi,
I've created a Python Bytecode Verifier in Python. I'm not a Python
guru so I borrowed coding patterns from C/C++. I also created this
with C portability in mind. The only reason I used Python was to
experiment with Python and was easier to morph code during
development.
If this program were ported to C it would only need 8 bytes per opcode
(some additional storage to track blocks) and a single pass. I haven't
found backward jumps to previously unused code in any compiled Python
code but it can easily be supported. In that case some more partial
passes are required.
I also was able to successfully verify all Python files in the
Python-2.5.2 source package.
The verification algorythm should be quite complete but I may have
missed some limitations of the interpreter that could be checked by
the verifier as well.
The ability to create this verfier proved that although Python
bytecode is designed for a dynamically typed interpreter, is still
easily verifiable. I am willing port this code C but only in the case
if there is any chance to be included in Python.
I believe that Python in general would benefit having the ability to
safely load .pyc files and create code objects on the fly.
Both Java and .NET have the ability to safely load compiled byte code.
.NET Framework, just like Python also has the ability to create and
execute new code at run-time.
You may feel that enabling closed source applications and/or creating
a multi-language runtime would hurt Python but both of these have
contributed to the success of Java (both the language and the
runtime).
Kornél
Hello,
A very small change: what about making the path argument optional for
os.listdir ? and use the current working directory if
not provided.
listdir(...)
listdir(path=None) -> list_of_strings
Return a list containing the names of the entries in the directory.
path: path of directory to list. If path is None,
os.getcwd() is used.
The list is in arbitrary order. It does not include the special
entries '.' and '..' even if they are present in the directory.
Cheers
Tarek
--
Tarek Ziadé | http://ziade.org
Hello there.
Please see my feature request: http://bugs.python.org/issue6326
The idea is to speed up the swapping of list elemenst so that
a.swap(b)
is equivalent to
a[:], b[:] = b[:], a[:]
but without all the overhead of creating slices, assigning them and so forth.
In particular, it can help reduce the cost of creating instances of list subclasses, from raw lists:
class mylist(list):
def first(self):
return self[0]
m = mylist(source_list)
This certainly creates m, but does so by copying source_list. If all we want to do is turn source_list into a mylist instance, it is much quicker to:
m = mylist()
m.swap(source_list)
See the above issue for initial comments, especially concerns on how this can bypass all kind of insertion semantics of list subclasses.
Cheers,
Kristján
Hello! This is my first time posting to a python list, so be kind with
the newbie :-)
>From issue 6372:
"I propose that all formal parameters that really act as
options/switches be made keyword-only. Examples of switches are all
flags, timeouts, 'verbose' bools, maximums and minimums, etc.
This stresses the difference between needed input for a function and an
argument that changes/extends the behavior. Besides, the code would be
more readable, because instead of having some cryptic function call like
register('Pablo Torres', 2, 4, 5, False) you would have the prettier
register('Pablo Torres', hour=2, min=4, sec=5, had_reservation=False).
The implementation would just require putting a star '*' before all
options, according to pep 3102.
If needed, I can rewrite this as a PEP."
Since this would break pretty much all the code out there, I suspect
that we should come up with a warning mechanism first. Maybe leave the
warning there for a couple of releases before making it definitive.
What do you guys think?
[Please keep the discussion in the list. Also, please avoid top posting (corrected below)]
> On 6/20/09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar>
> wrote:
> > En Thu, 18 Jun 2009 11:04:34 -0300, zhong nanhai
> > <higerinbeijing-Re5JQEeQqe8AvxtiuMwx3w(a)public.gmane.org>
> escribió:
> >
> >> So is it a good idea to enhance the filecmp to
> support
> >> universal-newline-mode?If so, we can compare
> different files from
> >> different operation systems and if they have the
> same content, the
> >> filecmp.cmp would return true.
> >
> > With aid from itertools.izip_longest, it's a one-line
> recipe:
> >
> > py> print repr(open("one.txt","rb").read())
> > 'hello\nworld!\nlast line\n'
> > py> print repr(open("two.txt","rb").read())
> > 'hello\r\nworld!\r\nlast line\r\n'
> > py> import filecmp
> > py> filecmp.cmp("one.txt", "two.txt", False)
> > False
> > py> from itertools import izip_longest
> > py> f1 = open("one.txt", "rU")
> > py> f2 = open("two.txt", "rU")
> > py>
> > py> print all(line1==line2 for line1,line2 in
> izip_longest(f1,f2))
> > True
> >
> > Currently filecmp considers both files as binary, not
> text; if they differ
> > in size they're considered different and the contents
> are not even read.
> >
> > If you want a generic text-mode file comparison, there
> are other factors
> > to consider in addition to line endings: character
> encoding, BOM,
> > character case, whitespace... All of those may be
> considered "irrelevant
> > differences" by some people. A generic text file
> comparison should take
> > all of them into account.
--- El vie 19-jun-09, zhong nanhai <higerinbeijing(a)gmail.com> escribió:
> Thanks for you suggestion.
> You are right and there are a lot of things to consider if
> we want to
> make filecmp support text comparision.But I think we can
> just do some
> little feature enhancement,e.g. only the
> universal-newline mode. I am
> not clear the way filecmp implement the file comparision.
> So, you can
> tell me more about that.
> And if in the source of filecmp, it compare files just by
> reading them
> line by line, then we can do some further comparisons when
> encountering newline flag(means the end of a line).
You can see it yourself, in lib/filecmp.py in your Python installation.
It does a binary comparison only -- and it does not read anything if file sizes differ. A text comparison should use a different algorithm; the code above already ignores end-of-line differences and breaks as soon as two lines differ. One could enhance it to add support for other options as menctioned earlier.
--
Gabriel Genellina
____________________________________________________________________________________
¡Viví la mejor experiencia en la web!
Descargá gratis el nuevo Internet Explorer 8
http://downloads.yahoo.com/ieak8/?l=ar
Hello.
Reading discussion on python-ideas about "Accessing the result of
comprehension's expression from the conditional", I've came to the
idea of where clauses, similar to Haskell's.
This solves the problem of recalculating of value multiple times. For
example, in the following expression:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
value f(y) calculates three times -- It is a problem if function f
takes much time to compute its value or has side effects. If we would
have where clause, we can rewrite expression above with:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
I think it is really useful. We can also expand this idea to lambdas
or maybe to introducing arbitrary scoping blocks of code.
Other thoughts:
- Can we use where clauses in lambda definition to allow some kind
of multi-line lamdba's?
Backgound:
Most functions that take an iterable as parameter iterate just once.
Users of the function can pass any iterable, including iterators.
The function will call iter(imput) and go.
If the user wants to iterate thru a virtual collection defined by an
instance of an iterator class (built-in or user-defined) or generator
function, the user must call the gf/ic with the appropriate arguments
and pass the result.
Problem:
Some functions iterate thru the input more than once. If the input is
not an iterator, there is no problem: call iter(input) again. If the
input is an iterator, that just returns the now-empty iterator. Bad.
General solution:
The most general solution is for the caller to pass the result of
list(it) instead of it. If the caller starts with only an iterator, that
is the only solution I know of.
Special solution:
If the caller starts with an non-iterable iterator source -- iterator
class or generator function (which can be regarded as defining a virtual
subclass of iterator class 'generator') -- it might be better to package
that source and required args as an iterable.
class gfic: #untested as yet
def __init__(self, func, *args, **kwds):
self.func = func
self.args = args
self.kwds = kwds
self.__name__ = func.__name__
def __iter__(self):
return self.func(*self.args, **self.kwds)
This is similar, I believe, to functools.partial except that the
wrapping is total rather than partial and the wrapped call is in
__iter__ instead of __call__.
Would this be a sensible addition to functools, say, or is the use case
too rare?
Terry Jan Reedy
In list/generator comprehension, currently we have no way to access the
result of the expression and have to write something like this:
[f(x) for x in l if f(x) > 0]
if f() is heavy or non-pure (i.e. have side effects), calling f() twice
might be undesirable.
Even if f() is not a function such as
[x + 1 for x in l if x + 1 > 0]
it looks ugly since we're repeating ourself.
We can work around it like this:
[y for y in (f(x) for x in l) if y > 0]
but then we have an unnecessary nested loop comprehension.
I'm suggesting about something like:
[f(x) for x in l if @ > 0]
where @ is the result of the listcomp's expression (i.e. f(x))
Personally, I don't like the use of symbols like @, as python is not
perl. I'm still thinking of a better syntax/approach and is open for
suggestion.
What do you guys think?
Carl Johnson wrote:
> Actually, the votes have been counted at various times in the past.
> Until 2000, the secretary of state for Florida was contracted to count
> the votes, but following a Supreme Court ruling, that was
> discontinued, and new electronic system from Diebold was put in place.
> Then in 2004, there was a lot of grumbling that the Diebold machines
> were using Perl internally, so they outsourced the whole vote counting
> project to a team in Iran. Unfortunately, the team there hasn't been
> answering their emails lately… (And good luck to them, insha'allah!)
Latest news is that the Iranian people have demanded a
recount, but that turned out to be impossible because
the authorities kept the votes in a non-reiterable
container.
--
Greg