Mailman 3 June 2009 - Python-ideas

Heap data type
by Facundo Batista 07 Dec '09

07 Dec '09

Using the module heapq, I found that it's not easy to use. Or, at least, that it could be straightforward to use. My main issue is that for correct usage: - user should provide an external list, that shouldn't use for anything else to don't break the invariant - to alter the heap queue, a bunch of functions must be used, passing always the external list I think that changing "external list" for "internal attribute", and "bunch of functions " for "methods", it will leave the module easier to use, safer, and more object oriented. So, I basically coded it. What do you think about including this class in the heap module? """ class Heap(object): '''Maintains a heapified list, always conserving the invariant. Heaps are arrays for which heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2] for all k, counting elements from zero. ''' def __init__(self, iterable=None): '''Initializes the heap from any iterable. >>> Heap([1, 2]) Heap([1, 2]) >>> Heap([]) Heap([]) >>> Heap() Heap([]) >>> Heap((1, 2, 3)) Heap([1, 2, 3]) >>> Heap(x**2 for x in range(3)) Heap([0, 1, 4]) ''' if iterable is None: self._queue = [] else: self._queue = list(iterable) heapq.heapify(self._queue) def __repr__(self): return "Heap(%s)" % self._queue def push(self, item): '''Push the item to the heap queue. >>> h = Heap() >>> h.push(5) >>> h.push(4) >>> h.push(3) >>> h.push(2) >>> h.push(1) >>> h.push(0) >>> h Heap([0, 2, 1, 5, 3, 4]) ''' heapq.heappush(self._queue, item) def pop(self): '''Pop one item from the heap queue. >>> h = Heap([0, 2, 1, 5, 3, 4]) >>> h.pop() 0 >>> h.pop() 1 >>> h.pop() 2 >>> h.pop() 3 >>> h.pop() 4 >>> h.pop() 5 >>> h Heap([]) >>> h.pop() Traceback (most recent call last): ... IndexError: index out of range ''' return heapq.heappop(self._queue) def pushpop(self, item): '''Push the item and pop one from the heap queue. Note that this method is more efficient than calling both push() and pop() separately. >>> h = Heap() >>> h.pushpop(7) 7 >>> h.push(5) >>> h.push(4) >>> h.push(3) >>> h.pushpop(7) 3 >>> h.pushpop(7) 4 >>> h Heap([5, 7, 7]) ''' return heapq.heappushpop(self._queue, item) def replace(self, item): '''Pop one item and push the received one into the heap queue Note that this method is more efficient than calling both pop() and push() separately. >>> h = Heap() >>> h.replace(3) Traceback (most recent call last): ... IndexError: index out of range >>> h.push(7) >>> h.replace(2) 7 >>> h Heap([2]) >>> h.push(3) >>> h Heap([2, 3]) >>> h.replace(1) 2 >>> h.replace(9) 1 >>> h Heap([3, 9]) ''' return heapq.heapreplace(self._queue, item) """ Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

11 14

Python Bytecode Verifier
by Kornél Pál 20 Sep '09

20 Sep '09

Hi, I've created a Python Bytecode Verifier in Python. I'm not a Python guru so I borrowed coding patterns from C/C++. I also created this with C portability in mind. The only reason I used Python was to experiment with Python and was easier to morph code during development. If this program were ported to C it would only need 8 bytes per opcode (some additional storage to track blocks) and a single pass. I haven't found backward jumps to previously unused code in any compiled Python code but it can easily be supported. In that case some more partial passes are required. I also was able to successfully verify all Python files in the Python-2.5.2 source package. The verification algorythm should be quite complete but I may have missed some limitations of the interpreter that could be checked by the verifier as well. The ability to create this verfier proved that although Python bytecode is designed for a dynamically typed interpreter, is still easily verifiable. I am willing port this code C but only in the case if there is any chance to be included in Python. I believe that Python in general would benefit having the ability to safely load .pyc files and create code objects on the fly. Both Java and .NET have the ability to safely load compiled byte code. .NET Framework, just like Python also has the ability to create and execute new code at run-time. You may feel that enabling closed source applications and/or creating a multi-language runtime would hurt Python but both of these have contributed to the success of Java (both the language and the runtime). Kornél

2 2

os.listdir with current working directory as default
by Tarek Ziadé 01 Aug '09

01 Aug '09

Hello, A very small change: what about making the path argument optional for os.listdir ? and use the current working directory if not provided. listdir(...) listdir(path=None) -> list_of_strings Return a list containing the names of the entries in the directory. path: path of directory to list. If path is None, os.getcwd() is used. The list is in arbitrary order. It does not include the special entries '.' and '..' even if they are present in the directory. Cheers Tarek -- Tarek Ziadé | http://ziade.org

11 18

add a list.swap() method
by Kristján Valur Jónsson 29 Jun '09

29 Jun '09

Hello there. Please see my feature request: http://bugs.python.org/issue6326 The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth. In particular, it can help reduce the cost of creating instances of list subclasses, from raw lists: class mylist(list): def first(self): return self[0] m = mylist(source_list) This certainly creates m, but does so by copying source_list. If all we want to do is turn source_list into a mylist instance, it is much quicker to: m = mylist() m.swap(source_list) See the above issue for initial comments, especially concerns on how this can bypass all kind of insertion semantics of list subclasses. Cheers, Kristján

12 29

Make all switches keyword-only
by Pablo Torres N. 29 Jun '09

29 Jun '09

Hello! This is my first time posting to a python list, so be kind with the newbie :-) >From issue 6372: "I propose that all formal parameters that really act as options/switches be made keyword-only. Examples of switches are all flags, timeouts, 'verbose' bools, maximums and minimums, etc. This stresses the difference between needed input for a function and an argument that changes/extends the behavior. Besides, the code would be more readable, because instead of having some cryptic function call like register('Pablo Torres', 2, 4, 5, False) you would have the prettier register('Pablo Torres', hour=2, min=4, sec=5, had_reservation=False). The implementation would just require putting a star '*' before all options, according to pep 3102. If needed, I can rewrite this as a PEP." Since this would break pretty much all the code out there, I suspect that we should come up with a warning mechanism first. Maybe leave the warning there for a couple of releases before making it definitive. What do you guys think?

3 3

Re: [Python-ideas] enhance filecmp to support text-and-universal-newline-mode file comparison
by gagsl-py2＠yahoo.com.ar 24 Jun '09

24 Jun '09

[Please keep the discussion in the list. Also, please avoid top posting (corrected below)] > On 6/20/09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> > wrote: > > En Thu, 18 Jun 2009 11:04:34 -0300, zhong nanhai > > <higerinbeijing-Re5JQEeQqe8AvxtiuMwx3w(a)public.gmane.org> > escribió: > > > >> So is it a good idea to enhance the filecmp to > support > >> universal-newline-mode?If so, we can compare > different files from > >> different operation systems and if they have the > same content, the > >> filecmp.cmp would return true. > > > > With aid from itertools.izip_longest, it's a one-line > recipe: > > > > py> print repr(open("one.txt","rb").read()) > > 'hello\nworld!\nlast line\n' > > py> print repr(open("two.txt","rb").read()) > > 'hello\r\nworld!\r\nlast line\r\n' > > py> import filecmp > > py> filecmp.cmp("one.txt", "two.txt", False) > > False > > py> from itertools import izip_longest > > py> f1 = open("one.txt", "rU") > > py> f2 = open("two.txt", "rU") > > py> > > py> print all(line1==line2 for line1,line2 in > izip_longest(f1,f2)) > > True > > > > Currently filecmp considers both files as binary, not > text; if they differ > > in size they're considered different and the contents > are not even read. > > > > If you want a generic text-mode file comparison, there > are other factors > > to consider in addition to line endings: character > encoding, BOM, > > character case, whitespace... All of those may be > considered "irrelevant > > differences" by some people. A generic text file > comparison should take > > all of them into account. --- El vie 19-jun-09, zhong nanhai <higerinbeijing(a)gmail.com> escribió: > Thanks for you suggestion. > You are right and there are a lot of things to consider if > we want to > make filecmp support text comparision.But I think we can > just do some > little feature enhancement,e.g. only the > universal-newline mode. I am > not clear the way filecmp implement the file comparision. > So, you can > tell me more about that. > And if in the source of filecmp, it compare files just by > reading them > line by line, then we can do some further comparisons when > encountering newline flag(means the end of a line). You can see it yourself, in lib/filecmp.py in your Python installation. It does a binary comparison only -- and it does not read anything if file sizes differ. A text comparison should use a different algorithm; the code above already ignores end-of-line differences and breaks as soon as two lines differ. One could enhance it to add support for other options as menctioned earlier. -- Gabriel Genellina ____________________________________________________________________________________ ¡Viví la mejor experiencia en la web! Descargá gratis el nuevo Internet Explorer 8 http://downloads.yahoo.com/ieak8/?l=ar

1 0

Introducing where clauses
by Andrey Popp 23 Jun '09

23 Jun '09

Hello. Reading discussion on python-ideas about "Accessing the result of comprehension's expression from the conditional", I've came to the idea of where clauses, similar to Haskell's. This solves the problem of recalculating of value multiple times. For example, in the following expression: [(f(x), f(x)) for x in some_iterable if f(x) < 2] value f(y) calculates three times -- It is a problem if function f takes much time to compute its value or has side effects. If we would have where clause, we can rewrite expression above with: [(y, y) for x in some_iterable if y < 2 where y = f(x)] I think it is really useful. We can also expand this idea to lambdas or maybe to introducing arbitrary scoping blocks of code. Other thoughts: - Can we use where clauses in lambda definition to allow some kind of multi-line lamdba's?

11 14

Aid reiteration with new class: gfic
by Terry Reedy 21 Jun '09

21 Jun '09

Backgound: Most functions that take an iterable as parameter iterate just once. Users of the function can pass any iterable, including iterators. The function will call iter(imput) and go. If the user wants to iterate thru a virtual collection defined by an instance of an iterator class (built-in or user-defined) or generator function, the user must call the gf/ic with the appropriate arguments and pass the result. Problem: Some functions iterate thru the input more than once. If the input is not an iterator, there is no problem: call iter(input) again. If the input is an iterator, that just returns the now-empty iterator. Bad. General solution: The most general solution is for the caller to pass the result of list(it) instead of it. If the caller starts with only an iterator, that is the only solution I know of. Special solution: If the caller starts with an non-iterable iterator source -- iterator class or generator function (which can be regarded as defining a virtual subclass of iterator class 'generator') -- it might be better to package that source and required args as an iterable. class gfic: #untested as yet def __init__(self, func, *args, **kwds): self.func = func self.args = args self.kwds = kwds self.__name__ = func.__name__ def __iter__(self): return self.func(*self.args, **self.kwds) This is similar, I believe, to functools.partial except that the wrapping is total rather than partial and the wrapped call is in __iter__ instead of __call__. Would this be a sensible addition to functools, say, or is the use case too rare? Terry Jan Reedy

7 14

Accessing the result of comprehension's expression from the conditional
by Lie Ryan 21 Jun '09

21 Jun '09

In list/generator comprehension, currently we have no way to access the result of the expression and have to write something like this: [f(x) for x in l if f(x) > 0] if f() is heavy or non-pure (i.e. have side effects), calling f() twice might be undesirable. Even if f() is not a function such as [x + 1 for x in l if x + 1 > 0] it looks ugly since we're repeating ourself. We can work around it like this: [y for y in (f(x) for x in l) if y > 0] but then we have an unnecessary nested loop comprehension. I'm suggesting about something like: [f(x) for x in l if @ > 0] where @ is the result of the listcomp's expression (i.e. f(x)) Personally, I don't like the use of symbols like @, as python is not perl. I'm still thinking of a better syntax/approach and is open for suggestion. What do you guys think?

20 48

Re: [Python-ideas] Possible method of distinguishing between set-literals, dict-literals, and odict-literals
by Greg Ewing 21 Jun '09

21 Jun '09

Carl Johnson wrote: > Actually, the votes have been counted at various times in the past. > Until 2000, the secretary of state for Florida was contracted to count > the votes, but following a Supreme Court ruling, that was > discontinued, and new electronic system from Diebold was put in place. > Then in 2004, there was a lot of grumbling that the Diebold machines > were using Perl internally, so they outsourced the whole vote counting > project to a team in Iran. Unfortunately, the team there hasn't been > answering their emails lately… (And good luck to them, insha'allah!) Latest news is that the Iranian people have demanded a recount, but that turned out to be impossible because the authorities kept the votes in a non-reiterable container. -- Greg

2 1