Algorithms Library - Asking for Pointers
jehugaleahsa at gmail.com
Fri Sep 2 18:49:03 EDT 2011
On Sep 2, 4:09 pm, Ian Kelly <ian.g.ke... at gmail.com> wrote:
> On Fri, Sep 2, 2011 at 10:59 AM, Travis Parks <jehugalea... at gmail.com> wrote:
> > Hello:
> > I am working on an algorithms library. It provides LINQ like
> > functionality to Python iterators. Eventually, I plan on having
> > feaures that work against sequences and mappings.
> > I have the code up athttp://code.google.com/p/py-compass.
> > This is my first project in Python, so I'd like some feedback. I want
> > to know if I am following conventions (overall style and quality of
> > code).
> Sure, here are my comments.
> In the "forever" and "__forever" functions, your use of the term
> "generator" is confusing. "__forever" is a generator function,
> because it has a yield statement. Its argument, called "generator",
> appears to be a callable, not a generator or even necessarily a
> generator function. Also, note that __forever(lambda: value) is
> functionally equivalent to the more efficient itertools.repeat(value).
> The staticmethod __next(iterator) accesses the class it is defined in,
> which suggests that it might be better written as a classmethod
> __next(cls, iterator).
> Each of the LINQ-style methods is divided into two parts: the public
> method that contains the docstring and some argument checks, and a
> private staticmethod that contains the implementation. I'm not
> certain what the purpose of that is. If it's to facilitate overriding
> the implementation in subclasses, then you need to change the names of
> the private methods to start with only one _ character so that they
> won't be mangled by the compiler.
> The comments before each method that only contain the name of the
> immediately following method are redundant.
> aggregate: the default aggregator is unintuitive to me. I would make
> it a required field and add a separate method called sum that calls
> aggregate with the operator.add aggregator.
> Also, the implementation doesn't look correct. Instead of passing in
> each item to the aggregator, you're passing in the number of items
> seen so far? The LINQ Aggregate method is basically reduce, so rather
> than reinvent the wheel I would suggest this:
> # MISSING is a unique object solely defined to represent missing arguments.
> # Unlike None we can safely assume it will never be passed as actual data.
> MISSING = object()
> def aggregate(self, aggregator, seed=MISSING):
> if seed is self.MISSING:
> return reduce(aggregator, self._iterable)
> return reduce(aggregator, self._iterable, seed)
> Note for compatibility that in Python 3 the reduce function has been
> demoted from a builtin to a member of the functools module.
> any: the name of this method could cause some confusion with the "any"
> builtin that does something a bit different.
> compare: the loop would more DRY as a for loop:
> def __compare(first, second, comparison):
> for firstval, secondval in itertools.izip_longest(first, second,
> if firstval is self.MISSING:
> return -1
> elif secondval is self.MISSING:
> return 1
> result = comparison(firstval, secondval)
> if result != 0:
> return result
> return 0
> concatenate: again, no need to reinvent the wheel. This should be
> more efficient:
> def concatenate(self, other):
> return extend(itertools.chain(self.__iterable, other))
> equals: could be just "return self.compare(other, comparison) == 0"
> __last: the loop could be a for loop:
> # assume we're looking at the last item and try moving to the next
> item = result.Value
> for item in iterator: pass
> return item
> lastOrDefault: there's a lot of repeated logic here. This could just be:
> def lastOrDefault(self, default=None):
> return self.last()
> except ValueError:
> return default
> map / forEach: .NET has to separate these into separate methods due to
> static typing. It seems a bit silly to have both of them in Python.
> Also, map would be more efficient as "return itertools.imap(mapper,
> max / min: it would be more efficient to use the builtin:
> def max(self, key):
> return max(self.__iterable, key=key)
> If somebody really needs to pass a comparison function instead of a
> key function, they can use functools.cmp_to_key.
> randomSamples: a more canonical way to pass the RNG would be to pass
> an instance of random.Random rather than an arbitrary function. Then
> to get a random integer you can just call generator.randrange(total).
> Note that for the default you can just use the random module itself,
> which provides default implementations of all the Random methods.
> Also, for loops:
> def __randomSamples(iterable, sampleCount, generator):
> samples = 
> iterator = iter(iterable)
> # fill up the samples with the items from the beginning of the iterable
> for i, value in zip(xrange(sampleCount), iterator):
> # replace items if the generated number is less than the total
> total = len(samples)
> for value in iterator:
> total += 1
> index = generator.randrange(total)
> if index < len(samples):
> samples[index] = result
> return samples
> __reverse: you could just "return reversed(list(iterable))"
> def __rotateLeft(iterable, shift):
> iterator = iter(iterable)
> front = list(itertools.islice(iterator, shift))
> return itertools.chain(iterator, front)
> skipWhile: suggest using itertools.dropwhile
> take: suggest using itertools.islice
> takeWhile: suggest using itertools.takewhile
> __toList: "return list(iterable)"
> __toSet: "return set(iterable)"
> __toTuple: "return tuple(iterable)". Note that as currently written
> this actually returns a generator, not a tuple.
> __where: suggest using itertools.ifilter
> __zip: suggest using a for loop over itertools.izip(first, second)
> Lookup: is inconsistent. The overridden __iter__ method returns an
> iterator over the values of the groups, but all the inherited methods
> are going to iterate over the keys. Probably you want to pass
> groups.values() to the superclass __init__ method.
Awesome tips. I appreciate the time you spent commenting on just about
every function. I really like your suggestions about using itertools
more, and for loops. I was feeling like some things were becoming way
I also noted the bugs you discovered. I will incorporate your
More information about the Python-list