[Edu-sig] Pythonic Math must include...

Mon Jan 19 06:21:40 CET 2009

On Sun, Jan 18, 2009 at 3:31 PM, Gregor Lingl <gregor.lingl at aon.at> wrote:

<< SNIP >>

>     Of course there are many more possibilities to realize this, for
> instance to use slice-assignment instead of sets. This would
> require some somewhat opaque index calculations, but be - as far
> as I rember - even faster. Particularly I suppose that there
> exist faster algorithms for generating primes, but I admittedly
> don't know them.
>
> ======================================================
>
> Summing up:
>
> Kirby                          1.71 s                  42.72 s
> Michel Paul                    1.58 s                  32.25 s
> Michel Paul modified::         0.14 s                   1.05 s
> Scott Daniels:                 0.07 s                   0.50 s
> G.L. minimal sieve:            0.014 s                  0.109 s
> G.L. sieve:                    0.012 s                  0.086 s
> G.L. sieve-based generator:    0.023 s                  0.113 s
> (performance depends on CHUNKSIZE)
>
> This exposes in my opinion an unsurmountable dilemma, namely
> that usually you cannot meet even those few criteria mentioned
> in the beginning in a single solution.
> So under the aspects you exposed at the beginning of this thread,
> "Pythonic Math", which title in some sense includes and/or addresses
> this dilemma, how would you prefer to weight those criteria?
>
> Regards
>
> Gregor

Hey Gregor, I'm really pleased you would do some careful analysis
regarding efficiency, one of the several criteria you mentioned.

I do think it's important to look at mathematics as an energetic, ergo
economic, activity, in the sense of consuming clock cycles while
burning through joules, calories (Roz Savage is one of my "action
figures" in my curriculum writing, burning joules as she rows the
Atlantic).  First Person Physics is likewise focused in athletics,
stress medicine.

An inefficient algorithm may well cost you, eat into your free time,
when you could be cutting yourself slack by putting a little more
thought into your technique.  Efficiency matters.  Sometimes we say
"the math is bad" because it's just too dang slow, even if correct
(years too late).  You want a reasoning process able to keep up with
the demands of the day.

So, again, agreeing with you, efficiency is a critical parameter
(interesting conversations with my colleague Trevor Blake on precisely
this point at FG the other day (FG of Fine Grind Productions,
undertaken with Jody Francis (cns.cfo)).

On the other hand, if we're really trying to be blindingly fast, why
are we using Python at all?  Answer:  we're not really trying to be as
fast as possible.

We're more like this:
http://www.istockphoto.com/file_thumbview_approve/422526/2/istockphoto_422526_slug_race.jpg

Speed of execution must not be what matters most.  We're avoiding
cluttering our code with type declarations (yay).  We're indulging in
clean syntax, no curly braces.  We're learning something
cross-platform, respected by industry.

Which is why we're *not* using VBA-per-Access or VB# (smile).

Trial by division is just one of those generic topics the pre-computer
mathematicians would dwell upon, in contrast to the sieve, not that
they're unrelated.

Given we're quite young, just learning about composite versus prime,
starting to get our feet wet in algebra, which is more abstract this
time around, thanks to operator overloading, thanks to "modulo
numbers", we have more of an interest in the timeline, the chronology.
 How have people tackled this problem, of getting prime numbers of
arbitrary size.  Why do we call them "probable primes" in that Jython
method?

By the way, here's a class, adapted from my trial-by-division
generator, that saves state in object variables, and so doesn't need
to be written with yield.

class Primes():

    def __init__(self):
        self.sofar = [-1, 2,3] # a running start, -1 proposed by J.H. Conway
        self.counter = 0
        self.maxprime = 3

    def __next__(self):
        if self.counter < 3:
            self.counter = self.counter + 1
            return self.sofar[self.counter - 1]
        candidate = self.maxprime + 2 # we'll increment from here on
        while True:
            for factor in self.sofar[1:]: # skip -1 (or don't use it
in the first place)
                if factor ** 2 > candidate:  # did we pass?
                    self.sofar.append(candidate) # keep the gold
                    self.maxprime = candidate
                    self.counter += 1
                    return candidate # woo hoo!
                if not candidate % factor: # oops, no remainder
                    break  # this is a composite
            candidate += 2 # next odd number please

In action, this looks a lot like a generator, except we're also using
numeric indexing to retrieve any prime so far, a new wrinkle:

>>> from trialbydivision import Primes
>>> g = Primes()
>>> next(g)
-1
>>> next(g)
2
>>> next(g)
3
>>> next(g)
5
>>> next(g)
7
>>> next(g)
11
>>> next(g)
13
>>> next(g)
17
>>> next(g)
19
>>> next(g)
23
>>> g[3]
5
>>> g[10]
Traceback (most recent call last):
  File "<pyshell#91>", line 1, in <module>
    g[10]
  File "/home/kirby/trialbydivision.py", line 26, in __getitem__
    return self.sofar[index]
IndexError: list index out of range
>>> g[8]
19
>>> g[6]
13

That we're able to work through either method in but seconds or less,
up to what would have seemed pretty high numbers to be dealing with
concretely, just a few decades back (ignoring scientific notation,
talking about mantissas), would have seemed quasi-incredible to these
Egyptian scribes or Imperial desk jobbers or whatever they were.

How deeply do we want to go into the Python?  Must we do both the
generator *and* the class versions?

It's not a matter of "must" (we hope) so much as each student
following an IEP (an "individual learning plan"), freely exploring in
ways that make sense.

In your peer group, is it more important to develop this talent or
that, this skill or that?

There's a reason it's called "a gymnasium" in German I think:  the
different machines are marked, as to which muscle groups they
exercise.

If you're racing through in a crypto thread, anxious to get to the RSA
part, then maybe we don't waste your time on the intricacies of
metaclasses or deques.

On the other hand, if you're anxious to master Python the language,
because of a newspaper job where Django gets used, then don't waste
your time on RSA just now, focus on regular expressions.

Work back from your goals, anticipate job requirements, cram where you
think it'll pay off.  Encourage students to make these calculations
themselves.  It's not all up to the guidance counselors, or shouldn't
be.

Obviously, if it's a "group march" (the typical classroom, taking kids
through as a group), with teacher as tour guide, then it's good to
give some up front info about what's to be covered, with referrals to
other options if this doesn't sound like what you're into.

Corporate training same deal:  (e.g) in this class, we will be looking
at CREATE TABLE syntax in SQL for 30 mins, then moving to the "Django
way" of specifying tables.  Lunch.  Then a review, a few more running
examples.

We're used to this from OSCON, Pycon, whatever conferences:  a catalog
of offerings with blurbs as to what to expect.

Imagine these topic sections in a Gnu Math PDF:

SQL and supermarket management
SQL and library science
SQL and hotel reservations systems
...
(many more, all with SQL -- students filter through in different queues)

My Saturday Academy course in April won't have much SQL even though
this is a good age to learn some, is mostly about two ways of
rendering computer graphics:  either by real time animation (Sims,
games, VRML, VPython, Pygame) or by high lag time ray tracing
(POV-Ray).

We talk about Model, View, Controller (MVC) where our Model is
persistent data in an algebra, a View is a rendering (visualization),
and the Controller is Python.

In sum, I think the stress to put on the "time efficiency" is related
to the goals of the particular class (this doesn't sound controversial
to me ears, more like a truism).

In many math classes, where the goal is conceptual clarity and fluency
within specific namespaces (language games), we will consider the fast
response times of an interpreted bytecode language such as Python to
be miraculous in comparison to our old ways with paper and pencil.

We're like that old Xerox commercial with that monk copyist (that's
what they'd do all day:  transcribe), satisfied to finally have a fast
photocopier.

http://www.youtube.com/watch?v=_IgH2M02xek

In our intro to maths context, we will prefer short and succinct over
cluttered, even at a speed cost.

In another [13-16 college] course, we'll maybe learn MMIX and use that
instead, covering a lot of the same algorithms.  This literature is
already well developed, so no need to reinvent that wheel.

http://www-cs-faculty.stanford.edu/~uno/taocp.html

Note:  my trial-by-divison code is quirky cluttered with comments,
plus some would complain the -1 is cluttering things.  Others are glad
to see J.H. Conway's suggestion finally percolating outward in some
subculture's archive (Python Nation's in this case).

Kirby