Mapping into two lists

Alex Martelli aleax at aleax.it
Mon Sep 3 06:42:35 EDT 2001


"Marcus Stojek" <stojek at part-gmbh.de> wrote in message
news:MPG.15fd71d44b57b4a989689 at news.easynews.net...
> Hi,
>
> I have a sorted list of int.  To generate an input file for
> a special software I have to split this list into two. Instead of
> explaining I'll give an example:
>
> List: [ 1, 3,4,5,6,7, 9, 11,12,13,14,15, 17]
>
> # start and end value of continuous list segment:
> Output list1: [3,7, 11,15]
> # single values:
> Output list2: [1,9,17]

So you want to identify the beginning and endpoints of
maximal subsequences (of length 2 or more?) such that
the subsequences are "compact" (hold every intermediate
integer value), and the "outliers" (items of the list
not in such subsequences), right?  And we can assume
there are no duplicates in the sorted input list, and
that the sorted input list itself is non-empty?  It's
hard to gather PRECISE specs from such a succint way
of presenting things, but, let's assume I guessed:-).

Assuming nested scopes:

def splitter(sortedlist):
    # to start with, no subseq's, no outliers
    subseqs = []
    outlier = []
    # prime the first subsequence
    currlas = current = sortedlist[0]
    def putsubseq(current, currlas):
        if current == currlas:
            outlier.append(current)
        else:
            subseqs.extend([current,currlast])
    for item in sortedlist[1:]:
        if item == currlas+1:
            # current subsequence is continuing
            currlas = item
        else:
            # current subsequence is finished
            putsubseq(current, currlas)
            # start new subsequence
            currlas = current = item
    # trailing subsequence is finished
    putsubseq(current, currlas)
    return subseqs, outlier

Without nested scopes, you just have to burden the
definition of putsubseq with the needed nested-state:
    def putsubseq(current,currlas,outlier=outlier,subseqs=subseqs):

You can also choose to play all sort of tricks to
avoid needing putsubseq as a nested function, e.g
the classic Sentinel idiom:

def splitter(sortedlist):
    currlas = current = sortedlist[0]
    subseqs = []
    outlier = []
    for item in sortedlist[1:]+[None]:
        if item == currlas+1:
            # current subsequence is continuing
            currlas = item
        else:
            # current subsequence is finished
            if current == currlas:
                outlier.append(current)
            else:
                subseqs.extend([current,currlast])
            # start new subsequence
            currlas = current = item
    return subseqs, outlier

Here, the None we append on the fly in the for
statement plays the role of a sentinel: it will
never meet the "item == currlas+1" test and thus
for the trailing subsequence it will ensure the
else-branch is followed, so at the end of the
for loop we're entirely done (you could assert
currlas == current == None before the return).


> I do have a piece of code with a lot of ifs and whiles and
> len(list)-1 and stuff. I bet this can be done much easier with some
> sophisticated mapping.

Sorry, no sophisticated mapping comes to mind, I can
just hope the above code (just two if's, no while,
and the obvious for-loop, no len(list) etc etc) is
an improvement.  Note: I haven't *tested* it!!!  So,
please do test it yourself if interested.


Alex






More information about the Python-list mailing list