Real-world use cases for map's None fill-in feature?

Raymond Hettinger python at rcn.com
Mon Jan 9 13:23:36 CET 2006


rurpy at yahoo.com wrote:
> The other use case I had was a simple file diff.
> All I cared about was if the files were the same or
> not, and if not, what were the first differing lines.
> This was to compare output from a process that
> was supposed to match some saved reference
> data.  Because of error propagation, lines beyond
> the first difference were meaningless.
 . . .
> This same use case occured again very recently
> when writing unit tests to compare output of a parser
> with known correct output during refactoring.

Analysis
--------

Both of these cases compare two data streams and report the first
mismatch, if any.  Data beyond the first mismatch is discarded.

The example code seeks to avoid managing two separate iterators and the
attendant code for trapping StopIteration and handling end-cases.  The
simplification is accomplished by generating a single fill element so
that the end-of-file condition becomes it own element capable of being
compared or reported back as a difference.  The EOF element serves as a
sentinel and allows a single line of comparison to handle all cases.
This is a normal and common use for sentinels.

The OP's code appends the sentinel using a proposed variant of zip()
which pads unequal iterables with a specified fill element:

    for x, y in izip_longest(file1, file2, fill='<EOF>'):
        if x != y:
            return 'Mismatch', x, y
    return 'Match'

Alternately, the example can be written using existing itertools:

    for x, y in izip(chain(file1, ['<EOF>']), chain(file2, ['<EOF>'])):
        if x != y:
            return 'Mismatch', x, y
    return 'Match'

This is a typical use of chain() and not at all tricky.  The chain()
function was specifically designed for tacking one or more elements
onto the end of another iterable. It is ideal for appending sentinels.


Raymond




More information about the Python-list mailing list