[Python-Dev] address manipulation in the standard lib

Tue Jan 6 03:01:55 CET 2009

A merger sounds like a good way forward.

It shouldn't be as painful as it might sound initially and there should be
lots of room for some early big wins.

Contentious Issues
------------------

*** Separate IP and CIDR classes

The IP and CIDR object split in netaddr is going to require some further
discussion. They are mostly related to what operations to keep and which to
drop from each. More on this later on when I've had some time to think about
it a bit more.

*** Using the Stategy pattern

I'd like to see us use the GoF strategy pattern in a combined solution with
a single IP class for both v4 and v6, with separate strategy classes (like
netaddr), rather than two separate IPv4 and IPv6 classes returned by a
factory function (like ipaddr). Again this might require a bit of further
discussion.

Killer Features
---------------

Here's a list of (hopefully uncontroversial) features for a combined module

*** Maintain ipaddr speeds

Impressive stuff - I like it!

***  PEP-8 support

*** Drop MAC and EUI support

I'm happy to let the MAC (EUI-48) and EUI-64 support find a good home in a
separate module. Guido's sense of this being something separate is spot on
despite the apparent benefits of any code sharing. Where necessary, the
separate module can import whatever it needs from our combined module.

***  Pythonic behaving of IP objects

IP address objects behave like standard Python types (ints, lists, tuples,
etc) dependent on context.

This is mainly achieved via copious amounts of operator overloading.

For example, instead of :-

>>> IP('192.168.0.0/24').exclude_addrs('192.168.0.15/32')
['192.168.0.0/29', '192.168.0.8/30', '192.168.0.12/31', '192.168.0.14/32']

you could just implement __sub__ (IP object subtraction) :-

>>> IP('192.168.0.0/24', format=str) - IP('192.168.0.15/32')
['192.168.0.0/29', '192.168.0.8/30', '192.168.0.12/31', '192.168.0.14/32']

Achieving the same results but in a more Python friendly manner.

Here's a list of operators I've so far found decent meanings for in netaddr
:-

__int__, __long__, __str__, __repr__, __hash__
__eq__, __ne__, __lt__, __le__, __gt__, __ge__
__iter__, __getitem__, __setitem__, __len__, __contains__
__add__, __sub__, __isub__, __iadd__

***  Constants for address type identification

Identifying specific address types with a constant is essential. netaddr has
the module level constants AT_INET and AT_INET6 for IPv4 and IPv6
respectively. I'll be the first to agree that AT_* is a bit quirky. As we
are looking to something for the stdlib we should use something more, well,
standard such as AF_INET and AF_INET6 from the socket module.

Is AF_INET6 fairly widely available on most operating systems these days?
Not sure how socket constants have fared in Google's App Engine socket
module implementation for example. If not, we can always define some
specifically for the module itself.

***  Use the Python descriptor protocol to police IP objects attribute
assignments

This makes IP object properties read/writable rather than just read-only.

I discovered this on the Python mailing list a while back in the early days
of netaddr's development. They are excellent and open up a whole new world
of possibilities for keeping control of your objects internal state once you
allow users write access to your class properties.

***  Formatter attributes on IP objects to controls return value
representations

Sometimes you just want the string or hex representation of an address
instead of grokking IP objects the whole time. A useful trick when combined
with descriptor protocol above.

***  Use iterators

I notice ipaddr doesn't currently use the 'yield' statement anywhere which
is a real shame. netaddr uses iterators everywhere and also defines an
nrange() function built as an xrange() work-a-like but for network addresses
instead of integers values (very similar).

***  Add support for IPv4 address abbreviations

Based on 'old school' IP classful networking rules. Still useful and worth
having.

***  Use slices on IP objects!

There's nothing quite like list slices on a network object ;-) I've got some
horrendous issues trying to get this going with Python n-bit integers for
IPv6 so I'd love to see this working correctly.

***  Careful coding to avoid endianness bugs

I spent a decent chunk of development time early on doing endian tests on
all basic integer conversion operations. Any combined solution must be rock
solid and robust in this area. It's all too make naive assumption and get
this wrong. OK, so it's a pet hate of mine! I'm looking forward to Python
stdlib buildbot support in this area ;-)

***  Display of IP objects as human-readable binary strings

Sometimes it's just nice to see the bit patterns!

***  Python 'set' type operations for collections of IP objects

Intersection, union etc between network objects and groups of network
objects. More nice to have than essential but would be interesting to see
working. I've spent time thinking about it but haven't really come up with a
good implementation of (yet). Hopefully with a lot of talented people
involved we can get something going here.

*** Add support for epydoc in docstrings

Is this post long enough to be a candidate for a PEP?!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090106/75cd3c39/attachment-0001.htm>