2009/8/25 Peter Moody <peter@hda3.com>
On Mon, Aug 24, 2009 at 3:24 PM, DrKJam<drkjam@gmail.com> wrote:

[SNIP]
 
As it was left in early June, a pep and design modifications were
requested before ipaddr would be considered for inclusion, but if this
is going to start *another* drawn out ipaddr/netaddr thread, perhaps
the mailman admin(s) could setup a new SIG list for this.  I
personally hope that's not required; yours has been the only
dissenting email and I believe I respond to all of your major points
here.
 
The PEP process is the perfect forum for spending some time scrutinizing and discussing this topic in more detail. I will be raising further points in future when I've had time to fully evaluate both the PEP and the reference implementation of ipaddr.

At this stage, it is premature to assume the reference implementation provided along with the PEP is necessarily complete, only requiring a few bug fixes to get through the approval process.


> 1) Firstly, an offering of code.
>
> I'd like to bring to your attention an example implementation of an IP
> address library and interface for general discussion to compare and contrast
> with ipaddr 2.0.x :-
>
>     http://netaddr.googlecode.com/svn/branches/exp_0.7.x_ip_only
>
> It is based on netaddr 0.7.2 which I threw together earlier today.
>
> In essence, I've stripped out all of what could be considered non-essential
> code for a purely IP related library. This branch should be suitable for
> *theoretical* consideration of inclusion into some future version of the
> Python standard library (with a little work).
>
> It is a pure subset of netaddr release 0.7.2, *minus* the following :-
>
> - all IEEE layer-2 code
> - some fairly non-essential IANA IP data files and lookup code
> - IP globbing code (fairly niche)
>
> Aside: Just a small mention here that I listened carefully to Clay McClure's
> and others criticisms of the previous incarnation of ipaddr. The 0.7.x
> series of netaddr breaks backward compatibility with previous netaddr
> releases and is an "answer" of sorts to that discussion and issue raised
> within the Python community. I hope you like what I've done with it.
>
> For the purposes of this discussion consider this branch the "Firefox to
> netaddr's Mozilla" or maybe just plain old "netaddr-ip-lite" ;-)
>
> 2) I refute bold claim in the PEP that :-
>
>     "Finding a good library for performing those tasks can be somewhat more
> difficult."
>
> On the contrary, I wager that netaddr is now a perfectly decent alternative
> implementation to ipaddr, containing quite a few more features with little
> of the slowness for most common operations,

I think you mean refuse,

No, I meant refute.
 
b/c this certainly wasn't the case when I
started writing ipaddr. IPy existed, but it was far too heavyweight
and restrictive for what I needed (no disrespect to the author(s)
intended). I believe I've an email or two from you wherein you
indicate the same.

The comment made on IPy, to which I believe you are referring, was in response to you incorrectly comparing netaddr and IPy's implementation (assuming conditional logic was used within each method to support IP versioning). As already stated netaddr gets around this with a strategy design pattern approach (apologies to readers for using the "Gang of Four" acronym with regard to this).

IPy is heavyweight? How so? It is a mere 1200 lines including comments and deals with IPv4 and IPv6 addressing, much like ipaddr (albeit with fewer features). There are certainly issues you could raise against it (otherwise we wouldn't be here), but being heavyweight is not one of them.

I would actively encourage authors of said library (Victor Stinner is listed as the current maintainer) to get involved in the discussion of this PEP. It is their legacy that this work is picking up from.

Incidentally, I've noticed a few bug fix releases come through for IPy on PyPI in the last month so that project certainly seems alive and well.

I think the PEP currently doesn't provide appropriate weight to the efforts of others in this area.

FYI, here is a wiki entry I've been maintaining for a while now to this end :-

http://code.google.com/p/netaddr/wiki/YetAnotherPythonIPModule
 

> 2/3x faster in a lot of cases,
> not that we're counting. What a difference a year makes!
> I also rate IPy quite highly even if it is getting a little "long in the tooth".
> For a lot of users, IPy could also be considered a nice, stable API!

yes, netaddr has sped up quite a bit. It's still slower in many cases
as well. But again, who's timing?

I mention speed and timings as the PEP cites this as one of the benefits of considering the ipaddr reference implementation.
 

> By the same token I'm happy to note some convergence between the ipaddr and
> netaddr's various interfaces, particularly in light of discussions and
> arguments put forward by Clay McClure and others. A satisfactory compromise
> between the two however still seems a way off.
>
>
> 3) I also disagree with the PEP's claim that :-
>
>     "attempts to combine [IPv4 and IPv6] into one object would be like
> trying to force a round peg into a square hole (or vice versa)".
>
> netaddr (and for that matter IPy) cope with this perceived problem
> admirably.
>
> netaddr employs a simple variant of the GoF Strategy design pattern (with
> added Python sensibility). In the rare cases where ambiguity exists between
> IPv4 and IPv6 addresses a version parameter may be passed to the constructor
> of the IPAddress class to differentiate between them. Providing an IP
> address version to the constructor also provides a small performance
> improvement.

I'm not sure what point you're trying to make here. I didn't say it
was impossible, I inferred that there are easier ways. having used
code which crams both types into one object, I found it to be cludgey
and complicated so I designed something different.

Let me clarify. I am +1 on the specific item in the PEP regarding the need for separate and distinct IPAddress and IPNetwork class interfaces that are not conflated into a single interface. Clay McClure made this point very eloquently. I've done a good bit of experimentation on this since it was mentioned so I am fully aware of the pros and cons of each approach. A brief look at netaddr.ip.lite confirms that on this we both agree.

Where I disagree is on the need to have yet another split in the interface to support different IP versions (and a set of Factory functions to pull it all together again). Hey, another design pattern, also known as the "Factory Method" a.k.a. Virtual Constructor (or in this case a Python function).
 

and as a hardly partial observer, I'll add the explicit address
version you can pass to the IPAddress class, but not the IPNetwork
class, is, odd. it actually seems to slow down object creation (~5%)
except in the case of an int arg (your default is about twice as
slow).

Ah, the issue of speed and timings again. Let's concentrate on getting the interface right before we spend too much effort on optimization. I'm quite happy to do a full speed comparison of major features in both libraries but I don't think that would be a worthwhile use of time just now.
 
Currently I'm ambivalent on whether an IP(vX)Network class constructor should accept a numerical (i.e. integer) value at all *unless* you explicit state somehow that you want the network aspect to be inferred in some specific way. It isn't a case of just choosing /32 or /128 and having this as the only option. IP (v4) classful rules are still pervasive in the real world. A general case IP library available to the whole Python community should certainly take this into account.


> IPv4 and IPv6 addresses can be used interchangably throughout netaddr
> without causing issue during operations such as sorting, merging (known in
> the PEP as "address collapsing") or address exclusion.
>
> Don't try and do this with the current reference implementation of ipaddr :-
>
>>>> collapse_address_list([IPv4Address('1.1.1.1'),
>>>> IPv6Address('::1.1.1.1')])
> [IPv4Network('1.1.1.1/32')]
>
> OUCH! Even if this isn't allowed (according to the documentation), it should
> raise an Exception rather than silently passing through.
>
> I actually raised this back in May on the ipaddr bug tracker but it hasn't
> received any attention so far :-
>
>     http://code.google.com/p/ipaddr-py/issues/detail?id=18
>
> Compare this with netaddr's behaviour :-
>
>>>> cidr_merge([IPAddress('1.1.1.1'), IPAddress('::1.1.1.1')])
> [IPNetwork('1.1.1.1/32'), IPNetwork('::1.1.1.1/128')]
>
> That's more like it.

OUCH! indeed. I'm not even sure that this is a nice corner case
feature, summarizing a single list of mixed ip type objects. with an
extra line or two, this can be done in ipaddr, though 'tis true that
we should now raise an exception and don't (it appears to be something
that was introduced recently).  If this is a feature for which
developers are clamoring, I'm all over it. Yours is the first email
I've heard mention it.

I may be the only one raising issues but that shouldn't mean they are any less relevant. There is a whole different feel and thrust behind both interfaces each with their own merits.
 

> 4) It may just be me but the design of this latest incarnation of ipaddr
> seems somewhat complicated for so few lines of code. Compared with ipaddr,
> netaddr doesn't use or require multiple inheritance nor a seemingly
> convoluted inheritance heirarchy. There isn't a need for an IP() type
> 'multiplexer' function either (although I might be missing an important use
> case here). But, then again, this may just be my personal preference talking
> here. I prefer composition over inheritance in most cases.

this basically smacks of more petty attackery from the start. so I'll
reply with, "it's just you".

if you want to debate the merits of GOF strategy vs. multiple
inheritance, fine. the class inheritance in ipaddr is very clean, and
leaves very little code duplication. The classes are very clearly
named and laid out, and in general are much easier to follow than the
strategy method you've chosen for netaddr.

I realise you've done a lot of work on ipaddr and my observations are not intended as a "petty attackery" as you put it. It was merely to question whether the shift in approach from earlier incarnations of ipaddr to this is the correct path to be taking. I don't think that solely relying on "IS A" via multiple inheritance necessarily brings clarity to this code which, as stated in the PEP, is intended to be simple for other to understand and possibly use as a basis for their own extensions. More on this in future posts.

If you missed it I have diagrammed the class hierarchy and internal layout of each library here for consideration :-

http://code.google.com/p/netaddr/wiki/PEP3144
 
[SNIP]


> 5) The ipaddr library is also missing options for expanding various
> (exceedingly common) IP abbreviations.
>
>>>> from netaddr import IPNetwork
>
>>>> IPNetwork('10/8', True)
> IPNetwork('10.0.0.0/8')
>
> netaddr also handles classful IP address logic, still pervasive throughout
> modern IP stacks :-
>
>>>> IPNetwork('192.168.0.1', True)
> IPNetwork('192.168.0.1/24')
>
> Note that these options are disabled by default, to keep up the speed of the
> IPNetwork constructor up for more normal cases.

these seem like corner case features for the sake of having features,
you don't even seem to put much stock in them. FWIW, I've never seen a
request for something similar. I may say '10 slash 8', but I mean,
'10.0.0.0/8'. I'm missing the utility here, but I'm open to reasoned
arguments.

I don't see why genuine features should be automatically dismissed as "corner cases".

If you need proof, here an excerpt from RFC 1918 :-

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3. Private Address Space


   The Internet Assigned Numbers Authority (IANA) has reserved the
   following three blocks of the IP address space for private
   internets:

     10.0.0.0        -   10.255.255.255  (10/8 prefix)
     172.16.0.0      -   172.31.255.255  (172.16/12 prefix)
     192.168.0.0     -   192.168.255.255 (192.168/16 prefix)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^    

I've also had specific requests from users about this feature, one just in the last week (which only required me to point them towards the available switch argument in IPNetwork constructor to enable the required behaviour).

In netaddr 0.7.x I have chosen *not* to make this expansion the default case because it provides a not insignificant construction penalty for those that are not interested in it (as you have already noted and of which I am aware).

I believe strongly that this *is* an important option for a general use IP address library.
 
[SNIP]


> There is a lot more to consider here than I can cram into this initial
> message, so I'll hand over to you all for some (hopefully) serious debate.

I'm always open to serious debate, and patches/bug reports (apologies
for missing your earlier issue. I'm not sure if you were aware, but
ipaddr was undergoing a major re-write at the time and I never got
around to following up).
 
I note your response to this on the ipaddr bug tracker today, thanks.

[SNIP]

> PS - Why does the References section in the PEP contain links to patches
> already applied to the ipaddr 2.0.x reference implementation?

There's A link to A patch (singular, both times), which has already
been applied. This link exists b/c, at the time I last updated the
PEP, the patch hadn't been applied as it was still being reviewed.

Thanks for the clarification.

[SNIP]

in general, that leads
to fewer bugs like the following:

>>> help(netaddr.IPNetwork.__init__)
Help on method __init__ in module netaddr.ip:

__init__(self, addr, implicit_prefix=False) unbound netaddr.ip.IPNetwork method
   Constructor.

   @param addr: an IPv4 or IPv6 address with optional CIDR prefix,
       netmask or hostmask. May be an IP address in representation
       (string) format, an integer or another IP object (copy
       construction).

   @param implicit_prefix: if True, the constructor uses classful IPv4
       rules to select a default prefix when one is not provided.
       If False it uses the length of the IP address version.
       (default: False).

>>> netaddr.IPNetwork(1)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "./netaddr/ip/__init__.py", line 632, in __init__
   prefix, suffix = addr.split('/')
AttributeError: 'int' object has no attribute 'split'

vs.

>>> import ipaddr
>>> ipaddr.IPNetwork(1)
IPv4Network('0.0.0.1/32')

Thanks for raising this on the netaddr bug tracker. I'll take a look at it.

Did you have any other comments on the PEP?

Yes I do but they will be coming through in stages unfortunately as I get time to look at this further.
 
David Moss