[Python-Dev] PEP 3144: IP Address Manipulation Library for the Python Standard Library
DrKJam
drkjam at gmail.com
Thu Aug 27 15:07:59 CEST 2009
2009/8/25 Peter Moody <peter at hda3.com>
> On Mon, Aug 24, 2009 at 3:24 PM, DrKJam<drkjam at gmail.com> wrote:
>
[SNIP]
> As it was left in early June, a pep and design modifications were
> requested before ipaddr would be considered for inclusion, but if this
> is going to start *another* drawn out ipaddr/netaddr thread, perhaps
> the mailman admin(s) could setup a new SIG list for this. I
> personally hope that's not required; yours has been the only
> dissenting email and I believe I respond to all of your major points
> here.
The PEP process is the perfect forum for spending some time scrutinizing and
discussing this topic in more detail. I will be raising further points in
future when I've had time to fully evaluate both the PEP and the reference
implementation of ipaddr.
At this stage, it is premature to assume the reference implementation
provided along with the PEP is necessarily complete, only requiring a few
bug fixes to get through the approval process.
> > 1) Firstly, an offering of code.
> >
> > I'd like to bring to your attention an example implementation of an IP
> > address library and interface for general discussion to compare and
> contrast
> > with ipaddr 2.0.x :-
> >
> > http://netaddr.googlecode.com/svn/branches/exp_0.7.x_ip_only
> >
> > It is based on netaddr 0.7.2 which I threw together earlier today.
> >
> > In essence, I've stripped out all of what could be considered
> non-essential
> > code for a purely IP related library. This branch should be suitable for
> > *theoretical* consideration of inclusion into some future version of the
> > Python standard library (with a little work).
> >
> > It is a pure subset of netaddr release 0.7.2, *minus* the following :-
> >
> > - all IEEE layer-2 code
> > - some fairly non-essential IANA IP data files and lookup code
> > - IP globbing code (fairly niche)
> >
> > Aside: Just a small mention here that I listened carefully to Clay
> McClure's
> > and others criticisms of the previous incarnation of ipaddr. The 0.7.x
> > series of netaddr breaks backward compatibility with previous netaddr
> > releases and is an "answer" of sorts to that discussion and issue raised
> > within the Python community. I hope you like what I've done with it.
> >
> > For the purposes of this discussion consider this branch the "Firefox to
> > netaddr's Mozilla" or maybe just plain old "netaddr-ip-lite" ;-)
> >
> > 2) I refute bold claim in the PEP that :-
> >
> > "Finding a good library for performing those tasks can be somewhat
> more
> > difficult."
> >
> > On the contrary, I wager that netaddr is now a perfectly decent
> alternative
> > implementation to ipaddr, containing quite a few more features with
> little
> > of the slowness for most common operations,
>
> I think you mean refuse,
No, I meant refute.
> b/c this certainly wasn't the case when I
> started writing ipaddr. IPy existed, but it was far too heavyweight
> and restrictive for what I needed (no disrespect to the author(s)
> intended). I believe I've an email or two from you wherein you
> indicate the same.
>
The comment made on IPy, to which I believe you are referring, was in
response to you incorrectly comparing netaddr and IPy's implementation
(assuming conditional logic was used within each method to support IP
versioning). As already stated netaddr gets around this with a strategy
design pattern approach (apologies to readers for using the "Gang of Four"
acronym with regard to this).
IPy is heavyweight? How so? It is a mere 1200 lines including comments and
deals with IPv4 and IPv6 addressing, much like ipaddr (albeit with fewer
features). There are certainly issues you could raise against it (otherwise
we wouldn't be here), but being heavyweight is not one of them.
I would actively encourage authors of said library (Victor Stinner is listed
as the current maintainer) to get involved in the discussion of this PEP. It
is their legacy that this work is picking up from.
Incidentally, I've noticed a few bug fix releases come through for IPy on
PyPI in the last month so that project certainly seems alive and well.
I think the PEP currently doesn't provide appropriate weight to the efforts
of others in this area.
FYI, here is a wiki entry I've been maintaining for a while now to this end
:-
http://code.google.com/p/netaddr/wiki/YetAnotherPythonIPModule
>
> > 2/3x faster in a lot of cases,
> > not that we're counting. What a difference a year makes!
> > I also rate IPy quite highly even if it is getting a little "long in the
> tooth".
> > For a lot of users, IPy could also be considered a nice, stable API!
>
> yes, netaddr has sped up quite a bit. It's still slower in many cases
> as well. But again, who's timing?
>
I mention speed and timings as the PEP cites this as one of the benefits of
considering the ipaddr reference implementation.
>
> > By the same token I'm happy to note some convergence between the ipaddr
> and
> > netaddr's various interfaces, particularly in light of discussions and
> > arguments put forward by Clay McClure and others. A satisfactory
> compromise
> > between the two however still seems a way off.
> >
> >
> > 3) I also disagree with the PEP's claim that :-
> >
> > "attempts to combine [IPv4 and IPv6] into one object would be like
> > trying to force a round peg into a square hole (or vice versa)".
> >
> > netaddr (and for that matter IPy) cope with this perceived problem
> > admirably.
> >
> > netaddr employs a simple variant of the GoF Strategy design pattern (with
> > added Python sensibility). In the rare cases where ambiguity exists
> between
> > IPv4 and IPv6 addresses a version parameter may be passed to the
> constructor
> > of the IPAddress class to differentiate between them. Providing an IP
> > address version to the constructor also provides a small performance
> > improvement.
>
> I'm not sure what point you're trying to make here. I didn't say it
> was impossible, I inferred that there are easier ways. having used
> code which crams both types into one object, I found it to be cludgey
> and complicated so I designed something different.
>
Let me clarify. I am +1 on the specific item in the PEP regarding the need
for separate and distinct IPAddress and IPNetwork class interfaces that are
not conflated into a single interface. Clay McClure made this point very
eloquently. I've done a good bit of experimentation on this since it was
mentioned so I am fully aware of the pros and cons of each approach. A brief
look at netaddr.ip.lite confirms that on this we both agree.
Where I disagree is on the need to have yet another split in the interface
to support different IP versions (and a set of Factory functions to pull it
all together again). Hey, another design pattern, also known as the "Factory
Method" a.k.a. Virtual Constructor (or in this case a Python function).
>
> and as a hardly partial observer, I'll add the explicit address
> version you can pass to the IPAddress class, but not the IPNetwork
> class, is, odd. it actually seems to slow down object creation (~5%)
> except in the case of an int arg (your default is about twice as
> slow).
>
Ah, the issue of speed and timings again. Let's concentrate on getting the
interface right before we spend too much effort on optimization. I'm quite
happy to do a full speed comparison of major features in both libraries but
I don't think that would be a worthwhile use of time just now.
Currently I'm ambivalent on whether an IP(vX)Network class constructor
should accept a numerical (i.e. integer) value at all *unless* you explicit
state somehow that you want the network aspect to be inferred in some
specific way. It isn't a case of just choosing /32 or /128 and having this
as the only option. IP (v4) classful rules are still pervasive in the real
world. A general case IP library available to the whole Python community
should certainly take this into account.
> > IPv4 and IPv6 addresses can be used interchangably throughout netaddr
> > without causing issue during operations such as sorting, merging (known
> in
> > the PEP as "address collapsing") or address exclusion.
> >
> > Don't try and do this with the current reference implementation of ipaddr
> :-
> >
> >>>> collapse_address_list([IPv4Address('1.1.1.1'),
> >>>> IPv6Address('::1.1.1.1')])
> > [IPv4Network('1.1.1.1/32' <http://1.1.1.1/32%27>)]
> >
> > OUCH! Even if this isn't allowed (according to the documentation), it
> should
> > raise an Exception rather than silently passing through.
> >
> > I actually raised this back in May on the ipaddr bug tracker but it
> hasn't
> > received any attention so far :-
> >
> > http://code.google.com/p/ipaddr-py/issues/detail?id=18
> >
> > Compare this with netaddr's behaviour :-
> >
> >>>> cidr_merge([IPAddress('1.1.1.1'), IPAddress('::1.1.1.1')])
> > [IPNetwork('1.1.1.1/32' <http://1.1.1.1/32%27>), IPNetwork('::
> 1.1.1.1/128' <http://1.1.1.1/128%27>)]
> >
> > That's more like it.
>
> OUCH! indeed. I'm not even sure that this is a nice corner case
> feature, summarizing a single list of mixed ip type objects. with an
> extra line or two, this can be done in ipaddr, though 'tis true that
> we should now raise an exception and don't (it appears to be something
> that was introduced recently). If this is a feature for which
> developers are clamoring, I'm all over it. Yours is the first email
> I've heard mention it.
>
I may be the only one raising issues but that shouldn't mean they are any
less relevant. There is a whole different feel and thrust behind both
interfaces each with their own merits.
>
> > 4) It may just be me but the design of this latest incarnation of ipaddr
> > seems somewhat complicated for so few lines of code. Compared with
> ipaddr,
> > netaddr doesn't use or require multiple inheritance nor a seemingly
> > convoluted inheritance heirarchy. There isn't a need for an IP() type
> > 'multiplexer' function either (although I might be missing an important
> use
> > case here). But, then again, this may just be my personal preference
> talking
> > here. I prefer composition over inheritance in most cases.
>
> this basically smacks of more petty attackery from the start. so I'll
> reply with, "it's just you".
>
> if you want to debate the merits of GOF strategy vs. multiple
> inheritance, fine. the class inheritance in ipaddr is very clean, and
> leaves very little code duplication. The classes are very clearly
> named and laid out, and in general are much easier to follow than the
> strategy method you've chosen for netaddr.
>
I realise you've done a lot of work on ipaddr and my observations are not
intended as a "petty attackery" as you put it. It was merely to question
whether the shift in approach from earlier incarnations of ipaddr to this is
the correct path to be taking. I don't think that solely relying on "IS A"
via multiple inheritance necessarily brings clarity to this code which, as
stated in the PEP, is intended to be simple for other to understand and
possibly use as a basis for their own extensions. More on this in future
posts.
If you missed it I have diagrammed the class hierarchy and internal layout
of each library here for consideration :-
http://code.google.com/p/netaddr/wiki/PEP3144
[SNIP]
> > 5) The ipaddr library is also missing options for expanding various
> > (exceedingly common) IP abbreviations.
> >
> >>>> from netaddr import IPNetwork
> >
> >>>> IPNetwork('10/8', True)
> > IPNetwork('10.0.0.0/8')
> >
> > netaddr also handles classful IP address logic, still pervasive
> throughout
> > modern IP stacks :-
> >
> >>>> IPNetwork('192.168.0.1', True)
> > IPNetwork('192.168.0.1/24')
> >
> > Note that these options are disabled by default, to keep up the speed of
> the
> > IPNetwork constructor up for more normal cases.
>
> these seem like corner case features for the sake of having features,
> you don't even seem to put much stock in them. FWIW, I've never seen a
> request for something similar. I may say '10 slash 8', but I mean,
> '10.0.0.0/8'. I'm missing the utility here, but I'm open to reasoned
> arguments.
>
I don't see why genuine features should be automatically dismissed as
"corner cases".
If you need proof, here an excerpt from RFC 1918 :-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3. Private Address Space
The Internet Assigned Numbers Authority (IANA) has reserved the
following three blocks of the IP address space for private
internets:
10.0.0.0 - 10.255.255.255 (10/8 prefix)
172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
192.168.0.0 - 192.168.255.255 (192.168/16 prefix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I've also had specific requests from users about this feature, one just in
the last week (which only required me to point them towards the available
switch argument in IPNetwork constructor to enable the required behaviour).
In netaddr 0.7.x I have chosen *not* to make this expansion the default case
because it provides a not insignificant construction penalty for those that
are not interested in it (as you have already noted and of which I am
aware).
I believe strongly that this *is* an important option for a general use IP
address library.
[SNIP]
> > There is a lot more to consider here than I can cram into this initial
> > message, so I'll hand over to you all for some (hopefully) serious
> debate.
>
> I'm always open to serious debate, and patches/bug reports (apologies
> for missing your earlier issue. I'm not sure if you were aware, but
> ipaddr was undergoing a major re-write at the time and I never got
> around to following up).
I note your response to this on the ipaddr bug tracker today, thanks.
[SNIP]
> PS - Why does the References section in the PEP contain links to patches
> > already applied to the ipaddr 2.0.x reference implementation?
>
> There's A link to A patch (singular, both times), which has already
> been applied. This link exists b/c, at the time I last updated the
> PEP, the patch hadn't been applied as it was still being reviewed.
Thanks for the clarification.
[SNIP]
in general, that leads
> to fewer bugs like the following:
>
> >>> help(netaddr.IPNetwork.__init__)
> Help on method __init__ in module netaddr.ip:
>
> __init__(self, addr, implicit_prefix=False) unbound netaddr.ip.IPNetwork
> method
> Constructor.
>
> @param addr: an IPv4 or IPv6 address with optional CIDR prefix,
> netmask or hostmask. May be an IP address in representation
> (string) format, an integer or another IP object (copy
> construction).
>
> @param implicit_prefix: if True, the constructor uses classful IPv4
> rules to select a default prefix when one is not provided.
> If False it uses the length of the IP address version.
> (default: False).
>
> >>> netaddr.IPNetwork(1)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "./netaddr/ip/__init__.py", line 632, in __init__
> prefix, suffix = addr.split('/')
> AttributeError: 'int' object has no attribute 'split'
>
> vs.
>
> >>> import ipaddr
> >>> ipaddr.IPNetwork(1)
> IPv4Network('0.0.0.1/32')
>
Thanks for raising this on the netaddr bug tracker. I'll take a look at it.
Did you have any other comments on the PEP?
>
Yes I do but they will be coming through in stages unfortunately as I get
time to look at this further.
David Moss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090827/10fdf1d0/attachment-0001.htm>
More information about the Python-Dev
mailing list