[Python-ideas] IntFlags

Neil Girdhar mistersheik at gmail.com
Fri Mar 6 18:03:58 CET 2015


It seems to me that we probably would agree on an interface even if we have
a philosophical disagreement.  I think it's possible to have clean,
Pythonic interface that produces whatever integers you want.

In short, my preferred interface is:

__or__ (and __ior__)
__setattr__ and __getattr__
__int__

and that's it.  Is there really a use case for __and__, or __invert__?

Given that you want to follow C so closely, I'm surprised that you don't
prefer IntFields to IntFlags.  I also gave a couple motivating examples for
fields (here's a third:
http://www.tagwith.com/question_332767_rgb-color-converting-into-565-format
).


Best,

Neil

On Fri, Mar 6, 2015 at 11:41 AM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Mar 6, 2015, at 1:42, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> On Fri, Mar 6, 2015 at 4:28 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>
>> On Mar 5, 2015, at 20:26, Neil Girdhar <mistersheik at gmail.com> wrote:
>>
>> Even if you constrain yourself to the BitFlags rather than the more
>> general BitFields, I strongly disagree with the interface that people are
>> proposing involving & and ~ operators.  In general, good interface design
>> reflects the way we think about objects — not their underlying
>> representation.
>>
>>
>> But sometimes the object really is "an integer used as a set of bits in
>> some C structure/protocol field/well-known API".
>>
>
> You can always get that integer by casting to integer.
>
>
>>
>> For example, if we were designing os.open or mmap or whatever as a
>> Pythonic interface, it wouldn't have a "flags" value that or's together
>> multiple integers. We'd probably have separate keyword-only arguments for
>> the less common flags, etc. But they weren't designed from scratch; they
>> were designed to closely mirror the POSIX APIs. And that doesn't just mean
>> simpler implementation, it means people who are familiar with those APIs
>> know how to use them. It means the vast volumes of tutorials and sample
>> code for opening file handles or mapping memory written for C applies to
>> Python. And so on. So, the interface makes sense.
>>
>
> I disagree that there is any need to follow the style of the "vast volumes
> of tutorials and sample code in C" when designing Python libraries.  The
> goal is for the Python code to be as natural as possible.  Member access,
> and building constants using | are natural.  Using &~ to clear a bit is not
> natural;  It is a coincidence of implementation that distracts from what is
> happening.
>
>
> For the vast majority of libraries, I agree. An XML parser or audio
> decoder has no need to follow cryptic C API standards.
>
> But libraries that are designed for close-to-the-metal access can be an
> exception--again, consider os.open, which automatically gives you access to
> every *nix plafform's platform-specific features.
>

It's just as "close to the metal" to write things with member access.  It's
not as if it's going to be much slower!  It's just a question of how you
express setting and clearing bits.


>
> And wrapping C libraries that don't have much of a Python userbase can be
> another example. If enough people start using it, someone will write and
> document a higher-level Pythonic interface, but until that happens, having
> an interface which closely matches what people can find documentation,
> StackOverflow help, sample code, etc. for is a huge help.
>

Most people use StackOverflow and SO always adapts.


> Consider PyGame. Much of it is still sparsely documented, but because it
> wraps the SDL APIs, you can almost always figure out what you need to do,
> which is part of the reason it's so popular while higher-level wrappers are
> not. (The other part of the reason is that it wraps almost all of the
> functionality of SDL, and nothing else can claim that, and again that's
> probably because it's a thin wrapper.) Or consider PyWin32: it has almost
> no documentation, and it's not at all Pythonic, but because you can look up
> a function on MSDN and directly use the C documentation, it's useful for
> all those areas of the Win32 API (and third-party COM libraries, etc.) that
> don't have higher-level wrappers.
>
> And there are plenty of protocols, file formats, etc. for which the
> documentation is written for C (or is just a C implementation, as with the
> predecessor to RTSP that I forget the name of) as well.
>
> In an ideal world, everything you wanted would have a high-level, Pythonic
> API--in fact, everything would be designed for Python in the first place.
> In the real world, you're better off with a C API than with no API at all.
>

I don't think the above API is so "high level".  I think "x.b = False" is
just better design than "x &= ~Class.b".


>
> So, one very good use for something like IntFlags is to allow people to
>> keep using that C sample code (with trivial, easy-to-understand changes),
>> but get better debugging, etc. when they do so--e.g., when you introspect
>> an mmap object, it would be great if it could tell you that it was opened
>> with PROT_READ | PROT_EXEC, instead of telling you "3", which you have to
>> manually convert to bits and reverse-lookup in the docs or the module dict.
>>
>
> Yes, totally agree.
>
>>
>> Not allowing people to use C-style operations if they use named bits
>> means that someone who wants the advantages of named bits has to rewrite
>> their familiar C-style code. Sure, maybe the result will be more readable
>> (although that's arguable; the suggested alternatives are pretty
>> verbose--especially since people keep suggesting mutating-only APIs...),
>> but it means many people will stick with plain ints rather than rewrite,
>> and those who do rewrite will end up with code that doesn't look like the
>> familiar code that everyone knows how to read from C.
>>
>
> I totally agree with you that there should not only be mutating-only
> functions.  I agree that | should be used for comining bit fields or
> flags.  However, the people who are "familiar with C" (including me) are
> frankly dying :)
>
>
> People have been saying that for a couple decades now, but there's still
> tons of functionality--not just system-level stuff, but APIs for high-level
> things like audio fingerprinting or animating sprites or streaming video
> or extending a Python interpreter--that only exists in C (or sometimes C++
> or ObjC), or with very thin wrappers for higher-level languages. And that's
> still going to be true for a long time to come.
>
> More importantly, if C really were dead and irrelevant, there would be no
> need for this proposal; again, the only reason you ever care about packing
> flags into an int in the first place is for compatibility with C or C-style
> code. When you don't need that, just use a namedtuple or a set or keyword
> arguments or whatever in the first place.
>

I'm not saying it's dead.  I'm saying that pandering to an audience who
knows C is a waste.  I have a feeling that the real inertia has nothing to
do with other people who might know C, and more to do with people like me
and you who want to keep writing things the same way we've been writing
things. Sometimes, we've been doing things the long way, and the next
generation can write things the short way.  It's not much more "high
level".  It's just simpler.


> Pandering to the past really gets you nowhere.  Try to be a bit idealistic
> so that new Python code is natural, succinct, and human-readable — rather
> than the C values of reflecting the underlying representation in spite of
> the human being.
>
>
>>   The fact is that a BitSet's main operations are set and clear
>> individual bits.  It is as if the BitFlags are a namedtuple with Boolean
>> elements whose underlying storage happens to be an integer.
>>
>>
>> In the case where you don't really care that the underlying storage is an
>> integer, why use an integer in the first place? Why not use a namedtuple,
>> or a set, or whatever else is appropriate? In the very rare case where you
>> need to store a million of these things (and can't store them even more
>> compactly with array or NumPy or similar), you can go get a third-party
>> lib; the vast majority of the time, there's no advantage to using an
>> integer.
>>
>
> The main reason is so that you can cast it to "int" and produce something
> that some API requires.
>
>>
>> Except, of course, when the underlying representation is the whole point,
>> because you're dealing with an API that's written in terms of integers.
>>
>> right.
>
>>   Therefore, the interface that makes the most sense is member access:
>>
>> my_bit_flags.some_bit = True
>> my_bit_flags.some_bit = False
>>
>> I don't see the justification for writing these as
>>
>> my_bit_flags |= TheBitFlagsClass.some_bit
>> my_bit_flags &= ~TheBitFlagsClass.some_bit
>>
>> The second line is particularly terrible because it exposes you to making
>> mistakes like:
>>
>> my_bit_flags &= TheBitFlagsClass.some_bit
>> my_bit_flags |= ~TheBitFlagsClass.some_bit
>>
>> — both of which are meaningless.
>>
>>
>> No they're not. Put some real names instead of toy names there:
>>
>>     readable = m.prot
>>     readable &= ProtFlags.Readable
>>
>> Now it's true iff m.prot includes the Readable flag.
>>
>> Of course usually you'd write this in a single line without mutation:
>>
>>     readable = m.prot & ProtFlags.Readable
>>
>
> We both know that the most readable version is just member access, like
> you would on any object:
>
> readable = m.prot.readable
>
> This usage of & to filter is unnecessarily complicated.  The fact that the
> machine does so is no reason for the programmer to write it so.
>
>
> Right, so someone should write a higher-level library that wraps up mmap
> so you don't have to use it. But no one has done so yet, and if you want to
> use it without waiting another couple decades until someone gets around to
> it, you're using the C-style API.
>
> But that just goes to show that the primary interface of bit flags is an
>> immutable one; trying to force people to use mutating methods like set_bit
>> and clear_bit is just getting in people's way. (And try to come up with a
>> good name for the non-mutating operation that's obvious and reads like
>> English and isn't approaching the ridiculous Apple level of verbosity you
>> get in Cocoa methods like "bitSetWithBitClear:".)
>>
>
> I agree with you here.  I think you should also have | so that you can
> build constants the way you're used to, although I'm not sure about & since
> I don't see when you would use it in preference to member access.
>
>
> OK, if you have | and &, you automatically have |= and &=. There's no way
> to implement the former without automatically getting the latter. So if
> that's your suggestion, it's not possible in the first place, so you have
> to choose whether we get both or neither.
>
>
It also makes it hard to convert code between the alternate implementation
>> of using a namedtuple.  It should be easy to do that in my opinion.
>>
>>
>
>>
>> Best,
>>
>> Neil
>>
>> On Thu, Mar 5, 2015 at 12:57 PM, Serhiy Storchaka <storchaka at gmail.com>
>> wrote:
>>
>>> On 05.03.15 19:29, Neil Girdhar wrote:
>>>
>>>> Have you looked at my IntFields generalization of IntFlags?  It seems
>>>> that many of your examples (permissions, e.g.) are better expressed with
>>>> fields than with flags.
>>>>
>>>
>>> It looks too complicated for such simple case. And it has an interface
>>> incompatible with plain int.
>>>
>>>
>>>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>> --
>>>
>>> --- You received this message because you are subscribed to a topic in
>>> the Google Groups "python-ideas" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>> topic/python-ideas/L5KfCEXFaII/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> python-ideas+unsubscribe at googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150306/d6820c12/attachment-0001.html>


More information about the Python-ideas mailing list