On Mon, Oct 8, 2012 at 12:03 PM, Rob Cliffe <rob.cliffe(a)btinternet.com> wrote:
>
> On 08/10/2012 19:39, Guido van Rossum wrote:
>>
>> Does this mean that the following behaviour of lists is a bug?
>>>>>>>
>>>>>>> x=float('NAN')
>>>>>>> [x]==[x], [x]<=[x], [x]>=[x]
>>>>
>>>> (True, True, True)
>>>
>>> No. That's a special case in the comparisons for sequences.
>>
>> [Now that I'm back at a real keyboard I can elaborate...]
>>
>> This applies to all container comparisons: without the rule that if
>> two contained items reference the same object they are to be
>> considered equal without calling their __eq__, containers couldn't
>> take the shortcut that a container is always equal to itself (i.e. c1
>> is c2 => c1 == c2). Without this shortcut, container comparisons would
>> be much more expensive: any time a large container was compared to
>> itself, it would be forced to recursively compare all the contained
>> items. You might say that it has to do this anyway when comparing to a
>> container that is not itself, but if the anser is "unequal" the
>> comparison can stop as soon as two unequal items are found, whereas if
>> the answer is "equal" you end up comparing all items. For two
>> different containers there is no possible shortcut, but comparing a
>> container to itself is quite common and really does deserve the
>> shortcut. We discussed this in the past and always came to the same
>> conclusion: despite the rules for NaN, the shortcut for containers is
>> required. A similar shortcut exists for 'x in [x]' BTW.
>>
> Thank you for elaborating, I was going to ask what the justification for the
> special case was.
> You have explained why
>
>>>> x=float('NAN'); A=[x]; A==A
> True
>
> but not as far as I can see why
>
>>>> x=float('NAN'); A=[x]; B=[x]; A==B, [x]=[x]
> (True, True)
>
> where neither of the results is comparing a container to itself.
It's so that when the container is iterating over pairs of elements it
can check for item identity (a simple pointer comparison) first, which
makes a pretty big difference in speed.
--
--Guido van Rossum (python.org/~guido)
On Mon, Oct 8, 2012 at 10:36 AM, Guido van Rossum <guido(a)python.org> wrote:
>
>>> It's not about equality. If you ask whether two NaNs are *unequal* the
>>> answer is *also* False.
>>
>> Does this mean that the following behaviour of lists is a bug?
>> >>> x=float('NAN')
>> >>> [x]==[x], [x]<=[x], [x]>=[x]
>> (True, True, True)
>
> No. That's a special case in the comparisons for sequences.
[Now that I'm back at a real keyboard I can elaborate...]
This applies to all container comparisons: without the rule that if
two contained items reference the same object they are to be
considered equal without calling their __eq__, containers couldn't
take the shortcut that a container is always equal to itself (i.e. c1
is c2 => c1 == c2). Without this shortcut, container comparisons would
be much more expensive: any time a large container was compared to
itself, it would be forced to recursively compare all the contained
items. You might say that it has to do this anyway when comparing to a
container that is not itself, but if the anser is "unequal" the
comparison can stop as soon as two unequal items are found, whereas if
the answer is "equal" you end up comparing all items. For two
different containers there is no possible shortcut, but comparing a
container to itself is quite common and really does deserve the
shortcut. We discussed this in the past and always came to the same
conclusion: despite the rules for NaN, the shortcut for containers is
required. A similar shortcut exists for 'x in [x]' BTW.
--
--Guido van Rossum (python.org/~guido)
On Sun, Oct 7, 2012 at 7:16 PM, Duncan McGreggor
<duncan.mcgreggor(a)gmail.com> wrote:
>
>
> On Sun, Oct 7, 2012 at 5:52 PM, Guido van Rossum <guido(a)python.org> wrote:
>>
>> On Sat, Oct 6, 2012 at 9:09 PM, Duncan M. McGreggor
>> <duncan.mcgreggor(a)gmail.com> wrote:
>> > We're here ;-)
>> >
>> > I'm forwarding this to the rest of the Twisted cabal...
>>
>> Quick question. I'd like to see how Twisted typically implements a
>> protocol parser. Where would be a good place to start reading example
>> code?
>
>
> I'm not exactly sure what you're looking for (e.g., I'm not sure what your
> exact definition of a protocol parser is), but this might be getting close
> to what you want:
>
> * https://github.com/twisted/twisted/blob/master/twisted/mail/pop3.py
> * https://github.com/twisted/twisted/blob/master/twisted/protocols/basic.py
>
> The POP3 protocol implementation in Twisted is a pretty good example of how
> one should create a protocol. It's a subclass of the
> twisted.protocol.basic.LineOnlyReceiver, and I'm guessing when you said
> "parsing" you're wanting to look at what's in the dataReceived method of
> that class.
>
> Hopefully that's what you were after...
Yes, those are perfect. The term I used came from one of Josiah's
previous messages in this thread, but I think he really meant protocol
handler.
My current goal is to see if it would be possible to come up with an
abstraction that makes it possible to write protocol handlers that are
independent from the rest of the infrastructure (e.g. transport,
reactor). I honestly have no idea if this is a sane idea but I'm going
to look into it anyway; if it works it would be cool to be able to
reuse the same POP3 logic in different environments (e.g. synchronous
thread-based, Twisted) without having to pul in all of Twisted. I.e.
Twisted could contribute the code to the stdlib and the stdlib could
make it work with SocketServer but Twisted could still use it
(assuming Twisted ever gets ported to Py3k :-).
--
--Guido van Rossum (python.org/~guido)
On Sat, Oct 6, 2012 at 9:09 PM, Duncan M. McGreggor
<duncan.mcgreggor(a)gmail.com> wrote:
> We're here ;-)
>
> I'm forwarding this to the rest of the Twisted cabal...
Quick question. I'd like to see how Twisted typically implements a
protocol parser. Where would be a good place to start reading example
code?
--
--Guido van Rossum (python.org/~guido)
Having just discovered that PEP 3131 [1] enables me to use greek letters to
represent variables in equations, it was pointed out to me that it also
allows
visually confusable characters in identifiers [2].
When I previously read the PEP I thought that the normalisation process
resolved these issues but now I see that the PEP leaves it as an open
problem.
I also previously thought that the PEP would be irrelevant if I was using
ascii-only code but now I can see that if a GREEK CAPITAL LETTER ALPHA can
sneak into my code (just like those pesky tab characters) I could still
have a
visually undetectable bug.
An example to show how an issue could arise:
"""
#!/usr/bin/env python3
code = '''
{0} = 123
{1} = 456
print('"{0}" == "{1}":', "{0}" == "{1}")
print('{0} == {1}:', {0} == {1})
'''
def test_identifier(identifier1, identifier2):
exec(code.format(identifier1, identifier2))
test_identifier('\u212b', '\u00c5') # Different Angstrom code points
test_identifier('A', '\u0391') # LATIN/GREEK CAPITAL A/ALPHA
"""
When I run this I get:
$ ./test.py
"Å" == "Å": False
Å == Å: True
"A" == "Α": False
A == Α: False
Is the proposal mentioned in the PEP (to use something based on Unicode
Technical Standard #39 [3]) something that might be implemented at any
point?
Oscar
References:
[1] http://www.python.org/dev/peps/pep-3131/#open-issues
[2] http://article.gmane.org/gmane.comp.python.tutor/78116
[3] http://unicode.org/reports/tr39/#Confusable_Detection
The builtin round function is completely useless. I've never seen
anyone use it constructively. Usually people using it are new
programmers who are not comfortable with or aware of string
formatting. Sometimes people use it to poorly replicate functionality
that's implemented correctly in the decimal module.
Mike
-cc: python-dev
+cc: python-ideas
On Sat, Sep 29, 2012 at 11:39 AM, Chris Angelico <rosuav(a)gmail.com> wrote:
> On Sun, Sep 30, 2012 at 4:26 AM, Brett Cannon <brett(a)python.org> wrote:
> > Does this mean we want to re-open the discussion about decimal constants?
> > Last time this came up I think we decided that we wanted to wait for
> > cdecimal (which is obviously here) and work out how to handle contexts,
> the
> > syntax, etc.
>
> Just to throw a crazy idea out: How bad a change would it be to make
> decimal actually the default?
>
> (Caveat: I've not worked with decimal/cdecimal to any real extent and
> don't know its limitations etc.)
>
Painful for existing code, unittests and extension modules. Definitely
python-ideas territory (thread moved there with an appropriate subject).
I'm not surprised at all that a decimal type can be "fast" in an
interpreted language due to the already dominant interpreter overhead.
I wish all spreadsheets had used decimals from day one rather than binary
floating point (blame Lotus?). Think of the trouble that would have saved
the world.
-gps