In Python 2.5 `0or` was accepted by the Python parser. It became an
error in 2.6 because "0o" became recognizing as an incomplete octal
number. `1or` still is accepted.
On other hand, `1if 2else 3` is accepted despites the fact that "2e" can
be recognized as an incomplete floating point number. In this case the
tokenizer pushes "e" back and returns "2".
Shouldn't it do the same with "0o"? It is possible to make `0or` be
parseable again. Python implementation is able to tokenize this example:
$ echo '0or' | ./python -m tokenize
1,0-1,1: NUMBER '0'
1,1-1,3: NAME 'or'
1,3-1,4: OP '['
1,4-1,5: OP ']'
1,5-1,6: NEWLINE '\n'
2,0-2,0: ENDMARKER ''
On other hand, all these examples look weird. There is an assymmetry:
`1or 2` is a valid syntax, but `1 or2` is not. It is hard to recognize
visually the boundary between a number and the following identifier or
keyword, especially if numbers can contain letters ("b", "e", "j", "o",
"x") and underscores, and identifiers can contain digits. On both sides
of the boundary can be letters, digits, and underscores.
I propose to change the Python syntax by adding a requirement that there
should be a whitespace or delimiter between a numeric literal and the
webmaster has already heard from 4 people who cannot install it.
I sent them to the bug tracker or to python-list but they seem
not to have gone either place. Is there some guide I should be
sending them to, 'how to debug installation problems'?
If one goes to httWhps://www.python.org/downloads
<https://www.python.org/downloads> from a Windows browser, the default
download URL is for the 32-bit installer instead of the 64-bit one.
I wonder why is this still the case?
Shouldn't we encourage new Windows users (who may not even know the
distinction between the two architectures) to use the 64-bit version of
Python, since most likely they can?
If this is not the correct forum for this, please let me know where I can
direct my question/feature request, thanks.
Hi, I open this thread to discuss the proposal by Nick Coghlan in
to add __int__ and __trunc__ to a type when __index__ is defined.
Currently __int__ does not default to __index__ during class initialisation
both must be defined to get a coherant behavior:
(cpython-venv) ➜ cpython git:(add-key-argument-to-bisect) ✗ python3
Python 3.8.0a1+ (heads/add-key-argument-to-bisect:b7aaa1adad, Feb 18
[Clang 10.0.0 (clang-1000.10.44.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> class MyInt:
... def __index__(self):
... return 4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string, a bytes-like object or a
number, not 'MyInt'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type MyInt doesn't define __trunc__ method
>>> MyInt.__int__ = MyInt.__index__
The difference in behavior is espacially weird in builtins like int() and
The documentation mentions at
the need to always define both __index__ and __int__:
Note: In order to have a coherent integer type class, when __index__()
is defined __int__() should also be defined, and both should return the
Nick Coghlan proposes to make __int__ defaults to __index__ when only the
is defined and asked to open a discussion on python-dev before making any
"as the closest equivalent we have to this right now is the "negative"
where overriding __eq__ without overriding __hash__ implicitly marks the
class as unhashable (look for "type->tp_hash =
I think the change proposed makes more sense than the current behavior and
volunteer to implement it if it is accepted.
What do you think about this?
I'm working on compact and ordered set implementation.
It has internal data structure similar to new dict from Python 3.6.
It is still work in progress. Comments, tests, and documents
should be updated. But it passes existing tests excluding
test_sys and test_gdb (both tests checks implementation detail)
Before completing this work, I want to evaluate it.
Following is my current thoughts about the compact ordered set.
## Preserving insertion order
Order is not fundamental for set. There are no order in set in the
But it is convenient sometime in real world. For example, it makes
doctest easy. When writing set to logs, we can use "grep" command
if print order is stable. pyc is stable without PYTHONHASHSEED=0 hack.
Additionally, consistency with dict is desirable. It removes one pitfall for
new Python users. "Remove duplicated items from list" idiom become
`list(set(duplicated))` from `list(dict.fromkeys(duplicated))`.
## Memory efficiency
Hash table has dilemma. To reduce collision rate, hash table
should be sparse. But it wastes memory.
Since current set is optimized for both of hit and miss cases,
it is more sparse than dict. (It is bit surprise that set typically uses
more memory than same size dict!)
New implementation partially solve this dilemma. It has sparse
"index table" which items are small (1byte when table size <= 256,
2bytes when table size <= 65536), and dense entry table (each item
has key and hash, which is 16bytes on 64bit system).
I use 1/2 for capacity rate for now. So new implementation is
memory efficient when len(s) <= 32768. But memory efficiency is
roughly equal to current implementation when 32768 < len(s) <= 2**31,
and worse than current implementation when len(s) > 2**31.
Here is quick test about memory usage.
$ ./python -m perf compare_to master.json oset2.json -G --min-speed=2
- unpickle_list: 8.48 us +- 0.09 us -> 12.8 us +- 0.5 us: 1.52x slower (+52%)
- unpickle: 29.6 us +- 2.5 us -> 44.1 us +- 2.5 us: 1.49x slower (+49%)
- regex_dna: 448 ms +- 3 ms -> 462 ms +- 2 ms: 1.03x slower (+3%)
- meteor_contest: 189 ms +- 1 ms -> 165 ms +- 1 ms: 1.15x faster (-13%)
- telco: 15.8 ms +- 0.2 ms -> 15.3 ms +- 0.2 ms: 1.03x faster (-3%)
- django_template: 266 ms +- 6 ms -> 259 ms +- 3 ms: 1.03x faster (-3%)
- unpickle_pure_python: 818 us +- 6 us -> 801 us +- 9 us: 1.02x faster (-2%)
Benchmark hidden because not significant (49)
unpickle and unpickle_list shows massive slowdown. I suspect this slowdown
is not caused from set change. Linux perf shows many pagefault is happened
in pymalloc_malloc. I think memory usage changes hit weak point of pymalloc
accidentally. I will try to investigate it.
On the other hand, meteor_contest shows 13% speedup. It uses set.
Other doesn't show significant performance changes.
I need to write more benchmarks for various set workload.
I expect new set is faster on simple creation, iteration and destruction.
Especially, sequential iteration and deletion will reduce cache misses.
(e.g. https://bugs.python.org/issue32846 )
On the other hand, new implementation will be slow on complex
(heavy random add & del) case.
Any comments are welcome. And any benchmark for set workloads
are very welcome.
INADA Naoki <songofacandy(a)gmail.com>
PEP 394 says:
> This recommendation will be periodically reviewed over the next few
> years, and updated when the core development team judges it
> appropriate. As a point of reference, regular maintenance releases
> for the Python 2.7 series will continue until at least 2020.
I think it's time for another review.
I'm especially worried about the implication of these:
- If the `python` command is installed, it should invoke the same
version of Python as the `python2` command
- scripts that are deliberately written to be source compatible
with both Python 2.x and 3.x [...] may continue to use `python` on
their shebang line.
So, to support scripts that adhere to the recommendation, Python 2
needs to be installed :(
Please see this PR for details and a suggested change: