Hello!
This list (which I co-admin, with Georg) is getting less and less
traffic as months pass by. See:
https://mail.python.org/pipermail/python-porting/
The interwebs has been collecting ton of resources about porting py2
to 3 during these years. Any not-yet-answered question surely can be
done in a list with more participants.
Can we kill this list?
Thanks! Regards,
--
. Facundo
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org.ar/
Twitter: @facundobatista
webmaster has already heard from 4 people who cannot install it.
I sent them to the bug tracker or to python-list but they seem
not to have gone either place. Is there some guide I should be
sending them to, 'how to debug installation problems'?
Laura
If one goes to httWhps://www.python.org/downloads
<https://www.python.org/downloads> from a Windows browser, the default
download URL is for the 32-bit installer instead of the 64-bit one.
I wonder why is this still the case?
Shouldn't we encourage new Windows users (who may not even know the
distinction between the two architectures) to use the 64-bit version of
Python, since most likely they can?
If this is not the correct forum for this, please let me know where I can
direct my question/feature request, thanks.
Cosimo
--
Cosimo Lupo
Hi, I open this thread to discuss the proposal by Nick Coghlan in
https://bugs.python.org/issue33039
to add __int__ and __trunc__ to a type when __index__ is defined.
Currently __int__ does not default to __index__ during class initialisation
so
both must be defined to get a coherant behavior:
(cpython-venv) ➜ cpython git:(add-key-argument-to-bisect) ✗ python3
Python 3.8.0a1+ (heads/add-key-argument-to-bisect:b7aaa1adad, Feb 18
2019, 16:10:22)
[Clang 10.0.0 (clang-1000.10.44.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> class MyInt:
... def __index__(self):
... return 4
...
>>> int(MyInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string, a bytes-like object or a
number, not 'MyInt'
>>> math.trunc(MyInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type MyInt doesn't define __trunc__ method
>>> hex(MyInt())
'0x4'
>>> len("a"*MyInt())
4
>>> MyInt.__int__ = MyInt.__index__
>>> int(MyInt())
4
The difference in behavior is espacially weird in builtins like int() and
hex().
The documentation mentions at
https://docs.python.org/3/reference/datamodel.html#object.__index__
the need to always define both __index__ and __int__:
Note: In order to have a coherent integer type class, when __index__()
is defined __int__() should also be defined, and both should return the
same value.
Nick Coghlan proposes to make __int__ defaults to __index__ when only the
second
is defined and asked to open a discussion on python-dev before making any
change
"as the closest equivalent we have to this right now is the "negative"
derivation,
where overriding __eq__ without overriding __hash__ implicitly marks the
derived
class as unhashable (look for "type->tp_hash =
PyObject_HashNotImplemented;").".
I think the change proposed makes more sense than the current behavior and
volunteer to implement it if it is accepted.
What do you think about this?
Hello all,
A couple months back, I reported bpo-35155[1] and I submitted a PR for
consideration[2]. After a couple of reviews, it seems like progress has
stalled. Would it be possible for someone to review this?
Thanks,
Denton
[1]: https://bugs.python.org/issue35155
[2]: https://github.com/python/cpython/pull/10313
Hello, folks.
I'm working on compact and ordered set implementation.
It has internal data structure similar to new dict from Python 3.6.
It is still work in progress. Comments, tests, and documents
should be updated. But it passes existing tests excluding
test_sys and test_gdb (both tests checks implementation detail)
https://github.com/methane/cpython/pull/16
Before completing this work, I want to evaluate it.
Following is my current thoughts about the compact ordered set.
## Preserving insertion order
Order is not fundamental for set. There are no order in set in the
math world.
But it is convenient sometime in real world. For example, it makes
doctest easy. When writing set to logs, we can use "grep" command
if print order is stable. pyc is stable without PYTHONHASHSEED=0 hack.
Additionally, consistency with dict is desirable. It removes one pitfall for
new Python users. "Remove duplicated items from list" idiom become
`list(set(duplicated))` from `list(dict.fromkeys(duplicated))`.
## Memory efficiency
Hash table has dilemma. To reduce collision rate, hash table
should be sparse. But it wastes memory.
Since current set is optimized for both of hit and miss cases,
it is more sparse than dict. (It is bit surprise that set typically uses
more memory than same size dict!)
New implementation partially solve this dilemma. It has sparse
"index table" which items are small (1byte when table size <= 256,
2bytes when table size <= 65536), and dense entry table (each item
has key and hash, which is 16bytes on 64bit system).
I use 1/2 for capacity rate for now. So new implementation is
memory efficient when len(s) <= 32768. But memory efficiency is
roughly equal to current implementation when 32768 < len(s) <= 2**31,
and worse than current implementation when len(s) > 2**31.
Here is quick test about memory usage.
https://gist.github.com/methane/98b7f43fc00a84964f66241695112e91
# Performance
pyperformance result:
$ ./python -m perf compare_to master.json oset2.json -G --min-speed=2
Slower (3):
- unpickle_list: 8.48 us +- 0.09 us -> 12.8 us +- 0.5 us: 1.52x slower (+52%)
- unpickle: 29.6 us +- 2.5 us -> 44.1 us +- 2.5 us: 1.49x slower (+49%)
- regex_dna: 448 ms +- 3 ms -> 462 ms +- 2 ms: 1.03x slower (+3%)
Faster (4):
- meteor_contest: 189 ms +- 1 ms -> 165 ms +- 1 ms: 1.15x faster (-13%)
- telco: 15.8 ms +- 0.2 ms -> 15.3 ms +- 0.2 ms: 1.03x faster (-3%)
- django_template: 266 ms +- 6 ms -> 259 ms +- 3 ms: 1.03x faster (-3%)
- unpickle_pure_python: 818 us +- 6 us -> 801 us +- 9 us: 1.02x faster (-2%)
Benchmark hidden because not significant (49)
unpickle and unpickle_list shows massive slowdown. I suspect this slowdown
is not caused from set change. Linux perf shows many pagefault is happened
in pymalloc_malloc. I think memory usage changes hit weak point of pymalloc
accidentally. I will try to investigate it.
On the other hand, meteor_contest shows 13% speedup. It uses set.
Other doesn't show significant performance changes.
I need to write more benchmarks for various set workload.
I expect new set is faster on simple creation, iteration and destruction.
Especially, sequential iteration and deletion will reduce cache misses.
(e.g. https://bugs.python.org/issue32846 )
On the other hand, new implementation will be slow on complex
(heavy random add & del) case.
-----
Any comments are welcome. And any benchmark for set workloads
are very welcome.
Regards,
--
INADA Naoki <songofacandy(a)gmail.com>
PEP 394 says:
> This recommendation will be periodically reviewed over the next few
> years, and updated when the core development team judges it
> appropriate. As a point of reference, regular maintenance releases
> for the Python 2.7 series will continue until at least 2020.
I think it's time for another review.
I'm especially worried about the implication of these:
- If the `python` command is installed, it should invoke the same
version of Python as the `python2` command
- scripts that are deliberately written to be source compatible
with both Python 2.x and 3.x [...] may continue to use `python` on
their shebang line.
So, to support scripts that adhere to the recommendation, Python 2
needs to be installed :(
Please see this PR for details and a suggested change:
https://github.com/python/peps/pull/893
My focus is AIX - and I believe I found a bug in AIX include files in
64-bit mode. I'll take that up with IBM and AIX support. However, this
issue might also be valid in Python3.
The following is from Centos, not AIX
Python 2.7.5 (default, Jul 13 2018, 13:06:57)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxsize
9223372036854775807
>>> import posix
>>> posix.stat("/tmp/xxx")
posix.stat_result(st_mode=33188, st_ino=33925869, st_dev=64768L,
st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1550742595,
st_mtime=1550742595, st_ctime=1550742595)
>>> st=posix.stat("/tmp/xxx")
>>> dev=st.st_dev
>>> min=posix.minor(dev)
>>> maj=posix.major(dev)
>>> min,max
(0, <built-in function max>)
>>> min
0
>>> max
<built-in function max>
>>> maj
253
>>> posix.minor(dev)
0
>>> posix.major(655536)
2560
>>> posix.major(65536)
256
>>> posix.major(256)
1
>>> import os
>>> os.major(256)
1
>>>
In AIX - 64-bit mode
Python 3.8.0a1+ (heads/master:e7a4bb554e, Feb 20 2019, 18:40:08) [C] on aix7
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys,os,posix
>>> sys.maxsize
9223372036854775807
>>> posix.major(256)
0
>>> posix.major(65536)
1
>>> posix.stat("/tmp/xxx")
os.stat_result(st_mode=33188, st_ino=12, st_dev=-9223371993905102841,
st_nlink=1, st_uid=202, st_gid=1954, st_size=0, st_atime=1550690105,
st_mtime=1550690105, st_ctime=1550690105)
AIX 32-bit:
root@x066:[/data/prj/python/git/python3-3.8.0.66]./python
Python 3.8.0a1+ (heads/master:e7a4bb554e, Feb 19 2019, 11:22:56) [C] on aix6
Type "help", "copyright", "credits" or "license" for more information.
>>> import os,sys,posix
>>> sys.maxsize
2147483647
>>> posix.major(65536)
1
>>> posix.stat("/tmp/xxx")
os.stat_result(st_mode=33188, st_ino=149, st_dev=655367, st_nlink=1,
st_uid=0, st_gid=0, st_size=0, st_atime=1550743517, st_mtime=1550743517,
st_ctime=1550743517)
To make it easier to view:
buildbot@x064:[/home/buildbot]cat osstat.c
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/sysmacros.h>
#include <stdio.h>
main()
{
dev_t dev;
char *path = "/tmp/xxx";
struct stat st;
int minor,major;
lstat(path,&st);
printf("size: %d\n", sizeof(st.st_dev));
dev = st.st_dev;
minor = minor(dev);
major = major(dev);
printf("%016lx %ld %ld\n",dev,dev, (unsigned) dev);
printf("%d,%d\n",major,minor);
}
buildbot@x064:[/home/buildbot]OBJECT_MODE=32 cc osstat.c -o osstat-32 &&
./osstat-32
size: 4
00000000000a0007 655367 655367
10,7
And here is the AIX behavior (and bug - major() macro!)
buildbot@x064:[/home/buildbot]OBJECT_MODE=64 cc osstat.c -o osstat-64 &&
./osstat-64
size: 8
8000000a00000007 -9223371993905102841 7
0,7
The same on AIX 6 (above is AIX7) - and also with gcc:
root@x068:[/data/prj]gcc -maix64 osstat.c -o osstat-64 && ./osstat-64
size: 8
8000000a00000007 -9223371993905102841 42949672967
0,7
root@x068:[/data/prj]gcc -maix32 osstat.c -o osstat-32 && ./osstat-32
size: 4
00000000000a0007 655367 0
10,7
root@x068:[/data/prj]
So, the AIX 'bug' with the macro major() has been around for ages - but
ALSO setting the MSB of the st_dev.
+++++
Now my question:
Will this continue to be enough space - i.e., is the Dev size going to
be enough?
+2042 #ifdef MS_WINDOWS
+2043 PyStructSequence_SET_ITEM(v, 2,
PyLong_FromUnsignedLong(st->st_dev));
+2044 #else
+2045 PyStructSequence_SET_ITEM(v, 2, _PyLong_FromDev(st->st_dev));
+2046 #endif
+711 #define _PyLong_FromDev PyLong_FromLongLong
It seems so - however, Is there something such as PyUnsignedLong and is
that large enough for a "long long"? and if it exists, would that make
the value positive (for the first test).
posix.major and os.major will need to mask away the MSB and
posix.makedev and os.makedev will need to add it back.
OR - do I need to make the PyStat values "the same" in both 32-bit and
64-bit?
Puzzled on what you think is the correct approach.
Michael