Mailman 3 December 2020 - Python-Dev

pathlib.Path: inconsistent symlink_to() and link_to()
by Barney Gale Jan. 22, 2021

Jan. 22, 2021

Hi, Pathlib's symlink_to() and link_to() methods have different argument orders, so: a.symlink_to(b) # Creates a symlink from A to B a.link_to(b) # Creates a hard link from B to A I don't think link_to() was intended to be implemented this way, as the docs say "Create a hard link pointing to a path named target.". It's also inconsistent with everything else in pathlib, most obviously symlink_to(). Bug report here: https://bugs.python.org/issue39291 This /really/ irks me. … [View More]

4 5

Advantages of pattern matching - a simple comparative analysis
by Brian Coleman Jan. 19, 2021

Jan. 19, 2021

Take as an example a function designed to process a tree of nodes similar to that which might be output by a JSON parser. There are 4 types of node: - A node representing JSON strings - A node representing JSON numbers - A node representing JSON arrays - A node representing JSON dictionaries The function transforms a tree of nodes, beginning at the root node, and proceeding recursively through each child node in turn. The result is a Python object, with the following transformation applied to … [View More]each node type: - A JSON string `->` Python `str` - A JSON number `->` Python `float` - A JSON array `->` Python `list` - A JSON dictionary `->` Python `dict` I have implemented this function using 3 different approaches: - The visitor pattern - `isinstance` checks against the node type - Pattern matching Here is the implementation using the visitor pattern: ``` from typing import List, Tuple class NodeVisitor: def visit_string_node(self, node: StringNode): pass def visit_number_node(self, node: NumberNode): pass def visit_list_node(self, node: ListNode): pass def visit_dict_node(self, node: DictNode): pass class Node: def visit(visitor: NodeVisitor): raise NotImplementedError() class StringNode(Node): value: str def visit(self, visitor: NodeVisitor): visitor.visit_string_node(self) class NumberNode(Node): value: str def visit(self, visitor: NodeVisitor): visitor.visit_number_node(self) class ListNode(Node): children: List[Node] def visit(self, visitor: NodeVisitor): visitor.visit_list_node(self) class DictNode(Node): children: List[Tuple[str, Node]] def visit(self, visitor: NodeVisitor): visitor.visit_dict_node(self) class Processor(NodeVisitor): def process(root_node: Node): return root_node.visit(self) def visit_string_node(self, node: StringNode): return node.value def visit_number_node(self, node: NumberNode): return float(node.value) def visit_list_node(self, node: ListNode): return [child_node.visit(self) for child_node in node.children] def visit_dict_node(self, node: DictNode): return {key: child_node.visit(self) for key, child_node in node.children} def process(root_node: Node): processor = Processor() return processor.process(root_node) ``` Here is the implementation using `isinstance` checks against the node type: ``` from typing import List, Tuple class Node: pass class StringNode(Node): value: str class NumberNode(Node): value: str class ListNode(Node): children: List[Node] class DictNode(Node): children: List[Tuple[str, Node]] def process(root_node: Node): def process_node(node: Node): if isinstance(node, StringNode): return node.value elif isinstance(node, NumberNode): return float(node.value) elif isinstance(node, ListNode): return [process_node(child_node) for child_node in node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node') return process_node(root_node) ``` Finally here is the implementation using pattern matching: ``` from typing import List, Tuple class Node: pass class StringNode(Node): value: str class NumberNode(Node): value: str class ListNode(Node): children: List[Node] class DictNode(Node): children: List[Tuple[str, Node]] def process(root_node: Node): def process_node(node: Node): match node: case StringNode(value=str_value): return str_value case NumberNode(value=number_value): return float(number_value) case ListNode(children=child_nodes): return [process_node(child_node) for child_node in child_nodes] case DictNode(children=child_nodes): return {key: process_node(child_node) for key, child_node in child_nodes} case _: raise Exception('Unexpected node') return process_node(root_node) ``` Here are the lengths of the different implementations: - Pattern matching `->` 37 lines - `isinstance` checks `->` 36 lines - The visitor pattern `->` 69 lines The visitor pattern implementation is by far the most verbose solution, weighing in at almost twice the length of the alternative implementations due to the large amount of boilerplate that is necessary to achieve double dispatch. The pattern matching and `isinstance` check implementations are very similar in length for this trivial example. In each implementation, there are 2 operations performed on each node. - Determine the type of the node - Destructure the node to extract the desired data The visitor pattern and `isinstance` check implementations separate these 2 operations, whereas the pattern matching approach combines the operations together. I believe that it is the declarative nature of pattern matching, where the operations of determining the type of the node and destructuring the node are combined into a single clause, which allows pattern matching to express a concise solution to the problem. In this trivial example, the advantage of pattern matching over the alternative of using a sequence of `if`-`elif`-`else` statements is not as obvious as it would be when compared to a more complex example, where a sub-tree of nodes might be matched based on their type and be destructured in a single clause. I have seen elsewhere an argument that pattern matching should not be accepted into Python as it introduces a pseudo-DSL that is separate from the rest of the language. I agree that pattern matching might be viewed as a pseudo-DSL, but I believe that it is a good thing, if it allows the solution to certain classes of problems to be expressed in a concise manner. People often raise similar objections to operator overloading in other languages, whereas the presence of operator overloading in Python allows mathematical expressions involving custom numeric types such as vectors to be expressed in a natural way. Furthermore, Python has a regular expression module which implements it's own DSL for the purpose of matching string patterns. Regular expressions, in a similar way to pattern matching, allow string patterns to be expressed in a concise and declarative manner. I really hope that the Steering Council accepts pattern matching into Python. I think that it allows for processing of heterogeneous graphs of objects using recursion in a concise, declarative manner. I would like to thank the authors of the Structural Pattern Matching PEP for their hard work in designing this feature and developing an implementation of it. I believe that it will be a wonderful addition to the language that I am very much looking forward to using. [View Less]

12 25

__init_subclass__ and metaclasses
by Ethan Furman Jan. 16, 2021

Jan. 16, 2021

PEP 487 introduced __init_subclass__ and __set_name__, and both of those were wins for the common cases of metaclass usage. Unfortunately, the implementation of PEP 487 with regards to __init_subclass__ has made the writing of correct metaclasses significantly harder, if not impossible. The cause is that when a metaclass calls type.__new__ to actually create the class, type.__new__ calls the __init_subclass__ methods of the new class' parents, passing it the newly created, but incomplete, … [View More]

6 17

unittest of sequence equality
by Alan G. Isaac Jan. 10, 2021

Jan. 10, 2021

The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array. import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.])) I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence … [View More]

10 16

Story behind vars() vs .__dict__
by Paul Sokolovsky Jan. 8, 2021

Jan. 8, 2021

Hello, I would easily bet 10 bucks that vars() is the least known, and least used, of the Python builtin functions. My mental model of it was: it was introduced (perhaps in Python3) to "harmonize" all the existing .__dict__ stuff, and provide more abstract interface for it, with .__dict__ patiently waiting its deprecation in CPython 10 or something. Was I wrong with thinking like that, and finding that vars() was already in the docs of CPython 1.4, the earliest doc version hosted at python.… [View More]

3 4

Where is the SQLite module maintainer
by Erlend Aasland Jan. 7, 2021

Jan. 7, 2021

Hi, everyone. I'm trying to find a reviewer for this trivial PR: https://github.com/python/cpython/pull/20530 (The PR fixes CheckTraceCallbackContent (in the sqlite3 test suite) for SQLite pre 3.7.15.) I've given up hunting for alternative reviewers (a process I find very uncomfortable, since I feel I'm just bugging people who have too much to do with things they're not interested in), so as a last resort, I'm trying the mailing list: Who can help me review code that touches the sqlite3 … [View More]

5 8

Enhancement request for PyUnicode proxies
by Nelson, Karl E. Jan. 4, 2021

Jan. 4, 2021

I was directed to post this request to the general Python development community so hopefully this is on topic. One of the weaknesses of the PyUnicode implementation is that the type is concrete and there is no option for an abstract proxy string to a foreign source. This is an issue for an API like JPype in which java.lang.Strings are passed back from Java. Ideally these would be a type derived from the Unicode type str, but that requires transferring the memory immediately from Java to … [View More]Python even when that handle is large and will never be accessed from within Python. For certain operations like XML parsing this can be prohibitable, so instead of returning a str we return a JString. (There is a separate issue that Java method names and Python method names conflict so direct inheritance creates some problems.) The JString type can of course be transferred to Python space at any time as both Python Unicode and Java string objects are immutable. However the CPython API which takes strings only accepts the Unicode type objects which have a concrete implementation. It is possible to extend strings, but those extensions do not allow for proxing as far as I can tell. Thus there is no option currently to proxy to a string representation in another language. The concept of the using the duck type ``__str__`` method is insufficient as this indices that an object can become a string, rather than "this object is effectively a string" for the purposes of the CPython API. One way to address this is to use currently outdated copy of READY to extend Unicode objects to other languages. A class like JString would be an unready Unicode object which when READY is called transfers the memory from Java, sets up the flags and sets up a pointer to the code point representation. Unfortunately the READY concept is scheduled for removal and thus the chance to address the needs for proxying a Unicode to another languages representation may be limited. There may be other methods to accomplish this without using the concept of READY. So long as access to the code points go through the Unicode API and the Unicode object can be extended such that the actual code points may be located outside of the Unicode object then a proxy can still be achieved if there are hooks in it to decided when a transfer should be performed. Generally the transfer request only needs to happen once but the key issue being that the number of code points (nor the kind of points) will not be known until the memory is transferred. Java has much the same problem. Although they defined an interface class "java.lang.CharacterArray" the actually "java.lang.String" class is concrete and almost all API methods take a String rather than the base interface even when the base interface would have been adequate. Thus just like Python has difficulty treating a foreign string class as it would a native one, Java cannot treat a Python string as native one as well. So Python strings get represented as CharacterArray type which effectively limits it use greatly. Summary: * A String proxy would need the address of the memory in the "wstr" slot though the code points may be char[], wchar[] or int[] depending the representation in the proxy. * API calls to interpret the data would need to check to see if the data is transferred first, if not it would call the proxy dependent transfer method which is responsible for creating a block of code points and set up flags (kind, ascii, ready, and compact). * The memory block allocated would need to call the proxy dependent destructor to clean up with the string is done. * It is not clear if this would have impact on performance. Python already has the concept of a string which needs actions before it can be accessed, but this is scheduled for removal. Are there any plans currently to address the concept of a proxy string in PyUnicode API? [View Less]

9 21

Review patch fixing packed bitfields in ctypes struct/union
by Filipe Laíns Jan. 2, 2021

Jan. 2, 2021

Hi folks, I know this is a high-volume list so sorry for being yet another voice screaming into your mailbox, but I do not know how else to handle this. A few months ago, I opened a pull request fixing packed bitfields in ctypes struct/union, which right now are being incorrectly created. The offsets and size are all wrong in some cases, fields which should be packed end up with padding between them. bpo: https://bugs.python.org/issue29753 PR: https://github.com/python/cpython/pull/19850 … [View More]

4 4

Enum bug?
by Paul Bryan Dec. 28, 2020

Dec. 28, 2020

Should this be considered a bug in the Enum implementation? >>> class Foo(enum.Enum): ... A = True ... B = 1 ... C = 0 ... D = False ... >>> Foo.A <Foo.A: True> >>> Foo(True) <Foo.A: True> >>> Foo(1) <Foo.A: True> Seems to me like it should store and compare both type and value. Paul

3 2

Thoughts on PEP 634 (Structural Pattern Matching)
by Mark Shannon Dec. 27, 2020

Dec. 27, 2020

Hi everyone, PEP 634/5/6 presents a possible implementation of pattern matching for Python. Much of the discussion around PEP 634, and PEP 622 before it, seems to imply that PEP 634 is synonymous with pattern matching; that if you reject PEP 634 then you are rejecting pattern matching. That simply isn't true. Can we discuss whether we want pattern matching in Python and the broader semantics first, before dealing with low level details? Do we want pattern matching in Python at all? ---… [View More]------------------------------------------ Pattern matching works really well in statically typed, functional languages. The lack of mutability, constrained scope and the ability of the compiler to distinguish let variables from constants means that pattern matching code has fewer errors, and can be compiled efficiently. Pattern matching works less well in dynamically-typed, functional languages and statically-typed, procedural languages. Nevertheless, it works well enough for it to be a popular feature in both erlang and rust. In dynamically-typed, procedural languages, however, it is not clear (at least not to me) that it works well enough to be worthwhile. That is not say that pattern matching could never be of value in Python, but PEP 635 fails to demonstrate that it can (although it does a better job than PEP 622). Should match be an expression, or a statement? ---------------------------------------------- Do we want a fancy switch statement, or a powerful expression? Expressions have the advantage of not leaking (like comprehensions in Python 3), but statements are easier to work with. Can pattern matching make it clear what is assigned? ---------------------------------------------------- Embedding the variables to be assigned into a pattern, makes the pattern concise, but requires discarding normal Python syntax and inventing a new sub-language. Could we make patterns fit Python better? Is it possible to make assignment to variables clear, and unambiguous, and allow the use of symbolic constants at the same time? I think it is, but PEP 634 fails to do this. How should pattern matching be integrated with the object model? ---------------------------------------------------------------- What special method(s) should be added? How and when should they be called? PEP 634 largely disregards the object model, meaning it has many special cases, and is inefficient. The semantics must be well defined. ----------------------------------- Language extensions PEPs should define the semantics of those extensions. For example, PEP 343 and PEP 380 both did. https://www.python.org/dev/peps/pep-0343/#specification-the-with-statement https://www.python.org/dev/peps/pep-0380/#formal-semantics PEP 634 just waves its hands and talks about undefined behavior, which horrifies me. In summary, I would ask anyone who wants pattern matching adding to Python, to not support PEP 634. PEP 634 just isn't a good fit for Python, and we deserve something better. Cheers, Mark. [View Less]

8 8