Mailman 3 November 2020 - Python-ideas

Adding PyInstaller to the standard library
by Abdur-Rahmaan Janhangeer Nov. 25, 2020

Nov. 25, 2020

Greetings list, What do you think of adding PyInstaller as an official part of CPython? Among the different native exports options, PyInstaller holds a nice track of clean delivery. Instead of the project battling for its survival as any other project, if the community finds it useful enough, the devs can get some peace of mind. The stereotyped idea about languages like Python is that they don't by default produce executables compared to let's say Go. Adding the native ability to produce … [View More]

25 119

JSON loading
by Stestagg Nov. 24, 2020

Nov. 24, 2020

I was thinking about the "Load JSON file as single line" thread from a bit back, and had an idea for a neat solution, albeit one that has potential stylistic issues. Idea: Create two new methods on pathlib.Path objects: Path.load(loader, **kwargs) and Path.dump(dumper, obj, **kwargs) Here, `loader` should be an object that implements a method `load(fileobj: BinaryIO, *args, **kwargs) -> object` and `dumper` should be an object that implements `dump(obj: object, fileobj: BinaryIO, *… [View More]

1 0

Matching TypedDicts and other values in JSON
by David Foster Nov. 24, 2020

Nov. 24, 2020

I am excited about the potential of the new PEP 634-636 "match" statement to match JSON data received by Python web applications. Often this JSON data is in the form of structured dictionaries (TypedDicts) containing Lists and other primitives (str, float, bool, None). PEP 634-636 already contain the ability to match all of those underlying data types except for TypedDicts, so I'd like to explore what it might look like to match a TypedDict... Consider an example web application that wants … [View More]to provide a service to draw shapes, perhaps on a connected physical billboard. The service has a '/draw_shape' endpoint which takes a JSON object (a Shape TypedDict) describing a shape to draw: from bottle import HTTPResponse, request, route from typing import Literal, TypedDict, Union class Point2D(TypedDict): x: float y: float class Circle(TypedDict): type: Literal['circle'] center: Point2D # has a nested TypedDict! radius: float class Rect(TypedDict): type: Literal['rect'] x: float y: float width: float height: float Shape = Union[Circle, Rect] # a Tagged Union / Discriminated Union @route('/draw_shape') def draw_shape() -> None: match request.json: # a Shape? ... case _: return HTTPResponse(status=400) # Bad Request Now, what syntax could we have at the ... inside the "match" statement to effectively pull apart a Shape? The current version of PEP 634-636 would require duplicating all the keys and value types that are defined in Shape's underlying Circle and Rect types: match request.json: # a Shape? case {'type': 'circle', 'center': {'x': float(), 'y': float()}, \ radius: float()} as circle: draw_circle(circle) # type is inferred as Circle case {'type': 'rect', 'x': float(), 'y': float(), \ 'width': float(), 'height': float()} as rect: draw_rect(rect) # type is inferred as Rect case _: return HTTPResponse(status=400) # Bad Request Wouldn't it be nicer if we could use class patterns instead? match request.json: # a Shape? case Circle() as circle: draw_circle(circle) case Rect() as rect: draw_rect(rect) case _: return HTTPResponse(status=400) # Bad Request Now that syntax almost works except that Circle and Rect, being TypedDicts, do not support isinstance() checks. PEP 589 ("TypedDict") did not define how such isinstance() checks should work initially because it's somewhat complex to specify. From the PEP: > In particular, TypedDict type objects cannot be used in isinstance() tests > such as isinstance(d, Movie). The reason is that there is no existing > support for checking types of dictionary item values, since isinstance() > does not work with many PEP 484 types, including common ones like List[str]. > [...] > This is consistent with how isinstance() is not supported for List[str]. Well, what if we (or I) took the time to specify how isinstance() worked with TypedDict? Then the match syntax above with TypedDict as a class pattern would work! Refining the example above even further, it would be nice if we didn't have to enumerate all the different types of Shapes directly in the match-statement. What if we could match on a Shape directly? match request.json: # a Shape? case Shape() as shape: draw_shape(shape) case _: return HTTPResponse(status=400) # Bad Request Now for that syntax to work it must be possible for an isinstance() check to work on a Shape, which is defined to be a Union[Circle, Rect], and isinstance() checks also aren't currently defined for Union types. So it would be useful to define isinstance() for Union types as well. Of course that match-statement is now simple enough to just be rewriten as an if-statement: if isinstance(shape := request.json, Shape): draw_shape(shape) else: return HTTPResponse(status=400) # Bad Request Now *that* is a wonderfully short bit of parsing code that results in well-typed objects as output. 🎉 So to summarize, I believe it's possible to support really powerful matching on JSON objects if we just define how isinstance() should work with a handful of new types. In particular the above example would work if isinstance() was defined for: * Union[T1, T2], Optional[T] * T extends TypedDict * Literal['foo'] For arbitrary JSON beyond the example, we'd also want to support isinstance() for: * List[T] We already support isinstance() for the other JSON primitive types: * str * float * bool * type(None) So what do folks think? If I were to start writing a PEP to extend isinstance() to cover at least the above cases, would that be welcome? -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy [View Less]

3 5

'Infinity' constant in Python
by Cade Brown Nov. 23, 2020

Nov. 23, 2020

I am positing that Python should contain a constant (similar to True, False, None), called Infinity. It would be equivalent to `float('inf')`, i.e. a floating point value representing a non-fininte value. It would be the positive constant; negative infinity could retrieved via `-Infinity` Or, to keep float representation the same, the name `inf` could be used, but that does not fit Python's normal choice for such identifiers (but indeed, this is what C uses which is the desired behavior of … [View More]

28 161

super() magic-methods.
by Jonatan Nov. 22, 2020

Nov. 22, 2020

```class A: def __eq__(self, other): return '__eq__' class B(A): def __eq__(self, other): print(super() == other) print(super().__eq__(other)) B() == ...``` OUTPUT: False __eq__ As you can see here, when you run the code, __uq__ does not get printed twice. somehow, The expression "super() <OPERATOR> other" does not turn into "super().__OPERATOR__(other)". this bug is for any operator that you apply on super(). I'd be happy to hear from you why it … [View More]

3 2

PEP 634-636: Mapping patterns and extra keys
by David Foster Nov. 22, 2020

Nov. 22, 2020

From PEP 636 (Structural Pattern Matching): > Mapping patterns: {"bandwidth": b, "latency": l} captures the "bandwidth" and "latency" values from a dict. Unlike sequence patterns, extra keys are ignored. It surprises me that ignoring extra keys would be the *default* behavior. This seems unsafe. Extra keys I would think would be best treated as suspicious by default. * Ignoring extra keys loses data silently. In the current proposal: point = {'x': 1, 'y': 2, 'z': 3) match … [View More]point: case {'x': x, 'y': y}: # MATCHES, losing z O_O pass case {'x': x, 'y': y, 'z': z}: # will never match O_O pass * Ignoring extra keys is inconsistent with the handling of sequences: We don't allow extra items when using a destructuring assignment to a sequence: p = [1, 2] [x, y] = p [x, y, z] = p # ERROR: ValueError: not enough values to unpack (expected 3, got 2) :) * Ignoring extra keys in mapping patterns is inconsistent with the current proposal for how sequence patterns match data: point = [1, 2, 3] match point: case [x, y]: # notices extra value and does NOT match :) pass case [x, y, z]: # matches :) pass * Ignoring extra keys is inconsistent with TypedDict's default "total" matching behavior: from typing import TypedDict class Point2D(TypedDict): x: int y: int p1: Point2D = {'x': 1, 'y': 2} p2: Point2D = {'x': 1, 'y': 2, 'z': 3) # ERROR: Extra key 'z' for TypedDict "Point2D" :) * It is *possible* to force an exact key match with a pattern guard but it's clumsy to do so. It should not be clumsy to parse strictly. point = {'x': 1, 'y': 2, 'z': 3) match point: # notices extra value and does NOT match, but requires ugly guard :/ case {'x': x, 'y': y, **rest} if rest == {}: pass case {'x': x, 'y': y, 'z': z, **rest} if rest == {}: pass To avoid the above problems, **I'd advocate for disallowing extra keys in mapping patterns by default**. For cases where extra keys want to be specifically allowed and ignored, I propose allowing a **_ wildcard. Some examples that illustrate behavior when *disallowing* extra keys in mapping patterns: 1. Strict parsing from typing import TypedDict, Union Point2D = TypedDict('Point2D', {'x': int, 'y': int}) Point3D = TypedDict('Point3D', {'x': int, 'y': int, 'z': int}) def parse_point(point_json: dict) -> Union[Point2D, Point3D]: match point_json: case {'x': int(x), 'y': int(y)}: return Point2D({'x': x, 'y': y}) case {'x': int(x), 'y': int(y), 'z': int(z)}: return Point3D({'x': x, 'y': y, 'z': z}) case _: raise ValueError(f'not a valid point: {point_json!r}') 2. Loose parsing, discarding unknown data. Common when reading JSON-like data when it's not necessary to output it again later. from typing import TypedDict TodoItem_ReadOnly = TypedDict('TodoItem_ReadOnly', {'title': str, 'completed': bool}) def parse_todo_item(todo_item_json: Dict) -> TodoItem_ReadOnly: match todo_item_json: case {'title': str(title), 'completed': bool(completed), **_}: return TodoItem_ReadOnly({'title': title, 'completed': completed}) case _: raise ValueError() input = {'title': 'Buy groceries', 'completed': True, 'assigned_to': ['me']} print(parse_todo_item(input)) # prints: {'title': 'Buy groceries', 'completed': True} 3. Loose parsing, preserving unknown data. Common when parsing JSON-like data when it needs to be round-tripped and output again later. from typing import Any, Dict, TypedDict TodoItem_ReadWrite = TypedDict('TodoItem_ReadWrite', {'title': str, 'completed': bool, 'extra': Dict[str, Any]}) def parse_todo_item(todo_item_json: Dict) -> TodoItem_ReadWrite: match todo_item_json: case {'title': str(title), 'completed': bool(completed), **extra}: return TodoItem_ReadWrite({'title': title, 'completed': completed, 'extra': extra}) case _: raise ValueError() def format_todo_item(item: TodoItem_ReadWrite) -> Dict: return {'title': item['title'], 'completed': item['completed'], **item['extra']} input = {'title': 'Buy groceries', 'completed': True, 'assigned_to': ['me']} output = format_todo_item(parse_todo_item(input)) print(output) # prints: {'title': 'Buy groceries', 'completed': True, 'assigned_to': ['me']} Comments? -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy [View Less]

5 11

One Type for pattern matching a sequence and dictionary
by Abdulla Al Kathiri Nov. 20, 2020

Nov. 20, 2020

Greetings all, We can do one type annotations like list[int] and dict[str, int], my question: why don’t we have a similar thing in matching patterns? Especially if you want to enforce a type on a sequence or a dictionary with unknown length. For example, ls = [3.14, ‘hello’, 1, 2, 3, 4, ...] d = {‘name’: 1, ‘age’: 25, ...} match ls: case [float(), str(), *rest] if all(isinstance(elem, int) for elem in rest): ... (do something with rest) We do the same thing with the following: … [View More]

1 0

Introduce "__python__" built-in attribute
by William Pickard Nov. 20, 2020

Nov. 20, 2020

TL/DR: A new built-in attribute who's purpose is to provide a simple way for developers to detect the Python implementation like CPython, JPython, IronPython and PyPy among other information. Ok, so the reason I'm suggesting this is for another suggestion I'll submit a later date (once I feel this one has ran it's course, or the contributors decide about it). The goal of this attribute (as mentioned above) is to provide developers quick and simple information about the Python runtime like … [View More]

5 4

Extending collections.Counter with top_n() to return elements by rank
by Bora Alper Nov. 19, 2020

Nov. 19, 2020

collections.Counter has most_common([n]) method which returns the most common n elements of the counter, but in case of a tie the result is unspecified --- whereas in practice the order of insertion breaks the tie. For example: >>> Counter(["a","a","b","a","b","c","c","d"]).most_common(2) [('a', 3), ('b', 2)] >>> Counter(["a","a","c","a","b","b","c","d"]).most_common(2) [('a', 3), ('c', 2)] In some cases (which I believe are not rare) you would like to break the tie … [View More]

1 0

Enable subscription operator for generator expressions
by Nuri Jung Nov. 18, 2020

Nov. 18, 2020

How about enabling subscription operator (`[]`) for generator expressions? Also for all `zip()`, `key()`, etc. They could be evaluated in the background only for the requested amount, to avoid evaluating the whole expression to something like a list or tuple, then indexed.

7 10