New submission from Steve Dower <steve.dower(a)python.org>:
Not all console configurations can correctly render smart quotes in help() text. See the "Æ" in "superclass's" below.
When building for pydoc-topics, it would be ideal to disable smart quotes. (I'm assuming from issue31793 that this can be done in configuration, though I'm not entirely sure how - it's not clear to me from those PRs)
---
>>> help("BASICMETHODS")
Basic customization
*******************
object.__new__(cls[, ...])
...
Typical implementations create a new instance of the class by
invoking the superclassÆs "__new__()" method using
"super().__new__(cls[, ...])" with appropriate arguments and then
modifying the newly-created instance as necessary before returning
it.
----------
assignee: docs@python
components: Documentation
messages: 341107
nosy: docs@python, steve.dower
priority: normal
severity: normal
stage: needs patch
status: open
title: Remove smart quotes in pydoc text
type: enhancement
versions: Python 3.7, Python 3.8
_______________________________________
Python tracker <report(a)bugs.python.org>
<https://bugs.python.org/issue36754>
_______________________________________
New submission from kernc:
The doctest execution context documentation [0] says the tests get shallow *copies* of module's globals, so one test can't mingle with results of another. This makes it impossible to make literate modules such as:
"""
This module is about reusable doctests context.
Examples
--------
Let's prepare something the later examples can work with:
>>> import foo
>>> result = foo.Something()
2
"""
class Bar:
"""
Class about something.
>>> bar = Bar(foo)
>>> bar.uses(foo)
True
"""
def baz(self):
"""
Returns 3.
>>> result + bar.baz()
5
"""
return 3
I.e. one has to instantiate everything in every single test. The documentation says one can pass their own globals as `glob=your_dict`, but it doesn't mention the dict is *cleared* after the test run.
Please acknowledge the use case of doctests in a module sharing their environment and results sometimes legitimately exists, and to make it future-compatible, please amend the final paragraph of the relevant part of documentation [0] like so:
You can force use of your own dict as the execution context by
passing `globs=your_dict` to `testmod()` or `testfile()` instead,
e.g., to have all doctests in a module use the _same_ execution
context (sharing variables), define a context like so:
class Context(dict):
def copy(self):
return self
def clear(self):
pass
and use it, optionally prepopulated with `M`'s globals:
doctest.testmod(module,
glob=Context(module.__dict__.copy()))
Thank you!
[0]: https://docs.python.org/3/library/doctest.html#what-s-the-execution-context
----------
assignee: docs@python
components: Documentation
messages: 259731
nosy: docs@python, kernc
priority: normal
severity: normal
status: open
title: Shared execution context between doctests in a module
type: enhancement
versions: Python 3.6
_______________________________________
Python tracker <report(a)bugs.python.org>
<http://bugs.python.org/issue26303>
_______________________________________
Serhiy Storchaka <storchaka+cpython(a)gmail.com> added the comment:
Then let to continue the discussion on the older issue which has larger discussion.
----------
resolution: -> duplicate
stage: patch review -> resolved
status: open -> closed
superseder: -> type() constructor should bind __int__ to __index__ when __index__ is defined and __int__ is not
_______________________________________
Python tracker <report(a)bugs.python.org>
<https://bugs.python.org/issue33039>
_______________________________________
New submission from sam_b <sam(a)sambrown.eu>:
The docs https://docs.python.org/3/tutorial/modules.html#the-module-search-path describe:
> When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:
> - The directory containing the input script (or the current directory when no file is specified).
> - PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
> - The installation-dependent default.
However, it seems like "the directory containing the input script" is checked *before* the standard library:
➜ tmp more logging.py
def foo():
print('bar')
➜ tmp python
Python 2.7.15rc1 (default, Apr 15 2018, 21:51:34)
[GCC 7.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import logging
>>> logging.foo()
bar
>>> logging.WARNING
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'WARNING'
>>>
Am I misunderstanding the docs?
----------
assignee: docs@python
components: Documentation
messages: 315653
nosy: docs@python, sam_b
priority: normal
severity: normal
status: open
title: Inaccurate docs on `import` behaviour
type: behavior
versions: Python 3.6
_______________________________________
Python tracker <report(a)bugs.python.org>
<https://bugs.python.org/issue33340>
_______________________________________
Rémi Lapeyre <remi.lapeyre(a)henki.fr> added the comment:
Hi Cheryl,
thanks for the ping.
I wasn't sure my patch was correct but reading typeobject.c:add_operators(), it is actually more straight-forward than I thought.
Serhiy Storchaka: This is indeed a duplicate of issue20092. I believe the solution proposed by Nick Coghlan is better than the one of Amitava Bhattacharyya, "adding a call to `nb_index` (if that slot exists) in `_PyLong_FromNbInt`" though.
One thing to note regarding the proposed patch: the following stops to work and raises a RecursionError since __index__ == __int__:
class MyInt(int):
def __index__(self):
return int(self) + 1
I changed test_int_subclass_with_index() as `int(self) + 1` is the same thing as `self + 1` for int subclasses. I don't think this sort of code should appear in the wild but if you think it is important not to break compatibility here, I think I could check for number subclasses before overriding __index__.
----------
_______________________________________
Python tracker <report(a)bugs.python.org>
<https://bugs.python.org/issue33039>
_______________________________________
New submission from Filip Bengtsson <filipbengtsson(a)live.se>:
There are 256 characters in the range 0–255.
----------
assignee: docs@python
components: Documentation
messages: 330975
nosy: autom, docs@python
priority: normal
pull_requests: 10114
severity: normal
status: open
title: Typo in documentation
_______________________________________
Python tracker <report(a)bugs.python.org>
<https://bugs.python.org/issue35393>
_______________________________________
New submission from Graham Wideman:
The Unicode HOWTO article is an attempt to help users wrap their minds around Unicode. There are some opportunities for improvement. Issues presented in order of the narrative:
http://docs.python.org/3.3/howto/unicode.html
History of Character Codes
---------------------------
References to the 1980's are a bit off.
"In the mid-1980s an Apple II BASIC program..."
Assuming the comment is about the state of play in the mid-80's, then: The Apple II appeared in 1977. By 1985 we already had Macs, and PCs running DOS, which were capable of various character sets (not to mention lowercase letters!)
"In the 1980s, almost all personal computers were 8-bit"
Both the PC (1983) and Mac (1984) had 16-bit processors.
Definitions:
------------
"Characters are abstractions": Not helpful unless one already knows what "abstraction" means in this specific context.
"the symbol for ohms (Ω) is usually drawn much like the capital letter omega (Ω) in the Greek alphabet [...] but these are two different characters that have different meanings."
Omega is a poor example for this concept. Omega is used as the identifier for a unit in the same way as "m" is used for meter, or "A" is used for ampere. Each is a specific use of a character, which, like any specific use, has a particular meaning. However, having a particular meaning doesn't necessarily require a separate character, and in the case of omega, the Unicode standard now says that the separate "ohm" character is deprecated.
"The ohm sign is canonically equivalent to the capital omega, and normalization would remove any distinction."
http://www.unicode.org/versions/Unicode4.0.0/ch07.pdf#search=%22character%2…
A better example might be the roman numerals, code points U+2160 and subsequent.
Definitions
------------
"A code point is an integer value, usually denoted in base 16."
When trying to convey clearly the distinction between character, code point, and byte representation, the topic of "how it's denoted" is a potential distraction for the reader, so I suggest this point be a bit more explicitly parenthetical, and less confusable with "16 bit". Like:
"A code point value is an integer in the range 0 to over 0x10FFFF (about 1.1 million, with some 110 thousand assigned so far). In a narrative such as the current article, a code point value is usually written in hexadecimal. The Unicode standard displays code points with the notation U+265E to mean the character with value 0x265e (9822 decimal; "Black Chess Knight" character)."
(Also revise subsequent para to use same example character. I suggest not using "Ethiotic Syllable WI", because it's unfamiliar to most readers, and it muddies the topic by suggesting that Unicode in general captures _syllables_ rather than _characters_.)
Encodings:
-----------
"This sequence needs to be represented as a set of bytes"
--> ""This code point sequence needs to be represented as a sequence of bytes"
"4. Many Internet standards are defined in terms of textual data"
This is a vague claim. Probably what was intended was: "Many Internet standards define protocols in which the data must contain no zero bytes, or zero bytes have special meaning." Is this actually true? Are there "many" such standards?
"Generally people don’t use this encoding,"
Probably "people" per se don't use any encoding, computers do. --> "Because of these problems, other more efficient and convenient encodings have been devised and are commonly used.
For continuity, directly after that para should come the later paras starting with "UTF-8 is one of the most common".
"2. A Unicode string is turned into a string of bytes..."
--> "2. A Unicode string is turned into a sequence of bytes..." (Ie: don't overload "string" in and article about strings and encodings.).
Create a new subhead "Converting from Unicode to non-Unicode encodings", and move under it the paras:
"Encodings don't have to..."
"Latin-1, also known as..."
"Encodings don't have to..."
But also revise:
"Encodings don’t have to handle every possible Unicode character, and most encodings don’t."
--> "Non-Unicode code systems usually don't handle all of the characters to be found in Unicode."
----------
assignee: docs@python
components: Documentation
messages: 213367
nosy: docs@python, gwideman
priority: normal
severity: normal
status: open
title: Unicode HOWTO
type: enhancement
versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
_______________________________________
Python tracker <report(a)bugs.python.org>
<http://bugs.python.org/issue20906>
_______________________________________