At the moment, the array module of the standard library allows to
create arrays of different numeric types and to initialize them from
an iterable (eg, another array).
What's missing is the possiblity to specify the final size of the
array (number of items), especially for large arrays.
I'm thinking of suffix arrays (a text indexing data structure) for
large texts, eg the human genome and its reverse complement (about 6
billion characters from the alphabet ACGT).
The suffix array is a long int array of the same size (8 bytes per
number, so it occupies about 48 GB memory).
At the moment I am extending an array in chunks of several million
items at a time at a time, which is slow and not elegant.
The function below also initializes each item in the array to a given
value (0 by default).
Is there a reason why there the array.array constructor does not allow
to simply specify the number of items that should be allocated? (I do
not really care about the contents.)
Would this be a worthwhile addition to / modification of the array module?
My suggestions is to modify array generation in such a way that you
could pass an iterator (as now) as second argument, but if you pass a
single integer value, it should be treated as the number of items to
Here is my current workaround (which is slow):
def filled_array(typecode, n, value=0, bsize=(1<<22)):
"""returns a new array with given typecode
(eg, "l" for long int, as in the array module)
with n entries, initialized to the given value (default 0)
a = array.array(typecode, [value]*bsize)
x = array.array(typecode)
r = n
while r >= bsize:
r -= bsize
For technical reasons, many functions of the Python standard libraries
implemented in C have positional-only parameters. Example:
Python 3.7.0a0 (default, Feb 25 2017, 04:30:32)
replace(self, old, new, count=-1, /) # <== notice "/" at the end
>>> "a".replace("x", "y") # ok
>>> "a".replace(old="x", new="y") # ERR!
TypeError: replace() takes at least 2 arguments (0 given)
When converting the methods of the builtin str type to the internal
"Argument Clinic" tool (tool to generate the function signature,
function docstring and the code to parse arguments in C), I asked if
we should add support for keyword arguments in str.replace(). The
answer was quick: no! It's a deliberate design choice.
Quote of Yury Selivanov's message:
I think Guido explicitly stated that he doesn't like the idea to
always allow keyword arguments for all methods. I.e. `str.find('aaa')`
just reads better than `str.find(needle='aaa')`. Essentially, the idea
is that for most of the builtins that accept one or two arguments,
positional-only parameters are better.
I just noticed a module on PyPI to implement this behaviour on Python functions:
My question is: would it make sense to implement this feature in
Python directly? If yes, what should be the syntax? Use "/" marker?
Use the @positional() decorator?
Do you see concrete cases where it's a deliberate choice to deny
passing arguments as keywords?
Don't you like writing int(x="123") instead of int("123")? :-) (I know
that Serhiy Storshake hates the name of the "x" parameter of the int
By the way, I read that "/" marker is unknown by almost all Python
developers, and [...] syntax should be preferred, but
inspect.signature() doesn't support this syntax. Maybe we should fix
signature() and use [...] format instead?
Replace "replace(self, old, new, count=-1, /)" with "replace(self,
old, new[, count=-1])" (or maybe even not document the default
Python 3.5 help (docstring) uses "S.replace(old, new[, count])".
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').
This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).
Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
Would it be reasonable to start deprecating this and eventually remove
it from the language?
--Guido van Rossum (python.org/~guido)
I find importing defaultdict from collections to be clunky and it seems
like having a default should just be an optional keyword to dict. Thus,
d = dict(default=int)
would be the same as
from collections import defaultdict
d = defaultdict(int)
I'm shortly writing to you about a reflection I lately made upon the
current functioning of __str__ for the time's class.
Before expressing my thought and proposal, I want to make sure we all agree
on a simple and clear fact:
the __str__ magic method is used to give a literal and human-readable
representation to the object (unlike __repr__).
Generally this is true across the python panorama. It's not true for the
time class, for example.
*>>> import time>>> a = time.localtime()>>>
a.__str__()'time.struct_time(tm_year=2017, tm_mon=3, tm_mday=8, tm_hour=16,
tm_min=6, tm_sec=16, tm_wday=2, tm_yday=67, tm_isdst=0)'*
Well, don't get me wrong: the main aim of the __str__ method has been
accomplished but, imho, not in the most pythonic way.
I just wanted to ask you: what do you think about re-writing the __str__ of
the time class so it would return something like
ISO 8601 [https://en.wikipedia.org/wiki/ISO_8601] format? Wouldn't it be
more meaningful? Especially in the JS-everywhere-era
it could be more more productive.
__str__ for dates should return a human-readable date format (eg:
I'm waiting for your opinions.
Thank you for your time and ideas!
After some french discussions about this idea, I subscribed here to
suggest adding a new string litteral, for regexp, inspired by other
types like : u"", r"", b"", br"", f""…
The regexp string litteral could be represented by : re""
It would ease the use of regexps in Python, allowing to have some regexp
We may end up with an integration like :
>>> import re
>>> if re".k" in 'ok':
... print "ok"
Regexps are part of the language in Perl, and the rather complicated
integration of regexp in other languages, especially in Python, is
something that comes up easily in language comparing discussion.
and new string litterals types in Python (like f"") looked like a good
compromise to have a tight integration of regexps without asking to make
them part of the language (as I imagine it has already been discussed
years ago, and obviously denied…).
As per XKCD illustration, using a regexp may be a problem on its own,
but really, the "each-language a new and complicated approach" is
another difficulty, of the level of writing regexps I think. And then,
when you get the trick for Python, it feels to me still to much letters
to type regarding the numerous problems one can solve using regexps.
I know regexps are slower than string-based workflow (like .startswith)
but regexps can do the most and the least, so they are rapide to come up
with, once you started to think with them. As Python philosophy is to
spare brain-cycles, sacrificing CPU-cycles, allowing to easily use
regexps is a brain-cycle savior trick.
What do you think ?
+336 769 702 53
This keeps on coming up in one form or another - either someone
multiplies a list of lists and ends up surprised that they're all the
same, or is frustrated with the verbosity of the alternatives.
Can we use the matmul operator for this?
def __matmul__(self, other):
return [copy.copy(x) for x in self for _ in range(other)]
>>> x = List([*4]) @ 2
[[0, 0, 0, 0], [0, 0, 0, 0]]
>>> x = 1
[[1, 0, 0, 0], [0, 0, 0, 0]]
If this were supported by the built-in list type, it would be either of these:
>>> x = [ * 4] @ 2
>>> x = [ @ 4] @ 4
(identical functionality, as copying an integer has no effect).
The semantics could be either as shown above (copy.copy()), or
something very simple and narrow like "lists get shallow-copied, other
objects get referenced".
> To me, 'pop' implies mutation. Tuples do not have a pop method, and it
> is not obvious to me that either tuples or frozensets should. What are > the use cases that are not better served by converting to list or set?
> Terry Jan Reedy
1) coverting to set or list is O(n) in time
2) if I have to keep the old copy,
standard set solution will be O(n) both in time and space!
1) priority queue:
insert and pop occur
2) share immutable data to difference subroutines:
each one can modify local copy safely and concurrency.
yet again today I ended up writing:
d = [ * 5 for _ in range(10)]
And wondered, why don't we have a way to repeat other than looping over
range() and using a dummy variable? This seems like a rather common thing
to do, and while the benefit doesn't seem much, something like this would
be much prettier and more pythonic than using underscore variable:
d = [ * 5 repeat_for 10]
And could obviously be used outside of comprehensions too:
print('Attempting to reconnect...')
print('Unable to reconnect :(')
I chose to use repeat_for instead of repeat because it's way less likely to
be used as a variable name, but obviously other names could be used like
loop_for or repeat_times etc.
I feel like that borders on a bit too wordy...
Personally, I'd like to see something like Felix's regular definitions:
Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else
On Mar 29, 2017 3:30 PM, "Abe Dillon" <abedillon(a)gmail.com> wrote:
My 2 cents is that regular expressions are pretty un-pythonic because of
their horrible readability. I would much rather see Python adopt something
like Verbal Expressions ( https://github.com/VerbalExpressions/
PythonVerbalExpressions ) into the standard library than add special syntax
support for normal REs.
On Tue, Mar 28, 2017 at 3:31 AM, Paul Moore <p.f.moore(a)gmail.com> wrote:
> On 28 March 2017 at 08:54, Simon D. <simon(a)acoeuro.com> wrote:
> > I believe that the u"" notation in Python 2.7 is defined by while
> > importing the unicode_litterals module.
> That's not true. The u"..." syntax is part of the language. from
> future import unicode_literals is something completely different.
> > Each regexp lib could provide its instanciation of regexp litteral
> > notation.
> The Python language has no way of doing that - user (or library)
> defined literals are not possible.
> > And if only the default one does, it would still be won for the
> > beginers, and the majority of persons using the stdlib.
> How? You've yet to prove that having a regex literal form is an
> improvement over re.compile(r'put your regex here'). You've asserted
> it, but that's a matter of opinion. We'd need evidence of real-life
> code that was clearly improved by the existence of your proposed
> Python-ideas mailing list
> Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/