There's a whole matrix of these and I'm wondering why the matrix is
currently sparse rather than implementing them all. Or rather, why we
can't stack them as:
class foo(object):
@classmethod
@property
def bar(cls, ...):
...
Essentially the permutation are, I think:
{'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable
attribute}.
concreteness
implicit first arg
type
name
comments
{unadorned}
{unadorned}
method
def foo():
exists now
{unadorned} {unadorned} property
@property
exists now
{unadorned} {unadorned} non-callable attribute
x = 2
exists now
{unadorned} static
method @staticmethod
exists now
{unadorned} static property @staticproperty
proposing
{unadorned} static non-callable attribute {degenerate case -
variables don't have arguments}
unnecessary
{unadorned} class
method @classmethod
exists now
{unadorned} class property @classproperty or @classmethod;@property
proposing
{unadorned} class non-callable attribute {degenerate case - variables
don't have arguments}
unnecessary
abc.abstract {unadorned} method @abc.abstractmethod
exists now
abc.abstract {unadorned} property @abc.abstractproperty
exists now
abc.abstract {unadorned} non-callable attribute
@abc.abstractattribute or @abc.abstract;@attribute
proposing
abc.abstract static method @abc.abstractstaticmethod
exists now
abc.abstract static property @abc.staticproperty
proposing
abc.abstract static non-callable attribute {degenerate case -
variables don't have arguments} unnecessary
abc.abstract class method @abc.abstractclassmethod
exists now
abc.abstract class property @abc.abstractclassproperty
proposing
abc.abstract class non-callable attribute {degenerate case -
variables don't have arguments} unnecessary
I think the meanings of the new ones are pretty straightforward, but in
case they are not...
@staticproperty - like @property only without an implicit first
argument. Allows the property to be called directly from the class
without requiring a throw-away instance.
@classproperty - like @property, only the implicit first argument to the
method is the class. Allows the property to be called directly from the
class without requiring a throw-away instance.
@abc.abstractattribute - a simple, non-callable variable that must be
overridden in subclasses
@abc.abstractstaticproperty - like @abc.abstractproperty only for
@staticproperty
@abc.abstractclassproperty - like @abc.abstractproperty only for
@classproperty
--rich
At the moment, the array module of the standard library allows to
create arrays of different numeric types and to initialize them from
an iterable (eg, another array).
What's missing is the possiblity to specify the final size of the
array (number of items), especially for large arrays.
I'm thinking of suffix arrays (a text indexing data structure) for
large texts, eg the human genome and its reverse complement (about 6
billion characters from the alphabet ACGT).
The suffix array is a long int array of the same size (8 bytes per
number, so it occupies about 48 GB memory).
At the moment I am extending an array in chunks of several million
items at a time at a time, which is slow and not elegant.
The function below also initializes each item in the array to a given
value (0 by default).
Is there a reason why there the array.array constructor does not allow
to simply specify the number of items that should be allocated? (I do
not really care about the contents.)
Would this be a worthwhile addition to / modification of the array module?
My suggestions is to modify array generation in such a way that you
could pass an iterator (as now) as second argument, but if you pass a
single integer value, it should be treated as the number of items to
allocate.
Here is my current workaround (which is slow):
def filled_array(typecode, n, value=0, bsize=(1<<22)):
"""returns a new array with given typecode
(eg, "l" for long int, as in the array module)
with n entries, initialized to the given value (default 0)
"""
a = array.array(typecode, [value]*bsize)
x = array.array(typecode)
r = n
while r >= bsize:
x.extend(a)
r -= bsize
x.extend([value]*r)
return x
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').
This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).
Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)
Would it be reasonable to start deprecating this and eventually remove
it from the language?
--
--Guido van Rossum (python.org/~guido)
I am wondering if it would be possible to include psutil
(https://pypi.python.org/pypi/psutil ) in the standard library, and if
not, what would be needed.
I am not a developer of it, but I am using psutil at work with good
success. it provides a good deal of services for querying and managing
processes in a cross platform way.
Any thoughts?
I'd like to propose adding the ability to extract last n stack trace
entries from a traceback using functions from traceback module. Simply
put, passing limit=-n would make functions to return (or print or
yield) last abs(n) entries. No-brainer implementation for extract_tb:
extracted = list(_extract_tb_iter(tb, limit=(limit, None)[limit < 0]))
if limit is not None and limit < 0:
return extracted[limit:]
return extracted
The motivation: http://stackoverflow.com/q/25472412/2301450
I think it would be helpful for folks using the asyncio module to be able
to make non-blocking calls to objects in the multiprocessing module more
easily. While some use-cases for using multiprocessing can be replaced with
ProcessPoolExecutor/run_in_executor, there are others that cannot; more
advanced usages of multiprocessing.Pool aren't supported by
ProcessPoolExecutor (initializer/initargs, contexts, etc.), and other
multiprocessing classes like Lock and Queue have blocking methods that
could be made into coroutines.
Consider this (extremely contrived, but use your imagination) example of a
asyncio-friendly Queue:
import asyncio
import time
def do_proc_work(q, val, val2):
time.sleep(3) # Imagine this is some expensive CPU work.
ok = val + val2
print("Passing {} to parent".format(ok))
q.put(ok) # The Queue can be used with the normal blocking API, too.
item = q.get()
print("got {} back from parent".format(item))
def do_some_async_io_task():
# Imagine there's some kind of asynchronous I/O
# going on here that utilizes asyncio.
asyncio.sleep(5)
@asyncio.coroutine
def do_work(q):
loop.run_in_executor(ProcessPoolExecutor(),
do_proc_work, q, 1, 2)
do_some_async_io_task()
item = yield from q.coro_get() # Non-blocking get that won't affect our
io_task
print("Got {} from worker".format(item))
item = item + 25
yield from q.coro_put(item)
if __name__ == "__main__":
q = AsyncProcessQueue() # This is our new asyncio-friendly version of
multiprocessing.Queue
loop = asyncio.get_event_loop()
loop.run_until_complete(do_work(q))
I have seen some rumblings about a desire to do this kind of integration on
the bug tracker (http://bugs.python.org/issue10037#msg162497 and
http://bugs.python.org/issue9248#msg221963) though that discussion is
specifically tied to merging the enhancements from the Billiard library
into multiprocessing.Pool. Are there still plans to do that? If so, should
asyncio integration with multiprocessing be rolled into those plans, or
does it make sense to pursue it separately?
Even more generally, do people think this kind of integration is a good
idea to begin with? I know using asyncio is primarily about *avoiding* the
headaches of concurrent threads/processes, but there are always going to be
cases where CPU-intensive work is going to be required in a primarily
I/O-bound application. The easier it is to for developers to handle those
use-cases, the better, IMO.
Note that the same sort of integration could be done with the threading
module, though I think there's a fairly limited use-case for that; most
times you'd want to use threads over processes, you could probably just use
non-blocking I/O instead.
Thanks,
Dan
This works:
re.search('(abc)', 'abc').group(1)
but this doesn't:
re.search('(abc)', 'abc').group(1L)
The latter raises "IndexError: no such group". Shouldn't that technically
work?
--
Ryan
If anybody ever asks me why I prefer C++ to C, my answer will be simple:
"It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was
nul-terminated."
Personal reality distortion fields are immune to contradictory evidence. -
srean
Check out my website: http://kirbyfan64.github.io/
Hey folks,
What do you think about making it easier to use packages by
automatically importing submodules on attribute access.
Consider this example:
>>> import matplotlib
>>> figure = matplotlib.figure.Figure()
AttributeError: 'module' object has no attribute 'figure'
For the newcomer (like me some months ago) it's not obvious that the
solution is to import matplotlib.figure.
Worse even: it may sometimes/later on work, if the submodule has been
imported from another place.
How, I'd like it to behave instead (in pseudo code, since `package` is
not a python class right now):
class package:
def __getattr__(self, name):
try:
return self.__dict__[name]
except KeyError:
# either try to import `name` or raise a nicer error message
The automatic import feature could also play nicely when porting a
package with submodules to or from a simple module with namespaces (as
suggested in [1]), making this transition seemless to any user.
I'm not sure about potential problems from auto-importing. I currently
see the following issues:
- harmless looking attribute access can lead to significant code
execution including side effects. On the other hand, that could always
be the case.
- you can't use attribute access anymore to test whether a submodule is
imported (must use sys.modules instead, I guess)
In principle one can already make this feature happen today, by
replacing the object in sys.modules - which is kind of ugly and has
probably more flaws. This would also be made easier if there were a
module.__getattr__ ([2]) or "metaclass" like feature for modules (which
would be just a class then, I guess).
Sorry, if this has come up before and I missed it. Anyhow, just
interested if anyone else considers this a nice feature.
Best regards,
Thomas
[1]
https://mail.python.org/pipermail/python-ideas/2014-September/029341.html
[2] https://mail.python.org/pipermail/python-ideas/2012-April/014957.html
Python uses "seconds since the epoch" term to describe time.time()
return value. POSIX term is "seconds since the Epoch" (notice the
capitalization) where Epoch is 1970-01-01 00:00:00+00:00. C99 term is
"calendar time" -- the encoding of the calendar time returned by the
time() function is unspecified.
Python documentation defines `epoch` as:
The :dfn:`epoch` is the point where the time starts. On January 1st
of that year, at 0 hours, the "time since the epoch" is zero. For
Unix, the epoch is 1970. To find out what the epoch is, look at
``gmtime(0)``.
time module documentation specifies calendar.timegm() as the inverse of
time.gmtime() while timegm() uses the fixed 1970 Epoch instead of
gmtime(0) epoch.
datetime.astimezone() (local_timezone()) passes Unix timestamp [1970] to
time.localtime() that may expect timestamp with different epoch
[gmtime(0)].
email.util.mktime_tz() uses both mktime() [gmtime(0)] and timegm() [1970].
mailbox.py uses both time.time() [gmtime(0)] and timegm() [1970].
http.cookiejar uses both EPOCH_YEAR=1970 and datetime.utcfromtimestamp()
[gmtime(0) epoch] for "seconds since epoch".
It seems 1970 Epoch is used for file times on Windows (os.stat()) but
os.path.getatime() refers to "seconds since epoch" [gmtime(0) epoch].
date{,time}.{,utc}fromtimestamp(), datetime.timestamp() docs equates
"POSIX timestamp" [1970 Epoch] and time.time()'s returned value
[gmtime(0) epoch].
datetime.timestamp() is inconsistent if gmtime(0) is not 1970. It uses
mktime() for naive datetime objects [gmtime(0) epoch]. But it
uses POSIX Epoch for aware datetime objects.
Correct me if I'm wrong here.
---
Possible fixes:
Say in the `epoch` definition that stdlib doesn't support
gmtime(0).tm_year != 1970.
OR don't use mktime() if 1970 Epoch is used e.g., create an aware
datetime object in the local timezone instead and use it to compute the
result with 1970 Epoch.
OR add *analog* of TZ=UTC time.mktime() and use it in stdlib where
necessary. Looking at previous attempts (e.g., [1], [2]) to implement
timegm(), the problem seems over-constrained. A different name could be
used, to avoid wrong expectations e.g., datetime could use
`(aware_datetime_object - gmtime0_epoch) // sec`
[1] http://bugs.python.org/issue6280,
[2] http://bugs.python.org/issue1667546
OR set EPOCH_YEAR=gmtime(0).tm_year instead of 1970 in
calendar.timegm(). It may break backward compatibility if there is a
system with non-1970 epoch. Deal on a case-by-case basis with other
places where 1970 Epoch is used. And drop "POSIX timestamp" [1970
Epoch] and use "seconds since the epoch" [gmtime(0) epoch] in the
datetime documentation. Change internal EPOCH year accordingly.
What is Python-ideas opinion about it?
--
Akira