Frequently, while globbing, one needs to work with multiple extensions. I’d
like to propose for fnmatch.filter to handle a tuple of patterns (while
preserving the single str argument functionality, alas str.endswith), as a
first step for glob.i?glob to accept multiple patterns as well.
Here is the implementation I came up with:
https://github.com/python/cpython/compare/master...andresdelfino:fnmatch-mu…
If this is deemed reasonable, I’ll write tests and documentation updates.
Any opinion?
The proposed implementation of dataclasses prevents defining fields with
defaults before fields without defaults. This can create limitations on
logical grouping of fields and on inheritance.
Take, for example, the case:
@dataclass
class Foo:
some_default: dict = field(default_factory=dict)
@dataclass
class Bar(Foo):
other_field: int
this results in the error:
5 @dataclass
----> 6 class Bar(Foo):
7 other_field: int
8
~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py
in dataclass(_cls, init, repr, eq, order, hash, frozen)
751
752 # We're called as @dataclass, with a class.
--> 753 return wrap(_cls)
754
755
~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py
in wrap(cls)
743
744 def wrap(cls):
--> 745 return _process_class(cls, repr, eq, order, hash, init,
frozen)
746
747 # See if we're being called as @dataclass or @dataclass().
~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py
in _process_class(cls, repr, eq, order, hash, init, frozen)
675 # in __init__. Use "self" if
possible.
676 '__dataclass_self__' if 'self' in
fields
--> 677 else 'self',
678 ))
679 if repr:
~/.pyenv/versions/3.6.2/envs/clover_pipeline/lib/python3.6/site-packages/dataclasses.py
in _init_fn(fields, frozen, has_post_init, self_name)
422 seen_default = True
423 elif seen_default:
--> 424 raise TypeError(f'non-default argument {f.name!r} '
425 'follows default argument')
426
TypeError: non-default argument 'other_field' follows default argument
I understand that this is a limitation of positional arguments because the
effective __init__ signature is:
def __init__(self, some_default: dict = <something>, other_field: int):
However, keyword only arguments allow an entirely reasonable solution to
this problem:
def __init__(self, *, some_default: dict = <something>, other_field: int):
And have the added benefit of making the fields in the __init__ call
entirely explicit.
So, I propose the addition of a keyword_only flag to the @dataclass
decorator that renders the __init__ method using keyword only arguments:
@dataclass(keyword_only=True)
class Bar(Foo):
other_field: int
--George Leslie-Waksman
PEP for support for indexing with keyword arguments is now submitted as PR.
https://github.com/python/peps/pull/1612
Thanks to everybody involved in the development of the PEP and the
interesting discussion. All your contributions have been received and
often added to the document.
If the PEP is approved, I would like to attempt an implementation, but
I am not particularly skilled in the python internals. If a core
developer is willing to teach me the ropes (especially in the new
parser, I barely understood the syntax of the old one, but have no
idea about the new one) and review my code I can give it a try. I
would not mind refreshing my C a bit.
--
Kind regards,
Stefano Borini
Hi All,
This is the first time I'm posting to this mailing group, so forgive me if I'm making any mistakes.
So one of the most common ways to load json, is via a file. This is used extensively in data science and the lines. We often write something like :-
with open(filename.json, "r") as f:
my_dict = json.load(f)
or
my_dict = json.load(open("filename.json", "r"))
Since this is sooooo common, why doesn't python have something like :-
json.loadf("filename.json")
Is there an obvious issue by defining this in the cpython? I don't whipping up a PR if it gains traction.
I'm not sure if this has been asked / suggested before.
I'm wondering if there is any interest in conditional or loop-based `with`
statements. I think it could be done without a syntax change.
### Napkin proposal
Context managers could define `__if__` or `__for__`, and if those dunder
methods were defined, then the `with` statement would either behave like a
conditional or a loop.
If `__if__` was defined then
```
with Ctx():
print('hi')
```
would only print `hi` if `__if__` returned True. This doesn't require a
syntax change.
The `__for__` variant would likely need a minor syntax change.
```
with item in Ctx():
print(item)
```
The `__for__` method is a generator that generates arguments of a loop. The
item will be printed as many times as there are items generated by
`__for__`.
### Use Cases
This would simplify usage of my Timerit and ubelt library.
The timerit module defines ``timerit.Timerit``, which is an object that is
iterable. It has an ``__iter__`` method that generates
``timerit.TimerTimer``
objects, which are context managers.
>>> import math
>>> from timerit import Timerit
>>> for timer in Timerit(num=200, verbose=2):
>>> with timer:
>>> math.factorial(10000)
The timer context manager measures how much time the body of it takes by
"tic"-ing ``__enter__`` and "toc"-ing on ``__exit__``. The underlying object
has access to the context manager, so it is able to read its measurement.
These
measurements are stored and then we compute some statistics on them. Notably
the minimum, mean, and standard-deviation of grouped (batched) running
times.
Unfortunately the syntax is one line and one indent bulker than I would
prefer.
However, a more consice version of the synax is available.
>>> import math
>>> from timerit import Timerit
>>> for _ in Timerit(num=200, verbose=2):
>>> math.factorial(10000)
In this case the measurement is made in the `__iter__` method ``Timerit``
object itself, which I believe contains slightly more overhead than the
with-statement version. (I should test to determine if this is the case).
In the case where it does make a difference, a cool syntax might look like:
>>> import math
>>> from timerit import Timerit
>>> with timer in Timerit(num=200, verbose=2):
>>> math.factorial(10000)
The other case is that my ``ubelt.Cacher`` library. Currently it requires 4
lines of boilerplate syntax.
>>> import ubelt as ub
>>> # Defines a cache name and dependencies, note the use of
`ub.hash_data`.
>>> cacher = ub.Cacher('name', cfgstr=ub.hash_data('dependencies'))
# boilerplate:1
>>> # Calling tryload will return your data on a hit and None on a miss
>>> data = cacher.tryload()
# boilerplate:2
>>> # Check if you need to recompute your data
>>> if data is None:
# boilerplate:3
>>> # Your code to recompute data goes here (this is not
boilerplate).
>>> data = 'mydata'
>>> # Cache the computation result (pickle is used by default)
>>> cacher.save(data)
# boilerplate:4
But a conditional ``with`` syntax would reduce boilerplate to 3 lines.
>>> import ubelt as ub
>>> with ub.Cacher('name', cfgstr=ub.hash_data('dependencies')) as
cacher:
>>> data = 'mydata'
>>> cacher.save(data)
>>> data = cacher.data
I'm sure there are a lot of viable syntax variations, but does the idea of
a conditional or loop aware "with" statement seem like a reasonable
language proposal?
--
-Dr. Jon Crall (him)
Forgive me if this idea has been discussed before, I searched the mailing lists, the CPython repo, and the issue tracker and was unable to find anything.
I have found myself a few times in a position where I have a repeated argument that uses the `append` action, along with some convenience arguments that append a specific const to that same dest (eg: `--filter-x` being made equivalent to `--filter x` via `append_const`). This is particularly useful in cli apps that expose some kind of powerful-but-verbose filtering capability, while also providing shorter aliases for common invocations. I'm sure there are other use cases, but this is the one I'm most familiar with.
The natural extension to this filtering idea are convenience args that set two const values (eg: `--filter x --filter y` being equivalent to `--filter-x-y`), but there is no `extend_const` action to enable this.
While this is possible (and rather straight forward) to add via a custom action, I feel like this should be a built-in action instead. `append` has `append_const`, it seems intuitive and reasonable to expect `extend` to have `extend_const` too (my anecdotal experience the first time I came across this need was that I simply tried using `extend_const` without checking the docs, assuming it already existed).
Please see this gist for a working example that may help explain the idea and intended use case more clearly: https://gist.github.com/roganartu/7c2ec129d868ecda95acfbd655ef0ab2
Hi Ilya,
I'm not sure that this mailing list (Python-Dev) is the right place for
this discussion, I think that Python-Ideas (CCed) is the correct place.
For the benefit of Python-Ideas, I have left your entire post below, to
establish context.
[Ilya]
> I needed reversed(enumerate(x: list)) in my code, and have discovered
> that it wound't work. This is disappointing because operation is well
> defined.
It isn't really well-defined, since enumerate can operate on infinite
iterators, and you cannot reverse an infinite stream. Consider:
def values():
while True:
yield random.random()
a, b = reversed(enumerate(values())
What should the first pair of (a, b) be?
However, having said that, I think that your idea is not unreasonable.
`enumerate(it)` in the most general case isn't reversable, but if `it`
is reversable and sized, there's no reason why `enumerate(it)` shouldn't
be too.
My personal opinion is that this is a fairly obvious and straightforward
enhancement, one which (hopefully!) shouldn't require much, if any,
debate. I don't think we need a new class for this, I think enhancing
enumerate to be reversable if its underlying iterator is reversable
makes good sense.
But if you can show some concrete use-cases, especially one or two from
the standard library, that would help your case. Or some other languages
which offer this functionality as standard.
On the other hand, I think that there is a fairly lightweight work
around. Define a helper function:
def countdown(n):
while True:
yield n
n -= 1
then call it like this:
# reversed(enumerate(seq))
zip(countdown(len(seq)-1), reversed(seq)))
So it isn't terribly hard to work around this. But I agree that it would
be nice if enumerate encapsulated this for the caller.
One potentially serious question: what should `enumerate.__reversed__`
do when given a starting value?
reversed(enumerate('abc', 1))
Should that yield...?
# treat the start value as a start value
(1, 'c'), (0, 'b'), (-1, 'a')
# treat the start value as an end value
(3, 'c'), (2, 'b'), (1, 'a')
Something else?
My preference would be to treat the starting value as an ending value.
Steven
On Wed, Apr 01, 2020 at 08:45:34PM +0200, Ilya Kamenshchikov wrote:
> Hi,
>
> I needed reversed(enumerate(x: list)) in my code, and have discovered that
> it wound't work. This is disappointing because operation is well defined.
> It is also well defined for str type, range, and - in principle, but not
> yet in practice - on dictionary iterators - keys(), values(), items() as
> dictionaries are ordered now.
> It would also be well defined on any user type implementing __iter__,
> __len__, __reversed__ - think numpy arrays, some pandas dataframes, tensors.
>
> That's plenty of usecases, therefore I guess it would be quite useful to
> avoid hacky / inefficient solutions like described here:
> https://code.activestate.com/lists/python-list/706205/.
>
> If deemed useful, I would be interested in implementing this, maybe
> together with __reversed__ on dict keys, values, items.
>
> Best Regards,
> --
> Ilya Kamen
>
> -----------
> p.s.
>
> *Sketch* of what I am proposing:
>
> class reversible_enumerate:
>
> def __init__(self, iterable):
> self.iterable = iterable
> self.ctr = 0
>
> def __iter__(self):
> for e in self.iterable:
> yield self.ctr, e
> self.ctr += 1
>
> def __reversed__(self):
> try:
> ri = reversed(self.iterable)
> except Exception as e:
> raise Exception(
> "enumerate can only be reversed if iterable to
> enumerate can be reversed and has defined length."
> ) from e
>
> try:
> l = len(self.iterable)
> except Exception as e:
> raise Exception(
> "enumerate can only be reversed if iterable to
> enumerate can be reversed and has defined length."
> ) from e
>
> indexes = range(l-1, -1, -1)
> for i, e in zip(indexes, ri):
> yield i, e
>
> for i, c in reversed(reversible_enumerate("Hello World")):
> print(i, c)
>
> for i, c in reversed(reversible_enumerate([11, 22, 33])):
>
> print(i, c)
> _______________________________________________
> Python-Dev mailing list -- python-dev(a)python.org
> To unsubscribe send an email to python-dev-leave(a)python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NDDKDUD…
> Code of Conduct: http://python.org/psf/codeofconduct/
I know enhancements to pathlib gets brought up occasionally, but it doesn't
look like anyone has been willing to take the initiative and see things
through to completion. I am willing to keep the ball rolling here and even
implement these myself. I have some suggestions and I would like to
discuss them. I don't think any of them are significant enough to require
a pep. These can be split it into independent threads if anyone prefers.
1. copy
The big one people keep bringing up that I strongly agree on is a "copy"
method. This is really the only common file manipulation task that
currently isn't possible. You can make files, read them, move them, delete
them, create directories, even do less common operations like change owners
or create symlinks or hard links.
A common objection is that pathlib doesn't work on multiple paths. But
that isn't the case. There are a ton of methods that do that, including:
* symlink_to
* link_to
* rename
* replace
* glob
* rglob
* iterdir
* is_relative_to
* relative_to
* samefile
I think this is really the only common file operation that someone would
need to switch to a different module to do, and it seems pretty strange to
me to be able to make symbolic or hard links to a file but not straight up
copy one.
2. recursive remove
This could be a "recursive" option to "rmdir" or a "rmtree" method (I
prefer the option). The main reason for this is symmetry. It is possible
to create a tree of folders (using "mkdir(parents=True)"), but once you do
that you cannot remove it again in a straightforward way.
3. newLine for write_text
This is the only relevant option that "Path.open" has but "Path.write_text"
doesn't, and is a serious omission when dealing with multiple operating
systems.
4. uid and gid
You can get the owner and group name of a file (with the "owner" and
"group" methods), but there is no easy way to get the corresponding
number.
5. Stem with no suffixes
The stem property only takes off the last suffix, but even in the example
given ('my/library.tar.gz') it isn't really useful because the suffix has
two parts ('.tar' and '.gz'). I suggest another property, probably
called "rootstem"
or "basestem", that takes off all the suffixes, using the same logic as the
"suffixes" property. This is another symmetry issue: it is possible to
extract all the suffixes, but not remove them.
6. with_suffixes
Equivalent to with_suffix, but replacing all suffixes. Again, this is a
symmetry issue. It is hard to manipulate all the suffixes right now, as
the example show. You can add them or extract them, but not change them
without doing several steps.
7. exist_ok for is_* methods
Currently all the is_* methods (such as is_file) return False if the file
doesn't exist or if it is a broken symlink. This can be dangerous, since
it is not trivially easy to tell if you are dealing with the wrong type of
file vs. a missing file. And it isn't obvious behavior just from the
method name. I suggest adding an "exist_ok" argument to all of these, with
the default being "True" for backwards-compatibility. This argument name
is already in use elsewhere in pathlib. If this is False and the file is
not present, a "FileNotFoundError" is raised.
Consider the following example:
import unittest
def foo():
for x in [1, 2, 'oops', 4]:
print(x + 100)
class TestFoo(unittest.TestCase):
def test_foo(self):
self.assertIs(foo(), None)
if __name__ == '__main__':
unittest.main()
If we were calling `foo` directly we could enter post-mortem debugging via `python -m pdb test.py`.
However since `foo` is wrapped in a test case, `unittest` eats the exception and thus prevents post-mortem debugging. `--failfast` doesn't help, the exception is still swallowed.
Since I am not aware of a solution that enables post-mortem debugging in such a case (without modifying the test scripts, please correct me if one exists), I propose adding a command-line option to `unittest` for [running test cases in debug mode](https://docs.python.org/3/library/unittest.html#unittest.TestCase.deb… so that post-mortem debugging can be used.
P.S.: There is also [this SO question](https://stackoverflow.com/q/4398967/3767239) on a similar topic.
Hi all,
I do not know maybe it was already discussed ... but the toolchain like LLVM is very mature and it can provide the simpler JIT compilation to machine code functionality and it will improve performance of the Python a lot !!