There is no lossless way to encode the information
to unicode. The argument that you know the encoding
the data is coming from is a fallacy. The argument that
data is always correct is a fallacy as well. So:
1. external data encoding is unknown or varies
2. external data has binary chunks that are invalid for
conversion to unicode
In real world you have to deal with broken and invalid
output and UnicodeDecode crashes is not an option.
The unicode() constructor proposes two options to
deal with invalid output:
1. ignore - meaning skip and corrupt the data
2. replace - just corrupt the data
The solution is to have filter preprocess the binary
string to escape all non-unicode symbols so that the
following lossless transformation becomes possible:
binary -> escaped utf-8 string -> unicode -> binary
How to accomplish that with Python 2.x?
This stuff is critical to port SCons to Python 3.x and I
expect for other such tools too.
Disclaimer: I’m not the author of jsonschema (https://github.com/Julian/jsonschema), but as a user think that users of the standard library (and potentially areas of the standard library itself) could benefit from its addition into the standard library.
I’ve been using jsonschema for the better part of a couple years now and have found it not only invaluable, but flexible around the variety of applications it has. Personally, I generally use it for HTTP response validation when dealing with RESTful APIs and system configuration input validation. For those not familiar with the package:
RFC draft: https://tools.ietf.org/html/draft-zyp-json-schema-04
Proposed addition implementation: https://github.com/Julian/jsonschema
Coles notes stats:
Has been publicly available for over a year: v0.1 released Jan 1, 2012, currently at 2.4.0 (released Sept 22, 2014)
Heavily used by the community: Currently sees ~585k downloads per month according to PyPI
I’ve reached out to the author to express my interest in authoring a PEP to have the module included to gauge his interest in assisting with maintenance as needed during the integration period (or following). I’d also be personally interested in supporting it as part of the stdlib as well.
My question is: Is there any reason up front anyone can see that this addition wouldn’t fly, or are others interested in the addition as well?
What you think about using Cmake build system?
I see advantages such as:
- Supported in Clion IDE (amazing C/C++ IDE, breakpoints, etc);
- Simple and easy to use (Zen of Python :)
I was actually seeing a discussion in python-commiters about Windows 7
buildbots failing. Found that someone already had the same idea but don't
know if it was shared here: http://www.vtk.org/Wiki/BuildingPythonWithCMake
Please share your thoughts.
I'm tired of getting bug reports like this one:
where the issue is just that the user didn't see deprecation warnings,
so I just filed a bug report requesting that the interactive Python
REPL start printing DeprecationWarnings when users use deprecated
In the bug report it was pointed out that this was discussed on
python-ideas a few months ago, and the discussion petered out without
As far as I can tell, though, there were only two real objections
raised in that previous thread, and IMO neither is really convincing.
So let me pre-empt those now:
Objection 1: This will cause the display of lots of unrelated warnings.
Response: You misunderstand the proposal. I'm not suggesting that we
display *all* DeprecationWarnings whenever the interactive interpreter
is running; I'm only suggesting that we display the deprecation
warnings that are warning about *code that was actually typed at the
# not this
So for example, if we have
warnings.warn("stop it!", DeprecationWarning, stacklevel=2)
>> import module1, module2
# This doesn't print a warning, because 'foo' is not deprecated
# it merely uses deprecated functionality, which is not my problem,
# because I am merely a user of module1, not the author.
# This *does* print a warning, because now I am using the
# deprecated functionality directly.
__main__:1: DeprecationWarning: stop it!
Objection 2: There are lots of places that code is run interactively
besides the standard REPL -- there's IDLE and IPython and etc.
Response: Well, this isn't really an objection :-). Basically I'm
looking for consensus from the CPython team that this is what should
happen in the interactive interpreters that they distribute. Other
interfaces can then follow that lead or not. (For some value of
"follow". By the time you read this IPython may have already made the
change: https://github.com/ipython/ipython/pull/8480 ;-).)
So, totally awesome idea, let's do it, yes/yes?
Nathaniel J. Smith -- http://vorpus.org
Context: A bunch of my students will be working with me (if all goes
according to plan!!)to hack on/in CPython sources.
One of the things we would like to try is a framework for CS101 [Intro to
So for example beginners get knocked out by None 'disappearing' from the
>>> import sys
>>> sys.displayhook = print
Now of course one can say: "If you want that behavior, set it as you choose"
However at the stage that beginners are knocked down by such, setting up a
pythonstartup file is a little premature.
So the idea (inspired by Scheme's racket) is to have a sequence of
They are like concentric rings, the innermost one being the noob ring, the
outermost one being standard python.
Now note that while the larger changes would in general be restrictions, ie
subsetting standard python, they may not be easily settable in
eg sorted function and sort method confusion
extend/append/etc mutable methods vs immutable '+'
Now different teachers may like to navigate the world of python differently.
So for example I prefer to start with the immutable (functional) subset and
go on to the stateful/imperative. The point (here) is not so much which is
preferable so much as this that a given teacher should have the freedom to
chart out a course through python in which (s)he can cross out certain
features at certain points for students. So a teacher preferring to
emphasise OO/imperative over functional may prefer the opposite choice.
[Aside: ACM curriculum 2013 juxtaposes OO and FP as absolute basic in core
So the idea is to make a framework for teachers to easily configure and
select teachpacks to their taste.
How does that sound?
In the PyData community, we really like method chaining for data analysis
(iris.query('SepalLength > 5')
.assign(SepalRatio = lambda x: x.SepalWidth / x.SepalLength,
PetalRatio = lambda x: x.PetalWidth / x.PetalLength)
.plot(kind='scatter', x='SepalRatio', y='PetalRatio'))
Unfortunately, method chaining isn't very extensible -- short of monkey
patching, every method we want to use has exist on the original object. If
a user wants to supply their own plotting function, they can't use method
You may recall that we brought this up a few months ago on python-ideas as
an example of why we would like macros.
To get around this issue, we are contemplating adding a pipe method to
pandas DataFrames. It looks like this:
def pipe(self, func, *args, **kwargs):
pipe_func = getattr(func, '__pipe_func__', func)
return pipe_func(self, *args, **kwargs)
We would encourage third party libraries with objects on which method
chaining is useful to define a pipe method in the same way.
The main idea here is to create an easy way for users to do method chaining
with their own functions and with functions from third party libraries.
The business with __pipe_func__ is more magical, and frankly we aren't sure
it's worth the complexity. The idea is to create a "pipe protocol" that
allows functions to decide how they are called when piped. This is useful
in some cases, because it doesn't always make sense for functions that act
on piped data to accept that data as their first argument.
For more motivation and examples, please read the opening post in this
GitHub issue: https://github.com/pydata/pandas/issues/10129
Obviously, this sort of protocol would not be an official part of the
Python language. But because we are considering creating a de-facto
standard, we would love to get feedback from other Python communities that
use method chaining:
1. Have you encountered or addressed the problem of extensible method
2. Would this pipe protocol be useful to you?
3. Is it worth allowing piped functions to override how they are called by
defining something like __pipe_func__?
Note that I'm not particularly interested in feedback about how we
shouldn't be defining double underscore methods. There are other ways we
could spell __pipe_func__, but double underscores seems to be pretty
standard for ad-hoc protocols.
Thanks for your attention.
Hey, author here, thanks a lot Demian for even suggesting such a thing :).
I'm really glad that people have found jsonschema useful.
I actually tend these days to think similarly to what Nick mentioned, that
the standard library really has decreased in importance as pip has shaped
up and now been bundled -- so overall my personal opinion is that I
wouldn't personally be pushing to get jsonschema in -- but! If you felt
strongly, just some brief answers -- I think jsonschema would be able to
cope with more restricted release cycles.
And there are a few areas that I don't like about jsonschema (some APIs)
which eventually I'd like to fix (RefResolver in particular), but for the
most part I think it has stabilized more or less.
I can provide some more details if there's any interest.
Thanks again for even proposing such a thing :)
On Thu, May 21, 2015 at 2:15 AM, <python-ideas-request(a)python.org> wrote:
> Message: 7
> Date: Thu, 21 May 2015 19:15:20 +1000
> From: Nick Coghlan <ncoghlan(a)gmail.com>
> To: Paul Moore <p.f.moore(a)gmail.com>
> Cc: Demian Brecht <demianbrecht(a)gmail.com>, Python-Ideas
> Subject: Re: [Python-ideas] Adding jsonschema to the standard library
> Content-Type: text/plain; charset=UTF-8
> On 21 May 2015 at 17:57, Paul Moore <p.f.moore(a)gmail.com> wrote:
> > On 21 May 2015 at 06:29, Demian Brecht <demianbrecht(a)gmail.com> wrote:
> >> Has been publicly available for over a year: v0.1 released Jan 1, 2012,
> currently at 2.4.0 (released Sept 22, 2014)
> >> Heavily used by the community: Currently sees ~585k downloads per month
> according to PyPI
> > One key question that should be addressed as part of any proposal for
> > inclusion into the stdlib. Would switching to having feature releases
> > only when a new major Python version is released (with bugfixes at
> > minor releases) be acceptable to the project? From the figures you
> > quote, it sounds like there has been some rapid development, although
> > things seem to have slowed down now, so maybe things are stable
> > enough.
> The other question to be answered these days is the value bundling
> offers over "pip install jsonschema" (or a platform specific
> equivalent). While it's still possible to meet that condition, it's
> harder now that we offer pip as a standard feature, especially since
> getting added to the standard library almost universally makes life
> more difficult for module maintainers if they're not already core
> I'm not necessarily opposed to including JSON schema validation in
> general or jsonschema in particular (I've used it myself in the past
> and think it's a decent option if you want a bit more rigor in your
> data validation), but I'm also not sure how large an overlap there
> will be between "could benefit from using jsonschema", "has a
> spectacularly onerous package review process", and "can't already get
> jsonschema from an approved source".
> Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
I just heard about PEP479, and I want to prepare my open-source projects
I have no problem changing the code so it won't depend on StopIteration to
stop generators, but I'd also like to test it in my test suite. In Python
3.5 I could use `from __future__ import generator_stop` so the test would
be real (i.e. would fail wherever I rely on StopIteration to stop a
generator). But I can't really put this snippet in my code because then it
would fail on all Python versions below 3.5.
This makes me think of two ideas:
1. Maybe we should allow `from __future__ import whatever` in code, even if
`whatever` wasn't invented yet, and simply make it a no-op? This wouldn't
help now but it could prevent these problems in the future.
2. Maybe introduce a way to do `from __future__ import generator_stop`
without including it in code? Maybe a flag to the `python` command? (If
something like this exists please let me know.)
I'm envisioning "unless" as a synonym for "if not(...):" currently I use
if .... :
N.B.: This isn't extremely important as there are already two ways to
accomplish the same purpose, but it would be useful, seems easy to
implement, and is already used by many other languages. The advantage
is that when the condition is long it simplifies understanding.