On Mon, 26 Aug 2019 at 09:47, Richard Musil <risa2000x@gmail.com> wrote:
I gave it some thought over the weekend and came to the conclusion that I am not going to go further with it (where "it" means my or anyone else's idea). The reason is that I totally lost any motivation. I however feel some more elaborate answer might be due and I will try to give one.
The other day (actually before I posted my last reply), I went to core-mentorship list to get some ideas about how to continue. There was this thread about how people got their first contribution and while most were positive there was one post which kind of stood out because it described an unsuccessful attempt which finally led to parting ways. I realized there is no shame in that.
I came here with some rough idea about the JSON module features, but had no clue what are the "real" use cases, what are peoples' expectations, etc. This thread actually helped me to get more of the understanding and the insight. I thought I had a nice feature in mind, and was wondering what it would take to get it into Python. On the other hand, I did not have any other particular ambitions, like becoming a Python contributor.
Thanks for the feedback, it's both interesting and valuable. Maybe you would be willing to add a pointer to this comment to that discussion on core-mentorship? I'm sure it would be useful information for the people looking to try to remove barriers to entry. And the point that you made, that you weren't coming here with an ambition to become a contributor in any more general sense, is also very relevant, as it's quite possible that people coming here with nothing more than an idea that they'd like to propose may well be put off by the feeling that they have to implement their idea or no-one is willing to listen. (There *is* in reality a problem in that many ideas are fine but pointless unless someone implements them, but that doesn't mean we should block people from just discussing things in the abstract).
During the discussion I realized that there were 3 aspects (of the potential acceptable solution), proposed by 3 different persons, about which they were quite imperative: 1) It must use Decimal (Paul) 2) It must check validity of serialized output (Christopher) 3) It must avoid unconditional import of Decimal (Andrew)
A summary like this is immensely helpful in clarifying both where the discussion has got to, and what the sticking points are. I don't think we do enough on this list to encourage or offer such summaries, or to help new contributors to think in terms of checkpointing the discussion like this. We get so stuck in the technical discussion that we forget that people may *also* need help in the softer skills, like managing the discussion. So we keep throwing ideas into the mix until the contributor is overwhelmed and doesn't know how to proceed.
Originally, I thought that I could fulfill 2) and 3), without jeopardizing 1) (my opinion on 1) I already expressed), so I implemented the Python part and run some performance tests only to find out that my solution cannot compete in performance with Decimal solution because of the additional validity check and I could not promote it anymore. I am not particularly convinced that the validity check is really needed, but I understand why others are requesting it.
So the only way to continue seemed to be implementing 1+2+3 and I realized I really did not want to do it. One reason was I did not particularly "like" it, while it is not meant to be read as that I thought it was wrong to do it this way, I just did not really feel invested in those ideas anymore, the other was, that I was no longer able to argue about it, because I had basically no idea, if the users really need full validity check, or if the cost of one time import of Decimal really overweights the performance hit of the heuristic for a lazy import, and had to rely on what someone claimed on some mailing list (no disrespect meant).
I agree, there's a real risk here that proposals get overwhelmed by additional requirements suggested by other people. And when those other people are long-time contributors, or even more so core devs, it's extremely hard for a newcomer to say that they don't think that such a requirement is necessary. But as you have demonstrated here, those requirements are sometimes mutually incompatible, and at some point, someone has to make a judgement call on the trade-offs. Again, subjecting a newcomer to the need to do that right up front isn't exactly fair or helpful.
I also realized that implementing this would not give me any advantage over using simplejson, neither in the performance nor in the features, so it lost also the practical aspect of needing it.
Fundamentally, this is where there is a disconnect between people's expectations occurs. Ideas discussed on this list are intended to be for implementation in future versions of Python - so it's pretty much always the case that anything agreed here will be of no immediate use to the individual proposing it. Historically, Python releases have been every 18 months, so even if we assume that the proposer is an extremely early adopter of new releases, and has no need to support older versions of Python, we're talking about 12-18 months before they can use a new feature. Compare that to *right now* for a PyPI package or a workaround in code. But people *think* of ideas because they hit a problem of their own. And they come here out of a sense that sharing a possible solution would help the community. It's not very encouraging if they get treated as if they are simply saying "solve my problem for me". We probably need to get better at helping such people to polish their ideas, *without* focusing too heavily on their original problem (or worse still, on criticising their original solution of that problem). The original problem is the initial use case for a new feature, of course, but focusing on "how does your original problem generalise, so we can see what common features a solution should have" rather than on "why do you think your problem is important enough to need solving in the core language/stdlib" (I exaggerate somewhat, but sadly I suspect not a lot :-() would be a much more welcoming approach.
So I guess I am going to leave my patch on github for a while, if anyone decides to go ahead with 1+2+3. It is not exactly a rocket science but could save some typing, or if you want to run some quick benchmark. If you supply it with dump_as_number=Decimal, it would behave exactly as the version with hardcoded Decimal (sans lazy import). One thing to note, if you choose to use Decimal for validating JSON number, you will need to handle the case where allow_nan is False, and check that Decimal does not serialized them (it does in simplejson as there is no check). Should not have a big impact though as allow_nan is True by default.
Thanks. Even if your PR doesn't ultimately get accepted, the discussion was useful, and highlighted the fact that we can't currently write full-precision Decimal values using native JSON (we can round-trip them using custom encoders and object_hook, but that's a non-standard layer on top of base JSON). Sometimes these things take a few rounds of discussion before getting accepted (again, the long-term view is important here). Thanks for both the proposal and subsequent thread, and for the helpful and thought-provoking summary post. Please don't be *too* put off from coming back with any future ideas you may have! Paul PS Your discussion of the 3 constraints people were asking for and how you viewed them and tried to address them, made me think of some other possible approaches that might be productive. But as you've said you don't want to take the proposal further, and I think that's an entirely reasonable position for you to take, I won't push you by re-opening the debate right now. But I'll keep the thoughts in mind for if someone else wants to take the proposal further.