Evolving the Standard Library
Hi everybody, I'm known for my dislike of the standard libray. In the past I wrote some blog posts about this topic into my personal blog already. However as many people pointed out earlier, a blog is not the place for this kind of criticism. Not only that, also just ranting about a topic does not help at all. Yesterday I subscribed to the stdlib-sig and immediately tons of mails ended up in my inbox. A quick look at the mail archives confirms what I was afraid of: this list is really high traffic. I tried to read up some of the discussions I missed but it's nearly impossible to do that. I would love to sum up my thoughts about the standard library here and my ideas to improve it. This list of ideas and improvements does not include any unrealistic plans such as rewriting the standard library, an approach I was a big fan of. I can see a couple of problems with the standard library currently, and some reasons why that is the case. If we look back on the history of Python it's obvious that a large number of modules in the standard library appeared out of the need of a single developer or company a while ago. Many of these libraries finally disappered or where renamed in the big standard library reorganization in Python 3 and I'm very happy that this happened. However at the same time a large number of the modules still continue to show their age. Python is currently heading into a new direction many people would not have thought about a few years ago. And that are web applications. For web applications different rules apply than for desktop applications. Command line scripts or GUI applications are mostly fine with shared state on module level, web applications are not. It is true that Python currently has some issues with high concurrency and people try to fix that by forking and spawning new processes which certainly hides away the problem of shared state, but that does not solve it. In fact, very recently Facebook open sourced the Tornado framework which does very well at high concurrency by using async IO. Also this recent interest in Tornado will probably also motivate Twisted developers to improve their project's documentation and performance, because competition is often the what causes projects to improve. Now if we look at the standard library, we can see many modules that just do not work in such environments because they have some sort of shared state. The most obvious ones are certainly the `locale` module and all the other modules that change behavior based on the locale settings. Did you know that every major Python framework reimplements time formatting even for something as simple as HTTP headers, because Python does not provide a way to format the time to english strings reliably? But there are certainly more modules that have this sort of problem. Also we have many modules in the standard library that in my opinion just do not belong there. From my point of view, stuff like XML does not belong into the standard library. But it appears that not many people agree with me on this one. But even if everybody would, backwards compatibility would still be a good reason to keep these modules around. Besides modules that do not work in every environment or modules that were probably a mistake to include, we also have modules in the standard library with a hideous implementation or no reusability, forcing people to reinvent what's already there. For a long time, `urllib` was a module I would have listed there, but as of Python 2.6, the module largely improved by exposing the underlaying socket more which finally alllows us to set the timeout in a reliable way. But there are still a ton of modules in the library that cause troubles for people. `dis` is one of them. The implementation of dis prints to stdout no matter what you do. Of course you can replace sys.stdout with something else for a brief moment, but again: this is not something we should aim for or advertise because it breaks for many people. `Cookie` is a module people monkey patched for a while (badly) to support the http only flag. Not only does the code expose a weird API, it is also nearly impossible to extend and even ships cookie subclasses that use unsigned pickles and trust the client. `cgi` has again, shared state on the global namespace that alters the behavior of the lirbary. Of course it was never intended to be used by anything but `cgi`, but that leaves people reimplementing it or abusing it. So when the discussion started replacing `optparse` with `argparse`, because the former is unmaintained I became alerted. My wishes have always been the standard library to be a reliable fallback to be used if everything else fails. Something I can rely on which will not change, except for maybe some additions or modules moved to different locations. As Python developers we became used to moving import locations a lot. It it's `cPickle` or any of the element tree implementations, you name it. I wonder if the solution to this problem wouldn't be a largely improved packaging system and some sort of standardized reviewing process for the standard library. Currently there is not even an accepted style for modules ending up in the Python distribution. That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers. Ubuntu recently started the "100 paper cuts" project. There people work on tiny little patches to improve the system, rather to replace components. Even though a large place of the standard library appears to be broken by design they could still be redesigned on the small scale, without breaking backwards compatibility. Of course libraries like `locale` and `logging` are hard to change, but it would still be possible. For `locale` it would probably a useful idea to go into the direction of datetime, where the timezone information is left to a 3rd party library. `locale` could provide some hooks for libraries like `babel` to fill the gap. On the other hand `Cookie` would be very easy to fix by moving the parsing code into a separate function and refactoring the cookie objects. We could probably also start a poll out there with well-selected questions of what users think about parts of the library. And for that poll it would make a lot of sense to not just ask the questions and evaluating the results, but also track the area the user is coming from (small size company, open / closed source, web development etc.). Because we all are biased and seeing results grouped by some of these factoids could be enlightening. That said, it could tell us that I'm completely wrong with my ideas of how the state of the standard library. But how realistic is it to refactor the standard library? I don't know. For a long time people were pretty sure Python will not get any faster and yet Unleaden Swallow is doing some really amazing progress. If we want to push Python foward into new areas, and the web is one of them, it is necessary to jump into the cold water and start things. Any maybe we should have some elected task forces for things like the standard library. Judging from the mailinglist it appears that far too many people are discussing *every detail* of it. It is a good idea to ask as many people as possible, but I am not sure if the mailinglist is the way to do that. It is currently very hard to see the direction in which development is heading. Please think of this email just as a suggestion. I don't have too much trust into myself to follow the discussions on this list camely enough to become a real part of a solution, but I would love to help shifting the development into a better direction, no matter which one it will be. Regards, Armin
On Wed, Sep 16, 2009 at 8:19 AM, Armin Ronacher <armin.ronacher@active-4.com> wrote:
Hi everybody,
I'm known for my dislike of the standard libray. In the past I wrote some blog posts about this topic into my personal blog already. However as many people pointed out earlier, a blog is not the place for this kind of criticism. Not only that, also just ranting about a topic does not help at all. Yesterday I subscribed to the stdlib-sig and immediately tons of mails ended up in my inbox. A quick look at the mail archives confirms what I was afraid of: this list is really high traffic. I tried to read up some of the discussions I missed but it's nearly impossible to do that.
I think that's out of the norm... the list has been quiet until recently. :-) [snip]
Any maybe we should have some elected task forces for things like the standard library. Judging from the mailinglist it appears that far too many people are discussing *every detail* of it. It is a good idea to ask as many people as possible, but I am not sure if the mailinglist is the way to do that. It is currently very hard to see the direction in which development is heading.
Please think of this email just as a suggestion. I don't have too much trust into myself to follow the discussions on this list camely enough to become a real part of a solution, but I would love to help shifting the development into a better direction, no matter which one it will be.
Just a thought: PyCon 2010 is around the corner. Open space? Something more formalized? Is that too late (because we will have lost momentum)? -John
John Szakmeister wrote:
On Wed, Sep 16, 2009 at 8:19 AM, Armin Ronacher <armin.ronacher@active-4.com> wrote:
Hi everybody,
I'm known for my dislike of the standard libray. In the past I wrote some blog posts about this topic into my personal blog already. However as many people pointed out earlier, a blog is not the place for this kind of criticism. Not only that, also just ranting about a topic does not help at all. Yesterday I subscribed to the stdlib-sig and immediately tons of mails ended up in my inbox. A quick look at the mail archives confirms what I was afraid of: this list is really high traffic. I tried to read up some of the discussions I missed but it's nearly impossible to do that.
I think that's out of the norm... the list has been quiet until recently. :-)
[snip]
Any maybe we should have some elected task forces for things like the standard library. Judging from the mailinglist it appears that far too many people are discussing *every detail* of it. It is a good idea to ask as many people as possible, but I am not sure if the mailinglist is the way to do that. It is currently very hard to see the direction in which development is heading.
Please think of this email just as a suggestion. I don't have too much trust into myself to follow the discussions on this list camely enough to become a real part of a solution, but I would love to help shifting the development into a better direction, no matter which one it will be.
Just a thought: PyCon 2010 is around the corner. Open space? Something more formalized? Is that too late (because we will have lost momentum)?
It looks like it will be something covered at the language summit, but an open space is a good idea. Backwards compatibility is a *big* problem for any major refactoring though. Michael
-John _______________________________________________ stdlib-sig mailing list stdlib-sig@python.org http://mail.python.org/mailman/listinfo/stdlib-sig
On Wed, Sep 16, 2009 at 9:18 AM, Michael Foord <michael@voidspace.org.uk> wrote:
It looks like it will be something covered at the language summit, but an open space is a good idea. Backwards compatibility is a *big* problem for any major refactoring though.
Michael
Yup, language summit. I'm hoping to cover some amount of this in a pending talk proposal I have in the pycon system too.
On Wed, Sep 16, 2009 at 09:28:35AM -0400, Jesse Noller wrote:
On Wed, Sep 16, 2009 at 9:18 AM, Michael Foord <michael@voidspace.org.uk> wrote:
It looks like it will be something covered at the language summit, but an open space is a good idea. Backwards compatibility is a *big* problem for any major refactoring though.
Michael
Yup, language summit. I'm hoping to cover some amount of this in a pending talk proposal I have in the pycon system too.
One interesting thought for backwards compatibility -- why not take all of the PyPI packages and try importing and/or testing them across versions, and then trying to build automatic classifiers to highlight the "interesting" breakages? A first pass filter would be "breakages we know about" vs "breakages we don't." Then you could build these breakages into a compatibility diagnostics package. Sounds like a fun PyCon sprint to me... --titus -- C. Titus Brown, ctb@msu.edu
On Wed, Sep 16, 2009 at 4:43 PM, C. Titus Brown <ctb@msu.edu> wrote:
On Wed, Sep 16, 2009 at 09:28:35AM -0400, Jesse Noller wrote:
On Wed, Sep 16, 2009 at 9:18 AM, Michael Foord <michael@voidspace.org.uk> wrote:
It looks like it will be something covered at the language summit, but an open space is a good idea. Backwards compatibility is a *big* problem for any major refactoring though.
Michael
Yup, language summit. I'm hoping to cover some amount of this in a pending talk proposal I have in the pycon system too.
One interesting thought for backwards compatibility -- why not take all of the PyPI packages and try importing and/or testing them across versions, and then trying to build automatic classifiers to highlight the "interesting" breakages? A first pass filter would be "breakages we know about" vs "breakages we don't."
Then you could build these breakages into a compatibility diagnostics package.
Sounds like a fun PyCon sprint to me...
I don't know if you remember my message on the snakebite mailing list some times ago on a related topic. That's the same process I would like to do to test distutils over PyPI, by grabbing packages there and running some commands using their setup.py. and say "this package is Distutils certified !" But this requires some work to make sure there are no security problems I/O-wise, unless you work with a list of trusted packages (which is not what we would want if we want to do QA tests) And the environment has to be reseted after each run to make sure there are no problems created by the package. Quite a work, but I am in for some brainstroming at Pycon on this topic if you are interested :)
On Wed, Sep 16, 2009 at 05:43:18PM +0200, Tarek Ziad? wrote:
On Wed, Sep 16, 2009 at 4:43 PM, C. Titus Brown <ctb@msu.edu> wrote:
On Wed, Sep 16, 2009 at 09:28:35AM -0400, Jesse Noller wrote:
On Wed, Sep 16, 2009 at 9:18 AM, Michael Foord <michael@voidspace.org.uk> wrote:
It looks like it will be something covered at the language summit, but an open space is a good idea. Backwards compatibility is a *big* problem for any major refactoring though.
Michael
Yup, language summit. I'm hoping to cover some amount of this in a pending talk proposal I have in the pycon system too.
One interesting thought for backwards compatibility -- why not take all of the PyPI packages and try importing and/or testing them across versions, and then trying to build automatic classifiers to highlight the "interesting" breakages? ?A first pass filter would be "breakages we know about" vs "breakages we don't."
Then you could build these breakages into a compatibility diagnostics package.
Sounds like a fun PyCon sprint to me...
I don't know if you remember my message on the snakebite mailing list some times ago on a related topic.
That's the same process I would like to do to test distutils over PyPI, by grabbing packages there and running some commands using their setup.py. and say "this package is Distutils certified !"
But this requires some work to make sure there are no security problems I/O-wise, unless you work with a list of trusted packages (which is not what we would want if we want to do QA tests)
And the environment has to be reseted after each run to make sure there are no problems created by the package.
Quite a work, but I am in for some brainstroming at Pycon on this topic if you are interested :)
yep, absolutely! I think I've got the execution and reporting end handled; now we just need to get some virtual environments running so we can do untrusted packages. Hmm, might be worth an AWS account just to do the basic stuff. --titus -- C. Titus Brown, ctb@msu.edu
On Wed, Sep 16, 2009 at 5:54 PM, C. Titus Brown <ctb@msu.edu> wrote:
Quite a work, but I am in for some brainstroming at Pycon on this topic if you are interested :)
yep, absolutely! I think I've got the execution and reporting end handled; now we just need to get some virtual environments running so we can do untrusted packages.
Ah nice ! for the virtual environment, I am not sure about the proper way to handle network access if some packages work with sockets. They suppose to fake I/O in tests and not acces the network in the code we might call, but that's never sure,
Hmm, might be worth an AWS account just to do the basic stuff.
absolutely, maybe we could continue this topic off-list at snakebite's ? Regards Tarek
Michael Foord wrote:
Backwards compatibility is a *big* problem for any major refactoring though.
Sigh. I sometimes get the feeling that people on this list don't know Python's history, how it was developed over the past decade and what our goals were. Maintaining as much backwards compatibility as reasonably possible has always been a key goal and we've done a pretty good job at it (if I may say so). As Py3k approached, it was deemed ok to break with the past and that was accepted by the core developers and the users. However, that time has past now and we're running in non-breaking mode again. As we're starting to establish the Py3k branch as new stable Python branch, we're not suddenly going to change the goals we've established over the years in the Python 2.x branch. Backwards compatibility is one of the key arguments for using Python as a development platform. As such it's not a problem, it's a feature of Python. And while it may not mean much to developers who prefer to run bleeding edge code, it does mean a lot to the established Python user base. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 16 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
M.-A. Lemburg wrote:
Michael Foord wrote:
Backwards compatibility is a *big* problem for any major refactoring though.
Sigh.
*sigh* Don't you just love emails that start with a sigh. Anyway, yes. That is why I said it was a problem. Good grief. Michael
I sometimes get the feeling that people on this list don't know Python's history, how it was developed over the past decade and what our goals were.
Maintaining as much backwards compatibility as reasonably possible has always been a key goal and we've done a pretty good job at it (if I may say so).
As Py3k approached, it was deemed ok to break with the past and that was accepted by the core developers and the users. However, that time has past now and we're running in non-breaking mode again.
As we're starting to establish the Py3k branch as new stable Python branch, we're not suddenly going to change the goals we've established over the years in the Python 2.x branch.
Backwards compatibility is one of the key arguments for using Python as a development platform. As such it's not a problem, it's a feature of Python.
And while it may not mean much to developers who prefer to run bleeding edge code, it does mean a lot to the established Python user base.
Hello, I'll just comment on some specific points:
A quick look at the mail archives confirms what I was afraid of: this list is really high traffic.
Actually, it was very low traffic until those recent threads were spawned. I'm probably guilty of some of the traffic :-)
It is true that Python currently has some issues with high concurrency and people try to fix that by forking and spawning new processes which certainly hides away the problem of shared state, but that does not solve it. In fact, very recently Facebook open sourced the Tornado framework which does very well at high concurrency by using async IO. Also this recent interest in Tornado will probably also motivate Twisted developers to improve their project's documentation and performance, because competition is often the what causes projects to improve.
First, I'm not sure what it has to do with the stdlib. Second, if you look at the HTTP implementation in Tornado, it does not handle 1/10th of the spec. Basically, it parses headers and handles a couple of them (Content-Length, perhaps another one). It's not difficult to write a fast HTTP server if you only need to support one smallish part of the spec, and then to show impressive "Hello, World" benchmarks. (besides, Tornado seems platform-specific since it explicitly uses epoll) The way Tornado was promoted looks like a marketing stunt. Glyph Lefkowitz had a very reasonable answer to it on his blog. (and, in any case, if you need speedy HTTP, just use mod_wsgi. There's no need to try and look fancy by using a pure-Python async server, which will always be much less tested, supported and documented than Apache is; not to mention the wealth of plugins which are available to customize Apache behaviour)
The most obvious ones are certainly the `locale` module and all the other modules that change behavior based on the locale settings. Did you know that every major Python framework reimplements time formatting even for something as simple as HTTP headers, because Python does not provide a way to format the time to english strings reliably?
Yes, it is very annoying. Please note, however, that the locale module addresses a specific need, which is to interface with the system-level locale mechanism. The global state comes from this and is not caused by the design of the module itself. While it does limit its uses a lot (and makes it fragile because of system variations), it is still useful precisely when what you want is to rely on the system's locale mechanism. I don't know if including something like Babel in the stdlib would be a good thing. It depends on the size of it, and the required maintenance (I suppose there is a continuous flow of patches, as long as new languages/cultures get supported?). Making locale being able to delegate to Babel sounds awkward. Just tell people to use Babel if they need to (whether it is in the stdlib, or not).
Also we have many modules in the standard library that in my opinion just do not belong there. From my point of view, stuff like XML does not belong into the standard library. But it appears that not many people agree with me on this one.
I would disagree indeed :) Things like XML and JSON in the standard library are very useful, because they provide a proven and reliable way to parse standardized formats without having to install any third-party library. Being able to do this kind of thing without installing additional stuff is especially useful when writing small scripts. (moreover, those libraries often have C accelerators, which might be non-trivial to package properly or install manually on Windows platforms)
`dis` is one of them. The implementation of dis prints to stdout no matter what you do. Of course you can replace sys.stdout with something else for a brief moment, but again: this is not something we should aim for or advertise because it breaks for many people.
Sure, but `dis` is used mainly by the core developers themselves, for testing and development purposes, and for these uses it is fine. Besides, it is certainly possible to propose an extension of the API so as to direct the output to another file-like object.
Ubuntu recently started the "100 paper cuts" project. There people work on tiny little patches to improve the system, rather to replace components. Even though a large place of the standard library appears to be broken by design they could still be redesigned on the small scale, without breaking backwards compatibility.
This "call to arms" can be a good idea. But we have to be able to channel it and appropriately review / validate the submitted changes.
But how realistic is it to refactor the standard library? I don't know.
It depends what you mean by "refactor". It doesn't sound very precise :) I think it's better to discuss proposed changes case by case rather than trying to reach a consensus on such vague terms.
It is a good idea to ask as many people as possible, but I am not sure if the mailinglist is the way to do that.
If you have precise feature requests or bugs to report, the bug tracker might indeed be a better place. Especially if you have patches ready :-) Regards Antoine.
Hi, Antoine Pitrou wrote:
First, I'm not sure what it has to do with the stdlib. Preamble.
I don't know if including something like Babel in the stdlib would be a good thing. It depends on the size of it, and the required maintenance (I suppose there is a continuous flow of patches, as long as new languages/cultures get supported?). I'm not saying babel should go into the stdlib, it's just not a good idea. Maybe unicodedata could expose more information but that's where it ends.
Making locale being able to delegate to Babel sounds awkward. Just tell people to use Babel if they need to (whether it is in the stdlib, or not). Then at least some time and friends should have a flag to *ignore* the locale if that's somehow possible.
I would disagree indeed :) Yes, I already gave up.
Sure, but `dis` is used mainly by the core developers themselves, for testing and development purposes, and for these uses it is fine. Besides, it is certainly possible to propose an extension of the API so as to direct the output to another file-like object. But that's something that can be changed as a paper-cut project, it's not hard. Just nobody really has the urge and time to do it.
This "call to arms" can be a good idea. But we have to be able to channel it and appropriately review / validate the submitted changes. Of course. But code review should happen in general, not just for external contributions.
It depends what you mean by "refactor". It doesn't sound very precise :) I think it's better to discuss proposed changes case by case rather than trying to reach a consensus on such vague terms. That would have to be decided on a papercut-by-papercut base. And someone would have to select this modules first, which is why I mentioned the poll.
Regards, Armin
Armin Ronacher wrote:
Hi everybody,
I'm known for my dislike of the standard libray. In the past I wrote some blog posts about this topic into my personal blog already. However as many people pointed out earlier, a blog is not the place for this kind of criticism. Not only that, also just ranting about a topic does not help at all. Yesterday I subscribed to the stdlib-sig and immediately tons of mails ended up in my inbox. A quick look at the mail archives confirms what I was afraid of: this list is really high traffic. I tried to read up some of the discussions I missed but it's nearly impossible to do that.
And so you repeat a lot of points that have been discussed already in the last few days (the only few days in recent years that this list has been high traffic).
[snip...]
Also we have many modules in the standard library that in my opinion just do not belong there. From my point of view, stuff like XML does not belong into the standard library. But it appears that not many people agree with me on this one.
You're right, a lot of people disagree with you. :-)
But even if everybody would, backwards compatibility would still be a good reason to keep these modules around.
Besides modules that do not work in every environment or modules that were probably a mistake to include, we also have modules in the standard library with a hideous implementation or no reusability, forcing people to reinvent what's already there. For a long time, `urllib` was a module I would have listed there, but as of Python 2.6, the module largely improved by exposing the underlaying socket more which finally alllows us to set the timeout in a reliable way. But there are still a ton of modules in the library that cause troubles for people. `dis` is one of them. The implementation of dis prints to stdout no matter what you do. Of course you can replace sys.stdout with something else for a brief moment, but again: this is not something we should aim for or advertise because it breaks for many people.
That particular problem sounds easy to fix.
`Cookie` is a module people monkey patched for a while (badly) to support the http only flag. Not only does the code expose a weird API, it is also nearly impossible to extend and even ships cookie subclasses that use unsigned pickles and trust the client. `cgi` has again, shared state on the global namespace that alters the behavior of the lirbary. Of course it was never intended to be used by anything but `cgi`, but that leaves people reimplementing it or abusing it.
cgi and Cookie would both be *excellent* targets for refactoring / improving. This is *hugely* preferable to complaining about them. ;-) I *hope* that Python-dev would be willing to accept some measure of backwards incompatibility in the name of improving what few people will disagree are potentially useful but horribly outdated APIs.
So when the discussion started replacing `optparse` with `argparse`, because the former is unmaintained I became alerted. My wishes have always been the standard library to be a reliable fallback to be used if everything else fails. Something I can rely on which will not change, except for maybe some additions or modules moved to different locations. As Python developers we became used to moving import locations a lot. It it's `cPickle` or any of the element tree implementations, you name it.
For many things that is an admirable goal. But as you point out, and we have already discussed at great length, there is a problem with what to do with modules that can't be evolved to meet new requirements because of original API design.
I wonder if the solution to this problem wouldn't be a largely improved packaging system and some sort of standardized reviewing process for the standard library. Currently there is not even an accepted style for modules ending up in the Python distribution.
Yes there is - the standard procedure for getting new mdoules into the standard library is via the PEP process.
That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers.
It would be a great project for GHOP *if* we have some experienced developers, like yourself, dedicated to working out what the things that need fixing are. I think that one of the best ways to achieve the changes you discuss below may well be 'one-step-at-a-time' rather than a huge project. Michael
Ubuntu recently started the "100 paper cuts" project. There people work on tiny little patches to improve the system, rather to replace components. Even though a large place of the standard library appears to be broken by design they could still be redesigned on the small scale, without breaking backwards compatibility.
Of course libraries like `locale` and `logging` are hard to change, but it would still be possible. For `locale` it would probably a useful idea to go into the direction of datetime, where the timezone information is left to a 3rd party library. `locale` could provide some hooks for libraries like `babel` to fill the gap. On the other hand `Cookie` would be very easy to fix by moving the parsing code into a separate function and refactoring the cookie objects.
We could probably also start a poll out there with well-selected questions of what users think about parts of the library. And for that poll it would make a lot of sense to not just ask the questions and evaluating the results, but also track the area the user is coming from (small size company, open / closed source, web development etc.). Because we all are biased and seeing results grouped by some of these factoids could be enlightening. That said, it could tell us that I'm completely wrong with my ideas of how the state of the standard library.
But how realistic is it to refactor the standard library? I don't know. For a long time people were pretty sure Python will not get any faster and yet Unleaden Swallow is doing some really amazing progress.
If we want to push Python foward into new areas, and the web is one of them, it is necessary to jump into the cold water and start things.
Any maybe we should have some elected task forces for things like the standard library. Judging from the mailinglist it appears that far too many people are discussing *every detail* of it. It is a good idea to ask as many people as possible, but I am not sure if the mailinglist is the way to do that. It is currently very hard to see the direction in which development is heading.
Please think of this email just as a suggestion. I don't have too much trust into myself to follow the discussions on this list camely enough to become a real part of a solution, but I would love to help shifting the development into a better direction, no matter which one it will be.
Regards, Armin _______________________________________________ stdlib-sig mailing list stdlib-sig@python.org http://mail.python.org/mailman/listinfo/stdlib-sig
Michael said: Armin said:
That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers.
It would be a great project for GHOP *if* we have some experienced developers, like yourself, dedicated to working out what the things that need fixing are.
Testing and doc updates worked reasonably well within GHOP last time, and surprisingly little in the way of "experienced developers" were needed. Faced with the responsibility of coming up with dozens of tasks on short notice, I picked a dozen stdlib modules and said test this integrate doug hellmann's documentation run through the existing examples and write more ...and voila, it happened and attracted positive notice from the BDFL, which is saying something. By far the most important part of that process was not my role in putting the tasks up, but Georg's role in reviewing the patches and committing them in a timely manner. I can't speak for how much time he spent doing that, however, and I certainly don't expect that level of effort from HIM this time; perhaps with Mercurial we can get non-committers to act as first-pass filters to reduce the strain on Georg or whoever steps into his shoes. [0] Perhaps this time we can focus on py3k stuff with GHOP; that'd be great, and a real community boost, IMO. cheers, --titus [0] I'm nominating Brett; he seems to have plenty of time.
C. Titus Brown schrieb:
Michael said: Armin said:
That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers.
It would be a great project for GHOP *if* we have some experienced developers, like yourself, dedicated to working out what the things that need fixing are.
Testing and doc updates worked reasonably well within GHOP last time, and surprisingly little in the way of "experienced developers" were needed. Faced with the responsibility of coming up with dozens of tasks on short notice, I picked a dozen stdlib modules and said
test this integrate doug hellmann's documentation run through the existing examples and write more
...and voila, it happened and attracted positive notice from the BDFL, which is saying something.
It also means that it's going to be a bit harder to find new tasks this year -- not much has changed in terms of the kind of work available. Most libraries might seem "small and easy to understand" to Armin these days, but certainly not to high school students from all over the world. Combine that with the fact that the code quality in the stdlib is supposed to *improve* due to these efforts, *and* that we need to keep a keen eye on backwards compatibility, and we have the same non-trivial problem of finding tasks. In fact, I'd rather mentor two tasks than find one (good) additional task :)
By far the most important part of that process was not my role in putting the tasks up, but Georg's role in reviewing the patches and committing them in a timely manner. I can't speak for how much time he spent doing that,
Enough for one lifetime! No, seriously, I'm willing to help as a mentor in this year's GHOP again, but I don't want to end up as the only core developer involved.
however, and I certainly don't expect that level of effort from HIM this time; perhaps with Mercurial we can get non-committers to act as first-pass filters to reduce the strain on Georg or whoever steps into his shoes. [0]
I sincerely hope that it could work that way. Not having to reject sub- standard work is already a relief. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
Georg Brandl schrieb:
By far the most important part of that process was not my role in putting the tasks up, but Georg's role in reviewing the patches and committing them in a timely manner. I can't speak for how much time he spent doing that,
Enough for one lifetime! No, seriously, I'm willing to help as a mentor in this year's GHOP again, but I don't want to end up as the only core developer involved.
Not that this has happened it the past, of course ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
On Wed, Sep 16, 2009 at 07:49, C. Titus Brown <ctb@msu.edu> wrote:
Michael said: Armin said:
That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers.
It would be a great project for GHOP *if* we have some experienced developers, like yourself, dedicated to working out what the things that need fixing are.
Testing and doc updates worked reasonably well within GHOP last time, and surprisingly little in the way of "experienced developers" were needed. Faced with the responsibility of coming up with dozens of tasks on short notice, I picked a dozen stdlib modules and said
test this integrate doug hellmann's documentation run through the existing examples and write more
...and voila, it happened and attracted positive notice from the BDFL, which is saying something.
We can also run figleaf or coverage.py across the standard library and see which modules have horrible test coverage.
By far the most important part of that process was not my role in putting the tasks up, but Georg's role in reviewing the patches and committing them in a timely manner. I can't speak for how much time he spent doing that, however, and I certainly don't expect that level of effort from HIM this time; perhaps with Mercurial we can get non-committers to act as first-pass filters to reduce the strain on Georg or whoever steps into his shoes. [0]
Hey, I helped a lot last time (and plan to do so again). As for the Mercurial thing, it might not happen by December, so don't rely on that (although we always have the mirrors so we can instruct the students on how to use those).
Perhaps this time we can focus on py3k stuff with GHOP; that'd be great, and a real community boost, IMO.
One possibility is to simply only care about the test and doc changes for py3k. -Brett
Brett Cannon wrote:
On Wed, Sep 16, 2009 at 07:49, C. Titus Brown <ctb@msu.edu> wrote:
Michael said: Armin said:
That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers.
It would be a great project for GHOP *if* we have some experienced developers, like yourself, dedicated to working out what the things that need fixing are.
Testing and doc updates worked reasonably well within GHOP last time, and surprisingly little in the way of "experienced developers" were needed. Faced with the responsibility of coming up with dozens of tasks on short notice, I picked a dozen stdlib modules and said
test this integrate doug hellmann's documentation run through the existing examples and write more
...and voila, it happened and attracted positive notice from the BDFL, which is saying something.
We can also run figleaf or coverage.py across the standard library and see which modules have horrible test coverage.
By far the most important part of that process was not my role in putting the tasks up, but Georg's role in reviewing the patches and committing them in a timely manner. I can't speak for how much time he spent doing that, however, and I certainly don't expect that level of effort from HIM this time; perhaps with Mercurial we can get non-committers to act as first-pass filters to reduce the strain on Georg or whoever steps into his shoes. [0]
Hey, I helped a lot last time (and plan to do so again). As for the Mercurial thing, it might not happen by December, so don't rely on that (although we always have the mirrors so we can instruct the students on how to use those).
I'll try and help. Michael
Perhaps this time we can focus on py3k stuff with GHOP; that'd be great, and a real community boost, IMO.
One possibility is to simply only care about the test and doc changes for py3k.
-Brett
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog
On Wed, Sep 16, 2009 at 01:08:28PM -0700, Brett Cannon wrote:
On Wed, Sep 16, 2009 at 07:49, C. Titus Brown <ctb@msu.edu> wrote:
Michael said: Armin said:
That, and a group of people, dedicated to standard library refactoring. ??The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. ??They could even be used as some sort of "playground" for new Python developers.
It would be a great project for GHOP *if* we have some experienced developers, like yourself, dedicated to working out what the things that need fixing are.
Testing and doc updates worked reasonably well within GHOP last time, and surprisingly little in the way of "experienced developers" were needed. ??Faced with the responsibility of coming up with dozens of tasks on short notice, I picked a dozen stdlib modules and said
??test this ??integrate doug hellmann's documentation ??run through the existing examples and write more
...and voila, it happened and attracted positive notice from the BDFL, which is saying something.
We can also run figleaf or coverage.py across the standard library and see which modules have horrible test coverage.
Yes, and I expect to integrate the GSoC work on C code coverage, too, along with multi-platform build results. (Titus's Motto: "coercing other people into building my own testing infrastructure from scratch since 2006".)
By far the most important part of that process was not my role in putting the tasks up, but Georg's role in reviewing the patches and committing them in a timely manner. ??I can't speak for how much time he spent doing that, however, and I certainly don't expect that level of effort from HIM this time; perhaps with Mercurial we can get non-committers to act as first-pass filters to reduce the strain on Georg or whoever steps into his shoes. [0]
Hey, I helped a lot last time (and plan to do so again). As for the Mercurial thing, it might not happen by December, so don't rely on that (although we always have the mirrors so we can instruct the students on how to use those).
Yes, I remember -- no denigration intended, just that Georg devoted an absolutely insane amount of time to it ;)
Perhaps this time we can focus on py3k stuff with GHOP; that'd be great, and a real community boost, IMO.
One possibility is to simply only care about the test and doc changes for py3k.
OK, sounds good. --t -- C. Titus Brown, ctb@msu.edu
[snip]
with shared state on module level, web applications are not. It is true that Python currently has some issues with high concurrency and people try to fix that by forking and spawning new processes which certainly hides away the problem of shared state, but that does not solve it.
FWIW: Multiprocessing doesn't care about shared state; nor is it an attempt to "get around" the shared state within the standard library. Conflating concurrency issues with shared state within standard library modules is not quite right. I do agree, however, that there are some modules whose shared state is undesirable. You do have to understand though; while a large portion of the world is moving into the web; there are many of us still, who simply don't do "the online thing" - we should strive to improve the web-story, but we can not do so in a way which cripples or makes the lives of people who are *not* web-heads more difficult.
Now if we look at the standard library, we can see many modules that just do not work in such environments because they have some sort of shared state. The most obvious ones are certainly the `locale` module and all the other modules that change behavior based on the locale settings. Did you know that every major Python framework reimplements time formatting even for something as simple as HTTP headers, because Python does not provide a way to format the time to english strings reliably? But there are certainly more modules that have this sort of problem.
Part of my motivation in starting the other thread are issues such as this.
Also we have many modules in the standard library that in my opinion just do not belong there. From my point of view, stuff like XML does not belong into the standard library. But it appears that not many people agree with me on this one. But even if everybody would, backwards compatibility would still be a good reason to keep these modules around.
Each of us comes from a different problem domain - You might be focused on the web, but I'm focused on daemons, tools, and networking and glue. This difference between us exemplifies the problem of a common, objective "smell test" for what really belongs in the standard library. Take your example - XML parsing. I would prefer One Way To Do It in the standard library. I feel XML parsing (and JSON, and YAML) are critical things to have in the standard library for a variety of reasons.
Besides modules that do not work in every environment or modules that were probably a mistake to include, we also have modules in the standard library with a hideous implementation or no reusability, forcing people to reinvent what's already there. [snip]
And SimpleHTTPServer, and logging, and... Armin, some of us agree with you, and again, this was part of my driving force in starting the other thread proposing the logical break out and subsequent cleanup. Fred, I and Brett have gone off to write PEPs outlining these tasks. If you would like to contribute to those peps, email me off list and I will give you access. But you have to be nice ;)
I wonder if the solution to this problem wouldn't be a largely improved packaging system and some sort of standardized reviewing process for the standard library. Currently there is not even an accepted style for modules ending up in the Python distribution. That, and a group of people, dedicated to standard library refactoring. The majority of libraries in the standard library are small and easy to understand, I'm sure they are perfectly suited for students on projects like GSOC or GHOP to work on. They could even be used as some sort of "playground" for new Python developers.
This was another point in the other thread; we need maintainers for all of the modules. While there is not "guideline" for the code which goes in per-se, the process by which something gets in is outlined, and the code is typically reviewed prior to inclusion by Python-Dev. As for the packaging system: Tarek and Company are working on this, and it is outside of the boundaries of the discussions on this list so far. If you really want to help with packaging, you need to go over to disutils-sig (and report back to us the traffic levels there ;)) or contact Tarek directly.
Ubuntu recently started the "100 paper cuts" project. There people work on tiny little patches to improve the system, rather to replace components. Even though a large place of the standard library appears to be broken by design they could still be redesigned on the small scale, without breaking backwards compatibility.
We have over 170 patches in the tracker needing reviews. We have more issues with patches that need docs and tests. More patches, while welcome, still need someone to review them, apply them, and ensure that they don't side-effect everything else, conceptually break everything, and so on.
Of course libraries like `locale` and `logging` are hard to change, but it would still be possible. For `locale` it would probably a useful idea to go into the direction of datetime, where the timezone information is left to a 3rd party library. `locale` could provide some hooks for libraries like `babel` to fill the gap. On the other hand `Cookie` would be very easy to fix by moving the parsing code into a separate function and refactoring the cookie objects.
And a 3rd party library adds a dependency to all the build bots, consumers, apps, etc out there. That dependency may not work on windows, OS/X, or IRIX. This is partially the reason something like an libxml dependency is right on out (sadly). Again, agreed - but these modules need maintainers, people who care enough about them to do the things you talk about. That's why I started this tempest in a mailpot in the first place. It's not like I enjoy replying to emails - I don't even get paid for it.
We could probably also start a poll out there with well-selected questions of what users think about parts of the library. And for that poll it would make a lot of sense to not just ask the questions and evaluating the results, but also track the area the user is coming from (small size company, open / closed source, web development etc.). Because we all are biased and seeing results grouped by some of these factoids could be enlightening. That said, it could tell us that I'm completely wrong with my ideas of how the state of the standard library.
There are two things conflated here. One is "what do the users want" and "what can we maintain". They are not the same thing. Brett already tried an informal poll: http://sayspy.blogspot.com/2009/07/results-of-informal-poll-about-standard.h... While not entirely representative of the hundreds of companies, and thousands of people out there using Python, it's a good place to start. In fact, it's one of the data points I'm using in my "cleanup PEP". Would you like to help?
But how realistic is it to refactor the standard library? I don't know. For a long time people were pretty sure Python will not get any faster and yet Unleaden Swallow is doing some really amazing progress.
refactoring of the standard library, and it's continued evolution are requirements for Python 's survival. This is why I started the other thread, and others contributed to it.
Any maybe we should have some elected task forces for things like the standard library. Judging from the mailinglist it appears that far too many people are discussing *every detail* of it. It is a good idea to ask as many people as possible, but I am not sure if the mailinglist is the way to do that. It is currently very hard to see the direction in which development is heading.
Those of us who care about this are off writing PEPs. If you want to help, you can. The discussion of every detail is a necessary "evil" - and it comes with the territory. There is a time for discussion though, and a time for work. David, Georg, Brett, Frank, and I are all taking action items to go off and do, because you're right: actions speak louder.
Please think of this email just as a suggestion. I don't have too much trust into myself to follow the discussions on this list camely enough to become a real part of a solution, but I would love to help shifting the development into a better direction, no matter which one it will be.
If you can not follow this mailing list calmly to find the good information, filter the fluff, and ultimately cherry pick and extract the work necessary to move forward, you're going to dread the PEP process. Changes affect everyone, we can not go and do them in a smokey dimly lit room. It runs counter to who and what we are. It's fine to be a dictator when it's your own project (Jinja vs. Jinja2 come to mind) but discussion is needed, and healthy. You just need to filter the good from the bad. Armin, I agree with your sentiment, the feeling that is contained within it is the motivation for me starting the original discussion in the *first place*. Yes, it caused a fair amount of discussion, some good, some bad, some circular. But we also got some people working on solid deliverables, which was the point. If you would like to help write some PEPs, I'm open to collaborating. Jesse
From the above snippets, one could infer that both Armin and Jesse have some "issues" with Python's logging package. In Brett's informal poll, logging was one of the packages which people raised as "needs to change" - it came third in
Hi everyone, I'm all for improving the standard library, and as the author of the logging package, have a keen interest in making sure that it is relevant and usable by most if not all of the Python community, and that it evolves with changing needs. However, this objective is made much harder by what I see as some shortcomings in the way we all communicate about these issues. For example, from a post earlier in this thread, here are two snippets... [Jesse] And SimpleHTTPServer, and logging, and... Armin, some of us agree with you, and again, this was part of my driving force in starting the other thread proposing the logical break out and subsequent cleanup. [Armin] Of course libraries like `locale` and `logging` are hard to change, but it would still be possible. For `locale` it would probably a useful Now, it's not hard to find out that I'm the author of the logging package - apart from being co-author of the PEP which introduced it, I'm fairly active on python-list when logging-related issues crop up, as well as promptly addressing issues in the bug tracker. I'm also not that hard to find via Google when you search for "python logging". the "hall of shame". When Brett posted about it, at http://sayspy.blogspot.com/2009/07/results-of-informal-poll-about-standard.h... I followed up there at some length. It seems a lot of this stuff gets discussed on Twitter, which makes it very easy for meanings to be misinterpreted because you can't always be clear about what you mean in 140 chars. (I don't use Twitter myself, as for me the noise to signal ratio is far too high.) I was at one point given to understand that in some tweet or other, Andrii Mishkovskyi apparently offered to rewrite the logging package. Andrii has assured me that he hadn't actually meant to cause offence, but surely you can see it comes across as a tad impolite, given that the package has an active maintainer. Jesse's and Armin's comments above epitomise the problem. As far as I know (with Google's help), neither has ever bothered to post on python-list, python-dev, their own blogs or anywhere else what these "issues" with logging are. Nor has either ever contacted me directly. Yet they talk blithely about changing logging, as if it has no maintainer. What exactly is the difficulty in articulating your issues? Armin has done a fair job on describing bad points about other parts of the library, and fair points they are, too. Armin did once mention logging in a post about singletons, because logging does contain singletons. However, as far as I know it does not cause problems in practice - I use it with Django on numerous websites and the Tornado webserver of FriendFeed/Facebook uses it too, apparently without the sky falling on its head. Andrii Mishkovskyi set up a page on the Python wiki, http://wiki.python.org/moin/LoggingPackage where he posted his criticisms of the logging package and invited comments. Great! Something specific to work on. I responded to all his points, and waited for others to weigh in. Since 8 August, when I made my last changes to it, that page has not been changed - by Andrii, Jesse or anyone else. I'm not expecting logging to be anyone's hot button except mine, but I am committed to maintaining it. If you're not interested in improving it, don't mention it in the offhand way I quoted above - it's not the type of criticism that I can work from. And if you are interested in improving it, take the time to articulate the issues. For example, I recently came across the Opster library (which wraps getopt) and really liked some aspects of it, though at the moment argparse is my package of choice for command line parsing. I contacted both Steven Bethard, argparse author and Alexander Solovyov, author of Opster, about trying to get some synergy going between the two approaches. I did this using the argparse Google code project issue tracker (for contact with Steven) and (for Alexander) by commenting on his blog entry about Opster. Both contacts have been fruitful, at least from my point of view as a user/potential user of their work. This might sound a bit like a rant, but it's not meant to be - I just speak as I find. I'm not overly sensitive to criticism about the logging package; in fact I welcome *constructive* criticism which can help to improve it. All of you, please feel free to head over to Andrii's page to post your criticisms/comments there. If my expectations are that: - I'm not the dead parrot - think of me more as the elephant in the room. If you have issues with my work, talk to me. - Use a platform where meanings have the potential to be clear - i.e. let's not make being on Twitter a pre-requisite for discourse. - Avoid the general snide-sounding "Logging sucks." "Yes, doesn't it just?" kind of comments. It's great to vent, but is that the best you want to aim for? - Remember the exigencies of backward compatibility. Root and branch changes to the public API are clearly out, at least for now - not just for logging but for the whole stdlib. Am I expecting too much? Regards, Vinay Sajip
On Thu, 17 Sep 2009 at 16:56, Vinay Sajip wrote:
Now, it's not hard to find out that I'm the author of the logging package - apart from being co-author of the PEP which introduced it, I'm fairly active on python-list when logging-related issues crop up, as well as promptly addressing issues in the bug tracker. I'm also not that hard to find via Google when you search for "python logging".
I spend more time watching the bug tracker than I should, and I can confirm that Vinay is _very_ responsive to tracker issues concerning logging. --David
On Thu, Sep 17, 2009 at 09:56, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote: [snip]
If my expectations are that:
- I'm not the dead parrot - think of me more as the elephant in the room. If you have issues with my work, talk to me. - Use a platform where meanings have the potential to be clear - i.e. let's not make being on Twitter a pre-requisite for discourse. - Avoid the general snide-sounding "Logging sucks." "Yes, doesn't it just?" kind of comments. It's great to vent, but is that the best you want to aim for? - Remember the exigencies of backward compatibility. Root and branch changes to the public API are clearly out, at least for now - not just for logging but for the whole stdlib.
Am I expecting too much?
Nope. And hopefully David's maintainer list will help make sure people talk to you directly. -Brett
This is a great response from a standard library module maintainer. Michael Vinay Sajip wrote:
Hi everyone,
I'm all for improving the standard library, and as the author of the logging package, have a keen interest in making sure that it is relevant and usable by most if not all of the Python community, and that it evolves with changing needs. However, this objective is made much harder by what I see as some shortcomings in the way we all communicate about these issues. For example, from a post earlier in this thread, here are two snippets...
[Jesse]
And SimpleHTTPServer, and logging, and... Armin, some of us agree with you, and again, this was part of my driving force in starting the other thread proposing the logical break out and subsequent cleanup.
[Armin] Of course libraries like `locale` and `logging` are hard to change, but it would still be possible. For `locale` it would probably a useful
Now, it's not hard to find out that I'm the author of the logging package - apart from being co-author of the PEP which introduced it, I'm fairly active on python-list when logging-related issues crop up, as well as promptly addressing issues in the bug tracker. I'm also not that hard to find via Google when you search for "python logging".
From the above snippets, one could infer that both Armin and Jesse have some "issues" with Python's logging package. In Brett's informal poll, logging was one of the packages which people raised as "needs to change" - it came third in the "hall of shame". When Brett posted about it, at
http://sayspy.blogspot.com/2009/07/results-of-informal-poll-about-standard.h...
I followed up there at some length. It seems a lot of this stuff gets discussed on Twitter, which makes it very easy for meanings to be misinterpreted because you can't always be clear about what you mean in 140 chars. (I don't use Twitter myself, as for me the noise to signal ratio is far too high.) I was at one point given to understand that in some tweet or other, Andrii Mishkovskyi apparently offered to rewrite the logging package. Andrii has assured me that he hadn't actually meant to cause offence, but surely you can see it comes across as a tad impolite, given that the package has an active maintainer.
Jesse's and Armin's comments above epitomise the problem. As far as I know (with Google's help), neither has ever bothered to post on python-list, python-dev, their own blogs or anywhere else what these "issues" with logging are. Nor has either ever contacted me directly. Yet they talk blithely about changing logging, as if it has no maintainer. What exactly is the difficulty in articulating your issues? Armin has done a fair job on describing bad points about other parts of the library, and fair points they are, too. Armin did once mention logging in a post about singletons, because logging does contain singletons. However, as far as I know it does not cause problems in practice - I use it with Django on numerous websites and the Tornado webserver of FriendFeed/Facebook uses it too, apparently without the sky falling on its head.
Andrii Mishkovskyi set up a page on the Python wiki,
http://wiki.python.org/moin/LoggingPackage
where he posted his criticisms of the logging package and invited comments. Great! Something specific to work on. I responded to all his points, and waited for others to weigh in. Since 8 August, when I made my last changes to it, that page has not been changed - by Andrii, Jesse or anyone else.
I'm not expecting logging to be anyone's hot button except mine, but I am committed to maintaining it. If you're not interested in improving it, don't mention it in the offhand way I quoted above - it's not the type of criticism that I can work from. And if you are interested in improving it, take the time to articulate the issues. For example, I recently came across the Opster library (which wraps getopt) and really liked some aspects of it, though at the moment argparse is my package of choice for command line parsing. I contacted both Steven Bethard, argparse author and Alexander Solovyov, author of Opster, about trying to get some synergy going between the two approaches. I did this using the argparse Google code project issue tracker (for contact with Steven) and (for Alexander) by commenting on his blog entry about Opster. Both contacts have been fruitful, at least from my point of view as a user/potential user of their work.
This might sound a bit like a rant, but it's not meant to be - I just speak as I find. I'm not overly sensitive to criticism about the logging package; in fact I welcome *constructive* criticism which can help to improve it. All of you, please feel free to head over to Andrii's page to post your criticisms/comments there.
If my expectations are that:
- I'm not the dead parrot - think of me more as the elephant in the room. If you have issues with my work, talk to me. - Use a platform where meanings have the potential to be clear - i.e. let's not make being on Twitter a pre-requisite for discourse. - Avoid the general snide-sounding "Logging sucks." "Yes, doesn't it just?" kind of comments. It's great to vent, but is that the best you want to aim for? - Remember the exigencies of backward compatibility. Root and branch changes to the public API are clearly out, at least for now - not just for logging but for the whole stdlib.
Am I expecting too much?
Regards,
Vinay Sajip
_______________________________________________ stdlib-sig mailing list stdlib-sig@python.org http://mail.python.org/mailman/listinfo/stdlib-sig
Hi, Vinay Sajip schrieb:
Jesse's and Armin's comments above epitomise the problem. As far as I know (with Google's help), neither has ever bothered to post on python-list, python-dev, their own blogs or anywhere else what these "issues" with logging are. I agree that nobody did. And there is a reason for it, and that reason is probably something that could be discussed in a separate thread too, because it affects a lof stuff that is currently in the standard libary.
Armin has done a fair job on describing bad points about other parts of the library, and fair points they are, too. Armin did once mention logging in a post about singletons, because logging does contain singletons. However, as far as I know it does not cause problems in practice - I use it with Django on numerous websites and the Tornado webserver of FriendFeed/Facebook uses it too, apparently without the sky falling on its head. I'm sorry for the wait I was expressing my disagreement with design decisions made in the standard library and I would love to change that. One of the biggest griefs I have with how the standard library works it
I looked at the logging library and my point of view is that logging is "broken by design". Eg: you can't fix it without breaking backwards compatibility. that it does not work for me and everytime I discuss that topic with anyone else I immediately get the feedback that "this does not happen in the real world" or "works for me" etc. I consider any kind of shared state a design mistake, no matter if it may work for some people or not, the logging package is no exception. However I have to admit that global loggers appears to be one of the easiest solutions for the problem. If I would have to create my own logging library, I would go a total different part from design to implementation. I would create a system where the "sender" is an arbitrary Python object and the system that handles it would have to check for its own if it may process that message. Only *if* there was any handler that wants to do anything with that message, it would pull the details form the stack and format the message. That also would get rid of the formatters and all the other stuff we currently have and avoids a global registry of loggers. How would I configure a logger in a library then? I library would *never* by default print anything that is logged. Default configurations for the logging system currently are the biggest reason for me to hate that library. Many libraries do not even give you the ability to turn of logging in general, and documentation for the logging system was probably one of the reasons. I don't know when the "NullHandler" example appeared in the docs, but I'm sure it was not there when I started using the logging system. I cannot use logging for anything serious because it is slow, it has shared state that often causes frustration, you cannot delete loggers and much more. Of course I do use logging for libraries because it's the standard and the best we have got. The only reasons (in my opinion) that logging is still around is that it's in the stdlib, not because it's any good. Especially when it comes to highly optimized code in web applications you will quickly discover that half a dozen log calls take up more CPU cycles than the actual application code.
This might sound a bit like a rant, but it's not meant to be - I just speak as find. I'm not overly sensitive to criticism about the logging package; in fact I welcome *constructive* criticism which can help to improve it. All of you, please feel free to head over to Andrii's page to post your criticisms/comments there. I don't know how I could contribute constructive criticism. I'm sorry for that.
Am I expecting too much? I'm afraid you are. It's hard to accept that some people think of your system as "it just sucks", but you can't change that. I know that feeling from some of the stuff I wrote. It's just a little bit worse for you because logging is the standard (and only) one, everybody uses. For the stuff I wrote*, people have choice. If they don't want to use it, they don't have to.
Regards, Armin * external libraries, not distributed as part of Python except for the rather simple "ast" module.
On Sep 17, 2009, at 3:32 PM, Armin Ronacher wrote:
Hi,
Jesse's and Armin's comments above epitomise the problem. As far as I know (with Google's help), neither has ever bothered to post on python-list, python-dev, their own blogs or anywhere else what these "issues" with logging are. I agree that nobody did. And there is a reason for it, and that reason is probably something that could be discussed in a separate thread too, because it affects a lof stuff that is currently in the standard
Vinay Sajip schrieb: libary.
I looked at the logging library and my point of view is that logging is "broken by design". Eg: you can't fix it without breaking backwards compatibility.
I think we way just have to live with that one. While I'm no fan of the logging module, it is widely used. It was based on a Java version and Guido blessed it early-on. Raymond
Hi, Raymond Hettinger schrieb:
I think we way just have to live with that one. While I'm no fan of the logging module, it is widely used. It was based on a Java version and Guido blessed it early-on. I'm not insane enough to seriously consider replacing it. I just wanted to point out why I never wrote a ticket / mail to a mailing list in the first place.
Regards, Armin
On Sep 17, 2009, at 3:50 PM, Armin Ronacher wrote:
Hi,
I think we way just have to live with that one. While I'm no fan of the logging module, it is widely used. It was based on a Java version and Guido blessed it early-on. I'm not insane enough to seriously consider replacing it. I just wanted to point out why I never wrote a ticket / mail to a mailing list in
Raymond Hettinger schrieb: the first place.
That's good. For a moment, I thought you had lost your marbles ;-) Raymond
I'm not sure exactly how your logging API would look, but is there a PyPI logging module that you would recommend or something similar? It sounds as though the logging module could have a Class for handling logging differently as you would propose. Though I'm not sure what that is. On Fri, Sep 18, 2009 at 2:00 AM, Raymond Hettinger <python@rcn.com> wrote:
On Sep 17, 2009, at 3:50 PM, Armin Ronacher wrote:
Hi,
Raymond Hettinger schrieb:
I think we way just have to live with that one. While I'm no fan of the logging module, it is widely used. It was based on a Java version and Guido blessed it early-on.
I'm not insane enough to seriously consider replacing it. I just wanted to point out why I never wrote a ticket / mail to a mailing list in the first place.
That's good. For a moment, I thought you had lost your marbles ;-)
Raymond
_______________________________________________ stdlib-sig mailing list stdlib-sig@python.org http://mail.python.org/mailman/listinfo/stdlib-sig
Raymond Hettinger <python@...> writes:
While I'm no fan of the logging module, it is widely used. It was based on a Java version and Guido blessed it early-on.
Raymond, I took *some ideas* from log4j. In that sense it was "based on", but it is not a port. While Guido did bless it, I do believe it went through a reasonable review process on python-dev (where you could certainly have given some feedback - I don't remember that you did) and I changed the package in response to various concerns from various people. It certainly didn't feel to me like a rubber-stamping exercise - do you feel that it was? Logging's current design is based on the premise that logging is concerned with "What happened?", "Where did it happen?", "How important is it?" and "Who wants to know?", and this is completely general and not tied to any language or environment. Of course there are many designs which could be developed from such a premise, and the present design is just one such. The abstractions in log4j (such as a hierarchical namespace for loggers, and handlers as orthogonal to loggers, whereas many systems conflate them) made sense to me (and to a lot of others), and it's no surprise that I named my abstractions similarly where that made sense. Beyond that, there's little correspondence between log4j's code and the code in Python's logging. I'm sorry you're not a fan. Is it an aesthetic thing, or have you had specific problems?
Armin Ronacher <armin.ronacher@...> writes:
I agree that nobody did. And there is a reason for it, and that reason is probably something that could be discussed in a separate thread too, because it affects a lof stuff that is currently in the standard libary.
I looked at the logging library and my point of view is that logging is "broken by design". Eg: you can't fix it without breaking backwards compatibility.
It's easy for you to just say "it's broken by design", and that's only a more polite way of saying "it sucks". It doesn'nt strike me as a basis for constructive dialogue, unless you provide some more specifics.
I'm sorry for the wait I was expressing my disagreement with design decisions made in the standard library and I would love to change that. One of the biggest griefs I have with how the standard library works it that it does not work for me and everytime I discuss that topic with anyone else I immediately get the feedback that "this does not happen in the real world" or "works for me" etc.
Well, I'm happy to discuss any problems you are having with logging in a practical sense. Obviously I can't do much with "I just don't like how it's designed", but if you want to spell out a specific problem - something you want to do with it that you just can't - I'll gladly listen and see if I can help with it.
I consider any kind of shared state a design mistake, no matter if it may work for some people or not, the logging package is no exception. However I have to admit that global loggers appears to be one of the easiest solutions for the problem.
I think you are being dogmatic, rather than pragmatic. The Zen of Python says, "practicality beats purity." Your bugbear here seems to be how shared state causes problems in web applications. Despite having shared state, AFAIK the logging module is quite usable in a web context - as well as the usage with Django and Tornado that I mentioned earlier, Google App Engine uses it too (meaning, all the web applications developed with GAE can use it). So if it doesn't work for you in a practical way, give me some details. But the logging design isn't meant to be a candidate in a beauty contest, and I don't claim it's perfect. You're a very smart guy, Armin, but you perhaps need to consider that it is possible to not like a design because it doesn't suit your personal taste - but that doesn't necessarily make it a bad design.
If I would have to create my own logging library, I would go a total different part from design to implementation. I would create a system where the "sender" is an arbitrary Python object and the system that handles it would have to check for its own if it may process that message. Only *if* there was any handler that wants to do anything with that message, it would pull the details form the stack and format the
Currently, Python logging doesn't do formatting of stack traces etc. until it's sure that the message is severe enough to require handling, based on the current logger configuration. When handling a message, each handler checks its configuration before formatting and dealing with the formatted message. The system tries not to do unnecessary work, and if you have found some cases where it does unnecessary work, please tell me. So at present, Python logging conforms to your statement "Only if there was any handler that wants to do anything with that message, it would pull the details form the stack and format ..." That's been there from day one, so if you got the idea it worked differently, I'm not sure why.
message. That also would get rid of the formatters and all the other stuff we currently have and avoids a global registry of loggers.
In Python logging, you never have to instantiate a Formatter in your code unless you want some specific formatting functionality or format. In your scheme, if a user wanted certain messages to be formatted in certain ways (and other messages in other ways - e.g. for a log file as opposed to console display), you would do this without any formatter classes - how, exactly? The global registry of loggers is there to avoid the need to pass loggers around the system - you just access them by name. It's a bit like thread locals - do you have a problem with thread locals, too? Sure, it's state shared across threads, unlike thread locals. But if it's causing you a specific problem because you've found some non-thread-safe behaviour, I'd really like to know.
How would I configure a logger in a library then? I library would *never* by default print anything that is logged. Default configurations for the logging system currently are the biggest reason for me to hate that library. Many libraries do not even give you the ability to turn of logging in general, and documentation for the logging system was probably one of the reasons. I don't know when the "NullHandler" example appeared in the docs, but I'm sure it was not there when I started using the logging system.
Agreed, the logging from a library leading to warning messages was an annoyance in the library as first released. But *you* never raised an issue about it. When other people did (I think it was Thomas Heller), I immediately updated the documentation to explain how to use a NullHandler to avoid annoying messages, and NullHandler appeared in a subsequent release soon afterwards. If default configuration for the logging system is the biggest reason for you to hate that package, don't get mad - get even. Tell me how I messed up and how to make it better.
I cannot use logging for anything serious because it is slow, it has shared state that often causes frustration, you cannot delete loggers and much more. Of course I do use logging for libraries because it's the standard and the best we have got.
Slow? Sure, everything has a cost. It's all about tradeoffs. What specific performance problems have you come up against? What mitigating strategies did you use? What were the observed performance metrics, as against what you expected? Did you do any profiling to be sure that logging was definitely the culprit? Show me the numbers. If your frustration with shared state is from an aesthetic point of view, I can't really help. After all, sys.modules (say) is shared state too. If the frustration comes from some specific thing you're trying to do, please do tell what that is. You cannot delete loggers because in a multithreaded application, other threads may still be using them. You can, however, disable loggers - which, from a functional point of view, seems as good. Or are you finding that loggers are taking up too much memory?
The only reasons (in my opinion) that logging is still around is that it's in the stdlib, not because it's any good. Especially when it comes
That's a cheap remark, I would have expected better from you. So everybody who uses logging (and is smart, like you) hates it but uses it because they have no choice, right? It *is* widely used, AFAIK. If everyone shared your opinion, I don't think that would be the case.
to highly optimized code in web applications you will quickly discover that half a dozen log calls take up more CPU cycles than the actual application code.
Please show me the numbers.
I don't know how I could contribute constructive criticism. I'm sorry for that.
You can't contribute constructive criticism because you don't know how? Well, you've already made some criticisms in your post, so how about putting some in more detail? I really like Jinja2 (a lot), but if someone said to you "Jinja2 sucks, it's really slow, it does all that conversion to bytecode stuff, that's really gross. I can't really say any more than that, excuse me while I vomit" then how would you feel? That's pretty much how you're coming across right now.
Am I expecting too much? I'm afraid you are. It's hard to accept that some people think of your system as "it just sucks", but you can't change that. I know that
Actually I have no problem with that - people thinking that it sucks. You can't please everyone, and that goes for when you have a roomful of smart people, too. The problem I have is when people just vent without trying to make it better, talk about replacing it or removing it without even having the courtesy to talk to me first. (I'm not talking here about what you post on your blog - those are your opinions and of course you're entitled to say whatever you like. I'm talking about this discussion and perhaps on python-dev and other other SIG mailing lists.) Perhaps you think I'm being oldskool by mentioning "courtesy", but I prefer to think if it as an attribute of being a grown-up. Open source is very political, but that doesn't mean it doesn't have to be polite.
feeling from some of the stuff I wrote. It's just a little bit worse for you because logging is the standard (and only) one, everybody uses. For the stuff I wrote*, people have choice. If they don't want to use it, they don't have to.
Actually they don't have to use Python's built-in logging either. Who's twisting your arm? Talking of twisting, I recently had a discussion with Glyph Lefkowitz of Twisted Matrix about twisted.log after he posted some of his thoughts on logging in general and Python's implementation in particular. I believe I rebutted, or at least addressed, all of his points. They will continue to use twisted.log for aesthetic and pragmatic reasons, which is fine by me. They've built a bridge to Python logging, presumably because some customers wanted that. By the way, if you think that having choice has no downsides, perhaps you should step back and take a look at the Python web application framework space - Django, Pylons, TurboGears, web.py, Werkzeug, ... wow, so much choice! Sorry if I missed one. Each one doing it a little differently, even for something as basic as HTTP request/response objects. (I know the issues aren't trivial, but are they rocket surgery? ;-) Look at Graham Dumpleton's frustration over WSGI and Python 3.0 - I feel more than a little sympathy for his situation. He's actually trying to create something useful, just as I am. There's no shortage of approaches, and opinions, and in general, I believe competition is good. But some of us are just trying to get some work done here.
Le vendredi 18 septembre 2009 à 08:18 +0000, Vinay Sajip a écrit :
Despite having shared state, AFAIK the logging module is quite usable in a web context - as well as the usage with Django and Tornado that I mentioned earlier, Google App Engine uses it too (meaning, all the web applications developed with GAE can use it).
Pylons uses it too. While logging may not look extremely pretty, I have yet to see another logging library that has a pretty *and* powerful API. Applications / libraries which reinvent their own logging API usually end up with something which is both less powerful and *not* prettier (see twisted.log for an example). Regards Antoine.
Antoine Pitrou <solipsis@...> writes:
Pylons uses it too.
While logging may not look extremely pretty, I have yet to see another logging library that has a pretty *and* powerful API. Applications / libraries which reinvent their own logging API usually end up with something which is both less powerful and *not* prettier (see twisted.log for an example).
Thanks. If you had the time to write your ideal, "pretty" API - hypothetically, say, to wrap logging so you wouldn't use the underlying power - what would this API look like? I'm open to ideas from all of you.
On Fri, Sep 18, 2009 at 10:43 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou <solipsis@...> writes:
Pylons uses it too.
While logging may not look extremely pretty, I have yet to see another logging library that has a pretty *and* powerful API. Applications / libraries which reinvent their own logging API usually end up with something which is both less powerful and *not* prettier (see twisted.log for an example).
Thanks. If you had the time to write your ideal, "pretty" API - hypothetically, say, to wrap logging so you wouldn't use the underlying power - what would this API look like? I'm open to ideas from all of you.
I think logging is fine, but it misses a few pythonic functions on the top of it to work with. Right now, if you want to set up a logging output on a file or on stdout with some options, you have to write 5 or 6 lines of code. These 5/6 lines could probably be put in a function in the logging module, and be used with a few arguments. This function would return a logger ready to be used. Once a logger is created, logging is dead simple to use. If you think it's a good idea I can try to work on a proposal for this function signature.
_______________________________________________ stdlib-sig mailing list stdlib-sig@python.org http://mail.python.org/mailman/listinfo/stdlib-sig
-- Tarek Ziadé | http://ziade.org | オープンソースはすごい!
Tarek Ziadé <ziade.tarek@...> writes:
I think logging is fine, but it misses a few pythonic functions on the top of it to work with.
Right now, if you want to set up a logging output on a file or on stdout with some options, you have to write 5 or 6 lines of code.
There's basicConfig, which can be used like logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s', filename='/tmp/myapp.log', filemode='w') for a file and logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s', stream=sys.stdout) for stdout. This configures the root logger, but you could obviously add a loggerName argument that configured and returned a configured logger with that name. That wouldn't even break backward compatibility. It's not in general a desirable pattern to have handlers associated with every logger - which is why I haven't provided that additional argument. It's more common to attach handlers at the root and at certain specific points in the hierarchy - for example, attach an SMTP logger to the root module for a subsystem so that emails about errors can be sent to the team looking after that subsystem. Of course it's perfectly valid to have handers attached to multiple loggers - but if you do that for lots of loggers, you get multiple messages and increased processing time. If that's what is wanted, fine - it's just not the norm. But having a convenience function which makes it too convenient to configure multiple handlers could lead to lots of "I'm getting messages multiple times, please help!" traffic on c.l.py. The basic premise is - loggers map to areas in the application ("Where did it happen?") and handlers to the audience ("Who wants to know?"). Apart from in scripts intended to be run from the command line, in general you don't find a one-to-one mapping between loggers and handlers.
On Fri, Sep 18, 2009 at 11:18 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
but you could obviously add a loggerName argument that configured and returned a configured logger with that name. That wouldn't even break backward compatibility.
That would be great. With an option to make the logger instance 'standalone' (e.g. blocking the propagation of the messages to the other handler to avoid the problem you've described below)
It's not in general a desirable pattern to have handlers associated with every logger - which is why I haven't provided that additional argument. It's more common to attach handlers at the root and at certain specific points in the hierarchy - for example, attach an SMTP logger to the root module for a subsystem so that emails about errors can be sent to the team looking after that subsystem.
Of course it's perfectly valid to have handers attached to multiple loggers - but if you do that for lots of loggers, you get multiple messages and increased processing time. If that's what is wanted, fine - it's just not the norm. But having a convenience function which makes it too convenient to configure multiple handlers could lead to lots of "I'm getting messages multiple times, please help!" traffic on c.l.py.
The basic premise is - loggers map to areas in the application ("Where did it happen?") and handlers to the audience ("Who wants to know?"). Apart from in scripts intended to be run from the command line, in general you don't find a one-to-one mapping between loggers and handlers.
_______________________________________________ stdlib-sig mailing list stdlib-sig@python.org http://mail.python.org/mailman/listinfo/stdlib-sig
-- Tarek Ziadé | http://ziade.org | オープンソースはすごい!
Tarek Ziadé <ziade.tarek@...> writes:
That would be great. With an option to make the logger instance 'standalone' (e.g. blocking the propagation of the messages to the other handler to avoid the problem you've described below)
That could work. The additional keyword argument would be propagate, defaulting to None, and (if not specified) internally set as True if no logger name is provided, or False if it is. Let's just wait to see if any other ideas pop up. Regards, Vinay Sajip
On Fri, Sep 18, 2009 at 4:54 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
On Fri, Sep 18, 2009 at 10:43 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou <solipsis@...> writes:
Pylons uses it too.
While logging may not look extremely pretty, I have yet to see another logging library that has a pretty *and* powerful API. Applications / libraries which reinvent their own logging API usually end up with something which is both less powerful and *not* prettier (see twisted.log for an example).
Thanks. If you had the time to write your ideal, "pretty" API - hypothetically, say, to wrap logging so you wouldn't use the underlying power - what would this API look like? I'm open to ideas from all of you.
I think logging is fine, but it misses a few pythonic functions on the top of it to work with.
Right now, if you want to set up a logging output on a file or on stdout with some options, you have to write 5 or 6 lines of code.
That would be exactly my complaint. It feels like I'm writing Java instead of Python. :-) FWIW, logging has been quite flexible for my needs. Just getting it all configured is a bit of work.
These 5/6 lines could probably be put in a function in the logging module, and be used with a few arguments. This function would return a logger ready to be used.
That would definitely be an improvement! Now back to lurk mode... -John
On Sep 18, 2009, at 4:43 AM, Vinay Sajip wrote:
Thanks. If you had the time to write your ideal, "pretty" API - hypothetically, say, to wrap logging so you wouldn't use the underlying power - what would this API look like? I'm open to ideas from all of you.
So, I'm a big fan of the logging package and thank Vinay for his work on it over the years. It was quite a joy to chuck all the hacky Mailman 2 logging crap in favor of the standard logging package for MM3. The one thing I (very) occasionally want is to ask a logger for a file- like object suitable for print. There are some situations where I have a 3rd party API that requires a file-like object to output to, but I really want that output to go to a log. I'm pretty sure I've wrangled it out of a file-based logger, but it would be nice have this as an official API. Maybe it's there and I've just missed it though. -Barry
Barry Warsaw <barry@...> writes:
So, I'm a big fan of the logging package and thank Vinay for his work on it over the years. It was quite a joy to chuck all the hacky Mailman 2 logging crap in favor of the standard logging package for MM3.
Yay, a thumbs up! Thanks, Barry, you're very welcome!
The one thing I (very) occasionally want is to ask a logger for a file- like object suitable for print. There are some situations where I have a 3rd party API that requires a file-like object to output to, but I really want that output to go to a log. I'm pretty sure I've wrangled it out of a file-based logger, but it would be nice have this as an official API. Maybe it's there and I've just missed it though.
A StreamHandler (which includes file-based handlers) has a stream attribute which is a file-like object. Just pass that to your 3rd-party API. N.B. If it's a file-based handler and you specify a delay argument to the handler constructor, the file isn't actually opened until you first log to it, so stream will be None. So you might need to be careful about that (by default, a file-based handler doesn't delay opening the file). Regards, Vinay Sajip
On Sep 18, 2009, at 9:54 AM, Vinay Sajip wrote:
A StreamHandler (which includes file-based handlers) has a stream attribute which is a file-like object. Just pass that to your 3rd-party API.
Thanks Vinay. If that's part of the public (i.e. supported) API, could you add that to the documentation?
N.B. If it's a file-based handler and you specify a delay argument to the handler constructor, the file isn't actually opened until you first log to it, so stream will be None. So you might need to be careful about that (by default, a file-based handler doesn't delay opening the file).
I see. Is .stream used internally? Could it be a property that ensures its underlying file is open when its accessed? Also, if the handler is not StreamHandler, is it insane to want to print to it? IOW, would it make sense to implement .write() on the base Handler? -Barry
On Fri, Sep 18, 2009 at 4:43 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou <solipsis@...> writes:
Pylons uses it too.
While logging may not look extremely pretty, I have yet to see another logging library that has a pretty *and* powerful API. Applications / libraries which reinvent their own logging API usually end up with something which is both less powerful and *not* prettier (see twisted.log for an example).
Thanks. If you had the time to write your ideal, "pretty" API - hypothetically, say, to wrap logging so you wouldn't use the underlying power - what would this API look like? I'm open to ideas from all of you.
Hi Vinay; I've kept my mouth shut (well, on this subject) simply due to the fact I tend to feel API design is a bit of a "smell" thing. First off; thank you for the package. As much as I might not like the API - I use the logging package an *insane* amount. I also know you're responsive, and you care strongly about it. I myself might come off as slightly defensive for my own module (multiprocessing) if someone where to just say "lol it sux hahaha" (in fact, I have). You are right; it is flexible, and it is meant for a wide-range of use cases, and when you really "get it" it can be wildly powerful (I think David Beazley wrote a twitter handler!). However, my particular gut feeling when dealing with it stems from something I can't quite communicate properly. I *want* something "simpler" - for example, something which logs messages at a certain level to stderr (but not stdout) and stdout to stdout (but not stderr) - but also has a file logger. I don't want to have to write more than a few lines of code to do this - in *my* mind this is something so fundamental to unixy-scripts/daemons than it should be as simple as: import logging log = logging.get_log('mylog') log.warning('hay guys') What I end up doing in most of my projects (sorry, not public ones) is wrapping this in a "jesse.log" module that offers that API. The user does not see the complexity of the underlying logging module's APIs. In fact, I have a nasty tendency to create one "log" object which also has a fair amount of the logging module's API pushed into it, e.g.: from jesse import log log = log() log.critical('yay!') log.set_level(log.WARNING) # I loathe BouncyNames log.add_handler(log.file_handler(level=log.CRITICAL)) The other aspect of this is my experience trying to explain logging to people who have never dealt with that module before. Recently, I was trying to explain it to someone who has limited python knowledge, and really just wanted something like what I describe above. They read the docs, re-read the docs, re-re-re-read the docs, and still came to me and said "how on earth do I do this?!". The API doesn't (and this again, is a smell thing) feel python-y - it feels very Java like (having experience log4j, I can say it really does feel like it). I think this trips newbies up quite a bit. Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose. Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you. Like I said at the start - this is all a "smell" thing, and it obviously varies from person to person. This is fundamentally why I was interested in encouraging Mishok and others to put together concrete ideas together (I would be interested in seeing an alternative implementation as a thought exercise). I know others besides me have written little wrappers around logging, for example: http://pypi.python.org/pypi/easylog/ http://pypi.python.org/pypi/autolog/ http://pypi.python.org/pypi/sensible/ Perhaps that's a good place to start - higher level functions/methods/etc to "scale down" loggings perceived complexity? I know I'm trying to do bits of that for multiprocessing. Then of course, there's time for something completely different ;) http://code.zacharyvoase.com/lumberjack/src/ jesse
On Sep 18, 2009, at 9:57 AM, Jesse Noller wrote:
Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose. Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you.
We use the logging module extensively at work and have found the shared state (even in a multi-threaded/multi-process situation) *much* nicer than passing around log handles. In fact, we recently overhauled several process that, for legacy reasons, had been using log handles and now the code is simpler and the logs are easier to follow. As far as code verbosity goes, we've found the "secret" to be using a configuration file. That makes it easy to set up default logging and if our integrators want to tie in to their "enterprise" logging facility they can simply reconfigure it. We don't have to build custom hooks for them because they're already included. We *love* writing less code! Vinay, contact me off list if you want to talk about documentation. I'd be happy to help in any way I can. Doug
Doug Hellmann <doug.hellmann@...> writes:
We use the logging module extensively at work and have found the shared state (even in a multi-threaded/multi-process situation) *much* nicer than passing around log handles... ...now the code is simpler and the logs are easier to follow.
Thanks for sharing your experience.
As far as code verbosity goes, we've found the "secret" to be using a configuration file... We don't have to build custom hooks for them because they're already included. We *love* writing less code!
Vinay, contact me off list if you want to talk about documentation.
Thanks, Doug, I'll definitely do that. Regards, Vinay Sajip
On Fri, Sep 18, 2009 at 1:17 PM, Barry Warsaw <barry@python.org> wrote:
On Sep 18, 2009, at 10:07 AM, Doug Hellmann wrote:
We use the logging module extensively at work and have found the shared state (even in a multi-threaded/multi-process situation) *much* nicer than passing around log handles.
Here, here. -Barry
Also +1; while the concept of shared state is bad in some contexts, for apps where you want one central logger the global registry *is* nice.
Jesse Noller schrieb:
Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose. Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you.
Uh, the logging docs already contain most of Doug's MOTW for logging in the "Tutorial/Simple examples" section. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
On Fri, Sep 18, 2009 at 10:15 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Jesse Noller schrieb:
Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose. Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you.
Uh, the logging docs already contain most of Doug's MOTW for logging in the "Tutorial/Simple examples" section.
Georg
I know, but I think that further work doesn't hurt - it only helps. In fact I've still got a checkout where I'm trying to mind-meld multiprocessing's docs with Doug's writing. We should just give doug a commit bit. jesse
Jesse Noller schrieb:
On Fri, Sep 18, 2009 at 10:15 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Jesse Noller schrieb:
Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose. Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you.
Uh, the logging docs already contain most of Doug's MOTW for logging in the "Tutorial/Simple examples" section.
Georg
I know, but I think that further work doesn't hurt - it only helps. In fact I've still got a checkout where I'm trying to mind-meld multiprocessing's docs with Doug's writing.
We should just give doug a commit bit.
Fine by me, but does he want it? (And if he had it, he'd probably not know where to start...) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
On Fri, Sep 18, 2009 at 10:37 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Jesse Noller schrieb:
On Fri, Sep 18, 2009 at 10:15 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Jesse Noller schrieb:
Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose. Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you.
Uh, the logging docs already contain most of Doug's MOTW for logging in the "Tutorial/Simple examples" section.
Georg
I know, but I think that further work doesn't hurt - it only helps. In fact I've still got a checkout where I'm trying to mind-meld multiprocessing's docs with Doug's writing.
We should just give doug a commit bit.
Fine by me, but does he want it? (And if he had it, he'd probably not know where to start...)
Georg
Yes - I think he is open to it, and could be an excellent contributor to the docs at very least. jesse
2009/9/18 Jesse Noller <jnoller@gmail.com>:
I've kept my mouth shut (well, on this subject) simply due to the fact I tend to feel API design is a bit of a "smell" thing. First off; thank you for the package. [...] Perhaps that's a good place to start - higher level functions/methods/etc to "scale down" loggings perceived complexity? I know I'm trying to do bits of that for multiprocessing.
Basically, exactly what Jesse said applies with me - except that I don't use the logging module intensively, and when I do use it the lack of a simple approach makes it a struggle (nagging "why don't I just use print statements" questions in the back of my mind :-)) If I try to recall my last experience, things that struck me as "too hard" were: - logging.getLogger().warning(...) is irritatingly verbose ("feels like Java"). Hmm, looks like I missed the module-level warning() function and its cousins - don't know how I managed that! That's basically my mistake (I can't even really claim that the documentation is hard to find, looks like I just missed it). It might be nice if the module-level functions had a logger='whatever' argument to ease the change from "simple" requirements (root logger only) to more complex ones (multiple loggers), but that's hardly critical. - The file configuration seems very complex, for simple uses. I looked at using it, but ultimately rolled my own, because I only wanted a few simple options (which, no surprise, grew as time went on :-)). Currently my application config consists of [Log] level: ERROR file: {APPDIR}\{APP}.log maxsize: 1M log level, file to log to (with a couple of simple templating parameters and special cases of "stdout" and "stderr") and a max size (if set, use a rotating file handler). Of course, writing a simpler version of logging.config.fileConfig can be done as a 3rd party addition, so it's not fundamental to the logging module. Oh, and finally I *hate* Java-style camelCase but that's purely a preference thing, and it's not going to change for compatibility reasons, so let's ignore that. Anyway, ultimately have no significant issues with the logging module. Maybe it doesn't make the simple cases as simple as I'd like, but it gives me all the power I'm ever likely to need, and then some. So thanks for all your work on it! Paul
Paul Moore <p.f.moore@...> writes:
- The file configuration seems very complex, for simple uses. I looked at using it, but ultimately rolled my own, because I only wanted a few simple options (which, no surprise, grew as time went on ). [snip] Oh, and finally I *hate* Java-style camelCase but that's purely a preference thing, and it's not going to change for compatibility reasons, so let's ignore that.
Okay, I've commented elsewhere that I'm happy to provide unix_style_underscored_names.
Anyway, ultimately have no significant issues with the logging module. Maybe it doesn't make the simple cases as simple as I'd like, but it gives me all the power I'm ever likely to need, and then some. So thanks for all your work on it!
For non-rotating files, it's really as simple as import logging logging.basicConfig(filename='/tmp/logging_example.out', level=logging.DEBUG,) logging.debug('This message should go to the log file') Of course for rotating it's more involved, but that's where there's really not much of a common denominator. IMO The best approach would be to add recipes to the Python Cookbook on ActiveState.
Vinay Sajip <vinay_sajip@yahoo.co.uk> writes:
Paul Moore <p.f.moore@...> writes:
Oh, and finally I *hate* Java-style camelCase […]
Okay, I've commented elsewhere that I'm happy to provide unix_style_underscored_names.
I think you mean “python_pep8_style_underscored_names” :-) How would these names be provided? As simple ‘fooName = foo_name’? Or would the PEP-8 one be preferred, deprecating the non-conformant name? -- \ “I filled my humidifier with wax. Now my room is all shiny.” | `\ —Steven Wright | _o__) | Ben Finney
Ben Finney <ben+python@...> writes:
I think you mean “python_pep8_style_underscored_names”
How would these names be provided? As simple ‘fooName = foo_name’? Or would the PEP-8 one be preferred, deprecating the non-conformant name?
I don't see any need for deprecation, as it's a personal preference rather than anything else. And yes, just a 'foo_name = fooName' is what I was thinking of. Now, this has come up in the past and Guido has said he doesn't think it's really worth doing. Personally, it only makes sense to me if this is going to be done for the whole of the stdlib which isn't PEP-8 conformant. And you have to question whether the time to be spent on this couldn't be spent on some *real* improvements ;-) Regards, Vinay Sajip
On Sun, Oct 11, 2009 at 15:04, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Ben Finney <ben+python@...> writes:
I think you mean “python_pep8_style_underscored_names”
How would these names be provided? As simple ‘fooName = foo_name’? Or would the PEP-8 one be preferred, deprecating the non-conformant name?
I don't see any need for deprecation, as it's a personal preference rather than anything else. And yes, just a 'foo_name = fooName' is what I was thinking of.
I am not going to start up a discussion on moving logging or unittest over to PEP 8 standards, but I want to make clear it is not a personal preference thing but coding standards thing. Both the logging and unittest get a pass on not meeting our standard for historical reasons and that's it; nothing to do with personal preference.
Now, this has come up in the past and Guido has said he doesn't think it's really worth doing. Personally, it only makes sense to me if this is going to be done for the whole of the stdlib which isn't PEP-8 conformant. And you have to question whether the time to be spent on this couldn't be spent on some *real* improvements ;-)
I agree. This would be more of a Py4k thing then bothering with it in Python 3 now. -Brett
Brett Cannon <brett@...> writes:
I am not going to start up a discussion on moving logging or unittest over to PEP 8 standards, but I want to make clear it is not a personal preference thing but coding standards thing. Both the logging and unittest get a pass on not meeting our standard for historical reasons and that's it; nothing to do with personal preference.
Sorry, Brett, you're right. I wasn't trying to skirt around the importance of having a common coding style. When I said "personal preference", I was referring to the fact that some people get somewhat exercised about the issue, to the extent that the word "hate" comes up in the discussion. Typically it's not because they're PEP 8 zealots; more commonly, they're used to it from their Unix/C experience and find the Pascal, Java,and C# conventions obnoxious. I'm not personally tied to any convention, I normally work with the convention for the project. Logging's divergence from PEP 8 is, as you say, historical, and when the Py4K time comes around, I'll be rolling up my sleeves and getting on with the conversion :-) Regards, Vinay Sajip
I'm not personally tied to any convention, I normally work with the convention for the project. Logging's divergence from PEP 8 is, as you say, historical, and when the Py4K time comes around, I'll be rolling up my sleeves and getting on with the conversion :-)
Regards,
Vinay Sajip
In all seriousness, why not do something like what we started with the threading module? We added pep8 aliases and started marking the old API deprecated. They'll follow the common deprecation guidelines we already have. That, plus the dict config option you mentioned plus a doc re-org would probably go a long way to improving the PR of the module. jesse
Jesse Noller <jnoller@...> writes:
In all seriousness, why not do something like what we started with the threading module? We added pep8 aliases and started marking the old API deprecated. They'll follow the common deprecation guidelines we already have.
That, plus the dict config option you mentioned plus a doc re-org would probably go a long way to improving the PR of the module.
It's just a question of priorities and timing. Clearly, the dict config option, other functional enhancements and documentation reorganisation should take priority. I'm waiting for a little while to see what feedback comes in from python-dev about the dict config functionality. I've been in touch with Doug Hellmann on the documentation reorganisation and am waiting for some feedback from him, and I've started to put together some diagrams which will hopefully help clarify some aspects of the documentation. Regards, Vinay Sajip
On Oct 12, 2009, at 1:20 PM, Vinay Sajip wrote:
Jesse Noller <jnoller@...> writes:
In all seriousness, why not do something like what we started with the threading module? We added pep8 aliases and started marking the old API deprecated. They'll follow the common deprecation guidelines we already have.
That, plus the dict config option you mentioned plus a doc re-org would probably go a long way to improving the PR of the module.
It's just a question of priorities and timing. Clearly, the dict config option, other functional enhancements and documentation reorganisation should take priority.
I'm waiting for a little while to see what feedback comes in from python-dev about the dict config functionality. I've been in touch with Doug Hellmann on the documentation reorganisation and am waiting for some feedback from him, and I've started to put together some diagrams which will hopefully help clarify some aspects of the documentation.
Jesse keeps bugging me to review PyCon proposals ;-), so I haven't had a chance to dig into the doc work, yet. It's still on my list, though. Doug
Jesse Noller <jnoller@...> writes:
I've kept my mouth shut (well, on this subject) simply due to the fact I tend to feel API design is a bit of a "smell" thing. First off; thank you for the package. As much as I might not like the API - I use the logging package an *insane* amount. I also know you're responsive, and you care strongly about it. I myself might come off as slightly defensive for my own module (multiprocessing) if someone where to just say "lol it sux hahaha" (in fact, I have).
You are right; it is flexible, and it is meant for a wide-range of use cases, and when you really "get it" it can be wildly powerful (I think David Beazley wrote a twitter handler!).
It was only a matter of time. Next, a Google Wave handler. ;-)
However, my particular gut feeling when dealing with it stems from something I can't quite communicate properly. I *want* something "simpler" - for example, something which logs messages at a certain level to stderr (but not stdout) and stdout to stdout (but not stderr) - but also has a file logger.
I don't want to have to write more than a few lines of code to do this - in *my* mind this is something so fundamental to unixy-scripts/daemons than it should be as simple as:
import logging
log = logging.get_log('mylog') log.warning('hay guys')
What I end up doing in most of my projects (sorry, not public ones) is wrapping this in a "jesse.log" module that offers that API. The user does not see the complexity of the underlying logging module's APIs. In fact, I have a nasty tendency to create one "log" object which also has a fair amount of the logging module's API pushed into it, e.g.:
from jesse import log
log = log() log.critical('yay!') log.set_level(log.WARNING) # I loathe BouncyNames log.add_handler(log.file_handler(level=log.CRITICAL))
One reason why people think logging is complicated is that they typically have to contend with two things - loggers *and* handlers - whereas they have perhaps in just used a logger, which either logs to file, or to console, or whatever. For simple applications, this might work. However, for more sophisticated requirements, it really helps to keep the two concepts separate: "Where did it happen?" (the logger) and "Who wants to know?" (the handler). In my experience, these are in general not mapped one-to-one. In a typical enterprise use case, a critical error might page an on-call member of the support team, write a detailed message to log, print a simplified message to console, and email the incident to a developer team mailbox. The logger is the domain of the developer. The handler is the domain of the deployer. In many scenarios, of course, they are the same constituency - but in many other scenarios, it's not so. On the BouncyName thing - yes I know it's a religious thing for some people. I've worked in Python, Java and C# amongst others and I've seen it all. My cardinal rule is to fit in with whatever's already there, and although PEP 8 mandates names with no bounce, the logging package had already been released and was in use before being proposed for inclusion in Python, and was using camelCaseNames. Furthermore, the stdlib didn't at the time religiously stick to PEP 8 in this regard. As I'm agnostic on this point, I've no problem in providing all the method names using unix_style_underscore_notation, though of course the other names will still be there for backwards compatibility :-) Where you say "smell", I say "taste". It has a more positive connotation, but means more or less the same thing in this context. For instance, I would not naturally use "log" to name a logger, as in my mind, a logger is something that facilitates getting some information into a log, and the log is the final substrate for that information - the file, the console screen, the email or whatever. So IMO there are elements of personal taste which we just have to pass over when considering API design in general and naming in particular, and when working with others who may not think exactly as we do. I mentioned that the handler is the domain of the deployer, by which I mean that using hard-coded set_level and add_handler calls in code is not always the right thing to do. (To be sure, it is the right thing to do on many occasions). Of course your simple basic example is doable as import logging, sys logging.basicConfig(level=logging.DEBUG, stream=sys.stderr) logger = logging.getLogger('myapp') logger.warning('hay guys') which is only one line longer than your snippet - but in general, it's better if the verbosity of logging in an application can be turned up and down without having to change the application. It's not uncommon for people to use configuration files to configure levels and handlers, and in the case of long-running daemon processes this can even be done without stopping the process. The Python logging package's configuration format uses ConfigParser, which doesn't float everybody's boat. If it is a cause of extreme hatred or even mild distaste, you can always come up with your own configuration file format which is more to your liking, and configure programmatically from that. Why ConfigParser? one may ask. I don't believe that introducing my own, different, ad-hoc configuration file format would have been the right thing to do. TOOWTDI, and all that. Also, at the time, it was important not to add too many APIs to the system, using the guiding principle of YAGNI (You Aren't Gonna Need It), particularly if they were things that would perhaps be more open to subjective judgements. Feel free to search the python-dev mail archives for YAGNI in the late 2001-early 2002 timeframe. If I come up with a simple easy-to-use wrapper that you like, then it's quite possible that some other smart-and-opinionated developer will complain that it doesn't do it for him/her. So - if it's just five lines of code in a project, even if it's for every project, then it's hardly worth secreting much bile over, is it? (I'm not saying *you* are.) There are also some archaic constructs which result from trying to provide Python 1.5.2 compatibility. At the time, there was a good reason for this - most Linux distros were shipping with 1.5.2, although 2.2 had been around for some time. I felt at the time that was important to support developers and sysadmins who would have had to use the system's Python rather than upgrading to 2.3. Of course the landscape has now changed considerably, and I am not of the opinion that 1.5.2 support is stll important. The archaic relics will disappear over time, no doubt.
The other aspect of this is my experience trying to explain logging to people who have never dealt with that module before. Recently, I was trying to explain it to someone who has limited python knowledge, and really just wanted something like what I describe above. They read the docs, re-read the docs, re-re-re-read the docs, and still came to me and said "how on earth do I do this?!". The API doesn't (and this again, is a smell thing) feel python-y - it feels very Java like (having experience log4j, I can say it really does feel like it). I think this trips newbies up quite a bit.
I don't know why, particularly, apart from the conflating-handlers-and-loggers thing. But, as I explained above, that separation of concerns is there for a reason. I'm well aware of the Java predilection for over-complicating things, and I have not drunk that particular Kool-Aid. I don't blog, but following a recent discussion with Glyph Lefkowitz of Twisted Matrix, I created a blog where I will put information related to Python logging which perhaps doesn't belong in the documentation. The first post is, reasonably enough, entitled "Python Logging 101" and is available at http://plumberjack.blogspot.com/2009/09/python-logging-101.html Perhaps you could take a look at it. Is really is a bit of a 101 in terms of going from first principles; it's distilled from an article draft I had prepared for DDJ back in 2002, but which never got accepted :-( [In those days, DDJ was a prestigious print magazine, not vendor-aligned, rather than a website-with-mailshots which is hard to distinguish from a Microsoft tentacle.] It's only a five-minute-or-so read, and perhaps might explain why logging's broad design is as it is.
Part of the newb-to-not-newb transition would be helped by possibly simplifying the docs (something I am *still* working on for multiprocessing) - the examples can be like drinking from a fire hose.
Well the basic example is import logging logging.debug('A debug message') logging.info('Some information') logging.warning('A shot across the bows') which is hardly challenging, and it builds up from there.
Doug Hellmann - the author of the Python Module of the Week is an *excellent* doc writer (see his logging write up here: http://www.doughellmann.com/PyMOTW/logging/index.html#module-logging) and might be willing to help, give pointers, what have you.
I'll definitely take it up with Doug. AFAIK, the current introductory documentation on logging owes, I believe, something to Doug's PyMOTW entry for logging, but I'm not sure as I didn't write that material. (I can't remember who did, but it was in response to "logging is complicated" feedback.)
Like I said at the start - this is all a "smell" thing, and it obviously varies from person to person. This is fundamentally why I was interested in encouraging Mishok and others to put together concrete ideas together (I would be interested in seeing an alternative implementation as a thought exercise). I know others besides me have written little wrappers around logging, for example:
http://pypi.python.org/pypi/easylog/ http://pypi.python.org/pypi/autolog/ http://pypi.python.org/pypi/sensible/
Perhaps that's a good place to start - higher level functions/methods/etc to "scale down" loggings perceived complexity? I know I'm trying to do bits of that for multiprocessing.
I'm not sure how much traction they've got. Anybody know? The very fact that there are three seems to indicate that this is an area where subjective judgements play a part. So, people can pick whichever one they want, and off they go. If I incorporated one of their versions into the stdlib, presumably the others would still be around, along with your variation on the same theme.
Then of course, there's time for something completely different ;)
Is Zachary actually serious about this as an alternative to Python logging? (I'm not dissing it - just asking if he is seriously committed to getting it to have the same level of functionality.) From the screencast, I got the impression (perhaps mistakenly) that it was motivated at least in part by "Oh look! Coroutines! Shiny new Python toys. Let's see what we can do with them!" After an initial flurry of work on it, things have gone quiet over the last four weeks. Perhaps he's not pushed his changes to BitBucket because he's still working on them. Coroutines are undeniably nice for some things - but I can't say I see a no-brainer fit with logging. I know coroutines are new in Python but I've seen them come and go in popularity a couple of times in different environments. Regards, Vinay Sajip
Vinay, you get a bug thumbs-up from me too, for the logging package and your maintainance and support. Vinay Sajip schrieb:
The Python logging package's configuration format uses ConfigParser, which doesn't float everybody's boat. If it is a cause of extreme hatred or even mild distaste, you can always come up with your own configuration file format which is more to your liking, and configure programmatically from that.
Although I like and use logging, I belong more to the hatred people for the logging configuration. What I am really missing is a way to do basic logging configuration (calling logging.basicConfig(...)) with some command line args for the Python interpreter. I have suggested that before IIRC, but it doesn't seem to fly. -- Thanks, Thomas
Thomas Heller <theller@...> writes:
Vinay, you get a bug thumbs-up from me too, for the logging package and your maintainance and support.
Thanks.
Although I like and use logging, I belong more to the hatred people for the logging configuration.
I don't blame you. It's partly, though not all, a ConfigParser thing. The difficulty for me with configuration is, there's very little chance of a consensus for how it should work. Everybody will have their opinion, they'll all have good points, but nobody will agree :-( Backwards compatibility might be manageable, if we stick to a ConfigParser format and introduce a "version" key somewhere.
What I am really missing is a way to do basic logging configuration (calling logging.basicConfig(...)) with some command line args for the Python interpreter. I have suggested that before IIRC, but it doesn't seem to fly.
Can you point me to a ticket or post about it? Thanks and regards, Vinay Sajip
Vinay Sajip schrieb:
Thomas Heller <theller@...> writes:
Although I like and use logging, I belong more to the hatred people for the logging configuration.
I don't blame you. It's partly, though not all, a ConfigParser thing. The difficulty for me with configuration is, there's very little chance of a consensus for how it should work. Everybody will have their opinion, they'll all have good points, but nobody will agree :-( Backwards compatibility might be manageable, if we stick to a ConfigParser format and introduce a "version" key somewhere.
You should be fine with me: I don't care because I don't the config files.
What I am really missing is a way to do basic logging configuration (calling logging.basicConfig(...)) with some command line args for the Python interpreter. I have suggested that before IIRC, but it doesn't seem to fly.
Can you point me to a ticket or post about it?
I'm not sure there is a ticket or feature request for it - probably not. What I would like to see is something like this: python -L level=DEBUG script.py scriptargs ... which then calls import logging; logging.basicConfig(level=logging.DEBUG) Unfortunately it's not possible to access command line arguments in sitecustomize.py, otherwise I would have hacked it there... -- Thanks, Thomas
Vinay Sajip schrieb:
Thomas Heller <theller@...> writes:
What I am really missing is a way to do basic logging configuration (calling logging.basicConfig(...)) with some command line args for the Python interpreter. I have suggested that before IIRC, but it doesn't seem to fly.
Can you point me to a ticket or post about it?
I've opened a feature request at the Python tracker: http://bugs.python.org/issue6958 -- Thanks, Thomas
Hi, Vinay Sajip schrieb:
It's easy for you to just say "it's broken by design", and that's only a more polite way of saying "it sucks". It doesn'nt strike me as a basis for constructive dialogue, unless you provide some more specifics. Of course it is easier.
I think you are being dogmatic, rather than pragmatic. The Zen of Python says, "practicality beats purity." Your bugbear here seems to be how shared state causes problems in web applications. Despite having shared state, AFAIK the logging module is quite usable in a web context - as well as the usage with Django and Tornado that I mentioned earlier, Google App Engine uses it too (meaning, all the web applications developed with GAE can use it). Django has a global settings module and yet there are tons of developers hating that one. Even Simon Willison agrees that the settings module was the biggest mistake made in Django.
You're a very smart guy, Armin, but you perhaps need to consider that it is possible to not like a design because it doesn't suit your personal taste - but that doesn't necessarily make it a bad design. That does not have anything to do with personal taste or not. Some things cannot work in some situations, and logging is currently one of them. Logging was designed to be based on persistent loggers. You get one, it's there, you can log into it. SQLAlchemy for example does an incredible dance to get separate loggers for temporary database connections.
Currently, Python logging doesn't do formatting of stack traces etc. until it's sure that the message is severe enough to require handling, based on the current logger configuration. That is actually something where I had a small problem with the logging implementation that could be fixed: the exc_text is currently only set in the format() function of the formatter, instead of being an attribute on the log record. And yes, it could be an attribute of the log record if it was a property. (I guess the reason it isn't currently is that the logging module is backwards compatible down to 1.5.2 or something)
So at present, Python logging conforms to your statement "Only if there was any handler that wants to do anything with that message, it would pull the details form the stack and format ..." My point was that there are no loggers, no registry of loggers, just senders and senders are arbitrary Python objects.
In Python logging, you never have to instantiate a Formatter in your code unless you want some specific formatting functionality or format. In your scheme, if a user wanted certain messages to be formatted in certain ways (and other messages in other ways - e.g. for a log file as opposed to console display), you would do this without any formatter classes - how, exactly? In my personal experience the formatting is based on the handler. For example if I want to log into a text file, i have one formatting, if I log to stderr I have a different one. If I log into an email I have a completely different one. For example I tried a long ago to log my exceptions from a logging handler into a new ticket in the trac. I ended up replacing a log of code from both log record and formatter inside the log handler because of the way the message was assembled. I had to escape text and could not rely on string formatting to work.
The global registry of loggers is there to avoid the need to pass loggers around the system - you just access them by name. It's a bit like thread locals - do you have a problem with thread locals, too? I have a problem with unnecessarily used thread locals. Not with the concept in general.
Sure, it's state shared across threads, unlike thread locals. But if it's causing you a specific problem because you've found some non-thread-safe behaviour, I'd really like to know. I don't complain that it's not threadsafe, I'm pretty sure if logging was thread unsafe someone would have noticed by now.
Agreed, the logging from a library leading to warning messages was an annoyance in the library as first released. But *you* never raised an issue about it. That is true and I have to change that.
If default configuration for the logging system is the biggest reason for you to hate that package, don't get mad - get even. Tell me how I messed up and how to make it better. If I would know how to improve it so that I'm happy with it, I would have told you.
Slow? Sure, everything has a cost. It's all about tradeoffs. What specific performance problems have you come up against? What mitigating strategies did you use? What were the observed performance metrics, as against what you expected? Did you do any profiling to be sure that logging was definitely the culprit? Show me the numbers.
If your frustration with shared state is from an aesthetic point of view, I can't really help. After all, sys.modules (say) is shared state too. http://lucumr.pocoo.org/2009/7/24/singletons-and-their-problems-in-python *cough*
You cannot delete loggers because in a multithreaded application, other threads may still be using them. You can, however, disable loggers - which, from a functional point of view, seems as good. Which is why I would not design a logging library based on loggers.
Or are you finding that loggers are taking up too much memory? Even if a logger would be just 8 bytes in size, it would steal leak if you cannot control the number of loggers created. (see SQLAlchemy for a nice example).
That's a cheap remark, I would have expected better from you. So everybody who uses logging (and is smart, like you) hates it but uses it because they have no choice, right? It *is* widely used, AFAIK. If everyone shared your opinion, I don't think that would be the case. A while ago when I was blogging about logging I wrote this:
[...] It's called "logging" and does exactly that --- it logs errors. I don't know why so many people miss it or just don't use it, but it's really one of the good things in the python standard library.
to which Robert Brewer replied:
That's also why it's so bad: it's so extensible and configurable that's it's far too slow for high-performance websites.
And I'm pretty sure Rober knows what he's talking about.
You can't contribute constructive criticism because you don't know how? Well, you've already made some criticisms in your post, so how about putting some in more detail? I really like Jinja2 (a lot), but if someone said to you "Jinja2 sucks, it's really slow, it does all that conversion to bytecode stuff, that's really gross. I can't really say any more than that, excuse me while I vomit" then how would you feel? That's pretty much how you're coming across right now. I gave up on defending Jinja2 a while ago. Because people from the Django world constantly call me names for enabling logic in templates ;)
I'm sorry for the way I expressed my disagreement with the library's design here and in what way. To make you feel better: from all the modules in the standard library, logging is still one of the best designed and implemented, despite my disagreement with it.
Perhaps you think I'm being oldskool by mentioning "courtesy", but I prefer to think if it as an attribute of being a grown-up. Open source is very political, but that doesn't mean it doesn't have to be polite. I agree.
Look at Graham Dumpleton's frustration over WSGI and Python 3.0 - I feel more than a little sympathy for his situation. He's actually trying to create something useful, just as I am. There's no shortage of approaches, and opinions, and in general, I believe competition is good. But some of us are just trying to get some work done here. and WSGI for Python 3 is something that has to be discussed. Graham's master plan is one proposal, but in my opinion not the best one.
Regards, Armi
Armin Ronacher <armin.ronacher@...> writes:
Django has a global settings module and yet there are tons of developers hating that one. Even Simon Willison agrees that the settings module was the biggest mistake made in Django.
No system is perfect, and even systems that are good to start with improve over time. I don't say that every use of shared state is justified. There's no connection I can see between haters of Django's global settings module and problems with the logging package. Care to stop waving your arms about over general principles like "shared state", and get down to specifics about logging?
That does not have anything to do with personal taste or not. Some things cannot work in some situations, and logging is currently one of them. Logging was designed to be based on persistent loggers. You get
You're still talking in very general terms - "some things cannot work in some situations" - that could really be said about most APIs. What things? What situations? I'm not saying we have to do this on this thread, else it might become off-topic - just that it should be done somewhere.
one, it's there, you can log into it. SQLAlchemy for example does an incredible dance to get separate loggers for temporary database connections.
Neither Mike Bayer nor anyone else on the SQLAlchemy team have raised this as an issue, AFAIK. Mike, if you or other SQLAlchemy committers are reading this, please get in touch. I'm not sure why they'd need to get separate loggers for temporary database connections - logging has mechanisms to log to the same logger while separating out different contextual information for transient entities (for example, database or socket connections).
That is actually something where I had a small problem with the logging implementation that could be fixed: the exc_text is currently only set in the format() function of the formatter, instead of being an attribute on the log record. And yes, it could be an attribute of the log record if it was a property. (I guess the reason it isn't currently is that the logging module is backwards compatible down to 1.5.2 or something)
It *is* an attribute of the LogRecord, and always has been - not sure where you get your "facts" ;-). The reason it is not set in the constructor is to avoid doing unnecessary work - something else you complained about. The reason the exc_text is set in the formatter is this: the logger receives the event/message, conditionally passes it on to handlers (i.e. only when necessary), the handlers format and send out the message (again, only when necessary). The exc_text is stored as an attribute of the LogRecord, but only computed when the exception is formatted for output by the overridable formatException method of the Formatter. If I hadn't done it that way, you might well have said, "Bah! There's no way to control how an exception gets formatted. Logging's not reusable." Once computed, it's stored there so that later handling operations don't need to compute it again (again, to save unnecessary work).
My point was that there are no loggers, no registry of loggers, just senders and senders are arbitrary Python objects.
The point of having logger names is to control, by easy configuration, logging verbosity at different levels in your application. Different levels imply a hierarchy. A dotted-namespace is a reasonable way of codifying a hierarchy - Java, Python and many other systems use it. If you don't want to pass loggers around between function calls (and in case you say a logger could be an instance attribute, remember logging works for non-OOP function-based scripts, too), then you need some way to get hold of a certain logger from anywhere in your application. A registry provides that. It means that at runtime, without bringing down a running application, it is possible to change the verbosity of any logger. If you want arbitrary objects to send notifications to arbitrary objects, you can use something like pydispatcher. It works, and it fulfills a need, but it's not, to my mind, a logging system. You've merely stated the germ of an idea ("no loggers, no registry of loggers, just senders, and senders are arbitrary Python objects"), but I see no evidence that you've thought it through in terms of addressing a whole host of practical issues, such as configuration.
In my personal experience the formatting is based on the handler. For example if I want to log into a text file, i have one formatting, if I log to stderr I have a different one. If I log into an email I have a completely different one. For example I tried a long ago to log my
Yes, that's how it is because the audiences are different. The end user may not need a timestamp and probably will not want a stack trace.
exceptions from a logging handler into a new ticket in the trac. I ended up replacing a log of code from both log record and formatter inside the log handler because of the way the message was assembled. I had to escape text and could not rely on string formatting to work.
I don't know, without knowing your problem in more detail, if you could have avoided copying and changing code from LogRecord and Formatter. Obviously I've tried to provide enough hooks so that people can subclass and override methods for specific requirements, such as adding to a Trac ticket. If you describe the problem in more detail, I may be able to indicate a better solution. If it turns out that I need to provide more hooks where people can override functionality, I'm open to doing that.
I have a problem with unnecessarily used thread locals. Not with the concept in general.
That's okay then. I don't feel that the use of shared state in logging is unnecessary, because of the benefits it confers. It was done in a thoughtful way, not just because I don't know any better. But then, I would say that, right?
I don't complain that it's not threadsafe, I'm pretty sure if logging was thread unsafe someone would have noticed by now.
So your complaint is really just a philosophical diatribe against shared state?
If I would know how to improve it so that I'm happy with it, I would have told you.
Then you're really saying something roughly like "It's Not Invented Here, so I don't like it. The only way it could be better is if I had thought of it. Period."
can't really help. After all, sys.modules (say) is shared state too. http://lucumr.pocoo.org/2009/7/24/singletons-and-their-problems-in-python *cough*
*cough* *cough*. I've already read that post, as I referred to it earlier in this thread. Since sys.modules is shared state at a much more fundamental level than logging's logger registry, why not focus your energies on getting that changed first? If you're successful at pulling it off, it'll no doubt lead to a whole slew of changes in the Python ecosystem, of which logging is just a tiny part.
... because in a multithreaded application, other threads ... be using them. You can, however, disable loggers - which, from a functional point of view, seems as good. Which is why I would not design a logging library based on loggers.
You didn't really answer the point that you don't need to delete loggers, since disabling them is just as good.
Or are you finding that loggers are taking up too much memory? Even if a logger would be just 8 bytes in size, it would steal leak if you cannot control the number of loggers created. (see SQLAlchemy for a nice example).
You (as the application developer or library developer) *can* control it, because you decide exactly which loggers get created. I'll ask Mike Bayer about the specifics of the SQLAlchemy issue.
A while ago when I was blogging about logging I wrote this:
[...] It's called "logging" and does exactly that --- it logs errors. I don't know why so many people miss it or just don't use it, but it's really one of the good things in the python standard library.
to which Robert Brewer replied:
That's also why it's so bad: it's so extensible and configurable that's it's far too slow for high-performance websites.
And I'm pretty sure Rober knows what he's talking about.
No doubt he does. Logging's extensible and configurable because it has to work for *lots* of different scenarios - not *just* high-performance web sites. Obviously, that generality involves tradeoffs which may make the logging package less suitable for high-performance scenarios. Did Robert confide in what they *do* use for logging functionality? Robert - if you're reading, do you have numbers? Obviously *any* logging will involve *some* penalty. I'll be very happy if someone says, "we profiled the application with logging enabled, and we found that there was a penalty of X%. That's hard for us, but we can live with an overhead of Y%. Here are the profiler reports, can you do anything to improve it?" I might *not* be able to make much, if any, improvement, but I'd certainly take a shot at it.
I gave up on defending Jinja2 a while ago. Because people from the Django world constantly call me names for enabling logic in templates ;)
There you go. Is Jinja2 "broken by design", then? Aso, I prefer Python to Ruby. Should I say "Ruby? Broken, by design"?
I'm sorry for the way I expressed my disagreement with the library's design here and in what way. To make you feel better: from all the modules in the standard library, logging is still one of the best designed and implemented, despite my disagreement with it.
What, sarcasm now? From "broken by design" to "paragon of virtue, exemplar to the world"? Please. I've been around the block a few times, no longer the new kid, and my skin is reasonably thick. I'm not crying. But keep it constructive (i.e. with improvement as a goal), that's all I'm asking.
and WSGI for Python 3 is something that has to be discussed. Graham's master plan is one proposal, but in my opinion not the best one.
My point was not about whether Graham's plan was or was not the best. It's that he wants to get *something* reasonable out there, knows the issues and is frustrated with the constant back-and-forth between conflicting opinions. We could be waiting forever for a resolution, what good is that to anybody? Remember Voltaire: "Le mieux est l'ennemi du bien." (The best is the enemy of the good.) Am I showing my age, or what? ;-) Regards, Vinay Sajip
Hi, Vinay Sajip wrote:
Care to stop waving your arms about over general principles like "shared state", and get down to specifics about logging? I was pointing out that a similar problem exists elsewhere. And you will never bring me to a point where I say something along the lines of "shared state is okay because we cannot avoid it".
It *is* an attribute of the LogRecord, and always has been - not sure where you get your "facts" ;-). s/attribute/property/
I don't know, without knowing your problem in more detail, if you could have avoided copying and changing code from LogRecord and Formatter. Obviously I've tried to provide enough hooks so that people can subclass and override methods for specific requirements, such as adding to a Trac ticket. If you describe the problem in more detail, I may be able to indicate a better solution. If it turns out that I need to provide more hooks where people can override functionality, I'm open to doing that. It's in the standard library, modifications are not that useful. If I could pull an updated version from an external URL that adds these hooks, fine. But until then I can't use any of the modifications done in there because it has to be compatible Python 2.4.
So your complaint is really just a philosophical diatribe against shared state? It's not philosophical, it's practical.
Then you're really saying something roughly like "It's Not Invented Here, so I don't like it. The only way it could be better is if I had thought of it. Period." I really don't know how to reply to that...
*cough* *cough*. I've already read that post, as I referred to it earlier in this thread. Since sys.modules is shared state at a much more fundamental level than logging's logger registry, why not focus your energies on getting that changed first? If you're successful at pulling it off, it'll no doubt lead to a whole slew of changes in the Python ecosystem, of which logging is just a tiny part. I think you're missing something: I do not intent do change it. I complain about both logging and sys.modules because in my opinion those are broken by design. However there is also no way we could possibly change those, so I don't even try to think about solutions. Please acknowledge that.
There you go. Is Jinja2 "broken by design", then? For many people it certainly is. Also there are design decisions I made in Jinja2 early on that are certainly broken, such as the scoping and I try to fix those by adding deprecation warning for edge cases. However, you cannot do that in the standard library, different rules apply there.
Aso, I prefer Python to Ruby. Should I say "Ruby? Broken, by design"? I will just ignore that.
What, sarcasm now? From "broken by design" to "paragon of virtue, exemplar to the world"? Please. I've been around the block a few times, no longer the new kid, and my skin is reasonably thick. I'm not crying. But keep it constructive (i.e. with improvement as a goal), that's all I'm asking. That was not sarcastic.
My point was not about whether Graham's plan was or was not the best. It's that he wants to get *something* reasonable out there, knows the issues and is frustrated with the constant back-and-forth between conflicting opinions. We could be waiting forever for a resolution, what good is that to anybody? What good is a broken specification. It's not about changing the specification, it's about fixing it. And there is a lot of things that have to be thought through. I love Graham's efforts in changing it, but just deciding on one the proposals without thinking about the consequences it has is not very wise.
Regards, Armin
2009/9/18 Armin Ronacher <armin.ronacher@active-4.com>:
I think you're missing something: I do not intent do change it. I complain about both logging and sys.modules because in my opinion those are broken by design. However there is also no way we could possibly change those, so I don't even try to think about solutions. Please acknowledge that.
Please, if you're not looking for change, do not complain about the existing situation. At least, do not complain on this mailing list - this whole discussion is tedious (with the exception that I have been impressed with Vinay's patience and willingness to listen to your complaints and do something practical where possible) as well as being off-topic for this list. Can this thread end now, please? Paul.
Hi, Armin Ronacher schrieb:
I think you're missing something: I do not intent do change it. I complain about both logging and sys.modules because in my opinion those are broken by design. However there is also no way we could possibly change those, so I don't even try to think about solutions. Please acknowledge that. Just to explain myself a little bit: I got the impression you're complaining that I'm not filing bugs in the bug tracker about logging. Now you know the reason: what I want from logging is something that will never happen.
As far as I'm concerned if this discussion should continue, then probably around a table with some beer ;) Regards, Armin
On Wed, Sep 16, 2009 at 2:19 PM, Armin Ronacher <armin.ronacher@active-4.com> wrote:
I wonder if the solution to this problem wouldn't be a largely improved packaging system and some sort of standardized reviewing process for the standard library.
FYI The work we've started in PEP 376 and PEP 386 (and some elements not formalized in PEPs) is trying to improve the situation. Distutils provides one part of what would be a full packaging system, and this work tries to fill the gap: - uninstall feature - version comparison - extension of metadata (to include dependencies) The plan is to finish this work and provide a basis to build a package manager. (which could be used by the stdlib as well if the packages and modules in there were at the same level than third party packages) Regards Tarek -- Tarek Ziadé | http://ziade.org | オープンソースはすごい!
participants (19)
-
Antoine Pitrou
-
Armin Ronacher
-
Barry Warsaw
-
Ben Finney
-
Brett Cannon
-
C. Titus Brown
-
Doug Hellmann
-
Georg Brandl
-
Jesse Noller
-
John Szakmeister
-
M.-A. Lemburg
-
Michael Foord
-
Paul Moore
-
R. David Murray
-
Raymond Hettinger
-
Tarek Ziadé
-
Thomas Heller
-
Vinay Sajip
-
Yuvgoog Greenle