New imaplib implementation for Python 3.2+ standard library
Hello python-ideas, My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements. I wrote a PEP-style readme file that describes all the details of why the library was written and how it works, which is available from my mercurial repository: http://hg.mxcrypt.com/python/imaplib2/raw-file/tip/README The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server: http://hg.mxcrypt.com/python/imaplib2/ Is there any interest in adding my code to a future version of Python 3.x standard library? - Max
Maxim Khitrov <max@mxcrypt.com> wrote:
The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server:
I took a look at this. Nice work! Impressive list of extensions implemented. The first thing to ask is, does it pass the (minimal) test suite in Lib/test/test_imaplib.py? Or is it too different? The code seems to be not quite in compliance with PEP 8. If you're aiming at the stdlib you should probably fix that. I tend to think one of the larger missing pieces of imap4lib is the lack of a higher-level interface. I've used imap4lib for a couple of projects, and I think that if I hadn't already had the experience of implementing my own IMAP server in Python, I wouldn't have done that; I'd have used a package like getmail instead. There's a lot of experience with IMAP necessary to use imap4lib. Looking at your example.py, that still seems to be the case. One thing you might consider is implementing a subclass of "mailbox.Mailbox" which provides that missing higher-level interface to IMAP4. Bill
On Mon, Jul 25, 2011 at 12:46 PM, Bill Janssen <janssen@parc.com> wrote:
Maxim Khitrov <max@mxcrypt.com> wrote:
The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server:
I took a look at this. Nice work! Impressive list of extensions implemented.
Thank you :)
The first thing to ask is, does it pass the (minimal) test suite in Lib/test/test_imaplib.py? Or is it too different?
Too different. Writing my own test suite is the next step. The library already went through extensive testing on Gmail and MS Exchange servers while I was developing my main project, but of course a good set of unit tests is needed for long-term maintenance.
The code seems to be not quite in compliance with PEP 8. If you're aiming at the stdlib you should probably fix that.
Will do.
I tend to think one of the larger missing pieces of imap4lib is the lack of a higher-level interface. I've used imap4lib for a couple of projects, and I think that if I hadn't already had the experience of implementing my own IMAP server in Python, I wouldn't have done that; I'd have used a package like getmail instead. There's a lot of experience with IMAP necessary to use imap4lib.
True, but that's the case with all low-level interfaces. Having used imaplib for a while before deciding to write a replacement, I do think that building a higher-level interface using my library is significantly easier. I'm not certain if that interface should be part of the library or a separate module, similar to the relationship between zlib and gzip. I think the latter makes more sense, because any serious IMAP4 client will still need to descend to the protocol level.
Looking at your example.py, that still seems to be the case. One thing you might consider is implementing a subclass of "mailbox.Mailbox" which provides that missing higher-level interface to IMAP4.
I've never used the mailbox module before. It looks like it is aimed primarily at working with local mail stores, no? I'm not sure if that's the right interface to use over the network, but I understand your point. I'll try to write another class on top of my library with a much simpler API for navigating through the mailbox hierarchy and working with messages. Will upload it to the repository when it's ready. - Max
Maxim Khitrov <max@mxcrypt.com> wrote:
I tend to think one of the larger missing pieces of imap4lib is the lack of a higher-level interface. I've used imap4lib for a couple of projects, and I think that if I hadn't already had the experience of implementing my own IMAP server in Python, I wouldn't have done that; I'd have used a package like getmail instead. There's a lot of experience with IMAP necessary to use imap4lib.
True, but that's the case with all low-level interfaces. Having used imaplib for a while before deciding to write a replacement, I do think that building a higher-level interface using my library is significantly easier.
I'm not certain if that interface should be part of the library or a separate module, similar to the relationship between zlib and gzip. I think the latter makes more sense, because any serious IMAP4 client will still need to descend to the protocol level.
There are lots of folks who just need to connect to an IMAP account, and do something with their mail. Many many fewer need to write a serious IMAP4 client. That's the big problem with the current imap4lib; it just doesn't work for most of the use cases, because it's too low-level. So it doesn't get much use -- more's the pity.
Looking at your example.py, that still seems to be the case. One thing you might consider is implementing a subclass of "mailbox.Mailbox" which provides that missing higher-level interface to IMAP4.
I've never used the mailbox module before. It looks like it is aimed primarily at working with local mail stores, no? I'm not sure if that's the right interface to use over the network, but I understand your point.
I'll try to write another class on top of my library with a much simpler API for navigating through the mailbox hierarchy and working with messages. Will upload it to the repository when it's ready.
Cool! Looking forward to it. I'd still suggest making it a subclass of mailbox.Mailbox, btw. PEP 20: "There should be one -- and preferably only one -- obvious way to do it." Bill
On 25 July 2011 02:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
There is an existing, well tested and widely used, replaced for imaplib that I would suggest should be the first for consideration in replacing imaplib: http://imapclient.freshfoo.com/ (Sorry.) All the best, Michael Foord
I wrote a PEP-style readme file that describes all the details of why the library was written and how it works, which is available from my mercurial repository:
http://hg.mxcrypt.com/python/imaplib2/raw-file/tip/README
The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server:
http://hg.mxcrypt.com/python/imaplib2/
Is there any interest in adding my code to a future version of Python 3.x standard library?
- Max _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html
On Mon, Jul 25, 2011 at 3:21 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 02:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
There is an existing, well tested and widely used, replaced for imaplib that I would suggest should be the first for consideration in replacing imaplib:
http://imapclient.freshfoo.com/
(Sorry.)
All the best,
Michael Foord
I have it beat at the "Python 3 support is in the works" feature ;) Mine doesn't handle 2.x though. In any case, I would not have been able to use IMAPClient for my project, because the requirements were for no dependencies outside of Python 3.2. Do you know if the developers of IMAPClient considered getting it into the standard library? My goal wasn't just to have another IMAP implementation, but something better available as part of Python. - Max
On Mon, Jul 25, 2011 at 14:38, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 3:21 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 02:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
There is an existing, well tested and widely used, replaced for imaplib
that
I would suggest should be the first for consideration in replacing imaplib:
http://imapclient.freshfoo.com/
(Sorry.)
All the best,
Michael Foord
I have it beat at the "Python 3 support is in the works" feature ;) Mine doesn't handle 2.x though.
That's not something you'd have to worry about anyway - new features, modules, and packages would only be accepted in the next 3.x release (3.3).
On 7/25/2011 3:42 PM, Brian Curtin wrote:
That's not something you'd have to worry about anyway - new features, modules, and packages would only be accepted in the next 3.x release (3.3).
And should be written in Py 3 style. In particular, long sequences should generally be returned as iterables rather than as lists. If I understood the proposal, fetch does that, where as old imaplib waited to create a finished list. -- Terry Jan Reedy
On Mon, Jul 25, 2011 at 4:01 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 7/25/2011 3:42 PM, Brian Curtin wrote:
That's not something you'd have to worry about anyway - new features, modules, and packages would only be accepted in the next 3.x release (3.3).
And should be written in Py 3 style. In particular, long sequences should generally be returned as iterables rather than as lists. If I understood the proposal, fetch does that, where as old imaplib waited to create a finished list.
Correct. - Max
On 25 July 2011 20:38, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 3:21 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 02:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
There is an existing, well tested and widely used, replaced for imaplib
that
I would suggest should be the first for consideration in replacing imaplib:
http://imapclient.freshfoo.com/
(Sorry.)
All the best,
Michael Foord
I have it beat at the "Python 3 support is in the works" feature ;) Mine doesn't handle 2.x though.
In any case, I would not have been able to use IMAPClient for my project, because the requirements were for no dependencies outside of Python 3.2.
Do you know if the developers of IMAPClient considered getting it into the standard library? My goal wasn't just to have another IMAP implementation, but something better available as part of Python.
I don't think Menno Smitts would object to adding Python 3 support or adding IMAPClient to the standard library. His goal was to create something useful to overcome what he saw (and evidently you agree) as irreparable brokenness in parts of imaplib. My point is that if there is an existing widely-used and battle-tested alternative, we would be wise to look at that first. Michael
- Max
-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html
On Mon, Jul 25, 2011 at 4:27 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 20:38, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 3:21 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 02:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
There is an existing, well tested and widely used, replaced for imaplib that I would suggest should be the first for consideration in replacing imaplib:
http://imapclient.freshfoo.com/
(Sorry.)
All the best,
Michael Foord
I have it beat at the "Python 3 support is in the works" feature ;) Mine doesn't handle 2.x though.
In any case, I would not have been able to use IMAPClient for my project, because the requirements were for no dependencies outside of Python 3.2.
Do you know if the developers of IMAPClient considered getting it into the standard library? My goal wasn't just to have another IMAP implementation, but something better available as part of Python.
I don't think Menno Smitts would object to adding Python 3 support or adding IMAPClient to the standard library. His goal was to create something useful to overcome what he saw (and evidently you agree) as irreparable brokenness in parts of imaplib.
My point is that if there is an existing widely-used and battle-tested alternative, we would be wise to look at that first.
Agreed. However, I just took a look through IMAPClient source code and have to correct you on your original assertion. IMAPClient is not a replacement for imaplib, but a wrapper around it. The author went down the same road that I originally started out on. He added a parser, a UTF-7 codec, and changed the overall interface to be more user-friendly. All good improvements, but at the core, IMAPClient relies entirely on imaplib, and thus inherits most of its design flaws. With the exception of the first and fifth bullet points in my README file (server response parser & UTF-7 mailbox name codec, respectively), all others apply equally to IMAPClient. For that reason, I would recommend against incorporating it into the standard library. My opinion on the matter is certainly biased, so I would welcome a review of both libraries by a neutral party familiar with the IMAP protocol, who could then make a recommendation. - Max
On 25 July 2011 23:39, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 4:27 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 20:38, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 3:21 PM, Michael Foord <fuzzyman@gmail.com>
wrote:
On 25 July 2011 02:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up
at
a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
There is an existing, well tested and widely used, replaced for imaplib that I would suggest should be the first for consideration in replacing imaplib:
http://imapclient.freshfoo.com/
(Sorry.)
All the best,
Michael Foord
I have it beat at the "Python 3 support is in the works" feature ;) Mine doesn't handle 2.x though.
In any case, I would not have been able to use IMAPClient for my project, because the requirements were for no dependencies outside of Python 3.2.
Do you know if the developers of IMAPClient considered getting it into the standard library? My goal wasn't just to have another IMAP implementation, but something better available as part of Python.
I don't think Menno Smitts would object to adding Python 3 support or adding IMAPClient to the standard library. His goal was to create something useful to overcome what he saw (and evidently you agree) as irreparable brokenness in parts of imaplib.
My point is that if there is an existing widely-used and battle-tested alternative, we would be wise to look at that first.
Agreed. However, I just took a look through IMAPClient source code and have to correct you on your original assertion. IMAPClient is not a replacement for imaplib, but a wrapper around it.
The author went down the same road that I originally started out on. He added a parser, a UTF-7 codec, and changed the overall interface to be more user-friendly. All good improvements, but at the core, IMAPClient relies entirely on imaplib, and thus inherits most of its design flaws.
Well I'm sure Menno would be interested if you could actually demonstrate those limitations rather than merely assert that. :-) (As he implemented to overcome problems with imaplib and that is why people use it.) Not that you're necessarily wrong, but I'm skeptical. Michael
With the exception of the first and fifth bullet points in my README file (server response parser & UTF-7 mailbox name codec, respectively), all others apply equally to IMAPClient. For that reason, I would recommend against incorporating it into the standard library.
My opinion on the matter is certainly biased, so I would welcome a review of both libraries by a neutral party familiar with the IMAP protocol, who could then make a recommendation.
- Max
-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html
On Tue, Jul 26, 2011 at 10:42 AM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 23:39, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 4:27 PM, Michael Foord <fuzzyman@gmail.com> wrote:
On 25 July 2011 20:38, Maxim Khitrov <max@mxcrypt.com> wrote:
On Mon, Jul 25, 2011 at 3:21 PM, Michael Foord <fuzzyman@gmail.com> wrote:
There is an existing, well tested and widely used, replaced for imaplib that I would suggest should be the first for consideration in replacing imaplib:
http://imapclient.freshfoo.com/
(Sorry.)
All the best,
Michael Foord
I have it beat at the "Python 3 support is in the works" feature ;) Mine doesn't handle 2.x though.
In any case, I would not have been able to use IMAPClient for my project, because the requirements were for no dependencies outside of Python 3.2.
Do you know if the developers of IMAPClient considered getting it into the standard library? My goal wasn't just to have another IMAP implementation, but something better available as part of Python.
I don't think Menno Smitts would object to adding Python 3 support or adding IMAPClient to the standard library. His goal was to create something useful to overcome what he saw (and evidently you agree) as irreparable brokenness in parts of imaplib.
My point is that if there is an existing widely-used and battle-tested alternative, we would be wise to look at that first.
Agreed. However, I just took a look through IMAPClient source code and have to correct you on your original assertion. IMAPClient is not a replacement for imaplib, but a wrapper around it.
The author went down the same road that I originally started out on. He added a parser, a UTF-7 codec, and changed the overall interface to be more user-friendly. All good improvements, but at the core, IMAPClient relies entirely on imaplib, and thus inherits most of its design flaws.
Well I'm sure Menno would be interested if you could actually demonstrate those limitations rather than merely assert that. :-)
(As he implemented to overcome problems with imaplib and that is why people use it.)
Not that you're necessarily wrong, but I'm skeptical.
Most of the limitations are outlined in my README file. I'm not sure what kind of a demonstration would convince you. The problem is that if your applications do not require things like data compression, asynchronous/parallel command execution, or multiple literals in commands, you will not agree that there is anything wrong with the existing tools. It's very much a matter of personal experience. One of the things I'm trying to address with my library is strict adherence to the current version of the IMAP4 protocol. The other is performance; hence the implementation of extensions such as SASL-IR, IDLE, non-synchronizing literals, multiappend, and compression. On the performance side, if you have an application that's trying do some sort of processing of a 6 GB mailbox with 700,000 messages in it, executing a separate FETCH command for each message will take you a week to finish. If you try to be clever and FETCH 1000 messages at a time, for example, you'll quickly run into a few problems: 1. The data for all 1000 messages is buffered in memory before you ever to get to see anything. 2. If the server fails to retrieve one of those messages, which actually happens rather frequently, at least on Gmail servers, you lose the data for the other 999. You'll get a 'NO' response from imaplib or an exception from IMAPClient, and an explanation of what went wrong instead of any data. 3. If you requested data for messages 1:1000 and the server decides to send you a FLAGS update for message 9999, imaplib will return that response as part of the data for the original request. All are interface design problems, which are inherited by IMAPClient. - Max P.S. It is not my intention to discourage the use of IMAPClient in any way. Its existence is a good thing for 99% of the users, because it does address a number of key imaplib issues with just the response parser and a UTF-7 codec. My point is that there are real-world use cases out there that cannot be handled by imaplib or IMAPClient, and for those, I'm offering my library as a more general solution that should satisfy the remaining 1% :)
On 7/26/2011 12:54 PM, Maxim Khitrov wrote:
On Tue, Jul 26, 2011 at 10:42 AM, Michael Foord<fuzzyman@gmail.com> wrote:
Well I'm sure Menno would be interested if you could actually demonstrate those limitations rather than merely assert that. :-)
(As he implemented to overcome problems with imaplib and that is why people use it.)
Not that you're necessarily wrong, but I'm skeptical.
Most of the limitations are outlined in my README file. I'm not sure what kind of a demonstration would convince you. The problem is that if your applications do not require things like data compression, asynchronous/parallel command execution, or multiple literals in commands, you will not agree that there is anything wrong with the existing tools. It's very much a matter of personal experience.
One of the things I'm trying to address with my library is strict adherence to the current version of the IMAP4 protocol. The other is performance; hence the implementation of extensions such as SASL-IR, IDLE, non-synchronizing literals, multiappend, and compression.
On the performance side, if you have an application that's trying do some sort of processing of a 6 GB mailbox with 700,000 messages in it, executing a separate FETCH command for each message will take you a week to finish. If you try to be clever and FETCH 1000 messages at a time, for example, you'll quickly run into a few problems:
1. The data for all 1000 messages is buffered in memory before you ever to get to see anything.
2. If the server fails to retrieve one of those messages, which actually happens rather frequently, at least on Gmail servers, you lose the data for the other 999. You'll get a 'NO' response from imaplib or an exception from IMAPClient, and an explanation of what went wrong instead of any data.
3. If you requested data for messages 1:1000 and the server decides to send you a FLAGS update for message 9999, imaplib will return that response as part of the data for the original request.
All are interface design problems, which are inherited by IMAPClient.
- Max
P.S. It is not my intention to discourage the use of IMAPClient in any way. Its existence is a good thing for 99% of the users, because it does address a number of key imaplib issues with just the response parser and a UTF-7 codec. My point is that there are real-world use cases out there that cannot be handled by imaplib or IMAPClient, and for those, I'm offering my library as a more general solution that should satisfy the remaining 1% :)
I think a proper iterator rather than batch mode fetch interface will benefit more than 1% of people. I also think RFC-based modules should be updated to complete-as-possible-and-sensible implementations of current RFCs that have replaced (in usage) previous RFCs. -- Terry Jan Reedy
(I'm replying via Google Groups because I just joined and don't have this thread in my email inbox. It's being a bit flaky so apologies if this comes though twice). On Jul 26, 5:54 pm, Maxim Khitrov <m...@mxcrypt.com> wrote:
One of the things I'm trying to address with my library is strict adherence to the current version of the IMAP4 protocol. The other is performance; hence the implementation of extensions such as SASL-IR, IDLE, non-synchronizing literals, multiappend, and compression.
On the performance side, if you have an application that's trying do some sort of processing of a 6 GB mailbox with 700,000 messages in it, executing a separate FETCH command for each message will take you a week to finish. If you try to be clever and FETCH 1000 messages at a time, for example, you'll quickly run into a few problems:
... All are interface design problems, which are inherited by IMAPClient.
- Max
P.S. It is not my intention to discourage the use of IMAPClient in any way. Its existence is a good thing for 99% of the users, because it does address a number of key imaplib issues with just the response parser and a UTF-7 codec. My point is that there are real-world use cases out there that cannot be handled by imaplib or IMAPClient, and for those, I'm offering my library as a more general solution that should satisfy the remaining 1% :)
As the maintainer of IMAPClient, I thought I'd weigh in. I've had a quick look at imaplib2[1] and I must say it's a solid piece of work. The number of IMAP extensions that it covers is impressive, the iterative streaming of fetch responses is great and the way that concurrent and async commands are handled is quite elegant. Max is correct about the limitations of imaplib (and therefore IMAPClient). As with many Python libraries (eg. poplib, email, smtplib), data is loaded into memory without the option of streaming (say via generators) and imaplib is not designed with asynchronous handling in mind. This simplicity has benefits in terms of the implementation and usage of the API but it can also lead to problems. I have considered restructuring IMAPClient so that it no longer depends on imaplib and so free it from those inherent limitations. This may still happen. imaplib2 makes good use of the features available in modern versions of Python. On the other hand, one of my goals with IMAPClient is to support a wide range of Python versions - many newer Python niceties are not used. Python 3 support for IMAPClient is definitely coming but a new addition to our family at home as slowed that effort down somewhat. I have never considered pushing for IMAPClient to be included as part of the stdlib given that it is fairly easy to install 3rd party packages these days, but I wouldn't be against it. Obviously Python 3 support would have to come first. I think imaplib2 is a very capable IMAP client library and the Python community could only benefit from having something like it in the standard library (on the proviso, as Brett mentions, that the Python community supports the library by using it widely). Here's a few comments about imaplib2 from my own biased perspective: It requires too much effort on behalf of the caller. Your example.py highlights how datetimes are returned as strings that need to be converted to real datetimes and FETCH response keys need to be uppercased to ensure consistency. The need to jump through the same non-trivial hoops each time I used imaplib was one of the frustrations that led to the creation of IMAPClient. Please consider having imaplib2 do a little more work so that every user doesn't have to. Similarly, UID support could be better. IMAPClient has a boolean attribute which lets you select whether you want UIDs to be transparently used for future commands. Having to specify whether you want UID support enabled on each call is a little clumsy. It's unlikely that a user of imaplib2 would want to toggle between using UIDs and not on every call. This has already been mentioned but imaplib2 won't get accepted into the stdlib if you don't conform to PEP 8. Those tabs have to go. How much testing has imaplib2 seen against real IMAP implementations? Throughout IMAPClient's history its users have found many unexpected behaviours in various popular IMAP implementations. Those discoveries have lead to updates to IMAPClient's code and tests (this is the "battle-tested" aspect that Michael refers too). On top of its unit tests, IMAPClient has a fairly extensive live test script that can be run (destructively) against a real IMAP account. I have test accounts with a number of different IMAP implementations which I regularly test IMAPClient against. A set of "live" tests is invaluable for testing new features and avoiding regressions between versions. It would be interesting to see what problems you find if you set up something similar for imaplib2. Anyway, I wish you all the best with your project. Regards, Menno [1] - Are you aware there's already another project with the same name? http://www.janeelix.com/piers/python/imaplib2.html
On Wed, Jul 27, 2011 at 6:34 PM, Menno Smits <menno@freshfoo.com> wrote:
(I'm replying via Google Groups because I just joined and don't have this thread in my email inbox. It's being a bit flaky so apologies if this comes though twice).
On Jul 26, 5:54 pm, Maxim Khitrov <m...@mxcrypt.com> wrote:
One of the things I'm trying to address with my library is strict adherence to the current version of the IMAP4 protocol. The other is performance; hence the implementation of extensions such as SASL-IR, IDLE, non-synchronizing literals, multiappend, and compression.
On the performance side, if you have an application that's trying do some sort of processing of a 6 GB mailbox with 700,000 messages in it, executing a separate FETCH command for each message will take you a week to finish. If you try to be clever and FETCH 1000 messages at a time, for example, you'll quickly run into a few problems:
... All are interface design problems, which are inherited by IMAPClient.
- Max
P.S. It is not my intention to discourage the use of IMAPClient in any way. Its existence is a good thing for 99% of the users, because it does address a number of key imaplib issues with just the response parser and a UTF-7 codec. My point is that there are real-world use cases out there that cannot be handled by imaplib or IMAPClient, and for those, I'm offering my library as a more general solution that should satisfy the remaining 1% :)
As the maintainer of IMAPClient, I thought I'd weigh in.
...
I think imaplib2 is a very capable IMAP client library and the Python community could only benefit from having something like it in the standard library (on the proviso, as Brett mentions, that the Python community supports the library by using it widely).
Thanks for the kind words :)
Here's a few comments about imaplib2 from my own biased perspective:
It requires too much effort on behalf of the caller. Your example.py highlights how datetimes are returned as strings that need to be converted to real datetimes and FETCH response keys need to be uppercased to ensure consistency. The need to jump through the same non-trivial hoops each time I used imaplib was one of the frustrations that led to the creation of IMAPClient. Please consider having imaplib2 do a little more work so that every user doesn't have to.
Part of this will be addressed by the higher-level interface that I'm currently working on. As for imaplib2, there are two reasons why I decided not to do any sort of automatic normalization of the responses (with the exception of CAPABILITY): 1. Performance. Not all responses (and parts of a response) are useful to the caller. There is no point in having the library perform response-specific normalization just to have the whole thing discarded as soon as it is returned. Originally, I even played with the idea of a lazy parser (i.e. parse the response only if some attribute or data item is accessed), but decided to go for a simpler implementation in the end. 2. Consistency, expectations, and bugs. The normalization processes may not do the Right Thing for every single response. Ultimately, only the caller knows for sure what content to expect from the server, especially if you are trying to implement some server-specific commands or a new extension. The library only knows the general syntax rules. If you start to assume that all returned responses are normalized, you could run into some unwelcome surprises when that normalization fails or even corrupts some data for a response type that wasn't recognized. So basically, I think that in a low-level library such as this, it should be the caller's decision whether an INTERNALDATE value is converted to Unix time (or some other format), or if the FETCH response keys are changed to upper case. I'm happy to provide additional utility functions for such conversions, but trying to handle these things automatically could be a source of many additional bugs. Think about the separation between zlib and gzip, or binascii and base64 modules. My library is the low-level interface and I'm working on something that will be easier to use at the cost of some control.
Similarly, UID support could be better. IMAPClient has a boolean attribute which lets you select whether you want UIDs to be transparently used for future commands. Having to specify whether you want UID support enabled on each call is a little clumsy. It's unlikely that a user of imaplib2 would want to toggle between using UIDs and not on every call.
I have to disagree with you here. The application that I wrote this library for does depend on the ability to run UID and regular FETCH commands in the same connection. I was actually very surprised to see that IMAPClient requires you pick one or the other at creation time. In some applications you may need to discover and use the relationships between SNs and UIDs, or use a command like UID EXPUNGE (from UIDPLUS extension) and a regular EXPUNGE in the same session. I think that you do have to let the user make this decision on a per-command basis.
This has already been mentioned but imaplib2 won't get accepted into the stdlib if you don't conform to PEP 8. Those tabs have to go.
I know. I'll reformat everything once all the major coding is done.
How much testing has imaplib2 seen against real IMAP implementations? Throughout IMAPClient's history its users have found many unexpected behaviours in various popular IMAP implementations. Those discoveries have lead to updates to IMAPClient's code and tests (this is the "battle-tested" aspect that Michael refers too). On top of its unit tests, IMAPClient has a fairly extensive live test script that can be run (destructively) against a real IMAP account. I have test accounts with a number of different IMAP implementations which I regularly test IMAPClient against. A set of "live" tests is invaluable for testing new features and avoiding regressions between versions. It would be interesting to see what problems you find if you set up something similar for imaplib2.
I've tested against Gmail servers, Microsoft Exchange 2007, and ran simulated tests based on example sessions in various RFCs and other sources. I also wrote a "shell" script that connects to an IMAP server and goes into interactive mode, allowing me to run IMAP or Python commands exactly as you would in an interactive Python session. I'll try to upload it to the repository in the next few days. My library does need more testing. Although I tried to follow the robustness principle (be conservative in what you send; be liberal in what you accept) when writing the command generator and response parser, there probably are some bugs remaining, but hopefully not many. Which IMAP servers do you test against and how did you go about getting the test accounts?
[1] - Are you aware there's already another project with the same name? http://www.janeelix.com/piers/python/imaplib2.html
Hmm... I probably should have tried searching before using that name. I'm happy to go with something else, since my library is not in wide-spread use right now. Would suggesting imaplib3 for stdlib be a bit confusing? :/ That looks like another improvement of imaplib, which uses threads to achieve some asynchronous execution. Can't say that I like the approach, but I do admire the effort. They even got compression working, but that was at the cost of having to implement readline in python rather than relying on the BufferedReader. That was one of the bigger challenges for me as well, but I opted to write my own SocketIO class for this. - Max
On 28/07/11 02:35, Maxim Khitrov wrote:
It requires too much effort on behalf of the caller. Your example.py highlights how datetimes are returned as strings that need to be converted to real datetimes and FETCH response keys need to be uppercased to ensure consistency. The need to jump through the same non-trivial hoops each time I used imaplib was one of the frustrations that led to the creation of IMAPClient. Please consider having imaplib2 do a little more work so that every user doesn't have to.
Part of this will be addressed by the higher-level interface that I'm currently working on. As for imaplib2, there are two reasons why I decided not to do any sort of automatic normalization of the responses (with the exception of CAPABILITY): ... So basically, I think that in a low-level library such as this, it should be the caller's decision whether an INTERNALDATE value is converted to Unix time (or some other format), or if the FETCH response keys are changed to upper case. I'm happy to provide additional utility functions for such conversions, but trying to handle these things automatically could be a source of many additional bugs. Think about the separation between zlib and gzip, or binascii and base64 modules. My library is the low-level interface and I'm working on something that will be easier to use at the cost of some control.
Fair enough. If you're planning a higher-level interface and helper functions that means less repeated work for each user of imaplib2 then that's great.
Similarly, UID support could be better. IMAPClient has a boolean attribute which lets you select whether you want UIDs to be transparently used for future commands. Having to specify whether you want UID support enabled on each call is a little clumsy. It's unlikely that a user of imaplib2 would want to toggle between using UIDs and not on every call.
I have to disagree with you here. The application that I wrote this library for does depend on the ability to run UID and regular FETCH commands in the same connection. I was actually very surprised to see that IMAPClient requires you pick one or the other at creation time.
That's not quite right. UID selection can be set at creation time but also be changed at any point by using the use_uid attribute.
In some applications you may need to discover and use the relationships between SNs and UIDs, or use a command like UID EXPUNGE (from UIDPLUS extension) and a regular EXPUNGE in the same session. I think that you do have to let the user make this decision on a per-command basis.
I think that having to pass the flag on each call is a little awkward but that's a minor issue really. Maybe you could allow the user to specify a default value to use it's not specified for a given command?
My library does need more testing. Although I tried to follow the robustness principle (be conservative in what you send; be liberal in what you accept) when writing the command generator and response parser, there probably are some bugs remaining, but hopefully not many.
Which IMAP servers do you test against and how did you go about getting the test accounts?
I regularly test against: Gmail, Fastmail.fm (a Cyrus variant), vanilla Cyrus, Dovecot, Courier and MS Exchange. Gmail and Fastmail have free accounts and I run the other test servers myself except for the Exchange server which is at my employer.
[1] - Are you aware there's already another project with the same name? http://www.janeelix.com/piers/python/imaplib2.html
Hmm... I probably should have tried searching before using that name. I'm happy to go with something else, since my library is not in wide-spread use right now. Would suggesting imaplib3 for stdlib be a bit confusing? :/
Possibly! I'm not sure of a better name though. IMAPClient was the best I could come up with and that conflicts with a Perl package of the same name and functionality :) All the best, Menno
On Sun, Jul 24, 2011 at 18:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
I wrote a PEP-style readme file that describes all the details of why the library was written and how it works, which is available from my mercurial repository:
http://hg.mxcrypt.com/python/imaplib2/raw-file/tip/README
The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server:
http://hg.mxcrypt.com/python/imaplib2/
Is there any interest in adding my code to a future version of Python 3.x standard library?
Since no one has pointed you to it, there is a doc explaining what it takes to get a new module added to the stdlib: http://docs.python.org/devguide/stdlibchanges.html#adding-a-new-module .
On Tue, Jul 26, 2011 at 10:09 PM, Brett Cannon <brett@python.org> wrote:
On Sun, Jul 24, 2011 at 18:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
I wrote a PEP-style readme file that describes all the details of why the library was written and how it works, which is available from my mercurial repository:
http://hg.mxcrypt.com/python/imaplib2/raw-file/tip/README
The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server:
http://hg.mxcrypt.com/python/imaplib2/
Is there any interest in adding my code to a future version of Python 3.x standard library?
Since no one has pointed you to it, there is a doc explaining what it takes to get a new module added to the stdlib: http://docs.python.org/devguide/stdlibchanges.html#adding-a-new-module .
Thanks, I actually found and read that page before making my initial announcement. For now, I'm making the module available under the simplified BSD license to anyone who wishes to use it. If it has gained some traction over the next year, I'll be happy to submit it to PSF under Apache License 2.0. Is there anything else I should do in the mean time? How do I go about moving the development into the Python's infrastructure, or does that happen later? Over the next week or so, I'll finish writing the higher-level interface, which actually did end up inheriting from Mailbox, but with some limitations that I'll explain later. After that, I'll put together a test suite and update the documentation. The last remaining bit will be to implement some additional extensions and authentication methods. Don't have anything else planned beyond that. - Max
On Wed, Jul 27, 2011 at 04:48, Maxim Khitrov <max@mxcrypt.com> wrote:
On Tue, Jul 26, 2011 at 10:09 PM, Brett Cannon <brett@python.org> wrote:
On Sun, Jul 24, 2011 at 18:06, Maxim Khitrov <max@mxcrypt.com> wrote:
Hello python-ideas,
My most recent project lead me down a path that eventually ended up at a new implementation of imaplib based on [RFC-3501]. Although I started the project by gradually adding functionality to the existing IMAP4 library, some of the features that I required simply could not be merged in (without breaking everything). As a result, I wrote my own version of the library, which incorporates all existing functionality of imaplib and includes many of my own improvements.
I wrote a PEP-style readme file that describes all the details of why the library was written and how it works, which is available from my mercurial repository:
http://hg.mxcrypt.com/python/imaplib2/raw-file/tip/README
The same repository also contains the library code and an example script that you can run if you have access to an IMAP4 server:
http://hg.mxcrypt.com/python/imaplib2/
Is there any interest in adding my code to a future version of Python 3.x standard library?
Since no one has pointed you to it, there is a doc explaining what it
takes
to get a new module added to the stdlib: http://docs.python.org/devguide/stdlibchanges.html#adding-a-new-module .
Thanks, I actually found and read that page before making my initial announcement. For now, I'm making the module available under the simplified BSD license to anyone who wishes to use it. If it has gained some traction over the next year, I'll be happy to submit it to PSF under Apache License 2.0. Is there anything else I should do in the mean time? How do I go about moving the development into the Python's infrastructure, or does that happen later?
Later; the module has to first get traction enough for python-dev to even consider adding it. Basically the community needs to have decided as a whole that your module is the best solution for the job and that its API is stable and done evolving. Hope this doesn't sound too negative, but we just have to be very cautious about what goes into Python's stdlib. -Brett
Over the next week or so, I'll finish writing the higher-level interface, which actually did end up inheriting from Mailbox, but with some limitations that I'll explain later. After that, I'll put together a test suite and update the documentation. The last remaining bit will be to implement some additional extensions and authentication methods. Don't have anything else planned beyond that.
- Max
participants (7)
-
Bill Janssen
-
Brett Cannon
-
Brian Curtin
-
Maxim Khitrov
-
Menno Smits
-
Michael Foord
-
Terry Reedy