Re: [Mailman-Developers] GSOC 2013 project discussion
In evaluating a proposal, we need to look at a number of factors:
First, will it work? -- Does the proposed design accomplish the stated objective? Next: Is it useful? And: Can the candidate be expected to accomplish the task within the allotted time frame? Finally: Is it the best use for our limited resources (funding, mentor time, etc.)?
If your presentation makes it easier to answer each of those questions in a positive manner, it will increase the likelihood that it will get funded.
On Apr 17, 2013, at 6:16 AM, Avik Pal <avikpal.me@gmail.com> wrote:
for identifying an important message a classifier will be implemented. and thanks for pointing out the issue regarding the delivery of the message, if it is delivered twice then the existing implementation of delivery is sufficient, but if we want to deliver it only once then for each person we need to maintain a database of important mails/threads to him(or vice-versa) and while sending check against that database. but this is going to raise some normalization issues which are to be taken care of by careful designing.
Avik Pal Bengal Engineering & Science University,Shibpur github:https://github.com/avikpal IRC:- irc://freenode/avikp,isnick twitter:-https://twitter.com/avikpalme
On 17 April 2013 01:02, Richard Wackerbarth <rkw@dataplex.net> wrote: An interesting suggestion -- A couple of things to consider:
How do you identify "important" messages?
Will you deliver these messages twice -- first as important and then, later, as a part of the digest ?
On Apr 16, 2013, at 2:13 PM, Avik Pal <avikpal.me@gmail.com> wrote:
also I would like to propose an idea of my own. Many of us set the
preference in mailman to get all the emails of a day batched together, but sometimes this means we miss important mails(though we get it at the end of the day but we miss the moment)----important to the community, or my own interest, discussion on something I also have discussed upon in my previous mails, delivery of these mails instantly to the subscriber so that he can also join at that very moment may come out to be a very useful feature. Thus person gets to set two options 1.receive batched mails only. 2.receive batched mails with important mails delivered instantly.
thanks a lot for the information. Thing is that I don't think
that the Spam classifier by itself is going to be big enough so I came up with this idea. Actually I also need to know what the community wants, regarding the e-mail delivery. and regarding the classifier I don't think that it is not going to be a problem at all( from my end, with my previous experience in machine learning NLP, just we need a database for the subscribers where classifier data for them is going to be stored) but the most important thing is what you have pointed out "Is it the best use for our limited resources (funding, mentor time, etc.)?" I am looking forward to Barry, Terri in this regard.
Meanwhile It would be much appreciated if someone can direct me to
an labeled dataset available on line.
Also somebody was talking about legal aspects in some countries
and also the fact that the classification to be done in MTA only. Here I have a suggestion, after submitting, whenever an email is classified as Spam, we store it in a separate archive and after the end of the day send them a mail telling "this is the digest for all the mails that Mailman thinks to be Spam" the subscriber may go there and can view them and also can mark them as not Spam, which will help the learning algorithm to work on the decision boundary and also the precision recall are also to be found out which upon adjusting the boundary or after being marked by majority(in simple words) as not Spam will be incorporated back into the main archive and will be sent as a part of the main digest then. Emails which stays as Spam will be dropped after a month
Avik Pal Bengal Engineering & Scieence University,Shibpur github:https://github.com/avikpal IRC:- irc://freenode/avikp,isnick twitter:-https://twitter.com/avikpalme
On 17 April 2013 17:37, Richard Wackerbarth <rkw@dataplex.net> wrote:
In evaluating a proposal, we need to look at a number of factors:
First, will it work? -- Does the proposed design accomplish the stated objective? Next: Is it useful? And: Can the candidate be expected to accomplish the task within the allotted time frame? Finally: Is it the best use for our limited resources (funding, mentor time, etc.)?
If your presentation makes it easier to answer each of those questions in a positive manner, it will increase the likelihood that it will get funded.
On Apr 17, 2013, at 6:16 AM, Avik Pal <avikpal.me@gmail.com> wrote:
for identifying an important message a classifier will be implemented. and thanks for pointing out the issue regarding the delivery of the message, if it is delivered twice then the existing implementation of delivery is sufficient, but if we want to deliver it only once then for each person we need to maintain a database of important mails/threads to him(or vice-versa) and while sending check against that database. but this is going to raise some normalization issues which are to be taken care of by careful designing.
Avik Pal Bengal Engineering & Science University,Shibpur github:https://github.com/avikpal IRC:- irc://freenode/avikp,isnick twitter:-https://twitter.com/avikpalme
On 17 April 2013 01:02, Richard Wackerbarth <rkw@dataplex.net> wrote:
An interesting suggestion -- A couple of things to consider:
How do you identify "important" messages?
Will you deliver these messages twice -- first as important and then, later, as a part of the digest ?
On Apr 16, 2013, at 2:13 PM, Avik Pal <avikpal.me@gmail.com> wrote:
also I would like to propose an idea of my own. Many of us set
preference in mailman to get all the emails of a day batched together, but sometimes this means we miss important mails(though we get it at the end of the day but we miss the moment)----important to the community, or my own interest, discussion on something I also have discussed upon in my
the previous
mails, delivery of these mails instantly to the subscriber so that he can also join at that very moment may come out to be a very useful feature. Thus person gets to set two options 1.receive batched mails only. 2.receive batched mails with important mails delivered instantly.
participants (2)
-
Avik Pal
-
Richard Wackerbarth