[BangPypers] Idea

Anand Balachandran Pillai abpillai at gmail.com
Fri Mar 7 17:28:11 CET 2008


I actually went ahead and did this today. I registered a new blog
at http://pythonjobs.blogspot.com . It took me roughly 3 hours to
write a custom crawler using HarvestMan to crawl monthly archives
of bangpypers and post Jobs automatically to blogger. It uses
the Google blogger API in gdata-python-client library.

http://code.google.com/p/gdata-python-client/

If someone wants to see the code of the custom crawler
it is available in the HarvestMan-2.0 trunk.

http://svn.eiao.net/robacc/experimental/HarvestMan-2.0/harvestman/apps/postingcrawler.py

I wrote a custom blogger module by using sample code from the google blogger
API. Since it contains google's code, I have not checked it into the
subversion trunk.
If someone wants the code, let me know.

To make sure your jobs are in the Blog, just ensure that you make your
job posts with [JOB] in the title. That is all the crawler looks for.

Regards,
--Anand


On Fri, Mar 7, 2008 at 6:32 PM, Anand Balachandran Pillai
<abpillai at gmail.com> wrote:
> On Fri, Mar 7, 2008 at 6:30 PM, Anand Balachandran Pillai
>  <abpillai at gmail.com> wrote:
>  >
>  > On Fri, Mar 7, 2008 at 6:05 PM, Harish Krishnan <bugsy.seigel at gmail.com> wrote:
>  >  >
>  >  >
>  >  > On 07-Mar-08, at 4:57 PM, Anand Balachandran Pillai wrote:
>  >  >
>  >  >
>  >  >  1. Automate blog posting backend when a mail which seems to mention a new
>  >  >  job posting is posted. This can be done bye requiring specific keyword(s)
>  >  > in
>  >  >  the subject for job postings such as [JOB]. I am not sure, but mailman
>  >  > might
>  >  >  allow such customizations in the backend.
>  >  >
>  >  > Sounds like a nice idea. It would also be good if we have a policy for not
>  >  > posting jobs directly on the mailing list else it will lead to duplication.
>  >  >
>  >  >
>  >  >
>  >  >  2. An incremental crawler (always!) which monitors the group for postings
>  >  > and
>  >  >  automatically fetches JOB posting posts (similar approach, use keywords or
>  >  >  naive bayesian classification!) and post it to a specific blog.
>  >  >
>  >  >
>  >  >
>  >  > This is even better. what does it take for this to work?
>  >  >
>  >
>  >  Nothing much. Just give me half a day to create a custom crawler for this
>  >  on top of HarvestMan :)
>  Ok, this is not posturing :) If someone can register an appropriate blog and
>  send me the URL and the auth credentials I will create the "job
>  posting crawler".
>  Only that someone has to bear the responsibility of running it on
>  a frequent basis.
>
>  gnuyoga, can you do this ? It would be a nice exercise to write a custom
>  crawler for this...
>
> >
>  >  > Harish
>  >
>  >
>  > >
>  >  >
>  >  > _______________________________________________
>  >  >  BangPypers mailing list
>  >  >  BangPypers at python.org
>  >  >  http://mail.python.org/mailman/listinfo/bangpypers
>  >  >
>  >  >
>  >
>  >
>  >
>  >  --
>  >  -Anand
>  >
>
>  Thanks
>
>  --
>  -Anand
>



-- 
-Anand


More information about the BangPypers mailing list