[Chicago] Chicago - Web Crawlers

Philip Doctor Philip.S.Doctor at gmail.com
Fri Oct 12 23:28:50 CEST 2012


Hi Paul,
You might strongly consider looking into Beautiful Soup for scraping in
python if you haven't already.  I've worked with it plenty of times and it
beats the stuffing out of trying to regex it.

http://www.crummy.com/software/BeautifulSoup/

Good luck.

-Phil



On Fri, Oct 12, 2012 at 4:25 PM, Paul Wallenberg <p.wallenberg at gmail.com>wrote:

> Hi ChiPy,
>
> I work for LaSalle Network and hosted what used to be the "best meeting
> ever" of ChiPy (until the following month). We were recently engaged on an
> initative that involves building web crawlers and/or working with web
> scraping techniques to extract data from selected web sites.
>
> If you have had similar exposure, are well versed in Linux OS, and have
> worked with search engine technologies like Lucerne or Solr, please let me
> know and advise if it would sense for us to set up a time to chat.
>
> Thanks in advance for your time and interest.
>
> All my best,
>
> Paul
>
> Paul Wallenberg
> Project Manager - Technology Services
> LaSalle Network
> pwallenberg at lasallenetwork.com
> p. 312-413-1700
> d. 312-924-3683
> c. 847-738-3685
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20121012/e45383be/attachment.html>


More information about the Chicago mailing list