[Tutor] Using Python to access .txt files stored behind a firewall as .exe files

Ian Monat ian.monat at gmail.com
Mon May 1 13:20:42 EDT 2017


I've got a Python project that I'd love some help on from a Python
developer who is well versed at web scraping or requests.

I work for a supplier, and we use a distributor to sell our products to
retailers. The distributor has a reporting website that requires a login.
>From that home / login page, you land on a page with 1 link for each state
in which we do business (12 states / links in total).

>From the 'state' page, you click a state link, and are taken to a page with
many data files for that state. The data files are neatly arranged .txt
files displayed as links, with logical naming conventions. The problem is,
when you click a link for a particular file, an .exe downloads to your
local machine.

Then you have you run the .exe which produces a zipped file, and inside the
zipped file, is the .txt, which what I really want. There's no way the
distributor will change anything about how they store files on their
website for me.  I've written a script using the requests module but I
think a web scraper like Scrapy, Beautiful Soup or Selinium may be
required.

What would you do? Thanks for your time. -Ian


More information about the Tutor mailing list