Question About Running Python code

ryguy7272 ryanshuell at gmail.com
Thu Oct 16 00:50:23 CEST 2014


I'm trying to run this script (using IDLE 3.4)

#!/usr/bin/env python

import urllib2
import pytz
import pandas as pd

from bs4 import BeautifulSoup
from datetime import datetime
from pandas.io.data import DataReader


SITE = "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
START = datetime(1900, 1, 1, 0, 0, 0, 0, pytz.utc)
END = datetime.today().utcnow()


def scrape_list(site):
    hdr = {'User-Agent': 'Mozilla/5.0'}
    req = urllib2.Request(site, headers=hdr)
    page = urllib2.urlopen(req)
    soup = BeautifulSoup(page)

    table = soup.find('table', {'class': 'wikitable sortable'})
    sector_tickers = dict()
    for row in table.findAll('tr'):
        col = row.findAll('td')
        if len(col) > 0:
            sector = str(col[3].string.strip()).lower().replace(' ', '_')
            ticker = str(col[0].string.strip())
            if sector not in sector_tickers:
                sector_tickers[sector] = list()
            sector_tickers[sector].append(ticker)
    return sector_tickers


def download_ohlc(sector_tickers, start, end):
    sector_ohlc = {}
    for sector, tickers in sector_tickers.iteritems():
        print 'Downloading data from Yahoo for %s sector' % sector
        data = DataReader(tickers, 'yahoo', start, end)
        for item in ['Open', 'High', 'Low']:
            data[item] = data[item] * data['Adj Close'] / data['Close']
        data.rename(items={'Open': 'open', 'High': 'high', 'Low': 'low',
                           'Adj Close': 'close', 'Volume': 'volume'},
                    inplace=True)
        data.drop(['Close'], inplace=True)
        sector_ohlc[sector] = data
    print 'Finished downloading data'
    return sector_ohlc


def store_HDF5(sector_ohlc, path):
    with pd.get_store(path) as store:
        for sector, ohlc in sector_ohlc.iteritems():
            store[sector] = ohlc


def get_snp500():
    sector_tickers = scrape_list(SITE)
    sector_ohlc = download_ohlc(sector_tickers, START, END)
    store_HDF5(sector_ohlc, 'snp500.h5')


if __name__ == '__main__':
    get_snp500()


I got it from this link.
http://www.thealgoengineer.com/2014/download_sp500_data/

I'm just about going insane here.  I've been doing all kinds of computer programming for 11 years, and I know 10 languages.  I'm trying to learn Python now, but this makes no sense to me.  

I would be most appreciative if someone could respond to a few questions.

The error that I get is this.
'invalid syntax'

The second single quote in this line is highlighted pink.
print 'Downloading data from Yahoo for %s sector' % sector

#1)  That's very bizarre to mix single quotes and double quotes in a single language.  Does Python actually mix single quotes and double quotes?  

#2)  In the Python 3.4 Shell window, I turn the debugger on by clicking 'Debugger'.  Then I run the file I just created; it's called 'stock.py'.  I get the error immediately, and I can't debug anything so I can't tell what's going on.  This is very frustrating.  All the controls in the debugger are greyed out.  What's up with the debugger?

#3)  My final question is this?  How do I get this code running?  It seems like there is a problem with a single quote, which is just silly.  I can't get the debugger working, so I can't tell what's going on.  The only thins I know, or I think I know, is that the proper libraries seem to be installed, so that's one thing that's working.


I'd really appreciate it if someone could please answer my questions and help me get this straightened out, so I can have some fun with Python.  So far, I've spent 2 months reading 4 books, and trying all kinds of sample code...and almost every single thing fails and I have no idea why.



More information about the Python-list mailing list