[MATRIX-SIG] Much ado about nothingness.
Hoon Yoon - IPT Quant
hyoon@nyptsrv1.etsd.ml.com
Tue, 8 Jul 1997 15:04:21 -0400
Aaron and MSIG:
I am not quite that concerned about the speed of the following code.
It runs in about 2sec on my Ultra Sparc. I got a lot of loops in
there to check the data. This is fine for 20 or 30 stocks, but
any more would prove problematic.
What I have much more concern for is that I am not happy that
Null, missing, or None values are foreign concepts to NumPy. An
array time should not be complex. And there should be much easier
way to pull matching and NOT criteria. Reverse of Take would be
very nice indeed. I guess I am griping, because I have to get whole
new thought process going here to get what I could do in 20 minutes
in Gauss rather than 2 hours. Still, I think there should be a more
elegent ways to handle Null data.
About 80% modeling is dealing with data. Hardest to deal with is
the issue with missing data. I should not continously use loops to
deal with this in a complex matrix. There should be a set missing
code that does not effect the matrix to default to complex. Pretty
much all other grips about handling could be classed away over time
in an elegent manner; however, I cannot change the fact that too many
operators like equal, where, etc... neglects to deal with Nulls.
I should be able to nonzero(equal(a, None)) to get index of Nones.
And most operation should be programmed with missing data poitn in
mind.
To credit the Python, I must say even with loops, the ideas are much
more readable than anything else I have ever written. Given how wonderful
Python has been so far, I am amazed that None/missing has not been addressed
in NumPy.
Thanks Aaron, I hope the following code generally give you sense of
80% of my work. There are few more places, I would have dropped loops
like > .80 could have been greater, the holidays could have been chopped
away at one shot, etc...
Hoon,
p.s.: Despite my nagging, I won't trade Python for anything at this point.
I just wanna see it improved to the point that I won't have to learn any
other language. I would like to better business graphics, like GNUplot tied
to this and better statistics (I am hoping to tie in something else if
necessary), and this will be perfect for my purposes.
*****************************************************************
import shelve
from dateproc import *
from Numeric import *
d = shelve.open('p_v_rum')
dates = d['i_dates'] # get dates that keys this time series
dates.sort()
trd_d = dates[:] # This will have all holidays removed later
holi = d['i_holi'] # get all available holidays
holi_i = []
for hday in holi.keys(): # getting rid of holidys
if len(holi[hday]) < (len(d.keys())*0.8): continue # At least 80%
stocks msg
try: hsx = dates.index(hday) # if > 80%: see if it's available.
except ValueError: continue
else:
holi_i.append(hsx) # holiday indexes to be used on
retrv.
del trd_d[hsx] # Get rid holidays leaving only
Tradins
tgt_date = chkdate('19970422') # This my Target date
try: dsx = trd_d.index(tgt_date) # See if the date is one of index Trd
dates
except ValueError: dsx = add.reduce(less(trd_d, tgt_date)) # if not goto next
date
dte_fwd = 6 # I need 5 dates forward prices
dte_bwd = 5 # I need backward as well
trd_date = trd_d[dsx-dte_bwd:dsx+dte_fwd] # get Trading Range
st_d = dates.index(trd_date[0]) # get date range, since that's how it's
stored
ed_d = dates.index(trd_date[-1])
h_sx = nonzero(less_equal(holi_i,ed_d)*greater_equal(holi_i,st_d))
# find Holidays Index inside the range, I will use to del later
stocks = ['ITX','IBM','AAPL'] # my ticker list
result = []
for stk in stocks:
iadj = dates.index(d[stk]['began']) # Time Series began, so need
to adj.
close = d[stk]['close'] # get closing prices
retrv = close[st_d-iadj:ed_d-iadj+1] # get only section interested
in
for hidx in h_sx: # getrid of all holidays in
retrv
del retrv[hidx]
if retrv[0] == None: # if first one is None
bk=0
while st_d-iadj > iadj: # go back in time to get
last know P
bk=bk+1
if close[st_d-iadj-bk] != None:
retrv[0] = close[st_d-iadj-bk]
break
while None in retrv: # if there are None values
try: Nsx = retrv.index(None)
except ValueError: break # no more None's done!
else: retrv[Nsx] = retrv[Nsx-1] # back fill using old
prices
if len(retrv) == len(trd_date):
result.append(retrv)
else:
print 'Error Wrong size!'
break
result = array(result) # turn it into a Numeric array!
_______________
MATRIX-SIG - SIG on Matrix Math for Python
send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________