[MATRIX-SIG] Much ado about nothingness.

Hoon Yoon - IPT Quant hyoon@nyptsrv1.etsd.ml.com
Tue, 8 Jul 1997 15:04:21 -0400


Aaron and MSIG:


  I am not quite that concerned about the speed of the following code.
It runs in about 2sec on my Ultra Sparc. I got a lot of loops in
there to check the data. This is fine for 20 or 30 stocks, but 
any more would prove problematic.
   What I have much more concern for is that I am not happy that
Null, missing, or None values are foreign concepts to NumPy. An
array time should not be complex. And there should be much easier
way to pull matching and NOT criteria. Reverse of Take would be
very nice indeed. I guess I am griping, because I have to get whole
new thought process going here to get what I could do in 20 minutes
in Gauss rather than 2 hours. Still, I think there should be a more
elegent ways to handle Null data.
   About 80% modeling is dealing with data. Hardest to deal with is
the issue with missing data. I should not continously use loops to
deal with this in a complex matrix. There should be a set missing 
code that does not effect the matrix to default to complex. Pretty
much all other grips about handling could be classed away over time
in an elegent manner; however, I cannot change the fact that too many
operators like equal, where, etc... neglects to deal with Nulls.
I should be able to nonzero(equal(a, None)) to get index of Nones.
And most operation should be programmed with missing data poitn in
mind.
  To credit the Python, I must say even with loops, the ideas are much
more readable than anything else I have ever written. Given how wonderful
Python has been so far, I am amazed that None/missing has not been addressed
in NumPy.
  Thanks Aaron, I hope the following code generally give you sense of
80% of my work. There are few more places, I would have dropped loops
like > .80 could have been greater, the holidays could have been chopped
away at one shot, etc...


Hoon,

p.s.: Despite my nagging, I won't trade Python for anything at this point.
I just wanna see it improved to the point that I won't have to learn any
other language. I would like to better business graphics, like GNUplot tied
to this and better statistics (I am hoping to tie in something else if 
necessary), and this will be perfect for my purposes.
*****************************************************************
import shelve
from   dateproc import *
from   Numeric import *

d        = shelve.open('p_v_rum')
dates    = d['i_dates']    # get dates that keys this time series
dates.sort()

trd_d = dates[:]           # This will have all holidays removed later
holi = d['i_holi']         # get all available holidays

holi_i = []
for hday in holi.keys():   # getting rid of holidys
	if len(holi[hday]) < (len(d.keys())*0.8): continue # At least 80% 
stocks msg
	try:   hsx = dates.index(hday)   # if > 80%: see if it's available.
	except ValueError: continue
	else:
		holi_i.append(hsx)           # holiday indexes to be used on 
retrv.
		del trd_d[hsx]               # Get rid holidays leaving only 
Tradins

tgt_date = chkdate('19970422')       # This my Target date
try:   dsx = trd_d.index(tgt_date)   # See if the date is one of index Trd 
dates
except ValueError: dsx = add.reduce(less(trd_d, tgt_date)) # if not goto next 
date

dte_fwd  = 6    # I need 5 dates forward prices
dte_bwd  = 5    # I need backward as well

trd_date = trd_d[dsx-dte_bwd:dsx+dte_fwd] # get Trading Range

st_d = dates.index(trd_date[0])      # get date range, since that's how it's 
stored
ed_d = dates.index(trd_date[-1])

h_sx = nonzero(less_equal(holi_i,ed_d)*greater_equal(holi_i,st_d))
# find Holidays Index inside the range, I will use to del later


stocks = ['ITX','IBM','AAPL']   # my ticker list
result = []
for stk in stocks:
	iadj = dates.index(d[stk]['began'])    # Time Series began, so need 
to adj.
	close = d[stk]['close']                # get closing prices
	retrv = close[st_d-iadj:ed_d-iadj+1]   # get only section interested 
in

	for hidx in h_sx:                      # getrid of all holidays in 
retrv
		del retrv[hidx]

	if retrv[0] == None:                   # if first one is None
		bk=0
		while st_d-iadj > iadj:            # go back in time to get 
last know P
			bk=bk+1
			if close[st_d-iadj-bk] != None:
				retrv[0] = close[st_d-iadj-bk]
				break

	while None in retrv:                   # if there are None values
		try:   Nsx = retrv.index(None)  
		except ValueError:	break          # no more None's done!
		else:  retrv[Nsx] = retrv[Nsx-1]   # back fill using old 
prices

	if len(retrv) == len(trd_date):
		result.append(retrv)
	else:
		print 'Error Wrong size!'
		break

result = array(result)  # turn it into a Numeric array!

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________