Re: [Numpy-discussion] add xirr to numpy financial functions?
On Mon, 25 May 2009 13:51:38 -0400, josef.pktd@gmail.com wrote:
On Mon, May 25, 2009 at 11:50 AM, Joe Harrington <jh@physics.ucf.edu> wrote:
On Sun, 24 May 2009 18:14:42 -0400 josef.pktd@gmail.com wrote:
On Sun, May 24, 2009 at 4:33 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
I hate to ask for another function in numpy, but there's an obvious one missing in the financial group: xirr. ?It could be done as a new function or as an extension to the existing np.irr.
The internal rate of return (np.irr) is defined as the growth rate that would give you a zero balance at the end of a period of investment given a series of cash flows into or out of the investment at regular intervals (the first and last cash flows are usually an initial deposit and a withdrawal of the current balance).
This is useful in academics, but if you're tracking a real investment, you don't just withdraw or add money on a perfectly annual basis, nor do you want a calc with thousands of days of zero entries just so you can handle the uneven intervals by evening them out. ?Both excel and openoffice define a "xirr" function that pairs each cash flow with a date. ?Would there be an objection to either a xirr or adding an optional second arg (or a keyword arg) to np.irr in numpy? ?Who writes the code is a different question, but that part isn't hard.
3 comments:
* open office has also the other function in an x??? version, so it might be good to add it consistently to all functions
* date type: scikits.timeseries and the gsoc for implementing a date type would be useful to have a clear date type, or would you want to base it only on python standard library
* real life accuracy: given that there are large differences in the definition of a year for financial calculations, any simple implementation would be only approximately accurate. for example in the open office help, oddlyield list the following option
Basis is chosen from a list of options and indicates how the year is to be calculated. Basis Calculation 0 or missing US method (NASD), 12 months of 30 days each 1 Exact number of days in months, exact number of days in year 2 Exact number of days in month, year has 360 days 3 Exact number of days in month, year has 365 days 4 European method, 12 months of 30 days each
So, my question: what's the purpose of the financial function in numpy? Currently it provides convenient functions for (approximate) interest calculations. If they get expanded to a "serious" implementation of, for example, the main financial functions listed in the open office help (just for reference) then maybe numpy is not the right location for it.
I started to do something similar in matlab, and once I tried to use real dates instead of just counting months, the accounting rules get quickly very messy.
Using dates as you propose would be very convenient, but the users shouldn't be surprised that their actual payments at the end of the year don't fully match up with what numpy told them.
my 3cents
Josef
First point: agreed. ?I wish this community had a design review process for numpy and scipy, so that these things could get properly hashed out, and not just one person (even Travis) suggesting something and everyone else saying yeah-sure-whatever.
Does anyone on the list have the financial background to suggest what functions "should" be included in a basic set of financial routines? xirr is the only one I've ever used in a spreadsheet, myself.
Other points: Yuk. ?You're right.
When these first came up for discussion, I had a Han Solo moment ("I've got a baaad feeling about this...") but I couldn't put my finger on why. ?They seemed like simple and limited functions with high utility. ?Certainly anything as open-ended as financial-industry rules should go elsewhere (scikits, scipy, monpy, whatever).
But, that doesn't prevent a user-supplied, floating-point time array from going into a function in numpy. ?The rate of return would be in units of that array. ?Functions that convert date/time in some format (or many) and following some rule (or one of many) to such a floating array can still go elsewhere, maintained by people who know the definitions, if they have interest (pun intended). ?That would make the functions in numpy much more useful without bloating them or making them a maintenance nightmare.
If you think of time just as a regularly spaced, e.g. days, but with sparse points on it, or as a continuous variable, then extending the current functions should be relatively easy. I guess the only questions are compounding, annual, quarterly or at each payment, and whether the annual rate is calculated as real compounded annualized rate or as accounting annual rate, e.g. quarterlyrate*4.
This leaves "What is the present value, if you get 100 Dollars at the 10th day of each month (or at the next working day if the 10th day is a holiday or a weekend) for the next 5 years and the monthly interest rate is 5/12%?" for another day.
Initially I understood you wanted the date as a string or date type as in e.g open office. What would be the units of the user-supplied, floating-point time array? It is still necessary to know the time units to provide an annualized rate, unless the rate is in continuous time, exp(r*t). I don't know whether this would apply to all functions in numpy.finance, it's a while since I looked at the code. Maybe there are some standard simplifications in open office or excel.
I briefly skimmed the list of function in the open office help, and it would be useful to have them available, e.g. as a package in scipy. But my google searches in the past for applications in finance with a compatible license didn't provide much useful code that could form the basis of a finance package.
Adding more convenience and functionality to numpy.finance is useful, but if they get extended with slow feature creep, then another location (scipy) might be more appropriate and would be more expandable, even if it happens only slowly.
That's just my opinion (obviously), I'm a relative newbie to numpy/scipy and still working my way through all the different subpackages.
np.irr is defined on (anonymous) constant time intervals and gives you the growth per time interval. The code is very short, basically a call to np.roots(values): def irr(values): """ Return the Internal Rate of Return (IRR). This is the rate of return that gives a net present value of 0.0. Parameters ---------- values : array_like, shape(N,) Input cash flows per time period. At least the first value would be negative to represent the investment in the project. Returns ------- out : float Internal Rate of Return for periodic input values. Examples -------- >>> np.irr([-100, 39, 59, 55, 20]) 0.2809484211599611 """ res = np.roots(values[::-1]) # Find the root(s) between 0 and 1 mask = (res.imag == 0) & (res.real > 0) & (res.real <= 1) res = res[mask].real if res.size == 0: return np.nan rate = 1.0/res - 1 if rate.size == 1: rate = rate.item() return rate So, I think this is a continuous definition of growth, not some periodic compounding. I'd propose the time array would be in anonymous units, and the result would be in terms of those units. For example, if an interval of 1.0 in the time array were one fortnight, it would give interest in units of continuous growth per fortnight, etc. Anything with many more options than that does not belong in numpy (but it would be interesting to have elsewhere). --jh--
On Mon, May 25, 2009 at 3:40 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
On Mon, 25 May 2009 13:51:38 -0400, josef.pktd@gmail.com wrote:
On Mon, May 25, 2009 at 11:50 AM, Joe Harrington <jh@physics.ucf.edu> wrote:
On Sun, 24 May 2009 18:14:42 -0400 josef.pktd@gmail.com wrote:
On Sun, May 24, 2009 at 4:33 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
I hate to ask for another function in numpy, but there's an obvious one missing in the financial group: xirr. ?It could be done as a new function or as an extension to the existing np.irr.
The internal rate of return (np.irr) is defined as the growth rate that would give you a zero balance at the end of a period of investment given a series of cash flows into or out of the investment at regular intervals (the first and last cash flows are usually an initial deposit and a withdrawal of the current balance).
This is useful in academics, but if you're tracking a real investment, you don't just withdraw or add money on a perfectly annual basis, nor do you want a calc with thousands of days of zero entries just so you can handle the uneven intervals by evening them out. ?Both excel and openoffice define a "xirr" function that pairs each cash flow with a date. ?Would there be an objection to either a xirr or adding an optional second arg (or a keyword arg) to np.irr in numpy? ?Who writes the code is a different question, but that part isn't hard.
3 comments:
* open office has also the other function in an x??? version, so it might be good to add it consistently to all functions
* date type: scikits.timeseries and the gsoc for implementing a date type would be useful to have a clear date type, or would you want to base it only on python standard library
* real life accuracy: given that there are large differences in the definition of a year for financial calculations, any simple implementation would be only approximately accurate. for example in the open office help, oddlyield list the following option
Basis is chosen from a list of options and indicates how the year is to be calculated. Basis Calculation 0 or missing US method (NASD), 12 months of 30 days each 1 Exact number of days in months, exact number of days in year 2 Exact number of days in month, year has 360 days 3 Exact number of days in month, year has 365 days 4 European method, 12 months of 30 days each
So, my question: what's the purpose of the financial function in numpy? Currently it provides convenient functions for (approximate) interest calculations. If they get expanded to a "serious" implementation of, for example, the main financial functions listed in the open office help (just for reference) then maybe numpy is not the right location for it.
I started to do something similar in matlab, and once I tried to use real dates instead of just counting months, the accounting rules get quickly very messy.
Using dates as you propose would be very convenient, but the users shouldn't be surprised that their actual payments at the end of the year don't fully match up with what numpy told them.
my 3cents
Josef
First point: agreed. ?I wish this community had a design review process for numpy and scipy, so that these things could get properly hashed out, and not just one person (even Travis) suggesting something and everyone else saying yeah-sure-whatever.
Does anyone on the list have the financial background to suggest what functions "should" be included in a basic set of financial routines? xirr is the only one I've ever used in a spreadsheet, myself.
Other points: Yuk. ?You're right.
When these first came up for discussion, I had a Han Solo moment ("I've got a baaad feeling about this...") but I couldn't put my finger on why. ?They seemed like simple and limited functions with high utility. ?Certainly anything as open-ended as financial-industry rules should go elsewhere (scikits, scipy, monpy, whatever).
But, that doesn't prevent a user-supplied, floating-point time array from going into a function in numpy. ?The rate of return would be in units of that array. ?Functions that convert date/time in some format (or many) and following some rule (or one of many) to such a floating array can still go elsewhere, maintained by people who know the definitions, if they have interest (pun intended). ?That would make the functions in numpy much more useful without bloating them or making them a maintenance nightmare.
If you think of time just as a regularly spaced, e.g. days, but with sparse points on it, or as a continuous variable, then extending the current functions should be relatively easy. I guess the only questions are compounding, annual, quarterly or at each payment, and whether the annual rate is calculated as real compounded annualized rate or as accounting annual rate, e.g. quarterlyrate*4.
This leaves "What is the present value, if you get 100 Dollars at the 10th day of each month (or at the next working day if the 10th day is a holiday or a weekend) for the next 5 years and the monthly interest rate is 5/12%?" for another day.
Initially I understood you wanted the date as a string or date type as in e.g open office. What would be the units of the user-supplied, floating-point time array? It is still necessary to know the time units to provide an annualized rate, unless the rate is in continuous time, exp(r*t). I don't know whether this would apply to all functions in numpy.finance, it's a while since I looked at the code. Maybe there are some standard simplifications in open office or excel.
I briefly skimmed the list of function in the open office help, and it would be useful to have them available, e.g. as a package in scipy. But my google searches in the past for applications in finance with a compatible license didn't provide much useful code that could form the basis of a finance package.
Adding more convenience and functionality to numpy.finance is useful, but if they get extended with slow feature creep, then another location (scipy) might be more appropriate and would be more expandable, even if it happens only slowly.
That's just my opinion (obviously), I'm a relative newbie to numpy/scipy and still working my way through all the different subpackages.
np.irr is defined on (anonymous) constant time intervals and gives you the growth per time interval. The code is very short, basically a call to np.roots(values):
def irr(values): """ Return the Internal Rate of Return (IRR).
This is the rate of return that gives a net present value of 0.0.
Parameters ---------- values : array_like, shape(N,) Input cash flows per time period. At least the first value would be negative to represent the investment in the project.
Returns ------- out : float Internal Rate of Return for periodic input values.
Examples -------- >>> np.irr([-100, 39, 59, 55, 20]) 0.2809484211599611
""" res = np.roots(values[::-1]) # Find the root(s) between 0 and 1 mask = (res.imag == 0) & (res.real > 0) & (res.real <= 1) res = res[mask].real if res.size == 0: return np.nan rate = 1.0/res - 1 if rate.size == 1: rate = rate.item() return rate
So, I think this is a continuous definition of growth, not some periodic compounding.
I'd propose the time array would be in anonymous units, and the result would be in terms of those units. For example, if an interval of 1.0 in the time array were one fortnight, it would give interest in units of continuous growth per fortnight, etc. Anything with many more options than that does not belong in numpy (but it would be interesting to have elsewhere).
Here is my stab at xirr. It depends on the python datetime module and the Newton - Raphson algorithm in scipy.optimize, but it could be taken as a starting point if someone wants to get rid of the dependencies (I haven't worked too much with dates or NR before). The reference for the open office version is here <http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_XIRR_function>, and it performs in exactly the same way (assumes 365 days a year). It also doesn't take a 'begin' or 'end' argument for when the payments are made. but this is already in the numpy.financial and could be added easily. def _discf(rate, pmts, dates): import numpy as np dcf=[] for i,cf in enumerate(pmts): d=dates[i]-dates[0] dcf.append(cf*(1+rate)**(-d.days/365.)) return np.add.reduce(dcf) def xirr(pmts, dates, guess=.10): ''' IRR function that accepts irregularly spaced cash flows Parameters ---------- values: array_like Contains the cash flows including the initial investment dates: array_like Contains the dates of payments as in the form (year, month, day) Returns: Float Internal Rate of Return Notes ---------- In general the xirr is the solution to .. math:: \sum_{t=0}^M{\frac{v_t}{(1+xirr)^{(date_t-date_0)/365}}} = 0 Examples -------------- dates=[[2008,2,5],[2008,7,5],[2009,1,5]] pmts=[-2750,1000,2000] print xirr(pmts,dates) ''' from datetime import date from scipy.optimize import newton for i,dt in enumerate(dates): dates[i]=date(*dt) f = lambda x: _discf(x, pmts, dates) return newton(f, guess) if __name__=="__main__": dates=[[2008,2,5],[2008,7,5],[2009,1,5]] pmts=[-2750,1000,2000] print xirr(pmts,dates) Cheers, Skipper
On Mon, May 25, 2009 at 4:27 PM, Skipper Seabold <jsseabold@gmail.com> wrote:
On Mon, May 25, 2009 at 3:40 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
On Mon, 25 May 2009 13:51:38 -0400, josef.pktd@gmail.com wrote:
On Mon, May 25, 2009 at 11:50 AM, Joe Harrington <jh@physics.ucf.edu> wrote:
On Sun, 24 May 2009 18:14:42 -0400 josef.pktd@gmail.com wrote:
On Sun, May 24, 2009 at 4:33 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
I hate to ask for another function in numpy, but there's an obvious one missing in the financial group: xirr. ?It could be done as a new function or as an extension to the existing np.irr.
The internal rate of return (np.irr) is defined as the growth rate that would give you a zero balance at the end of a period of investment given a series of cash flows into or out of the investment at regular intervals (the first and last cash flows are usually an initial deposit and a withdrawal of the current balance).
This is useful in academics, but if you're tracking a real investment, you don't just withdraw or add money on a perfectly annual basis, nor do you want a calc with thousands of days of zero entries just so you can handle the uneven intervals by evening them out. ?Both excel and openoffice define a "xirr" function that pairs each cash flow with a date. ?Would there be an objection to either a xirr or adding an optional second arg (or a keyword arg) to np.irr in numpy? ?Who writes the code is a different question, but that part isn't hard.
3 comments:
* open office has also the other function in an x??? version, so it might be good to add it consistently to all functions
* date type: scikits.timeseries and the gsoc for implementing a date type would be useful to have a clear date type, or would you want to base it only on python standard library
* real life accuracy: given that there are large differences in the definition of a year for financial calculations, any simple implementation would be only approximately accurate. for example in the open office help, oddlyield list the following option
Basis is chosen from a list of options and indicates how the year is to be calculated. Basis Calculation 0 or missing US method (NASD), 12 months of 30 days each 1 Exact number of days in months, exact number of days in year 2 Exact number of days in month, year has 360 days 3 Exact number of days in month, year has 365 days 4 European method, 12 months of 30 days each
So, my question: what's the purpose of the financial function in numpy? Currently it provides convenient functions for (approximate) interest calculations. If they get expanded to a "serious" implementation of, for example, the main financial functions listed in the open office help (just for reference) then maybe numpy is not the right location for it.
I started to do something similar in matlab, and once I tried to use real dates instead of just counting months, the accounting rules get quickly very messy.
Using dates as you propose would be very convenient, but the users shouldn't be surprised that their actual payments at the end of the year don't fully match up with what numpy told them.
my 3cents
Josef
First point: agreed. ?I wish this community had a design review process for numpy and scipy, so that these things could get properly hashed out, and not just one person (even Travis) suggesting something and everyone else saying yeah-sure-whatever.
Does anyone on the list have the financial background to suggest what functions "should" be included in a basic set of financial routines? xirr is the only one I've ever used in a spreadsheet, myself.
Other points: Yuk. ?You're right.
When these first came up for discussion, I had a Han Solo moment ("I've got a baaad feeling about this...") but I couldn't put my finger on why. ?They seemed like simple and limited functions with high utility. ?Certainly anything as open-ended as financial-industry rules should go elsewhere (scikits, scipy, monpy, whatever).
But, that doesn't prevent a user-supplied, floating-point time array from going into a function in numpy. ?The rate of return would be in units of that array. ?Functions that convert date/time in some format (or many) and following some rule (or one of many) to such a floating array can still go elsewhere, maintained by people who know the definitions, if they have interest (pun intended). ?That would make the functions in numpy much more useful without bloating them or making them a maintenance nightmare.
If you think of time just as a regularly spaced, e.g. days, but with sparse points on it, or as a continuous variable, then extending the current functions should be relatively easy. I guess the only questions are compounding, annual, quarterly or at each payment, and whether the annual rate is calculated as real compounded annualized rate or as accounting annual rate, e.g. quarterlyrate*4.
This leaves "What is the present value, if you get 100 Dollars at the 10th day of each month (or at the next working day if the 10th day is a holiday or a weekend) for the next 5 years and the monthly interest rate is 5/12%?" for another day.
Initially I understood you wanted the date as a string or date type as in e.g open office. What would be the units of the user-supplied, floating-point time array? It is still necessary to know the time units to provide an annualized rate, unless the rate is in continuous time, exp(r*t). I don't know whether this would apply to all functions in numpy.finance, it's a while since I looked at the code. Maybe there are some standard simplifications in open office or excel.
I briefly skimmed the list of function in the open office help, and it would be useful to have them available, e.g. as a package in scipy. But my google searches in the past for applications in finance with a compatible license didn't provide much useful code that could form the basis of a finance package.
Adding more convenience and functionality to numpy.finance is useful, but if they get extended with slow feature creep, then another location (scipy) might be more appropriate and would be more expandable, even if it happens only slowly.
That's just my opinion (obviously), I'm a relative newbie to numpy/scipy and still working my way through all the different subpackages.
np.irr is defined on (anonymous) constant time intervals and gives you the growth per time interval. The code is very short, basically a call to np.roots(values):
def irr(values): """ Return the Internal Rate of Return (IRR).
This is the rate of return that gives a net present value of 0.0.
Parameters ---------- values : array_like, shape(N,) Input cash flows per time period. At least the first value would be negative to represent the investment in the project.
Returns ------- out : float Internal Rate of Return for periodic input values.
Examples -------- >>> np.irr([-100, 39, 59, 55, 20]) 0.2809484211599611
""" res = np.roots(values[::-1]) # Find the root(s) between 0 and 1 mask = (res.imag == 0) & (res.real > 0) & (res.real <= 1) res = res[mask].real if res.size == 0: return np.nan rate = 1.0/res - 1 if rate.size == 1: rate = rate.item() return rate
So, I think this is a continuous definition of growth, not some periodic compounding.
I'd propose the time array would be in anonymous units, and the result would be in terms of those units. For example, if an interval of 1.0 in the time array were one fortnight, it would give interest in units of continuous growth per fortnight, etc. Anything with many more options than that does not belong in numpy (but it would be interesting to have elsewhere).
Here is my stab at xirr. It depends on the python datetime module and the Newton - Raphson algorithm in scipy.optimize, but it could be taken as a starting point if someone wants to get rid of the dependencies (I haven't worked too much with dates or NR before). The reference for the open office version is here <http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_XIRR_function>, and it performs in exactly the same way (assumes 365 days a year). It also doesn't take a 'begin' or 'end' argument for when the payments are made. but this is already in the numpy.financial and could be added easily.
def _discf(rate, pmts, dates): import numpy as np dcf=[] for i,cf in enumerate(pmts): d=dates[i]-dates[0] dcf.append(cf*(1+rate)**(-d.days/365.)) return np.add.reduce(dcf)
def xirr(pmts, dates, guess=.10): ''' IRR function that accepts irregularly spaced cash flows
Parameters ---------- values: array_like Contains the cash flows including the initial investment dates: array_like Contains the dates of payments as in the form (year, month, day)
Returns: Float Internal Rate of Return
Notes ---------- In general the xirr is the solution to
.. math:: \sum_{t=0}^M{\frac{v_t}{(1+xirr)^{(date_t-date_0)/365}}} = 0
Examples -------------- dates=[[2008,2,5],[2008,7,5],[2009,1,5]] pmts=[-2750,1000,2000] print xirr(pmts,dates) ''' from datetime import date from scipy.optimize import newton
for i,dt in enumerate(dates): dates[i]=date(*dt)
f = lambda x: _discf(x, pmts, dates)
return newton(f, guess)
if __name__=="__main__": dates=[[2008,2,5],[2008,7,5],[2009,1,5]] pmts=[-2750,1000,2000] print xirr(pmts,dates)
While I was still trying to think about the general problem, Skipper already implemented a solution. The advantage of Skippers implementation using actual dates instead of just an array of numbers is that it is possible to directly calculate the annual irr, since the time units are well specified. The only problem is the need for an equation solver in numpy. Just using a date tuple would remove the problem of string parsing, and it might be possible to extend it later to a date array. So, I think it would be possible to include Skippers solution, with some cleanup and testing, if an equation solver can be found or if np.roots can handle high order (sparse) polynomials. Below is my original message, which is based on the assumption of a date array that is just an array of numbers without any time units associated with it. Josef """
From my reading of the current irr, you have compounding of the interest rate at the given time interval. So if your data is daily data, you would get a daily interest rate with daily compounded rates, which might not be the most interesting number that the user wants. For the iir function it would still be very easy for the user to convert the daily or monthly rate to the annualized rate, (1+r_d)**365 -1 (1+r_m)**12 -1 (?).
For the implementation, would np.roots still work if you have 1000 days for example, or 360 months, or a few hundred fortnights? What would be the alternative in numpy for finding the root? equation solvers are in scipy. For arbitrary time units with possible large numbers, working with exp should be easier . In this case the exponent would be floats and not integers, so not a polynomial. I think in the continuous time version, we need to solve for r in sum(values*exp(-r*dates)) = 0 Can this be done in numpy? If dates are floats where the unit is one year, then this would give the continuously compounded annual rate, I think. Another property of the current function, that I just realized, is, that it doesn't allow for negative interest rates. This might not be a problem for the intended use, but if you look at real, i.e. inflation adjusted, interest rates then it happens often enough. Other options that might work if np.roots can handle it, would be to use integer time internally but fractional time from the user where the integer unit would be the reference period and the fractions would be for example 2/12 for the second month. I never tried this but using fractional units has a long enough tradition in finance. Or that the user optionally specifies the time units ("y" or "m" or "d") or number of periods per year (365, 12, 52, 26) """
Sorry to jump in a conversation I haven't followed too deep in details, but I'm sure you're all aware of the scikits.timeseries package by now. This should at least help you manage the dates operations in a straightforward manner. I think that could be a nice extension to the package: after all, half of the core developers is a financial analyst...
On Mon, May 25, 2009 at 6:36 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
Sorry to jump in a conversation I haven't followed too deep in details, but I'm sure you're all aware of the scikits.timeseries package by now. This should at least help you manage the dates operations in a straightforward manner. I think that could be a nice extension to the package: after all, half of the core developers is a financial analyst...
The problem is, if the functions are enhanced in the current numpy, then scikits.timeseries is not (yet) available. I agree that for any more extended finance package, the handling of "time"series (in calender time) should make use of scikits.timeseries (and possibly the new datetime array type.) Pierre, your not already hiding by chance any finance code in your timeseries scikit? :) Josef BTW: here are the formulas for the NotImplementedError http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_Derivat...
On May 25, 2009, at 7:02 PM, josef.pktd@gmail.com wrote:
On Mon, May 25, 2009 at 6:36 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
Sorry to jump in a conversation I haven't followed too deep in details, but I'm sure you're all aware of the scikits.timeseries package by now. This should at least help you manage the dates operations in a straightforward manner. I think that could be a nice extension to the package: after all, half of the core developers is a financial analyst...
The problem is, if the functions are enhanced in the current numpy, then scikits.timeseries is not (yet) available.
Mmh, I'm not following you here...
Pierre, your not already hiding by chance any finance code in your timeseries scikit? :)
Ah, you should ask Matt, he's the financial analyst, I'm the hydrologist... Would moving_funcs.mov_average_expw do something you'd find useful ? Anyhow, if the pb you have are just to specify dates, I really think you should give the scikits a try. And send feedback, of course...
On Mon, May 25, 2009 at 7:37 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
On May 25, 2009, at 7:02 PM, josef.pktd@gmail.com wrote:
On Mon, May 25, 2009 at 6:36 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
Sorry to jump in a conversation I haven't followed too deep in details, but I'm sure you're all aware of the scikits.timeseries package by now. This should at least help you manage the dates operations in a straightforward manner. I think that could be a nice extension to the package: after all, half of the core developers is a financial analyst...
The problem is, if the functions are enhanced in the current numpy, then scikits.timeseries is not (yet) available.
Mmh, I'm not following you here...
The original question was how we can enhance numpy.financial, eg. np.irr So we are restricted to use only what is available in numpy and in standard python.
Pierre, your not already hiding by chance any finance code in your timeseries scikit? :)
Ah, you should ask Matt, he's the financial analyst, I'm the hydrologist... Would moving_funcs.mov_average_expw do something you'd find useful ?
I looked at your moving functions, autocorrelation function and so on a while ago. That's were I learned how to use np.correlate or the scipy versions of it, and the filter functions. I've written the standard array versions for the moving functions and acf, ccf, in one of my experiments. If Skipper has enough time in his google summer of code, we would like to include some basic timeseries econometrics (ARMA, VAR, ...?) however most likely only for regularly spaced data.
Anyhow, if the pb you have are just to specify dates, I really think you should give the scikits a try. And send feedback, of course...
Skipper intends to write some examples to show how to work with the extensions to scipy.stats, which, I think, will include examples using time series, besides recarrays, and other array types. Is there a time line for including the timeseries scikits in numpy/scipy? With code that is intended for incorporation in numpy/scipy, we are restricted in our external dependencies. Josef
On May 25, 2009, at 8:06 PM, josef.pktd@gmail.com wrote:
The problem is, if the functions are enhanced in the current numpy, then scikits.timeseries is not (yet) available.
Mmh, I'm not following you here...
The original question was how we can enhance numpy.financial, eg. np.irr So we are restricted to use only what is available in numpy and in standard python.
Ah OK. But it seems that you're now running into a pb w/ dates handling, which might be a bit too specialized for numpy. Anyway, the call isn't mine.
I looked at your moving functions, autocorrelation function and so on a while ago. That's were I learned how to use np.correlate or the scipy versions of it, and the filter functions. I've written the standard array versions for the moving functions and acf, ccf, in one of my experiments.
The moving functions were written in C and they work even w/ timeseries (they work quite OK w/ pure MaskedArraysP. We put them in scikits.timeseries because it was easier to have them there than in scipy, for example.
If Skipper has enough time in his google summer of code, we would like to include some basic timeseries econometrics (ARMA, VAR, ...?) however most likely only for regularly spaced data.
Well, we can easily restrict the functions to the case were there's no missing data nor missing dates. Checking the mask is easy, and we have a method to chek the dates (is_valid)
Anyhow, if the pb you have are just to specify dates, I really think you should give the scikits a try. And send feedback, of course...
Skipper intends to write some examples to show how to work with the extensions to scipy.stats, which, I think, will include examples using time series, besides recarrays, and other array types.
Dealing with TimeSeries is pretty much the same thing as dealing with MaskedArray, with the extra convenience of converting from one frequency to another and so forth.... Quite often, an analysis can be performed by dropping the .dates part, working on the .series part (the underlying MaskedArray), and repatching the dates at the end...
Is there a time line for including the timeseries scikits in numpy/ scipy? With code that is intended for incorporation in numpy/scipy, we are restricted in our external dependencies.
I can't tell, because the decision is not mine. For what I understood, there could be an inclusion in scipy if there's a need for it. For that, we need more users end more feedback.... If you catch my drift...
Josef
On Mon, May 25, 2009 at 8:30 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
On May 25, 2009, at 8:06 PM, josef.pktd@gmail.com wrote:
The problem is, if the functions are enhanced in the current numpy, then scikits.timeseries is not (yet) available.
Mmh, I'm not following you here...
The original question was how we can enhance numpy.financial, eg. np.irr So we are restricted to use only what is available in numpy and in standard python.
Ah OK. But it seems that you're now running into a pb w/ dates handling, which might be a bit too specialized for numpy. Anyway, the call isn't mine.
I looked at your moving functions, autocorrelation function and so on a while ago. That's were I learned how to use np.correlate or the scipy versions of it, and the filter functions. I've written the standard array versions for the moving functions and acf, ccf, in one of my experiments.
The moving functions were written in C and they work even w/ timeseries (they work quite OK w/ pure MaskedArraysP. We put them in scikits.timeseries because it was easier to have them there than in scipy, for example.
If Skipper has enough time in his google summer of code, we would like to include some basic timeseries econometrics (ARMA, VAR, ...?) however most likely only for regularly spaced data.
Well, we can easily restrict the functions to the case were there's no missing data nor missing dates. Checking the mask is easy, and we have a method to chek the dates (is_valid)
Anyhow, if the pb you have are just to specify dates, I really think you should give the scikits a try. And send feedback, of course...
Skipper intends to write some examples to show how to work with the extensions to scipy.stats, which, I think, will include examples using time series, besides recarrays, and other array types.
Dealing with TimeSeries is pretty much the same thing as dealing with MaskedArray, with the extra convenience of converting from one frequency to another and so forth.... Quite often, an analysis can be performed by dropping the .dates part, working on the .series part (the underlying MaskedArray), and repatching the dates at the end...
Is there a time line for including the timeseries scikits in numpy/ scipy? With code that is intended for incorporation in numpy/scipy, we are restricted in our external dependencies.
I can't tell, because the decision is not mine. For what I understood, there could be an inclusion in scipy if there's a need for it. For that, we need more users end more feedback.... If you catch my drift...
Thanks for the info, we will keep this in mind. Personally, I still think of data just as an array or matrix of numbers, when they still have dates and units attached to them, they are usually a pain. And I'm only slowly getting used to the possibility that it doesn't necessarily need to be so painful. (I didn't know you moved the moving functions to C, I thought I saw them in python.) Josef
The advantage of Skippers implementation using actual dates instead of just an array of numbers is that it is possible to directly calculate the annual irr, since the time units are well specified. The only problem is the need for an equation solver in numpy. Just using a date tuple would remove the problem of string parsing, and it might be possible to extend it later to a date array.
So, I think it would be possible to include Skippers solution, with some cleanup and testing, if an equation solver can be found or if np.roots can handle high order (sparse) polynomials.
I looked a bit more: the current implementation of ``rate`` uses it's own iterative (Newton) solver, and in a similar way this could be done for a more general xirr. So with a bit of work this doesn't seem to be a problem and the only question that remains is the specification of the dates. Josef
On Mon, May 25, 2009 at 7:27 PM, <josef.pktd@gmail.com> wrote:
The advantage of Skippers implementation using actual dates instead of just an array of numbers is that it is possible to directly calculate the annual irr, since the time units are well specified. The only problem is the need for an equation solver in numpy. Just using a date tuple would remove the problem of string parsing, and it might be possible to extend it later to a date array.
So, I think it would be possible to include Skippers solution, with some cleanup and testing, if an equation solver can be found or if np.roots can handle high order (sparse) polynomials.
I looked a bit more: the current implementation of ``rate`` uses it's own iterative (Newton) solver, and in a similar way this could be done for a more general xirr.
So with a bit of work this doesn't seem to be a problem and the only question that remains is the specification of the dates.
Here is a solver using the polynomial class, or is there something like this already in numpy Josef ''' Newton solver for value of a polynomial equal to zero works also for negative rate of return ''' import numpy as np nper = 30 #Number of periods freq = 5 #frequency of payment val = np.zeros(nper) val[1:nper+1:freq] = 1 # periodic payment val[0]=-4 # initial investment p = np.poly1d(val[::-1]) #print p.roots # very slow for array with 1000 periods pd1 = np.polyder(p) #print p(0.95) # net present value #print pd1(0.95) # derivative of polynomial rv = np.linspace(0.9,1.05,16) for v,i in zip(rv, p(rv)):print v,i for v,i in zip(rv, pd1(rv)):print v,i # Newton iteration r = 0.95 # starting value, find polynomial root in neighborhood for i in range(10): r = r - p(r)/pd1(r) print r, p(r) print 'interest rate irr is', 1/r - 1
On Mon, May 25, 2009 at 6:55 PM, <josef.pktd@gmail.com> wrote:
On Mon, May 25, 2009 at 7:27 PM, <josef.pktd@gmail.com> wrote:
The advantage of Skippers implementation using actual dates instead of just an array of numbers is that it is possible to directly calculate the annual irr, since the time units are well specified. The only problem is the need for an equation solver in numpy. Just using a date tuple would remove the problem of string parsing, and it might be possible to extend it later to a date array.
So, I think it would be possible to include Skippers solution, with some cleanup and testing, if an equation solver can be found or if np.roots can handle high order (sparse) polynomials.
I looked a bit more: the current implementation of ``rate`` uses it's own iterative (Newton) solver, and in a similar way this could be done for a more general xirr.
So with a bit of work this doesn't seem to be a problem and the only question that remains is the specification of the dates.
Here is a solver using the polynomial class, or is there something like this already in numpy
No. But I think numpy might be a good place for one of the simple 1D solvers. The Brent one would be a good choice as it includes bisection as a fallback strategy. Simple bisection might also be worth adding. The current location of these solvers in scipy.optimize is somewhat obscure and they are the sort of function that gets used often. They don't really fit if we stick to an "arrays only" straight jacket in numpy, but polynomials and financial functions seem to me even further from the core. Chuck
forgive me for jumping in on this thread and playing devil's advocate here, but I am a natural pessimist so please bear with me :) ... I think as this discussion has already demonstrated, it is *extremely* difficult to build a solid general purpose API for financial functions (even seemingly simple ones like an IRR calculation) because of the endless amount of possible permutations and interpretations. I think it would be a big mistake to add more financial functions to numpy directly without having them mature independently in a separate (scikits) package first. It is virtually guaranteed that you won't get the API right on the first try and adding the functions to numpy locks you into an API commitment because numpy is supposed to be a stable package with certain guarantees for backwards compatibility. And as for a more fully featured finance/quant module in Python... someone has already mentioned the C++ library, QuantLib - which I use extensively at work - and I think any serious effort to improve Python's capabilities in this area would be best spent on building a good Python/numpy interface to QuantLib rather than reimplementing its very substantial functionality (which is probably an impossible task realistically). - Matt
On Mon, May 25, 2009 at 9:18 PM, Matt Knox <mattknox.ca@gmail.com> wrote:
forgive me for jumping in on this thread and playing devil's advocate here, but I am a natural pessimist so please bear with me :) ...
It's good to hear from a real finance person.
I think as this discussion has already demonstrated, it is *extremely* difficult to build a solid general purpose API for financial functions (even seemingly simple ones like an IRR calculation) because of the endless amount of possible permutations and interpretations. I think it would be a big mistake to add more financial functions to numpy directly without having them mature independently in a separate (scikits) package first. It is virtually guaranteed that you won't get the API right on the first try and adding the functions to numpy locks you into an API commitment because numpy is supposed to be a stable package with certain guarantees for backwards compatibility.
And as for a more fully featured finance/quant module in Python... someone has already mentioned the C++ library, QuantLib - which I use extensively at work - and I think any serious effort to improve Python's capabilities in this area would be best spent on building a good Python/numpy interface to QuantLib rather than reimplementing its very substantial functionality (which is probably an impossible task realistically).
Quantlib might be good for heavy duty work, but when I looked at their code, I wouldn't know where to start if I want to rewrite any algorithm. My benchmark is more scripting with matlab, where maybe some pieces are readily available, but where the code needs also to be strongly adjusted, or we want to implement a new method or prototype for one. I hadn't tried very hard, but I didn't manage to get Boost and quantlib correctly compiled with the python bindings with MingW. So, while python won't get any "industrial strength" finance package, a more modest "designer package" would be feasible, if there were any interest in it (which I haven't seen). It is similar with statistics, there is no way to achieve the same coverage of statistics as R for example, but still I find in many different python packages many of the basic statistics functions are implemented, without running immediately to R, not to mention the multitude (and multiplicity) of available machine learning packages in python. The other group of python packages cover very specialized requirement of the statistical analysis, as for example the neuroimaging groups. The even more modest question is whether we would want to match open office in it's finance part. These are pretty different use cases from those use cases where you have quantlib all set up and running. (I also saw a book announcement for Finance with Python, I don't remember the exact title.) Josef
<josef.pktd <at> gmail.com> writes:
So, while python won't get any "industrial strength" finance package, a more modest "designer package" would be feasible, if there were any interest in it (which I haven't seen).
...
The even more modest question is whether we would want to match open office in it's finance part.
These are pretty different use cases from those use cases where you have quantlib all set up and running.
As you have hinted, the scope of what will/should be covered with numpy financial functions needs to be defined better before putting more such functions into numpy. If that scope turns out to be something comparable to what excel or openoffice offers, that's fine, but I think a maturation period outside the numpy core (in the form of a scikit or otherwise) would be still be a good idea to avoid getting stuck with a poorly thought out API. As for my personal feelings on how much financial functionality numpy/scipy should offer... I would agree that QuantLib-like functionality is far beyond what numpy can/should try to achieve. More basic functionality like OpenOffice or Excel probably seems about right. Although maybe it is more appropriate for scipy than numpy. - Matt
On May 25, 2009, at 9:15 PM, Matt Knox wrote:
<josef.pktd <at> gmail.com> writes:
So, while python won't get any "industrial strength" finance package, a more modest "designer package" would be feasible, if there were any interest in it (which I haven't seen).
...
The even more modest question is whether we would want to match open office in it's finance part.
These are pretty different use cases from those use cases where you have quantlib all set up and running.
As you have hinted, the scope of what will/should be covered with numpy financial functions needs to be defined better before putting more such functions into numpy. If that scope turns out to be something comparable to what excel or openoffice offers, that's fine, but I think a maturation period outside the numpy core (in the form of a scikit or otherwise) would be still be a good idea to avoid getting stuck with a poorly thought out API.
+1 for a maturation period outside the numpy core.
As for my personal feelings on how much financial functionality numpy/scipy should offer... I would agree that QuantLib-like functionality is far beyond what numpy can/should try to achieve. More basic functionality like OpenOffice or Excel probably seems about right. Although maybe it is more appropriate for scipy than numpy.
+1 for something outside numpy. Even OpenOffice or Excel financial capability might, perhaps, go into scipy, but why not have it optional? -r
I haven't read all the messages in detail, and I'm a consumer not a producer, but I'll comment anyways. I'd love to see additional "financial" functionality, but I'd like to see them in a scikit, not in numpy. I think to be useful they are too complicated to go into numpy. A couple of my many reasons: 1. Doing a precise, bang-up job with dates is paramount to any interesting implementation of many financial functions. I've found timeseries to be a great package - there are some things I'd like to see, but overall it is at the foundation of all of my financial analysis. Any moderately interesting extension of the current capabilities would rapidly end up trying to duplicate much of the timeseries functionality, IMO. Rather than partially re-implement the wheel in numpy, as a consumer I'd like to see financial stuff built on a common basis, and timeseries would be a great start. 2. I've read enough of this discussion to hear a requirement for both good date handling and capable solvers - just for xirr. To do a really interesting job on an interesting amount of capability requires even more dependencies, I think. Although it might be tempting to include a few more "lightweight" financial functions in numpy, I doubt they will be that useful. Most of the lightweight ones are easy enough to whip up when you need them. Also, an approximation that's good today isn't the right one tomorrow - only the really robust stuff seems to survive the test of time, in my limited experience. A start on a really solid scikits financial package would be awesome, though. A few months ago, when the open source software for pricing CDS's was released (http://www.cdsmodel.com/information/cds-model) I took a look and noticed that it had a ton of code for dealing with dates. (I also didn't see any tests in the code. I wonder what that means. Scary for anybody that might want to modify it.) I thought if I had an extra 100 hours in every day it would be fun to re-write that code in numpy/scipy and release it. -r
On Mon, May 25, 2009 at 11:29 PM, Robert Ferrell <ferrell@diablotech.com> wrote:
I haven't read all the messages in detail, and I'm a consumer not a producer, but I'll comment anyways.
I'd love to see additional "financial" functionality, but I'd like to see them in a scikit, not in numpy. I think to be useful they are too complicated to go into numpy. A couple of my many reasons:
1. Doing a precise, bang-up job with dates is paramount to any interesting implementation of many financial functions. I've found timeseries to be a great package - there are some things I'd like to see, but overall it is at the foundation of all of my financial analysis. Any moderately interesting extension of the current capabilities would rapidly end up trying to duplicate much of the timeseries functionality, IMO. Rather than partially re-implement the wheel in numpy, as a consumer I'd like to see financial stuff built on a common basis, and timeseries would be a great start.
2. I've read enough of this discussion to hear a requirement for both good date handling and capable solvers - just for xirr. To do a really interesting job on an interesting amount of capability requires even more dependencies, I think.
Although it might be tempting to include a few more "lightweight" financial functions in numpy, I doubt they will be that useful. Most of the lightweight ones are easy enough to whip up when you need them. Also, an approximation that's good today isn't the right one tomorrow - only the really robust stuff seems to survive the test of time, in my limited experience. A start on a really solid scikits financial package would be awesome, though.
A few months ago, when the open source software for pricing CDS's was released (http://www.cdsmodel.com/information/cds-model) I took a look and noticed that it had a ton of code for dealing with dates. (I also didn't see any tests in the code. I wonder what that means. Scary for anybody that might want to modify it.) I thought if I had an extra 100 hours in every day it would be fun to re-write that code in numpy/scipy and release it.
I was looking at mortgage backed securities before the current crisis hit, and I realized that when I use real dates and real payment schedules then taking actual accounting rules into account, my work and code size would strongly increase. Since it was a semi-theoretic application, sticking to months and ignoring actual calender dates was a useful simplification. As Matt argued it is not possible (or maybe just unrealistic) to write a full finance package in python from scratch. As far as I understand, for example the time series scikits cannot handle business holidays. So some simplification will be necessary. But, I agree, that even for an "approximate" finance package, handling dates and timeseries without a corresponding array type will soon get very tedious or duplicative. One additional advantage of a scikits, besides more freedom for dependencies, would be that models can be incrementally added as contributers find time and interest, and gain more experience with the API and the appropriate abstraction, and to collect hacked up scripts before they get a common structure and implementation. If the only crucial dependency is the time series package, it could go possibly into scipy together with the time series scikits. Also targeting scipy, makes a lot of code available, e.g. the problem with the solver and including statistics. "A sparrow in the hand is better than a pigeon on the roof." (German Proverb) On the other hand, I have seen many plans on the mailing list for great new packages or extensions to existing packages without many results. So maybe an incremental inclusion of the functions and API of open office, excel or similar, now, instead of hoping for a "real" finance package is the more realistic approach, especially, because I haven't found any source where we could "steal" wholesale. (for example http://www.cdsmodel.com/information/cds-model doesn't look compatible with BSD) Josef
Would you like to put xirr in econpy until it finds a home in SciPy? (Might as well make it available.) Cheers, Alan Isaac
I rewrote irr to use the iterative solver instead of polynomial roots so that it can also handle large arrays. For 3000 values, I had to kill the current np.irr since I didn't want to wait longer than 10 minutes When writing the test, I found that npv is missing a "when" keyword, for the case when the first payment is immediate, i.e. in the present, and that broadcasting has problems:
np.npv(0.05, np.array([[1,1],[1,1]])) array([ 1.9047619 , 1.81405896]) np.npv(0.05, np.array([[1,1],[1,1],[1,1]]))
Traceback (most recent call last): File "<pyshell#82>", line 1, in <module> np.npv(0.05, np.array([[1,1],[1,1],[1,1]])) File "C:\Programs\Python25\Lib\site-packages\numpy\lib\financial.py", line 449, in npv return (values / (1+rate)**np.arange(1,len(values)+1)).sum(axis=0) ValueError: shape mismatch: objects cannot be broadcast to a single shape -------------------------- Here is the changed version, that only looks for one root. I added an optional starting value as keyword argument (as in open office) but didn't make any other changes: def irr(values, start=None): """ Return the Internal Rate of Return (IRR). This is the rate of return that gives a net present value of 0.0. Parameters ---------- values : array_like, shape(N,) Input cash flows per time period. At least the first value would be negative to represent the investment in the project. Returns ------- out : float Internal Rate of Return for periodic input values. Examples -------- >>> np.irr([-100, 39, 59, 55, 20]) 0.2809484211599611 """ p = np.poly1d(values[::-1]) pd1 = np.polyder(p) if start is None: r = 0.99 # starting value, find polynomial root in neighborhood else: r = start # iterative solver for discount factor for i in range(10): r = r - p(r)/pd1(r) ## #res = np.roots(values[::-1]) ## # Find the root(s) between 0 and 1 ## mask = (res.imag == 0) & (res.real > 0) & (res.real <= 1) ## res = res[mask].real ## if res.size == 0: ## return np.nan rate = 1.0/r - 1 if rate.size == 1: rate = rate.item() return rate def test_irr(): v = [-150000, 15000, 25000, 35000, 45000, 60000] assert_almost_equal(irr(v), 0.0524, 2) nper = 300 #Number of periods freq = 5 #frequency of payment v = np.zeros(nper) v[1:nper+1:freq] = 1 # periodic payment v[0] = -4.3995180296393199 assert_almost_equal(irr(v), 0.05, 10) nper = 3000 #Number of periods freq = 5 #frequency of payment v = np.zeros(nper) v[1:nper+1:freq] = 1 # periodic payment v[0] = -4.3995199643479603 assert_almost_equal(irr(v), 0.05, 10) If this looks ok, I can write a proper patch. Josef
participants (8)
-
Alan G Isaac
-
Charles R Harris
-
Joe Harrington
-
josef.pktd@gmail.com
-
Matt Knox
-
Pierre GM
-
Robert Ferrell
-
Skipper Seabold