
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time. Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs. This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats). Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions. Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio “”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes =========================================================== This package provides: 1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations). 2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations. 3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes. 4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration. 5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory. 6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses. For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process. Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes. General infrastructure ====================== .. autosummary:: :toctree: generated/ process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations. Stochasticity sources ===================== .. autosummary:: :toctree: generated/ wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments. Stochastic process realizations =============================== .. autosummary:: :toctree: generated/ const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution). Shortcuts:: lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below). .. autosummary:: :toctree: generated/ wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf cir_mean cir_var cir_std cir_pdf heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf mjd_log_pdf mjd_log_chf kou_mean kou_log_pdf kou_log_chf bsd1d2 bscall bscall_delta bsput bsput_delta Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """

Sounds like a pretty general computational tool to me. I have some problems in rigid body dynamics that I'd like to use this for. Matt On Fri, Sep 8, 2017 at 12:55 AM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions.
Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
-- Matt Haberland Assistant Adjunct Professor in the Program in Computing Department of Mathematics 7620E Math Sciences Building, UCLA

Hi Maurizio, On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs.
I've never used these, but a quick search turns up two packages: https://pypi.python.org/pypi/sdeint https://github.com/alu042/SDE-higham Not very mature probably, but would be good to compare your code with those.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions.
First thought: this looks like a useful addition to the scientific Python ecosystem, but is probably too specialized for SciPy (at least at this stage). Based on your docstring below, here are some questions/concerns: 1. We definitely don't want a new ndarray subclass (there's a pretty strong consensus at this point that subclassing ndarray is usually not a good solution), so your `process` doesn't sound attractive. 2. The `integrator` class sounds a lot like odeint, so maybe this would fit better with scipy.integrate? 3. Monte-Carlo simulation is a big and fairly technical topic. There are whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well with SciPy. 4. Things specific to finance like Black-Scholes and put/call options are not a good fit. 5. Maintainers for this added code. The ODE integrators in scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are more specialized so are likely to not be very well-maintained by other devs than you. Without seeing your code it's hard to say for sure, but it looks to me like it would be better to release your code as a standalone package. Cheers, Ralf
Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev

Hi Ralf, thanks for going through this, here are some follow up comments on your points: On the two packages you found: - Sdeint: provides several SDE integration algorithms but no specific processes; does not cover processes with jumps; offers no explicit provisions for handling solutions in a number of paths, which is what you need since the integration output is random variables, not numbers. - SDE-higham is not a package or tool, but rather a collection of recipes for a few textbook examples (no jumps either), in the form that an interactive session might take (set globals, do calculations, plot and print results). About your questions/concerns: 1. ndarray subclassing: the integrator does not depend on the process class, and process realizations may be easily tweaked to return an ndarray instead of a process instance; however, a main point of the package is not just to produce SDE solutions, but to allow the user to handle them (almost) effortlessly once obtained: so one might possibly hide the process class inside the testing suite, which depends on it, but hardly get rid of it altogether. Indeed, I subclassed ndarray as safely as I could figure out, but I will surely go through the scipy-dev archives and find out what advised you against doing so. 2. odeint and scipy.integrate: my view is that odeint churns out numbers dependant on numbers, while SDE integration churns out random variables dependant on random variables (the stochasticity sources), so there is a shift of paradigm and the two schemes do not fit smoothly one within the other. The step by step integration of SDEs makes sense, and gets useful, if you can easily control the inputs (eg. correlations among stochasticity sources fed into different SDEs), generate the output in a decent number of paths, and easily extract statistics of expressions dependent on it (hence the flow ‘get a number of paths packaged as a process instance’ -> ‘do numpy algebra with it’ -> ‘estimate probability distributions etc. using the process class methods’ -> ‘scale it up with the montecarlo class in case the needed paths do not fit into memory’). 3. Montecarlo simulations is a big topic indeed, that goes far beyond this package; what is called the montecarlo class is in fact a tool to cumulate statistical information extracted from a number of realizations of a random variable, and is focused on making easy to interpret the output of SDE integration. 4. Black-Scholes assists one more tests in a rather extensive testing suite, and might be painlessly dropped. This said, I see your general feeling is that this whole subject is too specialized to fit into scipy at this stage. I suggested otherwise, in my proposal, because stochastic calculus is regarded as one of the pillars of modern probability theory, and covering it into scipy might be relevant to lots of mathematicians and physicists, both practitioners and students, that might otherwise hesitate to adopt, and build dependencies upon, a standalone package. In case you reconsider let me know, I’d be happy to help (now or in the future…). If you have further questions, let me know as well. Cheers Maurizio On 9 September 2017 at 03:31, Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi Maurizio,
On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs.
I've never used these, but a quick search turns up two packages: https://pypi.python.org/pypi/sdeint https://github.com/alu042/SDE-higham Not very mature probably, but would be good to compare your code with those.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions.
First thought: this looks like a useful addition to the scientific Python ecosystem, but is probably too specialized for SciPy (at least at this stage).
Based on your docstring below, here are some questions/concerns: 1. We definitely don't want a new ndarray subclass (there's a pretty strong consensus at this point that subclassing ndarray is usually not a good solution), so your `process` doesn't sound attractive. 2. The `integrator` class sounds a lot like odeint, so maybe this would fit better with scipy.integrate? 3. Monte-Carlo simulation is a big and fairly technical topic. There are whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well with SciPy. 4. Things specific to finance like Black-Scholes and put/call options are not a good fit. 5. Maintainers for this added code. The ODE integrators in scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are more specialized so are likely to not be very well-maintained by other devs than you.
Without seeing your code it's hard to say for sure, but it looks to me like it would be better to release your code as a standalone package.
Cheers, Ralf
Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev

On Mon, Sep 11, 2017 at 12:35 PM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi Ralf,
thanks for going through this, here are some follow up comments on your points:
On the two packages you found:
- Sdeint: provides several SDE integration algorithms but no specific processes; does not cover processes with jumps; offers no explicit provisions for handling solutions in a number of paths, which is what you need since the integration output is random variables, not numbers. - SDE-higham is not a package or tool, but rather a collection of recipes for a few textbook examples (no jumps either), in the form that an interactive session might take (set globals, do calculations, plot and print results).
About your questions/concerns:
1. ndarray subclassing: the integrator does not depend on the process class, and process realizations may be easily tweaked to return an ndarray instead of a process instance; however, a main point of the package is not just to produce SDE solutions, but to allow the user to handle them (almost) effortlessly once obtained: so one might possibly hide the process class inside the testing suite, which depends on it, but hardly get rid of it altogether. Indeed, I subclassed ndarray as safely as I could figure out, but I will surely go through the scipy-dev archives and find out what advised you against doing so. 2. odeint and scipy.integrate: my view is that odeint churns out numbers dependant on numbers, while SDE integration churns out random variables dependant on random variables (the stochasticity sources), so there is a shift of paradigm and the two schemes do not fit smoothly one within the other. The step by step integration of SDEs makes sense, and gets useful, if you can easily control the inputs (eg. correlations among stochasticity sources fed into different SDEs), generate the output in a decent number of paths, and easily extract statistics of expressions dependent on it (hence the flow ‘get a number of paths packaged as a process instance’ -> ‘do numpy algebra with it’ -> ‘estimate probability distributions etc. using the process class methods’ -> ‘scale it up with the montecarlo class in case the needed paths do not fit into memory’). 3. Montecarlo simulations is a big topic indeed, that goes far beyond this package; what is called the montecarlo class is in fact a tool to cumulate statistical information extracted from a number of realizations of a random variable, and is focused on making easy to interpret the output of SDE integration. 4. Black-Scholes assists one more tests in a rather extensive testing suite, and might be painlessly dropped.
This said, I see your general feeling is that this whole subject is too specialized to fit into scipy at this stage. I suggested otherwise, in my proposal, because stochastic calculus is regarded as one of the pillars of modern probability theory, and covering it into scipy might be relevant to lots of mathematicians and physicists, both practitioners and students, that might otherwise hesitate to adopt, and build dependencies upon, a standalone package. In case you reconsider let me know, I’d be happy to help (now or in the future…).
If you have further questions, let me know as well.
Cheers
Maurizio
To partially follow up, similar to Ralph's view. I think it would be good to have this as a separate package, or at least the code on github. With distributions like conda it is now much easier to install additional packages, once they are established enough and are included in distributions or easily pip installable. The main reason that I think scipy is currently not the appropriate package for it is that the design and review would require a lot of work. I guess that there are not many maintainers and reviewers that are familiar with this area. I also know only of applications in finance and it is not clear whether or which parts would be of more general interest. In my opinion, some core tools for SDE and jump diffusions would fit in scipy. But the application support will not or might not fit well in a general purpose scientific numerical library. This would be similar to other areas where scipy provides the core computational tools, but applications are supported by other packages. For example, the process class might be similar in magnitude to scipy stats distributions or signal statespace classes, but I have no idea what design for it would fit in scipy. Some of the time aggregation properties sound similar to the corresponding pandas functions, and it is not clear whether users would want to use a separate class or just use pandas instead. Similarly, the kfunc class sounds like something that doesn't have a similar pattern in scipy. Design changes can be made more easily in a standalone package, while all refactorings of classes in scipy are difficult issues because of the backwards compatibility policy which requires that the design should be largely settled before a merge. Josef https://groups.google.com/d/msg/pystatsmodels/xsttiEiyqAg/maR6n4EeAgAJ (stackoverflow question has been deleted "Is there a library out there to calibrate an Ornstein-Uhlenbeck model?") https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/t... and diffusion2.py
On 9 September 2017 at 03:31, Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi Maurizio,
On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs.
I've never used these, but a quick search turns up two packages: https://pypi.python.org/pypi/sdeint https://github.com/alu042/SDE-higham Not very mature probably, but would be good to compare your code with those.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions.
First thought: this looks like a useful addition to the scientific Python ecosystem, but is probably too specialized for SciPy (at least at this stage).
Based on your docstring below, here are some questions/concerns: 1. We definitely don't want a new ndarray subclass (there's a pretty strong consensus at this point that subclassing ndarray is usually not a good solution), so your `process` doesn't sound attractive. 2. The `integrator` class sounds a lot like odeint, so maybe this would fit better with scipy.integrate? 3. Monte-Carlo simulation is a big and fairly technical topic. There are whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well with SciPy. 4. Things specific to finance like Black-Scholes and put/call options are not a good fit. 5. Maintainers for this added code. The ODE integrators in scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are more specialized so are likely to not be very well-maintained by other devs than you.
Without seeing your code it's hard to say for sure, but it looks to me like it would be better to release your code as a standalone package.
Cheers, Ralf
Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev

Hi everybody, following up on this conversation of some time ago, a pre-release of the package, renamed SdePy, is now available on pip ( https://pypi.org/project/sdepy ) and as a project on github ( https://github.com/sdepy/sdepy ), version 1.0.0rc3. A short quickstart guide (https://sdepy.readthedocs.io/en/latest/intro.html#id2 ) is the easiest path to see what it does and what it does not. Any feedback, comments and suggestions are very welcome! Since last September, the package has evolved quite a bit and had its fair share of refactoring, going through several 0.x unpublished versions before stabilizing in its current form (marked as beta). Here are a few takeaways I got from this thread, which I acted upon: - Integrating user defined SDEs turned out to be a main point of interest: this ‘odeint’ like functionality has been considerably streamlined and simplified - formerly, preset processes were mainly in focus, and getting a new process up and running took a deep dive in a not-so-welcoming API, now upgraded as well (the former ‘integrator’ has been superseded by a ‘paths_generator’ parent class, and cooperating ‘integrator’ and ‘SDE’ classes). - The ‘kfunc’ beast was admittedly a hard sell: all undue dependencies of the core modules on its machinery have been eliminated, it plays now the role of an optional decorator for the shortcuts of frequently used stuff (it’s fully functional though and, in my experience, quite handy). - The question “Do you REALLY need a ‘process’ class” clearly stood out in your comments: after some experimentation, the main reason for answering in the yes turned out, in my view, to be its seamless interoperability with the rest of the package, hardly achievable without a dedicated design. Now process instances can indeed be fed to integrators, both as time-dependent and/or path-dependent SDE parameters, and as stochasticity sources: this opened countless hacking opportunities when doing computations, and spoiling them seemed a pity. In case any of you is willing to take a look, feel free to get in touch for any problems/bugs/questions you might have Cheers Maurizio On Mon, 11 Sep 2017 at 19:39, <josef.pktd@gmail.com> wrote:
On Mon, Sep 11, 2017 at 12:35 PM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi Ralf,
thanks for going through this, here are some follow up comments on your points:
On the two packages you found:
- Sdeint: provides several SDE integration algorithms but no specific processes; does not cover processes with jumps; offers no explicit provisions for handling solutions in a number of paths, which is what you need since the integration output is random variables, not numbers. - SDE-higham is not a package or tool, but rather a collection of recipes for a few textbook examples (no jumps either), in the form that an interactive session might take (set globals, do calculations, plot and print results).
About your questions/concerns:
1. ndarray subclassing: the integrator does not depend on the process class, and process realizations may be easily tweaked to return an ndarray instead of a process instance; however, a main point of the package is not just to produce SDE solutions, but to allow the user to handle them (almost) effortlessly once obtained: so one might possibly hide the process class inside the testing suite, which depends on it, but hardly get rid of it altogether. Indeed, I subclassed ndarray as safely as I could figure out, but I will surely go through the scipy-dev archives and find out what advised you against doing so. 2. odeint and scipy.integrate: my view is that odeint churns out numbers dependant on numbers, while SDE integration churns out random variables dependant on random variables (the stochasticity sources), so there is a shift of paradigm and the two schemes do not fit smoothly one within the other. The step by step integration of SDEs makes sense, and gets useful, if you can easily control the inputs (eg. correlations among stochasticity sources fed into different SDEs), generate the output in a decent number of paths, and easily extract statistics of expressions dependent on it (hence the flow ‘get a number of paths packaged as a process instance’ -> ‘do numpy algebra with it’ -> ‘estimate probability distributions etc. using the process class methods’ -> ‘scale it up with the montecarlo class in case the needed paths do not fit into memory’). 3. Montecarlo simulations is a big topic indeed, that goes far beyond this package; what is called the montecarlo class is in fact a tool to cumulate statistical information extracted from a number of realizations of a random variable, and is focused on making easy to interpret the output of SDE integration. 4. Black-Scholes assists one more tests in a rather extensive testing suite, and might be painlessly dropped.
This said, I see your general feeling is that this whole subject is too specialized to fit into scipy at this stage. I suggested otherwise, in my proposal, because stochastic calculus is regarded as one of the pillars of modern probability theory, and covering it into scipy might be relevant to lots of mathematicians and physicists, both practitioners and students, that might otherwise hesitate to adopt, and build dependencies upon, a standalone package. In case you reconsider let me know, I’d be happy to help (now or in the future…).
If you have further questions, let me know as well.
Cheers
Maurizio
To partially follow up, similar to Ralph's view.
I think it would be good to have this as a separate package, or at least the code on github. With distributions like conda it is now much easier to install additional packages, once they are established enough and are included in distributions or easily pip installable.
The main reason that I think scipy is currently not the appropriate package for it is that the design and review would require a lot of work. I guess that there are not many maintainers and reviewers that are familiar with this area. I also know only of applications in finance and it is not clear whether or which parts would be of more general interest.
In my opinion, some core tools for SDE and jump diffusions would fit in scipy. But the application support will not or might not fit well in a general purpose scientific numerical library. This would be similar to other areas where scipy provides the core computational tools, but applications are supported by other packages.
For example, the process class might be similar in magnitude to scipy stats distributions or signal statespace classes, but I have no idea what design for it would fit in scipy. Some of the time aggregation properties sound similar to the corresponding pandas functions, and it is not clear whether users would want to use a separate class or just use pandas instead.
Similarly, the kfunc class sounds like something that doesn't have a similar pattern in scipy.
Design changes can be made more easily in a standalone package, while all refactorings of classes in scipy are difficult issues because of the backwards compatibility policy which requires that the design should be largely settled before a merge.
Josef
https://groups.google.com/d/msg/pystatsmodels/xsttiEiyqAg/maR6n4EeAgAJ (stackoverflow question has been deleted "Is there a library out there to calibrate an Ornstein-Uhlenbeck model?")
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/t... and diffusion2.py
On 9 September 2017 at 03:31, Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi Maurizio,
On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina < maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs.
I've never used these, but a quick search turns up two packages: https://pypi.python.org/pypi/sdeint https://github.com/alu042/SDE-higham Not very mature probably, but would be good to compare your code with those.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions.
First thought: this looks like a useful addition to the scientific Python ecosystem, but is probably too specialized for SciPy (at least at this stage).
Based on your docstring below, here are some questions/concerns: 1. We definitely don't want a new ndarray subclass (there's a pretty strong consensus at this point that subclassing ndarray is usually not a good solution), so your `process` doesn't sound attractive. 2. The `integrator` class sounds a lot like odeint, so maybe this would fit better with scipy.integrate? 3. Monte-Carlo simulation is a big and fairly technical topic. There are whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well with SciPy. 4. Things specific to finance like Black-Scholes and put/call options are not a good fit. 5. Maintainers for this added code. The ODE integrators in scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are more specialized so are likely to not be very well-maintained by other devs than you.
Without seeing your code it's hard to say for sure, but it looks to me like it would be better to release your code as a standalone package.
Cheers, Ralf
Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev

Sounds good, One thing that immediately jumps into my eyes are the lower case class names. I think using pep-8 capital/camel case would make it look more modern. Only old fashioned packages like numpy and scipy still have the legacy spelling (when developers thought python has to look like matlab :) Josef On Fri, Jun 15, 2018 at 1:51 PM, Maurizio Cipollina <maurizio.cipollina@gmail.com> wrote:
Hi everybody,
following up on this conversation of some time ago, a pre-release of the package, renamed SdePy, is now available on pip (https://pypi.org/project/sdepy ) and as a project on github (https://github.com/sdepy/sdepy ), version 1.0.0rc3. A short quickstart guide (https://sdepy.readthedocs.io/en/latest/intro.html#id2 ) is the easiest path to see what it does and what it does not. Any feedback, comments and suggestions are very welcome!
Since last September, the package has evolved quite a bit and had its fair share of refactoring, going through several 0.x unpublished versions before stabilizing in its current form (marked as beta). Here are a few takeaways I got from this thread, which I acted upon:
- Integrating user defined SDEs turned out to be a main point of interest: this ‘odeint’ like functionality has been considerably streamlined and simplified - formerly, preset processes were mainly in focus, and getting a new process up and running took a deep dive in a not-so-welcoming API, now upgraded as well (the former ‘integrator’ has been superseded by a ‘paths_generator’ parent class, and cooperating ‘integrator’ and ‘SDE’ classes).
- The ‘kfunc’ beast was admittedly a hard sell: all undue dependencies of the core modules on its machinery have been eliminated, it plays now the role of an optional decorator for the shortcuts of frequently used stuff (it’s fully functional though and, in my experience, quite handy).
- The question “Do you REALLY need a ‘process’ class” clearly stood out in your comments: after some experimentation, the main reason for answering in the yes turned out, in my view, to be its seamless interoperability with the rest of the package, hardly achievable without a dedicated design. Now process instances can indeed be fed to integrators, both as time-dependent and/or path-dependent SDE parameters, and as stochasticity sources: this opened countless hacking opportunities when doing computations, and spoiling them seemed a pity.
In case any of you is willing to take a look, feel free to get in touch for any problems/bugs/questions you might have
Cheers Maurizio
On Mon, 11 Sep 2017 at 19:39, <josef.pktd@gmail.com> wrote:
On Mon, Sep 11, 2017 at 12:35 PM, Maurizio Cipollina <maurizio.cipollina@gmail.com> wrote:
Hi Ralf,
thanks for going through this, here are some follow up comments on your points:
On the two packages you found:
Sdeint: provides several SDE integration algorithms but no specific processes; does not cover processes with jumps; offers no explicit provisions for handling solutions in a number of paths, which is what you need since the integration output is random variables, not numbers. SDE-higham is not a package or tool, but rather a collection of recipes for a few textbook examples (no jumps either), in the form that an interactive session might take (set globals, do calculations, plot and print results).
About your questions/concerns:
ndarray subclassing: the integrator does not depend on the process class, and process realizations may be easily tweaked to return an ndarray instead of a process instance; however, a main point of the package is not just to produce SDE solutions, but to allow the user to handle them (almost) effortlessly once obtained: so one might possibly hide the process class inside the testing suite, which depends on it, but hardly get rid of it altogether. Indeed, I subclassed ndarray as safely as I could figure out, but I will surely go through the scipy-dev archives and find out what advised you against doing so. odeint and scipy.integrate: my view is that odeint churns out numbers dependant on numbers, while SDE integration churns out random variables dependant on random variables (the stochasticity sources), so there is a shift of paradigm and the two schemes do not fit smoothly one within the other. The step by step integration of SDEs makes sense, and gets useful, if you can easily control the inputs (eg. correlations among stochasticity sources fed into different SDEs), generate the output in a decent number of paths, and easily extract statistics of expressions dependent on it (hence the flow ‘get a number of paths packaged as a process instance’ -> ‘do numpy algebra with it’ -> ‘estimate probability distributions etc. using the process class methods’ -> ‘scale it up with the montecarlo class in case the needed paths do not fit into memory’). Montecarlo simulations is a big topic indeed, that goes far beyond this package; what is called the montecarlo class is in fact a tool to cumulate statistical information extracted from a number of realizations of a random variable, and is focused on making easy to interpret the output of SDE integration. Black-Scholes assists one more tests in a rather extensive testing suite, and might be painlessly dropped.
This said, I see your general feeling is that this whole subject is too specialized to fit into scipy at this stage. I suggested otherwise, in my proposal, because stochastic calculus is regarded as one of the pillars of modern probability theory, and covering it into scipy might be relevant to lots of mathematicians and physicists, both practitioners and students, that might otherwise hesitate to adopt, and build dependencies upon, a standalone package. In case you reconsider let me know, I’d be happy to help (now or in the future…).
If you have further questions, let me know as well.
Cheers
Maurizio
To partially follow up, similar to Ralph's view.
I think it would be good to have this as a separate package, or at least the code on github. With distributions like conda it is now much easier to install additional packages, once they are established enough and are included in distributions or easily pip installable.
The main reason that I think scipy is currently not the appropriate package for it is that the design and review would require a lot of work. I guess that there are not many maintainers and reviewers that are familiar with this area. I also know only of applications in finance and it is not clear whether or which parts would be of more general interest.
In my opinion, some core tools for SDE and jump diffusions would fit in scipy. But the application support will not or might not fit well in a general purpose scientific numerical library. This would be similar to other areas where scipy provides the core computational tools, but applications are supported by other packages.
For example, the process class might be similar in magnitude to scipy stats distributions or signal statespace classes, but I have no idea what design for it would fit in scipy. Some of the time aggregation properties sound similar to the corresponding pandas functions, and it is not clear whether users would want to use a separate class or just use pandas instead.
Similarly, the kfunc class sounds like something that doesn't have a similar pattern in scipy.
Design changes can be made more easily in a standalone package, while all refactorings of classes in scipy are difficult issues because of the backwards compatibility policy which requires that the design should be largely settled before a merge.
Josef
https://groups.google.com/d/msg/pystatsmodels/xsttiEiyqAg/maR6n4EeAgAJ (stackoverflow question has been deleted "Is there a library out there to calibrate an Ornstein-Uhlenbeck model?")
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/t... and diffusion2.py
On 9 September 2017 at 03:31, Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi Maurizio,
On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina <maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python stack at large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured framework for generating and handling stochastic processes and numerically solving SDEs.
I've never used these, but a quick search turns up two packages: https://pypi.python.org/pypi/sdeint https://github.com/alu042/SDE-higham Not very mature probably, but would be good to compare your code with those.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of professional need, and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized processes stray into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is the latter, not the former, at least in my intentions.
First thought: this looks like a useful addition to the scientific Python ecosystem, but is probably too specialized for SciPy (at least at this stage).
Based on your docstring below, here are some questions/concerns: 1. We definitely don't want a new ndarray subclass (there's a pretty strong consensus at this point that subclassing ndarray is usually not a good solution), so your `process` doesn't sound attractive. 2. The `integrator` class sounds a lot like odeint, so maybe this would fit better with scipy.integrate? 3. Monte-Carlo simulation is a big and fairly technical topic. There are whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well with SciPy. 4. Things specific to finance like Black-Scholes and put/call options are not a good fit. 5. Maintainers for this added code. The ODE integrators in scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are more specialized so are likely to not be very well-maintained by other devs than you.
Without seeing your code it's hard to say for sure, but it looks to me like it would be better to release your code as a standalone package.
Cheers, Ralf
Thanks in advance for taking your time to go through this, and for your comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev

Thanks Josef, point very well taken, the look is definitely old fashioned - back from the good old times when a string wouldn’t be a String, and a list wouldn’t be a List... Maurizio On Fri, 15 Jun 2018 at 20:39, <josef.pktd@gmail.com> wrote:
Sounds good,
One thing that immediately jumps into my eyes are the lower case class names. I think using pep-8 capital/camel case would make it look more modern. Only old fashioned packages like numpy and scipy still have the legacy spelling (when developers thought python has to look like matlab :)
Josef
Hi everybody,
following up on this conversation of some time ago, a pre-release of the package, renamed SdePy, is now available on pip (https://pypi.org/project/sdepy ) and as a project on github (https://github.com/sdepy/sdepy ), version 1.0.0rc3. A short quickstart guide (https://sdepy.readthedocs.io/en/latest/intro.html#id2 ) is the easiest path to see what it does and what it does not. Any feedback, comments and suggestions are very welcome!
Since last September, the package has evolved quite a bit and had its fair share of refactoring, going through several 0.x unpublished versions before stabilizing in its current form (marked as beta). Here are a few takeaways I got from this thread, which I acted upon:
- Integrating user defined SDEs turned out to be a main point of interest: this ‘odeint’ like functionality has been considerably streamlined and simplified - formerly, preset processes were mainly in focus, and getting a new process up and running took a deep dive in a not-so-welcoming API, now upgraded as well (the former ‘integrator’ has been superseded by a ‘paths_generator’ parent class, and cooperating ‘integrator’ and ‘SDE’ classes).
- The ‘kfunc’ beast was admittedly a hard sell: all undue dependencies of the core modules on its machinery have been eliminated, it plays now the role of an optional decorator for the shortcuts of frequently used stuff (it’s fully functional though and, in my experience, quite handy).
- The question “Do you REALLY need a ‘process’ class” clearly stood out in your comments: after some experimentation, the main reason for answering in the yes turned out, in my view, to be its seamless interoperability with
rest of the package, hardly achievable without a dedicated design. Now process instances can indeed be fed to integrators, both as time-dependent and/or path-dependent SDE parameters, and as stochasticity sources: this opened countless hacking opportunities when doing computations, and spoiling them seemed a pity.
In case any of you is willing to take a look, feel free to get in touch for any problems/bugs/questions you might have
Cheers Maurizio
On Mon, 11 Sep 2017 at 19:39, <josef.pktd@gmail.com> wrote:
On Mon, Sep 11, 2017 at 12:35 PM, Maurizio Cipollina <maurizio.cipollina@gmail.com> wrote:
Hi Ralf,
thanks for going through this, here are some follow up comments on your points:
On the two packages you found:
Sdeint: provides several SDE integration algorithms but no specific processes; does not cover processes with jumps; offers no explicit provisions for handling solutions in a number of paths, which is what
you
need since the integration output is random variables, not numbers. SDE-higham is not a package or tool, but rather a collection of recipes for a few textbook examples (no jumps either), in the form that an interactive session might take (set globals, do calculations, plot and
results).
About your questions/concerns:
ndarray subclassing: the integrator does not depend on the process class, and process realizations may be easily tweaked to return an ndarray instead of a process instance; however, a main point of the package is not just to produce SDE solutions, but to allow the user to handle them (almost) effortlessly once obtained: so one might possibly hide the process class inside the testing suite, which depends on it, but hardly get rid of it altogether. Indeed, I subclassed ndarray as safely as I could figure out, but I will surely go through the scipy-dev archives and find out what advised you against doing so. odeint and scipy.integrate: my view is that odeint churns out numbers dependant on numbers, while SDE integration churns out random variables dependant on random variables (the stochasticity sources), so there is a shift of paradigm and the two schemes do not fit smoothly one within
other. The step by step integration of SDEs makes sense, and gets useful, if you can easily control the inputs (eg. correlations among stochasticity sources fed into different SDEs), generate the output in a decent number of paths, and easily extract statistics of expressions dependent on it (hence the flow ‘get a number of paths packaged as a process instance’ -> ‘do numpy algebra with it’ -> ‘estimate probability distributions etc. using the process class methods’ -> ‘scale it up with the montecarlo class in case the needed paths do not fit into memory’). Montecarlo simulations is a big topic indeed, that goes far beyond this package; what is called the montecarlo class is in fact a tool to cumulate statistical information extracted from a number of realizations of a random variable, and is focused on making easy to interpret the output of SDE integration. Black-Scholes assists one more tests in a rather extensive testing suite, and might be painlessly dropped.
This said, I see your general feeling is that this whole subject is too specialized to fit into scipy at this stage. I suggested otherwise, in my proposal, because stochastic calculus is regarded as one of the
modern probability theory, and covering it into scipy might be relevant to lots of mathematicians and physicists, both practitioners and students, that might otherwise hesitate to adopt, and build dependencies upon, a standalone package. In case you reconsider let me know, I’d be happy to help (now or in the future…).
If you have further questions, let me know as well.
Cheers
Maurizio
To partially follow up, similar to Ralph's view.
I think it would be good to have this as a separate package, or at least the code on github. With distributions like conda it is now much easier to install additional packages, once they are established enough and are included in distributions or easily pip installable.
The main reason that I think scipy is currently not the appropriate package for it is that the design and review would require a lot of work. I guess that there are not many maintainers and reviewers that are familiar with this area. I also know only of applications in finance and it is not clear whether or which parts would be of more general interest.
In my opinion, some core tools for SDE and jump diffusions would fit in scipy. But the application support will not or might not fit well in a general purpose scientific numerical library. This would be similar to other areas where scipy provides the core computational tools, but applications are supported by other packages.
For example, the process class might be similar in magnitude to scipy stats distributions or signal statespace classes, but I have no idea what design for it would fit in scipy. Some of the time aggregation properties sound similar to the corresponding pandas functions, and it is not clear whether users would want to use a separate class or just use pandas instead.
Similarly, the kfunc class sounds like something that doesn't have a similar pattern in scipy.
Design changes can be made more easily in a standalone package, while all refactorings of classes in scipy are difficult issues because of the backwards compatibility policy which requires that the design should be largely settled before a merge.
Josef
https://groups.google.com/d/msg/pystatsmodels/xsttiEiyqAg/maR6n4EeAgAJ (stackoverflow question has been deleted "Is there a library out there to calibrate an Ornstein-Uhlenbeck model?")
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/t...
and diffusion2.py
On 9 September 2017 at 03:31, Ralf Gommers <ralf.gommers@gmail.com> wrote:
Hi Maurizio,
On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina <maurizio.cipollina@gmail.com> wrote:
Hi everybody, and first of all thanks for the great job you have been doing, on which I relied *A LOT* over time.
Stochastic calculus and Stochastic Differential Equations are a significant branch of mathematics that to my understanding is underrepresented in scipy, as well as in the open source python
stack at
large. I may well be missing something, but as I can see one can find out there many recipies or collections of recipies, and a number of packages/projects skewed towards finance, but no structured
generating and handling stochastic processes and numerically solving SDEs.
I've never used these, but a quick search turns up two packages: https://pypi.python.org/pypi/sdeint https://github.com/alu042/SDE-higham Not very mature probably, but would be good to compare your code with those.
This is a pity, since scipy/numpy provide efficen implementations of all the basic ingredients needed to do the trick. Out of
and prompted by this gap, I have developed a general purpose package of which I own the copyright, and which I would be happy to release under the BSD license as part of scipy, in case there is a consensus among the scipy community that it makes sense to do so (the package has no dependencies beyond standard library, numpy and scipy, and might probably fit best as a subpackage of scipy.stats).
Before setting up a PR, I would be happy to hear your thoughts: I have pasted below the package docstring, that should give an overview of its scope and limitations. Please note that some of the realized
into mathematical finance territory, somehow inevitably since finance is one major field of application of stochastic calculus, but the focus is
On Fri, Jun 15, 2018 at 1:51 PM, Maurizio Cipollina <maurizio.cipollina@gmail.com> wrote: the print the pillars of framework for professional need, processes stray the
latter, not the former, at least in my intentions.
First thought: this looks like a useful addition to the scientific Python ecosystem, but is probably too specialized for SciPy (at least at this stage).
Based on your docstring below, here are some questions/concerns: 1. We definitely don't want a new ndarray subclass (there's a pretty strong consensus at this point that subclassing ndarray is usually not a good solution), so your `process` doesn't sound attractive. 2. The `integrator` class sounds a lot like odeint, so maybe this would fit better with scipy.integrate? 3. Monte-Carlo simulation is a big and fairly technical topic. There are whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well with SciPy. 4. Things specific to finance like Black-Scholes and put/call options are not a good fit. 5. Maintainers for this added code. The ODE integrators in scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are more specialized so are likely to not be very well-maintained by other devs than you.
Without seeing your code it's hard to say for sure, but it looks to me like it would be better to release your code as a standalone package.
Cheers, Ralf
Thanks in advance for taking your time to go through this, and for
your
comments and suggestions. Maurizio
“”” =========================================================== Stochastic - Monte Carlo simulation of stochastic processes ===========================================================
This package provides:
1. A `process` class, a subclass of `numpy.ndarray` representing a sequence of values in time, realized in one or several paths. Algebraic manipulations and ufunc computations are supported for instances that share the same timeline and comply with numpy broadcasting rules. Interpolation along the timeline is supported via callability of `process` instances. Process-specific functionalities, such as averaging and indexing along time or across paths, are delegated to process-specific methods, attributes and properties (no overriding of `numpy.ndarray` operations).
2. The `source` class and its subclasses, stochasticity sources providing numerical realizations of the differentials commonly found in stochastic differential equations (SDEs), with or without memory of formerly invoked realizations.
3. The `integrator` class, a general framework for numerical SDE integration, intended for subclassing, to assist the definition of step by step integration algorithms and the computation of numerical realizations of stochastic processes.
4. Several objects providing realizations of specific stochastic processes along a given timeline and across a requested number of paths. Realizations are obtained via explicit formulae, in case the relevant SDE has a closed form solution, or otherwise as `integrator` subclasses performing Euler-Maruyama numerical integration.
5. A `montecarlo` class, as an aid to cumulate the results of several Monte Carlo simulations of a given stochastic variable, and to extract summary estimates for its probability distribution function and statistics, thus supporting simulations across a number of paths that exceeds the maximum allowed by available memory.
6. Several analytical results relating to the supported stochastic processes, as a general reference and for testing purpouses.
For all sources and realizations, process values can take any shape, scalar or multidimensional. Correlated multivariate stochasticity sources are supported. Time-varying process parameters (correlations, intensity of Poisson processes, volatilities etc.) are allowed whenever applicable. `process` instances act as valid stochasticity source realizations (as does any callable object complying with a `source` protocol), and may be passed as a source specification when computing the realization of a given process.
Computations are fully vectorized across paths, providing an efficient infrastructure for simulating a large number of process realizations. Less so, for large number of time steps: integrating 1000 time steps across 100000 paths takes seconds, one million time steps across 100 paths takes minutes.
General infrastructure ====================== .. autosummary:: :toctree: generated/
process -- Array representation of a process (a subclass of -- numpy.ndarray). kfunc -- Base class for functions with managed keyword arguments. source -- Base class for stochasticity sources. true_source -- Base class for stochasticity sources with memory. integrator -- General framework for numerical SDE integration. montecarlo -- Summmary statistics of Monte Carlo simulations.
Stochasticity sources ===================== .. autosummary:: :toctree: generated/
wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments. symmetric_wiener -- as above, with symmetric paths (averages exactly 0). true_wiener -- dW, a source of standard Wiener process (Brownian -- motion) increments, with memory. poisson -- dN, a source of Poisson process increments. compound_poisson -- dJ, a source of compound Poisson process increments. -- (jumps distributed in time as a Poisson process, -- and in size as a given `scipy.stats` distribution). compound_poisson_rv -- Preset jump size distributions for compound Poisson -- process increments.
Stochastic process realizations =============================== .. autosummary:: :toctree: generated/
const_wiener_process -- Wiener process (Brownian motion), with -- time-independent parameters. const_lognorm_process -- Lognormal process, with time-independent -- parameters. wiener_process -- Wiener process (Brownian motion). lognorm_process -- Lognormal process. ornstein_uhlenbeck_process -- Ornstein-Uhlenbeck process (mean-reverting -- Brownian motion). hull_white_process -- F-factors Hull-White process (sum of F -- correlated mean-reverting Brownian -- motions). hull_white_1factor_process -- 1-factor Hull-White process (F=1 Hull-White -- process with F-index collapsed to a -- scalar). cox_ingersoll_ross_process -- Cox-Ingersoll-Ross mean-reverting process. full_heston_process -- Heston stochastic volatility process -- (returns both process and volatility). heston_process -- Heston stochastic volatility process -- (stores and returns process only). jump_diffusion_process -- Jump-diffusion process (lognormal process -- with compound Poisson jumps). merton_jump_diffusion_process -- Merton jump-diffusion process -- (jump-diffusion process with normal -- jump size distribution). kou_jump_diffusion_process -- Kou jump-diffusion process (jump-diffusion -- process with double exponential -- jump size distribution).
Shortcuts::
lognorm -- lognorm_process oruh -- ornstein_uhlenbeck_process hwp -- hull_white_process hw1f -- hull_white_1factor_process cir -- cox_ingersoll_ross_process heston -- heston_process mjd -- merton_jump_diffusion_process kou -- kou_jump_diffusion_process
Analytical results ================== Exact statistics for the realized stochastic processes are listed below, limited to the case of constant process parameters and with some further limitations to the parameters' domains. Function arguments are consistent with those of the corresponding processes. Suffixes `pdf`, `cdf` and `chf` stand respectively for probability distribution function, cumulative probability distribution function, and characteristic function. Black-Scholes formulae for the valuation of call and put options have been included (prefixed with `bs` below).
.. autosummary:: :toctree: generated/
wiener_mean wiener_var wiener_std wiener_pdf wiener_cdf wiener_chf
lognorm_mean lognorm_var lognorm_std lognorm_pdf lognorm_cdf lognorm_log_chf
oruh_mean oruh_var oruh_std oruh_pdf oruh_cdf
hw2f_mean hw2f_var hw2f_std hw2f_pdf hw2f_cdf
cir_mean cir_var cir_std cir_pdf
heston_log_mean heston_log_var heston_log_std heston_log_pdf heston_log_chf
mjd_log_pdf mjd_log_chf
kou_mean kou_log_pdf kou_log_chf
bsd1d2 bscall bscall_delta bsput bsput_delta
Notes ===== To the benefit of interactive and notebook sessions, and as an aid to fluently freeze or vary the plurality of parameters that define each stochastic process, all sources, process realizations and analytical results are objects with managed keywords (subclasses of `kfunc`): if the corresponding classes are instantiated with both variables (non keyword arguments) and parameters (keyword arguments), they behave as functions returning the computations' result; if called with parameters only, return an instance that stores the set parameters, and exposes the same calling pattern (call with variables and optionally with parameters to get results, call with parameters only to get a new instance storing modified parameters). Parameters that are not specified fall back to the class defaults, if calling the class, or to the instance's stored values, if calling an instance. """
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
participants (4)
-
josef.pktd@gmail.com
-
Matt Haberland
-
Maurizio Cipollina
-
Ralf Gommers