Stochastic Processes: APPENDIX A

General Stochastic Processes

Stochastic processes concern sequences of events governed by probabilistic laws (Karlin & Taylor, "A First Course in Stochastic Processes", Academic Press, 1975).
In finance and economics problems, sequences of events take time, so we can think on random events along the time. Let us see a little more formal definition.

"A stochastic process X = { X(t), t belong T } is a collection of random variables. That is, for each t in the index set T, X(t) is a random variable. We often interpret t as time and call X(t) the state of the process at time t" (Ross, 1996, p.41).
The index set T can be countable set and we have a discrete-time stochastic process, or non-countable continuous set and we have a continuous-time stochastic process. Any realization of X is named a sample path, which can be discrete or continuous.
Although in most applications the index set is simply a set of time instants tk, for the case of technical uncertainty it is not true. Imagine a sequential investment in information process to determine the volume of a oil reserve. In the sequence of information revelation random variables (sequence of conditional expectations with new information - a Doob type martingale) the index set is a sequence of investments in information (a set of events, being each event one investment in information). That is, they are event-driven processes (evolve only if a new investment in information is performed) and not time-driven processes as most stochastic processes, which evolve with the pure passage of time. The former is a special case analyzed in a separated section.

Following Dixit & Pindyck's textbook (p.60): "Stochastic process is a variable that evolves over time in a way that is at least in part random". So, a stochastic process means time and randomness in this section.

For example the temperature in Rio de Janeiro is partly deterministic (lower temperatures are expected at night and in the winter), and partly random. In 1997 until October, the hottest 1997 day in Rio de Janeiro occurred in the winter!
Curiously, the same occurred in 1999, until September, the hottest 1999 day in Rio de Janeiro was September 8 (Winter in Brazil).
So, there are factors that make the temperature unpredictable.

In most cases, a stochastic variable has both a expected value term (drift term) and a random term (volatility term).
We can see the stochastic process forecasting for a random variable X, as a forecasted value (E[X]) plus a forecasting error, where error follow some probability distribution. So:
X(t) = E[X(t)] + error(t)

The figure below presents the idea, a popular example, and brings the concept of increment (in this case the Wiener increment).

Below is presented some general concepts of stochastic processes.

Lévy processes are stochastic processes with stationary independent increments and continuous in probability.
Stationary increment property means that the probability distribution for the changes in the stochastic variable X, depends only on the time interval length.
Independent increments means that for all time instant t, the increments are independents.
The two most basic types of Lévy processes are Wiener processes and Poisson process.

Markov processes have the following property: given that its current state is known, the probability of any future event of the process is not altered by additional knowledge concerning its past behavior.
In a more formal words, the probability distribution for xt + 1 depends only of xt, and not additionally on what occurred before time t (doesn't depend of xs, where s < t).

Itô process is a generalized Wiener process. A Wiener process is also a special case of a strong diffusion process that is a particular class of a continuous time Markov process (see Merton, Continuous Time Finance, 1990, pp. 121-122 and note 3).

A continuous time Wiener Process (also called Brownian motion) is a stochastic process with three properties:

The Itô process (or Generalized Wiener Process) for the value of a project V is:

dV = a(V, t) dt + b(V, t) dz

The generalized term is because the drift a(V, t) and the variance b(V, t) coefficients are function of the present state and time.

The integral version for this equation, is the following stochastic integral:

As an illustration of stochastic integral, consider the following mean-reverting process (arithmetic Ornstein-Uhlenbeck) process:

dx = h (m - x) dt + s dz

Where m is an equilibrium level which the process reverts and h is the reversion speed. The value of the stochastic process at a future date T, that is x(T), given the starting value x(0), is:

{short description of image}

See a discussion of this case in Bjerksund & Ekern (1993).

Point process is a stochastic process whose realizations are, instead continuous sample paths, counting measures.
Any counting process which is generated by an iid (independent identically distributed) sum process (Tn) is called renewal counting process.

The simplest and most fundamental point process is the Poisson process, also referred as jump process in the financial literature.

The Poisson process is a counting process in which interarrival times of successive jumps are independently and identically distributed (i.i.d.) exponential random variables.
The Poisson process is an example of renewal counting process.

Homogeneous Poisson Process has the following three properties:

  1. It starts at zero. This means that at t = 0, there is no jump. By counting process point of view, N(0) = 0 (the number of jumps in the process N is zero at t = 0).

  2. It has independent stationary increments. A Poisson process is also a special Lévy process.

  3. For t > 0, the probability of n jumps occurrences until time t is:
    P[N(t) = n] = (1/n!) (l t)n e- l t ; n = 0, 1, 2, .....

    Which is a Poisson distribution with parameter l t.
    A curiosity: the Poisson distribution tends to the Normal distribution as the frequency l tends to infinite. So, a Poisson distribution is asymptotically Normal.

A Nonhomogeneous Poisson Process is more general than the homogeneous one: the stationary increment assumption is not required (remain the independent increment assumption), and the constant arrival rate l of a Poisson process is replaced by a time-varying intensity function.
For an introduction about nonhomogeneous Poisson processes, see E.P.C. Kao, Chapter 2 ("An Introduction to Stochastic Processes", Duxbury Press, 1997).

Compound Poisson Process:
Let Fi be a sequence of i.i.d. (independent and identically distributed) random variables. These identical probability distributions can be interpreted as the jump-size distributions. Let N(t) be a Poisson process, independent of Fi.
The following process is called a Compound Poisson Process:

The sum of two independent compound Poisson processes is itself a compound Poisson process.

A jump is degenerate when the variable can only jump to a fixed value, and remains in this value. Example: the case of "sudden death" process which the variable drops to zero forever (see Dixit & Pindyck, chapters 3 and 5, for some applications).

Interesting that the combination of Poisson processes with Brownian motions is related to Lévy process. Karlin & Taylor ("A Second Course in Stochastic Processes", Academic Press, 1981), p.432, states "The general Lévy process can be represented as a sum of a Brownian motion, a uniform translation, and a limit (actually, an integral) of a one-parameter family of compound Poisson processes, where all the contributing basic processes are mutually independent".
In the sample paths of Lévy processes, the large increments or "jumps" are called "Lévy flights"

A general Lévy process (that is, not Gaussian like Wiener process) has been object of recent research for the financial options and real options. Examples:

See also theory on optimal stopping for Lévy Processes, closed form for perpetual options, and more in:

In order to download the papers above and learn more on Lévy Processes, go to Prof. Mordecki's website, with papers and presentations.

Poisson-Gaussian Processes

Also named jump-diffusion processes, a combination of Itô process with a Poisson process, (a processes mix, continuous with discrete), can be described by the following equation:

dV = a(V, t) dt + b(V, t) dz + V dq

Where the terms are as before but the additional dq is a Poisson term defined by:

dq = 0 . . . . . . . with probability 1 - l dt

dq = F - 1 . . . . . . . with probability l dt

The processes dz and dq are independents.

In the above equation, l is the arrival rate of jumps and F is the jump size probability distribution (the jump size/direction can be random).

In case of jump, the stochastic variable V will change from V to VF

The other equivalent (perhaps clear) format to write this jump-diffusion equation is:

dV = a(V, t) dt + b(V, t) dz + V (F - 1) dq

Where the Poisson term dq is here defined by:

dq = 0 . . . . . . . with probability 1 - l dt

dq = 1 . . . . . . . with probability l dt

The integral version for this equation, is the following stochastic integral:

Where N(t) is the number of jumps until the instant t, drawn from a Poisson distribution.
In most cases:
a(V, s) = drift (= a V, for the Geometric Brownian case; but could be a mean-reverting drift, which has other format)
b(V, s) = s V
c(V, s) = V (F - 1)

For the case of Geometric Brownian with Poisson-jumps, the logarithm version (v = LnV) of the stochastic equation is:

d(LnV) = (a - 0.5 s2) dt + s dz + Ln(F) dq

The returns generated by a jump-diffusion process are not normally distributed, presenting fatter tails.
The following picture resulted from a jump-diffusion simulation for a logarithm of a price (with jumps-up and jumps-down):

When comparing with a normal distribution (from pure Brownian processes), the above distribution presents higher peak and consequently fatter tails.
Even being small tails (low probability density), its size could mean large losses (or large gains) for financial or real derivatives. So, jumps can be very important.
In the picture there are some samples near 1 (so, prices near zero) and samples near 4 (prices near 54), although most values are between 2.5 and 3 (prices in the range 12-20).

For petroleum applications, the most relevant jump-diffusion process is a combination of mean-reversion (continuous process) with jumps (discrete Poisson process).
For real options applications, in general we are interested in low frequency large jumps (rare large jumps), which can be jump-up or jump-down. The chart below presents sample paths from a mean-reversion with low frequency jump model. In this case, the initial price is 15 $/bbl and the long run mean value (which the price reverts) is also $15/bbl (imagine a heavy oil, with lower long run price). The simulation is for three years ahead.

The chart above uses l = 0.2 per year, meaning an average of 20 sample paths with jumps from 100 sample paths, each year. In the chart, some jumps are visible and others the jumps are smaller and their sample paths are hidden by the continuous sample paths near the long run mean price (15 in this example). Sometimes, in the same sample path, occurs another jump few weeks after a jump.

Less realistic for oil prices is the combination of large jumps with high frequency of jumps arrival. Less realistic because the idea is associate jumps with "abnormal" rare news (very important news), whereas the continuous process (mean-reversion) is associated to the "normal news" (this is the Merton idea, from a famous 1976 paper).
The chart below presents the simulated sample paths from a mean-reversion with high frequency jumps l = 1 per year).

Hence, we are looking for large jumps consistent with rare abnormal news. With this focus, the frequency of jumps is a number like one for each 5 years (l = 0.2 per year), as in the first sample paths chart (low-frequency jumps).

The problems with jump-diffusion processes are the parameters estimation and the portfolio hedging. See a discussion in the section of Mean-Reversion with Jumps Models. Even with some practical problems, the jump-diffusion process is perhaps the best bet for most uncertainty of commodities prices.

Bernoulli Jump Process: Poisson Process as Limit of Bernoulli Process

This section follows mainly the paper of Ball & Torous ("A Simplified Jump Process for Common Stock Returns", Journal of Financial and Quantitative Analysis, vol.18 (1), 1983). However, the reader can also found many books on probability and stochastic processes which discuss the construction of the Poisson process starting from a sequence of Bernoulli distribution variables.
The main advantage of the Ball & Torous insight is related with the empirical job of estimating parameters from jump-diffusion processes, as we will see in this section.

The Bernoulli distribution, Be(p), is the simplest distribution in the Theory of Probability. It is a univariate discrete distribution with only two discrete scenarios, one scenario with value zero (failure) and the other with value 1 (success). The unique parameter p (probability of the scenario with value 1) is also the mean of this distribution. The figure below illustrates the Bernoulli distribution for p = 0.3 (or 30%).

Bernoulli distribution Be(0.3)

A typical Bernoulli distribution in petroleum is the chance factor (or probability of success) to find petroleum when valuing a wildcat drilling investment. Bernoulli distribution is the building block to construct many other distributions in probability, such as binomial, geometric, hypergeometric, negative binomial, and Poisson. The connection with the Poisson process is the object of this section.

Here we will use the Bernoulli distribution to set the occurrence (scenario X = 1) or lack of occurrence (scenario X = 0) of a jump in a stochastic variable.
Let us first define the concept of a function f being o(h) ("order h"). The function f is called to be o(h) if:

order h

Now, let us develop the link between Bernoulli and Poisson distributions. Let N(t) be the number of events (jumps) arriving in a time interval of length t. Denote n a positive integer, and consider the subdivision of the time interval (0, t) into n equal subintervals of length h so that h = t / n. Let Xj be the number of jumps (events) that occur in subinterval j. By the stationary independent increment assumption:

Number of jumps

This is a counting process summing n i.i.d. random variables with the properties:

If n is a large number, the probability of a jump in this subinterval j is negligible so that each Xj is approximately a Bernoulli random variable with parameter l h = l t / n.
This is an alternative way to define a Poisson process which is equivalent to the previous definition as we will see below.

Being a sum of independent and identically distributed Bernoulli random variables, N has a binomial distribution. The binomial distribution is given by:

Binomial distribution

Where k = 0, 1, 2, ...
What's happen if the time subinterval is very small, that is, if n tends to infinite? The binomial distribution tends to the Poisson distribution with parameter l t. See below:

Poisson distribution as a limit of a binomial distribution

For the proof of the above limit, see for example Ross ("A First Course in Probability", 5th edition, 1998, p.154).
As Ball & Torous pointed out, this is the standard construction of the Poisson process.

Let us assume now that the time interval t is very small. Consequently, N is well approximated by the Bernoulli random variable X with parameter l t.
That is, X ~ Be( l t) given by:

The main practical feature of the Bernoulli jump process is that over a period of time t (e.g.: 1 day), either no abnormal (jump) information occur or only one relevant abnormal (jump) information occur.
The advantage of this approach is related with the empirical analyses, as developed by Ball & Torous. Under regular conditions, parameter estimates are asymptotically unbiased, consistent, and efficient, being obtained economically using maximum likelihood estimation.
In addition, the statistical test of the null hypothesis l = 0, can be implemented.

Ball & Torous implemented both the method of cumulants and maximum likelihood estimation, with better results for the latter. In addition, their hypothesis tests showed the presence of jumps in most stock returns time-series.

The parameters from the Poisson-Gaussian process of mixing jump process with a geometric Brownian motion (GBM), can be estimated with a Bernoulli mixture of Normal densities, as performed by Ball & Torous. Over t = 1 day, the stock return density f(x) is in this case:

f(x) = (1 - l) F(a, s2) + l F(a, s2 + z2)

Where Ball & Torous assume the mean jump size equal to zero, jump size variance of z2, and daily volatility of the GBM equal to s. The above density expression can be interpreted as the convex combination of two Normal densities one of them with the variance augmented by the jump-size variance.
Assuming n daily stock returns x = (xi, i = 1, 2, ...n), and denoting by g the vector of parameters to estimate (l, s, z, a), the logarithm of the likelihood function L(x; g) is:

logarithm of the likelihood function

So, the empirical job of maximum likelihood estimation consists in maximizing the above function for the four parameters. This is performed by deriving the above expression in relation to each one of the four parameters to estimate and equaling these equations to zero. The sufficient conditions for the existence of maximum require that the matrix - H(x; g) be positive definite, where H(x; g) is formed by the second derivatives of the ln L(x; g) with relation to the parameters to be estimated.
With this procedure, Ball & Torous got good empirical results for stock returns.

Stochastic Topics: Itô's Lemma, Martingale, Arbitrage

Itô's Lemma is the main tool of the stochastic calculus. Itô's Lemma in the stochastic calculus is like the Taylor expansion for the ordinary calculus.
The following example illustrates the comparison and the basic idea of Itô's Lemma. Suppose an option value F(V, t) and we need the value of dF.

The Itô's Lemma is largely used to construct differential equations for a function (like options) of stochastic variable(s). Imagine in the above example, V is a project value (after investment) and F is a derivative of V, F is a real opportunity to invest in V (that is, F is a real option).
If you describe the stochastic process for V, the same you can perform for F by using the Itô's Lemma for dF.
When building a riskless portfolio (nowadays largely used approach based in arbitrage), is necessary to know the value of the option incremental value dF. The Itô's Lemma is the way.

Another application. Several books bring the proof, by using the Itô's Lemma, that if
dV/V = a dt + s dz
The function v = ln(V), follows:
dv = d(lnV) = (a - ½ s2) dt + s dz

So, we get a process for v, but with parameters of other process V.

The key feature on the Itô Lemma is that, the square of the Wiener increment is not negligible. More, it is deterministic! Let us prove that (dz)2 = dt:

(dz)2 = (e (dt)0.5)2; where e ~ N(0, 1), the standard Normal distribution.

(dz)2 = e2 dt

But Var(e) = 1 by definition of e. So,

Var(e) = 1 = E[e2] - (E[e])2

The last term is zero by definition of e. So, E[e2] = 1. So,

E[(dz)2] = E[e2] dt = dt. So, the expected value of the square of the Wiener increment is dt.

Finally, let us prove that Var[(dz)2] = 0, so that the probability of a deviation of (dz)2 from its expected value dt, is negligible (or the expected deviation is negligible compared with the magnitude of its expected value). The simpler proof below follows the Brandão notes on the Dixit & Pindyck book, but a more general proof using Chebyschev's inequality is found in Ingersoll, 1987, p. 348:

Var[(dz)2] = (dt)2 Var[e2] = 0

Because (dt)2 = 0 (negligible for small dt) and Var[e2] is not a very large number. So, (dz)2 is deterministic and equal to its expected value dt. The proof is complete.
In general, for n > 2, we have (dz)n = an(dt)n/2 + O{(dt)n}, where the asymptotic order symbol O{.} relates the negligible terms order. See Ingersoll, 1987, p.348.

Itô's Lemma for Poisson-Jump Processes

If the underlying asset V follows a pure Poisson-Jump Process and we need to evaluate a derivative F(V, t), how to apply the Itô's Lemma in order to calculate E[dF]?
If the jump in the stochastic variable V occurs in the interval between t and dt, the variation of this derivative is:

dF = F(Vt+dt, t) - F(Vt, t)

Recall that in case of jump, the stochastic variable V will change from V to VF, where F is the jump-size distribution (jump-size is also random). Hence the previous equation can be write:

dF = F(VF, t) - F(V, t)

The probability of jump arrival in this time interval dt is given by l dt. Hence the expected value of the derivative F of the underlying stochastic variable V is given by:

E[dF] = l dt E[F(VF, t) - F(V, t)]

For the combined discrete with continuous process, that is, the mix jump-diffusion process, the Itô's Lemma is a straightforward combination of dFIto's diffusion plus dFPoisson's jump.
For a jump-reversion process of price P, the Itô's Lemma is:

See also Dixit & Pindyck, eq.42, p.86 for additional considerations.


A stochastic process X = {x0, x1, x2, x3..........} is a martingale if, given all the information until the time t (here expressed by the current price xt), the expected value of x in the future instant t + s is xt. In math notation:
Et[ xt + s] = xt.
Where the subscript in the expected value operator Et denotes that the expectation is conditional to the information available at time t.

Martingale can be seen as a driftless stochastic process, so that E[dX] = 0.
For instance, in the case of oil prices, if the oil prices were a martingale, the best forecast for a future oil price should be the current oil price. Even oil prices being not a martingale, the martingale representation is useful because we can perform some changes of measure (in the probability or in the drift) in order to transform the process into a martingale one. By working under martingale measure we can obtain some important theoretical and practical benefits that are useful for valuation purposes.

Being a martingale, we can apply the risk-neutral valuation to this (transformed) discounted process by using the risk free interest rate as discount rate. This kind of transformation has large application in derivatives pricing, for both differential equation methods and lattices approaches like binomial.
Using equivalent martingale measure (also called risk neutral probability) instead "real" probabilities, we can discount the binomial tree using the risk free interest rate.
Binomial method, developed by Cox, Ross & Rubinstein (1979), is one of most popular models in finance, thanks to the method simplicity. The martingale artificial probability is the kernel of the binomial and other lattice approaches.

Two fundamental theorems of asset pricing were developed in Harrison & Kreps (1979) and in Harrison & Pliska (1981). These theorems are:

  1. The existence of a risk neutral measure named equivalent martingale measure, is equal to the absence of arbitrage; and
  2. The uniqueness of martingale measure is equivalent to market completeness.

For arbitrage concept, see below.
In case of incomplete markets, there are more than one equivalent martingale measure, so for incomplete markets is necessary to choose a martingale measure among several possibilities. Some researchers select the equivalent martingale measure in the incomplete markets framework, by a more detailed description of the equilibrium and/or building an utility function for the representative investor. See Duffie (1996) for a sophisticated in depth treatment.

Battig & Jarrow (The Second Fundamental Theorem of Asset Pricing: A New Approach, Review of Financial Studies vol.12, n. 5, Winter 1999) developed a new approach for the market completeness (second theorem) that doesn't rely on the arbitrage and martingale concepts.
They argue that "For economies involving an infinite number of assets with discontinuous sample paths, the first fundamental theorem has not yet been extended, and the second fundamental theorem fails". In addition, they claim that in practice "...arbitrage opportunities are often sought in complete markets. This consideration is impossible under the existing definitions". They contest the notion that complete markets must be arbitrage free.
For Battig & Jarrow, "the market is complete if and only if a (suitably defined) valuation operator is unique".

Anyway, the concepts of arbitrage and martingale are two of most important concepts of asset pricing and financial modelling.


"Arbitrage is any trading strategy requiring no cash input that has some probability of making profits, without any risk of a loss" (Jarrow & Turnbull, "Derivatives Securities", South-Western College Publishing, 1996, p.33).
Other more formal definition: "an arbitrage opportunity is a consumption plan that is nonnegative always and strictly positive in at least one event, and has a nonpositive initial cost" (Huang & Litzemberger, "Foundations for Financial Economics", Elsevier Science Pub. Co., 1988, p.226).
The definitions are equivalent.

Imagine you have no money. Someone lend you money and you buy a portfolio of securities, so that at the worst scenario you will stay as you started (zero cash). For other scenarios, this strategy generates some profit. This trading strategy is an arbitrage opportunity.
But everybody wish earn money without risk and without put your money ("money machine"). So, if this opportunity exist for some time, the investor actions (the arbitrageurs) entering in this underpriced portfolio will press the prices levels, so that prices will be adjusted until arbitrage is no longer possible.

Prices that not allow arbitrage opportunities are in equilibrium. So, arbitrage is related to the concept of market equilibrium.

Numerical example of arbitrage opportunity:
Imagine the shares of Petrobras (Petro ON) is selling in Bovespa (a Brazilian stock exchange market) by R$ 230, but in New York is selling by US$ 120. If the current exchange rate is R$ 2 = US$ 1, an investor without considering transaction costs, is making the riskless profit with simultaneous orders of: Buy 1000 shares of Petro ON in Brazil and sell 1000 shares of Petro ON in New York.
Riskless profit = 1000 x (120 - 230/2) = US$ 5000
Transactions costs generally eliminate the profit for small traders. Large banks, with small transactions costs, are the natural candidates to earn some money searching for arbitrage opportunities. In this example, the investor buy a riskless portfolio (buying and selling the same stock, in different markets).

But more important is the concept of prices equilibrium, so we can estimate prices of derivatives of traded assets, reaching a fair price (of a real options for example) by arbitrage considerations. In this case, the asset (real options) is priced by arbitrage.
Typically, in real options context the riskless portfolio consist of one unit of project value (or the commodity produced by the project) and n units of F, the value of the option to invest in the project, being n known as delta hedge. The value of n is chosen in order to make the portfolio riskless.

In the context of real options, see a nice example of arbitrage in real estate, in the book (p.421) of Grinblatt & Titman: "Financial Markets and Corporate Strategy", McGraw-Hill Co. Inc., 1998.

More on arbitrage, see Varian, H.R. (1987): "The Arbitrage Principle in Financial Economics" Journal of Economic Perspectives, vol.1, n. 2, Fall 1987, pp.55-72

Go to the items 2 and 3, Mean-Reversion Models

Go to the item 4, Mean-Reversion with Jumps Models

Back to the Stochastic Processes Menu

Back to Contents