Continuous Distributions

Probability Density

For all continuous variables, the probability mass function (pmf) is always equal to zero, $\forall x \; P(x)=0$

We can use the cumulative distribution function (cdf) $F(x)$. In the continuous case, it equals $F(x) = P \{X \leq x\} = P \{X < x\}$

Assume, additionally, that $F(x)$ has a derivative.

Probability Density Function

Probability density function (pdf, density) is the derivative of the cdf, $f(x) = F'(x)$. The distribution is called continuous if it has a density.

Then, $F(x)$ is an antiderivative of a density, so the integral of a density from $a$ to $b$ equals to the difference of antiderivatives, i.e.,

\[\int_a^b f(x) dx = F(b)-F(a) = P\{a < X < b\}\]

probarea

Example 1

The lifetime, in years, of some electronic component is a continuous random variable with the density

\[f(x) = \begin{cases} \frac{k}{x^3} & \text{ for } x \geq 1 \\ 0 &\text{ for } x < 1 \end{cases}\]

Find $k$, and compute the probability for the lifetime to exceed 5 years.

Joint and Marginal Densities

For a vector of random variables, the joint cumulative distribution function is defined as

\[F_{(X,Y)}(x,y) = P\{X \leq x \cap Y \leq y\}\]

The joint density is the mixed derivative of the joint cdf,

\[f_{(X,Y)}(x,y) = \frac{\partial^2}{\partial x \partial y} F_{(X,Y)}(x,y)\]

Marginal density of $X$ or $Y$ can be obtained by integrating out the other variable.
Variables $X$ and $Y$ are independent if their joint density factors into the product of marginal densities.

Joint and Marginal Densities (cont.)

joint-cont-disc

Example 2

A random variable $X$ in Example 1 has density $f(x) = 2x^{-3} \text{ for } x \geq 1$

What is its expectation?

Uniform Distribution

The Uniform distribution has a constant density, $f(x) = \frac{1}{b-a}, a < x < b$

For any $h>0$ and $t \in [a, b-h]$, the probability is independent of $t$. $P\{ t < X < t+h \} = \int_t^{t+h} \frac{1}{b-a}dx = \frac{h}{b-a}$

Uniform property: the probability is only determined by the length of the interval, but not by its location.

Standard Uniform Distribution

The Uniform distribution with $a = 0$ and $b = 1$ is called Standard Uniform distribution.

Its density is $f(x) = 1$ for $0 < x < 1$

If $X$ is a Uniform(a, b) random variable, then

\[Y = \frac{X-a}{b-a}\]

is Standard Uniform. Likewise, if $Y$ is Standard Uniform, then

\[X = a + (b - a)Y\]

is Uniform(a, b).

Expectation and Variance

For a Standard Uniform variable $Y$,

\[E(Y) = \int yf(y) dy = \int_0^1 ydy = \frac{1}{2}\]

\[Var(Y) = E(Y^2) - E^2(Y) = \int_0^1 y^2f(y) dy - \left(\frac{1}{2}\right)^2 = \frac{1}{12}\]

Expectation and Variance (cont.)

Let $X = a+(b-a)Y$ which has a Uniform(a, b) distribution, so

\[E(X) = E\{a+(b-a)Y\} = a + \frac{b-a}{2} = \frac{a+b}{2}\]

\[Var(X) = Var\{a+(b-a)Y\} = (b-a)^2 Var(Y) = \frac{(b-a)^2}{12}\]

Exponential Distribution

Exponential distribution is often used to model time.

\[f(x) = \lambda e^{-\lambda x} \text{ for } x >0\]

\[F(x) = \int_0^x \lambda e^{-\lambda t} dt = 1 - e^{-\lambda x} (x>0)\]

\[E(X) = \int tf(t)dt = \int_0^\infty t\lambda e^{-\lambda t} dt = \frac{1}{\lambda}\]

Proof

\[Var(X) = \int t^2f(t)dt - E^2(X)= \int_0^\infty t^2\lambda e^{-\lambda t} dt - \left(\frac{1}{\lambda} \right)^2\]

\[= \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}\]

Proof

The quantity $\lambda$ is a parameter of Exponential distribution, and its meaning is clear from $E(X)$.

Example 3

Jobs are sent to a printer at an average rate of 3 jobs per hour.

What is the expected time between jobs?
What is the probability that the next job is sent within 5 minutes?

Memoryless Property

Memoryless property: Exponential variables lose memory

It means that the fact of having waited for $t$ minutes gets "forgotten", and it does not affect the future waiting time.

Regardless of the event $T > t$, when the total waiting time exceeds $t$, the remaining waiting time still has Exponential distribution with the same parameter. $P\{T > t+x | T>t\} = P\{T>x\} \text{ for } t,x >0$

Gamma Distribution

When a certain procedure consists of $\alpha$ independent steps, and each step takes Exponential($\lambda$) amount of time, then the total time has Gamma distribution with parameters $\alpha$ and $\lambda$.

\[f(x) = \frac{\lambda^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\lambda x} \; \text{ for } x >0\]

where

\[\Gamma(t) = \int_0^\infty x^{t-1} e^{-\lambda x} dx \; \text{ for any } t >0\]

When $\alpha = 1$, the Gamma distribution becomes Exponential.

\[Gamma(1, \lambda) = \text{Exponential}(\lambda)\]

When $\lambda = 1/2$ and $\alpha > 0$, the Gamma distribution becomes Chi-square with $2\alpha$ degrees of freedom.

\[Gamma(\alpha, 1/2) = \text{Chi-square}(2\alpha)\]

Expectation and Variance

\[F(t) = \int_0^t f(x)dx = \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^t x^{\alpha-1} e^{-\lambda x} dx\]

\[E(X) = \int x f(x)dx = \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^\infty x^\alpha e^{-\lambda x} dx =\]

\[= \frac{\lambda^\alpha}{\Gamma(\alpha)} \frac{\Gamma(\alpha+1)}{\lambda^{\alpha+1}} = \frac{\alpha}{\lambda}\]

\[Var(X) = E(X^2) - E^2(X) = \frac{(1+\alpha)\alpha}{\lambda^2} - \frac{\alpha^2}{\lambda^2} = \frac{\alpha}{\lambda^2}\]

Gamma-Poisson Formula

Let $T$ be a Gamma variable with an integer parameter $\alpha$ and some positive $\lambda$.

This is a distribution of the time of the $\alpha$-th rare event.

Then, the event $\{T > t\}$ means that the $\alpha$-th rare event occurs after the moment $t$.

fewer than $\alpha$ rare events occur before the time $t$.

\[P\{T > t\} = P \{X < \alpha \}\]

where $X$ has Poisson distribution with parameter $\lambda t$.

Example 4

Lifetimes of computer memory chips have Gamma distribution with expectation $\mu = 12$ years and standard deviation $\sigma = 4$ years.

What is the probability that such a chip has a lifetime between 8 and 10 years?

Normal Distribution

Normal distribution is often found to be a good model for physical variables. It has density

\[f(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp \left \{ \frac{-(x-\mu)^2}{2\sigma^2} \right \}, \; -\infty < x +\infty\]

\[E(X) = \mu\]

\[Var(X) = \sigma^2\]

Standard Normal Distribution

Normal distribution with "standard parameters" $\mu = 0$ and $\sigma = 1$ is called Standard Normal distribution.

\[Z = \frac{X-\mu}{\sigma}, \text{is a Standard Normal random variable}\]

\[\phi(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}, \text{(PDF)}\]

\[\Phi(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-z^2/2} dz, \text{(CDF)}\]

Example 5

For a Standard Normal random variable $Z$, find

P{Z < 1.35}
P{Z > 1.35}
P{-0.77 < Z < 1.35}

Central Limit Theorem

Let $X_1, X_2, \ldots$ be independent random variables with the same expectation $\mu = E(X_i)$ and standard deviation $\sigma = Std(X_i)$, and let

\[S_i = \sum_{i=1}^n X_i = X_1 + \cdots +X_n\]

As $n \rightarrow \infty$, the standardized sum

\[Z_n = \frac{S_n - E(S_n)}{Std(S_n)} = \frac{S_n - n\mu}{\sigma \sqrt n}\]

converges in distribution to a SN random variable for all $z$, s.t. $F_{Z_n}(z) = P \left \{ \frac{S_n - n\mu}{\sigma \sqrt n} \leq z \right \} \rightarrow \Phi(z)$

Example 6

A disk has free space of 330 megabytes. Is it likely to be sufficient for 300 independent images, if each image has expected size of 1 megabyte with a standard deviation of 0.5 megabytes?

The Normal Approximation for the Binomial

Sum of many independent 0/1 components with probabilities equal $p$ (discrete Binomial model) is approximately Normal with

\[\mu = np, \sigma = \sqrt{ np(1-p) }\]

if we expect at least 10 successes and 10 failures:

\[np \geq 10, n(1-p) \geq 10 \text{ or } np(1-p) \geq 3\]

Example 6

Suppose the probability of finding a prize in a cereal box is 20%. If we open 50 boxes, then the number of prizes found is a Binomial distribution with mean of 10:

For Binomial(50, 0.2), $\mu = 10$ and $\sigma = 2.83$
To estimate P(10):

\[P(9.5 \leq X \leq 10.5) \approx P(\frac{9.5-10}{2.83} \leq z \leq \frac{9.5-10}{2.83}) = 0.1405\]