SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
1
Lecture 1. Transformation of Random Variables
Suppose we are given a random variable X with density fX(x). We apply a function g
to produce a random variable Y = g(X). We can think of X as the input to a black
box, and Y the output. We wish to find the density or distribution function of Y . We
illustrate the technique for the example in Figure 1.1.
-
1
2
e-x
1/2
-1
f (x)
x-axis
X
Y
y
X-Sqrt[y]
Sqrt[y]
Y = X
2
Figure 1.1
The distribution function method finds FY directly, and then fY by differentiation.
We have FY (y) = 0 for y < 0. If y ≥ 0, then P{Y ≤ y} = P{−
√
y ≤ x ≤
√
y}.
Case 1. 0 ≤ y ≤ 1 (Figure 1.2). Then
FY (y) =
1
2
√
y +
√
y
0
1
2
e−x
dx =
1
2
√
y +
1
2
(1 − e−
√
y
).
1/2
-1 x-axis
-Sqrt[y]
Sqrt[y]
f (x)
X
Figure 1.2
Case 2. y > 1 (Figure 1.3). Then
FY (y) =
1
2
+
√
y
0
1
2
e−x
dx =
1
2
+
1
2
(1 − e−
√
y
).
The density of Y is 0 for y < 0 and
TARUN GEHLOT
B.E (CIVIL) (HONOURS)
2
1/2
-1 x-axis
-Sqrt[y] Sqrt[y]
f (x)
X
1'
Figure 1.3
fY (y) =
1
4
√
y
(1 + e−
√
y
), 0 < y < 1;
fY (y) =
1
4
√
y
e−
√
y
, y > 1.
See Figure 1.4 for a sketch of fY and FY . (You can take fY (y) to be anything you like at
y = 1 because {Y = 1} has probability zero.)
y
f (y)
Y
1' y
F (y)
Y
1
1
2 y + 1
2 (1 - e- y )
1
2 + 1
2 (1 - e- y )
'
Figure 1.4
The density function method finds fY directly, and then FY by integration; see
Figure 1.5. We have fY (y)|dy| = fX(
√
y)dx + fX(−
√
y)dx; we write |dy| because proba-
bilities are never negative. Thus
fY (y) =
fX(
√
y)
|dy/dx|x=
√
y
+
fX(−
√
y)
|dy/dx|x=−
√
y
with y = x2
, dy/dx = 2x, so
fY (y) =
fX(
√
y)
2
√
y
+
fX(−
√
y)
2
√
y
.
(Note that | − 2
√
y| = 2
√
y.) We have fY (y) = 0 for y < 0, and:
Case 1. 0 < y < 1 (see Figure 1.2).
fY (y) =
(1/2)e−
√
y
2
√
y
+
1/2
2
√
y
=
1
4
√
y(1 + e−
√
y
).
3
Case 2. y > 1 (see Figure 1.3).
fY (y) =
(1/2)e−
√
y
2
√
y
+ 0 =
1
4
√
y
e−
√
y
as before.
Y
y
X
y- y
Figure 1.5
The distribution function method generalizes to situations where we have a single out-
put but more than one input. For example, let X and Y be independent, each uniformly
distributed on [0, 1]. The distribution function of Z = X + Y is
FZ(z) = P{X + Y ≤ z} =
x+y≤z
fXY (x, y) dx dy
with fXY (x, y) = fX(x)fY (y) by independence. Now FZ(z) = 0 for z < 0 and FZ(z) = 1
for z > 2 (because 0 ≤ Z ≤ 2).
Case 1. If 0 ≤ z ≤ 1, then FZ(z) is the shaded area in Figure 1.6, which is z2
/2.
Case 2. If 1 ≤ z ≤ 2, then FZ(z) is the shaded area in FIgure 1.7, which is 1−[(2−z)2
/2].
Thus (see Figure 1.8)
fZ(z) =



z, 0 ≤ z ≤ 1
2 − z, 1 ≤ z ≤ 2
0 elsewhere
.
Problems
1. Let X, Y, Z be independent, identically distributed (from now on, abbreviated iid)
random variables, each with density f(x) = 6x5
for 0 ≤ x ≤ 1, and 0 elsewhere. Find
the distribution and density functions of the maximum of X, Y and Z.
2. Let X and Y be independent, each with density e−x
, x ≥ 0. Find the distribution (from
now on, an abbreviation for “Find the distribution or density function”) of Z = Y/X.
3. A discrete random variable X takes values x1, . . . , xn, each with probability 1/n. Let
Y = g(X) where g is an arbitrary real-valued function. Express the probability function
of Y (pY (y) = P{Y = y}) in terms of g and the xi.
4
1
1
y
x
z
z
1
1
y
x
x+y = z
1 ≤ z ≤ 2
2-z
2-z
Figures 1.6 and 1.7
f (z)
Z
-1
1 2' z
Figure 1.8
4. A random variable X has density f(x) = ax2
on the interval [0, b]. Find the density
of Y = X3
.
5. The Cauchy density is given by f(y) = 1/[π(1 + y2
)] for all real y. Show that one way
to produce this density is to take the tangent of a random variable X that is uniformly
distributed between −π/2 and π/2.
5
Lecture 2. Jacobians
We need this idea to generalize the density function method to problems where there are
k inputs and k outputs, with k ≥ 2. However, if there are k inputs and j < k outputs,
often extra outputs can be introduced, as we will see later in the lecture.
2.1 The Setup
Let X = X(U, V ), Y = Y (U, V ). Assume a one-to-one transformation, so that we can
solve for U and V . Thus U = U(X, Y ), V = V (X, Y ). Look at Figure 2.1. If u changes
by du then x changes by (∂x/∂u) du and y changes by (∂y/∂u) du. Similarly, if v changes
by dv then x changes by (∂x/∂v) dv and y changes by (∂y/∂v) dv. The small rectangle in
the u − v plane corresponds to a small parallelogram in the x − y plane (Figure 2.2), with
A = (∂x/∂u, ∂y/∂u, 0) du and B = (∂x/∂v, ∂y/∂v, 0) dv. The area of the parallelogram
is |A × B| and
A × B =
I J K
∂x/∂u ∂y/∂u 0
∂x/∂v ∂y/∂v 0
du dv =
∂x/∂u ∂x/∂v
∂y/∂u ∂y/∂v
du dvK.
(A determinant is unchanged if we transpose the matrix, i.e., interchange rows and
columns.)
∂x
∂u
du
∂y
∂u
du
x
y
u
v
du
dv
R
Figure 2.1
A
B
S
Figure 2.2
2.2 Definition and Discussion
The Jacobian of the transformation is
J =
∂x/∂u ∂x/∂v
∂y/∂u ∂y/∂v
, written as
∂(x, y)
∂(u, v)
.
6
Thus |A × B| = |J| du dv. Now P{(X, Y ) ∈ S} = P{(U, V ) ∈ R}, in other words,
fXY (x, y) times the area of S is fUV (u, v) times the area of R. Thus
fXY (x, y)|J| du dv = fUV (u, v) du dv
and
fUV (u, v) = fXY (x, y)
∂(x, y)
∂(u, v)
.
The absolute value of the Jacobian ∂(x, y)/∂(u, v) gives a magnification factor for area
in going from u − v coordinates to x − y coordinates. The magnification factor going the
other way is |∂(u, v)/∂(x, y)|. But the magnification factor from u − v to u − v is 1, so
fUV (u, v) =
fXY (x, y)
∂(u, v)/∂(x, y)
.
In this formula, we must substitute x = x(u, v), y = y(u, v) to express the final result in
terms of u and v.
In three dimensions, a small rectangular box with volume du dv dw corresponds to a
parallelepiped in xyz space, determined by vectors
A =
∂x
∂u
∂y
∂u
∂z
∂u
du, B =
∂x
∂v
∂y
∂v
∂z
∂v
dv, C =
∂x
∂w
∂y
∂w
∂z
∂w
dw.
The volume of the parallelepiped is the absolute value of the dot product of A with B×C,
and the dot product can be written as a determinant with rows (or columns) A, B, C. This
determinant is the Jacobian of x, y, z with respect to u, v, w [written ∂(x, y, z)/∂(u, v, w)],
times du dv dw. The volume magnification from uvw to xyz space is |∂(x, y, z)/∂(u, v, w)|
and we have
fUV W (u, v, w) =
fXY Z(x, y, z)
|∂(u, v, w)/∂(x, y, z)|
with x = x(u, v, w), y = y(u, v, w), z = z(u, v, w).
The jacobian technique extends to higher dimensions. The transformation formula is
a natural generalization of the two and three-dimensional cases:
fY1Y2···Yn (y1, . . . , yn) =
fX1···Xn
(x1, . . . , xn)
|∂(y1, . . . , yn)/∂(x1, . . . , xn)|
where
∂(y1, . . . , yn)
∂(x1, . . . , xn)
=
∂y1
∂x1
· · · ∂y1
∂xn
...
∂yn
∂x1
· · · ∂yn
∂xn
.
To help you remember the formula, think f(y) dy = f(x)dx.
7
2.3 A Typical Application
Let X and Y be independent, positive random variables with densities fX and fY , and let
Z = XY . We find the density of Z by introducing a new random variable W, as follows:
Z = XY, W = Y
(W = X would be equally good). The transformation is one-to-one because we can solve
for X, Y in terms of Z, W by X = Z/W, Y = W. In a problem of this type, we must always
pay attention to the range of the variables: x > 0, y > 0 is equivalent to z > 0, w > 0.
Now
fZW (z, w) =
fXY (x, y)
|∂(z, w)/∂(x, y)|x=z/w,y=w
with
∂(z, w)
∂(x, y)
=
∂z/∂x ∂z/∂y
∂w/∂x ∂w/∂y
=
y x
0 1
= y.
Thus
fZW (z, w) =
fX(x)fY (y)
w
=
fX(z/w)fY (w)
w
and we are left with the problem of finding the marginal density from a joint density:
fZ(z) =
∞
−∞
fZW (z, w) dw =
∞
0
1
w
fX(z/w)fY (w) dw.
Problems
1. The joint density of two random variables X1 and X2 is f(x1, x2) = 2e−x1
e−x2
,
where 0 < x1 < x2 < ∞; f(x1, x2) = 0 elsewhere. Consider the transformation
Y1 = 2X1, Y2 = X2 − X1. Find the joint density of Y1 and Y2, and conclude that Y1
and Y2 are independent.
2. Repeat Problem 1 with the following new data. The joint density is given by f(x1, x2) =
8x1x2, 0 < x1 < x2 < 1; f(x1, x2) = 0 elsewhere; Y1 = X1/X2, Y2 = X2.
3. Repeat Problem 1 with the following new data. We now have three iid random variables
Xi, i = 1, 2, 3, each with density e−x
, x > 0. The transformation equations are given
by Y1 = X1/(X1 + X2), Y2 = (X1 + X2)/(X1 + X2 + X3), Y3 = X1 + X2 + X3. As
before, find the joint density of the Yi and show that Y1, Y2 and Y3 are independent.
Comments on the Problem Set
In Problem 3, notice that Y1Y2Y3 = X1, Y2Y3 = X1+X2, so X2 = Y2Y3−Y1Y2Y3, X3 =
(X1 + X2 + X3) − (X1 + X2) = Y3 − Y2Y3.
If fXY (x, y) = g(x)h(y) for all x, y, then X and Y are independent, because
f(y|x) =
fXY (x, y)
fX(x)
=
g(x)h(y)
g(x)
∞
−∞
h(y) dy
8
which does not depend on x. The set of points where g(x) = 0 (equivalently fX(x) = 0)
can be ignored because it has probability zero. It is important to realize that in this
argument, “for all x, y” means that x and y must be allowed to vary independently of each
other, so the set of possible x and y must be of the rectangular form a < x < b, c < y < d.
(The constants a, b, c, d can be infinite.) For example, if fXY (x, y) = 2e−x
e−y
, 0 < y < x,
and 0 elsewhere, then X and Y are not independent. Knowing x forces 0 < y < x, so the
conditional distribution of Y given X = x certainly depends on x. Note that fXY (x, y)
is not a function of x alone times a function of y alone. We have
fXY (x, y) = 2e−x
e−y
I[0 < y < x]
where the indicator I is 1 for 0 < y < x and 0 elsewhere.
In Jacobian problems, pay close attention to the range of the variables. For example, in
Problem 1 we have y1 = 2x1, y2 = x2 − x1, so x1 = y1/2, x2 = (y1/2) + y2. From these
equations it follows that 0 < x1 < x2 < ∞ is equivalent to y1 > 0, y2 > 0.
9
Lecture 3. Moment-Generating Functions
3.1 Definition
The moment-generating function of a random variable X is defined by
M(t) = MX(t) = E[etX
]
where t is a real number. To see the reason for the terminology, note that M(t) is the
expectation of 1 + tX + t2
X2
/2! + t3
X3
/3! + · · · . If µn = E(Xn
), the n-th moment of X,
and we can take the expectation term by term, then
M(t) = 1 + µ1t +
µ2t2
2!
+ · · · +
µntn
n!
+ · · · .
Since the coefficient of tn
in the Taylor expansion is M(n)
(0)/n!, where M(n)
is the n-th
derivative of M, we have µn = M(n)
(0).
3.2 The Key Theorem
If Y =
n
i=1 Xi where X1, . . . , Xn are independent, then MY (t) =
n
i=1 MXi (t).
Proof. First note that if X and Y are independent, then
E[g(X)h(Y )] =
∞
−∞
∞
−∞
g(x)h(y)fXY (x, y) dx dy.
Since fXY (x, y) = fX(x)fY (y), the double integral becomes
∞
−∞
g(x)fX(x) dx
∞
−∞
h(y)fY (y) dy = E[g(X)]E[h(Y )]
and similarly for more than two random variables. Now if Y = X1 + · · · + Xn with the
Xi’s independent, we have
MY (t) = E[etY
] = E[etX1
· · · etXn
] = E[etX1
] · · · E[etXn
] = MX1 (t) · · · MXn (t). ♣
3.3 The Main Application
Given independent random variables X1, . . . , Xn with densities f1, . . . , fn respectively,
find the density of Y =
n
i=1 Xi.
Step 1. Compute Mi(t), the moment-generating function of Xi, for each i.
Step 2. Compute MY (t) =
n
i=1 Mi(t).
Step 3. From MY (t) find fY (y).
This technique is known as a transform method. Notice that the moment-generating
function and the density of a random variable are related by M(t) =
∞
−∞
etx
f(x) dx.
With t replaced by −s we have a Laplace transform, and with t replaced by it we have a
Fourier transform. The strategy works because at step 3, the moment-generating function
determines the density uniquely. (This is a theorem from Laplace or Fourier transform
theory.)
10
3.4 Examples
1. Bernoulli Trials. Let X be the number of successes in n trials with probability of
success p on a given trial. Then X = X1 + · · · + Xn, where Xi = 1 if there is a success on
trial i and Xi = 0 if there is a failure on trial i. Thus
Mi(t) = E[etXi
] = P{Xi = 1}et1
+ P{Xi = 0}et0
= pet
+ q
with p + q = 1. The moment-generating function of X is
MX(t) = (pet
+ q)n
=
n
k=0
n
k
pk
qn−k
etk
.
This could have been derived directly:
MX(t) = E[etX
] =
n
k=0
P{X = k}etk
=
n
k=0
n
k
pk
qn−k
etk
= (pet
+ q)n
by the binomial theorem.
2. Poisson. We have P{X = k} = e−λ
λk
/k!, k = 0, 1, 2, . . . . Thus
M(t) =
∞
k=0
e−λ
λk
k!
etk
= e−λ
∞
k=0
(λet
)k
k!
= exp(−λ) exp(λet
) = exp[λ(et
− 1)].
We can compute the mean and variance from the moment-generating function:
E(X) = M (0) = [exp(λ(et
− 1))λet
]t=0 = λ.
Let h(λ, t) = exp[λ(et
− 1)]. Then
E(X2
) = M (0) = [h(λ, t)λet
+ λet
h(λ, t)λet
]t=0 = λ + λ2
hence
Var X = E(X2
) − [E(X)]2
= λ + λ2
− λ2
= λ.
3. Normal(0,1). The moment-generating function is
M(t) = E[etX
] =
∞
−∞
etx 1
√
2π
e−x2
/2
dx
Now −(x2
/2) + tx = −(1/2)(x2
− 2tx + t2
− t2
) = −(1/2)(x − t)2
+ (1/2)t2
so
M(t) = et2
/2
∞
−∞
1
√
2π
exp[−(x − t)2
/2] dx.
The integral is the area under a normal density (mean t, variance 1), which is 1. Conse-
quently,
M(t) = et2
/2
.
11
4. Normal(µ, σ2
). If X is normal(µ, σ2
), then Y = (X − µ)/σ is normal(0,1). This is a
good application of the density function method from Lecture 1:
fY (y) =
fX(x)
|dy/dx|x=µ+σy
= σ
1
√
2πσ
e−y2
/2
.
We have X = µ + σY , so
MX(t) = E[etX
] = etµ
E[etσY
] = etµ
MY (tσ).
Thus
MX(t) = etµ
et2
σ2
/2
.
Remember this technique, which is especially useful when Y = aX + b and the moment-
generating function of X is known.
3.5 Theorem
If X is normal(µ, σ2
) and Y = aX + b, then Y is normal(aµ + b, a2
σ2
).
Proof. We compute
MY (t) = E[et
Y ] = E[et(aX+b)
] = ebt
MX(at) = ebt
eatµ
ea2
t2
σ2
/2
.
Thus
MY (t) = exp[t(aµ + b)] exp(t2
a2
σ2
/2). ♣
Here is another basic result.
3.6 Theorem
Let X1, . . . , Xn be independent, with Xi normal (µi, σ2
i ). Then Y =
n
i=1 Xi is normal
with mean µ =
n
i=1 µi and variance σ2
=
n
i=1 σ2
i .
Proof. The moment-generating function of Y is
MY (t) =
n
i=1
exp(tiµi + t2
σ2
i /2) = exp(tµ + t2
σ2
/2). ♣
A similar argument works for the Poisson distribution; see Problem 4.
3.7 The Gamma Distribution
First, we define the gamma function Γ(α) =
∞
0
yα−1
e−y
dy, α > 0. We need three
properties:
(a) Γ(α + 1) = αΓ(α), the recursion formula;
(b) Γ(n + 1) = n!, n = 0, 1, 2, . . . ;
12
(c) Γ(1/2) =
√
π.
To prove (a), integrate by parts: Γ(α) =
∞
0
e−y
d(yα
/α). Part (b) is a special case of (a).
For (c) we make the change of variable y = z2
/2 and compute
Γ(1/2) =
∞
0
y−1/2
e−y
dy =
∞
0
√
2z−1
e−z2
/2
z dz.
The second integral is 2
√
π times half the area under the normal(0,1) density, that is,
2
√
π(1/2) =
√
π.
The gamma density is
f(x) =
1
Γ(α)βα
xα−1
e−x/β
where α and β are positive constants. The moment-generating function is
M(t) =
∞
0
[Γ(α)βα
]−1
xα−1
etx
e−x/β
dx.
Change variables via y = (−t + (1/β))x to get
∞
0
[Γ(α)βα
]−1 y
−t + (1/β)
α−1
e−y dy
−t + (1/β)
which reduces to
1
βα
β
1 − βt
α
= (1 − βt)−α
.
In this argument, t must be less than 1/β so that the integrals will be finite.
Since M(0) =
∞
−∞
f(x) dx =
∞
0
f(x) dx in this case, with f ≥ 0, M(0) = 1 implies that
we have a legal probability density. As before, moments can be calculated efficiently from
the moment-generating function:
E(X) = M (0) = −α(1 − βt)−α−1
(−β)|t=0 = αβ;
E(X2
) = M (0) = −α(−α − 1)(1 − βt)−α−2
(−β)2
|t=0 = α(α + 1)β2
.
Thus
Var X = E(X2
) − [E(X)]2
= αβ2
.
3.8 Special Cases
The exponential density is a gamma density with α = 1 : f(x) = (1/β)e−x/β
, x ≥ 0, with
E(X) = β, E(X2
) = 2β2
, Var X = β2
.
13
A random variable X has the chi square density with r degrees of freedom (X = χ2
(r)
for short, where r is a positive integer) if its density is gamma with α = r/2 and β = 2.
Thus
f(x) =
1
Γ(r/2)2r/2
x(r/2)−1
e−x/2
, x ≥ 0
and
M(t) =
1
(1 − 2t)r/2
, t < 1/2.
Therefore E[χ2
(r)] = αβ = r, Var[χ2
(r)] = αβ2
= 2r.
3.9 Lemma
If X is normal(0,1) then X2
is χ2
(1).
Proof. We compute the moment-generating function of X2
directly:
MX2 (t) = E[etX2
] =
∞
−∞
etx2 1
√
2π
e−x2
/2
dx.
Let y =
√
1 − 2tx; the integral becomes
∞
−∞
1
√
2π
e−y2
/2 dy
√
1 − 2t
= (1 − 2t)−1/2
which is χ2
(1). ♣
3.10 Theorem
If X1, . . . , Xn are independent, each normal (0,1), then Y =
n
i=1 X2
i is χ2
(n).
Proof. By (3.9), each X2
i is χ2
(1) with moment-generating function (1 − 2t)−1/2
. Thus
MY (t) = (1 − 2t)−n/2
for t < 1/2, which is χ2
(n). ♣
3.11 Another Method
Another way to find the density of Z = X + Y where X and Y are independent random
variables is by the convolution formula
fZ(z) =
∞
−∞
fX(x)fY (z − x) dx =
∞
−∞
fY (y)fX(z − y) dy.
To see this intuitively, reason as follows. The probability that Z lies near z (between z
and z + dz) is fZ(z) dz. Let us compute this in terms of X and Y . The probability that
X lies near x is fX(x) dx. Given that X lies near x, Z will lie near z if and only if Y lies
near z − x, in other words, z − x ≤ Y ≤ z − x + dz. By independence of X and Y , this
probability is fY (z−x) dz. Thus fZ(z) is a sum of terms of the form fX(x) dx fY (z−x) dz.
Cancel the dz’s and replace the sum by an integral to get the result. A formal proof can
be given using Jacobians.
14
3.12 The Poisson Process
This process occurs in many physical situations, and provides an application of the gamma
distribution. For example, particles can arrive at a counting device, customers at a serving
counter, airplanes at an airport, or phone calls at a telephone exchange. Divide the time
interval [0, t] into a large number n of small subintervals of length dt, so that n dt = t. If
Ii, i = 1, . . . , n, is one of the small subintervals, we make the following assumptions:
(1) The probability of exactly one arrival in Ii is λ dt, where λ is a constant.
(2) The probability of no arrivals in Ii is 1 − λ dt.
(3) The probability of more than one arrival in Ii is zero.
(4) If Ai is the event of an arrival in Ii, then the Ai, i = 1, . . . , n are independent.
As a consequence of these assumptions, we have n = t/dt Bernoulli trials with prob-
ability of success p = λdt on a given trial. As dt → 0 we have n → ∞ and p → 0, with
np = λt. We conclude that the number N[0, t] of arrivals in [0, t] is Poisson (λt):
P{N[0, t] = k} = e−λt
(λt)k
/k!, k = 0, 1, 2, . . . .
Since E(N[0, t]) = λt, we may interpret λ as the average number of arrivals per unit time.
Now let W1 be the waiting time for the first arrival. Then
P{W1 > t} = P{no arrival in [0,t]} = P{N[0, t] = 0} = e−λt
, t ≥ 0.
Thus FW1
(t) = 1 − e−λt
and fW1
(t) = λe−λt
, t ≥ 0. From the formulas for the mean and
variance of an exponential random variable we have E(W1) = 1/λ and Var W1 = 1/λ2
.
Let Wk be the (total) waiting time for the k-th arrival. Then Wk is the waiting time
for the first arrival plus the time after the first up to the second arrival plus · · · plus the
time after arrival k − 1 up to the k-th arrival. Thus Wk is the sum of k independent
exponential random variables, and
MWk
(t) =
1
1 − (t/λ)
k
so Wk is gamma with α = k, β = 1/λ. Therefore
fWk
(t) =
1
(k − 1)!
λk
tk−1
e−λt
, t ≥ 0.
Problems
1. Let X1 and X2 be independent, and assume that X1 is χ2
(r1) and Y = X1 + X2 is
χ2
(r), where r > r1. Show that X2 is χ2
(r2), where r2 = r − r1.
2. Let X1 and X2 be independent, with Xi gamma with parameters αi and βi, i = 1, 2.
If c1 and c2 are positive constants, find convenient sufficient conditions under which
c1X1 + c2X2 will also have the gamma distribution.
3. If X1, . . . , Xn are independent random variables with moment-generating functions
M1, . . . , Mn, and c1, . . . , cn are constants, express the moment-generating function M
of c1X1 + · · · + cnXn in terms of the Mi.
15
4. If X1, . . . , Xn are independent, with Xi Poisson(λi), i = 1, . . . , n, show that the sum
Y =
n
i=1 Xi has the Poisson distribution with parameter λ =
n
i=1 λi.
5. An unbiased coin is tossed independently n1 times and then again tossed independently
n2 times. Let X1 be the number of heads in the first experiment, and X2 the number
of tails in the second experiment. Without using moment-generating functions, in fact
without any calculation at all, find the distribution of X1 + X2.
16
Lecture 4. Sampling From a Normal Population
4.1 Definitions and Comments
Let X1, . . . , Xn be iid. The sample mean of the Xi is
X =
1
n
n
i=1
Xi
and the sample variance is
S2
=
1
n
n
i=1
(Xi − X)2
.
If the Xi have mean µ and variance σ2
, then
E(X) =
1
n
n
i=1
E(Xi) =
1
n
nµ = µ
and
Var X =
1
n2
n
i=1
Var Xi =
nσ2
n2
=
σ2
n
→ 0 as n → ∞.
Thus X is a good estimate of µ. (For large n, the variance of X is small, so X is
concentrated near its mean.) The sample variance is an average squared deviation from
the sample mean, but it is a biased estimate of the true variance σ2
:
E[(Xi − X)2
] = E[(Xi − µ) − (X − µ)]2
= Var Xi + Var X − 2E[(Xi − µ)(X − µ)].
Notice the centralizing technique. We subtract and add back the mean of Xi, which will
make the cross terms easier to handle when squaring. The above expression simplifies to
σ2
+
σ2
n
− 2E[(Xi − µ)
1
n
n
j=1
(Xj − µ)] = σ2
+
σ2
n
−
2
n
E[(Xi − µ)2
].
Thus
E[(Xi − X)2
] = σ2
(1 +
1
n
−
2
n
) =
n − 1
n
σ2
.
Consequently, E(S2
) = (n − 1)σ2
/n, not σ2
. Some books define the sample variance as
1
n − 1
n
i=1
(Xi − X)2
=
n
n − 1
S2
where S2
is our sample variance. This adjusted estimate of the true variance is unbiased
(its expectation is σ2
), but biased does not mean bad. If we measure performance by
asking for a small mean square error, the biased estimate is better in the normal case, as
we will see at the end of the lecture.
17
4.2 The Normal Case
We now assume that the Xi are normally distributed, and find the distribution of S2
. Let
y1 = x = (x1 +· · ·+xn)/n, y2 = x2 −x, . . . , yn = xn −x. Then y1 +y2 = x2, y1 +y3 =
x3, . . . , y1 +yn = xn. Add these equations to get (n−1)y1 +y2 +· · ·+yn = x2 +· · ·+xn,
or
ny1 + (y2 + · · · + yn) = (x2 + · · · + xn) + y1 (1)
But ny1 = nx = x1 +· · ·+xn, so by cancelling x2, . . . , xn in (1), x1 +(y2 +· · ·+yn) = y1.
Thus we can solve for the x’s in terms of the y’s:
x1 = y1 − y2 − · · · − yn
x2 = y1 + y2
x3 = y1 + y3 (2)
...
xn = y1 + yn
The Jacobian of the transformation is
dn =
∂(x1, . . . , xn)
∂(y1, . . . , yn)
=
1 −1 −1 · · · −1
1 1 0 · · · 0
1 0 1 · · · 0
...
1 0 0 · · · 1
To see the pattern, look at the 4 by 4 case and expand via the last row:
1 −1 −1 −1
1 1 0 0
1 0 1 0
1 0 0 1
= (−1)
−1 −1 −1
1 0 0
0 1 0
+
1 −1 −1
1 1 0
1 0 1
so d4 = 1+d3. In general, dn = 1+dn−1, and since d2 = 2 by inspection, we have dn = n
for all n ≥ 2. Now
n
i=1
(xi − µ)2
= (xi − x + x − µ)2
= (xi − x)2
+ n(x − µ)2
(3)
because (xi −x) = 0. By (2), x1 −x = x1 −y1 = −y2 −· · ·−yn and xi −x = xi −y1 = yi
for i = 2, . . . , n. (Remember that y1 = x.) Thus
n
i=1
(xi − x)2
= (−y2 − · · · − yn)2
+
n
i=2
y2
i (4)
Now
fY1···Yn
(y1, . . . , yn) = nfX1···Xn
(x1, . . . , xn).
18
By (3) and (4), the right side becomes, in terms of the yi’s,
n
1
√
2πσ
n
exp
1
2σ2
−
n
i=2
yi
2
−
n
i=2
y2
i − n(y1 − µ)2
.
The joint density of Y1, . . . , Yn is a function of y1 times a function of (y2, . . . , yn), so
Y1 and (Y2, . . . , Yn) are independent. Since X = Y1 and [by (4)] S2
is a function of
(Y2, . . . , Yn),
X and S2
are independent
Dividing Equation (3) by σ2
we have
n
i=1
Xi − µ
σ
2
=
nS2
σ2
+
X − µ
σ/
√
n
2
.
But (Xi − µ)/σ is normal (0,1) and
X − µ
σ/
√
n
is normal (0,1)
so χ2
(n) = (nS2
/σ2
) + χ2
(1) with the two random variables on the right independent. If
M(t) is the moment-generating function of nS2
/σ2
, then (1−2t)−n/2
= M(t)(1−2t)−1/2
.
Therefore M(t) = (1 − 2t)−(n−1)/2
, i.e.,
nS2
σ2
is χ2
(n − 1)
The random variable
T =
X − µ
S/
√
n − 1
is useful in situations where µ is to be estimated but the true variance σ2
is unknown. It
turns out that T has a “T distribution”, which we study in the next lecture.
4.3 Performance of Various Estimates
Let S2
be the sample variance of iid normal (µ, σ2
) random variables X1, . . . , Xn. We
will look at estimates of σ2
of the form cS2
, where c is a constant. Once again employing
the centralizing technique, we write
E[(cS2
− σ2
)2
] = E[(cS2
− cE(S2
) + cE(S2
) − σ2
)2
]
which simplifies to
c2
Var S2
+ (cE(S2
) − σ2
)2
.
19
Since nS2
/σ2
is χ2
(n−1), which has variance 2(n−1), we have n2
(Var S2
)/σ4
= 2(n−1).
Also nE(S2
)/σ2
is the mean of χ2
(n − 1), which is n − 1. (Or we can recall from (4.1)
that E(S2
) = (n − 1)σ2
/n.) Thus the mean square error is
c2
2σ4
(n − 1)
n2
+ c
(n − 1)
n
σ2
− σ2 2
.
We can drop the σ4
and use n2
as a common denominator, which can also be dropped.
We are then trying to minimize
c2
2(n − 1) + c2
(n − 1)2
− 2c(n − 1)n + n2
.
Differentiate with respect to c and set the result equal to zero:
4c(n − 1) + 2c(n − 1)2
− 2(n − 1)n = 0.
Dividing by 2(n − 1), we have 2c + c(n − 1) − n = 0, so c = n/(n + 1). Thus the best
estimate of the form cS2
is
1
n + 1
n
i=1
(Xi − X)2
.
If we use S2
then c = 1. If we us the unbiased version then c = n/(n − 1). Since
[n/(n + 1)] < 1 < [n/(n − 1)] and a quadratic function decreases as we move toward
its minimum, w see that the biased estimate S2
is better than the unbiased estimate
nS2
/(n − 1), but neither is optimal under the minimum mean square error criterion.
Explicitly, when c = n/(n − 1) we get a mean square error of 2σ4
/(n − 1) and when c = 1
we get
σ4
n2
2(n − 1) + (n − 1 − n)2
=
(2n − 1)σ4
n2
which is always smaller, because [(2n − 1)/n2
] < 2/(n − 1) iff 2n2
> 2n2
− 3n + 1 iff
3n > 1, which is true for every positive integer n.
For large n all these estimates are good and the difference between their performance
is small.
Problems
1. Let X1, . . . , Xn be iid, each normal (µ, σ2
), and let X be the sample mean. If c is a
constant, we wish to make n large enough so that P{µ − c < X < µ + c} ≥ .954. Find
the minimum value of n in terms of σ2
and c. (It is independent of µ.)
2. Let X1, . . . , Xn1 , Y1, . . . Yn2 be independent random variables, with the Xi normal
(µ1, σ2
1) and the Yi normal (µ2, σ2
2). If X is the sample mean of the Xi and Y is the
sample mean of the Yi, explain how to compute the probability that X > Y .
3. Let X1, . . . , Xn be iid, each normal (µ, σ2
), and let S2
be the sample variance. Explain
how to compute P{a < S2
< b}.
4. Let S2
be the sample variance of iid normal (µ, σ2
) random variables Xi, i = 1 . . . , n.
Calculate the moment-generating function of S2
and from this, deduce that S2
has a
gamma distribution.
20
Lecture 5. The T and F Distributions
5.1 Definition and Discussion
The Tdistribution is defined as follows. Let X1 and X2 be independent, with X1 normal
(0,1) and X2 chi-square with r degrees of freedom. The random variable Y1 =
√
rX1/
√
X2
has the T distribution with r degrees of freedom.
To find the density of Y1, let Y2 = X2. Then X1 = Y1
√
Y 2/
√
r and X2 = Y2. The
transformation is one-to-one with −∞ < X1 < ∞, X2 > 0 ⇐⇒ −∞ < Y1 < ∞, Y2 > 0.
The Jacobian is given by
∂(x1, x2)
∂(y1, y2)
=
y2/r y1/(2
√
ry2)
0 1
= y2/r.
Thus fY1Y2
(y1, y2) = fX1X2
(x1, x2) y2/r, which upon substitution for x1 and x2 becomes
1
√
2π
exp[−y2
1y2/2r]
1
Γ(r/2)2r/2
y
(r/2)−1
2 e−y2/2
y2/r.
The density of Y1 is
1
√
2πΓ(r/2)2r/2
∞
0
y
[(r+1)/2]−1
2 exp[−(1 + (y2
1/r))y2/2] dy2/
√
r.
With z = (1 + (y2
1/r))y2/2 and the observation that all factors of 2 cancel, this becomes
(with y1 replaced by t)
Γ((r + 1)/2)
√
rπΓ(r/2)
1
(1 + (t2/r))(r+1)/2
, −∞ < t < ∞,
the Tdensity with r degrees of freedom.
In sampling from a normal population, (X − µ)/(σ/
√
n) is normal (0,1), and nS2
/σ2
is χ2
(n − 1). Thus
√
n − 1
(X − µ)
σ/
√
n
divided by
√
nS/σ is T(n − 1).
Since σ and
√
n disappear after cancellation, we have
X − µ
S/
√
n − 1
is T(n − 1)
Advocates of defining the sample variance with n − 1 in the denominator point out that
one can simply replace σ by S in (X − µ)/(σ/
√
n) to get the T statistic.
Intuitively, we expect that for large n, (X − µ)/(S/
√
n − 1) has approximately the
same distribution as (X −µ)/(σ/
√
n), i.e., normal (0,1). This is in fact true, as suggested
by the following computation:
1 +
t2
r
(r+1)/2
= 1 +
t2
r
r
1 +
t2
r
1/2
→
√
et2
× 1 = et2
/2
as r → ∞.
21
5.2 A Preliminary Calculation
Before turning to the F distribution, we calculate the density of U = X1/X2 where X1 and
X2 are independent, positive random variables. Let Y = X2, so that X1 = UY, X2 = Y
(X1, X2, U, Y ) are all greater than zero). The Jacobian is
∂(x1, x2)
∂(u, y)
=
y u
0 1
= y.
Thus fUY (u, y) = fX1X2 (x1, x2)y = fX1 (uy)fX2 (y), and the density of U is
h(u) =
∞
0
yfX1
(uy)fX2
(y) dy.
Now we take X1 to be χ2
(m), and X2 to be χ2
(n). The density of X1/X2 is
h(u) =
1
2(m+n)/2Γ(m/2)Γ(n/2)
u(m/2)−1
∞
0
y[(m+n)/2]−1
e−y(1+u)/2
dy.
The substitution z = y(1 + u)/2 gives
h(u) =
1
2(m+n)/2Γ(m/2)Γ(n/2)
u(m/2)−1
∞
0
z[(m+n)/2]−1
[(1 + u)/2][(m+n)/2]−1
e−z 2
1 + u
dz.
We abbreviate Γ(a)Γ(b)/Γ(a + b) by β(a, b). (We will have much more to say about this
when we discuss the beta distribution later in the lecture.) The above formula simplifies
to
h(u) =
1
β(m/2, n/2)
u(m/2)−1
(1 + u)(m+n)/2
, u ≥ 0.
5.3 Definition and Discussion
The F density is defined as follows. Let X1 and X2 be independent, with X1 = χ2
(m)
and X2 = χ2
(n). With U as in (5.2), let
W =
X1/m
X2/n
=
n
m
U
so that
fW (w) = fU (u)
du
dw
=
m
n
fU
m
n
w .
Thus W has density
(m/n)m/2
β(m/2, n/2)
w(m/2)−1
[1 + (m/n)w](m+n)/2
, w ≥ 0,
the F density with m and n degrees of freedom.
22
5.4 Definitions and Calculations
The beta function is given by
β(a, b) =
1
0
xa−1
(1 − x)b−1
dx, a, b > 0.
We will show that
β(a, b) =
Γ(a)Γ(b)
Γ(a + b)
which is consistent with our use of β(a, b) as an abbreviation in (5.2). We make the change
of variable t = x2
to get
Γ(a) =
∞
0
ta−1
e−t
dt = 2
∞
0
x2a−1
e−x2
dx.
We now use the familiar trick of writing Γ(a)Γ(b) as a double integral and switching to
polar coordinates. Thus
Γ(a)Γ(b) = 4
∞
0
∞
0
x2a−1
y2b−1
e−(x2
+y2
)
dx dy
= 4
π/2
0
dθ
∞
0
(cos θ)2a−1
(sin θ)2b−1
e−r2
r2a+2b−1
dr.
The change of variable u = r2
yields
∞
0
r2a+2b−1
e−r2
dr = (1/2)
∞
0
ua+b−1
e−u
du = Γ(a + b)/2.
Thus
Γ(a)Γ(b)
2Γ(a + b)
=
π/2
0
(cos θ)2a−1
(sin θ)2b−1
dθ.
Let z = cos2
θ, 1 − z = sin2
θ, dz = −2 cos θ sin θ dθ = −2z1/2
(1 − z)1/2
dz. The above
integral becomes
−
1
2
0
1
za−1
(1 − z)b−1
dz =
1
2
1
0
za−1
(1 − z)b−1
dz =
1
2
β(a, b)
as claimed. The beta density is
f(x) =
1
β(a, b)
xa−1
(1 − x)b−1
, 0 ≤ x ≤ 1 (a, b > 0).
23
Problems
1. Let X have the beta distribution with parameters a and b. Find the mean and variance
of X.
2. Let T have the T distribution with 15 degrees of freedom. Find the value of c which
makes P{−c ≤ T ≤ c} = .95.
3. Let W have the F distribution with m and n degrees of freedom (abbreviated W =
F(m, n)). Find the distribution of 1/W.
4. A typical table of the F distribution gives values of P{W ≤ c} for c = .9, .95, .975 and
.99. Explain how to find P{W ≤ c} for c = .1, .05, .025 and .01. (Use the result of
Problem 3.)
5. Let X have the T distribution with n degrees of freedom (abbreviated X = T(n)).
Show that T2
(n) = F(1, n), in other words, T2
has an F distribution with 1 and n
degrees of freedom.
6. If X has the exponential density e−x
, x ≥ 0, show that 2X is χ2
(2). Deduce that the
quotient of two exponential random variables is F(2, 2).

Mais conteúdo relacionado

Mais procurados

Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsBabasab Patil
 
Gamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsGamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsDataminingTools Inc
 
Differential Equations
Differential EquationsDifferential Equations
Differential EquationsKrupaSuthar3
 
Numerical differentiation
Numerical differentiationNumerical differentiation
Numerical differentiationandrushow
 
Partial Differential Equation - Notes
Partial Differential Equation - NotesPartial Differential Equation - Notes
Partial Differential Equation - NotesDr. Nirav Vyas
 
Presentation on Solution to non linear equations
Presentation on Solution to non linear equationsPresentation on Solution to non linear equations
Presentation on Solution to non linear equationsRifat Rahamatullah
 
Newton's Forward/Backward Difference Interpolation
Newton's Forward/Backward  Difference InterpolationNewton's Forward/Backward  Difference Interpolation
Newton's Forward/Backward Difference InterpolationVARUN KUMAR
 
5.1 anti derivatives
5.1 anti derivatives5.1 anti derivatives
5.1 anti derivativesmath265
 
Partial Differentiation & Application
Partial Differentiation & Application Partial Differentiation & Application
Partial Differentiation & Application Yana Qlah
 
Probability Density Function (PDF)
Probability Density Function (PDF)Probability Density Function (PDF)
Probability Density Function (PDF)AakankshaR
 
Binomial and Poisson Distribution
Binomial and Poisson  DistributionBinomial and Poisson  Distribution
Binomial and Poisson DistributionSundar B N
 
Chapter 2 discrete_random_variable_2009
Chapter 2 discrete_random_variable_2009Chapter 2 discrete_random_variable_2009
Chapter 2 discrete_random_variable_2009ayimsevenfold
 
Runge Kutta Method
Runge Kutta Method Runge Kutta Method
Runge Kutta Method Bhavik Vashi
 
Random variables
Random variablesRandom variables
Random variablesclari1998
 
Estimation theory 1
Estimation theory 1Estimation theory 1
Estimation theory 1Gopi Saiteja
 
Partial differentiation
Partial differentiationPartial differentiation
Partial differentiationTanuj Parikh
 

Mais procurados (20)

Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec doms
 
Gamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsGamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared Distributions
 
Differential Equations
Differential EquationsDifferential Equations
Differential Equations
 
Numerical differentiation
Numerical differentiationNumerical differentiation
Numerical differentiation
 
Partial Differential Equation - Notes
Partial Differential Equation - NotesPartial Differential Equation - Notes
Partial Differential Equation - Notes
 
Presentation on Solution to non linear equations
Presentation on Solution to non linear equationsPresentation on Solution to non linear equations
Presentation on Solution to non linear equations
 
Numerical analysis ppt
Numerical analysis pptNumerical analysis ppt
Numerical analysis ppt
 
Newton's Forward/Backward Difference Interpolation
Newton's Forward/Backward  Difference InterpolationNewton's Forward/Backward  Difference Interpolation
Newton's Forward/Backward Difference Interpolation
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
 
5.1 anti derivatives
5.1 anti derivatives5.1 anti derivatives
5.1 anti derivatives
 
Partial Differentiation & Application
Partial Differentiation & Application Partial Differentiation & Application
Partial Differentiation & Application
 
Gamma function
Gamma functionGamma function
Gamma function
 
Probability Density Function (PDF)
Probability Density Function (PDF)Probability Density Function (PDF)
Probability Density Function (PDF)
 
Binomial and Poisson Distribution
Binomial and Poisson  DistributionBinomial and Poisson  Distribution
Binomial and Poisson Distribution
 
Chapter 2 discrete_random_variable_2009
Chapter 2 discrete_random_variable_2009Chapter 2 discrete_random_variable_2009
Chapter 2 discrete_random_variable_2009
 
Runge Kutta Method
Runge Kutta Method Runge Kutta Method
Runge Kutta Method
 
Random variables
Random variablesRandom variables
Random variables
 
Estimation theory 1
Estimation theory 1Estimation theory 1
Estimation theory 1
 
Partial differentiation
Partial differentiationPartial differentiation
Partial differentiation
 
Separation of variables
Separation of variablesSeparation of variables
Separation of variables
 

Destaque

Computing Transformations Spring2005
Computing Transformations Spring2005Computing Transformations Spring2005
Computing Transformations Spring2005guest5989655
 
linear transformation
linear transformationlinear transformation
linear transformationmansi acharya
 
Graphs of trigonometric functions
Graphs of trigonometric functionsGraphs of trigonometric functions
Graphs of trigonometric functionsTarun Gehlot
 
Recurrence equations
Recurrence equationsRecurrence equations
Recurrence equationsTarun Gehlot
 
Modelling with first order differential equations
Modelling with first order differential equationsModelling with first order differential equations
Modelling with first order differential equationsTarun Gehlot
 
Intervals of validity
Intervals of validityIntervals of validity
Intervals of validityTarun Gehlot
 
Linear approximations
Linear approximationsLinear approximations
Linear approximationsTarun Gehlot
 
How to draw a good graph
How to draw a good graphHow to draw a good graph
How to draw a good graphTarun Gehlot
 
Real meaning of functions
Real meaning of functionsReal meaning of functions
Real meaning of functionsTarun Gehlot
 
An applied approach to calculas
An applied approach to calculasAn applied approach to calculas
An applied approach to calculasTarun Gehlot
 
Solution of nonlinear_equations
Solution of nonlinear_equationsSolution of nonlinear_equations
Solution of nonlinear_equationsTarun Gehlot
 
Describing and exploring data
Describing and exploring dataDescribing and exploring data
Describing and exploring dataTarun Gehlot
 
Probability and statistics as helpers in real life
Probability and statistics as helpers in real lifeProbability and statistics as helpers in real life
Probability and statistics as helpers in real lifeTarun Gehlot
 
C4 discontinuities
C4 discontinuitiesC4 discontinuities
C4 discontinuitiesTarun Gehlot
 
2.3 stem and leaf displays
2.3 stem and leaf displays2.3 stem and leaf displays
2.3 stem and leaf displaysleblance
 
The shortest distance between skew lines
The shortest distance between skew linesThe shortest distance between skew lines
The shortest distance between skew linesTarun Gehlot
 

Destaque (20)

Computing Transformations Spring2005
Computing Transformations Spring2005Computing Transformations Spring2005
Computing Transformations Spring2005
 
linear transformation
linear transformationlinear transformation
linear transformation
 
Graphs of trigonometric functions
Graphs of trigonometric functionsGraphs of trigonometric functions
Graphs of trigonometric functions
 
Critical points
Critical pointsCritical points
Critical points
 
Logicgates
LogicgatesLogicgates
Logicgates
 
Recurrence equations
Recurrence equationsRecurrence equations
Recurrence equations
 
Modelling with first order differential equations
Modelling with first order differential equationsModelling with first order differential equations
Modelling with first order differential equations
 
Intervals of validity
Intervals of validityIntervals of validity
Intervals of validity
 
Linear approximations
Linear approximationsLinear approximations
Linear approximations
 
How to draw a good graph
How to draw a good graphHow to draw a good graph
How to draw a good graph
 
Real meaning of functions
Real meaning of functionsReal meaning of functions
Real meaning of functions
 
An applied approach to calculas
An applied approach to calculasAn applied approach to calculas
An applied approach to calculas
 
Solution of nonlinear_equations
Solution of nonlinear_equationsSolution of nonlinear_equations
Solution of nonlinear_equations
 
Describing and exploring data
Describing and exploring dataDescribing and exploring data
Describing and exploring data
 
Probability and statistics as helpers in real life
Probability and statistics as helpers in real lifeProbability and statistics as helpers in real life
Probability and statistics as helpers in real life
 
Matrix algebra
Matrix algebraMatrix algebra
Matrix algebra
 
C4 discontinuities
C4 discontinuitiesC4 discontinuities
C4 discontinuities
 
2.3 stem and leaf displays
2.3 stem and leaf displays2.3 stem and leaf displays
2.3 stem and leaf displays
 
The shortest distance between skew lines
The shortest distance between skew linesThe shortest distance between skew lines
The shortest distance between skew lines
 
Thermo dynamics
Thermo dynamicsThermo dynamics
Thermo dynamics
 

Semelhante a Transformation of random variables

Conditional Expectations Liner algebra
Conditional Expectations Liner algebra Conditional Expectations Liner algebra
Conditional Expectations Liner algebra AZIZ ULLAH SURANI
 
Lesson20 Tangent Planes Slides+Notes
Lesson20   Tangent Planes Slides+NotesLesson20   Tangent Planes Slides+Notes
Lesson20 Tangent Planes Slides+NotesMatthew Leingang
 
CHAIN RULE AND IMPLICIT FUNCTION
CHAIN RULE AND IMPLICIT FUNCTIONCHAIN RULE AND IMPLICIT FUNCTION
CHAIN RULE AND IMPLICIT FUNCTIONNikhil Pandit
 
Implicit differentiation
Implicit differentiationImplicit differentiation
Implicit differentiationSporsho
 
PLease show work thank you Suppose X and Y art-jointly distributed a.pdf
PLease show work thank you Suppose X and Y art-jointly distributed a.pdfPLease show work thank you Suppose X and Y art-jointly distributed a.pdf
PLease show work thank you Suppose X and Y art-jointly distributed a.pdfajayenterprises11
 
Week 3 [compatibility mode]
Week 3 [compatibility mode]Week 3 [compatibility mode]
Week 3 [compatibility mode]Hazrul156
 
Mathematics 3.pdf civil engineering concrete
Mathematics 3.pdf civil engineering concreteMathematics 3.pdf civil engineering concrete
Mathematics 3.pdf civil engineering concretemocr84810
 
PlymouthUniversity_MathsandStats_partial_differentiation.pdf
PlymouthUniversity_MathsandStats_partial_differentiation.pdfPlymouthUniversity_MathsandStats_partial_differentiation.pdf
PlymouthUniversity_MathsandStats_partial_differentiation.pdfssuser3d4bb8
 
Partial diferential good
Partial diferential goodPartial diferential good
Partial diferential goodgenntmbr
 
Quantitative Techniques random variables
Quantitative Techniques random variablesQuantitative Techniques random variables
Quantitative Techniques random variablesRohan Bhatkar
 
AEM Integrating factor to orthogonal trajactories
AEM Integrating factor to orthogonal trajactoriesAEM Integrating factor to orthogonal trajactories
AEM Integrating factor to orthogonal trajactoriesSukhvinder Singh
 
Continuity of functions by graph (exercises with detailed solutions)
Continuity of functions by graph   (exercises with detailed solutions)Continuity of functions by graph   (exercises with detailed solutions)
Continuity of functions by graph (exercises with detailed solutions)Tarun Gehlot
 
Bahan ajar kalkulus integral
Bahan ajar kalkulus integralBahan ajar kalkulus integral
Bahan ajar kalkulus integralgrand_livina_good
 
MA500-2 Topological Structures 2016Aisling McCluskey, Dar.docx
MA500-2 Topological Structures 2016Aisling McCluskey, Dar.docxMA500-2 Topological Structures 2016Aisling McCluskey, Dar.docx
MA500-2 Topological Structures 2016Aisling McCluskey, Dar.docxsmile790243
 

Semelhante a Transformation of random variables (20)

Conditional Expectations Liner algebra
Conditional Expectations Liner algebra Conditional Expectations Liner algebra
Conditional Expectations Liner algebra
 
Lesson20 Tangent Planes Slides+Notes
Lesson20   Tangent Planes Slides+NotesLesson20   Tangent Planes Slides+Notes
Lesson20 Tangent Planes Slides+Notes
 
Section2 stochastic
Section2 stochasticSection2 stochastic
Section2 stochastic
 
CHAIN RULE AND IMPLICIT FUNCTION
CHAIN RULE AND IMPLICIT FUNCTIONCHAIN RULE AND IMPLICIT FUNCTION
CHAIN RULE AND IMPLICIT FUNCTION
 
Universal algebra (1)
Universal algebra (1)Universal algebra (1)
Universal algebra (1)
 
Implicit differentiation
Implicit differentiationImplicit differentiation
Implicit differentiation
 
Cs jog
Cs jogCs jog
Cs jog
 
PLease show work thank you Suppose X and Y art-jointly distributed a.pdf
PLease show work thank you Suppose X and Y art-jointly distributed a.pdfPLease show work thank you Suppose X and Y art-jointly distributed a.pdf
PLease show work thank you Suppose X and Y art-jointly distributed a.pdf
 
Hw2 s
Hw2 sHw2 s
Hw2 s
 
Week 3 [compatibility mode]
Week 3 [compatibility mode]Week 3 [compatibility mode]
Week 3 [compatibility mode]
 
Mathematics 3.pdf civil engineering concrete
Mathematics 3.pdf civil engineering concreteMathematics 3.pdf civil engineering concrete
Mathematics 3.pdf civil engineering concrete
 
Sect4 2
Sect4 2Sect4 2
Sect4 2
 
PlymouthUniversity_MathsandStats_partial_differentiation.pdf
PlymouthUniversity_MathsandStats_partial_differentiation.pdfPlymouthUniversity_MathsandStats_partial_differentiation.pdf
PlymouthUniversity_MathsandStats_partial_differentiation.pdf
 
Partial diferential good
Partial diferential goodPartial diferential good
Partial diferential good
 
Quantitative Techniques random variables
Quantitative Techniques random variablesQuantitative Techniques random variables
Quantitative Techniques random variables
 
Sect1 2
Sect1 2Sect1 2
Sect1 2
 
AEM Integrating factor to orthogonal trajactories
AEM Integrating factor to orthogonal trajactoriesAEM Integrating factor to orthogonal trajactories
AEM Integrating factor to orthogonal trajactories
 
Continuity of functions by graph (exercises with detailed solutions)
Continuity of functions by graph   (exercises with detailed solutions)Continuity of functions by graph   (exercises with detailed solutions)
Continuity of functions by graph (exercises with detailed solutions)
 
Bahan ajar kalkulus integral
Bahan ajar kalkulus integralBahan ajar kalkulus integral
Bahan ajar kalkulus integral
 
MA500-2 Topological Structures 2016Aisling McCluskey, Dar.docx
MA500-2 Topological Structures 2016Aisling McCluskey, Dar.docxMA500-2 Topological Structures 2016Aisling McCluskey, Dar.docx
MA500-2 Topological Structures 2016Aisling McCluskey, Dar.docx
 

Mais de Tarun Gehlot

Materials 11-01228
Materials 11-01228Materials 11-01228
Materials 11-01228Tarun Gehlot
 
Continuity and end_behavior
Continuity and  end_behaviorContinuity and  end_behavior
Continuity and end_behaviorTarun Gehlot
 
Factoring by the trial and-error method
Factoring by the trial and-error methodFactoring by the trial and-error method
Factoring by the trial and-error methodTarun Gehlot
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysisTarun Gehlot
 
Finite elements : basis functions
Finite elements : basis functionsFinite elements : basis functions
Finite elements : basis functionsTarun Gehlot
 
Finite elements for 2‐d problems
Finite elements  for 2‐d problemsFinite elements  for 2‐d problems
Finite elements for 2‐d problemsTarun Gehlot
 
Error analysis statistics
Error analysis   statisticsError analysis   statistics
Error analysis statisticsTarun Gehlot
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlabTarun Gehlot
 
Linear approximations and_differentials
Linear approximations and_differentialsLinear approximations and_differentials
Linear approximations and_differentialsTarun Gehlot
 
Local linear approximation
Local linear approximationLocal linear approximation
Local linear approximationTarun Gehlot
 
Interpolation functions
Interpolation functionsInterpolation functions
Interpolation functionsTarun Gehlot
 
Propeties of-triangles
Propeties of-trianglesPropeties of-triangles
Propeties of-trianglesTarun Gehlot
 
Gaussian quadratures
Gaussian quadraturesGaussian quadratures
Gaussian quadraturesTarun Gehlot
 
Basics of set theory
Basics of set theoryBasics of set theory
Basics of set theoryTarun Gehlot
 
Numerical integration
Numerical integrationNumerical integration
Numerical integrationTarun Gehlot
 
Applications of set theory
Applications of  set theoryApplications of  set theory
Applications of set theoryTarun Gehlot
 
Miscellneous functions
Miscellneous  functionsMiscellneous  functions
Miscellneous functionsTarun Gehlot
 
Dependent v. independent variables
Dependent v. independent variablesDependent v. independent variables
Dependent v. independent variablesTarun Gehlot
 

Mais de Tarun Gehlot (20)

Materials 11-01228
Materials 11-01228Materials 11-01228
Materials 11-01228
 
Binary relations
Binary relationsBinary relations
Binary relations
 
Continuity and end_behavior
Continuity and  end_behaviorContinuity and  end_behavior
Continuity and end_behavior
 
Factoring by the trial and-error method
Factoring by the trial and-error methodFactoring by the trial and-error method
Factoring by the trial and-error method
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysis
 
Finite elements : basis functions
Finite elements : basis functionsFinite elements : basis functions
Finite elements : basis functions
 
Finite elements for 2‐d problems
Finite elements  for 2‐d problemsFinite elements  for 2‐d problems
Finite elements for 2‐d problems
 
Error analysis statistics
Error analysis   statisticsError analysis   statistics
Error analysis statistics
 
Matlab commands
Matlab commandsMatlab commands
Matlab commands
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Linear approximations and_differentials
Linear approximations and_differentialsLinear approximations and_differentials
Linear approximations and_differentials
 
Local linear approximation
Local linear approximationLocal linear approximation
Local linear approximation
 
Interpolation functions
Interpolation functionsInterpolation functions
Interpolation functions
 
Propeties of-triangles
Propeties of-trianglesPropeties of-triangles
Propeties of-triangles
 
Gaussian quadratures
Gaussian quadraturesGaussian quadratures
Gaussian quadratures
 
Basics of set theory
Basics of set theoryBasics of set theory
Basics of set theory
 
Numerical integration
Numerical integrationNumerical integration
Numerical integration
 
Applications of set theory
Applications of  set theoryApplications of  set theory
Applications of set theory
 
Miscellneous functions
Miscellneous  functionsMiscellneous  functions
Miscellneous functions
 
Dependent v. independent variables
Dependent v. independent variablesDependent v. independent variables
Dependent v. independent variables
 

Último

Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICESayali Powar
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...Nguyen Thanh Tu Collection
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxAditiChauhan701637
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphNetziValdelomar1
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational PhilosophyShuvankar Madhu
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxDr. Asif Anas
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17Celine George
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxheathfieldcps1
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.EnglishCEIPdeSigeiro
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice documentXsasf Sfdfasd
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfMohonDas
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and stepobaje godwin sunday
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRADUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRATanmoy Mishra
 

Último (20)

Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICE
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
Finals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quizFinals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quiz
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptx
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a Paragraph
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational Philosophy
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptx
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice document
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdf
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and step
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRADUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
 

Transformation of random variables

  • 1. 1 Lecture 1. Transformation of Random Variables Suppose we are given a random variable X with density fX(x). We apply a function g to produce a random variable Y = g(X). We can think of X as the input to a black box, and Y the output. We wish to find the density or distribution function of Y . We illustrate the technique for the example in Figure 1.1. - 1 2 e-x 1/2 -1 f (x) x-axis X Y y X-Sqrt[y] Sqrt[y] Y = X 2 Figure 1.1 The distribution function method finds FY directly, and then fY by differentiation. We have FY (y) = 0 for y < 0. If y ≥ 0, then P{Y ≤ y} = P{− √ y ≤ x ≤ √ y}. Case 1. 0 ≤ y ≤ 1 (Figure 1.2). Then FY (y) = 1 2 √ y + √ y 0 1 2 e−x dx = 1 2 √ y + 1 2 (1 − e− √ y ). 1/2 -1 x-axis -Sqrt[y] Sqrt[y] f (x) X Figure 1.2 Case 2. y > 1 (Figure 1.3). Then FY (y) = 1 2 + √ y 0 1 2 e−x dx = 1 2 + 1 2 (1 − e− √ y ). The density of Y is 0 for y < 0 and TARUN GEHLOT B.E (CIVIL) (HONOURS)
  • 2. 2 1/2 -1 x-axis -Sqrt[y] Sqrt[y] f (x) X 1' Figure 1.3 fY (y) = 1 4 √ y (1 + e− √ y ), 0 < y < 1; fY (y) = 1 4 √ y e− √ y , y > 1. See Figure 1.4 for a sketch of fY and FY . (You can take fY (y) to be anything you like at y = 1 because {Y = 1} has probability zero.) y f (y) Y 1' y F (y) Y 1 1 2 y + 1 2 (1 - e- y ) 1 2 + 1 2 (1 - e- y ) ' Figure 1.4 The density function method finds fY directly, and then FY by integration; see Figure 1.5. We have fY (y)|dy| = fX( √ y)dx + fX(− √ y)dx; we write |dy| because proba- bilities are never negative. Thus fY (y) = fX( √ y) |dy/dx|x= √ y + fX(− √ y) |dy/dx|x=− √ y with y = x2 , dy/dx = 2x, so fY (y) = fX( √ y) 2 √ y + fX(− √ y) 2 √ y . (Note that | − 2 √ y| = 2 √ y.) We have fY (y) = 0 for y < 0, and: Case 1. 0 < y < 1 (see Figure 1.2). fY (y) = (1/2)e− √ y 2 √ y + 1/2 2 √ y = 1 4 √ y(1 + e− √ y ).
  • 3. 3 Case 2. y > 1 (see Figure 1.3). fY (y) = (1/2)e− √ y 2 √ y + 0 = 1 4 √ y e− √ y as before. Y y X y- y Figure 1.5 The distribution function method generalizes to situations where we have a single out- put but more than one input. For example, let X and Y be independent, each uniformly distributed on [0, 1]. The distribution function of Z = X + Y is FZ(z) = P{X + Y ≤ z} = x+y≤z fXY (x, y) dx dy with fXY (x, y) = fX(x)fY (y) by independence. Now FZ(z) = 0 for z < 0 and FZ(z) = 1 for z > 2 (because 0 ≤ Z ≤ 2). Case 1. If 0 ≤ z ≤ 1, then FZ(z) is the shaded area in Figure 1.6, which is z2 /2. Case 2. If 1 ≤ z ≤ 2, then FZ(z) is the shaded area in FIgure 1.7, which is 1−[(2−z)2 /2]. Thus (see Figure 1.8) fZ(z) =    z, 0 ≤ z ≤ 1 2 − z, 1 ≤ z ≤ 2 0 elsewhere . Problems 1. Let X, Y, Z be independent, identically distributed (from now on, abbreviated iid) random variables, each with density f(x) = 6x5 for 0 ≤ x ≤ 1, and 0 elsewhere. Find the distribution and density functions of the maximum of X, Y and Z. 2. Let X and Y be independent, each with density e−x , x ≥ 0. Find the distribution (from now on, an abbreviation for “Find the distribution or density function”) of Z = Y/X. 3. A discrete random variable X takes values x1, . . . , xn, each with probability 1/n. Let Y = g(X) where g is an arbitrary real-valued function. Express the probability function of Y (pY (y) = P{Y = y}) in terms of g and the xi.
  • 4. 4 1 1 y x z z 1 1 y x x+y = z 1 ≤ z ≤ 2 2-z 2-z Figures 1.6 and 1.7 f (z) Z -1 1 2' z Figure 1.8 4. A random variable X has density f(x) = ax2 on the interval [0, b]. Find the density of Y = X3 . 5. The Cauchy density is given by f(y) = 1/[π(1 + y2 )] for all real y. Show that one way to produce this density is to take the tangent of a random variable X that is uniformly distributed between −π/2 and π/2.
  • 5. 5 Lecture 2. Jacobians We need this idea to generalize the density function method to problems where there are k inputs and k outputs, with k ≥ 2. However, if there are k inputs and j < k outputs, often extra outputs can be introduced, as we will see later in the lecture. 2.1 The Setup Let X = X(U, V ), Y = Y (U, V ). Assume a one-to-one transformation, so that we can solve for U and V . Thus U = U(X, Y ), V = V (X, Y ). Look at Figure 2.1. If u changes by du then x changes by (∂x/∂u) du and y changes by (∂y/∂u) du. Similarly, if v changes by dv then x changes by (∂x/∂v) dv and y changes by (∂y/∂v) dv. The small rectangle in the u − v plane corresponds to a small parallelogram in the x − y plane (Figure 2.2), with A = (∂x/∂u, ∂y/∂u, 0) du and B = (∂x/∂v, ∂y/∂v, 0) dv. The area of the parallelogram is |A × B| and A × B = I J K ∂x/∂u ∂y/∂u 0 ∂x/∂v ∂y/∂v 0 du dv = ∂x/∂u ∂x/∂v ∂y/∂u ∂y/∂v du dvK. (A determinant is unchanged if we transpose the matrix, i.e., interchange rows and columns.) ∂x ∂u du ∂y ∂u du x y u v du dv R Figure 2.1 A B S Figure 2.2 2.2 Definition and Discussion The Jacobian of the transformation is J = ∂x/∂u ∂x/∂v ∂y/∂u ∂y/∂v , written as ∂(x, y) ∂(u, v) .
  • 6. 6 Thus |A × B| = |J| du dv. Now P{(X, Y ) ∈ S} = P{(U, V ) ∈ R}, in other words, fXY (x, y) times the area of S is fUV (u, v) times the area of R. Thus fXY (x, y)|J| du dv = fUV (u, v) du dv and fUV (u, v) = fXY (x, y) ∂(x, y) ∂(u, v) . The absolute value of the Jacobian ∂(x, y)/∂(u, v) gives a magnification factor for area in going from u − v coordinates to x − y coordinates. The magnification factor going the other way is |∂(u, v)/∂(x, y)|. But the magnification factor from u − v to u − v is 1, so fUV (u, v) = fXY (x, y) ∂(u, v)/∂(x, y) . In this formula, we must substitute x = x(u, v), y = y(u, v) to express the final result in terms of u and v. In three dimensions, a small rectangular box with volume du dv dw corresponds to a parallelepiped in xyz space, determined by vectors A = ∂x ∂u ∂y ∂u ∂z ∂u du, B = ∂x ∂v ∂y ∂v ∂z ∂v dv, C = ∂x ∂w ∂y ∂w ∂z ∂w dw. The volume of the parallelepiped is the absolute value of the dot product of A with B×C, and the dot product can be written as a determinant with rows (or columns) A, B, C. This determinant is the Jacobian of x, y, z with respect to u, v, w [written ∂(x, y, z)/∂(u, v, w)], times du dv dw. The volume magnification from uvw to xyz space is |∂(x, y, z)/∂(u, v, w)| and we have fUV W (u, v, w) = fXY Z(x, y, z) |∂(u, v, w)/∂(x, y, z)| with x = x(u, v, w), y = y(u, v, w), z = z(u, v, w). The jacobian technique extends to higher dimensions. The transformation formula is a natural generalization of the two and three-dimensional cases: fY1Y2···Yn (y1, . . . , yn) = fX1···Xn (x1, . . . , xn) |∂(y1, . . . , yn)/∂(x1, . . . , xn)| where ∂(y1, . . . , yn) ∂(x1, . . . , xn) = ∂y1 ∂x1 · · · ∂y1 ∂xn ... ∂yn ∂x1 · · · ∂yn ∂xn . To help you remember the formula, think f(y) dy = f(x)dx.
  • 7. 7 2.3 A Typical Application Let X and Y be independent, positive random variables with densities fX and fY , and let Z = XY . We find the density of Z by introducing a new random variable W, as follows: Z = XY, W = Y (W = X would be equally good). The transformation is one-to-one because we can solve for X, Y in terms of Z, W by X = Z/W, Y = W. In a problem of this type, we must always pay attention to the range of the variables: x > 0, y > 0 is equivalent to z > 0, w > 0. Now fZW (z, w) = fXY (x, y) |∂(z, w)/∂(x, y)|x=z/w,y=w with ∂(z, w) ∂(x, y) = ∂z/∂x ∂z/∂y ∂w/∂x ∂w/∂y = y x 0 1 = y. Thus fZW (z, w) = fX(x)fY (y) w = fX(z/w)fY (w) w and we are left with the problem of finding the marginal density from a joint density: fZ(z) = ∞ −∞ fZW (z, w) dw = ∞ 0 1 w fX(z/w)fY (w) dw. Problems 1. The joint density of two random variables X1 and X2 is f(x1, x2) = 2e−x1 e−x2 , where 0 < x1 < x2 < ∞; f(x1, x2) = 0 elsewhere. Consider the transformation Y1 = 2X1, Y2 = X2 − X1. Find the joint density of Y1 and Y2, and conclude that Y1 and Y2 are independent. 2. Repeat Problem 1 with the following new data. The joint density is given by f(x1, x2) = 8x1x2, 0 < x1 < x2 < 1; f(x1, x2) = 0 elsewhere; Y1 = X1/X2, Y2 = X2. 3. Repeat Problem 1 with the following new data. We now have three iid random variables Xi, i = 1, 2, 3, each with density e−x , x > 0. The transformation equations are given by Y1 = X1/(X1 + X2), Y2 = (X1 + X2)/(X1 + X2 + X3), Y3 = X1 + X2 + X3. As before, find the joint density of the Yi and show that Y1, Y2 and Y3 are independent. Comments on the Problem Set In Problem 3, notice that Y1Y2Y3 = X1, Y2Y3 = X1+X2, so X2 = Y2Y3−Y1Y2Y3, X3 = (X1 + X2 + X3) − (X1 + X2) = Y3 − Y2Y3. If fXY (x, y) = g(x)h(y) for all x, y, then X and Y are independent, because f(y|x) = fXY (x, y) fX(x) = g(x)h(y) g(x) ∞ −∞ h(y) dy
  • 8. 8 which does not depend on x. The set of points where g(x) = 0 (equivalently fX(x) = 0) can be ignored because it has probability zero. It is important to realize that in this argument, “for all x, y” means that x and y must be allowed to vary independently of each other, so the set of possible x and y must be of the rectangular form a < x < b, c < y < d. (The constants a, b, c, d can be infinite.) For example, if fXY (x, y) = 2e−x e−y , 0 < y < x, and 0 elsewhere, then X and Y are not independent. Knowing x forces 0 < y < x, so the conditional distribution of Y given X = x certainly depends on x. Note that fXY (x, y) is not a function of x alone times a function of y alone. We have fXY (x, y) = 2e−x e−y I[0 < y < x] where the indicator I is 1 for 0 < y < x and 0 elsewhere. In Jacobian problems, pay close attention to the range of the variables. For example, in Problem 1 we have y1 = 2x1, y2 = x2 − x1, so x1 = y1/2, x2 = (y1/2) + y2. From these equations it follows that 0 < x1 < x2 < ∞ is equivalent to y1 > 0, y2 > 0.
  • 9. 9 Lecture 3. Moment-Generating Functions 3.1 Definition The moment-generating function of a random variable X is defined by M(t) = MX(t) = E[etX ] where t is a real number. To see the reason for the terminology, note that M(t) is the expectation of 1 + tX + t2 X2 /2! + t3 X3 /3! + · · · . If µn = E(Xn ), the n-th moment of X, and we can take the expectation term by term, then M(t) = 1 + µ1t + µ2t2 2! + · · · + µntn n! + · · · . Since the coefficient of tn in the Taylor expansion is M(n) (0)/n!, where M(n) is the n-th derivative of M, we have µn = M(n) (0). 3.2 The Key Theorem If Y = n i=1 Xi where X1, . . . , Xn are independent, then MY (t) = n i=1 MXi (t). Proof. First note that if X and Y are independent, then E[g(X)h(Y )] = ∞ −∞ ∞ −∞ g(x)h(y)fXY (x, y) dx dy. Since fXY (x, y) = fX(x)fY (y), the double integral becomes ∞ −∞ g(x)fX(x) dx ∞ −∞ h(y)fY (y) dy = E[g(X)]E[h(Y )] and similarly for more than two random variables. Now if Y = X1 + · · · + Xn with the Xi’s independent, we have MY (t) = E[etY ] = E[etX1 · · · etXn ] = E[etX1 ] · · · E[etXn ] = MX1 (t) · · · MXn (t). ♣ 3.3 The Main Application Given independent random variables X1, . . . , Xn with densities f1, . . . , fn respectively, find the density of Y = n i=1 Xi. Step 1. Compute Mi(t), the moment-generating function of Xi, for each i. Step 2. Compute MY (t) = n i=1 Mi(t). Step 3. From MY (t) find fY (y). This technique is known as a transform method. Notice that the moment-generating function and the density of a random variable are related by M(t) = ∞ −∞ etx f(x) dx. With t replaced by −s we have a Laplace transform, and with t replaced by it we have a Fourier transform. The strategy works because at step 3, the moment-generating function determines the density uniquely. (This is a theorem from Laplace or Fourier transform theory.)
  • 10. 10 3.4 Examples 1. Bernoulli Trials. Let X be the number of successes in n trials with probability of success p on a given trial. Then X = X1 + · · · + Xn, where Xi = 1 if there is a success on trial i and Xi = 0 if there is a failure on trial i. Thus Mi(t) = E[etXi ] = P{Xi = 1}et1 + P{Xi = 0}et0 = pet + q with p + q = 1. The moment-generating function of X is MX(t) = (pet + q)n = n k=0 n k pk qn−k etk . This could have been derived directly: MX(t) = E[etX ] = n k=0 P{X = k}etk = n k=0 n k pk qn−k etk = (pet + q)n by the binomial theorem. 2. Poisson. We have P{X = k} = e−λ λk /k!, k = 0, 1, 2, . . . . Thus M(t) = ∞ k=0 e−λ λk k! etk = e−λ ∞ k=0 (λet )k k! = exp(−λ) exp(λet ) = exp[λ(et − 1)]. We can compute the mean and variance from the moment-generating function: E(X) = M (0) = [exp(λ(et − 1))λet ]t=0 = λ. Let h(λ, t) = exp[λ(et − 1)]. Then E(X2 ) = M (0) = [h(λ, t)λet + λet h(λ, t)λet ]t=0 = λ + λ2 hence Var X = E(X2 ) − [E(X)]2 = λ + λ2 − λ2 = λ. 3. Normal(0,1). The moment-generating function is M(t) = E[etX ] = ∞ −∞ etx 1 √ 2π e−x2 /2 dx Now −(x2 /2) + tx = −(1/2)(x2 − 2tx + t2 − t2 ) = −(1/2)(x − t)2 + (1/2)t2 so M(t) = et2 /2 ∞ −∞ 1 √ 2π exp[−(x − t)2 /2] dx. The integral is the area under a normal density (mean t, variance 1), which is 1. Conse- quently, M(t) = et2 /2 .
  • 11. 11 4. Normal(µ, σ2 ). If X is normal(µ, σ2 ), then Y = (X − µ)/σ is normal(0,1). This is a good application of the density function method from Lecture 1: fY (y) = fX(x) |dy/dx|x=µ+σy = σ 1 √ 2πσ e−y2 /2 . We have X = µ + σY , so MX(t) = E[etX ] = etµ E[etσY ] = etµ MY (tσ). Thus MX(t) = etµ et2 σ2 /2 . Remember this technique, which is especially useful when Y = aX + b and the moment- generating function of X is known. 3.5 Theorem If X is normal(µ, σ2 ) and Y = aX + b, then Y is normal(aµ + b, a2 σ2 ). Proof. We compute MY (t) = E[et Y ] = E[et(aX+b) ] = ebt MX(at) = ebt eatµ ea2 t2 σ2 /2 . Thus MY (t) = exp[t(aµ + b)] exp(t2 a2 σ2 /2). ♣ Here is another basic result. 3.6 Theorem Let X1, . . . , Xn be independent, with Xi normal (µi, σ2 i ). Then Y = n i=1 Xi is normal with mean µ = n i=1 µi and variance σ2 = n i=1 σ2 i . Proof. The moment-generating function of Y is MY (t) = n i=1 exp(tiµi + t2 σ2 i /2) = exp(tµ + t2 σ2 /2). ♣ A similar argument works for the Poisson distribution; see Problem 4. 3.7 The Gamma Distribution First, we define the gamma function Γ(α) = ∞ 0 yα−1 e−y dy, α > 0. We need three properties: (a) Γ(α + 1) = αΓ(α), the recursion formula; (b) Γ(n + 1) = n!, n = 0, 1, 2, . . . ;
  • 12. 12 (c) Γ(1/2) = √ π. To prove (a), integrate by parts: Γ(α) = ∞ 0 e−y d(yα /α). Part (b) is a special case of (a). For (c) we make the change of variable y = z2 /2 and compute Γ(1/2) = ∞ 0 y−1/2 e−y dy = ∞ 0 √ 2z−1 e−z2 /2 z dz. The second integral is 2 √ π times half the area under the normal(0,1) density, that is, 2 √ π(1/2) = √ π. The gamma density is f(x) = 1 Γ(α)βα xα−1 e−x/β where α and β are positive constants. The moment-generating function is M(t) = ∞ 0 [Γ(α)βα ]−1 xα−1 etx e−x/β dx. Change variables via y = (−t + (1/β))x to get ∞ 0 [Γ(α)βα ]−1 y −t + (1/β) α−1 e−y dy −t + (1/β) which reduces to 1 βα β 1 − βt α = (1 − βt)−α . In this argument, t must be less than 1/β so that the integrals will be finite. Since M(0) = ∞ −∞ f(x) dx = ∞ 0 f(x) dx in this case, with f ≥ 0, M(0) = 1 implies that we have a legal probability density. As before, moments can be calculated efficiently from the moment-generating function: E(X) = M (0) = −α(1 − βt)−α−1 (−β)|t=0 = αβ; E(X2 ) = M (0) = −α(−α − 1)(1 − βt)−α−2 (−β)2 |t=0 = α(α + 1)β2 . Thus Var X = E(X2 ) − [E(X)]2 = αβ2 . 3.8 Special Cases The exponential density is a gamma density with α = 1 : f(x) = (1/β)e−x/β , x ≥ 0, with E(X) = β, E(X2 ) = 2β2 , Var X = β2 .
  • 13. 13 A random variable X has the chi square density with r degrees of freedom (X = χ2 (r) for short, where r is a positive integer) if its density is gamma with α = r/2 and β = 2. Thus f(x) = 1 Γ(r/2)2r/2 x(r/2)−1 e−x/2 , x ≥ 0 and M(t) = 1 (1 − 2t)r/2 , t < 1/2. Therefore E[χ2 (r)] = αβ = r, Var[χ2 (r)] = αβ2 = 2r. 3.9 Lemma If X is normal(0,1) then X2 is χ2 (1). Proof. We compute the moment-generating function of X2 directly: MX2 (t) = E[etX2 ] = ∞ −∞ etx2 1 √ 2π e−x2 /2 dx. Let y = √ 1 − 2tx; the integral becomes ∞ −∞ 1 √ 2π e−y2 /2 dy √ 1 − 2t = (1 − 2t)−1/2 which is χ2 (1). ♣ 3.10 Theorem If X1, . . . , Xn are independent, each normal (0,1), then Y = n i=1 X2 i is χ2 (n). Proof. By (3.9), each X2 i is χ2 (1) with moment-generating function (1 − 2t)−1/2 . Thus MY (t) = (1 − 2t)−n/2 for t < 1/2, which is χ2 (n). ♣ 3.11 Another Method Another way to find the density of Z = X + Y where X and Y are independent random variables is by the convolution formula fZ(z) = ∞ −∞ fX(x)fY (z − x) dx = ∞ −∞ fY (y)fX(z − y) dy. To see this intuitively, reason as follows. The probability that Z lies near z (between z and z + dz) is fZ(z) dz. Let us compute this in terms of X and Y . The probability that X lies near x is fX(x) dx. Given that X lies near x, Z will lie near z if and only if Y lies near z − x, in other words, z − x ≤ Y ≤ z − x + dz. By independence of X and Y , this probability is fY (z−x) dz. Thus fZ(z) is a sum of terms of the form fX(x) dx fY (z−x) dz. Cancel the dz’s and replace the sum by an integral to get the result. A formal proof can be given using Jacobians.
  • 14. 14 3.12 The Poisson Process This process occurs in many physical situations, and provides an application of the gamma distribution. For example, particles can arrive at a counting device, customers at a serving counter, airplanes at an airport, or phone calls at a telephone exchange. Divide the time interval [0, t] into a large number n of small subintervals of length dt, so that n dt = t. If Ii, i = 1, . . . , n, is one of the small subintervals, we make the following assumptions: (1) The probability of exactly one arrival in Ii is λ dt, where λ is a constant. (2) The probability of no arrivals in Ii is 1 − λ dt. (3) The probability of more than one arrival in Ii is zero. (4) If Ai is the event of an arrival in Ii, then the Ai, i = 1, . . . , n are independent. As a consequence of these assumptions, we have n = t/dt Bernoulli trials with prob- ability of success p = λdt on a given trial. As dt → 0 we have n → ∞ and p → 0, with np = λt. We conclude that the number N[0, t] of arrivals in [0, t] is Poisson (λt): P{N[0, t] = k} = e−λt (λt)k /k!, k = 0, 1, 2, . . . . Since E(N[0, t]) = λt, we may interpret λ as the average number of arrivals per unit time. Now let W1 be the waiting time for the first arrival. Then P{W1 > t} = P{no arrival in [0,t]} = P{N[0, t] = 0} = e−λt , t ≥ 0. Thus FW1 (t) = 1 − e−λt and fW1 (t) = λe−λt , t ≥ 0. From the formulas for the mean and variance of an exponential random variable we have E(W1) = 1/λ and Var W1 = 1/λ2 . Let Wk be the (total) waiting time for the k-th arrival. Then Wk is the waiting time for the first arrival plus the time after the first up to the second arrival plus · · · plus the time after arrival k − 1 up to the k-th arrival. Thus Wk is the sum of k independent exponential random variables, and MWk (t) = 1 1 − (t/λ) k so Wk is gamma with α = k, β = 1/λ. Therefore fWk (t) = 1 (k − 1)! λk tk−1 e−λt , t ≥ 0. Problems 1. Let X1 and X2 be independent, and assume that X1 is χ2 (r1) and Y = X1 + X2 is χ2 (r), where r > r1. Show that X2 is χ2 (r2), where r2 = r − r1. 2. Let X1 and X2 be independent, with Xi gamma with parameters αi and βi, i = 1, 2. If c1 and c2 are positive constants, find convenient sufficient conditions under which c1X1 + c2X2 will also have the gamma distribution. 3. If X1, . . . , Xn are independent random variables with moment-generating functions M1, . . . , Mn, and c1, . . . , cn are constants, express the moment-generating function M of c1X1 + · · · + cnXn in terms of the Mi.
  • 15. 15 4. If X1, . . . , Xn are independent, with Xi Poisson(λi), i = 1, . . . , n, show that the sum Y = n i=1 Xi has the Poisson distribution with parameter λ = n i=1 λi. 5. An unbiased coin is tossed independently n1 times and then again tossed independently n2 times. Let X1 be the number of heads in the first experiment, and X2 the number of tails in the second experiment. Without using moment-generating functions, in fact without any calculation at all, find the distribution of X1 + X2.
  • 16. 16 Lecture 4. Sampling From a Normal Population 4.1 Definitions and Comments Let X1, . . . , Xn be iid. The sample mean of the Xi is X = 1 n n i=1 Xi and the sample variance is S2 = 1 n n i=1 (Xi − X)2 . If the Xi have mean µ and variance σ2 , then E(X) = 1 n n i=1 E(Xi) = 1 n nµ = µ and Var X = 1 n2 n i=1 Var Xi = nσ2 n2 = σ2 n → 0 as n → ∞. Thus X is a good estimate of µ. (For large n, the variance of X is small, so X is concentrated near its mean.) The sample variance is an average squared deviation from the sample mean, but it is a biased estimate of the true variance σ2 : E[(Xi − X)2 ] = E[(Xi − µ) − (X − µ)]2 = Var Xi + Var X − 2E[(Xi − µ)(X − µ)]. Notice the centralizing technique. We subtract and add back the mean of Xi, which will make the cross terms easier to handle when squaring. The above expression simplifies to σ2 + σ2 n − 2E[(Xi − µ) 1 n n j=1 (Xj − µ)] = σ2 + σ2 n − 2 n E[(Xi − µ)2 ]. Thus E[(Xi − X)2 ] = σ2 (1 + 1 n − 2 n ) = n − 1 n σ2 . Consequently, E(S2 ) = (n − 1)σ2 /n, not σ2 . Some books define the sample variance as 1 n − 1 n i=1 (Xi − X)2 = n n − 1 S2 where S2 is our sample variance. This adjusted estimate of the true variance is unbiased (its expectation is σ2 ), but biased does not mean bad. If we measure performance by asking for a small mean square error, the biased estimate is better in the normal case, as we will see at the end of the lecture.
  • 17. 17 4.2 The Normal Case We now assume that the Xi are normally distributed, and find the distribution of S2 . Let y1 = x = (x1 +· · ·+xn)/n, y2 = x2 −x, . . . , yn = xn −x. Then y1 +y2 = x2, y1 +y3 = x3, . . . , y1 +yn = xn. Add these equations to get (n−1)y1 +y2 +· · ·+yn = x2 +· · ·+xn, or ny1 + (y2 + · · · + yn) = (x2 + · · · + xn) + y1 (1) But ny1 = nx = x1 +· · ·+xn, so by cancelling x2, . . . , xn in (1), x1 +(y2 +· · ·+yn) = y1. Thus we can solve for the x’s in terms of the y’s: x1 = y1 − y2 − · · · − yn x2 = y1 + y2 x3 = y1 + y3 (2) ... xn = y1 + yn The Jacobian of the transformation is dn = ∂(x1, . . . , xn) ∂(y1, . . . , yn) = 1 −1 −1 · · · −1 1 1 0 · · · 0 1 0 1 · · · 0 ... 1 0 0 · · · 1 To see the pattern, look at the 4 by 4 case and expand via the last row: 1 −1 −1 −1 1 1 0 0 1 0 1 0 1 0 0 1 = (−1) −1 −1 −1 1 0 0 0 1 0 + 1 −1 −1 1 1 0 1 0 1 so d4 = 1+d3. In general, dn = 1+dn−1, and since d2 = 2 by inspection, we have dn = n for all n ≥ 2. Now n i=1 (xi − µ)2 = (xi − x + x − µ)2 = (xi − x)2 + n(x − µ)2 (3) because (xi −x) = 0. By (2), x1 −x = x1 −y1 = −y2 −· · ·−yn and xi −x = xi −y1 = yi for i = 2, . . . , n. (Remember that y1 = x.) Thus n i=1 (xi − x)2 = (−y2 − · · · − yn)2 + n i=2 y2 i (4) Now fY1···Yn (y1, . . . , yn) = nfX1···Xn (x1, . . . , xn).
  • 18. 18 By (3) and (4), the right side becomes, in terms of the yi’s, n 1 √ 2πσ n exp 1 2σ2 − n i=2 yi 2 − n i=2 y2 i − n(y1 − µ)2 . The joint density of Y1, . . . , Yn is a function of y1 times a function of (y2, . . . , yn), so Y1 and (Y2, . . . , Yn) are independent. Since X = Y1 and [by (4)] S2 is a function of (Y2, . . . , Yn), X and S2 are independent Dividing Equation (3) by σ2 we have n i=1 Xi − µ σ 2 = nS2 σ2 + X − µ σ/ √ n 2 . But (Xi − µ)/σ is normal (0,1) and X − µ σ/ √ n is normal (0,1) so χ2 (n) = (nS2 /σ2 ) + χ2 (1) with the two random variables on the right independent. If M(t) is the moment-generating function of nS2 /σ2 , then (1−2t)−n/2 = M(t)(1−2t)−1/2 . Therefore M(t) = (1 − 2t)−(n−1)/2 , i.e., nS2 σ2 is χ2 (n − 1) The random variable T = X − µ S/ √ n − 1 is useful in situations where µ is to be estimated but the true variance σ2 is unknown. It turns out that T has a “T distribution”, which we study in the next lecture. 4.3 Performance of Various Estimates Let S2 be the sample variance of iid normal (µ, σ2 ) random variables X1, . . . , Xn. We will look at estimates of σ2 of the form cS2 , where c is a constant. Once again employing the centralizing technique, we write E[(cS2 − σ2 )2 ] = E[(cS2 − cE(S2 ) + cE(S2 ) − σ2 )2 ] which simplifies to c2 Var S2 + (cE(S2 ) − σ2 )2 .
  • 19. 19 Since nS2 /σ2 is χ2 (n−1), which has variance 2(n−1), we have n2 (Var S2 )/σ4 = 2(n−1). Also nE(S2 )/σ2 is the mean of χ2 (n − 1), which is n − 1. (Or we can recall from (4.1) that E(S2 ) = (n − 1)σ2 /n.) Thus the mean square error is c2 2σ4 (n − 1) n2 + c (n − 1) n σ2 − σ2 2 . We can drop the σ4 and use n2 as a common denominator, which can also be dropped. We are then trying to minimize c2 2(n − 1) + c2 (n − 1)2 − 2c(n − 1)n + n2 . Differentiate with respect to c and set the result equal to zero: 4c(n − 1) + 2c(n − 1)2 − 2(n − 1)n = 0. Dividing by 2(n − 1), we have 2c + c(n − 1) − n = 0, so c = n/(n + 1). Thus the best estimate of the form cS2 is 1 n + 1 n i=1 (Xi − X)2 . If we use S2 then c = 1. If we us the unbiased version then c = n/(n − 1). Since [n/(n + 1)] < 1 < [n/(n − 1)] and a quadratic function decreases as we move toward its minimum, w see that the biased estimate S2 is better than the unbiased estimate nS2 /(n − 1), but neither is optimal under the minimum mean square error criterion. Explicitly, when c = n/(n − 1) we get a mean square error of 2σ4 /(n − 1) and when c = 1 we get σ4 n2 2(n − 1) + (n − 1 − n)2 = (2n − 1)σ4 n2 which is always smaller, because [(2n − 1)/n2 ] < 2/(n − 1) iff 2n2 > 2n2 − 3n + 1 iff 3n > 1, which is true for every positive integer n. For large n all these estimates are good and the difference between their performance is small. Problems 1. Let X1, . . . , Xn be iid, each normal (µ, σ2 ), and let X be the sample mean. If c is a constant, we wish to make n large enough so that P{µ − c < X < µ + c} ≥ .954. Find the minimum value of n in terms of σ2 and c. (It is independent of µ.) 2. Let X1, . . . , Xn1 , Y1, . . . Yn2 be independent random variables, with the Xi normal (µ1, σ2 1) and the Yi normal (µ2, σ2 2). If X is the sample mean of the Xi and Y is the sample mean of the Yi, explain how to compute the probability that X > Y . 3. Let X1, . . . , Xn be iid, each normal (µ, σ2 ), and let S2 be the sample variance. Explain how to compute P{a < S2 < b}. 4. Let S2 be the sample variance of iid normal (µ, σ2 ) random variables Xi, i = 1 . . . , n. Calculate the moment-generating function of S2 and from this, deduce that S2 has a gamma distribution.
  • 20. 20 Lecture 5. The T and F Distributions 5.1 Definition and Discussion The Tdistribution is defined as follows. Let X1 and X2 be independent, with X1 normal (0,1) and X2 chi-square with r degrees of freedom. The random variable Y1 = √ rX1/ √ X2 has the T distribution with r degrees of freedom. To find the density of Y1, let Y2 = X2. Then X1 = Y1 √ Y 2/ √ r and X2 = Y2. The transformation is one-to-one with −∞ < X1 < ∞, X2 > 0 ⇐⇒ −∞ < Y1 < ∞, Y2 > 0. The Jacobian is given by ∂(x1, x2) ∂(y1, y2) = y2/r y1/(2 √ ry2) 0 1 = y2/r. Thus fY1Y2 (y1, y2) = fX1X2 (x1, x2) y2/r, which upon substitution for x1 and x2 becomes 1 √ 2π exp[−y2 1y2/2r] 1 Γ(r/2)2r/2 y (r/2)−1 2 e−y2/2 y2/r. The density of Y1 is 1 √ 2πΓ(r/2)2r/2 ∞ 0 y [(r+1)/2]−1 2 exp[−(1 + (y2 1/r))y2/2] dy2/ √ r. With z = (1 + (y2 1/r))y2/2 and the observation that all factors of 2 cancel, this becomes (with y1 replaced by t) Γ((r + 1)/2) √ rπΓ(r/2) 1 (1 + (t2/r))(r+1)/2 , −∞ < t < ∞, the Tdensity with r degrees of freedom. In sampling from a normal population, (X − µ)/(σ/ √ n) is normal (0,1), and nS2 /σ2 is χ2 (n − 1). Thus √ n − 1 (X − µ) σ/ √ n divided by √ nS/σ is T(n − 1). Since σ and √ n disappear after cancellation, we have X − µ S/ √ n − 1 is T(n − 1) Advocates of defining the sample variance with n − 1 in the denominator point out that one can simply replace σ by S in (X − µ)/(σ/ √ n) to get the T statistic. Intuitively, we expect that for large n, (X − µ)/(S/ √ n − 1) has approximately the same distribution as (X −µ)/(σ/ √ n), i.e., normal (0,1). This is in fact true, as suggested by the following computation: 1 + t2 r (r+1)/2 = 1 + t2 r r 1 + t2 r 1/2 → √ et2 × 1 = et2 /2 as r → ∞.
  • 21. 21 5.2 A Preliminary Calculation Before turning to the F distribution, we calculate the density of U = X1/X2 where X1 and X2 are independent, positive random variables. Let Y = X2, so that X1 = UY, X2 = Y (X1, X2, U, Y ) are all greater than zero). The Jacobian is ∂(x1, x2) ∂(u, y) = y u 0 1 = y. Thus fUY (u, y) = fX1X2 (x1, x2)y = fX1 (uy)fX2 (y), and the density of U is h(u) = ∞ 0 yfX1 (uy)fX2 (y) dy. Now we take X1 to be χ2 (m), and X2 to be χ2 (n). The density of X1/X2 is h(u) = 1 2(m+n)/2Γ(m/2)Γ(n/2) u(m/2)−1 ∞ 0 y[(m+n)/2]−1 e−y(1+u)/2 dy. The substitution z = y(1 + u)/2 gives h(u) = 1 2(m+n)/2Γ(m/2)Γ(n/2) u(m/2)−1 ∞ 0 z[(m+n)/2]−1 [(1 + u)/2][(m+n)/2]−1 e−z 2 1 + u dz. We abbreviate Γ(a)Γ(b)/Γ(a + b) by β(a, b). (We will have much more to say about this when we discuss the beta distribution later in the lecture.) The above formula simplifies to h(u) = 1 β(m/2, n/2) u(m/2)−1 (1 + u)(m+n)/2 , u ≥ 0. 5.3 Definition and Discussion The F density is defined as follows. Let X1 and X2 be independent, with X1 = χ2 (m) and X2 = χ2 (n). With U as in (5.2), let W = X1/m X2/n = n m U so that fW (w) = fU (u) du dw = m n fU m n w . Thus W has density (m/n)m/2 β(m/2, n/2) w(m/2)−1 [1 + (m/n)w](m+n)/2 , w ≥ 0, the F density with m and n degrees of freedom.
  • 22. 22 5.4 Definitions and Calculations The beta function is given by β(a, b) = 1 0 xa−1 (1 − x)b−1 dx, a, b > 0. We will show that β(a, b) = Γ(a)Γ(b) Γ(a + b) which is consistent with our use of β(a, b) as an abbreviation in (5.2). We make the change of variable t = x2 to get Γ(a) = ∞ 0 ta−1 e−t dt = 2 ∞ 0 x2a−1 e−x2 dx. We now use the familiar trick of writing Γ(a)Γ(b) as a double integral and switching to polar coordinates. Thus Γ(a)Γ(b) = 4 ∞ 0 ∞ 0 x2a−1 y2b−1 e−(x2 +y2 ) dx dy = 4 π/2 0 dθ ∞ 0 (cos θ)2a−1 (sin θ)2b−1 e−r2 r2a+2b−1 dr. The change of variable u = r2 yields ∞ 0 r2a+2b−1 e−r2 dr = (1/2) ∞ 0 ua+b−1 e−u du = Γ(a + b)/2. Thus Γ(a)Γ(b) 2Γ(a + b) = π/2 0 (cos θ)2a−1 (sin θ)2b−1 dθ. Let z = cos2 θ, 1 − z = sin2 θ, dz = −2 cos θ sin θ dθ = −2z1/2 (1 − z)1/2 dz. The above integral becomes − 1 2 0 1 za−1 (1 − z)b−1 dz = 1 2 1 0 za−1 (1 − z)b−1 dz = 1 2 β(a, b) as claimed. The beta density is f(x) = 1 β(a, b) xa−1 (1 − x)b−1 , 0 ≤ x ≤ 1 (a, b > 0).
  • 23. 23 Problems 1. Let X have the beta distribution with parameters a and b. Find the mean and variance of X. 2. Let T have the T distribution with 15 degrees of freedom. Find the value of c which makes P{−c ≤ T ≤ c} = .95. 3. Let W have the F distribution with m and n degrees of freedom (abbreviated W = F(m, n)). Find the distribution of 1/W. 4. A typical table of the F distribution gives values of P{W ≤ c} for c = .9, .95, .975 and .99. Explain how to find P{W ≤ c} for c = .1, .05, .025 and .01. (Use the result of Problem 3.) 5. Let X have the T distribution with n degrees of freedom (abbreviated X = T(n)). Show that T2 (n) = F(1, n), in other words, T2 has an F distribution with 1 and n degrees of freedom. 6. If X has the exponential density e−x , x ≥ 0, show that 2X is χ2 (2). Deduce that the quotient of two exponential random variables is F(2, 2).