Sufficiency

Idea: A sufficient statistic compresses data without losing information about the parameter
Definition

A statistic $U=U(X_1, ... ,X_n)$ is a sufficient statistic if conditional distribution of $X_1,...,X_n$ given $U$ does not depend on $\theta$.
Any 1-1 function of a sufficient statistic is a sufficient statistic.
Any statistic from which a sufficient statistic is calculated is also a sufficient statistic
$\exist$ Many possible SS’s ⇒ MSS (Minimal Sufficient Statistics)

Likelihood

Definition (Likelihood function)

$$ L(\theta|x_1,...,x_n) = f(x_1|\theta) \times ... \times f(x_n|\theta) $$

Likelihood = joint probability / density function of $X_1,...,X_n$, but different viewpoint!

Likelihood → focus on parameters / prob or density → focus on R.V.s
Factorization Theorem

$U=U(X_1, ..., X_n)$ is a sufficient statistic for $\theta$ iff we can write the likelihood in the form as

$$ L(\theta)=g(u(x_1,...,x_n), \theta) h(x_1,...,x_n) $$

Definition

$U=U(X_1,...,X_n)$ is a minimal sufficient statistic (MSS) if

(a) $U$ is a sufficient statistic and

(b) $U$ compresses the data at least as much as any other SS: If $V$ is any other SS, $U$ is a function of $V$.
Theorem by Lehmann-Scheffé

$$ \frac{L(x_1,...,x_n|\theta)}{L(y_1,...,y_n|\theta)} \neq h(\theta) \iff g(x_1,...,x_n)=g(y_1,...,y_n) $$

The above ratio is free of unknown parameter $\theta$ iff $g(x_1,...,x_n)=g(y_1,...,y_n)$. If such a function $g$ can be found, then $g(X_1,...,X_n)$ is a minimal sufficient statistic for $\theta$.

Definition (Consistency)

$\hat\theta_n$ based on $X_1,...,X_n$ is consistent for $\theta$ if $\hat\theta_n \overset{p}{\to} \theta$ as $n\rightarrow\infin$ for all values of $\theta$.
Tool 1: WLLN

$$ \bar X = \frac{1}{n} \sum^n_i X_i \overset{p}{\to} \mu=E(X_i) $$
Tool 2: Theorems on Limiting distributions

Suppose $W_n \overset{p}{\to} a$ and $V_n \overset{p}{\to} b$.
- $c_nW_n+d_nW_n \overset{p}{\to} ca+db \text{ when } c_n \rightarrow c, d_n \rightarrow d$
- $W_nV_n \overset{p}{\to} ab$
- $W_n/V_n \overset{p}{\to} a/b \text{ if } b \neq0$
- $h(W_n) \overset{p}{\to} h(a)$ if $h$ is continuous at $a$
Tool 3:

If $\hat\theta_n$ is an UE of $\theta$ and $V(\hat\theta_n) \rightarrow 0$, $\hat\theta_n$ is consistent for $\theta$.

proof by Chevyshev inequality

Theorem (CLT)

$$ Z_n=\frac{\bar X_n-\mu}{\sigma/\sqrt n} \overset{D}{\to} N(0,1) $$

meaning that cdf of $Z_n$ converges to the cdf of $N(0,1)$

($\mu=E(X), \sigma=\sqrt{Var(X)}$)
Mapping Theorem

If $Y_n \overset{D}{\to} Y$, then $h(Y_n) \overset{D}{\to}h(Y)$ for any continuous function $h$.