Sufficiency

Idea: A sufficient statistic compresses data without losing information about the parameter
Definition

A statistic $U=U(X_1, ... ,X_n)$ is a sufficient statistic if conditional distribution of $X_1,...,X_n$ given $U$ does not depend on $\theta$.
Any 1-1 function of a sufficient statistic is a sufficient statistic.
Any statistic from which a sufficient statistic is calculated is also a sufficient statistic
$\exist$ Many possible SS’s ⇒ MSS (Minimal Sufficient Statistics)

Likelihood

Definition (Likelihood function)

$$ L(\theta|x_1,...,x_n) = f(x_1|\theta) \times ... \times f(x_n|\theta) $$

Likelihood = joint probability / density function of $X_1,...,X_n$, but different viewpoint!

Likelihood → focus on parameters / prob or density → focus on R.V.s
Factorization Theorem

$U=U(X_1, ..., X_n)$ is a sufficient statistic for $\theta$ iff we can write the likelihood in the form as

$$ L(\theta)=g(u(x_1,...,x_n), \theta) h(x_1,...,x_n) $$

Definition

$U=U(X_1,...,X_n)$ is a minimal sufficient statistic (MSS) if

(a) $U$ is a sufficient statistic and

(b) $U$ compresses the data at least as much as any other SS: If $V$ is any other SS, $U$ is a function of $V$.
Theorem by Lehmann-Scheffé

$$ \frac{L(x_1,...,x_n|\theta)}{L(y_1,...,y_n|\theta)} \neq h(\theta) \iff g(x_1,...,x_n)=g(y_1,...,y_n) $$

The above ratio is free of unknown parameter $\theta$ iff $g(x_1,...,x_n)=g(y_1,...,y_n)$. If such a function $g$ can be found, then $g(X_1,...,X_n)$ is a minimal sufficient statistic for $\theta$.