Idea: A sufficient statistic compresses data without losing information about the parameter
Definition
A statistic $U=U(X_1, ... ,X_n)$ is a sufficient statistic if conditional distribution of $X_1,...,X_n$ given $U$ does not depend on $\theta$.
Any 1-1 function of a sufficient statistic is a sufficient statistic.
Any statistic from which a sufficient statistic is calculated is also a sufficient statistic
$\exist$ Many possible SS’s ⇒ MSS (Minimal Sufficient Statistics)
Definition (Likelihood function)
$$ L(\theta|x_1,...,x_n) = f(x_1|\theta) \times ... \times f(x_n|\theta) $$
Likelihood = joint probability / density function of $X_1,...,X_n$, but different viewpoint!
Likelihood → focus on parameters / prob or density → focus on R.V.s
Factorization Theorem
$U=U(X_1, ..., X_n)$ is a sufficient statistic for $\theta$ iff we can write the likelihood in the form as
$$ L(\theta)=g(u(x_1,...,x_n), \theta) h(x_1,...,x_n) $$
Definition
$U=U(X_1,...,X_n)$ is a minimal sufficient statistic (MSS) if
(a) $U$ is a sufficient statistic and
(b) $U$ compresses the data at least as much as any other SS: If $V$ is any other SS, $U$ is a function of $V$.
Theorem by Lehmann-Scheffé
$$ \frac{L(x_1,...,x_n|\theta)}{L(y_1,...,y_n|\theta)} \neq h(\theta) \iff g(x_1,...,x_n)=g(y_1,...,y_n) $$
The above ratio is free of unknown parameter $\theta$ iff $g(x_1,...,x_n)=g(y_1,...,y_n)$. If such a function $g$ can be found, then $g(X_1,...,X_n)$ is a minimal sufficient statistic for $\theta$.