Bayesian Decision Theory
- Bayes’ Rule
Prior P(C): prob of C regardless of the x
likelihood P(x|C): conditional prob of x, given x belongs to C
evidence P(x): marginal prob of x regardless of the C
posterior P(C|x): prob of C, after having seen the observation, x
- Losses and Risks
- Discriminant Functions
- Association Rules (Support: P(X,Y), Condition P(X|Y), lift: P(X,Y)/P(X)P(Y))
- Apriori algorithm (merge frequent sets)
Parametric Methods
- Maximum Likelihood Estimation (MLE): estimate parameter
- Bias and Variance
Bias: E(d) – theta
Variance: E((E(d)-d)^2)
MSE = E((d-theta)^2) = Variance + Bias^2
- Bayes’ Estimator
Maximum a Posteriori (MAP) / Maximum Likelihood / Bayes
prior is flat -> MAP = ML, posterior is symmetry -> MAP = Bayes
- Regression
r = f(x) + epsilon, g(x|theta) is estimator of f(x)
estimate g(x|theta) -> MLE (maximize Log Likelihood) -> MSE (Mean squared error)
- Other Error Measures
Square Error, Relative Square Error, Absolute Error, e-sensitive Error
- Bias/Variance Dilemma
- Regularization: penalize complex models
Multivariate Methods
- Esimation of Missing Values: Imputation (mean/regression)
- Multivariate Normal Distribution: Mahalanobis Distance
- Parametric Classification
- different covariance – discriminant: P(C1|x)=0.5 (# of Classes = 2) (nonlinear)
- Common Covariance – discriminant: equal Mahalanobis distance (linear)