集成学习

Boosting

  1. learn over a subset of data --> rule
  2. combine

bagging

combine(mean)

Boosting

choose "hardest" example

weighted mean

error: PrD[h(x)C(x)]Pr_D[h(x)\neq{C(x)}]

D: distribution H:hypothesis C:true concept

Weak learning

weak learner: learner that does better than chance always <= 1/2

DPrD[]1/2ϵ\forall_{D}Pr_D[\cdot] \le 1/2-\epsilon

Boosting in code

  • Given training {xi,yi)}\{x_i,y_i)\}, yi{1,+1}y_i\in{\{-1,+1\}}
  • For t = 1 to T
    • Construct DtD_t
    • find weak classifier ht(x)h_t(x) with small error ϵt=PrDt[ht(xi)yi]\epsilon_t=Pr_{D_t}[h_t(x_i)\neq{y_i}]
    • output HfinalH_{final}

Di(i)=1nD_i(i)=\frac{1}{n}

Dt+1(i)=Dt(i)eαtyiht(xi)zt D_{t+1}(i)=\frac{D_t(i)\cdot{e^{-\alpha_ty_ih_t(x_i)}}}{z_t}

where αt=1/2ln1ϵtϵt\alpha_t=1/2\ln{\frac{1-\epsilon_t}{\epsilon_t}}, ztz_t 归一化常数

Hfinal(x)=sgn(tαtht(x))H_{final}(x)=sgn(\displaystyle\sum_t\alpha_th_t(x))

boosting 不怎么会过拟合

pink noise: uniform noise will lead to overfit

results matching ""

    No results matching ""