集成学习
Boosting
- learn over a subset of data --> rule
- combine
bagging
combine(mean)
Boosting
choose "hardest" example
weighted mean
error: PrD[h(x)≠C(x)]
D: distribution H:hypothesis C:true concept
Weak learning
weak learner: learner that does better than chance always <= 1/2
∀DPrD[⋅]≤1/2−ϵ
Boosting in code
- Given training {xi,yi)}, yi∈{−1,+1}
- For t = 1 to T
- Construct Dt
- find weak classifier ht(x) with small error ϵt=PrDt[ht(xi)≠yi]
- output Hfinal
Di(i)=n1
Dt+1(i)=ztDt(i)⋅e−αtyiht(xi)
where αt=1/2lnϵt1−ϵt, zt 归一化常数
Hfinal(x)=sgn(t∑αtht(x))
boosting 不怎么会过拟合
pink noise: uniform noise will lead to overfit