11.8.
|
ENSEMBLE METHODS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
375
|
|
1
|
|
|
|
|
|
|
|
|
|
|
1
|
CLASS A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.9
|
CLASS A
|
|
|
|
|
|
|
|
|
|
0.9
|
|
|
|
|
INSTANCE X
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.8
|
|
|
|
|
|
|
|
|
|
|
0.8
|
|
|
|
|
|
|
|
|
|
|
|
0.7
|
|
|
|
|
|
|
|
|
|
|
0.7
|
|
|
|
|
|
|
|
|
|
|
|
0.6
|
|
|
|
|
|
|
|
|
|
|
0.6
|
DECISION
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TREE A
|
|
|
|
|
|
|
|
|
|
|
0.5
|
|
|
|
|
|
|
|
|
|
|
0.5
|
|
|
|
|
|
|
TRUE
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BOUNDARY
|
|
|
|
|
0.4
|
|
|
|
|
|
|
TRUE BOUNDARY
|
|
0.4
|
|
|
|
|
|
|
|
|
|
|
|
0.3
|
|
|
|
|
|
|
|
|
|
|
0.3
|
|
|
|
|
|
|
|
|
|
|
|
0.2
|
|
|
|
BEST LINEAR
|
|
|
|
|
|
0.2
|
|
|
|
|
DECISION
|
|
|
|
|
|
|
|
|
SVM
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
CLASS B
|
|
|
|
|
|
|
|
TREE B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.1
|
|
|
|
|
|
|
|
|
|
|
0.1
|
|
|
|
|
|
|
|
CLASS B
|
|
|
|
00
|
0.1
|
0.2
|
0.3
|
0.4
|
0.5
|
0.6
|
0.7
|
0.8
|
0.9
|
1
|
00
|
0.1
|
0.2
|
0.3
|
0.4
|
0.5
|
0.6
|
0.7
|
0.8
|
0.9
|
1
|
|
|
|
|
|
(a) bias
|
|
|
|
|
|
|
|
|
(b) variance
|
|
|
|
|
Figure 11.5: Impact of bias and variance on classification accuracy
Model-centered ensembles: Different algorithms Qj are used in each ensemble iteration. In these cases, the data set fj (D) for each ensemble component is the same as the original data set D. The rationale for these methods is that different models may work better in different regions of the data, and therefore the combination of the models may be more effective for any given test instance, as long as the specific errors of a classification algorithm are not reflected by the majority of the ensemble components on any particular test instance.
The following discussion introduces the rationale for ensemble analysis before presenting specific instantiations.
11.8.1 Why Does Ensemble Analysis Work?
The rationale for ensemble analysis can be best understood by examining the different components of the error of a classifier, as discussed in statistical learning theory. There are three primary components to the error of a classifier:
Bias: Every classifier makes its own modeling assumptions about the nature of the decision boundary between classes. For example, a linear SVM classifier assumes that the two classes may be separated by a linear decision boundary. This is, of course, not true in practice. For example, in Fig. 11.5a, the decision boundary between the differ-ent classes is clearly not linear. The correct decision boundary is shown by the solid line. Therefore, no (linear) SVM classifier can classify all the possible test instances correctly even if the best possible SVM model is constructed with a very large train-ing data set. Although the SVM classifier in Fig. 11.5a seems to be the best possible approximation, it obviously cannot match the correct decision boundary and there-fore has an inherent error. In other words, any given linear SVM model will have an inherent bias. When a classifier has high bias, it will make consistently incor-rect predictions over particular choices of test instances near the incorrectly modeled decision-boundary, even when different samples of the training data are used for the learning process.
Dostları ilə paylaş: |