1
|
|
|
|
|
|
|
|
|
|
|
1
|
|
|
|
|
|
|
|
|
|
|
|
0.9
|
CLASS A
|
|
|
|
SVM A
|
|
|
|
|
|
0.9
|
CLASS A
|
|
|
|
INSTANCE X
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.8
|
|
|
|
|
|
|
|
|
|
|
0.8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ENSEMBLE
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.7
|
|
BOUNDARY
|
|
|
|
|
|
|
|
0.7
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.6
|
|
|
|
|
|
|
|
|
|
|
0.6
|
|
|
|
|
|
|
|
|
|
|
|
0.5
|
|
|
|
|
|
|
|
|
|
|
0.5
|
ENSEMBLE
|
|
|
|
|
TRUE BOUNDARY
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BOUNDARY
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.4
|
SVM B
|
|
|
|
|
|
TRUE BOUNDARY
|
|
0.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.3
|
|
|
|
|
|
|
|
|
|
|
0.3
|
|
|
|
|
|
|
|
|
|
|
|
0.2
|
|
|
|
|
SVM C
|
|
|
|
|
|
0.2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
CLASS B
|
|
|
|
|
|
|
|
|
|
|
CLASS B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.1
|
|
|
|
|
|
|
|
|
|
|
0.1
|
|
|
|
|
|
|
|
|
|
|
|
00
|
0.1
|
0.2
|
0.3
|
0.4
|
0.5
|
0.6
|
0.7
|
0.8
|
0.9
|
1
|
00
|
0.1
|
0.2
|
0.3
|
0.4
|
0.5
|
0.6
|
0.7
|
0.8
|
0.9
|
1
|
|
|
|
|
|
(a) bias
|
|
|
|
|
|
|
|
|
(b) variance
|
|
|
|
|
|
Figure 11.6: Ensemble decision boundaries are more refined than those of component clas-sifiers
will be correct (0.83 + 32 × 0.82 × 0.2) × 100 ≈ 90% of the time. In other words, the ensemble decision boundary of the majority classifier will be much closer to the true decision boundary than that of any of its component classifiers. In fact, a realistic example of how an ensemble boundary might look like after combining a set of relatively coarse decision trees, is illustrated in Fig. 11.6b. Note that the ensemble boundary is much closer to the true boundary because it is not regulated by the unpredictable variations in decision -tree behavior for a training data set of limited size. Such an ensemble is, therefore, better able to make use of the knowledge in the training data.
In general, different classification models have different sources of bias and variance. Models that are too simple (such as a linear SVM or shallow decision tree) make too many assumptions about the shape of the decision boundary, and will therefore have high bias. Models that are too complex (such as a deep decision tree) will overfit the data, and will therefore have high variance. Sometimes a different parameter setting in the same classifier will favor different parts of the bias-variance trade-off curve. For example, a small value of k in a nearest-neighbor classifier will result in lower bias but higher variance. Because different kinds of ensembles learners have different impacts on bias and variance, it is important to choose the component classifiers, so as to optimize the impact on the bias-variance trade-off. An overview of the impact of different models on bias and variance is provided in Table 11.1.
11.8.2 Formal Statement of Bias-Variance Trade-off
In the following, a formal statement of the bias-variance trade-off will be provided. Consider
Dostları ilə paylaş: |