k
|
|
Certain(X) =||pi − 0.5||.
|
(11.25)
|
i=1
The value lies in the range (0, 1), and lower values are indicative of greater uncertainty. In the multiclass scenario, a formal entropy measure may be used to quantify uncertainty. If the Bayes posterior probabilities of the k classes are p1 . . . pk, respectively, based on the current set of labeled instances, then the entropy measure Entropy(X) is defined as follows:
k
|
|
Entropy(X) = − pilog(pi).
|
(11.26)
|
i=1
In this case, larger values of the entropy indicate greater uncertainty and are more desirable for label acquisition.
11.7.1.2 Query-by-Committee
In this case, the heterogeneity is measured in terms of the disagreement of different classi-fiers rather than the posterior probabilities of a single classifier over different labels. This criterion, however, tries to achieve the same intuitive goal, but in a different way. Intuitively, when the posterior probability of a Bayes classifier is the same across different classes, a sig-nificant disagreement may exist between different classification models about the predicted label. Therefore, this approach uses a committee of different classifiers that are trained on the current set of labeled instances. These classifiers are then used to predict the class label of each unlabeled instance. The instance for which the classifiers disagree the most is selected as the relevant one in this scenario.
At an intuitive level, the query-by-committee method achieves similar heterogeneity goals as the uncertainty sampling method. Different classifiers are more likely to disagree on the class label for instances near the true decision boundary. The mathematical formula for quantifying the disagreement is also the same as uncertainty sampling. In particular, the posterior probability pi of each class i in Eq. 11.26 is replaced with the fraction of votes received by each class i. It is particularly beneficial to use diverse classifiers that use fundamentally different modeling methodologies.
11.7.1.3 Expected Model Change
In this approach, the instance with the greatest expected change from the current classi-fication model by adding a particular instance to the training data is selected. In many optimization-based classification models, such as discriminative probabilistic models, the gradient of the model objective function with respect to the model parameters can be quantified. By adding a queried instance to the training data, the gradient will change as well. The instance with the greatest change in the gradient when the queried instance is added to set of labeled instances. The intuition is that such an instance is likely to be very different from the model constructed using already labeled instances. Let δgi(X) be the change in the gradient with respect to the model parameters, conditional on the fact that the correct training label of the candidate instance X is the ith class. In other words, if the current labeled training set is L and ∇G(L) is the gradient of the objective function with respect to model parameters, we have:
Dostları ilə paylaş: |