Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	224/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 220 221 222 223 224 225 226 227 ... 423

1-Data Mining tarjima

d

P (C = c\|X) ∝ P (C = c) P (x_j = a_j \|C = c)	(11.21)

j=1

(M-step) Estimate conditional distribution of features for diﬀerent clusters (classes), using the current estimated posterior probabilities (unlabeled data) and known mem-berships (labeled data) of data points to clusters (classes).

One challenge with the use of the approach is that the clustering structure may some-times not correspond to the class distribution very well. In such cases, the use of unla-beled data can harm the classification accuracy, as the clusters found by the EM algorithm

366 CHAPTER 11. DATA CLASSIFICATION: ADVANCED CONCEPTS

drift away from the true class structure. After all, unlabeled data are plentiful compared to labeled data, and therefore the estimation of P (x_j = a_j |C = c) in Eq. 11.20 will be dominated by the unlabeled data. To ameliorate this eﬀect, the labeled and unlabeled data are weighted diﬀerently during the estimation of P (x_j = a_j |C = c). The unlabeled data are weighted down by a predefined discount factor μ < 1 to ensure better corre-spondence between the clustering structure and the class distribution. In other words, the value of w(X, c) is multiplied with μ for only the unlabeled examples before estimating P (x_j = a_j |C = c) in Eq. 11.20. The EM-approach for semisupervised classification is par-ticularly remarkable because it demonstrates the link between semisupervised clustering and semisupervised classification, even though these two kinds of semisupervision are motivated by diﬀerent application scenarios.

11.6.2.2 Transductive Support Vector Machines

The general assumption for most of the semisupervised methods is that the label values of unsupervised examples do not vary abruptly at densely populated regions of the data. In transductive support vector machines, this assumption is implicitly encoded by assigning labels to unsupervised examples that maximize the margin of the support vector machine. To understand this point, consider the example of Fig 11.2b. In this case, the margin of the SVM will be optimized only when the labels of the examples in the cluster containing the single example for class A, are also set to the same value A. The same is true for the unlabeled examples in the cluster containing the single label for class B. Therefore, the SVM formulation now needs to be modified to incorporate additional margin constraints, and binary decision variables for each unlabeled example. Recall from the discussion in Sect. 10.6 of Chap. 10 that the original SVM formulation was to minimize the objective

function	\|\|W \|\|²	+ C	n	ξ_i, subject to the following constraints:
function	\|\|W \|\|²	+ C	i=1	ξ_i, subject to the following constraints:
	2		i=1

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 220 221 222 223 224 225 226 227 ... 423