Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	206/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 202 203 204 205 206 207 208 209 ... 423

1-Data Mining tarjima

10.12. EXERCISES

343

patterns are discussed in [149]. A recent overview discussion of pattern-based classifica-tion algorithms may be found in [115]. The naive Bayes classifier has been discussed in detail in [187, 333, 344]. The work in [344] is particularly notable, in that it provides an understanding and justification of the naive Bayes assumption. A brief discussion of logistic regression models may be found in Chap. 3 of [33]. A more detailed discussion may be found in [275].

Numerous books are available on the topic of SVMs [155, 449, 478, 494]. An excellent tutorial on SVMs may be found in [124]. A detailed discussion of the Lagrangian relaxation technique for solving the resulting quadratic optimization problem may be found in [485]. It has been pointed out [133] that the advantages of the primal approach in SVMs seem to have been largely overlooked in the literature. It is sometimes mistakenly understood that the kernel trick can only be applied to the dual; the trick can be applied to the pri-mal formulation as well [133]. A discussion of kernel methods for SVMs may be found in [451]. Other applications of kernels, such as nonlinear k -means and nonlinear PCA, are discussed in [173, 450]. The perceptron algorithm was due to Rosenblatt [439]. Neural net-works are discussed in detail in several books [96, 260 ]. The back-propagation algorithm is described in detail in these books. The earliest work on instance-based classification was discussed in [167]. The method was subsequently extended to symbolic attributes [166]. Two surveys on instance-based classification may be found in [14, 183]. Local methods for nearest-neighbor classification are discussed in [216, 255]. Generalized instance-based learning methods have been studied in the context of decision trees [217], rule-based meth-ods [347], Bayes methods [214], SVMs [105, 544], and neural networks [97, 209, 281]. Methods for classifier evaluation are discussed in [256].

10.12 Exercises

Compute the Gini index for the entire data set of Table 10.1, with respect to the two classes. Compute the Gini index for the portion of the data set with age at least 50.

Repeat the computation of the previous exercise with the use of the entropy criterion.

Show how to construct a (possibly overfitting) rule-based classifier that always exhibits 100 % accuracy on the training data. Assume that the feature variables of no two training instances are identical.

Design a univariate decision tree with a soft maximum-margin split criterion borrowed from SVMs. Suppose that this decision tree is generalized to the multivariate case. How does the resulting decision boundary compare with SVMs? Which classifier can handle a larger variety of data sets more accurately?

Discuss the advantages of a rule-based classifier over a decision tree.

Show that an SVM is a special case of a rule-based classifier. Design a rule-based classifier that uses SVMs to create an ordered list of rules.

Implement an associative classifier in which only maximal patterns are used for clas-sification, and the majority consequent label of rules fired, is reported as the label of the test instance.

Suppose that you had d-dimensional numeric training data, in which it was known that the probability density of d-dimensional data instance X in each class i is proportional

344 CHAPTER 10. DATA CLASSIFICATION

to e^−||X−μi ^||1 , where || · ||₁ is the Manhattan distance, and μ _i is known for each class. How would you implement the Bayes classifier in this case? How would your answer change if μ_i is unknown?

Explain the relationship of mutual exclusiveness and exhaustiveness of a rule set, to the need to order the rule set, or the need to set a class as the default class.

Consider the rules Age > 40 ⇒ Donor and Age ≤ 50 ⇒ ¬Donor. Are these two rules mutually exclusive? Are these two rules exhaustive?

For the example of Table 10.1, determine the prior probability of each class. Determine the conditional probability of each class for cases where the Age is at least 50.

Implement the naive Bayes classifier.

For the example of Table 10.1, provide a single linear separating hyperplane. Is this separating hyperplane unique?

Consider a data set containing four points located at the corners of the square. The two points on one diagonal belong to one class, and the two points on the other diagonal belong to the other class. Is this data set linearly separable? Provide a proof.

Provide a systematic way to determine whether two classes in a labeled data set are linearly separable.

For the soft SVM formulation with hinge loss, show that:

(a)

The weight vector is given by the same relationship W =

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 202 203 204 205 206 207 208 209 ... 423