Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə207/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   203   204   205   206   207   208   209   210   ...   423
1-Data Mining tarjima

λiyiXi, as for




i=1







hard SVMs.




























(b)

The condition

n

λiyi = 0 holds as in hard SVMs.
















i=1
















    1. The Lagrangian multipliers satisfy λi ≤ C.

    2. The Lagrangian dual is identical to that of hard SVMs.




  1. Show that it is possible to omit the bias parameter b from the decision boundary of SVMs by suitably preprocessing the data set. In other words, the decision boundary is now W · X = 0. What is the impact of eliminating the bias parameter on the gradient ascent approach for Lagrangian dual optimization in SVMs?




  1. Show that an n × d data set can be mean-centered by premultiplying it with the n × n matrix (I − U/n), where U is a unit matrix of all ones. Show that an n × n kernel matrix K can be adjusted for mean centering of the data in the transformed space by adjusting it to K = (I − U/n)K(I − U/n).




  1. Consider two classifiers A and B. On one data set, a 10-fold cross validation shows that classifier A is better than B by 3 %, with a standard deviation of 7 % over 100 different folds. On the other data set, classifier B is better than classifier A by 1 %, with a standard deviation of 0.1 % over 100 different folds. Which classifier would you prefer on the basis of this evidence, and why?




  1. Provide a nonlinear transformation which would make the data set of Exercise 14 linearly separable.




  1. Let Sw and Sb be defined according to Sect. 10.2.1.3 for the binary class problem. Let the fractional presence of the two classes be p0 and p1, respectively. Show that Sw + p0p1Sb is equal to the covariance matrix of the data set.

Chapter 11


Data Classification: Advanced Concepts

Labels are for filing. Labels are for clothing. Labels are not for people.”—Martina Navratilova


11.1 Introduction


In this chapter, a number of advanced scenarios related to the classification problem will be addressed. These include more difficult special cases of the classification problem and various ways of enhancing classification algorithms with the use of additional inputs or a combination of classifiers. The enhancements discussed in this chapter belong to one of the following two categories:






  1. Yüklə 17,13 Mb.

    Dostları ilə paylaş:
1   ...   203   204   205   206   207   208   209   210   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin