Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	207/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 203 204 205 206 207 208 209 210 ... 423

1-Data Mining tarjima

λ_iy_iX_i, as for

i=1

hard SVMs.

(b)

The condition

λ_iy_i = 0 holds as in hard SVMs.

i=1

The Lagrangian multipliers satisfy λ_i ≤ C.
The Lagrangian dual is identical to that of hard SVMs.

Show that it is possible to omit the bias parameter b from the decision boundary of SVMs by suitably preprocessing the data set. In other words, the decision boundary is now W · X = 0. What is the impact of eliminating the bias parameter on the gradient ascent approach for Lagrangian dual optimization in SVMs?

Show that an n × d data set can be mean-centered by premultiplying it with the n × n matrix (I − U/n), where U is a unit matrix of all ones. Show that an n × n kernel matrix K can be adjusted for mean centering of the data in the transformed space by adjusting it to K = (I − U/n)K(I − U/n).

Consider two classifiers A and B. On one data set, a 10-fold cross validation shows that classifier A is better than B by 3 %, with a standard deviation of 7 % over 100 diﬀerent folds. On the other data set, classifier B is better than classifier A by 1 %, with a standard deviation of 0.1 % over 100 diﬀerent folds. Which classifier would you prefer on the basis of this evidence, and why?

Provide a nonlinear transformation which would make the data set of Exercise 14 linearly separable.

Let S_w and S_b be defined according to Sect. 10.2.1.3 for the binary class problem. Let the fractional presence of the two classes be p₀ and p₁, respectively. Show that S_w + p₀p₁S_b is equal to the covariance matrix of the data set.

Chapter 11

Data Classification: Advanced Concepts

“Labels are for filing. Labels are for clothing. Labels are not for people.”—Martina Navratilova

11.1 Introduction

In this chapter, a number of advanced scenarios related to the classification problem will be addressed. These include more diﬃcult special cases of the classification problem and various ways of enhancing classification algorithms with the use of additional inputs or a combination of classifiers. The enhancements discussed in this chapter belong to one of the following two categories:

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 203 204 205 206 207 208 209 210 ... 423