This distance function is the same as the Euclidean metric when A is the identity matrix. Different choices of A can lead to better sensitivity of the distance function to the local and global data distributions. These different choices will be discussed in the following subsections.
10.8.1.1 Unsupervised Mahalanobis Metric
The Mahalanobis metric is introduced in Chap. 3. In this case, the value of A is chosen to be the inverse of the d × d covariance matrix Σ of the data set. The (i, j)th entry of the matrix Σ is the covariance between the dimensions i and j. Therefore, the Mahalanobis distance is defined as follows:
Dist(
|
|
|
|
) = (
|
|
−
|
|
)Σ−1(
|
|
−
|
|
)T .
|
(10.72)
|
|
X,
|
Y
|
X
|
Y
|
X
|
Y
|
|
The Mahalanobis metric adjusts well to the different scaling of the dimensions and the redundancies across different features. Even when the data is uncorrelated, the Mahalanobis metric is useful because it auto-scales for the naturally different ranges of attributes describ-ing different physical quantities, such as age and salary. Such a scaling ensures that no single attribute dominates the distance function. In cases where the attributes are correlated, the Mahalanobis metric accounts well for the varying redundancies in different features. How-ever, its major weakness is that it does not account for the varying shapes of the class distributions in the underlying data.
10.8.1.2 Nearest Neighbors with Linear Discriminant Analysis
To obtain the best results with a nearest-neighbor classifier, the distance function needs to account for the varying distribution of the different classes. For example, in the case of Fig. 10.11, there are two classes A and B, which are represented by “.” and “*,” respectively. The test instance denoted by X lies on the side of the boundary related to class A. However, the Euclidean metric does not adjust well to the arrangement of the class distribution, and a circle drawn around the test instance seems to include more points from class B than class A.
One way of resolving the challenges associated with this scenario, is to weight the most discriminating directions more in the distance function with an appropriate choice of the
This approach is also referred to as leave-one-out cross-validation, and is described in detail in Sect. 10.9 on classifier evaluation.
Dostları ilə paylaş: |