Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	40/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 36 37 38 39 40 41 42 43 ... 423

1-Data Mining tarjima

TEMPERATURE 1 1 1
AVERAGE = 75 1 1 1
COEFFICIENT = 75 1 1 1
CUT ALONG X AXIS BASE DATA
AVERAGE TEMPERATURE DIFFERENCE BETWEEN LEFT AND RIGHT BLOCKS = 7/4
TOP AND BOTTOM 1

GLOBAL	1	1	1	1	75 76 75 72	SEA SURFACE

TEMPERATURE	1	1	1	1	77 73 73 74	TEMPERATURES
AVERAGE = 75	1	1	1	1	72 71 78 80	ALONG SPATIAL
COEFFICIENT = 75	1	1	1	1	74 75 79 76	GRID
	CUT ALONG X AXIS				BASE DATA

AVERAGE

TEMPERATURE
DIFFERENCE

BETWEEN LEFT

AND RIGHT
BLOCKS = 7/4
COEFFICIENT=

AVERAGE TEMP.
DIFFERENCE
BETWEEN TOP AND
BOTTOM BLOCKS = 9/4
COEFFICIENT= 9/8

	1	1	1	1	BINARY
	1	1	1	1	MATRICES
	1	1	1	1	REPRESENT
					2 DIMENSIONAL
	1	1	1	1	BASIS MATRICES
7/8		CUT ALONG

			Y AXIS
1	1	0	0		0	0	1	1	AVERAGE
									TEMPERATURE
1	1	0	0		0	0	1	1
									DIFFERENCE BETWEEN
1	1 0		0		0	0	1	1	TOP AND BOTTOM
1	1 0		0		0	0	1	1	BLOCKS = 19/4
			CUT ALONG							COEFFICIENT = 19/8

X AXIS

Figure 2.7: Illustration of the top levels of the wavelet decomposition for spatial data in a grid containing sea-surface temperatures

that have the largest average normalized coeﬃcient across the N diﬀerent series are selected.

The full dimensionality of the wavelet coeﬃcient representation is retained. However, for each time series, the largest normalized coeﬃcients (in magnitude) are selected individually. The remaining values are set to 0. This results in a sparse database of high dimensionality, in which many values are 0. A method such as SVD can be applied as a second step to further reduce the dimensionality. The second step of this approach has the disadvantage of losing interpretability of the features of the wavelet transform. Recall that the Haar wavelet is one of the few dimensionality reduction transforms where the coeﬃcients do have some interpretability in terms of specific trends across particular time-series segments.

The wavelet decomposition method provides a natural method for dimensionality reduction (and data-type transformation) by retaining only a small number of coeﬃcients.

Wavelet Decomposition with Multiple Contextual Attributes

Time-series data contain a single contextual attribute, corresponding to the time value. This helps in simplification of the wavelet decomposition. However, in some cases such as spatial data, there may be two contextual attributes corresponding to the X-coordinate and the Y -coordinate. For example, sea-surface temperatures are measured at spatial locations that are described with the use of two coordinates. How can wavelet decomposition be performed in such cases? In this section, a brief overview of the extension of wavelets to multiple contextual attributes is provided.

Assume that the spatial data is represented in the form of a 2-dimensional grid of size q × q. Recall that in the 1-dimensional case, diﬀerencing operations were applied over contiguous segments of the time series by successive division of the time series in hierarchical fashion. The corresponding basis vectors have +1 and −1 at the relevant positions. The 2-dimensional case is completely analogous where contiguous areas of the spatial grid are used

2.4. DATA REDUCTION AND TRANSFORMATION

by successive divisions. These divisions are alternately performed along the diﬀerent axes. The corresponding basis vectors are 2-dimensional matrices of size q × q that regulate how the diﬀerencing operations are performed.

An example of the strategy for 2-dimensional decomposition is illustrated in Fig. 2.7. Only the top two levels of the decomposition are illustrated in the figure. Here, a 4 × 4 grid of spatial temperatures is used as an example. The first division along the X-axis divides the spatial area into two blocks of size 4 × 2 each. The corresponding two-dimensional binary basis matrix is illustrated into the same figure. The next phase divides each of these 4 × 2 blocks into blocks of size 2 × 2 during the hierarchical decomposition process. As in the case of 1-dimensional time series, the wavelet coeﬃcient is half the diﬀerence in the average temperatures between the two halves of the relevant block being decomposed. The alternating process of division along the X-axis and the Y -axis can be carried on to the individual data entries. This creates a hierarchical wavelet error tree, which has many similar properties to that created in the 1-dimensional case. The overall principles of this decomposition are almost identical to the 1-dimensional case, with the major diﬀerence in terms of how the cuts along diﬀerent dimensions are performed by alternating at diﬀerent levels. The approach can be extended to the case of k > 2 contextual attributes with the use of a round-robin rotation in the axes that are selected at diﬀerent levels of the tree for the diﬀerencing operation.

2.4.4.2 Multidimensional Scaling

Graphs are a powerful mechanism for representing relationships between objects. In some data mining scenarios, the data type of an object may be very complex and heteroge-neous such as a time series annotated with text and other numeric attributes. However, a crisp notion of distance between several pairs of data objects may be available based on application-specific goals. How can one visualize the inherent similarity between these objects? How can one visualize the “nearness” of two individuals connected in a social net-work? A natural way of doing so is the concept of multidimensional scaling (MDS). Although MDS was originally proposed in the context of spatial visualization of graph-structured dis-tances, it has much broader applicability for embedding data objects of arbitrary types in multidimensional space. Such an approach also enables the use of multidimensional data mining algorithms on the embedded data.

For a graph with n nodes, let δ_ij = δ_ji denote the specified distance between nodes i and j. It is assumed that all ⁿ₂ pairwise distances between nodes are specified. It is desired to map the n nodes to n diﬀerent k-dimensional vectors denoted by X₁ . . . X_n, so that the distances in multidimensional space closely correspond to the n₂ distance values in the distance graph. In MDS, the k coordinates of each of these n points are treated as variables that need to be optimized, so that they can fit the current set of pairwise distances. Metric MDS, also referred to as classical MDS, attempts to solve the following optimization (minimization) problem:

O =	(\|\|		−		\|\| − δ_ij )²	(2.16)
		^Xi		^Xj
	i,j:i

Here || · || represents Euclidean norm. In other words, each node is represented by a mul-tidimensional data point, such that the Euclidean distances between these points reflect the graph distances as closely as possible. In other forms of nonmetric MDS, this objective function might be diﬀerent. This optimization problem therefore has n · k variables, and it scales with the size of the data n and the desired dimensionality k of the embedding. The

56 CHAPTER 2. DATA PREPARATION

Table 2.3: Scaled eigenvectors of various similarity matrices yield embeddings with diﬀerent properties

Method
Relevant similarity matrix

PCA

Dot product matrix DDT after mean centering D

SVD

Dot product matrix DDT

Spectral embedding

Sparsified/normalized similarity matrix Λ−1/2W Λ−1/2

(Symmetric Version)

(cf. Sect. 19.3.4 of Chap. 19)

MDS/ISOMAP

Similarity matrix derived from distance matrix Δ with

cosine law S =

−

)Δ(I

−

)

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 36 37 38 39 40 41 42 43 ... 423