Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə331/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   327   328   329   330   331   332   333   334   ...   423
1-Data Mining tarjima

16.3. TRAJECTORY MINING

549



EP : 5 ⇒ {3,9,11}

Note that this is an unordered set, since it represents the individuals present in a particular (space, time) pair. A similar set can be constructed over all the (space, time) pairs that are populated with at least two individuals. This can be viewed as a vertical representation of the sequence database.


Any frequent pattern mining algorithm, discussed in Chap. 4, can be applied to the resulting database of sets. The frequent patterns correspond to the frequent sets of colocated individuals. These individuals are often likely to be socially related individuals.


16.3.4 Trajectory Clustering


In the following, a detailed discussion of the different kinds of trajectory clustering algo-rithms will be provided. Trajectory clustering algorithms are naturally related to trajectory pattern mining because of the close relationship between the two problems. Trajectory clustering methods are of two types.





  1. The first type of methods use conventional clustering algorithms, with the use of a distance function between trajectories. Once a distance function has been designed, many different kinds of algorithms, such as k-medoids or graph-based methods, may be used.




  1. The second type of methods use data transformation and discretization to convert trajectories into sequences of discrete symbols. Different types of transformations, such as segment extraction or grid-based discretization, may be applied to the trajectories. After the transformation, pattern mining algorithms are applied to the extracted sequence of symbols.

In addition, a number of other ad hoc methods have also been designed for trajectory clustering. This section will focus only on the systematic techniques. The bibliographic notes contain pointers to the ad hoc methods.


16.3.4.1 Computing Similarity Between Trajectories


A key aspect of trajectory clustering is the ability to compute similarity between different trajectories. At first sight, similarity function computation seems to be a daunting task because of the spatial and temporal aspects of trajectory analysis. However, in practice, similarity computation between trajectories is not very different from that of time series data. As discussed at the beginning of Sect. 16.3, trajectory data is equivalent to multivariate time series data. Several dynamic programming algorithms are discussed in Chap. 3 for similarity computation in univariate time series data. These algorithms can be generalized to the multivariate case. In the following, the extension of the dynamic time warping algorithm to the multivariate case will be discussed. A similar approach can be used for other dynamic programming methods. The reader is advised to revisit Sect. 3.4.1.3 of Chap. 3 on dynamic time warping before reading further.


First, the discussion on univariate time series distances will be revisited briefly. Let DT W (i, j) be the optimal distance between the first i and first j elements of two univariate time series X = (x1 . . . xm) and Y = (y1 . . . yn) respectively. Then, the value of DT W (i, j)





550


CHAPTER 16.


MINING SPATIAL DATA



is defined recursively as follows:





DT W (i, j − 1)
repeat xi






Yüklə 17,13 Mb.

Dostları ilə paylaş:
1   ...   327   328   329   330   331   332   333   334   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin