Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə334/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   330   331   332   333   334   335   336   337   ...   423
1-Data Mining tarjima

16.6. EXERCISES

555

The tile-based simplification for pattern mining from trajectories was proposed in [375]. Pattern mining in trajectory data is closely related to clustering. The problem of mining periodic behaviors from trajectories is addressed in [352]. Moving object clusters have been studied as Swarms [351], Flocks [86] and Convoys [290]. Among these, Swarms provide the most relaxed definition, in which noisy gaps are allowed. In these noisy gap periods, objects from the same cluster may not move together. An algorithm for maintaining real-time communities from trajectories in social sensing applications was proposed in [429]. A method for partitioning longer trajectories into smaller segments for shape-based clustering was proposed in [338]. Anomaly monitoring from trajectories of moving object streams was studied in [117 ]. The Top-Eye method, an algorithm for monitoring the top-k anomalies in moving object trajectories, was proposed in [226]. The TRAOD algorithm, which discovers shape-based trajectory outliers, was proposed in [337]. A method that uses region-based and trajectory-based clustering for classification was proposed in [339].


16.6 Exercises





  1. Discuss how to generalize the spatial wavelets to the case where there are n contextual attributes.




  1. Implement the algorithm to construct a multidimensional representation from spatial data, with the use of wavelets.




  1. Describe a method for converting shapes to a multidimensional representation.




  1. Implement the algorithm for converting shapes to time series data.




  1. Suppose that you had N different snapshots of sea surface temperature over successive instants in time over a spatial grid. You want to identify contiguous regions over which significant change has occurred between successive time instants. Describe an approach to identify such regions and time instants with the use of spatial wavelets.




  1. Suppose the snapshots of Exercise 5 were not from successive instants in time. How would you identify spatial snapshots that were very different from the other snapshots with the use of spatial wavelets? How would you identify specific regions that are very different from the remaining data?




  1. Suppose that you used the tile-based approach for finding frequent trajectory patterns. Discuss how the different constraint-based variants of sequential pattern mining map onto different constraint-based variants of sequential trajectory patterns.




  1. Propose a snapshot-based clustering approach for converting trajectories to symbolic sequences. Discuss the advantages and disadvantages with respect to the tile-based approach.




  1. Implement the different variations for converting trajectories to symbolic sequences with the use of the tile-based technique for frequent trajectory pattern mining.




  1. Discuss how to use wavelets to perform different data mining tasks on trajectories.



Chapter 17


Mining Graph Data

Structure is more important than content in the transmission of information.”—Abbie Hoffman


17.1 Introduction


Graphs are ubiquitous in a wide variety of application domains such as bioinformatics, chemical, semi-structured, and biological data. Many important properties of graphs can be related to their structure in these domains. Graph mining algorithms can, therefore, be leveraged for analyzing various domain-specific properties of graphs. Most graphs, encoun-tered in real applications, are one of the two types:





  1. In applications such as chemical and biological data, a database of many small graphs is available. Each node is associated with a label that may or may not be unique to the node, depending on the application-specific scenario.




  1. In applications such as the Web and social networks, a single large graph is available. For example, the Web can be viewed as a very large graph, in which nodes correspond to Web pages (labeled by their URLs) and edges correspond to hyperlinks between nodes.

The nature of the applications for these two types of data are quite different. Web and social network applications will be addressed in Chaps. 18 and 19, respectively. This chapter will therefore focus on the first scenario, in which many small graphs are available. A graph database may be formally defined as follows.


Definition 17.1.1 (Graph Database) A graph database D is a collection of n different undirected graphs, G1 = (N1, A1) . . . Gn = (Nn, An), such that the set of nodes in the ith graph is denoted by Ni, and the set of edges in the ith graph is denoted by Ai. Each node p ∈ Ni is associated with a label denoted by l(p).


The labels associated with the nodes may be repeated within a single graph. For example, when each graph Gi corresponds to a chemical compound, the label of the node is the






C. C. Aggarwal, Data Mining: The Textbook, DOI 10.1007/978-3-319-14142-8 17

557

c Springer International Publishing Switzerland 2015



558 CHAPTER 17. MINING GRAPH DATA



OH














Yüklə 17,13 Mb.

Dostları ilə paylaş:
1   ...   330   331   332   333   334   335   336   337   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin