Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə94/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   90   91   92   93   94   95   96   97   ...   423
1-Data Mining tarjima

CHAPTER 5. ASSOCIATION PATTERN MINING: ADVANCED CONCEPTS

of frequent pattern mining methods for graph applications, such as software bug analysis, and chemical and biological data, is provided in Aggarwal and Wang [26].


5.7 Exercises





  1. Consider the transaction database in the table below:







tid

items



















1

a, b, c, d







2

b, c, e, f







3

a, d, e, f







4

a, e, f







5

b, d, f




Determine all maximal patterns in this transaction database at support levels of 2, 3, and 4.





  1. Write a program to determine the set of maximal patterns, from a set of frequent patterns.




  1. For the transaction database of Exercise 1, determine all the closed patterns at support levels of 2, 3, and 4.




  1. Write a computer program to determine the set of closed frequent patterns, from a set of frequent patterns.




  1. Consider the transaction database in the table below:







tid

items



















1

a, c, d, e







2

a, d, e, f







3

b, c, d, e, f







4

b, d, e, f







5

b, e, f







6

c, d, e







7

c, e, f







8

d, e, f




Determine all frequent maximal and closed patterns at support levels of 3, 4, and 5.





  1. Write a computer program to implement the greedy algorithm for finding a represen-tative itemset from a group of itemsets.




  1. Write a computer program to implement an inverted index on a set of market baskets. Implement a query to retrieve all itemsets containing a particular set of items.




  1. Write a computer program to implement a signature table on a set of market baskets. Implement a query to retrieve the closest market basket to a target basket on the basis of the cosine similarity.

Chapter 6


Cluster Analysis

In order to be an immaculate member of a flock of sheep, one must, above all, be a sheep oneself.” —Albert Einstein


6.1 Introduction


Many applications require the partitioning of data points into intuitively similar groups. The partitioning of a large number of data points into a smaller number of groups helps greatly in summarizing the data and understanding it for a variety of data mining applica-tions. An informal and intuitive definition of clustering is as follows:




Given a set of data points, partition them into groups containing very similar data points.

This is a very rough and intuitive definition because it does not state much about the different ways in which the problem can be formulated, such as the number of groups, or the objective criteria for similarity. Nevertheless, this simple description serves as the basis for a number of models that are specifically tailored for different applications. Some examples of such applications are as follows:





  • Data summarization: At the broadest level, the clustering problem may be considered as a form of data summarization. As data mining is all about extracting summary information (or concise insights) from data, the clustering process is often the first step in many data mining algorithms. In fact, many applications use the summarization property of cluster analysis in one form or the other.




  • Customer segmentation: It is often desired to analyze the common behaviors of groups of similar customers. This is achieved by customer segmentation. An example of an application of customer segmentation is collaborative filtering, in which the stated or derived preferences of a similar group of customers are used to make product recommendations within the group.


C. C. Aggarwal, Data Mining: The Textbook, DOI 10.1007/978-3-319-14142-8 6

153

c Springer International Publishing Switzerland 2015



154 CHAPTER 6. CLUSTER ANALYSIS


1   ...   90   91   92   93   94   95   96   97   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin