Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə65/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   61   62   63   64   65   66   67   68   ...   423
1-Data Mining tarjima

transaction, given that the transaction contains X. Therefore, the confidence conf (X ⇒ Y )
is defined as follows:

conf (X



Y ) =

sup(X ∪ Y )

.

(4.2)










sup(X)







The itemsets X and Y are said to be the antecedent and the consequent of the rule, respec-tively. In the case of Table 4.1, the support of {Eggs, M ilk } is 0.6, whereas the support of {Eggs, M ilk, Y ogurt} is 0.4. Therefore, the confidence of the rule {Eggs, M ilk} ⇒ {Y ogurt} is (0.4/0.6) = 2/3.


As in the case of support, a minimum confidence threshold minconf can be used to generate the most relevant association rules. Association rules are defined using both support and confidence criteria.

Definition 4.3.2 (Association Rules) Let X and Y be two sets of items. Then, the rule X ⇒ Y is said to be an association rule at a minimum support of minsup and minimum confidence of minconf , if it satisfies both the following criteria:





  1. The support of the itemset X ∪ Y is at least minsup.




  1. The confidence of the rule X ⇒ Y is at least minconf .



The first criterion ensures that a sufficient number of transactions are relevant to the rule; therefore, it has the required critical mass for it to be considered relevant to the application at hand. The second criterion ensures that the rule has sufficient strength in terms of con-ditional probabilities. Thus, the two measures quantify different aspects of the association rule.

The overall framework for association rule generation uses two phases. These phases correspond to the two criteria in Definition 4.3.2, representing the support and confidence constraints.





  1. In the first phase, all the frequent itemsets are generated at the minimum support of minsup.




  1. In the second phase, the association rules are generated from the frequent itemsets at the minimum confidence level of minconf.

The first phase is more computationally intensive and is, therefore, the more interesting part of the process. The second phase is relatively straightforward. Therefore, the discussion of the first phase will be deferred to the remaining portion of this chapter, and a quick discussion of the (more straightforward) second phase is provided here.


Assume that a set of frequent itemsets F is provided. For each itemset I ∈ F, a simple way of generating the rules would be to partition the set I into all possible combinations of sets X and Y = I − X , such that I = X ∪ Y . The confidence of each rule X ⇒ Y can then be determined, and it can be retained if it satisfies the minimum confidence requirement. Association rules also satisfy a confidence monotonicity property.


Property 4.3.1 (Confidence Monotonicity) Let X1, X2, and I be itemsets such that X1 ⊂ X2 ⊂ I. Then the confidence of X2 ⇒ I − X2 is at least that of X1 ⇒ I − X1.






Yüklə 17,13 Mb.

Dostları ilə paylaş:
1   ...   61   62   63   64   65   66   67   68   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin