Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə74/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   70   71   72   73   74   75   76   77   ...   423
1-Data Mining tarjima

Partition’s recursive tid list intersection approach. Eclat is a lattice-partitioned memory-optimization of the algo-






  • Strictly speaking, Monet is the name of the vertical database, on top of which this (unnamed) algorithm was built.

112 CHAPTER 4. ASSOCIATION PATTERN MINING

Algorithm VerticalApriori(Transactions: T , Minimum Support: minsup) begin




k = 1;
F1 = { All Frequent 1-itemsets };
Construct vertical tid lists of each frequent item; while Fk is not empty do begin
Generate Ck+1 by joining itemset-pairs in Fk;
Prune itemsets from Ck+1 that violate downward closure; Generate tid list of each candidate itemset in Ck+1 by intersecting
tid lists of the itemset-pair in Fk that was used to create it; Determine supports of itemsets in Ck+1 using lengths of their tid lists; Fk+1= Frequent itemsets of Ck+1 together with their tid lists; k = k + 1;

end;
return(k F );


i=1 i
Figure 4.7: The vertical Apriori algorithm of Savasere et al. [446]
rithm in Fig. 4.7. In Eclat [537], an independent Apriori-like breadth-first strategy is used on each of the sublattices of itemsets with a common prefix. These groups of itemsets are referred to as equivalence classes. Such an approach can reduce the memory requirements by partitioning the candidate space into groups that are processed independently in con-junction with the relevant vertical lists of their prefixes. This kind of candidate partitioning is similar to parallel versions of Apriori, such as the Candidate Distribution algorithm [54]. Instead of using the candidate partitioning to distribute various sublattices to different processors, the Eclat approach sequentially processes the sublattices one after another to reduce peak memory requirements. Therefore, Eclat can avoid the postprocessing overheads associated with Savasere et al.’s data partitioning approach, if the database is too large to be processed in core by
Yüklə 17,13 Mb.

Dostları ilə paylaş:
1   ...   70   71   72   73   74   75   76   77   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin