Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə6/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   2   3   4   5   6   7   8   9   ...   423
1-Data Mining tarjima

CONTENTS







xvii




13.4.1

Use in Dimensionality Reduction and Comparison with Latent










Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . .

443




13.4.2

Use in Clustering and Comparison with Probabilistic










Clustering

...............................

445




13.4.3

Limitations of PLSA . . . . . . . . . . . . . . . . . . . . . . . . .

446

13.5

Specialized Classification Methods for Text . . . . . . . . . . . . . . . . . .

446




13.5.1

Instance-Based Classifiers . . . . . . . . . . . . . . . . . . . . . .

447







13.5.1.1 Leveraging Latent Semantic Analysis . . . . . . . . . .

447







13.5.1.2

Centroid-Based Classification . . . . . . . . . . . . . .

447







13.5.1.3

Rocchio Classification . . . . . . . . . . . . . . . . . . .

448




13.5.2

Bayes Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . .

448







13.5.2.1

Multinomial Bayes Model . . . . . . . . . . . . . . . . .

449




13.5.3

SVM Classifiers for High-Dimensional and Sparse Data . . . . . .

451

13.6

Novelty and First Story Detection . . . . . . . . . . . . . . . . . . . . . . .

453




13.6.1

Micro-clustering Method . . . . . . . . . . . . . . . . . . . . . . .

453

13.7

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

454

13.8

Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

454

13.9

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

455

14 Mining Time Series Data

457

14.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

457

14.2

Time Series Preparation and Similarity . . . . . . . . . . . . . . . . . . . .

459




14.2.1

Handling Missing Values . . . . . . . . . . . . . . . . . . . . . . .

459




14.2.2

Noise Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

460




14.2.3

Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

461




14.2.4

Data Transformation and Reduction . . . . . . . . . . . . . . . .

462







14.2.4.1

Discrete Wavelet Transform . . . . . . . . . . . . . . .

462







14.2.4.2

Discrete Fourier Transform . . . . . . . . . . . . . . . .

462







14.2.4.3 Symbolic Aggregate Approximation (SAX) . . . . . . .

464




14.2.5

Time Series Similarity Measures . . . . . . . . . . . . . . . . . . .

464

14.3

Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . .

464




14.3.1

Autoregressive Models . . . . . . . . . . . . . . . . . . . . . . . .

467




14.3.2

Autoregressive Moving Average Models . . . . . . . . . . . . . . .

468




14.3.3

Multivariate Forecasting with Hidden Variables . . . . . . . . . .

470

14.4

Time Series Motifs

...............................

472




14.4.1

Distance-Based Motifs . . . . . . . . . . . . . . . . . . . . . . . .

473




14.4.2

Transformation to Sequential Pattern Mining . . . . . . . . . . .

475




14.4.3

Periodic Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . .

476

14.5

Time Series Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

476




14.5.1

Online Clustering of Coevolving Series . . . . . . . . . . . . . . .

477




14.5.2

Shape-Based Clustering . . . . . . . . . . . . . . . . . . . . . . . .

479







14.5.2.1 k-Means . . . . . . . . . . . . . . . . . . . . . . . . . .

480







14.5.2.2 k-Medoids . . . . . . . . . . . . . . . . . . . . . . . . .

480







14.5.2.3

Hierarchical Methods . . . . . . . . . . . . . . . . . . .

481







14.5.2.4

Graph-Based Methods . . . . . . . . . . . . . . . . . .

481

14.6

Time Series Outlier Detection . . . . . . . . . . . . . . . . . . . . . . . . .

481




14.6.1

Point Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

482




14.6.2

Shape Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

483

14.7

Time Series Classification . . . . . . . . . . . . . . . . . . . . . . . . . . .

485




xviii










CONTENTS







14.7.1

Supervised Event Detection . . . . . . . . . . . . . . . . .

.... 485







14.7.2

Whole Series Classification . . . . . . . . . . . . . . . . . .

.... 488










14.7.2.1

Wavelet-Based Rules . . . . . . . . . . . . . . .

.... 488










14.7.2.2

Nearest Neighbor Classifier . . . . . . . . . . . .

.... 489










14.7.2.3

Graph-Based Methods . . . . . . . . . . . . . .

.... 489




14.8

Summary . . . .

............................

.... 489




14.9

Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . .

.... 490




14.10

Exercises . . . . .

............................

.... 490




15 Mining Discrete Sequences

493




15.1

Introduction . . .

............................

.... 493




15.2

Sequential Pattern Mining . . . . . . . . . . . . . . . . . . . . . . .

.... 494







15.2.1

Frequent Patterns to Frequent Sequences . . . . . . . . . .

.... 497







15.2.2

Constrained Sequential Pattern Mining . . . . . . . . . . .

.... 500




15.3

Sequence Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . .

.... 501







15.3.1

Distance-Based Methods . . . . . . . . . . . . . . . . . . .

.... 502







15.3.2

Graph-Based Methods . . . . . . . . . . . . . . . . . . . .

.... 502







15.3.3

Subsequence-Based Clustering . . . . . . . . . . . . . . . .

.... 503







15.3.4

Probabilistic Clustering . . . . . . . . . . . . . . . . . . . .

.... 504










15.3.4.1

Markovian Similarity-Based Algorithm: CLUSEQ . . . 504










15.3.4.2 Mixture of Hidden Markov Models . . . . . . .

.... 506




15.4

Outlier Detection in Sequences . . . . . . . . . . . . . . . . . . . .

.... 507







15.4.1

Position Outliers . . . . . . . . . . . . . . . . . . . . . . .

.... 508










15.4.1.1 Efficiency Issues: Probabilistic Suffix Trees . . .

.... 510







15.4.2

Combination Outliers . . . . . . . . . . . . . . . . . . . . .

.... 512










15.4.2.1

Distance-Based Models . . . . . . . . . . . . . .

.... 513










15.4.2.2

Frequency-Based Models . . . . . . . . . . . . .

.... 514




15.5

Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . . .

.... 514







15.5.1

Formal Definition and Techniques for HMMs.

.... 517







15.5.2

Evaluation:...... .Computing...... the.. Fit.. .Probability...... for.. Observed......



















Sequence

............................

.... 518







15.5.3

Explanation: Determining the Most Likely State Sequence













for Observed Sequence . . . . . . . . . . . . . . . . . . . .

.... 519







15.5.4

Training: Baum–Welch Algorithm . . . . . . . . . . . . . .

.... 520







15.5.5

Applications . . . . . . . . . . . . . . . . . . . . . . . . . .

.... 521




15.6

Sequence Classification . . . . . . . . . . . . . . . . . . . . . . . . .

.... 521







15.6.1

Nearest Neighbor Classifier . . . . . . . . . . . . . . . . . .

.... 522







15.6.2

Graph-Based Methods . . . . . . . . . . . . . . . . . . . .

.... 522







15.6.3

Rule-Based Methods . . . . . . . . . . . . . . . . . . . . .

.... 523







15.6.4

Kernel Support Vector Machines . . . . . . . . . . . . . . .

.... 524










15.6.4.1

Bag-of-Words Kernel . . . . . . . . . . . . . . .

.... 524










15.6.4.2

Spectrum Kernel . . . . . . . . . . . . . . . . .

.... 524










15.6.4.3

Weighted Degree Kernel . . . . . . . . . . . . .

.... 525







15.6.5

Probabilistic Methods: Hidden Markov Models . . . . . . .

.... 525




15.7

Summary . . . .

............................

.... 526




15.8

Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . .

.... 527




15.9

Exercises . . . . .

............................

.... 528








Yüklə 17,13 Mb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin