χ2 Measure, 123
-diversity, 682
k-anonymity, 670, 671
t-closeness, 684
AdaBoost, 381
Agglomerative Clustering, 167
Aggregate Change Points, 419
Almost Closed Sets, 139
AMS Sketch, 406
Approximate Frequent Patterns, 139
Apriori Algorithm, 100
AR Model, 467
ARIMA Model, 469
ARMA Model, 469
Association Pattern Mining, 15, 93
Association Rule Hiding, 688
Association Rules, 98
Associative Classifiers, 305
Authorities, 602
Autoregressive Integrated Moving Average
Model, 469
Autoregressive Model, 467
Autoregressive Moving Average Model, 469
AVC-set, 351
Bag-of-Words Kernel, 524
Bagging, 379
Balaban Index, 573
Barabasi-Albert Model, 622
Baum-Welch Algorithm, 520
Bayes Classifier, 306
Bayes Optimal Privacy, 684
Bayes Reconstruction Method, 665
Bayes Text Classifier, 448
Behavioral Attributes, 10, 458, 532
Bernoulli Bayes Model, 309
Between-Class Scatter Matrix, 291
Betweenness Centrality, 626
Bias Term in SVMs, 314
Biased Sampling, 38
Big Data, 389
Binarization, 31
Binning of Time Series, 460
Biological Sequences, 493
BIRCH, 214
Bisecting K-Means, 173
Bloom Filter, 399
BOAT, 351
Boosting, 381
Bootstrap, 337
Bootstrapped Aggregating, 379
Bucket of Models, 383
Buckshot, 435
C4.5rules, 300
Candidate Distribution Algorithm, 112
Cascade, 655
Categorical Data Clustering, 206
CBA, 148, 305
Centrality, 623
Centroid Distance Signature, 533
Centroid-based Text Classification, 447
Chebychev Inequality, 394
Chernoff Bound (Lower-Tail), 395
Chernoff Bound (Upper-Tail), 396
Circuit Rank, 573
CLARA, 213
CLARANS, 213
Classification, 285
C. C. Aggarwal, Data Mining: The Textbook, DOI 10.1007/978-3-319-14142-8
|
727
|
c Springer International Publishing Switzerland 2015
728
Classification Based on Associations, 305 Classification of Time Series, 488 Classifier Evaluation, 334
Classifying Graphs, 582 Cleaning Data, 34 CLIQUE, 219 Closed Itemsets, 137
Closed Patterns, 137 Closeness Centrality, 624 CLUSEQ, 504
Cluster Digest for Text, 434 Cluster Validation, 195 Clustering, 153 Clustering Coefficient, 621 Clustering Data Streams, 411 Clustering Graphs, 579 Clustering Tendency, 154 Clustering Text, 434 Clustering Time Series, 476 Clusters and Outliers, 246 CluStream, 413 Co-clustering, 438
Co-clustering for Recommendations, 610 Co-location Patterns, 548 Co-Training, 363
Coefficient of Determination, 361, 468 Collaborative Filtering, 149, 234, 605 Collective Classification, 367, 641 Combination Outliers in Sequences, 508 Community Detection, 627 Compression-based Dissimilarity Measure,
513
Concept Drift, 22, 390 Condensation-based Anonymization, 680 Confidence, 97
Confidence Monotonicity, 98
Constrained Clustering, 225
Constrained Pattern Mining, 146
Constrained Sequential Patterns, 500
Content-based Recommendations, 605
Contextual Attributes, 10, 458, 532
CONTOUR, 504
Coordinate Descent, 355
Core of Joined Subgraphs, 578
Count-Min Sketch, 403
Cross-Validation, 336
CSketch, 417
CURE, 216
CVFDT, 423
INDEX
Cyclomatic Number, 573
Data Classification, 18, 285
Data Cleaning, 34
Data Clustering, 16, 153
Data Reduction, 37
Data Streams, 389
Data Type Portability, 30
Data Types, 6
Data-centered Ensembles, 278
DBSCAN, 181
Decision List, 300
Decision Trees, 293
Degree Centrality, 624
Degree Prestige, 624
DENCLUE, 184
Dendrogram, 168
Densification, 622
Density Attractors, 185
DepthProject Algorithm, 106
Differencing Time Series, 466
Diffusion Models, 655
Dijkstra Algorithm, 86
Dimensionality Curse in Privacy, 687
Dimensionality Reduction, 41
Discrete Cosine Transform, 464
Discrete Fourier Transform, 462
Discrete Sequence Similarity Measures, 82
Discretization, 30
Discriminative Classifier, 306
Distance-based Clustering, 159
Distance-based Entropy, 156
Distance-based Motifs, 473
Distance-based Outlier Detection, 248
Distance-based Sequence Clustering, 502
Distance-based Sequence Outliers, 513
Distributed Privacy, 689
Document Preparation, 431
Document-Term Matrix, 8
Domain Generalization Hierarchy, 670
Downward Closure Property, 96
DWT, 50
Dynamic Programming in HMM, 520
Dynamic Time Warping Distance, 79
Dynamics of Network Formation, 622
Early Termination Trick, 250 Earth Mover Distance, 685 Eckart-Young Theorem, 46
INDEX
Eclat, 110
Edit Distance, 82, 513
Edit Distance in Graphs, 567
Eigenvector Centrality, 627
EM Algorithm for Continuous Data, 173, 244 EM Algorithm for Data Clustering, 175 Embedded Models, 292
Energy of a Data Set, 46 Ensemble Classification, 373 Ensemble Clustering, 231 Ensemble-based Streaming Classification,
424
Entropy, 156, 289
Entropy -diversity, 683
Enumeration Tree, 103
Equivalence Class in Privacy, 671
Error Tree of Wavelet Representation, 52
Estrada Index, 572
Euclidean Metric, 64
Event Detection, 485
Evolutionary Outlier Algorithms, 271 Example Re-weighting, 348 Expected Error Reduction, 372 Expected Model Change, 371 Expected Variance Reduction, 373 Explaining Sequence Anomalies, 519 Exponential Smoothing, 461 Extreme Value Analysis, 239
Feature Bagging, 274
Feature Selection, 40
Feature Selection for Classification, 287 Feature Selection for Clustering, 154 Filter Models, 155, 288 Finite State Automaton, 509
First Story Detection, 418, 453 Fisher Score, 290
Fisher’s Linear Discriminant, 290
Flajolet-Martin Algorithm, 408
FOIL’s Information Gain, 304
Forward Algorithm, 519
Forward-backward Algorithm, 520
Fowlkes-Mallows Measure, 201
Fractionation, 435
Frequency-based Sequence Outliers, 514
Frequent Itemset, 93
Frequent Pattern Mining, 15, 93 Frequent Pattern Mining in Streams, 409 Frequent Substructure Mining, 575
729
Frequent Trajectory Paths, 546 Frequent Traversal Patterns, 615 Full-Domain Generalization, 673
Generalization in Privacy, 670
Generalization Property, 675
Generalized Linear Models, 357
Generative Classifier, 306
Geodesic Distances, 71
Gini Index, 288
Girvan-Newman Algorithm, 631
GLM, 357
Global Recoding, 672
Global Statistical Similarity, 74
Goodall Measure, 75
Graph Classification, 582
Graph Clustering, 579
Graph Database, 557
Graph Distances and Matching, 565
Graph Edit Distance, 567
Graph Isomorphism, 559
Graph Kernels, 573
Graph Matching, 559
Graph Similarity Measures, 85
Graph-based Algorithms, 187
Graph-based Collaborative Filtering, 608
Graph-based Methods, 522
Graph-based Semisupervised Learning, 367
Graph-based Sequence Clustering, 502
Graph-based Spatial Neighborhood, 541
Graph-based Spatial Outliers, 542
Graph-based Time-Series Clustering, 481
Gregariousness in Social Networks, 624
Grid-based Outliers, 255
Grid-based Projected Outliers, 270
GSP Algorithm, 495
Haar Wavelets, 50
Heavy Hitters, 405
Hidden Markov Model Clustering, 506
Hidden Markov Models, 514
Hierarchical Clustering Algorithms, 166
High Dimensional Privacy, 687
Hinge Loss, 319
Histogram-based Outliers, 255
HITS, 602
HMETIS, 232
HMM, 514
HMM Applications, 521
730
Hoeffding Inequality, 397
Hoeffding Trees, 421
Holdout, 336
Homophily, 58, 621
Hopkin’s Statistic, 157
Hosoya Index, 572
HOTSAX, 483
Hubs, 602
Hybrid Feature Selection, 159
Imputation, 49
Incognito, 675
Incognito Super-roots, 678
Inconsistent Data, 36
Independent Cascade Model, 656
Independent Ensembles, 276
Inductive Classifiers, 362
Influence Analysis, 655
Information Gain, 289
Information Theoretic Measures, 513
Instance-based Learning, 331
Instance-based Text Classification, 447
Interest Ratio, 124
Internal Validation Criteria, 196
Intrinsic Dimensionality, 41
Inverse Document Frequency, 74
Inverse Occurrence Frequency, 74
Inverted Index, 143
ISOMAP, 57, 71
Item-based Recommendations, 608
Itemset, 94
Iterative Classification Algorithm, 641
Jaccard Coefficient, 76, 432
Jaccard for Multiway Similarity, 125
K-Means, 162, 480
K-Medians, 164
K-Medoids, 164, 480, 579
K-Modes, 208
Katz Centrality, 653
Kernel Density Estimation, 256
Kernel Fisher’s Discriminant, 360
Kernel K-Means, 163, 325
Kernel Logistic Regression, 360
Kernel PCA, 44, 325
Kernel Ridge Regression, 359
Kernel SVM, 323, 524, 585
Kernel Trick, 323, 359
INDEX
Kernels in Graphs, 573 Kernighan-Lin Algorithm, 629 Keyword-based Sequence Similarity, 502 Kruskal Stress, 56
Label Propagation Algorithm, 643
Lagrangian Optimization in NMF, 193
Large Itemset, 93
Lasso, 355
Latent Components of NMF, 192 Latent Components of SVD, 47 Latent Factor Models, 611 Latent Semantic Indexing, 447 Law Enforcement, 18 Lazy Learners, 331
Learn-One-Rule, 302 Leave-One-Out Bootstrap, 337 Leave-One-Out Cross-Validation, 336 Left Eigenvector, 600
Level-wise Algorithms, 100 Levenshtein Distance, 82
Lexicographic Tree, 103 Likelihood Ratio Statistic, 304 Linear Discriminant Analysis, 291 Linear Threshold Model, 656 Link Prediction, 650
Link Prediction for Recommendations, 608 Loadshedding, 390
Local Outlier Factor, 252
Local Recoding, 672
LOF, 252
Logistic Regression, 310, 358 Longest Common Subsequence, 84 Lookahead-based Pruning, 110 Lossy Counting Algorithm, 410 LSA, 47, 447
MA Model, 468
Macro-clustering, 413
Mahalanobis k-means, 163
Mahalanobis Distance, 70, 242
Manhattan Metric, 64
Margin, 314
Margin Constraints, 315
Markov Inequality, 394
Massive-Domain Stream Clustering, 417
Massive-Domain Streaming Classification,
425
INDEX
Match-based Distance Measures in Graphs,
565
Maximal Frequent Itemsets, 96, 136
Maximum Common Subgraph, 561
Maximum Common Subgraph Problem, 564
Mean-Shift Clustering, 186
Mercer Kernel Map, 324
Mercer’s Theorem, 323
METIS, 634
Metric, 565
Micro-clustering, 413
Min-Max Scaling, 37
Minkowski Distance, 65
Missing Data, 35
Missing Time-Series Values, 459
Mixture Modeling, 173, 244
Model Selection, 383
Model-centered Ensembles, 277
Mondrian Algorithm, 678
Moore-Penrose Pseudoinverse, 49
Morgan Index, 572
Motif Discovery, 472
Moving Average Model, 468
Moving Average Smoothing, 460
Multiclass Learning, 346
Multidimensional Change Points, 419
Multidimensional Scaling, 55
Multidimensional Spatial Neighborhood, 541
Multidimensional Spatial Outliers, 542
Multilayer Neural Network, 328
Multinomial Bayes Model, 309, 448, 449
Multivariate Extreme Values, 242
Multivariate Time Series, 10, 458, 459
Multivariate Time-Series Forecasting, 470
Multiview Clustering, 231
Naive Bayes Classifier, 306 NCSA Common Log Format, 613 Near Duplicate Detection, 594 Nearest Neighbor Classifier, 522 Neighborhood-based Collaborative Filtering,
607
Network Data, 12
Neural Networks, 326
NMF, 191
Node-Induced Subgraph, 560
Noise Removal from Time Series, 460 Non-stationary Time Series, 465 Nonlinear Regression, 359
731
Nonlinear Support Vector Machines, 321
Nonnegative Matrix Factorization, 191
Normalization, 37
Normalization of Time Series, 461
Normalized Wavelet Basis, 52
Novelties in Text, 453
Oblivious Transfer Protocol, 690
One-Against-One Multiclass Learning, 347
One-Against-Rest Multiclass Learning, 347
Online Novelty Detection, 419
Online Time-Series Clustering, 477
ORCLUS, 222
Ordered Probit Regression, 359
Outlier Analysis, 17
Outlier Detection, 17
Outlier Ensembles, 274
Outlier Validity, 258
Output Privacy, 688
Overfitting, 287
PAA, 460
PageRank, 86, 592, 598 Partial Periodic Patterns, 476 Partition Algorithm, 110, 128 Partition-1, 111 PCA, 42
Perceptron, 326
Periodic Patterns, 476
Perturbation for Privacy, 664
Pessimistic Error Rate, 304
Piecewise Aggregate Approximation, 460 PLSA, 440
Point Outliers in Time Series, 482 Poisson Regression, 359 Polynomial Regression, 359 Pool-based Active Learning, 369 Position Outliers in Sequences, 507 Power-Iteration Method, 600 Power-Law Degree Distribution, 623 Predictive Attribute Dependence, 155 Preferential Attachment, 622 Preferential Crawlers, 591 Prestige, 623
Principal Component Analysis, 42 Principal Components Regression, 356 Privacy-Preserving Data Mining, 663 Privacy-Preserving Data Publishing, 667 Probabilistic Classifiers, 306
732
Probabilistic Clustering, 173
Probabilistic Latent Semantic Analysis, 440 Probabilistic Outlier Detection, 244 Probabilistic Suffix Trees, 510 Probabilistic Text Clustering, 436 Probit Regression, 359
PROCLUS, 220
Product Graph, 574
Profile Association Rules, 148 Projected Outliers, 270 Projection-based Reuse, 107 Projection-based Reuse of Support Count-
ing, 107
Proximal Gradient Methods, 355 Proximity Models for Mixed Data, 75 Proximity Prestige, 624 PST, 510
Pyramidal Time Frame, 415
Query Auditing, 688 Query-by-Committee, 371 Querying Patterns, 141 QuickSI Algorithm, 564
RainForest, 351
Randic Index, 573
Random Forests, 380
Random Subspace Ensemble, 274 Random Subspace Sampling, 273 Random Walks, 86, 598 Random-Walk Kernels, 573 Randomization for Privacy, 664 Rank Prestige, 627 Ranking Algorithms, 597
Rare Class Learning, 347
Ratings Matrix, 604
Recommendations, 149
Recommender Systems, 604
Recursive (c, )-diversity, 683
Regression Modeling, 353
Regularization, 312, 355, 613
Regularization in Collective Classification,
647
Rendezvous Label Propagation, 646 Representative-based Clustering, 159 Representativeness-based Active Learning,
373
Reservoir Sampling, 39, 391 Response Variable, 353
INDEX
Ridge Regression, 355
Right Eigenvector, 600
RIPPER, 300
Rocchio Classification, 448
ROCK, 209
Samarati’s Algorithm, 673
Sampling, 38
SAX, 32, 464
Scalable Classification, 350 Scalable Clustering, 212 Scalable Decision Trees, 351 Scale-Free Networks, 622 Scaling, 37
Scatter Gather Text Clustering, 434 Secure Multi-party Computation, 690 Secure Set Union Protocol, 690 Selective Sampling, 369 Self Training, 363
Semisupervised Bayes Classification, 364 Semisupervised Clustering, 224 Semisupervised Learning, 361 Sensor-Selection, 479
Sequence Classification, 521 Sequence Data, 10
Sequence Outlier Detection, 507 Sequential Covering Algorithms, 301 Sequential Ensembles, 275 Sequential Pattern Mining, 494 Shape Analysis, 533 Shape Clustering, 539
Shape Outliers, 543
Shape-based Time-Series Clustering, 479
Shared Nearest Neighbors, 73
Shingling, 594
Short Memory Property, 509 Shortest Path Kernels, 575
Shrinking Diameters, 623 Signature Table, 144
Similarity Computation with Mixed Data, 75 Simple Matching Coefficient, 513
Simple Redundancy, 143 SimRank, 86, 601
Singular Value Decomposition, 44 Small World Networks, 622 SMOTE, 350
Social Influence Analysis, 655 Soft SVM, 319
Spatial Co-location Patterns, 538
INDEX
Spatial Data, 11
Spatial Data Mining, 531 Spatial Outliers, 540
Spatial Tile Transformation, 547 Spatial Wavelets, 537 Spatiotemporal Data, 12 Spectral Clustering, 637 Spectral Decomposition, 47
Spectral Methods in Collective Classifica-tion, 646
Spectrum Kernel, 524
Spider Traps, 593
Spiders, 591
SPIRIT, 472
Stacking, 384
Standardization, 37, 354, 462
Stationary Time Series, 465
Stop-word Removal, 431
STORM, 426
Stratified Cross-Validation, 336 Stratified Sampling, 39 STREAM Algorithm, 411 Streaming Classification, 421 Streaming Data, 389
Streaming Frequent Pattern Mining, 409 Streaming Novelty Detection, 419 Streaming Outlier Detection, 417 Streaming Privacy, 681 Streaming Synopsis, 391
Strict Redundancy, 143
String Data, 10
Subgraph Isomorphism, 560
Subgraph Matching, 560
Subsequence, 495
Subsequence-based Clustering, 503
Superset-based Pruning, 110
Supervised Feature Selection, 41
Supervised Micro-clusters for Classification,
424
Support, 95
Support Vector Machines, 313 Support Vectors, 314 Suppression in Privacy, 670 SVD, 44
SVM for Text, 451
SVMLight, 352
SVMPerf, 451
Symbolic Aggregate Approximation, 32, 464 Symmetric Confidence Measure, 124
733
Synopsis for Streams, 391
Synthetic Data for Anonymization, 680 Synthetic Over-sampling, 350 System Diagnosis, 493
Tag Trees, 433
TARZAN, 514
Temporal Similarity Measures, 77
Term Strength, 155
Text Classification, 446
Text Clustering, 434
Text SVM, 451
Tikhonov Regularization, 355
Time Series Similarity Measures, 77
Time Warping, 78
Time-Series Classification, 485
Time-Series Correlation Clustering, 477
Time-Series Data, 9
Time-Series Data Mining, 457
Time-Series Forecasting, 464
Time-Series Preparation, 459
Topic Modeling, 440
Topic-Sensitive PageRank, 601
Topological Descriptors, 571
Trajectory Classification, 553
Trajectory Clustering, 549
Trajectory Mining, 544
Trajectory Outlier Detection, 551
Trajectory Pattern Mining, 546
Transductive Classifiers, 362, 583
Transductive Support Vector Machines, 366
TreeProjection Algorithm, 106
Triadic Closure, 621
Ullman’s Isomorphism Algorithm, 562 Uncertainty Sampling, 370 Universal Crawlers, 591 Unsupervised Feature Selection, 40 User-based Recommendations, 607 Utility in Privacy, 664, 674, 687, 691 Utility Matrix, 604
Value Generalization Hierarchy, 670 Velocity Density Estimation, 419 Vertical Counting Methods, 110 VF2 Algorithm, 564 Viterbi Algorithm, 519
Ward’s Method, 171 Wavelet-based Rules, 523
734
Wavelets, 50
Web Crawling, 591
Web Document Processing, 433
Web Resource Discovery, 591
Web Server Logs, 613
Web Usage Mining, 613
Weighted Degree Kernel, 525
Wiener Index, 572
INDEX
Within-Class Scatter Matrix, 291 Wrapper Models, 158, 292
XProj, 581
XRules, 584
Z-Index, 572
Dostları ilə paylaş: |