Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə280/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   276   277   278   279   280   281   282   283   ...   423
1-Data Mining tarjima

Sensor data: Sensor data is often collected by a wide variety of hardware and other monitoring devices. Typically, this data contains continuous readings about the under-lying data objects. For example, environmental data is commonly collected with differ-ent kinds of sensors that measure temperature, pressure, humidity, and so on. Sensor data is the most common form of time series data.




  • Medical devices: Many medical devices such as electrocardiogram (ECG) and elec-troencephalogram (EEG) produce continuous streams of time series data. These rep-resent measurements of the functioning of the human body, such as the heart beat, pulse rate, blood pressure, etc. Real-time data is also collected from patients in inten-sive care units (ICU) to monitor their condition.




  • Financial market data: Financial data, such as stock prices, is often temporal. Other forms of temporal data include commodity prices, industrial trends, and economic indicators.

In general, temporal data may be either discrete or continuous. For example, Web log data contains a series of discrete events corresponding to user clicks, whereas environmental data may contain a series of continuous values such as temperature. Continuous temporal data sets are referred to as time series, whereas discrete temporal data sets are referred to as sequences. This chapter focuses on continuous time series data. The next chapter






C. C. Aggarwal, Data Mining: The Textbook, DOI 10.1007/978-3-319-14142-8 14

457

c Springer International Publishing Switzerland 2015



458 CHAPTER 14. MINING TIME SERIES DATA

studies data mining methods for discrete sequence data. While time series and discrete sequence data are conceptually similar, there are significant differences in the algorithmic methodologies used in each domain. However, in many cases, time series data is converted to discrete sequence data through discretization to facilitate the application of rich classes of sequence mining techniques. This chapter also discusses such cases.


Unlike multidimensional data, in which all attributes are treated equally, time series data are viewed as contextual data representations. In contextual data representations, the attributes are of two types:





  • Contextual attribute(s): These represent the attributes that provide the context in which the measurements are made. In other words, the contextual attributes provide the reference points at which the behavioral values are measured. For the case of time series data, the single contextual attribute corresponds to the time dimension. Some data types, such as spatial data, may contain multiple contextual attributes corresponding to spatial coordinates. The time stamps could correspond to actual time values at which the data points are measured, or they could correspond to consecutive indices (or ticks) at which these values are measured.




  • Behavioral attribute(s): These represent the behavioral values at the reference points. For example, in an environmental sensor, this could correspond to the temperature attribute. In general, each contextual attribute value (e.g., time stamp) has a corre-sponding behavioral attribute value (e.g., temperature). The behavioral attributes are usually the interesting ones from an application-specific perspective, but they cannot be properly interpreted without the knowledge of the contextual attributes. When more than one behavioral attribute is associated with each series, the corresponding series is referred to as a multivariate time series.

The analysis of contextual data types is more difficult because behavioral attribute val-ues cannot be interpreted effectively without using the contextual attribute. For example, a sudden change of the behavioral attribute between successive time stamps (contextual attribute) is often indicative of outlier behavior. Thus, unlike multidimensional data, prob-lem definitions are dependent on a combination of the interrelationships between contex-tual and behavioral attributes. Thus, problems such as clustering, classification, and outlier detection need to be significantly modified to account for the impact of the contextual attribute. Several data types discussed in subsequent chapters fall within this class. Other examples include sequence data and spatial data.


The greater complexity of time series data enables a larger number of problem definitions.


Most of the models can be categorized into one of two types:






  1. Yüklə 17,13 Mb.

    Dostları ilə paylaş:
1   ...   276   277   278   279   280   281   282   283   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin