Table 1: Positive and Negative Expanders [8].
Shaping Factor
Positive Expanders
Negative Expanders
Familiarity
unfamiliar, layout, unfamiliarity, rely
Physical Environment
cloud, snow, ice, wind
Physical Factors
fatigue, tire, night, rest, hotel, awake, sleep, sick
declare, emergency, advisory, separation
Preoccupation
distract, preoccupied, awareness, situational,
task, interrupt, focus, eye, configure, sleep
declare, ice, snow, crash, fire, rescue, anti,
smoke
Pressure
bad, decision, extend, fuel, calculate, reserve,
diversion, alternate
with that set, it is labeled with the shaper associated with
the expanders. For example, if the bootstrapping algorithm
is labeling reports with the Preoccupation label using the
set of positive expanders in Table 1, and an unlabeled report
contains the words “awareness”, “task”, “eye”, and “smoke”,
it would label this report with the Preoccupation shaper
regardless of any negative expander.
Figure 8: The bootstrapping algorithm [8].
The bootstrapping algorithm [Figure 8] takes several in-
puts. These inputs consist of a set of positively labeled train-
ing examples of a shaper, a set of negatively labeled training
examples of a shaper, a set of unlabeled narratives, and the
number of bootstrapping iterations (k). Positive examples
of a shaper are narratives which contain words that indicate
that shaper, negative examples of a shaper are narratives
that include words that indicate that the shaper is not ap-
propriate for said narratives. To find more expanders, two
empty data sets are initialized, one for positive expanders,
one for negative expanders. The algorithm then iterates the
ExpandTrainingSet function k times. In these iterations, if
the size of the set of positively labeled training examples is
larger than the size of the set of negatively labeled training
examples, a new positive expander is found and vice versa.
In the second function of the algorithm, ExpandTrain-
ingSet, 4 new expanders are found.
The inputs for this
function are the narrative sets of the positive and negative
shaper examples in their respective variable (A or B depen-
dent on the size of the narrative sets), the set of narratives
not assigned a shaper, and a unigram feature set: a set of
positive or negative expanders (dependent on the sizes of the
positive and negative narrative sets). Expanders are found
by finding the log of the number of narratives in one set
(P or N) containing a word, t, divided by the number of
narratives in the other set containing the word, t, for every
word. The maximum value of these values is then found. If
expanding the positively labeled training examples, the pos-
itive narrative set is used in the numerator, and the negative
narrative set is used in the denominator, and vice versa for
expanding the negatively labeled training examples. If this
word is not already an expander, it is added to the set of
expanders. After the 4 new expanders are found, all of the
unlabeled narratives that contain more than 3 words in the
list of relevant expanders are added to the relevant list of
labeled training examples.
To use the semi-supervised bootstrapping algorithm, some
data must already be labeled. To initially label a test set of
incident reports, two graduate students not affiliated with
the research labeled 1,333 randomly selected reports from
the ASRS database.
The reports are then classified by a pre-existing software
package called LIBSVM to have a baseline to which we may
compare the bootstrapping algorithm.
4.
RESULTS
In this section we discuss the outcomes of the three meth-
ods compared to their respective baselines.
4.1
Multiple Kernel Learning
The data of concern in the set of 3500 flights were the
points below 10,000 feet mean sea level (MSL). The flight
data of these flights was passed through an algorithm to rid
the set of flights where the sensors or sensor values were un-
reasonable, likely due to noise or other malfunctions. This
left 2500 of the original 3500 flights for analysis. To find a
training set for the algorithm from these 2500 flights, “an
aggressive data quality filter was applied to the remaining
flights”, which returned “approximately 500 flights” [2]. Of
the 2500 flights, 227 flights were found to be anomalous by
the MKAD method. Of these 227 anomalous flights, 19 were
discrete, 94 were continuous, and 114 were heterogeneous
(discrete and continuous). Table 2 shows the results from
the multiple kernel method research, the overlap between
5
Pagels: Aviation Data Mining
Published by University of Minnesota Morris Digital Well, 2015
this multiple kernel method and the baselines. Multiple ker-
nel learning for heterogeneous anomaly detection suggest the
MKAD approach was able to detect anomalies indicated by
both discrete and continuous data more effectively than the
baseline methods based on this overlap.
Algorithms
Overlap of anomalous flights
(with MKAD)
Discrete
Continuous
Heterogeneous
O
21%
59%
34%
S
53%
0%
54%
O & S
58%
59%
67%
MKAD
19
94
114
Table 2:
Overlap between MKAD approach and
baselines.
The baselines are represented by O for
Orca and S for SequenceMiner. The values of O &
S are the union of their anomalous sets [2].
4.2
HMMs and HSMMs
Overall, of the scenarios listed in section 3.2, HSMM per-
formed better on scenarios 1 and 2, and performed simi-
larly to HMM on scenarios 3, 4, and 5. While the authors
of the paper discussed possible methods to further improve
anomaly detection using a HSMM, the result confirms the
relevance and importance of an algorithm that takes dura-
tion of states into account.
4.3
Semi-Supervised Bootstrapping Algorithm
A sample of the words indicative of certain labels, or ex-
panders, found when the bootstrapping algorithm was run
on the set of incident reports may be found in Table 1.
We can get an idea of the effectiveness of the bootstrap-
ping algorithm based on the expanders.
In a table from
Semi-Supervised Cause Identification from Aviation Safety
Reports, 1.8% of the reports were annotated with the ‘Pres-
sure’ shaper. Even with the small percentage of the data
set having a cause of pressure it is easy to see how the pos-
itive expanders shown in the table can indicate pressure as
a cause leading to the incident.
The bootstrapping algorithm’s effectiveness was measured
by F-measure. An F-measure is the combination of precision
and recall. Precision is the fraction of reports accurately
assigned a shaper, recall is the fraction of the reports for
a shaper that were properly labeled.
The bootstrapping
Algorithms’ F-measure yielded “a relative error reduction of
6.3% in F-measure over a purely supervised baseline when
applied to the minority classes” [8].
5.
CONCLUSION
Techniques in data mining show signs of improving the
ability to detect anomalies in aviation data. We are now
able to detect heterogeneous anomalies in data, where be-
fore we were only able to find either discrete or continuous
anomalies. To do this we use Multiple Kernel Learning. We
have learned that a Hidden Semi-Markov Model approach
to detecting anomalies is favorable over a Hidden Markov
Model approach. This is due to Hidden Semi-Markov Mod-
els having the ability to model the probability of sequences
with the duration of states having significance. Lastly, we
have looked at a new text classification approach to effec-
tively identifying causes in aviation incident reports with an
emphasis on minority causes. To accomplish this, the boot-
strapping algorithm was used to find causes based on key
words contained in the aviation reports. Some continuing
problems which have yet to be addressed in the field of data
mining of aviation data include:
• Overly generalized data in incident reports, making
cause identification a difficult task
• Providing a simple way for these methods to be de-
ployed
• Linking reports between other data (e.g. linking inci-
dent report to aircraft maintenance records) [7].
6.
ACKNOWLEDGEMENTS
Many thanks to Peter Dolan, Elena Machkasova, and An-
drew Latterner for their invaluable feedback.
7.
REFERENCES
[1] 14 Code of Federal Regulations 121.344. 2011.
[2] S. Das, B. L. Matthews, A. N. Srivastava, and N. C.
Oza. Multiple kernel learning for heterogeneous
anomaly detection: algorithm and aviation safety case
study. In Proceedings of the 16th ACM SIGKDD
international conference on Knowledge discovery and
data mining, pages 47–56. ACM, 2010.
[3] J. W. Hunt and T. G. Szymanski. A fast algorithm for
computing longest common subsequences. In
Communications of the ACM: Volume 20-Number 5,
pages 350–353. ACM, 1997.
[4] E. Kim. Everything you wanted to know about the
kernel trick (but were too afraid to ask).
http://www.eric-kim.net/eric-kim-net/posts/1/
kernel_trick.html, 2013.
[5] J. Lin, E. Keogh, L. Wei, and S. Lonardi. Experiencing
sax: a novel symbolic representation of time series. Data
Mining and Knowledge Discovery, 15(2):107–144, 2007.
[6] I. Melnyk, P. Yadav, M. Steinbach, J. Srivastava,
V. Kumar, and A. Banerjee. Detection of precursors to
aviation safety incidents due to human factors. In Data
Mining Workshops (ICDMW), 2013 IEEE 13th
International Conference on, pages 407–412. IEEE,
2013.
[7] Z. Nazeri, E. Bloedorn, and P. Ostwald. Experiences in
mining aviation safety data. In ACM SIGMOD Record,
volume 30, pages 562–566. ACM, 2001.
[8] I. Persing and V. Ng. Semi-supervised cause
identification from aviation safety reports. In
Proceedings of the Joint Conference of the 47th Annual
Meeting of the ACL and the 4th International Joint
Conference on Natural Language Processing of the
AFNLP: Volume 2-Volume 2, pages 843–851.
Association for Computational Linguistics, 2009.
6
Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal, Vol. 2 [2015], Iss. 1, Art. 3
http://digitalcommons.morris.umn.edu/horizons/vol2/iss1/3