Identifying Expressions of Emotion in Text
Saima Aman1 and Stan Szpakowicz1,2
1 School of Information Technology and Engineering,
University of Ottawa, Ottawa, Canada
2 Institute of Computer Science,
Polish Academy of Sciences, Warszawa, Poland
{saman071, szpak}@site.uottawa.ca
Abstract. Finding emotions in text is an area of research with wide-ranging applications. We describe an emotion annotation task of
identifying emotion category, emotion intensity and the words/phrases that indicate emotion in text. We introduce the annotation scheme and present results of an annotation agreement study on a corpus of blog posts. The average inter-annotator agreement on labeling a sentence as emotion or non-emotion was 0.76. The agreement on emotion categories was in the range 0.6 to 0.79; for emotion indicators, it was
0.66. Preliminary results of emotion classification experiments show the accuracy of 73.89%, significantly above the baseline.
Analysis of sentiment in text can help determine the opinions and affective intent of writers, as well as their attitudes, evaluations and inclinations with respect to various topics. Previous work in sentiment analysis has been done
on a variety of text genres, including product and movie reviews [9, 18], news stories, editorials and opinion articles [20], and more recently, blogs [7].
Work on sentiment analysis has typically focused on recognizing valence – positive or negative orientation. Among the less explored sentiment areas is the recognition of types of emotions and their strength or intensity.
In this work, we address the task of identifying expressions of emotion in text. Emotion research has recently attracted increased attention of the NLP community – it is one of the tasks at Semeval-2007
1; a workshop on emotional corpora was also held at LREC-2006
2.
We discuss the methodology and results of an emotion annotation task. Our goal is to investigate the expression of emotion in language through a corpus annotation study and to prepare (and place in the public domain) an annotated corpus for use in automatic emotion analysis experiments. We also explore computational techniques for emotion classification. In our experiments, we use a knowledge-based approach for automatically classifying emotional and non-emotional sentences. The results of the initial experiments show an improved performance over baseline accuracy.
The data in our experiments come from blogs. We wanted emotion-rich data, so that there would be ample examples of emotion use for analysis.
Such data is
1 http://nlp.cs.swarthmore.edu/semeval/tasks/task14/summary.shtml
2 http://www.lrec-conf.org/lrec2006/IMG/pdf/programWSemotion-LREC2006-last1.pdf
V. Matoušek and P. Mautner (Eds.): TSD 2007, LNAI 4629, pp. 196–205, 2007.
© Springer-Verlag Berlin Heidelberg 2007