but
awkwardly (–)”
rather than “He ran quickly (+)
and
awkwardly (–)”. However, it seems less likely that
HM would work well with nouns and verbs. There is nothing wrong with saying “the rise
(+)
and
fall (–) of the Roman Empire” or “love (+)
and
death (–)”.
7
Indeed, “but” would
not work in these phrases.
Kamps and Marx [2002] use the WordNet lexical database [Miller 1990] to determine
the semantic orientation of a word. For a given word, they look at its semantic distance
from “good” compared to its semantic distance from “bad”. The idea is similar to SO-A,
except that the measure of association is replaced with a measure of semantic distance,
based on WordNet [Budanitsky and Hirst 2001]. This is an interesting approach, but it
has not yet been evaluated empirically.
4.2. Classifying Reviews
Turney [2002] used a three-step approach to classify reviews. The first step was to apply
a part-of-speech tagger to the review and then extract two-word phrases, such as
“romantic ambience” or “horrific events”, where one of the words in the phrase was an
adjective or an adverb. The second step was to use SO-PMI to calculate the semantic
orientation of each extracted phrase. The third step was to classify the review as positive
or negative, based on the average semantic orientation of the extracted phrases. If the
7
The Rise and Fall of the Roman Empire is the title of a book by Edward Gibbon. Love and Death is the title of
a movie directed by Woody Allen.
13
average was positive, then the review was classified as positive; otherwise, negative. The
experimental results suggest that SO-PMI may be useful for classifying reviews, but the
results do not reveal how well SO-PMI can classify individual words or phrases.
Therefore it is worthwhile to experimentally evaluate the performance of SO-PMI on
individual words, as we do in Section 5.
The reviewing application of SO-A illustrates the value of an automated approach to
determining semantic orientation. Although it might be feasible to manually create a
lexicon of individual words labeled with semantic orientation, if an application requires
the semantic orientation of two-word or three-word phrases, the number of terms
involved grows beyond what can be handled by manual labeling. Turney [2002] observed
that an adjective such as “unpredictable” may have a negative semantic orientation in an
automobile review, in a phrase such as “unpredictable steering”, but it could have a
positive (or neutral) orientation in a movie review, in a phrase such as “unpredictable
plot”. SO-PMI can handle multiword phrases by simply searching for them using a
quoted phrase query.
Pang
et al.
[2002] applied classical text classification techniques to the task of
classifying movie reviews as positive or negative. They evaluated three different
supervised learning algorithms and eight different sets of features, yielding twenty-four
different combinations. The best result was achieved using a Support Vector Machine
(SVM) with features based on the presence or absence (rather than the frequency) of
single words (rather than two-word phrases).
We expect that Pang
et al.
’s algorithm will tend to be more accurate than Turney’s,
since the former is supervised and the latter is unsupervised. On the other hand, we
hypothesize that the supervised approach will require retraining for each new domain.
For example, if a supervised algorithm is trained with movie reviews, it is likely to
perform poorly when it is tested with automobile reviews. Perhaps it is possible to design
a hybrid algorithm that achieves high accuracy without requiring retraining.
Classifying reviews is related to measuring semantic orientation, since it is one of the
possible applications for semantic orientation, but there are many other possible
applications (see Section 2). Although it is interesting to evaluate a method for inferring
semantic orientation, such as SO-PMI, in the context of an application, such as review
classification, the diversity of potential applications makes it interesting to study semantic
orientation in isolation, outside of any particular application. That is the approach
adopted in this paper.
14
4.3. Subjectivity Analysis
Other related work is concerned with determining subjectivity [Hatzivassiloglou and
Wiebe 2000; Wiebe 2000; Wiebe
et al.
2001]. The task is to distinguish sentences (or
paragraphs or documents or other suitable chunks of text) that present opinions and
evaluations from sentences that objectively present factual information [Wiebe 2000].
Wiebe
et al.
[2001] list a variety of potential applications for automated subjectivity
tagging, such as recognizing “flames” [Spertus, 1997], classifying email, recognizing
speaker role in radio broadcasts, and mining reviews. In several of these applications, the
first step is to recognize that the text is subjective and then the natural second step is to
determine the semantic orientation of the subjective text. For example, a flame detector
cannot merely detect that a newsgroup message is subjective, it must further detect that
the message has a negative semantic orientation; otherwise a message of praise could be
classified as a flame.
On the other hand, applications that involve semantic orientation are also likely to
benefit from a prior step of subjectivity analysis. For example, a movie review typically
contains a mixture of objective descriptions of scenes in the movie and subjective
statements of the viewer’s reaction to the movie. In a positive movie review, it is
common for the objective description to include words with a negative semantic
orientation, although the subjective reaction may be quite positive [Turney 2002]. If the
task is to classify the review as positive or negative, a two-step approach seems wise. The
first step would be to filter out the objective sentences [Wiebe 2000; Wiebe
et al.
2001]
and the second step would be to determine the semantic orientation of the words and
phrases in the remaining subjective sentences [Turney 2002].
5. EXPERIMENTS
In Section 5.1, we discuss the lexicons and corpora that are used in the following
experiments. Section 5.2 examines the baseline performance of SO-PMI, when it is
configured as described in Section 3.1. Sections 5.3, 5.4, and 5.5 explore variations on
the baseline SO-PMI system. The baseline performance of SO-LSA is evaluated in
Section 5.6 and variations on the baseline SO-LSA system are considered in Section 5.7.
The final experiments in Section 5.8 analyze the effect of the choice of the paradigm
words, for both SO-PMI and SO-LSA.
15
5.1. Lexicons and Corpora
The following experiments use two different lexicons and three different corpora. The
corpora are used for unsupervised learning and the lexicons are used to evaluate the
results of the learning. The
Dostları ilə paylaş: |