10
corresponding rows of
U
k
[Deerwester
et al.
1990; Bartell
et al.
1992; Schütze 1993;
Landauer and Dumais 1997].
The semantic orientation of a word,
word
, is calculated
by SO-LSA from equation
(4), as follows:
(15)
For the paradigm words, we have the following (from equations (5), (6), and (15)):
(16)
As with SO-PMI, a word,
word
, is classified as having a positive semantic orientation
when SO-LSA(
word
) is positive and a negative orientation when SO-LSA(
word
) is
negative. The magnitude of SO-LSA(
word
) represents the strength of the semantic
orientation.
4. RELATED WORK
Related work falls into three groups: work on classifying words by positive or
negative
semantic orientation (Section 4.1), classifying reviews (e.g., movie reviews) as positive
or negative (Section 4.2), and recognizing subjectivity in text (Section 4.3).
4.1. Classifying Words
Hatzivassiloglou and McKeown [1997] treat the problem of determining semantic
orientation as a problem of classifying words, as we also do in this paper. They note that
there are linguistic constraints on the semantic orientations of adjectives in conjunctions.
As an example, they present the following three sentences:
1. The tax proposal was simple and well received by the public.
2. The tax proposal was simplistic, but well received by the public.
3. (*) The tax proposal was simplistic and well received by the public.
The third sentence is incorrect, because we use “and” with adjectives that have the same
semantic orientation (“simple” and “well-received” are both positive), but we use “but”
with adjectives that have different semantic orientations (“simplistic” is negative).
Hatzivassiloglou and McKeown [1997] use a four-step supervised learning algorithm
to infer the semantic orientation of adjectives from constraints on conjunctions:
1. All conjunctions of adjectives are extracted from the given corpus.
5
The tf-idf score gives more weight to terms that are statistically “surprising”. This heuristic works well for
information retrieval, but its impact on determining semantic orientation is unknown.
SO-LSA(
word
)=
∈
∈
−
Pwords
pword
Nwords
nword
nword
word
pword
word
)
,
(
LSA
)
,
(
LSA
.
SO-LSA(
word
) = [LSA(
word
, good) + ... + LSA(
word
, superior)]
– [LSA(
word
, bad) + ... + LSA(
word
, inferior)].
11
2. A supervised learning algorithm combines multiple sources of evidence to label pairs
of adjectives as having the same semantic orientation
or different semantic
orientations. The result is a graph where the nodes are adjectives and links indicate
sameness or difference of semantic orientation.
3. A clustering algorithm processes the graph structure to produce two subsets of
adjectives, such that links across the two subsets are mainly different-orientation
links, and links inside a subset are mainly same-orientation links.
4. Since it is known that positive adjectives tend to be used more frequently than
negative adjectives, the cluster with the higher average frequency is classified as
having positive semantic orientation.
For brevity, we will call this the HM algorithm.
Like SO-PMI and SO-LSA, HM can produce a real-valued number that indicates both
the direction (positive or negative) and the strength of the semantic orientation. The
clustering algorithm (Step 3 above) can produce a “goodness-of-fit” measure that
indicates how well an adjective fits in its assigned cluster.
Hatzivassiloglou and McKeown [1997] used a corpus of 21 million words and
evaluated HM with 1,336 manually labeled adjectives (657 positive and 679 negative).
Their results are given in Table 2. HM classifies adjectives with accuracies ranging from
78% to 92%, depending on Alpha, as described next.
Table 2. The accuracy of HM with a 21 million-word corpus.
6
Alpha
Accuracy
Size of test set
Percent of “full” test set
2
78.08%
730
100.0%
3
82.56%
516
70.7%
4
87.26%
369
50.5%
5
92.37%
236
32.3%
Alpha is a parameter that is used to partition the 1,336 labeled adjectives into training
and testing sets.
As Alpha increases, the training set grows and the testing set becomes
smaller. The precise definition of Alpha is complicated, but the basic idea is to put the
hard cases (the adjectives for which there are few conjunctions in the given corpus) in the
training set and the easy cases (the adjectives for which there are many conjunctions) in
the testing set. As Alpha increases, the testing set becomes increasingly easy (that is, the
adjectives that remain in the testing set are increasingly
well covered by the given
6
This table is derived from Table 3 in Hatzivassiloglou and McKeown [1997].
12
corpus). In essence, the idea is to improve accuracy by abstaining from classifying the
difficult (rare, sparsely represented) adjectives. As expected, the accuracy rises as Alpha
rises. This suggests that the accuracy will improve with larger corpora.
This algorithm is able to achieve good accuracy levels, but it has some limitations. In
contrast with SO-A, HM is restricted to adjectives and it requires
labeled adjectives as
training data (in step 2).
Although each step in HM, taken by itself, is relatively simple, the combination of the
four steps makes theoretical analysis challenging. In particular, the interaction between
the supervised labeling (step 2) and the clustering (step 3) is difficult to analyze. For
example, the degree of regularization (i.e., smoothing, pruning) in the labeling step may
have an impact on the quality of the clusters. By contrast, SO-PMI is captured in a single
formula (equation (10)), which takes the form of the familiar log-odds ratio [Agresti
1996].
HM has only been evaluated with adjectives, but it seems
likely that it would work
with adverbs. For example, we would tend to say “He ran quickly (+)
Dostları ilə paylaş: