29
0
10
20
30
40
50
60
70
80
90
0
50
100
150
200
250
300
350
Number of Dimensions
A
c
c
u
ra
c
y
100% Threshold
75% Threshold
50% Threshold
25% Threshold
Figure 14. The effect of varying the number of dimensions for SO-LSA.
5.8. Varying the Paradigm Words
The standard methodology for supervised learning is to randomly
split the labeled data
(the lexicon, in this context) into a training set and a testing set. The sizes of the training
and testing sets are usually approximately the same, within an order of magnitude. We
think of SO-A as an unsupervised learning method, because the “training”
set is only
fourteen words (two orders of magnitude smaller than the testing set) and because the
paradigm words were carefully chosen instead of randomly selected (
defining rather than
training).
The fourteen paradigm words were chosen as prototypes or ideal examples of positive
and negative semantic orientation (see Section 3). All fourteen paradigm words appear in
the General Inquirer lexicon. The positive paradigm words are all tagged “Positiv” and
the negative paradigm words are all tagged “Negativ” (although they were chosen before
consulting the General Inquirer lexicon). As we mentioned,
the paradigm words were
removed from the testing words for our experiments.
The following experiment examines the behaviour of SO-A when the paradigm words
are randomly selected. Since rare words would tend to require a larger corpus for SO-A
to work well, we controlled for frequency effects. For each original paradigm word, we
found the word in the General Inquirer lexicon with the same tag (“Positiv” or “Negativ”)
and the most similar frequency. The frequency was measured by the number of hits in
AltaVista. Table 8 shows the resulting new paradigm words.
30
Table 8. Original paradigm words and corresponding frequency-matched new
paradigm words.
Original
paradigm word
Frequency of
original word
Matched
new word
Frequency
of new word
Semantic
orientation
good
55,289,359
right
55,321,211
positive
nice
12,259,779
worth
12,242,455
positive
excellent
11,119,032
commission
11,124,607
positive
positive
9,963,557
classic
9,969,619
positive
fortunate
1,049,242
devote
1,052,922
positive
correct
11,316,975
super
11,321,807
positive
superior
5,335,487
confidence
5,344,805
positive
bad
18,577,687
lost
17,962,401
negative
nasty
2,273,977
burden
2,267,307
negative
poor
9,622,080
pick
9,660,275
negative
negative
5,896,695
raise
5,885,800
negative
unfortunate
987,942
guilt
989,363
negative
wrong
12,048,581
capital
11,721,649
negative
inferior
1,013,356
blur
1,011,693
negative
The inclusion of some of the words in Table 8, such as “pick”, “raise”, and “capital”,
may seem surprising. These words are only negative in certain contexts, such as “pick on
your brother”, “raise a protest”, and “capital offense”. We
hypothesized that the poor
performance of the new paradigm words was (at least partly) due to their sensitivity to
context, in contrast to the original paradigm words. To test this hypothesis, we asked 25
people to rate the 28 words in Table 8, using the following scale:
1 = negative semantic orientation (in almost all contexts)
2 = negative semantic orientation (in typical contexts)
3 = neutral or context-dependent semantic orientation
4 = positive semantic orientation (in typical contexts)
5 = positive semantic orientation (in almost all contexts)
Each person was given a different random permutation of the 28 words, to control for
ordering effects. The average pairwise correlation between subjects’ ratings was 0.86.
The original paradigm words had average ratings of 4.5 for the seven positive words and
1.4 for the seven negative words. The new paradigm words had average ratings of 3.9 for
positive and 2.4 for negative. These judgments lend support to the hypothesis that context
sensitivity is higher for the new paradigm words; context independence is higher for the
31
original paradigm words. On an individual basis, subjects judged the original word more
context independent than the corresponding new paradigm word in 61% of cases
(statistically significant, p < .01).
To evaluate the
fourteen new paradigm words, we removed them from the set of
3,596 testing words and substituted the original paradigm words in their place. Figure 15
compares the accuracy of the original paradigm words with the new words, using
SO-PMI
with AV-ENG and GI, and Figure 16 uses AV-CA. It is clear that the original
words perform much better than the new words.
Figure 17 and Figure 18 compare SO-PMI and SO-LSA on the TASA-ALL corpus
with the original and new paradigm words. Again, the original words perform much
better than the new words.
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Dostları ilə paylaş: