6
3. SEMANTIC ORIENTATION FROM ASSOCIATION
The general strategy in this paper is to infer semantic orientation from semantic
association. The semantic orientation of a given word is calculated from the strength of
its association with a set of positive words, minus the strength of its association with a set
of negative words:
(1)
(2)
(3)
(4)
We assume that A(
word
1
,
word
2
) maps to a real number. When A(
word
1
,
word
2
) is
positive, the words tend to be associated with each other. Larger values correspond to
stronger associations. When A(
word
1
,
word
2
) is negative, the presence of one word
makes it likely that the other is absent.
A word,
word, is classified as having a positive semantic orientation when
SO-A(
word) is positive and a negative orientation when SO-A(
word) is negative. The
magnitude (absolute value) of SO-A(
word) can be considered the strength of the semantic
orientation.
In the following experiments, seven positive words and seven negative words are
used as paradigms of positive and negative semantic orientation:
(5)
(6)
These fourteen words were chosen for their lack of sensitivity to context. For example, a
word such as “excellent” is positive in almost all contexts. The sets also consist of
opposing pairs (good/bad, nice/nasty, excellent/poor, etc.). We experiment with randomly
selected words in Section 5.8.
It could be argued that this is a supervised learning algorithm with fourteen labeled
training examples and millions or billions of unlabeled training examples, but it seems
more appropriate to say that the paradigm words are
defining semantic orientation, rather
than
training the algorithm. Therefore we prefer to describe our approach as
unsupervised learning. However, this point does not affect our conclusions.
This general strategy is called SO-A (Semantic Orientation from Association).
Selecting particular measures of word association results in particular instances of the
Dostları ilə paylaş: