Microsoft Word turney-littman-acm doc

Yüklə 200 Kb.

Pdf görüntüsü

səhifə	16/18
tarix	22.05.2023
ölçüsü	200 Kb.
	#119806

1 ... 10 11 12 13 14 15 16 17 18

strong tends to be correlated with positive and weak with negative, there are many
examples in General Inquirer of words that are negative and strong (e.g., abominable,
aggressive, antagonism, attack, austere, avenge) or positive and weak (e.g., delicate,
gentle, modest, polite, subtle). The strong/weak pair may be useful in applications such as
analysis of political text, propaganda, advertising, news, and opinions. Many of the
applications discussed in Section 2 could also make use of the ability to automatically
distinguish strong and weak words.
As we discussed in Section 5.8, the semantic orientation of many words depends on
the context. For example, in the General Inquirer lexicon, mind#9 (“lose one’s mind”) is
Negativ and mind#10 (“right mind”) is Positiv. In our experiments, we avoided this issue
by deleting words like “mind”, with both Positiv and Negativ tags, from the set of testing
words. However, in a real-world application, the issue cannot be avoided so easily.
This may appear to be a problem of word sense disambiguation. Perhaps, in one
sense, the word “mind” is positive and, in another sense, it is negative. Although it is
related to word sense disambiguation, we believe that it is a separate problem. For
example, consider “unpredictable steering” versus “unpredictable plot” (from Section
4.2). The word “unpredictable” has the same meaning in both phrases, yet it has a
negative orientation in the first case but a positive orientation in the second case. We
believe that the problem is context sensitivity. This is supported by the experiments in
Section 5.8. Evaluating the semantic orientation of two-word phrases, instead of single
words, is an attempt to deal with this problem [Turney 2002], but more sophisticated
solutions might yield significant improvements in performance, especially with
applications that involve larger chunks of text (e.g., paragraphs and documents instead of
words and phrases).
8. CONCLUSION
This paper has presented a general strategy for measuring semantic orientation from
semantic association, SO-A. Two instances of this strategy have been empirically
evaluated, SO-PMI and SO-LSA. SO-PMI requires a large corpus, but it is simple, easy
to implement, unsupervised, and it is not restricted to adjectives.

36
Semantic orientation has a wide variety of applications in information systems,
including classifying reviews, distinguishing synonyms and antonyms, extending the
capabilities of search engines, summarizing reviews, tracking opinions in online
discussions, creating more responsive chatbots, and analyzing survey responses. There
are likely to be many other applications that we have not anticipated.
ACKNOWLEDGEMENTS
Thanks to the anonymous reviewers of ACM TOIS for their very helpful comments. We
are grateful to Vasileios Hatzivassiloglou and Kathy McKeown for generously providing
a copy of their lexicon. Thanks to Touchstone Applied Science Associates for the TASA
corpus. We thank AltaVista for allowing us to send so many queries to their search
engine. Thanks to Philip Stone and his colleagues for making the General Inquirer
lexicon available to researchers. We would also like to acknowledge the support of
NASA and Knowledge Engineering Technologies.
9. REFERENCES
A
GRESTI
, A. 1996. An introduction to categorical data analysis. Wiley, New York.
B
ARTELL
, B.T., C
OTTRELL
, G.W.,
AND
B
ELEW
, R.K. 1992. Latent semantic indexing is an optimal special case
of multidimensional scaling. Proceedings of the Fifteenth Annual International ACM SIGIR Conference on

Yüklə 200 Kb.

Dostları ilə paylaş:

1 ... 10 11 12 13 14 15 16 17 18