4.3 Identifying Countries
59
instances to identify a set of effective patterns. The learned patterns are used in the
second part of this experiment to extract instances using memory-based learning.
4.3.1 Learning Effective Hyponym Patterns
We are interested whether the effective text patterns are indeed intuitive formula-
tions of the given relation. As a test-case, we compute the most effective patterns
for the hyponym relation using a test set with names of all countries. Taking the
terms
country and countries as hypernyms, we are interested which text fragments
connect the names of countries with these words. Much pattern-based information
extraction research (e.g. [Caraballo, 1999; Cimiano & Staab, 2004; Etzioni et al.,
2005; Snow, Jurafsky, & Ng, 2006; Tjong Kim Sang & Hofmann, 2007]) is based
on hyponym patterns manually identified by Hearst in [1992]. We are interested in
the overlap of the automatically found hyponym patterns with the commonly used
ones.
This experiment was set up as follows. We again use the collected list of coun-
tries (see Section 4.2). Let
I
Dostları ilə paylaş: