6.2 Processing Extracted Subjective Information
113
be problematic [Cafarella, Downey, Soderland, & Etzioni, 2005]. Moreover, the
number of hits can fluctuate over time [V´eronis, 2006], which hampers the reuse
of old hit counts.
Using
PCM
we thus need to perform
m · n queries to collect the co-occurren-
ces between tags and instances and
1
2
(
n
2
− n) queries to gather all pairs of co-oc-
currences between the instances in
I
a
. Hence, the Google Complexity of
PCM
is
O(
mn +
n
2
). When we assume that the size of
I
g
does not exceed
n, the Google
Complexity of
PCM
is
O(
n
2
).
Document-based Method (DM). In the Document-based Method (
DM
) ap-
proach we collect the first
k
URL
s of the documents returned by the search engine
for a given query, constructed using a known instance. These
k
URL
s are the most
relevant for the query submitted based on the ranking used by the search engine
[Brin & Page, 1998]. The corresponding documents are subsequently scanned for
occurrences of instances of the related class [De Boer et al., 2007].
In the first phase of the algorithm, we query all instances in both
I
Dostları ilə paylaş: