114
6.2.1 Identifying Relatedness between Instances
Having gathered a list of co-occurrences of instances in
I
a
using either
PM
,
PCM
or
DM
, we are interested to what extent these instances are expressed to be related.
We assume that two instances are related when they are relatively often mentioned
in the same context. For each instance
i we could consider the instance
i
0
∈ I
a
with
the highest co(
i, i) to be the most related to
i. However, we observe that, in that
case, frequently occurring instances have a relatively large probability to be related
to any other instance. This observation leads to an approach inspired by the theory
of pointwise mutual information [Manning & Sch¨utze, 1999; Downey et al., 2005].
We use
T (
i, i
0
) to express the relatedness of instances
i
0
to
i as follows,
T (
i, i
0
) =
co(
i, i
0
)
∑
i
00
,i
00
6=
i
0
co(
i
00
, i
0
)
.
(6.1)
The function
T can be normalized to
t, i.e. with values 0
≤ t(
i, i
0
)
≤ 1
t(
i, i
0
) =
T (
i, i
0
)
∑
i
00
∈I
a
T (
i, i
00
)
.
(6.2)
We address the Instance Relatedness Problem using
t(
i, i
0
) by identifying an or-
dered list of all instances related to
i.
6.2.2 Categorizing Instances
The Instance Categorization Problem handles the identification of a most applica-
ble
j ∈ I
Dostları ilə paylaş: