6.3 Evaluating Extracted Subjective Information
125
0
10
20
30
40
50
60
70
0
10
20
30
40
50
60
70
80
90
100
average
number of shared tags
k
Average number
of shared tags for k-NN
normalized data
last.fm data
track-filtered data
Figure 6.1. Average number of shared tags for the 224 artists.
We computed the average number of overlapping tags for the 224 artists and
their
k nearest neighbors and display the results in Figure 6.1. As – especially
after track filtering – often less than 100 tags are assigned to each artist, we also
computed the
similarity score for each of the 224 artists and their
k nearest neigh-
bors by taking the average number of tags relative to the total number of tags for
the nearest neighbors. For example, if an artist shares 34 out of 40 tags with an
artist in the list of 224, the relative tag similarity score for this artist is 34
/40. The
average similarity scores are given in Figure 6.2. The scores are computed using
unfiltered, normalized and track-filtered
Last.fm data.
The average number and score of overlapping tags decreases only slightly for
the unfiltered and normalized data with increasing
k. For the track-filtered data,
we even note a small increase in the relative amount of tags shared (starting from
k = 25). This can be explained by the small number of tags that remain after track-
filtering, as can be found in Figure 6.1.
Using the unfiltered
Last.fm tags of all retrieved artists, we estimate the ex-
pected number of tags shared by two randomly chosen artists as 29
.8 and the rel-
ative number of shared tags as 0
.58. When we filter the tags by normalization and
compare the normalized forms of the tags, we obtain an average of 29
.8 shared
tags, with a relative number of 0
.62. For the track filtering, these numbers are 3
.87
and 0
.64 respectively. Hence, the number of tags shared by similar artists is indeed
much larger than that shared by randomly chosen artists.
126
0.64
0.66
0.68
0.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0
10
20
30
40
50
60
70
80
90
100
relative
tag similarity
k
Relative
tag similarity for k-NN
track-filtered data
normalized data
last.fm data
Figure 6.2. Relative tag similarity score for the 224 artists and their k Nearest
Neighbors
6.3.4 Evaluating with Data from a Folksonomy
In earlier work (e.g. [Schedl et al., 2006; Pohle, Knees, Schedl, & Widmer, 2007;
Geleijnse & Korst, 2006b]) computed artist similarities were evaluated using the
assumption that two artists are similar when they share a genre. To our best knowl-
edge, only the tagging of artists with a single tag, usually a genre name, has been
addressed in literature. Also in other domains than music, the automatic creating
of a list of tags from unstructured texts from multiple pages on the web has not
been addressed.
As the
Last.fm data shows to be reliable, we propose to use it as a ground truth
for evaluating algorithms that identify tags for artists tagging and compute artist
similarity. The use of such a rich, user-based ground truth gives better insights in
the performance of the algorithm and provides possibilities to study the automatic
labeling of artists with multiple tags. Moreover, by evaluating a method using
artists and
Last.fm ground truth we gain insights in the output of the method. High
quality output for the musical artist data may lead to confidence on domains that
can not be evaluated as easily.
A Dynamic Ground Truth Extraction Algorithm
As the perception of users changes over time, we propose a dynamic ground truth
to evaluate a populated ontology with tags and instances. In the evaluation section
of this chapter, we will use this evaluation method to evaluate populated ontologies
on the artists using
Last.fm data. Moreover, an ontology on books is similarly
6.4 Experimental Results
127
evaluated using the social website
LibraryThing.com.
For the evaluation of similarity of instances in
I
Dostları ilə paylaş: