192
Chapter 10
I
k-nearest neighbors
Here’s the distance between A and B, for example.
The distance between A and B is 1. You
can find the rest of the
distances, too.
The distance formula confirms what you saw visually: fruits A and B
are similar.
Suppose you’re comparing Netflix users, instead. You need some
way to graph the users. So, you need to convert
each user to a set of
coordinates, just as you did for fruit.
193
Building a recommendations system
Once you can graph users, you can measure the distance between them.
Here’s how you can convert users into a set of numbers.
When users
sign up for Netflix, have them rate some categories of movies based on
how much they like those categories.
For each user, you now have a set
of ratings!
Priyanka and Justin like Romance and hate Horror. Morpheus likes
Action but hates Romance (he hates when
a good action movie gets
ruined by a cheesy romantic scene). Remember how in oranges versus
grapefruit, each fruit was represented by a set of two numbers? Here,
each user is represented by a set of five numbers.
A mathematician would say, instead of calculating
the distance in two
dimensions, you’re now calculating the distance in
five
dimensions. But
the distance formula remains the same.