I was able to run recommendations.py
Euclidean Distance
I added the sim_distance function from the book but I got a different result. It turns out that the result from his code, but not the book result of his code is right. If you do the math yourself according to the formula, you get 0.294298..etc, which is right and what his code produced when I typed it in to my version of recommendations.py. I spent a really long time figuring out why my code wasn't working, but it turns out the book was wrong :(
Pearson Correlation Score
Worked great, minus the hour lost to a stupid indentation error. Turns out most of my stuff was in the for loop. Lovely. I got the book answer for this one.
Recommendations
This part worked fine for me.
Manhattan Distance
There aren't any good websites on Manhattan distance. I swear they don't expect you to actually need to use it. I can't quite figure out what the x's and y's are and how you add them and stuff. But after emailing Dr. Zacharski I was able to implement it. This is what I came up with.
#Manhattan Distance Stuff
def manhattan(prefs,person1,person2):
#Get the list of mutually related items
si={}
for item in prefs[person1]:
if item in prefs[person2]: si[item]=1
#Find the number of the elements
n=len(si)
#if they have no ratings in common, return 0
if n==0: return 0
mdists = [abs(prefs[person1][item] - prefs[person2][item])
for item in si]
#Doing the thing at the bottom of page 10 to make it between 0 and 1
# we are adding 1 then inverting it so we dont divide by 0
return 1/(1+sum(mdists))
I got a result of 0.1818181... for Lisa Rose and Gene Seymour, which I think is at least close to the right answer.
No comments:
Post a Comment