0
我試圖設置一個函數來計算兩個電影的相似度的分數。現有的詞典以電影爲關鍵詞,導演,流派或主演員都是價值觀。有三部演員字典(每部電影的3名主角演員均被列出)。代碼大多工作正常,但有時我得到的結果比我應該得到的更大。使用預先存在的字典添加到int值
# create a two-variable function to deterime the FavActor Similarity score:
def FavActorFunction(film1,film2):
#set the result of the FavActor formula between two films to a default of 0.
FavActorScore = 0
#add 3 to the similarity score if the films have the same director.
if direct[film1] == direct[film2]:
FavActorScore += 3
#add 2 to the similarity score if the films are in the same genre.
if genre[film1] == genre[film2]:
FavActorScore += 2
#add 5 to the similarity score for each actor they have in common.
if actor1[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor2[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor3[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
#print the resulting score.
return FavActorScore
我的假設是,在統計他們有共同點的演員時,它會計算一些東西兩次。有沒有辦法修改這部分代碼,以獲得更準確的結果?
if actor1[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor2[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor3[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
我真的*真的*不得不問:這些愚蠢的數據結構來自哪裏?電影應該是字典,屬性應該是關鍵(並且演員應該在一個序列或集合中)。 –