2012-03-17 84 views
2

假設我有一些.csv數據是這樣的:如何用分數創建一個字典/查詢列表,然後隨機選擇一些來添加分數?

query, score1, score2, score3 

kobe bryant,0,3,1, 
ccny,1,1,2, 
lego,3,1,0, 
disney,4,0,0, 
power rangers,2,0,2, 
britney spears,2,0,2, 
backstreet boys,2,1,1, 
soccer,3,0,1, 
justin beaver,2,0,2, 
new york knicks,2,1,1 

加起來我希望能得到類似的分數後:

score1 = 10; score2 = 4; score3 18; 

如何去分割這個和添加他們嗎?

這是我到目前爲止有:

import random 

def getScores(): 
    # open files to read 
    web = open("page.txt", "r"); 
    img = open("image.txt", "r"); 

    # scores for each search engine results 
    gScore = 0; 
    bScore = 0; 
    yScore = 0; 

    webDict = []; 
    imgDict = []; 

    # split by ',' 
    tmp = img.read().split(","); 
for i in range(0, len(tmp)-4, 4): 
     gScore = gScore + int(tmp[i+1]); 
     bScore = bScore + int(tmp[i+2]); 
     yScore = yScore + int(tmp[i+3]); 

    print "gScore is: ", gScore, "\n"; 
    print "bScore is: ", bScore, "\n"; 
    print "yScore is: ", yScore, "\n"; 

    tmp = web.read().split(","); 
    for i in range(0, len(tmp)-4, 4): 
     gScore = gScore + int(tmp[i+1]); 
     bScore = bScore + int(tmp[i+2]); 
     yScore = yScore + int(tmp[i+3]); 

print "gScore is: ", gScore, "\n"; 
    print "bScore is: ", bScore, "\n"; 
    print "yScore is: ", yScore, "\n"; 

if __name__ == "__main__": 
    getScores(); 

這將添加了所有的分數,但我無法建立從數據的字典。

我的意思是這樣的:

bigList = [ 'query':{score1:int, score2:int, score3:int}, 'query2':{score1:int, score2:int, score3:int}... and so on]; 
+0

@Marcin編輯代碼和更多詳細信息 – iCodeLikeImDrunk 2012-03-17 22:29:54

+2

好,現在你可以完成你不知道如何完成的任務(用字典做些什麼?),並留下餘下的部分?沒有人想讀你的家庭作業,所以問一個更好的問題是獲得有用答案的好方法。 – alexis 2012-03-17 22:42:23

回答

3

一旦你在逗號分割它,它可以很容易地在單線處理:

gScore, bScore, yScore = 
      [sum(map(int, scores)) for scores in (data[n::4] for n in range(1, 4))] 

data[::4]部分以每4個項目從數據中,從每種類型分數的適當偏移量開始。然後,您將每種類型轉換爲整數並對其進行總結。

1

我會用逗號第一分割字符串:

stuff = 'kobe bryant,0,3,1,ccny,1,1,2,lego,3,1,0,disney,4,0,0,power rangers,2,0,2,britney spears,2,0,2,backstreet boys,2,1,1,soccer,3,0,1,justin beaver,2,0,2,new york knicks,2,1,1' 
parts = stuff.split(',') 

len(parts)應該是4的倍數,否則你可以扔掉一個例外:

if len(parts)%4: 
    raise ValueError('bad csv') 

然後做類似:

d = {'score1': 0, 'score2': 0, 'score3': 0} 
for i in range(len(parts)/4): 
    d['score1'] += int(parts[4*i+1]) 
    d['score2'] += int(parts[4*i+2]) 
    d['score3'] += int(parts[4*i+3]) 

print d 

我得到

{'score1': 21, 'score2': 7, 'score3': 12} 
+2

你不需要循環:'sum(parts [1 :: 4])'等等,甚至可能是'dict((「score%d」%n,sum(parts [i + 1 :: 4]))因爲我在(1,2,3))'(未經測試)。 – WolframH 2012-03-17 22:43:46

+1

感謝提醒有關這種奇特的擴展切片符號。它有時非常方便。 – 2012-03-17 22:51:58

+1

@WolframH我的答案顯示瞭如何僅通過隱式循環來實現:) – agf 2012-03-17 22:52:20

相關問題