2010-03-07 51 views
0

存儲從一個文件中的數據,我有以下文件是這樣的:如何加載數據,並且使用numpy的

2 qid:1 1:0.32 2:0.50 3:0.78 4:0.02 10:0.90 
5 qid:2 2:0.22 5:0.34 6:0.87 10:0.56 12:0.32 19:0.24 20:0.55 
... 

他結構follwoing那樣:

輸出= {} 相對= 2 qid = 1 features = {}#功能列表「1:0.32 2:0.50 3:0.78 4:0.02 10:0.90」output.append([rel,qid,features]) ... 寫我的python代碼來加載數據,謝謝

+4

如果你所描述的期望的輸出數據結構,這將是有益的。 – mtrw 2010-03-07 09:42:40

回答

1

對於閱讀使用這樣的事情(數據在文件「FNAME」):

f = open(fname) 
lines = f.readlines(f) 
for line in lines: 
    elements = line.split(' ') 
    rel = int(elements[0]) 
    qid = int(elements[1].split(':')[1]) 
    featurelist = elements[2:] 
    # get the various features again with splitting at ':' 
    # you get the idea ... 
0

下應該很好地工作,讓你的數據在一個方便的格式:

regexp = r"(\d+)\s+qid:(\d+)\s+(.+)" 
data = np.fromregex(file_name, regexp, 
        dtype=[('rel', int), ('qid', int), ('features', object)]) 

從這裏,你可以選擇rel,qid或通過調用功能:

>>> data['rel'] 
array([2, 5]) 
>>> data['qid'] 
array([1, 2]) 
>>> data['features'] 
array(['1:0.32 2:0.50 3:0.78 4:0.02 10:0.90', 
     '2:0.22 5:0.34 6:0.87 10:0.56 12:0.32 19:0.24 20:0.55'], dtype=object) 
相關問題