2017-02-23 126 views
0

我想編寫的代碼,將數據以這種格式元素字典

數據例如:

[['12319825', '39274', {'pH': 8.1}], ['12319825', '39610', {'pH': 7.27}], 
['12319825', '39638', {'pH': 7.87, 'Escherichia coli': 25.0}], 
['12319825', '39770', {'pH': 7.47, 'Escherichia coli': 27.0}], 
['12319825', '39967', {'pH': 8.36}], ['12319825', '39972', {'pH': 8.42}], 
['12319825', '39987', {'pH': 8.12, 'Escherichia coli': 8.0}], 
['12319825', '40014', {'pH': 8.12}], ['12319825', '40329',{'pH': 8.45}], 
['12319825', '40658', {'pH': 8.35, 'Escherichia coli': 6.3}], 
['12319825', '40686', {'pH': 8.17}], 
['12319825', '40714', {'pH': 8.13}], ['12319825', '40732', {'pH': 8.4}], 
['12319825', '40809', {'pH': 8.42}], 
['12319825', '40827', {'pH': 8.46}], 
['12319825', '41043', {'pH': 8.42, 'Escherichia coli': 170.0}], 
['12319825', '41071', {'pH': 8.24, 'Escherichia coli': 92.0}], 
['12319825', '41080', {'pH': 8.4}], 
['12319825', '41101', {'pH': 8.36, 'Escherichia coli': 560.0}], ['12319825', '41134', {'pH': 8.67}]] 

,並會返回一個字典,其中的關鍵是污染物(以這種情況下,無論是pH值還是大腸桿菌),這個值就是我所稱的DateList。日期列表將是格式(日期,T/F)的每個數據點的列表元組。如果該值在給定範圍之外的布爾將爲真,或在給定的值(取決於標準型)

rangeCriteria={'pH':(5.0,9.0)} 
convCriteria={'Echerichia coli':320) 

現在,當運行此代碼,每個字典具有用於這兩個值

def testLocationForConv(DataFromLocation): 
#checks if a pollutant is outside of acceptable values. 
#A dictionary is created where each pollutant has a cooresponding list of tuples 
#with the date and a corresponding boolean to say if it is in or out of 
#the criteria (true if out false if in) 
#It handles when the criteria is a minimum or range rather than a 
#maximum 

dateList=[] 
impairedList=[] 
overDict=dict() 
for date in DataFromLocation: 
    for pollutant in date[2]: 
     if pollutant in conventionalCriteriaList: 
      dateList.append((date[1],date[2][pollutant]>convCriteria[pollutant])) 
      overDict[pollutant]=dateList 
     if pollutant in rangeCriteria: 
      overDict[pollutant]=dateList 
      dateList.append((date[1], (not (float(date[2][pollutant])>rangeCriteria[pollutant][0] and float(date[2][pollutant])<rangeCriteria[pollutant][1])))) 
     #if pollutant in minCriteriaList: 
     # overDict[pollutant]=dateList 
      # dateList.append((date[1],date[2][pollutant]<minCriteria[pollutant]) 

     else: 
      pass 
print overDict 

現在,兩種污染物的數據點都添加到詞典中,得到以下結果。

{'pH': [('39274', False), ('39610', False), ('39638', False), 
('39638', False), ('39770', False), ('39770', False), ('39967', False), 
('39972', False), ('39987', False), ('39987', False), ('40014', False), 
('40329', False), ('40658', False), ('40658', False), ('40686', False), 
('40714', False), ('40732', False), ('40809', False), ('40827', False), 
('41043', False), ('41043', False), ('41071', False), ('41071', False), 
('41080', False), ('41101', False), ('41101', True), ('41134', False)], 
'Escherichia coli': [('39274', False), ('39610', False), ('39638', False), 
('39638', False), ('39770', False), ('39770', False), ('39967', False), 
('39972', False), ('39987', False), ('39987', False), ('40014', False), 
('40329', False), ('40658', False), ('40658', False), ('40686', False), 
('40714', False), ('40732', False), ('40809', False), ('40827', False), 
('41043', False), ('41043', False), ('41071', False), ('41071', False), 
('41080', False), ('41101', False), ('41101', True), ('41134', False)]} 

現在,我輸入了這個問題,我意識到這個問題是我迭代的日期,那麼污染物,但我想,編譯日期的名單,但獨立的污染物。我將如何製作這樣的清單並將其添加到字典中?

+1

重讀您的文章兩次之後,我想通了,你問的大多是什麼,但它會簡單得多,如果你只是發佈一個你想要的輸出的例子,我不會傷害到我的頭。您還沒有發佈完整的代碼 - 例如,什麼是'conventionalCriteriaList'? –

+0

那麼,列表中的第一項總是被拋棄? –

+0

另外,每次執行'overDict [pollutant] = dateList'都是沒有意義的......它是完全一樣的列表。這就是爲什麼在你的字典中的值是完全一樣的... –

回答

0

我會退後一步,想想你的方法。你讓自己變得更難。首先,數據:

In [3]: data = [['12319825', '39274', {'pH': 8.1}], ['12319825', '39610', {'pH': 
    ...: 7.27}], 
    ...: ['12319825', '39638', {'pH': 7.87, 'Escherichia coli': 25.0}], 
    ...: ['12319825', '39770', {'pH': 7.47, 'Escherichia coli': 27.0}], 
    ...: ['12319825', '39967', {'pH': 8.36}], ['12319825', '39972', {'pH': 8.42}] 
    ...: , 
    ...: ['12319825', '39987', {'pH': 8.12, 'Escherichia coli': 8.0}], 
    ...: ['12319825', '40014', {'pH': 8.12}], ['12319825', '40329',{'pH': 8.45}], 
    ...: 
    ...: ['12319825', '40658', {'pH': 8.35, 'Escherichia coli': 6.3}], 
    ...: ['12319825', '40686', {'pH': 8.17}], 
    ...: ['12319825', '40714', {'pH': 8.13}], ['12319825', '40732', {'pH': 8.4}], 
    ...: 
    ...: ['12319825', '40809', {'pH': 8.42}], 
    ...: ['12319825', '40827', {'pH': 8.46}], 
    ...: ['12319825', '41043', {'pH': 8.42, 'Escherichia coli': 170.0}], 
    ...: ['12319825', '41071', {'pH': 8.24, 'Escherichia coli': 92.0}], 
    ...: ['12319825', '41080', {'pH': 8.4}], 
    ...: ['12319825', '41101', {'pH': 8.36, 'Escherichia coli': 560.0}], ['123198 
    ...: 25', '41134', {'pH': 8.67}]] 

當你的布爾條件,哪怕是一點點複雜,你應該給他們自己的功能,如果只是爲了可讀性的原因。在這裏,我會走得更遠,並將它們添加到字典中,其中關鍵是相應的污染物,這將使您的生活變得非常簡單!

In [4]: def ecoli_threshold(value): return value > 320 

In [5]: def ph_range(value): return not (5 < value < 9) 

In [6]: test = {'Escherichia coli': ecoli_threshold, 'pH':ph_range} 

跳閘您的關鍵問題是,您使用的是單名單,但你真的需要。用兩個空列表初始化你的字典,因爲你知道你會追加到它們。

In [7]: over_dict = {'Escherichia coli':[], 'pH':[]} 

最後,遍歷數據:

In [8]: for entry in data: 
    ...:  for pollutant, value in entry[2].items(): 
    ...:   over_dict[pollutant].append((entry[1], test[pollutant](value))) 
    ...: 

最後,輸出:

In [9]: over_dict 
Out[9]: 
{'Escherichia coli': [('39638', False), 
    ('39770', False), 
    ('39987', False), 
    ('40658', False), 
    ('41043', False), 
    ('41071', False), 
    ('41101', True)], 
'pH': [('39274', False), 
    ('39610', False), 
    ('39638', False), 
    ('39770', False), 
    ('39967', False), 
    ('39972', False), 
    ('39987', False), 
    ('40014', False), 
    ('40329', False), 
    ('40658', False), 
    ('40686', False), 
    ('40714', False), 
    ('40732', False), 
    ('40809', False), 
    ('40827', False), 
    ('41043', False), 
    ('41071', False), 
    ('41080', False), 
    ('41101', False), 
    ('41134', False)]} 
+0

非常感謝你的反饋!複雜的是,這個代碼會考慮更多的污染物,而且並不是所有的地點都有污染物,所以手工添加清單很難,但我認爲使用這些評論我可以制定一個方法。謝謝! –

+0

@AmeliaMcClure然後你最好的選擇是使用'defaultdict',並且應該相對直接地擴展上面這個方法的其餘部分。 –