從基於其值的json文件中刪除數據

我製作了一個腳本來解析來自不同樣本的一些blast文件。因爲我想知道所有樣本都有它的基因，我創建了一個列表和一個字典來計算它們。我也從字典中生成了一個json文件。現在我想刪除那些計數小於100的基因，因爲這是來自字典或json文件的樣本數量，但我不知道如何去做。這是部分代碼：忽略else語句，但隨後它會給我一個空的字典二）試圖打印僅其的那些一）：從基於其值的json文件中刪除數據

###to produce a dictionary with the genes, and their repetitions 
for extracted_gene in matches: 
    if extracted_gene in matches_counts: 
     matches_counts[extracted_gene]+=1 
    else: 
     matches_counts[extracted_gene]=1 
print matches_counts #check point 
#if matches_counts[extracted_gene]==100: 
    #print extracted_gene 
#to convert a dictionary into a txt file and format it with json 

with open('my_gene_extraction_trial.txt', 'w') as file: 
    json.dump(matches_counts,file, sort_keys=True, indent=2, separators=(',',':')) 

print 'Parsing has finished'

我曾嘗試不同的方法來做到這一點值是100，但它不會打印 c）我閱讀了關於json的文檔，但我只能看到如何按對象而不是按值刪除元素。我可以幫助我解決這個問題嗎？這讓我很生氣！

來源

2017-08-11 Ana

不知道我理解你的問題，但是......如果'x'是一個基因字典，'y'是一個匹配計數字典：'對於基因x：如果y [基因] <100：del x [基因]'。這將從x中移除「基因」條目。您可以創建x的副本，以便在需要時不會從原始字典中刪除它們。你將剩下x作爲100個或更多匹配基因的字典。 – illiteratecoder

不，我有一個名單，「匹配」，存儲的基因，和一個字典，「matches_counts」，存儲的基因和他們的計數。我想刪除字典中的「額外基因」。 – Ana

製作字典'matches_counts'的副本，我們稱之爲'copy'; '對於matches_counts中的基因：如果matches_counts [基因] <100：del拷貝[基因]'。現在複製是一個基因字典：匹配，其中匹配> 100.您可以使用'copy.keys（）'遍歷基因名稱。 – illiteratecoder

這是它應該是什麼樣子：

# matches (list) and matches_counts (dict) already defined 
for extracted_gene in matches: 
    if extracted_gene in matches_counts: 
     matches_counts[extracted_gene] += 1 
    else: matches_counts[extracted_gene] = 1 

print matches_counts #check point 

# Create a copy of the dict of matches to remove items from 
counts_100 = matches_counts.copy() 

for extracted_gene in matches_counts: 
    if matches_counts[extracted_gene] < 100: 
     del counts_100[extracted_gene] 

print counts_100

讓我知道如果你仍然得到錯誤。

來源

2017-08-11 11:36:50 illiteratecoder

從基於其值的json文件中刪除數據

回答

相關問題