2016-09-21 63 views
1

我有以下csv文件,其中包含三個字段漏洞標題, 漏洞嚴重性級別,資產IP地址 ,其中顯示了存在漏洞的漏洞名稱,漏洞等級和IP地址。 我正在嘗試打印一份報告,其中列出 列中的 嚴重性列表 以及具有該漏洞的IP地址的最後一列列表。使用Python解析CSV

Vulnerability Title Vulnerability Severity Level Asset IP Address 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.103.64.10 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.103.64.10 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.103.65.10 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.103.65.164 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.103.64.10 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.10.30.81 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.10.30.81 
TLS/SSL Server Supports RC4 Cipher Algorithms (CVE-2013-2566) 4 10.10.50.82 
TLS/SSL Server Supports Weak Cipher Algorithms 6 10.103.65.164 
Weak Cryptographic Key 3 10.103.64.10 
Unencrypted Telnet Service Available 4 10.10.30.81 
Unencrypted Telnet Service Available 4 10.10.50.82 
TLS/SSL Server Supports Anonymous Cipher Suites with no Key Authentication 6 10.103.65.164 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.103.64.10 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.103.65.10 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.103.65.100 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.103.65.164 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.103.65.164 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.103.64.10 
TLS/SSL Server Supports The Use of Static Key Ciphers 3 10.10.30.81 

,我想重新使用漏洞標題標籤爲重點,並新建一個名爲漏洞嚴重等級和最後一個標籤的第二突出一個CSV文件將包含所有IP地址有漏洞

import csv 
from pprint import pprint 
from collections import defaultdict 
import glob 
x= glob.glob("/root/*.csv") 

d = defaultdict() 
n = defaultdict() 
for items in x: 
     with open(items, 'rb') as f: 
       reader = csv.DictReader(f, delimiter=',') 
       for row in reader: 
         a = row["Vulnerability Title"] 
         b = row["Vulnerability Severity Level"], row["Asset IP Address"] 
         c = row["Asset IP Address"] 
     #    d = row["Vulnerability Proof"] 
         d.setdefault(a, []).append(b) 
     f.close() 
pprint(d) 
with open('results/ipaddress.csv', 'wb') as csv_file: 
     writer = csv.writer(csv_file) 
     for key, value in d.items(): 
       for x,y in value: 
         n.setdefault(y, []).append(x) 
#      print x 
         writer.writerow([key,n]) 

with open('results/ipaddress2.csv', 'wb') as csv2_file: 
     writer = csv.writer(csv2_file) 
     for key, value in d.items(): 
      n.setdefault(value, []).append(key) 
      writer.writerow([key,n]) 

因爲我無法解釋得很好。讓我儘量簡化

可以說我有以下CSV

Car model owner 
Honda Blue James 
Toyota Blue Tom 
Chevy Green James 
Chevy Green Tom 

我試圖創建這個CSV如下所示:

Car model owner 
Honda Blue James 
Toyota Blue Tom 
Chevy Green James,Tom 

的解決方案都是正確的。 這裏是我最後的劇本以及

import csv 
import pandas as pd 

df = pd.read_csv('test.csv', names=['Vulnerability Title', 'Vulnerability Severity Level','Asset IP Address']) 
#print df 
grouped = df.groupby(['Vulnerability Title','Vulnerability Severity Level']) 

groups = grouped.groups 
#print groups 
new_data = [k + (v['Asset IP Address'].tolist(),) for k, v in grouped] 
new_df = pd.DataFrame(new_data, columns=['Vulnerability Title' ,'Vulnerability Severity Level', 'Asset IP Address']) 

print new_df 
new_df.to_csv('final.csv') 

謝謝

+0

你能舉一個例子,你試圖創建的最終csv的結構嗎?這將是非常有用的 –

+0

謝謝你的伴侶。我編輯了更多細節的問題?讓我知道我是否應該添加更多信息。 –

+0

不客氣,最後一次編輯特別好。謝謝。 –

回答

1

當操縱結構化日期,特別是大的數據集。我想建議你使用pandas

對於你的問題,我會給你一個熊貓groupby功能解決方案的例子。假設你擁有的數據:

data = [['vt1', 3, '10.0.0.1'], ['vt1', 3, '10.0.0.2'], 
     ['vt2', 4, '10.0.10.10']] 

大熊貓操作日期是非常fensy:

import pandas as pd 

df = pd.DataFrame(data=data, columns=['title', 'level', 'ip']) 
grouped = df.groupby(['title', 'level']) 

然後

groups = grouped.groups 

將是一個字典幾乎是你需要的。

print(groups) 
{('vt1', 3): [0, 1], ('vt2', 4): [2]} 

[0,1]代表行標籤。其實你可以迭代這些組來應用你想要的任何操作。例如,如果你想將它們保存到CSV文件:

new_data = [k + (v['ip'].tolist(),) for k, v in grouped] 
new_df = pd.DataFrame(new_data, columns=['title', 'level', 'ips']) 

讓我們來看看什麼是new_df現在:

title level     ips 
0 vt1  3 [10.0.0.1, 10.0.0.2] 
1 vt2  4   [10.0.10.10] 

這就是你所需要的。最後,保存到文件:

new_df.to_csv(filename) 

我強烈建議你應該學習熊貓數據處理。你可能會發現這更容易,更清潔。

1

答案考慮您的汽車的例子。從本質上講,我創建了一個以汽車品牌爲關鍵詞的字典和一個兩元素元組。該元組的第一個元素是顏色和第二,所有者的列表):

import csv 

car_dict = {} 
with open('<file_to_read>', 'rb') as fi: 
    reader = csv.reader(fi) 
    for f in reader: 
     if f[0] in car_dict: 
      car_dict[f[0]][1].append(f[2]) 
     else: 
      car_dict[f[0]] = (f[1], [f[2]]) 

with open('<file_to_write>', 'wb') as ou: 
    for k in car_dict: 
     out_string ='{}\t{}\t{}\n'.format(k, car_dict[k][0], ','.join(car_dict[k][1])) 
     ou.write(out_string) 
+0

導入CSV 進口大熊貓作爲PD DF = pd.read_csv( 'test.csv',名字= [ '漏洞標題', '漏洞嚴重等級', '資產IP地址']) #PRINT DF 分組= df.groupby(['Vulnerability Title',''Vulnerability Severity Level']) groups = grouped.groups #print groups new_data = [k +(v ['Asset IP Address']。tolist(),)for k,v分組] new_df = pd.DataFrame(new_data,columns = ['Vulnerability Title','Vulnerability Severity Level','資產IP地址]]) print new_df new_df.to_csv('final.csv' ) –