打開CSV文件，對特定列進行排序並覆蓋現有的CSV

我被困在這一段時間了。我試圖打開一個csv，按嚴重性（Critical，High，Medium，Low）排序，然後覆蓋現有文件。我也想忽略第一個寫或添加標題行。打開CSV文件，對特定列進行排序並覆蓋現有的CSV

原始CSV

IP Address Severity Score 
10.0.0.1 High  2302 
172.65.0.1 Low   310 
192.168.0.1 Critical 5402 
127.0.0.1 Medium  1672`

修改/排序CSV

IP Address Severity Score 
192.168.0.1 Critical 5402 
10.0.0.1 High  2302 
127.0.0.1 Medium  1672 
172.65.0.1 Low   310

代碼

import csv 
crit_sev = "Critical" 
high_sev = "High" 
med_sev = "Medium" 
low_sev = "Low" 
reader = csv.reader(open('sample.csv', 'r')) 
row=0 
my_list = [] 
for row in reader: 
    if row[1] == crit_sev: 
     my_list.append(row) 
    elif row[1] == high_sev: 
     my_list.append(row) 
    elif row[1] == med_sev: 
     my_list.append(row) 
    elif row[1] == low_sev: 
     my_list.append(row) 

writer = csv.writer(open("sample.csv", 'w')) 
header = ['IP Address', 'Severity', 'Score'] 
writer.writerow([header]) 
for word in my_list: 
    writer.writerow([word])

任何幫助WO不勝感激。

來源

2017-01-23 cyber_raven

「或添加標題行」 - 這正是你有什麼想說的？ – DyZ

爲什麼不在Excel中打開CSV或其他東西並在那裏排序呢？ – TigerhawkT3

CSV ==逗號分隔值。我的文件中沒有看到任何逗號，所以這可能是第一個問題。它可能是製表符分隔還是固定格式？修復似乎不太可能，因爲當IP地址192.168.0.254出現時，您將沒有足夠的空間。總體思路是讀取每條記錄，根據嚴重性對其進行分類，並將其存儲在數據結構中。然後完成後，按嚴重性順序編寫新的數據結構。 –

你可以使用Python的csv圖書館要做到這一點，如下所示：

import socket  
import csv 

severity = {"Critical" : 0, "High" : 1, "Medium" : 2, "Low" : 3}  

with open('sample.csv', 'rb') as f_input: 
    csv_input = csv.reader(f_input) 
    header = next(csv_input) 
    data = sorted(csv_input, key=lambda x: (severity[x[1]], socket.inet_aton(x[0]))) 

with open('sample.csv', 'wb') as f_output: 
    csv_output = csv.writer(f_output) 
    csv_output.writerow(header) 
    csv_output.writerows(data)

這將保留現有的頭和排序基礎上，severity列中的條目。接下來，它也（可選）按IP地址進行分類（對您可能有用或不可用），使用socket.inet_aton()將IP地址轉換爲可排序的數字。

例如：

IP Address,Severity,Score 
10.168.0.1,Critical,5402 
192.168.0.1,Critical,5402 
10.0.0.1,High,2302 
127.0.0.1,Medium,1672 
172.65.0.1,Low,310

來源

2017-01-23 08:35:34

非常感謝！ –

這裏有一個pandas解決方案：

import pandas as pd 
# Read the CSV file 
data = pd.read_csv('sample.csv') 

# Configure the levels of severity 
levels = pd.Series({"Critical" : 0, "High" : 1, "Medium" : 2, "Low" : 3}) 
levels.name='Severity' 

# Add numeric severity data to the table 
augmented = data.join(levels,on='Severity',rsuffix='_') 

# Sort and select the original columns 
sorted_df = augmented.sort_values('Severity_')[['IP Address', 'Severity','Score']] 

# Overwrite the original file 
sorted_df.to_csv('sample.csv',index=False)

來源

2017-01-23 00:26:26 DyZ

您是否需要定義各個級別，因爲它們是單詞而不是數字？ – TigerhawkT3

@ TigerhawkT3理論上，是的。但在這種情況下，嚴重性的順序與字母順序相匹配（'C'<'H'<'M'<'L'）。 – DyZ

'M'<'L'？真？ – TigerhawkT3

打開CSV文件，對特定列進行排序並覆蓋現有的CSV

回答

相關問題