2013-04-24 125 views
1

我需要通過包含以下格式的Temp5列對CSV文件進行排序。在我的具體情況中,Temp5列包含失敗值。
換句話說,它不包含任何值,只呈現失敗。排序CSV文件Python/Linux命令

因此,我需要對Temp5中的值執行排序操作並忽略失敗值。

我可以編寫新的csv文件或修改存在的文件。我已經在Python中調查了csv,並在linunx中調用了sort命令但是我找不到任何解決方案。 所以在new/Existing CSV File,我已經temp5排序值比失敗後的值(即沒有丟失任何行和失敗的價值的是任何順序)

努力: 我曾嘗試進入Python代碼也這表明我做出解釋和存儲列作爲鍵(你想排序)和值做完整的行,而不是排序鍵和基於鍵的數據。但我面臨的問題,它沒有包括失敗的值。請找到我寫入python的函數

csv_s_mt0 = csv.reader(open("data.csv","rb")) 
    s_mt0_map = {} 
    s_mt1_map = {} 
    line_escape = 0 
    for line in csv_s_mt0: 
     if(line_escape > 3): 
      print line 
      print line[4] 
      s_mt0_map[line[4]] = line 
     else: 
      line_escape = line_escape + 1 
    s_mt0_map_key = s_mt0_map.keys() 
    s_mt0_map_key.sort() 
    for key in s_mt0_map_key: 
     print s_mt0_map_key[key] 

    print len(s_mt0_map_key) 


$Header Information 
$Tool info=3 
.TITLE '*****************************************************' 
Temp1,Temp2,Temp3,Temp4,Temp5,Temp6,Temp6,Temp7,Temp8,Temp9 
0., failed, failed,-2.700e-10, 9.803e-11,-2.725e-11, 2.725e-11,-1.645e-06, -40.0000,1 
1.000e-12, failed, failed,-2.689e-10, 9.805e-11,-2.731e-11, 2.731e-11, 6.571e-08, -40.0000,1 
2.000e-12, failed, failed,-2.679e-10, 9.806e-11,-2.731e-11, 2.731e-11, 6.835e-08, -40.0000,1 
3.000e-12, failed, failed,-2.669e-10, 9.805e-11,-2.729e-11, 2.729e-11, 1.376e-07, -40.0000,1 
4.000e-12, failed, failed,-2.660e-10, 9.803e-11,-2.731e-11, 2.731e-11, 3.583e-08, -40.0000,1 
5.000e-12, failed, failed,-2.649e-10, 9.807e-11,-2.725e-11, 2.725e-11,-1.646e-06, -40.0000,1 
6.000e-12, failed, failed,-2.640e-10, 9.803e-11,-2.731e-11, 2.731e-11, 3.579e-08, -40.0000,1 
7.000e-12, failed, failed,-2.630e-10, 9.801e-11,-2.728e-11, 2.728e-11, 1.828e-07, -40.0000,1 
8.000e-12, failed, failed,-2.620e-10, 9.805e-11,-2.729e-11, 2.729e-11, 1.353e-07, -40.0000,1 
4.940e-10, failed, failed, 2.241e-10, failed, failed, failed, 0.8100, -40.0000,1 
4.950e-10, failed, failed, 2.251e-10, failed, failed, failed, 0.8100, -40.0000,1 
4.960e-10, failed, failed, 2.261e-10, failed, failed, failed, 0.8100, -40.0000,1 
4.970e-10, failed, failed, 2.271e-10, failed, failed, failed, 0.8100, -40.0000,1 
4.980e-10, failed, failed, 2.280e-10, failed, failed, failed, 0.8100, -40.0000,1 
4.990e-10, failed, failed, 2.291e-10, failed, failed, failed, 0.8100, -40.0000,1 
5.000e-10, failed, failed, 2.301e-10, failed, failed, failed, 0.8100, -40.0000,1 
+0

你嘗試過什麼?通過張貼您嘗試過的內容並尋求對特定問題的幫助,您會得到更好的迴應,而不是發佈規範並說「爲我做我的工作」。 – 2013-04-24 06:24:22

+0

你想忽略哪一列「失敗」?只是temp5我假設 – jamylak 2013-04-24 06:27:18

+0

是的但在排序。排序後的數據,寫失敗值也 – user765443 2013-04-24 06:49:47

回答

1

key的函數被用於排序,這對於所有返回,所以他們都放在列表的底部。同樣,如果您希望全部位於頂部,則可以使用float('-inf')

>>> import csv 
>>> import sys # to print to sys.stdout for this example 
>>> from itertools import islice 
>>> def key(n): 
     return float(n) if n != 'failed' else float('inf') 

>>> with open('data.csv') as f: 
     info = list(islice(f, 0, 3)) # first 3 lines 
     r = csv.DictReader(f) 
     w = csv.DictWriter(sys.stdout, r.fieldnames) 
     rows = sorted(r, key=lambda row: key(row['Temp5']))  
     sys.stdout.writelines(info) 
     w.writeheader() 
     w.writerows(rows) 


$HeaderInformation 
$Toolinfo=3 
.TITLE'*****************************************************' 
Temp1,Temp2,Temp3,Temp4,Temp5,Temp6,Temp6,Temp7,Temp8,Temp9  
7.000e-12,failed,failed,-2.630e-10,9.801e-11,2.728e-11,2.728e-11,1.828e-07,-40.0000,1  
0.,failed,failed,-2.700e-10,9.803e-11,2.725e-11,2.725e-11,-1.645e-06,-40.0000,1  
4.000e-12,failed,failed,-2.660e-10,9.803e-11,2.731e-11,2.731e-11,3.583e-08,-40.0000,1  
6.000e-12,failed,failed,-2.640e-10,9.803e-11,2.731e-11,2.731e-11,3.579e-08,-40.0000,1  
1.000e-12,failed,failed,-2.689e-10,9.805e-11,2.731e-11,2.731e-11,6.571e-08,-40.0000,1  
3.000e-12,failed,failed,-2.669e-10,9.805e-11,2.729e-11,2.729e-11,1.376e-07,-40.0000,1  
8.000e-12,failed,failed,-2.620e-10,9.805e-11,2.729e-11,2.729e-11,1.353e-07,-40.0000,1  
2.000e-12,failed,failed,-2.679e-10,9.806e-11,2.731e-11,2.731e-11,6.835e-08,-40.0000,1  
5.000e-12,failed,failed,-2.649e-10,9.807e-11,2.725e-11,2.725e-11,-1.646e-06,-40.0000,1  
4.940e-10,failed,failed,2.241e-10,failed,failed,failed,0.8100,-40.0000,1  
4.950e-10,failed,failed,2.251e-10,failed,failed,failed,0.8100,-40.0000,1  
4.960e-10,failed,failed,2.261e-10,failed,failed,failed,0.8100,-40.0000,1  
4.970e-10,failed,failed,2.271e-10,failed,failed,failed,0.8100,-40.0000,1  
4.980e-10,failed,failed,2.280e-10,failed,failed,failed,0.8100,-40.0000,1  
4.990e-10,failed,failed,2.291e-10,failed,failed,failed,0.8100,-40.0000,1  
5.000e-10,failed,failed,2.301e-10,failed,failed,failed,0.8100,-40.0000,1 
+0

謝謝答覆。排序數據後,我也需要失敗值。原因是,在那一行中,Temp5包含失敗,但在其他列中它包含值(同一行) – user765443 2013-04-24 06:38:36

+0

我運行這個程序並最終出現以下錯誤。回溯(最近一次通話最後一次): 文件「temp.py」,第10行,在 rows = sorted(filtered_r,key = lambda row:float(row ['Temp5'])) 文件「temp.py」 ,第10行,在 rows = sorted(filtered_r,key = lambda row:float(row ['Temp5'])) ValueError:無法將字符串轉換爲float:失敗 – user765443 2013-04-24 07:40:42

+0

@AbhishekGoswami請解釋您想要做什麼'失敗' – jamylak 2013-04-24 08:00:32

0

試試這個:

>>> c = list(csv.reader(s)) # here s is the file with the headers skipped 
>>> final = [] 
>>> for row in c: 
... row = map(lambda x: x.strip(), row) 
... Temp5 = row[4] 
... if Temp5 != 'failed': 
...  final.append(row) 
>>> myprint.tabular(final) # just a pretty-printing function... 
+ --------- + ------ + ------ + ---------- + --------- + ---------- + --------- + ---------- + -------- + ----- + 
| Temp1 | Temp2 | Temp3 | Temp4 | Temp5 | Temp6 | Temp6 | Temp7 | Temp8 | Temp9 | 
+ --------- + ------ + ------ + ---------- + --------- + ---------- + --------- + ---------- + -------- + ----- + 
|  0. | failed | failed | -2.700e-10 | 9.803e-11 | -2.725e-11 | 2.725e-11 | -1.645e-06 | -40.0000 | 1 | 
| 1.000e-12 | failed | failed | -2.689e-10 | 9.805e-11 | -2.731e-11 | 2.731e-11 | 6.571e-08 | -40.0000 | 1 | 
| 2.000e-12 | failed | failed | -2.679e-10 | 9.806e-11 | -2.731e-11 | 2.731e-11 | 6.835e-08 | -40.0000 | 1 | 
| 3.000e-12 | failed | failed | -2.669e-10 | 9.805e-11 | -2.729e-11 | 2.729e-11 | 1.376e-07 | -40.0000 | 1 | 
| 4.000e-12 | failed | failed | -2.660e-10 | 9.803e-11 | -2.731e-11 | 2.731e-11 | 3.583e-08 | -40.0000 | 1 | 
| 5.000e-12 | failed | failed | -2.649e-10 | 9.807e-11 | -2.725e-11 | 2.725e-11 | -1.646e-06 | -40.0000 | 1 | 
| 6.000e-12 | failed | failed | -2.640e-10 | 9.803e-11 | -2.731e-11 | 2.731e-11 | 3.579e-08 | -40.0000 | 1 | 
| 7.000e-12 | failed | failed | -2.630e-10 | 9.801e-11 | -2.728e-11 | 2.728e-11 | 1.828e-07 | -40.0000 | 1 | 
| 8.000e-12 | failed | failed | -2.620e-10 | 9.805e-11 | -2.729e-11 | 2.729e-11 | 1.353e-07 | -40.0000 | 1 | 
+ --------- + ------ + ------ + ---------- + --------- + ---------- + --------- + ---------- + -------- + ----- +