2014-11-24 31 views
0

enter image description here是否有一個Python代碼來連接值到1個細胞在Excel

這是在Excel csv文件內的圖像。然而,右側的行(ID)具有多個ID的副本,但具有不同的符號(左側)。是否可以編寫代碼以查找ID的重複項,然後將不同的符號附加到ID的左側。

前:
AAA | 1
bbb | 1
ccc | 2

後:
AAA,BBB | 1
ccc | 2

到目前爲止,我編寫這樣的:
進口win32com.client,CSV,操作系統,串 進口OS

# Office 2010 - Microsoft Office Object 14.0 Object Library 
from win32com.client import gencache 
gencache.EnsureModule('{2DF8D04C-5BFA-101B-BDE5-00AA0044DE52}', 0, 2, 5) 
                    # 
# Office 2010 - Excel COM 
from win32com.client import gencache 
gencache.EnsureModule('{00020813-0000-0000-C000-000000000046}', 0, 1, 7) 
# 
Application = win32com.client.Dispatch("Excel.Application") 
Application.Visible = True 
Workbook = Application.Workbooks.Add() 
Sheet = Application.ActiveSheet 
# 

f= open("gene_test.csv") 
data = csv.reader(f) 
count = 0 
columnA = [] 
columnB = [] 
columnC = [] 
for i in data: 
    print i 
    count += 1 
    Sheet.Range("A"+ str(count)).Value = i[0] 
    Sheet.Range("B" + str(count)).Value = i[1] 
    Sheet.Range("C" + str(count)).Value = i[2] 
    columnA.append(i[0]) 
    columnB.append(i[1]) 
    columnC.append(i[2]) 
    for x in columnA: 
     if columnA.count > 1: 
      print x 

回答

1

這將需要輸入,並在第二屆 'uniquify'柱。

#!/usr/bin/env python 

import csv 

dict={} 
with open('gene_test.csv','rB') as f: 
    reader = csv.reader(f) 
    for line in reader: 
     try: 
      dict[line[1]].append(line[0]) 
     except: 
      dict[line[1]]=[line[0]] 

with open('out_gene_test.csv','wb') as f: 
    writer = csv.writer(f, delimiter='|') 
    for key in dict: 
     writer.writerow([','.join(dict[key]),key]) 

輸入文件:

$ cat gene_test.csv 
aaa,1 
bbb,1 
ccc,2 

輸出文件:

$ cat out_gene_test.csv 
aaa,bbb|1 
ccc|2