2014-10-07 101 views
-1

文件拉過來的數據:的Python:比較2 CSV的,如果某個單元格匹配

id,desc,name 
12345,blah blah blah,jsmith 
6789,yada yada yada,ckast 
54321,yum yum yum,jpetersen 

文件B:

key,id 
AB-873,6789 
CF-395,54321 
HG-713,12345 

我想要做的就是拿來看在文件中每一行A,看是否id列在文件B的id列相匹配,並且如果它在「名稱」單元格複製到文件B.所以在最後,文件B會是什麼樣子:

AB-873,6789,ckast 
CF-395,54321,jpetersen 
HG-713,12345,jsmith 

我知道'csv'Python模塊可以讀取單個行,但是我不知道該從哪裏去。謝謝!

+0

是一個大小的文件,他們都將裝入內存? – dawg 2014-10-07 18:49:33

回答

0

如果你想要一個簡單的代碼,這些代碼對你的作品:

a_lines = open('FileA', 'r').readlines()[1:] 
b_lines = open('FileB', 'r').readlines()[1:] 
file_result = open('result', 'w') 

# Read content of FileA to a table (list of lists) 
a_table = [] 
for l in a_lines: 
    a_table.append([w.strip() for w in l.split(',')]) 

# Read content of FileB in a dictionary. 
# The 'id' field as dictionary key for simple look-up. 
b_dict = {} 
for l in b_lines: 
    words = l.split(',') 
    b_dict[words[1].strip()] = words[0].strip() 

# Do the actual work and save result. 
for row in a_table: 
    if row[0] in b_dict: 
     file_result.write(b_dict[row[0]] + ',' + row[0] + ',') 
     file_result.write(row[2] + '\n') 

我與你的樣品進行了測試。

0

隨着csv,你可以這樣做:

import csv 

with open(fn1) as fa, open(fn2) as fb: 
    r1, r2=map(csv.reader, (fa, fb)) 
    a_header, b_header=map(next, (r1, r2)) 
    data_a, data_b=map(lambda header: {k:list() for k in header}, 
          (a_header, b_header)) 
    for line in r1: 
     for k, v in zip(a_header, line): 
      data_a[k].append(v) 
    for line in r2: 
     for k, v in zip(b_header, line): 
      data_b[k].append(v) 

b_header+=['name']   
data_b['name']=[]   
for e in data_b['id']: 
    try: 
     v=data_a['name'][data_a['id'].index(e)] 
    except ValueError: 
     v=None  
    data_b['name'].append(v)  

with open(fn3, 'w') as fout: 
    writer=csv.writer(fout) 
    writer.writerow([e for e in b_header]) 
    idx=0 
    while True: 
     try: 
      writer.writerow([data_b[key][idx] for key in b_header]) 
      idx+=1 
     except IndexError: 
      break 
相關問題