2011-09-28 88 views
10

我想csv.DictReader從文件中推導出字段名稱。 The docs「如果省略了fieldnames參數,則csvfile的第一行中的值將用作字段名稱。」,但在我的情況下,第一行包含標題,第二行包含名稱。如何用csv.DictReader跳過預標題行?

我不能申請next(reader)根據Python 3.2 skip a line in csv.DictReader,因爲字段名分配發生在初始化閱讀器(或者我做錯了)。

CanVec v1.1.0,,,,,,,,,^M 
Entity,Attributes combination,"Specification Code 
Point","Specification Code 
Line","Specification Code 
Area",Generic Code,Theme,"GML - Entity name 
Shape - File name 
Point","GML - Entity name 
Shape - File name 
Line","GML - Entity name 
Shape - File name 
Area"^M 
Amusement park,Amusement park,,,2260012,2260009,LX,,,LX_2260009_2^M 
Auto wrecker,Auto wrecker,,,2360012,2360009,IC,,,IC_2360009_2^M 

我的代碼:

f = open(entities_table,'rb') 
try: 
    dialect = csv.Sniffer().sniff(f.read(1024)) 
    f.seek(0) 

    reader = csv.DictReader(f, dialect=dialect) 
    print 'I think the field names are:\n%s\n' % (reader.fieldnames) 

    i = 0 
    for row in reader: 
     if i < 20: 
      print row 
      i = i + 1 

finally: 
    f.close() 

目前的結果:

I think the field names are: 
['CanVec v1.1.0', '', '', '', '', '', '', '', '', ''] 

期望的結果:

I think the field names are: 
['Entity','Attributes combination','"Specification Code Point"',...snip] 

的csvfile(從Excel 2010,original source出口)

我意識到簡單地刪除第一行並繼續進行會很方便,但我試圖儘可能接近地只讀數據並儘量減少手動干預。

回答

1

我用itertools的islice。我的頭是在一個大序言的最後一行。我已通過序言並使用hederline作爲域名:

with open(file, "r") as f: 
    '''Pass preamble''' 
    n = 0 
    for line in f.readlines(): 
     n += 1 
     if 'same_field_name' in line: # line with field names was found 
      h = line.split(',') 
      break 
    f.close() 
    f = islice(open(i, "r"), n, None) 

    reader = csv.DictReader(f, fieldnames = h) 
+0

這是一個更靈活的解決方案,只要您確定地知道一個字段的名稱(合理的期望)。謝謝。 –

12

f.seek(0)後,插入:

next(f) 

到初始化DictReader之前文件指針前進到第二行。

+0

doh!當然。非常感謝您對初學者的耐心。 –