在pandas.DataFrame.to_csv中寫入多個標題行

我正在將我的數據放入NASA的ICARTT格式中進行存檔。這是一個包含多個標題行的逗號分隔文件，在標題行中有逗號。例如：在pandas.DataFrame.to_csv中寫入多個標題行

46, 1001 
lastname, firstname 
location 
instrument 
field mission 
1, 1 
2011, 06, 21, 2012, 02, 29 
0 
Start_UTC, seconds, number_of_seconds_from_0000_UTC 
14 
1, 1 
-999, -999 
measurement name, units 
measurement name, units 
column1 label, column2 label, column3 label, column4 label, etc.

我必須爲收集數據的每一天製作一個單獨的文件，因此我最終會創建大約三十個文件。當我通過pandas.DataFrame.to_csv創建一個csv文件時，我不能（據我所知）在寫入數據之前將標題行寫入文件，所以我不得不欺騙它通過

# assuming <df> is a pandas dataframe 
df.to_csv('dst.ict',na_rep='-999',header=True,index=True,index_label=header_lines)

其中「header_lines」是標題字符串

這是什麼給我的是我想要的東西，除了「header_lines」由雙引號括起來。有沒有辦法使用to_csv將文本寫入csv文件的頭部或刪除雙引號？我已經嘗試在to_csv（）中設置quotechar =''和doublequote = False，但雙引號仍然出現。我現在正在做的事情（現在它的工作原理，但我想移動到更好的東西）只是打開一個文件通過打開（'dst.ict'，'w'）和打印到該行線，這很慢。

來源

2014-11-21 tnknepp

實際上，您可以在數據前寫入標題行。 pandas.DataFrame.to_csv需要path_or_buf作爲第一個參數，而不僅僅是一個路徑：

pandas.DataFrame.to_csv(path_or_buf, *args, **kwargs)

path_or_buf：字符串或文件句柄，默認無

文件路徑或對象，如果提供None，則結果以字符串形式返回。

下面是一個例子：

#!/usr/bin/python2 

import pandas as pd 
import numpy as np 
import sys 

# Make an example data frame. 
df = pd.DataFrame(np.random.randint(100, size=(5,5)), 
        columns=['a', 'b', 'c', 'd', 'e']) 

header = '\n'.join(
    # I like to make sure the header lines are at least utf8-encoded. 
    [unicode(line, 'utf8') for line in 
     [ '1001', 
     'Daedalus, Stephen', 
     'Dublin, Ireland', 
     'Keys', 
     'MINOS', 
     '1,1', 
     '1904,06,16,1922,02,02', 
     'time_since_8am', # Ends up being the header name for the index. 
     ] 
    ] 
) 

with open(sys.argv[1], 'w') as ict: 
    # Write the header lines, including the index variable for 
    # the last one if you're letting Pandas produce that for you. 
    # (see above). 
    for line in header: 
     ict.write(line) 

    # Just write the data frame to the file object instead of 
    # to a filename. Pandas will do the right thing and realize 
    # it's already been opened. 
    df.to_csv(ict)

結果正是你想要的東西 - 寫的標題行，然後調用.to_csv()，寫的是：

$ python example.py test && cat test 
1001 
Daedalus, Stephen 
Dublin, Ireland 
Keys to the tower 
MINOS 
1, 1 
1904, 06, 16, 1922, 02, 02 
time_since_8am,a,b,c,d,e 
0,67,85,66,18,32 
1,47,4,41,82,84 
2,24,50,39,53,13 
3,49,24,17,12,61 
4,91,5,69,2,18

對不起，如果這太遲了沒有用。我將這些文件歸檔（並使用Python），所以如果您有將來的問題，請隨時給我寫信。

來源

2014-12-24 09:23:56 ndt

不算太晚，因爲我現在可以更新我的代碼！爲什麼我沒有意識到我可以通過緩衝區值（我必須讀100次）我不知道。謝謝你指出了我！ – tnknepp 2015-01-02 14:19:46

在pandas.DataFrame.to_csv中寫入多個標題行

回答

相關問題