2016-08-02 41 views
0

我已經使用this code,這是一種工作。現在在更小的'rep_list'中,執行CP列表中的第一個代表就是增加它,但是當它進入AM時,它將覆蓋CP。所以當我運行這段代碼時,它實際上只保存了循環中的最後一個人。如果我只用「CP」運行代碼,然後只是「AM」,它就會按照它的意思附加它。 for循環有問題嗎?或者這是工作簿本身的問題?如何在將熊貓df寫入xlsx時使此循環正常工作?

import pandas as pd 
import datetime 
from openpyxl import load_workbook 

now = datetime.datetime.now() 
currentDate = now.strftime("%Y-%m-%d") 
call_report = pd.read_excel("Ending 2016-07-30.xlsx", "raw_data") 

#rep_list = ["CP", "AM", "JB", "TT", "KE"] 
rep_list = ["CP", "AM"] 

def call_log_reader(rep_name): 
    rep_log = currentDate + "-" + rep_name + ".csv" 
    df = pd.read_csv(rep_log) 
    df = df.drop(['From Name', 'From Number', 'To Name/Reference', 'To Number', 'Billing Code', 'Original Dialed Number', 
    'First Hunt Group', 'Last Hunt Group'], axis=1) 
    df['rep'] = rep_name 

    book = load_workbook('Ending 2016-07-30.xlsx') 
    writer = pd.ExcelWriter('Ending 2016-07-30.xlsx', engine='openpyxl') 
    writer.book = book 
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets) 
    df.to_excel(writer, "raw_data", index=False) 
    writer.save() 
    ## I tried adding this : writer.close() hoping it would close the book and then force it to reopen for the next rep in the loop but it doesn't seem to work. 

for rep in rep_list: 
    call_log_reader(rep) 

非常感謝!

編輯:

Gaurav Dhama給出了一個非常好的答案。他指出,熊貓excelwriter (refer to this link)有一些限制,並提出了一個解決方案,最終每個代表都得到自己的工作表。這工作,但是,當我想到它後,我選擇了額外的牀單,並提出瞭解決方案知道存在的限制。基本上,我附加了一個CSV而不是實際的XLSX文件,然後在最後打開該CSV並將一個大列表附加到XLSX文件中。任何一個工作,只取決於你最終產品的樣子。

import pandas as pd 
import datetime 
from openpyxl import load_workbook 

now = datetime.datetime.now() 
currentDate = now.strftime("%Y-%m-%d") 
call_report = "Ending 2016-07-30.xlsx" 
#rep_list = ["CP", "AM", "JB", "TT", "KE"] 
rep_list = ["CP", "AM"] 
csv_to_xl_files = [] 
merged_csv = currentDate + "-master.csv" 

def call_log_reader(rep_name): 
    rep_log = currentDate + "-" + rep_name + ".csv" 
    df = pd.read_csv(rep_log) 
    df = df.drop(['TimestampDetail', 'Billing Code', 'From Name', 'From Number', 'To Name/Reference', 'To Number', 
       'Original Dialed Number', 'First Hunt Group', 'Last Hunt Group'], axis=1) 
    df['rep'] = rep_name 
    #print (df.head(3)) 
    df.to_csv(merged_csv, mode='a', index=False, header=False) 
    csv_to_xl_files.append(rep_log) 

book = load_workbook(call_report) 
writer = pd.ExcelWriter(call_report, engine='openpyxl') 
writer.book = book 
writer.sheets = dict((ws.title, ws) for ws in book.worksheets) 

for rep in rep_list: 
    call_log_reader(rep) 

master_df = pd.read_csv(merged_csv) 
master_df.to_excel(writer, "raw_data", index=False) 
writer.save() 

#this csv_to_xl_files list isn't finished yet, basically I'm going to use it to delete the files from the directory as I don't need them once the script is run. 
print (csv_to_xl_files) 

回答

1

嘗試使用以下:

import pandas as pd 
import datetime 
from openpyxl import load_workbook 

now = datetime.datetime.now() 
currentDate = now.strftime("%Y-%m-%d") 
call_report = pd.read_excel("Ending 2016-07-30.xlsx", "raw_data") 

#rep_list = ["CP", "AM", "JB", "TT", "KE"] 
rep_list = ["CP", "AM"] 

def call_log_reader(rep_name): 
    rep_log = currentDate + "-" + rep_name + ".csv" 
    df = pd.read_csv(rep_log) 
    df = df.drop(['From Name', 'From Number', 'To Name/Reference', 'To Number', 'Billing Code', 'Original Dialed Number', 
    'First Hunt Group', 'Last Hunt Group'], axis=1) 
    df['rep'] = rep_name 
    df.to_excel(writer, "raw_data"+rep, index=False) 
    return df 

book = load_workbook('Ending 2016-07-30.xlsx') 
writer = pd.ExcelWriter('Ending 2016-07-30.xlsx', engine='openpyxl') 
writer.book = book 
writer.sheets = dict((ws.title, ws) for ws in book.worksheets) 

for rep in rep_list: 
    call_log_reader(rep) 

writer.save() 
+0

剛纔我嘗試了幾種不同的方法,如果我在調用「For」循環之前放上面的代碼塊,它將運行時沒有任何錯誤,但該文件只有一個空白表用於'raw_data'。如果我在調用函數後在for循環中放置for循環後的代碼塊,則會拋出NameError:全局名稱'writer'未定義並且不會完成 – Mxracer888

+0

我編輯了我的答案並測試了類似碼。這應該適合你。此外,如果它不起作用,請發佈樣本「Ending 2016-07-30.xlsx」,從中讀取變量call_report –

+0

Ahhh,在pandas excelwriter中存在問題,您無法在同一張表中寫入不同的數據幀,他們會一個接一個地重寫。請參閱此[鏈接](https://github.com/pydata/pandas/issues/3441)。我再次編輯了代碼以生成多個工作表。 –