2016-03-08 63 views
1

我讀從.csv第9行成數據幀,其正常工作:大熊貓據幀不會重新索引和調換,返回NaN

invoice_desc = pd.read_csv('path', sep=',', nrows = 9, header=None) 

打印時,數據幀看起來像這樣:

    0        1 
0   Bill to      /client/ 
1  Billing ID   xxxx-xxxx-xxxx-xxxx 
2 Invoice number      3359680287 
3  Issue date     31-Jan-2016 
4   Due Date     01-Mar-2016 
5   Currency       CURR 
6 Invoice subtotal     9,999,999.90 
7   VAT (0%)       0.00 
8  Amount due     9,999,999.90 

我現在需要挑選出某些行,重新索引和調換,所以我可以插入一個MySQL該DB通過to_sql():

i = ['invoiceNum', 'issueDate', 'dueDate', 'invoiceSubtotal'] 
invoice_desc2 = pd.DataFrame(invoice_desc.loc[[2, 3, 4, 8],], index = i) 
invoice_desc2.transpose() 

print invoice_desc2 

然而,這段代碼並重新建立索引,但不保留值,併產生這樣的輸出打印時:

    0 1 
invoiceNum  NaN NaN 
issueDate  NaN NaN 
dueDate   NaN NaN 
invoiceSubtotal NaN NaN 

我一直在閱讀有關大熊貓索引和切片here但我不能讓它上班。我究竟做錯了什麼?謝謝!

+1

當你'reindex'它試圖找到您的索引這些標籤,他們不存在,如果你想覆蓋索引值應該像這樣創建df:'invoice_desc2 = pd.DataFrame(invoice_desc.loc [[2,3,4,8],])'然後覆蓋索引:'invoice_desc2.index = i'然後轉置 – EdChum

回答

1

我想你可以先選invoice_desc的子集loc,轉置它T,然後更改列i。沒有必要通過pd.DataFrame創建新的DataFrame

print invoice_desc 
        0     1 
0   Bill to    \tclient 
1  Billing ID xxxx-xxxx-xxxx-xxxx 
2 Invoice number   3359680287 
3  Issue date   31-Jan-2016 
4   Due Date   01-Mar-2016 
5   Currency     CURR 
6 Invoice subtotal   9,999,999.90 
7   VAT (0%)     0.00 
8  Amount due   9,999,999.90 

invoice_desc2 = invoice_desc.loc[[2, 3, 4, 8],:] 
invoice_desc2 = invoice_desc2.T 
print invoice_desc2 
       2   3   4    8 
0 Invoice number Issue date  Due Date Amount due 
1  3359680287 31-Jan-2016 01-Mar-2016 9,999,999.90 

i = ['invoiceNum', 'issueDate', 'dueDate', 'invoiceSubtotal'] 
invoice_desc2.columns = i 
print invoice_desc2 
     invoiceNum issueDate  dueDate invoiceSubtotal 
0 Invoice number Issue date  Due Date  Amount due 
1  3359680287 31-Jan-2016 01-Mar-2016 9,999,999.90 

還是第一次設置index通過i,然後轉:

print invoice_desc 
        0     1 
0   Bill to    \tclient 
1  Billing ID xxxx-xxxx-xxxx-xxxx 
2 Invoice number   3359680287 
3  Issue date   31-Jan-2016 
4   Due Date   01-Mar-2016 
5   Currency     CURR 
6 Invoice subtotal   9,999,999.90 
7   VAT (0%)     0.00 
8  Amount due   9,999,999.90 

invoice_desc2 = invoice_desc.loc[[2, 3, 4, 8],:] 
i = ['invoiceNum', 'issueDate', 'dueDate', 'invoiceSubtotal'] 
invoice_desc2.index = i 
print invoice_desc2 
           0    1 
invoiceNum  Invoice number 3359680287 
issueDate   Issue date 31-Jan-2016 
dueDate    Due Date 01-Mar-2016 
invoiceSubtotal  Amount due 9,999,999.90 

print invoice_desc2.T 
     invoiceNum issueDate  dueDate invoiceSubtotal 
0 Invoice number Issue date  Due Date  Amount due 
1  3359680287 31-Jan-2016 01-Mar-2016 9,999,999.90