爲什麼df.apply（元組）工作，但沒有df.apply（名單）？

這裏有一個數據幀：爲什麼df.apply（元組）工作，但沒有df.apply（名單）？

我可以檢索一列基本上是從原來的df使用df.apply列的元組：

out = df.apply(tuple, 1) 
print(out) 

0 (6, 2, -5) 
1  (2, 5, 2) 
2 (10, 3, 1) 
3 (-5, 2, 8) 
4  (3, 6, 2) 
dtype: object

但是，如果我想的值的列表，而不是一個他們中的元組，我不能這樣做，因爲它並沒有給我什麼，我想到：

out = df.apply(list, 1) 
print(out) 

    A B C 
0 6 2 -5 
1 2 5 2 
2 10 3 1 
3 -5 2 8 
4 3 6 2

相反，我的東東d要做的事：

out = pd.Series(df.values.tolist()) 
print(out) 

0 [6, 2, -5] 
1  [2, 5, 2] 
2 [10, 3, 1] 
3 [-5, 2, 8] 
4  [3, 6, 2] 
dtype: object

爲什麼我不能用df.apply(list, 1)得到我想要的？

附錄一些可能的解決方法的

時序：

df_test = pd.concat([df] * 10000, 0) 

%timeit pd.Series(df.values.tolist()) # original workaround 
10000 loops, best of 3: 161 µs per loop 

%timeit df.apply(tuple, 1).apply(list, 1) # proposed by Alexander 
1000 loops, best of 3: 615 µs per loop

來源

2017-08-28 cᴏʟᴅsᴘᴇᴇᴅ

奇怪的行爲。 'df.apply（tuple，1）.apply（list）'作爲解決方法？ – Alexander

@Alexander可能的，但速度緩慢。 :(增加了一些時機。 –

，在您有列表對象的數據幀的時候，你已經差不多放棄了速度和效率都希望反正...注意，'.apply'只是圍繞一個Python的包裝爲-loop，因此就使用'iterrows'一個for循環自己，這將可能是*比任何'.apply'方法快*。 –

罪魁禍首是here。隨着func=tuple它的工作原理，但使用func=list從編譯的模塊lib.reduce內引發了一個異常：

ValueError: ('function does not reduce', 0)

正如你所看到的，他們捕捉到了異常，但也懶得去處理它。

即使沒有太寬泛的除外條款，這是熊貓中的一個錯誤。您可以嘗試提高它自己的跟蹤，但類似的問題已經被封閉，不會修復或欺騙的一番風味。

16321: weird behavior using apply() creating list based on current columns

15628: Dataframe.apply does not always return a Series when reduce=True

這後一個問題得到了關閉，然後重新打開，和幾個月前轉換成文檔增強請求，現在似乎被用作任何相關問題的傾銷地。

想必它不是一個高優先級，因爲作爲piRSquared commented（和熊貓維護者commented the same之一），你是一個列表理解更好：

pd.Series([list(x) for x in df.itertuples(index=False)])

通常apply將使用numpy ufunc或類似。

來源

2017-08-29 15:07:01 wim

非常感謝。我會研究這些鏈接。 –

爲什麼df.apply（元組）工作，但沒有df.apply（名單）？

回答

相關問題