從熊貓數據框中的每一行的列表中發生頻率

假設我有一個名爲「base」的6個整數列表和一個包含6列整數的100,000行數據框。從熊貓數據框中的每一行的列表中發生頻率

我需要創建一個額外的列，顯示列表'base'對數據框數據中每一行的出現頻率。

在這種情況下，列表「base」和dataframe中的整數序列都將被忽略。

出現頻率的值可能在0到6之間。
0表示列表「base」中的所有6個整數與數據框中某行的6列中的任何一個不匹配。

任何人都可以對此有所瞭解嗎？

來源

2015-11-05 SmoothJourney

向我們展示你的數據框和您想要的結果：HTTP： //stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples –

旋轉你的數據框。在'apply（）'中使用'isin（）'。 – Kartik

你可以試試這個：

import pandas as pd 

# create frame with six columns of ints 
df = pd.DataFrame({'a':[1,2,3,4,10], 
        'b':[8,5,3,2,11], 
        'c':[3,7,1,8,8], 
        'd':[3,7,1,8,8], 
        'e':[3,1,1,8,8], 
        'f':[7,7,1,8,8]}) 

# list of ints 
base =[1,2,3,4,5,6] 

# define function to count membership of list 
def base_count(y): 
    return sum(True for x in y if x in base) 

# apply the function row wise using the axis =1 parameter 
df.apply(base_count, axis=1)

輸出：

0 4 
1 3 
2 6 
3 2 
4 0 
dtype: int64

然後將其分配到一個新的列：

df['g'] = df.apply(base_count, axis=1)

來源

2015-11-05 05:43:51 JAB

這就像一個魅力。正是我需要的。 – SmoothJourney

從熊貓數據框中的每一行的列表中發生頻率

回答

相關問題