2
我是一個Python和熊貓的新手。我需要做一些簡單的熊貓數據框解析來獲得一個新的數據框,涉及多個功能。這裏有一個玩具例子:熊貓應用多個自定義功能
df = pd.DataFrame({'A' : pd.Series(["T100", "T100", "M100", "M100"]), 'B' : pd.Series(["520", "620", "720", "820"]), 'C' : pd.Series(["10/50", "20/50", "30/50", "50/50"])})
>>> df
A B C
0 T100 520 10/50
1 T100 620 20/50
2 M100 720 30/50
3 M100 820 50/50
這是我曾嘗試(自然也沒有工作 - 它返回的錯誤AttributeError: 'DataFrame' object has no attribute 'agg'
,但我想要做的想法是有):
def get_pat_ID(row):
sample = row['A']
patID = re.match("[TM](\d+)", sample).group(1)
return(patID)
def get_funcB(row):
sample, b, c = row['A'], row['B'], row['C']
if sample == "T100":
output = b + "_" + c
else:
output = "NA"
return(output)
def cust(dataset, funcname):
f = dataset.apply(funcname, axis=1) # I want the function to be performed on each row of my dataframe
return(f)
funcdict = {"pat_ID": get_pat_ID, "funcB": get_funcB} # contains all the functions that I want to pass to my dataframe
funcs = {'PatID': cust(df, funcdict["pat_ID"]), 'AnotherFunc': cust(df, funcdict["funcB"])} # creates one column for output of each function
newdf = pd.DataFrame()
newdf = df.agg(funcs)
我知道我的方法不是最有效的,因爲每次我計算函數時,apply
函數都會重複使用相同的行。任何人都可以幫我嗎?
對不起,我遲到的反應!感謝您的回答! – phusion