2017-10-16 101 views
2

我設法開發了我以前遇到的問題的答案(在此處找到:How can i create a ruleset to assign values to specific columns, based on searching substrings, in Pandas?)。從函數返回的多個值創建多個類別列

不過,我想知道是否有更有效的方法來做到這一點。我想創建多個分類列,基於我在描述列中搜索的字符串。

目前我的策略是如下:

android_phones = ['samsung','xperia','google'] 

iphone= ['iphone','apple'] 


def OS_rules(raw_Df): 
    val='' 

    if any(word in raw_Df['Names'].lower() for word in android_phones): 
     val='android' 
    elif any(word in raw_Df['Names'].lower() for word in iphone): 
     val='iPhone'   
    else: val = 'Handset' 

    return val 


df.loc[:,'OS_Type']=df.apply(OS_rules,axis=1) 
這一戰略

不過,我需要用「幾乎」相同的規則創建多種功能,但具有不同的返回值。

有沒有辦法從單個函數返回多個值?並將其應用於多個新列?

例如

if any(word in raw_Df['Names'].lower() for word in android_phones): 
    val1='android' 
    val2='pixel' 
    val3='vodafone' 

etc etc等等,然後從那些創建新的列?

回答

0

用途:

#create dictionary of all lists 
d = {'android':android_phones, 'iPhone':iphone} 

def OS_rules(raw_Df): 

    #loop by dictionary and return key of dict 
    for k, v in d.items(): 
     if any(word in raw_Df['Names'].lower() for word in v): 
      return k  

#if no value match get NaN, so fillna by default value 
df['OS_Type']=df.apply(OS_rules,axis=1).fillna('Handset') 
print (df) 
        Names qty OS_Type 
0  IPHONE_3UK_CONTRACT 968 iPhone 
1  IPHONE_O2_SIMONLY 155 iPhone 
2  ANDROID_3UK_PAYG 77 Handset 
3 ANDROID_VODAF_CONTRACT 973 Handset