2016-07-25 83 views
2

我使用pandas來嘗試計算在兩個日期之間購買了特定類型合同的成員。說我有酷似工作數據框:熊貓根據日期和條件過濾

Member Nbr  Contract-Type Date-Joined 
20   1 Year Membership  2011-08-01 
3128  3 Month Membership  2011-07-22 
3535  4 Month Membership  2015-02-18 
3760  4 Month Membership  2010-02-28 
3762  3 Month Membership  2010-01-31 
3882  1 Month Membership  2010-04-24  
3892  3 Month Membership  2010-03-24  
4116  3 Month Membership  2014-12-02 
4700  1 Month Membership  2014-11-11 
4802  4 Month Membership  2014-07-26 
5004   1 Year Membership  2012-03-12 
5020   1 Year Membership  2010-07-28  
5022  3 Month Membership  2010-06-25  
5130   1 Year Membership  2011-01-04 
         ... 

我能夠得到計數如果只有一個合同類型,我很感興趣,使用

print(len(df[(df['Date-Joined'] > '2010-01-01') 
      & (df['Date-Joined'] < '2012-02-01') 
      & (df['Member Type'] == '1 Year Membership')])) 

當我嘗試類似的東西通過指定1 Year Membership4 Month Membership用下面的代碼

print(len(df[(df['Date-Joined'] > '2013-01-01') 
     & (df['Date-Joined'] < '2013-02-01') 
     & (df['Member Type'] == '1 Year Membership') 
     or (df['Member Type'] == '4 Month Membership')])) 

我收到以下錯誤

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

並通過&條件更換or條件返回0

+5

使用'|'而不是'或'。 –

+1

另外,'&'優先於'|,所以你的邏輯可能需要一組括號。 –

回答

4

使用|代替or。另外,&優先於|,所以您的邏輯需要一組括號。

import io 
import pandas as pd 

data = io.StringIO('''\ 
Member Nbr,Contract-Type,Date-Joined 
20,1 Year Membership,2011-08-01 
3128,3 Month Membership,2011-07-22 
3535,4 Month Membership,2015-02-18 
3760,4 Month Membership,2010-02-28 
3762,3 Month Membership,2010-01-31 
3882,1 Month Membership,2010-04-24 
3892,3 Month Membership,2010-03-24 
4116,3 Month Membership,2014-12-02 
4700,1 Month Membership,2014-11-11 
4802,4 Month Membership,2014-07-26 
5004,1 Year Membership,2012-03-12 
5020,1 Year Membership,2010-07-28 
5022,3 Month Membership,2010-06-25 
5130,1 Year Membership,2011-01-04 
''') 

df = pd.read_csv(data) 

print(df[ 
    (df['Date-Joined'] > '2010-01-01') & 
    (df['Date-Joined'] < '2012-02-01') & 
    (df['Contract-Type'] == '1 Year Membership') 
    ]) 

#  Member Nbr  Contract-Type Date-Joined 
# 0   20 1 Year Membership  2011-08-01 
# 11  5020 1 Year Membership  2010-07-28 
# 13  5130 1 Year Membership  2011-01-04 

print(df[ 
    (df['Date-Joined'] > '2010-01-01') & 
    (df['Date-Joined'] < '2012-02-01') & 
    (df['Contract-Type'] == '1 Year Membership') | 
    (df['Contract-Type'] == '4 Month Membership') 
    ]) 

#  Member Nbr  Contract-Type Date-Joined 
# 0   20 1 Year Membership  2011-08-01 
# 2   3535 4 Month Membership  2015-02-18 <====== BEWARE! 
# 3   3760 4 Month Membership  2010-02-28 
# 9   4802 4 Month Membership  2014-07-26 <====== BEWARE! 
# 11  5020 1 Year Membership  2010-07-28 
# 13  5130 1 Year Membership  2011-01-04 

print(df[ 
    (df['Date-Joined'] > '2010-01-01') & 
    (df['Date-Joined'] < '2012-02-01') & 
    ((df['Contract-Type'] == '1 Year Membership') | 
    (df['Contract-Type'] == '4 Month Membership')) 
    ]) 

#  Member Nbr  Contract-Type Date-Joined 
# 0   20 1 Year Membership  2011-08-01 
# 3   3760 4 Month Membership  2010-02-28 
# 11  5020 1 Year Membership  2010-07-28 
# 13  5130 1 Year Membership  2011-01-04