python來計算一些列的分佈的確切數量是Dataframe

-3

編寫一個python程序來獲取一個數據框（pandas） - 「pre_data_matrix」，並在這個數據框中有一個名爲「PostTextPolarity」的列，其值介於 - 1和1時，要計算「PostTextPolarity」的數量，當它> 0,< 0和= 0時，例如總共有超過30000個項目，當它大於0時，可能是「PostTextPolarity」的數量是10000 ，也許可能的「PostTextPolarity」的個數當它是< 0是20000，我想獲得的確切數量，程序是：python來計算一些列的分佈的確切數量是Dataframe

select_sql = "select userID,userName,userURL,postTime,postText,postTextLength,likesCount,sharesCount,commentsCount,postTextPolarity,postTextSubjectivity from fb_pre_davi_group_members_posts" 
    cur.execute(select_sql) 

    pre_data = cur.fetchall() 
    pre_data_list = list(pre_data) 
    ... 
    pre_data_matrix = pd.DataFrame(pre_data_list,columns = ['userId','UserName','UserURL','PostTime','PostText','PostTextLength','LikesCount','SharesCount','CommentsCount','PostTextPolarity','PostTextSubjectivity']) 
    print(pre_data_matrix)

它表明：

  LikesCount SharesCount CommentsCount  PostTextPolarity \ 
    0  0   0    0     0.0 
    1  0   0    0 0.3571428571428571 
    2  3   0    0     1.0 
    3  11   0    0     0.0 
    4  11   0    0 0.46909090909090906 
    5  0   0    0     0.9 
    6  11   0    1     0.625 
    7  11   0    1     0.0 
    8  11   0    0    0.56875 
    9  11   0    0     0.0 
    10  0   0    1 0.08333333333333333 
    11  20   0    2     0.0 
    12  4   0    1     0.0 
    13  7   0    1     0.0 
    14  11   0    1     0.25 
    ...

你能告訴我如何獲得PostTextPolarity> 0的確切數目，= 0，< 0，可能需要使用其他的庫如numpy的

來源

2017-07-16 bin

花一些時間來觀看這次談話和實踐的概念/例子。您的解決方案應該變得明顯 - http://pandas.pydata.org/talks.html#pycon-us-2015 – wwii

請閱讀[問]和[mcve] – wwii

通過使用熊貓庫np.where：

g = pd.np.where(df.PostTextPolarity == 0,'Equals 0',pd.np.where(df.PostTextPolarity < 0,'< 0','> 0')) 

df.groupby(g)['PostTextPolarity'].count().rename_axis('Category').reset_index()

輸出：

Category PostTextPolarity 
0  > 0     8 
1 Equals 0     7

來源

2017-07-16 18:59:02

python來計算一些列的分佈的確切數量是Dataframe

回答

相關問題