將熊貓輸出印刷到控制檯

我正在使用topic_.set_value(each_topic, word, prob)更改熊貓數據框中單元的值。基本上，我初始化了一個具有某種形狀的numpy數組，並將其轉換爲熊貓數據框。然後，我使用上面的代碼遍歷所有列和行來替換這些零。問題是單元的數量大約是50,000，每次我設置值pandas都會將數組打印到控制檯。我想壓制這種行爲。有任何想法嗎？將熊貓輸出印刷到控制檯

EDIT

我有兩個dataframes一個是topic_是目標數據幀和tw也就是源數據幀。 topic_是一個由詞矩陣組成的主題，每個單元存儲單詞在特定主題中出現的概率。我已使用numpy.zeros將topic_數據幀初始化爲零。所述tw dataframe-

print(tw) 
    topic_id          word_prob_pair 
0   0 [(customer, 0.061703717964), (team, 0.01724444... 
1   1 [(team, 0.0260560163563), (customer, 0.0247838... 
2   2 [(customer, 0.0171786268847), (footfall, 0.012... 
3   3 [(team, 0.0290787264225), (product, 0.01570401... 
4   4 [(team, 0.0197917953222), (data, 0.01343226630... 
5   5 [(customer, 0.0263740639141), (team, 0.0251677... 
6   6 [(customer, 0.0289764173735), (team, 0.0249938... 
7   7 [(client, 0.0265082412402), (want, 0.016477447... 
8   8 [(customer, 0.0524006965405), (team, 0.0322975... 
9   9 [(generic, 0.0373422774996), (product, 0.01834... 
10  10 [(customer, 0.0305256248248), (team, 0.0241559... 
11  11 [(customer, 0.0198707090364), (ad, 0.018516805... 
12  12 [(team, 0.0159852971954), (customer, 0.0124540... 
13  13 [(team, 0.033444510469), (store, 0.01961003290... 
14  14 [(team, 0.0344793243818), (customer, 0.0210975... 
15  15 [(team, 0.026416114692), (customer, 0.02041691... 
16  16 [(campaign, 0.0486186973667), (team, 0.0236024... 
17  17 [(customer, 0.0208270072145), (branch, 0.01757... 
18  18 [(team, 0.0280889397541), (customer, 0.0127932... 
19  19 [(team, 0.0297011415217), (customer, 0.0216007...

我topic_數據幀的樣品是num_topics大小（這是20）number_of_unique_words（在數據幀tw）

繼是我使用來代替每個碼在topic_數據幀值

for each_topic in range(num_topics): 
    a = tw['word_prob_pair'].iloc[each_topic] 
    for word, prob in a: 
     topic_.set_value(each_topic, word, prob)

來源

2017-02-20 Clock Slave

我不明白 - 如果不使用'只打印''topic_.set_value（each_topic，word，prob）'爲什麼它打印？順便說一句，爲什麼使用這種方法？這是非常緩慢的，如果檢查[文檔]（http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe）有很多更好的方法。你的數據來源是什麼？ 'Lists'，'numpy array'？你能解釋更多嗎？ – jezrael

@jezrael我按照http://stackoverflow.com/questions/13842088/set-value-for-particular-cell-in-pandas-dataframe中給出的答案。根據答案，set_value的運行速度最快。我沒有使用'print'。我用來替換'topic_'數據幀中的零的數據源來自另一個數據框。源數據幀的行看起來像： '[（taret_df_col_1，值_1），（taret_df_col_2，_2），...，（taret_df_col_n，value_n）]' 我遍歷所述源數據的每一行幀，然後在每個（列，值）對將它放在目標數據幀 –

嗯，似乎必須有更好的方法。你能添加[最小的，完整的和可驗證的例子]（http://stackoverflow.com/help/mcve）作爲輸入數據樣本和期望的輸出嗎？ – jezrael

只是輸出重定向到變量：

>>> df.set_value(index=1,col=0,value=1) 
      0   1 
0 0.621660 -0.400869 
1 1.000000 1.585177 
2 0.962754 1.725027 
3 0.773112 -1.251182 
4 -1.688159 2.372140 
5 -0.203582 0.884673 
6 -0.618678 -0.850109 
>>> a=df.set_value(index=1,col=0,value=1) 
>>>

至Init，DF它更好地使用：

pd.DataFrame(np.zeros_like(pd_n), index=pd_n.index, columns=pd_n.columns)

來源

2017-02-20 07:19:32 aslavkin

如果您不希望創建一個變量（「A」在上面的建議），然後使用Python的一次性變量「_」。所以你的陳述變成：

_ = df.set_value(index=1,col=0,value=1)

來源

2017-02-20 09:22:38 dmdip

將熊貓輸出印刷到控制檯

回答

相關問題