Pandas Iterrows行號和百分比

我正在瀏覽具有1000行的數據幀。我理想地想知道我的循環的進展 - 即它已完成多少行，它已完成總行數的百分比等。Pandas Iterrows行號和百分比

是否有辦法打印行數甚至更好，百分比的行在之前？

我的代碼目前在下面。目前，打印下面看起來如何顯示，現在顯示某種元組/列表，但是我需要的是行號。這可能很簡單。

for row in testDF.iterrows(): 

     print("Currently on row: "+str(row))

理想印刷響應：

Currently on row 1; Currently iterrated 1% of rows 
Currently on row 2; Currently iterrated 2% of rows 
Currently on row 3; Currently iterrated 3% of rows 
Currently on row 4; Currently iterrated 4% of rows 
Currently on row 5; Currently iterrated 5% of rows

來源

2017-07-02 christaylor

爲什麼你要使用循環開始？最有可能是更好的方法。如果必須，那麼可以使用'enumerate'輕鬆計算進度，該枚舉返回當前行的索引（以及行本身），它可以除以總行數。（testDF.iterrows（））：... progress = index/len（testDF）' – DeepSpace

我正在使用iterrows循環，因爲我使用地理編碼數據創建了一個新列。大部分允許您進行地理編碼的服務都有限制，因此我在循環中也添加了0.1秒的延遲。 – christaylor

一個與format可能的解決方案，如果唯一單調指數（0,1,2,...）：

for i, row in testDF.iterrows(): 
     print("Currently on row: {}; Currently iterrated {}% of rows".format(i, (i + 1)/len(testDF.index) * 100))

樣品：

np.random.seed(1332) 
testDF = pd.DataFrame(np.random.randint(10, size=(10, 3))) 
print (testDF) 
    0 1 2 
0 8 1 9 
1 4 3 5 
2 0 1 3 
3 1 8 6 
4 7 4 7 
5 7 5 3 
6 7 9 9 
7 0 1 2 
8 1 3 4 
9 0 0 3 

for i, row in testDF.iterrows(): 
     print("Currently on row: {}; Currently iterrated {}% of rows".format(i, (i + 1)/len(testDF.index) * 100)) 
Currently on row: 0; Currently iterrated 10.0% of rows 
Currently on row: 1; Currently iterrated 20.0% of rows 
Currently on row: 2; Currently iterrated 30.0% of rows 
Currently on row: 3; Currently iterrated 40.0% of rows 
Currently on row: 4; Currently iterrated 50.0% of rows 
Currently on row: 5; Currently iterrated 60.0% of rows 
Currently on row: 6; Currently iterrated 70.0% of rows 
Currently on row: 7; Currently iterrated 80.0% of rows 
Currently on row: 8; Currently iterrated 90.0% of rows 
Currently on row: 9; Currently iterrated 100.0% of rows

EDI T：

如果一些自定義的索引值，溶液zip和numpy.arange通過length of index什麼是相同的length of df：

np.random.seed(1332) 
testDF = pd.DataFrame(np.random.randint(10, size=(10, 3)), index=[2,4,5,6,7,8,2,1,3,5]) 
print (testDF) 
    0 1 2 
2 8 1 9 
4 4 3 5 
5 0 1 3 
6 1 8 6 
7 7 4 7 
8 7 5 3 
2 7 9 9 
1 0 1 2 
3 1 3 4 
5 0 0 3 

for i, (idx, row) in zip(np.arange(len(testDF.index)), testDF.iterrows()): 
    print("Currently on row: {}; Currently iterrated {}% of rows".format(idx, (i + 1)/len(testDF.index) * 100)) 

Currently on row: 2; Currently iterrated 10.0% of rows 
Currently on row: 4; Currently iterrated 20.0% of rows 
Currently on row: 5; Currently iterrated 30.0% of rows 
Currently on row: 6; Currently iterrated 40.0% of rows 
Currently on row: 7; Currently iterrated 50.0% of rows 
Currently on row: 8; Currently iterrated 60.0% of rows 
Currently on row: 2; Currently iterrated 70.0% of rows 
Currently on row: 1; Currently iterrated 80.0% of rows 
Currently on row: 3; Currently iterrated 90.0% of rows 
Currently on row: 5; Currently iterrated 100.0% of rows

來源

2017-07-02 13:45:19 jezrael

打印你做的或者喜歡的方式是否更好？ 'print（'目前在行'，i'，'迭代'，100 * i/testDF.shape [0]，'％'）'爲什麼？謝謝你的回答 –

@RayhaneMama - 我認爲有很多可能的方法，你的作品也是。我更喜歡'len（df.index）'因爲最快的方式。 – jezrael

請注意，這裏'i'是每行的索引。它適用於索引包含從0到len（df）-1的整數，但如果'testDF'使用自定義索引值則不會。 –

所有iterrows首先給出了(index, row)元組。所以，正確的代碼是

for index, row in testDF.iterrows():

指數在一般的情況下是不是行的數量，這是一些標識符（這是熊貓的動力，但它使一些混亂，因爲它的表現還不如蟒蛇，其中一般負責list該索引是行的數量）。這就是爲什麼我們需要獨立計算行數。我們可以引進line_number = 0並在每個環節line_number += 1中增加它。但是python爲我們提供了一個可用的工具：enumerate，它返回(line_number, value)的元組，而不僅僅是value。所以我們回到代碼

for (line_number, (index, row)) in enumerate(testDF.iterrows()): 
    print("Currently on row: {}; Currently iterrated {}% of rows".format(
      line_number, 100*(line_number + 1)/len(testDF)))

P.S. python2在你分配integeres時返回整數，這就是爲什麼999/1000 = 0，你不期望的。所以你可以改變浮動或者開始100*以獲得整數百分比。

來源

2017-07-02 14:04:41

對於大數據幀，限制打印可能會更好，這是一項耗時的任務。這是一種方法：

dftest=pd.DataFrame(np.random.rand(10**5,5)) 

percent=0 
n=len(dftest)//100 

for i,row in dftest.iterrows(): 
    if (i+1)//n>percent : 
     percent +=1 
     print (percent, "% realized") 
    dftest.iloc[i] = 2*row #a job

來源

2017-07-02 14:36:09

Pandas Iterrows行號和百分比

回答

相關問題