2017-04-05 42 views
2

我有一個看起來像這樣的數據幀:訂貨數據框架柱然後粘貼行名稱

> testdata 
      topic1 topic2 topic3 topic4 topic5 
church  0.011 0.003 0.001 0.001 0.012 
of   0.094 0.085 0.098 0.063 0.051 
the  0.143 0.115 0.159 0.083 0.097 
appearance 0.000 0.000 0.002 0.005 0.040 
restrain 0.000 0.000 0.000 0.000 0.000 

我需要做的是建立一個新的數據幀,這也是5行由每列,其中5列是什麼是此數據框的有序行名稱。換句話說,我需要按降序排序每列的數據框,然後在列的頂部打印行名,主要是爲了獲得排序的字。在這個例子中,我需要的數據幀將

> testdata_word_ranks 
       topic1  topic2  topic3  topic4  topic5 
church    the   the   the   the   the 
of     of   of   of   of   of 
the    church  church appearance appearance appearance 
appearance appearance appearance  church  church  church 
restrain  restrain  restrain  restrain  restrain  restrain 

這是我在分配上述的testdata_word_ranks列到一個新的數據幀失敗的嘗試:

for(i in 1:nrow(testdata)){ 
    minidf = data.frame(rownames(testdata), testdata[,i]) 
    assign(paste0('testdata_word_ranks$topic', i), 
     as.vector(minidf[order(minidf[,2], decreasing = TRUE),]$rownames.testdata)) 
} 

只是爲了您的信息,這個數據來自特定語料庫的主題模型。

回答

3

你可以通過索引各列的順序行名稱:

matrix(row.names(test.data)[apply(-test.data, 2, order)], nrow(test.data)) 
#  [,1]   [,2]   [,3]   [,4]   [,5]   
# [1,] "the"  "the"  "the"  "the"  "the"  
# [2,] "of"   "of"   "of"   "of"   "of"   
# [3,] "church"  "church"  "appearance" "appearance" "appearance" 
# [4,] "appearance" "appearance" "church"  "church"  "church"  
# [5,] "restrain" "restrain" "restrain" "restrain" "restrain" 
+1

另類的嘗試 - '代替(TESTDATA ,, rownames(TESTDATA)sapply(-testdata,順序)])' – thelatemail