2015-11-02 89 views
2
sessionInfo() 
R version 3.2.2 (2015-08-14) 
Platform: x86_64-w64-mingw32/x64 (64-bit) 
Running under: Windows 7 x64 (build 7601) Service Pack 1 

locale: 
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C     
[5] LC_TIME=German_Germany.1252  

attached base packages: 
[1] stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] dplyr_0.4.3  plyr_1.8.3  tidyr_0.3.1  gridExtra_2.0.0 scales_0.3.0 
[6] ggplot2_1.0.1 RPostgreSQL_0.4 DBI_0.3.1  

loaded via a namespace (and not attached): 
[1] Rcpp_0.12.1  lubridate_1.3.3 assertthat_0.1 digest_0.6.8  MASS_7.3-44  
[6] R6_2.1.1   grid_3.2.2  gtable_0.1.2  magrittr_1.5  stringi_0.5-5 
[11] reshape2_1.4.1 proto_0.3-10  tools_3.2.2  stringr_1.0.0 munsell_0.4.2 
[16] parallel_3.2.2 colorspace_1.2-6 memoise_0.2.1 

例如,我在一列中有n個字符串,如下所示。我想根據最後一個字詞對字符串進行排序。根據r中的最後一個字對字符串進行排序

dput(dsp) 
c("handlingstation/cropping/ forward/Linie 1", "handlingstation/cropping/ forward/Linie 2", 
"conveyorstation/Linie 1", "conveyorstation/Linie 2", "soft/handling/cleaning/backward/Linie 3", 
"jumper/doublejumper/Linie 1", "jumper/doublejumper/Linie 2" 
) 



dsp 
[1] "handlingstation/cropping/ forward/Linie 1" 
[2] "handlingstation/cropping/ forward/Linie 2" 
[3] "conveyorstation/Linie 1"      
[4] "conveyorstation/Linie 2"      
[5] "soft/handling/cleaning/backward/Linie 3" 
[6] "jumper/doublejumper/Linie 1"     
[7] "jumper/doublejumper/Linie 2" 

所需的輸出

dsp_sorted 
[1] "handlingstation/cropping/ forward/Linie 1" 
[2] "conveyorstation/Linie 1"      
[3] "jumper/doublejumper/Linie 1"     
[4] "handlingstation/cropping/ forward/Linie 2" 
[5] "conveyorstation/Linie 2"      
[6] "jumper/doublejumper/Linie 2"     
[7] "soft/handling/cleaning/backward/Linie 3" 

我想在prticular列中的所有字符串基於硬道理訂購。這裏應該以Linie 1,Linie 2等爲基礎。

有人能告訴我怎麼做到這些。

回答

4

你可以嘗試的東西,如下

dsp[order(sub(".*/ ", "", dsp))] 
# [1] "handlingstation/cropping/ forward/Linie 1" "conveyorstation/Linie 1"      
# [3] "jumper/doublejumper/Linie 1"     "handlingstation/cropping/ forward/Linie 2" 
# [5] "conveyorstation/Linie 2"      "jumper/doublejumper/Linie 2"     
# [7] "soft/handling/cleaning/backward/Linie 3" 

這基本上是使用正則表達式的/最後一個出場之前刪除一切和排序您的載體,根據你的情況,這個詞


雖然使用混合訂單操作可能會更安全(因爲您在單個值中包含數字和字符)

library(gtools) 
dsp[mixedorder(sub(".*/ ", "", dsp))] 
# [1] "handlingstation/cropping/ forward/Linie 1" "conveyorstation/Linie 1"      
# [3] "jumper/doublejumper/Linie 1"     "handlingstation/cropping/ forward/Linie 2" 
# [5] "conveyorstation/Linie 2"      "jumper/doublejumper/Linie 2"     
# [7] "soft/handling/cleaning/backward/Linie 3" 

另一種選擇(取決於您的真實數據)是從字符串末尾抽取的數量和種類相應

dsp[order(as.numeric(sub(".*(\\d+$)", "\\1", dsp)))] 

顯然,stringi封裝具有混合順序選項也是通過指定opts_collator = list(numeric = TRUE),同時提取一個字符串的最後一個單詞,所以你也可以這樣做

library(stringi) 
dsp[stri_order(stri_extract_last_words(dsp), opts_collator = list(numeric = TRUE))] 
# [1] "handlingstation/cropping/ forward/Linie 1" "conveyorstation/Linie 1"      
# [3] "jumper/doublejumper/Linie 1"     "handlingstation/cropping/ forward/Linie 2" 
# [5] "conveyorstation/Linie 2"      "jumper/doublejumper/Linie 2"     
# [7] "soft/handling/cleaning/backward/Linie 3" 
+0

非常感謝。它運作良好。在我的實際數據框(「函數」)中,dsp是一列。你能告訴我如何通過在dsp列上應用上面的混合順序(gtools)來對數據框「函數」進行排序。 – Chanti

+0

我認爲這只是'function [mixedorder(sub(「。* /」,「」,function $ dsp)),]'。數據集btw的錯誤名稱。 –

+1

謝謝大衛。我完全同意你關於數據集的名稱。我改變了它 – Chanti

相關問題