2017-02-27 66 views
1

我在R中有一個字符數組。某些字符串有一個'(number)'模式附加到該字符串。我試圖從正則表達式中刪除這個'(數字)'字符串,但無法弄清楚。我可以訪問字符串中有一個空格而不是一個字符的所有行的行,但是必須有一種方法來查找這些數字字符串。使用正則表達式與R

dat <- c("Alabama-Birmingham", "Arizona State", "Canisius", "UCF", "George Washington", 
      "Green Bay", "Iona", "Louisville (7)", "UMass", "Memphis", "Michigan State", 
      "Milwaukee", "Nebraska", "Niagara", "Northern Kentucky", "Notre Dame (21)", 
      "Quinnipiac", "Siena", "Tulsa", "Washington State", "Wright State", 
      "Xavier") 

    rows <- grep(" (.*)", dat) 
    fixed <- gsub(" (.*)","",games[rows,]) 
    dat = fixed 

回答

2

首先,你需要對括號進行轉義,這將是好是更具體的瞭解裏面有什麼東西他們

gsub("\\s+\\(\\d+\\)", "", dat) 
[1] "Alabama-Birmingham" "Arizona State"  "Canisius"   
[4] "UCF"    "George Washington" "Green Bay"   
[7] "Iona"    "Louisville"   "UMass"    
[10] "Memphis"   "Michigan State"  "Milwaukee"   
[13] "Nebraska"   "Niagara"   "Northern Kentucky" 
[16] "Notre Dame"   "Quinnipiac"   "Siena"    
[19] "Tulsa"    "Washington State" "Wright State"  
[22] "Xavier" 
+0

這是做到這一點太好了,謝謝你的幫助。 – Developing

0

我們可以sub

sub("\\s*\\(.*", "", dat) 
#[1] "Alabama-Birmingham" "Arizona State"  "Canisius"   
#[4] "UCF"    "George Washington" "Green Bay"   
#[7] "Iona"    "Louisville"   "UMass"    
#[10] "Memphis"   "Michigan State"  "Milwaukee"   
#[13] "Nebraska"   "Niagara"   "Northern Kentucky" 
#[16] "Notre Dame"   "Quinnipiac"   "Siena"    
#[19] "Tulsa"    "Washington State" "Wright State"  
#[22] "Xavier"