我會建議從正則避而遠之,當你處理的話。有些包可以根據您的特定任務量身定製,您可以使用它們。例如,請嘗試以下內容:
library(corpus)
text <- readLines("http://norvig.com/big.txt") # sherlock holmes
terms <- c("watson", "sherlock holmes", "elementary")
text_locate(text, terms)
## text before instance after
## 1 1 …Book of The Adventures of Sherlock Holmes
## 2 27 Title: The Adventures of Sherlock Holmes
## 3 40 … EBOOK, THE ADVENTURES OF SHERLOCK HOLMES ***
## 4 50 SHERLOCK HOLMES
## 5 77 To Sherlock Holmes she is always the woman. I…
## 6 85 …," he remarked. "I think, Watson , that you have put on seve…
## 7 89 …t a trifle more, I fancy, Watson . And in practice again, I …
## 8 145 …ere's money in this case, Watson , if there is nothing else.…
## 9 163 …friend and colleague, Dr. Watson , who is occasionally good …
## 10 315 … for you. And good-night, Watson ," he added, as the wheels …
## 11 352 …s quite too good to lose, Watson . I was just balancing whet…
## 12 422 …as I had pictured it from Sherlock Holmes ' succinct description, but…
## 13 504 "Good-night, Mister Sherlock Holmes ."
## 14 515 …t it!" he cried, grasping Sherlock Holmes by either shoulder and loo…
## 15 553 "Mr. Sherlock Holmes , I believe?" said she.
## 16 559 "What!" Sherlock Holmes staggered back, white with…
## 17 565 …tter was superscribed to " Sherlock Holmes , Esq. To be left till call…
## 18 567 "MY DEAR MR. SHERLOCK HOLMES ,--You really did it very w…
## 19 569 …est to the celebrated Mr. Sherlock Holmes . Then I, rather imprudentl…
## 20 571 …s; and I remain, dear Mr. Sherlock Holmes ,
## ⋮ (189 rows total)
請注意,無論大小寫如何,這都與該術語匹配。
爲了您的具體使用情況,做
ix <- text_detect(text, terms)
或
matches <- text_subset(text, terms)
現在不能檢查,但'過濾器_()'應該工作 – MikolajM
尋求幫助時,你應該提供一個[可重現的例子](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)與樣品輸入和所需的輸出。一般來說,您需要在data.frames中搜索數值的特定列,而不是整行,因此最好在這裏更具體。 – MrFlick
如果有幫助,我已經創建了一個簡化的示例。 – user113156