gsub列數據框中的值

我有一個包含多列的文件。我顯示在我感興趣的兩列兩列gsub列數據框中的值

Probe.Set.ID   Entrez.Gene 
A01157cds_s_at    50682 
A03913cds_s_at    29366 
A04674cds_s_at 24860 /// 100909612 
A07543cds_s_at    24867 
A09811cds_s_at    25662 
----       ---- 
A16585cds_s_at    25616

我需要用「\ t」的（標籤）和輸出更換///應該像

A01157cds_s_at;50682 
A03913cds_s_at;29366 
A04674cds_s_at;24860  100909612

另外，我需要避免使用「---」

來源

2016-07-26 user1631306

如果您用選項卡替換'///'，則會出現一些列比其他列多的行，這些列無法正確導入。從它的外觀來看，我會用'read.fwf'導入它並在事後修復它。 – alistaire

你正在努力的是什麼？從外觀上看，這可以看作是固定寬度的數據'help（read.fwf）'，你已經找到'gsub'代替分隔符，顯然你可以用'write.csv2'來寫這些數據。你需要哪部分幫助？順便說一句：「命運（230）' – Bernhard

這裏使用dplyr稍微不同的方法：

data <- data.frame(Probe.Set.ID = c("A01157cds_s_at", 
           "A03913cds_s_at", 
           "A04674cds_s_at", 
           "A07543cds_s_at", 
           "A09811cds_s_at", 
           "----", 
           "A16585cds_s_at"), 
       Entrez.Gene = c("50682", 
           "29366", 
           "24860 /// 100909612", 
           "24867", 
           "25662", 
           "----", 
           "25616") 
) 

if(!require(dplyr)) install.packages("dplyr") 
library(dplyr) 

data %>% 
    filter(Entrez.Gene != "----") %>% 
    mutate(new_column = paste(Probe.Set.ID, 
         gsub("///", "\t", Entrez.Gene), 
         sep = ";" 
         ) 
    ) %>% select(new_column)

來源

2016-07-26 19:36:09 user2280549

看起來您希望子集數據，然後將兩列粘貼在一起，然後使用gsub來替換'///'。這是我想出來的，dat是包含兩列的數據框。

dat = dat[dat$Probe.Set.ID != "----",] # removes the rows with "---" 
dat = paste0(dat$Probe.Set.ID, ";", dat$Entrez.Gene) # pastes the columns together and adds the ";" 
dat = gsub("///","\t",dat) # replaces the "///" with a tab

此外，使用cat（）查看選項卡而不是「\ t」。我從這裏得到：How to replace specific characters of a string with tab in R。這將輸出一個列表，而不是一個data.frame。您可以使用data.frame（）進行轉換，但不能使用cat（）進行查看。

來源

2016-07-26 19:25:21

我們可以用dplyr和tidyr這裏。

library(dplyr) 
library(tidyr) 

> df <- data.frame(
    col1 = c('A01157cds_s_at', 'A03913cds_s_at', 'A04674cds_s_at', 'A07543cds_s_at', '----'), 
    col2 = c('50682', '29366', '24860 /// 100909612', '24867', '----')) 


> df %>% filter(col1 != '----') %>% 
    separate(col2, c('col2_first', 'col2_second'), '///', remove = T) %>% 
    unite(col1_new, c(col1, col2_first), sep = ';', remove = T) 

> df 

##    col1_new col2_second 
## 1 A01157cds_s_at;50682  <NA> 
## 2 A03913cds_s_at;29366  <NA> 
## 3 A04674cds_s_at;24860 100909612 
## 4 A07543cds_s_at;24867  <NA>

filter與col1 == '----'刪除意見。
separatecol2分裂成兩列，即col2_firstcol2_second和
unite會連接col1和col2_first與;作爲分隔符。

來源

2016-07-26 20:10:09

gsub列數據框中的值

回答

相關問題