在數據框的每一行中查找最終值，保留該值並替換所有其他值

我試圖在R中自動化一些數據格式。我有多個個人地點（按行），所以一個人可能有幾行）。我需要調換數據，以便每個位置日期都是一列，每個人有一行。如果在當天發現個人，則在列中輸入1，否則爲0。在數據框的每一行中查找最終值，保留該值並替換所有其他值

一旦完成了這一步，我需要找到最後一個觀察結果，即每行1，將該值保留爲1，但將該行中的所有其他值更改爲0.我能夠找到哪些行和列有最終值，但我還沒有想出如何將此函數包含到更大的語句中，以查找這些值，然後替換不符合條件的行值。我不想單獨替換每行中的值。我不需要輸出告訴我哪些行/列符合我的標準。我需要找到它們的唯一原因是替換數據框中的其他值。下面是我在phidot.org上找到的模擬數據代碼，它幫助我構建了轉置數據框（由J Laake編寫）。根據需要創建「間隔」和「場合」以將位置分成不同的時間段。

# create some dummy dates from tomorrow to 20 days from today 
x = c(Sys.Date()+1:20) 
# extract the year and change to numeric 
as.numeric(format(x, "%Y")) 
# you can also extract the month and day with 
as.numeric(format(x, "%m")) 
as.numeric(format(x, "%d")) 


# create dummy capture data; id is animal and date is the date it was captured or recaptured 
df=data.frame(id=floor(runif(100,1,50)),date=runif(100,0,5000)+as.Date("1980-01-01")) 

#create some dummy date intervals that are approximately every 6 months 
intervals=as.Date("1979-01-01")+seq(180,15*365,182.5) 

# cut the dates into intervals 
occasions=cut(df$date,intervals) 

#create the count table with id for rows and years for columns 
ch=with(df,table(id,occasions))

我得到以下（只顯示前10行，在這裏5列）表：

ch[10:20,1:10] 

occasions 
# id 1979-06-30 1979-12-29 1980-06-29 1980-12-28 1981-06-29 
# 1   0   1   0   0   0 
# 2   0   1   0   0   0 
# 3   0   0   0   0   0 
# 4   0   0   0   0   0 
# 5   0   0   0   0   0 
# 6   0   0   0   0   0 
# 7   0   0   0   0   0 
# 9   0   0   0   0   0 
# 10   0   1   0   0   0

下面是代碼，我放在一起找到每行中的最後1，並將其分配給一個對象：

last <- apply(ch,1,function(x){tail(which(x==1),1)}) 
last

但是，這裏是我卡住的地方。我無法弄清楚如何將這些值保存在數據框中，並將其替換爲1，並用0替換數據框中的所有其他值。

最終，在有多個的行，我只希望最後1露面和項目的其餘部分更改爲0。所以，如果我有如下表：

# id 1979-06-30 1979-12-29 1980-06-29 1980-12-28 1981-06-29 
# 1   0   1   0   0   0 
# 2   0   1   1   1   0 
# 3   0   0   0   0   1 
# 4   0   0   0   0   0 
# 5   1   1   0   1   0 
# 6   0   1   0   1   0 
# 7   0   1   0   0   0 
# 9   1   0   0   1   1 
# 10   0   1   0   0   1

我'我想改變看起來像這樣的表：

# id 1979-06-30 1979-12-29 1980-06-29 1980-12-28 1981-06-29 
# 1   0   1   0   0   0 
# 2   0   0   0   1   0 
# 3   0   0   0   0   1 
# 4   0   0   0   0   0 
# 5   0   0   0   1   0 
# 6   0   0   0   1   0 
# 7   0   1   0   0   0 
# 9   0   0   0   0   1 
# 10   0   0   0   0   1

我目前轉置的數據幀「ch」是348行x 462列。每年都會添加數據，因此我希望在R中自動執行此過程，而不是每年都要在Excel中格式化它並將它帶入R中進行分析。我在這個網站上查看了幾個問題和答案，以及phidot.org和一般的互聯網，並且在花了幾天的時間之後還沒有弄清楚這一點。在此先感謝您的時間。

來源

2017-02-27 berk0035

或者，從您從桌子上停下來的地方搭建並使用底座R，您可以做

ch.new <- t(apply(ch, 1, function(row){row[which.max(cumsum(row))] <- "max"; ifelse(row=="max", 1, 0)})) 
ch.new[1:6,] 
    occasions 
id 1979-06-30 1979-12-29 1980-06-29 1980-12-28 1981-06-29 1981-12-28 1982-06-29 1982-12-28 1983-06-29 1983-12-28 1984-06-28 
    1   0   0   0   0   0   0   0   0   0   0   0 
    2   0   0   0   0   0   0   0   0   0   0   0 
    4   0   0   0   0   0   0   0   0   0   1   0 
    5   0   0   0   0   0   0   0   0   0   0   0 
    6   0   0   0   1   0   0   0   0   0   0   0 
    8   0   0   0   1   0   0   0   0   0   0   0 
    occasions 
id 1984-12-27 1985-06-28 1985-12-27 1986-06-28 1986-12-27 1987-06-28 1987-12-27 1988-06-27 1988-12-26 1989-06-27 1989-12-26 
    1   0   0   0   0   0   0   0   0   0   0   0 
    2   0   0   0   0   0   0   1   0   0   0   0 
    4   0   0   0   0   0   0   0   0   0   0   0 
    5   0   0   0   0   0   0   0   0   0   0   0 
    6   0   0   0   0   0   0   0   0   0   0   0 
    8   0   0   0   0   0   0   0   0   0   0   0 
    occasions 
id 1990-06-27 1990-12-26 1991-06-27 1991-12-26 1992-06-26 1992-12-25 1993-06-26 
    1   0   0   1   0   0   0   0 
    2   0   0   0   0   0   0   0 
    4   0   0   0   0   0   0   0 
    5   0   0   0   0   0   1   0 
    6   0   0   0   0   0   0   0 
    8   0   0   0   0   0   0   0

來源

2017-02-28 01:38:54 gfgm

我們可以在data.table中輕鬆完成此操作 - 不是創建中間矩陣，而是直接在數據中找到最大行。框架：

#replicate your data 
df=data.frame(id=floor(runif(100,1,50)),date=runif(100,0,5000)+as.Date("1980-01-01")) 

#create some dummy date intervals that are approximately every 6 months 
intervals=as.Date("1979-01-01")+seq(180,15*365,182.5) 

# cut the dates into intervals (I added this as a new column) 
df$occasions = as.Date(as.character(cut(df$date,intervals))) 

# convert to data.table 
library(data.table) 
setDT(df)

現在，我們可以找到發現每個ID的最後日期：

df_last <- df[, .(last_date = max(occasions)), by = id]

我們轉換回因素，使所有的日期間隔表示：

df_last[, factor(as.character(last_date), levels = as.character(sort(unique(intervals))))]

我們然後施展此以獲得所需矩陣：

dcast(df_last, id ~ last_date, length, drop = FALSE, value.var = "last_date") 

# Top Corner 

    id 1979-12-29 1980-06-29 1980-12-28 1981-06-29 1981-12-28 1982-06-29 1982-12-28 
1: 1   0   0   0   0   0   0   0 
2: 2   0   0   0   0   0   0   0 
3: 3   0   0   0   0   0   0   0 
4: 4   0   0   0   0   1   0   0 
5: 5   0   1   0   0   0   0   0 
6: 6   0   0   0   0   0   0   0 
7: 7   0   0   0   0   0   0   0 
8: 8   0   0   0   0   0   0   0

來源

2017-02-28 01:25:15 Chris

在數據框的每一行中查找最終值，保留該值並替換所有其他值

回答

相關問題