2016-08-01 105 views
-1

我有一個數據幀(DF),其具有三個柱喜歡這樣:(所有數字隨機)獲取的緯度和經度的質心在數據幀中

ID Lat Lon 
1 25.32 -63.32 
1 25.29 -64.21 
1 24.12 -62.43 
2 12.42 54.64 
2 12.11 53.43 
. .... .... 

基本上我想有像每ID的質心所以:

ID Lat Lon Cent_lat Cent_lon 
1 25.32 -63.32 25.31  -63.25 
1 25.29 -64.21 25.31  -63.25 
1 24.12 -62.43 25.31  -63.25 
2 12.42 54.64 12.20  53.60 
2 12.11 53.43 12.20  53.60 

我厭倦了以下內容:

library(geosphere) 
library(rgeos) 
library(dplyr) 

df1 <- by(df,df$ID,centroid(df$Lat, df$Long)) 

但是,這給了我這個錯誤:

Error in (function (classes, fdef, mtable): unable to find an inherited method for function ‘centroid’ for signature ‘"numeric"’

我甚至累

df1 <- by(df,df$ID,centroid(as.numeric(df$Lat), as.numeric(df$Long))) 

但是,這給了我這個錯誤:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘centroid’ for signature ‘"function"’

+0

是不是三個點的平均值(平均值(長),平均值(lat))的質心? – lmo

+0

對於大多數情況,我們有三個以上的點數,如果地球是平坦的,平均方法就可以工作:-) –

+0

要使用'centroid',需要一個poligon作爲矩陣對象,或者每個點需要一個適當的rownames數據框 – Robert

回答

2

這是一個data.table方法。正如@czeinerb所提到的,Lon是質心函數的第一個參數,Lat是第二個。我們在下面重新定義centroid函數,以便在data.table聚合中,它接收一個有2列(Lat | Lon)的矩陣,它是地圈的centroid函數的必需輸入。

# Import packages 
library(geosphere) 
library(data.table) # Using a data.table approach 

# Sample data 
df = data.frame("ID" = c(1, 1, 1, 2, 2, 2), "Lat" = c(25.32, 25.29, 24.12, 12.42, 12.11, 12.22), "Lon" = c(-63.32, -64.21, -62.43, 54.64, 53.43, 53.23)) 

df 

    ID Lat Lon 
1 1 25.32 -63.32 
2 1 25.29 -64.21 
3 1 24.12 -62.43 
4 2 12.42 54.64 
5 2 12.11 53.43 
6 2 12.22 53.23 

# Convert to data.table 
setDT(df) 

# Re-define centroid function - Lon is first argument and Lat is second 
# Geosphere takes a matrix with two columns: Lon|Lat, so we use cbind to coerce the data to this form 
findCentroid <- function(Lon, Lat, ...){ 
    centroid(cbind(Lon, Lat), ...) 
} 

# Find centroid Lon and Lat by ID, as required 
df[, c("Cent_lon", "Cent_lat") := as.list(findCentroid(Lon, Lat)), by = ID] 
df 

    ID Lat Lon Cent_lon Cent_lat 
1: 1 25.32 -63.32 -63.32000 24.91126 
2: 1 25.29 -64.21 -63.32000 24.91126 
3: 1 24.12 -62.43 -63.32000 24.91126 
4: 2 12.42 54.64 53.76667 12.25003 
5: 2 12.11 53.43 53.76667 12.25003 
6: 2 12.22 53.23 53.76667 12.25003 
+0

非常感謝,我喜歡這些評論和結構 –

0

?centroid它說,它只需2列的矩陣作爲其參數。您擁有的ID信息是將矩陣分成三列。

df <- rbind(c(25.32,-63.32),c(25.29,-64.32),c(24.12,-62.43),c(12.42,54.64),c(12.11,53.43) centroid(df)

lon  lat 
[1,] 24.27109 -60.37098 
2
library(geosphere) 
library(ggplot2) 
library(dplyr) 

states <- map_data("state") 

head(states) 
##  long  lat group order region subregion 
## 1 -87.46201 30.38968  1  1 alabama  <NA> 
## 2 -87.48493 30.37249  1  2 alabama  <NA> 
## 3 -87.52503 30.37249  1  3 alabama  <NA> 
## 4 -87.53076 30.33239  1  4 alabama  <NA> 
## 5 -87.57087 30.32665  1  5 alabama  <NA> 
## 6 -87.58806 30.32665  1  6 alabama  <NA> 

cntrd <- function(x) { 
    data.frame(centroid(as.matrix(x[,c("long", "lat")]))) 
} 

by(states, states$group, cntrd) %>% head() 
## $`1` 
##   lon  lat 
## 1 -86.82976 32.82735 
## 
## $`2` 
##   lon  lat 
## 1 -111.6698 34.34309 
## 
## $`3` 
##   lon  lat 
## 1 -92.43826 34.92167 
## 
## $`4` 
##   lon  lat 
## 1 -119.6713 37.40289 
## 
## $`5` 
##   lon  lat 
## 1 -105.5526 39.02653 
## 
## $`6` 
##   lon  lat 
## 1 -72.72553 41.62706 

group_by(states, group) %>% 
    do(cntrd(.)) 
## Source: local data frame [63 x 3] 
## Groups: group [63] 
## 
## group  lon  lat 
## <dbl>  <dbl> <dbl> 
## 1  1 -86.82976 32.82735 
## 2  2 -111.66978 34.34309 
## 3  3 -92.43826 34.92167 
## 4  4 -119.67130 37.40289 
## 5  5 -105.55264 39.02653 
## 6  6 -72.72553 41.62706 
## 7  7 -75.51543 39.00879 
## 8  8 -77.03411 38.91083 
## 9  9 -82.51260 28.69498 
## 10 10 -83.46361 32.67562 
## # ... with 53 more rows 
1

要使用centroid你需要的多邊形與經度和緯度,在這個順序。看到這個例子:

df<-rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20), 
c(-100,-50), c(-160,-60), c(-180, -10), c(-160,10), c(-60,0),c(-100,-50)) 
df<-data.frame(ID=rep(c(1,2),times=c(5,6)),Lon=df[,1],Lat=df[,2]) 
df1 <- by(df[,c("Lon", "Lat")],df$ID,centroid) 
df1 
df[,c("Cent_lon","Cent_lat")]<-NA 
for(i in names(df1))df[df$ID==i,c("Cent_lat","Cent_lon")]<-df1[[i]] 
df 

    ID Lon Lat Cent_lon Cent_lat 
1 1 -180 -20 -23.89340 -133.33333 
2 1 -160 5 -133.33333 -23.89340 
3 1 -60 0 -23.89340 -133.33333 
4 1 -160 -60 -133.33333 -23.89340 
5 1 -180 -20 -23.89340 -133.33333 
6 2 -100 -50 -127.66065 -127.66065 
7 2 -160 -60 -26.10686 -26.10686 
8 2 -180 -10 -127.66065 -127.66065 
9 2 -160 10 -26.10686 -26.10686 
10 2 -60 0 -127.66065 -127.66065 
11 2 -100 -50 -26.10686 -26.10686 

您可以使用plotArrows看到多邊形geosphere

pol<-split(df[,2:3],df$ID) 
#plotArrows(pol[[1]]) 
plotArrows(as.matrix(pol[[1]])) 
points(df1[[1]],col=4) 

enter image description here

+0

謝謝對於答案,你使用情節來顯示答案,當之無愧我的投票 –

1

功能centroid需要矩陣數據參數:「參數: xa 2列矩陣(經度/緯度)「

https://cran.r-project.org/web/packages/geosphere/geosphere.pdf

此外,經度是第一和緯度是第二列,不在身邊所以你的情況的代碼可能是另一種方式:)

,如:

library(geosphere) 

df <- data.frame(ID = c(1,1,1,2,2,2,2) 
       , Lon = c(-63.32, -64.43, -62.43, 54.64, 53.43, 54.64, 53.43) 
       , Lat = c(25.32, 25.29, 24.12, 12.42, 12.11, 11.11, 10.55)) 
mx <- as.matrix(df) 

(mx1 <- by(mx[,2:3], mx[,1], centroid)) 

隨着輸出:

> INDICES: 1 
> lon  lat 
> [1,] -63.39333 24.91126 
> ----------------------------------------------------------------- 
> INDICES: 2 
> lon lat 
> [1,] Inf 90