2017-10-08 102 views
0

我從一項調查中得到結果,其中人們回答了5個問題,1 =是0 =否,他們如何描述自己,CRAFTSPERSON DESIGNER FABRICATOR FINE ARTIST OTHER。從R CSV文件創建5維維恩圖

我想看看人們如何在身份認同圈。我想在維恩圖中獲取信息。 我發現要做到這一點的大部分都告訴我如何獲得每個部分,例如。 n1,n2,... n12345,然後繪製與draw.quintuplet.venn圖,因爲會有20個部分,我試圖看看是否有一個更簡單的方法來做到這一點,而不必複製代碼20次輕微的調整。

我已經安裝了venndiagram軟件包,但正在努力如何使用它5節。 使用venn.diagram但不確定在括號中輸入什麼。

的數據看起來像下降大約300線(取向是有點偏離但每個報頭去與一列)

技工設計師製造者FINE ARTIST OTHER

0   1   0  0 1  1 
1   1   0  0 0  1  
0   0   0  0 0  1 
0   0   0  0 1  1 

由於

回答

2

使用VennDiagram庫需要定義每個集合的交集。這可能很麻煩,我同意。我最近遇到了圖書館的歐拉。除此之外,該套件還會使設置/交叉區域與計數成比例。

下面是4組的例子:

第一一些數據:

set.seed(2) 
df = data.frame(A = sample(c(0, 1), 100, replace = T), 
       B = sample(c(0, 1), 100, replace = T), 
       C = sample(c(0, 1), 100, replace = T), 
       D = sample(c(0, 1), 100, replace = T)) 


library("eulerr") 
set.seed(10) #this seed changes the orientation of the sets   
plot(euler(df), counts = T, fontface = 1)  

enter image description here

你也一樣可以做到的維恩圖的方法:

set.seed(10) 
sp_euler = with(df, 
       euler(c("A" = sum(A), 
         "B" = sum(B), 
         "C" = sum(C), 
         "D" = sum(D), 
         "A&B" = sum(A == 1 & B == 1), 
         "A&C" = sum(A == 1 & C == 1), 
         "A&D" = sum(A == 1 & D == 1), 
         "B&C" = sum(B == 1 & C == 1), 
         "B&D" = sum(B == 1 & D == 1), 
         "C&D" = sum(C == 1 & D == 1), 
         "A&B&C" = sum(A == 1 & B == 1 & C == 1), 
         "A&B&D" = sum(A == 1 & B == 1 & D == 1), 
         "A&C&D" = sum(A == 1 & C == 1 & D == 1), 
         "B&C&D" = sum(B == 1 & C == 1 & D == 1), 
         "A&B&C&D" = sum(A == 1 & B == 1 & C == 1 & D == 1)), input = "union")) 

plot(sp_euler, counts = T, fontface = 1) 

這裏也是一個閃亮的應用程序:

http://jolars.co/eulerr/

擁有5套:

set.seed(2) 
df = data.frame(A = sample(c(0, 1), 100, replace = T), 
       B = sample(c(0, 1), 100, replace = T), 
       C = sample(c(0, 1), 100, replace = T), 
       D = sample(c(0, 1), 100, replace = T), 
       E = sample(c(0, 1), 100, replace = T)) 

set.seed(10) 
plot(euler(df), counts = T, fontface = 1) 

enter image description here

然而,這可能不適合所有集合,因爲如果歐拉模型不能說明所有交叉計數不會ploted 。例如在4組實施例B和d的交點是5和不爲0等人能從情節結束:

set.seed(2) 
df = data.frame(A = sample(c(0, 1), 100, replace = T), 
       B = sample(c(0, 1), 100, replace = T), 
       C = sample(c(0, 1), 100, replace = T), 
       D = sample(c(0, 1), 100, replace = T)) 
eu_model = euler(df) 
eu_model 
#output: 

     original fitted residuals region_error 
A    5 5.022 -0.022  0.023 
B    5 5.000  0.000  0.023 
C    8 8.004 -0.004  0.037 
D    7 7.012 -0.012  0.033 
A&B   6 0.000  6.000  0.065 
A&C   5 4.985  0.015  0.023 
A&D   9 8.978  0.022  0.041 
B&C   11 11.004 -0.004  0.051 
B&D   5 0.000  5.000  0.054 
C&D   6 0.000  6.000  0.065 
A&B&C   8 7.985  0.015  0.037 
A&B&D   4 0.000  4.000  0.043 
A&C&D   7 7.018 -0.018  0.033 
B&C&D   6 0.000  6.000  0.065 
A&B&C&D  1 0.000  1.000  0.011 

diag_error: 0.065 
stress:  0.23 

另一種選擇是limma

library(limma) # part of bioconductor 

安裝:

source("http://www.bioconductor.org/biocLite.R")  
biocLite("limma") 
library(limma) 

要繪製:

vennDiagram(vennCounts(df), circle.col = 1:5) 

enter image description here