-2
之間抽取的標籤在下面,其執行的是並導致電流輸出,我試圖以添加標記AF=
,FR=
,HRUN=
後,將提取文本或 值的條件的awk
,LEN=
,TYPE=
爲file1
各行中的行數與file2
相比較。如同 之間的行,這兩個文件是Match
,Missing in file 1
或Missing in file2
,但我無法添加條件以提取到;
(分號)。 標籤後面可能並不總是有文字,但總是以;
結尾。 $6
中的小數點也是3個符號數字,以便於閱讀。它似乎接近 ,但有一些事情我不太確定該怎麼做。謝謝 :)。awk將兩個文件
file1的
chr1 43814978 COSM27286 G A 86.92679999999999 PASS
AF=0;AO=1;DP=5535;FAO=0;FR=.,REALIGNEDx0.008;HRUN=1;LEN=1;TYPE=snp;VARB=0;HS;
chr1 43814981 COSM27287 G A 86.83350000000002 PASS
AF=0;AO=2;DP=5556;FAO=0;FR=.;HRUN=1;LEN=1;TYPE=snp;VARB=0;HS;
chr1 43815008 COSM29008;COSM43212 TGG AAA,AAG 70.3099 PASS
AF=0,0;AO=0,0;DP=5528;FAO=0,0;FR=.,.,;HRUN=1,1;LEN=3,2,;TYPE=mnp,mnp;VARB=0,0;HS;
file2的
chr1 43814979 COSM27286 G A 86.92679999999999 PASS
AF=0;AO=1;DP=5535;FAO=0;FR=.,REALIGNEDx0.008;HRUN=1;LEN=1;TYPE=snp;VARB=0;HS;
chr1 43814981 COSM27287 G A 86.83350000000002 PASS
AF=0;AO=2;DP=5556;FAO=0;FR=.;HRUN=1;LEN=1;TYPE=snp;VARB=0;HS;
chr1 43815008 COSM29008;COSM43212 TGG AAA,AAG 70.3099 PASS
AF=0,0;AO=0,0;DP=5528;FAO=0,0;FR=.,.,;HRUN=1,1;LEN=3,2,;TYPE=mnp,mnp;VARB=0,0;HS;
期望的輸出
Match:
chr1 43814981 COSM27287 G A 86.8 PASS
AF=0;FR=.;HRUN=1;LEN=1;TYPE=snp
chr1 43815008 COSM29008;COSM43212 TGG AAA,AAG 70.3099 PASS
AF=0,0;FR=.,.,;HRUN=1,1;LEN=3,2,;TYPE=mnp,mnp
Missing in file1:
chr1 43814979 COSM27286 G A 86.9 PASS
AF=0;FR=.,REALIGNEDx0.008;HRUN=1;LEN=1;TYPE=snp
Missing in file2:
chr1 43814978 COSM27286 G A 86.9 PASS
AF=0;FR=.,REALIGNEDx0.008;HRUN=1;LEN=1;TYPE=snp
AW ķ
awk 'FNR==1 { next }
FNR == NR { file1[$1,$2,$3,$4,$5,$6,$7] = $1 " " $2 " " $3 " " $4 " " $5 " " $6 " "$7 }
FNR != NR { file2[$1,$2,$3,$4,$5,$6,$7] = $1 " " $2 " " $3 " " $4 " " $5 " " $6 " "$7 }
END { print "Match:"; for (k in file1) if (k in file2) print file1[k] # Or file2[k]
print "Missing in file1:"; for (k in file2) if (!(k in file1)) print file2[k]
print "Missing in file2:"; for (k in file1) if (!(k in file2)) print file1[k]
}' file1 file2 > output
電流輸出
Match:
chr1 43814981 COSM27287 G A 86.83350000000002 PASS
chr1 43815008 COSM29008;COSM43212 TGG AAA,AAG 70.3099 PASS
Missing in File1:
chr1 43814979 COSM27286 G A 86.92679999999999 PASS
Missing in File2:
chr1 43814978 COSM27286 G A 86.92679999999999 PASS
能否請您解釋一下在比賽第2線,應是邏輯(COSM29008; COSM43212; COSM19193; COSM27289; COSM28487)只來(COSM29008; COSM43212)? – RavinderSingh13
您似乎在'awk'中提出了足夠的問題,至少爲解決問題做出了不錯的努力。但是你一直在問免費的代碼? – Inian
對不起,這是我的一個錯字,你在代碼中是正確的。對不起,謝謝:)。 – Chris