2016-04-28 55 views
0

我想編寫一個腳本來匹配兩個文件。我有一個文件總是變化,一個文件作爲一個數據庫。如何編寫匹配文件的bash腳本

輸入文件1:

1 
3 
5 
7 
9 

數據庫匹配文件1:

A B C D E F 
1 0.27776079 0.302853938 1.52415756 2.751714059 1.363932416 2.286189771 
2 0.332465 0.777918524 0.705056607 0.484138872 0.443787105 0.848742839 
3 0.941768856 0.19125 0.573714912 0.5040488 0.526207725 1.554118026 
4 1.717348092 0.19642752 0.315945 0.1331712 0.28427498 0.30113875 
5 0.802253697 0.3768849 0.426688 0.27693 0.591697038 0.3832675 
6 0.2752232 0.570078 0.3847095 0.659548575 0.327469824 0.3346875 
7 0.153272 0.36594447 0.19125 0.526602427 0.44771265 0.31136 
8 0.637448551 0.735756919 1.284158594 0.464060016 0.259459816 0.887975536 
9 0.397221469 0.20808 0.268226 0.710250679 0.493069267 0.47672443 
10 0.196928 0.492713856 0.22302 0.783853054 0.303534 1.736908487 
11 0.510789888 0.14948712 0.26432 0.684485438 0.683017627 0.614033957 

期望的輸出文件1:

A B C D E F 
1 0.27776079 0.302853938 1.52415756 2.751714059 1.363932416 2.286189771 
3 0.941768856 0.19125 0.573714912 0.5040488 0.526207725 1.554118026 
5 0.802253697 0.3768849 0.426688 0.27693 0.591697038 0.3832675 
7 0.153272 0.36594447 0.19125 0.526602427 0.44771265 0.31136 
9 0.397221469 0.20808 0.268226 0.710250679 0.493069267 0.47672443 

我想從數據庫中匹配線提取。

head -1 database1.txt > output1.txt 
grep -wf inputfile1.txt database1.txt >> output1.txt 
head -1 database1.txt > output2.txt 
grep -wf inputfile2.txt database1.txt >> output2.txt 
head -1 database2.txt > output3.txt 
grep -wf inputfile3.txt database2.txt >> output3.txt 

我嘗試使用nano命令,但每次都需要更改語法。

+3

實施例的輸入和所希望的輸出將是有益的。 –

+0

@JohnZwinck對不起,我希望現在更清楚。非常感謝您的幫助! –

+0

您是否使用[diff](http://www.gnu.org/software/diffutils/manual/diffutils.html#Output-Formats)進行了研究? – Jens

回答

1

可以使用join命令加入關於第一列中的2個文件:

$ cat file1 
1 
3 
5 
7 
9 
$ cat file2 
A B C D E F 
1 0.27776079 0.302853938 1.52415756 2.751714059 1.363932416 2.286189771 
2 0.332465 0.777918524 0.705056607 0.484138872 0.443787105 0.848742839 
3 0.941768856 0.19125 0.573714912 0.5040488 0.526207725 1.554118026 
4 1.717348092 0.19642752 0.315945 0.1331712 0.28427498 0.30113875 
5 0.802253697 0.3768849 0.426688 0.27693 0.591697038 0.3832675 
6 0.2752232 0.570078 0.3847095 0.659548575 0.327469824 0.3346875 
7 0.153272 0.36594447 0.19125 0.526602427 0.44771265 0.31136 
8 0.637448551 0.735756919 1.284158594 0.464060016 0.259459816 0.887975536 
9 0.397221469 0.20808 0.268226 0.710250679 0.493069267 0.47672443 
10 0.196928 0.492713856 0.22302 0.783853054 0.303534 1.736908487 
11 0.510789888 0.14948712 0.26432 0.684485438 0.683017627 0.614033957 
$ sed -n '1p' file2 && join --nocheck-order file1 <(sed -n '1!p' file2) 
A B C D E F 
1 0.27776079 0.302853938 1.52415756 2.751714059 1.363932416 2.286189771 
3 0.941768856 0.19125 0.573714912 0.5040488 0.526207725 1.554118026 
5 0.802253697 0.3768849 0.426688 0.27693 0.591697038 0.3832675 
7 0.153272 0.36594447 0.19125 0.526602427 0.44771265 0.31136 
9 0.397221469 0.20808 0.268226 0.710250679 0.493069267 0.47672443 
$ 
+0

什麼是所有'sed'的東西?它對我來說只是'join -t''file1 file2'。 –

+0

所以腳本將sed -n'1p'file2 && join --nocheck-order file1 <(sed -n'1!p'file2)?但是,如果file1有10個不同的文件呢? –

+0

是的,如果你多於一個文件,那麼你可以使用'<(cat file {1..10})'而不是'file1' – ritesht93