通過首先對sort
進行排序,你可以申請uniq
。
這似乎文件就好了排序:
$ cat test.csv
[email protected],2009-11-27 00:58:29.793000000,xx3.net,255.255.255.0
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 00:58:29.646465785,2x3.net,256.255.255.0
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
$ sort test.csv
[email protected],2009-11-27 00:58:29.646465785,2x3.net,256.255.255.0
[email protected],2009-11-27 00:58:29.793000000,xx3.net,255.255.255.0
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
$ sort test.csv | uniq
[email protected],2009-11-27 00:58:29.646465785,2x3.net,256.255.255.0
[email protected],2009-11-27 00:58:29.793000000,xx3.net,255.255.255.0
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
你也可以做一些AWK魔術:
$ awk -F, '{ lines[$1] = $0 } END { for (l in lines) print lines[l] }' test.csv
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
[email protected],2009-11-27 00:58:29.646465785,2x3.net,256.255.255.0
如果該列中包含逗號本身(帶引號) – user775187 2011-06-17 10:18:56
這是不行的唯一的事情是排序不會給你一個計數...我認爲.. – Rodo 2014-01-14 11:00:51
爲什麼你需要,1在-k1,1?爲什麼不只是-k1? – 2014-11-24 20:10:28