2017-10-14 161 views
0

我給它以這種格式排序數據庫/殼牌

10027|Chen|Ning|female|1982-12-08|2010-02-22T17:59:59.221+0000|1.2.9.86|Firefox 
10995116908|Chen|Wei|female|1985-08-02|2010-05-2420:52:26.582+0000|27.98.244.108|Firefox 

(注意在T在第6列)結構化數據庫

我的任務是輸出數據庫的行的日期從給定的dateA到另一個給定的dateB

到目前爲止,我試圖通過第6列按-M排序我的文件,特別是sort -k 6M -t "|" "file.dat"sort -k6 -M -t "|"和其他試驗。

但沒有任何反應。

我需要分揀這樣就可以與awk

EDIT 例如指定的起始和結束日期, 從這個

933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
4194|Do|Hα» ChΓ­|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 

輸出所需的排序必須是此

8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
4194|Do|Hα» ChΓ­|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
+0

請將您希望的輸出樣本輸入添加到您的問題。 – Cyrus

+0

顯示你如何指定給定的dateA到另一個給定的dateB * – RomanPerekhrest

+0

@RomanPerekhrest可供日期A 2010-02-15T09:33:33.400 + 0000,因爲它還沒有被指定給我們,到目前爲止B 2010-03-16T20 :20:20.300 + 0000 –

回答

1

最終,在這項任務中我沒有看到任何特別的東西 - 只需簡單的排序:

sort -k6,6 -t "|" file.dat 

輸出:

8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
4194|Do|Hα» ChΓ­|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
+0

謝謝雖然我認爲-M選項是需要的。 –

+0

'-M'爲月份排序,而不是整個日期時間 – RomanPerekhrest

0

添加了一對額外的數據線進行搜索的例子有點容易看到:

933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
4194|Do|H? Ch?|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
4567|Kim|Lisa|female|1982-05-29|2009-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
1234|Axe|John|male|1982-05-29|2012-02-21T08:44:41.479+0000|14.103.81.196|Firefox 

我將定義一個bash腳本[search.sh]具有以下輸入參數:

search.sh [--born_after <dateA>] [--born_before <dateB>] -f <dbfile> 

`--born_after <dateA>` : [optional] search for data records with field6 >= this search parameter; [format=YYYY-MM-DDTHH:MM:SS.sss+HHMM] [default=0000-00-00T00:00:00.000+0000] 
`--born_before <dateB>` : [optional] search for data records with field6 <= this search parameter; [format=YYYY-MM-DDTHH:MM:SS.sss+HHMM] [default=9999-99-99T99:99:99.999+9999] 
`-f <dbfile>`   : [required] data file to search 

實際的腳本:

$ cat search.sh 
#!/bin/bash 

# set default search dates, clear the dbfile variable: 

dateA="0000-00-00T00:00:00.000+0000" 
dateB="9999-99-99T99:99:99.999+9999" 

unset dbfile 

# simulate getopts so we can parse for long and short option names 

while [ $# -gt 0 ] 
do 
     case $1 in 
       --born-after) dateA=$2         ; shift ;; 
       --born-before) dateB=$2         ; shift ;; 
       -f)    dbfile=$2         ; shift ;; 
       *)    echo "Unexpected argument '$1'. Aborting." ; exit 1 ;; 
     esac 

     shift 
done 

# if we didn't get receive/parse a `-f <dbfile>` option then abort: 

[[ "${dbfile}" = '' ]] && echo "Missing a dbfile. Aborting." && exit 1 

# start by sorting dbfile using RomanPerekhrest's solution; then pipe results to 
# an awk script to handle the 'search' 

sort -k6,6 -t "|" ${dbfile} | awk -F"|" -v dateA="${dateA}" -v dateB="${dateB}" '$6>=dateA && $6<=dateB' 
  • -v date[AB]="${date[AB]}":通過我們的bash的變量到awk腳本;爲了簡單起見,我們會保持相同的名稱
  • -F "|":定義AWK
  • 輸入分隔符
  • $6>=dateA && $6<=dateB:只打印線,其中字段6之間(含)我們的搜索日期

的一些樣品運行腳本:

# no search dates provided (ie, use defaults; display entire file (sorted)) 
$ search.sh -f file.dat 
4567|Kim|Lisa|female|1982-05-29|2009-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
4194|Do|H? Ch?|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
1234|Axe|John|male|1982-05-29|2012-02-21T08:44:41.479+0000|14.103.81.196|Firefox 

# only print records (sorted) with field6 >= 2009-10-01 
$ search.sh --born-after '2009-10-01T00:00:00.000+0000' -f file.dat 
8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
4194|Do|H? Ch?|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
1234|Axe|John|male|1982-05-29|2012-02-21T08:44:41.479+0000|14.103.81.196|Firefox 

# only print records (sorted) with field6 between 2009-10-01 and 2011-05-05 
$ search.sh --born-after '2009-10-01T00:00:00.000+0000' --born-before '2011-05-05T23:59:59.999+9999' -f file.dat 
8698|Liu|Chen|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox 
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer 
8333|Wang|Chen|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer 
933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox 
4194|Do|H? Ch?|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer 
+0

你是一個救生員,儘管它適用於小型數據庫,當我處理大文件時,它不應該像它應該那樣工作。我應該檢查編譯器或shell實例/版本等更復雜的東西嗎? –

+0

你必須擴大評論「它不應該像它應該」;另外,如果你通常搜索(相對)少量的行,如果你首先執行'awk'工作,然後輸出到'sort'(最終結果是你),它應該會更有效一些將花費較少的資源對較小的數據集進行排序) – markp

+0

別的事情要考慮......如果您總是按字段排序此文件6 ...考慮排序一次並將結果保存到新文件,然後運行搜索新的/排序的文件(即消除重複排序原始文件的開銷) – markp