2017-02-24 215 views
0

現在我正在使用「match_all」查詢來獲取Logstash正在處理的數據。我得到的輸出是每一個屬於事件一部分的字段,因爲它應該是。這是我的查詢:如何通過curl查詢Logstash並僅返回特定字段

{ 
"query": { 
    "match_all" : { } 
}, 
    "size": 1, 
    "sort": [ 
{ 
"@timestamp": { 
    "order": "desc" 
    } 
    } 
    ] 
} 

正如你所看到的,我也排序我的結果,我總是得到最近的一個輸出。

這裏是我的輸出的一個例子:

{ 
    "took" : 1, 
    "timed_out" : false, 
    "_shards" : { 
    "total" : 5, 
    "successful" : 5, 
    "failed" : 0 
    }, 
    "hits" : { 
    "total" : 15768, 
    "max_score" : null, 
    "hits" : [ 
     { 
     "_index" : "filebeat-2017.02.24", 
     "_type" : "bro", 
     "_id" : "AVpx-pFtiEtl3Zqhg8tF", 
     "_score" : null, 
     "_source" : { 
      "resp_pkts" : 0, 
      "source" : "/usr/local/bro/logs/current/conn.log", 
      "type" : "bro", 
      "id_orig_p" : 56058, 
      "duration" : 848.388112, 
      "local_resp" : true, 
      "uid" : "CPndOf4NNf9CzTILFi", 
      "id_orig_h" : "192.168.137.130", 
      "conn_state" : "OTH", 
      "@version" : "1", 
      "beat" : { 
      "hostname" : "localhost.localdomain", 
      "name" : "localhost.localdomain", 
      "version" : "5.2.0" 
      }, 
      "host" : "localhost.localdomain", 
      "id_resp_h" : "192.168.137.141", 
      "id_resp_p" : 22, 
      "resp_ip_bytes" : 0, 
      "offset" : 115612, 
      "orig_bytes" : 32052, 
      "local_orig" : true, 
      "input_type" : "log", 
      "orig_ip_bytes" : 102980, 
      "orig_pkts" : 1364, 
      "missed_bytes" : 0, 
      "history" : "DcA", 
      "tunnel_parents" : [ ], 
      "message" : "{\"ts\":1487969779.653504,\"uid\":\"CPndOf4NNf9CzTILFi\",\"id_orig_h\":\"192.168.137.130\",\"id_orig_p\":56058,\"id_resp_h\":\"192.168.137.141\",\"id_resp_p\":22,\"proto\":\"tcp\",\"duration\":848.388112,\"orig_bytes\":32052,\"resp_bytes\":0,\"conn_state\":\"OTH\",\"local_orig\":true,\"local_resp\":true,\"missed_bytes\":0,\"history\":\"DcA\",\"orig_pkts\":1364,\"orig_ip_bytes\":102980,\"resp_pkts\":0,\"resp_ip_bytes\":0,\"tunnel_parents\":[]}", 
      "tags" : [ 
      "beats_input_codec_plain_applied" 
      ], 
      "@timestamp" : "2017-02-24T21:15:29.414Z", 
      "resp_bytes" : 0, 
      "proto" : "tcp", 
      "fields" : { 
      "sensorType" : "networksensor" 
      }, 
      "ts" : 1.487969779653504E9 
     }, 
     "sort" : [ 
      1487970929414 
     ] 
     } 
    ] 
    } 
} 

正如你可以看到,這是一個大量輸出的外部應用程序(C#編寫的處理,使垃圾收集大量關於所有這些字符串),我只是不需要。

我的問題是,我如何設置我的查詢,以便我只抓取我需要的字段?

回答

2

對於5.x有一個更改,允許您執行_source篩選。該文檔是here,它應該是這樣的:

{ 
"query": { 
    "match_all" : { } 
}, 
"size": 1, 
"_source": ["a","b"], 
... 

而結果是這樣的:

{ 
    "took" : 2, 
    "timed_out" : false, 
    "_shards" : { 
    "total" : 5, 
    "successful" : 5, 
    "failed" : 0 
    }, 
    "hits" : { 
    "total" : 1, 
    "max_score" : 1.0, 
    "hits" : [ 
     { 
     "_index" : "xxx", 
     "_type" : "xxx", 
     "_id" : "xxx", 
     "_score" : 1.0, 
     "_source" : { 
      "a" : 1, 
      "b" : "2" 
     } 
     } 
    ] 
    } 
} 

之前的版本中5,你可以用一個領域參數做到這一點:

您查詢可以在查詢的根級傳遞,"fields": ["field1","field2"...]。它返回的格式將有所不同,但它會起作用。

{ 
"query": { 
    "match_all" : { } 
}, 
"size": 1, 
"fields": ["a","b"], 
... 

這將產生如下輸出:

{ 
    "took": 9, 
    "timed_out": false, 
    "_shards": { 
    "total": 1, 
    "successful": 1, 
    "failed": 0 
    }, 
    "hits": { 
    "total": 2077, 
    "max_score": 1, 
    "hits": [ 
     { 
     "_index": "xxx", 
     "_type": "xxx", 
     "_id": "xxxx", 
     "_score": 1, 
     "fields": { 
      "a": [ 
      0 
      ], 
      "b": [ 
      "xyz" 
      ] 
     } 
     } 
    ] 
    } 
} 

字段總是陣列(由於1.0 API)並沒有任何方法來改變,由於Elasticsearch固有輯陣值感知。

+0

運行5.2,我其實從得到一個錯誤: 'code' { 「錯誤」:{ 「ROOT_CAUSE」:[{ 「類型」: 「parsing_exception」, 「理由」:「其如果未存儲字段,請使用[stored_fields]檢索存儲的字段或_source篩選「, 」line「:6, 」col「:13 } 」status「: 400 }'code' – BenjaFriend

+0

你嘗試過使用'stored_fields'而不是'fields'(我沒有意識到5.x api的改變) – Alcanzar

+0

我做了我只是得到沒有字段的輸出 – BenjaFriend