2017-04-18 75 views
1

以下是我的logstash配置以加載彈性搜索數據,並轉換爲CSV格式Logstash CSV配置以便訪問嵌套JSON字段

input { 
elasticsearch { 
hosts => "localhost:9200" 
index => "chats" 
query => '{ "query": { "range" : { 
"timestamp" : { 
"gte" : "1492080665000", 
"lt" : "1492088665000" 
} 
} }, "_source": [ "timestamp","content.text"] }' 
} 
} 

filter { 
date { 
match => [ "timestamp","UNIX_MS" ] 
target => "timestamp_new" 
remove_field => [ "timestamp" ] 
} 
csv { 
columns => ["timestamp", "content.text"] 
separator => "," 
} 
} 

output{ 
csv { 
fields => ["timestamp_new","content.text"] 
path => "/home/ubuntu/chats-content-date-range-v3.csv" 
} 
stdout { codec => rubydebug } 
} 

樣品輸入數據

"source":{"userName": "xxx", "senderType": 3, "spam": 0, "senderId": "1000", "threadId": 101, "userId": "xxx", "sessionId": 115, "content": {"text": "Yes okay", "image": null, "location": null, "card": null}, "receiverId": "xxx", "timestamp": 1453353242657, "type": 0, "id": "0dce30dd-781e-4a42-b230-a988b68fd9ed1000_1453353242657"} 

以下是我的樣本輸出數據

2017-04-13T12:41:34.423Z,"{""text"":""Yes okay""}" 

相反,我想要下面的輸出

2017-04-13T12:41:34.423Z,"Yes okay" 

回答

1
input { 
    elasticsearch { 
     hosts => "localhost:9200" 
     index => "chats" 
     query => '{ 
      "query": { 
       "range" : { 
        "timestamp" : { 
         "gte" : "1492080665000", 
         "lt" : "1492088665000" 
        } 
       } 
      }, 
      "_source": [ "timestamp","content.text"] 
     }' 
    } 
} 

filter { 
    date { 
     match => [ "timestamp","UNIX_MS" ] 
     target => "timestamp_new" 
     remove_field => [ "timestamp" ] 
    } 
    csv { 
     columns => ["timestamp", "content.text"] 
     separator => "," 
    } 
    json { 
     source => "content.text" 
     target => "content.text" 
    } 
} 
+0

僅供參考,如果你有一個以上的JSON鍵,你應該重新考慮你的csv分離器的按鍵也將被逗號分隔的,你可能會得到一個CSV解析失敗。在這種情況下,管道分隔符或#分隔符將是一個不錯的選擇。 –