2017-04-25 66 views
1

我嘗試從兩個MySQL表格(作業數據和位置)導入作業廣告,但是當工作廣告有多個位置時我遇到問題。我使用這個MySQL查詢:Logstash一對多從MySQL導入

SELECT id, company, jobtitle, description, priority, DATE_FORMAT(date, '%Y-%m-%d %T') AS date, sa_locations.location AS location_name, sa_locations.lat AS location_lat, sa_locations.lon AS location_lon FROM sa_data JOIN sa_locations ON sa_data.id = sa_locations.id ORDER BY id 

忽略的位置問題一切都很好,我得到的結果是這樣的:

{ 
    "_index" : "jk", 
    "_type" : "jobposting", 
    "_id" : "26362", 
    "_score" : 1.0, 
    "_source" : { 
     "date" : "2017-04-22 00:00:00", 
     "location_name" : "Berlin", 
     "location_lat" : "52.520007", 
     "location_lon" : "13.404954", 
     "@timestamp" : "2017-04-24T07:50:31.660Z", 
     "@version" : "1", 
     "description" : "Some text here", 
     "company" : "Test Company", 
     "id" : 26362, 
     "jobtitle" : "Architect Data Center Network & Security", 
     "priority" : 10, 
}, { 
    "_index" : "jk", 
    "_type" : "jobposting", 
    "_id" : "26363", 
    "_score" : 1.0, 
    "_source" : { 
     "date" : "2017-04-22 00:00:00", 
     "location_name" : "Hamburg", 
     "location_lat" : "53.551085", 
     "location_lon" : "9.993682", 
     "@timestamp" : "2017-04-24T07:50:31.660Z", 
     "@version" : "1", 
     "description" : "Some text here", 
     "company" : "Test Company", 
     "id" : 26363, 
     "jobtitle" : "Architect Data Center Network & Security", 
     "priority" : 10, 
} 

我想要得到的是這樣的:

{ 
    "_index" : "jk", 
    "_type" : "jobposting", 
    "_id" : "26362", 
    "_score" : 1.0, 
    "_source" : { 
     "date" : "2017-04-22 00:00:00", 
     "locations" : [ { "name": "Berlin", "lat" : "52.520007", "lon" : "13.04954" }, { "name": "Hamburg", "lat" : "53.551085", "lon" : 
"9.993682" } ] 
     "@timestamp" : "2017-04-24T07:50:31.660Z", 
     "@version" : "1", 
     "description" : "Some text here", 
     "company" : "Test Company", 
     "id" : 26362, 
     "jobtitle" : "Architect Data Center Network & Security", 
     "priority" : 10, 
    } 

因此,如果我要通過使用geo_distance過濾器搜索柏林或漢堡附近的工作,則應顯示此作業。有沒有辦法用logstash以這種方式導入數據?

我logstash.conf看起來是這樣的:

input { 
jdbc { 
jdbc_connection_string => "jdbc:mysql://localhost:3306/jk" 
jdbc_user => "..." 
jdbc_password => "..." 
jdbc_driver_library => "/etc/logstash/mysql-connector-java-5.1.41/mysql-connector-java-5.1.41-bin.jar" 
jdbc_driver_class => "com.mysql.jdbc.Driver" 
statement => "SELECT id, company, jobtitle, description, priority, DATE_FORMAT(date, '%Y-%m-%d %T') AS date, sa_locations.location AS location_name, sa_locations.lat AS location_lat, sa_locations.lon AS location_lon 
FROM sa_data JOIN sa_locations 
ON sa_data.id = sa_locations.id 
ORDER BY id 
} 
} 

#filter { 
# aggregate { 
# task_id => "%{id}" 
# code => " 
# map['location_name'] = event.get('location_name') 
# map['location_lat'] = event.get('location_lat') 
# map['location_lon'] = event.get('location_lon') 
# map['locations'] ||= [] 
# map['locations'] < event.get('location_name')} 
# map['locations'] < event.get('location_lat')} 
# map['locations'] < event.get('location_lon')} 
# event.cancel() 
# " 
# push_previous_map_as_event => true 
# timeout => 3 
# } 
#} 

output { 
elasticsearch { 
index => "jk" 
document_type => "jobposting" 
document_id => "%{id}" 
hosts => ["localhost:9200"] 
} 
} 

過濾器似乎是一個錯誤的做法。

+0

DOB - 是你能得到最終這方面的工作?我有一個類似的問題,不能得到它的工作:( – Birdy

回答

2

如果您有一個ID的多個位置,但您當前的設置不會爲每個位置創建一個哈希陣列(位置數據庫中的每一行都有一個哈希),您仍然希望進行聚合。

你可以做這樣的事情:

filter { 
    mutate { 
    rename => { 'location_name' => '[location][name]' } 
    rename => { 'location_lat' => '[location][lat]' } 
    rename => { 'location_long' => '[location][long]' } 
    } 

    aggregate { 
    task_id => '%{id}' 
    code => " 
     map['locations'] ||= [] 
     map['locations'] << event.get('location') 
    " 
    push_previous_map_as_event => true 
    } 
} 
+0

我試過這個解決方案,但得到了這個錯誤信息:異常在線程「[main]> worker2」java.lang.ClassCastException:期望List或Map,找到類org.logstash.bivalues.StringBiValue – DOB

+0

什麼是你的配置文件? – cattastrophe

+0

上面的一個,但添加了你的過濾器,我只是改變了「))」to「)」並添加了「map ['id'] = event.get(' id')「,但沒有成功。 – DOB