2016-06-13 50 views
0

插入LAT經度作爲geo_point我有一個簡單csv文件,其具有4個字段,SERIAL_NUM,post_code,LAT,LON如:彈性搜索轉換和在本體

serial_num,post_code,LAT,LON 
06AA209365,PE10 2AZ,532342,168459 
98A819621,PE10 1AA,532342,168459 
07FD490906,PE12 1VV,497882,157983 

我需要批量插入到elasticsearch 。所述LAT LON字段需要在一個單一的geo_point字段被定義,所以我已經創建的映射如下:

  • 指數是SERIAL_DATA
  • 類型是插件

    PUT /serial_data 
    { 
    "mappings": { 
    "widget": { 
        "properties": { 
        "serial_number": { 
         "type": "string" 
        }, 
        "post_code": { 
         "type": "string" 
        }, 
        "location": { 
         "type": "geo_point" 
        } 
        } 
    } 
    

    } }

我試圖使用embulk插入數據,因爲我認爲我有一個定義的m apping。如果我將lat定義爲double或long,那麼embulk將解決lat問題,long可以放到單個位置,它不會,也不會樂觀。

我也認爲embulk有一個批量輸入json插件,但我找不到它。

問題

任何想法將非常感激如何批量加載這些數據。

回答

0

我使用tree filte插件。

  • embulk-濾器插入:插入位置列
  • embulk濾波器-ruby_proc:結合LAT和LON柱
  • embulk過濾柱:除去LAT和LON柱

數據的.csv

serial_num,post_code,LAT,LON 
06AA209365,PE10 2AZ,532342,168459 
98A819621,PE10 1AA,532342,168459 
07FD490906,PE12 1VV,497882,157983 

conf.yml

in: 
    type: file 
    path_prefix: data.csv 
    parser: 
    charset: UTF-8 
    newline: CRLF 
    type: csv 
    delimiter: ',' 
    quote: '"' 
    escape: '"' 
    trim_if_not_quoted: false 
    skip_header_lines: 1 
    allow_extra_columns: false 
    allow_optional_columns: false 
    columns: 
    - {name: serial_num, type: string} 
    - {name: post_code, type: string} 
    - {name: lat, type: long} 
    - {name: lon, type: long} 
filters: 
    - type: insert 
    column: 
     location: 
    - type: ruby_proc 
    requires: 
     - json 
    columns: 
     - name: location 
     proc: | 
      ->(_,record) do 
      return { lat: record["lat"], lon: record["lon"] }.to_json.to_s 
      end 
     skip_nil: false 

    - type: column 
    columns: 
     - {name: serial_num} 
     - {name: post_code} 
     - {name: location} 


out: {type: stdout} 

輸出

+-------------------+------------------+-----------------------------+ 
| serial_num:string | post_code:string |    location:string | 
+-------------------+------------------+-----------------------------+ 
|  06AA209365 |   PE10 2AZ | {"lat":532342,"lon":168459} | 
|   98A819621 |   PE10 1AA | {"lat":532342,"lon":168459} | 
|  07FD490906 |   PE12 1VV | {"lat":497882,"lon":157983} | 
+-------------------+------------------+-----------------------------+