2016-11-30 90 views
0

我已經設置了一個Elasticsearch v5索引,用於將配置哈希映射到URL。匹配最接近的祖先與路徑層次結構Tokenizer

{ 
"settings": { 
    "analysis": { 
    "analyzer": { 
     "url-analyzer": { 
      "type": "custom", 
      "tokenizer": "url-tokenizer" 
     } 
    }, 
    "tokenizer": { 
     "url-tokenizer": { 
      "type": "path_hierarchy", 
      "delimiter": "/" 
     } 
    } 
} 
}, 
"mappings": { 
    "route": { 
     "properties": { 
     "uri": { 
      "type": "string", 
      "index": "analyzed", 
      "analyzer": "url-analyzer" 
     }, 
     "config": { 
      "type": "object" 
     }}}}} 

我想,匹配得分最高的最長路徑前綴,以便給定的文件

{ "uri": "/trousers/", "config": { "foo": 1 }} 
{ "uri": "/trousers/grey", "config": { "foo": 2 }} 
{ "uri": "/trousers/grey/lengthy", "config": { "foo": 3 }} 

當我搜索/trousers,上面的結果應該是trousers,當我搜索對於/trousers/grey/short,最高結果應爲/trousers/grey

取而代之,我發現/trousers的最高結果是/trousers/grey/lengthy

如何索引和查詢我的文檔以實現此目的?

回答

0

我有一個解決方案,喝完之後:如果我們將索引中的URI作爲關鍵字,但是仍然在搜索輸入中使用PathHierarchyTokenizer?

現在我們存儲以下文檔:

/trousers /trousers/grey /trousers/grey/lengthy

當我們提交/trousers/grey/short的查詢時,search_analyzer可以建立輸入[trousers, trousers/grey, trousers/grey/short]

我們的前兩個文檔將匹配,我們可以使用自定義排序來平均選擇最長匹配。

現在我們的映射文件看起來是這樣的:

{ 
"settings": { 
"analysis": { 
    "analyzer": { 
     "uri-analyzer": { 
      "type": "custom", 
      "tokenizer": "keyword" 
     }, 
     "uri-query": { 
       "type": "custom", 
       "tokenizer": "uri-tokenizer" 
     } 
    }, 
    "tokenizer": { 
     "uri-tokenizer": { 
      "type": "path_hierarchy", 
      "delimiter": "/" 
     } 
    } 
}}, 

"mappings": { 
    "route": { 
     "properties": { 
     "uri": { 
      "type": "text", 
      "fielddata": true, 
      "analyzer": "uri-analyzer", 
      "search_analyzer": "uri-query" 
     }, 

     "config": { 
      "type": "object" 
     } 
     } 
    } 
    } 
} 

```

和我們的查詢看起來是這樣的:

{ 
    "sort": { 
      "_script": { 
        "script": "doc.uri.length", 
        "order": "asc", 
        "type": "number" 
      } 
    }, 
    "query": { 
     "match": { 
     "uri": { 
       "query": "/trousers/grey/lengthy", 
       "type": "boolean" 
     } 
    } 
    } 
}