2017-07-17 123 views
1

Elasticsearch多匹配查詢與cross_fiels類型和同義詞不能按預期方式工作。Elasticsearch multi_match查詢不能處理同義詞和cross_fields

我有以下配置:

{ 
    "my_index": { 
     "mappings": { 
      "my_mapping": { 
       "properties": { 
        "@timestamp": { 
         "type": "date" 
        }, 
        "@version": { 
         "type": "text", 
         "fields": { 
          "keyword": { 
           "type": "keyword", 
           "ignore_above": 256 
          } 
         } 
        }, 
        "field1": { 
         "type": "text", 
         "fields": { 
          "keyword": { 
           "type": "keyword", 
           "ignore_above": 256 
          } 
         } 
        }, 
        "field2": { 
         "type": "text", 
         "fields": { 
          "keyword": { 
           "type": "keyword", 
           "ignore_above": 256 
          } 
         } 
        } 
     }, 
     "settings": { 
      "index": { 
       "analysis": { 
        "filter": { 
         "my_synonym_filter": { 
          "type": "synonym", 
          "synonyms": [ 
           "matthew,matt,matty", 
           "thomas,tom,thom,tommy" 
          ] 
         } 
        }, 
        "analyzer": { 
         "my_synonyms": { 
          "filter": [ 
           "lowercase", 
           "my_synonym_filter" 
          ], 
          "tokenizer": "standard" 
         } 
        } 
       } 
      } 
     } 
    } 
} 

而下面的查詢:

{ 
    "query":{ 
     "bool":{ 
      "should":[ 
       { 
        "multi_match":{ 
        "fields":[ 
         "field1^8", 
         "field2^2" 
        ], 
        "query":"Matt And Tom Oldfield", 
        "type":"cross_fields", 
        "analyzer": "my_synonyms" 
        } 
       } 
      ] 
     } 
    } 
} 

但是,當我執行它並沒有擴張的同義詞到每一個領域的查詢,所以如果我分析查詢解釋如下:

(Synonym(field1:matt field1:matthew field1:matty) blended(terms:[field1:and^8.0, field2:and^2.0]) Synonym(field1:thom field1:thomas field1:tom field1:tommy) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

因此,如果我在field1中有「Tom Oldfield」,而在field2中有「Matt Oldfield」,則查詢與該結果不匹配,因爲您可以看到它只擴展了同義詞,但是僅擴展了第一個字段的同義詞(field1),而不是其他字段。

如果我從查詢中刪除分析器,然後它會用「湯姆·菲爾德」在FIELD1和「馬特菲爾德」在域2匹配文檔和查詢的解釋如下:

(blended(terms:[field1:matt^8.0, field2:matt^2.0]) blended(terms:[field1:and^8.0, field2:and^2.0]) blended(terms:[field1:tom^8.0, field2:tom^2.0]) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

是有沒有辦法讓同義詞擴展到每個領域?

+0

您的配置示例中存在一個問題 - field1重複 – Ivan

+0

對不起,我剛修復它。 –

回答

1

我無法在彈性5.5.0的env上重現您的問題。 這是我MVCE設置:

{ 
    "settings": { 
    "index": { 
     "analysis": { 
     "filter": { 
      "my_synonym_filter": { 
      "type": "synonym", 
      "synonyms": [ 
       "matthew,matt,matty", 
       "thomas,tom,thom,tommy" 
      ] 
      } 
     }, 
     "analyzer": { 
      "my_synonyms": { 
      "filter": [ 
       "lowercase", 
       "my_synonym_filter" 
      ], 
      "tokenizer": "standard" 
      } 
     } 
     } 
    } 
    }, 
    "mappings": { 
    "my_mapping": { 
     "properties": { 
     "field1": { 
      "type": "text", 
      "fields": { 
      "keyword": { 
       "type": "keyword", 
       "ignore_above": 256 
      } 
      } 
     }, 
     "field2": { 
      "type": "text", 
      "fields": { 
      "keyword": { 
       "type": "keyword", 
       "ignore_above": 256 
      } 
      } 
     } 
     } 
    } 
    } 
} 

下面的文檔建立索引:

{ "field1": "Tom Oldfield", "field2": "Matt Oldfield"} 

上提供的查詢ES創建以下Lucene query

((field1:matt)^8.0 | (field1:matthew)^8.0 | (field1:matty)^8.0 | (field2:matt)^2.0 | (field2:matthew)^2.0 | (field2:matty)^2.0) 
((field1:and)^8.0 | (field2:and)^2.0) 
((field1:tom)^8.0 | (field1:thomas)^8.0 | (field1:thom)^8.0 | (field1:tommy)^8.0 | (field2:tom)^2.0 | (field2:thomas)^2.0 | (field2:thom)^2.0 | (field2:tommy)^2.0) 
((field1:oldfield)^8.0 | (field2:oldfield)^2.0)) 

其中同義詞是各個領域擴大。

+0

你說得對。如果我在我的筆記本電腦上使用ES進行試用,它會起作用,但如果我在AWS Elasticsearch服務上嘗試它,它會生成我之前輸入的內容。你有什麼想法,爲什麼會發生? –

+0

@SofiaBraun你能提供ES版嗎? – Ivan

+0

我正在使用ES 5.1 –

相關問題