Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
441 views
in Technique[技术] by (71.8m points)

autocomplete - Elasticsearch: Find substring match

I want to perform both exact word match and partial word/substring match. For example if I search for "men's shaver" then I should be able to find "men's shaver" in the result. But in case case I search for "en's shaver" then also I should be able to find "men's shaver" in the result. I using following settings and mappings:

Index settings:

PUT /my_index
{
    "settings": {
        "number_of_shards": 1, 
        "analysis": {
            "filter": {
                "autocomplete_filter": { 
                    "type":     "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete": {
                    "type":      "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter" 
                    ]
                }
            }
        }
    }
}

Mappings:

PUT /my_index/my_type/_mapping
{
    "my_type": {
        "properties": {
            "name": {
                "type":            "string",
                "index_analyzer":  "autocomplete", 
                "search_analyzer": "standard" 
            }
        }
    }
}

Insert records:

POST /my_index/my_type/_bulk
{ "index": { "_id": 1            }}
{ "name": "men's shaver" }
{ "index": { "_id": 2            }}
{ "name": "women's shaver" }

Query:

1. To search by exact phrase match --> "men's"

POST /my_index/my_type/_search
{
    "query": {
        "match": {
            "name": "men's"
        }
    }
}

Above query returns "men's shaver" in the return result.

2. To search by Partial word match --> "en's"

POST /my_index/my_type/_search
{
    "query": {
        "match": {
            "name": "en's"
        }
    }
}

Above query DOES NOT return anything.

I have also tried following query

POST /my_index/my_type/_search
{
    "query": {
        "wildcard": {
           "name": {
              "value": "%en's%"
           }
        }
    }
}

Still not getting anything. I figured it is because of "edge_ngram" type filter on Index which is not able to find "partial word/sbustring match". I tried "n-gram" type filter as well but it is slowing down the search alot.

Please suggest me how to achieve both excact phrase match and partial phrase match using same index setting.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

To search for partial field matches and exact matches, it will work better if you define the fields as "not analyzed" or as keywords (rather than text), then use a wildcard query.

See also this.

To use a wildcard query, append * on both ends of the string you are searching for:

POST /my_index/my_type/_search
{
"query": {
    "wildcard": {
       "name": {
          "value": "*en's*"
       }
    }
}
}

To use with case insensitivity, use a custom analyzer with a lowercase filter and keyword tokenizer.

Custom Analyzer:

"custom_analyzer": {
            "tokenizer": "keyword",
            "filter": ["lowercase"]
        }

Make the search string lowercase

If you get search string as AsD: change it to *asd*


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...