I've recently started using ElasticSearch and I can't seem to make it search for a part of a word.
Example: I have three documents from my couchdb indexed in ElasticSearch:
{
"_id" : "1",
"name" : "John Doeman",
"function" : "Janitor"
}
{
"_id" : "2",
"name" : "Jane Doewoman",
"function" : "Teacher"
}
{
"_id" : "3",
"name" : "Jimmy Jackal",
"function" : "Student"
}
So now, I want to search for all documents containing "Doe"
curl http://localhost:9200/my_idx/my_type/_search?q=Doe
That doesn't return any hits. But if I search for
curl http://localhost:9200/my_idx/my_type/_search?q=Doeman
It does return one document (John Doeman).
I've tried setting different analyzers and different filters as properties of my index. I've also tried using a full blown query (for example:
{
"query": {
"term": {
"name": "Doe"
}
}
}
)
But nothing seems to work.
How can I make ElasticSearch find both John Doeman and Jane Doewoman when I search for "Doe" ?
UPDATE
I tried to use the nGram tokenizer and filter, like Igor proposed, like this:
{
"index": {
"index": "my_idx",
"type": "my_type",
"bulk_size": "100",
"bulk_timeout": "10ms",
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "my_ngram_tokenizer",
"filter": [
"my_ngram_filter"
]
}
},
"filter": {
"my_ngram_filter": {
"type": "nGram",
"min_gram": 1,
"max_gram": 1
}
},
"tokenizer": {
"my_ngram_tokenizer": {
"type": "nGram",
"min_gram": 1,
"max_gram": 1
}
}
}
}
}
The problem I'm having now is that each and every query returns ALL documents.
Any pointers? ElasticSearch documentation on using nGram isn't great...
Question&Answers:
os