In the previous articles about scoring, and boolean queries Spoon Consulting saw how the scoring of Elasticsearch works by default and how to tweak your query to influence it. 

In the article below, we will study how we can add different weights to different fields, and the different “boost” behaviors. 

TL:DR

Give power to relevant fields

Basic concept of Boost is to add more weight to relevant fields. 

As an example, if you use elastic on a blog: 

If you search for elasticsearch and it’s found on the title, it should be more important than if it’s found in a comment.

However, depending on your dataset, comments can give a better score than title. 

“Boosts” are here to ensure that the most valuable fields give more points than the others. 

In our previous example, we could build a query like this: 

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": {
              "query": "Elastic Search",
              "boost": 3
            }
          }
        },
        {
          "match": {
            "comments": "Elastic Search"
          }
        }
      ]
    }
  }
}

It will multiply part of your scoring values by 3. 

Multiply means that adding a boost between 0 and 1 will lower the importance of a field.
In our previous query we could also have used a boost of 0.3 on comments field. 

Remember the formula of the scoring in Elasticsearch article

∑ (           
    tf(t in d)  # tf = sqrt(termFreq)
    idf(t)²  # 1 + ln(maxDocs/(docFreq + 1))    
    t.getBoost() #query time boost applied on field
    norm(t,d)    #1/sqrt(numFieldTerms)
) (t in q)

Note: In old elastic documentation, you’ll find boosting at Index Time, on the mapping of your index. Just don’t use it. It’s deprecated from the version 5. 

Boost queries

The boost query is rarely used. Usually,  function score queries are prefered.
But it can be useful for some tricky use cases. 

Sometimes, it can be interesting to influence the score depending on some lexical set of words

GET /_search
{
  "query": {
    "boosting": {
      "positive": {
        "term": {
          "text": "apple"
        }
      },
      "negative": {
        "term": {
          "text": "pie tart fruit crumble tree"
        }
      },
      "negative_boost": 0.5
    }
  }
}

How to define the good boost strategy in my Elasticsearch query?

Well, some use cases are easy to anticipate but it’s more trial and error. 

  1. Try some queries with no boost at all and study the result list.
    Search for non-optimal results
  2. If you find some non optimal results, tweak the boosts with your logic. Try to add a higher boost on the most important fields until you get satisfying results.
  3. A/B, peer testing: ask some colleagues, or some clients to test and give their feedback.
  4. Analyse client behaviours to find non optimal queries.
    Implement some analytics like appsearch does for you to better know your users and the queries made on you application. And tweak to leverage your ROI
Query stats in app search for elastic search
Analytics for Elasticsearch from AppSearch

Spoon consulting is a certified partner of Elastic

As a certified partner of the Elastic company, Spoon Consulting offers a high level consulting for all kinds of companies.

Read more information on your personal use Elasticsearch use case on Spoon consulting’s posts

Or contact Spoon consulting now