Live Business Intelligence (BI) with Salesforce, Heroku, Elastic Search and Kibana - Part 1 - Indexing Datas

09/05/2019

Business Intelligence (BI), Spoon's Elastic posts

Introduction :

Elastic Search and Kibana bring value in many different ways.
Let’s explore how we use it in the context of a Salesforce platform application.

Wartner Group is a Paris connected Laundry group, who has significantly transformed its established business units, with a reputation of excellence and quality delivery. Wartner Group has led an impressive digital reimagining of their business by fully integrating and digitalizing its process so as to provide a unique customer experience.

Spoon Consulting has been working with Wartner since the inception of its transformation program, to design and build a full Cloud solution (namely Salesforce and Heroku). Spoon Consulting has offered a solution to manage all business processes including Order Management, Supply Chain, Customer Service and billing.

Over the years data has explodes with the needs to track the item of all the laundry accentuated with the swift expansion of the company. Reporting began to be a challenge…

Thanks to ElasticSearch and Kibana, Wartner is now able to have real time dashboard both on Salesforce data and production data without impacting It production

Business and IT stacks

Salesforce used for all business processes from order management , supply chain and billing
A strategy of full cloud and very limited internal IT resources
Production tracking in Heroku for the item status in real time both in the laundry and in the customer permises of Wartner
Heroku Connect to synchronise data between Heroku and Salesforce
Elastic Search on the Elastic Cloud
Kibana

Volumes

Several hundred thousands lines a day
300Gb on indexed data in PostgreSQL
Data’ growth of 50% a year

TL/DR

In short :

Index Salesforce data with Heroku connect and Ruby on Rails thanks to the Searchkick gem
Create Kibana Dashboards on Followups, History, Orders and Invoices
Create Dashboard for the management
Create user with specific rights in Elastic Search
Give VIP client a unique login to follow their data in real time in their Dashboard

Situation

Wartners faces a huge amount of information coming from multiple ecosystem (Salesforce and production tool in Kibana), which results in difficulty to have appropriate insights of the activity.

Hence, it means that Wartner was compelled to create reports and dashboards in Salesforce and in the production app. After the creation of reporting, Wartner did the reconciliation manually.

Creating reports in the production app raise significant difficulties. We had to develop ad’hoc complex query and UI for each needs. This lead to long time information retrieval and meaningless results.
It it was also really dangerous. Even with strong user limitations, it implies the use of heavy request which may impact the whole production tool with dramatic consequences like the fall over of the entire production Database.

With more than 100 employees using the production app, it’s definitely not an acceptable risk.

Moreover, Wartner was willing to build a close relationship with it own clients and give them access to those reports.

For Wartner it was urgent to find a more viable solution.

Pains

Each projet comes always with challenges. In Wartner’s case, the following pains were the following :

Complex business rules, depending on several internal flows
Need for a reporting approaching real time capacity. The previous solutions based on materialized views in postgres were problematic
Non structured history log
The history logs are based on the audited gem which stocks every change in the audited models in a json field in Postgres.
Definitely it was not a good solution for aggregations, even with the best indexes
Increasing amount of data (+ 50% each year)
That justify the change of technologies. It was possible 3 years ago to use materialized views.
More than 1 000 000 updates a day on index in Elastic search
250 Gb of data into Elastic Search

Solution for real time Business intelligence (BI)

Hosting

You may know perceive why Elastic Search became such an evidence.

As we where based on Heroku, we test the open source cloud solution distributed via Heroku Add-ons.
But we found that these offers was less powerful and more expensive than the Elastic Enterprise Cloud of Elastic.co.
So we choose to use the original instead of the copies! And yet, we deployed it on AWS on the same region that our Heroku Dynos to limit latency.

We took a 2 zones (so one replica) cluster with Kibana and APM.

Really easy to set up, and also easy to upgrade when the volume of data generated become too important.

Data Ingestion Strategy

The production application is built on Ruby On Rails and the Salesforce data are synced into Postgres thanks to Heroku connect.

We love Active Record and our models for the application.

So we choose to manage the data ingest directly into the app, without Logstash.

At first we add some gems :

Sidekiq for async management and parallelisation
Searchkick for connection of Active models with Elastic Search index.
Searchkick reindex the data without any downtimes with an alias strategy.

With Searchkick, you can just call the gem, and everything works out of the box, with life cycles on the models.

So each time the model is saved, the data is automatically ingested into Elastic Search.

But to be a relevant solution, all your updates needs to be done by your model.
Any update_column, of update_all will be ignored.
In our case, because of some database triggers, we add to bypass the life cycles and manage the sync by ourselves.

Model configuration

We choose to create search classes which inherit from the wanted Model.

So our configuration looks like that :

module Search
 # Manage Followup indexing and search
 # in elastic search
 class FollowupSearch < Followup
   # elastic search : https://github.com/ankane/searchkick#getting-started
   searchkick callbacks: false,
             settings: { number_of_shards: 5 },
             index_name: 'followups'

Data enrichment

That’s the interesting and tricky part.

A table, by itself is not a relevant data without a lot of JOINS
We don’t need all the fields values of all the table
But we need special information to have cohérent NoSql indexes in Elastic Search.

This is the part that need more reflexion. Each Table for each use case is different, and each field added has a cost in indexing, disk size and computing time of aggregations.

For this part, we need a very good understanding of the SGBD and of the company needs.

In the code it means :

scope :search_import, lambda {
                           includes(:account,
                                  :and_a_lot_of_includes)
                           }

And the core of the information is here :

  # define custom fields
   def search_data_fields
     {
       tag_rfid: tag_rfid.code,
       type_de_devis: quote.type__c,
       localisation: step.localisation,
       step_name: step.name,
       step_label: step.label,
       product_name: product.name,
       account_sfid: account.sfid,
       account_name: account.name,
       account: account.to_json,
       departement: department.name
     }
   end

Standardisation

In the previous model example, you should have notice duplicates in account information.

That’s not a mistake.

       account_sfid: account.sfid,
       account_name: account.name,
       account: account.to_json,

The account information are bases for filters and aggregation of all the indices created in Elastic Search.

The trick here is to add main values normalized the same way on each index.

You will then be able to do queries on several object at one time like :

GET _search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "field": "my searc
        }
      },
      "filter": {
        "term": {
          "account_sfid": "My id"
        }
      }
    }
  }
}

This is a key for the next point

User creation in Elastic Search

A small method allows administrators of Wartner to create an elastic search client to secure the data access of their clients in Elastic search.

This will allow us to give them access to Kibana with the absolute assurance that they will only have access to data of their Account.

The Elastic search users are not directly bind to an Active Record Model.
This is not a problem :

# create a singleton to avoid multiple non closed connexions to Elastic Search
require 'singleton'
 
module Search
 class EsClient
   require 'elasticsearch/rails'
   require 'elasticsearch/xpack'
   require 'securerandom'
   include Singleton
 
   def initialize
     @EsClient = Elasticsearch::Client.new log: true
   end
 
   def client
     @EsClient
   end
 
   # generate new random password and keep it in memory to show it later
   def get_password
     @password.nil? ? @password = SecureRandom.hex(16) : @password
   end
 
   # create an elastic search User from a salesforce contact
   # generate a new random pasword
   # we add kibana_dashboard_only_user to avoid access to anything but dashboard visualisation
   # "client" is our custom profile
   # metadata is used to stock the filtering datas
   def create_contact_user(contact)
     response = @EsClient.xpack.security.put_user(
       {
         username: contact.email,
         body: {
           password: get_password,
           roles: [ "kibana_dashboard_only_user", "client" ],
           full_name: contact.name,
           email: contact.email,
           metadata: {
               account_sfid: contact.accountid
           }
        }
       }
     )
     response[contact.email]
   end 
 end
end

The corresponding Elastic Role handles user visibility this way :

{
  "client" : {
    "indices" : [
      {
        "names" : [
          …
        ],
        "privileges" : [
          "read"
        ],
          "query" : """{"template":{"source":{"term":{"account.sfid":"{{_user.metadata.account_sfid}}"}}}}""",
        "allow_restricted_indices" : false
      }
    ]
   }
}

Conclusion

The indexation process is very common, but need to be designed wisely.

In other blog posts, we will cover how to use Beat or Elastic

If you want Spoon Consulting Expert to help you on you business needs and implementation with Salesforce, Elastic Search, or Salesforce and Elastic search, don’t hesitate to contact us.

Live Business Intelligence (BI) with Salesforce, Heroku, Elastic Search and Kibana – Part 1 – Indexing Datas