text / document classification with elastic search

Text classification with Elasticsearch : NO AI needed

Elasticsearch is very powerful for non-structured text classification. In a recent use case for an important international Consulting Company, we had to exploit non-structured documents to allow our client to exploit his internal data, be able to analyse and find similar content, based on a tag classification.  That’s typically a case where Elasticsearch can look magic. In just a few days, based on the core functionality of Elasticsearch and Kibana, we were able to auto-classify all the documents of the…

0
Read More
Generate dynamics fields

Runtime fields in Elasticsearch and Kibana – tests and explanation

In the latest releases of Elasticsearch and Kibana, a lot has been done to integrate runtime fields smoothly and unleash all it’s power naturally.  In a few clicks, you can create virtual values for formatting or calculating values without reindexing the whole index.  The runtime field can now be called in total transparency like any other elasticsearch _source field.  But it’s still a virtual field, calculated on the fly at query time, just like scripted fields.  So, it’s a very…

0
Read More
Text categorization with elasticsearch

Categorize_text : better log alerts in elasticsearch

TL:DR New aggregation on unstructured (semi-structured) texts with the 7.16 Categorize logs for an alerte Better granularity for information message Build Better Alerts with the new aggregation of Elasticsearch We are working on an alerting system for one of Spoon Consulting clients on Elasticsearch.  Client needs are very classical :  Send an alert when I have more than 5 error logs within less than 10 min Know encountered errors Usually to do this I would have to build a query…

0
Read More
Dedup data with elasticsearch

Deduplication made (almost) easy, thanks to Elasticsearch’s Aggregations

TL:DR After an awful update of physical RFID readers, a connected manufactory started to generate histories in a continious mode instead of a sequential mode This lead to hundreds of thousands duplicates in the history database With one Elasticsearch aggregation and one small Rails script, it was easy to clean up both elasticsearch and Postgre database Management of Histories with Elasticsearch In this project, we use The Audit gem of Ruby on Rails to track any change on the main…

0
Read More
centralized work with elastic search search template

Elasticsearch’s Search Templates: Industrialisation of Elasticsearch queries

Elasticsearch has a lot of small unknown game changing features.  Search templates are one of those. When it suits your use case, it changes a lot your integration quality  TL:DR  Elasticsearch’s Search Template are interesting for :  Simplify integration Give a standard access to your indices without having to create any microservice Separation of concerns Avoid code duplication What is a Search Template?  Search templates are reusable scripts that handle the query complexity and let integrator use very complicated queries…

0
Read More
Scoring in Elasticsearch

Multi Match Query with elasticsearch – Influence scoring – part 3

As we saw in part 1, we can use boolean queries to influence scoring and boost the relevance of your searches. But it soon becomes very verbose and some kind of queries can be quite complicated to write. Multi-match comes to help you write more concise search Queries.  TL:DR Terms can be reused to “boost” better results – in part 1 Filter can be used to scope a query without influencing the score – in part 1 Mix Must/Should/Filter in one Elasticsearch…

0
Read More
Boost your fields for better relevance with Elaticsearch

Boost field weight in Elasticsearch – Influence Elasticsearch Scoring – Part 2

In the previous articles about scoring, and boolean queries Spoon Consulting saw how the scoring of Elasticsearch works by default and how to tweak your query to influence it.  In the article below, we will study how we can add different weights to different fields, and the different “boost” behaviors.  TL:DR Terms can be reused to “boost” better results – in part 1 Filter can be used to scope a query without influencing the score – in part 1 Mix…

0
Read More
Meaning of relevance

Boolean query with elasticsearch – Influence Elasticsearch Scoring – Part 1

In the previous article about scoring, Spoon Consulting saw how the scoring of Elasticsearch works by default.  Now let’s see how you can leverage your results to map your use cases.  TL:DR Terms can be reused to “boost” better results – in part 1 Filter can be used to scope a query without influencing the score – in part 1 Mix Must/Should/Filter in one Elasticsearch boolean query give a lot of flexibility – in part 1 Boosts give weight on…

0
Read More
Paginate aggregation with Elasticsearch

Paginating term aggregation

In Elasticsearch, paginating aggregations results is a recurring need.By default, Elastic will send all results in your aggregation. If a query filter is often enough, it’s not always the wanted behavior.  First possibility, increase a lot the size parameter and do the pagination on front side.It can be a good solution… for few hundred results, and a low cardinality.  But if we don’t want to crash our app, we probably can do better.  Depending on your specific use case you will…

0
Read More
More like this on ebay

More Like This query (MLT) – Suggest similar content with Elasticsearch

TL:DR With the More Like This query, Elasticsearch gives an easy and powerful way to find contents similar to what your users are currently watching.  It’s a key point to improve your retention, and your ROI  More Like This Query : Suggest related content Any company that edits a website, or any content centred tool has one only goal: Improve Transformation.It can be a contact form, a newsletter subscription, an online purchase, a click on an advert or simply finding…

0
Read More
Contact us
Spoon Consulting 
Spaces Bonne Nouvelle
17 rue Saint-Fiacre
75002 Paris

Contact the Spoon Consulting expert team

Spoon Consulting
Elastic Spoon is part of the spoon consulting Team.
Visit our website to know more about us.