In my last blog post on the subject, I tried to find the maximum shard size in elasticsearch. But in the end all I could say is that elasticsearch can index the whole English Wikipedia dump in one shard without any problem but that queries are painfully slow. I couldn’t find any hard limit because […]
Java clients behavior during a split-brain situation in Elasticsearch
In my previous blog post I explained what the split-brain problem is for elasticsearch and how to avoid it, but only briefly spoken about how it manifests. In this post I’m going to expand on what actually happens to your indexing and query requests after the split-brain has occurred. As I’m sure you’re already aware, […]
How to avoid the split-brain problem in elasticsearch
We’ve all been there – we started to plan for an elasticsearch cluster and one of the first questions that comes up is “How many nodes should the cluster have?”. As I’m sure you already know, the answer to that question depends on a lot of factors, like expected load, data size, hardware etc. In […]
Maximum shard size in elasticsearch
Whenever people start working with elasticsearch they have to make important configuration decisions. Most of the decisions can be altered along the line (refresh interval, number of replicas), but one stands out as permanent – number of shards. When you create an index in elasticsearch, you specify how many shards that index will have and […]
Server-side clustering of geo-points on a map using Elasticsearch
Plotting markers on a map is easy using the tooling that is readily available. However, what if you want to add a large number of markers to a map when building a search interface? The problem is that things start to clutter and it’s hard to view the results. The solution is to group results […]
Improved search for Hippo CMS websites using ElasticSearch
We have done multiple big Hippo projects. A regular Hippo project consists of multiple components like the website, the content management system and a repository for the documents. In most of the projects we also introduce the integration component. This component is used to pull other data sources into Hippo, but we also use it […]
Migrating Verity to Elasticsearch at Beeld & Geluid
Nederlands Instituut voor Beeld & Geluid: Beeld & Geluid is not only the very interesting museum of media and television located in the colorful building next to the Hilversum Noord train station, but is also responsible for the archiving of all the audio-visual content of all the Dutch radio and television broadcasters. Around 800.000 hours of […]
Elasticsearch server book review
I recently read the ElasticSearch server book published by Packt Publishing. It was a pleasant reading, really interesting even though I was already familiar with the product. So here is a quick synopsis of the book & it’s content. Not one of my usual blogs but nonetheless something I wanted to share. Writing a book about […]
Fun combining Java, JavaScript and elastic.js within the elasticshell
I recently wrote a couple of articles about the elasticshell, the command line shell for Elasticsearch that I created. If you haven’t heard about it, it’s a json friendly command line tool that allows to quickly interact with Elasticsearch: you can easily index documents, execute queries and make use of all the API that Elasticsearch […]
Introducing a Query tool as an Elasticsearch plugin (part 2)
In the first part of this series of blogs on Introducing a Query as an Elasticsearch plugin, I described the functionality of a query tool I also discussed the functionality of the tool as the structure of the project. In this part I want to take a deeper dive in interacting with Elasticsearch. The post […]