Trifork Blog

AngularJS training

Category ‘Elasticsearch’

Elasticsearch & Spring MVC & Sencha Touch 2 in the Cloud - Part 1

April 15th, 2014 by
(http://blog.trifork.com/2014/04/15/elasticsearch-spring-mvc-sencha-touch-2-in-the-cloud-part-1/)

elasticsearch-logo

 

Introduction

Welcome to my third blog entry. In this one, I want to show how to connect three different technologies (database, REST service, and a mobile framework) and deploy them into the cloud. Those three technologies are:

  • Elasticsearch (Database)
  • Spring MVC (REST service)
  • Sencha Touch 2 (Client side framework)

First, I want to give a short introduction of the three technologies. I am going to start with Elasticsearch, followed by Spring MVC and in the end Sencha Touch 2. In the second part, I will explain the services that I used. The first service is called searchly and is a database service. The second service is cloudbees and is used as cloud host. In the end a conclusion is given. Part 2 of this blog will cover a demo with the steps to connect everything with each other. I wish you fun reading this entry and if there are any questions don't hesitate to drop a comment (Of course you can drop also a comment if you just like it).

Read the rest of this entry »

Server-side clustering of geo-points on a map using Elasticsearch - continued

March 26th, 2014 by
(http://blog.trifork.com/2014/03/26/server-side-clustering-of-geo-points-on-a-map-using-elasticsearch-continued/)

In a previous post I described a problem of data visualization and a possible solution provided by a plugin of elasticsearch. I noticed that elasticsearch might one day evolve to make the plugin unnecessary. That day seems to have come: starting from version 1.0.0, elasticsearch includes Aggregations, a new API for data mining. In this post I'll show you how to use aggregations to reproduce the functionality of the plugin.

Read the rest of this entry »

Evaluating elasticsearch and marvel on the raspberry pi

February 8th, 2014 by
(http://blog.trifork.com/2014/02/08/evaluating-elasticsearch-and-marvel-on-the-raspberry-pi/)

IMG 0208

The past years I have been working with search solutions, mostly elasticsearch. During this time a bought myself a raspberry pi and installed java and elasticsearch on it. Then I put it in the closet and it did not come out anymore. Than a few weeks a go the guys from elasticsearch released marvel. Marvel is a monitoring tool for your elasticsearch cluster. Suddenly I realized what the problem with the raspberry pi is. It is not fun to have just one. Therefore I decided to buy two more and create an elasticsearch cluster. With this cluster I can do experiments. The first experiment is evaluating marvel.

In this blog post I will show some of the concepts of marvel. To make this possible I will also explain the steps I had to take to install elasticsearch on my raspberry pi cluster.

Read the rest of this entry »

Using logstash, elasticsearch and Kibana to monitor your video card - a tutorial

January 28th, 2014 by
(http://blog.trifork.com/2014/01/28/using-logstash-elasticsearch-and-kibana-to-monitor-your-video-card-a-tutorial/)

A few weeks ago my colleague Jettro wrote a blog post about an interesting real-life use case for Kibana: using it to graph meta-data of the photos you took. Given that photography is not a hobby of mine I decided to find a use-case for Kibana using something closer to my heart: gaming.

This Christmas I treated myself to a new computer. The toughest decision I had to make was regarding the video card. In the end I went with a reference AMD R9 290, notoriously known for its noisiness. Because I'm really interested in seeing how the card performs while gaming, I decided to spent some time on my other hobby, programming, in order to come up with a video card monitoring solution based on logstash, elasticsearch & Kibana. Overkill? Probably. Fun? Definitely.

I believe it's also a very nice introduction on how to set up a fully working setup of logstash - elasticsearch - Kibana. Because of the "Windowsy" nature of gaming, some of the commands listed are the Windows version. The Unix folk should have no problems translating these as everything is kept very simple.

Read the rest of this entry »

elasticsearch - how many shards?

January 7th, 2014 by
(http://blog.trifork.com/2014/01/07/elasticsearch-how-many-shards/)

We've all been there - you're provisioning for an elasticsearch index and one of the first questions that comes to mind is "how many shards should I create my index with?". In my previous posts on the subject, I wrote about how to find the maximum shard size for elasticsearch. Although informative, the results of the tests also raised a new question: would more shards on a single elasticsearch node increase performance? In this blog post I'm going to try to show the performance consequences of different choices for the number of shards.

Read the rest of this entry »

Use Kibana to analyze your images

November 28th, 2013 by
(http://blog.trifork.com/2013/11/28/use-kibana-to-analyze-your-images/)

If you are reading some technical blogs, maybe about search or data analysis, chances are big you have read about Kibana. You have seen stories about how easy it is    to use. Most of the blogging effort deals with getting data into kibana using logstash for instance. Maybe some of you have installed Kibana and are using it in combination with logstash. But what if you want to analyze other data. With the most recent release M4, Kibana is better than ever in analyzing other sort of data. In this blog I am going to show you how to create your own dashboard in Kibana. In order to do something useful with Kibana we have to have data. Peter Meijer had a very nice idea to index metadata from all of your images to learn about the type of photo's that you take. I decided to put this in practice. I used Node.js and the exiftool to obtain metadata from images and store it in elasticsearch.

Read the rest of this entry »

Maximum shard size in elasticsearch - revisited

November 5th, 2013 by
(http://blog.trifork.com/2013/11/05/maximum-shard-size-in-elasticsearch-revisited/)

Elasticsearch LogoIn my last blog post on the subject, I tried to find the maximum shard size in elasticsearch. But in the end all I could say is that elasticsearch can index the whole English Wikipedia dump in one shard without any problem but that queries are painfully slow. I couldn't find any hard limit because I didn't know exactly what will be the problem. I was expecting indexing to slow down before the querying, thus I couldn't do a relevant querying test with a smaller index. Armed with my knowledge from my previous experiment, in this post I will try to show what the maximum shard size is for a given set of conditions.

Read the rest of this entry »

Java clients behavior during a split-brain situation in Elasticsearch

October 31st, 2013 by
(http://blog.trifork.com/2013/10/31/java-clients-behavior-during-creating-a-split-brain-situation-in-elasticsearch/)

Elasticsearch LogoIn my previous blog post I explained what the split-brain problem is for elasticsearch and how to avoid it, but only briefly spoken about how it manifests. In this post I'm going to expand on what actually happens to your indexing and query requests after the split-brain has occurred. As I'm sure you're already aware, it depends! It depends on the type of client you use. Because Java is my specialty, I'm going to write about the two types of clients elasticsearch supports through the Java API: the transport client and the node client.

Read the rest of this entry »

How to avoid the split-brain problem in elasticsearch

October 24th, 2013 by
(http://blog.trifork.com/2013/10/24/how-to-avoid-the-split-brain-problem-in-elasticsearch/)

Elasticsearch LogoWe've all been there - we started to plan for an elasticsearch cluster and one of the first questions that comes up is "How many nodes should the cluster have?". As I'm sure you already know, the answer to that question depends on a lot of factors, like expected load, data size, hardware etc. In this blog post I'm not going to go into the detail of how to size your cluster, but instead will talk about something equally important - how to avoid the split-brain problem.

Read the rest of this entry »

Maximum shard size in elasticsearch

September 26th, 2013 by
(http://blog.trifork.com/2013/09/26/maximum-shard-size-in-elasticsearch/)

Elasticsearch LogoWhenever people start working with elasticsearch they have to make important configuration decisions. Most of the decisions can be altered along the line (refresh interval, number of replicas), but one stands out as permanent - number of shards. When you create an index in elasticsearch, you specify how many shards that index will have and you cannot change this setting without reindexing all the data from scratch. In some cases reindexing is not a time consuming task, but there are situations where it can take days to rebuild an elasticsearch index.

Many developers feel the pressure of making the right choice in regards to the number of shards they will use when creating an index. But with a base line of what the maximum shard size is and knowing how much data needs to be stored in elasticsearch, the choice of number of shards becomes much easier.

When I started working with elasticsearch a while ago, I was fortunate enough to work alongside a very talented engineer, a true search expert. I would often ask him questions like "So how many shards can one elasticsearch node support?" or "What should the refresh interval be?". He would pause, think for a while, but in the end his answer would always be "Well, it depends". This answer irked me in the beginning, especially because we're in IT, where everything is 0s and 1s, right? In this blog post I will show what the answer to the question "How much data can a single-shard index hold?" depends on and how to find the best setting for your environment.

Read the rest of this entry »