Trifork Blog

Elasticsearch server book review

May 22nd, 2013 by

elasticsearch server

I recently read the ElasticSearch server book published by Packt Publishing. It was a pleasant reading, really interesting even though I was already familiar with the product. So here is a quick synopsis of the book & it’s content. Not one of my usual blogs but nonetheless something I wanted to share.

Writing a book about Elasticsearch turns out not to be easy. There are in fact lots of features and gems that would need to be discussed, something that’s really hard to do in a book with a reasonable number of pages. Also, the product is rapidly evolving, which makes it extremely hard to keep up with it and come up with up-to-date content.

I think this book brings something that was missing until now in the Elasticsearch ecosystem, since it goes from installing the product and setting it up to using it in real life, describing also potential issues and their solutions. Also, it doesn’t neglect the needed technical details about the underlying Lucene library and search in general.

Chapter 1: Getting started with Elasticsearch cluster

The first chapter gives an overview of Elasticsearch, how to install it and how to use it, and goes rapidly and surprisingly into detail about all the supported data types and text analyzers available, to then describe the distributed nature of Elasticsearch and some best practices like using index templates and aliases.

Chapter 2: Searching your data

The second chapter explains how to search against the available indexes and find results. It contains an overview of the queries that the Elasticsearch query DSL offers, together with examples and all the available query options.

Chapter 3: Extending your structure and search

The third chapter goes more into detail about search. It describes how to highlight the relevant parts of the search results, together with real examples on different ways to implement the auto-complete feature, how to index binary content and how to search for geographic locations.

Chapter 4: Make your search better

The fourth chapter goes ahead describing the analyze and explain api, great tools to understand how the text analysis and documents scoring work. The next topic is boosting and the different ways to implement it, either at index time or query time. This chapter contains also a real example on how to handle multilanguage content and an overview of all the span queries available, in other words the queries that take token positions and their proximity into account.

Chapter 5: Combining indexing, analysis and search

The fifth chapter starts with a really hot topic nowadays: document relations. It goes over the out-of-the-box support for json nested objects, to then describe nested documents and parent-child. The final and really interesting topic for the chapter is how data flows into Elasticsearch using rivers and how to index data as fast as possible through batch indexing.

Chapter 6: Beyond searching

As the title said, the sixth chapter goes beyond search and describes other features that Elasticsearch offers, among which faceting is definitely the most important one. In fact, there are many companies using Elasticsearch only for analytics through facets, without any full-text search in their applications. When it comes to facets it’s great to have a look not only at the needed json request, but at the obtained response too and the different numbers depending on the type of facet used. Other features discussed in this chapter are more like this and the percolator.

Chapter 7: Administrating your cluster

The seventh chapter explains how to administer an Elasticsearch cluster, mainly using the cluster api and the existing user interfaces or plugins that make use of them.

Chapter 8: Dealing with problems

The last chapter is all about tackling potential issues with Elasticsearch, looking at the logs and using an API like validate query and indices warmup.

I think ElasticSearch server is a good fit not only for beginners, but also for people who already know the product and want to get more familiar with it. The reason is that it covers quite a lot, and if you haven’t used Elasticsearch extensively there’s a good chance you have missed some of its goodness!

The parts that I liked the most are the ones that contain real examples and practical hints. That’s why I would have loved to see even more of them, especially about document relations, the query DSL and the percolator, and I don’t mean basic ones but real cases and tips, together with a “Go to production” chapter containing suggestions about all the settings that one should change before going to production.

Hope this helped anyone that might be considering to get some extra product insight, promise my next blog will be diving into cool features of the actual product again!

2 Responses

  1. […] Click here to read the rest of the article I wrote on the Trifork blog. […]

  2. […] Click here to read the rest of the article I wrote on the Trifork blog. […]