One of the very first thing users encounter when using Apache Solr is its schema. Here they configure the fields that their Documents will contain and the field types which define amongst other things, how field data will be analyzed. Solr’s schema is often touted as one of its major features and you will find […]
Lucene Versions – Stable, Development, 3.x and 4.0
With Solr and Lucene 3.6 soon becoming the last featureful 3.x release and the release of 4.0 slowly drawing near, I thought it might be useful just to recap what all the various versions mean to you the user and why two very different versions are soon going to be made available. A Brief History […]
Document Frequency Limited MultiTermQuerys
If you’ve ever looked at user generated data such as tweets, forum comments or even SMS text messages, you’ll have noticed there there are many variations in the spelling of words. In some cases they are intentional such as omissions of vowels to reduce message length, in other cases they are unintentional typos and spelling […]
Apache Lucene & Solr 3.5.0
Just a little over two weeks ago Apache Lucene and Solr 3.5.0 were released. The released artifacts can be found here and here respectively. As part of the Lucene project’s effort to do regular releases, 3.5.0 is another solid release providing a handful of new features and bugs. The following is a review of the […]
Analysing European Languages With Lucene
It seems more and more often these days that search applications must support a large array of European languages. Part of supporting a language is analysing words to find their stem or root form. An example of stemming is the reduction of the words “run”, “running”, “runs” and “ran” to their stem “run”. In the […]
Compromise is hard
Whenever I talk my job with friends who are also IT professionals, the most commonly desired aspect is that I get to work in a community where everybody has a voice. Apache Software Foundation projects like Solr and Lucene tend to work from the motto that if it didn’t happen on the mailing list, it […]
The Lucene Sandbox
Few people are aware that Apache Lucene has been part of the ASF since 2001, becoming a Top Level Project in 2005. 10 years is an eternity in IT where ideas tend to evolve in leaps and bounds. Over that 10 year period many users, contributors and committers have come and gone from Lucene, each […]
Hotspotting Lucene With Query Specialization
Most Java developers probably take the technology behind HotspotTM for granted and assume it’s best suited to optimize their code. While they might know that choosing the best implementation of List will have a big impact on their program’s performance, they probably shy away from worrying about the cost of virtual method calls, believing Hotspot […]
The State and Future of Spatial Search
The release of Solr 3.1, containing Solr’s official spatial search support, has coincided with a new debate about the future of spatial search in Solr and Lucene. JTeam has been involved in the development of spatial search support for a number of years and we maintain our own spatial search plugin for Solr. Consequently this […]
SSP 1.0 Video Tutorial
Although SSP v1.0 has been replaced by the simpler 2.0 version, some of you out there are probably still using 1.0 version. Because we like to provide as much assistance as we can to our users, we’ve decided to publish a video tutorial I created on how to configure and use SSP v1.0. It walks […]