The Data Import Handler is a popular method to import data into a Solr instance. It provides out of the box integration with databases, xml sources, e-mails and documents. A Solr instance often has multiple sources and the process to import data is usually expensive in terms of time and resources. Meanwhile, if you make […]
Introducing Lucene Index Doc Values
From day one Apache Lucene provided a solid inverted index datastructure and the ability to store the text and binary chunks in stored field. In a typical usecase the inverted index is used to retrieve & score documents matching one or more terms. Once the matching documents have been scored stored fields are loaded for the top N […]
Lucene PMC Otis Gospodnetić at Berlin Buzzwords 2011
Some of you might have attended BerlinBuzzwords 2011 – yet again an awesome conference for people interested in topics around Search, Store and Scale. Beside awesome talks we also had some volunteer students that interviewed some of the speakers. We have published these interviews with the videos which give them the visibility they deserve. So […]
The Lucene Sandbox
Few people are aware that Apache Lucene has been part of the ASF since 2001, becoming a Top Level Project in 2005. 10 years is an eternity in IT where ideas tend to evolve in leaps and bounds. Over that 10 year period many users, contributors and committers have come and gone from Lucene, each […]
Hotspotting Lucene With Query Specialization
Most Java developers probably take the technology behind HotspotTM for granted and assume it’s best suited to optimize their code. While they might know that choosing the best implementation of List will have a big impact on their program’s performance, they probably shy away from worrying about the cost of virtual method calls, believing Hotspot […]
Apache Lucene in Google Summer of Code – The Apache Way
In 2011 Google invited open source project around the globe for its 7th Google Summer of Code: “Google Summer of Code is a global program that offers student developers stipends to write code for various open source software projects. We have worked with several open source, free software, and technology-related groups to identify and fund several […]
Gather content for Lucene from WordPress using Groovy
I am learning about the capabilities of Lucene. Here at JTeam we have a few people that are specialized in Search using technology like Lucene and Solr. Therefore I want to have a higher level of knowledge of Lucene than I have now. So I started reading the Lucene in Action book. As I read […]
Search Result Grouping / Field Collapsing in Lucene / Solr
Grouping of search results or also known as field collapsing is often a requirement for search projects. As described earlier this functionality was added to Solr and happens to be one of the most wanted features in Solr. Recently result grouping was added to Lucene as contrib in Lucene 3.1 and a module in 4.0. […]
The State and Future of Spatial Search
The release of Solr 3.1, containing Solr’s official spatial search support, has coincided with a new debate about the future of spatial search in Solr and Lucene. JTeam has been involved in the development of spatial search support for a number of years and we maintain our own spatial search plugin for Solr. Consequently this […]
Indexing your Samba/Windows network shares using Solr
Many of JTeam’s clients want to search the content of their existing network shares as part of their Enterprise Search infrastructure. Over the last couple of years, more and more people are switching to Apache Lucene / Solr as their preferred, open source search solution. However, many still have the misconception that it is not […]