-
Local Solr
Posted on June 12th, 2009 3 commentsFor searching at our Open Business Information Directory project we have been using a Solr server. Based on the Apache Lucene library, it consistenly works well for the type of searches we need to do. Since a lot of the applications for business data will be local searches, we looked at ways to implement distance algorithms. For a quick solution, we turned to LocalLucene/LocalSolr. This package adds distance search to Solr, so that we can send queires for matching within a certain radius of a point. LocalLucene is available from SourgeForge.
Getting started takes some doing. First of all, we need to perform geocoding of our records. It is worth noting that not every record needs to be geocoded – ones that aren’t coded just won’t appear in proximity searches. To get up and running quickly, we used a postal code database to get a rough location of the records that weren’t already geocoded.
For the changes to the Solr installation, I referred to the helpful tutorial at GISSearch.com. Once the changes had been made, I reindexed our records. This was the longest part of the process – even though I only processed the Canadian records for this test, there are still over a million to go through. If the server wasn’t being used I could have shut it down, deleted the indexes and rebuild to save some time.
Along the way, a few problems came up. First was that a version build with the latest sources didn’t work, I had to revert to some earlier stable versions. At GISSearch there is an example package that has a compiled solr that works, so that is a good place to start if you are having issues there. The other big problem was that a bug in the phps output writer was preventing the searches from running. Switching to xml or json output solves that.
Using Local Solr instead of writing our own solutions has saved a lot of development time. We still need to do some performance testing to see how it will hold up under heavy usage, but so far it looks like with a dedicated server for geo searching we will be able to keep up with the loads.
3 responses to “Local Solr”
-
Jason Judge August 27th, 2009 at 12:53
I’m interested in knowing more about this open business directory. Is the code going to be open-sourced, or is it just the data? Or perhaps neither – maybe the ‘open’ means something else?
– Jason
-
It’s the data that we want to open – allow it to be community edited, and available for use in other projects.
The code for searching the data is lucene/solr based, so there is already a good open source solution for that. If we happen to develop something interesting along the way for working with this we would release it, but this isn’t our focus.
Some of the tools that we develop for managing, backing up and editing the data may also open sourced.
Edit: I should also add that there hasn’t been any activity on this project for a few months, but hope to get back to it in the fall.
-
I’ve been running a similar concept in New Zealand for the last few years – an open business/location directory with CC Attribution licensed data. It’s pretty successful now with over 100,000 unique visitors a month (population 4 million in NZ!) and an active community.
Haven’t used solr so far (Ferret + PostGIS) but am looking at transitioning currently
-

