Friday, June 24, 2011

spatial intersections with Solr

The following describes a Solr query filter that determines if axis aligned geographic bounding boxes intersect. It is used to determine which documents in a Solr repository containing spatial data are relevant to a given rectangular geographic region that corresponds to a displayed map. I haven’t seen this described before and thought it might be useful to others.

OpenGeoPortal (http://geoportal-demo.atech.tufts.edu/) is a web application supporting the rapid discovery of GIS layers. It uses Solr to combine spatial, keyword, date and GIS datatype based searching. As a user manipulates its map OpenGeoPortal automatically computes and displays relevant search results. This spatial searching requires the application to determine which GIS layers are relevant given the current extent of the map. Each Solr document includes spatial information about a single GIS layer. Specifically, it contains the center of the layer (in degrees latitude and longitude stored as tdoubles) as well as the height and width of the layer (in degrees stored as a tdouble). These values are precomputed from the bounding boxes of the layers during ingest.

To identify relevant layers our search algorithm looks for a separating axis (http://en.wikipedia.org/wiki/Separating_axis_theorem) between the current bounds of the map and the bounds of each layer. If a horizontal or vertical separating axis exists then the layer does not contain any information in the geographic area defined by the map. If neither separating axis exists then the layer intersects the map, and is included in the result set.

Identifying whether separating axes exists is relatively straightforward given two axis-aligned bounding boxes. In our case, one bounding box is defined by the map’s current extent and the other bounding box by a GIS layer. To determine if a vertical separating exists one must determine if the difference between the center longitude of the map and the center longitude of the layer is greater then the sum of the width of the map with the width of the layer. If so, a vertical separating axis exists. If not, a vertical separating axis does not exist. (See http://www.gamasutra.com/view/feature/3383/simple_intersection_tests_for_games.php?page=3 for a diagram.) Similarly, the presence of a horizontal separating can be computed using center latitudes and heights.

It is possible to generate a Solr filter query that filters layers that contain neither a horizontal or vertical separating axis given a specific map. Naturally, this query is somewhat complicated. The query essentially counts the number of separating axes and, using !frange, eliminates layers that have a separating axis. In the following example, the map was centered on latitude 42.3, longitude -71.0 and had a width and height of 0.3 degrees. The schema defines the fields CenterX, CenterY, HalfWidth and HalfHeight.

fq={!frange l=1 u=2}
map(sum(
map(sub(abs(sub(-71.0,CenterX)),sum(0.3,HalfWidth)),0,360,1,0),
map(sub(abs(sub(42.3,CenterY)),sum(0.3,HalfHeight)),0,90,1,0)),
0,0,1,0)

The clauses that check for bounding box (e.g., sub(abs(sub(-71.0,CenterX)),sum(0.3,HalfWidth)) and sub(abs(sub(42.3,CenterY)),sum(0.3,HalfHeight))) return a positive number if a separating axis exists. Using a map function, this value is mapped to 1 if the separating axis exists and 0 if it does not. The Solr query checks for two separating axes and computes the number of such axes using sum. The total number of separating axes (which is 0, 1 or 2) is then mapped to the values 0 and 1. This final map returns 1 if there are no separating axes (that is, the bounding boxes intersect) or 0 if there is at least one separating axis (that is, the bounding boxes do not intersect). The outermost clause applies a frange function to eliminate those layers that do not intersect the current map.

Ranking the layers that intersect the map is a separate issue. This is done with several query clauses. One clause determines how the area of the map compares area of the layer. The other determines how the center of the map compares to the center of the layer. These clauses are used in conjunction keyword-based queries and date-based filters to create search results based on spatial, keyword and temporal constraints.

Mailing List

There's been some activity on the mailing list. You can visit the archives at http://groups.google.com/group/opengeoportal.