Friday, June 24, 2011

spatial intersections with Solr

The following describes a Solr query filter that determines if axis aligned geographic bounding boxes intersect. It is used to determine which documents in a Solr repository containing spatial data are relevant to a given rectangular geographic region that corresponds to a displayed map. I haven’t seen this described before and thought it might be useful to others.

OpenGeoPortal (http://geoportal-demo.atech.tufts.edu/) is a web application supporting the rapid discovery of GIS layers. It uses Solr to combine spatial, keyword, date and GIS datatype based searching. As a user manipulates its map OpenGeoPortal automatically computes and displays relevant search results. This spatial searching requires the application to determine which GIS layers are relevant given the current extent of the map. Each Solr document includes spatial information about a single GIS layer. Specifically, it contains the center of the layer (in degrees latitude and longitude stored as tdoubles) as well as the height and width of the layer (in degrees stored as a tdouble). These values are precomputed from the bounding boxes of the layers during ingest.

To identify relevant layers our search algorithm looks for a separating axis (http://en.wikipedia.org/wiki/Separating_axis_theorem) between the current bounds of the map and the bounds of each layer. If a horizontal or vertical separating axis exists then the layer does not contain any information in the geographic area defined by the map. If neither separating axis exists then the layer intersects the map, and is included in the result set.

Identifying whether separating axes exists is relatively straightforward given two axis-aligned bounding boxes. In our case, one bounding box is defined by the map’s current extent and the other bounding box by a GIS layer. To determine if a vertical separating exists one must determine if the difference between the center longitude of the map and the center longitude of the layer is greater then the sum of the width of the map with the width of the layer. If so, a vertical separating axis exists. If not, a vertical separating axis does not exist. (See http://www.gamasutra.com/view/feature/3383/simple_intersection_tests_for_games.php?page=3 for a diagram.) Similarly, the presence of a horizontal separating can be computed using center latitudes and heights.

It is possible to generate a Solr filter query that filters layers that contain neither a horizontal or vertical separating axis given a specific map. Naturally, this query is somewhat complicated. The query essentially counts the number of separating axes and, using !frange, eliminates layers that have a separating axis. In the following example, the map was centered on latitude 42.3, longitude -71.0 and had a width and height of 0.3 degrees. The schema defines the fields CenterX, CenterY, HalfWidth and HalfHeight.

fq={!frange l=1 u=2}
map(sum(
map(sub(abs(sub(-71.0,CenterX)),sum(0.3,HalfWidth)),0,360,1,0),
map(sub(abs(sub(42.3,CenterY)),sum(0.3,HalfHeight)),0,90,1,0)),
0,0,1,0)

The clauses that check for bounding box (e.g., sub(abs(sub(-71.0,CenterX)),sum(0.3,HalfWidth)) and sub(abs(sub(42.3,CenterY)),sum(0.3,HalfHeight))) return a positive number if a separating axis exists. Using a map function, this value is mapped to 1 if the separating axis exists and 0 if it does not. The Solr query checks for two separating axes and computes the number of such axes using sum. The total number of separating axes (which is 0, 1 or 2) is then mapped to the values 0 and 1. This final map returns 1 if there are no separating axes (that is, the bounding boxes intersect) or 0 if there is at least one separating axis (that is, the bounding boxes do not intersect). The outermost clause applies a frange function to eliminate those layers that do not intersect the current map.

Ranking the layers that intersect the map is a separate issue. This is done with several query clauses. One clause determines how the area of the map compares area of the layer. The other determines how the center of the map compares to the center of the layer. These clauses are used in conjunction keyword-based queries and date-based filters to create search results based on spatial, keyword and temporal constraints.

Mailing List

There's been some activity on the mailing list. You can visit the archives at http://groups.google.com/group/opengeoportal.

Thursday, March 10, 2011

Updating a Solr Index

Clearing out an existing Solr index when you want to update your schema is easy with curl.

curl http://localhost:8080/solr/update --data-binary '*:*' -H 'Content-type:text/xml; charset=utf-8'


*replace port 8080 with the port # Solr is running under.

Stop servlet container
Change the schema.xml
Start servlet container

In the admin tool run a *:* query for the results.

Tuesday, December 7, 2010

Plate Carree: Geoserver and ArcIMS Compatibility

Anyone trying to connect Geoserver to an ArcSDE dataset stored as (ESRI) EPSG: 54001 will quickly find that not all Plate Carree's are created equal. Geoserver does not like ESRI's choice of ellipsoid, which means tweaking the parameters slightly. Follow these simple steps to make EPSG:54001 operational in Geoserver.

1. Edit ../webapps/geoserver/data/user_projections/epsg.properties in your Tomcat context.

2. Add a new line at the end of the file and append the following text as 1 line. Syntax is critical.

54001= PROJCS["WGS 84 / Plate Carree", GEOGCS["WGS 84", DATUM["World Geodetic System 1984", SPHEROID["WGS 84", 6378137.0, 298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich", 0.0, AUTHORITY["EPSG","8901"]], UNIT["degree", 0.017453292519943295], AXIS["Geodetic longitude", EAST], AXIS["Geodetic latitude", NORTH], AUTHORITY["EPSG","4326"]], PROJECTION["Equidistant Cylindrical (Spherical)", AUTHORITY["EPSG","9823"]], PARAMETER["central_meridian", 0.0], PARAMETER["latitude_of_origin", 0.0], PARAMETER["standard_parallel_1", 0.0],PARAMETER["false_easting", 0.0],PARAMETER["false_northing", 0.0],UNIT["m", 1.0],AXIS["Easting",EAST],AXIS["Northing", NORTH],AUTHORITY["EPSG","54001"]]

3. Restart Geoserver.

4. In your Geoserver Admin page select "Demos" and click on the "SRS List" link.

5. Search for either "54001" or "Plate Carree" and view the results.

Your projection should be in this list. Remember to keep an eye out on the Geoserver log for errors.

Thursday, October 14, 2010

7,000+ Active Geoserver Layers?

In considering our objective of adding map data from remote sites via WMS it occurs to me that at some point such model could fail in certain situations. The example that sticks out is data set volumes. With 7,000+ data sets to publish at Harvard, there are bound to be circumstances where a remote WMS request will return empty data if a Geoserver instance has lost it's connection (or a coverage store becomes corrupt) to ArcSDE. In cases like this we should think about intercepting the request and if a particular data layer needs "correcting" in Geoserver, we should utilize the REST capabilities to correct connection issues.

Thursday, July 29, 2010

Use Java to Add ArcSDE Data Layers to Geoserver Using REST

I was scouring the web for examples of how to implement Geoserver's REST API in Java to add data layers dynamically. I was able to use curl to successfully add data layers but I wanted to make this functionality accessible via Java without having to use .exec() to do the work. What I found were some examples using the open source Jersey Reference Implementation for building RESTful Web services. A description of the project is here and the jars needed to run the following code are here.

A simple implementation without components to add metadata to fully describe the layer (to come later). The original code came from Jon Britton and was posted here.



An excellent alternative to this approach, and no dependencies is GSRCJ.

Monday, July 12, 2010

Exploring Solr

Over the weekend, I read some of "Solr 1.4 Enterprise Search Server". I learned about a few more features. When multiple keywords are provided, Solr does the right thing. Documents that hit or more words tend to get a higher score. The more rare a word is, the more a hit on it is worth. Solr provides built-in support for both stemming and paging.

I have defined a Solr schema and ingested some XML formatted data using curl (curl http://localhost:8983/solr/update -F stream.file=/tmp/sampleData.xml). Note that the absolute path of the file must be provided. The added data will not searchable until a commit is performed.

The data can be searched using the Solr admin tool or by providing a URL. The URL to search for all the data is http://localhost:8983/solr/select/?q=*%3A*, where %3A is the code for :.

The current schema is not at all complete. Layer coordinates are simply being stored as a bounding box where each coordinate is of type tdouble. We should probably consider ingesting the the XML meta data directly. However, one item lacking from the XML is document boost.