Thursday, July 29, 2010

Use Java to Add ArcSDE Data Layers to Geoserver Using REST

I was scouring the web for examples of how to implement Geoserver's REST API in Java to add data layers dynamically. I was able to use curl to successfully add data layers but I wanted to make this functionality accessible via Java without having to use .exec() to do the work. What I found were some examples using the open source Jersey Reference Implementation for building RESTful Web services. A description of the project is here and the jars needed to run the following code are here.

A simple implementation without components to add metadata to fully describe the layer (to come later). The original code came from Jon Britton and was posted here.



An excellent alternative to this approach, and no dependencies is GSRCJ.

Monday, July 12, 2010

Exploring Solr

Over the weekend, I read some of "Solr 1.4 Enterprise Search Server". I learned about a few more features. When multiple keywords are provided, Solr does the right thing. Documents that hit or more words tend to get a higher score. The more rare a word is, the more a hit on it is worth. Solr provides built-in support for both stemming and paging.

I have defined a Solr schema and ingested some XML formatted data using curl (curl http://localhost:8983/solr/update -F stream.file=/tmp/sampleData.xml). Note that the absolute path of the file must be provided. The added data will not searchable until a commit is performed.

The data can be searched using the Solr admin tool or by providing a URL. The URL to search for all the data is http://localhost:8983/solr/select/?q=*%3A*, where %3A is the code for :.

The current schema is not at all complete. Layer coordinates are simply being stored as a bounding box where each coordinate is of type tdouble. We should probably consider ingesting the the XML meta data directly. However, one item lacking from the XML is document boost.

Friday, July 9, 2010

OpenLayers, Google Maps and versions

The current 2.9 OpenLayers release is not compatible with Google Maps v3. This issue is being worked on and there are patches for v3 compatibility. The details are at http://trac.openlayers.org/ticket/2493. v3 compatibility is slated for the OpenLayers 2.10 (which is the next release). It appears OpenLayers releases new versions roughly once a year. Since 2.9 was released in April 2010, we can't expect to receive the 2.10 release this summer. (You can follow the 2.10 release at http://trac.openlayers.org/wiki/Release/2.10.) Google Maps v2 has been deprecated. It may be available for only three more years. For OpenGeoPortal we can use Google Maps v3 with the already nearly complete v3 patches to 2.9. Or we can use Google Maps v2 with the standard 2.9 OpenLayers release. Then in 2011 or 2012 or so we can upgrade to 2.10 and v3.

Search Options

Search is a critical element for OpenGeoPortal. Results must be properly ranked, complete and returned quickly. There are two approaches we can take. The traditional solution uses SQL. The layer meta data is put into a relational database. SQL queries run against the table and the results are displayed. Often the SQL query, using "ORDER BY", ranks the layers and determines the order layers are displayed. Another potential approach is to use more modern search technology. There are open source solutions (Lucene) and Solr) we might integrate.

What are the advantages of using Solr/Lucene? It has built in support for GIS data include geodetic coordinates, geohashes, bounding boxes and spacial hierarchies. Distances can be calculated in several coordinate systems including euclidean, great circle and Manhattan. It has built in support for advanced search features. Synonyms and misspellings are added by putting them into a configuration file. Likewise, words to ignore can be added by editing another configuration file. Ranking supports different weights on both specific layers and individual meta data fields. Weights are modified via configuration files, not changing code. Results can be both ranked and grouped. Results are available in multiple formats including XML and JSON.

The biggest disadvantage is somebody has to learn a lot about Solr and write some ingest code. I think I'm up for that.

Grant Ingersoll wrote a nice paper discussing using Solr with GIS data.

A few parting comments. First, people building high-end search solutions today don't look to SQL like they used to. Search solutions often include data repositories optimized for search, not relying on legacy data stores designed for transaction based read/write operations. That makes me wonder if we should build our search solution based on a SQL database. Second, I think the days are numbered for people creating their own search solution. Maybe we're not quite there yet, but as a function of time, programmers will increasing rely on integrating existing search solutions. It happened for hashtables and data repositories, it is now happening for search.

What do you think? Should we consider it? Is the technology mature enough? Does its Java/Tomcat infrastructure make it easy enough for us to deal with and integrate? Do we envision an search that relies on ESRI SDE that could not be replicated in Solr?

Thursday, July 8, 2010

Screen Shots

Our UI staff has created a design. Unless it changes, what we build will be very close to the design.

The UI is dominated by a map (on the right) and a tabbed panel (on the left). The panel contains 4 tabs. Initially, the "Getting Started" tab is displayed.




The second tab is "Search". From it the user can execute searches based on keywords and/or the geographic extents. The following image also shows how search results are displayed.



The advanced search lets the user provide many different search parameters. The display of search results matches the basic search.



Layers that appear in the search results can be displayed on the map. Displayed layers are listed on the "Saved Layers" tab. From this tab the user can modify graphics parameters, change layer drawing order and download layers.

Thursday, July 1, 2010

JavaScript Table Library

The search results will be displayed in a table. Naturally, we need a dynamic table that supports sorting based on any column. Currently, we plan on using DataTables.