Google Maps with Google Earth Plugin

June 6th, 2008



Fig 1 - Google Map with Google Earth plugin -

Google announced a new GE plugin for use inside Google Maps:
http://code.google.com/apis/earth/documentation/

This is an interesting development since it allows Google Earth to be used inside a browser. Google’s Map object can be programmed using their javascript api for user interaction control which was not available inside standalone Google Earth. The api documents have plenty of examples but the very simplest way to use the Google Earth Plugin is to simply load their plugin api javascript like this:

<script
	src="http://maps.google.com/maps?file=api&v=2.x&key=**************"
	type="text/javascript">
	google.load("earth", "1");
</script>

Then add a Map Type G_SATELLITE_3D_MAP to the Map control in the initialization code.

function initialize() {
	if (GBrowserIsCompatible()) {
		map = new GMap2(document.getElementById("map_canvas"));
		map.setCenter(new GLatLng(39.43551, -104.91207), 9);
     		var mapControl = new GMapTypeControl();
     		map.addControl(mapControl);
     		map.addControl(new GLargeMapControl());
     		map.addMapType(G_SATELLITE_3D_MAP);
	}
}

This adds a fourth map type, “Earth”, to the control shown over the Google map base.

Fig 2 - Google Map with Google Earth plugin showing map overlays

Now a user can switch to a GE type viewing frame with full 3D camera action. Unfortunately the Maptype control is hidden so returning back to a Map view requires an additional button and piece of javascript code:

function resetMapType(evt){
	map.setMapType(G_NORMAL_MAP);
}

In order to show how useful this might be I added a button to read kml from a url:

function  LoadKML(){
	var geoXml = new GGeoXml(document.getElementById('txtKML').value);
	GEvent.addListener(geoXml, 'load', function() {
		if (geoXml.loadedCorrectly()) {
			geoXml.gotoDefaultViewport(map);
			document.getElementById("status").innerHTML = "";
		}
	});
	map.addOverlay(geoXml);
	document.getElementById("status").innerHTML = "Loading...";
}

Now I simply coded up a servlet to proxy PostGIS datasources into kml for me and I can add mapOverlays to my hearts content. If I want to be a bit more SOA it would be simple to configure a Geoserver FeatureType and let Geoserver produce kml for me.

My LoadKML script lets me copy any url that produces kml into a text box, which then loads the results into the Google Map object. With the GE plugin enabled I can view my kml inside a GE viewer, inside my browser. By stacking these overlays onto the map I can see multiple layers. The javascript api gives me pretty complete control of what goes on. However, there are still some rough edges. In addition to overwriting the map control that would allow the user to click back to a map, satellite, or hybrid view, there are some very odd things going on with the kml description balloons. Since I’m using IE8beta I can’t really vouch for this being a universal oddity or some glitch in the IE8 situation. After all IE8 beta on Vista really does strange things to the Google Map website making it more or less unuseable.

Here are some items I ran across in the little bit of experimetation I’ve done:

  • plugin loading is slow and doesn’t appear to be cached
  • returning from Earth view requires javascript code
  • click descriptions are only available on point placemarks
  • the balloon descriptions show up only sometimes in an earth view
  • There appears to be a limit on the number of kml features which can be added. Over 5000 seems to choke

The rendering in the new Google Earth plugin view is quite useful and provides at least a subset of kml functionality. This evolution distinctly shows the advantage of a competitive market. The Microsoft Google competition significantly speeds the evolution of browser map technology. Microsoft is approaching this same type of browser merged capability as well with their pre announcement of Virtual Earth elements inside Silverlight. 3D buildings, Street view, Deep Zoom, Photosynth, Panoramio …. are all technologies racing into the browser. Virtual parallel worlds are fascinating especially when they overlap the real world. Kml feeds, map games, and live cameras coupled with GPS streams seem to be transforming map paradigms into more or less virtual life worlds.

GIS savvy developers already have a wealth of technology to expose into user applications. Many potential users, though, are still quite unaware of the possibilities. The ramp up of these new capabilities in the enterprise should make business tools very powerful, if not downright entertaining!

More Google Earth - Time Animation

June 3rd, 2008



Fig 1 - Google Earth Time Animation tool visible in the upper part of the view frame -

I was out of town for a trip back to Washington DC the last couple of weeks, but now that I’m back I wanted to play with some more KML features in Google Earth 4.3. One of the more interesting elements offered by KML inside Google Earth is the use of timestamps for animated sequences.
KML offers a couple of time elements:

<TimeStamp>

        <TimeStamp>
	<when>2002-09-27T21:44:41.087Z</when>
        </TimeStamp>

<TimeSpan>

        <TimeSpan>
	<begin>2002-09-27T21:44:41.087Z</begin>
                <end>2002-09-27T21:45:41.087Z</end>
        </TimeSpan>

By attaching a time stamp element to kml rendered elements it is possible to make use of the built in Google Earth time animation tool. I have a table of GMTI events for a few simulated missions that is ideal for this type of viewing. The time stamp deltas are in the millisecond range and work best as timestamp elements.

KML time formats are defined as dateTime (YYYY-MM-DDThh:mm:ssZ) which translates to a Java formatter:

SimpleDateFormat timeout = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");

In KML the T char delimits date from time, while the Z indicates UTC. In my case using simulated mission data I am more interested in delta time effects than actual time.

By adding the appropriately formatted <TimeStamp> to each <Placemark> from the GMTI data set I can create a KML data stream with the time animation tool enabled.

<Placemark id="gmti_target.80916">
        <TimeStamp>
          <when>2002-09-27T21:00:06Z</when>
        </TimeStamp>
         <description><![CDATA[<table border='1'>
           <tr>
             <th colspan='8' scope='col'>gmti_target</th>
           </tr>
           <tr>
             <td>targetid</td>
             <td>missionid</td>
             <td>dwellid</td>
             <td>name</td>
             <td>time</td>
             <td>the_geom</td>
             <td>classification</td>
             <td>dwellindex</td>
             <td>trackindex</td>
             <td>geom</td>
           </tr>
           <tr>
             <td>78840</td>
             <td>16</td>
             <td>135417</td>
             <td>gh1_v4bt.cgmti</td>
             <td>75606708</td>
             <td></td>
             <td>Unknown, Simulated Target</td>
             <td>1476</td>
             <td>0</td>
             <td>SRID=4269;POINT(-111.89313294366 35.1604509260505 2180)</td>
           </tr>
           </table>
           ]]></description>
         <LookAt>
            <longitude>-111.89313294366002</longitude>
            <latitude>35.160450926050544</latitude>
            <range>700</range>
            <tilt>10.0</tilt>
            <heading>10.0</heading>
         </LookAt>
         <styleUrl>#Stylegmti_target.634</styleUrl>
         <Point>
            <coordinates-111.89313294366002,35.160450926050544</coordinates>
         </Point>
      </Placemark>

For this experiment I utilized a mission with a relatively small number of gmti targets in the 6000 point range. The time to load is still a bit slow even with only 6000 points to load. In order to boost performance I switched from a simple kml stream to a zipped kmz stream for the servlet response.·····

Response.setContentType("application/vnd.google-earth.kmz");
ZipOutputStream out = new ZipOutputStream(response.getOutputStream());

This helps with bandspeed latency by reducing the data stream from 5Mb to 0.25Mb, however, the biggest latency on my system is the Google Earth load rendering rather than the download. Im using a medium 5Mb DSL connection here on an Intel Core2 Quad CPU Q6600 2.39Ghz. The download of the kmz is about 5sec but the Google Earth ingest and rendering, complete with blanked viewport, is around 50sec.

Once rendered the point icons are available for time sequence animation. Google Earth furnishes a built in tool that automates this process. The time tool includes option settings for adjusting speed, time range view, a setting to keep or discard points from the beginning, as well as repeat, stop, or reverse selection for end of time range event. The animation is helpful for visual analysis of target association. The tool provides step and slider views as well as animation.


Fig 2 - Google Earth time animation

The built in tools provided by Google Earth are quite helpful. Virtual Earth or WPF tools can be used for time sequence as well but require writing the necessary client javascript or C# to step through the set of elements in sequence. Having the tool already available makes simple view animation much easier to create. The potential for customization and interaction is a bit limited, but many applications are well served by the prebuilt viewing tools provided by default in GE. The large terrain and imagery infrastructure behind Google Earth is a real asset to viewing. Additional high resolution imagery can be added as GroundOverlay elements for enhancing specific gmti view areas.

For classification and analysis the simple view capability is somewhat limited. What is needed is a mode of selection to group and classify point data to create track associations. This requires more interactive events than afforded by GE. WPF is more difficult to work with, but the ability to add a variety of selection tools and change classification interactively may make the effort worthwhile from a GMTI analyst’s point of view. Any background imagery, terrain, or mapping layers need to be added to the WPF UI so there would be quite a bit more effort involved.


Fig 3 - Google Earth time animationviewed from above

GMTI is only one of a number of time sequence vehicle tracking data resources. The increasing use of GPS and fleet tracking software makes these types of data sets fairly common. The use of timestamp animation is a nice addition to the viewing capability of historical tracks, of course live tracks generally retain a history tail as well, so a live tracking databases could also make good use of this timestamp element. NetworkLinkControl refresh events can keep the track synchronized with the live data feeds.

The lack of interactive UI capability in Google Earth limits its use for operator classification and analysis. However, GE is just one viewing possibility. A UI system using a FOSS GIS stack such as PostGIS, Java, and Geoserver can be accessed in a number of ways through the browser. For example one browser tool could view the data set through a WPF UI, which allows operator reclassification and filtering using a variety of selection tools, while simultaneously a GE viewer with a NetworkLinkControl refreshed from the serverside backing datastore can be open from the same client. One aspect of the power of Browser based UIs is the ability to access multiple heterogeneous views simultaneously.


Fig 4 - Multiple browser views of gmti data source - WPF xbap UI and GE Time animation

Google Earth 4.3

April 22nd, 2008

Fig 1 - Google Earth Terrain with a USGS DRG from Terraservice as a GroundOverlay

I have to admit that I’ve been a bit leary of Google Earth. It’s not just the lurking licensing issues, or the proprietary application install, or even the lack of event listeners, or accessible api. If I’m honest I have to admit I’m a bit jealous of the large infrastructure, the huge data repository, and the powerfully fun user interface. So this weekend I faced my fears and downloaded the current version, Google Earth 4.3.

I’m not really interested in the existing Google Earth stuff. There are lots of default layers available, but I want to see how I can make use of the cool interface and somehow adapt it as a control for my stuff. The easiest route to customization in GE is KML. KML was developed for XML interchange of Keyhole views before Google bought Keyhole and turned it in to Google Earth. Keeping the KML acronym does avoid some conflict since Keyhole Markup Language, KML, does not run into the namespace conflict with the other GML, OGC’s, that would result from Google Markup Language.

KML 2.2 has evolved much further than a simple interchange language with some features that can be adapted to customized GE applications. After looking over the KML 2.2 reference I started with a simple static KML file. I have a USGS topo index table in PostGIS that I wished to view over the Google terrain. PostGIS includes an AsKML function for converting Geometry to KML. Using Java I can write a JDBC query and embed the resulting geometry AsKML into a KML document. However adding a WMS layer in between the PostGIS and GE seemed like a better approach. Geoserver gives you a nice SOA approach with built in KML export functionality as well as custom styling through sld.

It is easy to set up Geoserver as a WMS layer over the PostGIS database containing my USGS topo index table. Geoserver has also included a KML export format for the WMS framework. So I simply add my table to the Geoserver Data Featuretype list. Now I can grab out a KML document with a WMS query like this:
http://rkgeorge-pc:80/geoserver/wms?bbox=-104.875,39,-104.75,39.125&styles=&Format=kml&request=GetMap&layers=usgstile&width=500&height=500&srs=EPSG:4269

This is simple, and the resulting kml provides a basic set of topo polygons complete with a set of clickable attribute tables at the polygon center points. The result is not especially beautiful or useful, but it is interesting to tilt the 3D view of the GE terrain and see that the polygons are draped onto the surface. It is also handy to have access to the topo attributes with the icon click. However, in order to be useful I need to make my kml a bit more interactive.

The next iteration was to create a simple folder kml with a <NetworkLink>:

<kml xmlns="http://earth.google.com/kml/2.1">
     <Folder>
       <name>USGS Topo Map Index</name>
       <description>
        <![CDATA[
          <h3>Map Tiles</h3>
          <p><font color="blue">index tile of 1:24000 scale <b> USGS topographic maps</b>
           1/8 degree x 1/8 degree coverage</font>
           <a href='http://topomaps.usgs.gov/'>
               more information>>
           </a>
          </p>
      ]]>
     </description>
      <NetworkLink>
        <name>WMS USGS quad tiles</name>
          <Link>
             <href>http://rkgeorge-pc/GoogleTest/servlet/GetWMS</href>
             <httpQuery>layer=usgstile</httpQuery>
             <viewRefreshMode>onStop</viewRefreshMode>
             <viewRefreshTime>1</viewRefreshTime>
             <viewFormat>BBOX=[bboxWest],[bboxSouth],[bboxEast],[bboxNorth]
                             &CAMERA=[lookatLon],[lookatLat]</viewFormat>
           </Link>
      </NetworkLink>
    </Folder>
</kml>

In this case the folder kml has a network link to a java servlet called GetWMS. The servlet handles building the GetMap WMS query with customized style sld loading so that now I can change my style as needed by editing a .sld file. The servlet then opens my usgstile query and feeds the resulting KML back to the kml folder’s <NetworkLink>. Since I have set the <viewRefreshMode> to onStop, each time I pan around the GE view I will generate a new call to the GetWMS servlet which builds a new GetMap call to Geoserver based on the current bbox view parameters.

This is more interesting and adds some convenient refresh capability, but views at a national or even state level are quickly overwhelmed with the amount of data being generated. The next iteration adds a <Region> element with a <LOD> subelement:

<Region>
  <LatLonAltBox>
    <north>70.125</north>
    <south>18.875</south>
    <east>-66.875</east>
    <west>-160.25</west>
  </LatLonAltBox>
  <Lod>
    <minLodPixels>30000</minLodPixels>
    <maxLodPixels>-1</maxLodPixels>
  </Lod>
 </Region>
</font>

Now my refreshes only occur when zoomed in to a reasonable level of detail. The kml provides a layer of USGS topo tiles and associated attributes from my PostGIS table by taking advantage of the built in KML export feature of Geoserver’s WMS.


Fig 2 - Google Earth with <Region><LOD> showing USGS topo tiles

Unfortunately Google Earth 4.3 seems to disable the onStop refresh after an initial load of my usgstile subset, once a Region element with LOD is added. In GoogleEarth 4.2 I didn’t run into this problem. The work around appears to be a manual refresh of the menu subtree “WMS USGS quad tiles”. Once this refresh is done, pan and zoom refresh in the normally expected ‘onStop’ fashion. GE 4.3 is beta and perhaps this behavior/”feature” will be changed for final release.

Now I have a reasonable USGS tile query overlay. However, why just show the tiles. It is more useful to show the actual topo map. Fortunately the “web map of the future” from four or five years ago is still around “terraservice.net” Terra service is a Microsoft research project for serving large sets of imagery into the internet cloud. It was a reasonably successful precursor to what we see now as Google Map and Virtual Earth. The useful thing for me is that one of the imagery layers that this service provides as a WMS, is DRG, Digital Raster Graph. DRG is a seamless WMS of the USGS topo map scans in a pyramid using 1:250,000, 1:100,000, and 1:24000 scale scanned paper topos. Anyone doing engineering, hiking etc before 2000 is probably still familiar with the paper topo series which, once upon a time, was the gold standard map interface. Since then the USGS has fallen on harder times, but before they fall into utter obscurity they did manage to swallow enough new tech to produce the DRG map scans and let Microsoft load it into TerraService.net.

This means that the following url will give back a nice jpeg 1:24000 topo for Mt Champion in Colorado:
http://terraservice.net/ogcmap.ashx?version=1.1.1&request=GetMap&Layers=DRG&Styles=&SRS=EPSG:4326&BBOX=-106.5625,39.0,-106.5,39.0625&width=1000&height=1000&format=image/jpeg&Exceptions=se_xml

Armed with this bit of WMS capability I can add some more capability to my GoogleTest. First I added a new field to the usgstile database, which can be seen in the screen capture above. The new field simply makes a reference url to another servlet I wrote to build the terraservice WMS query and pull down the requested topo image. By setting the width and height parameters to 2000 I can get the full 1:24000 scale detail for a single 1/8 degree topo quad. My GetDRG servlet also conveniently builds the kml document to add the topo to my Google Earth menu:

<?xml version="1.0" encoding="UTF-8"?></font>
<kml xmlns="http://earth.google.com/kml/2.2">
  <Folder>
    <name>38106G1  Buena Vista East</name>
    <description>DRG Overlay of USGS Topo</description>
    <GroundOverlay>
      <name>Large-scale overlay on terrain</name>
	     <description>
        <![CDATA[
          <h3>DRG Overlay</h3>
          <p><font color="blue">index tile of 1:24000 scale <b> USGS topographic maps</b>
          1/8 degree x 1/8 degree coverage</font>
          <a href='http://topomaps.usgs.gov/'>more information>></a>
          </p>
        ]]>
      </description>
      <Icon>
        <href>http://terraservice.net/ogcmap6.ashx?version=1.1.1&request=GetMap
           &Layers=DRG&Styles=&SRS=EPSG:4326&BBOX=-106.125,38.75,-106.0,38.875
           &width=2000&height=2000&format=image/jpeg&Exceptions=se_blank</href>
      </Icon>
      <LatLonBox>
        <north>38.875</north>
        <south>38.75</south>
        <east>-106.0</east>
        <west>-106.125</west>
        <rotation>0</rotation>
      </LatLonBox>
    </GroundOverlay>
  </Folder>
</kml>

The result is a ground clamped USGS topo over the Google Earth terrain. It is now available for some flying around at surface level in the cool GE interface. Also the DRG opacity can easily be adjusted to see the Google Earth imagery under DRG contours. The match up is pretty good but of course GE imagery will be more detailed at this stage. I also understand that Google is in the process of flying submeter LiDAR through out areas of the USA so we can expect another level in the terrain detail pyramid at some point as well.

This is all fun and hopefully useful for anyone needing to access USGS topo overlays. The technology pattern demonstrated though, can be used for just about any data resource.
1) data in an imagery coverage or a PostGIS database
2) Geoserver WMS/WCS SOA layer
3) KML produced by some custom servlets

This can be extremely useful. The Google Earth interface with all of its powerful capability is now available as a nice OWS interface for viewing OWS exposed layers which can be proprietary or public. For example here is an example of a TIGER viewer using this same pattern:




Fig 3 - TIGER 2007 view of El Paso County Colorado

Google Earth viewer does have a gotcha license, but it may be worth the license cost for the capability exposed. In addition, if Microsoft ever catches up to the KML 2.2 spec recently released as a final standard by the OGC, this same pattern will work conveniently in Virtual Earth and consequently in a browser through a recently announced future silverlight VE element. Lots of fun for view interfaces!

My next project is to figure out a way to use GE KML drawing capability to make a two way OWS using Geoserver transactional WFS, WFS-T.

Deep Zoom Easter Eggs?

March 14th, 2008

Easter rolls around early this year and the MIX08 Deep Zoom announcement made it just in time to try some experiments in the Easter Egg tradition. Using Deep Zoom Composer, images can be inserted at various locations within the zoom pyramid. This means full images of documents, html pages, schematics, data records, photos … can be inserted down at lower levels of a main image’s zoom. This was pointed out in Laurence Moroney’s recent blog posting, which showed an image inserted into the pupil of an eye in a framing picture. Looking at potential for this type of DeepZoom, here is an experiment from a geospatial perspective.



Fig 1 Denver USGS UrbanArea 0.25m zoomed in to make Coors Field visible

The SparseImageSceneGraph.xml generated by Deep Zoom Composer shows how position and insert scale are part of a SceneNode Element:

<SceneGraph version=”1″>
<AspectRatio>0.999999999999994</AspectRatio>
<SceneNode>
<FileName>C:\temp\Denver\denver5\source images\den8×8.jpg</FileName>
<x>0</x>
<y>0</y>
<Width>1</Width>
<Height>1</Height>
<ZOrder>1</ZOrder>
</SceneNode>
.
.
.
<SceneNode>
<FileName>C:\temp\Denver\denver5\source images\taylorbuchholz.jpg</FileName>
<x>0.59990935133569</x>
<y>0.584386453325495</y>
<Width>0.000326534539695684</Width>
<Height>0.000457282456000806</Height>
<ZOrder>8</ZOrder>
</SceneNode>
<SceneNode>
<FileName>C:\temp\Denver\denver5\source images\jeffbaker.jpg</FileName>
<x>0.590298544487025</x>
<y>0.565998027596921</y>
<Width>0.000436734620380339</Width>
<Height>0.000624543890228259</Height>
<ZOrder>9</ZOrder>
</SceneNode>
.
.
.
</SceneGraph>

This brings to mind the various easter egg items available in well known software packages. Interestingly aerial imagery lends itself to this type of feature since the spatial dimension provides a handy reference for hanging additional information.

As an experiment I’ve updated a Denver USGS Urban Area imagery tile with some easter egg information in the Deep Zoom approach. In this case zooming into Coors Field you will notice some small patches scattered around the field. Behind home plate are a couple of informational items from the Colorado Rockies Home page, including a roster list. Then panning around the field you will see various player cards at their field positions.

In each of these cases the image is available at sufficient scale to read the text. The scene graph will allow multiple depth image insertion, so I could go further hiding some detail of a player life inside a dot on an individual player card.


Fig 2 - Zoomed in to see the location of additional SceneNode images


Fig 3 - Zooming further you can read the text and view additional information embedded in the map


Fig 4 - Finally zoomed to an individual player card at short stop.

So far I can see this as little more than a novelty, but perhaps it would have some uses. For example a sitemap builder could be enhanced to output a Deep Zoom image to allow moving around a multipage graph showing every page in a site. This type of tool would be a great way to QC a static html site (are there any of these still around?). Perhaps facility management would benefit as well, since personnel records could be associated with a desk inside an office, inside a building floor, inside an office campus giving a secondary spatial way to review office space. Adding some papers to the desktop and then sending a deepzoom link with the apropriate zoom factor might be a novel approach to document workflow management:
public void ZoomAboutLogicalPoint( double zoomIncrementFactor, double zoomCenterLogicalX, double zoomCenterLogicalY )

In telecom, outside plant items such as transformers and switches could contain schematics embedded within a snapshot of the item, which is itself embedded in a wide area schematic or aerial imagery showing a city wide sector.
The next trick is to find a way to back track through a Deep Zoom Scene graph to a real world coordinate. This seems possible using the pair of methods shown in the MultiScaleImage class:

MultiScaleImage..::.LogicalToElementPoint Method
MultiScaleImage..::.ElementToLogicalPoint Method

Once a coordinate has been reversed into its outer root element coordinates, it should be possible to assign a real world georeference, which would be used for a nearest query on a PostGIS table for obtaining additional attributes or accessing live data.

Which brings up the big problem I see to date on the DeepZoom technology: it is static. The processing of large image sets into pyramids is not fast. Although there are some tantalizing command line hints in the ImageTool.exe and SparseImageTool.exe packaged with Deep Zoom Composer there is no real documentation:

usage: ImageTool [-v] options:
-v enable verbose output
commands:
convert

[outdir ] [basename ] [explode]
[large[tif|jpg|wdp|png] [quality ] [tilesize

]
Converts an image or a dir with files to a Seadragon stored image.
The optional outdir parameter specifies the output directory path.
The optional basename parameter allows overriding the output filesename
which defaults to the basename of the input file.
The default form is single-file (not exploded).
Specify large to use a slower algorithm, but one that can process very large images.
The default format is HDPhoto (wdp).
The quality value should be a number in the interval [0.0-1.0]. Default is 0.8
A quality value of 1.0 will result in lossless compression if possible in the chosen format.
If not specified, tilesize values default to size of 255 and a physical size of 256.
Legal values for tilesize values are >= 64 and <= 2048.dump [levels] [index]
Dumps info about the specified Seadragon stored image file.

extract [] []
Extract level image into specified output directory in
TIFF format. If level is not specified, all are output.
If the output directory is not specified, files are written to
the current working directory.

The possibility of adding smaller items to an already existing large pyramid holds promise for more dynamic real time updating from a content management system.

In looking at the pyramid of jpg files it seems possible to add a simple place holder of the right dimension and then change content by overwriting the already existing pyramid jpgs with updated content. In this way easter egg content with relatively small jpg sets could be updated dynamically out of a content management system on each refresh. If Microsoft SeaDragon adds MultiScaleImage elements to WPF xaml some of the static issues surrounding DeepZoom viewing might be easier to handle.

Deep Zoom a TerraServer UrbanArea on EC2

March 12th, 2008


Fig 1 - Silverlight MultiScaleImage of a high resolution Denver image - 200.6Mb .png

Just to show that I can serve a compiled Deep Zoom Silverlight app from various Apache servers I loaded this Denver example on a Windows 2003 Apache Tomcat here: http://www.web-demographics.com/Denver, and then a duplicate on a Linux Ubuntu7.10 running as an instance in the Amazon EC2, this time using Apache httpd not Tomcat: http://www.gis-ows.com/Denver Remember these are using beta technology and will requires updating to Silverlight 2.0. The Silverlight install is only about 4.5Mb so the install is relatively painless on a normal bandwidth connection.

Continuing the exploration of Deep Zoom, I’ve had a crash course in Silverlight. Silverlight is theoretically cross browser compatible (at least for IE, Safari, and FireFox), and it’s also cross server. The trick for compiled Silverlight is to use Visual Studio 2008 with .NET 3.5 updates. Under the list of new project templates is a template called ‘Silverlight application’. Using this template sets up a project that can be published directly to the webapp folder of my Apache Server. I have not tried a DeepZoom MultiScaleImage on Linux FireFox or Mac Safari clients. However, I can view this on a Windows XP FireFox updated to Silverlight 2.0Beta as well as Silverlight updated IE7 and IE8 beta.

Creating a project called Denver and borrowing liberally from a few published examples, I was able to add a ClientBin folder under my Denver_Web project folder. Into this folder goes the pyramid I generate using Deep Zoom Composer. Once the pyramid is copied into place I can reference this source from my MultiScaleImage element source. Now the pyramid is viewable.

To make the MultiScaleImage element useful, I added a couple of additional .cs touches for mousewheel and drag events. Thanks to the published work of Lutz Gerhard, Peter Blois, and Scott Hanselman this was just a matter of including a MouseWheelHelper.cs in the project namespace and adding a few delegate functions to the main Page initialization code behind file. Pan and Zoom .cs

Now I need to backtrack a bit. How do I get some reasonable Denver imagery for testing this Deep Zoom technology? Well I don’t belong to DRCOG which I understand is planning on collecting 6″ aerials. There are other imagery sets floating around Denver, as well, I believe down to 3″ pixel resolution. However, the cost of aerial capture precludes any free and open source type of use. However, there is some nice aerial data available from the USGS. The USGS Urban Area imagery is available for a number of metropolitan areas, including Denver.


Fig 2 - Same high resolution Denver image zoomed in to show detail

USGS Urban Area imagery is a color orthorectified image set captured at approximately 1ft pixel resolution. The data is made available to the public through the TerraServer WMS. Looking over the TerraServer UrbanArea GetCapabilities layer I see that I can ‘GetMap’ this layer in EPSG:26913 (UTM83-13m). The best possible pixel resolution through the TerraServer WMS is 0.25m per pixel. To achieve this level of resolution I can use the max pixel Height and Width of 2000 over a metric bounding box of 500m x 500m. http://gisdata.usgs.net/IADD/factsheets/fact.html

For example:
http://terraservice.net/ogcmap.ashx?version=1.1.1&service=WMS&ServiceName=WMS&request=GetMap&layers=UrbanArea&srs=EPSG:26913&bbox=511172,4399768,511672,4400268&WIDTH=2000&HEIGHT=2000

This is nice data but I want to get the max resolution for a larger area and mosaic the imagery into a single large image that I will then feed into the Deep Zoom Composer tool for building the MultiScaleImage pyramid. Java is the best tool I have to make a simple program to connect to the WMS and pull down my images one at a time into the tiff format.
try {
File OutFile = new File(dir+imageFileName);
URL u = new URL(url);
HttpURLConnection geocon = (HttpURLConnection)u.openConnection();
geocon.setAllowUserInteraction(false);
geocon.setRequestMethod(”GET”);
geocon.setDoOutput(true);
geocon.setDoInput(true);
geocon.setUseCaches(false);
BufferedImage image = ImageIO.read(geocon.getInputStream());
ImageIO.write(image,”TIFF”,OutFile);
geocon.disconnect();
System.out.println(”download completed to “+dir+imageFileName+” “+bbox);
}

Looping this over my desired area creates a directory of 11.7Mb tif images. In my present experiment I grabbed a set of 6×6 tiles, or 36 tiff files at a total of 412Mb. The next step is to collect all of these tif tiles into a single mosaic. The Java JAI package contains a nice tool for this called mosaic:
mosaic = JAI.create(”mosaic”, pbMosaic, new RenderingHints(JAI.KEY_IMAGE_LAYOUT, imageLayout));

Iterating pbMosaic.addSource(translated); over my set of TerraServer tif files and then using PNGImageEncoder, I am able to create a single png file of about 200Mb. Now I have a sufficiently large image to drop into the Deep Zoom Composer for testing. The resulting pyramid of jpg files is then copied into my ClientBin subdirectory of the Denver VS2008 project. From there it is published to the Apache webapp. Now I can open my Denver webapp for viewing the image pyramid. On this client system with a good GPU and dual core cpu the image zoom and pan is quite smooth and replicates a nice local application viewing program with smooth transitions around real time zoom pan space. On an older Windows XP running FireFox the pan and zoom is very similar. This is on a system with no GPU so I am impressed.

Peeking into the pyramid I see that the bottom level 14 contains 2304 images for a 200Mb png pyramid. Each image stays at 256×256 and the compression ranges from 10kb to 20kb per tile. Processing into the jpg pyramid compresses from the original 412Mb tif set => 200.5Mb png mosaic => 45.7Mb 3084 file jpg pyramid. Evidently there is a bit of lossy compression, but the end effect is that the individual tiles are small enough to stream into the browser at a decent speed. Connected with high bandwidth the result is very smooth pan and zoom. This is basically a Google Earth or Virtual Earth user experience all under my control!

Now that I have a workflow and a set of tools, I wanted to see what limits I ran into. The next step was to increment my tile set to an 8×8 for 64 tifs to see if my mosaic tool would endure the larger size as well as the DeepZoom Composer. My JAI mosaic will be the sticking point on a maximum image size since the source images are built in memory which on this machine is 3Gb. Taking into account Vista’s footprint I can actually only get about 1.5Gb. One possible workaround to that bottleneck is to create several mosaics and then attempt to splice them in the Deep Zoom Composer by manually positioning them before exporting to a pyramid.

First I modified my mosaic program to write a Jpeg output with jpgParams.setQuality(1.0f); This results in a faster mosaic and a smaller export. The JAI PNG encoder is much slower than JPEG. With this modification I was able to export a couple of 3000m x 3000m mosaics as jpg files. I then used Deep Zoom Composer to position the two images horizontally and exported as a single collection. In the end the image pyramid is 6000m x 3000m and 152Mb of jpg tiles. It looks like I might be able to scale this up to cover a large part of the Denver metro UrbanArea imagery.

The largest mosaic I was able to get Deep Zoom Composer to accept was 8×8 or 16000px x 16000px which is just 4000m x 4000m on the ground. Feeding this 143Mb mosaic through Composer resulted in a pyramid consists of 5344 jpg files at 82.3Mb. However, scaling to a 5000m x 5000m set of 100 tif, the 221Mb mosaic, failed on import to Deep Zoom Composer. I say failed, but in this prerelease version the import finishes with a blank image shown on the right. Export works in the usual quirky fashion in that the export progress bar generally never stops, but in this case the pyramid also remains empty. Another quirky item to note is that each use of Deep Zoom Composer starts a SparseImageTool.exe process which continues consuming about 25% of cpu even after the Deep Zoom Composer is closed. After working awhile you will need to go into task manager and close down these processes manually. Apparently this is “pre-release.”


Fig 3 - Same high resolution Denver image zoomed in to show detail of Coors Field players are visible

Deep Zoom is an exciting technology. It allows map hackers access to real time zoom and pan of large images. In spite of some current size limitations on the Composer tool the actual pyramid serving appears to have no real limit. I verified on a few clients and was impressed that this magic works in IE and FireFox although I don’t have a Linux or Mac client to test. The compiled code serves easily from Apache and Tomcat with no additional tweaking required. My next project will be adapting these Deep Zoom pyramids into a tile system. I plan to use either an OWS front end or a Live Maps with a grid overlay. The deep zoom tiles can then be accessed by clicking on a tile to open a Silverlight MultiScaleImage. This approach seems like a simple method for expanding coverage over a larger metropolitan area while still using the somewhat limiting Deep Zoom Composer pre release.

BMNG SilverLight 2.0 Beta Deep Zoom

March 7th, 2008

Well somebody had to do it! Im willing to give it a try.

What is it?

One of the many announcements at the MIX08 conference is the availability of Deep Zoom technology for Silverlight 2.0 Beta 1. This results from an R&D program in Microsoft called Sea Dragon. Sea Dragon was evidently a Microsoft acquisition awhile back. Reminiscent of Keyhole(now google earth), Sea Dragon is a tool for smooth viewing of very large image resources. The novelty is to have it useable inside a SilverLight browser view. http://labs.live.com/Seadragon.aspx

For those so inclined to load beta stuff there is a demo available here with a nice code behind zoom pan capability: http://visitmix.com/blogs/Joshua/Hard-Rock-Cafe/

The idea is that behind the viewport is a large image in PNG, JPEG, TIFF, BMP format which is fed to the screen resolution using a MultiScaleImage element in a SilverLight 2.0 page. Actually just jpeg as it turns out.

Well of course this is extremely interesting to geospatial people since all of the WMS, WCS, WFS, KML, tile caching, image pyramids etc are aimed at exactly this functionality.

How is the user experience?

Holy Mackerel, finally a reason to use Vista! But wait this is SilverLight it should work in FireFox and Safari too. By the way,if any Sea Dragon big Tuna comes trolling by this post I already have my map hacker wishlist for 2.0 at the end of this entry.

OK, let’s try it.

First I grabbed a full resolution BMNG image off of BitTorrent: http://www.geotorrent.org/torrents/61.torrent

Chris Holmes has detailed instructions on this part here: http://docs.codehaus.org/display/GEOSDOC/Load+NASA+Blue+Marble+Data

The eventual result will be twelve large(267Mb) ecw files, “world-topo-bathy-200408-3×86400x43200.ecw,” one for each month of 2004. Ecw is a wavelet compression format for imagery http://en.wikipedia.org/wiki/ECW_(file_format), but to use the image we need it in a different format, Gdal to the rescue. The easiest approach is to take advantage of the FWtools bin directory to run a command line translation like this:

“C:\Program Files\FWTools2.1.0\bin\gdal_translate” -of GTiff world-topo-bathy-200401-3×86400x43200.ecw BMNG.tiff

After a few minutes the result is a tiff image of 86400×43200 of about 11Gb. Now it is time to use the Deep Zoom Composer (actually a decomposer) to process this into a MultiScaleImage info.bin
http://blogs.msdn.com/expression/archive/2008/03/05/download-the-preview-of-the-deep-zoom-composer.aspx

When I attempted an import of this 11Gb tiff into Deep Zoom Composer, the Mermaid.exe choked after a few minutes. I guess we aren’t ready for geospatial scale exactly yet. Note to self: do this with -o Tiff, since mermaids may not like GTiff.

So I went back to a smaller downsample to start my experiment. This time I chose a 3600×1800 png at 4.154Mb. This was rather instant success. Now there is a BMNG test pyramid on my hard drive. The pyramid is 13 levels and each subdirectory contains the necessary images in jpg. Deep Zoom Composer rather handily chunks the tiles in each level, even calculating non square tiling.


Fig 1 Example of the a pyramid level resulting from Deep Zoom Image Composer

After playing around a bit the export spits out what we need for a Silverlight MultiScaleImage. Remember this is an element type introduced with SilverLight2.0 Beta so you can’t really see this unless you want to do a beta Silverlight 2.0 install.

Here are some links on other neat things in R&D over at Microsoft labs.live: http://labs.live.com/photosynth/whatis/

SilverLight works a lot better with an IIS server, but I am using an Apache server, so I created a javascript Silverlight project. Using the default project, I modified the Scene.xaml and associated scene.js to make use of the new MultiScaleImage element:

<MultiScaleImage      
x:Name=msi“      
ViewportWidth=1.0“      
ViewportOrigin=0,0“      
Source=/BMNGZoom/BMNG/info.bin>

This worked pretty well to get the image object in the browser with a sort of spring loaded entre. Perhaps the springy annoyance can be turned off by setting UseSpringsProperty=”off.” However, adding zoom and pan are bit more problematic. I am brand new to Silverlight but oddly there seem to be very few events available:
MouseMove, MouseEnter, MouseLeave, MouseLeftButtonDown, MouseLeftButtonUp

If you want MouseRight, Keyboard, MouseWheel etc, you need to have some code behind. Since I didn’t really have time to figure out all of the code behind tricks for getting this to serve from Apache, I took a primitive approach. By attaching an event to MouseLeftButtonUp I can simulate a click event. Then connecting this click event to the MultiScaleImage ViewportWidth *= 0.9; I could make a one way zoom in without too much effort. Not very useful, but good enough to get a feel for the interaction, which by the way is very impressive. The zooming is familiar to anyone used to VE or GE type of continuous zoom. Pretty nifty for a browser interface.

There is even an ‘old view’ to ‘new view’ animation effect, which cleverly distracts the user while the new tiles are streaming in over the old tiles. Local tile cache makes revisiting very smooth. I will have to try this on an older tier 0 GPU system so I can watch the tiles move slowly into place.

http://www.web-maps.com/BMNGZoom/ >


Fig 2- primitive test of a Deep Zoom BMNG

Now that I had this working on a relatively small 4.1 Mb image, my next step was to step up the size and see what the effect would be. I already knew 11Gb Gtiff was not going to work. Dividing the full Blue Marble into 8 tiles and using PNG output seemed like a possibility. This gives 8 files at 256Mb each.

However, I noticed that the pyramid files are jpeg so why not start with 8 jpeg files instead:
“C:\Program Files\FWTools2.1.0\bin\gdal_translate” -of JPEG
-projwin -180 90 -90 0 world-topo-bathy-200401-3×86400x43200.ecw BMNG1.jpg
Input file size is 86400, 43200
Computed -srcwin 0 0 21600 21600 from projected window.
0…10.

After a few minutes I could open the Deep Zoom Image Composer again and do an import on the full eight tile collection. The composer did not show anything in the view window with these larger jpg images so I was working blind on composition. I did the export anyway out of curiosity.
I’ll post the results next week since it will take a good bit of uploading time.
The result of this bit of experiment was quite surprising. The pyramid building goes fairly smoothly in the Deep Zoom Composer and is painless compared to manually building pyramids in Java. Actually Geotools ImagePyramid has some advantages like iteration over multiple images and command line capability. But the resulting tile pyramid doesn’t have the client side streaming.The MultiScaleImage element hides the complexity of an ajax slippy map interface in a single element. On the down side adding functionality seems to be aimed at IIS ASP type servers. I imagine with a bit of time I can work out the details of a pan and MouseWheel zoom. SilverLight includes RenderTransform matrix capability, it just requires code behind to make it useful with mouse and keyboard functions.

The question is “how does this work?” Of course the answer is “I dont know,” but that doesn’t stop some speculation. The pyramid is obvious. The fact that it works on both a linux or a windows box eliminates a stub socket on the server side. It appears to be an object downloaded to the client which orchestrates things in an ajax mode. Of course clr won’t work with Firefox on Linux or Safari so there must be a plugin object which can be duplicated cross platform.

Wishlist for Deep Zoom 2.0 from a map hacker

1. Can we scale this puppy to use extremely large image sets? I imagine DCOG is going to want to see their 6″ aerials for the Denver metro area in Deep Zoom. Why should they have to come up with a pyramid tile set of MultiScaleImage elements?

2. How about MultiScaleImage elements for WPF xaml/xbap? I would like to use it with a more comprehensive set of code behind tools as an xbap.

3. Once it’s in WPF how about using some OO magic and add MultiScaleImageBrush for draping on a 3D mesh?

Lets extrapolate a couple more steps

4. Why stop at images? How about extending to MultiScale3DMesh. Then my spare time project for an AJAX LiDAR viewer won’t require much work.

5. Don’t stop there, lets have MultiScaleImageBrush on MultiScale3DMesh.

Now sharpen your patent pen

6. Why not MultiScaleVideo? Sooner or later all of the bifocaled baby boomers will be downloading movies to their iPhone, oops ZunePhone. How else are we going to see anything on those miniscule screens. Besides talk about “immersive,” movies could really be interactive. Imax resolution on a phone, why not!

Wide Area HVAC controller using WPF and ZigBee Sensor grid

March 6th, 2008

One project I’ve been working on recently revolves around an online controller for a wide area HVAC system. HVAC systems can sometimes be optimized for higher efficiency by monitoring performance in conjunction with environment parameters. Local rules can be established for individual systems based on various temperatures, humidity, and duct configurations. Briefly, a set of HVAC functions consisting of on/off relay switches and thermistors, can be observed from an online monitoring interface. Conversely state changes can be initiated online by issuing a command to a queue. These sensors and relays might be scattered over a relatively large geographic area and in multiple locations inside a commercial building.

It is interesting to connect a macro geospatial world with a micro world, drilling down through a local facility to a single thermistor chip. In the end its all spatial.

Using a simple map view allows drill down from a wide area to a building, a device inside a building, a switch bank, and individual relay or analog channel for monitoring or controlling. The geospatial aspect of this project is somewhat limited, however, the zoom and pan tools used in the map location also happen to work well in the facilities and graphing views.

The interface can be divided into three parts:
1) The onsite system - local base system and Zigbee devices
2) The online server system - standard Apache Tomcat
3) The online client interface - WPF xbap, although svg would also work with a bit more work

Onsite System

The electronically impaired, like myself, may find the details of controller PIC chip sets, relays, and thermistor spec sheets baffling, but really they look more hacky than they are:

Fig 0 - Left: Zigbee usb antenna ; Center: thermistor chip MCP9701A; Right: ProXR Zigbee relay controller

The onsite system is made up of sensors and controller boards. The controller boards include a Zigbee antenna along with a single bank of 8 relays and an addition set of 8 analog inputs. The sensors are wired to the controller board in this development mode. However, Zigbee enabled temperature sensors are also a possibility, just more expensive. See SunSpot for example: http://www.sunspotworld.com/ (Open Source hardware? )

ZigBee is a wifi type communications protocol based on IEEE 802.15.4. It allows meshes of devices to talk to each other via RF as long as they are within about 100-300 ft of another node on the mesh. Extender repeaters are also available. ZigBee enabled devices can be scattered around a facility and communicate back to a base system by relaying messages node to node through an ad hoc mesh network.

The onsite system has a local pc acting as the base server. The local onsite server communicates with an external connection via a internet router and monitors the Zigbee network. ASUS EeePCs look like a good candidate for this type of application. Messages originating from outside are communicated down to the individual relay through the ZigBee mesh, while state changes and analog readings originating from a controller relay or sensor channel are communicated up the ZigBee network and then passed to the outside from the local onsite server.

The local server must have a small program polling the ZigBee devices and handling communications to the outside world via an internet connection. The PC is equipped with a usb ZigBee antenna to communicate with the other ZigBee devices in the network. This polling software was written in Java even though that may not the best language for serial USB com control in Windows. The target system will be Linux based. The ZigBee devices we selected came with the USB driver that treats a USB port like a simple COM port.

Since this was a Java project the next step was finding a comm api. The sun JavaComm has discontinued support of Windows, although it is available for Linux. Our final onsite system will likely be Linux for cost reasons, so this is only a problem with the R&D system which is Windows based. I ended using a RXTX library, RXTXcomm.jar, at http://www.jcontrol.org/download/rxtx_en.html

Commands for our ProXR controller device are a series of numeric codes, for example<254;140;3;1>This series of commands puts the controller in command mode 254, a set bank status command 140, a byte indicating relays 0 and 1 on 3, and bank address 1. The result is relays 0 and 1 are switched to the on position. The commands are issued similarly for reading relay state and analog channels. <254;166;1> for example reads all 8 analog I/O channels as a set of 8 bytes.

Going in prototype mode we picked up a batch of three wire MCP9701A thermistor chips for a few dollars. The trick is to pick the right resistance to get voltage readings in to the mid range of the 8bit or 10bit analog channel read. Using 8 bit output lets us poll for temperature with around .5 degree F resolution.

The polling program issues commands and reads results on separate threads. If state is changed locally it is communicated back to the online server on the next polling message, while commands from the online command queue are written to the local controller boards with the return. In the meantime every polling interval sends an analog channel record back to the server.

Online Server

The online server is an Apache Tomcat service with a set of servlets to process communications from the onsite servers. Polled analog readings are stored in a PostgreSQL database with building:device: bank:channel addresses as well as a timestamp. The command queue is another PostgreSQL table which is checked at each poll interval for commands addressed to the building address which initiated the poll. Any pending commands are returned to the polling onsite server where they will be sent out to the proper device:bank:relay over the ZigBee network.

Two other tables simply provide locations of buildings as longitude, latitude in the wide area HVAC control system. Locations of devices insidebuildings are stored in a building table as floor and x,y coordinates. These are available for the client interface.

Client Interface

The client interface was developed using WPF xbap to take advantage of xaml controls and a WMS mapping interface. Initially the client presents a tabbed menu with a map view. The map view indicates the wide area HVAC extents with a background WMS image for reference. Zooming in to the building location of interest allows the user to select a building to show a floor plan with device locations indicated.

Fig 1 HVAC wide area map view

Once a building is selected the building floor plans are displayed. Selecting an individual device determines the building:device address.

Fig 2 Building:device address selection from facilities floor plan map

Finally individual relays can be selected from the device bank by pushing on/off buttons. Once the desired switch configuration is set in the panel, it can be sent to the command queue as a building:device:bank address command. Current onsite state is also double checked by the next polling return from the onsite server.

Fig 3 Relay switch panel for selected building:device:bank address.

The analog IO channels are updated to the server table at the set polling interval. A selection of the analog tab displays a set of graph areas for each of the 8 channels. The on/off panel initiates a server request for the latest set of 60 polled readings which are displayed graphically. It won’t be much effort to extend this analog graph to a bidirectional interface with user selectable ranges set by dragging floor and ceiling lines that trigger messages or events when a line is crossed.

Fig 4 Analog IO channel graphs

This prototype incorporates several technologies using a Java based Tomcat service online and a Java RXTXcomm Api for the local Zigbee polling. The client interface is also served out of Apache Tomcat as WPF xaml to take advantage of easier gui control building. In addition OGC WMS is used for the map views. The facilities plan views will be DXF translations to WPF xaml. Simple graphic selection events are used to build addresses to individual relays and channels. The server provides historical command queues and channel readings by storing time stamped records. PostgreSQL also has the advantage of handling record locking on the command queue when multiple clients are accessing the system.

This system is in the prototype stage but illustrates the direction of control systems. A single operator can maintain and monitor systems from any locations accessible to the internet, which is nearly anywhere these days. XML rendering graphics grammars for browsers like svg and xaml enable sophisticated interfacesthat are relatively simple to build.

There are several OGC specifications oriented toward sensor grids, http://www.opengeospatial.org/projects/groups/sensorweb. The state of art is still in flux but by virtue of the need for spatial management of sensor grids, there will be a geospatial component in an “ubiquitous sensor” world.

LiDAR processing with Hadoop on EC2

March 1st, 2008

I was inspired yesterday by a FRUGOS talk from a Lidar production manager showing how to use some OpenSource tools to make processing LiDAR easier. LiDAR stands for LIght Detection And Ranging. It’s a popular technology these days for developing detailed terrain models and ground classification. The USGS has some information here: USGS CLick and the state of PA has an ambitious plan to fly the entire state every three years at a 1.4 ft resolution. PAMAP LiDAR

My first stop was to download some .las data from the PAMAP site. This PA LiDAR has been tiled into 10000×10000ft sections which are roughly 75Mb each. First I wrote a .las translator so I could look at the data. The LAS 1.1 specification is about to be replaced by LAS 2.0 but in the meantime most available las data is still using the older spec. The spec contains a header with a space for variable length attributes and then the point data. Each point record contains 20bytes (at least in the PAMAP .las)  of information including x,y,z, as long values and a classification and reflectance value. The xyz data is scaled and offset by values in the header to obtain double values. The LiDAR sensor mounted on an airplane flies tracks across the are of interest while the LiDAR pulses at a set frequency. The location of the reading can then be determined by post processing against the GPS location of the instrument. The pulse scans result in tracks that angle across the flight path.

Since the PAMAP data is available in PA83-SF coordinates, despite the metric unit shown in the meta data, I also added a conversion to UTM83-17. This was a convenience for displaying over Terraserver backgrounds using the available EPSG:26917. Given the hi res aerial data available from PAMAP this step may be unnecessary in a later iteration.

Actually PAMAP has done this step already. Tiff grids are available for download. However, I’m interested in the gridding algorithm and possible implementation in a hadoop cluster so PAMAP is a convenient test set with a solution already available for comparison. It is also nice to think about going directly from raw .las data to my end goal of a set of image and mesh pyramids for the 3D viewer I am exploring for gmti data.


Fig 1 - LiDAR scan tracks in xaml over USGS DOQ

Although each section is only about 3.5 square miles the scans generate more than a million points per square mile. The pulse rate achieves roughly 1.4ft spacing for the PAMAP data collection. My translator produces a xaml file of approximately 85Mb which is much too large to show in a browser. I just pulled some scans from the beginning and the end to show location and get an idea of scale.

The interesting aspect of LiDAR is the large amount of data collected which is accessible to the public. In order to really make use of this in a web viewer I will need to subtile and build sets of image pyramids. The scans are not orthogonal so the next task will be to grid the raw data into an orthoganal cell set. Actually there will be three grids, one for the elevation, one for the classification, and one for the reflectance value. With gridded data sets, I will be able to build xaml 3D meshes overlaid with the reflectance or classification. The reflectance and classification will be translated to some type of image file, probably png.

Replacing the PA83-SF with a metric UTM base gives close to 3048mx3048m per PAMAP section.If the base cell starts at 1 meter, and using a 50cellx50cell patch, the pyramid will look like this:
dim cells/patch size  approx faces/section
1m 50×50  50mx50m  9,290,304
2m 50×50  100mx100m 2,322,576
4m 50×50  200mx200m   580,644
8m 50×50   400mx400m   145,161
16m 50×50   800mx800m 36,290
32m 50×50  1600mx1600m 9072
64m 50×50  3200mx3200m 2268
128m 50×50  6400mx6400m 567
256m 50×50  12800mx12800m 141

There will end up being two image pyramids and a 3D zaml mesh pyramid with the viewer able to move from one level to the next based on zoom scale. PAMAP also has high resolution aerial imagery which could be added as an additional overlay.

Again this is a large amount of data, so pursuing the AWS cluster thoughts from my previous posting it will be interesting to build a hadoop hdfs cluster. I did not realize that there is actually a hadoop public ami available with ec2 specific scripts that can initialize and terminate sets of EC2 instances. The hadoop cluster tools are roughly similar to the Google GFS with a chunker master and as many slaves as desired. Hadoop is an Apache project started by the Lucene team inspired by the Google GFS. The approach is useful in processing data sets larger than available disk space, or 160Gb in a small EC2 instance. The default instance limit for EC2 users is currently at 20. With permission this can be increased but going above 100 seems to be problematic at this stage. Using a hadoop cluster of 20 EC2 instances still provides a lot of horsepower for processing data sets like LiDAR. The cost works out to 20x$0.10 per hour = $2.00/hr which is quite reasonable for finite workflows. hadoop EC2 ami

This seems to be a feasible approach to doing gridding and pyramid building on large data resources like PAMAP. As PAMAP paves the way, other state and federal programs will likely follow with comprehensive sub meter elevation and classification available for large parts of the USA. Working out the browser viewing requires a large static set of pre-tiled images and xaml meshes. The interesting part of this adventure is building and utilizing a supercomputer on demand. Hadoop makes it easier but this is still speculative at this point.


Fig 1 - LiDAR scan tracks in xaml over USGS DRG

I still have not abandoned the pipline approach to WPS either. 52North has an early implementation of a WPS available written in Java. Since it can be installed as a war, it should be simple to startup a pool of EC2 instances using an ami with this WPS war installed. The challenge, in this case, is to make a front end in WPF that would allow the user to wire up the instance pool in different configurations. Each WPS would be configured with an atomistic service which is then fed by a previous service on up the chain to the data source. Conceptually this would be similar to the pipleline approach used in JAI which builds a process chain, but leaves it empty until a process request is initialized. At that point the data flows down through the process nodes and out at the other end. Conceptually this is quite simple, one AMI per WPS process and a master instance to push the data down the chain. The challenge will be a configuration interface to select a WPS atom at each node and wire the processes together.

I am open to suggestions on a reasonably useful WPS chain to use in experimentations.

AWS is an exciting technology and promises a lot of fun for computer junkies who have always had a hankering to play with supercomputing, but didn’t quite make the grade for a Google position.

Amazon SQS the poor man’s super computer

February 26th, 2008

EC2 and S3 are not the only AWS services of interest to the geospatial community. Amazon SQS Simple Queue Service is also quite interesting. I haven’t looked into it too far but unlimited locking message queues with large instance arrays is essentially a poor man’s supercomputer. For a certain scale of problem which can be replicated recursively into multiple subsets, parallel computing techniques have often been used. Numerous distributed computing projects come to mind, Active Distributed Computing Projects.

Perhaps AWS can be configured for short burst supercomputer problems in an economical fashion. By breaking a problem into enough small chunks and adding them to a set of SQS queues pointed at a configurable array of ami instances, voila, we have an AWS super computer! The EC2 instance array would pull data chunks out of a queue, process , and queue back to an aggregator instance. An interesting problem might be to determine whether such a scenario would be queue constrained or processing instance constrained. Amazon resources are not infinite: “If you wish to run more than 20 instances, please contact us at aws@amazon.com ” However, let’s imagine a utility computing environment of the future.

In the AWS of the future an instance array can be more like Deep Blue. A modest 32×32 array provides 1024 discrete process instances which is possibly within current limits, but a more ambitious 256×256 array at 65536 distinct instances would not be out of the question on the five year horizon.

In the geospatial arena there are numerous problems amenable to distributed processing. With the massive collection of geospatial imagery presently underway, collection and storage are already a large problem for NASA, NOAA, JPL, USGS etc. Add to this problem the issue of scientific exploration of these massive data sets and distributed computing may have a large role to play within the same 5 year horizon.

This week OGC announced final release of the Web Processing Service, WPS. OGC WPS press release The Web Processing Service spec provides a blue print for services to ask higher level questions like why?, how much?, and what if? The goal is to provide interchangeable service process algorithms that can potentially be chained into answers to these types of higher level questions. For example a lidar scene can be processed into a roughness measure using a convolution kernel. When the result is compared with other bands from hyperspectral sensors in some boolean operation the output could be used to answer the question: “how many acres of drought tolerant grassland lie within Kit Carson county?” There are at least two distinct functions 1) roughness calculation 2) boolean combination, possibly a 3rd to add all pixels in the expected range for a final area measure.

Now add a distributed compute model. The simplest is one process per instance. In this approach each analysis request gets its own EC2 instance. All processes run sequentially in the single dedicated instance. This is of course a big help and far different than the typical multi-request one server model. But now we can move down this stream another step or two.

Next why not one instance for each process step. In this case a queue connects to a downstream instance. Process one performs the convolution and as chunks/cells/tiles become available they are pushed into the SQS. Process two, the boolean union, picks chunks from the other end of the queue to build the end result from a series of boolean tile operations. The queue decouples the two processes so that asynchronous operations are possible. If the first process proceeds at twice the speed of the second process simply add another instance to the other end of the queue. In this scenario we have one request, two WPS processes, and perhaps 3 AMI instances. This improves things a bit, actually quite a bit. The cost per request has at least tripled but throughput has also been increased by close to the same factor.

Now comes a full blown distributed model. Like most array objects geospatial processes can be broken into smaller subsets and the same process replicated over an array of subsets in a parallel fashion. Now each step in the process chain can have an array of instances each working on a small chunk. These chunks feed into multiple queues directed down stream to process two which is also an array of instances. We now have supercomputing potential. Process one 32×32 array pool of instances feeding some set of queues connecting to a second 32×32 array pool of instances working on process two. At 1024 instances per process we can quickly see the current AWS is not going to be happy. The cost is now magnified by a factor of a thousand but only if the instance pools are maintained continuously. If the pools are only in use for the duration of the request the cost could potentially be in the same magnitude as the one process per instance architecture, while throughput is increased by the 1000 factor. Short burst supercomputing inside utility computing warehouses like AWS could be quite cost effective.

It is conceivable that some analysis chains will involve dozens of process steps over very large imagery sets. Harnessing the ephemeral instance creation of utility computing points toward solutions to complex WPS process chains in near real time all on the internet cloud. So SQS does have some interesting potential in the geospatial analysis arena.

Xaml on Amazon EC2 S3

February 16th, 2008

Time to experiment with Amazon EC2 and S3. This site http://www.gis-xaml.com is using an Amazon EC2 instance with a complete open source GIS stack running on Ubuntu Gutsy.

  • Ubuntu Gutsy
  • Java 1.6.0
  • Apache2
  • Tomcat 6.016
  • PHP5
  • MySQL 5.0.45
  • PostgreSQL 8.2.6
  • PostGIS 1.2.1
  • GEOS 2.2.3-CAPI-1.1.1
  • Proj 4.5.0
  • GeoServer 1.6

Running an Apache2 service with a jk_mod connector to tomcat lets me run the examples of xaml xbap files with their associated java servlet utilities for pulling up GetCapabilities trees on various OWS services. This is an interesting example of combining open source and WPF. In the NasaNeo example Java is used to create the 3D terrain models from JPL srtm (Ctrl+click) and drape with BMNG all served as WPF xaml to take advantage of native client bindings. NasaNeo example

I originally attempted to start with a public ami based on fedora core 6. I found loading the stack difficult with hard to find RPMs and difficult installation issues. I finally ran into a wall with the PostgreSQL/PostGIS install. In order to load I needed a complete gcc make package to compile from sources. It did not seem worth the trouble. At that point I switched to an Ubuntu 7.10 Gutsy ami.

Ubuntu based on debian is somewhat different in its directory layout from the fedora base. However, Ubuntu apt-get was much better maintained than the fedora core yum installs. This may be due to using the older fedora 6 rather than a fedora 8 or 9, but there did not appear to be any useable public ami images available on the AWS EC2 for the newer fedoras. In contrast to fedora on Ubuntu installing a recent version of PostgreSQL/PostGIS was a simple matter:
apt-get install postgresql-8.2-postgis postgis

In this case I was using the basic small 32 bit instance ami with 1.7Gb memory and 160Gb storage at $0.10/hour. The performance was very comparable to some dedicated servers we are running, perhaps even a bit better since the Ubuntu service is setup using an Apache2 jk_mod to tomcat while the dedicated servers simply use tomcat.

There are some issues to watch for on the small ami instances. The storage is 160Gb but the partition allots just 10Gb to root and the balance to a /mnt point. This means the default installations of mysql and postgresql will have data directories on the smaller 10Gb partition. Amazon has done this to limit ec2-bundle-vol to a 10GB max. ec2-bundle-volume is used to store an image to S3 which is where the whole utility computing gets interesting.

Once an ami stack has been installed it is bundled and stored on S3, that ami is then registered with AWS. Now you have the ability to replicate the image on as many instances as desired. This allows very fast scaling or failover with minimal effort. The only caveat of course is in dynamic data. Unless provision is made to replicate mysql and postgresql data to multiple instances or S3, any changes can be lost with the loss of an instance. This does not appear to occur terribly often but then again the AWS is still Beta. Also important to note, the DNS domain pointed to an existing instance will also be lost with the loss of your instance. Bringing up a new instance requires a change to the DNS entry as well (several hours), since each instance creates its own unique amazon domain name. There appear to be some work arounds for this requiring more extensive knowledge of DNS servers.

In my case the data sources are fairly static. I ended up changing the datadir pointers to /mnt locations. Since these are not bundled in the volume creation, I handled them separately. Once the data required was loaded I ran a tar on the /mnt/directory and copied the .tar files each to its own S3 bucket. The files are quite large so this is not a nice way to treat backups of dynamic data resources.

Next week I have a chance to experiment with a more comprehensive solution from Elastra. Their beta version promises to solve these issues by wrapping Postgresql/postgis on the ec2 instance with a layer that uses S3 as the actual datadir. I am curious how this is done but assume for performance the indices remain local to an instance while the data resides on S3. I will be interested to see what performance is possible with this product.

Another interesting area to explore is Amazon’s recently introduced SimpleDB. This is not a standard sql database but a type of hierarchical object stack over on S3 that can be queried from EC2 instances. This is geared toward non typed text storage which is fairly common in website building. It will be interesting to adapt this to geospatial data to see what can be done. One idea is to store bounding box attributes in the SimpleDB and create some type of JTS tool for indexing on the ec2 instance. The local spatial index would handle the lookup which is then fed to the SimpleDB query tools for retrieving data. I imagine the biggest bottleneck in this scenario would be the cost of text conversion to double and its inverse.

Utility computing has an exciting future in the geospatial realm - thank you Amazon and Zen.