Skip to topic | Skip to bottom
Home
Main
Main.CarolinasCoastLiter1.8 - 08 Mar 2007 - 13:44 - JeremyCothrantopic end

Start of topic | Skip to actions

Carolinas Coast : A lite approach to the backend

This is only etherware and is intended to be fodder for USC roundtable discussion.

Related links

The bigger picture

This thought process is mainly geared around two current high priority CC objectives: (1) reduce obs update delay; (2) make the entire CC package easier to customize. In a nutshell, this approach will eliminate the need for a database and only rely on MapServer, Perl, PHP, and shapefiles. It will also be mostly configurable by an adminstrator by editing XML files.

Before getting into the nitty gritty, here are what I consider the:

  • possible advantages
    • does not require a database
    • all information retrieved via WMS and/or WFS
    • addition of new obs site only a matter of creating a new .xml
    • low overhead
      • Perl and MapServer (handles all spatial queries, including the this_hazard_contains_this_obs check) for backend
      • PHP for frontend
  • possible disadvantages
    • Might not work well for HUGE domains w/ lots of obs points; However, this is supposed to be a regional effort, and since we have only been required to keep the last few days' of data, this scalabity issue should not rear its ugly head.

Finally, the issues not immediately addressed are:

  • tides
  • forecasts
  • station_list
  • map customization (namely zoom areas)

Nuts and bolts

Backend data storage

obs_site info (must be created by weather office host)

  • e.g. cc_sun3.xml
  • type = xml
  • name = station_id.xml
  • 1 file per obs_site
  • contents
    • normal metadata including lon,lat
    • commands to get real-time data

observations data (automatcially generated from *.xml's)

  • e.g. observations.[shp|shx|dbf]
  • type = point shapefile
  • name = observations.[shp|shx|dbf]
  • 1 file
  • contents
    • a postgis-like table containing a station_id, time_stamp, val0, val2, val3, ..., latest?, hazard
      • contain values / blanks for all possible obs types across entire CC sensor suite
      • are 'processed' in that they are in the final units required for display
      • are ordered

hazards (automatically generated by hazard collection routine)

  • e.g. hazards.[shp|shx|dbf]
  • type = polygon shapefile
  • name = hazards.[shp|shx|dbf]
  • 1 file
  • contents
    • hazard, geometry, text, times
    • ordered by priority

Backend data collection

get_data()

get_hazards()

  • append new hazards to hazards shapefile using exsiting routines
  • clean out expired hazards

update_obs_with_hazards()

  • loop through each hazard (prioritized somehow)
    • loop through each obs
      • if (this_obs_not_already_marked_with_a_hazard && this_hazard_contains_this_obs) => mark this_obs with this hazard

Frontend interface update

update_cc_maps()

  • observations
    • get data and hazard
      • loop through obs_site .xml's to populate web page header tables
      • get info from WFS query that will hit the observations.shp and pick off rows where latest
      • match the two above results to create the popup boxes
    • get map
      • ping the WMS which will use the observations.shp among other WMS layers to make maps
    • hazards more info
      • get map
        • ping the WMS which will use observations.shp to make yellow dot & hazards.shp to shade the particular hazard
      • get text
        • ping the WFS which will use hazards.shp to get the hazard text (only one; don't remember how this is done!)
  • hazards
    • get map
      • ping the WMS to display everything in hazards.shp
    • get text
      • ping the WFS which will use hazards.shp to display the hazard type text (could be many)
  • tides
    • leaving alone for now
  • forecasts
    • leaving alone for now
  • station_list
    • leaving alone for now

update data once every 30 minutes
  get_data();
  update_obs_with_hazards();
  update_cc_maps();
  
update hazards every 10 minutes
  get_hazards();
  update_obs_with_hazards();
  update_cc_maps();

Notes

Obs

  • NWS includes FAA and DOD
  • NOS includes Caro-COOPS and CORMP
  • NDBC : DODS interface for NDBC data, http://dods.ndbc.noaa.gov/. Doesn't seem to be complete.

Hazards

  • Eventually, the weather office host will create hazard .xml files to point to where hazards need to be downloaded from.
  • Experimental XML Feeds and Web Displays of Watches, Warnings, and Advisories are available as XML feeds from http://www.nws.noaa.gov/alerts/.
    • The National Weather Service provides access to watches, warnings and advisories for land areas, and for hurricane watches and warnings, via RSS and CAP/XML to aid the automated dissemination of this information. Planning is in progress to extend this to marine warnings.
  • Experimental National Warning GIS Shapefiles from http://www.prh.noaa.gov/regsci/gis/shapefiles/.
    • Only warnings: Tornado, Severe Thunderstorm, Flash Flood, and Special Marine

Forecasts

JeremyCothran Notes

routine procedures

changing zoom boxes

sea_coos_obs=# update nws_carolinas_coast_zoom_boxes set the_geom = 'POLYGON((-78.21 36.53, -75.33 36.53, -75.33 34.31, -78.21 34.31, -78.21 36.53))' where id = 'group1';

adding a platform/station

#note that we could probably make this simpler by substituting the xenia platform table lookup instead of the nws_carolinas_coast_obs_sites table

#ignore descrip column and case sensitity on station__id

insert into nws_carolinas_coast_obs_sites (descrip,station_id,station__1,institutio,title,lon,lat,the_geom) values ('blah','ndbc','ndbc_41002_met','National Data Buoy Center','National Data Buoy Center Real-Time Station Data 41002',-75.35,32.31,makepoint(-75.35,32.31));

####
insert into platform (organization_id,type_id,short_name,platform_handle,fixed_longitude,fixed_latitude,long_name,description,url) values (10,4,'NDBC_41002','ncbc_41002_met',-75.35,32.31,'NDBC Station 41002 - S HATTERAS - 250 NM East of Charleston, SC','NDBC Station 41002 - S HATTERAS - 250 NM East of Charleston, SC','http://ndbc.noaa.gov/station_page.php?station=41002');

#472 is the new platform_id for 41002
#just adding the same sensors as 41004 (platform_id=419) - unused sensors are ignored
insert into sensor (platform_id,type_id,short_name,m_type_id,s_order) select 472,type_id,short_name,m_type_id,s_order from sensor where platform_id = 419;

####
#added to nautilus:/usr2/prod/buoys/perl/carolinas_stations.txt   
ndbc_41002_met<SEP>NDBC_41002<SEP>http://ndbc.noaa.gov/station_page.php?station=41002<SEP>NDBC Station 41002 - S HATTERAS - 250 East of Charleston, SC

#susanna's side
Here's documentation of what I did to add it:

1. Saved current default map to hard drive (basemap_cities.png).
2. Created a blank html document in Dreamweaver.
3. Put default map image in blank document and used image map tool to draw a box over new obs site.
4. Added link and alt info for new obs site.
5. Copied and pasted area element into main image map ("imgmap") in carolinascoast.php.

Same steps would be followed for adding a new obs site to a sub-map.

development

PostGIS? specific functions

  • platform/station map - yellow dots for locations generated from PostGIS? table
  • forecasts
  • tide gauges - locating nearest using radius function
  • air pressure map ? - this may be a straight file conversion from grib to image using GMT and not using database at all

using non-geospatially enabled relational database (aka MySQL?) and file(shapefile, CSV) based methods

MySQL?

I've setup MySQL? version 4.1.14 on the neptune server(same server as the new carolinas coast stuff). Contact me to get admin access. A beginning intro on using MySQL? is http://dev.mysql.com/tech-resources/articles/mysql_intro.html

neptune jcothran # mysql -u root --password
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 11 to server version: 4.1.14-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> use db_multi_obs_delta
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+------------------------------+
| Tables_in_db_multi_obs_delta |
+------------------------------+
| app_catalog                  |
| platform                     |
+------------------------------+
2 rows in set (0.00 sec)

In converting a few sample postgres tables to mysql I've noted the following so far

  • 'serial' datatype keys which I use on all my tables as a primary key are established primary as default by mysql (don't need to tell mysql explicitly to make these primary key)
  • ignoring/removing postgres permission/grant settings for now
  • 'table only' statements should be shortened to just 'table'
  • 'timestamp' datatype defaults to some strange initial values and doesn't support 'without time zone' clause

Still need to setup the other tables, establish foreign keys, populate and see whether any of the DBI calls error out using the MySQL? driver.

OGR techniques

Didn't realize that the OGR driver supported the following two useful methods which I'll be taking advantage of in regards to:

  • high-volume gridded data(model output, hf radar, quikscat, hourly maps)

OGR shapefile driver allows you to SQL index/query a shapefile. Handle high-volume data as sets of shapefiles instead of housing data internal to the database. This gives a readily accessible/usable file archive and prevents the data maintenance overhead associated with a relational database approach.

#ogr shp
http://www.remotesensing.org/gdal/ogr/ogr_sql.html
http://mapserver.gis.umn.edu/data2/wilma/mapserver-users/0304/msg00371.html
http://lists.maptools.org/pipermail/gdal-dev/2005-January/004903.html
http://gdal.maptools.org/ogr/drv_shapefile.html
http://shapelib.maptools.org

  • simpler data source approaches(CSV files, non-geospatially enabled simpler relational databases via ODBC)

#ODBC Virtual Spatial Data
http://mapserver.gis.umn.edu/docs/reference/vector_data/VirtualSpatialData

streamlining xml output and feed

To emulate the earlier carolinas coast data feed used, we did a couple of things extra that we could probably get rid of or streamline. The main steps here I'll probably do as part of the Xenia development.

  • create an xml feed of latest observations directly from the multi_obs table. This should hopefully remove the steps used to emulate the earlier WFS feed and get away from including observation specific columns (sst_fahrenheit) and string label values '79 F at 4 m' as the mechanisms and more toward an xml feed which has a list of observations with each observation including specific elements for observation_type, value, unit of measure and elevation(a more abstract, less hard-coded approach).
  • modify the carolinas coast read from xml to html tables to use the above new xml format

   nemo:/usr2/home/scscout/cc
   drop 
       neptune:db_multi_obs_carolinas_test_aux
       mk_sql_for_latest_obs_by_station_id.pl

   alter to add/update elements using XML::XPath to latest xml
       mk_sql_for_hazard_trip.pl
       mk_sql_for_hazard_updates.pl
   convert units and dataURL units from SI to English

streamlining/dropping tables

The table neptune:sea_coos_obs:nws_carolinas_coast_obs_sites table functionality in producing yellow dots on the main map could probably be replaced by the xenia 'platform' table which seems to carry the same info.

archival/time series

I'll be working towards setting up an archival (2 weeks, older than 2 weeks) setup of the data tables and modifying the existing time series graphing scripts to leverage this xenia setup.

UPDATE: February 6, 2007: The following scripts should enable time series graphing for a xenia instance.

get_graph.php calls the perl script http://nautilus.baruch.sc.edu/xenia/graph/graphSingleLine.pl which call the files under http://carocoops.org/xenia/graph (ignore the 'old' folder). These files currently live at nautilus:/var/www/html/xenia and require that gnuplot http://gnuplot.info be installed on the server.

graph/time series corrections

  • add unit_conversion=en to graph links to provide default english unit of measure for graphs
  • wind from direction is shown for wind speed, several vars like wind gust show no data and vars like surface temp or current are not available as a link
  • may want to suppress unit_conversion=en for air pressure as the english is different than the presented si in millibars

Generally speaking the graph product for now only accomodates single line plots.

At some point we could develop a list of sensor_id's to pass to a multi-line plot, one for variables of the same type and another for variables of different types. The graph.xml file could continue to be used for help in configuring graphing options.

The database mechanisms to further support these more complex groupings in Xenia version 2 http://nautilus.baruch.sc.edu/twiki_dmcc/bin/view/Main/XeniaPackageV2 are the m_type -> m_scalar_type which lets you associate several scalar values with one m_type ('winds' rows should always list wind_speed, wind_from_direction, wind_gust for example) or collection_id which is a general way of grouping a set of rows on the multi_obs, sensor or platform table.

A future WFS request could make a single request to get speed and dir if the m_type is defined that way for the first example or a view could be created between the multi_obs and sensor table to include the collection_id to be used in getting the collection of sensor_id's and their labels.

a description of the graph.xml fields

To add more observation types to the graph.xml file use the following notes

        <observation m_type_id="2">
                <standard_name>wind_gust</standard_name>
                <standard_uom>m_s-1</standard_uom>
                <standard_uom_en>knots</standard_uom_en>

                <range_min>0</range_min>
                <range_max>45</range_max>
                <title>Wind Gust</title>
                <y_title>meters/second</y_title>
                <with_clause>lines lt 1</with_clause>
                <break_interval>7300</break_interval>
                <size_x>600</size_x>
                <size_y>300</size_y>
        </observation> 

  • m_type_id should be same as listing in m_type table
  • standard_uom is the SI uom in the obs_type_id lookup (use underscores for spaces)
  • standard_uom_en is the english uom in the obs_type_id lookup (use underscores for spaces)
  • range_min, range_max are min/max measurement values in SI units (standard_uom)
  • title, y_title are labels - try to stay consistent with other examples in graph.xml listing
  • with_clause is the linetype used by gnuplot - always 'lines lt 1' for connected line unless looking for something different like wind_from_direction
  • break_interval is the number of seconds between observations greater than which a line break is created
  • size_x, size_y are the pixel size of the final image

derived measurements

There is a need with component based measurement types (surface current as u,v components for example) for these to be concurrently populated with their derived measurement variations such as current_speed and current_to_direction . This could be done using a trigger or some regular processing(say every 10 minutes) which checks for new measurements and their m_type to produce the derived measurement(s). A similar concept could also apply to 12 hour average or aggregate derived measurements.

A suggested solution for surface/bottom currents is to have a cron that every 10 minutes checked for new rows (using multi_obs.row_entry_date) in the past 10 minutes of eastward_current m_type. When finding those rows, match them to their corresponding northward_current using the same platform_handle and depth and derive the current_speed and current_to_direction and insert as new rows on the multi_obs table.
to top


You are here: Main > CarolinasCoast > CarolinasCoastLite

to top

Copyright © 1999-2008 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding DMCC? Send feedback