Hi Bob,
I thought it might be useful to elaborate on the geoareas.
My hope is that these could be incorporated into the DarwinCore so that an Area would look something like this.
<dwc:Area about="geo:41.53000000,-70.67000000;u=100"> <rdf:type resource="http://purl.org/dc/terms/Location%22/%3E <dwc:georeferenceMethod resource=" http://rs.tdwg.org/dwc/terms/index.htm#GeoMethod_GoogleMaps%22%3E <dwc_area:areaWithInFeature rdf:resource=" http://sws.geonames.org/4929772/%22/%3E <wdrs:describedby rdf:resource=" http://my_organization.com/occurrence/123.rdf%22/%3E </dwc:Area>
The ietf.org came up with the following proposal for dealing with location data.
http://tools.ietf.org/html/rfc5870
http://tools.ietf.org/html/rfc5870What I am proposing is that we incorporate a subset of this into the DarwinCore.
That is the latitude and longitude and a uncertainty in meters (radius)
For now Virtuoso does not understand that this is a mappable thing so I still include the regular geo:lat and geo:long in my mapping examples.
OK, so what is the main advantage?
If I create URI's for insect collection areas in my namespace ( http://lod.taxonconcept.org/#####)
And Steve has plant collections from those same geographic locations that you identify in your namespace (http://bioimages.org/#####)
Our two datasets, about the same place, are not linked.
But if Steve marks up his data with the geoarea dwc:Area resource="geo:41.53000000,-70.67000000;u=100" it would be possible for a consuming service to see what observations have identical locations or overlap.
A link back to the RDF that makes a statement about a particular geoarea is included using the *wdrs* vocabulary.
This allows a consumer to find the original RDF that contained statements about a particular geoarea. (provenance)
Does this make sense?
It is likely that what I am proposing will need to be tweaked in someway that I have not anticipated.
I would like it them to still work as a subset of the more complicated ietf.org proposal.
That is tools and services that understand the ietf.org standard will be able to consume the geoarea's.
This already work as urn's in a triple/quadstore, allowing occurrences tied to the same GPS reading to be linked together.
If these catch on I suspect Virtuoso and others will start supporting them so that entities like "geo:41.53000000,-70.67000000;u=100" are seen as mappable things and processed accordingly.
This does not offer everything that PointRadiusSpatialFit, coordinateUncertaintyInMeters provides but I think that these areas are more efficient, easier for people to understand and more likely to be adopted. I think we should keep PointRadiusSpatialFit, coordinateUncertaintyInMeters in the DarwinCore for those who want to use them.
Standardizing on some set of decimal points makes the standard easier for producers to implement, and the areas easier to compare as simple strings and urn's.
The current ontology is here and I am open to having it tweaked and improved.
http://lod.taxonconcept.org/ontology/dwc_area.owl
OntDoc http://lod.taxonconcept.org/ontology/dwc_area_doc/index.html
* I was thinking that it might make sense to add support for linking to Yahoo Placefinder's Where On Earth ID (WOEID) in addition to Geonames
http://developer.yahoo.com/geo/placefinder/
Respectfully,
- Pete
On Sun, Feb 27, 2011 at 12:14 PM, Bob Morris morris.bob@gmail.com wrote:
Ah, well, the point I was raising was only about geo as a URI scheme, the subject of the now adopted IETF RFC 5870. To my mind this raises several distinguishable questions (whose answers may not be independent):
- What mappings are there to other georeferencing schemes with richer
semantics? RFC 5870 mentions offers a non-normative one to parts of GML. Pete and you seem to be discussing possible mappings to DwC. At the very least, I would agree with any subtext in that discussion that such a mapping and best practices for its use be the subject of a TDWG applicability statement or other non-normative document.
- The questions implicit in a recent posting of Steve Baskauf:
a. To the extent that LOD or other uses of RDF require or urge an http URI, what provides the mapping of the now approved IANA geo URI scheme to the IANA http URI scheme, and what should the http service status values be? (Maybe service status is a separate question. But LOD as used seems to depend in practice on conventions about http as a service protocol, not just as a URI scheme.) b. What defines the (semantics of the) dereferencing of a geo URI? http://tools.ietf.org/html/rfc5870#section-5 is rather spare on this point, but has the warning: "Currently, just one operation on a 'geo' URI is defined - location
dereference: in that operation, a client dereferences the URI by extracting the geographical coordinates from the URI path component <geo-path>. Further use of those coordinates (and the uncertainty value from <uval>) is then up to the application processing the URI, and might depend on the context of the URI."
Bob
On Sun, Feb 27, 2011 at 11:05 AM, John Wieczorek tuco@berkeley.eduwrote:
The number of digits given is definitely is not a good substitution for this for many reasons, just one of which is that the the original may have been captured in a different coordinate system (such as degrees decimal minutes - the most precise coordinate system other than UTM or other meter-based systems when recording data from a GPS) and then converted to decimal degrees where the number of significant digits then becomes meaningless.
Happily, the Darwin Core also has a dwc:coordinatePrecision term (http://rs.tdwg.org/dwc/terms/index.htm#coordinatePrecision), which can say explicitly what the level of precision is in the coordinates given. The dwc:coordinateUncertaintyInMeters (http://rs.tdwg.org/dwc/terms/index.htm#coordinateUncertaintyInMeters) is supposed to account for all sources of uncertainty in the coordinates given, including GPS accuracy and coordinate precision.
For the sake of completeness, there are the terms dwc:pointRadiusSpatialFit (http://rs.tdwg.org/dwc/terms/index.htm#pointRadiusSpatialFit) to capture analytically how well the point-radius given matches the actual uncertainty of the coordinates given (in case someone artificially adds uncertainty), the dwc:georeferenceProtocol (http://rs.tdwg.org/dwc/terms/index.htm#georeferenceProtocol) to explain the method used to georeference, and the dwc:dataGeneralizations (http://rs.tdwg.org/dwc/terms/index.htm#dataGeneralizations) to explain what was done to the georeference post-facto to obscure it.
On Sat, Feb 26, 2011 at 4:04 PM, Peter DeVries pete.devries@gmail.com wrote:
Hi Bob, Yes an estimate of the precision / extent should be recorded by the
original
observer. This has been repeated several times and it is interesting that even
TDWG
did not incorporate this into their data collection. What I was proposing was a specific extension to the ietf proposal for occurrence records. It adds something very similar to pointRadiusSpatialFit to a latitude
and
longitude. By standardizing on the significant digits we gain something even before there is general software support for this standard. That records with "geo:41.53000000,-70.67000000;u=100" are an equivalent URN, while. "geo:41.53000000,-70.67000000;u=100" and "geo:41.53,-70.67;u=100" are
not
That allows those records to be linked within a triple or quadstore. As in this earlier example: Here is a browsable view of one of the areas bit.ly http://bit.ly/hBtVFL
http://lsd.taxonconcept.org/describe/?url=geo:41.53000000,-70.67000000;u%3D1...
Without doing anything other than standardizing on the number of digits, occurrences attached to the same GPS reading are linked in both a triple store and a google search. Where as software needs to be written that interprets "geo:41.53000000,-70.67000000;u=100" and "geo:41.53,-70.67;u=100" as equivalent. Try Googling "geo:44.86294500,-87.23120400;u=10" If the ietf.org standard is supported in future versions of Virtuoso
and
other tools then we would not need to include the redundant use of
geo:lat
geo:long for the dynamic maps.
- Pete
On Sat, Feb 26, 2011 at 1:01 PM, Bob Morris morris.bob@gmail.com
wrote:
Your arguably reasonable recoding of the geo uri's of your example illustrates an issue on which so much metadata is silent: provenance.
Once
exposed, it is probably impossible for someone to know how the
uncertainty
(or any other data that might be the subject of opinion or estimate)
was
determined and whether the data is fit for some particular purpose,
e.g.
that the species were observed near each other. BTW, the IETF geo proposal was adopted in 2010, in the final form given at http://tools.ietf.org/html/rfc5870 . One interesting point is http://tools.ietf.org/html/rfc5870#section-3.4.3 which says "Note: The number of digits of the values in <coordinates> MUST NOT
be
interpreted as an indication to the level of uncertainty." The section following is also interesting, albeit irrelevant for your procedure. It implies that when uncertainty is omitted (and therefore unknown), then "geo:41.53000000,-70.67000000" and "geo:41.53,-70.67" identify the
same
geo resource.
Bob Morris On Wed, Feb 23, 2011 at 4:56 PM, Peter DeVries <pete.devries@gmail.com
wrote:
[...]
- I added in my proposed "area" so that it is easy to see what
species
were observed near each other. Since there was no measure of radius in
these
longitude and latitudes I made the radius 100 meters. Normally I would estimate the radius for a GPS reading to be
within
10 meters but some of these observations were made where the GPS
reading was
taken and the readings were given only to two decimals. Area = long, lat; radius in meters following the ietf proposal but
with
the precision of the long and lat standardized example "geo:41.53000000,-70.67000000;u=100" [...]
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://efg.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)
--
Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base / GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://efg.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)