[tdwg-content] data provenance; was Re: Updated TDWG BioBlitz RDF Example with Pivot View and Data Browsing

Bob Morris morris.bob at gmail.com
Sun Feb 27 19:14:15 CET 2011


Ah, well, the point I was raising was only about geo as a URI scheme, the
subject of the now adopted IETF RFC 5870. To my mind this raises several
distinguishable questions (whose answers may not be independent):

1. What mappings are there to other georeferencing schemes with richer
semantics? RFC 5870 mentions offers a non-normative one to parts of GML.
Pete and you seem to be discussing possible mappings to DwC.  At the very
least, I would agree with any subtext in that discussion that such a mapping
and best practices for its use be the subject of a TDWG applicability
statement or other non-normative document.

2. The questions implicit in a recent posting of Steve Baskauf:
a. To the extent that LOD or other uses of RDF require or urge an http URI,
what provides the mapping of the now approved IANA geo URI scheme to the
IANA http URI scheme, and what should the http service status values be?
(Maybe service status is a separate question. But LOD as used seems to
depend in practice on conventions about http as a service protocol, not just
as a URI scheme.)
b. What defines the (semantics of the) dereferencing of a geo URI?
http://tools.ietf.org/html/rfc5870#section-5 is rather spare on this point,
but has the warning:
"Currently, just one operation on a 'geo' URI is defined - location

   dereference: in that operation, a client dereferences the URI by
   extracting the geographical coordinates from the URI path component
   <geo-path>.  Further use of those coordinates (and the uncertainty
   value from <uval>) is then up to the application processing the URI,
   and might depend on the context of the URI."



Bob






On Sun, Feb 27, 2011 at 11:05 AM, John Wieczorek <tuco at berkeley.edu> wrote:

> The number of digits given is definitely is not a good substitution
> for this for many reasons, just one of which is that the the original
> may have been captured in a different coordinate system (such as
> degrees decimal minutes - the most precise coordinate system other
> than UTM or other meter-based systems when recording data from a GPS)
> and then converted to decimal degrees where the number of significant
> digits then becomes meaningless.
>
> Happily, the Darwin Core also has a dwc:coordinatePrecision term
> (http://rs.tdwg.org/dwc/terms/index.htm#coordinatePrecision), which
> can say explicitly what the level of precision is in the coordinates
> given. The dwc:coordinateUncertaintyInMeters
> (http://rs.tdwg.org/dwc/terms/index.htm#coordinateUncertaintyInMeters)
> is supposed to account for all sources of uncertainty in the
> coordinates given, including GPS accuracy and coordinate precision.
>
> For the sake of completeness, there are the terms
> dwc:pointRadiusSpatialFit
> (http://rs.tdwg.org/dwc/terms/index.htm#pointRadiusSpatialFit) to
> capture analytically how well the point-radius given matches the
> actual uncertainty of the coordinates given (in case someone
> artificially adds uncertainty), the dwc:georeferenceProtocol
> (http://rs.tdwg.org/dwc/terms/index.htm#georeferenceProtocol) to
> explain the method used to georeference, and the
> dwc:dataGeneralizations
> (http://rs.tdwg.org/dwc/terms/index.htm#dataGeneralizations) to
> explain what was done to the georeference post-facto to obscure it.
>
> On Sat, Feb 26, 2011 at 4:04 PM, Peter DeVries <pete.devries at gmail.com>
> wrote:
> > Hi Bob,
> > Yes an estimate of the precision / extent should be recorded by the
> original
> > observer.
> > This has been repeated several times and it is interesting that even TDWG
> > did not incorporate this into their data collection.
> > What I was proposing was a specific extension to the ietf proposal for
> > occurrence records.
> > It adds something very similar to pointRadiusSpatialFit to a latitude and
> > longitude.
> > By standardizing on the significant digits we gain something even before
> > there is general software support for this standard.
> > That records with "geo:41.53000000,-70.67000000;u=100" are an equivalent
> > URN, while.
> >  "geo:41.53000000,-70.67000000;u=100" and "geo:41.53,-70.67;u=100" are
> not
> > That allows those records to be linked within a triple or quadstore.
> > As in this earlier example:
> >     Here is a browsable view of one of the
> > areas bit.ly  http://bit.ly/hBtVFL
> >
> >
> http://lsd.taxonconcept.org/describe/?url=geo:41.53000000,-70.67000000;u%3D100
> > Without doing anything other than standardizing on the number of digits,
> > occurrences attached to the same GPS reading are linked in both a triple
> > store and a google search.
> > Where as software needs to be written that
> > interprets  "geo:41.53000000,-70.67000000;u=100"
> > and "geo:41.53,-70.67;u=100" as equivalent.
> > Try Googling "geo:44.86294500,-87.23120400;u=10"
> > If the ietf.org standard is supported in future versions of Virtuoso and
> > other tools then we would not need to include the redundant use of
> geo:lat
> > geo:long for the dynamic maps.
> > - Pete
> >
> > On Sat, Feb 26, 2011 at 1:01 PM, Bob Morris <morris.bob at gmail.com>
> wrote:
> >>
> >> Your arguably reasonable recoding of the geo uri's of your example
> >> illustrates an issue on which so much metadata is silent: provenance.
> Once
> >> exposed, it is probably impossible for someone to know how the
> uncertainty
> >> (or any other data that might be the subject of opinion or estimate) was
> >> determined and whether the data is fit for some particular purpose, e.g.
> >> that the species were observed near each other.
> >> BTW, the IETF geo proposal was adopted in 2010, in the final form given
> >> at http://tools.ietf.org/html/rfc5870 . One interesting point
> >> is http://tools.ietf.org/html/rfc5870#section-3.4.3 which says
> >>   "Note: The number of digits of the values in <coordinates> MUST NOT be
> >> interpreted as an indication to the level of uncertainty." The section
> >> following is also interesting, albeit irrelevant for your procedure. It
> >> implies that when uncertainty is omitted (and therefore unknown), then
> >> "geo:41.53000000,-70.67000000"  and "geo:41.53,-70.67"  identify  the
> same
> >> geo resource.
> >>
> >> Bob Morris
> >> On Wed, Feb 23, 2011 at 4:56 PM, Peter DeVries <pete.devries at gmail.com>
> >> wrote:
> >>>
> >>> [...]
> >>>
> >>> 5) I added in my proposed "area" so that it is easy to see what species
> >>> were observed near each other. Since there was no measure of radius in
> these
> >>> longitude and latitudes I made the radius 100 meters.
> >>>     Normally I would estimate the radius for a GPS reading to be within
> >>> 10 meters but some of these observations were made where the GPS
> reading was
> >>> taken and the readings were given only to two decimals.
> >>> Area = long, lat; radius in meters following the ietf proposal but with
> >>> the precision of the long and lat standardized
> >>> example "geo:41.53000000,-70.67000000;u=100"
> >>> [...]
> >>
> >> --
> >> Robert A. Morris
> >> Emeritus Professor  of Computer Science
> >> UMASS-Boston
> >> 100 Morrissey Blvd
> >> Boston, MA 02125-3390
> >> Associate, Harvard University Herbaria
> >> email: morris.bob at gmail.com
> >> web: http://efg.cs.umb.edu/
> >> web: http://etaxonomy.org/mw/FilteredPush
> >> http://www.cs.umb.edu/~ram
> >> phone (+1) 857 222 7992 (mobile)
> >>
> >
> >
> >
> > --
> > ---------------------------------------------------------------
> > Pete DeVries
> > Department of Entomology
> > University of Wisconsin - Madison
> > 445 Russell Laboratories
> > 1630 Linden Drive
> > Madison, WI 53706
> > TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> > About the GeoSpecies Knowledge Base
> > ------------------------------------------------------------
> >
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
> >
>



-- 
Robert A. Morris
Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Associate, Harvard University Herbaria
email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110227/d3c8b352/attachment.html 


More information about the tdwg-content mailing list