Hi Steve,
This is mainly and advantage if everyone supports the ietf.org proposal in the way they support the geo vocabulary.
The way to think about this is ietf geo could become a well unknown URN.
The reason to have a standard number of digits is not as a measure of precision, but to make the units, strings and urns consistent for both the LOD and Google.
What I would like is a standard efficient way to markup potentially billions of occurrences, that can be correctly interpreted and mapped using current and future LOD tools.
Note that your use of geo is not standard DarwinCore. What is the official word from the DarwinCore Illuminati on the use of geo?
I am sympathetic to the need for some measure of radius, pointSpatialFit, coordinateUncertaintyInMeters etc. but adoption of these has been poor.
I think the "radius" and area form that I am proposing is easy for providers to understand and for tools to interpret.
It probably maps to dwc:coordinateUncertaintyInMeters
Here is a modification of your earlier location example using a TDWG BioBlitz record. Note how I replace some standard vocabulary terms with URI's.
Those things with txn, could be incorporated in to the DarwinCore.
<!-- The who, what, where and when and how of the observation, as efficient as I can make it but with a literal for scientific name (label) -->
<dwc:Occurrence about=" http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1607%22%3E rdfs:labelOBS: Branta canadensis</rdfs:label> <dwc:Area resource="geo:41.53000000,-70.67000000;u=100"> <txn:occurrenceHasSpeciesConcept rdf:resource=" http://lod.taxonconcept.org/ses/SeecQ#Species%22/%3E <txn::hasCollector rdf:resource=" http://lod.taxonconcept.org/people/tdwg2010bioblitz#Dmitry_Mozzherin%22/%3E <txn::occurrenceHasIndividual rdf:resource=" http://www.cs.umbc.edu/~jsachs/individuals/tdwg2010bioblitz_1607%22/%3E <foaf:depiction rdf:resource=" http://farm5.static.flickr.com/4126/5037315500_4c555f742a_b.jpg%22/%3E <txn:basisOfRecord rdf:resource=" http://lod.taxonconcept.org/ontology/txn.owl#BasisOfRecord_StillImage%22/%3E dcterms:date2010-09-29</dcterms:date> </dwc:Occurrence>
<!-- The area is is a Location and the georeference method is included as metadata. wdrs is used to link the area to the RDF. --> <!-- I will change the Area ontology so that it is a subclass of dcterms:Location --> <!-- If others make statements about this particular area, their RDF's wdrs will create the provenence links -->
<dwc:Area about="geo:41.53000000,-70.67000000;u=100"> <rdf:type resource="http://purl.org/dc/terms/Location%22/%3E <dwc:georeferenceMethod resource=" http://rs.tdwg.org/dwc/terms/index.htm#GeoMethod_GoogleMaps%22%3E <dwc_area:areaWithInFeature rdf:resource=" http://sws.geonames.org/4929772/%22/%3E <wdrs:describedby rdf:resource=" http://my_organization.com/occurrence/123.rdf%22/%3E </dwc:Area>
<!-- Like the occurrence the individual is only one thing but different people make assertions about what that thing is -->
<dwc:Individual about=" http://www.cs.umbc.edu/~jsachs/individuals/tdwg2010bioblitz_1607%22%3E rdfs:labelIND_1607: Branta canadensis</rdfs:label> <txn:occurrenceHasSpeciesConcept rdf:resource=" http://lod.taxonconcept.org/ses/SeecQ#Species%22/%3E <txn::hasCollector rdf:resource=" http://lod.taxonconcept.org/people/tdwg2010bioblitz#Dmitry_Mozzherin%22/%3E <txn::individualHasOccurrence rdf:resource=" http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1607%22/%3E <foaf:depiction rdf:resource=" http://farm5.static.flickr.com/4126/5037315500_4c555f742a_b.jpg%22/%3E <!-- Identification history should be part of the documentation of the individual --> <txn:individualHasCurrrentIdentificationAssertion rdf:resource=" http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1607_id_2%22... <txn:individualHasPreviousIdentificationAssertion rdf:resource=" http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1607_id_1%22... </dwc:Individual>
Respectfully,
- Pete
On Sun, Feb 27, 2011 at 6:55 AM, Steve Baskauf <steve.baskauf@vanderbilt.edu
wrote:
Pete, This topic has come up several times before and each time I've left the email sitting in my inbox with the intention of trying to understand it better. I guess what I don't understand is what one "does" with it. I suppose I could dig in and do some research, but the multiple times the message has sat in my inbox without action tells me that I'm probably not going to get around to doing that. So maybe you can explain it further.
Is this supposed to be usable as a "Linked Data" resource (i.e. object of a predicate such as "hasLocation")? It isn't an HTTP URI, so it can't get dereferenced via http. I guess one parses it as a string and interprets the string directly based on some rules. Isn't that a no-no in the GUID rules? I guess with enough community support, applications would know what do do with it, but look what happened with "urn:lsid:my_organization.com:location:123". Web browsers and "regular" Linked Data clients (e.g. Linked Data browsers) didn't know what to do with it.
The whole issue of provenance is pretty satisfactorily handled in the existing Darwin Core. There are a multitude of terms available to express all kinds of uncertainty and shapes. For example, if I create a record like:
<rdf:Description about="http://my_organization.com/location/123"http://my_organization.com/location/123
<rdf:type resource="http://purl.org/dc/terms/Location"<http://purl.org/dc/terms/Location>
/> geo:lat36.144719</geo:lat> geo:long-86.801498</geo:long>
dwc:coordinateUncertaintyInMeters1000</dwc:coordinateUncertaintyInMeters> dwc:georeferenceRemarksLocation determined from Google maps</dwc:georeferenceRemarks> </rdf:Description>
I very explicitly express the type of thing, geocoordinates, datum (implicit in the use of geo:), uncertainty, and method of generating the data. If necessary, I could also use dwc:dataGeneralizations and dwc:informationWithheld to explain how and why I have provided less precise coordinates than I actually know. This is clearly more verbose, but hey, Linked data in xml IS verbose, and existing Linked Data applications would be able to "understand" something like what I wrote without any kind of special "plug-in" to interpret the spring. I guess I don't really understand why you are proposing this, especially since you are a passionate advocate of Linked Data. Your proposed thing is more succinct, but it doesn't seem like it would be usable by normal Linked Data clients.
Steve
Bob Morris wrote:
Your arguably reasonable recoding of the geo uri's of your example illustrates an issue on which so much metadata is silent: provenance. Once exposed, it is probably impossible for someone to know how the uncertainty (or any other data that might be the subject of opinion or estimate) was determined and whether the data is fit for some particular purpose, e.g. that the species were observed near each other.
BTW, the IETF geo proposal was adopted in 2010, in the final form given at http://tools.ietf.org/html/rfc5870 . One interesting point is http://tools.ietf.org/html/rfc5870#section-3.4.3 which says "Note: The number of digits of the values in <coordinates> MUST NOT beinterpreted as an indication to the level of uncertainty." The section following is also interesting, albeit irrelevant for your procedure. It implies that when uncertainty is omitted (and therefore unknown), then "geo:41.53000000,-70.67000000" and "geo:41.53,-70.67" identify the same geo resource.
Bob Morris
On Wed, Feb 23, 2011 at 4:56 PM, Peter DeVries pete.devries@gmail.comwrote:
[...]
- I added in my proposed "area" so that it is easy to see what species
were observed near each other. Since there was no measure of radius in these longitude and latitudes I made the radius 100 meters. Normally I would estimate the radius for a GPS reading to be within 10 meters but some of these observations were made where the GPS reading was taken and the readings were given only to two decimals.
Area = long, lat; radius in meters following the ietf proposal but with the precision of the long and lat standardized example "geo:41.53000000,-70.67000000;u=100"
[...]
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://efg.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707http://bioimages.vanderbilt.edu