[tdwg-content] Proposal to Add a GeoSpatial "Area" to the DarwinCore

Bob Morris morris.bob at gmail.com
Tue Nov 23 00:27:11 CET 2010


Sorry to be opaque.  I was trying to agree with the position you and
Markus espoused by adding that the problem of emitting something else
than Darwin Core based on new usecases is not very hard. It is not,
generally speaking, either harder or easier than emitting Darwin Core.
 That's because most occurrence data is being held in something that
is not DarwinCore.  The secondary point was that just because
something is easy to engineer, one should not expect it to be quick to
engineer.

On Mon, Nov 22, 2010 at 5:59 PM, Peter DeVries <pete.devries at gmail.com> wrote:
> Bob, It is not clear what you are getting at.
>>  which become a matter of choice when the provider and the map maker are
>> one in the same.
>
> Are you saying that GBIF is the only entity that will be processing and
> mapping these records?
> There is nothing in what I propose that won't also make these easier and
> more efficient for GBIF.
> It is simply a standardized WGS84 latitude, longitude, radius and an integer
> link to the next higher level in the geographic hierarchy.
> There are a number of tools and libraries for both end users and data
> servers that already understand the geo vocabulary.
> It is the same vocabulary used for georss etc., with the addition of a
> radius.
> Millions of records in this format would be much easier to parse and store
> than the numerous variations that occur in the current DarwinCore.
> How many different forms of the literal "WGS84" will they need to be able to
> recognize and parse correctly under the current standard?
> Do DarwinCore users find the current standard confusing and often put the
> wrong things into the different fields?
> Also try googling "geo:44.86528100,-87.23147800;u=10"  Aren't all the
> returned records appropriate?
> * I am not suggesting that the current DarwinCore be changed, just that this
> format be added.
>> The main burden will fall on Markus and other tool builders who help make
>> mapping issues invisible to undersupported data providers in ways that don't
>> depend very much on what the mapping is from.
> Yes, so maybe this would be best decided by Markus and the other GBIF
> people. If it make sense to them, then the best place for it might be in the
> GBIF vocabulary.
> Respectfully,
> - Pete
> On Mon, Nov 22, 2010 at 1:46 PM, Bob Morris <morris.bob at gmail.com> wrote:
>>
>> 4. Hundreds of millions of existing occurrence records already live in
>> relational databases with their own choice of geospatial and other
>> vocabularies, some standard, some not. Except in the few(?) cases that
>> their Relational schema is DwC, they have to map their local schemas
>> to DwC for their service responses.  It's likely inconsequentially
>> more work to map to some other vocabulary that is also semantically
>> similar to theirs than to map to DDwC.  The main burden will fall on
>> Markus and other tool builders who help make mapping issues invisible
>> to undersupported data providers in ways that don't depend very much
>> on what the mapping is from.  They'll have robustness requirements
>> imposed on them which become a matter of choice when the provider and
>> the map maker are one in the same.
>>
>> On Mon, Nov 22, 2010 at 10:48 AM, Peter DeVries <pete.devries at gmail.com>
>> wrote:
>> >  Hi Markus,
>> >>
>> >> In general I feel lots of the latest threads (though I have to admit I
>> >> hadnt had the time to read them all) seem to rather discuss the
>> >> (re)creation
>> >> of a new tdwg ontology instead of suggesting changes to darwin core.
>> >> Darwin
>> >> Core is ratified and I think we should refrain from taking every single
>> >> term
>> >> it has to pieces, redefine its definition and reconstruct it in
>> >> different
>> >> ways because of linked data demands.
>> >
>> >
>> >>
>> >> We might be better off to start something new for the LOD world.
>> >
>> > I agree. For three main reasons,
>> > 1) The current DarwinCore works for what it was originally intended.
>> > 2) Creating something that works well for the semantic web is somewhat
>> > tricky and will take some time.
>> > 3) I think this discussion is confusing potential DarwinCore users.
>> > My main goal was to figure out a "home" for this ontology, and it seems
>> > as
>> > if TDWG only wants to use things in it's namespace.
>> > It is my feeling that other groups like EUNIS will want to use this so I
>> > would like to figure out the final ontology uri soon.
>> > I could add this to the txn ontology, but I think it has utility even
>> > outside biodiversity informatics.
>> > - Pete
>> >
>> >>
>> >> > Here is an example with some of the non-relevant stuff taken out.
>> >> >
>> >> > <dwc_area:Area rdf:about="geo:44.86528100,-87.23147800;u=10">
>> >> >  <dcterms:title>44.86528100, -87.23147800 Radius 10
>> >> > meters</dcterms:title>
>> >> >
>> >> >
>> >> >  <dcterms:identifier>geo:44.86528100,-87.23147800;u=10</dcterms:identifier>
>> >> >  <dcterms:created>2010-10-28T00:00:00-0500</dcterms:created>
>> >> >  <dcterms:modified>2010-11-19T22:17:37-0600</dcterms:modified>
>> >> >  <geo:lat>44.86528100</geo:lat>
>> >> >  <geo:long>-87.23147800</geo:long>
>> >> >  <dwc_area:radius>10</dwc_area:radius>
>> >> >  <dwc_area:areaWithInFeature
>> >> > rdf:resource="http://sws.geonames.org/5250768/"/>
>> >> >  <wdrs:describedBy
>> >> >
>> >> > rdf:resource="http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9.rdf"/>
>> >> > </dwc_area:Area>
>> >> >
>> >> > What I have done with my examples is standardized on a fixed
>> >> > precision
>> >> > to the lat and log.
>> >> >
>> >> > The level of precision does not have be this high, we just need to
>> >> > have
>> >> > an agreed on level of precision.
>> >> > (Probably something that we should do anyway since the actual
>> >> > precision
>> >> > should not be inferred by the significant digits.)
>> >> >
>> >> > So Steve's 44.86,-87.23 => 44.86000000, -87.23000000.
>> >> >
>> >> > Now we have a standard urn-like entity for an area that follows the
>> >> > proposed ietf standard.
>> >> >
>> >> > Other data like soil type, weather etc can be tied to these areas.
>> >> >
>> >> > Area1 => hasDegreeDayValue "2105"
>> >> >
>> >> > Other areas can be related to this area Area1 within Area2, AreaB
>> >> > near
>> >> > AreaC etc.
>> >> >
>> >> > AreaZ within NaturePreserveB
>> >> >
>> >> > Also the following was recently proposed by the ietf.org
>> >> >
>> >> > A Uniform Resource Identifier for Geographic Locations ('geo' URI)
>> >> > http://tools.ietf.org/html/rfc5870 (from Sean Gillies)
>> >> > * Not clear that all the systems understand this but URIburner and
>> >> > Virtuoso interprets these as a URN type thing so they work but are
>> >> > not
>> >> > understood in the way that the "geo" vocabulary is understood. Note
>> >> > that if
>> >> > and when this becomes a standard it will allow these "Areas" to be
>> >> > universally understood.
>> >> >
>> >> > What this means is that I can markup my 10,0000 mosquito records from
>> >> > one location with one URN type id.
>> >> >
>> >> > In summary, these allow more efficient occurrence records that
>> >> > leverage
>> >> > the existing support for the geo vocabulary and the probably support
>> >> > for the
>> >> > ietf.org standard.
>> >> >
>> >> > I have added some predicate to the vocabulary that allow inferencing
>> >> > from the area through the geonames hierarchy.
>> >> >
>> >> > This would allow one to infer the following about the record above.
>> >> >
>> >> > Area > areaWithInFeature => Door County => Wisconsin => United States
>> >> > =>
>> >> > North America = Earth
>> >> >
>> >> > This does not mean that additions could also be made that allow
>> >> > inferencing up some other non-geonames hierarchy.
>> >> >
>> >> > Here is the ontology
>> >> > http://lod.taxonconcept.org/ontology/dwc_area.owl
>> >> > Here is the owl doc
>> >> > http://lod.taxonconcept.org/ontology/dwc_area_doc/index.html
>> >> >
>> >> > Here is an example of an occurrence record that uses this:
>> >> >
>> >> > HTML
>> >> >
>> >> > http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9.html
>> >> > RDF
>> >> >
>> >> > http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9.rdf
>> >> >
>> >> > Respectfully,
>> >> >
>> >> > - Pete
>> >> >
>> >> > On Fri, Nov 19, 2010 at 12:32 PM, Steve Baskauf
>> >> > <steve.baskauf at vanderbilt.edu> wrote:
>> >> > Pete,
>> >> > I was going to ask questions about this the first time you mentioned
>> >> > it,
>> >> > but got distracted.  I guess the main question I have is: what you
>> >> > would
>> >> > "do" with it?  I guess it could be considered an identifier for a
>> >> > spot on
>> >> > the earth, but based on what I've read about guids it's considered to
>> >> > be a
>> >> > "no-no" to try to infer stuff about a resource by looking at the form
>> >> > of the
>> >> > identifier.  Rather one should look at the metadata associated with
>> >> > the
>> >> > identifier to understand things about the identifier (which you
>> >> > provide with
>> >> > geo:lat and geo:long).  I suppose that one could consider this to be
>> >> > some
>> >> > kind of identifier that could be reused, but particularly since the
>> >> > precision of your lat and long are 8 digits, it is highly unlikely
>> >> > that
>> >> > anybody besides you is ever going to choose to use this identifier
>> >> > over (vs.
>> >> > 44.86,-87.23 which would be imprecise enough to include a lot of
>> >> > places to
>> >> > which people might want to refer).  I may just be misunderstanding
>> >> > the
>> >> > purpose you intend.
>> >> >
>> >> > The other question is a more general one.  Do we need more ways to
>> >> > specify uncertainty in location than we already have?  We already
>> >> > have
>> >> > dwc:coordinateUncertaintyInMeters and dwc:coordinatePrecision .  I've
>> >> > been
>> >> > using dwc:coordinateUncertaintyInMeters with a seat-of-the pants
>> >> > estimate on
>> >> > my part about how accurate I think my geolocation is (expressed in
>> >> > meters).
>> >> >  That may be a misuse of this term because I'm really thinking radius
>> >> > around
>> >> > a point rather than uncertainty of coordinates.  But as a practical
>> >> > matter,
>> >> > if I think my estimate of location is good to 1000 m (vs. 100 m or
>> >> > 10000 m)
>> >> > does it really matter if I'm talking about a square or a circle?  In
>> >> > any
>> >> > case, I'm saying, this lat/long could be off from the actual location
>> >> > by a
>> >> > km.  If I had a GPS receiver that allowed me to download the actual
>> >> > estimated accuracy (based on satellite signals and whatever else),
>> >> > then I
>> >> > would populate dwc:coordinateUncertaintyInMeters with that, but mine
>> >> > crummy
>> >> > old one doesn't.  To me the most important thing is for users to know
>> >> > whether this is a ballpark estimate or if they could expect to
>> >> > actually be
>> >> > able to walk up to the tree using the coordinates they give.
>> >> >
>> >> > Steve
>> >> >
>> >> >
>> >> > Peter DeVries wrote:
>> >> >> I wrote about this earlier but I never heard anything back.
>> >> >>
>> >> >> I have made something that uses the geo vocabulary but also allows
>> >> >> pointRadiusSpatial fit measure that I call radius.
>> >> >>
>> >> >> The advantage is that this adds a standard way to deal use something
>> >> >> like an extent or pointRadiusSpatial while still benefiting from the
>> >> >> widely
>> >> >> used geo vocabulary.
>> >> >>
>> >> >> It also allows these "Areas" to be referenced in a commonly
>> >> >> understood
>> >> >> urn way that using a ietf standard.
>> >> >>
>> >> >> For example: "geo:44.86528100,-87.23147800;u=10"
>> >> >>
>> >> >> There are still some things I need to fix and check with this
>> >> >> vocabulary but I am wondering if there is any interest in
>> >> >> incorporating this
>> >> >> into the DarwinCore.
>> >> >>
>> >> >> If not I will probably change the name of the ontology.
>> >> >>
>> >> >> There are also things in the example below that are not part of my
>> >> >> proposal.
>> >> >>
>> >> >> I have what I call "Areas" that look like this:
>> >> >>
>> >> >>   <dwc_area:Area rdf:about="geo:44.86528100,-87.23147800;u=10">
>> >> >>     <dcterms:title>44.86528100, -87.23147800 Radius 10
>> >> >> meters</dcterms:title>
>> >> >>     <dcterms:isPartOf
>> >> >> rdf:resource="http://lod.taxonconcept.org/ontology/void#this"/>
>> >> >>
>> >> >>
>> >> >> <dcterms:identifier>geo:44.86528100,-87.23147800;u=10</dcterms:identifier>
>> >> >>     <dcterms:created>2010-10-28T00:00:00-0500</dcterms:created>
>> >> >>     <dcterms:modified>2010-11-09T16:33:34-0600</dcterms:modified>
>> >> >>     <geo:lat>44.86528100</geo:lat>
>> >> >>     <geo:long>-87.23147800</geo:long>
>> >> >>     <dwc_area:radius>10</dwc_area:radius>
>> >> >>     <txn:elevation>186.54</txn:elevation>
>> >> >>     <txn:continent>North America</txn:continent>
>> >> >>     <txn:countryCode>US</txn:countryCode>
>> >> >>     <txn:country>United States</txn:country>
>> >> >>     <txn:stateProvince>Wisconsin</txn:stateProvince>
>> >> >>     <txn:county>Door</txn:county>
>> >> >>     <txn:localityText>Town of Sevastopol</txn:localityText>
>> >> >>     <txn:locationName>Shivering Sands Natural Area
>> >> >> Woods</txn:locationName>
>> >> >>     <txn:areaHasOccurrence
>> >> >>
>> >> >> rdf:resource="http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9#Occurrence"/>
>> >> >>     <txn:areaHasObservedSpeciesConcept
>> >> >> rdf:resource="http://lod.taxonconcept.org/ses/ICmLC#Species"/>
>> >> >>     <txn:areaHasIndividual
>> >> >>
>> >> >> rdf:resource="http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9#Individual"/>
>> >> >>     <txn:areaInStateProvince
>> >> >> rdf:resource="http://sws.geonames.org/5279468/"/>
>> >> >>     <txn:areaInCounty
>> >> >> rdf:resource="http://sws.geonames.org/5250768/"/>
>> >> >>     <wdrs:describedBy
>> >> >>
>> >> >> rdf:resource="http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9.rdf"/>
>> >> >>   </dwc_area:Area>
>> >> >>
>> >> >> I recently added the following predicates, but have not altered my
>> >> >> RDF
>> >> >> examples.
>> >> >>
>> >> >> #featureContainsArea
>> >> >> #areaWithInFeature
>> >> >>
>> >> >> http://lod.taxonconcept.org/ontology/dwc_area.owl
>> >> >>
>> >> >> OWL Doc http://lod.taxonconcept.org/ontology/dwc_area_doc/index.html
>> >> >>
>> >> >> The predicates are a bit awkward, but I wanted to be clear that this
>> >> >> was to link an "Area" like "geo:44.86528100,-87.23147800;u=10" to a
>> >> >> Geonames
>> >> >> "Feature".
>> >> >>
>> >> >> I thought a different set of predicates could be created to deal
>> >> >> with
>> >> >> some other class of "SpatialThing" if needed.
>> >> >>
>> >> >> Respectfully,
>> >> >>
>> >> >> - Pete
>> >> >>
>> >> >> ---------------------------------------------------------------
>> >> >> Pete DeVries
>> >> >> Department of Entomology
>> >> >> University of Wisconsin - Madison
>> >> >> 445 Russell Laboratories
>> >> >> 1630 Linden Drive
>> >> >> Madison, WI 53706
>> >> >> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
>> >> >> About the GeoSpecies Knowledge Base
>> >> >> ------------------------------------------------------------
>> >> >
>> >> > --
>> >> > Steven J. Baskauf, Ph.D., Senior Lecturer
>> >> > Vanderbilt University Dept. of Biological Sciences
>> >> >
>> >> > postal mail address:
>> >> > VU Station B 351634
>> >> > Nashville, TN  37235-1634,  U.S.A.
>> >> >
>> >> > delivery address:
>> >> > 2125 Stevenson Center
>> >> > 1161 21st Ave., S.
>> >> > Nashville, TN 37235
>> >> >
>> >> > office: 2128 Stevenson Center
>> >> > phone: (615) 343-4582,  fax: (615) 343-6707
>> >> >
>> >> > http://bioimages.vanderbilt.edu
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > ---------------------------------------------------------------
>> >> > Pete DeVries
>> >> > Department of Entomology
>> >> > University of Wisconsin - Madison
>> >> > 445 Russell Laboratories
>> >> > 1630 Linden Drive
>> >> > Madison, WI 53706
>> >> > TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
>> >> > About the GeoSpecies Knowledge Base
>> >> > ------------------------------------------------------------
>> >> > <area_sparql_query.jpg>
>> >>
>> >
>> >
>> >
>> > --
>> > ---------------------------------------------------------------
>> > Pete DeVries
>> > Department of Entomology
>> > University of Wisconsin - Madison
>> > 445 Russell Laboratories
>> > 1630 Linden Drive
>> > Madison, WI 53706
>> > TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
>> > About the GeoSpecies Knowledge Base
>> > ------------------------------------------------------------
>> >
>> > _______________________________________________
>> > tdwg-content mailing list
>> > tdwg-content at lists.tdwg.org
>> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> >
>> >
>>
>>
>>
>> --
>> Robert A. Morris
>> Emeritus Professor  of Computer Science
>> UMASS-Boston
>> 100 Morrissey Blvd
>> Boston, MA 02125-3390
>> Associate, Harvard University Herbaria
>> email: morris.bob at gmail.com
>> web: http://bdei.cs.umb.edu/
>> web: http://etaxonomy.org/mw/FilteredPush
>> http://www.cs.umb.edu/~ram
>> phone (+1) 857 222 7992 (mobile)
>
>
>
> --
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
>



-- 
Robert A. Morris
Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Associate, Harvard University Herbaria
email: morris.bob at gmail.com
web: http://bdei.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)


More information about the tdwg-content mailing list