[tdwg-content] Bioblitz as rdf: does this make sense?

Peter DeVries pete.devries at gmail.com
Thu Jan 13 21:14:54 CET 2011


Thanks Joel,

Here is one of the BioBlitz Occurrence Records marked up with Darwin Core

http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_5&sid=217&urilookup=1

Here is one of the TaxonConcept Records marked up in the txn vocabulary
http://lsd.taxonconcept.org/describe/?url=http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9%23Occurrence

Some people have trouble with how the %23 above is escaped in their email,
they might like this bit.ly bundle
better.<http://lsd.taxonconcept.org/describe/?url=http://ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9%23Occurrence>
<http://lsd.taxonconcept.org/about/html/http/ocs.taxonconcept.org/ocs/f522444a-2dd9-400e-be59-47213ef38cb9%01Occurrence>
 http://bit.ly/hZdUpP

<http://bit.ly/hZdUpP>Also an issue to discuss is that identifications are
to both a name and a concept so

Should these should be the same dwc:taxonConcepts

http://spire.umbc.edu/ethan/Apis_mellifera
http://spire.umbc.edu/ethan/Apis_Mellifera
==> http://bit.ly/g1zzJC (Apis mellifera se:z9oqP)

http://spire.umbc.edu/ethan/Ascelpius_syruaca
http://spire.umbc.edu/ethan/Asclepias_syriaca
==> HTML page http://lod.taxonconcept.org/ses/tTEIq.html Concept View
Bit.ly http://bit.ly/dJHJqj

http://spire.umbc.edu/ethan/Aster_nova-belgii
http://spire.umbc.edu/ethan/Aster_nova-gelgii

http://spire.umbc.edu/ethan/Baccharis_halimifolia
http://spire.umbc.edu/ethan/Baccharis_halimifolia_L.

http://spire.umbc.edu/ethan/Bartonia_virginia
http://spire.umbc.edu/ethan/Bartonia_virginica

http://spire.umbc.edu/ethan/Branta_canadensis
http://spire.umbc.edu/ethan/Branta_canadensis_(Linnaeus,_1758)

http://spire.umbc.edu/ethan/Carex_pennsylvanica
http://spire.umbc.edu/ethan/Carex_pensylvanica

http://spire.umbc.edu/ethan/Cyperus_esculantus
http://spire.umbc.edu/ethan/Cyperus_Esculantus
http://spire.umbc.edu/ethan/Cyperus_esculentus
http://spire.umbc.edu/ethan/Cyperus_esculentus_L.

http://spire.umbc.edu/ethan/Carpodacus_mexicanus
http://spire.umbc.edu/ethan/Carpodacus_mexicanus_(Statius_Muller,_1776)

>From the taxonconcepts their should be a link to the various name strings.

vs. modeling the namestring as the concept.

Also I think that DarwinCore is good for somethings but maybe not as a
semantic web representation.

Respectfully,

- Pete


On Thu, Jan 13, 2011 at 6:53 AM, joel sachs <jsachs at csee.umbc.edu> wrote:

> Pete,
> Thanks - I corrected the geo properties.
> Joel.
>
>
>
>
> On Wed, 12 Jan 2011, Peter DeVries wrote:
>
>  Hi Joel,
>>
>> Cool :-)
>>
>> I just loaded this into my SPARQL endpoint.
>>
>> In the named graph
>> urn:org:linkedopenspeciesdata:dataspace:tdwg2010bioblitz
>>
>> It consists of 19,990 Triples
>>
>> Here is one of the dwc:taxonConceptID entries.
>>
>> *About: http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata*
>>
>> http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata
>>
>> *About: http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627*
>> <
>> http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata
>> >
>>
>> http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627
>>
>>
>>
>> <
>> http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627
>> >This
>> should give you an a count of occurrences.
>>
>> SELECT count(*) WHERE {?s a <http://rs.tdwg.org/dwc/terms/#Occurrence>};
>>
>> = 1882
>>
>> SELECT count(*) WHERE {?s a <http://rs.tdwg.org/dwc/terms/#taxonConceptID
>> >};
>>
>> This should give you a list of occurrences
>>
>>
>> http://lsd.taxonconcept.org/describe/?url=http://rs.tdwg.org/dwc/terms/%23Occurrence
>>
>> If this did not come through your email system try the bit.ly.
>>
>> http://bit.ly/g9BcoL
>>
>> I tried the following that should have given me a google map of all the
>> occurrences but it did not result in the map.
>>
>> DESCRIBE ?x WHERE {
>>  ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://rs.tdwg.org/dwc/terms/#Occurrence>.
>> }
>>
>> I looked that the RDF and I think I see the problem.
>>
>> In the RDF
>>
>> <geo:latitude>
>> 41.53
>> </geo:latitude>
>>
>> <geo:longitude>
>> -70.67
>> </geo:longitude>
>>
>> Should be
>>
>> <geo:lat>
>> 41.53
>> </geo:lat>
>>
>> <geo:long>
>> -70.67
>> </geo:long>
>>
>> See http://www.w3.org/2003/01/geo/
>>
>> I did the following query to get a list of all the dwc:taxonConceptID's
>> and
>> have attached them as a .txt file.
>>
>> select distinct ?o WHERE {?s <
>> http://rs.tdwg.org/dwc/terms/#taxonConceptID>
>> ?o}
>>
>> Pretty neat :-)
>>
>> There are some things that I will get back to Joel on.
>>
>> Here is where you can manually enter a SPARQL query. Click on "Advanced"
>> for
>> the entry window.
>>
>> http://lsd.taxonconcept.org/isparql/
>>
>> Respectfully,
>>
>> - Pete
>>
>>
>>
>> On Wed, Jan 12, 2011 at 5:55 PM, joel sachs <jsachs at csee.umbc.edu> wrote:
>>
>>  Hi Everyone,
>>>
>>> I've posted rdf of the bioblitz data. It's at
>>> http://www.cs.umbc.edu/~jsachs/occurrences/TechnoBioblitzOccurrences.rdf.
>>>
>>> Individual occurrences can be retrieved via
>>> http://www.cs.umbc.edu/~jsachs/occurrences/[occurrence_id]<http://www.cs.umbc.edu/~jsachs/occurrences/%5Boccurrence_id%5D>
>>> e.g. http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1835
>>>
>>> Individual identifications can be retrieved via
>>> http://www.cs.umbc.edu/~jsachs/identifications/[identification_id]<http://www.cs.umbc.edu/~jsachs/identifications/%5Bidentification_id%5D>
>>> e.g.
>>> http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1835_id_1
>>>
>>> The scripts behind this are on the kludgy side, so reports of errors and
>>> abnormalities will be warmly welcomed.
>>>
>>> Implicit in each of the following notes is the question "Is this a good
>>> way to do it?":
>>>
>>> 1. The data is "normalized" w.r.t. identification. "Normalized" is in
>>> quotes because I mean it in the sense that Steve Baskauf was using in his
>>> Fall 2010 series of posts. His meaning of the term makes sense to me, but
>>> many people (e.g. the OBO folks), take "normalized ontology" to mean
>>> "disentangled" (i.e. no multiple inheritance.)
>>> As an example, here's an occurrence with two crowdsourced determinations:
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644
>>>
>>> 2. I used sequential integers for observation and identification IDs; in
>>> practice, a mechanism needs to be in place to prevent two people from
>>> assigning the same id to their respective identifications.
>>>
>>> 3. My answer to Cam Webb's Question #1 from
>>> http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001720.html
>>> is "both". In other words, just as "Joel Sachs" is both me and also my
>>> name, so
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1668 is both
>>> an occurrence and an occurrence_id, expressed as:
>>> ---
>>> <dwc:Occurrence
>>> rdf:about="
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644">
>>> <dwc:occurrenceID>
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644
>>> </dwc:occurrenceID>
>>> <blah blah blah/>
>>> </dwc:Occurrence>
>>> ---
>>>
>>> 4. I was surprised to see that the Darwin Core Identification class has
>>> no
>>> "occurrenceID" or "specimenID" term. How is one supposed to tie an
>>> identification to an observation (assuming the identification is not
>>> in-lined, of course)? DeVries and Baskauf each mint their own terms for
>>> doing this (txn:identificationHasOccurrence, and
>>> sernec:basedOnOccurrence,
>>> respectively); I used dwc:occurrenceID as if it were a record level term.
>>>
>>> 5. We had scope for multiple taxonConceptID columns in the Fusion table,
>>> and assigned lsids where possible. I also mean to work with Pete to
>>> assign
>>> GUIDs from taxoncocept.org. In addition, I assigned ethan taxon concept
>>> ids, which look like this:
>>> http:.//spire.umbc.edu/ethan/Coffea_arabica
>>>
>>> In their argument over opaque vs. transparent taxonCoceptIDs, I was
>>> sympathetic to both Pete's and Gregor's arguments. Ultimately, if the
>>> tooling exists to always display the rdfs:labels every time I'm loooking
>>> at a list of opaqueIDs, then transparent IDs are unnecessary. But, for
>>> now, it's really helpful to look at an ID and know what it's referring
>>> to.
>>>
>>> (For species names not in the spire database, the rdf returned by
>>> http:.//spire.umbc.edu/ethan/$name
>>> is simply an rdfs:seeAlso to
>>> http://http://gni.globalnames.org/name_strings?search_term=$name)
>>>
>>> 6. It was easy to assert membership in RDF classes corresponding to
>>> various Cape Cod categories of concern - invasive species, threatenened
>>> species, indicators, etc. You can see these classes at
>>> http://spire.umbc.edu/ontologies/lists (Information of where these lists
>>> come from is included as rdfs:comments. I'll add further documentation,
>>> e.g. links to eml files.)
>>>
>>> Note that "ThingOfConcern" is defined as the superclass of all the other
>>> classes in the collection. The idea here is that people can create their
>>> own "ThingOfConcern" class, and then query for observations that are of
>>> concern to them. You can see sample sparql queries at
>>> http://www.csee.umbc.edu/~jsachs/occurrences/queries/sample.txt
>>>
>>>
>>> As an aside, I think we, as a community, should come up with a
>>> biodiversity benchmark suite of rdf data and corresponding sparql
>>> queries,
>>> that can be
>>> used to test the suitability and scalability of semantic web knowledge
>>> bases. I'll take this up in a future post (unless someone beats me to
>>> it).
>>>
>>> Comments, questions, and better ideas are welcome.
>>>
>>> Thanks -
>>> Joel.
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>
>>>
>>
>>
>> --
>> ---------------------------------------------------------------
>> Pete DeVries
>> Department of Entomology
>> University of Wisconsin - Madison
>> 445 Russell Laboratories
>> 1630 Linden Drive
>> Madison, WI 53706
>> TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
>> Knowledge Base <http://lod.geospecies.org/>
>> About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
>> ------------------------------------------------------------
>>
>>


-- 
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110113/23667863/attachment.html 


More information about the tdwg-content mailing list