[tdwg-content] Bioblitz as rdf: does this make sense?

Peter DeVries pete.devries at gmail.com
Fri Jan 14 02:55:40 CET 2011


I have gotten the fixed records from Joel and added some elements to the
RDF.

Also we need to use URI's for people so you can see browse between their
records and identifications etc.

I only did a few of these. I changed all the records from Dima to "Dmitry
Mozzherin" there was at least one misspelling (Dimitry Mozzherin)

    <dwc:recordedBy>Dmitry Mozzherin</dwc:recordedBy> *fixed
    <txn:hasCollector rdf:resource="
http://lod.taxonconcept.org/people/tdwg2010bioblitz#Dmitry_Mozzherin"/>
*added

Now you can see all the observations he recorded

http://lsd.taxonconcept.org/describe/?url=http://lod.taxonconcept.org/people/tdwg2010bioblitz%23Dmitry_Mozzherin&sid=219&urilookup=1

The newer modified versions of the RDF are here: (These are still a work in
progress)

BioBlitz Occurrences
http://lod.taxonconcept.org/tdwg2010bioblitz/TechnoBioblitzOccurrences.rdf

TDWG People http://lod.taxonconcept.org/people/tdwg2010bioblitz_people.rdf

I added a TaxonConcept species concept to only one set of records - those
for the Honey Bee Apis mellifera.

It shows up in this occurrence record

http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_7108&sid=220&urilookup=1

Also the geoquery works but it seems to return some strange locations?

http://bit.ly/dYQXUp

Respectfully,

- Pete


On Thu, Jan 13, 2011 at 6:53 AM, joel sachs <jsachs at csee.umbc.edu> wrote:

> Pete,
> Thanks - I corrected the geo properties.
> Joel.
>
>
>
>
> On Wed, 12 Jan 2011, Peter DeVries wrote:
>
>  Hi Joel,
>>
>> Cool :-)
>>
>> I just loaded this into my SPARQL endpoint.
>>
>> In the named graph
>> urn:org:linkedopenspeciesdata:dataspace:tdwg2010bioblitz
>>
>> It consists of 19,990 Triples
>>
>> Here is one of the dwc:taxonConceptID entries.
>>
>> *About: http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata*
>>
>> http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata
>>
>> *About: http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627*
>> <
>> http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata
>> >
>>
>> http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627
>>
>>
>>
>> <
>> http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627
>> >This
>> should give you an a count of occurrences.
>>
>> SELECT count(*) WHERE {?s a <http://rs.tdwg.org/dwc/terms/#Occurrence>};
>>
>> = 1882
>>
>> SELECT count(*) WHERE {?s a <http://rs.tdwg.org/dwc/terms/#taxonConceptID
>> >};
>>
>> This should give you a list of occurrences
>>
>>
>> http://lsd.taxonconcept.org/describe/?url=http://rs.tdwg.org/dwc/terms/%23Occurrence
>>
>> If this did not come through your email system try the bit.ly.
>>
>> http://bit.ly/g9BcoL
>>
>> I tried the following that should have given me a google map of all the
>> occurrences but it did not result in the map.
>>
>> DESCRIBE ?x WHERE {
>>  ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://rs.tdwg.org/dwc/terms/#Occurrence>.
>> }
>>
>> I looked that the RDF and I think I see the problem.
>>
>> In the RDF
>>
>> <geo:latitude>
>> 41.53
>> </geo:latitude>
>>
>> <geo:longitude>
>> -70.67
>> </geo:longitude>
>>
>> Should be
>>
>> <geo:lat>
>> 41.53
>> </geo:lat>
>>
>> <geo:long>
>> -70.67
>> </geo:long>
>>
>> See http://www.w3.org/2003/01/geo/
>>
>> I did the following query to get a list of all the dwc:taxonConceptID's
>> and
>> have attached them as a .txt file.
>>
>> select distinct ?o WHERE {?s <
>> http://rs.tdwg.org/dwc/terms/#taxonConceptID>
>> ?o}
>>
>> Pretty neat :-)
>>
>> There are some things that I will get back to Joel on.
>>
>> Here is where you can manually enter a SPARQL query. Click on "Advanced"
>> for
>> the entry window.
>>
>> http://lsd.taxonconcept.org/isparql/
>>
>> Respectfully,
>>
>> - Pete
>>
>>
>>
>> On Wed, Jan 12, 2011 at 5:55 PM, joel sachs <jsachs at csee.umbc.edu> wrote:
>>
>>  Hi Everyone,
>>>
>>> I've posted rdf of the bioblitz data. It's at
>>> http://www.cs.umbc.edu/~jsachs/occurrences/TechnoBioblitzOccurrences.rdf.
>>>
>>> Individual occurrences can be retrieved via
>>> http://www.cs.umbc.edu/~jsachs/occurrences/[occurrence_id]<http://www.cs.umbc.edu/~jsachs/occurrences/%5Boccurrence_id%5D>
>>> e.g. http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1835
>>>
>>> Individual identifications can be retrieved via
>>> http://www.cs.umbc.edu/~jsachs/identifications/[identification_id]<http://www.cs.umbc.edu/~jsachs/identifications/%5Bidentification_id%5D>
>>> e.g.
>>> http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1835_id_1
>>>
>>> The scripts behind this are on the kludgy side, so reports of errors and
>>> abnormalities will be warmly welcomed.
>>>
>>> Implicit in each of the following notes is the question "Is this a good
>>> way to do it?":
>>>
>>> 1. The data is "normalized" w.r.t. identification. "Normalized" is in
>>> quotes because I mean it in the sense that Steve Baskauf was using in his
>>> Fall 2010 series of posts. His meaning of the term makes sense to me, but
>>> many people (e.g. the OBO folks), take "normalized ontology" to mean
>>> "disentangled" (i.e. no multiple inheritance.)
>>> As an example, here's an occurrence with two crowdsourced determinations:
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644
>>>
>>> 2. I used sequential integers for observation and identification IDs; in
>>> practice, a mechanism needs to be in place to prevent two people from
>>> assigning the same id to their respective identifications.
>>>
>>> 3. My answer to Cam Webb's Question #1 from
>>> http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001720.html
>>> is "both". In other words, just as "Joel Sachs" is both me and also my
>>> name, so
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1668 is both
>>> an occurrence and an occurrence_id, expressed as:
>>> ---
>>> <dwc:Occurrence
>>> rdf:about="
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644">
>>> <dwc:occurrenceID>
>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644
>>> </dwc:occurrenceID>
>>> <blah blah blah/>
>>> </dwc:Occurrence>
>>> ---
>>>
>>> 4. I was surprised to see that the Darwin Core Identification class has
>>> no
>>> "occurrenceID" or "specimenID" term. How is one supposed to tie an
>>> identification to an observation (assuming the identification is not
>>> in-lined, of course)? DeVries and Baskauf each mint their own terms for
>>> doing this (txn:identificationHasOccurrence, and
>>> sernec:basedOnOccurrence,
>>> respectively); I used dwc:occurrenceID as if it were a record level term.
>>>
>>> 5. We had scope for multiple taxonConceptID columns in the Fusion table,
>>> and assigned lsids where possible. I also mean to work with Pete to
>>> assign
>>> GUIDs from taxoncocept.org. In addition, I assigned ethan taxon concept
>>> ids, which look like this:
>>> http:.//spire.umbc.edu/ethan/Coffea_arabica
>>>
>>> In their argument over opaque vs. transparent taxonCoceptIDs, I was
>>> sympathetic to both Pete's and Gregor's arguments. Ultimately, if the
>>> tooling exists to always display the rdfs:labels every time I'm loooking
>>> at a list of opaqueIDs, then transparent IDs are unnecessary. But, for
>>> now, it's really helpful to look at an ID and know what it's referring
>>> to.
>>>
>>> (For species names not in the spire database, the rdf returned by
>>> http:.//spire.umbc.edu/ethan/$name
>>> is simply an rdfs:seeAlso to
>>> http://http://gni.globalnames.org/name_strings?search_term=$name)
>>>
>>> 6. It was easy to assert membership in RDF classes corresponding to
>>> various Cape Cod categories of concern - invasive species, threatenened
>>> species, indicators, etc. You can see these classes at
>>> http://spire.umbc.edu/ontologies/lists (Information of where these lists
>>> come from is included as rdfs:comments. I'll add further documentation,
>>> e.g. links to eml files.)
>>>
>>> Note that "ThingOfConcern" is defined as the superclass of all the other
>>> classes in the collection. The idea here is that people can create their
>>> own "ThingOfConcern" class, and then query for observations that are of
>>> concern to them. You can see sample sparql queries at
>>> http://www.csee.umbc.edu/~jsachs/occurrences/queries/sample.txt
>>>
>>>
>>> As an aside, I think we, as a community, should come up with a
>>> biodiversity benchmark suite of rdf data and corresponding sparql
>>> queries,
>>> that can be
>>> used to test the suitability and scalability of semantic web knowledge
>>> bases. I'll take this up in a future post (unless someone beats me to
>>> it).
>>>
>>> Comments, questions, and better ideas are welcome.
>>>
>>> Thanks -
>>> Joel.
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>
>>>
>>
>>
>> --
>> ---------------------------------------------------------------
>> Pete DeVries
>> Department of Entomology
>> University of Wisconsin - Madison
>> 445 Russell Laboratories
>> 1630 Linden Drive
>> Madison, WI 53706
>> TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
>> Knowledge Base <http://lod.geospecies.org/>
>> About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
>> ------------------------------------------------------------
>>
>>


-- 
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110113/375b84ac/attachment.html 


More information about the tdwg-content mailing list