[tdwg-content] Bioblitz as rdf: does this make sense?

Peter DeVries pete.devries at gmail.com
Fri Jan 14 06:12:05 CET 2011


Most of those were from a lat that was 70 rather than -70.

I fixed that in the RDF and reloaded the data set.

The SPARQL Query is still via http://bit.ly/dYQXUp

If you downloaded the RDF recently from my links above you might want to do
so again, since they just changed.

- Pete

On Thu, Jan 13, 2011 at 10:33 PM, Bob Morris <morris.bob at gmail.com> wrote:

> The geolocations in the *stans apparently have a systematic sign error
> in the longitude. The one on the Turkish coast is more mysterious. If
> the actual observer and observation time were available, the strategy
> for quality control on geolocation outliers would start with
> correlation of the locations of all the observations that day of the
> observer whose data assert
>   <geo:lat     38.47   geo:long        27.09>
> For example, it's fairly unlikely that the observer could get from
> Massachusetts to Turkey in a few hours.... OTOH, if it's a solitary
> observation from that observer, it might really be a Turkish
> observation.
>
>
> Bob
>
>
> On Thu, Jan 13, 2011 at 8:55 PM, Peter DeVries <pete.devries at gmail.com>
> wrote:
> >...
> > Also the geoquery works but it seems to return some strange locations?
> > http://bit.ly/dYQXUp
> >
> > Respectfully,
> > - Pete
> >
> > On Thu, Jan 13, 2011 at 6:53 AM, joel sachs <jsachs at csee.umbc.edu>
> wrote:
> >>
> >> Pete,
> >> Thanks - I corrected the geo properties.
> >> Joel.
> >>
> >>
> >>
> >> On Wed, 12 Jan 2011, Peter DeVries wrote:
> >>
> >>> Hi Joel,
> >>>
> >>> Cool :-)
> >>>
> >>> I just loaded this into my SPARQL endpoint.
> >>>
> >>> In the named graph
> >>> urn:org:linkedopenspeciesdata:dataspace:tdwg2010bioblitz
> >>>
> >>> It consists of 19,990 Triples
> >>>
> >>> Here is one of the dwc:taxonConceptID entries.
> >>>
> >>> *About: http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata*
> >>>
> >>>
> http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata
> >>>
> >>> *About:
> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627*
> >>>
> >>> <
> http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata
> >
> >>>
> >>>
> http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627
> >>>
> >>>
> >>>
> >>> <
> http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627
> >This
> >>> should give you an a count of occurrences.
> >>>
> >>> SELECT count(*) WHERE {?s a <http://rs.tdwg.org/dwc/terms/#Occurrence
> >};
> >>>
> >>> = 1882
> >>>
> >>> SELECT count(*) WHERE {?s a
> >>> <http://rs.tdwg.org/dwc/terms/#taxonConceptID>};
> >>>
> >>> This should give you a list of occurrences
> >>>
> >>>
> >>>
> http://lsd.taxonconcept.org/describe/?url=http://rs.tdwg.org/dwc/terms/%23Occurrence
> >>>
> >>> If this did not come through your email system try the bit.ly.
> >>>
> >>> http://bit.ly/g9BcoL
> >>>
> >>> I tried the following that should have given me a google map of all the
> >>> occurrences but it did not result in the map.
> >>>
> >>> DESCRIBE ?x WHERE {
> >>>  ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
> >>> http://rs.tdwg.org/dwc/terms/#Occurrence>.
> >>> }
> >>>
> >>> I looked that the RDF and I think I see the problem.
> >>>
> >>> In the RDF
> >>>
> >>> <geo:latitude>
> >>> 41.53
> >>> </geo:latitude>
> >>>
> >>> <geo:longitude>
> >>> -70.67
> >>> </geo:longitude>
> >>>
> >>> Should be
> >>>
> >>> <geo:lat>
> >>> 41.53
> >>> </geo:lat>
> >>>
> >>> <geo:long>
> >>> -70.67
> >>> </geo:long>
> >>>
> >>> See http://www.w3.org/2003/01/geo/
> >>>
> >>> I did the following query to get a list of all the dwc:taxonConceptID's
> >>> and
> >>> have attached them as a .txt file.
> >>>
> >>> select distinct ?o WHERE {?s
> >>> <http://rs.tdwg.org/dwc/terms/#taxonConceptID>
> >>> ?o}
> >>>
> >>> Pretty neat :-)
> >>>
> >>> There are some things that I will get back to Joel on.
> >>>
> >>> Here is where you can manually enter a SPARQL query. Click on
> "Advanced"
> >>> for
> >>> the entry window.
> >>>
> >>> http://lsd.taxonconcept.org/isparql/
> >>>
> >>> Respectfully,
> >>>
> >>> - Pete
> >>>
> >>>
> >>>
> >>> On Wed, Jan 12, 2011 at 5:55 PM, joel sachs <jsachs at csee.umbc.edu>
> wrote:
> >>>
> >>>> Hi Everyone,
> >>>>
> >>>> I've posted rdf of the bioblitz data. It's at
> >>>>
> http://www.cs.umbc.edu/~jsachs/occurrences/TechnoBioblitzOccurrences.rdf
> >>>> .
> >>>>
> >>>> Individual occurrences can be retrieved via
> >>>> http://www.cs.umbc.edu/~jsachs/occurrences/[occurrence_id]
> >>>> e.g. http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1835
> >>>>
> >>>> Individual identifications can be retrieved via
> >>>> http://www.cs.umbc.edu/~jsachs/identifications/[identification_id]
> >>>> e.g.
> >>>>
> >>>>
> http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1835_id_1
> >>>>
> >>>> The scripts behind this are on the kludgy side, so reports of errors
> and
> >>>> abnormalities will be warmly welcomed.
> >>>>
> >>>> Implicit in each of the following notes is the question "Is this a
> good
> >>>> way to do it?":
> >>>>
> >>>> 1. The data is "normalized" w.r.t. identification. "Normalized" is in
> >>>> quotes because I mean it in the sense that Steve Baskauf was using in
> >>>> his
> >>>> Fall 2010 series of posts. His meaning of the term makes sense to me,
> >>>> but
> >>>> many people (e.g. the OBO folks), take "normalized ontology" to mean
> >>>> "disentangled" (i.e. no multiple inheritance.)
> >>>> As an example, here's an occurrence with two crowdsourced
> >>>> determinations:
> >>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644
> >>>>
> >>>> 2. I used sequential integers for observation and identification IDs;
> in
> >>>> practice, a mechanism needs to be in place to prevent two people from
> >>>> assigning the same id to their respective identifications.
> >>>>
> >>>> 3. My answer to Cam Webb's Question #1 from
> >>>> http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001720.html
> >>>> is "both". In other words, just as "Joel Sachs" is both me and also my
> >>>> name, so
> >>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1668 is
> both
> >>>> an occurrence and an occurrence_id, expressed as:
> >>>> ---
> >>>> <dwc:Occurrence
> >>>> rdf:about="
> >>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644">
> >>>> <dwc:occurrenceID>
> >>>> http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644
> >>>> </dwc:occurrenceID>
> >>>> <blah blah blah/>
> >>>> </dwc:Occurrence>
> >>>> ---
> >>>>
> >>>> 4. I was surprised to see that the Darwin Core Identification class
> has
> >>>> no
> >>>> "occurrenceID" or "specimenID" term. How is one supposed to tie an
> >>>> identification to an observation (assuming the identification is not
> >>>> in-lined, of course)? DeVries and Baskauf each mint their own terms
> for
> >>>> doing this (txn:identificationHasOccurrence, and
> >>>> sernec:basedOnOccurrence,
> >>>> respectively); I used dwc:occurrenceID as if it were a record level
> >>>> term.
> >>>>
> >>>> 5. We had scope for multiple taxonConceptID columns in the Fusion
> table,
> >>>> and assigned lsids where possible. I also mean to work with Pete to
> >>>> assign
> >>>> GUIDs from taxoncocept.org. In addition, I assigned ethan taxon
> concept
> >>>> ids, which look like this:
> >>>> http:.//spire.umbc.edu/ethan/Coffea_arabica
> >>>>
> >>>> In their argument over opaque vs. transparent taxonCoceptIDs, I was
> >>>> sympathetic to both Pete's and Gregor's arguments. Ultimately, if the
> >>>> tooling exists to always display the rdfs:labels every time I'm
> loooking
> >>>> at a list of opaqueIDs, then transparent IDs are unnecessary. But, for
> >>>> now, it's really helpful to look at an ID and know what it's referring
> >>>> to.
> >>>>
> >>>> (For species names not in the spire database, the rdf returned by
> >>>> http:.//spire.umbc.edu/ethan/$name
> >>>> is simply an rdfs:seeAlso to
> >>>> http://http://gni.globalnames.org/name_strings?search_term=$name)
> >>>>
> >>>> 6. It was easy to assert membership in RDF classes corresponding to
> >>>> various Cape Cod categories of concern - invasive species,
> threatenened
> >>>> species, indicators, etc. You can see these classes at
> >>>> http://spire.umbc.edu/ontologies/lists (Information of where these
> lists
> >>>> come from is included as rdfs:comments. I'll add further
> documentation,
> >>>> e.g. links to eml files.)
> >>>>
> >>>> Note that "ThingOfConcern" is defined as the superclass of all the
> other
> >>>> classes in the collection. The idea here is that people can create
> their
> >>>> own "ThingOfConcern" class, and then query for observations that are
> of
> >>>> concern to them. You can see sample sparql queries at
> >>>> http://www.csee.umbc.edu/~jsachs/occurrences/queries/sample.txt
> >>>>
> >>>>
> >>>> As an aside, I think we, as a community, should come up with a
> >>>> biodiversity benchmark suite of rdf data and corresponding sparql
> >>>> queries,
> >>>> that can be
> >>>> used to test the suitability and scalability of semantic web knowledge
> >>>> bases. I'll take this up in a future post (unless someone beats me to
> >>>> it).
> >>>>
> >>>> Comments, questions, and better ideas are welcome.
> >>>>
> >>>> Thanks -
> >>>> Joel.
> >>>>
> >>>> _______________________________________________
> >>>> tdwg-content mailing list
> >>>> tdwg-content at lists.tdwg.org
> >>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> ---------------------------------------------------------------
> >>> Pete DeVries
> >>> Department of Entomology
> >>> University of Wisconsin - Madison
> >>> 445 Russell Laboratories
> >>> 1630 Linden Drive
> >>> Madison, WI 53706
> >>> TaxonConcept Knowledge Base <http://www.taxonconcept.org/> /
> GeoSpecies
> >>> Knowledge Base <http://lod.geospecies.org/>
> >>> About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
> >>> ------------------------------------------------------------
> >>>
> >
> >
> >
> > --
> > ---------------------------------------------------------------
> > Pete DeVries
> > Department of Entomology
> > University of Wisconsin - Madison
> > 445 Russell Laboratories
> > 1630 Linden Drive
> > Madison, WI 53706
> > TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> > About the GeoSpecies Knowledge Base
> > ------------------------------------------------------------
> >
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
> >
>
>
>
> --
> Robert A. Morris
> Emeritus Professor  of Computer Science
> UMASS-Boston
> 100 Morrissey Blvd
> Boston, MA 02125-3390
> Associate, Harvard University Herbaria
> email: morris.bob at gmail.com
> web: http://bdei.cs.umb.edu/
> web: http://etaxonomy.org/mw/FilteredPush
> http://www.cs.umb.edu/~ram
> phone (+1) 857 222 7992 (mobile)
>



-- 
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110113/081d9793/attachment.html 


More information about the tdwg-content mailing list