Most of those were from a lat that was 70 rather than -70. <div><br></div><div>I fixed that in the RDF and reloaded the data set.</div><div><br></div><div>The SPARQL Query is still via <a href="http://bit.ly/dYQXUp">http://bit.ly/dYQXUp</a></div>
<div><br></div><div>If you downloaded the RDF recently from my links above you might want to do so again, since they just changed.</div><div><br></div><div>- Pete<br><br><div class="gmail_quote">On Thu, Jan 13, 2011 at 10:33 PM, Bob Morris <span dir="ltr"><<a href="mailto:morris.bob@gmail.com">morris.bob@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">The geolocations in the *stans apparently have a systematic sign error<br>
in the longitude. The one on the Turkish coast is more mysterious. If<br>
the actual observer and observation time were available, the strategy<br>
for quality control on geolocation outliers would start with<br>
correlation of the locations of all the observations that day of the<br>
observer whose data assert<br>
<geo:lat 38.47 geo:long 27.09><br>
For example, it's fairly unlikely that the observer could get from<br>
Massachusetts to Turkey in a few hours.... OTOH, if it's a solitary<br>
observation from that observer, it might really be a Turkish<br>
observation.<br>
<br>
<br>
Bob<br>
<br>
<br>
On Thu, Jan 13, 2011 at 8:55 PM, Peter DeVries <<a href="mailto:pete.devries@gmail.com">pete.devries@gmail.com</a>> wrote:<br>
>...<br>
<div><div></div><div class="h5">> Also the geoquery works but it seems to return some strange locations?<br>
> <a href="http://bit.ly/dYQXUp" target="_blank">http://bit.ly/dYQXUp</a><br>
><br>
> Respectfully,<br>
> - Pete<br>
><br>
> On Thu, Jan 13, 2011 at 6:53 AM, joel sachs <<a href="mailto:jsachs@csee.umbc.edu">jsachs@csee.umbc.edu</a>> wrote:<br>
>><br>
>> Pete,<br>
>> Thanks - I corrected the geo properties.<br>
>> Joel.<br>
>><br>
>><br>
>><br>
>> On Wed, 12 Jan 2011, Peter DeVries wrote:<br>
>><br>
>>> Hi Joel,<br>
>>><br>
>>> Cool :-)<br>
>>><br>
>>> I just loaded this into my SPARQL endpoint.<br>
>>><br>
>>> In the named graph<br>
>>> urn:org:linkedopenspeciesdata:dataspace:tdwg2010bioblitz<br>
>>><br>
>>> It consists of 19,990 Triples<br>
>>><br>
>>> Here is one of the dwc:taxonConceptID entries.<br>
>>><br>
>>> *About: <a href="http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata*" target="_blank">http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata*</a><br>
>>><br>
>>> <a href="http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata" target="_blank">http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata</a><br>
>>><br>
>>> *About: <a href="http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627*" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627*</a><br>
>>><br>
>>> <<a href="http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata" target="_blank">http://lsd.taxonconcept.org/describe/?url=http://spire.umbc.edu/ethan/Ampelopsis_brevipedunculata</a>><br>
>>><br>
>>> <a href="http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627" target="_blank">http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627</a><br>
>>><br>
>>><br>
>>><br>
>>> <<a href="http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627" target="_blank">http://lsd.taxonconcept.org/describe/?url=http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1627</a>>This<br>
>>> should give you an a count of occurrences.<br>
>>><br>
>>> SELECT count(*) WHERE {?s a <<a href="http://rs.tdwg.org/dwc/terms/#Occurrence" target="_blank">http://rs.tdwg.org/dwc/terms/#Occurrence</a>>};<br>
>>><br>
>>> = 1882<br>
>>><br>
>>> SELECT count(*) WHERE {?s a<br>
>>> <<a href="http://rs.tdwg.org/dwc/terms/#taxonConceptID" target="_blank">http://rs.tdwg.org/dwc/terms/#taxonConceptID</a>>};<br>
>>><br>
>>> This should give you a list of occurrences<br>
>>><br>
>>><br>
>>> <a href="http://lsd.taxonconcept.org/describe/?url=http://rs.tdwg.org/dwc/terms/%23Occurrence" target="_blank">http://lsd.taxonconcept.org/describe/?url=http://rs.tdwg.org/dwc/terms/%23Occurrence</a><br>
>>><br>
>>> If this did not come through your email system try the <a href="http://bit.ly" target="_blank">bit.ly</a>.<br>
>>><br>
>>> <a href="http://bit.ly/g9BcoL" target="_blank">http://bit.ly/g9BcoL</a><br>
>>><br>
>>> I tried the following that should have given me a google map of all the<br>
>>> occurrences but it did not result in the map.<br>
>>><br>
>>> DESCRIBE ?x WHERE {<br>
>>> ?x <<a href="http://www.w3.org/1999/02/22-rdf-syntax-ns#type" target="_blank">http://www.w3.org/1999/02/22-rdf-syntax-ns#type</a>> <<br>
>>> <a href="http://rs.tdwg.org/dwc/terms/#Occurrence" target="_blank">http://rs.tdwg.org/dwc/terms/#Occurrence</a>>.<br>
>>> }<br>
>>><br>
>>> I looked that the RDF and I think I see the problem.<br>
>>><br>
>>> In the RDF<br>
>>><br>
>>> <geo:latitude><br>
>>> 41.53<br>
>>> </geo:latitude><br>
>>><br>
>>> <geo:longitude><br>
>>> -70.67<br>
>>> </geo:longitude><br>
>>><br>
>>> Should be<br>
>>><br>
>>> <geo:lat><br>
>>> 41.53<br>
>>> </geo:lat><br>
>>><br>
>>> <geo:long><br>
>>> -70.67<br>
>>> </geo:long><br>
>>><br>
>>> See <a href="http://www.w3.org/2003/01/geo/" target="_blank">http://www.w3.org/2003/01/geo/</a><br>
>>><br>
>>> I did the following query to get a list of all the dwc:taxonConceptID's<br>
>>> and<br>
>>> have attached them as a .txt file.<br>
>>><br>
>>> select distinct ?o WHERE {?s<br>
>>> <<a href="http://rs.tdwg.org/dwc/terms/#taxonConceptID" target="_blank">http://rs.tdwg.org/dwc/terms/#taxonConceptID</a>><br>
>>> ?o}<br>
>>><br>
>>> Pretty neat :-)<br>
>>><br>
>>> There are some things that I will get back to Joel on.<br>
>>><br>
>>> Here is where you can manually enter a SPARQL query. Click on "Advanced"<br>
>>> for<br>
>>> the entry window.<br>
>>><br>
>>> <a href="http://lsd.taxonconcept.org/isparql/" target="_blank">http://lsd.taxonconcept.org/isparql/</a><br>
>>><br>
>>> Respectfully,<br>
>>><br>
>>> - Pete<br>
>>><br>
>>><br>
>>><br>
>>> On Wed, Jan 12, 2011 at 5:55 PM, joel sachs <<a href="mailto:jsachs@csee.umbc.edu">jsachs@csee.umbc.edu</a>> wrote:<br>
>>><br>
>>>> Hi Everyone,<br>
>>>><br>
>>>> I've posted rdf of the bioblitz data. It's at<br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/occurrences/TechnoBioblitzOccurrences.rdf" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/TechnoBioblitzOccurrences.rdf</a><br>
>>>> .<br>
>>>><br>
>>>> Individual occurrences can be retrieved via<br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/occurrences/[occurrence_id]" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/[occurrence_id]</a><br>
>>>> e.g. <a href="http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1835" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1835</a><br>
>>>><br>
>>>> Individual identifications can be retrieved via<br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/identifications/[identification_id]" target="_blank">http://www.cs.umbc.edu/~jsachs/identifications/[identification_id]</a><br>
>>>> e.g.<br>
>>>><br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1835_id_1" target="_blank">http://www.cs.umbc.edu/~jsachs/identifications/tdwg2010bioblitz_1835_id_1</a><br>
>>>><br>
>>>> The scripts behind this are on the kludgy side, so reports of errors and<br>
>>>> abnormalities will be warmly welcomed.<br>
>>>><br>
>>>> Implicit in each of the following notes is the question "Is this a good<br>
>>>> way to do it?":<br>
>>>><br>
>>>> 1. The data is "normalized" w.r.t. identification. "Normalized" is in<br>
>>>> quotes because I mean it in the sense that Steve Baskauf was using in<br>
>>>> his<br>
>>>> Fall 2010 series of posts. His meaning of the term makes sense to me,<br>
>>>> but<br>
>>>> many people (e.g. the OBO folks), take "normalized ontology" to mean<br>
>>>> "disentangled" (i.e. no multiple inheritance.)<br>
>>>> As an example, here's an occurrence with two crowdsourced<br>
>>>> determinations:<br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644</a><br>
>>>><br>
>>>> 2. I used sequential integers for observation and identification IDs; in<br>
>>>> practice, a mechanism needs to be in place to prevent two people from<br>
>>>> assigning the same id to their respective identifications.<br>
>>>><br>
>>>> 3. My answer to Cam Webb's Question #1 from<br>
>>>> <a href="http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001720.html" target="_blank">http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001720.html</a><br>
>>>> is "both". In other words, just as "Joel Sachs" is both me and also my<br>
>>>> name, so<br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1668" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1668</a> is both<br>
>>>> an occurrence and an occurrence_id, expressed as:<br>
>>>> ---<br>
>>>> <dwc:Occurrence<br>
>>>> rdf:about="<br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644</a>"><br>
>>>> <dwc:occurrenceID><br>
>>>> <a href="http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644" target="_blank">http://www.cs.umbc.edu/~jsachs/occurrences/tdwg2010bioblitz_1644</a><br>
>>>> </dwc:occurrenceID><br>
>>>> <blah blah blah/><br>
>>>> </dwc:Occurrence><br>
>>>> ---<br>
>>>><br>
>>>> 4. I was surprised to see that the Darwin Core Identification class has<br>
>>>> no<br>
>>>> "occurrenceID" or "specimenID" term. How is one supposed to tie an<br>
>>>> identification to an observation (assuming the identification is not<br>
>>>> in-lined, of course)? DeVries and Baskauf each mint their own terms for<br>
>>>> doing this (txn:identificationHasOccurrence, and<br>
>>>> sernec:basedOnOccurrence,<br>
>>>> respectively); I used dwc:occurrenceID as if it were a record level<br>
>>>> term.<br>
>>>><br>
>>>> 5. We had scope for multiple taxonConceptID columns in the Fusion table,<br>
>>>> and assigned lsids where possible. I also mean to work with Pete to<br>
>>>> assign<br>
>>>> GUIDs from <a href="http://taxoncocept.org" target="_blank">taxoncocept.org</a>. In addition, I assigned ethan taxon concept<br>
>>>> ids, which look like this:<br>
>>>> http:.//<a href="http://spire.umbc.edu/ethan/Coffea_arabica" target="_blank">spire.umbc.edu/ethan/Coffea_arabica</a><br>
>>>><br>
>>>> In their argument over opaque vs. transparent taxonCoceptIDs, I was<br>
>>>> sympathetic to both Pete's and Gregor's arguments. Ultimately, if the<br>
>>>> tooling exists to always display the rdfs:labels every time I'm loooking<br>
>>>> at a list of opaqueIDs, then transparent IDs are unnecessary. But, for<br>
>>>> now, it's really helpful to look at an ID and know what it's referring<br>
>>>> to.<br>
>>>><br>
>>>> (For species names not in the spire database, the rdf returned by<br>
>>>> http:.//<a href="http://spire.umbc.edu/ethan/$name" target="_blank">spire.umbc.edu/ethan/$name</a><br>
>>>> is simply an rdfs:seeAlso to<br>
>>>> http://<a href="http://gni.globalnames.org/name_strings?search_term=$name" target="_blank">http://gni.globalnames.org/name_strings?search_term=$name</a>)<br>
>>>><br>
>>>> 6. It was easy to assert membership in RDF classes corresponding to<br>
>>>> various Cape Cod categories of concern - invasive species, threatenened<br>
>>>> species, indicators, etc. You can see these classes at<br>
>>>> <a href="http://spire.umbc.edu/ontologies/lists" target="_blank">http://spire.umbc.edu/ontologies/lists</a> (Information of where these lists<br>
>>>> come from is included as rdfs:comments. I'll add further documentation,<br>
>>>> e.g. links to eml files.)<br>
>>>><br>
>>>> Note that "ThingOfConcern" is defined as the superclass of all the other<br>
>>>> classes in the collection. The idea here is that people can create their<br>
>>>> own "ThingOfConcern" class, and then query for observations that are of<br>
>>>> concern to them. You can see sample sparql queries at<br>
>>>> <a href="http://www.csee.umbc.edu/~jsachs/occurrences/queries/sample.txt" target="_blank">http://www.csee.umbc.edu/~jsachs/occurrences/queries/sample.txt</a><br>
>>>><br>
>>>><br>
>>>> As an aside, I think we, as a community, should come up with a<br>
>>>> biodiversity benchmark suite of rdf data and corresponding sparql<br>
>>>> queries,<br>
>>>> that can be<br>
>>>> used to test the suitability and scalability of semantic web knowledge<br>
>>>> bases. I'll take this up in a future post (unless someone beats me to<br>
>>>> it).<br>
>>>><br>
>>>> Comments, questions, and better ideas are welcome.<br>
>>>><br>
>>>> Thanks -<br>
>>>> Joel.<br>
>>>><br>
>>>> _______________________________________________<br>
>>>> tdwg-content mailing list<br>
>>>> <a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br>
>>>> <a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br>
>>>><br>
>>><br>
>>><br>
>>><br>
>>> --<br>
>>> ---------------------------------------------------------------<br>
>>> Pete DeVries<br>
>>> Department of Entomology<br>
>>> University of Wisconsin - Madison<br>
>>> 445 Russell Laboratories<br>
>>> 1630 Linden Drive<br>
>>> Madison, WI 53706<br>
>>> TaxonConcept Knowledge Base <<a href="http://www.taxonconcept.org/" target="_blank">http://www.taxonconcept.org/</a>> / GeoSpecies<br>
>>> Knowledge Base <<a href="http://lod.geospecies.org/" target="_blank">http://lod.geospecies.org/</a>><br>
>>> About the GeoSpecies Knowledge Base <<a href="http://about.geospecies.org/" target="_blank">http://about.geospecies.org/</a>><br>
>>> ------------------------------------------------------------<br>
>>><br>
><br>
><br>
><br>
> --<br>
> ---------------------------------------------------------------<br>
> Pete DeVries<br>
> Department of Entomology<br>
> University of Wisconsin - Madison<br>
> 445 Russell Laboratories<br>
> 1630 Linden Drive<br>
> Madison, WI 53706<br>
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base<br>
> About the GeoSpecies Knowledge Base<br>
> ------------------------------------------------------------<br>
><br>
> _______________________________________________<br>
> tdwg-content mailing list<br>
> <a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br>
> <a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br>
><br>
><br>
<br>
<br>
<br>
--<br>
</div></div>Robert A. Morris<br>
Emeritus Professor of Computer Science<br>
UMASS-Boston<br>
100 Morrissey Blvd<br>
Boston, MA 02125-3390<br>
Associate, Harvard University Herbaria<br>
email: <a href="mailto:morris.bob@gmail.com">morris.bob@gmail.com</a><br>
web: <a href="http://bdei.cs.umb.edu/" target="_blank">http://bdei.cs.umb.edu/</a><br>
web: <a href="http://etaxonomy.org/mw/FilteredPush" target="_blank">http://etaxonomy.org/mw/FilteredPush</a><br>
<a href="http://www.cs.umb.edu/~ram" target="_blank">http://www.cs.umb.edu/~ram</a><br>
phone (+1) 857 222 7992 (mobile)<br>
</blockquote></div><br><br clear="all"><br>-- <br>---------------------------------------------------------------<br>Pete DeVries<br>Department of Entomology<br>University of Wisconsin - Madison<br>445 Russell Laboratories<br>
1630 Linden Drive<br>Madison, WI 53706<br><a href="http://www.taxonconcept.org/" target="_blank">TaxonConcept Knowledge Base</a> / <a href="http://lod.geospecies.org/" target="_blank">GeoSpecies Knowledge Base</a><br><a href="http://about.geospecies.org/" target="_blank">About the GeoSpecies Knowledge Base</a><br>
------------------------------------------------------------<br>
</div>