[tdwg-content] What I learned at the TechnoBioBlitz

Steve Kelling stk2 at cornell.edu
Mon Oct 11 20:09:46 CEST 2010


Hello,
I was not involved with TDWG this year for a variety of reasons. But
we have organized a lot (@80 million records) of birds using an
extension of Darwin Core. This work may be relevant to encoding
bioblitz type data.

What we were particularly focused on adding was information about the
observer, collection event, protocol, and location. You can view the
additions at http://www.avianknowledge.net/content/about/bmde-variable-descriptions

The information that we have organized with this extension of Darwin
Core has been successfully used in numerous publications, and includes
sufficient information on effort etc to allow assumptions of
detectability, bias, and incorporation of negative data.

Denis Lepage spearheaded the development of the Bird Monitoring Data
Exchange schema, but was assisted by numerous members of the bird
monitoring and informatics community.

Steve Kelling

On Mon, Oct 11, 2010 at 2:00 PM, John Wieczorek <tuco at berkeley.edu> wrote:
>
>
> On Mon, Oct 11, 2010 at 6:38 AM, "Markus Döring (GBIF)" <mdoering at gbif.org>
> wrote:
>>
>> I see. From what I understand (please correct me if Im wrong, John)
>> dwc:recordedBy is rather the primary observer in the field than the person
>> keying in the dwc metadata record about it.
>
> Yes. DwC doesn't have a term for the person creating the record - recordedBy
> is meant to be the collector(s) or observer(s) in the field who made the
> determination that a species occurred there.
>
>>
>> If that's the missing bit I would use dcterm:creator instead:
>> http://dublincore.org/documents/dcmi-terms/#elements-creator and we could
>> think about including this into the official record level terms in darwin
>> core.
>>
>> dc:source is another record level term I personally include very often
>> although not explicitly listed in the dwc guide. For species checklists I
>> often want to include a link back to the often richer source webpage.
>
> dcterms:source seems like a reasonable addition to the record-level terms
> borrowed from Dublin Core. There is no evidence that we ever attempted to
> include it, which means at least that we didn't have a reason to reject it.
>>
>> Markus
>>
>>
>>
>> On Oct 11, 2010, at 14:53, joel sachs wrote:
>>
>> > Markus,
>> >
>> > The issue is that the observer and the the recorder are not always the
>> > same person. A student or a team member might report the observation to a
>> > teacher/team leader, who then creates the record. As you and Tim point out,
>> > the definition of recordedBy does allow us to use the term for both
>> > observers and record creators, but then we lose the distinction between the
>> > two.
>> >
>> > Joel.
>> >
>> >
>> >
>> >
>> > On Mon, 11 Oct 2010, "Markus Döring (GBIF)" wrote:
>> >
>> >> Joel,
>> >> thanks for this nice summary. Darwin Core definitely needs far more
>> >> examples to illustrate its use. We should probably provide various examples
>> >> for all the different use cases we already know dwc can be used for.
>> >>
>> >> Darwin Core has a recordedBy term. Did that not fit your obs:observedBy
>> >> property and if not what exactly is the difference?
>> >> See http://rs.tdwg.org/dwc/terms/index.htm#recordedBy
>> >>
>> >> Thanks,
>> >> Markus
>> >>
>> >>
>> >> On Oct 11, 2010, at 13:46, joel sachs wrote:
>> >>
>> >>> One of the goals of the recent bioblitz was to think about the
>> >>> suitability
>> >>> and appropriatness of TDWG standards for citizen science. Robert
>> >>> Stevenson
>> >>> has volunteered to take the lead on preparing a technobioblitz lessons
>> >>> learned document, and though the scope of this document is not yet
>> >>> determined, I think the audience will include bioblitz organizers,
>> >>> software developers, and TDWG as a whole. I hope no one is shy about
>> >>> sharing lessons they think they learned, or suggestions that they
>> >>> have. We
>> >>> can use the bioblitz google group for this discussion, and copy in
>> >>> tdwg-content when our discussion is standards-specific.
>> >>>
>> >>> Here are some of my immediate observations:
>> >>>
>> >>> 1. Darwin Core is almost exactly right for citizen science. However,
>> >>> there
>> >>> is a desperate need for examples and templates of its use. To
>> >>> illustrate
>> >>> this need: one of the developers spoke of the design choice between "a
>> >>> simple csv file and a Darwin Core record". But a simple csv file is a
>> >>> legitimate representation of Darwin Core! To be fair to the developer,
>> >>> such a sentence might not have struck me as absurd a year ago, before
>> >>> Remsen said "let's use DwC for the bioblitz".
>> >>>
>> >>> We provided a couple of example DwC records (text and rdf) in the
>> >>> bioblitz
>> >>> data profile [1]. I  think the lessons learned document should include
>> >>> an
>> >>> on-line catalog of cut-and-pasteable examples covering a variety of
>> >>> use
>> >>> cases, together with a dead simple desciption of DwC, something like
>> >>> "Darwin Core is a collection of terms, together with definitions."
>> >>>
>> >>> Here are areas where we augemented or diverged from DwC in the
>> >>> bioblitz:
>> >>>
>> >>> i. We added obs:observedBy [2], since there is no equivalent property
>> >>> in
>> >>> DwC, and it's important in Citizen Science (though often not
>> >>> available).
>> >>>
>> >>> ii. We used geo:lat and geo:long [3] instead of DwC terms for latitude
>> >>> and
>> >>> longitude. The geo namespace is a well used and supported standard,
>> >>> and
>> >>> records with geo coordinates are automatically mapped by several
>> >>> applications. Since everyone was using GPS  to retrieve their
>> >>> coordinates,
>> >>> we were able to assume WGS-84 as the datum.
>> >>>
>> >>> If someone had used another Datum, say XYZ, we would have added
>> >>> columns to
>> >>> the Fusion table so that they could have expressed their coordiantes
>> >>> in
>> >>> DwC, as, e.g.:
>> >>> DwC:decimalLatitude=41.5
>> >>> DwC:decimalLongitude=-70.7
>> >>> DwC:geodeticDatum=XYZ
>> >>>
>> >>> (I would argue that it should be kosher DwC to express the above as
>> >>> simply
>> >>> XYZ:lat and XYZ:long. DwC already incorporates terms from other
>> >>> namespaces, such as Dublin Core, so there is precedent for this.
>> >>>
>> >>> 2. DwC:scientificName might be more user friendly than
>> >>> taxonomy:binomial
>> >>> and the other taxonomy machine tags EOL uses for flickr images.  If
>> >>> DwC:scientificName isn't self-explanatory enough, a user can look it
>> >>> up,
>> >>> and see that any scientific name is acceptable, at any taxonomic rank,
>> >>> or
>> >>> not having any rank. And once we have a scientific name, higher ranks
>> >>> can
>> >>> be inferred.
>> >>>
>> >>> 3. Catalogue of Life was an important part of the workflow, but we
>> >>> had some problems with it. Future bioblitzes might consider using
>> >>> something like a CoL fork, as recently described by Rod Page [4].
>> >>>
>> >>> 4. We didn't include "basisOfRecord" in the original data profile, and
>> >>> so
>> >>> it wasn't a column in the Fusion Table [5]. But when a transcriber
>> >>> felt it
>> >>> was necessary to include in order to capture data in a particular
>> >>> field
>> >>> sheet, she just added the column to the table. This flexibility of
>> >>> schema
>> >>> is important, and is in harmony with the semantic web.
>> >>>
>> >>> 5. There seemed to be enthusiasm for another field event at next
>> >>> year's
>> >>> TDWG. This could be an opportunity to gather other types of data (eg.
>> >>> character data) and thereby
>> >>> i) expose meeting particpants to another set of everyday problems from
>> >>> the
>> >>> world of biodiversity workflows, and ii) try other TDWG technology on
>> >>> for size, e.g. the observation exchange format, annotation framework,
>> >>> etc.
>> >>>
>> >>>
>> >>> Happy Thanksgiving to all in Canada -
>> >>> Joel.
>> >>> ----
>> >>>
>> >>>
>> >>> 1.
>> >>> http://groups.google.com/group/tdwg-bioblitz/web/tdwg-bioblitz-profile-v1-1
>> >>> 2. Slightly bastardizing our old observation ontology -
>> >>> http://spire.umbc.edu/ontologies/Observation.owl
>> >>> 3. http://www.w3.org/2003/01/geo/
>> >>> 4.
>> >>> http://iphylo.blogspot.com/2010/10/replicating-and-forking-data-in-2010.html
>> >>> 5. http://tables.googlelabs.com/DataSource?dsrcid=248798
>> >>>
>> >>> _______________________________________________
>> >>> tdwg-content mailing list
>> >>> tdwg-content at lists.tdwg.org
>> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> > _______________________________________________
>> > tdwg-content mailing list
>> > tdwg-content at lists.tdwg.org
>> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>



-- 
Steve Kelling
Information Science
Cornell Lab of Ornithology
607-254-2478 (office)
607-342-1029 (cell)


More information about the tdwg-content mailing list