[tdwg-content] A proposal to improve Darwin Core for invasive species data

John Wieczorek tuco at berkeley.edu
Wed May 25 17:57:06 CEST 2016


Thanks Annie. I don't plan to give up, it's just that I don't feel I have
been doing it justice for a while now.

On Wed, May 25, 2016 at 12:52 PM, Simpson, Annie <asimpson at usgs.gov> wrote:

> Chiming in now, very briefly...
>
> John, you are the DwC champion who has kept it moving forward in a
> comprehensible and useful way. The TDWG community is very grateful for your
> work on this and I personally hope you don't give up the good fight.
>
> *Annie Simpson, biologist & information scientist*
> http://orcid.org/0000-0001-8338-5134
>
> *BISON project (http://bison.usgs.ornl.gov <http://bison.usgs.ornl.gov>)*
> *Core Science Analytics, Synthesis, & Libraries Program*
>
>
>
>
>
> *U.S. Geological Survey, MS 30212201 Sunrise Valley DriveReston, Virginia
>  20192=================asimpson at usgs.gov <asimpson at usgs.gov>703.648.4281
> <703.648.4281> desk*
>
> On Wed, May 25, 2016 at 7:26 AM, John Wieczorek <tuco at berkeley.edu> wrote:
>
>> One primary idea behind the BCO is indeed as the proving ground you
>> mention. The challenge is having consistent available resources to do that
>> work. With BCO it could be a particular challenge, I think, since it can
>> cover so much semantic space. I would love to be in a position to be a
>> proper BCO caretaker, but I have not even been able to do a good job with
>> Darwin Core as it is.
>>
>> On Wed, May 25, 2016 at 2:00 AM, Quentin Groom <
>> quentin.groom at plantentuinmeise.be> wrote:
>>
>>> Interesting! I had not come across Apple Core before. Isn't this
>>> proving-ground role something that the Biological Collections Ontology
>>> can also do?
>>> I like the idea of a Darwin Core with different levels of adherence to
>>> rules. I agree that strict enforcement of rules will inhibit the flow of
>>> data, but in my own experience there are simple fields that I could have
>>> easily completed in a standardized way, if only I could have found a
>>> suitable recommendation to follow. Hierarchical vocabularies are
>>> particularly useful here, because they have built in flexibility.
>>> Regards
>>> Quentin
>>>
>>>
>>>
>>> Dr. Quentin Groom
>>> (Botany and Information Technology)
>>>
>>> Botanic Garden Meise
>>> Domein van Bouchout
>>> B-1860 Meise
>>> Belgium
>>>
>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>
>>> Landline; +32 (0) 226 009 20 ext. 364
>>> FAX:      +32 (0) 226 009 45
>>>
>>> E-mail:     quentin.groom at plantentuinmeise.be
>>> Skype name: qgroom
>>> Website:    www.botanicgarden.be
>>>
>>>
>>> On 25 May 2016 at 03:17, John Wieczorek <tuco at berkeley.edu> wrote:
>>>
>>>> I think it is indeed worthwhile to have content standards to go with
>>>> the term definitions. Applce Core (now under renewed development at
>>>> https://github.com/tdwg/applecore) is a good example of this.
>>>>
>>>> On Tue, May 24, 2016 at 6:16 PM, Chuck Miller <Chuck.Miller at mobot.org>
>>>> wrote:
>>>>
>>>>> John,
>>>>>
>>>>> It’s interesting how long that text has been out there, and without
>>>>> much comment.  It seems to presume there is a binary situation: tightly
>>>>> controlled vocabulary that is exclusive or loosely controlled that is
>>>>> inclusive.   Maybe it’s time now to consider something additional in the
>>>>> middle.  We know a lot more about how the Darwin Core standard is being
>>>>> used, or at least have plenty of examples.  With the addition of use cases
>>>>> into the standards for terms, progress could be made on use-case-based
>>>>> standard vocabulary that could reduce the “garbage in/garbage out” problem
>>>>> that comes from being totally inclusive.
>>>>>
>>>>>
>>>>>
>>>>> TDWG standards in the 80s and 90s were a little more about controlled
>>>>> vocabulary and reducing garbage than they have been in the 00s and 10s.
>>>>> Maybe we should spend some time on that aspect of data exchange again and
>>>>> use cases could be a method.
>>>>>
>>>>>
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* tdwg-content [mailto:tdwg-content-bounces at lists.tdwg.org] *On
>>>>> Behalf Of *John Wieczorek
>>>>> *Sent:* Tuesday, May 24, 2016 2:48 PM
>>>>>
>>>>> *To:* Quentin Groom
>>>>> *Cc:* TDWG Content Mailing List
>>>>> *Subject:* Re: [tdwg-content] A proposal to improve Darwin Core for
>>>>> invasive species data
>>>>>
>>>>>
>>>>>
>>>>> I would say that the primary factor driving the philosophy for loose
>>>>> controlled vocabulary recommendations is a desire to promote the stability
>>>>> of Darwin Core term definitions, because changes can be disruptive. Section
>>>>> 1.4 on the Simple Darwin Core page (
>>>>> http://rs.tdwg.org/dwc/terms/simple/index.htm) gives further
>>>>> practical arguments for this stance. I have copied the relevant text here
>>>>> for convenience:
>>>>>
>>>>>
>>>>>
>>>>> "There is a difference between having data in a field and requiring
>>>>> that field to have a value from among a legal set of values. The Darwin
>>>>> Core is simple in that it has minimal restrictions on the contents of
>>>>> fields. The term comments give recommendations about the use of controlled
>>>>> vocabularies and how to structure content wherever appropriate. Data
>>>>> contributors are encouraged to follow these recommendations as well as
>>>>> possible. You might argue that having no restrictions will promote "dirty"
>>>>> data (data of low quality or dubious value). Consider the simple axiom
>>>>> "It's not what you have, but what you do with it that matters." If data
>>>>> restrictions were in place at the fundamental level, then a record having
>>>>> any non-compliant data in any of its fields could not be shared via the
>>>>> standard. Not only would there be a dearth of shared data in that case (or
>>>>> an unused standard), but also there would be no way to use the standard to
>>>>> build shared data cleaning tools to actually improve the situation, nor to
>>>>> use data services to look up alternative representations (language
>>>>> translations, for example) to serve a broader audience. The rest is up to
>>>>> how the records will be used - in other words, it is up to applications to
>>>>> enforce further restrictions if appropriate, and it is up to the
>>>>> stakeholders of those applications to decide what the restrictions will be
>>>>> for the purpose the application is trying to serve."
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 24, 2016 at 3:44 PM, Quentin Groom <
>>>>> quentin.groom at plantentuinmeise.be> wrote:
>>>>>
>>>>> Hi Paco,
>>>>>
>>>>> I'm glad to hear Plinian Core is active, I only recently discovered it
>>>>> and think it is a good initiative. The species data I've seen is in quite
>>>>> diverse and in unstandardised formats. It would be nice to see some of the
>>>>> big providers using Plinian Core.
>>>>>
>>>>>
>>>>>
>>>>> I'm not so worried about imposing limitations on users, because as far
>>>>> as I can see Darwin Core only recommends vocabularies, it doesn't enforce
>>>>> them. Having said that, it would be useful to know what is meant by "*Recommended
>>>>> best practice is to use a controlled vocabulary*", because if Darwin
>>>>> Core doesn't impose a vocabulary and there is no field to specify which
>>>>> vocabulary you are using then it doesn't help interoperability much.
>>>>>
>>>>>
>>>>>
>>>>> I'm happy to also discuss off list. Invasiveness and impact are
>>>>> difficult to standardize and so far I've chosen other fields that I
>>>>> consider easier to gain consensus on.
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>>
>>>>> Quentin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dr. Quentin Groom
>>>>>
>>>>> (Botany and Information Technology)
>>>>>
>>>>>
>>>>>
>>>>> Botanic Garden Meise
>>>>>
>>>>> Domein van Bouchout
>>>>>
>>>>> B-1860 Meise
>>>>>
>>>>> Belgium
>>>>>
>>>>>
>>>>>
>>>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>>>
>>>>>
>>>>>
>>>>> Landline; +32 (0) 226 009 20 ext. 364
>>>>>
>>>>> FAX:      +32 (0) 226 009 45
>>>>>
>>>>>
>>>>>
>>>>> E-mail:     quentin.groom at plantentuinmeise.be
>>>>>
>>>>> Skype name: qgroom
>>>>>
>>>>> Website:    www.botanicgarden.be
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 24 May 2016 at 19:08, Francisco Pando <pando at gbif.es> wrote:
>>>>>
>>>>> Quentin et al.,
>>>>>
>>>>>
>>>>>
>>>>> Plinian Core is active and backed up by an international group that
>>>>> seeks expansion. A session is planned in TDWG 2016 about it within the
>>>>> Species Information Interest Group slot.
>>>>>
>>>>>
>>>>>
>>>>> "Invasiveness" is a section within the Plinian Core schema:
>>>>> https://github.com/PlinianCore/Documentation/wiki/InvasivenessClass
>>>>>
>>>>> It is much based on the GISIN schema.   This can be revisited, updated
>>>>> and harmonized with current initiatives, some mentioned in this thread.
>>>>> Quentin, we may do a bit of exchange of-list
>>>>>
>>>>>
>>>>>
>>>>> Whereas shared vocabularies bring plenty of good things , I share
>>>>> Chuck’s concerns about imposing some unwanted limitations for some
>>>>> potential users of the schema.
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>>
>>>>> Paco
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Francisco Pando
>>>>>
>>>>>
>>>>>
>>>>> Investigador
>>>>>
>>>>> Real Jardín Botánico - CSIC
>>>>>
>>>>> Plaza de Murillo, 2
>>>>>
>>>>> 28014 Madrid, Spain
>>>>>
>>>>> Tel.+34 91 420 3017 x 172
>>>>>
>>>>>
>>>>>
>>>>> *From:* tdwg-content [mailto:tdwg-content-bounces at lists.tdwg.org] *On
>>>>> Behalf Of *Quentin Groom
>>>>> *Sent:* Monday, May 23, 2016 10:10 PM
>>>>> *To:* Chuck Miller
>>>>>
>>>>>
>>>>> *Cc:* TDWG Content Mailing List
>>>>> *Subject:* Re: [tdwg-content] A proposal to improve Darwin Core for
>>>>> invasive species data
>>>>>
>>>>>
>>>>>
>>>>> Hi Chuck,
>>>>>
>>>>> thanks for your point. The use cases I'm thinking of are conservation
>>>>> red-listing; horizon-scanning for potential new invasives; early warning of
>>>>> new aliens; impact assessment and invasion monitoring. We have recently be
>>>>> discussing the possibility of automating all of these process so that they
>>>>> can be repeated regularly, or as soon as new data becomes available.
>>>>> Obviously, for this we need observations, but we also need check-lists to
>>>>> tell us what is considered native or alien, present or extinct.
>>>>>
>>>>> I know more about invasive species research than red-listing, but I am
>>>>> aware that the current rate of red-listing is so slow that most things will
>>>>> become extinct before they are assessed. Given that the IUCN criteria are
>>>>> so clear, it should be possible to automate the whole process using GBIF.
>>>>> The only limitation with then be mobilizing the observations.
>>>>>
>>>>> Regards
>>>>>
>>>>> Quentin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dr. Quentin Groom
>>>>>
>>>>> (Botany and Information Technology)
>>>>>
>>>>>
>>>>>
>>>>> Botanic Garden Meise
>>>>>
>>>>> Domein van Bouchout
>>>>>
>>>>> B-1860 Meise
>>>>>
>>>>> Belgium
>>>>>
>>>>>
>>>>>
>>>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>>>
>>>>>
>>>>>
>>>>> Landline; +32 (0) 226 009 20 ext. 364
>>>>>
>>>>> FAX:      +32 (0) 226 009 45
>>>>>
>>>>>
>>>>>
>>>>> E-mail:     quentin.groom at plantentuinmeise.be
>>>>>
>>>>> Skype name: qgroom
>>>>>
>>>>> Website:    www.botanicgarden.be
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 23 May 2016 at 21:10, Chuck Miller <Chuck.Miller at mobot.org> wrote:
>>>>>
>>>>> Quentin,
>>>>>
>>>>> I think in addition to defining the community that needs the new
>>>>> Origin term, you also need to define the use cases to which the proposed
>>>>> controlled vocabularies for establishmentMeans and occurenceStatus apply.
>>>>> Darwin Core is used in multiple ways.  I think there may be use cases for
>>>>> these terms that don’t match the invasive species use cases. One controlled
>>>>> vocabulary may not work for all Darwin Core users.
>>>>>
>>>>>
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>>>
>>>>> *Chuck Miller | VP-IT & CIO | Missouri Botanical Garden*
>>>>>
>>>>> *4344 Shaw Boulevard | Saint Louis, MO 63110 | Phone 314-577-9419
>>>>> <314-577-9419>*
>>>>>
>>>>> *From:* tdwg-content [mailto:tdwg-content-bounces at lists.tdwg.org] *On
>>>>> Behalf Of *John Wieczorek
>>>>> *Sent:* Monday, May 23, 2016 1:36 PM
>>>>> *To:* Quentin Groom
>>>>> *Cc:* TDWG Content Mailing List
>>>>> *Subject:* Re: [tdwg-content] A proposal to improve Darwin Core for
>>>>> invasive species data
>>>>>
>>>>>
>>>>>
>>>>> Hi Quentin,
>>>>>
>>>>>
>>>>>
>>>>> Thank you for your effort in putting forth these welll thought out
>>>>> proposals. At various times I have heard discussions on the inadequecy of
>>>>> establishmentMeans. Your work encapsulates the problem well.
>>>>>
>>>>> One of the things that helps when proposing to add a Darwin Core term
>>>>> is demonstrating that there is a community that needs it. Can you tell us
>>>>> who has a demonstrated need to share this information? Anyone out there who
>>>>> has this interest is also welcome to share that here to provide evidence of
>>>>> demand from more than one group, project or individual.
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>>
>>>>> On Mon, May 23, 2016 at 3:23 PM, Quentin Groom <
>>>>> quentin.groom at plantentuinmeise.be> wrote:
>>>>>
>>>>> I've been working on a proposal to improve Darwin Core for use with
>>>>> invasive species data.
>>>>>
>>>>>
>>>>>
>>>>> The proposal is detailed on GitHub at
>>>>> https://github.com/qgroom/ias-dwc-proposal/blob/master/proposal.md.
>>>>>
>>>>>
>>>>>
>>>>> The proposal is for a new term "origin" and suggested vocabularies for
>>>>> establishmentMeans and occurrenceStatus.
>>>>>
>>>>>
>>>>>
>>>>> I'd welcome your feedback on the proposal.
>>>>>
>>>>>
>>>>>
>>>>> From my perspective it provides some needed clarity on
>>>>> the establishmentMeans and occurrenceStatus fields, but also adds the
>>>>> origin that is needed for invasive species research and for conservation
>>>>> assessments.
>>>>>
>>>>>
>>>>>
>>>>> I'm not sure of the best way to discuss this, but if you have concrete
>>>>> proposals for changes you might raise them as issues on GitHub, as well as
>>>>> mentioning them here.
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>>
>>>>> Quentin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dr. Quentin Groom
>>>>>
>>>>> (Botany and Information Technology)
>>>>>
>>>>>
>>>>>
>>>>> Botanic Garden Meise
>>>>>
>>>>> Domein van Bouchout
>>>>>
>>>>> B-1860 Meise
>>>>>
>>>>> Belgium
>>>>>
>>>>>
>>>>>
>>>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>>>
>>>>>
>>>>>
>>>>> Landline; +32 (0) 226 009 20 ext. 364
>>>>>
>>>>> FAX:      +32 (0) 226 009 45
>>>>>
>>>>>
>>>>>
>>>>> E-mail:     quentin.groom at plantentuinmeise.be
>>>>>
>>>>> Skype name: qgroom
>>>>>
>>>>> Website:    www.botanicgarden.be
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> tdwg-content mailing list
>>>>> tdwg-content at lists.tdwg.org
>>>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> tdwg-content mailing list
>>>>> tdwg-content at lists.tdwg.org
>>>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> tdwg-content mailing list
>>>>> tdwg-content at lists.tdwg.org
>>>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tdwg.org/pipermail/tdwg-content/attachments/20160525/3e7b10ed/attachment.html>


More information about the tdwg-content mailing list