[tdwg-content] A proposal to improve Darwin Core for invasive species data

Simpson, Annie asimpson at usgs.gov
Wed May 25 17:52:43 CEST 2016


Chiming in now, very briefly...

John, you are the DwC champion who has kept it moving forward in a
comprehensible and useful way. The TDWG community is very grateful for your
work on this and I personally hope you don't give up the good fight.

*Annie Simpson, biologist & information scientist*
http://orcid.org/0000-0001-8338-5134

*BISON project (http://bison.usgs.ornl.gov <http://bison.usgs.ornl.gov>)*
*Core Science Analytics, Synthesis, & Libraries Program*





*U.S. Geological Survey, MS 30212201 Sunrise Valley DriveReston, Virginia
 20192=================asimpson at usgs.gov <asimpson at usgs.gov>703.648.4281
desk*

On Wed, May 25, 2016 at 7:26 AM, John Wieczorek <tuco at berkeley.edu> wrote:

> One primary idea behind the BCO is indeed as the proving ground you
> mention. The challenge is having consistent available resources to do that
> work. With BCO it could be a particular challenge, I think, since it can
> cover so much semantic space. I would love to be in a position to be a
> proper BCO caretaker, but I have not even been able to do a good job with
> Darwin Core as it is.
>
> On Wed, May 25, 2016 at 2:00 AM, Quentin Groom <
> quentin.groom at plantentuinmeise.be> wrote:
>
>> Interesting! I had not come across Apple Core before. Isn't this
>> proving-ground role something that the Biological Collections Ontology
>> can also do?
>> I like the idea of a Darwin Core with different levels of adherence to
>> rules. I agree that strict enforcement of rules will inhibit the flow of
>> data, but in my own experience there are simple fields that I could have
>> easily completed in a standardized way, if only I could have found a
>> suitable recommendation to follow. Hierarchical vocabularies are
>> particularly useful here, because they have built in flexibility.
>> Regards
>> Quentin
>>
>>
>>
>> Dr. Quentin Groom
>> (Botany and Information Technology)
>>
>> Botanic Garden Meise
>> Domein van Bouchout
>> B-1860 Meise
>> Belgium
>>
>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>
>> Landline; +32 (0) 226 009 20 ext. 364
>> FAX:      +32 (0) 226 009 45
>>
>> E-mail:     quentin.groom at plantentuinmeise.be
>> Skype name: qgroom
>> Website:    www.botanicgarden.be
>>
>>
>> On 25 May 2016 at 03:17, John Wieczorek <tuco at berkeley.edu> wrote:
>>
>>> I think it is indeed worthwhile to have content standards to go with the
>>> term definitions. Applce Core (now under renewed development at
>>> https://github.com/tdwg/applecore) is a good example of this.
>>>
>>> On Tue, May 24, 2016 at 6:16 PM, Chuck Miller <Chuck.Miller at mobot.org>
>>> wrote:
>>>
>>>> John,
>>>>
>>>> It’s interesting how long that text has been out there, and without
>>>> much comment.  It seems to presume there is a binary situation: tightly
>>>> controlled vocabulary that is exclusive or loosely controlled that is
>>>> inclusive.   Maybe it’s time now to consider something additional in the
>>>> middle.  We know a lot more about how the Darwin Core standard is being
>>>> used, or at least have plenty of examples.  With the addition of use cases
>>>> into the standards for terms, progress could be made on use-case-based
>>>> standard vocabulary that could reduce the “garbage in/garbage out” problem
>>>> that comes from being totally inclusive.
>>>>
>>>>
>>>>
>>>> TDWG standards in the 80s and 90s were a little more about controlled
>>>> vocabulary and reducing garbage than they have been in the 00s and 10s.
>>>> Maybe we should spend some time on that aspect of data exchange again and
>>>> use cases could be a method.
>>>>
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Chuck
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* tdwg-content [mailto:tdwg-content-bounces at lists.tdwg.org] *On
>>>> Behalf Of *John Wieczorek
>>>> *Sent:* Tuesday, May 24, 2016 2:48 PM
>>>>
>>>> *To:* Quentin Groom
>>>> *Cc:* TDWG Content Mailing List
>>>> *Subject:* Re: [tdwg-content] A proposal to improve Darwin Core for
>>>> invasive species data
>>>>
>>>>
>>>>
>>>> I would say that the primary factor driving the philosophy for loose
>>>> controlled vocabulary recommendations is a desire to promote the stability
>>>> of Darwin Core term definitions, because changes can be disruptive. Section
>>>> 1.4 on the Simple Darwin Core page (
>>>> http://rs.tdwg.org/dwc/terms/simple/index.htm) gives further practical
>>>> arguments for this stance. I have copied the relevant text here for
>>>> convenience:
>>>>
>>>>
>>>>
>>>> "There is a difference between having data in a field and requiring
>>>> that field to have a value from among a legal set of values. The Darwin
>>>> Core is simple in that it has minimal restrictions on the contents of
>>>> fields. The term comments give recommendations about the use of controlled
>>>> vocabularies and how to structure content wherever appropriate. Data
>>>> contributors are encouraged to follow these recommendations as well as
>>>> possible. You might argue that having no restrictions will promote "dirty"
>>>> data (data of low quality or dubious value). Consider the simple axiom
>>>> "It's not what you have, but what you do with it that matters." If data
>>>> restrictions were in place at the fundamental level, then a record having
>>>> any non-compliant data in any of its fields could not be shared via the
>>>> standard. Not only would there be a dearth of shared data in that case (or
>>>> an unused standard), but also there would be no way to use the standard to
>>>> build shared data cleaning tools to actually improve the situation, nor to
>>>> use data services to look up alternative representations (language
>>>> translations, for example) to serve a broader audience. The rest is up to
>>>> how the records will be used - in other words, it is up to applications to
>>>> enforce further restrictions if appropriate, and it is up to the
>>>> stakeholders of those applications to decide what the restrictions will be
>>>> for the purpose the application is trying to serve."
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, May 24, 2016 at 3:44 PM, Quentin Groom <
>>>> quentin.groom at plantentuinmeise.be> wrote:
>>>>
>>>> Hi Paco,
>>>>
>>>> I'm glad to hear Plinian Core is active, I only recently discovered it
>>>> and think it is a good initiative. The species data I've seen is in quite
>>>> diverse and in unstandardised formats. It would be nice to see some of the
>>>> big providers using Plinian Core.
>>>>
>>>>
>>>>
>>>> I'm not so worried about imposing limitations on users, because as far
>>>> as I can see Darwin Core only recommends vocabularies, it doesn't enforce
>>>> them. Having said that, it would be useful to know what is meant by "*Recommended
>>>> best practice is to use a controlled vocabulary*", because if Darwin
>>>> Core doesn't impose a vocabulary and there is no field to specify which
>>>> vocabulary you are using then it doesn't help interoperability much.
>>>>
>>>>
>>>>
>>>> I'm happy to also discuss off list. Invasiveness and impact are
>>>> difficult to standardize and so far I've chosen other fields that I
>>>> consider easier to gain consensus on.
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>> Quentin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Dr. Quentin Groom
>>>>
>>>> (Botany and Information Technology)
>>>>
>>>>
>>>>
>>>> Botanic Garden Meise
>>>>
>>>> Domein van Bouchout
>>>>
>>>> B-1860 Meise
>>>>
>>>> Belgium
>>>>
>>>>
>>>>
>>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>>
>>>>
>>>>
>>>> Landline; +32 (0) 226 009 20 ext. 364
>>>>
>>>> FAX:      +32 (0) 226 009 45
>>>>
>>>>
>>>>
>>>> E-mail:     quentin.groom at plantentuinmeise.be
>>>>
>>>> Skype name: qgroom
>>>>
>>>> Website:    www.botanicgarden.be
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 24 May 2016 at 19:08, Francisco Pando <pando at gbif.es> wrote:
>>>>
>>>> Quentin et al.,
>>>>
>>>>
>>>>
>>>> Plinian Core is active and backed up by an international group that
>>>> seeks expansion. A session is planned in TDWG 2016 about it within the
>>>> Species Information Interest Group slot.
>>>>
>>>>
>>>>
>>>> "Invasiveness" is a section within the Plinian Core schema:
>>>> https://github.com/PlinianCore/Documentation/wiki/InvasivenessClass
>>>>
>>>> It is much based on the GISIN schema.   This can be revisited, updated
>>>> and harmonized with current initiatives, some mentioned in this thread.
>>>> Quentin, we may do a bit of exchange of-list
>>>>
>>>>
>>>>
>>>> Whereas shared vocabularies bring plenty of good things , I share
>>>> Chuck’s concerns about imposing some unwanted limitations for some
>>>> potential users of the schema.
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>>
>>>>
>>>> Paco
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Francisco Pando
>>>>
>>>>
>>>>
>>>> Investigador
>>>>
>>>> Real Jardín Botánico - CSIC
>>>>
>>>> Plaza de Murillo, 2
>>>>
>>>> 28014 Madrid, Spain
>>>>
>>>> Tel.+34 91 420 3017 x 172
>>>>
>>>>
>>>>
>>>> *From:* tdwg-content [mailto:tdwg-content-bounces at lists.tdwg.org] *On
>>>> Behalf Of *Quentin Groom
>>>> *Sent:* Monday, May 23, 2016 10:10 PM
>>>> *To:* Chuck Miller
>>>>
>>>>
>>>> *Cc:* TDWG Content Mailing List
>>>> *Subject:* Re: [tdwg-content] A proposal to improve Darwin Core for
>>>> invasive species data
>>>>
>>>>
>>>>
>>>> Hi Chuck,
>>>>
>>>> thanks for your point. The use cases I'm thinking of are conservation
>>>> red-listing; horizon-scanning for potential new invasives; early warning of
>>>> new aliens; impact assessment and invasion monitoring. We have recently be
>>>> discussing the possibility of automating all of these process so that they
>>>> can be repeated regularly, or as soon as new data becomes available.
>>>> Obviously, for this we need observations, but we also need check-lists to
>>>> tell us what is considered native or alien, present or extinct.
>>>>
>>>> I know more about invasive species research than red-listing, but I am
>>>> aware that the current rate of red-listing is so slow that most things will
>>>> become extinct before they are assessed. Given that the IUCN criteria are
>>>> so clear, it should be possible to automate the whole process using GBIF.
>>>> The only limitation with then be mobilizing the observations.
>>>>
>>>> Regards
>>>>
>>>> Quentin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Dr. Quentin Groom
>>>>
>>>> (Botany and Information Technology)
>>>>
>>>>
>>>>
>>>> Botanic Garden Meise
>>>>
>>>> Domein van Bouchout
>>>>
>>>> B-1860 Meise
>>>>
>>>> Belgium
>>>>
>>>>
>>>>
>>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>>
>>>>
>>>>
>>>> Landline; +32 (0) 226 009 20 ext. 364
>>>>
>>>> FAX:      +32 (0) 226 009 45
>>>>
>>>>
>>>>
>>>> E-mail:     quentin.groom at plantentuinmeise.be
>>>>
>>>> Skype name: qgroom
>>>>
>>>> Website:    www.botanicgarden.be
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 23 May 2016 at 21:10, Chuck Miller <Chuck.Miller at mobot.org> wrote:
>>>>
>>>> Quentin,
>>>>
>>>> I think in addition to defining the community that needs the new Origin
>>>> term, you also need to define the use cases to which the proposed
>>>> controlled vocabularies for establishmentMeans and occurenceStatus apply.
>>>> Darwin Core is used in multiple ways.  I think there may be use cases for
>>>> these terms that don’t match the invasive species use cases. One controlled
>>>> vocabulary may not work for all Darwin Core users.
>>>>
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Chuck
>>>>
>>>>
>>>>
>>>> *Chuck Miller | VP-IT & CIO | Missouri Botanical Garden*
>>>>
>>>> *4344 Shaw Boulevard | Saint Louis, MO 63110 | Phone 314-577-9419
>>>> <314-577-9419>*
>>>>
>>>> *From:* tdwg-content [mailto:tdwg-content-bounces at lists.tdwg.org] *On
>>>> Behalf Of *John Wieczorek
>>>> *Sent:* Monday, May 23, 2016 1:36 PM
>>>> *To:* Quentin Groom
>>>> *Cc:* TDWG Content Mailing List
>>>> *Subject:* Re: [tdwg-content] A proposal to improve Darwin Core for
>>>> invasive species data
>>>>
>>>>
>>>>
>>>> Hi Quentin,
>>>>
>>>>
>>>>
>>>> Thank you for your effort in putting forth these welll thought out
>>>> proposals. At various times I have heard discussions on the inadequecy of
>>>> establishmentMeans. Your work encapsulates the problem well.
>>>>
>>>> One of the things that helps when proposing to add a Darwin Core term
>>>> is demonstrating that there is a community that needs it. Can you tell us
>>>> who has a demonstrated need to share this information? Anyone out there who
>>>> has this interest is also welcome to share that here to provide evidence of
>>>> demand from more than one group, project or individual.
>>>>
>>>>
>>>>
>>>> Cheers,
>>>>
>>>>
>>>>
>>>> John
>>>>
>>>>
>>>>
>>>> On Mon, May 23, 2016 at 3:23 PM, Quentin Groom <
>>>> quentin.groom at plantentuinmeise.be> wrote:
>>>>
>>>> I've been working on a proposal to improve Darwin Core for use with
>>>> invasive species data.
>>>>
>>>>
>>>>
>>>> The proposal is detailed on GitHub at
>>>> https://github.com/qgroom/ias-dwc-proposal/blob/master/proposal.md.
>>>>
>>>>
>>>>
>>>> The proposal is for a new term "origin" and suggested vocabularies for
>>>> establishmentMeans and occurrenceStatus.
>>>>
>>>>
>>>>
>>>> I'd welcome your feedback on the proposal.
>>>>
>>>>
>>>>
>>>> From my perspective it provides some needed clarity on
>>>> the establishmentMeans and occurrenceStatus fields, but also adds the
>>>> origin that is needed for invasive species research and for conservation
>>>> assessments.
>>>>
>>>>
>>>>
>>>> I'm not sure of the best way to discuss this, but if you have concrete
>>>> proposals for changes you might raise them as issues on GitHub, as well as
>>>> mentioning them here.
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>> Quentin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Dr. Quentin Groom
>>>>
>>>> (Botany and Information Technology)
>>>>
>>>>
>>>>
>>>> Botanic Garden Meise
>>>>
>>>> Domein van Bouchout
>>>>
>>>> B-1860 Meise
>>>>
>>>> Belgium
>>>>
>>>>
>>>>
>>>> ORCID: 0000-0002-0596-5376 <http://orcid.org/0000-0002-0596-5376>
>>>>
>>>>
>>>>
>>>> Landline; +32 (0) 226 009 20 ext. 364
>>>>
>>>> FAX:      +32 (0) 226 009 45
>>>>
>>>>
>>>>
>>>> E-mail:     quentin.groom at plantentuinmeise.be
>>>>
>>>> Skype name: qgroom
>>>>
>>>> Website:    www.botanicgarden.be
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> tdwg-content mailing list
>>>> tdwg-content at lists.tdwg.org
>>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> tdwg-content mailing list
>>>> tdwg-content at lists.tdwg.org
>>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> tdwg-content mailing list
>>>> tdwg-content at lists.tdwg.org
>>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tdwg.org/pipermail/tdwg-content/attachments/20160525/c2d318f3/attachment.html>


More information about the tdwg-content mailing list