[tdwg-tag] tdwg-tag Digest, Vol 67, Issue 4

Éamonn Ó Tuama [GBIF] eotuama at gbif.org
Fri Nov 2 09:35:23 CET 2012

As it happens, the TDWG exec has just given the go-ahead (at Beijing
meeting) to form a vocabulary management task group under the TAG to look
into how TDWG can better manage the development, maintenance and governance
of vocabularies. A general call for participation will be issued in the next
few weeks. One obvious task is to see where the TDWG Ontology now stands in
relation to DwC and vocab best practices (e.g., why does the TDWG Ontology
re-invent rather than re-use an existing vocab like, e.g., FOAF) and whether
we should extract the best parts (e.g., Taxon Name, Taxon Concept,
Collection) for clean-up, promotion, etc. In the absence of a functioning
TDWG web site, we have put up a draft of the the VoMaG charter on the GBIF
community site
http://community.gbif.org/pg/groups/21382/vocabulary-management/ . The doc
itself is here:
. Please get involved and help us to refine the charter and define tasks for
the group.


-----Original Message-----
From: tdwg-tag-bounces at lists.tdwg.org
[mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of
tdwg-tag-request at lists.tdwg.org
Sent: 02 November 2012 02:13
To: tdwg-tag at lists.tdwg.org
Subject: tdwg-tag Digest, Vol 67, Issue 4

Send tdwg-tag mailing list submissions to
	tdwg-tag at lists.tdwg.org

To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
	tdwg-tag-request at lists.tdwg.org

You can reach the person managing the list at
	tdwg-tag-owner at lists.tdwg.org

When replying, please edit your Subject line so it is more specific than
"Re: Contents of tdwg-tag digest..."

Today's Topics:

   1. Re: Any TCS users with experiences to report? (Richard Pyle)


Message: 1
Date: Thu, 1 Nov 2012 15:13:20 -1000
From: Richard Pyle <deepreef at bishopmuseum.org>
Subject: Re: [tdwg-tag] Any TCS users with experiences to report?
To: "'Roderic Page'" <r.page at bio.gla.ac.uk>,	<Tony.Rees at csiro.au>
Cc: pmurray at anbg.gov.au, tdwg-tag at lists.tdwg.org, Simon.Pigot at csiro.au
Message-ID: <006c01cdb897$3f978700$bec69500$@bishopmuseum.org>
Content-Type: text/plain; charset="utf-8"

As the Convenor of the TDWG Taxon Names and Concept group, I have failed in
one of my core duties to address this issue.  My inability to attend TDWG
this year has only exacerbated this problem.

Having said that?.. I have had many discussions with many folks over the
past couple of years on this issue, and for various reasons the time is now
ripe to re-visit this age-old problem and make some decisions about how to
move forward.

For the ZooBank LSID resolver, we used Roger?s vocabularies; and to some
extent, the DwC terms harmonize (but not completely).  A few years ago I
made a push to either revitalize TCS (e.g., through TCS 2.0), or to allow it
to retire (if it hasn?t already done so de facto).

Having just emerged from nearly two very thick years of development on
ZooBank, GNA/GNUB, etc., I am now more energized (and liberated, in terms of
available time) to re-focus on how to move forward.  My hope is that we can
make some core decisions about how to move forward well before next year?s
TDWG meeting.

I would very-much welcome feedback from people on:

1)      Who is actively using TCS?  Does it work?  Can it be improved?
Should it be retired?

2)      Who is using Roger?s vocabulary? Does it work?  Can it be improved?

3)      How much of DwC:Taxon is in active use?  Just the ?traditional?
terms; or some of the new ones introduced with the ratified DwC? Does it
work?  Can it be improved?

4)      What other standards are being used in this space?

Now that we have launched the new ZooBank, we will turn our attention to
GNUB services that will start to put that content to work.  It is therefore
very much in our interest to support the sorts of data exchange mechanisms
that people most need and, ideally, collapse the various ?flavors? into
something we can all rally around.


Richard L. Pyle, PhD
Database Coordinator for Natural Sciences Associate Zoologist in Ichthyology
Dive Safety Officer Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org

Note: This disclaimer formally apologizes for the disclaimer below, over
which I have no control.

From: tdwg-tag-bounces at lists.tdwg.org
[mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of Roderic Page
Sent: Thursday, November 01, 2012 1:56 PM
To: <Tony.Rees at csiro.au>
Cc: pmurray at anbg.gov.au; <tdwg-tag at lists.tdwg.org>; Simon.Pigot at csiro.au
Subject: Re: [tdwg-tag] Any TCS users with experiences to report?

A TDWG standard not actually being used, surely not ;)

Leaving aside the wisdom of XML schema (yuck) and developing standards
independently of actual products, it does puzzle me that the work Roger Hyam
did on the LSID vocabularies is consistently overlooked. The is a RDF
version of TCS http://rs.tdwg.org/ontology/voc/TaxonConcept

This was used by CoL in their LSIDs, but because they usually broke I
suspect nobody used them.

We seem to be in a muddled state at present where there are competing
vocabularies in use for taxonomic names and concepts, and these two notions
are often not cleanly separated. Whereas nomenclators such as IPNI and
Zoobank use the LSID taxon name vocabulary, other databases use vocabularies
such as Darwin Core, which rather conflate  names and concepts. It's not
clear to me how this situation arose, but it somewhat defeats the point of
having standards.



Sent from my iPhone

On 1 Nov 2012, at 22:41, <Tony.Rees at csiro.au<mailto:Tony.Rees at csiro.au>>
Hi TDWG persons,

I am involved in an activity here to set a local standard for storing
taxonomic name, identifier and (probably) hierarchy information in metadata
records using our profile of ISO 19115 for the latter, and the question will
come up as to whether to use elements from TCS, DwC, EML, NCBII extension to
ISO 19115, or other. By default I would expect the front runner to be TCS
but it appears few if any major systems have ever gone that route ? I have
looked at ITIS, COL, TROPICOS, WoRMS, IPNI, GBIF, AFD/APNI, more? the
nearest would perhaps be AFD/APNI (hence copying Paul on this email) however
their ?ibis? schema, though apparently based originally on TCS,
http://biodiversity.org.au/xml/ibis-20120909.xsd , does not make any
explicit reference to the TCS schema so far as I can see. (Note also the
cited schema definition http://biodiversity.org.au/xml/ibis [or presumably
http://biodiversity.org.au/xml/ibis.xsd] does not seem to exist, but maybe I
am missing something).

I am in the interesting position of also wishing to make apps which both
publish and consume taxonomic name information so *could* implement TCS for
these, but if no-one else is doing so maybe that is not a path to future
data harmonisation, and something like DwC might be better.

It does seem odd that we have a standard endorsed in 2005 by TDWG which is
apparently unused by any current major players in the real world. Any

Regards - Tony

Tony Rees
Manager, Divisional Data Centre,
CSIRO Marine and Atmospheric Research,
GPO Box 1538,
Hobart, Tasmania 7001, Australia
Ph: 0362 325318 (Int: +61 362 325318)
Fax: 0362 325000 (Int: +61 362 325000)
e-mail: Tony.Rees at csiro.au<mailto:Tony.Rees at csiro.au>
Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity
informatics research activities:
Personal info:
LinkedIn profile: http://www.linkedin.com/pub/tony-rees/18/770/36

tdwg-tag-bounces at lists.tdwg.org<mailto:tdwg-tag-bounces at lists.tdwg.org>
[mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of Paul Murray
Sent: Wednesday, 7 March 2012 12:52 PM
To: Steve Baskauf
Cc: "?amonn ? Tuama (GBIF)"; TDWG TAG
Subject: Re: [tdwg-tag] Creating a TDWG standard for documenting Data

On 07/03/2012, at 3:11 AM, Steve Baskauf wrote:

Dag and ?amonn,

In the context of the discussion which has been going on in the TDWG RDF
mailing list, I have been thinking more about the issue of how to deal with
DwC terms which state "Recommended best practice is to use a controlled
vocabulary...".  That would be dcterms:type, dwc:language,
dwc:basisOfRecord, dwc:sex, dwc:lifeStage, dwc:reproductiveCondition,
dwc:behavior, dwc:establishmentMeans, dwc:occurrenceStatus, dwc:disposition,
dwc:continent, dwc:waterBody, dwc:islandGroup, dwc:island, dwc:country,
dwc:verbatimCoordinateSystem, dwc:georeferenceVerificationStatus,
dwc:identificationVerificationStatus, dwc:taxonRank; dwc:nomenclaturalCode,
dwc:taxonomicStatus, dwc:relationshipOfResource, and dwc:measurementType .

We here have had all sorts of problems using other people's vocabularies -
they never quite match the data we have. Our solution has been to use the
standard terms where possible, but to mint our own where needed. We create
RDF objects and to declare them as being the correct type.

For instance,

Is declared to be a subclass of

And we have a few specific items of that type:

These individuals are therefore correctly typed to be legitimately be used
as a TDWG  relationshipCategory.

Your lists of dwc:disposition values does not need to be exhaustive. It's
legitimate (from a machine point of view) for a site to create their own
terms. However, this does mean that the world becomes fragmented into a
number of site-specific vocabularies that cannot be machine-reasoned over.
The underlying reason for this is that that is in fact the way the world
actually is at the moment, and there's not a lot of help for it.


There are two or three approaches to using a standard vocabulary when your
own data does not quite match it.

You can use the standard term that is *closest in meaning* to your own term.
The difficulty here is that if the meaning of the standard term implies
things that are not true of your data, using it  means that you are
asserting things that are in fact not true, and for that reason I suggest
that it's not the way to go.

You can use the standard term whose definition encompasses your term. The
difficulty here is that some vocabularies (notably Taxon Concept Schema)
don't have "other" or "unspecified" values for their enumerations - they are
not exhaustive.

In either of these cases, you will want to supplement the standard term with
another value specific to your own data set, whose definition you make
available. There are a few ways to do that.

You can use the "define your own term" mechanism and assert both
  _:_ tdwg:has_relationship_type tdwg:is-subtaxon-of  .
  _:_ tdwg:has_relationship_type my-voc:is-recently-declared-subtaxon-of  .

You can have a completely separate predicate:
  _:_ tdwg:has_relationship_type tdwg:is-subtaxon-of  .
  _:_ myvoc:has_relationship_type my-voc:is-recently-declared-subtaxon-of  .

You can also be terribly clever and declare your own predicate to be a
super-property of the TDWG predicate, one whose range is a union. This isn't
terribly useful to people using your data unless the tdwg triple is also

Another alternative is to create an OWL rule that says "if a thing has
relationship-type my-voc:is-recently-declared-subtaxon-of, then it also has
relationship-type tdwg:is-subtaxon-of"

But this creates a performance hit.


That little discussion aside, my main concern is that you don't get mired in
attempting to exhaustively list all the different island types (etc) as part
of the vocabulary that you are creating. It's a never-ending job. It might
be an idea to have the design guideline that no enumeration class defined by
the vocabulary shall have more than 10 values. It's arbitrary, but it will
keep people from being carried away subdividing types into a hierarchy that
they think is a good idea, but which doesn't match the data people already

I'd also suggest that that every enumeration (ie, ist of individuals)
include two special values:

NOT_SPECIFIED. This value is not present in the source, underlying data. It
isn't in the database, the respondent didn't fill out the form fully.
Perhaps "NULL" might be a better name - assuming people at this level know
what it means.
OTHER. This means the value is some specific value, but it's not covered in
the TDWG list. I am not sure if this value should be explicitly used if you
are publishing your own vocabulary and using terms from that. I'm inclined
to say it should not be, because doing that would result in two values for
predicates that naturally should be functional.

These special values *can* be done as a single instance, which means you
could easily pull all "not specifieds" out of a dataset, but that means that
either the ranges would have to be declared as a union, which is messy, or
the individuals would have to be declared as having all possible types,
which would break disjoint class declarations.

If you have received this transmission in error please notify us immediately
by return e-mail and delete all copies. If this e-mail or any attachments
have been sent to you in error, that error does not constitute waiver of any
confidentiality, privilege or copyright in respect of information in the
e-mail or attachments. Please consider the environment before printing this
tdwg-tag mailing list
tdwg-tag at lists.tdwg.org<mailto:tdwg-tag at lists.tdwg.org>

This message is only intended for the addressee named above. Its contents
may be privileged or otherwise protected. Any unauthorized use, disclosure
or copying of this message or its contents is prohibited. If you have
received this message by mistake, please notify us immediately by reply mail
or by collect telephone call. Any personal opinions expressed in this
message do not necessarily represent the views of the Bishop Museum.
-------------- next part --------------
An HTML attachment was scrubbed...


tdwg-tag mailing list
tdwg-tag at lists.tdwg.org

End of tdwg-tag Digest, Vol 67, Issue 4

More information about the tdwg-tag mailing list