[tdwg-tag] Embedding specimen (and other) annotations in NeXML
sblum at calacademy.org
Mon Feb 23 23:29:15 CET 2009
The TDWG 'house' is organic and messy, if not in disarray. I'll explain this
state with a descriptive history instead of a logical exposition.
* We changed the way TDWG approves standards (in 2006). ABCD, TCS, and SDD
were the last standards approved by the old method (a vote by members). New
standards simply have to pass expert and public review (as judged by the
executive committee). ABCD, TCS, and SDD aren't consistent because...(blah,
blah; life is hard).
* The TDWG ontology or vocabulary is NOT a standard. The terms are supposed
to be taken from standards, and perhaps even non-standard schemas that are
widely used. The ontology managers are supposed to apply the filter of use;
terms in the ontology should be widely used. (We expect that some terms in
approved standards will not be widely used.)
* With the ontology/vocabulary comes a shift or at least a broadening towards
RDF (away from XML schemas), though there were/are some skeptics.
The existing pages on the TDWG vocabulary are pretty old, and I don't think
there has been any kind of review "event" for these. Editing and reviewing
these is on our to-do list for this year.
I think IT WOULD BE VERY HELPFUL if we put up a list of REQUIRED READING
before getting into this review, and perhaps even the current review of
DarwinCore. John Wieczorek and his core collaborators have drawn heavily
from the DCMI initiative in shaping the most recent draft of the DarwinCore.
I would really appreciate references to any other good statements about RDF
and ontologies. These should go on the TAG homepage.
Matt, thanks for pointing out the emperor's clothes!
From: mbjones.89 at gmail.com [mailto:mbjones.89 at gmail.com] On Behalf Of Matt
Sent: Monday, February 23, 2009 12:51 PM
To: Blum, Stan
Cc: Kevin Richards; Hilmar Lapp; Technical Architecture Group mailing list;
rogerhyam Hyam; Mark Schildhauer
Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
This thread has prompted me to ask some naive questions about the
process under which the vocabularies are formed. Maybe I'm the only
one who is confused about the vocabularies, their status, and the
process of forming new terms, but it seems maybe I'm not alone. And
clarification on some of these points will help me with our direction
on the development of the Observation Ontology under the OSR group,
which I think will fit right in with Stan's point about fitting some
of the concepts into a broader Observation framework.
For me there is a lot of confusion over the TDWG vocabularies, partly
because they capture concepts that are present in existing TDWG
standards, but are generally incomplete. For example, the TCS
standard provides the field 'Specimens/Specimen', which I think is
relevant to Hilmar's question. However, the listed TDWG vocabulary
for TCS is the TaxonConcept vocabulary
(http://rs.tdwg.org/ontology/voc/TaxonConcept), which does not provide
a class for the TCS 'Specimen' concept. In addition, the ABCD TDWG
standard also seem to have a way for specimens to be represented,
but they are generalized as 'Unit's with a 'RecordBasis' of
'PreservedSpecimen'. So there are at least two official TDWG
'standards' for representing Specimen information, in addition to
whatever DwC does. It seems to me that the best thing to do would be
to finish the LSID vocabularies for TCS and ABCD so that they
completely represent the concepts in TCS and ABCD, then get that
approved as a valid way to represent these TDWG standards. In the
process, one could try to resolve the differences in modeling
approaches employed by the different standards, such as mapping the
Specimen concept in TCS to its corresponding concept in DwC and ABCD.
This would help avoid multiple TDWG standards defining overlapping
versions of these concepts, and let people use the vocabularies in
place of the XML schema versions of these standards.
What is the process for approval of the LSID vocabularies? They seem
to be bypassing the normal TDWG standards track. Some of the
vocabularies have a status of 'Available' (like TaxonConcept, even
though it is incomplete), while others are marked as 'Developmental'.
The page on OntologyGovernance
"Relationship Between TDWG Standards and the Ontology --
Concepts are standardized by being included in TDWG Standards. Once
they have been mentioned in a standard the Ontology Manager has the
responsibility of maintaining their URIs and descriptions as per the
standard. Concepts must be promoted to the live branch before the
standard enters the standards process. "
So it seems that the OntologyManager replaces the standards process
for the purpose of the vocabularies. Is this correct? And does the
OntologyManager make sure that concepts like 'Specimen' that are
defined in TCS make it into the corresponding LSID vocabulary before
it is classified as 'Available'? And how does the OntologyManager
decide which concept and representation for 'Specimen' to use -- the
one from TCS or the one from ABCD? Does 'Available' have the same
weight as a published TDWG standard, and if so, shouldn't these
vocabularies be listed on the Standards page as well? Finally, does
the existence of a concept such as 'Specimen' in TCS have any bearing
on the development of new standards such as DwC that may want to
define the concept differently, or more completely?
More information about the tdwg-tag