[tdwg-tag] Any TCS users with experiences to report?

Steve Baskauf steve.baskauf at vanderbilt.edu
Fri Nov 2 02:10:41 CET 2012


Rod,
Not entirely overlooked:

1. The Australian Plan Name Index (APNI) uses some components of the 
TDWG Taxon Concept LSID vocabulary for rdf+xml repesentations.  See 
http://www.biodiversity.org.au/confluence/display/bdv/NSL%2bServices for 
information and http://biodiversity.org.au/apni.taxon/118883.rdf for an 
example.  Paul Murray may want to comment on his experience with using it.

2. Referred to in Beginner's Guide to RDF 
(http://code.google.com/p/tdwg-rdf/wiki/Beginners7OWL#7.6.3._Properties_in_OWL) 
and the RDF Task Group's class inventory 
(http://code.google.com/p/tdwg-rdf/wiki/ClassInventory#1.4.__Classes_in_the_TDWG_Ontologies).  


The TDWG Ontology as a whole has a very "unfinished" feel to it, which 
may contribute to why people haven't jumped to embrace it.  However, 
because the TaxonConcept Ontology component of the TDWG Ontology is 
based on TCS, it appears to me to be pretty "finished" and quite 
usable.  I agree with you that the Darwin Core Taxon class rather 
confusingly mixes aspects of taxon names and taxon concepts.  That's why 
when Cam and I put together Darwin-SW (which is primarily based on 
Darwin Core) we opted to incorporate the TaxonConcept Ontology rather 
than to try to figure out how to use the DwC Taxon class terms (see 
http://code.google.com/p/darwin-sw/wiki/ClassTaxon).  For a live 
example, in http://bioimages.vanderbilt.edu/ind-baskauf/37770.rdf a 
dwc:Identification instance is related to a tc:TaxonConcept through a 
dsw:toTaxon property.  I would have preferred to link to a stable, 
well-known HTTP URI for the taxon concept, but since I don't think they 
exist for most of my non-Australian taxa, I was forced to mint some, 
e.g.  http://bioimages.vanderbilt.edu/taxon/37297-fna1993.rdf (use view 
page source to see underlying RDF).  I attempted to use the TDWG 
TaxonConcept ontology such as I could understand it as a non-taxonomist.

Hmmm.  I see that Greg has mentioned APNI already in his email before I 
had time to finish this.

Steve

Roderic Page wrote:
> A TDWG standard not actually being used, surely not ;)
>
> Leaving aside the wisdom of XML schema (yuck) and developing standards 
> independently of actual products, it does puzzle me that the work 
> Roger Hyam did on the LSID vocabularies is consistently overlooked. 
> The is a RDF version of TCS 
> http://rs.tdwg.org/ontology/voc/TaxonConcept 
>
> This was used by CoL in their LSIDs, but because they usually broke I 
> suspect nobody used them.
>
> We seem to be in a muddled state at present where there are competing 
> vocabularies in use for taxonomic names and concepts, and these two 
> notions are
> often not cleanly separated. Whereas nomenclators such as IPNI and 
> Zoobank use the LSID taxon name vocabulary, other databases use 
> vocabularies

> such as Darwin Core, which rather conflate  names and concepts. It's 
> not clear to me how this situation arose, but it somewhat defeats the 
> point of having

> standards.
>
> Regards,
>
> Rod
>
>
>
> Sent from my iPhone
>
> On 1 Nov 2012, at 22:41, <Tony.Rees at csiro.au 
> <mailto:Tony.Rees at csiro.au>> wrote:
>
>> Hi TDWG persons,
>>
>>  
>>
>> I am involved in an activity here to set a local standard for storing 
>> taxonomic name, identifier and (probably) hierarchy information in 
>> metadata records using our profile of ISO 19115 for the latter, and 
>> the question will come up as to whether to use elements from TCS, 
>> DwC, EML, NCBII extension to ISO 19115, or other. By default I would 
>> expect the front runner to be TCS but it appears few if any major 
>> systems have ever gone that route – I have looked at ITIS, COL, 
>> TROPICOS, WoRMS, IPNI, GBIF, AFD/APNI, more… the nearest would 
>> perhaps be AFD/APNI (hence copying Paul on this email) however their 
>> “ibis” schema, though apparently based originally on TCS, 
>> http://biodiversity.org.au/xml/ibis-20120909.xsd , does not make any 
>> explicit reference to the TCS schema so far as I can see. (Note also 
>> the cited schema definition http://biodiversity.org.au/xml/ibis [or 
>> presumably http://biodiversity.org.au/xml/ibis.xsd] does not seem to 
>> exist, but maybe I am missing something).
>>
>>  
>>
>> I am in the interesting position of also wishing to make apps which 
>> both publish and consume taxonomic name information so **could** 
>> implement TCS for these, but if no-one else is doing so maybe that is 
>> not a path to future data harmonisation, and something like DwC might 
>> be better.
>>
>>  
>>
>> It does seem odd that we have a standard endorsed in 2005 by TDWG 
>> which is apparently unused by any current major players in the real 
>> world. Any thoughts?
>>
>>  
>>
>> Regards - Tony
>>
>>  
>>
>> Tony Rees
>> Manager, Divisional Data Centre,
>> CSIRO Marine and Atmospheric Research,
>> GPO Box 1538,
>> Hobart, Tasmania 7001, Australia
>> Ph: 0362 325318 (Int: +61 362 325318)
>> Fax: 0362 325000 (Int: +61 362 325000)
>>
>> e-mail: Tony.Rees at csiro.au <mailto:Tony.Rees at csiro.au>
>> Manager, OBIS Australia regional node, http://www.obis.org.au/
>> Biodiversity informatics research activities: 
>> http://www.cmar.csiro.au/datacentre/biodiversity.htm
>> Personal 
>> info: http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
>>
>> LinkedIn profile: http://www.linkedin.com/pub/tony-rees/18/770/36
>>
>>  
>>
>> *From:* tdwg-tag-bounces at lists.tdwg.org 
>> <mailto:tdwg-tag-bounces at lists.tdwg.org> 
>> [mailto:tdwg-tag-bounces at lists.tdwg.org] *On Behalf Of *Paul Murray
>> *Sent:* Wednesday, 7 March 2012 12:52 PM
>> *To:* Steve Baskauf
>> *Cc:* "Éamonn Ó Tuama (GBIF)"; TDWG TAG
>> *Subject:* Re: [tdwg-tag] Creating a TDWG standard for documenting 
>> Data Standards [SEC=UNCLASSIFIED]
>>
>>  
>>
>>  
>>
>> On 07/03/2012, at 3:11 AM, Steve Baskauf wrote:
>>
>>
>>
>> Dag and Éamonn,
>>
>> In the context of the discussion which has been going on in the TDWG 
>> RDF mailing list, I have been thinking more about the issue of how to 
>> deal with DwC terms which state "Recommended best practice is to use 
>> a controlled vocabulary...".  That would be dcterms:type, 
>> dwc:language, dwc:basisOfRecord, dwc:sex, dwc:lifeStage, 
>> dwc:reproductiveCondition, dwc:behavior, dwc:establishmentMeans, 
>> dwc:occurrenceStatus, dwc:disposition, dwc:continent, dwc:waterBody, 
>> dwc:islandGroup, dwc:island, dwc:country, 
>> dwc:verbatimCoordinateSystem, dwc:georeferenceVerificationStatus, 
>> dwc:identificationVerificationStatus, dwc:taxonRank; 
>> dwc:nomenclaturalCode, dwc:taxonomicStatus, 
>> dwc:relationshipOfResource, and dwc:measurementType .
>>
>>  
>>
>>  
>>
>> We here have had all sorts of problems using other people's 
>> vocabularies - they never quite match the data we have. Our solution 
>> has been to use the standard terms where possible, but to mint our 
>> own where needed. We create RDF objects and to declare them as being 
>> the correct type.
>>
>>  
>>
>> For instance, 
>>
>>           http://biodiversity.org.au/voc/afd/AFD#RelationshipTypeTerm
>>
>>  
>>
>> Is declared to be a subclass of
>>
>>           http://rs.tdwg.org/ontology/voc/TaxonConcept# 
>> <http://rs.tdwg.org/ontology/voc/TaxonConcept>TaxonRelationshipTerm
>>
>>  
>>
>> And we have a few specific items of that type:
>>
>>   
>>   http://biodiversity.org.au/voc/afd/RelationshipTypeTerm#has-emendation
>>
>>   
>>   http://biodiversity.org.au/voc/afd/RelationshipTypeTerm#has-invalid-name
>>
>>   
>>   http://biodiversity.org.au/voc/afd/RelationshipTypeTerm#has-junior-homonym
>>
>>   
>>   http://biodiversity.org.au/voc/afd/RelationshipTypeTerm#has-miscellaneous-literature-name
>>
>>  
>>
>> These individuals are therefore correctly typed to be legitimately be 
>> used as a TDWG  relationshipCategory. 
>>
>>  
>>
>> Your lists of dwc:disposition values does not need to be exhaustive. 
>> It's legitimate (from a machine point of view) for a site to create 
>> their own terms. However, this does mean that the world becomes 
>> fragmented into a number of site-specific vocabularies that cannot be 
>> machine-reasoned over. The underlying reason for this is that that is 
>> in fact the way the world actually is at the moment, and there's not 
>> a lot of help for it.
>>
>>  
>>
>> -------------------------------------------------------------
>>
>>  
>>
>> There are two or three approaches to using a standard vocabulary when 
>> your own data does not quite match it.
>>
>>  
>>
>> You can use the standard term that is *closest in meaning* to your 
>> own term. The difficulty here is that if the meaning of the standard 
>> term implies things that are not true of your data, using it  means 
>> that you are asserting things that are in fact not true, and for that 
>> reason I suggest that it's not the way to go.
>>
>>  
>>
>> You can use the standard term whose definition encompasses your term. 
>> The difficulty here is that some vocabularies (notably Taxon Concept 
>> Schema) don't have "other" or "unspecified" values for their 
>> enumerations - they are not exhaustive.
>>
>>  
>>
>> In either of these cases, you will want to supplement the standard 
>> term with another value specific to your own data set, whose 
>> definition you make available. There are a few ways to do that.
>>
>>  
>>
>> You can use the "define your own term" mechanism and assert both
>>
>>   _:_ tdwg:has_relationship_type tdwg:is-subtaxon-of  .
>>
>>   _:_ tdwg:has_relationship_type 
>> my-voc:is-recently-declared-subtaxon-of  .
>>
>>  
>>
>> You can have a completely separate predicate:
>>
>>   _:_ tdwg:has_relationship_type tdwg:is-subtaxon-of  .
>>
>>   _:_ myvoc:has_relationship_type 
>> my-voc:is-recently-declared-subtaxon-of  .
>>
>>  
>>
>> You can also be terribly clever and declare your own predicate to be 
>> a super-property of the TDWG predicate, one whose range is a union. 
>> This isn't terribly useful to people using your data unless the tdwg 
>> triple is also asserted.
>>
>>  
>>
>> Another alternative is to create an OWL rule that says 
>>
>> "if a thing has relationship-type 
>> my-voc:is-recently-declared-subtaxon-of, then it also has 
>> relationship-type tdwg:is-subtaxon-of"
>>
>>  
>>
>> But this creates a performance hit.
>>
>>  
>>
>> -------------------------------------------------------------
>>
>>  
>>
>> That little discussion aside, my main concern is that you don't get 
>> mired in attempting to exhaustively list all the different island 
>> types (etc) as part of the vocabulary that you are creating. It's a 
>> never-ending job. It might be an idea to have the design guideline 
>> that no enumeration class defined by the vocabulary shall have more 
>> than 10 values. It's arbitrary, but it will keep people from being 
>> carried away subdividing types into a hierarchy that they think is a 
>> good idea, but which doesn't match the data people already have.
>>
>>  
>>
>> I'd also suggest that that every enumeration (ie, ist of individuals) 
>> include two special values:
>>
>>  
>>
>> NOT_SPECIFIED. This value is not present in the source, underlying 
>> data. It isn't in the database, the respondent didn't fill out the 
>> form fully. Perhaps "NULL" might be a better name - assuming people 
>> at this level know what it means.
>>
>> OTHER. This means the value is some specific value, but it's not 
>> covered in the TDWG list. I am not sure if this value should be 
>> explicitly used if you are publishing your own vocabulary and using 
>> terms from that. I'm inclined to say it should not be, because doing 
>> that would result in two values for predicates that naturally should 
>> be functional.
>>
>>  
>>
>> These special values *can* be done as a single instance, which means 
>> you could easily pull all "not specifieds" out of a dataset, but that 
>> means that either the ranges would have to be declared as a union, 
>> which is messy, or the individuals would have to be declared as 
>> having all possible types, which would break disjoint class declarations.
>>
>>  
>>
>> If you have received this transmission in error please notify us 
>> immediately by return e-mail and delete all copies. If this e-mail or 
>> any attachments have been sent to you in error, that error does not 
>> constitute waiver of any confidentiality, privilege or copyright in 
>> respect of information in the e-mail or attachments. Please consider 
>> the environment before printing this email.
>>
>> _______________________________________________
>> tdwg-tag mailing list
>> tdwg-tag at lists.tdwg.org <mailto:tdwg-tag at lists.tdwg.org>
>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20121101/70d65241/attachment-0001.html 


More information about the tdwg-tag mailing list