[tdwg-tnc] LSIDs and taxon concepts

Eamonn O Tuama eotuama at gbif.org
Mon Oct 8 16:25:57 CEST 2007


Hi Roger,

I'd like you to comment on the issue of validation. In RDF, with its Open
World assumption, we loose the ability to validate. So how easy is it to
take RDF output and, if an application requires it, re-format it so that it
can be validated against an XML Schema, i.e., take your example below and
link it to an XML Schema. I understand that there are multiple ways in RDF
to express the same thing, so would that create problems for a schema if the
RDF was coming from different data providers. Or do we have control of that
because of the LSID vocabularies?

Éamonn



-----Original Message-----
From: tdwg-tnc-bounces at lists.tdwg.org
[mailto:tdwg-tnc-bounces at lists.tdwg.org] On Behalf Of Roger Hyam
Sent: 05 October 2007 16:51
To: Richard Pyle
Cc: tdwg-tnc at lists.tdwg.org
Subject: Re: [tdwg-tnc] LSIDs and taxon concepts


Paul, Rich et al.

I'll try and answer all the questions in a single mail and also keep  
it short.

Taxon Concept Schema (TCS) is an XML Schema that was standardized by  
TDWG in 2005 but TCS also uses as short hand for distinguishing  
between Taxon Names and Taxon Concepts.

The fundamental thing that TCS does (both the schema and the way of  
modeling) is separate TaxonNames (or nomenclatural acts) from  
TaxonConcepts (actual delimited or implied taxa that one would  
identify something to).

In order to issue LSIDs for TaxonNames or TaxonConcepts it is  
necessary to represent them in RDF rather than XML Schema. RDF is far  
more modular by nature than XML Schema and so two vocabularies were  
put together to represent TaxonNames and TaxonConcepts (rather than  
one schema) but unless you are issuing pure nomenclatural data you  
will usually use both.

http://rs.tdwg.org/ontology/voc/TaxonName

http://rs.tdwg.org/ontology/voc/TaxonConcept

The TaxonName vocabulary is being used by IPNI, Index Fungorum and  
soon ZooBank. It is also being used by GBIF and anyone else who uses  
the TaxonConcept or TaxonOccurrence because it is embedded within  
these vocabularies. In fact it could be used anywhere some one wants  
to break apart a name string.

The TaxonOccurrence vocabulary is being issued by the CATE project  
(don't have the reference to hand) and Species2k/Catalogue of life  
are going to use it for their checklist and of course by others who  
issue TaxonOccurrence data.

I'll show and example of the embedding as it makes things clearer.  
This is abbreviated for clarity. Suppose we want to express an  
occurrence of a taxon (perhaps as a specimen)

<to:TaxonOccurrence rdf:about="urn:lsid:example.com:specimens:1234">
	<dc:title>Hyam.R.D. 284927 - Rhododendron ponticum L.</dc:title>
	<to:collector>Roger Hyam</to:collector>
	<.... other stuff ...>
	<to:identifiedTo>
		<to:Identification>
			<to:expertName>Chris Browning</to:expertName>
			<to:taxonName>Rhododendron ponticum
L.</to:taxonName>
			<to:taxon >
				<tc:TaxonConcept>
					<tc:hasName>
						<tn:TaxonName>
	
<tn:genusPart>Rhododendron</tn:genusPart>
	
<tn:specificEpithet>ponticum</tn:specificEpithet >
	
<tn:authorship>L.</tn:authorship >
						<tn:TaxonName>
					</tc:hasName>
					<tc:accordingToString>Brown and
Smith 1995</tc:accordingToString>
					<tcom:publishedIn>Some monograph by
some guys</tcom:publishedIn>
					<tcom:microReference>page
32</tcom:microReference>
					<tcom:publishedInCitation
rdf:reference="http:// 
some.uri.to.some.citation.for.the.pub"/>
				</tc:TaxonConcept>
			</to:taxon>
		</to:Identification>	
	</to:identifiedTo>
</to:TaxonOccurrence>

So we have a TaxonOccurrence (really like a DarwinCore record but  
with embedding). In order to express the identification of this  
specimen in more detail than just a string we include a TaxonConcept  
and a TaxonName. Neither the concept nor the name have identities  
(they are both anonymous) but they are both objects of that type.  
They could be replaced by references to external instances. There are  
also properties to allow the supplier to "cop-out" of embedding  
referencing anything and simply include a string if that is all they  
have in their database.

So in issuing a TaxonOccurrence record I use both TaxonConcept and  
TaxonName vocabularies. I am not using TCS in the sense of the XML  
Schema but I am using in the sense of the notions involved.

This is where we are  headed with integrated standards and semantic  
approaches.

I hope this helps.

BTW I hope it answers Rich's question as it is possible to add  
reference info in the TaxonConcept to say where it was published  
using the common properties defined in:

http://rs.tdwg.org/ontology/voc/Common

All the best,

Roger






On 5 Oct 2007, at 13:58, Richard Pyle wrote:

>
> Hi Paul and others,
>
> This leads me to a couple of questions about serving TCS data.  For  
> example,
> strictly speaking, ZooBank will be return metadata in accordance  
> with the
> "TaxonNameLsidVoc"
> (http://wiki.tdwg.org/twiki/bin/view/TAG/TaxonNameLsidVoc), which  
> is based
> on TCS, but is not TCS per se (ZooBank is concerned with taxon  
> names, not
> concepts). There is also the TaxonConceptLsidVoc
> (http://wiki.tdwg.org/twiki/bin/view/TAG/TaxonConceptLsidVoc), which
> together with the TaxonNameLsidVoc and other more genral ontologies,
> collectively represent the same information as a TCS XML document.   
> I guess
> that one of the things I'm not clear on is whether RDF returned for  
> an LSID
> counts as "TCS", or does TCS specifically mean a document structured
> according to the TCS XML Schema?
>
> Also, what are we really serving when we say we're serving TCS  
> documents?
> Name-only data is part of TCS, but I wouldn't think of it as TCS  
> per se.  I
> think you need it in the cntext of an "accordingTo" instance.  (By  
> the way
> -- Roger -- I'd always thought of "accordingTo" as referring to a
> PublicationCitation, not an Actor or Team.  A topic of discussion for
> another day...
>
> But my point is, I've got hundreds of thousands of database records  
> for
> [Name accordingTo Publication], which each represent a pointer to a  
> taxon
> concept (that is, "concept" sensu Kennedy, not sensu Pyle).  And  
> for many of
> these, I also have information on synonymies within the Publication  
> (i.e.,
> taxon concepts defined at the resolution of names, which means at the
> implied resolution of type specimens).  What I don't have, however, is
> robust sets of "taxon concept" records that go into more specific  
> detail
> regarding the definition of the concept itself (in terms of non-type
> specimens and/or character data, for example).  Also, I don't have  
> much in
> the way of third-party RelationshipAssertions to define how these  
> alternate
> concepts map to each other.
>
> This leads to the question I've been meaning to ask, which is "How  
> much
> information do I need before I call it a TCS document?"  I would  
> say raw
> names data alone don't cut it -- you would need at least an  
> "accordingTo"
> before you could call it a concept/TCS document.  But if all I have  
> as an
> accordingTo (with no additional specimens or characters or
> RelationshipAssertions), do I still call it TCS?
>
> Sorry if I'm over-thinking this...
>
> Aloha,
> Rich
>
>> -----Original Message-----
>> From: tdwg-tnc-bounces at lists.tdwg.org
>> [mailto:tdwg-tnc-bounces at lists.tdwg.org] On Behalf Of Paul Allen
>> Sent: Friday, October 05, 2007 2:11 AM
>> To: tdwg-tnc at lists.tdwg.org
>> Subject: [tdwg-tnc] LSIDs and taxon concepts
>>
>> Hi all,
>>
>> I'm new to this list and hope that the following are
>> appropriate questions.
>> In Bratislava, I wasn't keeping detailed enough notes on
>> projects and their current and future plans wrt TCS.
>>
>> What sites are currently publishing TCS-formatted data or
>> will be within the year? I know that zoobank.org will be
>> publishing TCS data in the near future. Is GBIF? ITIS? Species2000?
>>
>> What sites are publishing real "taxon concept" data (in TCS
>> format or not)?
>> Conversely, what sites are simply publishing "nominal taxon
>> concepts" as opposed to detailed authoritative taxon concepts?
>>
>> Is this the kind of thing for which we should generate a
>> survey to send to sites (i.e. their plans for publishing TCS)
>> or distrubute to TDWG members?
>>
>> Thanks,
>> Paul
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------
>> Paul Allen, Assistant Director
>> Information Science             pea1 at cornell.edu
>> Cornell Lab of Ornithology     (800) 843-BIRD
>> 159 Sapsucker Woods Road       (607) 254-2480 (direct)
>> Ithaca, NY 14850               (607) 254-2415 (fax)
>>
>> http://www.birds.cornell.edu/
>> http://bna.birds.cornell.edu/
>> http://www.ebird.org/
>> http://bird.atlasing.org/
>> ------------------------
>>
>> _______________________________________________
>> tdwg-tnc mailing list
>> tdwg-tnc at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-tnc
>
>
> _______________________________________________
> tdwg-tnc mailing list
> tdwg-tnc at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tnc

_______________________________________________
tdwg-tnc mailing list
tdwg-tnc at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-tnc





More information about the tdwg-content mailing list