On 25/11/2010, at 12:03 PM, Tony.Rees@csiro.au Tony.Rees@csiro.au wrote:
Maybe an answer would be to use TCS not DwC for exchange of purely taxonomic data? How about creating a TCSA format for bulk transfer - or is this not a great thought... (not being that familiar with TCS)
We had all sorts of hideous, intractable problems with TCS, and have abandoned it.
The main issue is that it is quite carefully closed. The only top-level element it defines is "DataSet" - you cannot validly include a single tcs:TaxonConcept element in some other xml - it has to be wrapped in a DataSet and a TaxonConcepts element. Most of its types are defined as in-line anonymous blocks, so you cannot say "my ibis Foo element is of type tcs:TaxonConceptType".
The reason for this, I think, is that one of the ways a TCS element can refer to another is by a "local" reference, which only makes sense inside a coherent dataset. The way TCS is built, you will never see a valid tcs xml element with a local reference but that that reference is inside a tcs DataSet. Which sounds like a good idea at first.
TCS has no extension mechanism other than the "provider specific data" sections. You cannot validly put attributes or elements from other namespaces inside TCS elements except in the ProviderSpecificData section. Now ... this would almost be ok if TCS was very complete. But we kept coming up with things that didn't fit.
TCS, for instance, insists that TaxonRelationshipAssertion blocks have ids. Ours don't. Ok, so we can put our relationship assertion objects inside the ProviderSpecificData section of the taxon. Obviously, we can't use the TCS xml type for these blocks. But that means that we can use any of the TCS attributes and sub-elements either. So we wound up building an XML schema that essentially paralleled TCS, using the same element names, but with the elements exposed as top-level elements where necessary.
We have many taxon relationship types that are not TCS types. Ok, so these extra relationships go in the provider-specifc data section, described with in-house xml. Now ... there are two design choices here, both bad.
These relationships can be not included in the TCS relationships block at all. But that means that something that only understands TCS will simply not be aware of them. It also meant that there were slabs of duplicated code to generate the TCS relationship blocks "where relationship type in ...", and almost identical code differing only in that it uses the ibis xml namespace and "where relationship type NOT in ...". Horrible.
Alternatively, the relationships can be included in the TCS block with our relationship type mapped as closely as possible to a TCS one (needless to say, there's no TCS "other/unspecified" relationship type value), and extra data included elsewhere. But ... TaxonRelationship elements don't have a provider specific data section: only the enclosing TaxonConcept does. We could include all of the extra data for each relationship in an element in this section, but there's no way to map them across to the particular TCS relationship element that they are each about, because relationships that appear inside a taxon concept (ie: not relationship assertions) don't have ids, and there's no way to attach one.
Our taxon ranks don't always match the TCS ranks. So we had to map them across to TCS equivalents. But we still want to keep the ones we have - it's data - and so the verbatim rank goes into an element in the ProviderSpecificData section. Which would be ok ... except that now a TaxonName has two ranks, and the reason that it has two is that the TCS XML schema forces us to include one which is, well, wrong.
Eventually the ProviderSpecificData sections became so large and complex that we had to admit that it was a document in its own right. At which stage ... why TCS at all? As a wrapper, into which we can't fit the data values that we want, around some in-house XML with the good stuff?
We abandoned it. Sorry to go on at such length and detail, but I have such vivid memories of trying to make it work.
_______________________________________________
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.