Well, I hesitate to stick my neck out and into the space of this august, expert group, but there hasn't been enough news from Iraq recently, so here's something to ponder.
Bryan Heidorn visited Kansas this week and we had a discussion about where TDWG-SDD is going, and the history of your deliberations. We at Kansas have not been involved with character data (although I have the original interactive key program for plants (IDENT4) archived on paper tape for the TDWG-SDD museum), but our recent SEEK project (http://seek.ecoinformatics.org), with which I see there has been some interaction with SDD, is attempting to build a global infrastructure for handling, resolving and communicating taxonomic concepts (as opposed to taxonomic names).
Bryan described to us how your schema is intended in include all of the data and metadata associated with a diagnostic character set for a group of taxa. My understanding is thus incomplete and based only on our single discussion, but it struck me that TDWG-SDD has an opportunity to have much broader acceptance and support if your schema was not designed as a single data object--to contain both the metadata about the package (or work or whatever you refer to it as) *and* the descriptive data that describe the individual concepts.
If the taxa/concepts had their own schemas and were linked to the package metadata with a GUID, maybe a DOI or some other globally unique identifier, then the XML concept data sets could be used for other systems like concept based classification or database management systems. This would in theory, and in my view, give the work of your group much more leverage, exposure and relevance to a broader group of scientists and users of names and concepts.
The overhead for the traditional diagnostic identification software makers would be that the XML parts would need to assembled for the various applications that use the data and there would be the potential risk that SDD data sets would be incomplete, if there were some careless file management. But presumably you guys are thinking about a registry or distributed federation of these data sets anyway, where they would be archived and served intact from a trusted source.
I also understand that data sets of diagnostic identification information are far from complete descriptions of concepts in either a taxonomic or phylogenetic sense, but if the SDD concept schema could accommodate additional characters, then the opportunity would be there for other people to use SDD for other kinds of systems. The UI of diagnostic key programs would likely not need to use or display DNA sequences for interactive identification, but no harm done, they could just ignore fields of no use to the program at hand.
Another objection I anticipate would be that broadening SDD to accommodate the functional requirements of the broader taxon concept management objectives would make the SDD schema too complex and difficult for anyone to work with.
Overall, it seems to this SDD-novice, that the mechanical overhead of applications having to assemble the taxon/concept pieces with the character definition and package metadata, would not so onerous that it would outweigh the benefit of separating out the taxon/concept schema as data sets that *could* standalone if someone wanted to use them that way.
Respond if you dare.
Jim B.
-------------------------------- James H. Beach Biodiversity Research Center University of Kansas 1345 Jayhawk Boulevard Lawrence, KS 66045, USA Tel: 785 864-4645, Fax: 785 864-5335 Televideocon: (H.323): 129.237.201.102
participants (1)
-
Beach, James H