Re: SEEK Project and TDWG-SDD
Hi Jim!
Well, I hesitate to stick my neck out and into the space of this august, expert group
Actually this expert group is small and struggling with a complex design.
Too few people have the time either to penetrate it and find the critical points still lacking, or to write comprehensive documentation.
Any contributions are most welcome!
(http://seek.ecoinformatics.org), with which I see there has been some interaction with SDD, is attempting to build a global infrastructure for handling, resolving and communicating taxonomic concepts (as opposed to taxonomic names).
.... and as I see it, the most valuable tax. concepts are not based on synonymy or specimen lists, but on comprehensive descriptions.
.... which makes me critical of the current lack of providing name identifiers separately from taxon concept identifiers. To describe a new concept in SDD we need name, not concept IDs... But this is a GBIF ECAT issue...
If the taxa/concepts had their own schemas and were linked to the package metadata with a GUID, maybe a DOI or some other globally unique identifier, then the XML concept data sets could be used for other systems like concept based classification or database management systems. This would in theory, and in my view, give the work of your group much more leverage, exposure and relevance to a broader group of scientists and users of names and concepts.
This is right into the heart of our discussions. We absolutely do NOT want to define names, we want to use them. You may say link to them. The current schema contains a general attempt to link to external resources, including specimens and taxonomic names. See Entities/Objects and Entities/Classes in the schema.
We call these objects ResourceConnectors. Each connector has an external ID, a provider, and a human-readable text representation. In addition, the derived connector types may add additional information.
The name "ResourceConnector" may be vague or confusing. In a way, it is intended as a proxy design pattern, each connector stands as a proxy object for a external object, the link to which may be temporarily unavailable. Any suggestions what better name to call this?
some careless file management. But presumably you guys are thinking about a registry or distributed federation of these data sets anyway, where they would be archived and served intact from a trusted source.
Rather struggling with UDDI... I am still confused what, besides a UDDI URL should be stored to reproducible retrieve a service wsdl URI (that may change over time, thus the use of a UDDI). UDDI is defined extremely general, too general for me.
Also, we would need to develop interface definitions containing a set of standard operations to retrieve lists of external objects to make a selection from, or to retrieve rich information about an external object (i.e. more than the proxy inside SDD saves. I believe this is essential for the future of GBIF but wonder where the resources are to tackle these problems...
I also understand that data sets of diagnostic identification information are far from complete descriptions of concepts in either a taxonomic or phylogenetic sense, but if the SDD concept schema could accommodate additional characters, then the opportunity would be there for other people to use SDD for other kinds of systems. The UI of diagnostic key programs would likely not need to use or display DNA sequences for interactive identification, but no harm done, they could just ignore fields of no use to the program at hand.
I believe this is a misunderstanding. SDD is in no ways confined to diagnostic sets, if diagnostic is understood as a minimized set of field-observable characters. That may work for plants, it does not work for fungi or other microorganisms.
We do attempt to provide means to differentiate between different kind of characters. One method is that concept trees point to the characters, and the method tree contains different methods. Another is an attempt to allow ratings to rate characters and concepts (the characters would inherit from concepts) for identification convenience, identification availability, identification reliability of feature
DELTA had a simple character wheight. Problem with that is that it had no semantics other than change it until your selected program produces a desired result.
Any comments or experience with this?
Another objection I anticipate would be that broadening SDD to accommodate the functional requirements of the broader taxon concept management objectives would make the SDD schema too complex and difficult for anyone to work with.
No, I think that is where SDD has to go. I guess difficult is where we are already... We try to break it up into modular components to show that SDD tackles a range of problems, and that smaller subsets of SDD are available for individual problems.
Can you provide some ideas about how you would modularize the data, with just a keyword /example list of things in each module. I think without you looking in detail at the schema, such a list might be very valuable to the development of SDD!
Thanks a lot!
Gregor ---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
participants (1)
-
Gregor Hagedorn