Bryan - great!
| Treatment not equal to file: | The second paragraph of the General Requirements of the DDST v1.1 states | that "One file will comprise one treatment...". There is no reason to | require in the specification that a treatment = a file. This is because | there is good reason to synthesize XML treatments on the fly from databases | and allow users to query the XML structure to extract just the elements | that they need. Most of the proposed spec works fine it we say that "A | treatment is a structured presentation of information. The DDST provides a | uniform view of descriptive data for taxonomy but it does not specify the | underlying storage and representation of the information." We'll examples | of why I say this below. | | Taxon Name: | We start right off with the Treatment Name. For the butterfly project at | least, the treatments are all about species. Is it reasonable to give each | treatment a name that equals the Taxon Name | Taxon ID? That is what we're | doing for now. In general, should the DDST include treatment naming | conventions linked to the Taxon ID? Of course you get into trouble if the | Taxon Name changes but you would be OK if the ID were kept constant. | So for now our treatments will have names like <Treatment Name>Papilio | polyxene</Treatment Name> but I am not happy with the decision. Any better | ideas out there? If so it would be helpful if those ideas were in the spec? | In our case the treatment name will also be a file name for now but | eventually parts will be in a database and we'll need a unique <Treatment ID>. | | Description: | I think it is fine that there is a description element to the treatment but | we were confused about what should be being described; the taxon or really | the treatment. We decided that decided to take the spec at face value and | really include the treatment which is fairly boring. All our treatment | descriptions will be pretty much the same. <DESCRIPTION>This treatment was | created as part of the Biological Information Browsing Environment project | funded by NSF, UIUC... Information selected for the treatment is intended | to support tasks associated with the PrairieWatch project....</DESCRIPTION> | This uniformity makes me believe that many aspects in this description | should be removed from the treatment and moved "up" one level to a higher | order entity such as a collection of treatments or project. Several other | elements of the treatment format support that notion. It means that DDST | v1.1 is really a Treatment Spec. We also need a Collection Spec. This means | that the treatments should all have an element called the Project ID.
This is getting places.
Perhaps I'm stuck still in the way Lucid and DELTA do things. There, a treatment is not a description of one taxon, but a set of descriptions for a set of taxa - e.g. all the species in genus X, or all the trees of Victoria. This is because the treatment is viewed as having a definite objective or product - storing data that will become an interactive key to the trees of Victoria, or a set of natural-language descriptions of the species in genus X. You're handling it very differently, with each treatment the description of one taxon, the treatments then bundled into a project defined (presumably) by the objective. The advantage of this is being able to bundle individual treatments in different ways (for different projects); the disadvantage is having to make sure that such cross-bundling works - ie you may end up bundling together several taxon treatments with different character lists. Perhaps what you're calling a 'treatment' is actually the Item Data block of the DDST, and what I call a treatment is what you're suggesting as a project. We need to sort this out.
| Treatment build/revision number | Treatment build/revision number is a really good idea and we all agreed | that we should have had it in our treatments before. | <TREATMENT REVISION>0.1</TREATMENT REVISION> | We use the 0 when the treatment is under development and not released yet. | I do not know how to deal with dynamic treatment build revision numbers. | Suggestions are welcome.
I think this should be up to the contributor not the spec. The general guideline that <1 is work-in-progress doesn't really work, since all treatments are and always will be works n progress. The only requirement surely is that treatment numbers should increment - but perhaps then we should just use the arrow of time and date-stamp?
| | Contributors List: | Notice that if the information about collectors in the list such as contact | information is repeated in each treatment there is a data integrity | problem. People move. It might be possible that some treatments would have | one set of old contact information and not be updated while some treatments | might get updated information.
This will always be a problem if you try to attribute data to a movng object such as a person. I see no way of getting around this. But presumably a combination of the contact details and date stamps on treatments could be used to track our wandering contributor?
| It should be that when a person requests a treatment in DDST v1.1 format | that they will get back a contributor list with ID, Name and Contact | Details but in most cases the system providing the information would insert | the contact information "on the fly". There does not seem to be a reason to | force the contributor IDs to be numeric as stated in the spec. Alpha | numeric can carry at least a little information to make the codes readable.
True
| The current standard makes a relatively weak standard that the contributor | codes are unique to the treatment. I think we need to use a broader | definition. The should be unique to a collection at least.
Again we have a definitional problem and I think my treatment = your collection.
| Attribution: | Must this be a contributor? If so this information should be handles as a | property of <CONTRIBUTOR ROLE=PRINCIPLE|COPRINCIPLE> or as another tag of | <CONTRIBUTOR> | .... <ROLE>PRINCIPLE</ROLE>.... | Suggestions?
Yes, it could be done like this, but what would be wrong with doing it the other way - seems somehow neater to me, and I can't see much inefficiency.
| List of Sources: | Again, usually the underlying information might be kept in different files | or databases but the view of a treatment would include the fields of the | spec. For us the spec will include bibliographic references and "personal | observations" Again is seems that source ID should at least be unique to an | entire collection, not just to the treatment. | <LIST OF SOURCES> | <SOURCE> | <SOURCE ID>KL2000</SOURCE ID> | <DESCRIPTION>Koller and Smith (2000). Butterflies of the Great West. | Biology Press: New York. | </DESCRIPTION> | </SOURCE> | <SOURCE> | <SOURCE ID>MJ2000</SOURCE ID> | <DESCRIPTION>Mike Jacobs, personal communication August 10, 2000. | </DESCRIPTION> | </SOURCE> | </SOURCE LIST>
Sources and attributions need to be thought through more. I see attribution as a reference to the person who actually encoded the data, and source as reference to the data source as far back as it can be pushed. Source may thus be a reference to a printed work, and attribution a reference to the person who interpreted the meaning of that description into the format; or source may be a reference to a specimen (or collection of specimens) and attribution be a reference to the person who observed that specimen. The critical thing is to have a hierarchy of ownership for both source and attribution; thus, for one character the (default) source may be a printed work, but for individual taxa the contributor may have referred back to specimens and this should be an overriding source.
Cheers - k