New Draft Descriptive Data Standard Specification Kevin Thiele September 1, 2000 Introduction The basic unit (atom) of a description is a statement of the form "For item x, element y has Value z" e.g. "Gouania exilis flower colour is green" or "Joe Blogg's eye colour is blue" In some cases, the statement needs to be qualified: "Gouania australiana flower colour is yellow (rarely)" Any description can be represented by a series of such statements. That is, a group of such statements comprises a description. In this draft specification one or more descriptions are represented by a set of statements comprising a document. An example structure is shown in Fig. 1. A simple description would comprise a set of statements for one item (e.g. the taxon Gouania exilis). There is no requirement that these statements be structured - they may comprise a simple list. Statements in a document may also be contained within a collection. The purpose of a collection is to allow properties of the statement and its parts (item and element) to be designated once only, rather than for every statement, and to allow statements about several items or elements to be contained within one document. A collection may comprise a set of statements about one item (e.g. the collection for Gouania australiana), or for one element (e.g. statements describing the element Flower Colour for several taxa). The general structure is given in Fig. 2. The parts of the description (Item, Element, Statement) and the document each have properties. Some of these properties, particularly high-level ones relating to the document as a whole, may be recorded in only one place within the document (but see External Resources below). For example, the Document property Version No. clearly may only be recorded for the document. Other properties, particularly lower-level ones, may be moved higher in the document for efficiency (so that they only need to be specified once). For example, the Item property Name may be placed as a property of a collection. All statements in the collection then take that name property. Or it may be placed as a property of the document as a whole, in which case all statements in the document take that name property. Attribution and sources The Attribution and Reference properties are special cases. The standard is designed to allow (but not require) very rich attribution and referencing of statements. Attribution here means assigning one or more authors to a statement or set of statements ("Jane Taxonomist is the person who has recorded here that Gouania exilis flowers are green"). A reference points to a published document or other source for the statement, such as a set of specimens or observations ("The statement that Gouania exilis flowers are green is derived from Suessenguth's Das Pflanzenreich", or "... is based on examination of specimens in the Kew Herbarium"). Attributions and references may be specified for a statement, or for an item or element when these are defined at collection level or higher, for a collection or for the whole document. Attributions and references specified at a lower level override ones specified at a higher level. For example, a reference (Suessenguth's Das Pflanzenreich) and attribution (Jane Taxonomist) may be specified at the document level. Another reference and attribution may be made for a collection of statements, and these will override the reference and attribution specified for the document as a whole. Yet another reference and attribution may be specified for a particular statement, and this will override all others. If no attribution and reference are specified at one level, they are derived from the first specified at a higher level. Note that statements may be anonymous if there is no attribution, and unreferenced if there is no reference, specified for the statement or for any higher level. External resources Properties for a document need not be specified within the document. They may be specified in an external document, and referred to (with a suitable pointer) within the document. The document that contains the externally-referenced properties may or may not have its own statements. External referencing will allow documents to share properties, reducing the need to specify the same properties many times over. It will allow the evolution of shared name-spaces etc, while not requiring it - a document may be part of a set that shares properties or it may be self-reliant. Collation of values Documents or collections of statements within documents may be specified as the source of data for particular statements. For instance, document D1 may contain statements about specimens of Gouania, document D2 may contain statements about species of Gouania, while document D3 may contain statements about the genus Gouania. Some statements about Gouania exilis in D2 (e.g. statements about the element Flower colour) may be stated explicitly in that document, while others (e.g. statements about the element Leaf length may be collated from Leaf length statements made in D1. Similarly, some statements about Gouania may be made explicitly in the D3, while others may be collated from information held in D2 (or D1). Collation rules (e.g. "find the global range" or "list all values") will be needed to describe how the collation is to be done. Example Documents Document 1 - most basic 1.0 12/12/99 13/9/00 This is a very basic document, with only the most basic statement data and no fancy ascriptions for items or elements Gouania exilis Flower colour green rarely Gouania exilis Petal shape ovate -------------------------------------------------------------------------------------------------------------------- Document 2 - use of statement collection (item-based) 1.0 12/12/99 13/9/00 This is a slightly fancier document, with statements arranged in a collection (some properties of the statements may be moved up to become statements of the collection for efficiency). Flower colour green rarely Petal shape ovate Flower colour white rarely Petal shape obovate -------------------------------------------------------------------------------------------------------------------- Document 3 - fancier attributions 1.0 12/12/99 13/9/00 This has a list of attributions (people to whom statements can be attributed); each statement is attributed. Note that a statement with no attribution would be attributed to the Principal (default) Attribution as specified. If the document had a collection, the collection could be attributed also. Attributions are ranked: an attribution for a statement will override one for a collection, which overrides one for a document 1 Kevin Thiele kevin.thiele@pi.csiro.au 2 Jim Croft jrc@anbg.gov.au Dodgy character, not to be trusted Gouania exilis Flower colour green rarely Gouania exilis Petal shape obovate -------------------------------------------------------------------------------------------------------------------- Document 4 - Element and Item properties 1.0 12/12/99 13/9/00 This is a very basic document, with only the most basic statement data and no fancy ascriptions for items or elements Flower colour <\ELEMENT NAME> Flower characters "green" "yellow" Petal shape <\ELEMENT NAME> Flower characters "ovate" "obovate" Gouania exilis 1 1 rarely Gouania exilis 2 1