SDD Schema in relationship to Prometheus
Kevin Thiele
kevin.thiele at BIGPOND.COM
Tue Mar 16 08:24:30 CET 2004
Trevor - thanks very much for your comments and comparative document - this is really useful, and we need to get much more feedback like this.
The main difference between SDD and Prometheus seems to be that you are working specifically on the basis of defining a controlled terminology whereas SDD explicitly decided early on that a controlled terminology was outside our scope. History will judge which approach is best.
We did have early discussions about a controlled terminology (see the list archives for a history of this).One dificulty for us is that SDD is designed to be biology-wide (indeed, we have even removed specific references to biology, such as "taxon", because SDD is equally applicable to descriptions of non-taxa such as diseases, nutrient deficiency syndromes, soils and minerals. Perhaps here we have drawn our bow too wide, but we were informed by the fact that at our Lisbon meeting all but one of the contributors who were working with identification tools had removed their biology-specific tags to become more general). Prometheus (as I understand it from your document) is specifically botanical. This would be an intolerable restriction for us given our brief.
Obviously, a botany-wide controlled terminology is more achievable than a biology-wide one. Personally, however, I think that you run the danger even in botany with any controlled terminology of trying to force nature kicking and screaming into small boxes, and do it an injustice therewith. I don't know how any botany-wide controlled terminology could cope with the leaves of Drosera auriculata, for instance, or the morphology of Podostemaceae. (In fact, I wonder whether the dream of a controlled terminology is more likely in a cold Northern Hemisphere climate than in the biodiverse South or tropics?).
In general, we have taken the view that a controlled terminology in particular domains (e.g. legumes) may develop as an emergent property of SDD, rather than imposed top-down.
On more specific points from your document:
Complexity: SDD was scoped to be a superset of existing systems and standards e.g. DELTA, Lucid, DeltaAcess, and also to accommodate future developments that those of us working in the field can envisage but no-one's really done yet (particularly federation issues - and you may be further down this track than we are). This is part of the reason for the complexity,
>It is not clear to me whether SDD is proposing this schema as
>a unifying schema to which different description formats would map their own schema
>or
>whether the SDD schema is being proposed as a schema for developers to (partially) implement when designing applications >and repositories for capturing descriptive data.
It is designed as a unifying standard, to allow lossless roundtripping between applications. At the same time, we are struggling with how much should be mandatory and how much optional (your second option)
>>>From our own collaborative experiences with botanical taxonomists, data models and structures hold no interest to them in >practice, and they find even our simple conceptual model of character description complex to understand. Probably few working >taxonomists would wish to interact at any level with the SDD schema and applications would have to achieve this mapping >transparently.
On this I'm sure you're right, and we have had many discussions within SDD about this problem. There are differing views as to the importance of taxonomists themselves coming to grips with SDD, as the standard itself will generally be invisible to a taxonomist using an SDD-compliant application.
Translation and multiple language representations: allowing multiple languages is seen as a fundamental part of the SDD brief. Life would indeed be much simpler if everyone spoke the same language, but they don't so we need to handle that.
>It is not clear whether SDD proposes that a single document can include multiple language representations, or whether these >would form separate documents, conforming to the same standard
SDD can handle multiple language representations of every character string within the one document.
Multiple expertise levels
>I am similarly suspicious of the necessity for including the ability for recording different expertise levels in one document format. >Is SDD proposing/allowing multiple representations within the same document : or just that the same format/standard can be >used for documents aimed at different expertise level.
>
>There clearly is value in being able to extract/translate simple language descriptions from complex data resources - as is >necessary for compiling flora and keys from monographs and original descriptions. However, is including the ability to describe >descriptive data in language suitable for primary schoolchildren relevant to an accurate scientific database of taxonomic data. [Again this would appear to be a political requirement??]
----- Original Message -----
From: Paterson, Trevor
To: TDWG-SDD at LISTSERV.NHM.KU.EDU
Sent: Monday, March 15, 2004 9:41 PM
Subject: SDD Schema in relationship to Prometheus
Gregor
I have written a rough document considering several aspects of the SDD-schema - largely interpreted with reference to our Prometheus Database model for descriptive data. It seems easier to keep this all together, rather than post it to various sections on twiki, so i am attaching it here
My main problems in interpreting the schema were the lack of documentation ( as always...) especially for the conceptually complex parts like concept trees. I think clear, visual summary models for description, characters, concept trees etc would help a novice to get to grips with the concepts, and might make some of the complexities more tractable. I do worry that the overall schema is over complex and 'trying to do too much in one go' - eg considering multiple language and expertise representations, although I am sure that there are good political reasons for everything.....
yours
trevor
Trevor Paterson PhD
t.paterson at napier.ac.uk
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175
www.prometheusdb.org
------=_NextPart_000_0365_01C40B30.1AC17BC0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1400" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Trevor - thanks very much for your comments and
comparative document - this is really useful, and we need to get much more
feedback like this.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>The main difference between SDD and Prometheus
seems to be that you are working specifically on the basis of defining a
controlled terminology whereas SDD explicitly decided early on that a controlled
terminology was outside our scope. History will judge which approach is
best.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>We did have early discussions about a controlled
terminology (see the list archives for a history of this).One dificulty for us
is that SDD is designed to be biology-wide (indeed, we have even removed
specific references to biology, such as "taxon", because SDD is equally
applicable to descriptions of non-taxa such as diseases, nutrient deficiency
syndromes, soils and minerals. Perhaps here we have drawn our bow too wide, but
we were informed by the fact that at our Lisbon meeting all but one of the
contributors who were working with identification tools had removed their
biology-specific tags to become more general). Prometheus (as I understand it
from your document) is specifically botanical. This would be an intolerable
restriction for us given our brief.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Obviously, a botany-wide controlled terminology is
more achievable than a biology-wide one. Personally, however, I think that you
run the danger even in botany with any controlled terminology of trying to force
nature kicking and screaming into small boxes, and do it an injustice therewith.
I don't know how any botany-wide controlled terminology could cope with the
leaves of <EM>Drosera auriculata</EM>, for instance, or the morphology of
Podostemaceae. (In fact, I wonder whether the dream of a controlled terminology
is more likely in a cold Northern Hemisphere climate than in the biodiverse
South or tropics?).</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>In general, we have taken the view that a
controlled terminology in particular domains (e.g. legumes) may develop as an
emergent property of SDD, rather than imposed top-down.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>On more specific points from your
document:</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2><STRONG>Complexity</STRONG>: SDD was scoped to be a
superset of existing systems and standards e.g. DELTA, Lucid, DeltaAcess, and
also to accommodate future developments that those of us working in the field
can envisage but no-one's really done yet (particularly federation issues - and
you may be further down this track than we are). This is part of the reason for
the complexity,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV class=MsoNormal
style="MARGIN: 0cm 0cm 0pt; COLOR: #333333; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt"><SPAN
lang=EN-GB style="mso-bidi-font-family: Arial"><FONT size=2><FONT
face=Arial>>It is not clear to me whether SDD is proposing this schema as
<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office"
/><o:p></o:p></FONT></FONT></SPAN></DIV>
<DIV class=MsoNormal
style="MARGIN: 0cm 0cm 0pt; COLOR: #333333; mso-list: l0 level2 lfo1; tab-stops: list 72.0pt"><SPAN
lang=EN-GB style="mso-bidi-font-family: Arial"><FONT size=2><FONT
face=Arial>>a unifying schema to which different description formats would
map their own schema <BR>>or <o:p></o:p></FONT></FONT></SPAN></DIV>
<DIV class=MsoNormal
style="MARGIN: 0cm 0cm 0pt; COLOR: #333333; mso-list: l0 level2 lfo1; tab-stops: list 72.0pt"><SPAN
lang=EN-GB style="mso-bidi-font-family: Arial"><FONT size=2><FONT
face=Arial>>whether the SDD schema is being proposed as a schema for
developers to (partially) implement when designing applications >and
repositories for capturing descriptive
data.<o:p></o:p></FONT></FONT></SPAN></DIV>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2>It is designed as a unifying standard, to allow lossless
roundtripping between applications. At the same time, we are struggling with how
much should be mandatory and how much optional (your second
option)</FONT></o:p></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><FONT size=2><FONT
face=Arial></FONT></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><FONT size=2><FONT
face=Arial>>From our own collaborative experiences with botanical
taxonomists, data models and structures hold no interest to them in
>practice, and they find even our simple conceptual model of character
description complex to understand. Probably few working >taxonomists would
wish to interact at any level with the SDD schema and applications would have to
achieve this mapping
>transparently.<o:p></o:p></FONT></FONT></SPAN></P></o:p></SPAN>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2>On this I'm sure you're right, and we have had many
discussions within SDD about this problem. There are differing views as to the
importance of taxonomists themselves coming to grips with SDD, as the standard
itself will generally be invisible to a taxonomist using an
SDD-compliant application.</FONT></o:p></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><STRONG>Translation and multiple language
representations</STRONG>: allowing multiple languages is seen as a
fundamental part of the SDD brief. Life would indeed be much simpler if everyone
spoke the same language, but they don't so we need to handle that.
</FONT></o:p></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA">>It
is not clear whether SDD proposes that a single document can include multiple
language representations, or whether these >would form separate documents,
conforming to the same standard</SPAN></FONT></o:p></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"></SPAN></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA">SDD
can handle multiple language representations of every character string within
the one document.</SPAN></FONT></o:p></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"></SPAN></FONT></o:p></SPAN> </P><SPAN
lang=EN-GB style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT
face=Arial color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA">
<P class=MsoNormal
style="MARGIN: 0cm 0cm 0pt 18pt; TEXT-INDENT: -18pt; mso-list: l0 level1 lfo1; tab-stops: list 18.0pt"><B><SPAN
lang=EN-GB>Multiple expertise
levels<o:p></o:p></SPAN></B></P></SPAN></FONT></o:p></SPAN><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA">
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN
lang=EN-GB></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB>>I am
similarly suspicious of the necessity for including the ability for recording
different expertise levels in one document format. >Is SDD proposing/allowing
multiple representations within the same document : or just that the same
format/standard can be >used for documents aimed at different expertise
level.</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN
lang=EN-GB><o:p>><EM> </EM></o:p></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB>>There
clearly is value in being able to extract/translate simple language descriptions
from complex data resources as is >necessary for compiling flora and keys
from monographs and original descriptions. However, is including the ability to
describe >descriptive data in language suitable for primary schoolchildren
relevant to an accurate scientific database of taxonomic data. [Again this would
appear to be a political requirement??]</SPAN></P>
<P class=MsoNormal
style="MARGIN: 0cm 0cm 0pt"></SPAN></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"></SPAN></FONT></o:p></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN lang=EN-GB
style="COLOR: #333333; mso-bidi-font-family: Arial"><o:p><FONT face=Arial
color=#000000 size=2><SPAN lang=EN-GB
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-GB; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"></SPAN></FONT></o:p></SPAN> </P>
<BLOCKQUOTE dir=ltr
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A title=T.Paterson at NAPIER.AC.UK
href="mailto:T.Paterson at NAPIER.AC.UK">Paterson, Trevor</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A title=TDWG-SDD at LISTSERV.NHM.KU.EDU
href="mailto:TDWG-SDD at LISTSERV.NHM.KU.EDU">TDWG-SDD at LISTSERV.NHM.KU.EDU</A>
</DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Monday, March 15, 2004 9:41
PM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> SDD Schema in relationship to
Prometheus</DIV>
<DIV><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT
face=Arial size=2></FONT><BR></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004>Gregor</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=977562610-15032004>I have written a
rough document considering several aspects of the SDD-schema - largely
interpreted with reference to our Prometheus Database model for descriptive
data. It seems easier to keep this all together, rather than post it to
various sections on twiki, so i am attaching it here</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=977562610-15032004>My main problems
in interpreting the schema were the lack of documentation ( as always...)
especially for the conceptually complex parts like concept trees. I think
clear, visual summary models for description, characters, concept trees
etc would help a novice to get to grips with the concepts, and might make some
of the complexities more tractable. I do worry that the overall schema is over
complex and 'trying to do too much in one go' - eg considering multiple
language and expertise representations, although I am sure that there are good
political reasons for everything.....</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004>yours</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004>trevor</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=977562610-15032004></SPAN></FONT> </DIV>
<P><B><FONT face=Arial size=2>Trevor Paterson PhD</FONT></B> <BR><B><FONT
face=Arial size=2><A
href="mailto:t.paterson at napier.ac.uk">t.paterson at napier.ac.uk</A></FONT></B>
</P>
<P><FONT face=Arial size=2>School of Computing</FONT> <BR><FONT face=Arial
size=2>Napier University</FONT> <BR><FONT face=Arial size=2>Merchiston
Campus</FONT> <BR><FONT face=Arial size=2>10 Colinton
Road
</FONT><BR><FONT face=Arial
size=2>Edinburgh
</FONT><BR><FONT face=Arial
size=2>Scotland
</FONT><BR><FONT face=Arial
size=2>EH10 5DT</FONT> </P>
<P><FONT face=Arial size=2>tel:
+44 (0)131 455-2752</FONT> </P>
<P><STRONG><FONT face=Arial size=2><A
href="http://www.dcs.napier.ac.uk/~cs175">www.dcs.napier.ac.uk/~cs175<BR></A><A
href="http://www.prometheusdb.org/">www.prometheusdb.org</A></FONT></STRONG>
</P>
<DIV> </DIV></BLOCKQUOTE></BODY></HTML>
More information about the tdwg-content
mailing list