tdwg-content
Threads by month
- ----- 2024 -----
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2002 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2001 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2000 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1999 -----
- December
- November
- October
- September
- August
March 2004
- 5 participants
- 12 discussions
Dear all,
I am working in the Bioiformatics Center as a librarian. I am pursuing my
ph.d. in LIS and its contribution in the field Biodiversity informatics. I
have conducted a survey of information needs of mycologist. I would
appreciate if you go through the following URL and help me in getting the
information for the database structure.
http://bioinfo.ernet.in/library/fungi/survey.htm
OR
http://shubha.0catch.com
These URLs have word file of my survey. I would appreciate if you go
through it and send me the filled questionnaire at
shubha(a)bioinfo.ernet.in / shubhanagarkar(a)netscape.net /
shubha_nagarkar(a)hotmail.com
Thanking you in advance.
Sincerely
shubha nagarkar
http://bioinfo.ernet.in/library/personal/main.html
1
0
(Further thoughts on problems I see in structural decompositions).
The main unknown problem I see is how to express variability and lack
or uncertainty of knowledge about structural observations. This may
be exceptional in the extremely well studied higher plants, but not
so in small organisms that are often only imperfectly studied yet.
An example is "Chlorophyll in stems or leaves". You basically suggest
reformulating it to fit a given model that is considered more
effective for identification. In higher plants the assumption would
be that since the information is not available in the combined
statement, this involves some literature research, and perhaps
looking at a herbarium specimen. However, already in higher plants in
a key to families this is far from trivial, and in many organism
groups this imprecise synthetic statement may be the only knowledge
available.
Very similar case occur very frequently for temporal or spatial
modifications (which may be "regions" or not).
Also, I think you suggest in addition to Leaf-Cholorphyll Presence
and Stem-Chlorophyll Presence
to have:
Entire Plant - Colour - specify any colour
Which I believe would truly have to be above-ground vegetative parts
(I assume that the root color would not be considered, although roots
are included in entire plant). Providing these combined sub-
partitions (do you do that in the Prometheus model?) is necessarily a
choice among multiple combinations possible (for vascular bundle
characters combining root and stem may be desirable) that can not be
fit into a single hierarchy (do you provide multiple hierarchies?).
I do agree that better structured models have analytical advantages.
So far our believe is that currently no better model is well tested
and that limitations in the range of possible statements (which
naturally do exist in the SDD model as well) need careful
consideration and testing. In that case the attempt of Prometheus to
do exactly such testing (i.e. not only to design an ontology model,
but record data and test its use in identification processes is
extremely valuable for the future.
My own goal would be to allow less structured descriptions being
expressed in SDD, but to support working ("voluntarily") in a model
that is highly and systematically structured. Applications may have
"modes" for rigourous or lenient behavior, but not the exchange
format.
Gregor
----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn(a)bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
1
0
> The approach that we are proposing is that the descriptions are
> collected as atomic statements, and more traditional 'characters' can
> be discovered by analysis of this data (many characters are apparently
> a collection of atomic scores/states)
>
> Our taxonomists find this quite a departure from how they compose and
> record their characters at the moment ( they recognize/discover and
> define a set of characters by looking at the variation that exists in
> their specimen, then create a scoring sheet/proforma that allows them
> to pick one of these alternative characters) - our system might be
> tweaked to allow them to work in a more character oriented manner if
> they precompose sets of statements as part of the proforma
> specification, and then score these alternates as present or absent.
I believe when you say "character" in the above you mean what in
DELTA and SDD is considered a character "state"; is that correct? Do
you consider the states to be completely unrelated, i.e. green and
red and hairy and present are all on the same level? Or is this the
collection of state the property or type? This is something I don't
yet understand.
In many aspects, SDD and DELTA treat states are potentially
independent, i.e. characters have semantics, but are otherwise sets
of states. The important deviation is the coding status itself, which
always applies to the entire character.
Can you say in your model that a property is inapplicable or unknown?
> a major advantage of our system can be seen from some of your simple
> characters - eg growth habit: you have split this into two
> alternatives 1. Epiphytic or lithophytic habit vs 2. (not epiphytic
> or lithophytic) whilst this might make sense for a key, and is a
> DELTA-like representation, we would argue that if the ACTUAL growth
> habit was scored for each specimen as epiphytic, lithophytic,
> terrestrial, aquatic ( or concatenations of these ) far more accurate
> information would be recorded. For example, this would allow the same
> specimen description to be divided into other character sets if
> desired ( someone else may think that a key would work better if the
> alternates were soildwelling or lithophytic vs epiphytic, another
> person might want the alternates separately....if the description data
> had been recorded in the orginal two-alternate-character division,
> this data reuse would not be possible.
I think potentially the point your are making is very important, i.e.
that for some (not all) "state sets" there may be different
partitionings motivated by different goals. One potentially can split
color into metallic colors, bright colors, earthen/brownish colors.
However, I think the example above is not well choosen (I assume you
refer to Kevin's General habit and Epiphytic or lithophytic habit. It
is quite possible to have combinations of these. Epiphytic plants may
be climber, herb, or grass- or sedge-like plant.
> I hope this shows some of the salient features of our model...and how
> we think it would beneft working taxonomists.
(The rest of the email is unfortunately not readable to me, you seem
to use some Microsoft Office or Outlook specific feature to construct
the table)
>
> LUCID CHARACTERS<?xml:namespace prefix = o ns =
> "urn:schemas-microsoft-com:office:office" />
>
> STRUCTURE
>
> PROPERTY/
>
> STATEGROUP
>
> STATES
>
>
>
>
>
> Salt tolerance
>
>
> · plants tolerating high salt levels (halophytes)
>
> · plants not salt tolerant
>
>
>
>
>
>
>
>
>
>
>
> Entire Plant
>
> Ecological Adaptations
>
> Halophytic
>
> (there are a list of alternate states that could be scored, or
> NOT-halophytic is allowed)
>
>
>
>
>
> General habit
>
>
and so on
Gregor
----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn(a)bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
1
0
(Note: I am trying to split the discussion "SDD Schema in
relationship to Prometheus_Response to Kevin"; please respond to
this).
> The good news I suppose is that since SDD is (hopefully) more
general,
> there should be no problem rendering your data in SDD. The more
> interesting question is whether your data *model* (the part-
structure
> stuff) would come out the other end of an SDD roundtrip.
>
> It should be possible, if you represent you part-structure stuff in
a
> character hierarchy, and assume after roundtripping that the
hierarchy
> you get back can still be interpreted in the same way. That is, you
> would need to check whether the hierarchy you receive conforms with
> your model (and I suppose reject it if it does not).
I am very interested in how well this can be achieved. The character
hierarchies contain enumerated attributes that attempt to describe
what kind of hierarchy a given tree represents. This may be a purely
presentational hierarchy, or a part/structure, a property, a method.
Please compare ConceptTreeDefType/Type with enumerated list in the
schema. Is this appropriate/sufficient?
I addition to these we have all the time recognized the need for more
exact definition of descriptive objects, and we call the objects that
should achieve this Glossary/GlossaryEntries. Despite being within
terminology, and next to characters they can be used fully separate
to describe
Two of Trevor's questions in the word document regarding the Glossary
were:
1. "I am also unsure as to how well defined the terminology MUST be
within SDD (e.g. does a glossary entry require a definition and
authority - or could bare words with no definition or relation ship
to other terms be included)"
2. "I am still quite confused about HOW PRECISELY a terminology CAN
be defined within a SDD document. e.g. how accurately can a term be
defined in terms of relationships with others, I understand that
these relationships are captured by the concept trees, although again
a (visual) model detailing this would clarify."
Ad 1: Glossary entries are always optional, otherwise it would be
impossible to express DELTA / Lucid / etc. legacy data in SDD. If a
glossary entry exists, it currently must provide a head term
(formalated independently, and perhaps redundantly from the
character, state, modifier, hierarchical concept etc. it refers to)
and either of a free-form text definition or a URI pointing to such a
definition on the web. The "ontological" relationships are optional.
The assumption is that a glossary will "grow" over time while a
project matures.
(Trevor: your argument about rigorous definition of data is quite
valid. I would assume that either the application controls that, or
that somebody derives a local version of the SDD schema for fully
schema driven data entry. However, quality enforcement can not be an
issue for a schema focussing on data exchange. We enforce
completeness questions only if we believe not doing so would break
application interoperability. We do rigourously enforce consistency
questions such as that a state must be defined before using it).
Ad 2. I cannot answer this, other than that we try to provide both
operational and ontological/fundamental means of expression. We
believe that much can be expressed, but these are only our scenarios.
Thus:
TO ALL: Please do provide scenarios you would like to try to express,
which you think may be difficult. We urgently need such examples!
I am rather worried about the relationship between these two
mechanisms in SDD. I believe there is a very good point in separating
fundamental ontology from operational definitions, in fact given our
task to be able to also represent legacy data (including printed
information) it is unavoidable.
However, there is clearly much overlap in creating character
hierarchies expressing a part-hierarchy, and having an unconstrained
method to express any part hierarchy in the Glossary/*/PartOf method.
One current result of this is, that we are not yet expressing
Property/State relations in Glossary; these only exist because
Glossary and Character/states may be linked. It would probably be a
good idea to separate them as well. Any opinions on that?
Gregor
----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn(a)bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
1
0
1
0
Hi Kevin
looking over the character lists you use for keys I would say that they can
all be expressed by decomposing into a number of atomic statements ( our
angiosperm ontology has not yet included some of the structures and states
required, but this is not a problem as the terminology is readily
expandable) - i have briefly outlined the sort of mapping that is done in
the table below,
(note - i am not a botanist...so there may be some gaffs ;-).
The approach that we are proposing is that the descriptions are collected as
atomic statements, and more traditional 'characters' can be discovered by
analysis of this data (many characters are apparently a collection of atomic
scores/states)
Our taxonomists find this quite a departure from how they compose and record
their characters at the moment ( they recognize/discover and define a set of
characters by looking at the variation that exists in their specimen, then
create a scoring sheet/proforma that allows them to pick one of these
alternative characters) - our system might be tweaked to allow them to work
in a more character oriented manner if they precompose sets of statements as
part of the proforma specification, and then score these alternates as
present or absent.
a major advantage of our system can be seen from some of your simple
characters - eg growth habit: you have split this into two alternatives 1.
Epiphytic or lithophytic habit vs 2. (not epiphytic or lithophytic) whilst
this might make sense for a key, and is a DELTA-like representation, we
would argue that if the ACTUAL growth habit was scored for each specimen as
epiphytic, lithophytic, terrestrial, aquatic ( or concatenations of these )
far more accurate information would be recorded. For example, this would
allow the same specimen description to be divided into other character sets
if desired ( someone else may think that a key would work better if the
alternates were soildwelling or lithophytic vs epiphytic, another person
might want the alternates separately....if the description data had been
recorded in the orginal two-alternate-character division, this data reuse
would not be possible.
I hope this shows some of the salient features of our model...and how we
think it would beneft working taxonomists.
LUCID CHARACTERS<?xml:namespace prefix = o ns =
"urn:schemas-microsoft-com:office:office" />
STRUCTURE
PROPERTY/
STATEGROUP
STATES
Salt tolerance
· plants tolerating high salt levels (halophytes)
· plants not salt tolerant
Entire Plant
Ecological Adaptations
Halophytic
(there are a list of alternate states that could be scored, or
NOT-halophytic is allowed)
General habit
· tree
· shrub
· climber (woody or herbaceous)
· herb
· grass- or sedge-like plant
Entire Plant
Habit
Tree, Shrub, Herb etc.are scorable (or the negative)
Entire Plant
Architecture
Climbing, Bushy, creeper, Twining etc
We can collect more specific data by scoring more states for additional
properties
Epiphytic or lithophytic habit
· plants growing in soil (not epiphytic or lithophytic)
· plants growing on other plants or on bare rock surfaces
(epiphytic or
· lithophytic)
Entire Plant
Preferred Substrate
Epiphytic, Aquatic, Lithophytic, Terrestrial
Habit (aquatic herbs only)
· free-floating
· rooted in substrate with leaves all or mostly submerged
· rooted in substrate with leaves mostly floating on the water
surface
· rooted in substrate with leaves mostly emergent above the water
surface
Root
Root attachment
free-floating, substrate-attached
we don't have appropriate terms etc for thes states in our ontology as yet -
but they could be added
Leaf
Aquatic Position
floating, submerged,
emergent
Seasonal longevity
· annual, biennial or ephemeral
· perennial
Entire Plant
Lifespan
Annual, Biennial, ephemeral, perrenial
Seasonality of leaves (woody plants)
· evergreen
· deciduous or semi-deciduous
Leaf
Lifespan
deciduous, semi d., evergreen
Structures for spreading vegetatively
· none (plants not spreading vegetatively)
· underground bulbs, corms or tubers etc
· rhizomes, stolons or root-suckers
· detached aerial stem parts, or proliferous flowerheads
Entire Plant
sex and reproduction
vegetative
list of alternatives, or use NOT
Bulb
Presence
present, absent
Corm
Presence
present, absent
Tuber
Presence
present, absent
Rhizome
Presence
present, absent
Stolon
Presence
present, absent
Root-sucker
Presence
present, absent
detached aerial stem parts
Presence
present, absent
bulbils
Presence
present, absent
inflorescence
Type
proliferous
we can identify 'types' of structures, ith associated sets of states,
(aerial stem parts migh be a type of stem)
Chlorophyll in stems or leaves
· present (plants green or grey-green)
· absent (plants colourless, white or yellowish)
Leaf-Cholorphyll
Presence
present, absent
uses our structure hierarchy to identify which chlorophyll we are describing
Stem-Chlorophyll
Presence
present, absent
Entire Plant
Colour
specify any colour
Nutritional strategy
· neither carnivorous nor parasitic (normal plants)
· partially or totally parasitic on other plants
· carnivorous
Entire Plant
Habit-Lifestyle
carnivorous, parasite, partial parasite, etc
any combination of states including NOT can be allowed
Trap structures (carnivorous plants only)
· submerged or underground bladders
· pitcher-traps
· sticky glands or glandular hairs on leaves and/or stems
· trap like irritable leaf blade segments
We haven't had to address trap yet but we have anumber of ways in which the
terminology can be expanded to represent this information....
· we don't have 'trap' as a structure in our ontology yet - we
could add trap structure in various structural contexts, and allow scoring
presence or absence.
· we can add the presence of hairs or glandular hairs anywhere -
and again score presence/absence
· we would have to add some stes to the ontology - e.g irritable
Trevor Paterson PhD
t.paterson(a)napier.ac.uk < mailto:t.paterson@napier.ac.uk
<mailto:t.paterson@napier.ac.uk> >
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
http://www.dcs.napier.ac.uk/~cs175 <http://www.dcs.napier.ac.uk/~cs175>
http://www.prometheusdb.org <http://www.prometheusdb.org>
>>>-----Original Message-----
>>>From: Kevin Thiele [ mailto:kevin.thiele@BIGPOND.COM
<mailto:kevin.thiele@BIGPOND.COM> ]
>>>Sent: 16 March 2004 22:59
>>>To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
>>>Subject: Re: SDD Schema in relationship to
>>>Prometheus_Response to Kevin
>>>
>>>
>>>Trevor,
>>>
>>>I agree that there will be considerable congruence between SDD and
>>>Prometheus, imposed by the needs of the system. The main
>>>difference is in
>>>our decision early on not to constrain a taxonomist's
>>>ability to express any
>>>necessary differences in SDD.
>>>
>>>We looked at a proscribed part-region-property-state data
>>>model such as
>>>you've developed for Prometheus early on, and rejected it in
>>>favour of a
>>>more general, simple character-state model (such as DELTA
>>>had; with the
>>>extension that characters may optionally be arranged into character
>>>hierarchies).
>>>
>>>For instance, in an interactive key that I published a few
>>>years ago (The
>>>Families of Flowering Plants of Australia, in Lucid), I have
>>>(amongst many
>>>others) the following characters:
>>>
>>>Salt tolerance
>>> plants tolerating high salt levels (halophytes)
>>> plants not salt tolerant
>>>
>>>General habit
>>> tree
>>> shrub
>>> climber (woody or herbaceous)
>>> herb
>>> grass- or sedge-like plant
>>>
>>>Epiphytic or lithophytic habit
>>> plants growing in soil (not epiphytic or lithophytic)
>>> plants growing on other plants or on bare rock surfaces
>>>(epiphytic or
>>>lithophytic)
>>>
>>>Habit (aquatic herbs only)
>>> free-floating
>>> rooted in substrate with leaves all or mostly submerged
>>> rooted in substrate with leaves mostly floating on the
>>>water surface
>>> rooted in substrate with leaves mostly emergent above the
>>>water surface
>>>
>>>Seasonal longevity
>>> annual, biennial or ephemeral
>>> perennial
>>>
>>>Seasonality of leaves (woody plants)
>>> evergreen
>>> deciduous or semi-deciduous
>>>
>>>Structures for spreading vegetatively
>>> none (plants not spreading vegetatively)
>>> underground bulbs, corms or tubers etc
>>> rhizomes, stolons or root-suckers
>>> detached aerial stem parts, bulbils or proliferous flowerheads
>>>
>>>Chlorophyll in stems or leaves
>>> present (plants green or grey-green)
>>> absent (plants colourless, white or yellowish)
>>>
>>>Nutritional strategy
>>> neither carnivorous nor parasitic (normal plants)
>>> partially or totally parasitic on other plants
>>> carnivorous
>>>
>>>Trap structures (carnivorous plants only)
>>> submerged or underground bladders
>>> pitcher-traps
>>> sticky glands or glandular hairs on leaves and/or stems
>>> trap like irritable leaf blade segments
>>>
>>>I don't see how it would be possible to represent these in
>>>your data model,
>>>but as a taxonomist I need to represent them for my purpose.
>>>There would be
>>>no problem with these in SDD.
>>>
>>>The good news I suppose is that since SDD is (hopefully)
>>>more general, there
>>>should be no problem rendering your data in SDD. The more interesting
>>>question is whether your data *model* (the part-structure
>>>stuff) would come
>>>out the other end of an SDD roundtrip.
>>>
>>>It should be possible, if you represent you part-structure stuff in a
>>>character hierarchy, and assume after roundtripping that the
>>>hierarchy you
>>>get back can still be interpreted in the same way. That is,
>>>you would need
>>>to check whether the hierarchy you receive conforms with
>>>your model (and I
>>>suppose reject it if it does not).
>>>
>>>Cheers - k
>>>
>>>----- Original Message -----
>>>From: Paterson, Trevor
>>>To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
>>>Sent: Tuesday, March 16, 2004 10:36 PM
>>>Subject: Re: SDD Schema in relationship to
>>>Prometheus_Response to Kevin
>>>
>>>
>>>Kevin
>>>Thanks for your reply, it is becoming much clearer to me
>>>that actually alot
>>>of our thoughts are convergent ( probably because we are all
>>>thinking about
>>>the same issues.... ). You have clarified a lot of ipoints,
>>>and i have added
>>>a little more clarification below....
>>>It looks like a worthwhile task would be to try and
>>>represent our angiosperm
>>>terminology in SDD format at some stage ( time permitting
>>>etc...as ever).
>>>This is probably more straightforward than representing our
>>>descriptive data
>>>according to SDD as our underlying data model is quite
>>>different insome
>>>aspects ( I think!!!).
>>>cheers
>>>Trevor
>>>
>>>
>>>Trevor Paterson PhD
>>>t.paterson(a)napier.ac.uk
>>>School of Computing
>>>Napier University
>>>Merchiston Campus
>>>10 Colinton Road
>>>Edinburgh
>>>Scotland
>>>EH10 5DT
>>>tel: +44 (0)131 455-2752
>>>www.dcs.napier.ac.uk/~cs175
>>>www.prometheusdb.org
>>>-----Original Message-----
>>>From: Kevin Thiele [ mailto:kevin.thiele@BIGPOND.COM
<mailto:kevin.thiele@BIGPOND.COM> ]
>>>Sent: 15 March 2004 22:33
>>>To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
>>>Subject: Re: SDD Schema in relationship to Prometheus
>>>
>>>
>>>Apologies: The previous post from me with this title was an
>>>unfinished
>>>version sent off prematurely by my email editor. Please
>>>ignore and use this
>>>one instead.
>>>
>>>-----------------------------
>>>
>>>Hi Trevor - thanks very much for your comments and
>>>comparative document -
>>>this is really useful, and we need to get much more feedback
>>>like this.
>>>
>>>The main difference between SDD and Prometheus seems to be
>>>that you are
>>>working specifically on the basis of defining a controlled
>>>terminology
>>>whereas SDD explicitly decided early on that a controlled
>>>terminology was
>>>outside our scope. History will judge which approach is best.
>>>[Paterson, Trevor]
>>>a controlled terminology was not such a large feature of our work
>>>initially - we were more interested in the model for saving
>>>'character'
>>>data - however it became obvious that the only way to allow
>>>unambiguous
>>>interpretation of data - for reuse, comparison etc - was to
>>>provide full
>>>definitions. it then seemed desirable that people would
>>>share definitions to
>>>allow compatability.......whether this will be achieved by bottom up
>>>adoption is an open question. Taxonomists don seem to like
>>>the idea of top
>>>down imposition - tho they may be happier when it is
>>>restricted to quite a
>>>small domain of users
>>>
>>>We did have early discussions about a controlled terminology
>>>(see the list
>>>archives for a history of this).One dificulty for us is that
>>>SDD is designed
>>>to be biology-wide (indeed, we have even removed specific
>>>references to
>>>biology, such as "taxon", because SDD is equally applicable
>>>to descriptions
>>>of non-taxa such as diseases, nutrient deficiency syndromes,
>>>soils and
>>>minerals. Perhaps here we have drawn our bow too wide, but
>>>we were informed
>>>by the fact that at our Lisbon meeting all but one of the
>>>contributors who
>>>were working with identification tools had removed their
>>>biology-specific
>>>tags to become more general). Prometheus (as I understand it
>>>from your
>>>document) is specifically botanical. This would be an intolerable
>>>restriction for us given our brief.
>>>[Paterson, Trevor]
>>>We are constrained by the expertise of whoever we are
>>>collaborating with...
>>>the taxonomists at RBGE are full partners in this project so
>>>the 'test
>>>domains' reflect their interests and expertise ( or we will
>>>never get real
>>>test data). We hope that our character model will be
>>>applicable to the whole
>>>field of biological taxonomy - - and that specific
>>>ontologies/terminologies
>>>could be developed to allow description of other groups (
>>>mammals, insects
>>>etc)
>>>
>>>Obviously, a botany-wide controlled terminology is more
>>>achievable than a
>>>biology-wide one. Personally, however, I think that you run
>>>the danger even
>>>in botany with any controlled terminology of trying to force
>>>nature kicking
>>>and screaming into small boxes, and do it an injustice
>>>therewith. I don't
>>>know how any botany-wide controlled terminology could cope
>>>with the leaves
>>>of Drosera auriculata, for instance, or the morphology of
>>>Podostemaceae. (In
>>>fact, I wonder whether the dream of a controlled terminology
>>>is more likely
>>>in a cold Northern Hemisphere climate than in the biodiverse South or
>>>tropics?).
>>>[Paterson, Trevor]
>>>Yes - we know that diverse taxa would probably require
>>>specific ontologies.
>>>We may be able to develope a system that allows a core
>>>central terminology -
>>>with taxon specific extensions.....We want to allow
>>>MEANINGFUL comparison
>>>of data - and often there is no need or sense in comparing
>>>data across
>>>widely divergent taxa ie you would might want to compare
>>>the properties of
>>>stalks on angiosperm flowers, but it is probably of no
>>>taxonomic interest to
>>>compare these with the stalks of a slime mould fruiting
>>>body...............
>>>
>>>In general, we have taken the view that a controlled terminology in
>>>particular domains (e.g. legumes) may develop as an emergent
>>>property of
>>>SDD, rather than imposed top-down.
>>>[Paterson, Trevor]
>>>Yes - this is the working model we have come round
>>>to...users develope an
>>>ontology and share it with colleagues in a closely related
>>>field etc....
>>>
>>>On more specific points from your document:
>>>
>>>Complexity: SDD was scoped to be a superset of existing systems and
>>>standards e.g. DELTA, Lucid, DeltaAcess, and also to
>>>accommodate future
>>>developments that those of us working in the field can
>>>envisage but no-one's
>>>really done yet (particularly federation issues - and you
>>>may be further
>>>down this track than we are). This is part of the reason for
>>>the complexity,
>>>
>>>>It is not clear to me whether SDD is proposing this schema as
>>>>a unifying schema to which different description formats
>>>would map their
>>>own schema
>>>>or
>>>>whether the SDD schema is being proposed as a schema for
>>>developers to
>>>(partially) implement when designing applications
>>>>and repositories for capturing descriptive data.
>>>
>>>It is designed as a unifying standard, to allow lossless
>>>roundtripping
>>>between applications. At the same time, we are struggling
>>>with how much
>>>should be mandatory and how much optional (your second option)
>>>
>>>>>>From our own collaborative experiences with botanical
>>>taxonomists, data
>>>models and structures hold no interest to them in
>>>>practice, and they find even our simple conceptual model of
>>>character
>>>description complex to understand. Probably few working
>>>>taxonomists would wish to interact at any level with the
>>>SDD schema and
>>>applications would have to achieve this mapping
>>>>transparently.
>>>
>>>On this I'm sure you're right, and we have had many
>>>discussions within SDD
>>>about this problem. There are differing views as to the importance of
>>>taxonomists themselves coming to grips with SDD, as the
>>>standard itself will
>>>generally be invisible to a taxonomist using an
>>>SDD-compliant application.
>>>[Paterson, Trevor]
>>>The problem is that someone has to write and implement the
>>>applications (
>>>e.g. me !) - and they have to do this in collaboration with
>>>taxonomists - or
>>>it will be perceived as an irrelevant imposition -
>>>-therefore it will always
>>>be necessary to get at least some taxonomists from a
>>>variety of fields to
>>>understand the schema...
>>>>>From my perspective actually it seems much easier for
>>>computer scientists to
>>>get to grips with taxonomy rather than vice versa - (
>>>although taxonmists
>>>would be a bit defensive about this....)
>>>
>>>Translation and multiple language representations: allowing multiple
>>>languages is seen as a fundamental part of the SDD brief.
>>>Life would indeed
>>>be much simpler if everyone spoke the same language, but
>>>they don't so we
>>>need to handle that.
>>>
>>>>It is not clear whether SDD proposes that a single document
>>>can include
>>>multiple language representations, or whether these
>>>>would form separate documents, conforming to the same standard
>>>
>>>SDD can handle multiple language representations of every
>>>character string
>>>within the one document.
>>>[Paterson, Trevor]
>>>Still not convinced allowing multiple language
>>>representations would mace
>>>for good/accurate science
>>>
>>>Multiple expertise levels
>>>
>>>>I am similarly suspicious of the necessity for including
>>>the ability for
>>>recording different expertise levels in one document format.
>>>>Is SDD proposing/allowing multiple representations within
>>>the same document
>>>: or just that the same format/standard can be
>>>>used for documents aimed at different expertise level.
>>>>
>>>>There clearly is value in being able to extract/translate
>>>simple language
>>>descriptions from complex data resources - as is
>>>>necessary for compiling flora and keys from monographs and original
>>>descriptions. However, is including the ability to describe
>>>>descriptive data in language suitable for primary
>>>schoolchildren relevant
>>>to an accurate scientific database of taxonomic data.
>>>>[Again this would appear to be a political requirement??]
>>>
>>>This is not a political requirement, but an attempt to broaden the
>>>application of taxonomy beyond taxonomists (surely a
>>>requirement if the
>>>taxonomic crisis is to be resolved). It also derives neatly
>>>from the XML
>>>underpinning (XML is based on the idea of multiple
>>>representations of a
>>>single document)
>>>[Paterson, Trevor]
>>>Yes I see where you are coming from - i think the probelm is that
>>>taxonomists are concerned about accurate representation of
>>>their data for
>>>their purposes - making a shareable version of this is not
>>>perceived as of
>>>any value to them - they want a scientific tool to do their
>>>job - i am not
>>>sure how /if we could encourage a dual markup approach - or whether
>>>different 'markets' for descriptive data would exist independently -
>>>
>>>Defining the descriptive terminology
>>>
>>>>Are you suggesting that the SDD Terminology Section will be
>>>adequate and
>>>appropriate to store
>>>>and represent any (allowed) defined terminology?
>>>
>>>Yes, we hope so. Do you think it will be inadequate?
>>>[Paterson, Trevor]
>>>It probably would be adequate to store Prometheus
>>>terminologies - with
>>>minor tinkering, the semantics of the terminolgies could be
>>>saved just with
>>>glossary entries and relationships etc - but would obviously require
>>>interpetation by suitable applications. I cant understand
>>>concept trees well
>>>enough to get a feel for whether they could store the semantics of a
>>>terminolgy more explicitly.....
>>>However, that is not to say SDD could cope with other
>>>terminolgies that
>>>might have further unforseen relationships - there might need to be a
>>>facility for recording 'user defined' relationships such as
>>>our stategroup
>>>membership and restrictions between these and sets of
>>>structures - i think
>>>that these type of relationships are representable in
>>>concept trees - but
>>>would need a primer/ turotial to show me how.
>>>
>>>>Is the standard going to allow descriptions to reference
>>>other defined
>>>terminologies?
>>>
>>>It will be possible to outsource the terminology section, so
>>>if a group
>>>creates a controlled vocabulary, that could be referenced in
>>>multiple SDD
>>>documents. So presumably Prometheus could be the source of a
>>>controlled
>>>vocabulary that other users (of they found it adequate)
>>>could reference.
>>>
>>>>Would SDD only accept Data marked up in an SDD terminology?
>>>
>>>Yes - or do I misunderstand this question?
>>>[Paterson, Trevor]
>>>If i understand your previous remark - you could have
>>>completely external
>>>termnologies - that SDD 'knows' nothing about the structure
>>>and semantics -
>>>so it would have to accept 'non-SDD' terminology
>>>
>>>>Would existing terminologies have to be
>>>translated/mapped/redescribed in
>>>SDD format?
>>>
>>>Any existing terminology can be represented in SDD, so there
>>>will be no
>>>remapping necessary.
>>>[Paterson, Trevor]
>>>Again this will only be knowable if people try doing it.
>>>Obviously if there
>>>is a standard structure people would be encouraged to use it - but
>>>prexisting description ontologies ( eg PlantOnology and
>>>GeneOntology) would
>>>not want to retrofit to the standard....even if there is no
>>>' rdesign or
>>>remapping necessary - there is the matter of re-expressing
>>>it in an SDD
>>>format
>>>
>>>>Who is going to create terminologies, e.g.gusers on an
>>>adhoc basis, or
>>>expert user groups?
>>>
>>>As above, these may develop particularly for some groups (e.g. ferns,
>>>legumes), but some users may choose to stay outside such a
>>>system (there
>>>will be benefits and costs of using a controlled vocabulary,
>>>so people will
>>>have to weigh it up for themselves). SDD itself is agnostic.
>>>[Paterson, Trevor]
>>>Almost our opinion - but we are not agnostic - we believe
>>>'interpretability'
>>>and 'reusability' are Gods, and we should proselytize on
>>>their behalfs -
>>>gently of course - by offering only carrots and no sticks...
>>>
>>>>Is it an aim to promote re-use and sharing of terminologies?
>>>
>>>It would be a desirable outcome, but we hope it evolves
>>>bottom-up rather
>>>than being imposed top-down.
>>>[Paterson, Trevor]
>>>Sure
>>>
>>>>Is there going to be policing of SDD terminologies, e.g. maintaining
>>>versioning, additions etc?
>>>
>>>Versioning will be handled within SDD, but there can be no
>>>possibility of
>>>policing a system
>>>[Paterson, Trevor]
>>>This what really worries the taxonomists - imposed standards
>>>reducing the
>>>flexibility and expressivity of their descriptions - and
>>>hatred of a police
>>>state. I think standardisation WILL be perceived to lead to a loss of
>>>expressivity - but as often one person's expressivity is another'; s
>>>incomprehensivity to an outsider this seems like potentially
>>>'a good thing'
>>>( in small doses obviously)........
>>>
>>>
>>>>How was the terminology section created - by examining examples of
>>>terminology specifications,
>>>>ontology representations etc?
>>>
>>>We have had a mix of off-the-top-of-the-head speculation as
>>>to how best to
>>>do things, and proofing of concepts against real-world
>>>examples. I would
>>>have liked to see more proofing going on during development,
>>>but this has
>>>been hard to maintain, and may be to our cost. We are nbow
>>>at a phase where
>>>several groups are trting to implement SDD-compliance for
>>>their systems -
>>>this will be the proof of the pudding.
>>>
>>>Note also that SDD is currently v0.9 - with the explicit statement on
>>>release that everything may change if we find that the
>>>proofing fails.
>>>
>>>>Does it form a standard template for storing a terminology?
>>>
>>>What do you mean by this. It provides the standard schema
>>>for representing
>>>an (undefined) terminology.
>>>
>>>>Is it compatible with any existing tools, standards or
>>>formats - e.g.
>>>ontology editors?
>>>
>>>We have specifically made it very general. There is
>>>currently no existing
>>>tool that can handle SDD. Hopefull this will change shortly.
>>>[Paterson, Trevor]
>>>Keep me posted
>>>
>>>----------------------------------
>>>So does Prometheus have any data yet, or is it still at the
>>>model stage? It
>>>would be very interesting to try representing Prometheus data in SDD.
>>>[Paterson, Trevor] to date we have
>>>the model ( and a data base implementation to save model
>>>compliant data)
>>>a tool for creating simple terminologies - with definitions, PartOf,
>>>Typeof, Stategroup relations etc
>>>a prototype angiosperm terminology/ontology
>>>a prototype tool for using onologies to specify project description
>>>templates ( proformas)
>>>which also allows recordng of specimen descriptions
>>>the ability to save these descriptions to the database in the model
>>>compliant format
>>>and we are currently getting user testing done on the
>>>prototype by the
>>>taxonomists, who are making proformas adn saving specimen
>>>descriptions. We
>>>are just about ready to use the tool for a 'real' project if
>>>we can get time
>>>and interest to do this...eg for a small taxonomic revision.
>>>We are rapidly
>>>approaching the end of our project funding however.. so it
>>>may switch to
>>>being a 'back burner' project....
>>>
>>>
>>>
>>>Cheers - k
>>>----- Original Message -----
>>>From: Paterson, Trevor
>>>To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
>>>Sent: Monday, March 15, 2004 9:41 PM
>>>Subject: SDD Schema in relationship to Prometheus
>>>
>>>
>>>Gregor
>>>
>>>I have written a rough document considering several aspects of the
>>>SDD-schema - largely interpreted with reference to our
>>>Prometheus Database
>>>model for descriptive data. It seems easier to keep this all
>>>together,
>>>rather than post it to various sections on twiki, so i am
>>>attaching it here
>>>
>>>My main problems in interpreting the schema were the lack of
>>>documentation
>>>( as always...) especially for the conceptually complex
>>>parts like concept
>>>trees. I think clear, visual summary models for
>>>description, characters,
>>>concept trees etc would help a novice to get to grips with
>>>the concepts, and
>>>might make some of the complexities more tractable. I do
>>>worry that the
>>>overall schema is over complex and 'trying to do too much in
>>>one go' - eg
>>>considering multiple language and expertise representations,
>>>although I am
>>>sure that there are good political reasons for everything.....
>>>
>>>yours
>>>trevor
>>>
>>>
>>>Trevor Paterson PhD
>>>t.paterson(a)napier.ac.uk
>>>School of Computing
>>>Napier University
>>>Merchiston Campus
>>>10 Colinton Road
>>>Edinburgh
>>>Scotland
>>>EH10 5DT
>>>tel: +44 (0)131 455-2752
>>>www.dcs.napier.ac.uk/~cs175
>>>www.prometheusdb.org
>>>
1
0
17 Mar '04
Trevor,
I agree that there will be considerable congruence between SDD and
Prometheus, imposed by the needs of the system. The main difference is in
our decision early on not to constrain a taxonomist's ability to express any
necessary differences in SDD.
We looked at a proscribed part-region-property-state data model such as
you've developed for Prometheus early on, and rejected it in favour of a
more general, simple character-state model (such as DELTA had; with the
extension that characters may optionally be arranged into character
hierarchies).
For instance, in an interactive key that I published a few years ago (The
Families of Flowering Plants of Australia, in Lucid), I have (amongst many
others) the following characters:
Salt tolerance
plants tolerating high salt levels (halophytes)
plants not salt tolerant
General habit
tree
shrub
climber (woody or herbaceous)
herb
grass- or sedge-like plant
Epiphytic or lithophytic habit
plants growing in soil (not epiphytic or lithophytic)
plants growing on other plants or on bare rock surfaces (epiphytic or
lithophytic)
Habit (aquatic herbs only)
free-floating
rooted in substrate with leaves all or mostly submerged
rooted in substrate with leaves mostly floating on the water surface
rooted in substrate with leaves mostly emergent above the water surface
Seasonal longevity
annual, biennial or ephemeral
perennial
Seasonality of leaves (woody plants)
evergreen
deciduous or semi-deciduous
Structures for spreading vegetatively
none (plants not spreading vegetatively)
underground bulbs, corms or tubers etc
rhizomes, stolons or root-suckers
detached aerial stem parts, bulbils or proliferous flowerheads
Chlorophyll in stems or leaves
present (plants green or grey-green)
absent (plants colourless, white or yellowish)
Nutritional strategy
neither carnivorous nor parasitic (normal plants)
partially or totally parasitic on other plants
carnivorous
Trap structures (carnivorous plants only)
submerged or underground bladders
pitcher-traps
sticky glands or glandular hairs on leaves and/or stems
trap like irritable leaf blade segments
I don't see how it would be possible to represent these in your data model,
but as a taxonomist I need to represent them for my purpose. There would be
no problem with these in SDD.
The good news I suppose is that since SDD is (hopefully) more general, there
should be no problem rendering your data in SDD. The more interesting
question is whether your data *model* (the part-structure stuff) would come
out the other end of an SDD roundtrip.
It should be possible, if you represent you part-structure stuff in a
character hierarchy, and assume after roundtripping that the hierarchy you
get back can still be interpreted in the same way. That is, you would need
to check whether the hierarchy you receive conforms with your model (and I
suppose reject it if it does not).
Cheers - k
----- Original Message -----
From: Paterson, Trevor
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
Sent: Tuesday, March 16, 2004 10:36 PM
Subject: Re: SDD Schema in relationship to Prometheus_Response to Kevin
Kevin
Thanks for your reply, it is becoming much clearer to me that actually alot
of our thoughts are convergent ( probably because we are all thinking about
the same issues.... ). You have clarified a lot of ipoints, and i have added
a little more clarification below....
It looks like a worthwhile task would be to try and represent our angiosperm
terminology in SDD format at some stage ( time permitting etc...as ever).
This is probably more straightforward than representing our descriptive data
according to SDD as our underlying data model is quite different insome
aspects ( I think!!!).
cheers
Trevor
Trevor Paterson PhD
t.paterson(a)napier.ac.uk
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175
www.prometheusdb.org
-----Original Message-----
From: Kevin Thiele [mailto:kevin.thiele@BIGPOND.COM]
Sent: 15 March 2004 22:33
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
Subject: Re: SDD Schema in relationship to Prometheus
Apologies: The previous post from me with this title was an unfinished
version sent off prematurely by my email editor. Please ignore and use this
one instead.
-----------------------------
Hi Trevor - thanks very much for your comments and comparative document -
this is really useful, and we need to get much more feedback like this.
The main difference between SDD and Prometheus seems to be that you are
working specifically on the basis of defining a controlled terminology
whereas SDD explicitly decided early on that a controlled terminology was
outside our scope. History will judge which approach is best.
[Paterson, Trevor]
a controlled terminology was not such a large feature of our work
initially - we were more interested in the model for saving 'character'
data - however it became obvious that the only way to allow unambiguous
interpretation of data - for reuse, comparison etc - was to provide full
definitions. it then seemed desirable that people would share definitions to
allow compatability.......whether this will be achieved by bottom up
adoption is an open question. Taxonomists don seem to like the idea of top
down imposition - tho they may be happier when it is restricted to quite a
small domain of users
We did have early discussions about a controlled terminology (see the list
archives for a history of this).One dificulty for us is that SDD is designed
to be biology-wide (indeed, we have even removed specific references to
biology, such as "taxon", because SDD is equally applicable to descriptions
of non-taxa such as diseases, nutrient deficiency syndromes, soils and
minerals. Perhaps here we have drawn our bow too wide, but we were informed
by the fact that at our Lisbon meeting all but one of the contributors who
were working with identification tools had removed their biology-specific
tags to become more general). Prometheus (as I understand it from your
document) is specifically botanical. This would be an intolerable
restriction for us given our brief.
[Paterson, Trevor]
We are constrained by the expertise of whoever we are collaborating with...
the taxonomists at RBGE are full partners in this project so the 'test
domains' reflect their interests and expertise ( or we will never get real
test data). We hope that our character model will be applicable to the whole
field of biological taxonomy - - and that specific ontologies/terminologies
could be developed to allow description of other groups ( mammals, insects
etc)
Obviously, a botany-wide controlled terminology is more achievable than a
biology-wide one. Personally, however, I think that you run the danger even
in botany with any controlled terminology of trying to force nature kicking
and screaming into small boxes, and do it an injustice therewith. I don't
know how any botany-wide controlled terminology could cope with the leaves
of Drosera auriculata, for instance, or the morphology of Podostemaceae. (In
fact, I wonder whether the dream of a controlled terminology is more likely
in a cold Northern Hemisphere climate than in the biodiverse South or
tropics?).
[Paterson, Trevor]
Yes - we know that diverse taxa would probably require specific ontologies.
We may be able to develope a system that allows a core central terminology -
with taxon specific extensions.....We want to allow MEANINGFUL comparison
of data - and often there is no need or sense in comparing data across
widely divergent taxa ie you would might want to compare the properties of
stalks on angiosperm flowers, but it is probably of no taxonomic interest to
compare these with the stalks of a slime mould fruiting body...............
In general, we have taken the view that a controlled terminology in
particular domains (e.g. legumes) may develop as an emergent property of
SDD, rather than imposed top-down.
[Paterson, Trevor]
Yes - this is the working model we have come round to...users develope an
ontology and share it with colleagues in a closely related field etc....
On more specific points from your document:
Complexity: SDD was scoped to be a superset of existing systems and
standards e.g. DELTA, Lucid, DeltaAcess, and also to accommodate future
developments that those of us working in the field can envisage but no-one's
really done yet (particularly federation issues - and you may be further
down this track than we are). This is part of the reason for the complexity,
>It is not clear to me whether SDD is proposing this schema as
>a unifying schema to which different description formats would map their
own schema
>or
>whether the SDD schema is being proposed as a schema for developers to
(partially) implement when designing applications
>and repositories for capturing descriptive data.
It is designed as a unifying standard, to allow lossless roundtripping
between applications. At the same time, we are struggling with how much
should be mandatory and how much optional (your second option)
>>>From our own collaborative experiences with botanical taxonomists, data
models and structures hold no interest to them in
>practice, and they find even our simple conceptual model of character
description complex to understand. Probably few working
>taxonomists would wish to interact at any level with the SDD schema and
applications would have to achieve this mapping
>transparently.
On this I'm sure you're right, and we have had many discussions within SDD
about this problem. There are differing views as to the importance of
taxonomists themselves coming to grips with SDD, as the standard itself will
generally be invisible to a taxonomist using an SDD-compliant application.
[Paterson, Trevor]
The problem is that someone has to write and implement the applications (
e.g. me !) - and they have to do this in collaboration with taxonomists - or
it will be perceived as an irrelevant imposition - -therefore it will always
be necessary to get at least some taxonomists from a variety of fields to
understand the schema...
>>>From my perspective actually it seems much easier for computer scientists to
get to grips with taxonomy rather than vice versa - ( although taxonmists
would be a bit defensive about this....)
Translation and multiple language representations: allowing multiple
languages is seen as a fundamental part of the SDD brief. Life would indeed
be much simpler if everyone spoke the same language, but they don't so we
need to handle that.
>It is not clear whether SDD proposes that a single document can include
multiple language representations, or whether these
>would form separate documents, conforming to the same standard
SDD can handle multiple language representations of every character string
within the one document.
[Paterson, Trevor]
Still not convinced allowing multiple language representations would mace
for good/accurate science
Multiple expertise levels
>I am similarly suspicious of the necessity for including the ability for
recording different expertise levels in one document format.
>Is SDD proposing/allowing multiple representations within the same document
: or just that the same format/standard can be
>used for documents aimed at different expertise level.
>
>There clearly is value in being able to extract/translate simple language
descriptions from complex data resources - as is
>necessary for compiling flora and keys from monographs and original
descriptions. However, is including the ability to describe
>descriptive data in language suitable for primary schoolchildren relevant
to an accurate scientific database of taxonomic data.
>[Again this would appear to be a political requirement??]
This is not a political requirement, but an attempt to broaden the
application of taxonomy beyond taxonomists (surely a requirement if the
taxonomic crisis is to be resolved). It also derives neatly from the XML
underpinning (XML is based on the idea of multiple representations of a
single document)
[Paterson, Trevor]
Yes I see where you are coming from - i think the probelm is that
taxonomists are concerned about accurate representation of their data for
their purposes - making a shareable version of this is not perceived as of
any value to them - they want a scientific tool to do their job - i am not
sure how /if we could encourage a dual markup approach - or whether
different 'markets' for descriptive data would exist independently -
Defining the descriptive terminology
>Are you suggesting that the SDD Terminology Section will be adequate and
appropriate to store
>and represent any (allowed) defined terminology?
Yes, we hope so. Do you think it will be inadequate?
[Paterson, Trevor]
It probably would be adequate to store Prometheus terminologies - with
minor tinkering, the semantics of the terminolgies could be saved just with
glossary entries and relationships etc - but would obviously require
interpetation by suitable applications. I cant understand concept trees well
enough to get a feel for whether they could store the semantics of a
terminolgy more explicitly.....
However, that is not to say SDD could cope with other terminolgies that
might have further unforseen relationships - there might need to be a
facility for recording 'user defined' relationships such as our stategroup
membership and restrictions between these and sets of structures - i think
that these type of relationships are representable in concept trees - but
would need a primer/ turotial to show me how.
>Is the standard going to allow descriptions to reference other defined
terminologies?
It will be possible to outsource the terminology section, so if a group
creates a controlled vocabulary, that could be referenced in multiple SDD
documents. So presumably Prometheus could be the source of a controlled
vocabulary that other users (of they found it adequate) could reference.
>Would SDD only accept Data marked up in an SDD terminology?
Yes - or do I misunderstand this question?
[Paterson, Trevor]
If i understand your previous remark - you could have completely external
termnologies - that SDD 'knows' nothing about the structure and semantics -
so it would have to accept 'non-SDD' terminology
>Would existing terminologies have to be translated/mapped/redescribed in
SDD format?
Any existing terminology can be represented in SDD, so there will be no
remapping necessary.
[Paterson, Trevor]
Again this will only be knowable if people try doing it. Obviously if there
is a standard structure people would be encouraged to use it - but
prexisting description ontologies ( eg PlantOnology and GeneOntology) would
not want to retrofit to the standard....even if there is no ' rdesign or
remapping necessary - there is the matter of re-expressing it in an SDD
format
>Who is going to create terminologies, e.g.gusers on an adhoc basis, or
expert user groups?
As above, these may develop particularly for some groups (e.g. ferns,
legumes), but some users may choose to stay outside such a system (there
will be benefits and costs of using a controlled vocabulary, so people will
have to weigh it up for themselves). SDD itself is agnostic.
[Paterson, Trevor]
Almost our opinion - but we are not agnostic - we believe 'interpretability'
and 'reusability' are Gods, and we should proselytize on their behalfs -
gently of course - by offering only carrots and no sticks...
>Is it an aim to promote re-use and sharing of terminologies?
It would be a desirable outcome, but we hope it evolves bottom-up rather
than being imposed top-down.
[Paterson, Trevor]
Sure
>Is there going to be policing of SDD terminologies, e.g. maintaining
versioning, additions etc?
Versioning will be handled within SDD, but there can be no possibility of
policing a system
[Paterson, Trevor]
This what really worries the taxonomists - imposed standards reducing the
flexibility and expressivity of their descriptions - and hatred of a police
state. I think standardisation WILL be perceived to lead to a loss of
expressivity - but as often one person's expressivity is another'; s
incomprehensivity to an outsider this seems like potentially 'a good thing'
( in small doses obviously)........
>How was the terminology section created - by examining examples of
terminology specifications,
>ontology representations etc?
We have had a mix of off-the-top-of-the-head speculation as to how best to
do things, and proofing of concepts against real-world examples. I would
have liked to see more proofing going on during development, but this has
been hard to maintain, and may be to our cost. We are nbow at a phase where
several groups are trting to implement SDD-compliance for their systems -
this will be the proof of the pudding.
Note also that SDD is currently v0.9 - with the explicit statement on
release that everything may change if we find that the proofing fails.
>Does it form a standard template for storing a terminology?
What do you mean by this. It provides the standard schema for representing
an (undefined) terminology.
>Is it compatible with any existing tools, standards or formats - e.g.
ontology editors?
We have specifically made it very general. There is currently no existing
tool that can handle SDD. Hopefull this will change shortly.
[Paterson, Trevor]
Keep me posted
----------------------------------
So does Prometheus have any data yet, or is it still at the model stage? It
would be very interesting to try representing Prometheus data in SDD.
[Paterson, Trevor] to date we have
the model ( and a data base implementation to save model compliant data)
a tool for creating simple terminologies - with definitions, PartOf,
Typeof, Stategroup relations etc
a prototype angiosperm terminology/ontology
a prototype tool for using onologies to specify project description
templates ( proformas)
which also allows recordng of specimen descriptions
the ability to save these descriptions to the database in the model
compliant format
and we are currently getting user testing done on the prototype by the
taxonomists, who are making proformas adn saving specimen descriptions. We
are just about ready to use the tool for a 'real' project if we can get time
and interest to do this...eg for a small taxonomic revision. We are rapidly
approaching the end of our project funding however.. so it may switch to
being a 'back burner' project....
Cheers - k
----- Original Message -----
From: Paterson, Trevor
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
Sent: Monday, March 15, 2004 9:41 PM
Subject: SDD Schema in relationship to Prometheus
Gregor
I have written a rough document considering several aspects of the
SDD-schema - largely interpreted with reference to our Prometheus Database
model for descriptive data. It seems easier to keep this all together,
rather than post it to various sections on twiki, so i am attaching it here
My main problems in interpreting the schema were the lack of documentation
( as always...) especially for the conceptually complex parts like concept
trees. I think clear, visual summary models for description, characters,
concept trees etc would help a novice to get to grips with the concepts, and
might make some of the complexities more tractable. I do worry that the
overall schema is over complex and 'trying to do too much in one go' - eg
considering multiple language and expertise representations, although I am
sure that there are good political reasons for everything.....
yours
trevor
Trevor Paterson PhD
t.paterson(a)napier.ac.uk
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175
www.prometheusdb.org
1
0
16 Mar '04
Kevin
Thanks for your reply, it is becoming much clearer to me that actually alot
of our thoughts are convergent ( probably because we are all thinking about
the same issues.... ). You have clarified a lot of ipoints, and i have added
a little more clarification below....
It looks like a worthwhile task would be to try and represent our angiosperm
terminology in SDD format at some stage ( time permitting etc...as ever).
This is probably more straightforward than representing our descriptive data
according to SDD as our underlying data model is quite different insome
aspects ( I think!!!).
cheers
Trevor
Trevor Paterson PhD
t.paterson(a)napier.ac.uk <mailto:t.paterson@napier.ac.uk>
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175 <http://www.dcs.napier.ac.uk/~cs175>
www.prometheusdb.org <http://www.prometheusdb.org/>
-----Original Message-----
From: Kevin Thiele [mailto:kevin.thiele@BIGPOND.COM]
Sent: 15 March 2004 22:33
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
Subject: Re: SDD Schema in relationship to Prometheus
Apologies: The previous post from me with this title was an unfinished
version sent off prematurely by my email editor. Please ignore and use this
one instead.
-----------------------------
Hi Trevor - thanks very much for your comments and comparative document -
this is really useful, and we need to get much more feedback like this.
The main difference between SDD and Prometheus seems to be that you are
working specifically on the basis of defining a controlled terminology
whereas SDD explicitly decided early on that a controlled terminology was
outside our scope. History will judge which approach is best.
[Paterson, Trevor]
a controlled terminology was not such a large feature of our work initially
- we were more interested in the model for saving 'character' data - however
it became obvious that the only way to allow unambiguous interpretation of
data - for reuse, comparison etc - was to provide full definitions. it then
seemed desirable that people would share definitions to allow
compatability.......whether this will be achieved by bottom up adoption is
an open question. Taxonomists don seem to like the idea of top down
imposition - tho they may be happier when it is restricted to quite a small
domain of users
We did have early discussions about a controlled terminology (see the list
archives for a history of this).One dificulty for us is that SDD is designed
to be biology-wide (indeed, we have even removed specific references to
biology, such as "taxon", because SDD is equally applicable to descriptions
of non-taxa such as diseases, nutrient deficiency syndromes, soils and
minerals. Perhaps here we have drawn our bow too wide, but we were informed
by the fact that at our Lisbon meeting all but one of the contributors who
were working with identification tools had removed their biology-specific
tags to become more general). Prometheus (as I understand it from your
document) is specifically botanical. This would be an intolerable
restriction for us given our brief.
[Paterson, Trevor]
We are constrained by the expertise of whoever we are collaborating with...
the taxonomists at RBGE are full partners in this project so the 'test
domains' reflect their interests and expertise ( or we will never get real
test data). We hope that our character model will be applicable to the whole
field of biological taxonomy - - and that specific ontologies/terminologies
could be developed to allow description of other groups ( mammals, insects
etc)
Obviously, a botany-wide controlled terminology is more achievable than a
biology-wide one. Personally, however, I think that you run the danger even
in botany with any controlled terminology of trying to force nature kicking
and screaming into small boxes, and do it an injustice therewith. I don't
know how any botany-wide controlled terminology could cope with the leaves
of Drosera auriculata, for instance, or the morphology of Podostemaceae. (In
fact, I wonder whether the dream of a controlled terminology is more likely
in a cold Northern Hemisphere climate than in the biodiverse South or
tropics?).
[Paterson, Trevor]
Yes - we know that diverse taxa would probably require specific ontologies.
We may be able to develope a system that allows a core central terminology -
with taxon specific extensions.....We want to allow MEANINGFUL comparison
of data - and often there is no need or sense in comparing data across
widely divergent taxa ie you would might want to compare the properties of
stalks on angiosperm flowers, but it is probably of no taxonomic interest to
compare these with the stalks of a slime mould fruiting body...............
In general, we have taken the view that a controlled terminology in
particular domains (e.g. legumes) may develop as an emergent property of
SDD, rather than imposed top-down.
[Paterson, Trevor]
Yes - this is the working model we have come round to...users develope an
ontology and share it with colleagues in a closely related field etc....
On more specific points from your document:
Complexity: SDD was scoped to be a superset of existing systems and
standards e.g. DELTA, Lucid, DeltaAcess, and also to accommodate future
developments that those of us working in the field can envisage but no-one's
really done yet (particularly federation issues - and you may be further
down this track than we are). This is part of the reason for the complexity,
>It is not clear to me whether SDD is proposing this schema as
>a unifying schema to which different description formats would map their
own schema
>or
>whether the SDD schema is being proposed as a schema for developers to
(partially) implement when designing applications
>and repositories for capturing descriptive data.
It is designed as a unifying standard, to allow lossless roundtripping
between applications. At the same time, we are struggling with how much
should be mandatory and how much optional (your second option)
>>>From our own collaborative experiences with botanical taxonomists, data
models and structures hold no interest to them in
>practice, and they find even our simple conceptual model of character
description complex to understand. Probably few working
>taxonomists would wish to interact at any level with the SDD schema and
applications would have to achieve this mapping
>transparently.
On this I'm sure you're right, and we have had many discussions within SDD
about this problem. There are differing views as to the importance of
taxonomists themselves coming to grips with SDD, as the standard itself will
generally be invisible to a taxonomist using an SDD-compliant application.
[Paterson, Trevor]
The problem is that someone has to write and implement the applications (
e.g. me !) - and they have to do this in collaboration with taxonomists - or
it will be perceived as an irrelevant imposition - -therefore it will always
be necessary to get at least some taxonomists from a variety of fields to
understand the schema...
>>>From my perspective actually it seems much easier for computer scientists to
get to grips with taxonomy rather than vice versa - ( although taxonmists
would be a bit defensive about this....)
Translation and multiple language representations: allowing multiple
languages is seen as a fundamental part of the SDD brief. Life would indeed
be much simpler if everyone spoke the same language, but they don't so we
need to handle that.
>It is not clear whether SDD proposes that a single document can include
multiple language representations, or whether these
>would form separate documents, conforming to the same standard
SDD can handle multiple language representations of every character string
within the one document.
[Paterson, Trevor]
Still not convinced allowing multiple language representations would mace
for good/accurate science
Multiple expertise levels
>I am similarly suspicious of the necessity for including the ability for
recording different expertise levels in one document format.
>Is SDD proposing/allowing multiple representations within the same document
: or just that the same format/standard can be
>used for documents aimed at different expertise level.
>
>There clearly is value in being able to extract/translate simple language
descriptions from complex data resources - as is
>necessary for compiling flora and keys from monographs and original
descriptions. However, is including the ability to describe
>descriptive data in language suitable for primary schoolchildren relevant
to an accurate scientific database of taxonomic data.
>[Again this would appear to be a political requirement??]
This is not a political requirement, but an attempt to broaden the
application of taxonomy beyond taxonomists (surely a requirement if the
taxonomic crisis is to be resolved). It also derives neatly from the XML
underpinning (XML is based on the idea of multiple representations of a
single document)
[Paterson, Trevor]
Yes I see where you are coming from - i think the probelm is that
taxonomists are concerned about accurate representation of their data for
their purposes - making a shareable version of this is not perceived as of
any value to them - they want a scientific tool to do their job - i am not
sure how /if we could encourage a dual markup approach - or whether
different 'markets' for descriptive data would exist independently -
Defining the descriptive terminology
>Are you suggesting that the SDD Terminology Section will be adequate and
appropriate to store
>and represent any (allowed) defined terminology?
Yes, we hope so. Do you think it will be inadequate?
[Paterson, Trevor]
It probably would be adequate to store Prometheus terminologies - with
minor tinkering, the semantics of the terminolgies could be saved just with
glossary entries and relationships etc - but would obviously require
interpetation by suitable applications. I cant understand concept trees well
enough to get a feel for whether they could store the semantics of a
terminolgy more explicitly.....
However, that is not to say SDD could cope with other terminolgies that
might have further unforseen relationships - there might need to be a
facility for recording 'user defined' relationships such as our stategroup
membership and restrictions between these and sets of structures - i think
that these type of relationships are representable in concept trees - but
would need a primer/ turotial to show me how.
>Is the standard going to allow descriptions to reference other defined
terminologies?
It will be possible to outsource the terminology section, so if a group
creates a controlled vocabulary, that could be referenced in multiple SDD
documents. So presumably Prometheus could be the source of a controlled
vocabulary that other users (of they found it adequate) could reference.
>Would SDD only accept Data marked up in an SDD terminology?
Yes - or do I misunderstand this question?
[Paterson, Trevor]
If i understand your previous remark - you could have completely external
termnologies - that SDD 'knows' nothing about the structure and semantics -
so it would have to accept 'non-SDD' terminology
>Would existing terminologies have to be translated/mapped/redescribed in
SDD format?
Any existing terminology can be represented in SDD, so there will be no
remapping necessary.
[Paterson, Trevor]
Again this will only be knowable if people try doing it. Obviously if there
is a standard structure people would be encouraged to use it - but
prexisting description ontologies ( eg PlantOnology and GeneOntology) would
not want to retrofit to the standard....even if there is no ' rdesign or
remapping necessary - there is the matter of re-expressing it in an SDD
format
>Who is going to create terminologies, e.g.gusers on an adhoc basis, or
expert user groups?
As above, these may develop particularly for some groups (e.g. ferns,
legumes), but some users may choose to stay outside such a system (there
will be benefits and costs of using a controlled vocabulary, so people will
have to weigh it up for themselves). SDD itself is agnostic.
[Paterson, Trevor]
Almost our opinion - but we are not agnostic - we believe 'interpretability'
and 'reusability' are Gods, and we should proselytize on their behalfs -
gently of course - by offering only carrots and no sticks...
>Is it an aim to promote re-use and sharing of terminologies?
It would be a desirable outcome, but we hope it evolves bottom-up rather
than being imposed top-down.
[Paterson, Trevor]
Sure
>Is there going to be policing of SDD terminologies, e.g. maintaining
versioning, additions etc?
Versioning will be handled within SDD, but there can be no possibility of
policing a system
[Paterson, Trevor]
This what really worries the taxonomists - imposed standards reducing the
flexibility and expressivity of their descriptions - and hatred of a police
state. I think standardisation WILL be perceived to lead to a loss of
expressivity - but as often one person's expressivity is another'; s
incomprehensivity to an outsider this seems like potentially 'a good thing'
( in small doses obviously)........
>How was the terminology section created - by examining examples of
terminology specifications,
>ontology representations etc?
We have had a mix of off-the-top-of-the-head speculation as to how best to
do things, and proofing of concepts against real-world examples. I would
have liked to see more proofing going on during development, but this has
been hard to maintain, and may be to our cost. We are nbow at a phase where
several groups are trting to implement SDD-compliance for their systems -
this will be the proof of the pudding.
Note also that SDD is currently v0.9 - with the explicit statement on
release that everything may change if we find that the proofing fails.
>Does it form a standard template for storing a terminology?
What do you mean by this. It provides the standard schema for representing
an (undefined) terminology.
>Is it compatible with any existing tools, standards or formats - e.g.
ontology editors?
We have specifically made it very general. There is currently no existing
tool that can handle SDD. Hopefull this will change shortly.
[Paterson, Trevor]
Keep me posted
----------------------------------
So does Prometheus have any data yet, or is it still at the model stage? It
would be very interesting to try representing Prometheus data in SDD.
[Paterson, Trevor] to date we have
*
the model ( and a data base implementation to save model compliant
data)
*
a tool for creating simple terminologies - with definitions,
PartOf, Typeof, Stategroup relations etc
*
a prototype angiosperm terminology/ontology
*
a prototype tool for using onologies to specify project description
templates ( proformas)
*
which also allows recordng of specimen descriptions
*
the ability to save these descriptions to the database in the model
compliant format
and we are currently getting user testing done on the prototype by the
taxonomists, who are making proformas adn saving specimen descriptions. We
are just about ready to use the tool for a 'real' project if we can get time
and interest to do this...eg for a small taxonomic revision. We are rapidly
approaching the end of our project funding however.. so it may switch to
being a 'back burner' project....
Cheers - k
----- Original Message -----
From: Paterson, Trevor <mailto:T.Paterson@NAPIER.AC.UK>
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU <mailto:TDWG-SDD@LISTSERV.NHM.KU.EDU>
Sent: Monday, March 15, 2004 9:41 PM
Subject: SDD Schema in relationship to Prometheus
Gregor
I have written a rough document considering several aspects of the
SDD-schema - largely interpreted with reference to our Prometheus Database
model for descriptive data. It seems easier to keep this all together,
rather than post it to various sections on twiki, so i am attaching it here
My main problems in interpreting the schema were the lack of documentation (
as always...) especially for the conceptually complex parts like concept
trees. I think clear, visual summary models for description, characters,
concept trees etc would help a novice to get to grips with the concepts, and
might make some of the complexities more tractable. I do worry that the
overall schema is over complex and 'trying to do too much in one go' - eg
considering multiple language and expertise representations, although I am
sure that there are good political reasons for everything.....
yours
trevor
Trevor Paterson PhD
t.paterson(a)napier.ac.uk <mailto:t.paterson@napier.ac.uk>
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175 <http://www.dcs.napier.ac.uk/~cs175>
www.prometheusdb.org <http://www.prometheusdb.org/>
1
0
Apologies: The previous post from me with this title was an unfinished version sent off prematurely by my email editor. Please ignore and use this one instead.
-----------------------------
Hi Trevor - thanks very much for your comments and comparative document - this is really useful, and we need to get much more feedback like this.
The main difference between SDD and Prometheus seems to be that you are working specifically on the basis of defining a controlled terminology whereas SDD explicitly decided early on that a controlled terminology was outside our scope. History will judge which approach is best.
We did have early discussions about a controlled terminology (see the list archives for a history of this).One dificulty for us is that SDD is designed to be biology-wide (indeed, we have even removed specific references to biology, such as "taxon", because SDD is equally applicable to descriptions of non-taxa such as diseases, nutrient deficiency syndromes, soils and minerals. Perhaps here we have drawn our bow too wide, but we were informed by the fact that at our Lisbon meeting all but one of the contributors who were working with identification tools had removed their biology-specific tags to become more general). Prometheus (as I understand it from your document) is specifically botanical. This would be an intolerable restriction for us given our brief.
Obviously, a botany-wide controlled terminology is more achievable than a biology-wide one. Personally, however, I think that you run the danger even in botany with any controlled terminology of trying to force nature kicking and screaming into small boxes, and do it an injustice therewith. I don't know how any botany-wide controlled terminology could cope with the leaves of Drosera auriculata, for instance, or the morphology of Podostemaceae. (In fact, I wonder whether the dream of a controlled terminology is more likely in a cold Northern Hemisphere climate than in the biodiverse South or tropics?).
In general, we have taken the view that a controlled terminology in particular domains (e.g. legumes) may develop as an emergent property of SDD, rather than imposed top-down.
On more specific points from your document:
Complexity: SDD was scoped to be a superset of existing systems and standards e.g. DELTA, Lucid, DeltaAcess, and also to accommodate future developments that those of us working in the field can envisage but no-one's really done yet (particularly federation issues - and you may be further down this track than we are). This is part of the reason for the complexity,
>It is not clear to me whether SDD is proposing this schema as
>a unifying schema to which different description formats would map their own schema
>or
>whether the SDD schema is being proposed as a schema for developers to (partially) implement when designing applications
>and repositories for capturing descriptive data.
It is designed as a unifying standard, to allow lossless roundtripping between applications. At the same time, we are struggling with how much should be mandatory and how much optional (your second option)
>>>From our own collaborative experiences with botanical taxonomists, data models and structures hold no interest to them in
>practice, and they find even our simple conceptual model of character description complex to understand. Probably few working
>taxonomists would wish to interact at any level with the SDD schema and applications would have to achieve this mapping
>transparently.
On this I'm sure you're right, and we have had many discussions within SDD about this problem. There are differing views as to the importance of taxonomists themselves coming to grips with SDD, as the standard itself will generally be invisible to a taxonomist using an SDD-compliant application.
Translation and multiple language representations: allowing multiple languages is seen as a fundamental part of the SDD brief. Life would indeed be much simpler if everyone spoke the same language, but they don't so we need to handle that.
>It is not clear whether SDD proposes that a single document can include multiple language representations, or whether these
>would form separate documents, conforming to the same standard
SDD can handle multiple language representations of every character string within the one document.
Multiple expertise levels
>I am similarly suspicious of the necessity for including the ability for recording different expertise levels in one document format.
>Is SDD proposing/allowing multiple representations within the same document : or just that the same format/standard can be
>used for documents aimed at different expertise level.
>
>There clearly is value in being able to extract/translate simple language descriptions from complex data resources - as is
>necessary for compiling flora and keys from monographs and original descriptions. However, is including the ability to describe
>descriptive data in language suitable for primary schoolchildren relevant to an accurate scientific database of taxonomic data.
>[Again this would appear to be a political requirement??]
This is not a political requirement, but an attempt to broaden the application of taxonomy beyond taxonomists (surely a requirement if the taxonomic crisis is to be resolved). It also derives neatly from the XML underpinning (XML is based on the idea of multiple representations of a single document)
Defining the descriptive terminology
>Are you suggesting that the SDD Terminology Section will be adequate and appropriate to store
>and represent any (allowed) defined terminology?
Yes, we hope so. Do you think it will be inadequate?
>Is the standard going to allow descriptions to reference other defined terminologies?
It will be possible to outsource the terminology section, so if a group creates a controlled vocabulary, that could be referenced in multiple SDD documents. So presumably Prometheus could be the source of a controlled vocabulary that other users (of they found it adequate) could reference.
>Would SDD only accept Data marked up in an SDD terminology?
Yes - or do I misunderstand this question?
>Would existing terminologies have to be translated/mapped/redescribed in SDD format?
Any existing terminology can be represented in SDD, so there will be no remapping necessary.
>Who is going to create terminologies, e.g.gusers on an adhoc basis, or expert user groups?
As above, these may develop particularly for some groups (e.g. ferns, legumes), but some users may choose to stay outside such a system (there will be benefits and costs of using a controlled vocabulary, so people will have to weigh it up for themselves). SDD itself is agnostic.
>Is it an aim to promote re-use and sharing of terminologies?
It would be a desirable outcome, but we hope it evolves bottom-up rather than being imposed top-down.
>Is there going to be policing of SDD terminologies, e.g. maintaining versioning, additions etc?
Versioning will be handled within SDD, but there can be no possibility of policing a system.
>How was the terminology section created - by examining examples of terminology specifications,
>ontology representations etc?
We have had a mix of off-the-top-of-the-head speculation as to how best to do things, and proofing of concepts against real-world examples. I would have liked to see more proofing going on during development, but this has been hard to maintain, and may be to our cost. We are nbow at a phase where several groups are trting to implement SDD-compliance for their systems - this will be the proof of the pudding.
Note also that SDD is currently v0.9 - with the explicit statement on release that everything may change if we find that the proofing fails.
>Does it form a standard template for storing a terminology?
What do you mean by this. It provides the standard schema for representing an (undefined) terminology.
>Is it compatible with any existing tools, standards or formats - e.g. ontology editors?
We have specifically made it very general. There is currently no existing tool that can handle SDD. Hopefull this will change shortly.
----------------------------------
So does Prometheus have any data yet, or is it still at the model stage? It would be very interesting to try representing Prometheus data in SDD.
Cheers - k
----- Original Message -----
From: Paterson, Trevor
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
Sent: Monday, March 15, 2004 9:41 PM
Subject: SDD Schema in relationship to Prometheus
Gregor
I have written a rough document considering several aspects of the SDD-schema - largely interpreted with reference to our Prometheus Database model for descriptive data. It seems easier to keep this all together, rather than post it to various sections on twiki, so i am attaching it here
My main problems in interpreting the schema were the lack of documentation ( as always...) especially for the conceptually complex parts like concept trees. I think clear, visual summary models for description, characters, concept trees etc would help a novice to get to grips with the concepts, and might make some of the complexities more tractable. I do worry that the overall schema is over complex and 'trying to do too much in one go' - eg considering multiple language and expertise representations, although I am sure that there are good political reasons for everything.....
yours
trevor
Trevor Paterson PhD
t.paterson(a)napier.ac.uk
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175
www.prometheusdb.org
1
0
Trevor - thanks very much for your comments and comparative document - this is really useful, and we need to get much more feedback like this.
The main difference between SDD and Prometheus seems to be that you are working specifically on the basis of defining a controlled terminology whereas SDD explicitly decided early on that a controlled terminology was outside our scope. History will judge which approach is best.
We did have early discussions about a controlled terminology (see the list archives for a history of this).One dificulty for us is that SDD is designed to be biology-wide (indeed, we have even removed specific references to biology, such as "taxon", because SDD is equally applicable to descriptions of non-taxa such as diseases, nutrient deficiency syndromes, soils and minerals. Perhaps here we have drawn our bow too wide, but we were informed by the fact that at our Lisbon meeting all but one of the contributors who were working with identification tools had removed their biology-specific tags to become more general). Prometheus (as I understand it from your document) is specifically botanical. This would be an intolerable restriction for us given our brief.
Obviously, a botany-wide controlled terminology is more achievable than a biology-wide one. Personally, however, I think that you run the danger even in botany with any controlled terminology of trying to force nature kicking and screaming into small boxes, and do it an injustice therewith. I don't know how any botany-wide controlled terminology could cope with the leaves of Drosera auriculata, for instance, or the morphology of Podostemaceae. (In fact, I wonder whether the dream of a controlled terminology is more likely in a cold Northern Hemisphere climate than in the biodiverse South or tropics?).
In general, we have taken the view that a controlled terminology in particular domains (e.g. legumes) may develop as an emergent property of SDD, rather than imposed top-down.
On more specific points from your document:
Complexity: SDD was scoped to be a superset of existing systems and standards e.g. DELTA, Lucid, DeltaAcess, and also to accommodate future developments that those of us working in the field can envisage but no-one's really done yet (particularly federation issues - and you may be further down this track than we are). This is part of the reason for the complexity,
>It is not clear to me whether SDD is proposing this schema as
>a unifying schema to which different description formats would map their own schema
>or
>whether the SDD schema is being proposed as a schema for developers to (partially) implement when designing applications >and repositories for capturing descriptive data.
It is designed as a unifying standard, to allow lossless roundtripping between applications. At the same time, we are struggling with how much should be mandatory and how much optional (your second option)
>>>From our own collaborative experiences with botanical taxonomists, data models and structures hold no interest to them in >practice, and they find even our simple conceptual model of character description complex to understand. Probably few working >taxonomists would wish to interact at any level with the SDD schema and applications would have to achieve this mapping >transparently.
On this I'm sure you're right, and we have had many discussions within SDD about this problem. There are differing views as to the importance of taxonomists themselves coming to grips with SDD, as the standard itself will generally be invisible to a taxonomist using an SDD-compliant application.
Translation and multiple language representations: allowing multiple languages is seen as a fundamental part of the SDD brief. Life would indeed be much simpler if everyone spoke the same language, but they don't so we need to handle that.
>It is not clear whether SDD proposes that a single document can include multiple language representations, or whether these >would form separate documents, conforming to the same standard
SDD can handle multiple language representations of every character string within the one document.
Multiple expertise levels
>I am similarly suspicious of the necessity for including the ability for recording different expertise levels in one document format. >Is SDD proposing/allowing multiple representations within the same document : or just that the same format/standard can be >used for documents aimed at different expertise level.
>
>There clearly is value in being able to extract/translate simple language descriptions from complex data resources - as is >necessary for compiling flora and keys from monographs and original descriptions. However, is including the ability to describe >descriptive data in language suitable for primary schoolchildren relevant to an accurate scientific database of taxonomic data. [Again this would appear to be a political requirement??]
----- Original Message -----
From: Paterson, Trevor
To: TDWG-SDD(a)LISTSERV.NHM.KU.EDU
Sent: Monday, March 15, 2004 9:41 PM
Subject: SDD Schema in relationship to Prometheus
Gregor
I have written a rough document considering several aspects of the SDD-schema - largely interpreted with reference to our Prometheus Database model for descriptive data. It seems easier to keep this all together, rather than post it to various sections on twiki, so i am attaching it here
My main problems in interpreting the schema were the lack of documentation ( as always...) especially for the conceptually complex parts like concept trees. I think clear, visual summary models for description, characters, concept trees etc would help a novice to get to grips with the concepts, and might make some of the complexities more tractable. I do worry that the overall schema is over complex and 'trying to do too much in one go' - eg considering multiple language and expertise representations, although I am sure that there are good political reasons for everything.....
yours
trevor
Trevor Paterson PhD
t.paterson(a)napier.ac.uk
School of Computing
Napier University
Merchiston Campus
10 Colinton Road
Edinburgh
Scotland
EH10 5DT
tel: +44 (0)131 455-2752
www.dcs.napier.ac.uk/~cs175
www.prometheusdb.org
1
0