Here's my first go at representing Challenge 1.
The challenge:
Rigid, spreading shrub to c. 1m high and wide; stems glabrous. Leaves soon
deciduous (particularly on older plants), +/- oblong, (4-)6-10(-15) mm long,
2-3 mm wide, apex obtuse or minutely mucronate within an apical notch,
glabrous or a few hairs present near tip; stipules dark reddish-brown, c. 1
mm long, often shallowly joined around the node; spines stout, 1.5-4 cm
long.
Discaria nitida
Slender shrub, to 5 m high; stems glabrous. Leaves persistent (rarely
deciduous), elliptic to obovate, (8-)10-20(-30) mm long, 3-7 mm wide,
glabrous, shining; spines not developed at each node, to c. 1 cm long.
The representation:
<Document>
<Description Name="Discaria pubescens">
<Feature Name="Habit">Rigid, spreading <Feature Name="Life
form"><Value>shrub</Value></Feature><Feature Name="Height"> to c. <Value
MinValue="0"/><MaxValue>1</MaxValue><Units>m</Units> high and
wide</Feature></Feature>; <Feature Name="Stem indumentum">stems
<Value>glabrous</Value></Feature>. <Feature><Name>Leaves</Name><Feature
Name="Longevity">soon <Value>deciduous</Value>(particularly on older
plants)</Feature>, <Feature Name="Shape">+/-
<Value>oblong</Value></Feature>, <Feature
Name="Length">(<OutMinValue>4</OutMinValue>-)<MinValue>6</MinValue>-<MaxValu
e>10</MaxValue>(-<OutMaxValue>15</OutMaxValue>)<Units> mm</Units>
long</Feature>, <Feature
Name="Width"><MinValue>2</MinValue>-<MaxValue>3</MaxValue><Units> mm</Units>
wide</Feature>, <Feature Name="Apex shape"><Value>obtuse</Value> or minutely
<Value>mucronate within an apical notch</Value></Feature>, <Feature
Name="indumentum" Value="hairy">surfaces <Value>glabrous</Value> or a few
hairs present near tip</Feature>;<Feature><Name>stipules</Name><Feature
Name="Colour"><Value Value="Brown"/><Value Value="Red"/>dark
reddish-brown</Feature>, c. 1 mm long, <Feature Name="Arrangement">often
shallowly <Value>joined around the
node</Value></Feature></Feature></Feature> <Feature><Name>
spines</Name><Feature Name="Robustness"><Value>stout</Value></Feature>,
<Feature
Name="Length"><MinValue>1.5</MinValue>-<MaxValue>4</MaxValue><Units>cm</Unit
s> long</Feature></Feature>.
</Description>
<Description Name="Discaria nitida">
<Feature Name="Habit">Slender <Feature Name="Life
form"><Value>shrub</Value></Feature><Feature Name="Height"> to <Value
MinValue="0"/><MaxValue>5</MaxValue><Units>m</Units>hig</Feature></Feature>;
<Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>.
<Feature><Name>Leaves</Name><Feature
Name="Longevity"><Value>persistent</Value>(rarely<Value
Qualifier="rarely">deciduous</Value>)</Feature>, <Feature
Name="Shape"><Value>elliptic</Value> to <Value>obovate</Value></Feature>,
<Feature
Name="Length">(<OutMinValue>8</OutMinValue>-)<MinValue>10</MinValue>-<MaxVal
ue>20</MaxValue>(-<OutMaxValue>30</OutMaxValue>)<Units> mm</Units>
long</Feature>, <Feature
Name="Width"><MinValue>3</MinValue>-<MaxValue>7</MaxValue><Units> mm</Units>
wide</Feature>, <Feature Name="indumentum"
><Value>glabrous</Value></Feature> shining;
</Feature><Feature><Name>spines</Name>, not developing at each node,
<Feature Name="Length" MinValue="0">to c.
<MaxValue>1</MaxValue><Units>cm</Units> long</Feature></Feature>
</Description>
</Document>
or, if you prefer
<Document>
<Description Name="Discaria pubescens">
<Feature Name="Habit">Rigid, spreading
<Feature Name="Life form"><Value>shrub</Value></Feature>
<Feature Name="Height"> to c. <Value
MinValue="0"/><MaxValue>1</MaxValue><Units>m</Units> high and wide</Feature>
</Feature>;
<Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>.
<Feature><Name>Leaves</Name>
<Feature Name="Longevity">soon <Value>deciduous</Value>(particularly on
older plants)</Feature>,
<Feature Name="Shape">+/- <Value>oblong</Value></Feature>,
<Feature
Name="Length">(<OutMinValue>4</OutMinValue>-)<MinValue>6</MinValue>-<MaxValu
e>10</MaxValue>(-<OutMaxValue>15</OutMaxValue>)<Units> mm</Units>
long</Feature>,
<Feature Name="Width"><MinValue>2</MinValue>-<MaxValue>3</MaxValue><Units>
mm</Units> wide</Feature>,
<Feature Name="Apex shape"><Value>obtuse</Value> or minutely
<Value>mucronate within an apical notch</Value></Feature>,
<Feature Name="indumentum" Value="hairy">surfaces <Value>glabrous</Value>
or a few hairs present near tip</Feature>;
<Feature><Name>stipules</Name>
<Feature Name="Colour"><Value Value="Brown"/><Value Value="Red"/>dark
reddish-brown</Feature>, c. 1 mm long,
<Feature Name="Arrangement">often shallowly <Value>joined around the
node</Value></Feature>
</Feature>
</Feature>
<Feature><Name> spines</Name>
<Feature Name="Robustness"><Value>stout</Value></Feature>,
<Feature
Name="Length"><MinValue>1.5</MinValue>-<MaxValue>4</MaxValue><Units>cm</Unit
s> long</Feature>
</Feature>.
</Description>
<Description Name="Discaria nitida">
<Feature Name="Habit">Slender
<Feature Name="Life form"><Value>shrub</Value></Feature>
<Feature Name="Height"> to <Value
MinValue="0"/><MaxValue>5</MaxValue><Units>m</Units>high</Feature>
</Feature>;
<Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>.
<Feature><Name>Leaves</Name>
<Feature Name="Longevity"><Value>persistent</Value>(rarely<Value
Qualifier="rarely">deciduous</Value>)</Feature>,
<Feature Name="Shape"><Value>elliptic</Value> to
<Value>obovate</Value></Feature>,
<Feature
Name="Length">(<OutMinValue>8</OutMinValue>-)<MinValue>10</MinValue>-<MaxVal
ue>20</MaxValue>(-<OutMaxValue>30</OutMaxValue>)<Units> mm</Units>
long</Feature>, <Feature
Name="Width"><MinValue>3</MinValue>-<MaxValue>7</MaxValue><Units> mm</Units>
wide</Feature>,
<Feature Name="indumentum" ><Value>glabrous</Value></Feature> shining;
</Feature>
<Feature><Name>spines</Name>, not developing at each node,
<Feature Name="Length" MinValue="0">to c.
<MaxValue>1</MaxValue><Units>cm</Units> long</Feature>
</Feature>
</Description>
</Document>
Issues
I've taken the route of marking up a textual description, using a minimum of
tags. It seems to me that a description comprises a series of features with
values. I've used mixed markup because I wanted to have the mimimum tagging
and make maximum use of the text. That is, in cases like:
<Feature Name="Life form"><Value>shrub</Value></Feature>
I've made use of the fact that "shrub" occurs in the text by enclosing it in
<Value> tags. In cases like:
<Feature><Name>Leaves</Name>
I'm using a <Name> tag rather than a Name attribute
This may make processing considerably more difficult, but it seems to me
that there are some advantages. An alternative would be to first process the
description into a DELTA-like structure then mark up that. I chose not to go
that route because I still have the dream that one day we'll have a method
of semi-automatically marking up the content of natural language into
something like the above. The other advantage of this structure I believe is
that it maintains all the complexity and nuance of the original natural
language (which can be retrieved merely by removing all the tags). I prefer
this to the DELTA-process (take a natural-language description -> atomise
it -> reconstruct semi-natural-language by putting the atoms back together
again in some fashion).
Features are nested:
<Feature><Name>Leaves</Name>
<Feature><Name>Shape</Name></Feature>
</Feature>
Is this allowable? In XML-Spy I can create a Schema OK for this document.
It's also well-formed.
In this example, there is not a 1:1 correspondence between the
descriptions - e.g. the description for D. pubescens has
<Feature Name="Apex shape">
while that for D. nitida does not. This is anathema to most of us, but is a
real world case. We'll have to deal with it.
In this example there is no character list against which character names etc
can be validated (to e.g. pick up the above problem). This is something we
need to deal with.
Be gentle with me.
Your mission, should you choose to accept it, is to show that I'm crazy and
do better than this.
This email will self-destruct in 5 seconds
Cheers - k