Here's my first go at representing Challenge 1.
The challenge:
Rigid, spreading shrub to c. 1m high and wide; stems glabrous. Leaves soon deciduous (particularly on older plants), +/- oblong, (4-)6-10(-15) mm long, 2-3 mm wide, apex obtuse or minutely mucronate within an apical notch, glabrous or a few hairs present near tip; stipules dark reddish-brown, c. 1 mm long, often shallowly joined around the node; spines stout, 1.5-4 cm long.
Discaria nitida
Slender shrub, to 5 m high; stems glabrous. Leaves persistent (rarely deciduous), elliptic to obovate, (8-)10-20(-30) mm long, 3-7 mm wide, glabrous, shining; spines not developed at each node, to c. 1 cm long.
The representation:
<Document> <Description Name="Discaria pubescens"> <Feature Name="Habit">Rigid, spreading <Feature Name="Life form"><Value>shrub</Value></Feature><Feature Name="Height"> to c. <Value MinValue="0"/><MaxValue>1</MaxValue><Units>m</Units> high and wide</Feature></Feature>; <Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>. <Feature><Name>Leaves</Name><Feature Name="Longevity">soon <Value>deciduous</Value>(particularly on older plants)</Feature>, <Feature Name="Shape">+/- <Value>oblong</Value></Feature>, <Feature Name="Length">(<OutMinValue>4</OutMinValue>-)<MinValue>6</MinValue>-<MaxValu e>10</MaxValue>(-<OutMaxValue>15</OutMaxValue>)<Units> mm</Units> long</Feature>, <Feature Name="Width"><MinValue>2</MinValue>-<MaxValue>3</MaxValue><Units> mm</Units> wide</Feature>, <Feature Name="Apex shape"><Value>obtuse</Value> or minutely <Value>mucronate within an apical notch</Value></Feature>, <Feature Name="indumentum" Value="hairy">surfaces <Value>glabrous</Value> or a few hairs present near tip</Feature>;<Feature><Name>stipules</Name><Feature Name="Colour"><Value Value="Brown"/><Value Value="Red"/>dark reddish-brown</Feature>, c. 1 mm long, <Feature Name="Arrangement">often shallowly <Value>joined around the node</Value></Feature></Feature></Feature> <Feature><Name> spines</Name><Feature Name="Robustness"><Value>stout</Value></Feature>, <Feature Name="Length"><MinValue>1.5</MinValue>-<MaxValue>4</MaxValue><Units>cm</Unit s> long</Feature></Feature>. </Description> <Description Name="Discaria nitida"> <Feature Name="Habit">Slender <Feature Name="Life form"><Value>shrub</Value></Feature><Feature Name="Height"> to <Value MinValue="0"/><MaxValue>5</MaxValue><Units>m</Units>hig</Feature></Feature>; <Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>. <Feature><Name>Leaves</Name><Feature Name="Longevity"><Value>persistent</Value>(rarely<Value Qualifier="rarely">deciduous</Value>)</Feature>, <Feature Name="Shape"><Value>elliptic</Value> to <Value>obovate</Value></Feature>, <Feature Name="Length">(<OutMinValue>8</OutMinValue>-)<MinValue>10</MinValue>-<MaxVal ue>20</MaxValue>(-<OutMaxValue>30</OutMaxValue>)<Units> mm</Units> long</Feature>, <Feature Name="Width"><MinValue>3</MinValue>-<MaxValue>7</MaxValue><Units> mm</Units> wide</Feature>, <Feature Name="indumentum"
<Value>glabrous</Value></Feature> shining;
</Feature><Feature><Name>spines</Name>, not developing at each node, <Feature Name="Length" MinValue="0">to c. <MaxValue>1</MaxValue><Units>cm</Units> long</Feature></Feature> </Description> </Document>
or, if you prefer
<Document>
<Description Name="Discaria pubescens"> <Feature Name="Habit">Rigid, spreading <Feature Name="Life form"><Value>shrub</Value></Feature> <Feature Name="Height"> to c. <Value MinValue="0"/><MaxValue>1</MaxValue><Units>m</Units> high and wide</Feature> </Feature>; <Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>. <Feature><Name>Leaves</Name> <Feature Name="Longevity">soon <Value>deciduous</Value>(particularly on older plants)</Feature>, <Feature Name="Shape">+/- <Value>oblong</Value></Feature>, <Feature Name="Length">(<OutMinValue>4</OutMinValue>-)<MinValue>6</MinValue>-<MaxValu e>10</MaxValue>(-<OutMaxValue>15</OutMaxValue>)<Units> mm</Units> long</Feature>, <Feature Name="Width"><MinValue>2</MinValue>-<MaxValue>3</MaxValue><Units> mm</Units> wide</Feature>, <Feature Name="Apex shape"><Value>obtuse</Value> or minutely <Value>mucronate within an apical notch</Value></Feature>, <Feature Name="indumentum" Value="hairy">surfaces <Value>glabrous</Value> or a few hairs present near tip</Feature>; <Feature><Name>stipules</Name> <Feature Name="Colour"><Value Value="Brown"/><Value Value="Red"/>dark reddish-brown</Feature>, c. 1 mm long, <Feature Name="Arrangement">often shallowly <Value>joined around the node</Value></Feature> </Feature> </Feature> <Feature><Name> spines</Name> <Feature Name="Robustness"><Value>stout</Value></Feature>, <Feature Name="Length"><MinValue>1.5</MinValue>-<MaxValue>4</MaxValue><Units>cm</Unit s> long</Feature> </Feature>. </Description>
<Description Name="Discaria nitida"> <Feature Name="Habit">Slender <Feature Name="Life form"><Value>shrub</Value></Feature> <Feature Name="Height"> to <Value MinValue="0"/><MaxValue>5</MaxValue><Units>m</Units>high</Feature> </Feature>; <Feature Name="Stem indumentum">stems <Value>glabrous</Value></Feature>. <Feature><Name>Leaves</Name> <Feature Name="Longevity"><Value>persistent</Value>(rarely<Value Qualifier="rarely">deciduous</Value>)</Feature>, <Feature Name="Shape"><Value>elliptic</Value> to <Value>obovate</Value></Feature>, <Feature Name="Length">(<OutMinValue>8</OutMinValue>-)<MinValue>10</MinValue>-<MaxVal ue>20</MaxValue>(-<OutMaxValue>30</OutMaxValue>)<Units> mm</Units> long</Feature>, <Feature Name="Width"><MinValue>3</MinValue>-<MaxValue>7</MaxValue><Units> mm</Units> wide</Feature>, <Feature Name="indumentum" ><Value>glabrous</Value></Feature> shining; </Feature> <Feature><Name>spines</Name>, not developing at each node, <Feature Name="Length" MinValue="0">to c. <MaxValue>1</MaxValue><Units>cm</Units> long</Feature> </Feature> </Description>
</Document>
Issues
I've taken the route of marking up a textual description, using a minimum of tags. It seems to me that a description comprises a series of features with values. I've used mixed markup because I wanted to have the mimimum tagging and make maximum use of the text. That is, in cases like:
<Feature Name="Life form"><Value>shrub</Value></Feature>
I've made use of the fact that "shrub" occurs in the text by enclosing it in <Value> tags. In cases like:
<Feature><Name>Leaves</Name>
I'm using a <Name> tag rather than a Name attribute
This may make processing considerably more difficult, but it seems to me that there are some advantages. An alternative would be to first process the description into a DELTA-like structure then mark up that. I chose not to go that route because I still have the dream that one day we'll have a method of semi-automatically marking up the content of natural language into something like the above. The other advantage of this structure I believe is that it maintains all the complexity and nuance of the original natural language (which can be retrieved merely by removing all the tags). I prefer this to the DELTA-process (take a natural-language description -> atomise it -> reconstruct semi-natural-language by putting the atoms back together again in some fashion).
Features are nested:
<Feature><Name>Leaves</Name> <Feature><Name>Shape</Name></Feature> </Feature>
Is this allowable? In XML-Spy I can create a Schema OK for this document. It's also well-formed.
In this example, there is not a 1:1 correspondence between the descriptions - e.g. the description for D. pubescens has
<Feature Name="Apex shape">
while that for D. nitida does not. This is anathema to most of us, but is a real world case. We'll have to deal with it.
In this example there is no character list against which character names etc can be validated (to e.g. pick up the above problem). This is something we need to deal with.
Be gentle with me.
Your mission, should you choose to accept it, is to show that I'm crazy and do better than this.
This email will self-destruct in 5 seconds
Cheers - k
participants (1)
-
Kevin Thiele