Taxon debate synthesis?

Thu Nov 17 10:35:46 CET 2005

I agree with the idea of providing most of the information attached to a GUID in the metadata.  After talking to the guys at IBM who wrote the LSID implementation, the data for an LSID was mainly intended for quite static data objects, such as documents and images.  Most data, and especially dynamic data, is best returned as metadata.  And RDF 

As for breaking opacity and having "meaning" within the GUID itself, this goes against a fundamental software principle that an ID should have no meaning. The namespaces of LSIDs are reasonably uncontrolled and hence this will no doubt result in erroneous assumptions. Anyway, in some cases you will know what the type of object a GUID is pointing to by the context that it resides in - eg the example Rod gave, <isMisspellingOf rdf:resource="urn:lsid:authority:namespace:123" /> - you will know that in your database you store links of misspellings to other taxon names.

Kevin

>>> deepreef at BISHOPMUSEUM.ORG 17/11/2005 9:47:43 a.m. >>>

> Before we have a flurry of name flavours (yuck), I'd suggest that many
> of these "flavours" are really relationships between names,

Absolutely! -- I was using "flavours" in the sense that they would be
defined in terms of relationships, not in terms of the data objects
themselves.  Sorry if it seemed I meant it that way.  Perhaps there is value
in establishing a metadata attribute for "IsBasionym" or something like
that, but I think these things should really be defined by relationships
among GUID-represented data objects, rather than as attributes of the data
objects themselves.

> Hence, I think before we embark on adding complexity, why not have
> GUIDs for names and then specify relationships between these names?
> This is what I've been doing with the LSIDs we serve here.

Why not have GUIDs for NameUsages, and then specify relationships between
these NameUsages? (simply because "names" are much more difficult to define
than "NameUsages")

> We could do with an ontology for these relationships so that, for
> example, if somebody asks for the synonyms of name "x", and two names
> are linked by the relationship "isBasionymOf" we recover that
> relationship (in other words, our tools know that 'isBasionymOf" is a
> kind of synonym).

Sounds good to me!

> Again, let's try and keep things simple.

Agreed!  And simple not so much for the sake of simplicity, but more for the
sake of flexibility.

> This leads to a much more general topic, but I think if we adopt RDF
> for our metadata we gain a lot of query and inference tools "for free."
> GUIDs by themselves are boring, it's the metadata that matters.

Agreed!

> > My general feeling is that most of the useful information is metadata,
> > the
> > only candidates for data (in my mind) being the literal text string
> > (NameString), a ref pointer to a defined Documentation instance (if
> > any),
> > and perhaps position information (e.g. page) within the Documentation
> > instance.  All of these might be intially recorded in error, but could
> > be
> > corrected via the LSID revision id (version).
>
> I think everything about a name probably is metadata, and doing this
> opens up the possibility of doing useful things with that metadata (see
> above). In the LSID stuff we've been doing at Glasgow the only time we
> provide data is if we are serving images, in which case it makes sense
> to serve a stream of bytes (e.g., a TIFF image). In one sense, data
> isn't that interesting (or, put another way in many cases it requires a
> person to make sense of it, whereas computers can handle metadata).

After having just spent the past couple hours brushing up my LSID
"geek-speak", I agree completely with Rod on this.

Aloha,
Rich

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
WARNING: This email and any attachments may be confidential and/or
privileged. They are intended for the addressee only and are not to be read,
used, copied or disseminated by anyone receiving them in error.  If you are
not the intended recipient, please notify the sender by return email and
delete this message and any attachments.

The views expressed in this email are those of the sender and do not
necessarily reflect the official views of Landcare Research.

Landcare Research
http://www.landcareresearch.co.nz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

--=__PartD4F6E122.0__Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Description: HTML

<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1522" name=GENERATOR></HEAD>
<BODY style="MARGIN: 4px 4px 1px; FONT: 10pt Tahoma">
<DIV>I agree with the idea of providing most of the information attached to a GUID in the metadata.&nbsp; After talking to the guys at IBM who wrote the LSID implementation, the data for an LSID was mainly intended for quite static data objects, such as documents and images.&nbsp; Most data, and especially dynamic data, is best returned as metadata.&nbsp; And RDF </DIV>
<DIV>&nbsp;</DIV>
<DIV>As for breaking opacity and having "meaning" within the GUID itself, this goes against a fundamental software principle that an ID should have no meaning.&nbsp; The namespaces of LSIDs are reasonably uncontrolled and hence this will no doubt result in erroneous assumptions.&nbsp; Anyway, in some cases you&nbsp;will know what the type of object a GUID is pointing to by the context that it resides in - eg the example Rod gave, &lt;isMisspellingOf rdf:resource="urn:lsid:authority:namespace:123" /&gt; - you will know that in your database you store links of misspellings to other taxon names.</DIV>
<DIV>&nbsp;</DIV>
<DIV>Kevin &gt;&gt;&gt; deepreef at BISHOPMUSEUM.ORG 17/11/2005 9:47:43 a.m. &gt;&gt;&gt; </DIV>
<DIV style="COLOR: #000000">&gt; Before we have a flurry of name flavours (yuck), I'd suggest that many &gt; of these "flavours" are really relationships between names, Absolutely! -- I was using "flavours" in the sense that they would be defined in terms of relationships, not in terms of the data objects themselves.&nbsp; Sorry if it seemed I meant it that way.&nbsp; Perhaps there is value in establishing a metadata attribute for "IsBasionym" or something like that, but I think these things should really be defined by relationships among GUID-represented data objects, rather than as attributes of the data objects themselves. &gt; Hence, I think before we embark on adding complexity, why not have &gt; GUIDs for names and then specify relationships between these names? &gt; This is what I've been doing with the LSIDs we serve here. Why not have GUIDs for NameUsages, and then specify relationships between these NameUsages? (simply because "names" are much more difficult to define than "NameUsages") &gt; We could do with an ontology for these relationships so that, for &gt; example, if somebody asks for the synonyms of name "x", and two names &gt; are linked by the relationship "isBasionymOf" we recover that &gt; relationship (in other words, our tools know that 'isBasionymOf" is a &gt; kind of synonym). Sounds good to me! &gt; Again, let's try and keep things simple. Agreed!&nbsp; And simple not so much for the sake of simplicity, but more for the sake of flexibility. &gt; This leads to a much more general topic, but I think if we adopt RDF &gt; for our metadata we gain a lot of query and inference tools "for free." &gt; GUIDs by themselves are boring, it's the metadata that matters. Agreed! &gt; &gt; My general feeling is that most of the useful information is metadata, &gt; &gt; the &gt; &gt; only candidates for data (in my mind) being the literal text string &gt; &gt; (NameString), a ref pointer to a defined Documentation instance (if &gt; &gt; any), &gt; &gt; and perhaps position information (e.g. page) within the Documentation &gt; &gt; instance.&nbsp; All of these might be intially recorded in error, but could &gt; &gt; be &gt; &gt; corrected via the LSID revision id (version). &gt; &gt; I think everything about a name probably is metadata, and doing this &gt; opens up the possibility of doing useful things with that metadata (see &gt; above). In the LSID stuff we've been doing at Glasgow the only time we &gt; provide data is if we are serving images, in which case it makes sense &gt; to serve a stream of bytes (e.g., a TIFF image). In one sense, data &gt; isn't that interesting (or, put another way in many cases it requires a &gt; person to make sense of it, whereas computers can handle metadata). After having just spent the past couple hours brushing up my LSID "geek-speak", I agree completely with Rod on this. Aloha, Rich </DIV></BODY></HTML>

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<BR>
WARNING: This email and any attachments may be confidential and/or<BR>
privileged. They are intended for the addressee only and are not to be read,<BR>
used, copied or disseminated by anyone receiving them in error.  If you are<BR>
not the intended recipient, please notify the sender by return email and<BR>
delete this message and any attachments.<BR>
<BR>
The views expressed in this email are those of the sender and do not<BR>
necessarily reflect the official views of Landcare Research.  <BR>
<BR>
Landcare Research<BR>
http://www.landcareresearch.co.nz<BR>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<BR>
<BR>
</BODY></HTML>