[tdwg-content] Expressing some relationships in DwC?

Richard Pyle deepreef at bishopmuseum.org
Wed Oct 26 12:32:32 CEST 2011


Hi Markus,

> Well sure, valid is overloaded just as accepted is which we nevertheless
use
> for the "accepted" taxonomic relation.
> Leaving aside what the actual term name is, validNameUsage,
> correctNameUsage, amendedNameUsage or sth else - it seems to fix the
> problem, doesn't it?

Perhaps, but I guess because of the ambiguity issue with "valid", I'm not
completely sure which specific problem is being solved in this case. If, as
from Tony's original email, it is to link a nomen nudum usage instance to
its corresponding validly published/available usage instance, that makes
some sense.  But then do we need something similar for Emendations (e.g.,
emendedUsageInstance) and other sorts of nomenclatural relationships?  How
many of these "foreign keys" will we ultimately need?  This is why I'm
thinking in terms of something more relational.

> "DwCa 2.0" would be a drastic change which feels like reinventing the
> relational wheel. It might be better to go with sth existing in that case,
e.g.
> Google Dataset Publishing Language:
> http://code.google.com/apis/publicdata/

Yes, Tim just sent me that link off-list as well -- I'll take a look when I
have some time.

> Using the generic relationship extension is good, but also has serious
> drawbacks. Immediately these come to my mind:
>  - it is much harder to publish and consume data in this format. Without
> publishing tools you will be lost.
>  - the controlled vocabulary for the relationship type must be *very*
> controlled. Especially we need to avoid overloading which will easily
happen
> very quickly, see valid or accepted.
> - all linked resources must have unique ids across all classes, pretty
much
> globally unique ids. Nice to have, but hard for publishers.

These are all valid points to be sure.  However, I would like to think that
many content providers are (gradually?) improving their ability to represent
data in this more normalized way.  I would also like to think that this
trend is continuing.  My sense from TDWG is that perhaps we're getting close
to critical mass for enough providers, who have more robust needs for data
exchange, to beign exploring more sophisticated ways of exchanging more rich
datasets.  If, indeed, this is the general trend, then it would be useful
for a subset of providers with such capabilities and needs to begin
exploring more robust implementations of dwc:resourceRelationship, including
perhaps a CSV/Archive version modeled after DwCA.  The BiSciCol project has
been looking very closely at this sort of thing.  It may not involve DwC
(i.e., it may be framed in more semantic terms), but I still think there
could be a need for a more fleshed-out definition and use case for
resourceRelationship to meet the (growing?) list of providers who wish to
express many-to-many relationships in a standardized way.  Yes, we certainly
would need a very carefully crafted controlled vocabulary for
dwc:relationshipOfResource -- but I see that as a very worthwhile exercise
in any case (especially as we move more and more towards semanic
presentations of our content).  Same can be said for persistent, actionable
identifiers.

Aloha,
Rich



This message is only intended for the addressee named above.  Its contents may be privileged or otherwise protected.  Any unauthorized use, disclosure or copying of this message or its contents is prohibited.  If you have received this message by mistake, please notify us immediately by reply mail or by collect telephone call.  Any personal opinions expressed in this message do not necessarily represent the views of the Bishop Museum.


More information about the tdwg-content mailing list