[tdwg-content] Expressing some relationships in DwC?

Richard Pyle deepreef at bishopmuseum.org
Wed Oct 26 19:12:06 CEST 2011


> I have lost track of the number of times I've asked this question
> (I think it's somewhere around 4 times now) on the list without
> an answer: can anyone show me a functioning example where
> somebody is actually using the ResourceRelationship terms
> in any form?

That makes two of us.

> If I am understanding the use of the ResourceRelationship terms
> correctly (which I'm not at all sure about), dwc:resourceID represents
> an identifier for the subject of the relationship, dwc:relatedResourceID
> represents an identifier for the object of the relationship, and
> resourceRelationshipID represents an identifier for an _instance_ of a
> relationship which is further elaborated in the record for that instance
> by providing a string value for relationshipOfResource (using a controlled

> vocabulary) along with additional information about the relationships
> (according to whom, date, remarks).  What I think I'm hearing in this
> thread is a desire to start using these terms more widely and to improve
> interoperability  by standardizing the vocabulary used for
relationshipOfResource.
> I also think I'm hearing that it would be a good idea for values the xxID
terms to be GUIDs.

That's *exactly* my perspective.

> So what I'm wondering is why doing that would be better than just
> creating URI-identified object properties (a.k.a. predicates) that
> describe the relationship between the object and subject?
> Using the ResourceRelationship terms just makes "understanding" the
> relationship harder by requiring that the data consumer "look up"
> the value of the relationshipOfResource string for every instance
> of a relationship and then figure out if it's the same kind of
relationship
> as one that the consumer already knows or cares about.

Hmmm... not sure I follow.  My understanding is that resourceRelationship is
just a DwC wrapper to exchange what effectively amounts to triples, but with
some additional properties tied to each triple.  In other words, a
generalized way of packaging any kind of relationship (1:1, 1:M, M:M) in a
format that is more in keeping with the way DwC works than RDF is.  Perhaps
we should just focus on RDF/triples and not bother with a DwC package for
these kinds of relationships.

In any case, I don't see why relationshipOfResource cannot be populated with
a URI-identified predicate.  Or, perhaps introduce a
relationshipOfResourceID term.

A suggestion was made by John Deck within BiSciCol to propose two new terms
for this class: resourceType and relatedResourceType, which would be
populated by the name of the DwC class from which the respective two
identifiers are drawn.  This would include the kinds of things being related
without needing to resolve the two identifiers to determine that (e.g., via
dcterms:type from each record).  Whether or not BiSciCol utilizes this DwC
class, I think it's still worth exploring and putting some use cases out
there to test its efficacy.

> As I see it, the bottom line is that if we agree that it would be a
> best-practice for the xxID values to be GUIDs (which would either
> be URIs or could be URIs that are proxied UUIDs for UUID believers)

I am a "UUID believer" (or, more correctly, a believer in 128-bit
identifiers, of which UUID is merely one -- of potentially many -- protocol
for generation and a convention for rendering that is more human-friendly
than 128 consecutives 1's & 0's), and I am perfectly happy representing them
in this context as URIs. I don't even think of it as a proxy anymore -- I
think of it as a combined identifier (UUID) and resolution metadata (http or
other protocol prefix with domain+namespace), such that they fulfill the
function of a persistent, actionable identifier sensu TDWG.

But I digress...

> and that the relationships should be standardized, then the resource
> relationships boil down to RDF triples.

Effectively, yes.  But as I mentioned above, there may be value in defining
a DwC mechanism (indeed, it's already defined, just not really used -- so I
should have said "refining" instead of "defining") that allows packaging of
these "triples" with associated metadata (relationshipAccordingTo,
relationshipEstablishedDate, relationshipRemarks, and possibly resourceType
and relatedResourceType) in a package that's easier for many providers to
generate and digest than full-blown serialized RDF.  Perhaps I'm way off
base here, but my discussions at TDWG gave me the sense that I'm not the
only one who sees this as a desirable mechanism for data exchange that has
its own space of utility separate from full RDF-serialized content.  My
sense (and again, I may be way off base) is that the RDF becomes really
valuable when it is broadly implemented, and is working off a reasonably
functional and well-defined ontology. The day when those are reality will
certainly come (I hope!), but in the mean time, we still need an exchange
mechanism for certain content that is more complex than what can be met with
DwCA.

> There's a URI for the subject, a URI for the object, and there could
> be a URI for the relationshipOfResource which would have an
> rdfs:label property that was the controlled value for the string
> that describes the relationship.

I am fully supportive of this.

> The only difference that I see is
> that the RDF triples could be more efficient to handle.

...or more difficult to handle, depending on who you are, what your area of
technical expertise is, and what you want to use the standard for (e.g.,
content exchange vs. reasoning).

> Don't like RDF or are scared of it?  Fine, make yourself an
> Excel spreadsheet with three columns headed "subject",
> "predicate", and "object" (or use your favorite database method).
> Put the appropriate URI in each column.  Viola!  Resource relationships.

Great!  Add in columns for relationshipAccordingTo,
relationshipEstablishedDate, relationshipRemarks, resourceType and
relatedResourceType (would that be septuples?), so I don't have to serialize
these standard properties of the relationships into additional triples, and
I'm sold!

Aloha,
Rich


The one thing about implementing the ResourceRelationship class vocabulary
that I think may be an improvement is use of the generic terms for "by
whom", "date", and "remarks".  At the present moment, we are in the process
of expanding DwC by adding class-specific terms to do these things in nearly
every class.  We have recordedBy, georeferencedBy, identifiedBy,
occurrenceRemarks, locationRemarks, eventDate, georeferencedDate, etc.
rather than just using relationshipAccordingTo, relationshipEstablishedDate,
and relationshipRemarks which could do the same thing generically in any
class.  Add columns to the above-described Excel spreadsheet for
relationshipAccordingTo, relationshipEstablishedDate, and
relationshipRemarks if you want to track those things.

I think I understand why the ResourceRelationship class was created.  It
seems like a way to move forward under the worse-case scenario where GUIDs
never get off the ground, where no one ever agrees what the core
biodiversity classes are, and where no one ever establishes object
properties that connect those classes.  But if no one ever has the guts to
try to lay out the core classes and object properties that connect them,
then what good are a million unstandardized ResourceRelationship instances
that can't be unambiguously IDed?  Nobody will ever be able to figure out
anything useful from them and Rod Page's prophecy of doom will come true.

Somebody please show me where I'm wrong here...
Steve

Richard Pyle wrote:
Hmmm.... watch out for that tricky word "valid".  It means different things
to botanists & zoologists. The term "accepted" is generally seen as a more
code-neutral term to mean "valid" (sensu zoology) or "correct"/"accepted"
(sensu botany).  But if you mean "valid" in the botanical sense (="validly
published", or "available" sensu zoology).  I'm not entirely sure which
sense of "valid" is meant in this context.


More fundamentally, however, I'd like to report that a number of folks at
TDWG seemed to have converged on the same idea that, perhaps, we should be
using resourceRelationship more frequently (perhaps a *LOT* more
frequently).  A lot of these terms that effectively represent the functional
equivalent to "foreign keys" might be better packaged in the more open-ended
structure of resourceRelationship.  In fact, at one of the sessions at TDWG
(I believe it was at the AudubonCore break-out session), we discussed the
idea of DwCA "2.0", which would essentially define n-number of "Cores", and
then package the relationships among them via a set of resourceRelationship
records.  This idea emerged from a discussion about how people have been
trying to "force" many-to-many sorts of data into the one-to-many DwCA
format.  The beauty of using a more generic resourceRelationship set for
this function is that it allows one-to-one, one-to-many, and many-to-many
relationships all in one structure.  It may seem klunky now, but if we used
it as a general method to describe all relationships between instances of
DwC "classes", it would become pretty straightforward, I think.

Something to think about, anyway...

Aloha,
Rich


-----Original Message-----
From: tdwg-content-bounces at lists.tdwg.org [mailto:tdwg-content-
bounces at lists.tdwg.org] On Behalf Of Tony.Rees at csiro.au
Sent: Tuesday, October 25, 2011 4:34 PM
To: mdoering at gbif.org
Cc: tdwg-content at lists.tdwg.org
Subject: Re: [tdwg-content] Expressing some relationships in DwC?

Hi Markus,

You wrote:


I begin to wonder if a new term dwc:validNameUsageID would solve this
issue gracefully and remove the need for a relationship extension.

Yes, I believe this would cover both the cases I need, I think, when
accompanied by nomenclatural status = misspelling / nomenclatural status =
nomen nudum... - comments, anyone?

Cheers - Tony

________________________________________
From: "Markus Döring (GBIF)" [mdoering at gbif.org]
Sent: Wednesday, 26 October 2011 1:43 AM
To: Rees, Tony (CMAR, Hobart)
Cc: tdwg-content at lists.tdwg.org
Subject: Re: [tdwg-content] Expressing some relationships in DwC?

Hi Tony,
thanks for these practical questions. See inline for answers.
Markus


I have a few nomenclatural relationships between name that I would like

to

express using DwC, and would like to know the preferred way to do this if
any. The relationships are as follows:

(1) Point a nomen novum to the basionym it replaces. From reading there

was formerly a concept basionym/basionymID, apparently this is now
replaced with originalNameUsage/originalNameUsageID. So one quesiton is,
is this sufficient to infer this is a basionym, when accompanied by
noneclaturalStatus = 'nomen novum'?
yes, that is exactly right. As far as I understand the term basionym is

more of

a botanical term and was not used as the final dwc term therefore.


(2) Point an orthographic variant to the name which it is a variant of

(whether or not the latter is now the accepted name). In other words, if
name A is a variant of name B which is now a synonym of name C, I capture
the A=>C relationship with a synonym assertion, but I want a way to
capteure the A=>B relationship too.
This is only possible with an extension I am afraid. For example the

generic

dwc relationship one:
http://rs.gbif.org/extension/dwc/resource_relation.xml


(3) Point a nomen nudum to a validly published instance that comes later

(or do the same in reverse, i.e. this name was preceded by xxx as a nomen
nudum). Again, this should be independent of whether the validly published
name is an accepted name or now a synonym of something else.
same problem as above.
I begin to wonder if a new term dwc:validNameUsageID would solve this
issue gracefully and remove the need for a relationship extension.



Advice appreciated,

Regards - Tony Rees
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content

_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content




This message is only intended for the addressee named above.  Its contents
may be privileged or otherwise protected.  Any unauthorized use, disclosure
or copying of this message or its contents is prohibited.  If you have
received this message by mistake, please notify us immediately by reply mail
or by collect telephone call.  Any personal opinions expressed in this
message do not necessarily represent the views of the Bishop Museum.
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content

.




--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu



This message is only intended for the addressee named above.  Its contents may be privileged or otherwise protected.  Any unauthorized use, disclosure or copying of this message or its contents is prohibited.  If you have received this message by mistake, please notify us immediately by reply mail or by collect telephone call.  Any personal opinions expressed in this message do not necessarily represent the views of the Bishop Museum.


More information about the tdwg-content mailing list