[tdwg-content] Darwin Core Standard - proposed change in governance
Éamonn Ó Tuama [GBIF]
eotuama at gbif.org
Thu Jan 22 16:35:36 CET 2015
Dublin Core tackles the issue of borrowing from other vocabularies through
the construct of Dublin Core Application Profiles [1] something separate
from the specification of DC itself.
Éamonn
[1] http://dublincore.org/documents/2009/05/18/profile-guidelines/
From: tdwg-content-bounces at lists.tdwg.org
[mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Steve Baskauf
Sent: 22 January 2015 15:41
To: Tim Robertson
Cc: TDWG Content Mailing List; "Markus Döring (GBIF)"
Subject: Re: [tdwg-content] Darwin Core Standard - proposed change in
governance
I'm not sure whether anyone ever established that there was a consensus on
this, but it has been suggested repeatedly that the basic Darwin Core
vocabulary should have minimal restrictions and semantics imposed upon it.
This allows other layers to be built upon it (XML schemas, RDF-based
ontologies, etc.) that are intended to serve more specialized purposes.
>From the standpoint of the normative RDF document, that means that it should
include very little in the way semantics that would create entailments for
semantic clients. That doesn't mean that there is no value in having that
RDF, because the resources defined by that document can be nodes that can be
linked and restricted by other ontologies. This has been done or is being
done by Darwin-SW and BCO, for example.
Having the RDF document contain few machine-interpretable properties is a
separate issue from whether it is a good idea for that document to be
normative. Bob makes a good point that if it is the normative document yet
does not contain the information required for defining how the terms should
be used, it may be inadequate. We already have that situation with the
Dublin Core terms. Since they are defined as RDF by DCMI, we don't include
them in our normative RDF document. So if potential users were to try to
use Darwin Core by looking solely at the normative document (which they
should be able to do, at least in theory), they would not know that Darwin
Core includes Dublin Core terms. This is a point that should be considered
in discussions on a TDWG-wide policy on vocabulary documentation. It's an
even bigger issue in Audubon Core, since a much higher proportion of its
terms are "borrowed" from other vocabularies than DwC.
Steve
Tim Robertson wrote:
Hi all,
I dont think there has ever been any intention for DwC to impose that kind
of restriction, regardless of the format chosen for the authoritative
version (e.g. be it wiki text, RDF, HTML or PDF).
Id expect a schema to define that restriction (xsd, class definition etc).
Is that how others see it? I believe this is what John meant when he wrote
the spec [1]:
"There is meant to be a clear separation between the terms defined in this
standard and the applications that make use of them. For example, though the
data types and constraints are not provided in the term definitions,
recommendations are made about how to restrict the values where
appropriate."
Cheers,
Tim
[1] http://rs.tdwg.org/dwc/index.htm
On 22 Jan 2015, at 06:29, Bob Morris <morris.bob at gmail.com> wrote:
If dwc_normative.rdf is to be normative, will there be any
specification for how to determine whether data coded to it is
conformant? For example, there are some terms which, at first glance,
deserve to have a cardinality restriction. Consider e.g.
minimumDistanceAboveSurfaceInMeters and suppose I code DwC data
against [2]. Suppose a Location <L> in my data has two triples
<L> dwc:minimumDistanceAboveSurfaceInMeters "1.0"^^xsd:real ;
<L> dwc:minimumDistanceAboveSurfaceInMeters "2.0"^^xsd:real ;
Will there be a way that a machine can determine that such data is not
conformant to dwc_normative.rdf? Or is it meant to be conformant? Or
is the expectation that there is no machine-applicable specification
of conformance? (In this case it's particularly hard for me to
understand the utility of normative RDFS.)
Bob
[2] https://github.com/tdwg/dwc/blob/dwc_cleanup/terms/dwc_normative.rdf
On Wed, Jan 21, 2015 at 10:44 PM, Markus Döring <mdoering at gbif.org> wrote:
Hi Steve,
you are right the current live RDF files are awkward, especially that the
normative dwctermshistory.rdf document does not contain the real dwc term
URIs! It only contains versioned URIs with an appended date.
As you said the real terms are covered currently by dwcterms.rdf which is
not normative :(
In those rdf documents dwc currently defines a couple of dwc specific
properties, for example to indicate the status of a term:
<dwcattributes:status>recommended</dwcattributes:status>
In our cleanup branch [1] of dwc that we worked on we decided it would be
better to make the current terms the normative document [2]
Any advice on how to better model the RDF so it is usable for ontologies is
highly appreciated. We basically followed the existing approach which did
not use OWL. I vaguely remember during the TDWG ontology discussions that it
was preferred then to just stick with RDF schema to allow maximu reuse?
Markus
[1] https://github.com/tdwg/dwc/tree/dwc_cleanup
[2] https://github.com/tdwg/dwc/blob/dwc_cleanup/terms/dwc_normative.rdf
On 21 Jan 2015, at 21:03, Steve Baskauf <steve.baskauf at vanderbilt.edu>
wrote:
With respect to item 1:
I would support changing to a different RDF document from
http://rs.tdwg.org/dwc/rdf/dwctermshistory.rdf to something else. When a
semantic client dereferences a DwC term and requests content-type:
application/rdf+xml, it is not served the normative document. It gets a
different one (dwcterms.rdf ?). At least this is what happened in the past
when I tested this. It has never been clear to me how a client is supposed
to "follow its nose" to find the normative definitions. The normative
document is further complicated by having URIs with appended dates which I
suppose are connected to the non-dated versions in some way - again, I'm not
clear how.
It seems more straightforward to me for the normative document to contain
the URIs and metadata for the most recent (undated) versions of all terms,
whether they be recommended or deprecated. If a term is deprecated, it
shouldn't "disappear from sight". Rather, its metadata should indicate that
it's deprecated using the standard property:
<owl:deprecated rdf:datatype="&xsd;boolean">true</owl:deprecated>
I'm not sure that I get the distinction between "deprecated" and "obsolete".
Can't all terms just be either currently recommended or deprecated? Terms
should never go away in the RDF, they should be marked as deprecated.
Whether we want humans to see them on a quick reference guide is another
story.
Please note that I'm not taking a position on items 2 or 3. Item one simply
maintains the status quo (normative document=RDF) and just substitutes one
particular document with another.
Steve
Steve Baskauf wrote:
I should have said this in my earlier message, but many thanks to John,
Peter, and Markus for getting things streamlined and moved to GitHub.
I think that it would be best to break this proposal into four parts:
1. should we change the identity and form of the DwC normative RDF document?
2. should the DwC normative document be expressed as RDF?
3. should a single normative document be the entire standard, with other
supporting documents being outside of the standard and modifiable without
going through an official process?
4. should the DwC namespace policy be adopted TDWG-wide?
Each of these parts need to be discussed separately - having them as a
package makes it difficult to say "I support the proposal" or "I oppose the
proposal". I feel that item 1 could be addressed during a public comment
period and action taken based on the outcome of that discussion. It's
really a technical issue. But items 2-4 are really policy decisions about
process that I don't think can be addressed effectively on an email list. I
think they would be better addressed as a task group level, with an
examination of what has and hasn't worked for TDWG and other organizations,
a record of discussion, list of pros-and-cons, and a consensus
recommendation by the task group, all documented on a wiki or in a report.
At that point, those who care can review the recommendations and comment on
tdwg-content. We've learned in the past that tdwg-content just isn't an
effective way to hash out these complicated kind of decisions.
Steve
Markus Döring wrote:
Bob,
I am surprised about the dislike of RDF being normative for DwC. This has
been the case for years[1] and noone did mind.
The changes we propose focus on keeping out the history and the namespace
policy document.
The issue I would like to think about a little more is how to best deal with
deprecated terms.
Is it good enough if those term URIs would resolve to a section within the
html history document or do we need to return rdf for them?
Markus
[1] http://rs.tdwg.org/dwc/
"The normative document for the terms [RDF-NORMATIVE] is written in the
Resource Description Framework (RDF)
On 21 Jan 2015, at 17:34, Bob Morris <morris.bob at gmail.com> wrote:
Be wary; very, very wary. If RDF is the normative artifact for DwC,
there will be ambiguity about Containers and, less so, about Lists,
both of which are somewhere between non-existent and horrifying in
RDF. My prediction is that "DwC.rdf" will end up needing to be
expressed in OWL, and that the specification of lists, of unordered
sets, and of cardinality restrictions, will mystify most readers
hoping to discuss the standard.
On Wed, Jan 21, 2015 at 10:49 AM, Éamonn Ó Tuama [GBIF]
<eotuama at gbif.org> wrote:
Seems like RDF expression in combination with a privileged XSLT to human
readable doc might do. I still like the idea of the RDF doc as the normative
one. At least it is concise. The W3C specification pages are a rather messy
mix of different sections some headed by "This section is non-normative.",
and see, e.g., the page for the DCAT vocab [1] "As well as sections marked
as non-normative, all authoring guidelines, diagrams, examples, and notes in
this specification are non-normative. Everything else in this specification
is normative."
While RDF might excel as a graph definition langauge, I think there is still
value is using it without domain and range statements (if these don't exist)
to simply define labels, definitions and comments (examples) in a machine
readable way.
[1] http://www.w3.org/TR/vocab-dcat/
-----Original Message-----
From: tdwg-content-bounces at lists.tdwg.org
[mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Bob Morris
Sent: 21 January 2015 15:13
To: John Wieczorek
Cc: TDWG Content Mailing List
Subject: Re: [tdwg-content] Darwin Core Standard - proposed change in
governance
John et al. Thanks for all the work you've put into this
I favored this at first, but thought a lot since it was proposed and now
oppose item 1).
Short argument: RDF is meant for machines to read, not humans to read.
If an RDF document is normative, mainly RDF experts will be able to argue
about it and about conformance to it.
More(?) important, RDF is a graph definition language, not a specification
definition language. Not even RDF has an RDF file as its normative
definition. In fact, it seems both W3C and IETF regard most (all?) of their
normative artifacts for specification (respectively "Recommendation" and
"Request For Comment") as nothing other than human readable documents.
This is not to say there should not be one or more normative RDFS
serializations of a human readable specification. It may even be that there
should be a privileged RDFS document, together with a privileged
transformation (e.g. in xslt) and a privileged platform for synthesizing a
human readable form of DwC. But it's that web document that should be
normative (and human readable.) This is what Audubon Core does, except that
the base "generation data" comprises, annoyingly, but robustly, calls to the
MediaWiki template language.
(The annoyance of designing MediaWiki templates may ease in the future due
to [1])
Certainly there are exceptions to the principle of "make only human readable
as the base normative artifact". The XML schemaSchema [2] is an in example.
But DwC doesn't seem to fit that model. DwC is not a DwC object.
My position is a little influenced by [3], a lot of with which I disagree.
But it reminds me of something my Daddy taught me:
"multi-purpose tools are often poor at all their purposes, except in simple
cases." But really, my reluctance here is that I see no reason we should
imagine that DwC data is always a graph and that is why we should model it
with a graph description language. Worse, my experience is that the most
common potholes in the RDF world arise when using it without understanding
the underlying graph theory.
Bob Morris
[1] http://www.mediawiki.org/wiki/Lua_scripting
[2] http://www.w3.org/TR/xmlschema11-1/#normative-schemaSchema
[3] http://manu.sporny.org/2014/json-ld-origins-2/
On Tue, Jan 20, 2015 at 10:18 AM, John Wieczorek <tuco at berkeley.edu> wrote:
Dear all,
Peter Desmet, Markus Döring, and I have been working on the transition
of Darwin Core maintenance from the Google Code Site to Github. We've
taken the opportunity to streamline the process of making updates to
the standard when they are ratified, such as scripts to produce the
human-readable content and auxiliary files from the RDF document of
current terms. As a result of this work, we see further opportunities
to simplify the maintenance of the standard. They center on the following
proposal.
We would like to propose that the RDF document of current terms be
made to represent the normative standard for Darwin Core rather than
Complete History normative document we use now. We would also like to
make that new normative document the only document in the standard.
Under this proposal:
1) the normative standard for Darwin Core would consist of a single
document at http://rs.tdwg.org/terms/dwc_normative.rdf (not currently
active).
2) information currently held in
http://rs.tdwg.org/dwc/rdf/dwctermshistory.rdf (the current normative
document) and the corresponding Complete History web page
(http://rs.tdwg.org/dwc/terms/history/index.htm) would be retained
only in a history document http://rs.tdwg.org/terms/history.html (not
currently active).
3) all documents other than the proposed normative document would not
be part of the standard.
The proposed changes require community consensus under the existing
rules of governance of the Darwin Core. This means that the proposal
must be under public review for at least 30 days after an apparent
consensus on the proposal and any amendments to it is reached, where
consensus consists of no publicly-shared opposition.
The implications of this proposal are many. One of the most important
is that the rules governing changes to the standard
(http://rs.tdwg.org/dwc/terms/namespace/index.htm) would no longer be
a part of the standard. Instead, we would promote the adoption of
these rules across TDWG standards rather than just within Darwin Core.
It may be that TDWG is not ready to accommodate this at the moment. If
so, the Namespace Policy could remain within the Darwin Core standard
until the broader governance process for TDWG can cover it, at which
point we would propose to remove the Namespace Policy from the Darwin Core.
Other comments about the proposed changes:
Having one RDF document for the terms in the dwc namespace will avoid
confusion. Only those with status 'recommended' would be in the
normative document.
Having the term history (all versions, including deprecated,
superseded, and recommended ones) in a web page only is what Dublin
Core does. It means no one would be able to reason over old versions
of the Darwin Core. Would anyone do that?
Having no document other than the normative one as part of the
standard would free the whole rest of the body of Darwin Core
documentation from the requirements of public review and Executive
Committee approval. This would make that documentation much more open
to broader contributions and easier to adapt to evolving demands.
We do not propose to lose any of the documentation we have.
Please share your comments!
Cheers,
John
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
--
Robert A. Morris
Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Filtered Push Project
Harvard University Herbaria
Harvard University
email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://wiki.filteredpush.org
http://wiki.datakurator.net
http://taxonconceptexplorer.org/
http://www.cs.umb.edu/~ram <http://www.cs.umb.edu/%7Eram>
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
--
Robert A. Morris
Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Filtered Push Project
Harvard University Herbaria
Harvard University
email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://wiki.filteredpush.org
http://wiki.datakurator.net
http://taxonconceptexplorer.org/
http://www.cs.umb.edu/~ram <http://www.cs.umb.edu/%7Eram>
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
--
Robert A. Morris
Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Filtered Push Project
Harvard University Herbaria
Harvard University
email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://wiki.filteredpush.org
http://wiki.datakurator.net
http://taxonconceptexplorer.org/
http://www.cs.umb.edu/~ram <http://www.cs.umb.edu/%7Eram>
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20150122/d8cc89bf/attachment.html
More information about the tdwg-content
mailing list