[tdwg-content] NCD and DwC

Ann_Hitchcock at nps.gov Ann_Hitchcock at nps.gov
Mon Aug 17 15:40:41 CEST 2009


John et al.:

I am responding regarding the definitions outlined in  your note below
pertaining to

institutionCode:
collectionCode:
collectionID:
datasetID:
catalogNumber:

The definitions that you provided work from the standpoint of the
originating institution, but may not work well from the standpoint of a
repository when the originating institution remains involved in the
management of the specimens and records while the specimens may be on loan
to the repository.  I can best describe the problem by giving a typical
example from the National Park Service.

Example:  NPS permits the collection of specimens in a park.  The specimens
remain federal property and are cataloged in the NPS system.  The specimens
go on loan to the State University Museum, which has agreed to serve as the
repository.  The State University Museum (SUM) creates a new catalog record
for the specimen using its catalog system, while cross-referencing the NPS
catalog number.  The State University Museum periodically reports to NPS on
research use of the specimen, third party loans (which NPS authorizes the
State University Museum to make), annotations, condition, etc.  NPS updates
its catalog record (the original record) with this information.  As a
condition of the permit, the permitted researcher submits copies of his/her
field notes, maps, photos (a.k.a. associated resource management records)
to NPS and conveys the copyright.  NPS catalogs the resource management
records in its archival system.  NPS provides copies of the field notes to
the State University Museum for use in managing the specimens and to
provide researcher access to the notes.

If this example follows the definitions outlined below, the NPS, as the
institution administering the original record (and specimens), would be
identified in the InstitutionCode, CollectionCode, and CollectionID, while
the repository institution could show its information in
AssociatedOccurrence (for example, "same as SUM:Mammal:1234") and
OtherCatalogNumber fields.  Is the repository going to be satisfied?
Afterall, the repository is where researchers go to see the specimens.  Is
the repository getting adequate recognition for its role?

An alternative, as previously proposed, could provide for parallel data
sets for the owning institution and institution of custody.  See Issues 35
and 36 at http://code.google.com/p/darwincore/issues/list.

Thanks to all for your consideration of this issue.

Ann

Ann Hitchcock
National Park Service
1849 C Street, NW (2301)
Washington, DC 20240-0001
202-354-2271
Fax:  202-371-2422


                                                                           
             "John R.                                                      
             WIECZOREK"                                                    
             <tuco at berkeley.ed                                          To 
             u>                        Wouter Addink <wouter at eti.uva.nl>,  
             Sent by:                  Neil Thomson <n.thomson at nhm.ac.uk>, 
             tdwg-content-boun         TDWG Content Mailing List           
             ces at lists.tdwg.or         <tdwg-content at lists.tdwg.org>       
             g                                                          cc 
                                       Lee Belbin <lee at tdwg.org>, Gail     
                                       Kampmeier <gkamp at uiuc.edu>,         
             08/15/2009 08:14          "donald.hobern"                     
             AM                        <donald.hobern at csiro.au>            
                                                                   Subject 
                                       [tdwg-content] NCD and DwC          
             Please respond to                                             
             tuco at berkeley.edu                                             
                                                                           
                                                                           
                                                                           
                                                                           




Hi Wouter and Neil (and others),

I hope both of you are well. I know this may be a busy time (or,
better, vacation time), but I hope that you have had a chance to
consider recent discussions on tdwg-content about the relationships
between DwC, NCD and the TDWG Ontology. In addition to those public
discussions, I'm adding a few questions and comments I have had during
the progression of the Darwin Core Review. I'm cc'ing those having a
clear vested interest in resolution on both sides. I would urge you to
look at the relevant tdwg-content commentary as well as my concerns
from the messages below so that we can hopefully quickly come to a
consensus on joint plan. I say quickly because I am eager that DwC
review shouldn't undergo further unnecessary delays.

In case it's a bit much to go through all of the "literature" relevant
to the proposal I'm making, and in hopes of facilitating quick
solutions, I'll summarize.

1) Dublin Core recommends the use of the dcterms rather than their
antiquated dc counterparts. Shouldn't NCD follow suit? Specific
example: instead of http://purl.org/dc/elements/1.1/source, use
http://purl.org/dc/terms/source.

2) NCD is using terms from the TDWG ontology, which is to date an
unfinished academic exercise without any review. This dependency seems
to me to guarantee that NCD will require revision when the ontology is
revised. This wouldn't necessarily be required if NCD took the reigns
and defined terms that aren't already in another standard (the
Ontology does not fit into this category) within its own domain.
Specifically, abandon http://rs.tdwg.org/ontology/voc/ in favor of
http://rs.tdwg.org/ncd/terms/.

3) Reword some of the NCD term definitions so that NCD can be used
more generally for data sets (data collections), and not just for
object collections.

With these commitments, DwC could safely move forward reusing NCD
terms. Without the last two, DwC will have to redefine terms such as
collectionID.

Following are relevant message excerpts from previous tdwg-content
postings:

-----
from         John R. WIECZOREK <tuco at berkeley.edu>
reply-to           tuco at berkeley.edu
to           TDWG Content Mailing List <tdwg-content at lists.tdwg.org>
date         Thu, Jul 23, 2009 at 6:20 PM
subject            Darwin Core Collection-related terms

I have taken the content of the Darwin Core Issues 32 and 33 to post
here as they both require discussion before an unambiguous
recommendation can be made.

>>>From http://code.google.com/p/darwincore/issues/detail?id=32

Reported by ren... at cria.org.br
Term Name: collectionID

Recommendation: Reuse the term which is already defined in NCD (on the
other hand, the NCD term defined in the corresponding RDF file should
probably not be restricted to a specific domain).

Submitter: Renato De Giovanni


Comment 1 by gtuco.btuco
This is indeed intended to be the same term. Can you provide the URI
to the term in
NCD?
Status: Accepted
Labels: Milestone-Release1.0 Priority-Critical

Comment 2 by ren... at cria.org.br
Currently the URI is:

http://rs.tdwg.org/ontology/voc/Collection#collectionId

But I think that relationship terms like this one should probably not
be bound to a
domain since they can be used by objects from many different classes.
I'm not sure if
it's possible to change NCD and if the NCD creators would agree with
this change.
Perhaps a better URI for this term would be:

http://rs.tdwg.org/ontology/voc/collectionId

>>>From http://code.google.com/p/darwincore/issues/detail?id=33

Reported by ren... at cria.org.br
Term Name: collectionCode

Recommendation: Reuse existing term from NCD, but I would probably also
suggest to change the NCD term from
http://rs.tdwg.org/ontology/voc/Collection#acronymOrCoden to
http://rs.tdwg.org/ontology/voc/collectionCode (without a domain). It would
be nice to know Markus' or Roger's opinion about this, since they
participated in the NCD group.

Submitter: Renato De Giovanni
-------
from         Tim Robertson <trobertson at gbif.org>
to           tuco at berkeley.edu
cc           TDWG Content Mailing List <tdwg-content at lists.tdwg.org>
date         Fri, Jul 24, 2009 at 1:12 AM
subject            Re: [tdwg-content] Darwin Core Collection-related terms

Hi John, Renato

Thinking aloud, some possible options I see might be:

a) - omit it from the DwC terms altogether
b) - reuse the existing URI if the NCD term domain was derestricted
c) - keep a duplicate term in the DwC NS
d) - ? keep a duplicate term in the DwC NS and add some kind of "is
equivalent of" to the NCD acronymOrCoden
e) - ? keep a duplicate term in the DwC NS and have NCD acronymOrCoden
do some "refinement" of dwc:collectionCode

My preference is for c) (or if possible e) for clear boundaries of dwc
and also maintainability reasons.

To me, DwC fits nicely as a set of commonly used terms which are
unrestricted to domain classes, and extend the terms offered by the
DublinCore Metadata Terms.  Using these terms we can assemble
models/schemas etc.  To say DwC now also includes terms from other
namespaces (which are currently restricted to domains), I think might
become more difficult to grasp and maintain.    I also wonder if going
down the route of b) or d) for one term could open the floodgates for
a lot of other terms (http://rs.tdwg.org/dwc/terms/index.htm#genus ->
http://rs.tdwg.org/ontology/voc/TaxonName#genusPart) and effectively
move towards being an "index of data and object properties in the TDWG
ontology".

Just some thoughts,

Tim
-------
from         renato at cria.org.br
to           TDWG Content Mailing List <tdwg-content at lists.tdwg.org>
date         Fri, Jul 24, 2009 at 6:52 AM
subject            Re: [tdwg-content] Darwin Core Collection-related terms

Hi Tim,

Nice summary. My preference is for b. Considering that NCD follows the
same principles of this new DarwinCore version, I see no reason for
duplicating the same term. No matter how much we try to keep boundaries
clear between standards, there will always be some kind of semantic
overlap between them. Having the same terms defined under different
namespaces can be very confusing for users. I think TDWG should try to
make things as reusable as possible.

To be more specific, I would suggest the following changes to NCD:

1) Remove the domain from collectionId and institutionId and rename them
to "Id" so that the URI becomes:

http://rs.tdwg.org/ontology/voc/Collection#Id
http://rs.tdwg.org/ontology/voc/Institution#Id

2) Remove the domain from #acronymOrCoden (Collection) and rename it to
"Code" so that the URI becomes:

http://rs.tdwg.org/ontology/voc/Collection#Code

3) Add a Code property in Institution (without a domain) making it:

http://rs.tdwg.org/ontology/voc/Institution#Code

Then DarwinCore or any other standard can easily reuse these terms.

Depending on how this gets solved, yes, I think we should open the
floodgates...

Best Regards,

Renato

------
from         John R. WIECZOREK <tuco at berkeley.edu>
reply-to           tuco at berkeley.edu
to           Lynn Kutner <Lynn_Kutner at natureserve.org>
cc           TDWG Content Mailing List <tdwg-content at lists.tdwg.org>
date         Fri, Jul 31, 2009 at 12:57 PM
subject            Re: [tdwg-content] InstitutionCode Issue - ownership vs.
custodianship

The codes are now meant for any data set (a collection of data), not
just collections of objects. They were actually always meant to be
that way, but the descriptions had their origins in the specimen
collections realm. Specifically, the following terms can all be used
to identify where data are coming from originally:

institutionCode
collectionCode
collectionID
datasetID

while the Dublin Core terms dc:rights, and dc:rightsHolder can be used
to describe the original or other vested interests.

To be more clear about what I meant about collection-related terms, I
would propose changing the descriptions as follow:

institutionCode: "The name (or acronym) in use by the institution
administering the original record."
collectionCode: "The name (or acronym) identifying the original
collection or data set from which the record was derived."
collectionID: "A unique identifier for the original collection or
dataset from which the record was derived. Recommended best practice
is to use the identifier in a collections registry such as the
Biodiversity Collections Index
(http://www.biodiversitycollectionsindex.org/)."
datasetID: "An identifier for the data set. May be a global unique
identifier or an identifier specific to a collection or institution."
(unchanged)
catalogNumber: An identifier (preferably unique) for the record within
the data set or collection.
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content





More information about the tdwg-content mailing list