Many questions come to mind.
Is any aggregator along the line doing anything to change the original data (aside from the institutionCode, collectionCode, and catalogNumber)? If so, why? If not, why not keep the original identifying information? In other words, the original institutionCode, collectionCode, and catalogNumber? Is there something sacred or useful about the aggregator's identifiers? It seems like hosts should be completely transparent unless they are doing something to the data. There wouldn't be anything tricky if the original identifiers were retained - unless there weren't any.
The codes are now meant for any data set (a collection of data), not just collections of objects. They were actually always meant to be that way, but the descriptions had their origins in the specimen collections realm. Specifically, the following terms can all be used to identify where data are coming from originally:
institutionCode collectionCode collectionID datasetID
while the Dublin Core terms dc:rights, and dc:rightsHolder can be used to describe the original or other vested interests.
To be more clear about what I meant about collection-related terms, I would propose changing the descriptions as follow:
institutionCode: "The name (or acronym) in use by the institution administering the original record." collectionCode: "The name (or acronym) identifying the original collection or data set from which the record was derived." collectionID: "A unique identifier for the original collection or dataset from which the record was derived. Recommended best practice is to use the identifier in a collections registry such as the Biodiversity Collections Index (http://www.biodiversitycollectionsindex.org/)." datasetID: "An identifier for the data set. May be a global unique identifier or an identifier specific to a collection or institution." (unchanged) catalogNumber: An identifier (preferably unique) for the record within the data set or collection.
On Fri, Jul 31, 2009 at 11:59 AM, Lynn KutnerLynn_Kutner@natureserve.org wrote:
I think that our primary need is to be able to have data attribution / credit.
If I'm looking in the right place (http://rs.tdwg.org/dwc/terms/index.htm#dcterms:rights) then it seems like right / rightsHolder could meet that need.
Of course, it still gets a bit tricky when you've got datasets that are aggregated from many datasets that may themselves be aggregations ...
For GBIF, we're currently doing:
Institution Code = NTSRV (abbreviation for "NatureServe") Collection Code = IL-NHP (abbreviation for the contributing member program) Catalogue number = EO.28.2758 (code that relates to unique record ID inour databases)
I think this overlaps with my previous question / comment about whether the institutionCode / collectionCode / collectionID are intended to be used only for "collections" or for any sort of of dataset (a data collection in the generic sense).
I like John's comment about using terms generically:
"From the rest of the conversation on Collection-related terms in DwC it seems clear to me that, if we were to keep the terms in the DwC namespace, then they should be domain-less and their descriptions should omit the reference to Occurrences. Though these terms came from the idea of identifying specimen records within collections within institutions, they really can be applied more generically to any record within a dataset (collection) from a contributor (institution)."
Question - I'm a bit unclear - in the above did you mean omit reference to "Occurrences" or omit reference to "collections"?
If collection is used generically in DWC, I'm not sure whether we'd use rightsHolder or collectionID / collectionCode to identify the contributing member program. I need to think more about this one.
Lynn
-----Original Message----- From: gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On Behalf Of John R. WIECZOREK Sent: Thursday, July 30, 2009 5:01 PM To: Lynn Kutner Cc: TDWG Content Mailing List Subject: Re: [tdwg-content] InstitutionCode Issue - ownership vs. custodianship
Interesting that NCD doesn't cover this either at the level of Collections. Perhaps it is less of an issue at the level of whole collections where ownership may be mixed.
Is there any reason that dc:rights and dc:rightsHolder can't cover the need expressed here?
On Thu, Jul 30, 2009 at 1:45 PM, Lynn KutnerLynn_Kutner@natureserve.org wrote:
I think it makes sense to have separate "ownership" and "custodian" terms.
As custodians, we provide data to GBIF on behalf of our network of member programs.
This could become complicated, however, because some of those member program are themselves custodians of data that is owned by others.
Lynn
Lynn Kutner NatureServe phone: (703) 797-4804 email: lynn_kutner@natureserve.org http://www.natureserve.org/
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of John R. WIECZOREK Sent: Tuesday, July 28, 2009 3:24 PM To: TDWG Content Mailing List Subject: [tdwg-content] InstitutionCode Issue - ownership vs. custodianship
This Darwin Core Issue (http://code.google.com/p/darwincore/issues/detail?id=34) is being brought into this forum for further discussion in the hopes of a definitive recommendation.
==New Term Recommendation== Submitter:Ann Hitchcock
Justification: The evolution, from 2003 to the present, of the term "Institution Code" has resulted in a change from property ownership ("the institution to which the collection belongs") to administrative or management responsibility ("the institution administering the collection or data set"). We need to be able to separately identify the owning institution when that institution differs from the managing institution. Such a situation occurs when the owning party lends the collection to a repository for long-term management. The catalog numbers that both institutions may assign to the same specimen are addressed by the Catalog Number and "Other Catalog Numbers," but property ownership is not addressed.
Definition: The name (or acronym) in use by the owner of the collection or data set if different from the InstituionCode.
Comment:Examples: "NPS", "INBio". For discussion see http://code.google.com/p/darwincore/wiki/Occurrence
Refines:InstitutionCode
Has Domain:http://rs.tdwg.org/dwc/terms/Occurrence
Has Range:
Replaces:
ABCD 2.06:DataSets/DataSet/Units/Unit/SourceInstitutionID
Comment 1 by gtuco.btuco, Today (3 hours ago) Interesting. Do you think that dc:rights (http://rs.tdwg.org/dwc/terms/index.htm#dcterms:rights) and dc:rightsHolder (http://rs.tdwg.org/dwc/terms/index.htm#dcterms:rightsHolder) might fulfill the needs you are recommending? Status: Accepted Labels: Priority-Medium Milestone-Release1.0 Delete comment Comment 2 by ann_hitc...@nps.gov, Today (14 minutes ago) "Rights" and "Rights Holder" do address part of the issue, especially if a second example on "Rights" could be given that refers to physical property in addition to intellectual property. For example "Specimens are U.S. Government Property." Although these definitions are helpful, they still do not address the issue that the occurrence of the specimens or data set may be recorded in both the collection of the institution with custody and the collection of the institution with ownership.
Perhaps the term "administering" in the definition for Institution Code should be replaced with "with custody of" and the proposed definition for Institution Code should be bifurcated as follows:
InstitutionOwnership Code: The name (or acronym) in use by the institution that owns the collection or data set in which the Occurrence is recorded.
InstitutionCustody Code: The name (or acronym) in use by the institution with custody of the collection or data set in which the Occurrence is recorded, if different from the InstitutionOwnership Code.
Comment 3 by gtuco.btuco, Today (moments ago) Since this has moved into the realm of discussion and away from a definitive recommendation, I need to move it to the tdwg-content list for open discussion. If you have not yet joined that list, please do so to continue the conversation there in an open forum. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content