Hi Steve,
You are probably right that it might be best to use rdfs:Label, but I am thinking we might be able to get the same result my defining the string variants as subproperties of rdfs:Label.
This would make them an rdfs:Label but a special kind of rdfs:Label.
This is one of those things that I would test with Sindice and URIburner to see if they interpret these correctly.
This would require a live vocabulary that Sindice could look at to determine that hasScientificName is to be treated as a rdfs:Label.
- Pete
On Mon, Oct 4, 2010 at 10:41 AM, Steve Baskauf <steve.baskauf@vanderbilt.edu
wrote:
Although this specific example deals with taxonomic name identifiers, it is related to a previous discussion on this list about how we should use the dwc:xxxxxID terms and other terms (such as recordedBy and identifiedBy) that could have either a string (literal) or URI form. Although I don't really want to see an unnecessary proliferation of Darwin Core terms, I think that in the interest of clarity (particularly where RDF is involved) there either should be multiple terms that make it clear what form of identifier is expected, or else there should be an understanding that in RDF the default for such a term is a URI which would then have an rdfs:Label property which was the string form. I think the former would be preferable to the latter.
I came to this opinion when trying to write RDF describing an herbarium specimen. The collector should be the dwc:recordedBy property of the specimen. Optimally, there would be a database in which known collectors were assigned URIs so that "Glen N. Montz", "Glen Montz", "G. N. Montz", etc. would all be different labels for the same resource. However, realistically, I'm not going to drop what I'm doing to set up such a database (even if I were capable of doing it, which I'm not). So I ended up just writing it as dwc:recordedByGlen N. Montz</dwc:recordedBy> even though I knew it wasn't probably the best thing. In a large Occurrence database that was compiled from the RDF created by a lot of people, there might end up being a mixture of strings and URIs for dwc:recordedBy properties of the specimens. It seems to me like it would be better to have properties like dwc:recordedBy for strings and dwc:recordedByURI for a corresponding URI (and I suppose dwc:recordedByLSID if anyone wants to use it). Of course, this would require a number of term additions to DwC and clarification in the DwC documentation that the generic version was intended for strings.
With respect to the example
<dwc:hasScientificNameLSID rdf:resource="urn:lsid:catalogueoflife.org: taxon:24e7d624-60a7-102d-be47-00304854f810:ac2010"/> I think you are right that (with the possible exception of rdfs:seeAlso) there is an expectation that an rdf:resource attribute will be a resolvable URI that produces RDF. So dwc:hasScientificNameLSIDurn:lsid:catalogueoflife.org: taxon:24e7d624-60a7-102d-be47-00304854f810:ac2010</dwc:hasScientificNameLSID> is probably better.
Steve
Peter DeVries wrote:
I have been thinking about the following pattern. In part after looking at the GBIF vocabulary.
I am not sure if it is even a good idea but might be worth some discussion.
For those fields that have both a string and "ID" form maybe the following pattern might be useful
hasScientificName = string form hasScientificNameURI = Resolvable LOD compliant identifier hasScientificNameLSID = LSID identifier which could be resolvable once you add the "http:proxy" http:proxy etc.
This allows all three forms to be included if desired, it also provides a hint as to how the field should be interpreted or resolved.
One group could also provide a mapping service so that each record does not need to include all three forms, but would allow systems to find the matching LSID for a given URI or vs. versa.
My concern was that it would be difficult to infer how a scientificNameID should be interpreted by other systems.
Is this an LSD, is it a URI, is it a UUID etc. ?
This impacts the structure of the RDF.
- Note that the actual identifiers might not be correct, the example
below is more about the form of the RDF
- For instance, I don't think it is probably correct to see the COL LSID as
just a namestring
- Also in this example the GNI name does not exactly match the string name
dwc:hasScientificNamePuma concolor (Linnaeus 1771)</dwc:hasScientificName> <dwc:hasScientificNameURI rdf:resource=" http://gni.globalnames.org/name_strings/6c3dc35f-d901-5cc5-b9c8-ad241069b9f8 "/> <dwc:hasScientificNameLSID rdf:resource="urn:lsid:catalogueoflife.org: taxon:24e7d624-60a7-102d-be47-00304854f810:ac2010"/>
Some system may choke on the LSID form assuming that it uses a standard resolution mechanism
So it might be best to use this form
dwc:hasScientificNameLSIDurn:lsid:catalogueoflife.org: taxon:24e7d624-60a7-102d-be47-00304854f810:ac2010</dwc:hasScientificNameLSID>
- Pete
Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base http://www.taxonconcept.org/ / GeoSpecies Knowledge Base http://lod.geospecies.org/ About the GeoSpecies Knowledge Base http://about.geospecies.org/
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707http://bioimages.vanderbilt.edu