[tdwg-content] What is dwc:basisOfRecord for?

Steve Baskauf steve.baskauf at vanderbilt.edu
Wed Oct 27 19:34:59 CEST 2010


    occurrenceID
	recordedBy
	other Occurrence terms
	preparations
	other specimen-related terms
	basisOfRecord
http://herbarium.org/12345
	Joe Curator
	...
	pressed and dried
	...
	???


OK, above is an example of a database record that I would consider 
typical.  The manager has flattened the general model we have been 
discussing to merge the Occurrence resource with the token resource 
since in his/her database no occurrence has any token other than a 
single specimen.  So what is the value for basisOfRecord: Occurrence or 
PreservedSpecimen?  What I'm getting at here is that there seems to be 
ambiguity as to whether we intend for basisOfRecord to represent the 
type of the record (which in this case I would say is Occurrence) or the 
type of the token on which the record is based (which in this case is 
PreservedSpecimen).  In that lengthy discussion that happened last 
October, there was discussion of having the proposed recordClass 
represent the type of the overall record (Occurrence, Event, Location, 
etc.) and basisOfRecord representing the type of the token or evidence 
on which an Occurrence record is based (PreservedSpecimen).  I'm not 
sure what is intended now. 

I see this lack of clarity as being a consequence of the general 
blurring of the distinction between the Occurrence and the "token".  I 
feel like there is a general consensus that we need to tighten up our 
definitions regarding Occurrences and their evidence even if some people 
will prefer to continue using the "flattened" approach to Occurrences 
and their tokens illustrated above (which they are entitled to do).  I 
don't have a problem with the various types that you've listed below.  
The problem is that I don't think the definition of basisOfRecord makes 
clear how it (basisOfRecord) should apply - the definition just says 
"the specific nature of the data record" and that the controlled 
vocabulary of the Darwin Core Type Vocabulary should be used.  Since the 
Type Vocabulary includes all of the classes (Event, Location, 
Occurrence, etc.) in addition to the types of "tokens", I believe that 
some users might think that making a statement like
basisOfRecord="Event"
is correct.  If we intend for basisOfRecord to ONLY apply to 
Occurrences, and for basisOfRecord to ONLY have as valid values the 
types that apply to tokens (i.e. PreservedSpecimen, LivingSpecimen, 
HumanObservation), then we should say so explicitly in the definition.  
If we do not say this, then we end up with the kind of ambiguity that I 
illustrated above.  basisOfRecord ends up getting "overloaded" to 
represent general classes of records and also to represent the types of 
tokens.

In addition, I'm not entirely convinced that basisOfRecord actually has 
any use at all.  It seems to be intended for situations such as I've 
illustrated above where a database table has rows that could contain 
occurrences documented with different types of tokens.  Ostensibly, 
basisOfRecord is needed to tell us the type of the token.  But 
realistically, any such table that includes multiple token types isn't 
going to work very well.  For example, if the table includes Occurrences 
that are documented by specimens, images, and no token/memories (i.e. 
HumanObservations), then it's going to have a bunch of columns that are 
empty for any particular record.  The specimens rows won't have anything 
in the image term columns, the images won't have anything in the 
specimen term columns, and the human observations won't have anything in 
either the specimen or image term columns.  I think most database 
managers would just include the terms that apply to all Occurrences in 
the Occurrence table and then use identifiers to link separate tables 
for the metadata terms that are specific to the various types of 
tokens.  basisOfRecord also has no use for Occurrences that have several 
tokens.  Which of the several tokens are we saying is the "basis" of the 
record?

So again, my basic question is not so much about the various types and 
subtypes to which we can refer, but specifically how the basisOfRecord 
term is to be used.

Steve

Blum, Stan wrote:
> Steve,
>
> I'm wasn't involved in those final discussions of dwc:basisOfRecord and the
> type vocabulary, but I don't see a difficulty.  The Dublin Core type
> vocabulary includes the following:
>
>     Collection
>     Dataset
>     Event
>     Image
>     InteractiveResource
>     MovingImage
>     PhysicalObject
>     Service
>     Software
>     Sound
>     StillImage
>     Text
>
> The Darwin Core types extend that with the three additional types, and with
> Occurrence being further subtyped with those different kinds of Occurrences.
>
>     Location
>     Taxon
>     Occurrence
>         PreservedSpecimen
>         FossilSpecimen
>         LivingSpecimen
>         HumanObservation
>         MachineObservation
>         NomenclaturalChecklist
>
> Data publishers/providers should categorize their Occurrence data as one of
> those subtypes (or perhaps another subtype that wasn't included in that
> list).  It's up to the consumer to decide which kind of Occurrence subtypes
> are appropriate for a particular use.
>
> Could you give examples of the inconsistent use of values in basisOfRecord ?
>
> Also note, John W. is traveling at the moment.  Markus might be able to
> provide additional thoughts.
>
> -Stan
>
>
>
>
> On 10/25/10 10:34 PM, "Steve Baskauf" <steve.baskauf at vanderbilt.edu> wrote:
>
>   
>> OK, I know that this sounds like a stupid question, but I really want
>> somebody who was involved in the development and maintenance of the
>> current DwC standard to tell me how the term dwc:basisOfRecord is
>> supposed to be used (not what it IS - I've seen the definition at
>> http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord)?  I would like for
>> the answer of this question to be separated from the issue of what the
>> Darwin Core type vocabulary
>> (http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm) is for.
>>
>> I re-read the lengthy thread starting with
>> http://lists.tdwg.org/pipermail/tdwg-content/2009-October/000301.html
>> which talked a lot about basisOfRecord and its relationship to other
>> ways of typing things.  I don't want to re-plough that ground again, but
>> I couldn't find the post that stated what the final decision was. I
>> remember that there was a decision to NOT create the recordClass term
>> which was the subject of much discussion.
>>
>> I guess my confusion at this point is with the inclusion of both
>> "Occurrence" and "PreservedSpecimen" in the same list.  Let's say that I
>> have a flat database where I include metadata about the Occurrence (such
>> as dwc:recordedBy) and the specimen (such as dwc:preparations) in the
>> same line.  What is the basisOfRecord for that line?  I would guess that
>> the "basis of the record" was the specimen.  But the line in the record
>> also represents an Occurrence.  It seems like there is a lack of clarity
>> as to whether basisOfRecord is supposed to indicate the type of the
>> record (which would be an Occurrence record) or whether it's supposed to
>> indicate the kind of evidence on which the record is based (which would
>> be PreservedSpecimen).  There have been various times where I've seen a
>> database record that includes basisOfRecord and it seems to be
>> inconsistently applied.
>>
>> I can see how the Darwin Core type vocabulary could be useful - it
>> pretty much lays out useful values that one could give for rdfs:type.
>> But basisOfRecord as a term is confusing me.
>>
>> Steve
>>     
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> .
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101027/78292cec/attachment.html 


More information about the tdwg-content mailing list