[tdwg-content] practical details of recording a determination What is an Occurrence?

Richard Pyle deepreef at bishopmuseum.org
Wed Oct 20 23:26:36 CEST 2010


> I think I answered this (in a sense) in the other email that I sent a
little while ago.  

Yes, you did; and I now understand where you're coming from.

> In principle, if every record of an Occurrence has a time that is
discernibly 
> different from the times of other Occurrences (i.e. because the time was 
> recorded to the nearest second by a machine), then yes, I consider it to
be 
> a different event.  To force people to lump such Occurrences together into

> one Event (say comprising a day or some other time interval larger than a 
> second) is essentially throwing away data that we already have about the 
> Occurrences.  I have already found it useful to know exactly what time 
> of day a flower was opened rather than closed or the order in which 
> I took several images.  

Understood.  But we essentially never throuw data away -- the problem is we
have no way of tracking the high resolution data.  For example, if we run a
plankton tow, or a fish poison station, we have no real way of putting
individual timestamps or geopoints on every individual -- our only option is
to collapse it all into one event.  We can then ditto the same values for
all 200 specimens; or we can normalize it to one Event linked to 200
Individuals.

Likewise with locality.  There are specific study sites that get revisted
over and over, so it's useful to have a single Locality instance linked to
all of the events.  Also, many, many records in our database only have
generic locality descriptors (e.g., "Honolulu"), so no point in duplicating
the lat/long/uncertainty/descriptors for hundreds of events -- just define a
unique locality, and link to the hundereds of events.  Then if we get a
re-interpretation of lat/long/uncertainty for "Honolulu", or if we establish
it as a polygon, we simply attached that metadata to the one Locality
record, rather than the hundreds of events (or thousands of Occurences).
You know -- the usual reasons for nomralizing a data model.
	
> For clarification, when I argue that the class Individual should 
> exist in Darwin Core, it's not because I'm insisting that all 
> users must have an Individual table in their database.  What I 
> want is for people to be ABLE to have an Individual table in 
> their database (if they need it) and have others understand 
> what it means and how the entities described in that table are 
> related conceptually to other things like Identifications and 
> Occurrences.  If all of the records in their database have 
> only one occurrence per individual, they don't "need" to 
> keep track of Individuals.

Having read through this complete discussion, I am now convinced you are
right.  We already have dwc:individualID; so we're primed. Logically, if
such a class is establishd, then there are several terms in the Occurrence
Class that out to be migrated to the Individual Class.  I don't know how
much inertia must be overcome in order to
proposer/review/discuss/vote/ratify such a change in DwC, but if/when we get
to that point, count me as a strong supporter.
	
> I confess my crime.  In penance, I freely confess that the 
> Event class exists and that people should use it in their 
> databases if it helps them cluster Occurrences.  In addition, 
> I confess that I should probably acknowledge the existence 
> of Events and their relationship to other Darwin Core 
> classes when I write RDF.  Guilty as charged!

Likewise for me!  I realized afterward that I was defending the collapsing
of Individual into Occurrence, while at the same time fighting the (equally
justified) collapse of Location and/or Event into Occurrence.  So I was
playing both sides too.  Mea culpa, and I now support the class Individual.

[Lots of stuff we agree on deleted]
	
> As I said in my earlier email, I'm encouraged by the 
> consistency in the way I hear people talking about the 
> relationships among the DwC classes.  I was a bit afraid 
> when we started this thread that I would turn out to have 
> some kind of fringe ideas.  Now what I'm seeing is a lot 
> of variation on how people choose to "collapse" the basic 
> model to meet their individual needs, but not a lot of 
> disagreement about what the basic model is.

I have to say, this has been about the most productive (if volumunous)
list-discussion I've had in...well...maybe ever.  It seems we've both been
equally persuasive, and equally willing to concede.  How rare that happens
in an internet forum!  I'm not sure there's anything left that we disagee
about.  If the "diagram1" seems to resonate with everyone as the most
"normalized" ER diagram we'll likely ever need, and if we can somehow
accommodate flexibility in RDF for collapsing attributes to different
classes (but only from the "one" side to the "many" side) -- then we might
have achived the elusive Holy Grail of biodiversity informatics: true
consensus.

> Thanks for your great feedback and for challenging my statements.  I need
that!

Likewise!	

Aloha,
Rich




More information about the tdwg-content mailing list