A belated happy new years to all!
It's been quite awhile since the last post to the TDWG-observations list, so I have included at the bottom the previous message as a reminder of where we are in the discussion about a working definition for "observation".
In the interim, I discussed directly with Steve a minor edit to make the wording in the second sentence of the observation definition plural. Steve liked this edit, so we propose the following as our working definition for the observation subgroup:
"An observation characterizes the occurrence of an organism or set of organisms through a data collection event at a location. Observations are not necessarily independent entities and could be linked via characteristics such as time, place, protocol, and co-occurring organisms."
As a next step, we propose to develop more fully the definitions for the following words or phrases. As we we work through these definitions, please keep in mind addressing the issues that have been raised regarding topics such as: negative data, protocol, spatial temporal issues, and data aggregation.
occurrence
data collection event
location
entity
could be linked
We thought it best to start with fleshing out the definition for "occurrence" since that seems to be fundamental. I did some searching for existing definitions of "occurrence" in a biological context and it seems to most often be a word that is used with the assumption that everybody is working from a common frame of reference.
The following is from:
Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen.
(http://www.gbif.org/prog/digit/data_quality)
Species-occurrence data
Species-occurrence data is used here to include specimen label data attached to specimens or lots housed in museums and herbaria, observational data and environmental survey data. In general, the data are what we term "point-based", although line (transect data from environmental surveys, collections along a river), polygon (observations from within a defined area such as a national park) and grid data (observations or survey records from a regular grid) are also included. In general we are talking about georeferenced data - i.e. records with geographic references that tie them to a particular place in space - whether with a georeferenced coordinate (e.g. latitude and longitude, UTM) or not (textual description of a locality, altitude, depth) - and time (date, time of day). In general the data are also tied to a taxonomic name, but unidentified collections may also be included. The term has occasionally been used interchangeably with the term "primary species data".
Primary species data
"Primary species data" is used to describe raw collection data and data without any spatial attributes. It includes taxonomic and nomenclatural data without spatial attributes, such as names, taxa and taxonomic concepts without associated geographic references.
*****
Thoughts or suggestions on a working definition of "occurrence" for this subgroup?
Thank you in advance for your ideas!
Lynn
Lynn Kutner Data Management Coordinator, NatureServe Email: lynn_kutner@natureserve.org Phone: (303) 541-0360 www.natureserve.org
________________________________
From: Steve Kelling [mailto:stk2@cornell.edu] Sent: Wednesday, November 30, 2005 7:57 AM To: Lynn Kutner; Tdwg-obs@lists.tdwg.org Subject: Re: [Tdwg-obs] Monitoring definition and protocol repository
Hello, I really like comments that Lynn has brought, and appreciate that she forwarded the definition around NatureServe. I do have a couple of comments, which are found in the text below.
At 06:36 PM 11/22/2005 -0500, you wrote:
There was concern that including "defined protocol" in the definition of observation could be overly restrictive and prevent the inclusion in an observation data repository of high quality / high confidence information that was collected opportunistically and not as part of an official survey with a protocol.
While in some ways I feel that it is good to include a protocol in the definition, I can agree that it is not really essential in a primary definition of an observation. My reason is that in the NA bird monitoring community, even opportunistic observations are considered a protocol. These are characterized as opportunistic or incidental observations. Furthermore, in our development of the a data exchange schema for bird monitoring data the variable protocol figured prominently. So, while not part of the definition, I suggest that we make sure to spend some time on a discussion of protocols in future discussions.
We'd like to propose adding some language to incorporate explicit tracking of negative data. These would be data where a survey was conducted for a certain species (such as a rare orchid) in an area where it would be expected to be found, and the observer wants to document and communicate that information to inform future survey efforts, distribution mapping efforts, and activities such as conservation planning. This would be different from inferring negative data as is sometimes done with bird observation data.
I think that this is a good idea. The concept of negative data is an important one, with a variety of different angles. But, as with protocol I don't believe that it is absolutely essential in our definition. My feeling is that the way we use occurrence in the definition does not infer only positive observations. So, I suggest that we do not include this, but again spend quite a bit of time on this topic as we expand and explore the basic definition.
Also - does the "defined spatiotemporal location" need to be highly precise, or can it be defined generally (with spatiotemporal uncertainty as needed) to reflect knowledge of the observation or observer? For example - include imprecise dates (e.g. spring 1998) and locations (3 miles SW of the intersection of Clear Creek and Main Road)?
Yes, I believe that we can be less explicit on the location. So having more vague terminology in the definition is a good thing.
Finally, I always feel that a definition should strive to get its message across in as few a words as possible. So, I might suggest:
"An observation characterizes the occurrence of an organism or set of organisms through a data collection event at a location. An observation is not necessarily an independent entity and could be linked via characteristics such as time, place, protocol, and co-occurring organisms."
The words or phrases in bold in the definition need to be developed more fully. As we work through the definitions of these words and phrases I believe that issues being brought up such as negative data, protocol, spatial temporal issues, and data aggregation can be addressed.
Regards,