- tdwg-content - lists.tdwg.org

[Tdwg-obs] On observation definition / moving forward
by Robert K. Peet 01 Feb '06

01 Feb '06

Hi Lynn > "An observation characterizes the occurrence of an organism or set of > organisms through a data collection event at a location. Observations > are not necessarily independent entities and could be linked via > characteristics such as time, place, protocol, and co-occurring > organisms." > > As a next step, we propose to develop more fully the definitions for the > following words or phrases. As we we work through these definitions, > please keep in mind addressing the issues that have been raised > regarding topics such as: negative data, protocol, spatial temporal > issues, and data aggregation. > > occurrence Perhaps change "occurrence" to "evidence for the presence or absence"? The key idea is that the organism or set of organisms was either detected or not. We also need to provide an opportunity for the recorder to note the certainly. As an aside, recall we need to support minimalist protocols (e.g. "organism/community (not)seen in field", "organism heard in field", "scat seen in field", "tracks seen in field", "museum collection".) > data collection event = An event, during or after which at least the minimum required data were recorded. > location Ideally, at least geocoordinates plus an accuracy term. We may wish to support such primitive location indicators as place names, but this is dangerous and I would prefer to require translation of names into geocoordinates and precision. The geocoordinates should also be allowed to be associated with a set of points that define the edges of an area, or other spatial metadata. > entity Deletion of this word from the definition might help > could be linked = can have a pointer or pointers to other observations, thereby creating aggregate observations. Note that commonality of date, time, place, etc. is not sufficient in that the none of the observation authors explicitly made the connection Best, Bob ====================================================================== Robert K. Peet, Professor & Chair Phone: 919-962-6942 Curriculum in Ecology, CB#3275 Fax: 919-962-6930 University of North Carolina Cell: 919-368-4971 Chapel Hill, NC 27599-3275 USA Email: peet(a)unc.edu http://www.unc.edu/depts/ecology/ http://www.bio.unc.edu/faculty/peet/ ======================================================================

2 1

[Tdwg-obs] observation definition / moving forward
by Lynn Kutner 31 Jan '06

31 Jan '06

A belated happy new years to all! It's been quite awhile since the last post to the TDWG-observations list, so I have included at the bottom the previous message as a reminder of where we are in the discussion about a working definition for "observation". In the interim, I discussed directly with Steve a minor edit to make the wording in the second sentence of the observation definition plural. Steve liked this edit, so we propose the following as our working definition for the observation subgroup: "An observation characterizes the occurrence of an organism or set of organisms through a data collection event at a location. Observations are not necessarily independent entities and could be linked via characteristics such as time, place, protocol, and co-occurring organisms." As a next step, we propose to develop more fully the definitions for the following words or phrases. As we we work through these definitions, please keep in mind addressing the issues that have been raised regarding topics such as: negative data, protocol, spatial temporal issues, and data aggregation. occurrence data collection event location entity could be linked We thought it best to start with fleshing out the definition for "occurrence" since that seems to be fundamental. I did some searching for existing definitions of "occurrence" in a biological context and it seems to most often be a word that is used with the assumption that everybody is working from a common frame of reference. The following is from: Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. (http://www.gbif.org/prog/digit/data_quality) Species-occurrence data Species-occurrence data is used here to include specimen label data attached to specimens or lots housed in museums and herbaria, observational data and environmental survey data. In general, the data are what we term "point-based", although line (transect data from environmental surveys, collections along a river), polygon (observations from within a defined area such as a national park) and grid data (observations or survey records from a regular grid) are also included. In general we are talking about georeferenced data - i.e. records with geographic references that tie them to a particular place in space - whether with a georeferenced coordinate (e.g. latitude and longitude, UTM) or not (textual description of a locality, altitude, depth) - and time (date, time of day). In general the data are also tied to a taxonomic name, but unidentified collections may also be included. The term has occasionally been used interchangeably with the term "primary species data". Primary species data "Primary species data" is used to describe raw collection data and data without any spatial attributes. It includes taxonomic and nomenclatural data without spatial attributes, such as names, taxa and taxonomic concepts without associated geographic references. ***** Thoughts or suggestions on a working definition of "occurrence" for this subgroup? Thank you in advance for your ideas! Lynn Lynn Kutner Data Management Coordinator, NatureServe Email: lynn_kutner(a)natureserve.org Phone: (303) 541-0360 www.natureserve.org ________________________________ From: Steve Kelling [mailto:stk2@cornell.edu] Sent: Wednesday, November 30, 2005 7:57 AM To: Lynn Kutner; Tdwg-obs(a)lists.tdwg.org Subject: Re: [Tdwg-obs] Monitoring definition and protocol repository Hello, I really like comments that Lynn has brought, and appreciate that she forwarded the definition around NatureServe. I do have a couple of comments, which are found in the text below. At 06:36 PM 11/22/2005 -0500, you wrote: There was concern that including "defined protocol" in the definition of observation could be overly restrictive and prevent the inclusion in an observation data repository of high quality / high confidence information that was collected opportunistically and not as part of an official survey with a protocol. While in some ways I feel that it is good to include a protocol in the definition, I can agree that it is not really essential in a primary definition of an observation. My reason is that in the NA bird monitoring community, even opportunistic observations are considered a protocol. These are characterized as opportunistic or incidental observations. Furthermore, in our development of the a data exchange schema for bird monitoring data the variable protocol figured prominently. So, while not part of the definition, I suggest that we make sure to spend some time on a discussion of protocols in future discussions. We'd like to propose adding some language to incorporate explicit tracking of negative data. These would be data where a survey was conducted for a certain species (such as a rare orchid) in an area where it would be expected to be found, and the observer wants to document and communicate that information to inform future survey efforts, distribution mapping efforts, and activities such as conservation planning. This would be different from inferring negative data as is sometimes done with bird observation data. I think that this is a good idea. The concept of negative data is an important one, with a variety of different angles. But, as with protocol I don't believe that it is absolutely essential in our definition. My feeling is that the way we use occurrence in the definition does not infer only positive observations. So, I suggest that we do not include this, but again spend quite a bit of time on this topic as we expand and explore the basic definition. Also - does the "defined spatiotemporal location" need to be highly precise, or can it be defined generally (with spatiotemporal uncertainty as needed) to reflect knowledge of the observation or observer? For example - include imprecise dates (e.g. spring 1998) and locations (3 miles SW of the intersection of Clear Creek and Main Road)? Yes, I believe that we can be less explicit on the location. So having more vague terminology in the definition is a good thing. Finally, I always feel that a definition should strive to get its message across in as few a words as possible. So, I might suggest: "An observation characterizes the occurrence of an organism or set of organisms through a data collection event at a location. An observation is not necessarily an independent entity and could be linked via characteristics such as time, place, protocol, and co-occurring organisms." The words or phrases in bold in the definition need to be developed more fully. As we work through the definitions of these words and phrases I believe that issues being brought up such as negative data, protocol, spatial temporal issues, and data aggregation can be addressed. Regards,

2 1

Re: [Tdwg-obs] Monitoring definition and protocol repository
by Steve Kelling 30 Nov '05

30 Nov '05

Hello, I really like comments that Lynn has brought, and appreciate that she forwarded the definition around NatureServe. I do have a couple of comments, which are found in the text below. At 06:36 PM 11/22/2005 -0500, you wrote: >There was concern that including "defined protocol" in the definition of >observation could be overly restrictive and prevent the inclusion in an >observation data repository of high quality / high confidence >information that was collected opportunistically and not as part of an >official survey with a protocol. While in some ways I feel that it is good to include a protocol in the definition, I can agree that it is not really essential in a primary definition of an observation. My reason is that in the NA bird monitoring community, even opportunistic observations are considered a protocol. These are characterized as opportunistic or incidental observations. Furthermore, in our development of the a data exchange schema for bird monitoring data the variable protocol figured prominently. So, while not part of the definition, I suggest that we make sure to spend some time on a discussion of protocols in future discussions. >We'd like to propose adding some language to incorporate explicit >tracking of negative data. These would be data where a survey was >conducted for a certain species (such as a rare orchid) in an area where >it would be expected to be found, and the observer wants to document and >communicate that information to inform future survey efforts, >distribution mapping efforts, and activities such as conservation >planning. This would be different from inferring negative data as is >sometimes done with bird observation data. I think that this is a good idea. The concept of negative data is an important one, with a variety of different angles. But, as with protocol I don't believe that it is absolutely essential in our definition. My feeling is that the way we use occurrence in the definition does not infer only positive observations. So, I suggest that we do not include this, but again spend quite a bit of time on this topic as we expand and explore the basic definition. >Also - does the "defined spatiotemporal location" need to be highly >precise, or can it be defined generally (with spatiotemporal uncertainty >as needed) to reflect knowledge of the observation or observer? For >example - include imprecise dates (e.g. spring 1998) and locations (3 >miles SW of the intersection of Clear Creek and Main Road)? Yes, I believe that we can be less explicit on the location. So having more vague terminology in the definition is a good thing. Finally, I always feel that a definition should strive to get its message across in as few a words as possible. So, I might suggest: "An observation characterizes the occurrence of an organism or set of organisms through a data collection event at a location. An observation is not necessarily an independent entity and could be linked via characteristics such as time, place, protocol, and co-occurring organisms." The words or phrases in bold in the definition need to be developed more fully. As we work through the definitions of these words and phrases I believe that issues being brought up such as negative data, protocol, spatial temporal issues, and data aggregation can be addressed. Regards,

1 0

Re: [Tdwg-obs] Monitoring definition and protocol repository
by Lynn Kutner 23 Nov '05

23 Nov '05

Hi all - I think we're really close ... But I have a couple of questions and a suggested minor rewording. I forwarded Bob Peet's definition to several NatureServe staff, member programs, and partners who work with observation data. The following reflects the compiled comments from several people. There was concern that including "defined protocol" in the definition of observation could be overly restrictive and prevent the inclusion in an observation data repository of high quality / high confidence information that was collected opportunistically and not as part of an official survey with a protocol. For example - a biologist out doing a bird survey happens upon some scat or tracks that he/she recognizes as a species of interest to his/her (say, wolverine) and records the GPS coordinates and some basic information about the date/time and location. Would this be acceptable for an observations system? Would it be acceptable to indicate the protocol as "none" or something similar? We'd like to propose adding some language to incorporate explicit tracking of negative data. These would be data where a survey was conducted for a certain species (such as a rare orchid) in an area where it would be expected to be found, and the observer wants to document and communicate that information to inform future survey efforts, distribution mapping efforts, and activities such as conservation planning. This would be different from inferring negative data as is sometimes done with bird observation data. Also - does the "defined spatiotemporal location" need to be highly precise, or can it be defined generally (with spatiotemporal uncertainty as needed) to reflect knowledge of the observation or observer? For example - include imprecise dates (e.g. spring 1998) and locations (3 miles SW of the intersection of Clear Creek and Main Road)? So ... With the above as background, what do people think of the following tinkering with Bob's suggested definition: "An observation characterizes the occurrence, or documents the lack of occurrence, of an organism or set of organisms through a data collection event at a location. Individual observations are not necessarily independent entities and potentially can be linked through common characteristics such as time, place, protocol, and co-occurring organisms." Thanks - Lynn Lynn Kutner NatureServe Email: lynn_kutner(a)natureserve.org Phone: (303) 541-0360 www.natureserve.org -----Original Message----- From: Tdwg-obs-bounces(a)lists.tdwg.org [mailto:Tdwg-obs-bounces@lists.tdwg.org] On Behalf Of Steve Kelling Sent: Monday, November 21, 2005 5:19 AM To: Robert K. Peet; Tdwg-obs(a)lists.tdwg.org Subject: [Tdwg-obs] Monitoring definition and protocol repository All, I really like Bob's rewording of the definition, and suggest that we all refer to this. Steve At 12:38 PM 11/20/2005 -0500, Robert K. Peet wrote: >Hi Steve, > >Try the folllowing revision, which is in part an exercise to see whether I >understand your definition. > >"An observation characterizes the occurrence of an organism or set of >organisms through a data collection event using a defined protocol at a >defined spatiotemporal location. Individual observations are not >necessarily independent entities and potentially can be linked through >common characteristics such as time, place, protocol, and individual >organisms." > >Bob >

2 1

Re: [Tdwg-obs] Tdwg-obs Digest, Vol 2, Issue 5
by Robert K. Peet 21 Nov '05

21 Nov '05

Hi Steve, Try the folllowing revision, which is in part an exercise to see whether I understand your definition. "An observation characterizes the occurrence of an organism or set of organisms through a data collection event using a defined protocol at a defined spatiotemporal location. Individual observations are not necessarily independent entities and potentially can be linked through common characteristics such as time, place, protocol, and individual organisms." Bob > Date: Sun, 20 Nov 2005 11:01:11 -0500 > From: Steve Kelling <stk2(a)cornell.edu> ... > I suggest that we go with the definition of an observation as: > > An observation characterizes the occurrence of an organism (or a community > of organisms) through a collection event using a defined protocol and > spatiotemporal location. Individual observations are non-independent > entities that can be linked to each other through their common characteristics. > ====================================================================== Robert K. Peet, Professor & Chair Phone: 919-962-6942 Curriculum in Ecology, CB#3275 Fax: 919-962-6930 University of North Carolina Cell: 919-368-4971 Chapel Hill, NC 27599-3275 USA Email: peet(a)unc.edu http://www.unc.edu/depts/ecology/ http://www.bio.unc.edu/faculty/peet/ ======================================================================

2 1

Re: [Tdwg-obs] Tdwg-obs Digest, Vol 2, Issue 5
by Jerry Cooper 20 Nov '05

20 Nov '05

I support Bob's definition below. Jerry >>> "Robert K. Peet" <peet(a)unc.edu> 11/21/05 6:38 AM >>> Hi Steve, Try the folllowing revision, which is in part an exercise to see whether I understand your definition. "An observation characterizes the occurrence of an organism or set of organisms through a data collection event using a defined protocol at a defined spatiotemporal location. Individual observations are not necessarily independent entities and potentially can be linked through common characteristics such as time, place, protocol, and individual organisms." Bob > Date: Sun, 20 Nov 2005 11:01:11 -0500 > From: Steve Kelling <stk2(a)cornell.edu> ... > I suggest that we go with the definition of an observation as: > > An observation characterizes the occurrence of an organism (or a community > of organisms) through a collection event using a defined protocol and > spatiotemporal location. Individual observations are non-independent > entities that can be linked to each other through their common characteristics. > ====================================================================== Robert K. Peet, Professor & Chair Phone: 919-962-6942 Curriculum in Ecology, CB#3275 Fax: 919-962-6930 University of North Carolina Cell: 919-368-4971 Chapel Hill, NC 27599-3275 USA Email: peet(a)unc.edu http://www.unc.edu/depts/ecology/ http://www.bio.unc.edu/faculty/peet/ ====================================================================== _______________________________________________ Tdwg-obs mailing list Tdwg-obs(a)lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-obs_lists.tdwg.org ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ WARNING: This email and any attachments may be confidential and/or privileged. They are intended for the addressee only and are not to be read, used, copied or disseminated by anyone receiving them in error. If you are not the intended recipient, please notify the sender by return email and delete this message and any attachments. The views expressed in this email are those of the sender and do not necessarily reflect the official views of Landcare Research. Landcare Research http://www.landcareresearch.co.nz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 0

Re: [Tdwg-obs] Definition of an observation
by Jerry Cooper 20 Nov '05

20 Nov '05

Steve, Changing 'collection event' to 'recording event' is preferable to me. I know there has been previous debate about the distinction between collections/observations but the primary purpose is to record data - with or without associated specimens. I think we are approaching a reasonable definition although I think the last sentence is redundent. Jerry Cooper >>> Steve Kelling <stk2(a)cornell.edu> 11/21/05 5:01 AM >>> Hello, I think that the issues being brought up around the proposed definition of an observation, and the subsequent discussions and proposed modifications of the definition have been illuminating. The result, I believe, is that we are creating a more useful definition of an observation. I do want to make a couple of points. First, what we are trying to define is the fundamental datum, or fixed reference point from which observational data can be integrated with data from other biodiversity initiatives such as the organization of museum specimen collections. I agree with Denis that in order to accomplish this we must be able to define the "core elements" of an observation. Consequently, this definition only needs to recognize protocols, sampling area, quality etc. Addressing these issues are the next step and Matt's comments are right on. Second, Lynn's comment about the inclusion of ecological communities is extremely important for the same reasons that aggregation of observational data are. These two points have a major impact on how biodiversity informatics, particularly international efforts such as TDWG and GBIF organize and present information. These concepts move away from the specimen as the foundation of organization. For example, the community concept is based on species assemblages (instead of a single species), are well embraced by other ecological organizing initiatives as Bob Peet points out, and necessitates a change in organizational schemas. Furthermore, aggregating observations into collecting events changes the hierarchical structure of organizational schemas, because the collecting event and not the individual species record is the principal focus. I suggest that we go with the definition of an observation as: An observation characterizes the occurrence of an organism (or a community of organisms) through a collection event using a defined protocol and spatiotemporal location. Individual observations are non-independent entities that can be linked to each other through their common characteristics. If this group agrees with this I will post it on the TDWG observational monitoring web page. Regards, Steve Kelling Cornell Lab of Ornithology 607-254-2478 (work) 607-342-1029 (cell) _______________________________________________ Tdwg-obs mailing list Tdwg-obs(a)lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-obs_lists.tdwg.org ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ WARNING: This email and any attachments may be confidential and/or privileged. They are intended for the addressee only and are not to be read, used, copied or disseminated by anyone receiving them in error. If you are not the intended recipient, please notify the sender by return email and delete this message and any attachments. The views expressed in this email are those of the sender and do not necessarily reflect the official views of Landcare Research. Landcare Research http://www.landcareresearch.co.nz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 0

[Tdwg-obs] Definition of an observation
by Steve Kelling 20 Nov '05

20 Nov '05

Hello, I think that the issues being brought up around the proposed definition of an observation, and the subsequent discussions and proposed modifications of the definition have been illuminating. The result, I believe, is that we are creating a more useful definition of an observation. I do want to make a couple of points. First, what we are trying to define is the fundamental datum, or fixed reference point from which observational data can be integrated with data from other biodiversity initiatives such as the organization of museum specimen collections. I agree with Denis that in order to accomplish this we must be able to define the "core elements" of an observation. Consequently, this definition only needs to recognize protocols, sampling area, quality etc. Addressing these issues are the next step and Matt's comments are right on. Second, Lynn's comment about the inclusion of ecological communities is extremely important for the same reasons that aggregation of observational data are. These two points have a major impact on how biodiversity informatics, particularly international efforts such as TDWG and GBIF organize and present information. These concepts move away from the specimen as the foundation of organization. For example, the community concept is based on species assemblages (instead of a single species), are well embraced by other ecological organizing initiatives as Bob Peet points out, and necessitates a change in organizational schemas. Furthermore, aggregating observations into collecting events changes the hierarchical structure of organizational schemas, because the collecting event and not the individual species record is the principal focus. I suggest that we go with the definition of an observation as: An observation characterizes the occurrence of an organism (or a community of organisms) through a collection event using a defined protocol and spatiotemporal location. Individual observations are non-independent entities that can be linked to each other through their common characteristics. If this group agrees with this I will post it on the TDWG observational monitoring web page. Regards, Steve Kelling Cornell Lab of Ornithology 607-254-2478 (work) 607-342-1029 (cell)

1 0

Re: [Tdwg-obs] Survey and Monitoring
by Denis Lepage 15 Nov '05

15 Nov '05

Last email from me for tonight, I promise. > Overall I like the direction of Denis's proposed definition, but I have a comment and a question. > > Comment - I propose to explicitly expand the definition of an obervation to include both organisms and ecological communities (such as vegetation). Here's a slight change that might work: "An observation is the characterization of the occurrence of an organism (or a community of organisms) through a collection event with a defined protocol and spatiotemporal location. Individual observations are non-independant entities that can be linked to each other through their common characteristics." The main difference I see is that in the case of ecological communities, you are not only monitoring occurrence (presence/absence), but also changes in the other characteristics of your community. I think providing a separate definition for Monitoring would also be useful. Monitoring and observational data are 2 different things, although they are obviously intertwined (and sometimes confused). In the case of population monitoring, you are interested in changes of abundance of a population over space and time (and you typically use observation events as sample points to infer larger population changes). In the case of community monitoring, you are usually more interested in changes in the ecolological characteristics of a particular area (and you also use observation events to do that). > And a question - does the "defined protocol" need to be a > published document, or can the protocol be more informal? I > think that the observer should provide some minimal > documentation of how they made that particular spatiotemporal > observation, but I'd imagine there's quite a bit of high > quality observation in existence that was collected > opportunistically and not necessarily using a published protocol. No, defined protocol didn't imply a scientifically rigorous one or a published one. "Casual data collection" could be a defined protocol. I couldn't think of a better term. Denis

2 1

[Tdwg-obs] (no subject)
by Denis Lepage 15 Nov '05

15 Nov '05

Hi again, I think Steve's point about aggregating data is a critical one. When you have a unique data source, you have a lot more control over issues such as protocol, effort, data quality, etc. In a distributed data environment, you need to be able to make the assesment about the data you are using as a data analyst. This always comes down to the issue of what questions you want to ask of your data. The goal is not necessarily to describe all data, but to share data for particular purposes. If you are talking about population monitoring, anything that will affect your chances of detecting a particular organism during an event is potentially of interest to the analyst. I think we'll need to take on one piece at a time, at the risk of being overwhelmed. Here's an attempt to summarize the main issues at hand: - data quality/uncertainty (what are the parameters needed to describe it, and how do they differ among different communities). Some of it is a property of the protocol (eg, citizen-science vs. professional biologists), and some is a property of individual observations. Both are important. - protocol (what are the parameters needed to describe it, and how do they differ among communities). Protocol issues include how to measure effort (for how long did you observe, how big an area did you cover), taxonomic focus (what are the taxa of interest? all birds? all aquatic invertebrates? all waterfowl? all raptors?), and more. - sampling framework (is this a stratified survey, what are your strata, etc.). This is somewhat related to protocol, but you may also need information about individual strata (size, etc.). - ancillary data (eg, environmental data, weather, water chemistry, etc. or community-specific information about the organisms, such as fat score for birds). Which ones are important, and most importantly, to answer what specific questions? This is probably the most complex aspect of having multiple communities involved. A lot of this will probably require specific extensions. - I'm not sure how we can describe detectability in a data schema. This may be more something you want to be sure that you can measure from the data itself. If so, what are the components needed to do so? - are there other topics? I think we've made some progress already within the bird community on many of those fronts with the BMDE schema, but there's still some work to do, especially on the protocol description standards. Not sure what the best way to go about making progress on those things is for the larger monitoring community. What about building a dictionnary of monitoring protocols that we can refer to from the data schema? If each protocol had a unique ID, and this information was available somewhere outside of the data schema, it would probably simplify things immensely. The data schema would still need to have some protocol information (eg, if the duration varies among individual events following a given protocol). Were you thinking about something along those lines, Steve? Good night Denis Denis Lepage, Senior Scientist/Chercheur sénior National Data Center/Centre national des données Bird Studies Canada/Études d'Oiseaux Canada PO Box/B.P. 160, Port Rowan, ON N0E 1M0 519-586-3531 ext. 225, fax/téléc. 519-586-3532

1 0