Re: Standards for date / time values?
I concur with Richard on the sufficiency of the three fields, and prefer to err on the side of simplicity for the DwC.
On 2/24/06, Richard Pyle deepreef@bishopmuseum.org wrote:
RE: CollectingDatesInterpreted When would this field *not* be set to true? In other words, what is the threshold for "interpreted", in terms of populating EarliestCollectingDate and LatestCollectingDate?
At one extreme, we have a VerbatimDateTime of "12 January 2005, 13:34:15 local time". Not a lot of interpretation needed there (other than conversion of original documentation format to ASCII/Unicode, and interpretation of the time zone from the locality data). But conversion of "12 January 2005" to
EarliestCollectingDate: "2005-01-12 00:00:00" LatestCollectingDate: "2005-01-12 23:59:59"
...requires a bit of interpretation. Maybe the time portion could simply be omitted; but even still, there is technically *some* interpretation when converting "January" to "01".
And the spectrum continues on from there up until, well, stuff that would be unambiguously flagged as "interpreted".
My point is, there should be some sort of guideline as to when to set CollectingDatesInterpreted true; i.e., what threshold of interpretation triggers that field. If such a guideline already exists, then I apologize for missing it.
Personally, I don't think CollectingDatesInterpreted is necessary, really, because I have always regarded my equivalents of EarliestCollectingDate and LatestCollectingDate as interpreted.
RE: Lynn's examples below. I wrestled with these issues (and more) a number of years ago. Other examples include "Summer; year unknown" (important for seasonal species), and multiple (non-contiguous) date ranges: "May or August, 2004" -- among others.
In addition to the equivalents of VerbatimDateTime, EarliestCollectingDate, LatestCollectingDate; I had a controlled-vocabulary field called "DateRangeQualifier", which had an enumerated list of options tied to interpretation logic (e.g., "continuous range", "discontinuous range", "unknown point", "either/or", etc.; as well as sub-range qualifiers like "Summer"). I modelled it after an analogous field I use for interpreting paired or multiple georef coordinates (e.g., "transect" vs. "bounding box"; "polygon" vs. "linear"). This DateRangeQualifier approach ended up being too cumbersome to implement -- only because I was too lazy, and the number of cases that benefited from it (i.e., allowed computer logic to be applied, independent of a human reading of the VerbatimDateTime value) was low.
The three fields VerbatimDateTime/EarliestCollectingDate/LatestCollectingDate will give you 90% of the value 90% of the time (at least for the specimen/observation data that I deal with). I wonder if DWC really needs more precision capability than that....
Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
-----Original Message----- From: Taxonomic Databases Working Group List [mailto:TDWG@LISTSERV.NHM.KU.EDU]On Behalf Of Lynn Kutner Sent: Thursday, February 23, 2006 5:26 PM To: TDWG@LISTSERV.NHM.KU.EDU Subject: Re: Standards for date / time values?
Hi Stan and all -
Yes, I am thinking about dates in the context of the observation / monitoring schema.
Here are some of the types of dates that I've seen in just the data from the NatureServe member programs based on either lack of available information or interesting data collection situations.
- partial dates (only have the year or year-month) - definitely
accommodated by the ISO 8601 format
- vague dates - for these I suppose you'd use ISO 8601 in
EarliestCollectingDate / LatestCollectingDate, VerbatimDateTime, and the CollectingDatesInterpreted flag ?
PRE-1975 post-1992 summer 2001 Mid-1990s late 1950's Circa 1941 Early 1990s
- a range of dates indicating that collection occurred on a specific date
sometime between those two limits and you're not sure when (example: Observations / collections from pitfall traps may have date ranges associated with them rather than a single date. For instance, you can have dates of collection of 19Dec 2004 to 23Jan 2005 and the specimen was from some unknown specific date in that rage),
continuous data collection over a period of time (some sort of sensor)
information collected on one of two possible dates (one or the other is
correct, it's not a range) (example: aerial surveys flown on either May 27 or June 04, 1999).
It seems that EarliestCollectingDate / LatestCollectingDate could be used for #3, #4, and #5 - but is there a way to differentiate between these as actually being very different types of dates? Would those distinctions just be communicated using the VerbatimDateTime?
Of course for all of these examples the fallback would be to use the VerbatimDateTime, but it would be ideal to have as much as possible of the date information in field(s) that can be queried and analyzed
Thank you for your thoughts and help.
Lynn
Lynn Kutner NatureServe www.natureserve.org (303) 541-0360 lynn_kutner@natureserve.org
From: Blum, Stan [mailto:sblum@calacademy.org] Sent: Thu 2/23/2006 6:41 PM To: Lynn Kutner; TDWG@LISTSERV.NHM.KU.EDU Subject: RE: Standards for date / time values?
Hi Lynn,
Context is everything, so I'm going to assume you are talking about date/time representations in the observation and monitoring schema(s) that you are developing. The thinking in the collections community has gone something along the lines of this:
Concept: the date-time of a collecting event [= recording-event, gathering-event] -- i.e., when one or more organisms were collected or observed -- expressed in the common Gregorian calendar, in local time.
Requirements:
1 - express the data/time exactly as it was recorded 2 - retreive records by date/time ranges (date-time greater than X and/or less than Y) 3 - accommodate date-times of varying precision, including explicit date-time ranges (specifying a duration) 4 - support seasonal (time of year) queries
These requirements can be met by the following fields:
VerbatimDateTime EarliestDateCollected LatestDateCollected CollectingDatesInterpreted DayOfYear
As John Wieczorek noted, the ISO 8601 format accommodates varying degrees of precision. Using a "verbatim date-time" [text string] is as good as you can do to satisfy the first requirement, short of scanning field notes or voice recordings. The earliest and latest dates support the recording of explicit ranges as well as interpreted ranges, which could be used to make "Summer of 1952" retrievable. The CollectingDatesInterpreted field would be a boolean field, set as true when the earliest and latest dates represent interpretations rather than an explicit range. The "DayOfYear" field is a compromise that we've offered as a simple way to support queries involving seasonal (annual) cycles; e.g., collected in the northern summer, regardless of year. But it can be argued that day of year is derivable from the other fields (unless year is unknown), and that it doesn't accommodate explicit ranges.
A bit more documentation is needed to address the odd cases (What do I do when ...?), but these five fields will support a lot of the data exchange needed for this concept. These fields are not intended to handle the dating of localities in paleontology, nor are they intended to handle named periods that are used in cultural collections (e.g., Iron Age, Victorian).
ABCD uses a few more fields to handle the concept, http://ww3.bgbm.org/abcddocs/AbcdConcepts but some of these (date, time, and time zone) are handled by the ISO format, except when the larger units are unkown; .e.g., the year is unknown, but the day is [ June 16 ]; or the date is unknown, but the time is [ 14:00-15:30 ].
AbcdConcept0860 /DataSets/DataSet/Units/Unit/Gathering/DateTime AbcdConcept0861 /DataSets/DataSet/Units/Unit/Gathering/DateTime/DateText AbcdConcept0862 /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeZone AbcdConcept0863 /DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeBegin AbcdConcept0864 /DataSets/DataSet/Units/Unit/Gathering/DateTime/DayNumberBegin AbcdConcept0865 /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeOfDayBegin AbcdConcept0866 /DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeEnd AbcdConcept0867 /DataSets/DataSet/Units/Unit/Gathering/DateTime/DayNumberEnd AbcdConcept0868 /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeOfDayEnd AbcdConcept0869 /DataSets/DataSet/Units/Unit/Gathering/DateTime/PeriodExplicit The ABCD fields pretty much cover the entire concept space. One could argue whether time-zone is relevant to the description of biological phenomena, but we do know that ship trawls do cross time-zones (including the date-line), and that daylight savings time could stretch or compress some nocturnal collecting events if their durations were calculated too simply.
To some extent these arguments are still going on, so analyze your requirements and your data, then state your position. ;-)
Cheers,
-Stan Stanley D. Blum, Ph.D. Research Information Manager California Academy of Sciences 875 Howard St. San Francisco, CA +1 (415) 321-8183
-----Original Message----- From: Taxonomic Databases Working Group List [mailto:TDWG@LISTSERV.NHM.KU.EDU] On Behalf Of Lynn Kutner Sent: Thursday, February 23, 2006 9:12 AM To: TDWG@LISTSERV.NHM.KU.EDU Subject: Standards for date / time values?
Hi - I'm working with a suite of date attributes that can include a combination of precise dates, imprecise dates, and ranges of dates (and the same types of time values). We'd like to follow existing standards. If this sort of date / time standard exists, I'd appreciate leads to the appropriate resources. Thank you for your help - Lynn
Lynn Kutner Data Management Coordinator NatureServe Email: lynn_kutner@natureserve.org Phone: (303) 541-0360 www.natureserve.org
participants (1)
-
John R. WIECZOREK