Standards for date / time values?

Richard Pyle deepreef at BISHOPMUSEUM.ORG
Fri Feb 24 00:00:11 CET 2006


RE: CollectingDatesInterpreted
When would this field *not* be set to true?  In other words, what is the
threshold for "interpreted", in terms of populating EarliestCollectingDate
and LatestCollectingDate?

At one extreme, we have a VerbatimDateTime of "12 January 2005, 13:34:15
local time".  Not a lot of interpretation needed there (other than
conversion of original documentation format to ASCII/Unicode, and
interpretation of the time zone from the locality data).  But conversion of
"12 January 2005" to

EarliestCollectingDate: "2005-01-12 00:00:00"
LatestCollectingDate: "2005-01-12 23:59:59"

...requires a bit of interpretation. Maybe the time portion could simply be
omitted; but even still, there is technically *some* interpretation when
converting "January" to "01".

And the spectrum continues on from there up until, well, stuff that would be
unambiguously flagged as "interpreted".

My point is, there should be some sort of guideline as to when to set
CollectingDatesInterpreted true; i.e., what threshold of interpretation
triggers that field. If such a guideline already exists, then I apologize
for missing it.

Personally, I don't think CollectingDatesInterpreted is necessary, really,
because I have always regarded my equivalents of EarliestCollectingDate and
LatestCollectingDate as interpreted.

RE: Lynn's examples below.
I wrestled with these issues (and more) a number of years ago.  Other
examples include "Summer; year unknown" (important for seasonal species),
and multiple (non-contiguous) date ranges: "May or August, 2004" -- among
others.

In addition to the equivalents of VerbatimDateTime, EarliestCollectingDate,
LatestCollectingDate; I had a controlled-vocabulary field called
"DateRangeQualifier", which had an enumerated list of options tied to
interpretation logic (e.g., "continuous range", "discontinuous range",
"unknown point", "either/or", etc.; as well as sub-range qualifiers like
"Summer"). I modelled it after an analogous field I use for interpreting
paired or multiple georef coordinates (e.g., "transect" vs. "bounding box";
"polygon" vs. "linear"). This DateRangeQualifier approach ended up being too
cumbersome to implement -- only because I was too lazy, and the number of
cases that benefited from it (i.e., allowed computer logic to be applied,
independent of a human reading of the VerbatimDateTime value) was low.

The three fields
VerbatimDateTime/EarliestCollectingDate/LatestCollectingDate will give you
90% of the value 90% of the time (at least for the specimen/observation data
that I deal with). I wonder if DWC really needs more precision capability
than that....

Rich

Richard L. Pyle, PhD
Database Coordinator for Natural Sciences
Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org
http://hbs.bishopmuseum.org/staff/pylerichard.html



-----Original Message-----
From: Taxonomic Databases Working Group List
[mailto:TDWG at LISTSERV.NHM.KU.EDU]On Behalf Of Lynn Kutner
Sent: Thursday, February 23, 2006 5:26 PM
To: TDWG at LISTSERV.NHM.KU.EDU
Subject: Re: Standards for date / time values?


Hi Stan and all -

Yes, I am thinking about dates in the context of the observation /
monitoring schema.

Here are some of the types of dates that I've seen in just the data from the
NatureServe member programs based on either lack of available information or
interesting data collection situations.

1) partial dates (only have the year or year-month) - definitely
accommodated by the ISO 8601 format

2) vague dates - for these I suppose you'd use ISO 8601 in
EarliestCollectingDate / LatestCollectingDate, VerbatimDateTime, and the
CollectingDatesInterpreted flag ?

PRE-1975
post-1992
summer 2001
Mid-1990s
late 1950's
Circa 1941
Early 1990s

3) a range of dates indicating that collection occurred on a specific date
sometime between those two limits and you're not sure when (example:
Observations / collections from pitfall traps may have date ranges
associated with them rather than a single date. For instance, you can have
dates of collection of 19Dec 2004 to 23Jan 2005 and the specimen was from
some unknown specific date in that rage),

4) continuous data collection over a period of time (some sort of sensor)

5) information collected on one of two possible dates (one or the other is
correct, it's not a range) (example: aerial surveys flown on either May 27
or June 04, 1999).

It seems that EarliestCollectingDate / LatestCollectingDate could be used
for #3, #4, and #5 - but is there a way to differentiate between these as
actually being very different types of dates? Would those distinctions just
be communicated using the VerbatimDateTime?

Of course for all of these examples the fallback would be to use the
VerbatimDateTime, but it would be ideal to have as much as possible of the
date information in field(s) that can be queried and analyzed

Thank you for your thoughts and help.

Lynn



Lynn Kutner
NatureServe
www.natureserve.org
(303) 541-0360
lynn_kutner at natureserve.org



From: Blum, Stan [mailto:sblum at calacademy.org]
Sent: Thu 2/23/2006 6:41 PM
To: Lynn Kutner; TDWG at LISTSERV.NHM.KU.EDU
Subject: RE: Standards for date / time values?


Hi Lynn,

Context is everything, so I'm going to assume you are talking about
date/time representations in the observation and monitoring schema(s) that
you are developing.  The thinking in the collections community has gone
something along the lines of this:

Concept:  the date-time of a collecting event [= recording-event,
gathering-event] -- i.e., when one or more organisms were collected or
observed -- expressed in the common Gregorian calendar, in local time.

Requirements:

1 - express the data/time exactly as it was recorded
2 - retreive records by date/time ranges (date-time greater than X and/or
less than Y)
3 - accommodate date-times of varying precision, including explicit
date-time ranges (specifying a duration)
4 - support seasonal (time of year) queries

These requirements can be met by the following fields:

VerbatimDateTime
EarliestDateCollected
LatestDateCollected
CollectingDatesInterpreted
DayOfYear

As John Wieczorek noted, the ISO 8601 format accommodates varying degrees of
precision.  Using a "verbatim date-time" [text string] is as good as you can
do to satisfy the first requirement, short of scanning field notes or voice
recordings.  The earliest and latest dates support the recording of explicit
ranges as well as interpreted ranges, which could be used to make "Summer of
1952" retrievable.  The CollectingDatesInterpreted field would be a boolean
field, set as true when the earliest and latest dates represent
interpretations rather than an explicit range.  The "DayOfYear" field is a
compromise that we've offered as a simple way to support queries involving
seasonal (annual) cycles; e.g., collected in the northern summer, regardless
of year.  But it can be argued that day of year is derivable from the other
fields (unless year is unknown), and that it doesn't accommodate explicit
ranges.

A bit more documentation is needed to address the odd cases (What do I do
when ...?), but these five fields will support a lot of the data exchange
needed for this concept.  These fields are not intended to handle the dating
of localities in paleontology, nor are they intended to handle named periods
that are used in cultural collections (e.g., Iron Age, Victorian).

ABCD uses a few more fields to handle the concept,
http://ww3.bgbm.org/abcddocs/AbcdConcepts  but some of these (date, time,
and time zone) are handled by the ISO format, except when the larger units
are unkown; .e.g., the year is unknown, but the day is [ June 16 ]; or the
date is unknown, but the time is [ 14:00-15:30 ].

AbcdConcept0860  /DataSets/DataSet/Units/Unit/Gathering/DateTime
AbcdConcept0861  /DataSets/DataSet/Units/Unit/Gathering/DateTime/DateText
AbcdConcept0862  /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeZone
AbcdConcept0863
/DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeBegin
AbcdConcept0864
/DataSets/DataSet/Units/Unit/Gathering/DateTime/DayNumberBegin
AbcdConcept0865
/DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeOfDayBegin
AbcdConcept0866
/DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeEnd
AbcdConcept0867
/DataSets/DataSet/Units/Unit/Gathering/DateTime/DayNumberEnd
AbcdConcept0868
/DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeOfDayEnd
AbcdConcept0869
/DataSets/DataSet/Units/Unit/Gathering/DateTime/PeriodExplicit
The ABCD fields pretty much cover the entire concept space.  One could argue
whether time-zone is relevant to the description of biological phenomena,
but we do know that ship trawls do cross time-zones (including the
date-line), and that daylight savings time could stretch or compress some
nocturnal collecting events if their durations were calculated too simply.

To some extent these arguments are still going on, so analyze your
requirements and your data, then state your position.  ;-)

Cheers,

-Stan
Stanley D. Blum, Ph.D.
Research Information Manager
California Academy of Sciences
875 Howard St.
San Francisco,  CA
+1 (415) 321-8183


-----Original Message-----
From: Taxonomic Databases Working Group List
[mailto:TDWG at LISTSERV.NHM.KU.EDU] On Behalf Of Lynn Kutner
Sent: Thursday, February 23, 2006 9:12 AM
To: TDWG at LISTSERV.NHM.KU.EDU
Subject: Standards for date / time values?


Hi -
I'm working with a suite of date attributes that can include a combination
of precise dates, imprecise dates, and ranges of dates (and the same types
of time values). We'd like to follow existing standards. If this sort of
date / time standard exists, I'd appreciate leads to the appropriate
resources.
Thank you for your help -
Lynn


Lynn Kutner
Data Management Coordinator
NatureServe
Email:    lynn_kutner at natureserve.org
Phone:   (303) 541-0360
www.natureserve.org




More information about the tdwg mailing list