I'm not sure if its a help but in our reference collection database we store start and end dates for collection as separate day, month and year. We then concatenate these fields to give a sensible-looking date. Sorting is a little more complicated and must be nested.
This approach allows us to accurately reflect the detail on many old labels which read simply "June 1927".
--------------------------------------------------
Rob Emery (Mr) (Senior Entomologist)
ph: 61 8 9368 3247 Fax: 61 8 9368 3223 Mob: 0404 819 601
mailto:remery@agric.wa.gov.au http://www.agric.wa.gov.au/ento Department of Agriculture Western Australia
3 Baron-Hay Court, _-_|\ South Perth, 6151 / \ Western Australia --> _.-._/ v
-----Original Message----- From: Automatic digest processor [mailto:LISTSERV@LISTSERV.NHM.KU.EDU] Sent: Wednesday, 1 March 2006 2:00 PM To: Recipients of TDWG digests Subject: TDWG Digest - 27 Feb 2006 to 28 Feb 2006 (#2006-7)
There are 5 messages totalling 1039 lines in this issue.
Topics of the day:
1. Dates and Times (3) 2. Standards for date / time values? (2)
----------------------------------------------------------------------
Date: Mon, 27 Feb 2006 23:12:02 -0800 From: list.tdwg@ACHAPMAN.ORG Subject: Dates and Times
This is an interesting discussion, and I am wondering why we treat dates/times differently to localities where we may have "between two places", "near a place", "along a path", etc.
At this stage, we tend to recommend using a single location (lat/long) with an uncertainty surrounding that location. With location, we are currently using a circle to determine the uncertainty in a locality, where as time is linear (and thus easier). We may change to the use of polygons and uncertainty footprints at a later stage but let's leave that discussion for another time.
Would not the better way to handle some of these data/time situations be to use a midpoint with an uncertainty. Thus for "01-2000" we should not interpret that as 01-01-2000 but as 15-01-2000 with an uncertainty of 15 (or 16) days - can be refined with the incorporation of time. This is, of course, not all that dis-similar to what michael is suggesting.
On a related issue - if we have two dates - the dates may be "accurately" determined - but the actual time of occurrence is uncertain. I thus prefer the use of the term uncertainty - and possibly better (as we will most likely use in the Georeferencing Best Practices document under preparation) - "Maximum Uncertainty" or perhaps "Maximum uncertainty estimate"
Perhaps there are good reasons for using start and end dates/times - if so I am happy to go along with that - but believe we should aim for consistency of approach if possible.
Cheers
Arthur
Arthur D. Chapman Australian Biodiversity Information Services Toowoomba, Australia
From Automatic digest processor LISTSERV@LISTSERV.NHM.KU.EDU on 27 Feb
2006:
There are 3 messages totalling 562 lines in this issue.
Topics of the day:
- Standards for date / time values? (3)
Date: Mon, 27 Feb 2006 12:27:23 +0100 From: Hannu Saarenmaa hsaarenmaa@GBIF.ORG Subject: Re: Standards for date / time values?
In my understanding the "Not interpreted" case would be just a text string "DateTimeSourceText", repeating whatever is written in the label about the dates and times. This can be useful if questions arise of the interpretation.
Btw., isn't he name of the element "EarliestDateCollected" in Darwin Core v.1.4 proposal a bit misleading as it really is "EarliestDateTimeCollected"?
Hannu
Richard Pyle wrote:
RE: CollectingDatesInterpreted When would this field *not* be set to true?
Date: Mon, 27 Feb 2006 10:58:54 -0500 From: Michael Lee mikelee@EMAIL.UNC.EDU Subject: Re: Standards for date / time values?
Hello all,
It seems to me that the VerbatimDate/EarliestDate/LatestDate are helpful, but combine two separate issues:
- duration of the event spanning more than one date/time unit
- error associated with the date.
Is the distinction useful? Probably depends on your application of the date.
To handle these, VegBank uses three fields:
startDateTime stopDateTime dateTimeAccuracy (note that there is NO verbatimDate)
StartDateTime is most precise known date/time for the beginning of an event (in our case an observation of a vegetation plot) and the EndDateTime is the end date/time (which could have the same value). DateTimeAccuracy is a string with a list of values ranging from 1 second to 1 day to 1 year to 5,10 years, etc.
start and stop deal with issue #1 above, duration (potentially) exceeding usual date description precision.
The accuracy field deals with isue #2 above, error or uncertainty regarding the dates. It is this accuracy field that seems to be missing from this discussion thus far.
As far as time zones go, error on the order of hours isn't TOO important to us, so we don't worry about it too much. But I think what we're aiming to do is to store the date/time according to UTC. This means that to reconstruct what time the person saw something, the original time zone and Daylight Savings Time would need to be known, but this isn't crucial to us, either.
I like Lynn's list of weird dates, so I'll throw in what we'd do with them (this of course being "interpreted" but as Rich says, that's almost always the case):
PRE-1975 Start:null ; End: 01-JAN-1975 ; Accuracy: null or possibly a large value post-1992 Start: 01-Jan-1992; End:null ; Accuracy: null or possibly a large value summer 2001 Start: 15-JUL-2001; End:15-JUL-2001 ; Accuracy: 1.5 months Mid-1990s Start: 01-JUL-1995; End: 01-JUL-1995 ; Accuracy: 2 years late 1950's Start: 01-JAN-1958; End: 01-JAN-1958; Accuracy: 2.5 years Circa 1941 Start: 01-JUL-1941; End:01-JUL-1941; Accuracy: 6 months Early 1990s Start: 01-JAN-1992; End:01-JAN-1992; Accuracy: 2 years
Except for the "pre-Date" and "post-Date" I have assumed a relatively short duration. But this is interpreted. The problems with this approach I see are:
- accuracy for start and stop dates could be different
- accuracy is not a black and white idea. It's often more of a
gradient. Seems illogical to specify 10 years as accuracy in the first 2 examples, as the dates could be much more than 10 years away. 3) a closed list on accuracy seems to force you into decisions you don't
=== message truncated ===
------------------------------
Date: Mon, 27 Feb 2006 22:17:49 -1000 From: Richard Pyle deepreef@BISHOPMUSEUM.ORG Subject: Re: Standards for date / time values?
Hannu wrote:
In my understanding the "Not interpreted" case would be just a text string "DateTimeSourceText", repeating whatever is written in the label about the dates and times. This can be useful if questions arise of the interpretation.
I had assumed that the CollectingDatesInterpreted value referred to the Earliest/Latest values; not the Verbatim value. I would tend to equate null/empty values for the Earliest/Latest fields to be equivalent to "Not Interpreted"; and assume that all non-null values in thes two fields to be, by definition, interpreted.
Michael Lee wrote:
It seems to me that the VerbatimDate/EarliestDate/LatestDate are helpful, but combine two separate issues:
- duration of the event spanning more than one date/time unit
- error associated with the date.
These are two of several possible meanings of two date values that I tried to qualify with my "DateRangeQualifier" attribute.
There are actually a number of possible meanings/interpretations for a pair of date/time values, but in most cases, one could interpret a date range as either of the following:
- Event spanned entire range - Event occurred at a single point in time somewhere within this range (Same as what you said, but stated in a different way.)
However, I prefer to view the Earliest/Latest pair of values to have the specific meaning of "collection event did not likely begin prior to [Earliest], and did not likely end after [Latest]". In other words, I view it as always meaning, with some degree of certainty, "collection event took place within this window of time", independently of the duration of the event itself. If "Duration" of sampling event is imporant (e.g., plankton tow), I think it might best be documented secondarily, elsewhere. It could be handled with something like dateTimeAccuracy; but it seems to me that you would then have two subjective values (the scale of dateTimeAccuracy seems like it's subjectively defined?)
Arthur wrote:
This is an interesting discussion, and I am wondering why we treat dates/times differently to localities where we may have "between two places", "near a place", "along a path", etc.
I think there are many parallels, and you could manage Date values analagously to the MaNIS protocol for establishing georef coordinates with max error. But my feeling is that this would be overkill except in very specific use-cases involving seasonally critical datasets, or organisms with large-scale temporal cycles. I think in most use-cases, verbatim/Earliest/Latest provides most of the information that most of the people are interested in. Which leads me to...
Roger wrote:
Who is using data scored like this - actually searching on and analyzing data that was scored to 'summer', 'autumn', 'early 19th century'
This thread seems to be a great deal about how to capture data from
various
historic sources but what are the use cases for exploitation of it?
Use cases I deal with are things like establishing first/last known occurences of taxon X at locality Y; tracking down patterns of (long-term) periodicity in taxon occurance, and establishing patterns of seasonality (among others). The reason I feel that the three Verbatim/InterpretedEarliest/InterpretedLatest set covers me is that the first one really represents the *actual* [meta]data, whereas the other two are useful for getting a computer to sort & filter chronomogically.
Aloha, Rich
------------------------------
Date: Mon, 27 Feb 2006 22:25:08 -1000 From: Richard Pyle deepreef@BISHOPMUSEUM.ORG Subject: Re: Standards for date / time values?
This is why I define the two "interpreted" date/time values as as I described in my previous note. Something like "Apr. 75" gets recorded exactly as such in the Verbatim field, and the Earliest/Latest interpreted values would be 1975-04-01 and 1975-04-30 (in whatever standard format is preferred). Time seems to me to simply be higher-resolution precision of the same information (in the same way that a day is higher resolution than a month).
The problem comes when you want to record "Summer, unknown year" and "18:30-19:00, unknown date". The season and time of day might be important, even if lower-precision data are not available (which year the season, or which date the time range). But my hunch is that these fall outside the 90% value/90% cases -- at least in my experience.
Aloha, Rich
-----Original Message----- From: Taxonomic Databases Working Group List [mailto:TDWG@LISTSERV.NHM.KU.EDU]On Behalf Of Patricia Koleff Osorio Sent: Friday, February 24, 2006 6:02 AM To: TDWG@LISTSERV.NHM.KU.EDU Subject: Re: Standards for date / time values?
We should be clear on format for Early and Latest CollectionDate, meaning to store data as DDMMYYYY There are some old specimens missing the day or even the month (and only have years, and even not the complete year, only the like 79) I think that time hour should be another field, noting that must be refered to the locality Regards Patricia Koleff
At 12:00 a.m. 24/02/2006 -1000, Richard Pyle wrote:
RE: CollectingDatesInterpreted When would this field *not* be set to true? In other words, what is the threshold for "interpreted", in terms of populating
EarliestCollectingDate
and LatestCollectingDate?
At one extreme, we have a VerbatimDateTime of "12 January 2005, 13:34:15 local time". Not a lot of interpretation needed there (other than conversion of original documentation format to ASCII/Unicode, and interpretation of the time zone from the locality data). But
conversion of
"12 January 2005" to
EarliestCollectingDate: "2005-01-12 00:00:00" LatestCollectingDate: "2005-01-12 23:59:59"
...requires a bit of interpretation. Maybe the time portion
could simply be
omitted; but even still, there is technically *some* interpretation when converting "January" to "01".
And the spectrum continues on from there up until, well, stuff
that would be
unambiguously flagged as "interpreted".
My point is, there should be some sort of guideline as to when to set CollectingDatesInterpreted true; i.e., what threshold of interpretation triggers that field. If such a guideline already exists, then I apologize for missing it.
Personally, I don't think CollectingDatesInterpreted is
necessary, really,
because I have always regarded my equivalents of
EarliestCollectingDate and
LatestCollectingDate as interpreted.
RE: Lynn's examples below. I wrestled with these issues (and more) a number of years ago. Other examples include "Summer; year unknown" (important for seasonal species), and multiple (non-contiguous) date ranges: "May or August, 2004" -- among others.
In addition to the equivalents of VerbatimDateTime,
EarliestCollectingDate,
LatestCollectingDate; I had a controlled-vocabulary field called "DateRangeQualifier", which had an enumerated list of options tied to interpretation logic (e.g., "continuous range", "discontinuous range", "unknown point", "either/or", etc.; as well as sub-range qualifiers like "Summer"). I modelled it after an analogous field I use for interpreting paired or multiple georef coordinates (e.g., "transect" vs.
"bounding box";
"polygon" vs. "linear"). This DateRangeQualifier approach ended
up being too
cumbersome to implement -- only because I was too lazy, and the number of cases that benefited from it (i.e., allowed computer logic to be applied, independent of a human reading of the VerbatimDateTime value) was low.
The three fields VerbatimDateTime/EarliestCollectingDate/LatestCollectingDate
will give you
90% of the value 90% of the time (at least for the
specimen/observation data
that I deal with). I wonder if DWC really needs more precision capability than that....
Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
-----Original Message----- From: Taxonomic Databases Working Group List [mailto:TDWG@LISTSERV.NHM.KU.EDU]On Behalf Of Lynn Kutner Sent: Thursday, February 23, 2006 5:26 PM To: TDWG@LISTSERV.NHM.KU.EDU Subject: Re: Standards for date / time values?
Hi Stan and all -
Yes, I am thinking about dates in the context of the observation / monitoring schema.
Here are some of the types of dates that I've seen in just the
data from the
NatureServe member programs based on either lack of available
information or
interesting data collection situations.
- partial dates (only have the year or year-month) - definitely
accommodated by the ISO 8601 format
- vague dates - for these I suppose you'd use ISO 8601 in
EarliestCollectingDate / LatestCollectingDate, VerbatimDateTime, and the CollectingDatesInterpreted flag ?
PRE-1975 post-1992 summer 2001 Mid-1990s late 1950's Circa 1941 Early 1990s
- a range of dates indicating that collection occurred on a
specific date
sometime between those two limits and you're not sure when (example: Observations / collections from pitfall traps may have date ranges associated with them rather than a single date. For instance,
you can have
dates of collection of 19Dec 2004 to 23Jan 2005 and the specimen was from some unknown specific date in that rage),
- continuous data collection over a period of time (some sort of
sensor)
- information collected on one of two possible dates (one or
the other is
correct, it's not a range) (example: aerial surveys flown on
either May 27
or June 04, 1999).
It seems that EarliestCollectingDate / LatestCollectingDate could be used for #3, #4, and #5 - but is there a way to differentiate between these as actually being very different types of dates? Would those
distinctions just
be communicated using the VerbatimDateTime?
Of course for all of these examples the fallback would be to use the VerbatimDateTime, but it would be ideal to have as much as
possible of the
date information in field(s) that can be queried and analyzed
Thank you for your thoughts and help.
Lynn
Lynn Kutner NatureServe www.natureserve.org (303) 541-0360 lynn_kutner@natureserve.org
From: Blum, Stan [mailto:sblum@calacademy.org] Sent: Thu 2/23/2006 6:41 PM To: Lynn Kutner; TDWG@LISTSERV.NHM.KU.EDU Subject: RE: Standards for date / time values?
Hi Lynn,
Context is everything, so I'm going to assume you are talking about date/time representations in the observation and monitoring
schema(s) that
you are developing. The thinking in the collections community has gone something along the lines of this:
Concept: the date-time of a collecting event [= recording-event, gathering-event] -- i.e., when one or more organisms were collected or observed -- expressed in the common Gregorian calendar, in local time.
Requirements:
1 - express the data/time exactly as it was recorded 2 - retreive records by date/time ranges (date-time greater than X and/or less than Y) 3 - accommodate date-times of varying precision, including explicit date-time ranges (specifying a duration) 4 - support seasonal (time of year) queries
These requirements can be met by the following fields:
VerbatimDateTime EarliestDateCollected LatestDateCollected CollectingDatesInterpreted DayOfYear
As John Wieczorek noted, the ISO 8601 format accommodates
varying degrees of
precision. Using a "verbatim date-time" [text string] is as
good as you can
do to satisfy the first requirement, short of scanning field
notes or voice
recordings. The earliest and latest dates support the recording
of explicit
ranges as well as interpreted ranges, which could be used to
make "Summer of
1952" retrievable. The CollectingDatesInterpreted field would
be a boolean
field, set as true when the earliest and latest dates represent interpretations rather than an explicit range. The "DayOfYear"
field is a
compromise that we've offered as a simple way to support queries
involving
seasonal (annual) cycles; e.g., collected in the northern
summer, regardless
of year. But it can be argued that day of year is derivable
from the other
fields (unless year is unknown), and that it doesn't accommodate explicit ranges.
A bit more documentation is needed to address the odd cases (What do I do when ...?), but these five fields will support a lot of the data exchange needed for this concept. These fields are not intended to
handle the dating
of localities in paleontology, nor are they intended to handle
named periods
that are used in cultural collections (e.g., Iron Age, Victorian).
ABCD uses a few more fields to handle the concept, http://ww3.bgbm.org/abcddocs/AbcdConcepts but some of these (date, time, and time zone) are handled by the ISO format, except when the
larger units
are unkown; .e.g., the year is unknown, but the day is [ June 16
]; or the
date is unknown, but the time is [ 14:00-15:30 ].
AbcdConcept0860 /DataSets/DataSet/Units/Unit/Gathering/DateTime AbcdConcept0861 /DataSets/DataSet/Units/Unit/Gathering/DateTime/DateText AbcdConcept0862 /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeZone AbcdConcept0863 /DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeBegin AbcdConcept0864 /DataSets/DataSet/Units/Unit/Gathering/DateTime/DayNumberBegin AbcdConcept0865 /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeOfDayBegin AbcdConcept0866 /DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeEnd AbcdConcept0867 /DataSets/DataSet/Units/Unit/Gathering/DateTime/DayNumberEnd AbcdConcept0868 /DataSets/DataSet/Units/Unit/Gathering/DateTime/TimeOfDayEnd AbcdConcept0869 /DataSets/DataSet/Units/Unit/Gathering/DateTime/PeriodExplicit The ABCD fields pretty much cover the entire concept space. One
could argue
whether time-zone is relevant to the description of biological phenomena, but we do know that ship trawls do cross time-zones (including the date-line), and that daylight savings time could stretch or compress some nocturnal collecting events if their durations were calculated
too simply.
To some extent these arguments are still going on, so analyze your requirements and your data, then state your position. ;-)
Cheers,
-Stan Stanley D. Blum, Ph.D. Research Information Manager California Academy of Sciences 875 Howard St. San Francisco, CA +1 (415) 321-8183
-----Original Message----- From: Taxonomic Databases Working Group List [mailto:TDWG@LISTSERV.NHM.KU.EDU] On Behalf Of Lynn Kutner Sent: Thursday, February 23, 2006 9:12 AM To: TDWG@LISTSERV.NHM.KU.EDU Subject: Standards for date / time values?
Hi - I'm working with a suite of date attributes that can include a
combination
of precise dates, imprecise dates, and ranges of dates (and the
same types
of time values). We'd like to follow existing standards. If this sort of date / time standard exists, I'd appreciate leads to the appropriate resources. Thank you for your help - Lynn
Lynn Kutner Data Management Coordinator NatureServe Email: lynn_kutner@natureserve.org Phone: (303) 541-0360 www.natureserve.org
Dra. Patricia Koleff Directora de Análisis y Prioridades CONABIO Comisión Nacional para el Conocimiento y Uso de la Biodiversidad Av. Liga Periferico - Insurgentes Sur 4903 Col. Parques del Pedregal Delegacion Tlalpan 14010 Mexico, D.F. MEXICO Tel. +52 (55) 5528 9105 Fax +52 (55) 5528 9194 www.conabio.gob.mx e-mail: pkoleff@xolo.conabio.gob.mx
------------------------------
Date: Tue, 28 Feb 2006 07:56:09 -0800 From: "John R. WIECZOREK" tuco@BERKELEY.EDU Subject: Re: Dates and Times
------=_Part_17545_1395380.1141142169842 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
As a unabashed proponent of the Point-Radius Method I have to confess that = I was tempted to try exactly what Arthur is proposing here, and for the same reason - a consistent way of dealing with uncertain measures. But I didn't like the result. The basic reason is that the single date value plus duration (before and after) can't capture the situation where the uncertainties on the two ends of the duration differ. In other words, it can't capture "definitely before midnight local time, 2 May 2000."
On 2/27/06, list.tdwg@achapman.org list.tdwg@achapman.org wrote:
This is an interesting discussion, and I am wondering why we treat dates/times differently to localities where we may have "between two places", "near a place", "along a path", etc.
At this stage, we tend to recommend using a single location (lat/long) with an uncertainty surrounding that location. With location, we are currently using a circle to determine the uncertainty in a locality, wher=
e
as time is linear (and thus easier). We may change to the use of polygons and uncertainty footprints at a later stage but let's leave that discussi=
on
for another time.
Would not the better way to handle some of these data/time situations be to use a midpoint with an uncertainty. Thus for "01-2000" we should not interpret that as 01-01-2000 but as 15-01-2000 with an uncertainty of 15 =
(or
- days - can be refined with the incorporation of time. This is, of
course, not all that dis-similar to what michael is suggesting.
On a related issue - if we have two dates - the dates may be "accurately" determined - but the actual time of occurrence is uncertain. I thus pref=
er
the use of the term uncertainty - and possibly better (as we will most likely use in the Georeferencing Best Practices document under preparatio=
n)
- "Maximum Uncertainty" or perhaps "Maximum uncertainty estimate"
Perhaps there are good reasons for using start and end dates/times - if s=
o
I am happy to go along with that - but believe we should aim for consiste=
ncy
of approach if possible.
Cheers
Arthur
Arthur D. Chapman Australian Biodiversity Information Services Toowoomba, Australia
From Automatic digest processor LISTSERV@LISTSERV.NHM.KU.EDU on 27 Feb 2006:
There are 3 messages totalling 562 lines in this issue.
Topics of the day:
- Standards for date / time values? (3)
--
Date: Mon, 27 Feb 2006 12:27:23 +0100 From: Hannu Saarenmaa hsaarenmaa@GBIF.ORG Subject: Re: Standards for date / time values?
In my understanding the "Not interpreted" case would be just a text string "DateTimeSourceText", repeating whatever is written in the label about the dates and times. This can be useful if questions arise of th=
e
interpretation.
Btw., isn't he name of the element "EarliestDateCollected" in Darwin Core v.1.4 proposal a bit misleading as it really is "EarliestDateTimeCollected"?
Hannu
Richard Pyle wrote:
RE: CollectingDatesInterpreted When would this field *not* be set to true?
Date: Mon, 27 Feb 2006 10:58:54 -0500 From: Michael Lee mikelee@EMAIL.UNC.EDU Subject: Re: Standards for date / time values?
Hello all,
It seems to me that the VerbatimDate/EarliestDate/LatestDate are helpful, but combine two separate issues:
- duration of the event spanning more than one date/time unit
- error associated with the date.
Is the distinction useful? Probably depends on your application of the date.
To handle these, VegBank uses three fields:
startDateTime stopDateTime dateTimeAccuracy (note that there is NO verbatimDate)
StartDateTime is most precise known date/time for the beginning of an event (in our case an observation of a vegetation plot) and the EndDateTime is the end date/time (which could have the same value). DateTimeAccuracy is a string with a list of values ranging from 1 secon=
d
to 1 day to 1 year to 5,10 years, etc.
start and stop deal with issue #1 above, duration (potentially) exceeding usual date description precision.
The accuracy field deals with isue #2 above, error or uncertainty regarding the dates. It is this accuracy field that seems to be missin=
g
from this discussion thus far.
As far as time zones go, error on the order of hours isn't TOO importan=
t
to us, so we don't worry about it too much. But I think what we're aiming to do is to store the date/time according to UTC. This means that to reconstruct what time the person saw something, the original time zone and Daylight Savings Time would need to be known, but this isn't crucial to us, either.
I like Lynn's list of weird dates, so I'll throw in what we'd do with them (this of course being "interpreted" but as Rich says, that's almost always the case):
PRE-1975 Start:null ; End: 01-JAN-1975 ; Accuracy: null or possibly a large value post-1992 Start: 01-Jan-1992; End:null ; Accuracy: null or possibly a large value summer 2001 Start: 15-JUL-2001; End:15-JUL-2001 ; Accuracy: 1.5 months Mid-1990s Start: 01-JUL-1995; End: 01-JUL-1995 ; Accuracy: 2 years late 1950's Start: 01-JAN-1958; End: 01-JAN-1958; Accuracy: 2.5 years Circa 1941 Start: 01-JUL-1941; End:01-JUL-1941; Accuracy: 6 months Early 1990s Start: 01-JAN-1992; End:01-JAN-1992; Accuracy: 2 years
Except for the "pre-Date" and "post-Date" I have assumed a relatively short duration. But this is interpreted. The problems with this approach I see are:
- accuracy for start and stop dates could be different
- accuracy is not a black and white idea. It's often more of a
gradient. Seems illogical to specify 10 years as accuracy in the first 2 examples, as the dates could be much more than 10 years away. 3) a closed list on accuracy seems to force you into decisions you don'=
t
=3D=3D=3D message truncated =3D=3D=3D
------=_Part_17545_1395380.1141142169842 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
<div>As a unabashed proponent of the Point-Radius Method I have to confess = that I was tempted to try exactly what Arthur is proposing here, and for th= e same reason - a consistent way of dealing with uncertain measures. But I = didn't like the result. The basic reason is that the single date value plus= duration (before and after) can't capture the situation where the uncertai= nties on the two ends of the duration differ. In other words, it can't capt= ure "definitely before midnight local time, 2 May 2000." </div> <div><br> </div> <div><span class=3D"gmail_quote">On 2/27/06, <b class=3D"gmail_sendername">= <a href=3D"mailto:list.tdwg@achapman.org">list.tdwg@achapman.org</a></b> &l= t;<a href=3D"mailto:list.tdwg@achapman.org">list.tdwg@achapman.org</a>> = wrote: </span> <blockquote class=3D"gmail_quote" style=3D"PADDING-LEFT: 1ex; MARGIN: 0px 0= px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">This is an interesting discussio= n, and I am wondering why we treat dates/times differently to localities wh= ere we may have "between two places", "near a place", &= quot;along a path", etc. <br><br>At this stage, we tend to recommend using a single location (lat/lo= ng) with an uncertainty surrounding that location. With location= , we are currently using a circle to determine the uncertainty in a localit= y, where as time is linear (and thus easier). We may change to the use of p= olygons and uncertainty footprints at a later stage but let's leave that di= scussion for another time. <br><br>Would not the better way to handle some of these data/time situatio= ns be to use a midpoint with an uncertainty. Thus for "01-2= 000" we should not interpret that as 01-01-2000 but as 15-01-2000 with= an uncertainty of 15 (or 16) days - can be refined with the incorporation = of time. This is, of course, not all that dis-similar to what mi= chael is suggesting. <br><br>On a related issue - if we have two dates - the dates may be "= accurately" determined - but the actual time of occurrence is uncertai= n. I thus prefer the use of the term uncertainty - and possibly = better (as we will most likely use in the Georeferencing Best Practices doc= ument under preparation) - "Maximum Uncertainty" or perhaps "= ;Maximum uncertainty estimate" <br><br>Perhaps there are good reasons for using start and end dates/times = - if so I am happy to go along with that - but believe we should aim for co= nsistency of approach if possible.<br><br>Cheers<br><br>Arthur<br><br>Arthu= r D. Chapman <br>Australian Biodiversity Information Services<br>Toowoomba, Australia<br=
<br>From Automatic digest processor <<a href=3D"mailto:LISTSERV@LISTSER=
V.NHM.KU.EDU">LISTSERV@LISTSERV.NHM.KU.EDU</a>> on 27 Feb 2006:<br><br> > There are 3 messages totalling 562 lines in this issue.<br>><br>>= ; Topics of the day:<br>><br>> 1. Standards for date / ti= me values? (3)<br>><br>> --------------------------------------------= -------------------------- <br>><br>> Date: Mon, 27 Feb 2006 12:27:23 +01= 00<br>> From: Hannu Saarenmaa <<a href=3D"mail= to:hsaarenmaa@GBIF.ORG">hsaarenmaa@GBIF.ORG</a>><br>> Subject: Re: St= andards for date / time values?<br>> <br>> In my understanding the "Not interpreted" case would be = just a text<br>> string "DateTimeSourceText", repeating whatev= er is written in the label<br>> about the dates and times. Th= is can be useful if questions arise of the <br>> interpretation.<br>><br>> Btw., isn't he name of the element= "EarliestDateCollected" in Darwin<br>> Core v.1.4 proposal a = bit misleading as it really is<br>> "EarliestDateTimeCollected"= ;? <br>><br>> Hannu<br>><br>> Richard Pyle wrote:<br>><br>> = >RE: CollectingDatesInterpreted<br>> >When would this field *not* = be set to true?<br>> ><br>><br>> ------------------------------ <br>><br>> Date: Mon, 27 Feb 2006 10:58:54 -05= 00<br>> From: Michael Lee <<a href=3D"mailto:m= ikelee@EMAIL.UNC.EDU">mikelee@EMAIL.UNC.EDU</a>><br>> Subject: Re: St= andards for date / time values?<br>> <br>> Hello all,<br>><br>> It seems to me that the VerbatimDate/Ea= rliestDate/LatestDate are<br>> helpful, but combine two separate issues:= <br>><br>> 1) duration of the event spanning more than one date/time = unit <br>> 2) error associated with the date.<br>><br>> Is the distinct= ion useful? Probably depends on your application of the<br>> = date.<br>><br>> To handle these, VegBank uses three fields:<br>><b= r>> startDateTime <br>> stopDateTime<br>> dateTimeAccuracy<br>> (note that there is = NO verbatimDate)<br>><br>> StartDateTime is most precise known date/t= ime for the beginning of an<br>> event (in our case an observation of a = vegetation plot) and the <br>> EndDateTime is the end date/time (which could have the same value)= .<br>> DateTimeAccuracy is a string with a list of values ranging from 1= second<br>> to 1 day to 1 year to 5,10 years, etc.<br>><br>> star= t and stop deal with issue #1 above, duration (potentially) <br>> exceeding<br>> usual date description precision.<br>><br>>= ; The accuracy field deals with isue #2 above, error or uncertainty<br>>= regarding the dates. It is this accuracy field that seems to be= missing <br>> from this discussion thus far.<br>><br>> As far as time zone= s go, error on the order of hours isn't TOO important<br>> to us, so we = don't worry about it too much. But I think what we're<br>> ai= ming<br> > to do is to store the date/time according to UTC. This mean= s that to<br>> reconstruct what time the person saw something, the origi= nal time zone<br>> and<br>> Daylight Savings Time would need to be kn= own, but this isn't crucial to <br>> us, either.<br>><br>> I like Lynn's list of weird dates, so = I'll throw in what we'd do with<br>> them<br>> (this of course being = "interpreted" but as Rich says, that's almost<br>> always<br> > the case):<br>><br>> PRE-1975<br>> Start:null ; E= nd: 01-JAN-1975 ; Accuracy: null or possibly a large<br>> value<br>> = post-1992<br>> Start: 01-Jan-1992; End:null ; Accuracy: null= or possibly a large <br>> value<br>> summer 2001<br>> Start: 15-JUL-2001; = End:15-JUL-2001 ; Accuracy: 1.5 months<br>> Mid-1990s<br>>  = ; Start: 01-JUL-1995; End: 01-JUL-1995 ; Accuracy: 2 years<br>> late 195= 0's<br>> Start: 01-JAN-1958; End: 01-JAN-1958; Accuracy:=20 2.5 years<br>> Circa 1941<br>> Start: 01-JUL-1941; End:01= -JUL-1941; Accuracy: 6 months<br>> Early 1990s<br>> Start= : 01-JAN-1992; End:01-JAN-1992; Accuracy: 2 years<br>><br>> Except fo= r the "pre-Date" and "post-Date" I have assumed a relat= ively <br>> short duration. But this is interpreted. The= problems with this<br>> approach<br>> I see are:<br>> 1) accuracy= for start and stop dates could be different<br>> 2) accuracy is not a b= lack and white idea. It's often more of a <br>> gradient. Seems<br>> illogical to specify 10 years as accuracy = in the first 2 examples, as<br>> the<br>> dates could be much more th= an 10 years away.<br>> 3) a closed list on accuracy seems to force you i= nto decisions you don't <br><br>=3D=3D=3D message truncated =3D=3D=3D<br></blockquote></div><br>
------=_Part_17545_1395380.1141142169842--
------------------------------
Date: Tue, 28 Feb 2006 21:04:48 -0800 From: list.tdwg@ACHAPMAN.ORG Subject: Re: Dates and Times
Philip and John have convinced me that it is better to use the start/end date/time approach rather than the mid-point and uncertainty approach. Philip's arguments are very convincing.
Cheers
Arthur
From Philip.Gleeson@environment.nsw.gov.au on 27 Feb 2006:
I think the important point to remember in all of this is that time is linear and this allows for a simplified approach that is more difficult to apply in a 2-dimensional spatial situation. Where the boundaries of a
time interval are known the two approaches are identical and require two
data points to accurately specify the range. However, what of the situation where only one data point is known? It often occurs that one knows that a collection took place prior to a given date, or equally one
may know that a collection occurred after a given date with no other information. This is easily catered for when a start and end datetime are used, but if a middle point and an accuracy have to be specified on an unbounded time interval, how do you do this? In contrast, I am not aware
of any cases in collections where one has an accurate midpoint but is unsure of how far either side of this date the collection occurred. For this reason alone I would tend to lean towards the use of start and end dates over the alternative approach of a mid point and an accuracy as the latter can always be calculated from the former.
Philip Gleeson
Wildlife Data Officer Policy & Science Division Department of Environment & Conservation 43 Bridge St, Hurstville NSW 2220 Ph: 9585 6694
list.tdwg@ACHAPMAN.ORG Sent by: Taxonomic Databases Working Group List TDWG@LISTSERV.NHM.KU.EDU 28/02/06 18:12 Please respond to list.tdwg
To: TDWG@LISTSERV.NHM.KU.EDU cc: Subject: Dates and Times
This is an interesting discussion, and I am wondering why we treat dates/times differently to localities where we may have "between two places", "near a place", "along a path", etc.
At this stage, we tend to recommend using a single location (lat/long) with an uncertainty surrounding that location. With location, we are currently using a circle to determine the uncertainty in a locality, where as time is linear (and thus easier). We may change to the use of polygons and uncertainty footprints at a later stage but let's leave that discussion for another time.
Would not the better way to handle some of these data/time situations be
to use a midpoint with an uncertainty. Thus for "01-2000" we should not
interpret that as 01-01-2000 but as 15-01-2000 with an uncertainty of 15
(or 16) days - can be refined with the incorporation of time. This is, of course, not all that dis-similar to what michael is suggesting.
On a related issue - if we have two dates - the dates may be "accurately" determined - but the actual time of occurrence is uncertain. I thus prefer the use of the term uncertainty - and possibly better (as we will
most likely use in the Georeferencing Best Practices document under preparation) - "Maximum Uncertainty" or perhaps "Maximum uncertainty estimate"
Perhaps there are good reasons for using start and end dates/times - if so I am happy to go along with that - but believe we should aim for consistency of approach if possible.
Cheers
Arthur
Arthur D. Chapman Australian Biodiversity Information Services Toowoomba, Australia
From Automatic digest processor LISTSERV@LISTSERV.NHM.KU.EDU on 27 Feb
2006:
There are 3 messages totalling 562 lines in this issue.
Topics of the day:
- Standards for date / time values? (3)
Date: Mon, 27 Feb 2006 12:27:23 +0100 From: Hannu Saarenmaa hsaarenmaa@GBIF.ORG Subject: Re: Standards for date / time values?
In my understanding the "Not interpreted" case would be just a text string "DateTimeSourceText", repeating whatever is written in the
label
about the dates and times. This can be useful if questions arise of
the
interpretation.
Btw., isn't he name of the element "EarliestDateCollected" in Darwin Core v.1.4 proposal a bit misleading as it really is "EarliestDateTimeCollected"?
Hannu
=== message truncated ===
------------------------------
End of TDWG Digest - 27 Feb 2006 to 28 Feb 2006 (#2006-7) *********************************************************
This e-mail and files transmitted with it are privileged and confidential information intended for the use of the addressee. The confidentiality and/or privilege in this e-mail is not waived, lost or destroyed if it has been transmitted to you in error. If you received this e-mail in error you must (a) not disseminate, copy or take any action in reliance on it; (b) please notify the Department of Agriculture immediately by return e-mail to the sender; (c) please delete the original e-mail.