Re: Standards for date / time values?
Who is using data scored like this - actually searching on and analyzing data that was scored to 'summer', 'autumn', 'early 19th century'
This thread seems to be a great deal about how to capture data from various historic sources but what are the *use cases* for exploitation of it? What is the cost benefit analysis of doing more than Stan suggested above, /a priori,/ in the hope that it will be more useful than just capturing a string plus a year that can analyzed later if they ever need to be - which they might not.
Just some thoughts,
Roger
Michael Lee wrote:
Hello all,
It seems to me that the VerbatimDate/EarliestDate/LatestDate are helpful, but combine two separate issues:
- duration of the event spanning more than one date/time unit
- error associated with the date.
Is the distinction useful? Probably depends on your application of the date.
To handle these, VegBank uses three fields:
startDateTime stopDateTime dateTimeAccuracy (note that there is NO verbatimDate)
StartDateTime is most precise known date/time for the beginning of an event (in our case an observation of a vegetation plot) and the EndDateTime is the end date/time (which could have the same value). DateTimeAccuracy is a string with a list of values ranging from 1 second to 1 day to 1 year to 5,10 years, etc.
start and stop deal with issue #1 above, duration (potentially) exceeding usual date description precision.
The accuracy field deals with isue #2 above, error or uncertainty regarding the dates. It is this accuracy field that seems to be missing from this discussion thus far.
As far as time zones go, error on the order of hours isn't TOO important to us, so we don't worry about it too much. But I think what we're aiming to do is to store the date/time according to UTC. This means that to reconstruct what time the person saw something, the original time zone and Daylight Savings Time would need to be known, but this isn't crucial to us, either.
I like Lynn's list of weird dates, so I'll throw in what we'd do with them (this of course being "interpreted" but as Rich says, that's almost always the case):
PRE-1975 Start:null ; End: 01-JAN-1975 ; Accuracy: null or possibly a large value post-1992 Start: 01-Jan-1992; End:null ; Accuracy: null or possibly a large value summer 2001 Start: 15-JUL-2001; End:15-JUL-2001 ; Accuracy: 1.5 months Mid-1990s Start: 01-JUL-1995; End: 01-JUL-1995 ; Accuracy: 2 years late 1950's Start: 01-JAN-1958; End: 01-JAN-1958; Accuracy: 2.5 years Circa 1941 Start: 01-JUL-1941; End:01-JUL-1941; Accuracy: 6 months Early 1990s Start: 01-JAN-1992; End:01-JAN-1992; Accuracy: 2 years
Except for the "pre-Date" and "post-Date" I have assumed a relatively short duration. But this is interpreted. The problems with this approach I see are:
- accuracy for start and stop dates could be different
- accuracy is not a black and white idea. It's often more of a
gradient. Seems illogical to specify 10 years as accuracy in the first 2 examples, as the dates could be much more than 10 years away. 3) a closed list on accuracy seems to force you into decisions you don't want to make. But an open field is problematic as you might not always be able to interpret it. Perhaps 2 fields: DateAccuracyNumeric and DateAccuracyUnits would help, or you could require the units to be converted to years, with very tiny accuracies being really small decimal numbers. 4) Doesn't tackle issues of 2 dates (May or Sept of 1995) all that well.
But it seems flexible enough for our purposes. I wasn't part of the team that designed it, so I'm not sure what alternatives they considered and for what reasons they rejected any alternatives they considered.
cheers, michael
Michael Lee VegBank Project Manager http://www.vegbank.org
On Mon, 27 Feb 2006, Hannu Saarenmaa wrote:
In my understanding the "Not interpreted" case would be just a text string "DateTimeSourceText", repeating whatever is written in the label about the dates and times. This can be useful if questions arise of the interpretation.
Btw., isn't he name of the element "EarliestDateCollected" in Darwin Core v.1.4 proposal a bit misleading as it really is "EarliestDateTimeCollected"?
Hannu
Richard Pyle wrote:
RE: CollectingDatesInterpreted When would this field *not* be set to true?
--
------------------------------------- Roger Hyam Technical Architect Taxonomic Databases Working Group ------------------------------------- http://www.tdwg.org roger@tdwg.org +44 1578 722782 -------------------------------------
participants (1)
-
Roger Hyam