[tdwg] Ideas on having Harvesters like GBIF clean, flag inconsistencies, and add additional value to the data
Arthur Chapman
tdwg.lists at achapman.org
Sat May 9 03:50:28 CEST 2009
Peter - there is one additional issue here
You imply that if the data is not in WGS84 that the lat and long be
removed ("Those records without a Datum would still be exposed but the
added geo:latitude and geo:longitude fields would be empty.") However -
the lat and long could still be included, but the Uncertainty would be
increased as discussed in the Georferencing Best Practices document
(http://www.gbif.org/prog/digit/data_quality/BioGeomancerGuide) and as
also calculated in the MaNIS Georeferencing calculator
(http://manisnet.org/gc.html)
<http://www.gbif.org/prog/digit/data_quality/BioGeomancerGuide> and in
the BioGeomancer toolkit (http://biogeomancer.org/)
Cheers
Arthur
Peter DeVries wrote:
> Arthur Chapman sent me some good comments regarding Datums etc.
>
> The discussion made me realize that there may be a need for two types
> of formats. One for the providers and a second one that is output by
> the harvesting service.
>
> This is because the needs and abilities of the data providers are
> different than the needs and abilities of those who would like to
> consume the data.
>
> Consumers, who analyze and map the data, would like something that is
> easy to process, standardized and as as error free as as possible.
>
> It could work in the following way.
>
> Data harvesters, like GBIF, collect the records. Run them through
> cleaning algorithms that check attributes including that the lat and
> long actually match the location described.
>
> These harvesters would then expose this cleaned data via XML and RDF
> with tags that flag possible inconsistencies. The harvesters would
> also add a field for the lat and long in WGS84 if the original record
> contains a valid Datum. Those records without a Datum would still be
> exposed but the added geo:latitude and geo:longitude fields would be
> empty.
>
> I can imagine that that data uploaded to GBIF and other harvester
> services will be replete with typo's and inconsistencies that will
> frustrate people trying to analyze or simply map the data, the
> harvester services could add value by minimizing these frustrations.
>
> Originally, it seemed that a global service should standardize on a
> global Datum like WGS84. After all, we have standardized on
> meters? However, after discussing this with Arthur, I realize that
> this is not possible for a number of reasons. That said, I think the
> data would be much more valuable and less likely to be misinterpreted
> if if a version of it was available in WGS84. This solution would
> eventually encourage data providers to understand what a Datum is and
> include it in their data. It would also help solve a number of other
> data integration problems.
>
> Respectfully,
>
> Pete
>
>
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> ------------------------------------------------------------
> ------------------------------------------------------------------------
>
> _______________________________________________
> tdwg mailing list
> tdwg at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg/attachments/20090509/22ac0c86/attachment.html
More information about the tdwg
mailing list