Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head
scratching) I
realised that the verbatim values were actually Easting and Northing
(which
I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote: Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982tel:%2B47%204061%202982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Rod, I think your example is correct, yes. The height is likely to be a tough one for most consumers incl GBIF, but in general that is how dwc would like to see it.
Instead of verbatimLatitude and verbatimLongitude it might have been slightly simpler if we had a verbatimCoordinateX, verbatimCoordinateY & verbatimCoordinateZ so we would not need to parse those out and understand the order within the one verbatim string.
Markus
On 09 Feb 2015, at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote: Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I'm working on improving how the original data are captured in Darwin Core for TTU as we speak. :-)
On Mon, Feb 9, 2015 at 1:25 PM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page Roderic.Page@glasgow.ac.uk wrote: Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring <m.doering@mac.commailto:m.doering@mac.com> wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Accepted where? That is a very slippery slope. I think data publishers must be allowed to publish their verbatim information for a variety of reasons. First among them being that relevant data might not pass someone else's idea of a regular expression, and yet still be valid. Second among them is the huge win data publishers get by being able to put there data out their, warts and all and have the Rod Pages of the world (with great respect) and thier respective data quality tools find out what is wrong with them and report it back to the source, where, with luck and resources, it can be improved in perpetuity.
Darwin Core as a standard does not impose constraints on purpose, and I think that is a fundamental strength, not a weakness. It lets us know what is wrong. Let the appropriate constraints come in appropriate contexts.
I don't believe it to be wrong to be happier having the option for less ambiguity. That is why it doesn't bother me to have four ways to capture the event date. Some people hate that. I'd rather be able to publish what I have and interpret it in all the other ways as well than to not be able to publish it.
On Mon, Feb 9, 2015 at 2:15 PM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring m.doering@mac.com wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi John,
OK, maybe “accepted” is a little strong, and I agree we don’t want to put obstacles in the way of publishing data. But, it might help people (a) if they had guidance on what suitable data looks like, and (even better) (b) we had decent tools that could quickly check the data and flag issues (I know work is being done on this). Part of the problem is that publishing data is typically divorced from use, so errors often aren’t picked up until somebody actually tries to use the data. Decent testing tools may help tackle that.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:56, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Accepted where? That is a very slippery slope. I think data publishers must be allowed to publish their verbatim information for a variety of reasons. First among them being that relevant data might not pass someone else's idea of a regular expression, and yet still be valid. Second among them is the huge win data publishers get by being able to put there data out their, warts and all and have the Rod Pages of the world (with great respect) and thier respective data quality tools find out what is wrong with them and report it back to the source, where, with luck and resources, it can be improved in perpetuity.
Darwin Core as a standard does not impose constraints on purpose, and I think that is a fundamental strength, not a weakness. It lets us know what is wrong. Let the appropriate constraints come in appropriate contexts.
I don't believe it to be wrong to be happier having the option for less ambiguity. That is why it doesn't bother me to have four ways to capture the event date. Some people hate that. I'd rather be able to publish what I have and interpret it in all the other ways as well than to not be able to publish it.
On Mon, Feb 9, 2015 at 2:15 PM, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring <m.doering@mac.commailto:m.doering@mac.com> wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no Mobile: +47 4061 2982
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I agree 100% and that is why I am excited about data quality tools. I don't think we are doing enough to collaborate on the creation of those tools. Everyone and their dog is working on them, but there isn't enough global dialog and collaborative development, yet.
On Mon, Feb 9, 2015 at 3:10 PM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
OK, maybe “accepted” is a little strong, and I agree we don’t want to put obstacles in the way of publishing data. But, it might help people (a) if they had guidance on what suitable data looks like, and (even better) (b) we had decent tools that could quickly check the data and flag issues (I know work is being done on this). Part of the problem is that publishing data is typically divorced from use, so errors often aren’t picked up until somebody actually tries to use the data. Decent testing tools may help tackle that.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:56, John Wieczorek tuco@berkeley.edu wrote:
Accepted where? That is a very slippery slope. I think data publishers must be allowed to publish their verbatim information for a variety of reasons. First among them being that relevant data might not pass someone else's idea of a regular expression, and yet still be valid. Second among them is the huge win data publishers get by being able to put there data out their, warts and all and have the Rod Pages of the world (with great respect) and thier respective data quality tools find out what is wrong with them and report it back to the source, where, with luck and resources, it can be improved in perpetuity.
Darwin Core as a standard does not impose constraints on purpose, and I think that is a fundamental strength, not a weakness. It lets us know what is wrong. Let the appropriate constraints come in appropriate contexts.
I don't believe it to be wrong to be happier having the option for less ambiguity. That is why it doesn't bother me to have four ways to capture the event date. Some people hate that. I'd rather be able to publish what I have and interpret it in all the other ways as well than to not be able to publish it.
On Mon, Feb 9, 2015 at 2:15 PM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring m.doering@mac.com wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
And on another front, yes, we would like to open up the DwC documentation to much broader demonstration of examples. We've made a start on that with the project to move all of DwC over to Github. Right now we are just beginning to think of what the best way would be to provide practical guidance on the full range of issues people face when trying to cram real data into Darwin Core - How-To guides at a variety of levels for a variety of purposes as anchors to attach discussions and eventually tools.
On Mon, Feb 9, 2015 at 3:25 PM, John Wieczorek tuco@berkeley.edu wrote:
I agree 100% and that is why I am excited about data quality tools. I don't think we are doing enough to collaborate on the creation of those tools. Everyone and their dog is working on them, but there isn't enough global dialog and collaborative development, yet.
On Mon, Feb 9, 2015 at 3:10 PM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
OK, maybe “accepted” is a little strong, and I agree we don’t want to put obstacles in the way of publishing data. But, it might help people (a) if they had guidance on what suitable data looks like, and (even better) (b) we had decent tools that could quickly check the data and flag issues (I know work is being done on this). Part of the problem is that publishing data is typically divorced from use, so errors often aren’t picked up until somebody actually tries to use the data. Decent testing tools may help tackle that.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:56, John Wieczorek tuco@berkeley.edu wrote:
Accepted where? That is a very slippery slope. I think data publishers must be allowed to publish their verbatim information for a variety of reasons. First among them being that relevant data might not pass someone else's idea of a regular expression, and yet still be valid. Second among them is the huge win data publishers get by being able to put there data out their, warts and all and have the Rod Pages of the world (with great respect) and thier respective data quality tools find out what is wrong with them and report it back to the source, where, with luck and resources, it can be improved in perpetuity.
Darwin Core as a standard does not impose constraints on purpose, and I think that is a fundamental strength, not a weakness. It lets us know what is wrong. Let the appropriate constraints come in appropriate contexts.
I don't believe it to be wrong to be happier having the option for less ambiguity. That is why it doesn't bother me to have four ways to capture the event date. Some people hate that. I'd rather be able to publish what I have and interpret it in all the other ways as well than to not be able to publish it.
On Mon, Feb 9, 2015 at 2:15 PM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring m.doering@mac.com wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page@glasgow.ac.uk
wrote:
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Here is an example of DwC where I think I'm correct, but GBIF considered my verbatim data "suspicious". Plant recording in Britain is conducted in grid squares of the Ordanace Survey. Those coordinates have a very specific notation so that my verbatimCoordinatesare some thing like this NZ3767 and my verbatimCoordinateSystem is OSGB36 (e.g. http://www.gbif.org/occurrence/1020049283/verbatim). To me this is correct usage of the verbatim fields. I do complete the decimal longitude and latitude fields, based upon the centre of the square, but as these are based on the point-radius view of observations they are largely useless for real analysis. To make my records internationally acceptable I also complete the footprintWKT and footprintSRS, but although this is probably the best description of the locality I'm fairly sure that nobody will actually use it. Quentin
Roderic Page wrote:
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring <m.doering@mac.com mailto:m.doering@mac.com> wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek <tuco@berkeley.edu mailto:tuco@berkeley.edu> wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi Dag, Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset. Regards Rod --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk> Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778> Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com <http://iphylo.blogspot.com/> ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ <http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ> ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.com <mailto:dag.endresen@gmail.com>> wrote: Hi Rod, I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System). While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple). Regards Dag On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk>> wrote:
Hi John, So, if I understand you correctly, you would have hoped that TTU would have output something like this: “dwc:verbatimCoordinates” : “18M 166624 9840350” rather than put the easting and northing into verbatimLatitude and verbatimLongitude. Regards Rod --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk> Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778> Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com <http://iphylo.blogspot.com/> ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ <http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ> ResearchGate https://www.researchgate.net/profile/Roderic_Page On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edu <mailto:tuco@berkeley.edu>> wrote: Hi Rod, The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear. Cheers, John On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.com <mailto:dag.endresen@gmail.com>> wrote:
> > Hi Rod, > > At least in Norway, it is very common for the GBIF node to > receive > (only) Easting and Northing of UTM zones 32V to 36W. For > many datasets > we will on routine automatically make the conversion to decimal > degrees (and WGS84) at the node before these datasets are > published to > the GBIF portal. When people download occurrences from the > Norwegian > "GBIF portal", Artskart, my impression is that the UTM 32V > (and the > 33V) Easting and Northing coordinate format is actually more > popular > than the decimal degree format - this is because the > geographic data > layers for Norway more often are made available in the UTM > format > (most often 32V or 33V) [1]. And yes, this continued present day > official use of such a wide variety of coordinate formats > frustrates > me too... The historic use reported with the verbatim terms, > is of > course difficult to do anything with... > > I assume that Easting and Northing coordinates are both > valid and very > common values (and not only in Norway) for the Darwin Core > verbatim > coordinate terms (dwc:verbatimLatitude and > dwc:verbatimLongitude or > dwc:verbatimCoordinates), but of course only at all useful when > accompanied by the respective dwc:verbatimCoordinateSystem and > dwc:verbatimSRS also reported. (And that the > dwc:decimalLatitude and > dwc:decimalLongitude correctly reported in WGS84 should > preferably > also always be there). I believe that Darwin Core is already > fine with > respect to terms to report geographic coordinates. If at all any > additions are useful, I believe that identifying and > recommending > terms from more specialized geographic vocabularies and > ontologies > might be much more useful than adding any new dwc:Location > terms to > Darwin Core. In fact, most of the dwc:Location terms might > perhaps > preferably be replaced by terms from the geography > community... such > as perhaps [2] and [3] (as a start). > > [1] > https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ > [2] http://www.w3.org/2003/01/geo/wgs84_pos# > [3] http://www.geonames.org/ontology/documentation.html > > Regards > Dag > > > On 7 February 2015 at 13:02, Roderic Page > <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> > wrote: >> Pardon my ignorance, but has there ever been a discussion >> of easting and >> northing values in regards to Darwin Core? AFAIK the >> current standard >> doesn’t mention them. The reason I’m asking is that I’ve >> just come >> across >> some VerbatimLatitude and VerbatimLongitude values in a >> dataset that is >> aggregated by VertNet (and hence GBIF) where (after some head >> scratching) I >> realised that the verbatim values were actually Easting and >> Northing >> (which >> I didn’t know existed until yesterday). Details are here: >> https://github.com/ttu-vertnet/ttu-mammals/issues/11 >> >> I’m guessing this isn’t a terribly common way to record >> location >> information, but it looks like in this case the lack of >> support for this >> type of data has resulted in somebody trying to shoehorn >> them into >> VerbatimLatitude and VerbatimLongitude, resulting in values >> which are >> uninterpretable to aggregators further up the chain. >> >> Regards >> >> Rod >> >> > > -- > Dag Endresen, Ph.D. > Private email: dag.endresen@gmail.com > mailto:dag.endresen@gmail.com > Work email: dag.endresen@nhm.uio.no > mailto:dag.endresen@nhm.uio.no > Mobile: +47 4061 2982 > _______________________________________________ > tdwg-content mailing list > tdwg-content@lists.tdwg.org mailto:tdwg-content@lists.tdwg.org > http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com <mailto:dag.endresen@gmail.com> Work email: dag.endresen@nhm.uio.no <mailto:dag.endresen@nhm.uio.no> Mobile: +47 4061 2982
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org <mailto:tdwg-content@lists.tdwg.org> http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org mailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
No virus found in this message. Checked by AVG - www.avg.com http://www.avg.com Version: 2015.0.5646 / Virus Database: 4284/9088 - Release Date: 02/10/15
That looks great to me. What kind of a 'suspicious' notice do you get?
On Tue, Feb 10, 2015 at 6:59 AM, Quentin Groom quentin.groom@br.fgov.be wrote:
Here is an example of DwC where I think I'm correct, but GBIF considered my verbatim data "suspicious". Plant recording in Britain is conducted in grid squares of the Ordanace Survey. Those coordinates have a very specific notation so that my verbatimCoordinatesare some thing like this NZ3767 and my verbatimCoordinateSystem is OSGB36 (e.g. http://www.gbif.org/ occurrence/1020049283/verbatim). To me this is correct usage of the verbatim fields. I do complete the decimal longitude and latitude fields, based upon the centre of the square, but as these are based on the point-radius view of observations they are largely useless for real analysis. To make my records internationally acceptable I also complete the footprintWKT and footprintSRS, but although this is probably the best description of the locality I'm fairly sure that nobody will actually use it. Quentin
Roderic Page wrote:
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring <m.doering@mac.com <mailto:
m.doering@mac.com>> wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string. Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user= 4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user= 4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek <tuco@berkeley.edu <mailto:
tuco@berkeley.edu>> wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page < Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi Dag, Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset. Regards Rod --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk> Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778> Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com <http://iphylo.blogspot.com/> ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=
4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.com
<mailto:dag.endresen@gmail.com>> wrote: Hi Rod, I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System). While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple). Regards Dag On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk>> wrote:
> Hi John, > > So, if I understand you correctly, you would have hoped that > TTU would have > output something like this: > > “dwc:verbatimCoordinates” : “18M 166624 9840350” > > rather than put the easting and northing into > verbatimLatitude and > verbatimLongitude. > > Regards > > Rod > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: Roderic.Page@glasgow.ac.uk > mailto:Roderic.Page@glasgow.ac.uk > Tel: +44 141 330 4778 tel:%2B44%20141%20330%204778 > Skype: rdmpage > Facebook: http://www.facebook.com/rdmpage > LinkedIn: http://uk.linkedin.com/in/rdmpage > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com http://iphylo.blogspot.com/ > ORCID: http://orcid.org/0000-0002-7101-9767 > Citations: > http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ > http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ > ResearchGate https://www.researchgate.net/profile/Roderic_Page > > > On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edu > mailto:tuco@berkeley.edu> wrote: > > Hi Rod, > > The verbatimLatitude, verbatimLongitude, and > verbatimCoordinates were all > intended to be able to capture the original coordinates used > at the source, > where decimalLatitude and decimalLongitude, with > geodeticDatum, were meant > to contain the the easy to act on global system (UTMs do not > cover the > entire planet, for example). The verbatimCoordinate term's > definition shows > that this was the intent, but verbatimLatitude and > verbatimLongitude do not. > When we get the examples separated from the term definitions, > it should be > easier to make this clear. > > Cheers, > > John > > On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen > <dag.endresen@gmail.com mailto:dag.endresen@gmail.com> wrote: > >> >> Hi Rod, >> >> At least in Norway, it is very common for the GBIF node to >> receive >> (only) Easting and Northing of UTM zones 32V to 36W. For >> many datasets >> we will on routine automatically make the conversion to decimal >> degrees (and WGS84) at the node before these datasets are >> published to >> the GBIF portal. When people download occurrences from the >> Norwegian >> "GBIF portal", Artskart, my impression is that the UTM 32V >> (and the >> 33V) Easting and Northing coordinate format is actually more >> popular >> than the decimal degree format - this is because the >> geographic data >> layers for Norway more often are made available in the UTM >> format >> (most often 32V or 33V) [1]. And yes, this continued present day >> official use of such a wide variety of coordinate formats >> frustrates >> me too... The historic use reported with the verbatim terms, >> is of >> course difficult to do anything with... >> >> I assume that Easting and Northing coordinates are both >> valid and very >> common values (and not only in Norway) for the Darwin Core >> verbatim >> coordinate terms (dwc:verbatimLatitude and >> dwc:verbatimLongitude or >> dwc:verbatimCoordinates), but of course only at all useful when >> accompanied by the respective dwc:verbatimCoordinateSystem and >> dwc:verbatimSRS also reported. (And that the >> dwc:decimalLatitude and >> dwc:decimalLongitude correctly reported in WGS84 should >> preferably >> also always be there). I believe that Darwin Core is already >> fine with >> respect to terms to report geographic coordinates. If at all any >> additions are useful, I believe that identifying and >> recommending >> terms from more specialized geographic vocabularies and >> ontologies >> might be much more useful than adding any new dwc:Location >> terms to >> Darwin Core. In fact, most of the dwc:Location terms might >> perhaps >> preferably be replaced by terms from the geography >> community... such >> as perhaps [2] and [3] (as a start). >> >> [1] >> https://dagendresen.wordpress.com/2013/11/22/convert- >> coordinate-srs/ >> [2] http://www.w3.org/2003/01/geo/wgs84_pos# >> [3] http://www.geonames.org/ontology/documentation.html >> >> Regards >> Dag >> >> >> On 7 February 2015 at 13:02, Roderic Page >> <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk > >> >> >> wrote: >> >>> Pardon my ignorance, but has there ever been a discussion >>> of easting and >>> northing values in regards to Darwin Core? AFAIK the >>> current standard >>> doesn’t mention them. The reason I’m asking is that I’ve >>> just come >>> across >>> some VerbatimLatitude and VerbatimLongitude values in a >>> dataset that is >>> aggregated by VertNet (and hence GBIF) where (after some head >>> scratching) I >>> realised that the verbatim values were actually Easting and >>> Northing >>> (which >>> I didn’t know existed until yesterday). Details are here: >>> https://github.com/ttu-vertnet/ttu-mammals/issues/11 >>> >>> I’m guessing this isn’t a terribly common way to record >>> location >>> information, but it looks like in this case the lack of >>> support for this >>> type of data has resulted in somebody trying to shoehorn >>> them into >>> VerbatimLatitude and VerbatimLongitude, resulting in values >>> which are >>> uninterpretable to aggregators further up the chain. >>> >>> Regards >>> >>> Rod >>> >>> >>> >> -- >> Dag Endresen, Ph.D. >> Private email: dag.endresen@gmail.com >> mailto:dag.endresen@gmail.com >> Work email: dag.endresen@nhm.uio.no >> mailto:dag.endresen@nhm.uio.no >> Mobile: +47 4061 2982 >> _______________________________________________ >> tdwg-content mailing list >> tdwg-content@lists.tdwg.org mailto:tdwg-content@lists.tdwg.org > > >> http://lists.tdwg.org/mailman/listinfo/tdwg-content >> > > > >
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com <mailto:dag.endresen@gmail.com> Work email: dag.endresen@nhm.uio.no <mailto:dag.endresen@nhm.uio.no> Mobile: +47 4061 2982
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org <mailto:tdwg-content@lists.tdwg.org> http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org mailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
No virus found in this message. Checked by AVG - www.avg.com http://www.avg.com Version: 2015.0.5646 / Virus Database: 4284/9088 - Release Date: 02/10/15
-- Dr. Quentin Groom (Botany and Information Technology)
Botanic Garden, Meise Domein van Bouchout B-1860 Meise Belgium
ORCID: 0000-0002-0596-5376
Landline; +32 (0) 226 009 20 ext. 364 FAX: +32 (0) 226 009 45
E-mail: quentin.groom@br.fgov.be Skype name: qgroom Website: www.botanicgarden.be
Hi,
this suspicous flag in GBIF is raised for occurrences when our reprojection to WGS84 causes a suspicous large coordinate shift: "Indicates successful coordinate reprojection according to provided datum, but which results in a datum shift larger than 0.1 decimal degrees.”
In those case we keep the original coordinates. We also ignore the verbatim values as there are valid lat/lon values in the record. We only try to interpret the verbatim ones if the regular lat/lon values do not exist.
In this specific case it is odd as the datum EPSG:3857 is the WGS84 Web Mercator projection which should not cause any change I believe? http://spatialreference.org/ref/sr-org/7483/
This might boil down to a question I had with permitted values for dwc:geodeticDatum. The name suggests strictly a datum, but the definition and actual values found in there are mixing full reference systems, datums and any other identifyable spatial systems. The current definition is: "The ellipsoid, geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude as based"
Most of the codes (often epsg) we find in GBIF occurrences represent full reference systems with a geoid, meridian, unit, projection and not just a datum. We do interpret the given coordinate in that case with the entire system, not just the datum. In case it is just a datum, we interpret coordinates with a default geodetic ellipsoid using the greeenwich meridian (DefaultEllipsoidalCS.GEODETIC_2D in geotools).
best, Markus
On 10 Feb 2015, at 15:14, John Wieczorek tuco@berkeley.edu wrote:
That looks great to me. What kind of a 'suspicious' notice do you get?
On Tue, Feb 10, 2015 at 6:59 AM, Quentin Groom quentin.groom@br.fgov.be wrote: Here is an example of DwC where I think I'm correct, but GBIF considered my verbatim data "suspicious". Plant recording in Britain is conducted in grid squares of the Ordanace Survey. Those coordinates have a very specific notation so that my verbatimCoordinatesare some thing like this NZ3767 and my verbatimCoordinateSystem is OSGB36 (e.g. http://www.gbif.org/occurrence/1020049283/verbatim). To me this is correct usage of the verbatim fields. I do complete the decimal longitude and latitude fields, based upon the centre of the square, but as these are based on the point-radius view of observations they are largely useless for real analysis. To make my records internationally acceptable I also complete the footprintWKT and footprintSRS, but although this is probably the best description of the locality I'm fairly sure that nobody will actually use it. Quentin
Roderic Page wrote: Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.
If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 17:03, Markus Döring <m.doering@mac.com mailto:m.doering@mac.com> wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string. Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek <tuco@berkeley.edu mailto:tuco@berkeley.edu> wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page@glasgow.ac.uk mailto:Roderic.Page@glasgow.ac.uk> wrote:
Hi Dag, Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset. Regards Rod --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk> Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778> Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com <http://iphylo.blogspot.com/> ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ <http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ> ResearchGate https://www.researchgate.net/profile/Roderic_Page On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen@gmail.com <mailto:dag.endresen@gmail.com>> wrote: Hi Rod, I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System). While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple). Regards Dag On 9 February 2015 at 11:57, Roderic Page <Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk>> wrote: Hi John, So, if I understand you correctly, you would have hoped that TTU would have output something like this: “dwc:verbatimCoordinates” : “18M 166624 9840350” rather than put the easting and northing into verbatimLatitude and verbatimLongitude. Regards Rod --------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk> Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778> Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com <http://iphylo.blogspot.com/> ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ <http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ> ResearchGate https://www.researchgate.net/profile/Roderic_Page On 8 Feb 2015, at 20:22, John Wieczorek <tuco@berkeley.edu <mailto:tuco@berkeley.edu>> wrote: Hi Rod, The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear. Cheers, John On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen@gmail.com <mailto:dag.endresen@gmail.com>> wrote: Hi Rod, At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with... I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start). [1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html Regards Dag On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.uk <mailto:Roderic.Page@glasgow.ac.uk>> wrote: Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11 I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain. Regards Rod -- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com <mailto:dag.endresen@gmail.com> Work email: dag.endresen@nhm.uio.no <mailto:dag.endresen@nhm.uio.no> Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org <mailto:tdwg-content@lists.tdwg.org> http://lists.tdwg.org/mailman/listinfo/tdwg-content -- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com <mailto:dag.endresen@gmail.com> Work email: dag.endresen@nhm.uio.no <mailto:dag.endresen@nhm.uio.no> Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org <mailto:tdwg-content@lists.tdwg.org> http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org mailto:tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
No virus found in this message. Checked by AVG - www.avg.com http://www.avg.com Version: 2015.0.5646 / Virus Database: 4284/9088 - Release Date: 02/10/15
-- Dr. Quentin Groom (Botany and Information Technology)
Botanic Garden, Meise Domein van Bouchout B-1860 Meise Belgium
ORCID: 0000-0002-0596-5376
Landline; +32 (0) 226 009 20 ext. 364 FAX: +32 (0) 226 009 45
E-mail: quentin.groom@br.fgov.be Skype name: qgroom Website: www.botanicgarden.be
I could live with that recommendation given that the grid systems are not strictly latitudes and longitudes.
On Mon, Feb 9, 2015 at 2:03 PM, Markus Döring m.doering@mac.com wrote:
Hi Rod & John,
for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing. I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.
Any strong opinions?
On 09 Feb 2015, at 17:25, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.
So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 16:12, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.
In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.
Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.
Cheers,
John
On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi Dag,
Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 9 Feb 2015, at 11:41, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
I have assumed that the purpose of the dwc:verbatimCoordinates term is to allow for reporting coordinates originally recorded as one single tuple such as the MGRS (Military Grid Reference System).
While original source coordinates that do have two tuples such as UTM would use dwc:verbatimLongitude (for the Easting or X coordinate tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate tuple).
Regards Dag
On 9 February 2015 at 11:57, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Hi John,
So, if I understand you correctly, you would have hoped that TTU would have output something like this:
“dwc:verbatimCoordinates” : “18M 166624 9840350”
rather than put the easting and northing into verbatimLatitude and verbatimLongitude.
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 8 Feb 2015, at 20:22, John Wieczorek tuco@berkeley.edu wrote:
Hi Rod,
The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all intended to be able to capture the original coordinates used at the source, where decimalLatitude and decimalLongitude, with geodeticDatum, were meant to contain the the easy to act on global system (UTMs do not cover the entire planet, for example). The verbatimCoordinate term's definition shows that this was the intent, but verbatimLatitude and verbatimLongitude do not. When we get the examples separated from the term definitions, it should be easier to make this clear.
Cheers,
John
On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen dag.endresen@gmail.com wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982 _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Dag Endresen, Ph.D. Private email: dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.no Mobile: +47 4061 2982
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page Roderic.Page@glasgow.ac.uk wrote:
Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
Hi Dag,
I should have guessed that as soon as I’d typed “isn’t a terribly common way” that this would more likely expose my ignorance than reflect reality!
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 7 Feb 2015, at 17:39, Dag Endresen <dag.endresen@gmail.commailto:dag.endresen@gmail.com> wrote:
Hi Rod,
At least in Norway, it is very common for the GBIF node to receive (only) Easting and Northing of UTM zones 32V to 36W. For many datasets we will on routine automatically make the conversion to decimal degrees (and WGS84) at the node before these datasets are published to the GBIF portal. When people download occurrences from the Norwegian "GBIF portal", Artskart, my impression is that the UTM 32V (and the 33V) Easting and Northing coordinate format is actually more popular than the decimal degree format - this is because the geographic data layers for Norway more often are made available in the UTM format (most often 32V or 33V) [1]. And yes, this continued present day official use of such a wide variety of coordinate formats frustrates me too... The historic use reported with the verbatim terms, is of course difficult to do anything with...
I assume that Easting and Northing coordinates are both valid and very common values (and not only in Norway) for the Darwin Core verbatim coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or dwc:verbatimCoordinates), but of course only at all useful when accompanied by the respective dwc:verbatimCoordinateSystem and dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and dwc:decimalLongitude correctly reported in WGS84 should preferably also always be there). I believe that Darwin Core is already fine with respect to terms to report geographic coordinates. If at all any additions are useful, I believe that identifying and recommending terms from more specialized geographic vocabularies and ontologies might be much more useful than adding any new dwc:Location terms to Darwin Core. In fact, most of the dwc:Location terms might perhaps preferably be replaced by terms from the geography community... such as perhaps [2] and [3] (as a start).
[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/ [2] http://www.w3.org/2003/01/geo/wgs84_pos# [3] http://www.geonames.org/ontology/documentation.html
Regards Dag
On 7 February 2015 at 13:02, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Pardon my ignorance, but has there ever been a discussion of easting and northing values in regards to Darwin Core? AFAIK the current standard doesn’t mention them. The reason I’m asking is that I’ve just come across some VerbatimLatitude and VerbatimLongitude values in a dataset that is aggregated by VertNet (and hence GBIF) where (after some head scratching) I realised that the verbatim values were actually Easting and Northing (which I didn’t know existed until yesterday). Details are here: https://github.com/ttu-vertnet/ttu-mammals/issues/11
I’m guessing this isn’t a terribly common way to record location information, but it looks like in this case the lack of support for this type of data has resulted in somebody trying to shoehorn them into VerbatimLatitude and VerbatimLongitude, resulting in values which are uninterpretable to aggregators further up the chain.
Regards
Rod
-- Dag Endresen, Ph.D. GBIF Norway, UiO Natural History Museum Private email: dag.endresen@gmail.commailto:dag.endresen@gmail.com Work email: dag.endresen@nhm.uio.nomailto:dag.endresen@nhm.uio.no
participants (6)
-
Dag Endresen
-
John Wieczorek
-
Markus Döring
-
Markus Döring
-
Quentin Groom
-
Roderic Page