[tdwg-content] Easting and northing

Roderic Page Roderic.Page at glasgow.ac.uk
Mon Feb 9 19:10:41 CET 2015


Hi John,

OK, maybe “accepted” is a little strong, and I agree we don’t want to put obstacles in the way of publishing data. But, it might help people (a) if they had guidance on what suitable data looks like, and (even better) (b) we had decent tools that could quickly check the data and flag issues (I know work is being done on this). Part of the problem is that publishing data is typically divorced from use, so errors often aren’t picked up until somebody actually tries to use the data. Decent testing tools may help tackle that.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page


On 9 Feb 2015, at 17:56, John Wieczorek <tuco at berkeley.edu<mailto:tuco at berkeley.edu>> wrote:

Accepted where?  That is a very slippery slope. I think data publishers must be allowed to publish their verbatim information for a variety of reasons. First among them being that relevant data might not pass someone else's idea of a regular expression, and yet still be valid. Second among them is the huge win data publishers get by being able to put there data out their, warts and all and have the Rod Pages of the world (with great respect) and thier respective data quality tools find out what is wrong with them and report it back to the source, where, with luck and resources, it can be improved in perpetuity.

Darwin Core as a standard does not impose constraints on purpose, and I think that is a fundamental strength, not a weakness. It lets us know what is wrong. Let the appropriate constraints come in appropriate contexts.

I don't believe it to be wrong to be happier having the option for less ambiguity. That is why it doesn't bother me to have four ways to capture the event date. Some people hate that. I'd rather be able to publish what I have and interpret it in all the other ways as well than to not be able to publish it.



On Mon, Feb 9, 2015 at 2:15 PM, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:
Is it wrong that I’d be happier if Darwin Core itself had terms for easting, northing, and zones so there was no ambiguity (or at least it would be minimised). Putting stuff in text fields and hoping people can figure it out is just asking for trouble, as is overloading fields.

If they do go into verbatimCoordinates, I wonder if the spec/examples could add a regular expression to they would have to pass in order to be accepted.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778<tel:%2B44%20141%20330%204778>
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com<http://iphylo.blogspot.com/>
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page


On 9 Feb 2015, at 17:03, Markus Döring <m.doering at mac.com<mailto:m.doering at mac.com>> wrote:

Hi Rod & John,

for the future documentation of dwc I’d be curious if we all think it is a good idea to overload verbatimLongitude/Latitude with not strictly lon/lat values like easting and northing.
I *think* I would prefer a definition, comment and examples where UTM values go into the verbatimCoordinates field only, even though it would be great to not have to parse them out from a string.

Any strong opinions?


On 09 Feb 2015, at 17:25, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:

Hi John,

I guess my concern is that the TTU dataset in VertNet/GBIF didn’t explicitly say that the verbatimLatitude/Longitude were easting and northings (other values it has in those fields for other records are lats and longs), nor did it include the zone information, or the fact that the values were for south of the equator (TTU itself doesn’t say that either). I had to go back to the TTU web site to discover that.

So, even if parsing is a mess (and I agree it pretty much always is) the TTU output didn’t have everything needed to figure out how to parse the data correctly. It would be nice if the data that gets to the aggregator is complete enough to be interepreted, leaving aside in what fields people stick that information.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778<tel:%2B44%20141%20330%204778>
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com<http://iphylo.blogspot.com/>
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page


On 9 Feb 2015, at 16:12, John Wieczorek <tuco at berkeley.edu<mailto:tuco at berkeley.edu>> wrote:

Hi Rod,

I would treat easting and northing as latitude and longitude, but not following strictly the definition in Darwin Core. There is actually value in being able to have the easting (longitude) separate from the northing (latitude) if the source has them separated. It makes it that much less ambiguous to interpret. I would also have that full tuple as in the example you gave (18M 166624 9840350) along with "UTM" in verbatimCoordinateSystem, and a datum or something similar in verbatimSRS. We want more information, not less, when it comes time to try to turn this all into more readily usable information (decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters). All of this verbatim separation arose from the heyday of massive collaborative georeferencing in MaNIS, HerpNET, and ORNIS where we were able to take good advantage of whatever information the source had, and that is why verbatimCoordinateSystem is part of the offerings, to help with tha parsing problem if the original source is known.

In short, I wouldn't have it any other way than the way it is done with TTU. It actually allows use to extract more information correctly rather than less. Parsing is a mess, but it is a mess anyway. It takes me about 25 steps to parse the variations I see in verbatim coordinates in VertNet. but it is worth it.

Now, to get back to TTU and upgrade their venerable migrator and see how things look afterwords. We appreciate the careful eye and the quality feedback reports to Github on TTU's behalf.

Cheers,

John


On Mon, Feb 9, 2015 at 8:50 AM, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:
Hi Dag,

Gack, this is where things get messy. I wouldn’t treat easting and northing as latitude and longitude (although they are obviously related). When I write code to parse verbatim latitude and longitude the last thing I expect is  easting and northing (it’s hard enough already given the various ways people write lats and longs). There seems to be enough ambiguity here to really mess things up, as indeed they have for the TTU dataset.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778<tel:%2B44%20141%20330%204778>
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com<http://iphylo.blogspot.com/>
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page


On 9 Feb 2015, at 11:41, Dag Endresen <dag.endresen at gmail.com<mailto:dag.endresen at gmail.com>> wrote:

Hi Rod,

I have assumed that the purpose of the dwc:verbatimCoordinates term is
to allow for reporting coordinates originally recorded as one single
tuple such as the MGRS (Military Grid Reference System).

While original source coordinates that do have two tuples such as UTM
would use dwc:verbatimLongitude (for the Easting or X coordinate
tuple) and dwc:verbatimLatitude (for the Northing or Y coordinate
tuple).

Regards
Dag


On 9 February 2015 at 11:57, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:
Hi John,

So, if I understand you correctly, you would have hoped that TTU would have
output something like this:

“dwc:verbatimCoordinates” : “18M 166624 9840350”

rather than put the easting and northing into verbatimLatitude and
verbatimLongitude.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778<tel:%2B44%20141%20330%204778>
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com<http://iphylo.blogspot.com/>
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page


On 8 Feb 2015, at 20:22, John Wieczorek <tuco at berkeley.edu<mailto:tuco at berkeley.edu>> wrote:

Hi Rod,

The verbatimLatitude, verbatimLongitude, and verbatimCoordinates were all
intended to be able to capture the original coordinates used at the source,
where decimalLatitude and decimalLongitude, with geodeticDatum, were meant
to contain the the easy to act on global system (UTMs do not cover the
entire planet, for example). The verbatimCoordinate term's definition shows
that this was the intent, but verbatimLatitude and verbatimLongitude do not.
When we get the examples separated from the term definitions, it should be
easier to make this clear.

Cheers,

John

On Sat, Feb 7, 2015 at 2:36 PM, Dag Endresen <dag.endresen at gmail.com<mailto:dag.endresen at gmail.com>> wrote:

Hi Rod,

At least in Norway, it is very common for the GBIF node to receive
(only) Easting and Northing of UTM zones 32V to 36W. For many datasets
we will on routine automatically make the conversion to decimal
degrees (and WGS84) at the node before these datasets are published to
the GBIF portal. When people download occurrences from the Norwegian
"GBIF portal", Artskart, my impression is that the UTM 32V (and the
33V) Easting and Northing coordinate format is actually more popular
than the decimal degree format - this is because the geographic data
layers for Norway more often are made available in the UTM format
(most often 32V or 33V) [1]. And yes, this continued present day
official use of such a wide variety of coordinate formats frustrates
me too... The historic use reported with the verbatim terms, is of
course difficult to do anything with...

I assume that Easting and Northing coordinates are both valid and very
common values (and not only in Norway) for the Darwin Core verbatim
coordinate terms (dwc:verbatimLatitude and dwc:verbatimLongitude or
dwc:verbatimCoordinates), but of course only at all useful when
accompanied by the respective dwc:verbatimCoordinateSystem and
dwc:verbatimSRS also reported. (And that the dwc:decimalLatitude and
dwc:decimalLongitude correctly reported in WGS84 should preferably
also always be there). I believe that Darwin Core is already fine with
respect to terms to report geographic coordinates. If at all any
additions are useful, I believe that identifying and recommending
terms from more specialized geographic vocabularies and ontologies
might be much more useful than adding any new dwc:Location terms to
Darwin Core. In fact, most of the dwc:Location terms might perhaps
preferably be replaced by terms from the geography community... such
as perhaps [2] and [3] (as a start).

[1] https://dagendresen.wordpress.com/2013/11/22/convert-coordinate-srs/
[2] http://www.w3.org/2003/01/geo/wgs84_pos#
[3] http://www.geonames.org/ontology/documentation.html

Regards
Dag


On 7 February 2015 at 13:02, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>>
wrote:
Pardon my ignorance, but has there ever been a discussion of easting and
northing values in regards to Darwin Core? AFAIK the current standard
doesn’t mention them. The reason I’m asking is that I’ve just come
across
some VerbatimLatitude and VerbatimLongitude values in a dataset that is
aggregated by VertNet (and hence GBIF) where (after some head
scratching) I
realised that the verbatim values were actually Easting and Northing
(which
I didn’t know existed until yesterday). Details are here:
https://github.com/ttu-vertnet/ttu-mammals/issues/11

I’m guessing this isn’t a terribly common way to record location
information, but it looks like in this case the lack of support for this
type of data has resulted in somebody trying to shoehorn them into
VerbatimLatitude and VerbatimLongitude, resulting in values which are
uninterpretable to aggregators further up the chain.

Regards

Rod



--
Dag Endresen, Ph.D.
Private email: dag.endresen at gmail.com<mailto:dag.endresen at gmail.com>
Work email: dag.endresen at nhm.uio.no<mailto:dag.endresen at nhm.uio.no>
Mobile: +47 4061 2982
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org<mailto:tdwg-content at lists.tdwg.org>
http://lists.tdwg.org/mailman/listinfo/tdwg-content






--
Dag Endresen, Ph.D.
Private email: dag.endresen at gmail.com<mailto:dag.endresen at gmail.com>
Work email: dag.endresen at nhm.uio.no<mailto:dag.endresen at nhm.uio.no>
Mobile: +47 4061 2982


_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org<mailto:tdwg-content at lists.tdwg.org>
http://lists.tdwg.org/mailman/listinfo/tdwg-content



_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org<mailto:tdwg-content at lists.tdwg.org>
http://lists.tdwg.org/mailman/listinfo/tdwg-content




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20150209/9d825cee/attachment.html 


More information about the tdwg-content mailing list