[tdwg-content] dwc: city to county

John R. WIECZOREK tuco at berkeley.edu
Wed Aug 26 01:04:23 CEST 2009


On Tue, Aug 25, 2009 at 12:18 PM, Gregor Hagedorn<g.m.hagedorn at gmail.com> wrote:
> Hi John
>
>> Darwin Core is able to transmit Gazetteer IDs for the kind of objects
>> you are talking about (generally called "features" or "named places")
>> that are present in gazetteers. Not only that, gazetteers can have
>> detailed information (georeferences with uncertainties) about places
>> with complex descriptions as well as simple named places. BioGeoBIF
>> does this, and a Locality service I have long wanted to build has
>> exactly this intention. What Darwin Core can't do is give a gazetteer
>> id for some part of the Location, only for the whole. In other words,
>> it can't do what you want it to do. I don't think Darwin Core should.
>> I think the far better solution is to use universal terms - the
>> spatial data - for the use case you are proposing.
>
> There is a big difference between city being S. Francisco and the
> location being detail inside of it, and city being S. Francisco and
> the location being 200 km S of it.

Yes, I agree. They are very different. Assuming there was a "city"
term in DwC, I would not want someone to put San Francisco as the city
if the Location was outside of the city. In other words, no geographic
term is to be used to represent a "nearest named place", instead, they
are to be used only to designate containment of the specific place.

> So for the use case where the the detailed location is inside the
> boundaries defined by a gazeetter ID, I am still assuming that DWC can
> transmit the data ONLY if no more detailed data are given. Or this a
> misunderstanding?

You understand correctly. Darwin Core can transmit all of the detail
about the place, no matter how specific, but it cannot transmit any
gazetteer id that does not correspond to the whole Location in all its
detail.

> Note that the need for gazetteer IDs arises primarily because multiple
> cities or villages, even within a state, bear the same name. In part
> that could be solved by fully citing all administrative subdivisions,
> but in our experience this is extremely tough (in part because the
> data are unavailable, in part because these division change every year
> - in Germany we of course have had even more drastic changes, which
> made even the normally relatively reliable states problematic.
> Gazetter IDs are nice and stable.

Though the gazetteer IDs may be stable over time, the geographic
entities to which they refer may not be. Thus, I prefer the
georeferences again for stability and analytical power.

>>> It seems to me undesirable to be able to analyse records where the
>>> collectors did not care to add detailed information (such as in a
>>> collection only citing the city name) and unable to analyse any
>>> records where in addition to the same city name a careful collector
>>> also provided additional information. However, according to your
>>> definitions, this is presently the situation with DWC.
>>
>> I have to assume that by "analyse" you mean "query for". In any case,
>> the DwC supports both - by spatial data as mentioned above and by
>> substring querying on the locality term.
>
> No I mean aggregate and compare.
>
>>> Or could we even accept the IPTC Extension 1.1 as a relevant standard
>>> and accept its definitions (I previously wrongly quoted a version 2, I
>>> meant the 1.1. Also note that these fields are not new, they have been
>>> moved from core to extension.
>>>
>>> The link to the specs is here:
>>>
>>> http://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata%28200907%29_1.pdf
>>>
>>> It defines for the {location detail} class the following properties
>>> (available as RDF vocabularies)
>>>
>>> World Region
>>> Country ISO-Code
>>> Country Name
>>> Province or State
>>> City
>>> Location Details
>>> Sublocation
>>
>> These would be committing the same mistakes that make DwC problematic
>> for geographic subdivisions as it currently stands. If DwC is to
>> change, it seems to me that it should be for the better, not for a
>> similar alternative.
>
> I disagree. I think IPTC and xmp have been carefully drafted by very
> capable and pragmatic people. That includes the digital still image
> and video recorder manufactures who currently implement these
> standards (see http://www.metadataworkinggroup.org/members/).

These very capable and pragmatic people haven't addressed any of the
issues I raised about the fallibility of named places and their
somewhat arbitrary and changing hierarchical categorization systems -
issues that Darwin Core can solve only with spatial entities
(georeferences) to represent the Location.

> I personally think undefined terms like geohierarchy1 to geohierarchy8
> are not very practical, easily misunderstood or abused, not amendable
> for human consumption, liable to be not interpretable (where higher
> and lower geographic regions have the same name). I think the proposal
> is not in the laudably pragmatic spirit of most of the current Darwin
> Core.

Some other very capable and pragmatic people came up with the TDWG
Geographic Region Ontology
(http://rs.tdwg.org/ontology/voc/GeographicRegion), including
controlled vocabulary, and others are working hard to make actual
shapes associated with these features. Should we adopt all of that as
well despite the fact that much more inclusive and venerable
vocabularies exist in the Getty Thesaurus of Geographic Names (TGN,
http://www.getty.edu/research/conducting_research/vocabularies/tgn/)
and the Global Administrative Areas (GADM,
http://biogeo.berkeley.edu/gadm/) data sets? How would we reconcile
the terminology and differences in opinion of administrative levels. I
don't think we can, and I certainly don't think we should waste our
time trying.

I'm nothing if not pragmatic. That's why I expressed and rejected my
own purist ideas in the 13 Aug posting. I can be convinced to include
a "city" term, but only if you promise never to populate it with a
city if the actual Location is not at least partly within that city.
;-)

In return, I'd ask to not have to change any other term names in the
Location class.

John

> Gregor
>



More information about the tdwg-content mailing list