Renato De Giovanni renato at cria.org.br
Thu May 29 22:43:35 CEST 2008

John et al.,

I think it's a good idea to use a survey to test for consensus, but it
should be as simple as possible. I would suggest perhaps two questions:

1- Would you be satisfied if the current DarwinCore and its two extensions
(curatorial and geospatial) become an official standard? (even if you
don't fully agree with everything).


2- If not, what do you think should still be improved/fixed?


Regarding your questions, maybe it would be interesting to move the
discussion to the Wiki so that more people can participate and all answers
can be visualized together.

I agree with Dave that it's important to define the nature of DwC.
Personally, I see it as an XML vocabulary, not a real data model as the
TDWG ontology.

Now my answers:

1) Yes, I agree that species occurrence would be the right scope for the
core. However, I agree with Hannu that some concepts in the core can
easily cause confusion when interpreted by field observers. A few
adjustments in concept names and perhaps a new observation extension could
be worthwhile.

As a side note, if we decide to make any changes in the existing schemas
it's important to change the namespace because they are already being used
by providers. It's possible to map concepts from two schemas if they have
different namespaces (see http://rs.tdwg.org/tapir/cs/mappings/) allowing
providers to easily upgrade their configuration when necessary.

2) Fully agree with Dave.

3) Fully agree with Dave and Hannu.

4) I tend to agree with John, unless we expect that a few
observation/specimen data providers will be able to offer additional GUIDs
(such as taxon concept GUIDs) in the next years. I doubt this will happen
soon, but I may be wrong.

5) I see the current application schema more as an example. There can be
as many application schemas as necessary from TAPIR's perspective. This
one works well for flat data structures. ABCD or something along the lines
of Markus' new schema would be better options when there can be repeatable

6) Fully agree with John's answer. I think it's important to allow data
cleaning through valid XML instances.

Best Regards,

> I got sidetracked on this days ago, but feel in the light of recent star
> schema discussions on the original caching thread that the time is again
> right to submit this new discussion.
> Tradition has DwC discussions on this Tapir mailing list. I'm starting
> this
> new thread based on Markus' recent posting (below) about an Identification
> extension to DwC. I'm motivated to pull together the time and energy to
> finally push the pending DwC through the standards process, with a goal of
> having that whole process finished by the TDWG Meeting this year.
> I've been thinking about how to conduct the Request for Comment required
> to
> move the standard forward. I propose to put together a survey with Survey
> Monkey or something akin to actually test for reasonble concensus. Any
> comments or suggestions about this idea are welcome.
> However, I see benefits to having further discussion about some key issues
> before doing that, as I believe we now have enough accumulated experience
> to
> make some good decisions that will affect the design and guidelines for
> further development of the Darwin core and extensions.
> In the past, most Darwin Core discussions have revolved about whether to
> include a particular concept, and where. I think it will be much more
> useful
> to concentrate on a few key issues at a higher level, resolve them at that
> level, then make any necessary changes to the schemas based on the
> consensus
> guiding principles. It should be easy and fast to accomplish this if the
> principles are clear and simple. It should be possible to complete this
> work
> soon if we can easily achieve a concensus. Here are some seed questions
> and
> recommendations to facilitate the resolution if this next step in the
> process.
> 1) Is species occurrence in nature and in collections the right scope for
> the Core?
> 2) Should the general philosophy of the Core be inclusive or minimalist?
> What are the characteristics of a concept that allow it to be in the Core?
> What are the characteristics of a concept that allow it to be added to an
> existing extension?
> 3) What are the defining characteristics of a group of related concepts
> that
> justify the creation of a new extension? Should extensions be based on
> abstract conceptual groupings/objects (events,
> identifications/determinations, places)? Or on special interests (paleo,
> curation, interaction)? Or on the stability of the concepts (core contains
> the proven stable concepts, extensions are more volatile)?
> 4) Should there be elements in the Core and extensions to hold GUIDs
> linking
> them to instances of related classes of objects, such as an occurrence to
> a
> TaxonConceptGUID, or an occurrence to a CoreGatheringGUID? Should every
> extension have a non-mandatory GUID allowing for the external resolution
> of
> the object?
> 5) What should the Darwin Tapir application schema look like?
> 6) Is it the right approach to have restrictions on content at the concept
> definition level? Where should the line be drawn? Arguments have been
> raised
> in the past about the DwC and extensions' content with respect to
> being restrictive versus open to incorrect content. For example, DayOfYear
> in the current DwC 1.4 (http://rs.tdwg.org/dwc/tdwg_dw_core.xsd) is typed
> as
> a dwc:dayOfYearDataType, which is defined in
> http://rs.tdwg.org/dwc/tdwg_basetypes.xsd as:
> <xs:simpleType name="dayOfYearDataType">
>  <xs:restriction base="xs:integer">
>  <xs:minInclusive value="1" />
>  <xs:maxInclusive value="366" />
>  </xs:restriction>
> </xs:simpleType>
> Anything short of a flamethrower in response is welcome.

