Hi Markus,<div><br></div><div>This is very cool. :-)</div><div><br></div><div>I had some ideas, suggestions:</div><div><br></div><div>Add:</div><div>TaxonConceptID =&gt; unique identifier of taxon concept, some sort of resolvable GUID</div>

<div><br></div><div>Change:</div><div>TaxonID to TaxonNameID since it seems to link to something like a uBio namebankid</div><div><br></div><div>Question:</div><div>higherTaxonID, higherTaxon  are these the accepted names or the names on the collection label?</div>

<div><br></div><div>If accepted, change to higherTaxonNameID, higherTaxonName</div><div><br></div><div>Add:</div><div>HigherTaxonConceptID, this would link to a the taxon concept for the next highest group with a more stable identifier than the family name / lsid.</div>

<div><br></div><div>Question:</div><div>scientificNameAuthorship  does this include the year? also I replace @ with &quot;and&quot; to avoid encoding, decoding e.g. &quot;Smith and Jones&quot;</div><div><br></div><div>Question:</div>

<div>taxonAccordingTo  is this a DOI? or linked identifier?</div><div><br></div><div>Comment:</div><div>namePublishedIn should these be a subproperty of dcterms:BibliographicCitation?</div><div><br></div><div>Change:</div>

<div>acceptedTaxonID =&gt; acceptedTaxonNameID since it appears this would be a uBio namebank ID type identfier</div><div>acceptedTaxon    =&gt; acceptedTaxonName since this is the current accepted name.</div><div><br></div>

<div>These changes should not effect any of your goals but it would help disambiguate a species concept i.e. this is a real species, from the current taxonomic</div><div>hypothesis that this species belongs in this genus and family.</div>

<div><br></div><div>The recognition that something is a real species changes very little overtime, but the taxonomic hypothesis implicit in the Genus and species names</div><div>seems to change fairly often.</div><div><br>

</div><div>It also would have all the synonyms point to a relatively stable GUID for the species concept, not the latest accepted taxonomic hypothesis i.e. &quot;name&quot;</div><div><br></div><div>The rational is that this makes it easier to find all records of the same species that happen to labeled either <i>Aedes triseriatus</i> or <i>Ochleratatus triseriatus </i>via</div>

<div>names and the LSID&#39;s for those names.</div><div><br></div><div>[Caution Red Herring]</div><div><br></div><div>I will also throw this idea out there, as red herring. Is there any advantage to differentiating between different taxon levels for instance, species, family, order etc?</div>

<div><br></div><div>For instance having </div><div><br></div><div>SpeciesTaxon, </div><div>  speciesTaxonNameID</div><div>  speciesScientificName ...</div><div>  speciesInFamilyTaxonID</div><div>  speciesInFamily</div><div>

<br></div><div>FamilyTaxon</div><div>  familyTaxonNameID</div><div>  familyScientificName</div><div>  familyinOrderTaxonID</div><div>  familyinOrderName</div><div><br></div><div>...</div><div><br></div><div>This would make it clearer as to what attributes apply to which records. For instance, an order would have a kingdom, phylum, class, but not a family.</div>

<div>Another way to achieve this is to assume that Taxon is a species, or paraspecies entity like subspecies, variety, and not one of the higher level groupings.</div><div>For example, a Taxon is a OTU operating at or near the species level. I think most of the DarwinCore observation records will be of species or paraspecies</div>

<div>anyway.</div><div><br></div><div>These are just some things to think about,</div><div><br></div><div>Respectfully,</div><div><br></div><div>- Pete</div><div><br><div class="gmail_quote">On Tue, Apr 28, 2009 at 12:15 AM, &quot;Markus Döring (GBIF)&quot; <span dir="ltr">&lt;<a href="mailto:mdoering@gbif.org">mdoering@gbif.org</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Im trying my best to catch up with all the mails in this thread<br>

lately, but surely missed something.<br>

Ive ended up with a rather long post, please excuse, but I think the<br>

new dwc draft needs some explanations...<br>

<br>

<br>

<br>

The idea of darwin core is to provide a simple list of terms which are<br>

useful in different contexts - just like dublin core does. We have<br>

been thinking about giving the terms a &quot;domain&quot;, the dublin core<br>

terminology for binding a property to a class. Although we did declare<br>

a terms domain in the description, we finally decided that this is not<br>

more than a grouping of terms. Assigning a domain is a rather<br>

controversary task as properties can belong to several classes and the<br>

granularity of class definitions is strongly depending on your views.<br>

This is likely the mayor obstacle in getting an agreed ontology on the<br>

road too I believe.<br>

<br>

But in order to express the context of a collection of darwin core<br>

terms we also defined class terms representing a taxon/name, an<br>

occurrence/specimen, a site, a collecting event and more. Actually<br>

there are 2 ways of expressing the type of a thing in the new terms -<br>

using a class term (e.g. as the parent xml element instead of just<br>

record or inside the rowType attribute of the text archive meta file)<br>

or using the basis of record when using the simple/classic darwin core<br>

with a meaningfree DarwinRecord parent element.<br>

<br>

During last TDWG I got very attracted to the KISS idea that was<br>

present all over, but especially in Chucks talk. Why has darwin core<br>

been so much more successful than other tdwg formats? Why can&#39;t we use<br>

the same approach to share taxonomic or nomenclatural data? In the end<br>

the core properties for a taxon or name are rather limited and if you<br>

take TCS apart there is not much more you can do with it than what the<br>

taxonomic dwc terms provide - leaving concept relations aside. We also<br>

decided not to separate a name from a taxon. That happens in the<br>

context and due to the present of a taxonomic status or taxonomic<br>

classification. In order to save you from looking up the latest draft,<br>

here is the list of the taxonomic terms with a few quick annoations<br>

from myself:<br>

<br>

[ taken from <a href="http://darwincore.googlecode.com/svn/trunk/terms/" target="_blank">http://darwincore.googlecode.com/svn/trunk/terms/</a><br>

index.htm ]<br>

<br>

taxonID  # the taxon/name ID, preferrably a GUID but local ids are<br>

permitted too. Any ID<br>

scientificName  # the full name incl authorship<br>

higherTaxonID  # ID pointing to the next higher taxon<br>

higherTaxon  #  full scientific name of the next higher taxon. Can be<br>

used alternatively or in addition to the ID above<br>

kingdom  # the classic dwc higher ranks to simplify transfer in many<br>

cases and useful to disambiguate homonyms<br>

phylum<br>

class<br>

order<br>

family<br>

genus<br>

subgenus<br>

specificEpithet  # atomised epitheton part as you would guess<br>

taxonRank  # any rank string, preferrably taken from a controlled list<br>

such as the TCS one<br>

infraspecificEpithet<br>

scientificNameAuthorship  # the authorship alone. Similar to the other<br>

atomised parts this might not be needed as the complete sciname string<br>

is good enough, but it seemed desired by many<br>

nomenclaturalCode # the code to disambiguate homonyms across the codes<br>

taxonAccordingTo # the taxon concept sec reference<br>

namePublishedIn # the nomenclatural reference, original publication of<br>

the name<br>

taxonomicStatus  # a status such as accepted, (heterotypic) synonym,<br>

missapplied name. Ideally also a vocabulary such as the ontology one<br>

for nomenclatural note types<br>

nomenclaturalStatus  # similar to above, but only nomenclaturaly<br>

relevant statuses such as nom. illeg.<br>

acceptedTaxonID  # pointer to accepted taxon in case of synonyms or<br>

misapplied names<br>

acceptedTaxon  # explicit string version of above<br>

basionymID  #  pointer to the basionym<br>

basionym # explicit string version of above<br>

<br>

<br>

So frankly I think there is a lot you can do with these terms. Every<br>

synonym will be a name on its own and link to the accepted taxon,<br>

having its own taxonomic status that can provide you with details<br>

about the reason for it being a synonym or misapplied name. If<br>

combined with 1 to many extensions, you could even exchange pretty<br>

detailed species information in a *very* simple model.<br>

Imagine you have datat based on extensions for species such as<br>

distribution (multiple distribution status per area per species),<br>

descriptions (just a simple title, body, and description type ala SPM<br>

classes would be great), multimedia, alternative links/guids, types<br>

data. As an exercise I have also translated parts of the GISIN schema<br>

into extensions based around species described by darwin core terms<br>

and it works fine.<br>

<br>

Surely darwin core terms can leave you in doubt about the exact<br>

context, just as dublin core does. But it immediately allows you to<br>

share data in a simple way that people are comfortable with and it is<br>

not tight to a technology per se. I see this as the biggest advantage<br>

of all. It plays nice in the world of xml, rdf, plain text, xhtml,<br>

microformats - just anything as it is based on just a list of terms<br>

with an associated globally unique URI. And by decoupling the<br>

recommendations on how to use the dwc terms in the context of<br>

different technologies, we can keep the term definitions stable no<br>

matter what comes next.<br>

<br>

<br>

Also I wanted to avoid confusion about IPT and the dwc text archives.<br>

The text archive, extensions and vocabularies idea implemented in the<br>

IPT are not IPT specific. It&#39;s just a very simple exchanging format<br>

based on the dwc recommendations for how to use darwin core in the<br>

context of text/csv files. For the interested I am writing on client<br>

code currently, so there will be a java archive reader in a few days<br>

that allows you to easily iterate over the core dwc records in an<br>

archive and pulls out the related extension records together with the<br>

core record.<br>

<br>

<br>

I am glad to see this list being so alive again!<br>

Markus<br>

<br>

<br>

<br>

On Apr 28, 2009, at 3:22, Blum, Stan wrote:<br>

<br>

&gt; @font-face { font-family: Tahoma; } @page Section1 {size: 8.5in<br>

&gt; 11.0in; margin: 1.0in 1.25in 1.0in 1.25in; } P.MsoNormal { FONT-<br>

&gt; SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: &quot;Times New Roman&quot; }<br>

&gt; LI.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY:<br>

&gt; &quot;Times New Roman&quot; } DIV.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in<br>

&gt; 0pt; FONT-FAMILY: &quot;Times New Roman&quot; } A:link {        COLOR: blue; TEXT-<br>

&gt; DECORATION: underline } SPAN.MsoHyperlink { COLOR: blue; TEXT-<br>

&gt; DECORATION: underline } A:visited { COLOR: blue; TEXT-DECORATION:<br>

&gt; underline } SPAN.MsoHyperlinkFollowed { COLOR: blue; TEXT-<br>

&gt; DECORATION: underline } P { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt;<br>

&gt; FONT-FAMILY: &quot;Times New Roman&quot; } SPAN.EmailStyle18 { COLOR: navy;<br>

&gt; FONT-FAMILY: Arial; mso-style-type: personal-reply } DIV.Section1<br>

&gt; { page: Section1 }<br>

<div><div></div><div class="h5">&gt; Please note, the most current draft of the DarwinCore is:<br>

&gt;   here <a href="http://code.google.com/p/darwincore/" target="_blank">http://code.google.com/p/darwincore/</a> and<br>

&gt;   here <a href="http://rs.tdwg.org/dwc/index.htm" target="_blank">http://rs.tdwg.org/dwc/index.htm</a>,<br>

&gt;   not here <a href="http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard" target="_blank">http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard</a>)<br>

&gt; ,<br>

&gt;<br>

&gt; My primary concern with the latest draft is the absence of an<br>

&gt; explicit class identifier (and an implicit class definition) that<br>

&gt; indicates what kind of data object the sender is transmitting.  If<br>

&gt; an indexer/aggregator is indexing multiple kinds of resources, as<br>

&gt; GBIF is, and a publisher provides a record with these elements<br>

&gt; [ScientificName, ScientificNameAuthorship, NamePublishedIn, and<br>

&gt; Country], how should the indexer interepret this record?  Is it an<br>

&gt; organism occurrence record, an authoritative taxonomic record (the<br>

&gt; country name indicating the entire known range of the taxon), or<br>

&gt; part of a taxonomic checklist for that country?  The term/element<br>

&gt; [BasisOfRecord] is the first step in narrowing the possible<br>

&gt; meanings, but it&#39;s the only step and it appears not to be a required<br>

&gt; step.  (I interpret the &quot;Status: recommended&quot; to mean that it&#39;s<br>

&gt; optional.)   At a minimum, BasisOfRecord should be required.  It<br>

&gt; would still be possible to publish garbage (at least hard to<br>

&gt; interpret records) because our tools don&#39;t constrain structure, and<br>

&gt; there isn&#39;t (yet) any guidance controlling the structures for<br>

&gt; different classes of objects.<br>

&gt;<br>

&gt; The new GBIF Internet Publishing Toolkit (IPT) supports one-to-many<br>

&gt; relationships among a series of flat tables and looks like it&#39;s<br>

&gt; going to make it easier to transmit more complicated data than we<br>

&gt; were doing with DiGIR and TAPIR.  In conjunction with this new<br>

&gt; expanded bag of elements we could see a lot more complex data get<br>

&gt; published.  If I may use a skiing metaphor, we&#39;ve been on the bunny<br>

&gt; slopes (for beginners) up until now.  The new DarwinCore looks like<br>

&gt; a nicely groomed black diamond slope, but we haven&#39;t had any lessons<br>

&gt; or even watched anyone else do what we&#39;re going to attempt.  I think<br>

&gt; we&#39;re going to end up in a heap at the bottom of the hill, and the<br>

&gt; aggregators are going to have to sort it out.  ( Woohoo! Extreme<br>

&gt; biodiversity informatics! )<br>

&gt;<br>

&gt;<br>

&gt; Finally, I did not mean to say that the Ontology should not be<br>

&gt; ratified, as in not supported; what I meant was that it should not<br>

&gt; be a standard because we will want to change its contents without<br>

&gt; versioning the ontology as a whole.  (Also, its not an application<br>

&gt; schema, so it can&#39;t be used directly unless we venture into somewhat<br>

&gt; uncharted territory.)  Its role is to help us keep our application<br>

&gt; schemas coordinated.<br>

&gt;<br>

&gt; The earlier versions of the DarwinCore (or our protocols, or the way<br>

&gt; we used them together) were too limiting (see Greg&#39;s comments); this<br>

&gt; version allows terms to be combined in nonsensical ways.  It could<br>

&gt; make life very difficult for data integrators.<br>

&gt;<br>

&gt; I think the bottom line is that we URGENTLY need a similar concerted<br>

&gt; effort to advance the TDWG (biodiversity informatics) ontology, and<br>

&gt; a companion set of application schemas coming forward from the<br>

&gt; collections, marine biology, paleo, observation, taxon-name-concept<br>

&gt; groups as soon as possible.<br>

&gt;<br>

&gt; -Stan<br>

&gt;<br>

&gt; -----Original Message-----<br>

&gt; From: Chuck Miller [mailto:<a href="mailto:Chuck.Miller@mobot.org">Chuck.Miller@mobot.org</a>]<br>

&gt; Sent: Monday, April 27, 2009 7:22 AM<br>

&gt; To: Kevin Richards; Blum, Stan; Technical Architecture Group<br>

&gt; mailinglist; <a href="mailto:exec@tdwg.org">exec@tdwg.org</a><br>

&gt; Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology<br>

&gt;<br>

&gt; Kevin,<br>

&gt;<br>

&gt; I agree with you and Stan that the ontology is useful to all<br>

&gt; schemas.  It seems to me that a “TDWG Ontology” is a totally new and<br>

&gt; different kind of thing than all the data exchange standards of the<br>

&gt; prior 10 years – DwC, SDD, TCS, etc.  But, it is a very useful and<br>

&gt; important new kind of thing that should be part of the TDWG<br>

&gt; standards architecture. It challenges prior thinking about the<br>

&gt; nature of TDWG standards to grasp what standardizing on an ontology<br>

&gt; means.  But, I think it’s what is needed.<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; If TDWG standardized on one Ontology, then the vocabulary of all<br>

&gt; data exchange could be standardized on it.  Then all TDWG standards<br>

&gt; could be revised over time to comply to that vocabulary standard,<br>

&gt; including DwC.<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; Stan said: “ I&#39;d like to hear the rationale for combining taxonomic<br>

&gt; name/concept with organism occurrence.” An occurrence record<br>

&gt; generally has an organism’s name associated with it in the real<br>

&gt; world. It is necessary and inevitable that vocabulary about organism<br>

&gt; names will be used in an occurrence data exchange schema like DwC.<br>

&gt; We have been stymied with this idea for years. A standard Ontology/<br>

&gt; vocabulary for the elements of name information needed to be<br>

&gt; associated with an occurrence, or a description, or a taxon concept<br>

&gt; would go a long way toward solving this duality.  The “standard<br>

&gt; vocabulary” would not be standardized within DwC but it would be<br>

&gt; used in DwC.<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; Of course there is the problem of the hundreds of installations of<br>

&gt; DiGIR that use DwC “classic” and are no doubt not going to change<br>

&gt; for a long time.  I think they just have to be accepted and worked<br>

&gt; around going forward.  It’s impractical to think of anything else.<br>

&gt; But, the past should not roadblock the future and we need to get<br>

&gt; moving toward that future.<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; Stan thinks that the Ontology is not appropriate for TDWG<br>

&gt; ratification.  Why not?  Change has to start somewhere. Yes, other<br>

&gt; standards would probably be in conflict if the Ontology were<br>

&gt; ratified, but I think we want to ultimately have consistency across<br>

&gt; all the standards and that means there has to be change going<br>

&gt; forward.  I think a ratified TDWG Ontology would provide the<br>

&gt; foundation upon which to start building those changes.<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; Chuck<br>

&gt;<br>

&gt;<br>

&gt;<br>

</div></div><div><div></div><div class="h5">&gt; _______________________________________________<br>

&gt; tdwg-tag mailing list<br>

&gt; <a href="mailto:tdwg-tag@lists.tdwg.org">tdwg-tag@lists.tdwg.org</a><br>

&gt; <a href="http://lists.tdwg.org/mailman/listinfo/tdwg-tag" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-tag</a><br>

<br>

_______________________________________________<br>

tdwg-tag mailing list<br>

<a href="mailto:tdwg-tag@lists.tdwg.org">tdwg-tag@lists.tdwg.org</a><br>

<a href="http://lists.tdwg.org/mailman/listinfo/tdwg-tag" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-tag</a><br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>---------------------------------------------------------------<br>Pete DeVries<br>Department of Entomology<br>University of Wisconsin - Madison<br>445 Russell Laboratories<br>

1630 Linden Drive<br>Madison, WI 53706<br>------------------------------------------------------------<br>

</div>