<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    I do not have an opinion on this issue but wanted to note that the
    TaxonName part of the TDWG Ontology appears to be fully functional. 
    Given that the TaxonName and TaxonConcept ontologies are based on
    TCS, there may be existing terms (based on TCS) with stable URIs to
    represent exactly what people want to say.  They wouldn't be Darwin
    Core terms, but they would be defined and have stable URIs
    nonetheless.  For example,<br>
    <br>
    <a class="moz-txt-link-freetext" href="http://rs.tdwg.org/ontology/voc/TaxonName#nameComplete">http://rs.tdwg.org/ontology/voc/TaxonName#nameComplete</a><br>
    which can be abbreviated tn:nameComplete<br>
    where tn:=<a class="moz-txt-link-freetext" href="http://rs.tdwg.org/ontology/voc/TaxonName#">http://rs.tdwg.org/ontology/voc/TaxonName#</a><br>
    <br>
    is defined as "The complete uninomial, binomial or trinomial name
    without any authority or year components."<br>
    <br>
    Thus one could mark up data as <br>
    &lt;tn:nameComplete&gt;Homo sapiens&lt;/tn:nameComplete&gt;<br>
    and theoretically this would have meaning to the extent to which
    people take the TDWG Ontology seriously.  But that is a different
    item for discussion...<br>
    <br>
    Steve<br>
    <br>
    To view the rdf, see:
<a class="moz-txt-link-freetext" href="http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonName.owl">http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonName.owl</a><br>
    and
<a class="moz-txt-link-freetext" href="http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl">http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl</a><br>
    <br>
    On 3/14/2012 3:15 PM, Kennedy, Jessie wrote:
    <blockquote
cite="mid:78F1C759D89E1C44A6DD4C06B9A4B270010386C4F4B4@E2K7MBX.napier-mail.napier.ac.uk"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <meta name="Generator" content="Microsoft Word 14 (filtered
        medium)">
      <!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
      <style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
        {font-family:Webdings;
        panose-1:5 3 1 2 1 5 9 6 7 3;}
@font-face
        {font-family:"Comic Sans MS";
        panose-1:3 15 7 2 3 3 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Balloon Text Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:8.0pt;
        font-family:"Tahoma","sans-serif";}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;
        mso-fareast-language:EN-GB;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.BalloonTextChar
        {mso-style-name:"Balloon Text Char";
        mso-style-priority:99;
        mso-style-link:"Balloon Text";
        font-family:"Tahoma","sans-serif";
        mso-fareast-language:EN-GB;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";
        mso-fareast-language:EN-US;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">Peter
            –you just described what TCS offered….this was all covered
            in the discussion on TCS… (and many more things that have
            been discussed recently)<o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">The
            guide to using it covers some of the thoughts behind these
            issues I think…<o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><a
              moz-do-not-send="true"
              href="http://www.tdwg.org/fileadmin/subgroups/tnc/User_Guide.pdf">http://www.tdwg.org/fileadmin/subgroups/tnc/User_Guide.pdf</a>
            <o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"><b><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"
              lang="EN-US">From:</span></b><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"
            lang="EN-US"> <a class="moz-txt-link-abbreviated" href="mailto:tdwg-tag-bounces@lists.tdwg.org">tdwg-tag-bounces@lists.tdwg.org</a>
            [<a class="moz-txt-link-freetext" href="mailto:tdwg-tag-bounces@lists.tdwg.org">mailto:tdwg-tag-bounces@lists.tdwg.org</a>] <b>On Behalf Of </b>Peter
            Desmet<br>
            <b>Sent:</b> 14 March 2012 20:11<br>
            <b>To:</b> Paul Kirk<br>
            <b>Cc:</b> TDWG content mailing list; Donald Hobern (GBIF);
            TDWG TAG mailing list; Christian Gendreau; dev Developers<br>
            <b>Subject:</b> Re: [tdwg-tag] [tdwg-content] Canonical name
            parsing<o:p></o:p></span></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <p class="MsoNormal">Hi Paul,<o:p></o:p></p>
        <div>
          <p class="MsoNormal"><o:p> </o:p></p>
        </div>
        <div>
          <p class="MsoNormal">Higher taxon: "Magnoliidae Novák ex
            Takhtajan" (a subclass).<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal">- scientificName: Magnoliidae Novák ex
            Takhtajan<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal">- taxonRank: subclass<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal">But there are no terms to share the
            canonical name "Magnoliidae". The only available options are
            kingdom, phylum, class, order, family, genus, subgenus,
            specificEpithet, infraspecificEpithet, none of which are
            appropriate.<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><o:p> </o:p></p>
        </div>
        <div>
          <p class="MsoNormal">Solution:<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal">- canonicalScientificName: Magnoliidae<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><o:p> </o:p></p>
        </div>
        <div>
          <p class="MsoNormal">Infrageneric taxon: "<span
style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#222222">Abies
              sect. Amabilis (Matzenko) Farjon &amp; Rushforth" (a
              section)</span><o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><span
style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#222222">-
              scientificName: Abies sect. Amabilis (Matzenko) Farjon
              &amp; Rushforth</span><o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><span
style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#222222">-
              taxonRank: section</span><o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><span
style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#222222">-
              genus: Abies</span><o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><span
style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#222222">But
              there are no terms to share "Abies Amabilis", "Abies sect.
              Amabilis", "Abies section Amabilis" or even "Amabilis". </span>The
            only available options are kingdom, phylum, class, order,
            family, genus, subgenus, specificEpithet,
            infraspecificEpithet, none of which are appropriate. Why we
            have subgenus, but not <b>infragenericEpithet</b> is
            another issue. I would at least be able to share "Amabilis".<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><o:p> </o:p></p>
        </div>
        <div>
          <p class="MsoNormal">Solution:<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal">- canonicalScientificName: Abies Amabilis<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal">- taxonRank: section<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><o:p> </o:p></p>
        </div>
        <div>
          <p class="MsoNormal">Peter<o:p></o:p></p>
        </div>
        <div>
          <div>
            <p class="MsoNormal"><o:p> </o:p></p>
          </div>
          <div>
            <p class="MsoNormal">There is no place to share the
              canonical name "Magnoliidae" for this taxon.<o:p></o:p></p>
          </div>
          <p class="MsoNormal"><o:p> </o:p></p>
          <div>
            <p class="MsoNormal">On Wed, Mar 14, 2012 at 14:37, Paul
              Kirk &lt;<a moz-do-not-send="true"
                href="mailto:p.kirk@cabi.org">p.kirk@cabi.org</a>&gt;
              wrote:<o:p></o:p></p>
            <div>
              <div>
                <p><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">'For
                    higher taxa or infrageneric taxa, these terms are
                    not sufficient' ... why?<o:p></o:p></span></p>
                <p><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> <o:p></o:p></span></p>
                <p><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">Paul<o:p></o:p></span></p>
                <p><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> <o:p></o:p></span></p>
                <div class="MsoNormal" style="text-align:center"
                  align="center"><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">
                    <hr align="center" size="2" width="100%"></span></div>
                <div>
                  <div>
                    <p class="MsoNormal"><b><span
style="font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;;color:black">From:</span></b><span
style="font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;;color:black">
                        <a moz-do-not-send="true"
                          href="mailto:tdwg-tag-bounces@lists.tdwg.org"
                          target="_blank">tdwg-tag-bounces@lists.tdwg.org</a>
                        [<a moz-do-not-send="true"
                          href="mailto:tdwg-tag-bounces@lists.tdwg.org"
                          target="_blank">tdwg-tag-bounces@lists.tdwg.org</a>]
                        on behalf of Peter Desmet [<a
                          moz-do-not-send="true"
                          href="mailto:peter.desmet@umontreal.ca"
                          target="_blank">peter.desmet@umontreal.ca</a>]<br>
                        <b>Sent:</b> 14 March 2012 18:26<br>
                        <b>To:</b> Richard Pyle<br>
                        <b>Cc:</b> TDWG content mailing list; Donald
                        Hobern (GBIF); dev Developers; Christian
                        Gendreau; TDWG TAG mailing list<o:p></o:p></span></p>
                    <div>
                      <p class="MsoNormal"><span
style="font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;;color:black"><br>
                          <b>Subject:</b> Re: [tdwg-tag] [tdwg-content]
                          Canonical name parsing<o:p></o:p></span></p>
                    </div>
                    <p class="MsoNormal"><o:p> </o:p></p>
                  </div>
                  <div>
                    <div>
                      <div>
                        <p class="MsoNormal">Rich, <o:p></o:p></p>
                        <div>
                          <p class="MsoNormal"><o:p> </o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal">I wished those terms were
                            sufficient, but as mentioned in the
                            justification for <span
style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#222222"><a
                                moz-do-not-send="true"
                                href="http://code.google.com/p/darwincore/issues/detail?id=150"
                                target="_blank">http://code.google.com/p/darwincore/issues/detail?id=150</a>:</span><o:p></o:p></p>
                        </div>
                        <div>
                          <pre style="max-width:80em;white-space:pre-wrap"><span style="font-size:9.0pt">genus, specificEpithet, infraspecificEpithet: concatenated, this terms are identical to the canonicalScientificName for genera, species and infraspecific taxa. For higher taxa or infrageneric taxa, these terms are not sufficient. In addition, there is some ambiguity regarding the genus definition: for synonyms, is it the accepted genus or the genus that is part of the synonym name? See: <a moz-do-not-send="true" href="http://lists.tdwg.org/pipermail/tdwg-content/2010-November/002052.html" target="_blank"><span style="color:#0000CC">http://lists.tdwg.org/pipermail/tdwg-content/2010-November/002052.html</span></a>. In the former case, the genus cannot be used to concatenate a canonicalScientificName.<o:p></o:p></span></pre>
                        </div>
                        <div>
                          <p class="MsoNormal">To give an example for a
                            higher taxon:<o:p></o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal">scientificName: Magnoliidae
                            Novák ex Takhtajan<o:p></o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal">taxonRank: subclass<o:p></o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal"><o:p> </o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal">There is no place to
                            share the canonical name "Magnoliidae" for
                            this taxon.<o:p></o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal"><o:p> </o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal">Peter<o:p></o:p></p>
                        </div>
                        <div>
                          <p class="MsoNormal"><o:p> </o:p></p>
                          <div>
                            <p class="MsoNormal">On Wed, Mar 14, 2012 at
                              14:13, Richard Pyle &lt;<a
                                moz-do-not-send="true"
                                href="mailto:deepreef@bishopmuseum.org"
                                target="_blank">deepreef@bishopmuseum.org</a>&gt;
                              wrote:<o:p></o:p></p>
                            <p class="MsoNormal"><br>
                              I guess the parts that confuse me are:<br>
                              <br>
                              1) What providers are able to produce a
                              canonicalScientificName as per Peter’s
                              definition, but are unable to provide the
                              pre-parsed elements of genus | subgenus |
                              specificEpithet | infraspecificEpithet?<br>
                              <br>
                              2) What consumers could make use of a
                              canonicalScientificName as per Peter’s
                              definition, but are unable to make (even
                              better) use of the pre-parsed elements of
                              genus | subgenus | specificEpithet |
                              infraspecificEpithet?<br>
                              <br>
                              Aloha,<br>
                              Rich<br>
                              <br>
                              <br>
                              <br>
                              <br>
                              From: <a moz-do-not-send="true"
                                href="mailto:tdwg-content-bounces@lists.tdwg.org"
                                target="_blank">tdwg-content-bounces@lists.tdwg.org</a>
                              [mailto:<a moz-do-not-send="true"
                                href="mailto:tdwg-content-bounces@lists.tdwg.org"
                                target="_blank">tdwg-content-bounces@lists.tdwg.org</a>]
                              On Behalf Of Peter Desmet<br>
                              Sent: Wednesday, March 14, 2012 7:03 AM<br>
                              To: Donald Hobern (GBIF)<br>
                              Cc: TDWG content mailing list; Christian
                              Gendreau; Tim Robertson [GBIF]; TDWG TAG
                              mailing list; dev Developers<br>
                              Subject: Re: [tdwg-content] Canonical name
                              parsing<o:p></o:p></p>
                            <div>
                              <div>
                                <p class="MsoNormal"
                                  style="margin-bottom:12.0pt"><br>
                                  Hi Donald,<br>
                                  <br>
                                  scientificName, with its current
                                  definition [1] is a great term and
                                  should be continued to used as such.
                                  As with most Darwin Core terms, it
                                  offers flexibility, so its not an
                                  impediment for publishing data. In the
                                  GBIF context, this term is considered
                                  mandatory: records without it are
                                  ignored during indexing (I believe).
                                  All of this can stay.<br>
                                  <br>
                                  canonicalScientificName would be an
                                  additional term with a clear rule (see
                                  my proposed definition [2]). This is
                                  the case for other Darwin Core terms
                                  as well, such as<br>
                                  decimalLatitude [3],
                                  minimalElevationInMeters [4] or
                                  countryCode [5]. They serve as an
                                  ready-to-use addition/alternative to
                                  verbatimLatitude [6],
                                  verbatimElevation [7] and country [8]
                                  respectively. These terms don't stop
                                  anyone from publishing data, but data
                                  publishers who can provide this kind
                                  of information have the choice to do
                                  so. It would be the same for
                                  canonicalScientificName.<br>
                                  <br>
                                  And yes, an aggregator like GBIF can
                                  play an important role in providing
                                  consistent data to its users and
                                  figuring out what they really need,
                                  but not all data is consumed that way.
                                  In addition, I hope a user would be
                                  able to download cleaned data from the
                                  GBIF portal as Darwin Core. Wouldn't
                                  it be nice that the parsed
                                  canonicalScientificName created by
                                  GBIF can be provided in its proper
                                  term? There are users out there who
                                  want this!<br>
                                  <br>
                                  Regards,<br>
                                  <br>
                                  Peter<br>
                                  <br>
                                  [1] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#scientificName"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#scientificName</a><br>
                                  [2] <a moz-do-not-send="true"
                                    href="http://code.google.com/p/darwincore/issues/detail?id=150"
                                    target="_blank">http://code.google.com/p/darwincore/issues/detail?id=150</a><br>
                                  [3] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#decimalLatitude"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#decimalLatitude</a><br>
                                  [4] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#minimumElevationInMeters"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#minimumElevationInMeters</a><br>
                                  [5] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#countryCode"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#countryCode</a><br>
                                  [6] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#verbatimLatitude"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#verbatimLatitude</a><br>
                                  [7] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#verbatimElevation"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#verbatimElevation</a><br>
                                  [8] <a moz-do-not-send="true"
                                    href="http://rs.tdwg.org/dwc/terms/index.htm#country"
                                    target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#country</a><br>
                                  <br>
                                  On Wed, Mar 14, 2012 at 11:19, Donald
                                  Hobern (GBIF) &lt;<a
                                    moz-do-not-send="true"
                                    href="mailto:dhobern@gbif.org"
                                    target="_blank">dhobern@gbif.org</a>&gt;
                                  wrote:<br>
                                  &gt;<br>
                                  &gt; Hi Peter.<br>
                                  &gt;<br>
                                  &gt; I certainly agree that
                                  aggregators only represent one use
                                  case here but, having seen a lot of
                                  the mess of real-world data, I don't
                                  believe that simply adding a new term
                                  will fix this problem for the users
                                  you describe.  To get the results you
                                  want, we would need a sufficiently
                                  large majority of data sets to follow
                                  the rules perfectly that we could
                                  ignore those that were non-conformant.
                                   This would mean we should mandate
                                  that every data set must use the new
                                  element (with or without the existing
                                  scientificName element) and that they
                                  must present scientific names in the
                                  expected way (or else have their data
                                  considered non-compliant). Until now,
                                  the philosophy on publishing Darwin
                                  Core data has been to make it as easy
                                  as possible for data providers to
                                  expose their data, even at the expense
                                  of greater complexity for consumers.
                                   I suspect that we would have a lot
                                  less data available for use now if we
                                  had taken a more stringent approach.<br>
                                  &gt;<br>
                                  &gt; In some ways, this proposal
                                  reminds me of the structures in ABCD
                                  which seek to offer users verbatim and
                                  more normalised ways to represent
                                  several types of information.  This
                                  actually makes consuming all the
                                  possible forms of such data very
                                  complex, since a record may contain
                                  all variant forms or just any one of
                                  them.  If multiple forms are
                                  available, which one should be
                                  considered the primary version?<br>
                                  &gt;<br>
                                  &gt; I suspect that things may also
                                  get complicated as soon as you discuss
                                  botanical subspecies, varieties,
                                  subvarieties, forms and subforms.
                                   There are recommended ways to
                                  abbreviate the rank markers in these
                                  cases but some variation can be
                                  expected.<br>
                                  &gt;<br>
                                  &gt; Of course aggregators should be
                                  providing more robust services for
                                  accessing exactly what you want in a
                                  consistent, predictable way and I
                                  would suggest that the best place to
                                  attack the problem is to define
                                  exactly what a typical user needs to
                                  see and then for GBIF and similar
                                  projects to work on delivering
                                  predictable data downloads and web
                                  services that clean out all of these
                                  nomenclatural inconsistencies - and
                                  perhaps also add value in other ways
                                  such as augmenting the data with
                                  associated environmental values (as
                                  the Atlas of Living Australia does).
                                   This would allow us all to work
                                  together on developing a consistent
                                  and predictable algorithm for handling
                                  interpretation of name strings,
                                  including synonymy, misspellings,
                                  virus names and everything else that
                                  makes this such a difficult problem.<br>
                                  &gt;<br>
                                  &gt; Best wishes,<br>
                                  &gt;<br>
                                  &gt; Donald<br>
                                  &gt;<br>
                                  &gt;
                                  ----------------------------------------------------------------------<br>
                                  &gt; Donald Hobern - GBIF Director - <a
                                    moz-do-not-send="true"
                                    href="mailto:dhobern@gbif.org"
                                    target="_blank">dhobern@gbif.org</a><br>
                                  &gt; Global Biodiversity Information
                                  Facility <a moz-do-not-send="true"
                                    href="http://www.gbif.org/"
                                    target="_blank">http://www.gbif.org/</a><br>
                                  &gt; GBIF Secretariat,
                                  Universitetsparken 15, DK-2100
                                  Copenhagen Ø, Denmark<br>
                                  &gt; Tel: <a moz-do-not-send="true"
                                    href="tel:%2B45%203532%201471"
                                    target="_blank">+45 3532 1471</a>
                                   Mob: <a moz-do-not-send="true"
                                    href="tel:%2B45%202875%201471"
                                    target="_blank">+45 2875 1471</a>
                                   Fax: <a moz-do-not-send="true"
                                    href="tel:%2B45%202875%201480"
                                    target="_blank">+45 2875 1480</a><br>
                                  &gt;
                                  ----------------------------------------------------------------------<br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt; -----Original Message-----<br>
                                  &gt; From: <a moz-do-not-send="true"
href="mailto:peter.desmet.cubc@gmail.com" target="_blank">peter.desmet.cubc@gmail.com</a>
                                  [mailto:<a moz-do-not-send="true"
                                    href="mailto:peter.desmet.cubc@gmail.com"
                                    target="_blank">peter.desmet.cubc@gmail.com</a>]
                                  On Behalf Of Peter Desmet<br>
                                  &gt; Sent: Wednesday, March 14, 2012
                                  3:41 PM<br>
                                  &gt; To: Tim Robertson [GBIF]<br>
                                  &gt; Cc: Donald Hobern (GBIF); dev
                                  Developers; TDWG content mailing list;
                                  TDWG TAG mailing list; Christian
                                  Gendreau<br>
                                  &gt; Subject: Re: Canonical name
                                  parsing<br>
                                  &gt;<br>
                                  &gt; Hi Tim,<br>
                                  &gt;<br>
                                  &gt; I agree, aggregators like GBIF
                                  and Canadensys will have to deal with
                                  clean and dirty data in each field
                                  anyway: they need code libraries to
                                  deal with this and it is good that
                                  these are being developed. But, that
                                  doesn't help someone who wants to use
                                  data from a Darwin Core Archive with
                                  his data in Excel or a Roderic Page
                                  who wants to get things done for a
                                  prototype.<br>
                                  &gt; Having to use Java libraries or
                                  even the Name Parser [1] (though both<br>
                                  &gt; great) is a barrier to data use.
                                  Darwin Core (Archives) is not only
                                  used for machine to machine
                                  interaction, humans use it too, and I
                                  think we should allow easy hacking (I
                                  mean this in the good sense),
                                  especially for something as important
                                  as the scientific name.<br>
                                  &gt; In addition, as a data publisher
                                  (e.g. for our VASCAN checklist) I<br>
                                  &gt; *do* have the information to
                                  provide a clean and simple to use
                                  canonicalScientificName, but I just
                                  can't share it via the otherwise
                                  excellent biodiversity sharing
                                  standard Darwin Core. I think that's a
                                  pity.<br>
                                  &gt;<br>
                                  &gt; Peter<br>
                                  &gt;<br>
                                  &gt; [1] <a moz-do-not-send="true"
                                    href="http://tools.gbif.org/nameparser/"
                                    target="_blank">http://tools.gbif.org/nameparser/</a><br>
                                  &gt; [2] <a moz-do-not-send="true"
                                    href="http://data.canadensys.net/vascan"
                                    target="_blank">http://data.canadensys.net/vascan</a><br>
                                  &gt;<br>
                                  &gt; PS: Yes, Canadensys will use the
                                  GBIF interpretation libraries. Since
                                  we develop in Java as well, using
                                  those libraries is as easy as the
                                  proverbial "one line of code". We're
                                  looking forward in testing them and
                                  providing patches to enhance them.
                                  Open source FTW! :-)<br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt; On Wed, Mar 14, 2012 at 07:32,
                                  Tim Robertson [GBIF] &lt;<a
                                    moz-do-not-send="true"
                                    href="mailto:trobertson@gbif.org"
                                    target="_blank">trobertson@gbif.org</a>&gt;
                                  wrote:<br>
                                  &gt; &gt; Hi Peter,<br>
                                  &gt; &gt;<br>
                                  &gt; &gt; I'm replying off the TDWG
                                  list, since it is a bit of a tangent
                                  to your discussion.  If you feel it is
                                  relevant, please CC the list again.<br>
                                  &gt; &gt;<br>
                                  &gt; &gt; At GBIF as you know, we have
                                  to interpret all kinds of quality of
                                  content.  I tend to agree with Donald
                                  that this would not really help in
                                  consumption, as in my experience we
                                  will have to deal with both clean and
                                  dirty data in each field *anyway* when
                                  this is used at network scale.  I
                                  would rather see us evolve the
                                  interpretation libraries to handle all
                                  the corner cases, which we need to
                                  develop anyway.  We already do a
                                  pretty decent job at extracting
                                  canonicals.  This is further enhanced
                                  when you couple the extracted
                                  canonical with a fuzzy match against
                                  the "authoritative names" we can now
                                  index thanks to the availability of
                                  checklists in DwC-A format.<br>
                                  &gt; &gt;<br>
                                  &gt; &gt; I know you are a Java shop.
                                   Are you using the GBIF interpretation
                                  libraries [1] at the moment?  If not,
                                  is there a reason why you don't?<br>
                                  &gt; &gt; They are used in all GBIF
                                  projects (portal, checklistbank etc),
                                  and the more we enhance them, the
                                  better it is for everyone.  We have a
                                  significant test coverage [2,3] and
                                  there have been quite some man months
                                  (years?) spent already in their
                                  development and with some real regular
                                  expression experts (most notably
                                  Markus D. and Dave M.).  All our work
                                  is Maven-ized, versioned and available
                                  in our Maven repository [4].<br>
                                  &gt; &gt;<br>
                                  &gt; &gt; I hope these are interesting
                                  to you.  We would welcome any patches
                                  to enhance them, or assistance in
                                  identifying the corner cases and
                                  capturing those as unit tests.<br>
                                  &gt; &gt;<br>
                                  &gt; &gt; Hope this helps,<br>
                                  &gt; &gt; Tim<br>
                                  &gt; &gt;<br>
                                  &gt; &gt; [1]<br>
                                  &gt; &gt; <a moz-do-not-send="true"
href="http://code.google.com/p/gbif-ecat/source/browse/trunk/ecat-common/src"
                                    target="_blank">http://code.google.com/p/gbif-ecat/source/browse/trunk/ecat-common/src</a><br>
                                  &gt; &gt;
                                  /main/java/org/gbif/ecat/parser/NameParser.java<br>
                                  &gt; &gt; [2]<br>
                                  &gt; &gt; <a moz-do-not-send="true"
href="http://code.google.com/p/gbif-ecat/source/browse/trunk/ecat-common/src"
                                    target="_blank">http://code.google.com/p/gbif-ecat/source/browse/trunk/ecat-common/src</a><br>
                                  &gt; &gt;
                                  /test/java/org/gbif/ecat/parser/NameParserTest.java<br>
                                  &gt; &gt; [3]<br>
                                  &gt; &gt; <a moz-do-not-send="true"
href="http://code.google.com/p/gbif-ecat/source/browse/trunk/ecat-common/src"
                                    target="_blank">http://code.google.com/p/gbif-ecat/source/browse/trunk/ecat-common/src</a><br>
                                  &gt; &gt; /#src%2Ftest%2Fresources [4]<br>
                                  &gt; &gt; <a moz-do-not-send="true"
href="http://repository.gbif.org/index.html#nexus-search;quick%7Eecat-common"
                                    target="_blank">http://repository.gbif.org/index.html#nexus-search;quick~ecat-common</a><br>
                                  &gt; &gt;<br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt; --<br>
                                  &gt; Peter Desmet<br>
                                  &gt; Biodiversity Informatics Manager<br>
                                  &gt; Canadensys - <a
                                    moz-do-not-send="true"
                                    href="http://www.canadensys.net"
                                    target="_blank">www.canadensys.net</a><br>
                                  &gt;<br>
                                  &gt; Université de Montréal
                                  Biodiversity Centre<br>
                                  &gt; 4101 rue Sherbrooke est<br>
                                  &gt; Montreal, QC, H1X2B2<br>
                                  &gt; Canada<br>
                                  &gt;<br>
                                  &gt; Phone: <a moz-do-not-send="true"
                                    href="tel:514-343-6111%20%2382354"
                                    target="_blank">514-343-6111 #82354</a><br>
                                  &gt; Fax: <a moz-do-not-send="true"
                                    href="tel:514-343-2288"
                                    target="_blank">514-343-2288</a><br>
                                  &gt; Email: <a moz-do-not-send="true"
href="mailto:peter.desmet@umontreal.ca" target="_blank">peter.desmet@umontreal.ca</a>
                                  / <a moz-do-not-send="true"
                                    href="mailto:peter.desmet.cubc@gmail.com"
                                    target="_blank">peter.desmet.cubc@gmail.com</a><br>
                                  &gt; Skype: anderhalv<br>
                                  &gt; Public profile: <a
                                    moz-do-not-send="true"
                                    href="http://www.linkedin.com/in/peterdesmet"
                                    target="_blank">http://www.linkedin.com/in/peterdesmet</a><br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt;
                                  _______________________________________________<br>
                                  &gt; tdwg-content mailing list<br>
                                  &gt; <a moz-do-not-send="true"
                                    href="mailto:tdwg-content@lists.tdwg.org"
                                    target="_blank">tdwg-content@lists.tdwg.org</a><br>
                                  &gt; <a moz-do-not-send="true"
                                    href="http://lists.tdwg.org/mailman/listinfo/tdwg-content"
                                    target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br>
                                  <br>
                                  <br>
                                  <br>
                                  <br>
                                  --<br>
                                  Peter Desmet<br>
                                  Biodiversity Informatics Manager<br>
                                  Canadensys - <a
                                    moz-do-not-send="true"
                                    href="http://www.canadensys.net"
                                    target="_blank">www.canadensys.net</a><br>
                                  <br>
                                  Université de Montréal Biodiversity
                                  Centre<br>
                                  4101 rue Sherbrooke est<br>
                                  Montreal, QC, H1X2B2<br>
                                  Canada<br>
                                  <br>
                                  Phone: <a moz-do-not-send="true"
                                    href="tel:514-343-6111%20%2382354"
                                    target="_blank">514-343-6111 #82354</a><br>
                                  Fax: <a moz-do-not-send="true"
                                    href="tel:514-343-2288"
                                    target="_blank">514-343-2288</a><br>
                                  Email: <a moz-do-not-send="true"
                                    href="mailto:peter.desmet@umontreal.ca"
                                    target="_blank">peter.desmet@umontreal.ca</a>
                                  / <a moz-do-not-send="true"
                                    href="mailto:peter.desmet.cubc@gmail.com"
                                    target="_blank">peter.desmet.cubc@gmail.com</a><br>
                                  Skype: anderhalv<br>
                                  Public profile: <a
                                    moz-do-not-send="true"
                                    href="http://www.linkedin.com/in/peterdesmet"
                                    target="_blank">http://www.linkedin.com/in/peterdesmet</a><br>
                                  <br>
                                  <o:p></o:p></p>
                              </div>
                            </div>
                            <p class="MsoNormal">This message is only
                              intended for the addressee named above.
                               Its contents may be privileged or
                              otherwise protected.  Any unauthorized
                              use, disclosure or copying of this message
                              or its contents is prohibited.  If you
                              have received this message by mistake,
                              please notify us immediately by reply mail
                              or by collect telephone call.  Any
                              personal opinions expressed in this
                              message do not necessarily represent the
                              views of the Bishop Museum.<o:p></o:p></p>
                            <div>
                              <div>
                                <p class="MsoNormal">_______________________________________________<br>
                                  tdwg-content mailing list<br>
                                  <a moz-do-not-send="true"
                                    href="mailto:tdwg-content@lists.tdwg.org"
                                    target="_blank">tdwg-content@lists.tdwg.org</a><br>
                                  <a moz-do-not-send="true"
                                    href="http://lists.tdwg.org/mailman/listinfo/tdwg-content"
                                    target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><o:p></o:p></p>
                              </div>
                            </div>
                          </div>
                          <p class="MsoNormal"><br>
                            <br clear="all">
                            <o:p></o:p></p>
                          <div>
                            <p class="MsoNormal"><o:p> </o:p></p>
                          </div>
                          <p class="MsoNormal">-- <br>
                            Peter Desmet<br>
                            Biodiversity Informatics Manager<br>
                            Canadensys - <a moz-do-not-send="true"
                              href="http://www.canadensys.net"
                              target="_blank">www.canadensys.net</a><br>
                            <br>
                            Université de Montréal Biodiversity Centre<br>
                            4101 rue Sherbrooke est<br>
                            Montreal, QC, H1X2B2<br>
                            Canada<br>
                            <br>
                            Phone: <a moz-do-not-send="true"
                              href="tel:514-343-6111%20%2382354"
                              target="_blank">514-343-6111 #82354</a><br>
                            Fax: <a moz-do-not-send="true"
                              href="tel:514-343-2288" target="_blank">514-343-2288</a><br>
                            Email: <a moz-do-not-send="true"
                              href="mailto:peter.desmet@umontreal.ca"
                              target="_blank">peter.desmet@umontreal.ca</a>
                            / <a moz-do-not-send="true"
                              href="mailto:peter.desmet.cubc@gmail.com"
                              target="_blank">peter.desmet.cubc@gmail.com</a><br>
                            Skype: anderhalv<br>
                            Public profile: <a moz-do-not-send="true"
                              href="http://www.linkedin.com/in/peterdesmet"
                              target="_blank">http://www.linkedin.com/in/peterdesmet</a><o:p></o:p></p>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
              <p><span
                  style="font-size:18.0pt;font-family:Webdings;color:#78BE20">P</span><span
                  style="font-family:&quot;Courier
                  New&quot;;color:#78BE20"> </span><span
                  style="font-size:8.0pt;font-family:&quot;Comic Sans
                  MS&quot;;color:#78BE20">Think Green - don't print this
                  email unless you really need to </span><o:p></o:p></p>
              <div>
                <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:7.5pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;">************************************************************************<br>
                    The information contained in this e-mail and any
                    files transmitted with it is confidential and is for
                    the exclusive use of the intended recipient. If you
                    are not the intended recipient please note that any
                    distribution, copying or use of this communication
                    or the information in it is prohibited. <br>
                    <br>
                    Whilst CAB International trading as CABI takes steps
                    to prevent the transmission of viruses via e-mail,
                    we cannot guarantee that any e-mail or attachment is
                    free from computer viruses and you are strongly
                    advised to undertake your own anti-virus
                    precautions.</span><o:p></o:p></p>
              </div>
              <p class="MsoNormal"><span
style="font-size:7.5pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;">If
                  you have received this communication in error, please
                  notify us by e-mail at <a moz-do-not-send="true"
                    href="mailto:cabi@cabi.org" target="_blank">cabi@cabi.org</a>
                  or by telephone on <a moz-do-not-send="true"
                    href="tel:%2B44%20%280%291491%20832111"
                    target="_blank">+44 (0)1491 832111</a> and then
                  delete the e-mail and any copies of it.<br>
                  <br>
                  CABI is an International Organization recognised by
                  the UK Government under Statutory Instrument 1982 No.
                  1071...<br>
                  <br>
**************************************************************************</span><o:p></o:p></p>
            </div>
            <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
              _______________________________________________<br>
              tdwg-tag mailing list<br>
              <a moz-do-not-send="true"
                href="mailto:tdwg-tag@lists.tdwg.org">tdwg-tag@lists.tdwg.org</a><br>
              <a moz-do-not-send="true"
                href="http://lists.tdwg.org/mailman/listinfo/tdwg-tag"
                target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-tag</a><o:p></o:p></p>
          </div>
          <p class="MsoNormal"><br>
            <br clear="all">
            <o:p></o:p></p>
          <div>
            <p class="MsoNormal"><o:p> </o:p></p>
          </div>
          <p class="MsoNormal">-- <br>
            Peter Desmet<br>
            Biodiversity Informatics Manager<br>
            Canadensys - <a moz-do-not-send="true"
              href="http://www.canadensys.net" target="_blank">www.canadensys.net</a><br>
            <br>
            Université de Montréal Biodiversity Centre<br>
            4101 rue Sherbrooke est<br>
            Montreal, QC, H1X2B2<br>
            Canada<br>
            <br>
            Phone: 514-343-6111 #82354<br>
            Fax: 514-343-2288<br>
            Email: <a moz-do-not-send="true"
              href="mailto:peter.desmet@umontreal.ca" target="_blank">peter.desmet@umontreal.ca</a>
            / <a moz-do-not-send="true"
              href="mailto:peter.desmet.cubc@gmail.com" target="_blank">peter.desmet.cubc@gmail.com</a><br>
            Skype: anderhalv<br>
            Public profile: <a moz-do-not-send="true"
              href="http://www.linkedin.com/in/peterdesmet"
              target="_blank">http://www.linkedin.com/in/peterdesmet</a><o:p></o:p></p>
        </div>
      </div>
      <br clear="all">
      <style type="text/css"> .style1 { font-family: Arial; } .style2 { font-size: 9.0pt; } .style3 { color: #000000; } </style>
      <p><span class="style1"><span class="style2">Edinburgh Napier
            University is one of Scotland's top universities for
            graduate employability. 93.2% of graduates are in work or
            further study within six months of leaving. This university
            is also proud winner of the Queen's Anniversary Prize for
            Higher and Further Education 2009, awarded for innovative
            housing construction for environmental benefit and quality
            of life. </span></span></p>
      <p><span class="style1"><span class="style2">
            <span class="style1&lt;span class=" style1=""><span
                class="style3">This message is intended for the
                addressee(s) only and should not be read, copied or
                disclosed to anyone else outwith the University without
                the permission of the sender.<br>
                It is your responsibility to ensure that this message
                and any attachments are scanned for viruses or other
                defects. Edinburgh Napier University does not accept
                liability for any loss or damage which may result from
                this email or any attachment, or for errors or omissions
                arising after it was sent. Email is not a secure medium.
                Email entering the University's system is subject to
                routine monitoring and filtering by the University.<br>
                <br>
                Edinburgh Napier University is a registered Scottish
                charity. Registration number SC018373</span></span></span></span></p>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>
</pre>
  </body>
</html>