<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:p="urn:schemas-microsoft-com:office:powerpoint" xmlns:a="urn:schemas-microsoft-com:office:access" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" xmlns:b="urn:schemas-microsoft-com:office:publisher" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet" xmlns:odc="urn:schemas-microsoft-com:office:odc" xmlns:oa="urn:schemas-microsoft-com:office:activation" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:rtc="http://microsoft.com/officenet/conferencing" xmlns:D="DAV:" xmlns:Repl="http://schemas.microsoft.com/repl/" xmlns:mt="http://schemas.microsoft.com/sharepoint/soap/meetings/" xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml" xmlns:ppda="http://www.passport.com/NameSpace.xsd" xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp" xmlns:udc="http://schemas.microsoft.com/data/udc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sub="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/" xmlns:ec="http://www.w3.org/2001/04/xmlenc#" xmlns:sp="http://schemas.microsoft.com/sharepoint/" xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:udcs="http://schemas.microsoft.com/data/udc/soap" xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile" xmlns:udcp2p="http://schemas.microsoft.com/data/udc/parttopart" xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:dsss="http://schemas.microsoft.com/office/2006/digsig-setup" xmlns:dssi="http://schemas.microsoft.com/office/2006/digsig" xmlns:mdssi="http://schemas.openxmlformats.org/package/2006/digital-signature" xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns:mrels="http://schemas.openxmlformats.org/package/2006/relationships" xmlns:spwp="http://microsoft.com/sharepoint/webpartpages" xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:ex12m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:pptsl="http://schemas.microsoft.com/sharepoint/soap/SlideLibrary/" xmlns:spsl="http://microsoft.com/webservices/SharePointPortalServer/PublishedLinksService" xmlns:Z="urn:schemas-microsoft-com:" xmlns:st="" xmlns="http://www.w3.org/TR/REC-html40" xmlns:ns0="urn:schemas-microsoft-com:office:smarttags">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 12 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]--><style>
<!--
 /* Font Definitions */
 @font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Arial","sans-serif";
        color:navy;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.msoIns
        {mso-style-type:export-only;
        mso-style-name:"";
        text-decoration:underline;
        color:teal;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
-->
</style><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1" />
 </o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-NZ" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D">Thanks Tony.  I would be interested in any collaboration we can do in this area (assuming I find 5 minutes to work on it  :-))<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">There seems to be several approaches to name matching/integration – one working with the name strings, and one working with more structured data.  It would be good to clarify and perhaps standardise these approaches. 
 (the second approach is discussed in a recent paper of mine).  <o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">It does indeed become a slippery slope, and this is one of the reasons I am keen to promote some sort of infrastructure/configurability of a matching system, so that end users can configure a matching algorithm/workflow
 to suit their particular data.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Kevin<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:
"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;
font-family:"Tahoma","sans-serif""> Tony.Rees@csiro.au [mailto:Tony.Rees@csiro.au]
<br>
<b>Sent:</b> Wednesday, 7 July 2010 7:06 p.m.<br>
<b>To:</b> Kevin Richards; tdwg-tag@lists.tdwg.org<br>
<b>Cc:</b> tdwg-content@lists.tdwg.org<br>
<b>Subject:</b> RE: WPS for Names<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">Dear all (actually: I’m not sure who are the recipients currently on tdwg-tag, maybe I will now find out!!)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">As some on this list may be aware, this is an area that has been of interest to me for quite some time (for example see:
<a href="http://taxacom.markmail.org/message/ywq7ijiaeks7heiv">http://taxacom.markmail.org/message/ywq7ijiaeks7heiv</a> ), so happy to see what can be done in this space.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">Currently my algorithm is web accessible and tests designated genus names against genus names held, and genus+species combinations against both genus
 only, and genus+species combinations as held in my “IRMNG” reference database (search entry point is at
<a href="http://www.cmar.csiro.au/datacentre/irmng/">http://www.cmar.csiro.au/datacentre/irmng/</a> if interested). I am also planning to implement a degree of cross-rank matching shortly, e.g. if a subgenus is supplied, test this as a possible genus against
 genus+species combinations (as this often turns out to be the reason for a direct mismatch in practice), same with infraspecies vs. subspecies (my current interface does not yet handle infraspecies, and just detect then “parks” apparent subgenera, but the
 intention is to handle these as testable components in due course).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">Maybe I will set up the above options and let you know as available for testing. Also I may look for genus+species concatenated (think Homosapiens),
 genus+subgenus+species with missing brackets around subgenus, and maybe other things, as per my somewhat extensive exposure to otherwise non-resolved namestrings floating around in OBIS/GBIF data provider space. Of course it is a slippery slope; other examples
 are family in genus field and vice versa, or common name similar; genus and species reversed; truncated names not flagged as such; abbreviated genera (which I already handle as exact, but not fuzzy matches at this time, at least as “H. sapiens” etc.); more..<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">Any comments on the above welcome,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">Regards - Tony<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy">Tony Rees<br>
Manager, Divisional Data Centre,<br>
CSIRO Marine and Atmospheric Research,<br>
GPO </span><span style="font-size:10.0pt;font-family:"Times New Roman","serif""><ns0:address w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><ns0:Street w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><span lang="EN-US" style="font-family:"Arial","sans-serif";color:navy">Box</span><span class="msoIns"><ins cite="mailto:Rees,%20Tony%20(CMAR,%20Hobart)" datetime="2010-07-07T16:55"></ns0:Street></ins></span><span lang="EN-US" style="font-family:"Arial","sans-serif";color:navy">
 1538,</span><span class="msoIns"><ins cite="mailto:Rees,%20Tony%20(CMAR,%20Hobart)" datetime="2010-07-07T16:55"></ns0:address></ins></span></span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:navy"><br>
</span><span style="font-size:10.0pt;font-family:"Times New Roman","serif""><ns0:place w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><ns0:City w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><span lang="EN-US" style="font-family:"Arial","sans-serif";color:navy">Hobart</span><span class="msoIns"><ins cite="mailto:Rees,%20Tony%20(CMAR,%20Hobart)" datetime="2010-07-07T16:55"></ns0:City></ins></span><span lang="EN-US" style="font-family:"Arial","sans-serif";color:navy">,
</span><ns0:State w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><span lang="EN-US" style="font-family:"Arial","sans-serif";color:navy">Tasmania</span><span class="msoIns"><ins cite="mailto:Rees,%20Tony%20(CMAR,%20Hobart)" datetime="2010-07-07T16:55"></ns0:State></ins></span></ns0:place></span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:navy">
 7001, </span><span style="font-size:10.0pt;font-family:"Times New Roman","serif""><ns0:country-region w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><ns0:place w:insAuthor="Rees, Tony (CMAR, Hobart)" w:insDate="2010-07-07T16:55:00Z" w:endInsAuthor="Rees, Tony (CMAR, Hobart)" w:endInsDate="2010-07-07T16:55:00Z"><span lang="EN-US" style="font-family:"Arial","sans-serif";color:navy">Australia</span><span class="msoIns"><ins cite="mailto:Rees,%20Tony%20(CMAR,%20Hobart)" datetime="2010-07-07T16:55"></ns0:place></ins></span></ns0:country-region></span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:navy"><br>
Ph: 0362 325318 (Int: +61 362 325318)<br>
Fax: 0362 325000 (Int: +61 362 325000)<br>
e-mail: <a href="mailto:Tony.Rees@csiro.au" title="mailto:Tony.Rees@csiro.au">Tony.Rees@csiro.au</a><br>
Manager, OBIS Australia regional node, http://www.obis.org.au/ <br>
Biodiversity informatics research activities: <a href="http://www.cmar.csiro.au/datacentre/biodiversity.htm">
http://www.cmar.csiro.au/datacentre/biodiversity.htm</a><br>
Personal info: <a href="http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566" title="http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566">http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566</a><o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.0pt;font-family:"Arial","sans-serif";
color:navy"><o:p> </o:p></span></p>
<div>
<div class="MsoNormal" align="center" style="text-align:center"><span lang="EN-US" style="font-size:12.0pt;font-family:"Times New Roman","serif"">
<hr size="2" width="100%" align="center">
</span></div>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:
"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;
font-family:"Tahoma","sans-serif""> tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org]
<b>On Behalf Of </b>Kevin Richards<br>
<b>Sent:</b> Wednesday, 7 July 2010 11:55 AM<br>
<b>To:</b> tdwg-tag@lists.tdwg.org<br>
<b>Cc:</b> tdwg-content@lists.tdwg.org<br>
<b>Subject:</b> [tdwg-tag] WPS for Names</span><span lang="EN-US" style="font-size:12.0pt;font-family:"Times New Roman","serif""><o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal">I have been pondering taxon name matching type services lately…<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I wonder if the OGC WPS (Web Processing Service) would make a good platform for integrating the various name matching algorithms that are being worked on lately.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I was imagining something like a web interface where you can go to and view a list of the available algorithms and select different algorithms in different orders to get the best set of match results your own list of name strings/data.<o:p></o:p></p>
<p class="MsoNormal">If everyone set up their algorithms as a WPS then this interface would call each WPS in the appropriate order until then end of the configured workflow path.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">UI something like (in diagram):<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><img border="0" width="658" height="618" id="_x0000_i1025" src="cid:image001.gif@01CB1E7E.666E0530"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Where the bottom part is configurable by the user.  Each box being a representation of a WPS service for doing the match.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Any thoughts?<o:p></o:p></p>
<p class="MsoNormal">Perhaps something that could be discussed at TDWG?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">One issue would be how to define/specify the list of names to match against – then when you pass the processing of a match routine how would it access the names list to match??  Perhaps it could all be based on one server and people could
 submit algorithm/WPS services to it?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hmmmm, will keep dreaming …<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Kevin<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Times New Roman","serif""><o:p> </o:p></span></p>
<div class="MsoNormal" align="center" style="text-align:center"><span style="font-size:12.0pt;font-family:"Times New Roman","serif"">
<hr size="2" width="100%" align="center">
</span></div>
<p class="MsoNormal"><span style="font-size:7.5pt;font-family:"Arial","sans-serif";
color:green">Please consider the environment before printing this email<br>
Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails.<br>
The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz</span><span style="font-size:12.0pt;font-family:"Times New Roman","serif""><o:p></o:p></span></p>
</div>
<br>
<hr>
<font face="Arial" color="Green" size="1">Please consider the environment before printing this email<br>
Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails.<br>
The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz<br>
</font>
</body>
</html>