[tdwg-tag] WPS for Names

Evgeniy Meyke evgeniy at earthcape.com
Wed Jul 7 23:49:54 CEST 2010


Hi Kevin, Tony, and all

 

As one of the very likely consumers of such a service I’ll be following this
with great interest.

 

A quick question at this point. Is anything like this planned for GNA
(www.globalnames.org)? or GBIF? 

 

Evgeniy

 

From: tdwg-tag-bounces at lists.tdwg.org
[mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of Kevin Richards
Sent: 8. heinäkuuta 2010 0:32
To: Tony.Rees at csiro.au; tdwg-tag at lists.tdwg.org
Cc: tdwg-content at lists.tdwg.org
Subject: Re: [tdwg-tag] WPS for Names

 

Thanks Tony.  I would be interested in any collaboration we can do in this
area (assuming I find 5 minutes to work on it  :-))

 

There seems to be several approaches to name matching/integration – one
working with the name strings, and one working with more structured data.
It would be good to clarify and perhaps standardise these approaches.  (the
second approach is discussed in a recent paper of mine).  

 

It does indeed become a slippery slope, and this is one of the reasons I am
keen to promote some sort of infrastructure/configurability of a matching
system, so that end users can configure a matching algorithm/workflow to
suit their particular data.

 

Kevin

 

 

From: Tony.Rees at csiro.au [mailto:Tony.Rees at csiro.au] 
Sent: Wednesday, 7 July 2010 7:06 p.m.
To: Kevin Richards; tdwg-tag at lists.tdwg.org
Cc: tdwg-content at lists.tdwg.org
Subject: RE: WPS for Names

 

Dear all (actually: I’m not sure who are the recipients currently on
tdwg-tag, maybe I will now find out!!)

 

As some on this list may be aware, this is an area that has been of interest
to me for quite some time (for example see:
http://taxacom.markmail.org/message/ywq7ijiaeks7heiv ), so happy to see what
can be done in this space.

 

Currently my algorithm is web accessible and tests designated genus names
against genus names held, and genus+species combinations against both genus
only, and genus+species combinations as held in my “IRMNG” reference
database (search entry point is at
http://www.cmar.csiro.au/datacentre/irmng/ if interested). I am also
planning to implement a degree of cross-rank matching shortly, e.g. if a
subgenus is supplied, test this as a possible genus against genus+species
combinations (as this often turns out to be the reason for a direct mismatch
in practice), same with infraspecies vs. subspecies (my current interface
does not yet handle infraspecies, and just detect then “parks” apparent
subgenera, but the intention is to handle these as testable components in
due course).

 

Maybe I will set up the above options and let you know as available for
testing. Also I may look for genus+species concatenated (think Homosapiens),
genus+subgenus+species with missing brackets around subgenus, and maybe
other things, as per my somewhat extensive exposure to otherwise
non-resolved namestrings floating around in OBIS/GBIF data provider space.
Of course it is a slippery slope; other examples are family in genus field
and vice versa, or common name similar; genus and species reversed;
truncated names not flagged as such; abbreviated genera (which I already
handle as exact, but not fuzzy matches at this time, at least as “H.
sapiens” etc.); more..

 

Any comments on the above welcome,

 

Regards - Tony

 

Tony Rees
Manager, Divisional Data Centre,
CSIRO Marine and Atmospheric Research,
GPO Box 1538,
Hobart, Tasmania 7001, Australia
Ph: 0362 325318 (Int: +61 362 325318)
Fax: 0362 325000 (Int: +61 362 325000)
e-mail: Tony.Rees at csiro.au
Manager, OBIS Australia regional node, http://www.obis.org.au/ 
Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566

 

  _____  

From: tdwg-tag-bounces at lists.tdwg.org
[mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of Kevin Richards
Sent: Wednesday, 7 July 2010 11:55 AM
To: tdwg-tag at lists.tdwg.org
Cc: tdwg-content at lists.tdwg.org
Subject: [tdwg-tag] WPS for Names

 

I have been pondering taxon name matching type services lately


 

I wonder if the OGC WPS (Web Processing Service) would make a good platform
for integrating the various name matching algorithms that are being worked
on lately.

 

I was imagining something like a web interface where you can go to and view
a list of the available algorithms and select different algorithms in
different orders to get the best set of match results your own list of name
strings/data.

If everyone set up their algorithms as a WPS then this interface would call
each WPS in the appropriate order until then end of the configured workflow
path.

 

UI something like (in diagram):

 



 

Where the bottom part is configurable by the user.  Each box being a
representation of a WPS service for doing the match.

 

Any thoughts?

Perhaps something that could be discussed at TDWG?

 

One issue would be how to define/specify the list of names to match against
– then when you pass the processing of a match routine how would it access
the names list to match??  Perhaps it could all be based on one server and
people could submit algorithm/WPS services to it?

 

Hmmmm, will keep dreaming 


 

Kevin

 

 

  _____  

Please consider the environment before printing this email
Warning: This electronic message together with any attachments is
confidential. If you receive it in error: (i) you must not read, use,
disclose, copy or retain it; (ii) please contact the sender immediately by
reply email and then delete the emails.
The views expressed in this email may not be those of Landcare Research New
Zealand Limited. http://www.landcareresearch.co.nz

 

  _____  

Please consider the environment before printing this email
Warning: This electronic message together with any attachments is
confidential. If you receive it in error: (i) you must not read, use,
disclose, copy or retain it; (ii) please contact the sender immediately by
reply email and then delete the emails.
The views expressed in this email may not be those of Landcare Research New
Zealand Limited. http://www.landcareresearch.co.nz

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20100708/04ec0b13/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 15882 bytes
Desc: not available
Url : http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20100708/04ec0b13/attachment.gif 


More information about the tdwg-tag mailing list