[tdwg-content] Canonical name parsing

Roderic Page r.page at bio.gla.ac.uk
Wed Mar 14 17:59:26 CET 2012


OK, a slightly less ranty take on this.

It seems to me there are a couple of "use cases" we need to tease apart.

If the goal is to digitise and aggregate (e.g., transcribe museum labels, aggregate from old museum databases) then ultimately yes, you take whatever people give you and clean it. Ideally you'd give people tools to clean stuff up before they serve it, but you try and make the barriers as low as possible. 

But once you have that data for the love of God (oops, rant) don't simply repackage dirty data. What  motivated my initial tweet was having to deal with a mess from the EOL API where names may or may not have authorities, depending in part upon what the source of the classification was. It's a mess, and I don't understand why people insist on serving this stuff.

So I guess my point is that if you're a primary provider (i,.e.,  you're the person or institution that digitised the original metadata) then you have an excuse, but if you are serving data you got from someplace else, give me canonical names.

Regards

Rod


---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
Skype: rdmpage
AIM: rodpage1962 at aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20120314/9b87c431/attachment.html 


More information about the tdwg-content mailing list