[tdwg-tag] Differences in thinking between TDWG and LinkedData groups about data sharing / integration

Bob Morris morris.bob at gmail.com
Fri Apr 24 04:24:35 CEST 2009


Also, the Plazi EoL REST service for descriptions extracted from legacy
literature has particularly minimal use of TaxonOccurrence objects.  See
sample, including restful service spec, at
http://wiki.tdwg.org/twiki/bin/view/SPM/PlaziEOLProject

Bob Morris


On Thu, Apr 23, 2009 at 10:16 PM, Kevin Richards <
RichardsK at landcareresearch.co.nz> wrote:

>  Pete
>
>
>
> You should have sent something around on the mailing list, I could have
> given you an example of a TaxonOccurrence.  Or perhaps you did and I missed
> it???
>
>
>
> Anyway, with Herb IMI, Paul Kirk and I have set up an resolver to provider
> TaxonOccurrence RDF data,
>
> see for example urn:lsid:herbimi.info:specimens:100069 (or
> http://lsid.herbimi.info/authority/metadata/?lsid=urn:lsid:herbimi.info:specimens:100069in your browser).  It also has an example of using Interaction data - ie in
> this case a host plant (IPNI ID) of a fungus (Herb IMI specimen) with an
> identification to a taxon concept and name (Index Fungorum name).
>
>
>
>
>
> Jim - feel free to help improve the ideas and processes of TDWG if you find
> them that bad.   :-)
>
>
>
>
>
> Kevin
>
>
>
>
>
>
>
> *From:* tdwg-tag-bounces at lists.tdwg.org [mailto:
> tdwg-tag-bounces at lists.tdwg.org] *On Behalf Of *Peter DeVries
> *Sent:* Friday, 24 April 2009 1:46 p.m.
> *To:* Jim Croft
> *Cc:* Bob Morris; tdwg-tag at lists.tdwg.org
> *Subject:* Re: [tdwg-tag] Differences in thinking between TDWG and
> LinkedData groups about data sharing / integration
>
>
>
> Hi Jim,
>
>
>
> Thanks for keeping in good humor. :-)
>
>
>
> I was trying to get my head around the TaxonOccurrence standard so I could
> rewrite my observation records,
>
> and was hoping on finding some examples. That experience, and some related
> issues, made me a little irritable,
>
> for that I am sorry.
>
>
>
> It might be useful to make up a test set. It does not even need to be real.
> In my experience, this is where
>
> you really start to see if the standard does what you want it to do. Many
> times I have thought that I had
>
> everything figured out, only to discover after loading data into my triple
> store that something will not work
>
> or something that used to work was now broken by the addition of a new
> feature.
>
>
>
> I am not bothered by the idea of branding, it just seemed that the it was
> being confused with the goal of
>
> persistence in what seemed to be a unproductive thread.
>
>
>
> Thanks again for being amused, tolerant and insightful. :-)
>
>
>
> - Pete
>
>
>
>
>
> On Thu, Apr 23, 2009 at 7:50 PM, Jim Croft <jim.croft at gmail.com> wrote:
>
> Can't let this great opportunity pass...  :)
>
> It is not just us.  It is the regrettable human condition.  See also:
>
> http://www.informationweek.com/news/infrastructure/management/showArticle.jhtml?articleID=216600011
>
> <diatribe_alert/>
>
> One of the intriguing and enigmatic things about TDWG is that it does
> not seem to respect its own standards, preferring to invent another
> set rather than fix or enhance what is already there.  More than the
> 'not invented here' syndrome, we have to deal with 'not invented here
> this week' and the user is left with the impression of a happy
> self-absorbed group chasing a flock of flitting butterflies - 'oh,
> that's pretty, I will try and catch that one'.  Yep, just the human
> condition.  In another context, Roger H,
> http://www.hyam.net/blog/archives/346, has provided the best
> description of TDWG and its standards I think I have ever heard: "It
> is kind of like the guy selling seagulls on the beach. You give him £5
> and he points into the sky and says 'That one is yours'. Yes he is
> providing a service but the relationship that counts is the one
> between you and the gull."  This is both profound and frightening in
> how close it is to reality.
>
> On the complexity and costs of implementation, you are absolutely
> right.  Taxonomic database standards have moved out of the domain of
> taxonomists, they no longer understand what we are talking about
> (hell, I no longer understand what we are talking about.  And the rest
> of you?  c'mon now, be honest... :) and we are left with this
> relationship of 'trust us' paternalism that I do not think is all that
> healthy - we all trust and respect Microsoft, right?. It is getting
> increasingly difficult give advice to a bod with a beat up PC in a
> developing country on what they should do - the alphabet soup
> surrounding GUIDS and the like just does not cut it when the mind set
> is an Excel spreadsheet with a bunch of intuitive headings.  Oh, I'm
> sorry - you mean it is not just in developing countries?  There is a
> widening gap between 'Joe the Taxonomist' and business of taxonomy
> data standards that we do not seem to be able to address (do we even
> care?).  We used to be able to give our staff in the herbarium a TDWG
> standard and say 'this is what we have to do.'  Not anymore...  Ah,
> the good ole days...  Maybe the answer lies in the hegemony of a new
> and benevolent  'MSOffice for Taxonomy'.  Could work, but I do not
> think it is going to be particularly satisfying.
>
> On the reuse issue, Greg W argues that we do not do nearly enough of
> this and I agree with him.  He argues that TDWG should focus on the
> standard and not the application and implementation of the standard
> and has proposed that in our vocabularies we should adopt the
> principles of nomenclatural priority, that is, going back to *Dublin*
> Core, adding stuff chronologically from other other standards,
> including our own, until there is no option other than to invent
> another one, or there is nothing in our domain left to standardize.
> For taxonomists there is something inherently attractive in this
> approach - don't describe a taxon where it already exists, don't
> invent a standard where one already exists.  To retrofit this and
> untangle all the synonymy and homonymy in our existing standards and
> implementations is going to take a lot of work though.  But the
> vocabularies and ontologies are a good start.
>
> On the 'branding, issue, it is not so much branding but attribution.
> Apart from the moral and legal issues, it is unscientific not to
> attribute, source and provide lineage for data.  The is no
> optionality.  We have to do it.  Even if the initial supplier
> 'disappears'.  *Especially* if the initial supplier disappears.
> Attribution (branding if you like) is absolutely essential for
> credibility.  If someone is not going to do it, they can not have our
> data, and we will not use theirs.
>
> On the architectural issue, I can not really get all that hung up on
> it.  If a standard is good, it should be able to be implemented in a
> number of architectures (isn't that almost a definition of
> interoperability?).  Where things get 'interesting' is when
> architecture (and the continuum towards application) becomes the
> standard or part of the standard.  TDWG needs to constantly ask itself
> to what extent it needs to get involved with implementation of the
> standards it promotes.  I would argue 'not at all', but this is
> another discussion.
>
> And the 'insider' cabalistic nature of TDWG?   What can I say - it has
> always been this way.  A standard attracts a champion and the champion
> establishes a fiefdom of acolytes around it.  Yep, the human
> condition.  And it sort of works.  Some of the time.  (btw - another
> artifact of the human condition - your brilliant ideas are never
> perceived as such until someone else has them - just ask poor old
> Wallace how he is feeling this year).  A downside of this approach is
> that the various TDWG standards are very poorly coordinated between
> each other - this is something we should be able to do something
> about.
>
> We could piss on the TDWG tent from the outside, but you have to
> agree, it is much more satisfying to get inside the tent and piss on
> and piss off everyone in it...  :)
>
> <disclaimer>None of the above ideas are mine.  I am following the TDWG
> standard practice of restating them without attribution :)
> </disclaimer>
>
> Ah... that was fun...
>
> jim
>
>
> On Fri, Apr 24, 2009 at 5:09 AM, Peter DeVries <pete.devries at gmail.com>
> wrote:
> > Respectfully,
> > 1) Only certain classes of organizations will be able to contribute since
> > the standard is requires special skills. Those groups that can pay for
> > hardware and
> >     a person specific to this standard for perpetuity. I look at this and
> > think that a number of groups that could be providers cannot because of
> the
> > way the
> >     system is implemented. Why not have a simple RDF tar or zip file
> format
> > that GBIF checks with a crawler every night?
> > 2) There is very little reuse of existing vocabularies, geo for instance.
> > Similar to the "not invented here mentality".
> > 3) Discussions and decisions seem to be too much about making sure that
> > providers keep their "brand" on the data even if they disappear.
> > 4) Suggestions or alternative ways of thinking are rejected until an
> insider
> > restates them without attribution
> > 5) It is not at all clear how some of these decisions are made. It
> appears
> > as if some people disagree, there is discussion. Then years later there
> is
> >     the same discussion. It seems that some smaller group keeps pulling
> > everyone back to the same architectural decision.
> > 6) Where are the example data sets? We should have some example data sets
> > available to see if the standard can be used to answer real questions?
> >     Either they don't exist or they are only available to a few.
> > I actually have nothing but praise for GBIF and uBio (except for the
> minor
> > encoding thing), this more about trying to work within TDWG and getting
> > stonewalled. I am having the same feelings about it that I had a few
> years
> > ago, after which I left to try to make something that worked so I could
> > proceed with my project.
> > It probably was unfair to imply that the fiefdoms are by design, rather
> than
> > a side effect of the implementation standards, and for that I apologize.
> > - Pete
> >
> >
> > On Thu, Apr 23, 2009 at 10:56 AM, Bob Morris <morris.bob at gmail.com>
> wrote:
> >>
> >> "described by anyone" is not the same as "described by anyone in any way
> >> convenient to the describer", so I find this quotation somewhat
> >> disingenuous. More precisely, I wonder what TDWG standard or proposed
> >> standard you find enables fiefdoms \in ways that are impossible under
> some
> >> other solution to the problem the standard addresses/.
> >>
> >> Bob Morris
> >>
> >> On Thu, Apr 23, 2009 at 10:47 AM, Peter DeVries <pete.devries at gmail.com
> >
> >> wrote:
> >>>
> >>> This paragraph below seems to encapsulate the differences in thinking
> >>> between the linkeddata community and
> >>> some of the TDWG people on how to best share biodiversity data.
> >>> "The notion of a fabric of resources that are individually described,
> >>> queried, and resolved may seem unmanageable or like science fiction.
> For
> >>> organizations that are used to large, manual, centralized efforts to
> >>> standardize on everything, it may seem anarchic to allow resources to
> grow
> >>> organically and be described by anyone. The same people would probably
> not
> >>> believe the Web possible in the first place if there were not already
> ample
> >>> proof of its success."
> >>> REST for Java developers, Part 4: The future is RESTful
> >>>
> >>> From
> http://www.javaworld.com/javaworld/jw-04-2009/jw-04-rest-series-4.html?page=4
> >>> I think that some people may have lost sight of the goal of making data
> >>> available to improve the understanding of our
> >>> natural world and hopefully better manage our natural resources.
> >>> It does not seem that creating a distributed network of fiefdoms will
> >>> help us achieve this goal.
> >>> - Pete
> >>> I was led to this article by @janzemanek on twitter.
> >>> ---------------------------------------------------------------
> >>> Pete DeVries
> >>> Department of Entomology
> >>> University of Wisconsin - Madison
> >>> 445 Russell Laboratories
> >>> 1630 Linden Drive
> >>> Madison, WI 53706
> >>> ------------------------------------------------------------
> >>>
> >>> _______________________________________________
> >>> tdwg-tag mailing list
> >>> tdwg-tag at lists.tdwg.org
> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> >>>
> >>
> >>
> >>
> >> --
> >> Robert A. Morris
> >> Professor of Computer Science
> >> UMASS-Boston
> >> ram at cs.umb.edu
> >> http://bdei.cs.umb.edu/
> >> http://www.cs.umb.edu/~ram <http://www.cs.umb.edu/%7Eram>
> >> http://www.cs.umb.edu/~ram/calendar.html<http://www.cs.umb.edu/%7Eram/calendar.html>
> >> phone (+1)617 287 6466
> >
> >
> >
> > --
> > ---------------------------------------------------------------
> > Pete DeVries
> > Department of Entomology
> > University of Wisconsin - Madison
> > 445 Russell Laboratories
> > 1630 Linden Drive
> > Madison, WI 53706
> > ------------------------------------------------------------
> >
> > _______________________________________________
> > tdwg-tag mailing list
> > tdwg-tag at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> >
> >
>
>
>
> --
>
> _________________
> Jim Croft ~ jim.croft at gmail.com ~ +61-2-62509499
>
> "Words, as is well known, are the great foes of reality."
> - Joseph Conrad, author (1857-1924)
>
> "I know that you believe that you understood what you think I said,
> but I am not sure you realize that what you heard is not what I meant."
>  - attributed to Robert McCloskey, US State Department spokesman
>
>
>
>
> --
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> ------------------------------------------------------------
>
> ------------------------------
> Please consider the environment before printing this email
> Warning: This electronic message together with any attachments is
> confidential. If you receive it in error: (i) you must not read, use,
> disclose, copy or retain it; (ii) please contact the sender immediately by
> reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare Research New
> Zealand Limited. http://www.landcareresearch.co.nz
>



-- 
Robert A. Morris
Professor of Computer Science
UMASS-Boston
ram at cs.umb.edu
http://bdei.cs.umb.edu/
http://www.cs.umb.edu/~ram
http://www.cs.umb.edu/~ram/calendar.html
phone (+1)617 287 6466
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20090423/84ab1ba8/attachment.html 


More information about the tdwg-tag mailing list