# [tdwg-tag] Differences in thinking between TDWG and LinkedData groups about data sharing / integration

"Markus Döring (GBIF)" mdoering at gbif.org
Fri Apr 24 11:39:58 CEST 2009

Pete,
another custom non rdf application of it are the GBIF REST services,
e.g.:
http://data.gbif.org/ws/rest/occurrence/get/1003321

In regards of interoperable standards applicable to different
architecture I would like to point out the revised Darwin Core Terms
that are not tied to a technical implementation per se. Just like
dublin core the darwin core group has preferred to provide several
best practice guidelines on how to use the same dwc terms in the
context of xml, rdf or simple text, our (gbif) latest favorite
encoding. Because the terms are rather simple, it allows you to encode
and transform the exact same data for a variety of technical
architectures.

Wouldn't it be great to have this minimal set of biodiversity concepts
available for rdf(-a),xml,text,html, rss, micro formats & tagging?
There is a discrepancy to the current lsid vocabularies of course as
with all tdwg standards. But shouldn't we try to get this fixed and
have a core set of terms available everywhere?

Personally I think this is a far more important issue than having
globally unique ids.
They would have to resolve to something meaningful anyway to be useful.

Markus

On Apr 24, 2009, at 4:24, Bob Morris wrote:

> Also, the Plazi EoL REST service for descriptions extracted from
> legacy literature has particularly minimal use of TaxonOccurrence
> objects.  See sample, including restful service spec, at
> http://wiki.tdwg.org/twiki/bin/view/SPM/PlaziEOLProject
>
> Bob Morris
>
>
> On Thu, Apr 23, 2009 at 10:16 PM, Kevin Richards <RichardsK at landcareresearch.co.nz
> > wrote:
> Pete
>
>
> You should have sent something around on the mailing list, I could
> have given you an example of a TaxonOccurrence.  Or perhaps you did
> and I missed it???
>
>
> Anyway, with Herb IMI, Paul Kirk and I have set up an resolver to
> provider TaxonOccurrence RDF data,
>
> see for example urn:lsid:herbimi.info:specimens:100069 (or http://lsid.herbimi.info/authority/metadata/?lsid=urn:lsid:herbimi.info:specimens:100069
>  in your browser).  It also has an example of using Interaction data
> - ie in this case a host plant (IPNI ID) of a fungus (Herb IMI
> specimen) with an identification to a taxon concept and name (Index
> Fungorum name).
>
>
>
> Jim - feel free to help improve the ideas and processes of TDWG if
> you find them that bad.   :-)
>
>
>
> Kevin
>
>
>
>
> From: tdwg-tag-bounces at lists.tdwg.org [mailto:tdwg-tag-bounces at lists.tdwg.org
> ] On Behalf Of Peter DeVries
> Sent: Friday, 24 April 2009 1:46 p.m.
> To: Jim Croft
> Cc: Bob Morris; tdwg-tag at lists.tdwg.org
> Subject: Re: [tdwg-tag] Differences in thinking between TDWG and
>
>
> Hi Jim,
>
>
> Thanks for keeping in good humor. :-)
>
>
> I was trying to get my head around the TaxonOccurrence standard so I
> could rewrite my observation records,
>
> and was hoping on finding some examples. That experience, and some
> related issues, made me a little irritable,
>
> for that I am sorry.
>
>
> It might be useful to make up a test set. It does not even need to
> be real. In my experience, this is where
>
> you really start to see if the standard does what you want it to do.
> Many times I have thought that I had
>
> everything figured out, only to discover after loading data into my
> triple store that something will not work
>
> or something that used to work was now broken by the addition of a
> new feature.
>
>
> I am not bothered by the idea of branding, it just seemed that the
> it was being confused with the goal of
>
> persistence in what seemed to be a unproductive thread.
>
>
> Thanks again for being amused, tolerant and insightful. :-)
>
>
> - Pete
>
>
>
> On Thu, Apr 23, 2009 at 7:50 PM, Jim Croft <jim.croft at gmail.com>
> wrote:
>
> Can't let this great opportunity pass...  :)
>
> It is not just us.  It is the regrettable human condition.  See also:
> http://www.informationweek.com/news/infrastructure/management/showArticle.jhtml?articleID=216600011
>
>
> One of the intriguing and enigmatic things about TDWG is that it does
> not seem to respect its own standards, preferring to invent another
> set rather than fix or enhance what is already there.  More than the
> 'not invented here' syndrome, we have to deal with 'not invented here
> this week' and the user is left with the impression of a happy
> self-absorbed group chasing a flock of flitting butterflies - 'oh,
> that's pretty, I will try and catch that one'.  Yep, just the human
> condition.  In another context, Roger H,
> http://www.hyam.net/blog/archives/346, has provided the best
> description of TDWG and its standards I think I have ever heard: "It
> is kind of like the guy selling seagulls on the beach. You give him £5
> and he points into the sky and says 'That one is yours'. Yes he is
> providing a service but the relationship that counts is the one
> between you and the gull."  This is both profound and frightening in
> how close it is to reality.
>
> On the complexity and costs of implementation, you are absolutely
> right.  Taxonomic database standards have moved out of the domain of
> taxonomists, they no longer understand what we are talking about
> (hell, I no longer understand what we are talking about.  And the rest
> of you?  c'mon now, be honest... :) and we are left with this
> relationship of 'trust us' paternalism that I do not think is all that
> healthy - we all trust and respect Microsoft, right?. It is getting
> increasingly difficult give advice to a bod with a beat up PC in a
> developing country on what they should do - the alphabet soup
> surrounding GUIDS and the like just does not cut it when the mind set
> is an Excel spreadsheet with a bunch of intuitive headings.  Oh, I'm
> sorry - you mean it is not just in developing countries?  There is a
> widening gap between 'Joe the Taxonomist' and business of taxonomy
> data standards that we do not seem to be able to address (do we even
> care?).  We used to be able to give our staff in the herbarium a TDWG
> standard and say 'this is what we have to do.'  Not anymore...  Ah,
> the good ole days...  Maybe the answer lies in the hegemony of a new
> and benevolent  'MSOffice for Taxonomy'.  Could work, but I do not
> think it is going to be particularly satisfying.
>
> On the reuse issue, Greg W argues that we do not do nearly enough of
> this and I agree with him.  He argues that TDWG should focus on the
> standard and not the application and implementation of the standard
> and has proposed that in our vocabularies we should adopt the
> principles of nomenclatural priority, that is, going back to *Dublin*
> Core, adding stuff chronologically from other other standards,
> including our own, until there is no option other than to invent
> another one, or there is nothing in our domain left to standardize.
> For taxonomists there is something inherently attractive in this
> approach - don't describe a taxon where it already exists, don't
> invent a standard where one already exists.  To retrofit this and
> untangle all the synonymy and homonymy in our existing standards and
> implementations is going to take a lot of work though.  But the
> vocabularies and ontologies are a good start.
>
> On the 'branding, issue, it is not so much branding but attribution.
> Apart from the moral and legal issues, it is unscientific not to
> attribute, source and provide lineage for data.  The is no
> optionality.  We have to do it.  Even if the initial supplier
> 'disappears'.  *Especially* if the initial supplier disappears.
> Attribution (branding if you like) is absolutely essential for
> credibility.  If someone is not going to do it, they can not have our
> data, and we will not use theirs.
>
> On the architectural issue, I can not really get all that hung up on
> it.  If a standard is good, it should be able to be implemented in a
> number of architectures (isn't that almost a definition of
> interoperability?).  Where things get 'interesting' is when
> architecture (and the continuum towards application) becomes the
> standard or part of the standard.  TDWG needs to constantly ask itself
> to what extent it needs to get involved with implementation of the
> standards it promotes.  I would argue 'not at all', but this is
> another discussion.
>
> And the 'insider' cabalistic nature of TDWG?   What can I say - it has
> always been this way.  A standard attracts a champion and the champion
> establishes a fiefdom of acolytes around it.  Yep, the human
> condition.  And it sort of works.  Some of the time.  (btw - another
> artifact of the human condition - your brilliant ideas are never
> perceived as such until someone else has them - just ask poor old
> Wallace how he is feeling this year).  A downside of this approach is
> that the various TDWG standards are very poorly coordinated between
> each other - this is something we should be able to do something
>
> We could piss on the TDWG tent from the outside, but you have to
> agree, it is much more satisfying to get inside the tent and piss on
> and piss off everyone in it...  :)
>
> <disclaimer>None of the above ideas are mine.  I am following the TDWG
> standard practice of restating them without attribution :)
> </disclaimer>
>
> Ah... that was fun...
>
> jim
>
>
> On Fri, Apr 24, 2009 at 5:09 AM, Peter DeVries
> <pete.devries at gmail.com> wrote:
> > Respectfully,
> > 1) Only certain classes of organizations will be able to
> contribute since
> > the standard is requires special skills. Those groups that can pay
> for
> > hardware and
> >     a person specific to this standard for perpetuity. I look at
> this and
> > think that a number of groups that could be providers cannot
> because of the
> > way the
> >     system is implemented. Why not have a simple RDF tar or zip
> file format
> > that GBIF checks with a crawler every night?
> > 2) There is very little reuse of existing vocabularies, geo for
> instance.
> > Similar to the "not invented here mentality".
> > 3) Discussions and decisions seem to be too much about making sure
> that
> > providers keep their "brand" on the data even if they disappear.
> > 4) Suggestions or alternative ways of thinking are rejected until
> an insider
> > restates them without attribution
> > 5) It is not at all clear how some of these decisions are made. It
> appears
> > as if some people disagree, there is discussion. Then years later
> there is
> >     the same discussion. It seems that some smaller group keeps
> pulling
> > everyone back to the same architectural decision.
> > 6) Where are the example data sets? We should have some example
> data sets
> > available to see if the standard can be used to answer real
> questions?
> >     Either they don't exist or they are only available to a few.
> > I actually have nothing but praise for GBIF and uBio (except for
> the minor
> > encoding thing), this more about trying to work within TDWG and
> getting
> > stonewalled. I am having the same feelings about it that I had a
> few years
> > ago, after which I left to try to make something that worked so I
> could
> > proceed with my project.
> > It probably was unfair to imply that the fiefdoms are by design,
> rather than
> > a side effect of the implementation standards, and for that I
> apologize.
> > - Pete
> >
> >
> > On Thu, Apr 23, 2009 at 10:56 AM, Bob Morris
> <morris.bob at gmail.com> wrote:
> >>
> >> "described by anyone" is not the same as "described by anyone in
> any way
> >> convenient to the describer", so I find this quotation somewhat
> >> disingenuous. More precisely, I wonder what TDWG standard or
> proposed
> >> standard you find enables fiefdoms \in ways that are impossible
> under some
> >> other solution to the problem the standard addresses/.
> >>
> >> Bob Morris
> >>
> >> On Thu, Apr 23, 2009 at 10:47 AM, Peter DeVries <pete.devries at gmail.com
> >
> >> wrote:
> >>>
> >>> This paragraph below seems to encapsulate the differences in
> thinking
> >>> between the linkeddata community and
> >>> some of the TDWG people on how to best share biodiversity data.
> >>> "The notion of a fabric of resources that are individually
> described,
> >>> queried, and resolved may seem unmanageable or like science
> fiction. For
> >>> organizations that are used to large, manual, centralized
> efforts to
> >>> standardize on everything, it may seem anarchic to allow
> resources to grow
> >>> organically and be described by anyone. The same people would
> probably not
> >>> believe the Web possible in the first place if there were not
> >>> proof of its success."
> >>> REST for Java developers, Part 4: The future is RESTful
> >>>
> >>> From http://www.javaworld.com/javaworld/jw-04-2009/jw-04-rest-series-4.html?page=4
> >>> I think that some people may have lost sight of the goal of
> making data
> >>> available to improve the understanding of our
> >>> natural world and hopefully better manage our natural resources.
> >>> It does not seem that creating a distributed network of fiefdoms
> will
> >>> help us achieve this goal.
> >>> - Pete
> >>> ---------------------------------------------------------------
> >>> Pete DeVries
> >>> Department of Entomology
> >>> University of Wisconsin - Madison
> >>> 445 Russell Laboratories
> >>> 1630 Linden Drive
> >>> ------------------------------------------------------------
> >>>
> >>> _______________________________________________
> >>> tdwg-tag mailing list
> >>> tdwg-tag at lists.tdwg.org
> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> >>>
> >>
> >>
> >>
> >> --
> >> Robert A. Morris
> >> Professor of Computer Science
> >> UMASS-Boston
> >> ram at cs.umb.edu
> >> http://bdei.cs.umb.edu/
> >> http://www.cs.umb.edu/~ram
> >> http://www.cs.umb.edu/~ram/calendar.html
> >> phone (+1)617 287 6466
> >
> >
> >
> > --
> > ---------------------------------------------------------------
> > Pete DeVries
> > Department of Entomology
> > University of Wisconsin - Madison
> > 445 Russell Laboratories
> > 1630 Linden Drive
> > ------------------------------------------------------------
> >
> > _______________________________________________
> > tdwg-tag mailing list
> > tdwg-tag at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> >
> >
>
>
>
> --
>
> _________________
> Jim Croft ~ jim.croft at gmail.com ~ +61-2-62509499
>
> "Words, as is well known, are the great foes of reality."
> - Joseph Conrad, author (1857-1924)
>
> "I know that you believe that you understood what you think I said,
> but I am not sure you realize that what you heard is not what I
> meant."
>  - attributed to Robert McCloskey, US State Department spokesman
>
>
>
>
> --
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> ------------------------------------------------------------
>
>
> Please consider the environment before printing this email
> Warning: This electronic message together with any attachments is
> confidential. If you receive it in error: (i) you must not read,
> immediately by reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare
> Research New Zealand Limited. http://www.landcareresearch.co.nz
>
>
>
> --
> Robert A. Morris
> Professor of Computer Science
> UMASS-Boston
> ram at cs.umb.edu
> http://bdei.cs.umb.edu/
> http://www.cs.umb.edu/~ram
> http://www.cs.umb.edu/~ram/calendar.html
> phone (+1)617 287 6466
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag