[tdwg-content] Heretics and illuminati, oh my! [SEC=UNCLASSIFIED]

Peter DeVries pete.devries at gmail.com
Tue May 10 16:07:56 CEST 2011


Hi Joel,

Maybe we can do the hackathon virtually so that there is some progress by
the time of the meeting.

A lot of this could be done via email and perhaps a videoconference or two.

- Pete

On Tue, May 10, 2011 at 8:06 AM, joel sachs <jsachs at csee.umbc.edu> wrote:

> On Tue, 10 May 2011, Greg Whitbread wrote:
>
> > Kevin,
> >
> > I'm not too keen on the hack-a-thon idea.  It will in all probability
> > just consume important contact time with activity that is best
> > integrated into real projects and reported in a forum such as this or at
> > the TDWG meeting. Much in the way that Steve and Cam and Pete and Paul
> > are doing now.
> >
>
> Hi Greg,
>
> The conversation that's going on now is useful. But something is
> missing, namely reference to the use cases and competency questions that
> we sometimes talk about generating. Hackathons focus the mind (I've seen
> it happen), and a DwC-RDF VoCamp/hackathon would be an opportunity to move
> forward on stuff that keeps getting deferred. Ideally, we'd define the
> competency questions in advance, and spend the hackathon addressing them.
> But, if we all turn out to be lazy/indifferent/"busy",
> and come to New Orleans without use cases, then spending cloister time
> working them out would, IMHO, be cloister time well spent.
>
> There are any number of possible approaches we could take. We could have
> an integration competition; we could rely on digitization efforts to
> supply data; we could tie the hackathon into observations from a field
> event; we could explore how the work of the observation and annotation
> task groups interact with DwC in a semantic web context; we can do
> anything we want!
>
> I encourage anyone with a vision of how such an event might unfold to
> share it, on or off list, so that the program committee can discuss it.
> (And it would be great, as Kevin suggested, if someone steps forward as a
> spearhead.)
>
> Joel.
>
>
> > We (you and I ) have drafted a proposal to put to TDWG executive for a
> > 2-3 day workshop prior to this years meeting to establish context for
> > these issues within the framework of a TDWG Technical Architecture.  Do
> > we need to evolve a TDWG level understanding of the requirement for
> > semantic interoperability within our standards space? Would it be useful
> > to spend time and effort to formally model the TDWG domain? Is there a
> > role for TAG? Can we improve the process?
> >
> > Hopefully, we can find an opportunity to get a small group together
> > between now and then do a little planning around agenda, background
> > requirements, preparation workload and specialist inputs.
>
> > greg
> >
> > On Tue, 2011-05-10 at 07:26, Kevin Richards wrote:
> >> I wonder if it would be a good idea to have a session (hackathon?) at
> >> this years TDWG meeting to look at / prove / experiment with, the
> >> various ways of working with semantic web data and ontologies we
> >> discuss here?
> >>
> >>
> >>
> >> This would soon show any benefits/disadvantages etc of the various
> >> approaches.
> >>
> >>
> >>
> >> Is anyone lined up / keen to promote such a session?
> >>
> >>
> >>
> >> Kevin
> >>
> >>
> >>
> >> From: tdwg-content-bounces at lists.tdwg.org
> >> [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Steve
> >> Baskauf
> >> Sent: Tuesday, 10 May 2011 8:54 a.m.
> >> To: Paul Murray
> >> Cc: tdwg-content at lists.tdwg.org List
> >> Subject: Re: [tdwg-content] Heretics and illuminati, oh my!
> >> [SEC=UNCLASSIFIED]
> >>
> >>
> >>
> >>
> >> Being either fearless or a fool (is there actually a difference?), I
> >> shall tread into this subject area at which I am a mere novice.  So be
> >> kind...
> >>
> >> I think that there may be several "solutions" to this problem.  The
> >> one that is "correct" probably depends on what one is trying to
> >> accomplish.  So I will try to describe in the most succinct way what
> >> Cam and I were trying to accomplish with DSW, and how that fits in
> >> with this thread.  Cam and I basically wanted to do two things:
> >> 1. Make it possible to use GUIDs RIGHT NOW (not five years from now).
> >> 2. Create an extremely stripped down ontology that would be
> >> non-controversial enough that people might actually use it, but which
> >> wouldn't do anything so bad that it would inhibit future development
> >> in the Semantic Web context (i.e. it could be extended in the future
> >> by clever people to do clever Semantic Web stuff).
> >>
> >> Amazingly, the GUID Applicability Statement has achieved the status of
> >> Standard-hood! (http://www.tdwg.org/standards/150/)  Hooray!  I sort
> >> of missed the announcement, but ran across the fact the other day when
> >> I was surfing through the TDWG website.  Since the GUID A.S. is now a
> >> TDWG Standard, I would say it would now officially be a best-practice
> >> to follow it.  In particular, Recommendation 11 states "Objects in the
> >> biodiversity domain that are identified by a GUID should be typed
> >> using the TDWG ontology or other well-known vocabularies in accordance
> >> with the TDWG common architecture."  This is somewhat problematic,
> >> given that the TDWG ontology (with the possible exception of the
> >> Taxon/TaxonConcept part) is effectively ("socially"?) deprecated.
> >> What is the alternative "other well-known vocabulary"?  There is none,
> >> at least none having any kind of official status with TDWG.
> >>
> >> I recently discovered (or maybe re-discovered) the Technical
> >> Architecture Group (TAG) Technical Roadmaps from 2006-2008:
> >> http://www.tdwg.org/uploads/media/TAG_Roadmap_01.doc
> >> http://www.tdwg.org/fileadmin/subgroups/tag/TAG_Roadmap_2007_final.pdf
> >> http://www.tdwg.org/fileadmin/subgroups/tag/TAG_Roadmap_2008.pdf
> >> I might have seen them before, but if so it was at the point where I
> >> was really not knowledgeable enough to comprehend them.  I found it
> >> very instructive to read about what the TAG had in mind when it set
> >> out to create the TDWG Ontology.  In particular (from the 2007
> >> Roadmap):
> >> "From the point of view of exchanging data - such as in the federation
> >> of a number of natural history collections - there is no need for a
> >> standards architecture.  The federation is a closed system where a
> >> single exchange format can be agreed on. ... This model has worked
> >> well in the past but it does not meet the primary use case that is
> >> emerging.  Biodiversity research is typically carried out by combining
> >> data of different kinds from multiple sources. The providers of data
> >> do not know who will use their data or how it will be combined with
> >> data from other sources. The consumer needs some level of commonality
> >> across all the data received so that it can be combined for analysis
> >> without the need to write computer software for every new
> >> combination." [This brings to mind the very "different kinds" of
> >> resources Cam is documenting in Borneo and the "multiple sources" that
> >> will be handling the metadata once those resources are sent off to
> >> herbaria, labs, and arboreta.]
> >>
> >> and from the 2008 Roadmap:
> >> "If GUIDs are used to uniquely identify 'pieces' of data we need to
> >> have a shared understanding of what we mean by a 'piece of data' i.e.
> >> what kind of thing is it that a particular id applies to, a specimen,
> >> a person, an observation, a complete data set. We also need to have a
> >> shared understanding of at least some of the properties we use to
> >> describe these things."
> >>
> >> Having been barely aware of TDWG's existence in 2008, I am blissfully
> >> ignorant of whatever disagreements may have occurred regarding LSIDs,
> >> reification, or whatever, and really don't want to know about them.
> >> All I can say as an outside observer is that it appears that the
> >> failure of the initial efforts to get GUIDs and the TDWG Ontology off
> >> the ground was because the system envisioned was too complicated to
> >> maintain, too complicated to gain a consensus, and to complicated to
> >> actually explain to anybody.  Now that GUIDs seem to have a new lease
> >> on life, it seems like the greatest chance of successfully
> >> implementing them is to start by keeping things absolutely as simple
> >> as possible.  To Cam and me, Darwin Core seemed to be the only
> >> candidate for something relatively simple and relatively universally
> >> accepted on which one could base an ontology that could be used to
> >> type GUIDs and to use to express "a shared understanding of at least
> >> some of the properties used to describe" biodiversity resources.
> >> Although I was somewhat skeptical that there was a "community
> >> consensus" about what the DwC classes meant and how they were related
> >> to each other, the exhaustive discussion on this list in Oct/Nov
> >> convinced me that maybe there WAS a consensus, or at least enough of a
> >> consensus to move forward.  Although some people may at the present
> >> time be interested in figuring out how to do things like "define
> >> 'Fish' as an owl class as well as as a Taxon object", I would assert
> >> that is outside the core mission of TDWG as stated: to "develop, adopt
> >> and promote standards and guidelines for the recording and exchange of
> >> data about organisms evidenced by the historical record".  It is fun
> >> to talk about, but to me not the primary consideration in designing a
> >> community data exchange model.  This outlook explains to some extent
> >> why I asked questions about the complexity of taxonconcept.org and its
> >> orientation toward facilitating semantic queries.  There is nothing
> >> wrong with that, but it doesn't seem to be the direction that TDWG has
> >> said it wants to go.  Perhaps when we have "gotten there" (i.e. have a
> >> functioning system using GUIDs for clearly typed resources), we might
> >> want to embark further down the road to the Semantic Web.
> >>
> >> Aside from just importing the DwC classes into the DSW ontology and
> >> connecting them with object properties, Cam and I did a little nasty
> >> thing with them.  It has been said that declaring ranges and domains
> >> for terms doesn't prevent people from using the terms to express
> >> relationships among the "wrong" types of things.  Rather, it simply
> >> asserts that those things are instances of the classes used in the
> >> range and domain declarations for the term.  That is sort of true, but
> >> by declaring many of the core DwC classes to be disjoint, we actually
> >> ARE preventing people from using the wrong object properties with
> >> instances of the wrong classes.  If  Joe Curator rdf:type's a
> >> determination as a dwc:Identification, but then uses dsw:atEvent
> >> (which has the domain dwc:Occurrence) as a property of the
> >> determination, then a reasoner will infer that the determination is a
> >> type dwc:Occurrence as well as the explicitly declared type
> >> dwc:Identification.  Because dwc:Identification and dwc:Occurrence are
> >> disjoint classes, the reasoner will have a fit.  Cam and I are being
> >> Naughty (sensu Bob Morris) because we are inhibiting the extensibility
> >> of dsw:atEvent, but Joe Curator is being Naughty (sensu Baskauf and
> >> Webb) because Cam and I believe in the statement from the 2007
> >> Roadmap: "The consumer needs some level of commonality across all the
> >> data received...".  Joe Curator is not being consistent with the
> >> "shared understanding of at least some of the properties" to the
> >> extent that DSW reflects the "shared understanding" of the TDWG
> >> community.  We are basically trying to enforce a sort of orthodoxy on
> >> the use of DwC classes as rdf:types and on the connections between the
> >> dwc:classes so that people can have some reasonable expectation that
> >> they are talking the same language as their partners whose data are
> >> also being aggregated in the same federated database.
> >>
> >> It seems to me that this "enforcement of orthodoxy" may be very much
> >> at odds with the free-wheeling spirit of the Semantic Web community
> >> where Anybody Can Say Anything About Anything.  But when I look over
> >> those old TAG roadmaps, I see little having to do with clever Semantic
> >> inferencing.  I see a lot about providers and consumers understanding
> >> what each other are talking about.  To some extent, Darwin Core can
> >> provide most of the necessary commonality between providers and
> >> consumers.  There were (in our opinion) three areas where it could
> >> not.  One was the lack of a class to link repeated sampling events and
> >> determinations (dwc:IndividualOrganism or
> >> TaxonomicallyHomogeneousEntity if you prefer) and another was a class
> >> that allowed for the separation of evidence from the Occurrence
> >> documented by it (called by us the dsw:Token class).  The other area
> >> was the dwc:Taxon class which did not seem clear enough in its
> >> definition nor to possess enough complexity to express the kinds of
> >> relationships commonly discussed on this list.  dwc:Taxon needs to be
> >> "fixed" before it is Ready For Prime Time (i.e. usable in rdf:type
> >> declarations)
> >>
> >> So I guess having read the various responses to my query and thinking
> >> about the history of the TDWG Ontology, I would say that it may not
> >> really be important how dwc:Taxon could be tied to tc:Taxon because
> >> the two classes probably don't need to be tied together anyway.  As it
> >> currently stands, dwc:Taxon (outside of DSW) has no semantic meaning
> >> other than what people want to believe that it means because it's not
> >> tied to any other classes by object properties of its instances.   The
> >> mish-mash of terms describing names and taxa listed under dwc:Taxon
> >> add to the confusion - since the DwC vocabulary purposefully does not
> >> declare domains for the terms listed under a class they really could
> >> be used as properties for an instance of any class anywhere.  In
> >> contract, tc:Taxon does have properties that are described clearly in
> >> the TDWG Ontology.  The only reason that we declared the two classes
> >> to be equivalent was to signal that we felt that some of the DwC terms
> >> listed under dwc:Taxon in the DwC vocabulary could be used as data
> >> properties for the things in the tc:Taxon class that people like Paul
> >> were describing with properties from the TDWG Ontology.  Tying them
> >> together doesn't (at the moment) mess up anything that anybody is
> >> doing with dwc:Taxon because (outside of DSW) there isn't anything to
> >> actually DO with dwc:Taxon in RDF.  However, the point is well taken
> >> that if someone in the future did decide to define properties
> >> specifically intended for use with dwc:Taxon, those properties would
> >> be hopelessly tangled with tc:Taxon properties.
> >>
> >> It seems to me like the real road forward (if one believes as I do
> >> that DwC is the only practical alternative to use for typing GUIDs)
> >> would be to:
> >> 1. decide that the TDWG Ontology in its dead form adequately describes
> >> taxa, names, and their properties (use it as-is).  OR
> >> 2. decide that although the TDWG Ontology doesn't do everything that
> >> people want it to do at the present time, it could resurrected and
> >> modified to do what people want (use it and hope for the future).  OR
> >> 3. decide to just create the additional classes, e.g. dwc:Name (or
> >> dsw:Name if you prefer not to adulterate the "pure" Darwin Core), and
> >> object properties for dwc:Taxon and dwc:Name that are needed to get
> >> the job done (i.e. just dump the TDWG Ontology as unfixable and make
> >> up new stuff).
> >>
> >> In any of these three alternatives, there isn't actually any reason to
> >> tie the two classes together that I can see.  Of these three, I think
> >> the third option would probably be preferable, although it might put
> >> Paul (and any others currently using the TDWG Ontology to describe
> >> Taxon instances) in the unpleasant position of having to redo their
> >> RDF.
> >>
> >> Steve
> >>
> >> Paul Murray wrote:
> >>
> >> On 09/05/2011, at 2:07 PM, Kevin Richards wrote:
> >>
> >>
> >>         Paul
> >>
> >>         I had the same thought (ie the x is of type dwc:Taxon, y is of
> type tc:Taxon, we know dwc:Taxon and tc:Taxon are equivalent, so we can
> reasonably compare x and y).
> >>         And this is built into standard semantic web reasoners - which
> is a bonus.
> >>         But this was debated (taking into account Bob Morris' issue)
> with respect to DwC and it was decided the benefits weren't significantly
> better than having a "dwc:isInCategory" sort of property that could then be
> "equivalent to" another class property and therefore giving you a similar
> advantage (admittedly not as good, but similar).
> >>         Do you think this is reasonable or are we just losing too much
> semantic web benefits by not specifying the domain constraint?
> >>
> >>
> >> A thing to watch out for is that in OWL DL, you cannot apply ordinary
> data and object properties to vocabulary objects (classes, predicates) - you
> can only apply annotation properties. If you apply an ordinary data property
> to a class, OWL DL treats this as what it calls "punning": it decides that
> there is a class named X and also a named individual named X, and that these
> have nothing to do with one another. The individual has properties, the
> class has members, and the annotation properties, well: whatever. Reasoners
> do not reason over annotation properties: indeed - that's the entire point.
> Attempting to put properties on properties and having classes being
> instances of classes results in things that are mathematically undecidable
> ("this statement cannot be proven to be true").
> >>
> >> (another reason is that is allows you to put dc:comments and labels on
> classes, and even if you declare those classes to be equivalent nevertheless
> the comment only applies to the particular thing you put it on)
> >>
> >> This all means that dwc:isInCategory, if you want to apply it to
> dwc:taxon or other classes, will never have any meaning that semweb
> "engines" can get at. The underlying thing is that dwc:isInCategory is
> actually a meta-syntactic construct: rather than using owl to define a
> vocabulary, you are effectively attempting to extend OWL itself.
> >>
> >> But ... maybe that's ok. Maybe what is attempting to be done here only
> ever needs to be understood by humans.
> >>
> >> Now ... if what you are trying to do is to define "Fish" as an owl class
> as well as as a Taxon object - that is do-able, even to the point of being
> able to get inheritance working, using reflexive properties.  At least ... I
> think it is. I should write a test case.
> >> If you have received this transmission in error please notify us
> immediately by return e-mail and delete all copies. If this e-mail or any
> attachments have been sent to you in error, that error does not constitute
> waiver of any confidentiality, privilege or copyright in respect of
> information in the e-mail or attachments.
> >>
> >> Please consider the environment before printing this email.
> >> _______________________________________________
> >> tdwg-content mailing list
> >> tdwg-content at lists.tdwg.org
> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >> .
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Steven J. Baskauf, Ph.D., Senior Lecturer
> >> Vanderbilt University Dept. of Biological Sciences
> >>
> >> postal mail address:
> >> VU Station B 351634
> >> Nashville, TN  37235-1634,  U.S.A.
> >>
> >> delivery address:
> >> 2125 Stevenson Center
> >> 1161 21st Ave., S.
> >> Nashville, TN 37235
> >>
> >> office: 2128 Stevenson Center
> >> phone: (615) 343-4582,  fax: (615) 343-6707
> >> http://bioimages.vanderbilt.edu
> >>
> >> ______________________________________________________________________
> >> Please consider the environment before printing this email
> >> Warning: This electronic message together with any attachments is
> >> confidential. If you receive it in error: (i) you must not read, use,
> >> disclose, copy or retain it; (ii) please contact the sender
> >> immediately by reply email and then delete the emails.
> >> The views expressed in this email may not be those of Landcare
> >> Research New Zealand Limited. http://www.landcareresearch.co.nz
> >>
> >>
> >> ______________________________________________________________________
> >> _______________________________________________
> >> tdwg-content mailing list
> >> tdwg-content at lists.tdwg.org
> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> > --
> >
> >
> > australian centre for plant bIodiversity research<------------------+
> > national            greg whitBread             voice: +61 2 62509 482
> > botanic Integrated Botanical Information System  fax: +61 2 62509 599
> > gardens                      S........ I.T. happens.. ghw at anbg.gov.au
> > +----------------------------------------->GPO Box 1777 Canberra 2601
> >
> >
> > If you have received this transmission in error please notify us
> immediately by return e-mail and delete all copies. If this e-mail or any
> attachments have been sent to you in error, that error does not constitute
> waiver of any confidentiality, privilege or copyright in respect of
> information in the e-mail or attachments.
> >
> > Please consider the environment before printing this email.
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>



-- 
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries at wisc.edu
TaxonConcept <http://www.taxonconcept.org/>  &
GeoSpecies<http://about.geospecies.org/> Knowledge
Bases
A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
--------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110510/48d7326c/attachment.html 


More information about the tdwg-content mailing list