[tdwg-content] "Wrong" RDF, was Re: What I learned at the TechnoBioBlitz

Peter DeVries pete.devries at gmail.com
Thu Oct 14 11:29:50 CEST 2010


Note to all,

If you notice a large muscid fly buzzing around, pull your dark-bottled beer
closer to yourself.

However, do not get so engrossed in responding in tdwg-tag that you fail to
notice the fly is not buzzing anymore.

For those who may not have liked what I wrote, perhaps you could argue this
is perhaps evidence of the existence of God? :-)

Nough Said,

If appropriate, I will respond to Rich's comments after some sleep.

- Pete

P.S. I think I know a little too much about diptera, but hope that EtOH is a
pretty good disinfectant

On Thu, Oct 14, 2010 at 3:53 AM, Peter DeVries <pete.devries at gmail.com>wrote:

> Hi Steve,
>
> It is not if there are not examples of things that seem to work in this
> space, it is that those alternatives that could be incorporated into the
> DarwinCore are largely ignored unless they they come from the "right
> people".
>
> How these "right people" are defined is not quite clear to me but what
> seems strange is that many of the issues that are being rehashed over and
> over again I have already live working examples of.
>
> If there are any flaws. or features lacking. is because I have spent far
> too much time, as you have, trying to get people on the list to some of the
> problems with the system they have proposed.
>
> I believe that you are right and that we need to represent individuals in
> the RDF version, but I have implemented them in a slightly different way.
>
> <http://lod.taxonconcept.org/ses/iuCXz#Species> txn:speciesConceptHasSpeciesIndividualTag
> <http://lod.taxonconcept.org/ses/iuCXz#Individual>
>
> That this creates is a usable type for an "individual of that species
> concept." It is of type *txn:SpeciesIndividualTag*
> *
> *
> *Now you can easily query for all the "individuals" or all the
> "individuals that are of a particular species concept".*
> *
> *
> *We could be using the species concept URI's that I have setup but instead
> we see a perfect example of the folly of LSID's.*
> *
> *
> *Those listed above do not resolve to anything that tells me what they
> mean. *
> *
> *
> *(I tried, not because I wanted show the folly of LSID's but to see if the
> LSID was for something that I had a URI for.*
> *
> *
> *A URI that could be used in the interim.)*
> *
> *
> *What I got was an LSID that despite being resolved through a proxy,
> returned nothing.*
> *
> *
> *I can always add the zoobank LSID to the metadata to the description for
> that concept which would allow some tracking and use of LSID's.*
> *
> *
> *Frankly, I don't know what is really going on, but there is something
> very strange about how this entire process is operating.*
> *
> *
> *Some of these issue are better handled by people who already understand
> RDF, but instead are being rehashed here.*
> *
> *
> *Why do I keep seeing this "pull" to create a specific subsection of the
> semantic web or even informatics rather that benefit from all the related
> work happening a few email lists away?*
> *
> *
> *One thing that you might be missing, is that in RDF something can have
> many types so you can have something that is both a depiction of a
> speciesconcept and a depiction of an individual.*
> *
> *
> *Just like you had can have a depiction of a "Firehouse" that is also a
> depiction of the "West Washington Firehouse" *
> *
> *
> *I have thought that there may a need to be some additional "tag" like
> identifiers like Image or Media, which has Image as a subclass.*
> *
> *
> *Also since the dwc is not really "live" and and working like it should,
> we can't really test it in the ways it need to be tested.*
> *
> *
> In summary, I feel your pain.
> *
> *
> *Respectfully,*
> *
> *
> *- Pete
> *
>
> On Wed, Oct 13, 2010 at 9:07 PM, Steve Baskauf <
> steve.baskauf at vanderbilt.edu> wrote:
>
>>  I was just ready to leave work when I wrote this and since then I'm
>> feeling like I should clarify just what I mean by "wrong" ways of using
>> RDF.  I recognize that TDWG encourages flexibility in the ways that
>> standards such as DwC are used.  As such, it doesn't usually define "right"
>> and "wrong" ways of using the standards.  What I mean by calling some uses
>> "wrong" is not intended to discourage the creative use of DwC terms in RDF.
>> What I mean is that one must be careful to make sure that RDF statements
>> mean what is intended.  Here is an example.  The Dublin Core term
>> dcterms:language means "the language of the resource".  On multiple
>> occasions, I've seen this term used in RDF as a property of a resource whose
>> metadata is written in a certain language.  This is "wrong" because the
>> subject of the statement is the resource itself, not the resource's
>> metadata.  The need for this kind of clarity is apparent in the case of
>> media.  For example, if we are providing metadata in English that describes
>> a nature film which has audio in German, the correct statement is that
>> [film] dcterms:language "de", NOT [film] dcterms:language "en".  This
>> problem is handled appropriately in the MRTG schema by creating the
>> (required) term mrtg:metadataLanguage.   The correct statement would be
>> [film] mrtg:metadataLanguage "en" .  (I'm using "[film]" in lieu of a URI
>> identifier for the film.)  If, however, we were writing RDF to describe the
>> metadata itself rather than the film, then it would be appropriate to say
>> [film's metadata] dcterms:language "en" .  In straight XML, we might get
>> away with semantic sloppiness if the senders and receivers of the XML
>> "understand" what the intended subject is of the term dcterms:language.  But
>> in RDF, we have to assume that the receiver of the RDF is a "stupid"
>> computer which only infers exactly what is said and not what we MEANT to
>> say.
>>
>> I believe that this is a very important point that all parties need to
>> keep in mind before we happily march off creating RDF templates for the
>> general public to use.  In particular, I have some serious problems with the
>> way that people are associating properties with instances of the
>> dwc:Occurrence class.  I believe that these "wrong" ways originate with the
>> historical roots of Darwin Core as a means to describe specimens.  I will
>> illustrate what I mean.  In many cases, a specimen is created by killing an
>> organism and gluing it to a piece of paper (if it's a plant) or putting it
>> in a jar (if it's an animal).  It is natural to ask the question "what kind
>> of species is the specimen?".  We can look at the specimen and make a
>> statement like [specimen] dwc:scientificName "Drosophila melanogaster" and
>> it pretty much makes sense.  However, in the new Darwin Core standard, we
>> have a broader category of "things" (a.k.a. resources) that we call
>> Occurrences which include specimens but which also includes observations and
>> probably all kinds of things like images, DNA samples, and a whole lot of
>> other things.  If we try to apply the same kind of statement to other kinds
>> of Occurrences besides specimens we immediately run into problems.  If we
>> say that [digital image] dwc:scientificName "Drosophila melanogaster" we are
>> making a nonsensical statement.  The digital image can have properties like
>> its photographer, its format, its pixel dimensions, etc. but the image
>> itself does not have a scientific name.  The scientific name is a property
>> of the thing that was photographed.  It makes even less sense if we are
>> talking about observations.  An observation is a situation where somebody
>> observes an organism.  The observation can have properties like the
>> observer, the location, etc.  However, if we say [observation]
>> dwc:scientificName "Drosophila melanogaster" we are saying that that act of
>> observing has a scientific name.  That is an incorrect statement.  So the
>> general statement [Occurrence] dwc:scientificName "Drosophila melanogaster"
>> does not make sense when applied to all possible types of Occurrences.
>> Rather, the organism that we are observing is the thing that has a
>> scientific name.
>>
>> In all of the examples above, the correct statement is [individual
>> organism] dwc:scientificName "Drosophila melanogaster".  The specimen is an
>> occurrence of the individual organism.  The image is an occurrence of the
>> individual organism.  The observation is an occurrence of the individual
>> organism.  These statements may seem odd because we are used to thinking of
>> an Occurrence being an occurrence of the "species" but it's not really.  The
>> image is not an image of the Drosophila species concept nor is it an image
>> of the string "Drosophila melanogaster".  The image is an image of an
>> individual fruit fly.  The individual fruit fly is a representative of the
>> taxon, the image and the observation are not.
>>
>> This point becomes more clear if we look at a situation where several
>> types of occurrence records are collected from the same individual.  Let's
>> say that we capture a bird, photograph it, collect a feather from it,
>> collect a DNA sample and band it and let it go.  Later somebody sees the
>> band and reports that as an observation.  How do we connect all of these
>> things?  Do we create an identifier for the specimen (the feather) and then
>> say that the image and the DNA sample came from it?  That would be wrong.
>> We could take an image of the feather, but that would be a different thing
>> from an image of the bird.  We didn't get the DNA sample from the feather,
>> we got it via a blood sample from the bird.  The band observation is not an
>> observation of the feather, or the image or the DNA sample.  It's an
>> observation of the bird which was never any kind of specimen living or
>> dead.  The bird is an individual organism and that's what we need to call
>> it.  Right now we don't have anything in Darwin Core that can be used to
>> rdfs:type the bird, which is why I proposed Individual as a Darwin Core
>> class.
>>
>> I could say these things more clearly in RDF, but since because many
>> members of the audience of this message aren't familiar with RDF/XML they
>> would probably zone out and the point would be lost.  The point is that we
>> need to have identifiable classes of "resources" (the technical name for
>> "things" like physical artifacts, concepts, and electronic representations)
>> for all of the things that that we need to describe and inter-relate in the
>> Darwin Core world.  Right now, we are missing one of the important pieces
>> that we need, which is a class for the Individual.  If we are satisfied with
>> creating an RDF model that only works for specimens and one-time
>> observations, then we probably don't need Individual as a Darwin Core
>> class.  On the other hand, if TDWG and GBIF are really serious about
>> creating a system (Darwin Core and RDF based on it) that can handle other
>> types of Occurrences like multiple images of live organisms, observations of
>> the same organism over time, and multiple types of Occurrences collected
>> from the same organism, then this capability should be built into the system
>> from the start.  When I got back from the TDWG meeting, I was all excited
>> about trying to use Darwin Core Archives with my live plant image
>> collection.  However, it quickly became evident that it could not work
>> because Occurrences were at the center of the diagram rather than
>> Individuals.  So unless something changes, we are already embarking on the
>> process of locking out these other Occurrence types.
>>
>> I hate to sound like a broken record (do we have those any more?), but
>> read my paper on this subject.  It explains the rationale better than this
>> email, has nice diagrams, and gives RDF examples to illustrate everything (
>> https://journals.ku.edu/index.php/jbi/article/view/3664).  If somebody
>> has a better idea of how to develop an internally consistent system that can
>> handle the problems I've raised here that DOESN'T involve Individuals (i.e.
>> other "right"[=semantically accurate] ways to express properties and
>> relationships among Identifications, Taxa, diverse types of Occurrences,
>> etc.) I'd like to hear what it is.  Or perhaps as Stan has suggested, there
>> needs to be a task group that can hash out alternative views.  But let's
>> have the discussion before we post models and suggest people use them.
>>
>> Steve
>>
>> Steve Baskauf wrote:
>>
>> Stan,
>> Thanks for the clarification.  My concern here is that standard or not, if
>> examples are posted on the Google Darwin Code site, they will have an
>> implied "stamp of approval" and will be used by others as a template
>> (despite that site being labeled as "for discussion and development" not
>> everyone can post to it and that implies some authority).  In the case of
>> straight XML, that isn't really that big of an issue.  XML can mean whatever
>> one wants as long as there is an agreement between the sender and the
>> receiver (perhaps in the form of a formal XML schema) as to what the
>> elements represent.  I believe that RDF is a different beast.  When one
>> exposes RDF, the receiver is unknown.  Therefore, the RDF has to actually
>> "mean" something to the receiver without a pre-arranged agreement.  In a
>> generic XML document, the elements can simply be a list of string values of
>> terms with no implied "meaning" except what might be inferred by grouping
>> them in a container element.  In RDF, the elements represent properties of
>> particular resources.  I believe strongly that although there may be several
>> "right" ways to express properties of members of DwC classes, there are many
>> more "wrong" ways that should not be used.  By "right" I mean that they make
>> sense semantically in that the properties logically are ones that should
>> actually belong to the described resource.  I do not believe that the
>> discussion of these issues has progressed to the point where there is a
>> consensus on the "right" use of DwC terms for some types of resources and
>> therefore I am opposed to the posting of RDF examples on any official Darwin
>> Core sites without a lot more discussion UNLESS the examples are clearly
>> labeled as examples intended for discussion and not for use as templates.
>> If you want such examples, I can provide
>> http://bioimages.vanderbilt.edu/vanderbilt/4-145.rdf as an example for an
>> Individual
>> http://bioimages.vanderbilt.edu/baskauf/79695.rdf as am example of an
>> Occurrence that is a live plant image and
>> http://bioimages.vanderbilt.edu/rdf/examples/lsu000/0138.rdf as an
>> example of an Occurrence that is an herbarium specimen
>> I would be happy to discuss the reasons why I structured the RDF as I did
>> (although mostly those examples are already rationalized in
>> https://journals.ku.edu/index.php/jbi/article/view/3664), but I would not
>> go so far as to say that they are "right" without some discussion.
>>
>> What I intended when I suggested that I might write some kind of guide for
>> Darwin Core represented in RDF/XML was really a document that explained to
>> beginners what the point was of RDF, the basics of how one can structure
>> properties in RDF using examples that are Darwin Core terms, and options for
>> creating URIs that refer to resources that are described in separate files
>> or within the same file.  I wasn't really suggesting that it be a full-blown
>> recommendation with specific guidelines for the use of particular terms or
>> structuring of files for particular classes of resources, although that
>> would be a good thing ultimately.  I guess I was seeing some kind of a
>> beginner's guide as a way to involve more people (who aren't up on RDF) in
>> the discussion.  I don't think that it should be necessary to complete full
>> "standards" process before such a document were made available.  It would
>> probably be better to have some kind of road map where that document would
>> be the first segment but would later be followed by guidelines for specific
>> classes of resources with examples.  I think that such a modular approach
>> would be the most beneficial because pieces of it could actually get done in
>> a timely fashion rather than requiring the whole thing to be complete before
>> any of it would be accepted.
>>
>> I do think a task group for Darwin Core RDF would be a good idea.  If
>> nobody is in a huge hurry, I don't mind trying to charter such a group,
>> although I'd be just as happy if somebody else wanted to do it and I would
>> just try be an active participant.  I will look at the links you suggested,
>> thanks.
>>
>> Steve
>>
>>
>> Blum, Stan wrote:
>>
>> Steve,
>>
>> The TDWG process for creating standards is here:
>> http://www.tdwg.org/about-tdwg/process/   This is worth reading if you
>> haven’t done so  already.
>>
>> Another document worth reading is the standards format specifiation
>> http://www.tdwg.org/standards/147/    I never pushed this “standard”
>> through public review, but it still functions a guideline for formatting and
>> our view of what is isn’t within scope of a “standard”.  In other words, we
>> are doing our best to follow the basic ideas laid out there about the kinds
>> of specifications:
>>
>> Type 1 -- normative specification, versioned;
>> Type 2 -- versioned, supplementary documentation;
>> Type 3 — uncontrolled supplementary documentation.
>>
>> The page of examples John and others have put up on the DarwinCore site is
>> non-normative, uncontrolled documentation.
>>
>> The thing you were proposing sounded like an applicability statement —
>> offering guidance about how another standard, RDF, should be used in
>> biodiversity informatics.  These can also be treated as standards, and get
>> TDWG ratification as standard, but don’t create a de-novo standard.
>>
>> Interest groups and task groups are explained in the Process.  If you want
>> to create an applicability statement for RDF and DarwinCore, you could
>> prepare a task group charter and submit it to the executive for approval.
>>  Approval would make it a formal Task Group.  See other task group charters
>> for examples.
>>
>> -Stan
>>
>>
>>
>>
>> On 10/13/10 6:33 AM, "Steve Baskauf" <steve.baskauf at vanderbilt.edu>
>> wrote:
>>
>>  OK, because of a momentary heavy work load I'm still in the process of
>> getting caught up on this thread, but this is moving so fast I feel like I'm
>> being left in the dust.  Last week I offered to help facilitate creating
>> some guidelines and examples for RDF/XML in Darwin Core.  I was told that we
>> should follow the community process of forming an interest group, getting
>> participants, etc. and have been waiting for some guidelines on how that
>> process is supposed to work. Now we are surging ahead with examples and help
>> pages again.  Are we following a process or not and if so, what is it?
>> Steve
>>
>> Tim Robertson (GBIF) wrote:
>>
>> I will also help with examples.  If we are doing XML / RDF formats, lets
>> get an example record conforming to the Text guidelines in there as well for
>> completeness (most useful when dealing with checklists).
>>
>>
>>
>>
>>
>> On Oct 12, 2010, at 10:31 PM, John Wieczorek wrote:
>>
>>
>>
>> I am interested in helping with an examples page. The page could have XML
>> and RDF examples illustrating particular use cases, as you have recommended.
>> Create an "Examples" page on the Table of Contents and then have all of the
>> examples on one page with an index of links to specific examples at the top?
>> I made a straw man page to show what I am thinking at
>> http://code.google.com/p/darwincore/wiki/Examples.
>>
>>
>> On Tue, Oct 12, 2010 at 11:41 AM, "Markus Döring (GBIF)" <
>> mdoering at gbif.org> wrote:
>>
>>
>> Would we have the energy to compile example dwc records on how to use
>> darwin core for certain use cases?
>> The lack of guidance on how to use darwin core was mentioned earlier. An
>> additional example webpage for the dwc website would surely be really
>> helpful for not only newbies. A dwc record for bird watching, vegetation
>> plot surveys, insect specimen collection, herbarium sheets, zoological
>> garden visits, tissue sample, dna sequence, marine fishing net catches, etc
>>
>> Id volunteer to do the html page if Im given example records with a short
>> use case description...
>>
>> Markus
>>
>>
>>
>>
>> On Oct 12, 2010, at 13:14, Roger Hyam wrote:
>>
>> > Wow - what a thread to come back to.
>> >
>> > I saw my name mentioned so I ought to chip in. I also think we are
>> conflating two distinct things under the name "occurrence".
>> >
>> > This point is largely just expanding on what Kevin just said. Going down
>> the road he was wise enough not to go down!
>> >
>> > The vocabulary I briefly presented at TDWG was aimed at occurrence of
>> taxa in regions but the general thrust of my talk was intended to pose the
>> questions: Why should we score taxa to regions at all? Shouldn't this always
>> be the results of a query on occurrence records? The answer will always
>> depend on the question asked.
>> >
>> > Take two examples.
>> >
>> > A tiger roaming "free" in London living off a diet of squirrels and
>> tourists. Occurrence records for this organism are just occurrence records.
>> Why the tiger is in London (climate change, introduction, invasion, escape)
>> is not a quality of it being there. They are value judgements added later.
>> >
>> > A tiger sitting in a cage a London Zoo is "managed" in that it is being
>> maintained there by a human effort. We are recording the fact that someone
>> has placed it there and held it in that position for our edification.
>> >
>> > As Kevin says, when I observe an individual (or flock of individuals) I
>> do not observe their "introducedness" or their "nativeness" this is
>> something that is derived from combining multiple observations of occurrence
>> of individuals.
>> >
>> > I would therefore advocate that we just have a flag on an occurrence
>> record that says "intended for distribution" i.e. this is not maintained
>> here in a garden/zoo/farm etc. To say any more on a occurrence record is
>> misleading and there are occasions when even this flag will be ignored in
>> analysis. I think we already have this field.
>> >
>> > There are of course grey areas (biology always has grey areas). A Scots
>> Pine growing in the highlands may be part of a 150 year old naturalistic
>> plantation. It is therefore native to the region, possibly of local genetic
>> stock but has been planted in that position. For some applications this
>> could be considered managed and for others not.
>> >
>> > The status of taxa in regions is a completely different thing. As soon
>> as we talk about aggregating multiple observations (or lack of them) then we
>> are talking about the results of analysis instead of primary observations.
>>  Only at this point should we be talking about the status of the
>> "occurrence" in terms of native/invasive/naturalised etc. This may not even
>> be based on extant records. For example, a taxon can be invasive in an area
>> without actually occurring there. i.e. it used to be there but is presumed
>> to be irradiated.
>> >
>> > Does the problem occur because we are using the same term "occurrence"
>> to mean both a primary unit of data gathering and the result of an analysis
>> (possibly even just a hypothesis if it is the result of niche modelling)?
>> How could we differentiate between these two? The discussion probably comes
>> back to 'basisOfRecord' again and our fundamental classes of object.
>> >
>> > Sorry to be long winded.
>> >
>> > Roger
>> >
>> >
>> > On 12 Oct 2010, at 09:36, Kevin Richards wrote:
>> >
>> >> I also have always felt that "nativeness" should apply more to an
>> occurrence than a taxon, but have swayed from one opinion to the other on a
>> regular basis.  So my conclusion is that "nativeness" is a propety of both,
>> and require both, in a way - and that these different perspectives are
>> actually the same thing.
>> >>
>> >> Eg, if we describe (in a basic way) :
>> >> Ocurrence = Taxon at Location
>> >>
>> >> then if we say that Nativeness is a property of a Taxon that is
>> restricted by Location  (jerry's view)
>> >> then this is equivalent to saying that Nativeness is a property of an
>> Ocurrence ! (Rich's view)
>> >>
>> >> As Rich points out, it doesnt make a whole lot of sense to apply
>> Nativeness to a single occurrence, but I'm not sure this is what is meant by
>> stating that "this specimen of Poa anceps that I collected from Christchurch
>> is 'Native'" - but more that "I have found a specimen of Poa anceps in
>> Christchurch and from knowledge of other previously recorded ocurrences, I
>> know that this occurence/taxon is Native in this area"
>> >>
>> >> Also I tend to feel that a lot of biodiversity properties are
>> properties of ocurrences  - EVEN taxon names are a property of an occurrence
>> and not of this 'concept' of a species - but I wont go down that road right
>> now   :-)
>> >>
>> >> Also, we discussed this topic a while ago on the tdwg content list,
>> having worked out that "nativeness" or what we call "biostatus" is a fairly
>> complicated topic, involving taxon names, locations, time, and aspects like
>> 'origin' and 'presence', ...
>> >>
>> >> Kevin
>> >>
>> >> ________________________________________
>> >> From: tdwg-content-bounces at lists.tdwg.org [
>> tdwg-content-bounces at lists.tdwg.org] On Behalf Of Richard Pyle [
>> deepreef at bishopmuseum.org]
>> >> Sent: Tuesday, 12 October 2010 5:41 p.m.
>> >> To: Jerry Cooper; tdwg-content at lists.tdwg.org;
>> tdwg-bioblitz at googlegroups.com
>> >> Subject: Re: [tdwg-content] What I learned at the TechnoBioBlitz
>> >>
>> >> Hi Jerry,
>> >>
>> >> Before we agree to disagree, let me try to elaborate a bit more:
>> >>
>> >> I think we both agree that "Nativeness" (to borrow Dave's term) is a
>> >> property of a taxon at a geographic locality (it could also be a
>> property of
>> >> a taxon in a class of habitat, but few people actually frame it this
>> way).
>> >>
>> >> The reason I think that "Nativeness" is best represented as a property
>> of an
>> >> Occurrence, rather than of a taxon, is that a taxon is a circumscribed
>> set
>> >> of organisms, usually based on evolutionary relatedness or
>> morphological or
>> >> genetic similarity.  By contrast, an Occurrence is about the presence
>> of a
>> >> member or multiple members of a taxon concept in space and time (i.e.,
>> at a
>> >> particular place and time).
>> >>
>> >> We often think of Occurrence records in terms of individual organisms
>> (e.g.,
>> >> specimens, or specific observed or photographed organisms), and I
>> agree,
>> >> it's weird to think of "Nativeness" as it applies to an individual
>> organism.
>> >> However, my understanding is that Occurrence instances can also apply
>> to
>> >> populations -- which is what terms such as establishmentMeans and
>> >> occurrenceStatus fit into this class.
>> >>
>> >> More generally, if we agree that "Nativeness" is a property of a taxon
>> at a
>> >> particular locality, the way that this intersection is usually manifest
>> in
>> >> DwC is via Occurrence and Event instances.
>> >>
>> >> How else would you represent "Nativeness" within DwC?
>> >>
>> >> Aloha,
>> >> Rich
>> >>
>> >>> -----Original Message-----
>> >>> From: tdwg-content-bounces at lists.tdwg.org
>> >>> [mailto:tdwg-content-bounces at lists.tdwg.org<tdwg-content-bounces at lists.tdwg.org>]
>> On Behalf Of Jerry Cooper
>> >>> Sent: Monday, October 11, 2010 6:02 PM
>> >>> To: tdwg-content at lists.tdwg.org; tdwg-bioblitz at googlegroups.com
>> >>> Subject: Re: [tdwg-content] What I learned at the TechnoBioBlitz
>> >>>
>> >>> We will have to agree to disagree.
>> >>>
>> >>> For me at least 'Native',  'Invasive' etc are clearly not
>> >>> properties associated with a collection event. They are
>> >>> collective statements, not necessarily about properties of
>> >>> the taxon as a whole, but about the properties of a taxon in
>> >>> some restricted sense - usually geographically restricted.
>> >>>
>> >>> GISIN, like our model here in  NZ, pulls together such items
>> >>> under a triplet of taxon/occurrence statement/geographical
>> >>> extent linked to a publication.
>> >>>
>> >>>
>> >>> Jerry
>> >>>
>> >>>
>> >>> -----Original Message-----
>> >>> From: Richard Pyle [mailto:deepreef at bishopmuseum.org<deepreef at bishopmuseum.org>
>> ]
>> >>> Sent: Tuesday, 12 October 2010 4:23 p.m.
>> >>> To: Jerry Cooper
>> >>> Cc: tdwg-content at lists.tdwg.org; tdwg-bioblitz at googlegroups.com
>> >>> Subject: RE: [tdwg-content] What I learned at the TechnoBioBlitz
>> >>>
>> >>> Hi Jerry,
>> >>>
>> >>> Yes, this is a road I've been down before.  Intuitively,
>> >>> these terms seem like they should apply to taxon concepts,
>> >>> but it turns out that's not the right way to do it.  Things
>> >>> like "native" and "invasive" are not properties of taxon
>> >>> concepts; they're the property of an occurrence (which, I
>> >>> suspect, is why establishmentMeans is included in the
>> >>> Occurrence class in DwC; e.g., see the examples at
>> >>> http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans
>> >>>
>> >>> Rich
>> >>>
>> >>> ________________________________
>> >>>
>> >>>        From: tdwg-content-bounces at lists.tdwg.org
>> >>> [mailto:tdwg-content-bounces at lists.tdwg.org<tdwg-content-bounces at lists.tdwg.org>]
>> On Behalf Of Jerry Cooper
>> >>>        Sent: Monday, October 11, 2010 4:38 PM
>> >>>        Cc: tdwg-content at lists.tdwg.org;
>> >>> tdwg-bioblitz at googlegroups.com
>> >>>        Subject: Re: [tdwg-content] What I learned at the
>> >>> TechnoBioBlitz
>> >>>
>> >>>
>> >>>
>> >>>        Rich,
>> >>>
>> >>>
>> >>>
>> >>>        Let's not confuse those terms which are best applied
>> >>> to a taxon concept rather than a  specific
>> >>> collection/observation of a taxon at a location.
>> >>>
>> >>>
>> >>>
>> >>>         There are existing vocabularies for taxon-related
>> >>> provenance, like those in GISIN, or the vocabulary Roger
>> >>> mentioned in his PESI talk at TDWG.
>> >>>
>> >>>
>> >>>
>> >>>        However, against a specific collection you can only
>> >>> record what the recorder actually knows at that location for
>> >>> that specific collected taxon, and not to infer a status like
>> >>> 'introduced' etc.
>> >>>
>> >>>
>> >>>
>> >>>        So, to me, the vocabulary reduces even further - and
>> >>> the obvious ones are 'in cultivation', 'in captivity',
>> >>> 'border intercept' . Our botanical collection management
>> >>> system would hold more data on provenance of a specific
>> >>> collection and linkages between events - from the wild at t=1,
>> >>> x=1 to cultivation in botanic garden Y at t=2, X=2 etc. But
>> >>> then we often have that data because we are generating it.
>> >>>
>> >>>
>> >>>
>> >>>        Jerry
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>        From: tdwg-content-bounces at lists.tdwg.org
>> >>> [mailto:tdwg-content-bounces at lists.tdwg.org<tdwg-content-bounces at lists.tdwg.org>]
>> On Behalf Of Richard Pyle
>> >>>        Sent: Tuesday, 12 October 2010 3:27 p.m.
>> >>>        To: Donald.Hobern at csiro.au; tuco at berkeley.edu
>> >>>        Cc: tdwg-content at lists.tdwg.org;
>> >>> tdwg-bioblitz at googlegroups.com
>> >>>        Subject: Re: [tdwg-content] What I learned at the
>> >>> TechnoBioBlitz
>> >>>
>> >>>
>> >>>
>> >>>        I certainly agree it's important!  I was just saying
>> >>> that a simple flag probably wouldn't be enough.  I like the
>> >>> idea of a controlled vocabulary (as you and John both allude
>> >>> to), and I can imagine about a half-dozen terms that our
>> >>> community will no-doubt adopt with almost no debate.....  :-)
>> >>>
>> >>>
>> >>>
>> >>>        In my mind, the broadest categories (and likely most
>> >>> useful) would be something like:
>> >>>
>> >>>
>> >>>
>> >>>        Native (was there without any assistance from humans)
>> >>>
>> >>>        Introduced (got there with the assistance of humans,
>> >>> but is inhabiting the natural environment)
>> >>>
>> >>>        Captive (brought by humans and still maintained in captivity)
>> >>>
>> >>>
>> >>>
>> >>>        You might also throw in "Cryptogenic", which is an
>> >>> assertion that we do not know which of these categories a
>> >>> particular organism falls (not the same as null, which means
>> >>> we don't know whether or not we know)
>> >>>
>> >>>
>> >>>
>> >>>        Of course, each of these can be further subdivded,
>> >>> but the more we subdivide, the greater the ratio of
>> >>> fuzzy:clean distinctions. I would say that the terms should
>> >>> be established in consultation with those most likely to use
>> >>> them (e.g., as you suggest, distribution analysis, niche modellers,
>> >>> etc.)  For example, it might be useful to distinguish between
>> >>> an organism that was itself introduced, compared to the
>> >>> progeny (or a well-established
>> >>> population) of an intoduced organism. This information can be
>> >>> useful for separating things likely to become established in
>> >>> new localities, vs. things that do not seem to "take" in a
>> >>> novel environment.
>> >>>
>> >>>        Anyway...I didn't want to say a lot on this topic
>> >>> (too late?); I just wanted to steer more towards controlled
>> >>> vocabulary, than simple flag field.
>> >>>
>> >>>
>> >>>
>> >>>        Aloha,
>> >>>
>> >>>        Rich
>> >>>
>> >>>
>> >>>
>> >>>                ________________________________
>> >>>
>> >>>                                From: Donald.Hobern at csiro.au
>> >>> [mailto:Donald.Hobern at csiro.au <Donald.Hobern at csiro.au>]
>> >>>                Sent: Monday, October 11, 2010 3:44 PM
>> >>>                To: Richard Pyle; tuco at berkeley.edu
>> >>>                Cc: tdwg-content at lists.tdwg.org;
>> >>> tdwg-bioblitz at googlegroups.com
>> >>>                Subject: RE: [tdwg-content] What I learned at
>> >>> the TechnoBioBlitz
>> >>>
>> >>>                Hi Rich.
>> >>>
>> >>>
>> >>>
>> >>>                I recognise this (and could probably define
>> >>> many different useful flags).  The bottom line is really
>> >>> whether or not the location is one which should be used for
>> >>> distribution analysis, niche modelling and similar
>> >>> activities.  There will certainly be many grey areas, but it
>> >>> would be good if software could weed out captive occurrences.
>> >>>
>> >>>
>> >>>
>> >>>                Donald
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>                untitled
>> >>>
>> >>>
>> >>>
>> >>>                        Donald Hobern, Director, Atlas of
>> >>> Living Australia
>> >>>
>> >>>                CSIRO Ecosystem Sciences, GPO Box 1700,
>> >>> Canberra, ACT 2601
>> >>>
>> >>>                Phone: (02) 62464352 Mobile: 0437990208
>> >>>
>> >>>                Email: Donald.Hobern at csiro.au
>> >>> <mailto:Donald.Hobern at csiro.au <Donald.Hobern at csiro.au>>
>> >>>
>> >>>                Web: http://www.ala.org.au/
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>                From: Richard Pyle [mailto:deepreef at bishopmuseum.org<deepreef at bishopmuseum.org>
>> ]
>> >>>                Sent: Tuesday, 12 October 2010 12:33 PM
>> >>>                To: Hobern, Donald (CES, Black Mountain);
>> >>> tuco at berkeley.edu
>> >>>                Cc: tdwg-content at lists.tdwg.org;
>> >>> tdwg-bioblitz at googlegroups.com
>> >>>                Subject: RE: [tdwg-content] What I learned at
>> >>> the TechnoBioBlitz
>> >>>
>> >>>
>> >>>
>> >>>                I'm not so sure a simple flag will do it.  We
>> >>> have examples ranging from animals in zoos, to escaped
>> >>> animals, to intentionally and unintentionally introduced
>> >>> populations, to naturalized populations -- and just about
>> >>> everything in-between.  Where on this spectrum would you draw
>> >>> the line for flagging something as "naturally occurring"?
>> >>>
>> >>>
>> >>>
>> >>>                Rich
>> >>>
>> >>>
>> >>>
>> >>>                        ________________________________
>> >>>
>> >>>                                                From:
>> >>> tdwg-content-bounces at lists.tdwg.org
>> >>> [mailto:tdwg-content-bounces at lists.tdwg.org<tdwg-content-bounces at lists.tdwg.org>]
>> On Behalf Of
>> >>> Donald.Hobern at csiro.au
>> >>>                        Sent: Monday, October 11, 2010 2:59 PM
>> >>>                        To: tuco at berkeley.edu
>> >>>                        Cc: tdwg-content at lists.tdwg.org;
>> >>> tdwg-bioblitz at googlegroups.com
>> >>>                        Subject: Re: [tdwg-content] What I
>> >>> learned at the TechnoBioBlitz
>> >>>
>> >>>                        Thanks, John.
>> >>>
>> >>>
>> >>>
>> >>>                        This is useful, but completely
>> >>> uncontrolled - effectively a verbatimEstablishmentMeans.
>> >>> Having a more controlled version or a simple flag which could
>> >>> be machine-processible in those cases where providers can
>> >>> supply it would be useful.
>> >>>
>> >>>
>> >>>
>> >>>                        Donald
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>                        untitled
>> >>>
>> >>>
>> >>>
>> >>>                                Donald Hobern, Director,
>> >>> Atlas of Living Australia
>> >>>
>> >>>                        CSIRO Ecosystem Sciences, GPO Box
>> >>> 1700, Canberra, ACT 2601
>> >>>
>> >>>                        Phone: (02) 62464352 Mobile: 0437990208
>> >>>
>> >>>                        Email: Donald.Hobern at csiro.au
>> >>> <mailto:Donald.Hobern at csiro.au <Donald.Hobern at csiro.au>>
>> >>>
>> >>>                        Web: http://www.ala.org.au/
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>                        From: gtuco.btuco at gmail.com
>> >>> [mailto:gtuco.btuco at gmail.com <gtuco.btuco at gmail.com>] On Behalf Of
>> John Wieczorek
>> >>>                        Sent: Tuesday, 12 October 2010 11:34 AM
>> >>>                        To: Hobern, Donald (CES, Black Mountain)
>> >>>                        Cc: jsachs at csee.umbc.edu;
>> >>> tdwg-bioblitz at googlegroups.com; tdwg-content at lists.tdwg.org
>> >>>                        Subject: Re: [tdwg-content] What I
>> >>> learned at the TechnoBioBlitz
>> >>>
>> >>>
>> >>>
>> >>>                        Natural occurrence is meant to be
>> >>> captured through the term dwc:establishmentMeans
>> >>> (http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans).
>> >>>
>> >>>                        On Mon, Oct 11, 2010 at 5:16 PM,
>> >>> <Donald.Hobern at csiro.au> wrote:
>> >>>
>> >>>                        Thanks, Joel.
>> >>>
>> >>>                        Nice summary.  One addition which we
>> >>> do need to resolve (and which has been suggested in recent
>> >>> months) is to have a flag to indicate whether a record should
>> >>> be considered to show a "natural"
>> >>> occurrence (in distinction from cultivation, botanic gardens,
>> >>> zoos, etc.).
>> >>> This is not so much an issue in a BioBlitz, but is certainly
>> >>> a factor with citizen science recording in general - see the
>> >>> number of zoo animals in the Flickr EOL group.
>> >>>
>> >>>                        Donald
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>                        Donald Hobern, Director, Atlas of
>> >>> Living Australia
>> >>>                        CSIRO Ecosystem Sciences, GPO Box
>> >>> 1700, Canberra, ACT 2601
>> >>>                        Phone: (02) 62464352 Mobile: 0437990208
>> >>>                        Email: Donald.Hobern at csiro.au
>> >>>                        Web: http://www.ala.org.au/
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>                        -----Original Message-----
>> >>>                        From: tdwg-content-bounces at lists.tdwg.org
>> >>> [mailto:tdwg-content-bounces at lists.tdwg.org<tdwg-content-bounces at lists.tdwg.org>]
>> On Behalf Of joel sachs
>> >>>                        Sent: Monday, 11 October 2010 10:47 PM
>> >>>                        To: tdwg-bioblitz at googlegroups.com;
>> >>> tdwg-content at lists.tdwg.org
>> >>>                        Subject: [tdwg-content] What I
>> >>> learned at the TechnoBioBlitz
>> >>>
>> >>>                        One of the goals of the recent
>> >>> bioblitz was to think about the suitability and
>> >>> appropriatness of TDWG standards for citizen science. Robert
>> >>> Stevenson has volunteered to take the lead on preparing a
>> >>> technobioblitz lessons learned document, and though the scope
>> >>> of this document is not yet determined, I think the audience
>> >>> will include bioblitz organizers, software developers, and
>> >>> TDWG as a whole. I hope no one is shy about sharing lessons
>> >>> they think they learned, or suggestions that they have. We
>> >>> can use the bioblitz google group for this discussion, and
>> >>> copy in tdwg-content when our discussion is standards-specific.
>> >>>
>> >>>                        Here are some of my immediate observations:
>> >>>
>> >>>                        1. Darwin Core is almost exactly
>> >>> right for citizen science. However, there is a desperate need
>> >>> for examples and templates of its use. To illustrate this
>> >>> need: one of the developers spoke of the design choice
>> >>> between "a simple csv file and a Darwin Core record". But a
>> >>> simple csv file is a legitimate representation of Darwin
>> >>> Core! To be fair to the developer, such a sentence might not
>> >>> have struck me as absurd a year ago, before Remsen said
>> >>> "let's use DwC for the bioblitz".
>> >>>
>> >>>                        We provided a couple of example DwC
>> >>> records (text and rdf) in the bioblitz data profile [1]. I
>> >>> think the lessons learned document should include an on-line
>> >>> catalog of cut-and-pasteable examples covering a variety of
>> >>> use cases, together with a dead simple desciption of DwC,
>> >>> something like "Darwin Core is a collection of terms,
>> >>> together with definitions."
>> >>>
>> >>>                        Here are areas where we augemented or
>> >>> diverged from DwC in the bioblitz:
>> >>>
>> >>>                        i. We added obs:observedBy [2], since
>> >>> there is no equivalent property in DwC, and it's important in
>> >>> Citizen Science (though often not available).
>> >>>
>> >>>                        ii. We used geo:lat and geo:long [3]
>> >>> instead of DwC terms for latitude and longitude. The geo
>> >>> namespace is a well used and supported standard, and records
>> >>> with geo coordinates are automatically mapped by several
>> >>> applications. Since everyone was using GPS  to retrieve their
>> >>> coordinates, we were able to assume WGS-84 as the datum.
>> >>>
>> >>>                        If someone had used another Datum,
>> >>> say XYZ, we would have added columns to the Fusion table so
>> >>> that they could have expressed their coordiantes in DwC, as, e.g.:
>> >>>                        DwC:decimalLatitude=41.5
>> >>>                        DwC:decimalLongitude=-70.7
>> >>>                        DwC:geodeticDatum=XYZ
>> >>>
>> >>>                        (I would argue that it should be
>> >>> kosher DwC to express the above as simply XYZ:lat and
>> >>> XYZ:long. DwC already incorporates terms from other
>> >>> namespaces, such as Dublin Core, so there is precedent for this.
>> >>>
>> >>>                        2. DwC:scientificName might be more
>> >>> user friendly than taxonomy:binomial and the other taxonomy
>> >>> machine tags EOL uses for flickr images.  If
>> >>> DwC:scientificName isn't self-explanatory enough, a user can
>> >>> look it up, and see that any scientific name is acceptable,
>> >>> at any taxonomic rank, or not having any rank. And once we
>> >>> have a scientific name, higher ranks can be inferred.
>> >>>
>> >>>                        3. Catalogue of Life was an important
>> >>> part of the workflow, but we had some problems with it.
>> >>> Future bioblitzes might consider using something like a CoL
>> >>> fork, as recently described by Rod Page [4].
>> >>>
>> >>>                        4. We didn't include "basisOfRecord"
>> >>> in the original data profile, and so it wasn't a column in
>> >>> the Fusion Table [5]. But when a transcriber felt it was
>> >>> necessary to include in order to capture data in a particular
>> >>> field sheet, she just added the column to the table. This
>> >>> flexibility of schema is important, and is in harmony with
>> >>> the semantic web.
>> >>>
>> >>>                        5. There seemed to be enthusiasm for
>> >>> another field event at next year's TDWG. This could be an
>> >>> opportunity to gather other types of data (eg.
>> >>>                        character data) and thereby
>> >>>                        i) expose meeting particpants to
>> >>> another set of everyday problems from the world of
>> >>> biodiversity workflows, and ii) try other TDWG technology on
>> >>> for size, e.g. the observation exchange format, annotation
>> >>> framework, etc.
>> >>>
>> >>>
>> >>>                        Happy Thanksgiving to all in Canada -
>> >>>                        Joel.
>> >>>                        ----
>> >>>
>> >>>
>> >>>                        1.
>> >>> http://groups.google.com/group/tdwg-bioblitz/web/tdwg-bioblitz
>> >> -profile-v1-1
>> >>>                        2. Slightly bastardizing our old
>> >>> observation ontology -
>> >>> http://spire.umbc.edu/ontologies/Observation.owl
>> >>>                        3. http://www.w3.org/2003/01/geo/
>> >>>                        4.
>> >>> http://iphylo.blogspot.com/2010/10/replicating-and-forking-dat
>> >> a-in-2010.html
>> >>>                        5.
>> >>> http://tables.googlelabs.com/DataSource?dsrcid=248798
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>>                        tdwg-content mailing list
>> >>>                        tdwg-content at lists.tdwg.org
>> >>>
>> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> >>>
>> >>> _______________________________________________
>> >>>                        tdwg-content mailing list
>> >>>                        tdwg-content at lists.tdwg.org
>> >>>
>> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> ________________________________
>> >>>
>> >>>        Please consider the environment before printing this email
>> >>>        Warning: This electronic message together with any
>> >>> attachments is confidential. If you receive it in error: (i)
>> >>> you must not read, use, disclose, copy or retain it; (ii)
>> >>> please contact the sender immediately by reply email and then
>> >>> delete the emails.
>> >>>        The views expressed in this email may not be those of
>> >>> Landcare Research New Zealand Limited.
>> >>> http://www.landcareresearch.co.nz
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> Please consider the environment before printing this email
>> >>> Warning:  This electronic message together with any
>> >>> attachments is confidential. If you receive it in error: (i)
>> >>> you must not read, use, disclose, copy or retain it; (ii)
>> >>> please contact the sender immediately by reply email and then
>> >>> delete the emails.
>> >>> The views expressed in this email may not be those of
>> >>> Landcare Research New Zealand Limited.
>> >>> http://www.landcareresearch.co.nz
>> >>> _______________________________________________
>> >>> tdwg-content mailing list
>> >>> tdwg-content at lists.tdwg.org
>> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> >>>
>> >>
>> >>
>> >> _______________________________________________
>> >> tdwg-content mailing list
>> >> tdwg-content at lists.tdwg.org
>> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> >>
>> >> Please consider the environment before printing this email
>> >> Warning:  This electronic message together with any attachments is
>> confidential. If you receive it in error: (i) you must not read, use,
>> disclose, copy or retain it; (ii) please contact the sender immediately by
>> reply email and then delete the emails.
>> >> The views expressed in this email may not be those of Landcare Research
>> New Zealand Limited. http://www.landcareresearch.co.nz
>> >
>> > _______________________________________________
>> > tdwg-content mailing list
>> > tdwg-content at lists.tdwg.org
>> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>> _______________________________________________
>> tdwg-content mailing list
>>  tdwg-content at lists.tdwg.org
>>  http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>>  tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> Steven J. Baskauf, Ph.D., Senior Lecturer
>> Vanderbilt University Dept. of Biological Sciences
>>
>> postal mail address:
>> VU Station B 351634
>> Nashville, TN  37235-1634,  U.S.A.
>>
>> delivery address:
>> 2125 Stevenson Center
>> 1161 21st Ave., S.
>> Nashville, TN 37235
>>
>> office: 2128 Stevenson Center
>> phone: (615) 343-4582,  fax: (615) 343-6707http://bioimages.vanderbilt.edu
>>
>>
>> --
>> Steven J. Baskauf, Ph.D., Senior Lecturer
>> Vanderbilt University Dept. of Biological Sciences
>>
>> postal mail address:
>> VU Station B 351634
>> Nashville, TN  37235-1634,  U.S.A.
>>
>> delivery address:
>> 2125 Stevenson Center
>> 1161 21st Ave., S.
>> Nashville, TN 37235
>>
>> office: 2128 Stevenson Center
>> phone: (615) 343-4582,  fax: (615) 343-6707http://bioimages.vanderbilt.edu
>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>
>
> --
> ----------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
> Knowledge Base <http://lod.geospecies.org/>
> About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
> ------------------------------------------------------------
>



-- 
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101014/77847bf0/attachment-0001.html 


More information about the tdwg-content mailing list