I was just ready to leave work when I wrote this and since then I'm
feeling like I should clarify just what I mean by "wrong" ways of using
RDF. I recognize that TDWG encourages flexibility in the ways that
standards such as DwC are used. As such, it doesn't usually define
"right" and "wrong" ways of using the standards. What I mean by
calling some uses "wrong" is not intended to discourage the creative
use of DwC terms in RDF. What I mean is that one must be careful to
make sure that RDF statements mean what is intended. Here is an
example. The Dublin Core term dcterms:language means "the language of
the resource". On multiple occasions, I've seen this term used in RDF
as a property of a resource whose metadata is written in a certain
language. This is "wrong" because the subject of the statement is the
resource itself, not the resource's metadata. The need for this kind
of clarity is apparent in the case of media. For example, if we are
providing metadata in English that describes a nature film which has
audio in German, the correct statement is that [film] dcterms:language
"de", NOT [film] dcterms:language "en". This problem is handled
appropriately in the MRTG schema by creating the (required) term
mrtg:metadataLanguage. The correct statement would be [film]
mrtg:metadataLanguage "en" . (I'm using "[film]" in lieu of a URI
identifier for the film.) If, however, we were writing RDF to describe
the metadata itself rather than the film, then it would be appropriate
to say [film's metadata] dcterms:language "en" . In straight XML, we
might get away with semantic sloppiness if the senders and receivers of
the XML "understand" what the intended subject is of the term
dcterms:language. But in RDF, we have to assume that the receiver of
the RDF is a "stupid" computer which only infers exactly what is said
and not what we MEANT to say.
I believe that this is a very important point that all parties need to
keep in mind before we happily march off creating RDF templates for the
general public to use. In particular, I have some serious problems
with the way that people are associating properties with instances of
the dwc:Occurrence class. I believe that these "wrong" ways originate
with the historical roots of Darwin Core as a means to describe
specimens. I will illustrate what I mean. In many cases, a specimen
is created by killing an organism and gluing it to a piece of paper (if
it's a plant) or putting it in a jar (if it's an animal). It is
natural to ask the question "what kind of species is the specimen?".
We can look at the specimen and make a statement like [specimen]
dwc:scientificName "Drosophila melanogaster" and it pretty much makes
sense. However, in the new Darwin Core standard, we have a broader
category of "things" (a.k.a. resources) that we call Occurrences which
include specimens but which also includes observations and probably all
kinds of things like images, DNA samples, and a whole lot of other
things. If we try to apply the same kind of statement to other kinds
of Occurrences besides specimens we immediately run into problems. If
we say that [digital image] dwc:scientificName "Drosophila
melanogaster" we are making a nonsensical statement. The digital image
can have properties like its photographer, its format, its pixel
dimensions, etc. but the image itself does not have a scientific name.
The scientific name is a property of the thing that was photographed.
It makes even less sense if we are talking about observations. An
observation is a situation where somebody observes an organism. The
observation can have properties like the observer, the location, etc.
However, if we say [observation] dwc:scientificName "Drosophila
melanogaster" we are saying that that act of observing has a scientific
name. That is an incorrect statement. So the general statement
[Occurrence] dwc:scientificName "Drosophila melanogaster" does not make
sense when applied to all possible types of Occurrences. Rather, the
organism that we are observing is the thing that has a scientific
name.
In all of the examples above, the correct statement is [individual
organism] dwc:scientificName "Drosophila melanogaster". The specimen
is an occurrence of the individual organism. The image is an
occurrence of the individual organism. The observation is an
occurrence of the individual organism. These statements may seem odd
because we are used to thinking of an Occurrence being an occurrence of
the "species" but it's not really. The image is not an image of the
Drosophila species concept nor is it an image of the string "Drosophila
melanogaster". The image is an image of an individual fruit fly. The
individual fruit fly is a representative of the taxon, the image and
the observation are not.
This point becomes more clear if we look at a situation where several
types of occurrence records are collected from the same individual.
Let's say that we capture a bird, photograph it, collect a feather from
it, collect a DNA sample and band it and let it go. Later somebody
sees the band and reports that as an observation. How do we connect
all of these things? Do we create an identifier for the specimen (the
feather) and then say that the image and the DNA sample came from it?
That would be wrong. We could take an image of the feather, but that
would be a different thing from an image of the bird. We didn't get
the DNA sample from the feather, we got it via a blood sample from the
bird. The band observation is not an observation of the feather, or
the image or the DNA sample. It's an observation of the bird which was
never any kind of specimen living or dead. The bird is an individual
organism and that's what we need to call it. Right now we don't have
anything in Darwin Core that can be used to rdfs:type the bird, which
is why I proposed Individual as a Darwin Core class.
I could say these things more clearly in RDF, but since because many
members of the audience of this message aren't familiar with RDF/XML
they would probably zone out and the point would be lost. The point is
that we need to have identifiable classes of "resources" (the technical
name for "things" like physical artifacts, concepts, and electronic
representations) for all of the things that that we need to describe
and inter-relate in the Darwin Core world. Right now, we are missing
one of the important pieces that we need, which is a class for the
Individual. If we are satisfied with creating an RDF model that only
works for specimens and one-time observations, then we probably don't
need Individual as a Darwin Core class. On the other hand, if TDWG and
GBIF are really serious about creating a system (Darwin Core and RDF
based on it) that can handle other types of Occurrences like multiple
images of live organisms, observations of the same organism over time,
and multiple types of Occurrences collected from the same organism,
then this capability should be built into the system from the start.
When I got back from the TDWG meeting, I was all excited about trying
to use Darwin Core Archives with my live plant image collection.
However, it quickly became evident that it could not work because
Occurrences were at the center of the diagram rather than Individuals.
So unless something changes, we are already embarking on the process of
locking out these other Occurrence types.
I hate to sound like a broken record (do we have those any more?), but
read my paper on this subject. It explains the rationale better than
this email, has nice diagrams, and gives RDF examples to illustrate
everything (https://journals.ku.edu/index.php/jbi/article/view/3664).
If somebody has a better idea of how to develop an internally
consistent system that can handle the problems I've raised here that
DOESN'T involve Individuals (i.e. other "right"[=semantically accurate]
ways to express properties and relationships among Identifications,
Taxa, diverse types of Occurrences, etc.) I'd like to hear what it is.
Or perhaps as Stan has suggested, there needs to be a task group that
can hash out alternative views. But let's have the discussion before
we post models and suggest people use them.
Steve
Steve Baskauf wrote:
Stan,
Thanks for the clarification. My concern here is that standard or not,
if examples are posted on the Google Darwin Code site, they will have
an implied "stamp of approval" and will be used by others as a template
(despite that site being labeled as "for discussion and development"
not everyone can post to it and that implies some authority). In the
case of straight XML, that isn't really that big of an issue. XML can
mean whatever one wants as long as there is an agreement between the
sender and the receiver (perhaps in the form of a formal XML schema) as
to what the elements represent. I believe that RDF is a different
beast. When one exposes RDF, the receiver is unknown. Therefore, the
RDF has to actually "mean" something to the receiver without a
pre-arranged agreement. In a generic XML document, the elements can
simply be a list of string values of terms with no implied "meaning"
except what might be inferred by grouping them in a container element.
In RDF, the elements represent properties of particular resources. I
believe strongly that although there may be several "right" ways to
express properties of members of DwC classes, there are many more
"wrong" ways that should not be used. By "right" I mean that they make
sense semantically in that the properties logically are ones that
should actually belong to the described resource. I do not believe
that the discussion of these issues has progressed to the point where
there is a consensus on the "right" use of DwC terms for some types of
resources and therefore I am opposed to the posting of RDF examples on
any official Darwin Core sites without a lot more discussion UNLESS the
examples are clearly labeled as examples intended for discussion and
not for use as templates. If you want such examples, I can provide
http://bioimages.vanderbilt.edu/vanderbilt/4-145.rdf
as an example for
an Individual
http://bioimages.vanderbilt.edu/baskauf/79695.rdf
as am example of an
Occurrence that is a live plant image and
http://bioimages.vanderbilt.edu/rdf/examples/lsu000/0138.rdf
as an
example of an Occurrence that is an herbarium specimen
I would be happy to discuss the reasons why I structured the RDF as I
did (although mostly those examples are already rationalized in
https://journals.ku.edu/index.php/jbi/article/view/3664),
but I would
not go so far as to say that they are "right" without some discussion.
What I intended when I suggested that I might write some kind of guide
for Darwin Core represented in RDF/XML was really a document that
explained to beginners what the point was of RDF, the basics of how one
can structure properties in RDF using examples that are Darwin Core
terms, and options for creating URIs that refer to resources that are
described in separate files or within the same file. I wasn't really
suggesting that it be a full-blown recommendation with specific
guidelines for the use of particular terms or structuring of files for
particular classes of resources, although that would be a good thing
ultimately. I guess I was seeing some kind of a beginner's guide as a
way to involve more people (who aren't up on RDF) in the discussion. I
don't think that it should be necessary to complete full "standards"
process before such a document were made available. It would probably
be better to have some kind of road map where that document would be
the first segment but would later be followed by guidelines for
specific classes of resources with examples. I think that such a
modular approach would be the most beneficial because pieces of it
could actually get done in a timely fashion rather than requiring the
whole thing to be complete before any of it would be accepted.
I do think a task group for Darwin Core RDF would be a good idea. If
nobody is in a huge hurry, I don't mind trying to charter such a group,
although I'd be just as happy if somebody else wanted to do it and I
would just try be an active participant. I will look at the links you
suggested, thanks.
Steve
Blum, Stan wrote:
Re: [tdwg-content] What I learned at the TechnoBioBlitz
Steve,
The TDWG process for creating standards is here: http://www.tdwg.org/about-tdwg/process/
This is worth reading if you haven’t done so already.
Another document worth reading is the standards format specifiation http://www.tdwg.org/standards/147/
I never pushed this “standard” through public review, but it still
functions a guideline for formatting and our view of what is isn’t
within scope of a “standard”. In other words, we are doing our best to
follow the basic ideas laid out there about the kinds of specifications:
Type 1 -- normative specification, versioned;
Type 2 -- versioned, supplementary documentation;
Type 3 — uncontrolled supplementary documentation.
The page of examples John and others have put up on the DarwinCore site
is non-normative, uncontrolled documentation.
The thing you were proposing sounded like an applicability statement —
offering guidance about how another standard, RDF, should be used in
biodiversity informatics. These can also be treated as standards, and
get TDWG ratification as standard, but don’t create a de-novo standard.
Interest groups and task groups are explained in the Process. If you
want to create an applicability statement for RDF and DarwinCore, you
could prepare a task group charter and submit it to the executive for
approval. Approval would make it a formal Task Group. See other task
group charters for examples.
-Stan
On 10/13/10 6:33 AM, "Steve Baskauf" <steve.baskauf@vanderbilt.edu>
wrote:
OK, because of a momentary heavy work load
I'm still in the process of getting caught up on this thread, but this
is moving so fast I feel like I'm being left in the dust. Last week I
offered to help facilitate creating some guidelines and examples for
RDF/XML in Darwin Core. I was told that we should follow the community
process of forming an interest group, getting participants, etc. and
have been waiting for some guidelines on how that process is supposed
to work. Now we are surging ahead with examples and help pages again.
Are we following a process or not and if so, what is it?
Steve
Tim Robertson (GBIF) wrote:
I will also help with examples. If we are
doing XML / RDF formats, lets get an example record conforming to the
Text guidelines in there as well for completeness (most useful when
dealing with checklists).
On Oct 12, 2010, at 10:31 PM, John Wieczorek wrote:
I am interested in helping with an examples
page. The page could have XML and RDF examples illustrating particular
use cases, as you have recommended. Create an "Examples" page on the
Table of Contents and then have all of the examples on one page with an
index of links to specific examples at the top? I made a straw man page
to show what I am thinking at http://code.google.com/p/darwincore/wiki/Examples.
On Tue, Oct 12, 2010 at 11:41 AM, "Markus Döring (GBIF)" <mdoering@gbif.org>
wrote:
Would we have the energy to compile example
dwc records on how to use darwin core for certain use cases?
The lack of guidance on how to use darwin core was mentioned earlier.
An additional example webpage for the dwc website would surely be
really helpful for not only newbies. A dwc record for bird watching,
vegetation plot surveys, insect specimen collection, herbarium sheets,
zoological garden visits, tissue sample, dna sequence, marine fishing
net catches, etc
Id volunteer to do the html page if Im given example records with a
short use case description...
Markus
On Oct 12, 2010, at 13:14, Roger Hyam wrote:
> Wow - what a thread to come back to.
>
> I saw my name mentioned so I ought to chip in. I also think we are
conflating two distinct things under the name "occurrence".
>
> This point is largely just expanding on what Kevin just said.
Going down the road he was wise enough not to go down!
>
> The vocabulary I briefly presented at TDWG was aimed at occurrence
of taxa in regions but the general thrust of my talk was intended to
pose the questions: Why should we score taxa to regions at all?
Shouldn't this always be the results of a query on occurrence records?
The answer will always depend on the question asked.
>
> Take two examples.
>
> A tiger roaming "free" in London living off a diet of squirrels
and tourists. Occurrence records for this organism are just occurrence
records. Why the tiger is in London (climate change, introduction,
invasion, escape) is not a quality of it being there. They are value
judgements added later.
>
> A tiger sitting in a cage a London Zoo is "managed" in that it is
being maintained there by a human effort. We are recording the fact
that someone has placed it there and held it in that position for our
edification.
>
> As Kevin says, when I observe an individual (or flock of
individuals) I do not observe their "introducedness" or their
"nativeness" this is something that is derived from combining multiple
observations of occurrence of individuals.
>
> I would therefore advocate that we just have a flag on an
occurrence record that says "intended for distribution" i.e. this is
not maintained here in a garden/zoo/farm etc. To say any more on a
occurrence record is misleading and there are occasions when even this
flag will be ignored in analysis. I think we already have this field.
>
> There are of course grey areas (biology always has grey areas). A
Scots Pine growing in the highlands may be part of a 150 year old
naturalistic plantation. It is therefore native to the region, possibly
of local genetic stock but has been planted in that position. For some
applications this could be considered managed and for others not.
>
> The status of taxa in regions is a completely different thing. As
soon as we talk about aggregating multiple observations (or lack of
them) then we are talking about the results of analysis instead of
primary observations. Only at this point should we be talking about
the status of the "occurrence" in terms of native/invasive/naturalised
etc. This may not even be based on extant records. For example, a taxon
can be invasive in an area without actually occurring there. i.e. it
used to be there but is presumed to be irradiated.
>
> Does the problem occur because we are using the same term
"occurrence" to mean both a primary unit of data gathering and the
result of an analysis (possibly even just a hypothesis if it is the
result of niche modelling)? How could we differentiate between these
two? The discussion probably comes back to 'basisOfRecord' again and
our fundamental classes of object.
>
> Sorry to be long winded.
>
> Roger
>
>
> On 12 Oct 2010, at 09:36, Kevin Richards wrote:
>
>> I also have always felt that "nativeness" should apply more to
an occurrence than a taxon, but have swayed from one opinion to the
other on a regular basis. So my conclusion is that "nativeness" is a
propety of both, and require both, in a way - and that these different
perspectives are actually the same thing.
>>
>> Eg, if we describe (in a basic way) :
>> Ocurrence = Taxon at Location
>>
>> then if we say that Nativeness is a property of a Taxon that
is restricted by Location (jerry's view)
>> then this is equivalent to saying that Nativeness is a
property of an Ocurrence ! (Rich's view)
>>
>> As Rich points out, it doesnt make a whole lot of sense to
apply Nativeness to a single occurrence, but I'm not sure this is what
is meant by stating that "this specimen of Poa anceps that I collected
from Christchurch is 'Native'" - but more that "I have found a specimen
of Poa anceps in Christchurch and from knowledge of other previously
recorded ocurrences, I know that this occurence/taxon is Native in this
area"
>>
>> Also I tend to feel that a lot of biodiversity properties are
properties of ocurrences - EVEN taxon names are a property of an
occurrence and not of this 'concept' of a species - but I wont go down
that road right now :-)
>>
>> Also, we discussed this topic a while ago on the tdwg content
list, having worked out that "nativeness" or what we call "biostatus"
is a fairly complicated topic, involving taxon names, locations, time,
and aspects like 'origin' and 'presence', ...
>>
>> Kevin
>>
>> ________________________________________
>> From: tdwg-content-bounces@lists.tdwg.org
[tdwg-content-bounces@lists.tdwg.org]
On Behalf Of Richard Pyle [deepreef@bishopmuseum.org]
>> Sent: Tuesday, 12 October 2010 5:41 p.m.
>> To: Jerry Cooper; tdwg-content@lists.tdwg.org; tdwg-bioblitz@googlegroups.com
>> Subject: Re: [tdwg-content] What I learned at the
TechnoBioBlitz
>>
>> Hi Jerry,
>>
>> Before we agree to disagree, let me try to elaborate a bit
more:
>>
>> I think we both agree that "Nativeness" (to borrow Dave's
term) is a
>> property of a taxon at a geographic locality (it could also be
a property of
>> a taxon in a class of habitat, but few people actually frame
it this way).
>>
>> The reason I think that "Nativeness" is best represented as a
property of an
>> Occurrence, rather than of a taxon, is that a taxon is a
circumscribed set
>> of organisms, usually based on evolutionary relatedness or
morphological or
>> genetic similarity. By contrast, an Occurrence is about the
presence of a
>> member or multiple members of a taxon concept in space and
time (i.e., at a
>> particular place and time).
>>
>> We often think of Occurrence records in terms of individual
organisms (e.g.,
>> specimens, or specific observed or photographed organisms),
and I agree,
>> it's weird to think of "Nativeness" as it applies to an
individual organism.
>> However, my understanding is that Occurrence instances can
also apply to
>> populations -- which is what terms such as establishmentMeans
and
>> occurrenceStatus fit into this class.
>>
>> More generally, if we agree that "Nativeness" is a property of
a taxon at a
>> particular locality, the way that this intersection is usually
manifest in
>> DwC is via Occurrence and Event instances.
>>
>> How else would you represent "Nativeness" within DwC?
>>
>> Aloha,
>> Rich
>>
>>> -----Original Message-----
>>> From: tdwg-content-bounces@lists.tdwg.org
>>> [mailto:tdwg-content-bounces@lists.tdwg.org]
On Behalf Of Jerry Cooper
>>> Sent: Monday, October 11, 2010 6:02 PM
>>> To: tdwg-content@lists.tdwg.org; tdwg-bioblitz@googlegroups.com
>>> Subject: Re: [tdwg-content] What I learned at the
TechnoBioBlitz
>>>
>>> We will have to agree to disagree.
>>>
>>> For me at least 'Native', 'Invasive' etc are clearly not
>>> properties associated with a collection event. They are
>>> collective statements, not necessarily about properties of
>>> the taxon as a whole, but about the properties of a taxon
in
>>> some restricted sense - usually geographically restricted.
>>>
>>> GISIN, like our model here in NZ, pulls together such
items
>>> under a triplet of taxon/occurrence statement/geographical
>>> extent linked to a publication.
>>>
>>>
>>> Jerry
>>>
>>>
>>> -----Original Message-----
>>> From: Richard Pyle [mailto:deepreef@bishopmuseum.org]
>>> Sent: Tuesday, 12 October 2010 4:23 p.m.
>>> To: Jerry Cooper
>>> Cc: tdwg-content@lists.tdwg.org; tdwg-bioblitz@googlegroups.com
>>> Subject: RE: [tdwg-content] What I learned at the
TechnoBioBlitz
>>>
>>> Hi Jerry,
>>>
>>> Yes, this is a road I've been down before. Intuitively,
>>> these terms seem like they should apply to taxon concepts,
>>> but it turns out that's not the right way to do it. Things
>>> like "native" and "invasive" are not properties of taxon
>>> concepts; they're the property of an occurrence (which, I
>>> suspect, is why establishmentMeans is included in the
>>> Occurrence class in DwC; e.g., see the examples at
>>> http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans
>>>
>>> Rich
>>>
>>> ________________________________
>>>
>>> From: tdwg-content-bounces@lists.tdwg.org
>>> [mailto:tdwg-content-bounces@lists.tdwg.org]
On Behalf Of Jerry Cooper
>>> Sent: Monday, October 11, 2010 4:38 PM
>>> Cc: tdwg-content@lists.tdwg.org;
>>> tdwg-bioblitz@googlegroups.com
>>> Subject: Re: [tdwg-content] What I learned at the
>>> TechnoBioBlitz
>>>
>>>
>>>
>>> Rich,
>>>
>>>
>>>
>>> Let's not confuse those terms which are best applied
>>> to a taxon concept rather than a specific
>>> collection/observation of a taxon at a location.
>>>
>>>
>>>
>>> There are existing vocabularies for taxon-related
>>> provenance, like those in GISIN, or the vocabulary Roger
>>> mentioned in his PESI talk at TDWG.
>>>
>>>
>>>
>>> However, against a specific collection you can only
>>> record what the recorder actually knows at that location
for
>>> that specific collected taxon, and not to infer a status
like
>>> 'introduced' etc.
>>>
>>>
>>>
>>> So, to me, the vocabulary reduces even further - and
>>> the obvious ones are 'in cultivation', 'in captivity',
>>> 'border intercept' . Our botanical collection management
>>> system would hold more data on provenance of a specific
>>> collection and linkages between events - from the wild at
t=1,
>>> x=1 to cultivation in botanic garden Y at t=2, X=2 etc. But
>>> then we often have that data because we are generating it.
>>>
>>>
>>>
>>> Jerry
>>>
>>>
>>>
>>>
>>>
>>> From: tdwg-content-bounces@lists.tdwg.org
>>> [mailto:tdwg-content-bounces@lists.tdwg.org]
On Behalf Of Richard Pyle
>>> Sent: Tuesday, 12 October 2010 3:27 p.m.
>>> To: Donald.Hobern@csiro.au; tuco@berkeley.edu
>>> Cc: tdwg-content@lists.tdwg.org;
>>> tdwg-bioblitz@googlegroups.com
>>> Subject: Re: [tdwg-content] What I learned at the
>>> TechnoBioBlitz
>>>
>>>
>>>
>>> I certainly agree it's important! I was just saying
>>> that a simple flag probably wouldn't be enough. I like the
>>> idea of a controlled vocabulary (as you and John both
allude
>>> to), and I can imagine about a half-dozen terms that our
>>> community will no-doubt adopt with almost no debate.....
:-)
>>>
>>>
>>>
>>> In my mind, the broadest categories (and likely most
>>> useful) would be something like:
>>>
>>>
>>>
>>> Native (was there without any assistance from
humans)
>>>
>>> Introduced (got there with the assistance of humans,
>>> but is inhabiting the natural environment)
>>>
>>> Captive (brought by humans and still maintained in
captivity)
>>>
>>>
>>>
>>> You might also throw in "Cryptogenic", which is an
>>> assertion that we do not know which of these categories a
>>> particular organism falls (not the same as null, which
means
>>> we don't know whether or not we know)
>>>
>>>
>>>
>>> Of course, each of these can be further subdivded,
>>> but the more we subdivide, the greater the ratio of
>>> fuzzy:clean distinctions. I would say that the terms should
>>> be established in consultation with those most likely to
use
>>> them (e.g., as you suggest, distribution analysis, niche
modellers,
>>> etc.) For example, it might be useful to distinguish
between
>>> an organism that was itself introduced, compared to the
>>> progeny (or a well-established
>>> population) of an intoduced organism. This information can
be
>>> useful for separating things likely to become established
in
>>> new localities, vs. things that do not seem to "take" in a
>>> novel environment.
>>>
>>> Anyway...I didn't want to say a lot on this topic
>>> (too late?); I just wanted to steer more towards controlled
>>> vocabulary, than simple flag field.
>>>
>>>
>>>
>>> Aloha,
>>>
>>> Rich
>>>
>>>
>>>
>>> ________________________________
>>>
>>> From: Donald.Hobern@csiro.au
>>> [mailto:Donald.Hobern@csiro.au]
>>> Sent: Monday, October 11, 2010 3:44 PM
>>> To: Richard Pyle; tuco@berkeley.edu
>>> Cc: tdwg-content@lists.tdwg.org;
>>> tdwg-bioblitz@googlegroups.com
>>> Subject: RE: [tdwg-content] What I learned
at
>>> the TechnoBioBlitz
>>>
>>> Hi Rich.
>>>
>>>
>>>
>>> I recognise this (and could probably define
>>> many different useful flags). The bottom line is really
>>> whether or not the location is one which should be used for
>>> distribution analysis, niche modelling and similar
>>> activities. There will certainly be many grey areas, but
it
>>> would be good if software could weed out captive
occurrences.
>>>
>>>
>>>
>>> Donald
>>>
>>>
>>>
>>>
>>>
>>> untitled
>>>
>>>
>>>
>>> Donald Hobern, Director, Atlas of
>>> Living Australia
>>>
>>> CSIRO Ecosystem Sciences, GPO Box 1700,
>>> Canberra, ACT 2601
>>>
>>> Phone: (02) 62464352 Mobile: 0437990208
>>>
>>> Email: Donald.Hobern@csiro.au
>>> <mailto:Donald.Hobern@csiro.au>
>>>
>>> Web: http://www.ala.org.au/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> From: Richard Pyle [mailto:deepreef@bishopmuseum.org]
>>> Sent: Tuesday, 12 October 2010 12:33 PM
>>> To: Hobern, Donald (CES, Black Mountain);
>>> tuco@berkeley.edu
>>> Cc: tdwg-content@lists.tdwg.org;
>>> tdwg-bioblitz@googlegroups.com
>>> Subject: RE: [tdwg-content] What I learned
at
>>> the TechnoBioBlitz
>>>
>>>
>>>
>>> I'm not so sure a simple flag will do it.
We
>>> have examples ranging from animals in zoos, to escaped
>>> animals, to intentionally and unintentionally introduced
>>> populations, to naturalized populations -- and just about
>>> everything in-between. Where on this spectrum would you
draw
>>> the line for flagging something as "naturally occurring"?
>>>
>>>
>>>
>>> Rich
>>>
>>>
>>>
>>> ________________________________
>>>
>>> From:
>>> tdwg-content-bounces@lists.tdwg.org
>>> [mailto:tdwg-content-bounces@lists.tdwg.org]
On Behalf Of
>>> Donald.Hobern@csiro.au
>>> Sent: Monday, October 11, 2010 2:59
PM
>>> To: tuco@berkeley.edu
>>> Cc: tdwg-content@lists.tdwg.org;
>>> tdwg-bioblitz@googlegroups.com
>>> Subject: Re: [tdwg-content] What I
>>> learned at the TechnoBioBlitz
>>>
>>> Thanks, John.
>>>
>>>
>>>
>>> This is useful, but completely
>>> uncontrolled - effectively a verbatimEstablishmentMeans.
>>> Having a more controlled version or a simple flag which
could
>>> be machine-processible in those cases where providers can
>>> supply it would be useful.
>>>
>>>
>>>
>>> Donald
>>>
>>>
>>>
>>>
>>>
>>> untitled
>>>
>>>
>>>
>>> Donald Hobern, Director,
>>> Atlas of Living Australia
>>>
>>> CSIRO Ecosystem Sciences, GPO Box
>>> 1700, Canberra, ACT 2601
>>>
>>> Phone: (02) 62464352 Mobile:
0437990208
>>>
>>> Email: Donald.Hobern@csiro.au
>>> <mailto:Donald.Hobern@csiro.au>
>>>
>>> Web: http://www.ala.org.au/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> From: gtuco.btuco@gmail.com
>>> [mailto:gtuco.btuco@gmail.com]
On Behalf Of John Wieczorek
>>> Sent: Tuesday, 12 October 2010
11:34 AM
>>> To: Hobern, Donald (CES, Black
Mountain)
>>> Cc: jsachs@csee.umbc.edu;
>>> tdwg-bioblitz@googlegroups.com;
tdwg-content@lists.tdwg.org
>>> Subject: Re: [tdwg-content] What I
>>> learned at the TechnoBioBlitz
>>>
>>>
>>>
>>> Natural occurrence is meant to be
>>> captured through the term dwc:establishmentMeans
>>> (http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans).
>>>
>>> On Mon, Oct 11, 2010 at 5:16 PM,
>>> <Donald.Hobern@csiro.au>
wrote:
>>>
>>> Thanks, Joel.
>>>
>>> Nice summary. One addition which we
>>> do need to resolve (and which has been suggested in recent
>>> months) is to have a flag to indicate whether a record
should
>>> be considered to show a "natural"
>>> occurrence (in distinction from cultivation, botanic
gardens,
>>> zoos, etc.).
>>> This is not so much an issue in a BioBlitz, but is
certainly
>>> a factor with citizen science recording in general - see
the
>>> number of zoo animals in the Flickr EOL group.
>>>
>>> Donald
>>>
>>>
>>>
>>>
>>> Donald Hobern, Director, Atlas of
>>> Living Australia
>>> CSIRO Ecosystem Sciences, GPO Box
>>> 1700, Canberra, ACT 2601
>>> Phone: (02) 62464352 Mobile:
0437990208
>>> Email: Donald.Hobern@csiro.au
>>> Web: http://www.ala.org.au/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: tdwg-content-bounces@lists.tdwg.org
>>> [mailto:tdwg-content-bounces@lists.tdwg.org]
On Behalf Of joel sachs
>>> Sent: Monday, 11 October 2010 10:47
PM
>>> To: tdwg-bioblitz@googlegroups.com;
>>> tdwg-content@lists.tdwg.org
>>> Subject: [tdwg-content] What I
>>> learned at the TechnoBioBlitz
>>>
>>> One of the goals of the recent
>>> bioblitz was to think about the suitability and
>>> appropriatness of TDWG standards for citizen science.
Robert
>>> Stevenson has volunteered to take the lead on preparing a
>>> technobioblitz lessons learned document, and though the
scope
>>> of this document is not yet determined, I think the
audience
>>> will include bioblitz organizers, software developers, and
>>> TDWG as a whole. I hope no one is shy about sharing lessons
>>> they think they learned, or suggestions that they have. We
>>> can use the bioblitz google group for this discussion, and
>>> copy in tdwg-content when our discussion is
standards-specific.
>>>
>>> Here are some of my immediate
observations:
>>>
>>> 1. Darwin Core is almost exactly
>>> right for citizen science. However, there is a desperate
need
>>> for examples and templates of its use. To illustrate this
>>> need: one of the developers spoke of the design choice
>>> between "a simple csv file and a Darwin Core record". But a
>>> simple csv file is a legitimate representation of Darwin
>>> Core! To be fair to the developer, such a sentence might
not
>>> have struck me as absurd a year ago, before Remsen said
>>> "let's use DwC for the bioblitz".
>>>
>>> We provided a couple of example DwC
>>> records (text and rdf) in the bioblitz data profile [1]. I
>>> think the lessons learned document should include an
on-line
>>> catalog of cut-and-pasteable examples covering a variety of
>>> use cases, together with a dead simple desciption of DwC,
>>> something like "Darwin Core is a collection of terms,
>>> together with definitions."
>>>
>>> Here are areas where we augemented
or
>>> diverged from DwC in the bioblitz:
>>>
>>> i. We added obs:observedBy [2],
since
>>> there is no equivalent property in DwC, and it's important
in
>>> Citizen Science (though often not available).
>>>
>>> ii. We used geo:lat and geo:long [3]
>>> instead of DwC terms for latitude and longitude. The geo
>>> namespace is a well used and supported standard, and
records
>>> with geo coordinates are automatically mapped by several
>>> applications. Since everyone was using GPS to retrieve
their
>>> coordinates, we were able to assume WGS-84 as the datum.
>>>
>>> If someone had used another Datum,
>>> say XYZ, we would have added columns to the Fusion table so
>>> that they could have expressed their coordiantes in DwC,
as, e.g.:
>>> DwC:decimalLatitude=41.5
>>> DwC:decimalLongitude=-70.7
>>> DwC:geodeticDatum=XYZ
>>>
>>> (I would argue that it should be
>>> kosher DwC to express the above as simply XYZ:lat and
>>> XYZ:long. DwC already incorporates terms from other
>>> namespaces, such as Dublin Core, so there is precedent for
this.
>>>
>>> 2. DwC:scientificName might be more
>>> user friendly than taxonomy:binomial and the other taxonomy
>>> machine tags EOL uses for flickr images. If
>>> DwC:scientificName isn't self-explanatory enough, a user
can
>>> look it up, and see that any scientific name is acceptable,
>>> at any taxonomic rank, or not having any rank. And once we
>>> have a scientific name, higher ranks can be inferred.
>>>
>>> 3. Catalogue of Life was an
important
>>> part of the workflow, but we had some problems with it.
>>> Future bioblitzes might consider using something like a CoL
>>> fork, as recently described by Rod Page [4].
>>>
>>> 4. We didn't include "basisOfRecord"
>>> in the original data profile, and so it wasn't a column in
>>> the Fusion Table [5]. But when a transcriber felt it was
>>> necessary to include in order to capture data in a
particular
>>> field sheet, she just added the column to the table. This
>>> flexibility of schema is important, and is in harmony with
>>> the semantic web.
>>>
>>> 5. There seemed to be enthusiasm for
>>> another field event at next year's TDWG. This could be an
>>> opportunity to gather other types of data (eg.
>>> character data) and thereby
>>> i) expose meeting particpants to
>>> another set of everyday problems from the world of
>>> biodiversity workflows, and ii) try other TDWG technology
on
>>> for size, e.g. the observation exchange format, annotation
>>> framework, etc.
>>>
>>>
>>> Happy Thanksgiving to all in Canada
-
>>> Joel.
>>> ----
>>>
>>>
>>> 1.
>>> http://groups.google.com/group/tdwg-bioblitz/web/tdwg-bioblitz
>> -profile-v1-1
>>> 2. Slightly bastardizing our old
>>> observation ontology -
>>> http://spire.umbc.edu/ontologies/Observation.owl
>>> 3. http://www.w3.org/2003/01/geo/
>>> 4.
>>> http://iphylo.blogspot.com/2010/10/replicating-and-forking-dat
>> a-in-2010.html
>>> 5.
>>> http://tables.googlelabs.com/DataSource?dsrcid=248798
>>>
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content@lists.tdwg.org
>>>
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content@lists.tdwg.org
>>>
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>
>>>
>>>
>>>
>>> ________________________________
>>>
>>> Please consider the environment before printing
this email
>>> Warning: This electronic message together with any
>>> attachments is confidential. If you receive it in error:
(i)
>>> you must not read, use, disclose, copy or retain it; (ii)
>>> please contact the sender immediately by reply email and
then
>>> delete the emails.
>>> The views expressed in this email may not be those
of
>>> Landcare Research New Zealand Limited.
>>> http://www.landcareresearch.co.nz
>>>
>>>
>>>
>>>
>>>
>>> Please consider the environment before printing this email
>>> Warning: This electronic message together with any
>>> attachments is confidential. If you receive it in error:
(i)
>>> you must not read, use, disclose, copy or retain it; (ii)
>>> please contact the sender immediately by reply email and
then
>>> delete the emails.
>>> The views expressed in this email may not be those of
>>> Landcare Research New Zealand Limited.
>>> http://www.landcareresearch.co.nz
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content@lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>
>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content@lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>> Please consider the environment before printing this email
>> Warning: This electronic message together with any
attachments is confidential. If you receive it in error: (i) you must
not read, use, disclose, copy or retain it; (ii) please contact the
sender immediately by reply email and then delete the emails.
>> The views expressed in this email may not be those of Landcare
Research New Zealand Limited. http://www.landcareresearch.co.nz
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content@lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu