An earlier version of the draft DwC had a simple flag of ValidDistribution (true/false).  This is perhaps the flag that Donald was looking for, which morphed into the more complex establishmentMeans, to give various stakeholders the chance to figure out more precisely as you say "what, exactly are we interested in knowing?"  John's current version is more flexible but does suggest a controlled vocabulary, which has not been worked out, but ValidDistribution = false would capture the pandas in DC and the Ross's gull (http://www.allaboutbirds.org/guide/Rosss_Gull/id) in Massachusetts, and for a particular point in time.  It would not, however, adequately capture the extensive escapes, accidental and purposeful introductions of non-native species that become established, and are later considered to be ValidDistribution = true (apparently this might also now be the case for Ross's gull, but I remember the big hooha in 1975!).  When is this line considered to be crossed?  This is likely why John, in his wisdom, decided to deprecate ValidDistribution in favor of something with a broader possible interpretation.

Thanks for your discussion below. 

Gail

On Oct 11, 2010, at 11:00 PM, Richard Pyle wrote:

Hi Gail,

Yes, and this gets back to Donald's original point, and one I've wrestled
with for a long time -- which boils down to, "what, exactly, are we
interested in knowing"?

In my experience, most people seem most concerned with whether an organism
got to where it was
captured/observed/photographed/surveyed/monitored/whatever with the aid of
humans, or without.  I think the reason for this is that most people
distinguich between "natural" and "artificial" on the basis of how much
control humans had in it.

After that, the next most important question tends to be whether it's an
anomalous occurrence, or a locally reproducing population. This is where we
get into waifs & vagrants (I see the examples you gave as subcategories of
"Native", in my hyper-simplified vocabulary in my earlier post); and from
there, it gets into shades of grey on how "established" the population is
(e.g., does it achieve the threshold of "naturalized"?)  

Unfortunately, these two things (how it got there, vs. whether it's locally
reproducing) are really different metrics, yet people often try to combine
them into the same category.

From my read of http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans,
both metrics are evident. In a sense, the presence of the word "established"
might seem to restrict it to only locally-reproducing populations.  However,
one could also interpret the word "establish" in the context of individual
organism (e.g., how did this individual Ross's Gull get established in
Massachusetts?)

The GBIF vocabulary that Dave sent the link for also seems to mix the two
metrics.

And there are other considerations I've seen in databases as well (including
our own): such as the distinction between "adventive" and "intentional"
introductions.  Also, the word "invasive" (one of the examples on the DwC
page for establishmentMeans) suggests another angle, implying some sort of
"harm" done to the ecosystems or to human interests such as agriculture.

So again, I ask: "What, exactly, are we interested in knowing?"  If I read
Donald correctly, he simply wants to filter out occurrence records of Pandas
in Washington DC.  But Niche Modellers might come from a different angle,
and seek evidence of whether or not an organism is capable of sustaining a
locally reproducing population in some particular place (regardless of
whether the founders of the pupulation were brought by humans, or got there
on their own).

If we can answer that question, then the next question will be whether we
can pack all the answers into a single controlled vocabulary for
"establishmentMeans", or whether we may need to think about another DwC
term, with a slightly different meaning.

Aloha,
Rich


________________________________

From: Gail Kampmeier [mailto:gkamp@illinois.edu]
Sent: Monday, October 11, 2010 4:41 PM
To: Richard Pyle
Cc: tdwg-content@lists.tdwg.org; tdwg-bioblitz@googlegroups.com
Subject: Re: [tdwg-content] What I learned at the TechnoBioBlitz


Your human-centered distinctions will likely prove to be useful in
certain contexts (quarantine; invasive species watches), but think of birds
or insects blown or carried into areas by meteorological events that survive
and become established for a season or more, or as curiosities of
observation (Ross's gull in Massachusetts).  Range extension is important in
climate change research and may have nothing to do with direct human
activities outlined below.

Cheers!
Gail


On Oct 11, 2010, at 9:26 PM, Richard Pyle wrote:


I certainly agree it's important!  I was
just saying that a simple flag probably wouldn't be enough.  I like the idea
of a controlled vocabulary (as you and John both allude to), and I can
imagine about a half-dozen terms that our community will no-doubt adopt with
almost no debate.....  :-)

In my mind, the broadest categories (and likely most useful)
would be something like:

Native (was there without any assistance from humans)
Introduced (got there with the assistance of humans, but is
inhabiting the natural environment)
Captive (brought by humans and still maintained in
captivity)

You might also throw in "Cryptogenic", which is an assertion
that we do not know which of these categories a particular organism falls
(not the same as null, which means we don't know whether or not we know)

Of course, each of these can be further subdivded, but the
more we subdivide, the greater the ratio of fuzzy:clean distinctions. I
would say that the terms should be established in consultation with those
most likely to use them (e.g., as you suggest, distribution analysis, niche
modellers, etc.)  For example, it might be useful to distinguish between an
organism that was itself introduced, compared to the progeny (or a
well-established population) of an intoduced organism. This information can
be useful for separating things likely to become established in new
localities, vs. things that do not seem to "take" in a novel environment.

Anyway...I didn't want to say a lot on this topic (too
late?); I just wanted to steer more towards controlled vocabulary, than
simple flag field.

Aloha,
Rich


________________________________

From: Donald.Hobern@csiro.au
[mailto:Donald.Hobern@csiro.au]
Sent: Monday, October 11, 2010 3:44 PM
To: Richard Pyle; tuco@berkeley.edu
Cc: tdwg-content@lists.tdwg.org;
tdwg-bioblitz@googlegroups.com
Subject: RE: [tdwg-content] What I learned at the
TechnoBioBlitz


Hi Rich.
I recognise this (and could
probably define many different useful flags).  The bottom line is really
whether or not the location is one which should be used for distribution
analysis, niche modelling and similar activities.  There will certainly be
many grey areas, but it would be good if software could weed out captive
occurrences.
Donald

<image001.png>

Donald Hobern, Director, Atlas of Living Australia
CSIRO Ecosystem Sciences, GPO Box 1700, Canberra,
ACT 2601
Phone: (02) 62464352 Mobile: 0437990208
Email: Donald.Hobern@csiro.au
<mailto:Donald.Hobern@csiro.au>
Web: http://www.ala.org.au/


From: Richard Pyle [mailto:deepreef@bishopmuseum.org]
Sent: Tuesday, 12 October 2010 12:33 PM
To: Hobern, Donald (CES, Black Mountain);
tuco@berkeley.edu
Cc: tdwg-content@lists.tdwg.org;
tdwg-bioblitz@googlegroups.com
Subject: RE: [tdwg-content] What I learned at the
TechnoBioBlitz
I'm not so sure a simple flag will do it.  We have
examples ranging from animals in zoos, to escaped animals, to intentionally
and unintentionally introduced populations, to naturalized populations --
and just about everything in-between.  Where on this spectrum would you draw
the line for flagging something as "naturally occurring"?
Rich

________________________________

From:
tdwg-content-bounces@lists.tdwg.org
[mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of
Donald.Hobern@csiro.au
Sent: Monday, October 11, 2010 2:59 PM
To: tuco@berkeley.edu
Cc: tdwg-content@lists.tdwg.org;
tdwg-bioblitz@googlegroups.com
Subject: Re: [tdwg-content] What I learned
at the TechnoBioBlitz

Thanks, John.
This is
useful, but completely uncontrolled - effectively a
verbatimEstablishmentMeans.  Having a more controlled version or a simple
flag which could be machine-processible in those cases where providers can
supply it would be useful.
Donald

<image001.png>

Donald Hobern, Director, Atlas of Living Australia
CSIRO Ecosystem Sciences, GPO Box 1700,
Canberra, ACT 2601
Phone: (02) 62464352 Mobile: 0437990208
Email: Donald.Hobern@csiro.au
<mailto:Donald.Hobern@csiro.au>
Web: http://www.ala.org.au/


From: gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On Behalf Of John
Wieczorek
Sent: Tuesday, 12 October 2010 11:34 AM
To: Hobern, Donald (CES, Black Mountain)
Cc: jsachs@csee.umbc.edu;
tdwg-bioblitz@googlegroups.com; tdwg-content@lists.tdwg.org
Subject: Re: [tdwg-content] What I learned
at the TechnoBioBlitz

Natural occurrence is meant to be captured
through the term dwc:establishmentMeans
(http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans).

On Mon, Oct 11, 2010 at 5:16 PM,
<Donald.Hobern@csiro.au> wrote:
Thanks, Joel.

Nice summary.  One addition which we do need
to resolve (and which has been suggested in recent months) is to have a flag
to indicate whether a record should be considered to show a "natural"
occurrence (in distinction from cultivation, botanic gardens, zoos, etc.).
This is not so much an issue in a BioBlitz, but is certainly a factor with
citizen science recording in general - see the number of zoo animals in the
Flickr EOL group.

Donald




Donald Hobern, Director, Atlas of Living
Australia
CSIRO Ecosystem Sciences, GPO Box 1700,
Canberra, ACT 2601
Phone: (02) 62464352 Mobile: 0437990208
Email: Donald.Hobern@csiro.au
Web: http://www.ala.org.au/








-----Original Message-----
From: tdwg-content-bounces@lists.tdwg.org
[mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of joel sachs
Sent: Monday, 11 October 2010 10:47 PM
To: tdwg-bioblitz@googlegroups.com;
tdwg-content@lists.tdwg.org
Subject: [tdwg-content] What I learned at
the TechnoBioBlitz

One of the goals of the recent bioblitz was
to think about the suitability and appropriatness of TDWG standards for
citizen science. Robert Stevenson has volunteered to take the lead on
preparing a technobioblitz lessons learned document, and though the scope of
this document is not yet determined, I think the audience will include
bioblitz organizers, software developers, and TDWG as a whole. I hope no one
is shy about sharing lessons they think they learned, or suggestions that
they have. We can use the bioblitz google group for this discussion, and
copy in tdwg-content when our discussion is standards-specific.

Here are some of my immediate observations:

1. Darwin Core is almost exactly right for
citizen science. However, there is a desperate need for examples and
templates of its use. To illustrate this need: one of the developers spoke
of the design choice between "a simple csv file and a Darwin Core record".
But a simple csv file is a legitimate representation of Darwin Core! To be
fair to the developer, such a sentence might not have struck me as absurd a
year ago, before Remsen said "let's use DwC for the bioblitz".

We provided a couple of example DwC records
(text and rdf) in the bioblitz data profile [1]. I  think the lessons
learned document should include an on-line catalog of cut-and-pasteable
examples covering a variety of use cases, together with a dead simple
desciption of DwC, something like "Darwin Core is a collection of terms,
together with definitions."

Here are areas where we augemented or
diverged from DwC in the bioblitz:

i. We added obs:observedBy [2], since there
is no equivalent property in DwC, and it's important in Citizen Science
(though often not available).

ii. We used geo:lat and geo:long [3] instead
of DwC terms for latitude and longitude. The geo namespace is a well used
and supported standard, and records with geo coordinates are automatically
mapped by several applications. Since everyone was using GPS  to retrieve
their coordinates, we were able to assume WGS-84 as the datum.

If someone had used another Datum, say XYZ,
we would have added columns to the Fusion table so that they could have
expressed their coordiantes in DwC, as, e.g.:
DwC:decimalLatitude=41.5
DwC:decimalLongitude=-70.7
DwC:geodeticDatum=XYZ

(I would argue that it should be kosher DwC
to express the above as simply XYZ:lat and XYZ:long. DwC already
incorporates terms from other namespaces, such as Dublin Core, so there is
precedent for this.

2. DwC:scientificName might be more user
friendly than taxonomy:binomial and the other taxonomy machine tags EOL uses
for flickr images.  If DwC:scientificName isn't self-explanatory enough, a
user can look it up, and see that any scientific name is acceptable, at any
taxonomic rank, or not having any rank. And once we have a scientific name,
higher ranks can be inferred.

3. Catalogue of Life was an important part
of the workflow, but we had some problems with it. Future bioblitzes might
consider using something like a CoL fork, as recently described by Rod Page
[4].

4. We didn't include "basisOfRecord" in the
original data profile, and so it wasn't a column in the Fusion Table [5].
But when a transcriber felt it was necessary to include in order to capture
data in a particular field sheet, she just added the column to the table.
This flexibility of schema is important, and is in harmony with the semantic
web.

5. There seemed to be enthusiasm for another
field event at next year's TDWG. This could be an opportunity to gather
other types of data (eg.
character data) and thereby
i) expose meeting particpants to another set
of everyday problems from the world of biodiversity workflows, and ii) try
other TDWG technology on for size, e.g. the observation exchange format,
annotation framework, etc.


Happy Thanksgiving to all in Canada -
Joel.
----


1.
http://groups.google.com/group/tdwg-bioblitz/web/tdwg-bioblitz-profile-v1-1
2. Slightly bastardizing our old observation
ontology - http://spire.umbc.edu/ontologies/Observation.owl
3. http://www.w3.org/2003/01/geo/
4.
http://iphylo.blogspot.com/2010/10/replicating-and-forking-data-in-2010.html
5.
http://tables.googlelabs.com/DataSource?dsrcid=248798


_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org

http://lists.tdwg.org/mailman/listinfo/tdwg-content

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org

http://lists.tdwg.org/mailman/listinfo/tdwg-content

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content