<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Specific responses inline<br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap="">Hmmm...not sure I follow.  Are you saying that a new Event record (ID)

should be created for every Occurrence record, and that a new Location

record (ID) should be created for every Event record?  If so, then it's

going to be very difficult to convicne me of this.  I don't think that our

database is unusual in having many (sometimes hundreds) of Occurrences at

the same Event (e.g., a large fish poison station), and many (again,

sometimes hundreds) of Events at the same Location.

  </pre>

</blockquote>

I think I answered this (in a sense) in the other email that I sent a

little while ago.&nbsp; In principle, if every record of an Occurrence has a

time that is discernibly different from the times of other Occurrences

(i.e. because the time was recorded to the nearest second by a

machine), then yes, I consider it to be a different event.&nbsp; To force

people to lump such Occurrences together into one Event (say comprising

a day or some other time interval larger than a second) is essentially

throwing away data that we already have about the Occurrences.&nbsp; I have

already found it useful to know exactly what time of day a flower was

opened rather than closed or the order in which I took several images.&nbsp;

<br>

<br>

As for creating a new Event record ID for all of those one-second

events, I would submit the same solution that I gave in the previous

post.&nbsp; In my database, I don't have separate Event records for the

times when I took live plant images (which I am rightly or wrongly

calling Occurrences).&nbsp; I just have a "flat" table where each image has

a eventTime (the time recorded automatically in the EXIF data for the

image).&nbsp; If in the context of having an RDF structure that was

compatible with the kind of structure used by people who have many

Occurrences associated with a single event (e.g. your fish kill), for

the image <a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu/baskauf/57755">http://bioimages.vanderbilt.edu/baskauf/57755</a> I would

automatically create an event identifier for its creation as

<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu/baskauf/57755#event">http://bioimages.vanderbilt.edu/baskauf/57755#event</a> .&nbsp; There!&nbsp; I have a

perfectly valid GUID that represents the event without any additional

burden on my record-keeping system since I don't have to keep any

additional records about that GUID beyond the ones I'm already keeping

for the image.&nbsp; I'm not doing this now, but I suppose I should if there

is a consensus that things are related to each other in the way shown

in your diagram.&nbsp; The same approach could be taken for Location.&nbsp;

People who care about having unrelated identifiers for their Events and

Locations (because of one-to-many relationships) would be welcome to do

so, but I wouldn't need to in my internal database.. <br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap="">

Which database?  Are we talking about DwCA?  If so, I understand the

rationale for flattening out content to make it easier to batch-package

records amnd ship them around.  But if we're talking about actual database

implementations at the content provider end, I think I'm not alone in

wanting to stick with a more normalized approach.  Besides, what's the point

of even defining the different DwC classes, each with their own ID, if we're

just going flatten them all out anyway (as per old Dwc)?

  </pre>

</blockquote>

Well, yes I had DwCA (Darwin Core Archives) specifically in mind.&nbsp; But

it could be any other "shipping format" or local database format.&nbsp; I

think we need to draw the distinction between having a way that people

can understand the meaning of metadata records and fields that we are

shipping to them (e.g. DwCA), and describing the properties and

connections between resources (e.g. in RDF).&nbsp; I think this may be what

Pete means when he says that we need two "kinds" of DwC.&nbsp; The first use

is pretty much "ready to go".&nbsp; I don't think we know how close we are

to being able to use the existing DwC terms for describing

relationships until we have some more conversation of the sort we are

having now as well as conversation about which existing terms can be

used (perhaps in ways that weren't originally intended) to express the

relationships needed in RDF.&nbsp; I feel like I hear Pete saying that we

are still lacking a lot of the predicates we need, while I (and maybe

Cam) feel that we are most of the way there.<br>

<br>

For clarification, when I argue that the class Individual should exist

in Darwin Core, it's not because I'm insisting that all users must have

an Individual table in their database.&nbsp; What I want is for people to be

ABLE to have an Individual table in their database (if they need it)

and have others understand what it means and how the entities described

in that table are related conceptually to other things like

Identifications and Occurrences.&nbsp; If all of the records in their

database have only one occurrence per individual, they don't "need" to

keep track of Individuals.<br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap="">Yes, they could be collapsed to Occurrence -- in the same way that

properties of "Individual" are currently collapsed to Occurrence.  But after

pleading your case to normalize "Individual" as its own separate class, I'm

kinda surprised to see you arguing in favor of collapsing the Event class

into Occurrence.

  </pre>

</blockquote>

I confess my crime.&nbsp; In penance, I freely confess that the Event class

exists and that people should use it in their databases if it helps

them cluster Occurrences.&nbsp; In addition, I confess that I should

probably acknowledge the existence of Events and their relationship to

other Darwin Core classes when I write RDF.&nbsp; Guilty as charged!<br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <blockquote type="cite">

    <pre wrap="">Yes....sort of.  Doesn't helpf for localities defined as bounded boxes,

polygons or lines (e.g., transects, as we often have for data from plankton

tows) -- but it certainly does serve as a hand "natural key" of sorts for

point localities.  The problem is that so much of our exiting content is not

reliably georeferenced yet.  Thus, we need all those other terms to

accommodate various location descriptors, which will eventually allow us to

after-the-fact georeference the localities.  Also, many after-the-fact

georeferenced points are interpretations.  Keeping the descriptors around

can allow someone else to come up with a better/more precise

lat/long/uncertainty interpretation.  Also, errors are abundant

(particularly in failing to represent decimal degrees with negatives).

Having the descriptors allows us to catch such errors much more quickly.

    </pre>

  </blockquote>

</blockquote>

Here I will confess the crime of ignorance.&nbsp; I'm still trying to

understand the need for and uses of a number of the Darwin Core

dcterms:Location class terms.&nbsp; I guess I need to spend some more time

reading the Guide to Best Practices for Georeferencing (I was going to

include the link here, but the link at

<a class="moz-txt-link-freetext" href="http://www.biogeomancer.org/library.html">http://www.biogeomancer.org/library.html</a> is broken).&nbsp; <br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap="">

Sure -- but we can't really ignore the massive numbers of non-georeferenced

datapointn that already exist. And even when they are georeferenced, we'll

still want to keep the original location descriptors.

  </pre>

</blockquote>

Agree on this and all other points I deleted here.<br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap=""></pre>

  <pre wrap="">

OK, I see where you're coming from.  But I guess my rsponse is that we're a

LONG way from:

- A world where most existing Occurrence content is well georeferenced;

  </pre>

</blockquote>

[text omitted for brevity]<br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap="">So I still see the advantage of keeping Location as a separate class,

maintaining those extra "human-friendly" descriptor terms, and

conceptualizing 1:M Location:Events.

  </pre>

</blockquote>

I'm totally convinced.&nbsp; Location and Event belong where you had them in

the diagram.&nbsp; <br>

<br>

[more text listed below]

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap="">

I also still find it interesting that you are quite content to flatten

Locality and Event class terms into Occurrence, while simultaneously wanting

to normalize Individual as a new class, sparate from Occurrence.  I'm not

saying that establishing an Individual class is a bad idea -- in principle I

support it.  But I am curious as to why you think it important ti push for

normalization on the Individual side of an Occurrence, but push for

de-normalization on the Event side of an Occurrence.

  </pre>

</blockquote>

Again, I plead guilty as charged.&nbsp; The left side of the chart should be

allowed to not be flat.&nbsp; However, I maintain my stance that many

(most?) new Occurrence records will in the future have their own

individual latitude/longitude/elevation or depth/time.&nbsp; Those "atomized

events points" could easily be aggregated by software into larger scale

events and locations by some simple rules about the timespan for events

and geographic bounds for locations.&nbsp; Those larger scale events and

locations could be used to ask the kinds of questions you describe

below.&nbsp; As long as you aren't requiring me to do this kind of

aggregation BEFORE I create my records (and hence requiring me to lose

the data that my GPS has collected for me automatically), I'm happy to

allow others to define events and locations on larger scales and with

1:M relationships.<br>

<blockquote cite="mid:ECF31DCD22DC44578C676AD9D722E2CE@RLPLaptop"

 type="cite">

  <pre wrap=""><!---->

That may be true.  But I've spent more than two decades taking

over-simplified, flattened database structures and transforming them into

more normalized structures, because the flattened structures consistely

limited my ability to ask novel questions of the database, and also

encouraged inconsistency of data entry practices.  The small price I pay for

increased normalization has yielded ample return in more "powerful" datasets

(i.e., more flexibility in how I can frame and/or analyze the data).

  </pre>

</blockquote>

As I said in my earlier email, I'm encouraged by the consistency in the

way I hear people talking about the relationships among the DwC

classes.&nbsp; I was a bit afraid when we started this thread that I would

turn out to have some kind of fringe ideas.&nbsp; Now what I'm seeing is a

lot of variation on how people choose to "collapse" the basic model to

meet their individual needs, but not a lot of disagreement about what

the basic model is.<br>

<br>

Thanks for your great feedback and for challenging my statements.&nbsp; I

need that!<br>

<br>

Steve<br>

<pre class="moz-signature" cols="72">-- 

Steven J. Baskauf, Ph.D., Senior Lecturer

Vanderbilt University Dept. of Biological Sciences

postal mail address:

VU Station B 351634

Nashville, TN  37235-1634,  U.S.A.

delivery address:

2125 Stevenson Center

1161 21st Ave., S.

Nashville, TN 37235

office: 2128 Stevenson Center

phone: (615) 343-4582,  fax: (615) 343-6707

<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>

</pre>

</body>

</html>