Hi Paul,
Thanks all for your responses.
I now understand the Simple Darwin Core a lot better.
So we have an official tapir model for Simple Darwin Core and that will suffice for us for now.
If we wish to implement it in Tapirlink what is required?
Another option is to use IPT instead of Tapirlink to provide our data - is that correct? How stable is IPT in terms of development - is it still beta? Is IPT difficult to install and what is required?
You could evaluate the IPT for this yes.
There will shortly be a release of the IPT (the RC4 release which will be available for first testing next week) which will have some enhancements and bug fixes and I would recommend waiting for this before evaluating. The IPT has a TAPIR interface, although I am not sure how much it has been tested in live deployments, as GBIF use the DarwinCore Archive interface to harvest content as it is far quicker. In addition GBIF, have initiated a major refactoring of the IPT to address the feedback received in the first year of it's use. Specifically, users have requested that it be lightened up significantly (e.g. server requirements), made easier to install and configure, improve performance and address the intuitiveness of some of the pages on the user interface relating to the publication state and the registration. We are starting a 4 week coding camp on this tomorrow, and I anticipate testing of certain components to begin towards the end of September, known as a "pre-RC5 testing phase". This will not be a complete IPT and we will canvass the community to determine how much functionality should be supported by the IPT for RC5. This means there is potential that the IPT will not support TAPIR but only the DarwinCore-Archive output format. Following RC5 we will only address bugs and halt new functionality additions until the IPT is considered fully stable and considered to be at a General Availability state.
If you have experience using, and are happy with TAPIRLink and the operation of your TAPIR network then TAPIRLink would likely be the best product to use for the time being, and you might consider evaluating the IPT either at RC4, or get involved around the time RC5 components are available for testing. What the IPT provides: - the publishing of checklist and occurrence information from either SQL sources or CSV, tab file (etc) upload - the ability to author dataset metadata documents to accompany the data, or where the dataset is not digitally available (e.g. documenting undigitised collections) - browse a web application in the IPT for each dataset - register with GBIF associating the dataset with the Institutions of interest (the physical owners and the virtual hosting institutes) - Fast full dataset transfer: due to the DarwinCore-Archive output format, it becomes trivial to transfer an entire dataset in a single object - Support "many to one" extensions to the core records (occurrence or taxon). Extension definitions may be registered for others to use.
I hope this helps guide your decision. Please let me know if I can provide more information on the IPT.
Cheers, Tim
Really appreciate your quick feedback on this stuff.
Best Regards Paul
From: Markus Döring [m.doering@mac.com] Sent: 08 August 2010 05:09 To: Tim Robertson Cc: tdwg-content@lists.tdwg.org; Dave Martin; Ajay Ranipeta; Paul Flemons Subject: Re: [tdwg-content] Latest DwC as TAPIR output model?
Hi, not sure where exactly the conversation started, but I assume you are aware of the official tapir model for the simple darwin core? It looks as if this is up to date: http://code.google.com/p/darwincore/source/browse/trunk/tapir/dwc_simple.tom
In case you would like a flexible model that also allows for non flat extension records, we have created a new terms dwc tapir model which is used by the IPT: http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-provi...
The schema referenced is this one: http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-provi...
it defines a "DarwinExtensions" hook element in addition to the regular dwc terms. But no restrictions are made to how the extensions are constructed, it keeps it wide open.
Markus
On Aug 7, 2010, at 9:47, Tim Robertson (GBIF) wrote:
Hi all,
This thread started as a query about the availability of the latest DwC in TAPIR output models, and specifically for use with TAPIRLink. John correctly points out that the TDWG list should have been copied from the start, and I am remedying that now. I have condensed the thread to only the main points to make it easy to digest, and suggest we continue the discussion here. The current point in the discussion relates to the SimpleDarwinCore being flat in nature, and therefore why it holds a subset of all the DwC terms documented in the standard.
Thanks, Tim
On Aug 7, 2010, at 7:59 AM, John Wieczorek wrote:
Hi folks,
I wish that this conversation had taken place on the tdwg-content list so that others might take advantage of the discussion, but I don't feel comfortable moving someone else's conversations there.
Good point John. Sorry all, this was my mistake.
The Simple Darwin Core Schema is up to date. It does not contain the MeasurementOrFact terms, nor the ResourceRelationship terms, precisely because, as Renato said, the Simple Darwin Core is flat and the other terms in those two sets make little sense in a flat schema because they would allow you to share no more than one MeasurementOrFact and one ResourceRelationship.
There is a Generic Darwin Core schema that provides a model for building other schemas from the Darwin Core, but it is not an application schema as the SimpleDarwinCore schema is. At least two groups are using the Generic Darwin Core schema imported into other schemas to extend the capabilities of the Generic Darwin Core - the germ plasm folks and the Apiary folks. The former have a published schema at http://code.google.com/p/darwincore/source/browse/#svn/ trunk/xsd/profiles/germplasm, while the latter are working on one for herbarium sheet data entry, for which they are interested in many more types of annotations than just the Identification class found in Darwin Core presently.
If there is more that you want to share than just the Simple Darwin Core, but don't actually need a more complex structure, one simple way to do it would be to use the dynamicProperties term from Simple Darwin Core. The description is at http://rs.tdwg.org/dwc/terms/index.htm#dynamicProperties and further explanation on its use can be found at in the "Do More with Simple Darwin Core" section (http://rs.tdwg.org/dwc/terms/simple/index.htm#domore ) of the Simple Darwin Core page.
Hope that helps,
John
Renato, Simple Darwin Core is almost what I want but looking at the schema here http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd it is lacking the auxillary terms that occur that are listed here http://rs.tdwg.org/dwc/terms/index.htm
- which are the Measurement of Fact terms and the Resource
Relationship terms.
So ideally I would like all the terms listed here http://rs.tdwg.org/dwc/terms/index.htm included however if that is too difficult or would take too long then I would be happy with the Simple Darwin Core schema as described here http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd Paul
I'm not sure if the simple DwC schema is up-to-date, but it seems to contain more than 150 elements. I think the idea behind "simple" resides in the fact that the schema defines a flat structure, not that it only contains a subset of DarwinCore.
I can check if that schema is up-to-date and then build the output model, but first I would just need to know from you guys if this kind of schema will suit your needs. Renato
I was hoping the whole schema :-)
Here in Australia the OZCAM community (all Australian museums) has just migrated it's schema to the new standard - using about 60 or 70 of the fields so we would like to be able to expose all of them.
With my TDWG exec hat on I'd like to think that any data providers could expose their data using the new DwC schema and GBIF being the largest consumer of data should be doing their utmost to facilitate that being possible. TapirLink is the ideal way to do that as many are using it already and so upgrading providers to the new schema using Tapirlink should be relatively easy. Paul
I would think an output based on the simple DwC should be made available and GBIF will promote that with TAPIRLink, but I leave it to Paul to shout if he anticipates something else? Tim
You're right. There's no output model for the latest DwC, but it should be easy to make one. Is there a specific XML Schema for the data structure or are you planning to use the simple DwC?
http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd
If you want, just let me know the schema and I can quickly build a model for it. Renato
> Do you know if anyone put together output models for the TAPIR > using > the latest DwC? > http://rs.tdwg.org/tapir/cs/dwc/ does not appear to have any > and the > folks in Australia are looking into this. Just want to make > sure > there aren't any floating around already. > Tim
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
##################################################################################### This e-mail message has been scanned for Viruses and Content and cleared by MailMarshal #####################################################################################
ALIVE UNTIL 19 SEPTEMBER 2010 A dynamic program of events celebrating the International Year of Biodiversity TALKS • TOURS • WORKSHOPS KIDS ACTIVITIES • MUSIC • DISPLAYS www.australianmuseum.net.au/biodiversity
The Australian Museum.
The views in this email are those of the user and do not necessarily reflect the views of the Australian Museum. The information contained in this email message and any accompanying files is or may be confidential and is for the intended recipient only. If you are not the intended recipient, any use, dissemination, reliance, forwarding, printing or copying of this email or any attached files is unauthorised. If you are not the intended recipient, please delete it and notify the sender. The Australian Museum does not guarantee the accuracy of any information contained in this e-mail or attached files. As Internet communications are not secure, the Australian Museum does not accept legal responsibility for the contents of this message or attached files.
Please consider the environment before printing this email.