[tdwg-content] Latest DwC as TAPIR output model?

Paul Flemons Paul.Flemons at austmus.gov.au
Mon Aug 9 09:35:48 CEST 2010


Markus,
I did wonder whether that was the case but being a non technie I wasnt sure.
Thanks for clearing that up - all good for us now to proceed.

Thanks
Paul



________________________________________
From: "Markus Döring (GBIF)" [mdoering at gbif.org]
Sent: 09 August 2010 17:32
To: Paul Flemons
Cc: Tim Robertson (GBIF); tdwg-content at lists.tdwg.org; Ajay Ranipeta; Dave Martin; renato at cria.org.br
Subject: Re: [tdwg-content] Latest DwC as TAPIR output model?

Paul,
you only need to change some configuration in TapirLink to enable the new darwin core terms for configuration.
In addition to the output model for tapir we also host the official CNS (concept name server) file that tells tapir software which concepts/terms are available for mapping. You can find it next to the output model here:
http://darwincore.googlecode.com/svn/trunk/tapir/dwc_terms.cns

In TapirLink the best option is to update the tapirlink/config/schemas.xml file like this, so the new terms become available for any new resource:

<?xml version="1.0" encoding="utf-8" ?>
<schemas>
 <!-- renamed to dwc 1.4 to avoid confusion -->
 <schema alias="DarwinCore 1.4">
    <namespace>http://rs.tdwg.org/dwc/dwcore/</namespace>
    <location>http://rs.tdwg.org/tapir/cns/dwc_1_4.xml</location>
    <handler>CnsSchemaHandler_v2</handler>
  </schema>
  <schema alias="DarwinCore 1.4 curatorial extension">
    <namespace>http://rs.tdwg.org/dwc/curatorial/</namespace>
    <location>http://rs.tdwg.org/tapir/cns/dwcur_1_4.xml</location>
    <handler>CnsSchemaHandler_v2</handler>
  </schema>
 <schema alias="DarwinCore 1.4 geospatial extension">
    <namespace>http://rs.tdwg.org/dwc/geospatial/</namespace>
    <location>http://rs.tdwg.org/tapir/cns/dwgeo_1_4.xml</location>
    <handler>CnsSchemaHandler_v2</handler>
  </schema>
  <schema alias="ABCD 2.06">
     <namespace>http://www.tdwg.org/schemas/abcd/2.06</namespace>
     <location>http://rs.tdwg.org/tapir/cns/abcd_2_06.xml</location>
     <handler>CnsSchemaHandler_v2</handler>
  </schema>
 <!-- added to support new darwin core terms -->
  <schema alias="DarwinCore">
    <namespace>http://rs.tdwg.org/dwc/terms/</namespace>
    <location>http://darwincore.googlecode.com/svn/trunk/tapir/dwc_terms.cns</location>
    <handler>CnsSchemaHandler_v2</handler>
  </schema>
</schemas>


With this the new terms become available in the tapir link configurator. Alternatively you can insert the link to the cns file manually for every resource you want to configure in step5 (Mapping) at the bottom under "Location of additional schema:" enter http://darwincore.googlecode.com/svn/trunk/tapir/dwc_terms.cns and select CNS XML file.


I am checking in this schema.xml change into the trunk of TapirLink at sourceforge now.

Markus


On Aug 8, 2010, at 13:38, Paul Flemons wrote:

> Tim,
> I think we will use TapirLink for now in which case we need to have the Simple Darwin Core schema implemented in TapirLink.
> Any volunteers for doing that?
>
> We will be interested in the IPT version at RC5.
>
> Ta
> Paul
>
> ________________________________________
> From: Tim Robertson (GBIF) [trobertson at gbif.org]
> Sent: 08 August 2010 19:34
> To: Paul Flemons
> Cc: Markus Döring; tdwg-content at lists.tdwg.org; Ajay Ranipeta; Dave Martin; renato at cria.org.br
> Subject: Re: [tdwg-content] Latest DwC as TAPIR output model?
>
> Hi Paul,
>
>> Thanks all for your responses.
>>
>> I now understand the Simple Darwin Core a lot better.
>>
>> So we have an official tapir model for Simple Darwin Core and that
>> will suffice for us for now.
>>
>> If we wish to implement it in Tapirlink what is required?
>>
>> Another option is to use IPT instead of Tapirlink to provide our
>> data - is that correct?
>> How stable is IPT in terms of development - is it still beta?
>> Is IPT difficult to install and what is required?
>
> You could evaluate the IPT for this yes.
>
> There will shortly be a release of the IPT (the RC4 release which will
> be available for first testing next week) which will have some
> enhancements and bug fixes and I would recommend waiting for this
> before evaluating.  The IPT has a TAPIR interface, although I am not
> sure how much it has been tested in live deployments, as GBIF use the
> DarwinCore Archive interface to harvest content as it is far quicker.
> In addition GBIF, have initiated a major refactoring of the IPT to
> address the feedback received in the first year of it's use.
> Specifically, users have requested that it be lightened up
> significantly (e.g. server requirements), made easier to install and
> configure, improve performance and address the intuitiveness of some
> of the pages on the user interface relating to the publication state
> and the registration.  We are starting a 4 week coding camp on this
> tomorrow, and I anticipate testing of certain components to begin
> towards the end of September, known as a "pre-RC5 testing phase".
> This will not be a complete IPT and we will canvass the community to
> determine how much functionality should be supported by the IPT for
> RC5.  This means there is potential that the IPT will not support
> TAPIR but only the DarwinCore-Archive output format.  Following RC5 we
> will only address bugs and halt new functionality additions until the
> IPT is considered fully stable and considered to be at a General
> Availability state.
>
> If you have experience using, and are happy with TAPIRLink and the
> operation of your TAPIR network then TAPIRLink would likely be the
> best product to use for the time being, and you might consider
> evaluating the IPT either at RC4, or get involved around the time RC5
> components are available for testing.  What the IPT provides:
> - the publishing of checklist and occurrence information from either
> SQL sources or CSV, tab file (etc) upload
> - the ability to author dataset metadata documents to accompany the
> data, or where the dataset is not digitally available (e.g.
> documenting undigitised collections)
> - browse a web application in the IPT for each dataset
> - register with GBIF associating the dataset with the Institutions of
> interest (the physical owners and the virtual hosting institutes)
> - Fast full dataset transfer: due to the DarwinCore-Archive output
> format, it becomes trivial to transfer an entire dataset in a single
> object
> - Support "many to one" extensions to the core records (occurrence or
> taxon).  Extension definitions may be registered for others to use.
>
> I hope this helps guide your decision.  Please let me know if I can
> provide more information on the IPT.
>
> Cheers,
> Tim
>
>
>>
>> Really appreciate your quick feedback on this stuff.
>>
>> Best Regards
>> Paul
>>
>>
>>
>> ________________________________________
>> From: Markus Döring [m.doering at mac.com]
>> Sent: 08 August 2010 05:09
>> To: Tim Robertson
>> Cc: tdwg-content at lists.tdwg.org; Dave Martin; Ajay Ranipeta; Paul
>> Flemons
>> Subject: Re: [tdwg-content] Latest DwC as TAPIR output model?
>>
>> Hi,
>> not sure where exactly the conversation started, but I assume you
>> are aware of the official tapir model for the simple darwin core? It
>> looks as if this is up to date:
>> http://code.google.com/p/darwincore/source/browse/trunk/tapir/dwc_simple.tom
>>
>>
>> In case you would like a flexible model that also allows for non
>> flat extension records, we have created a new terms dwc tapir model
>> which is used by the IPT:
>> http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-providertool/resources/tapir_dwc_model.xml
>>
>> The schema referenced is this one:
>> http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-providertool/resources/tapir_dwc_model.xsd
>>
>> it defines a "DarwinExtensions" hook element in addition to the
>> regular dwc terms. But no restrictions are made to how the
>> extensions are constructed, it keeps it wide open.
>>
>> Markus
>>
>>
>> On Aug 7, 2010, at 9:47, Tim Robertson (GBIF) wrote:
>>
>>> Hi all,
>>>
>>> This thread started as a query about the availability of the latest
>>> DwC in TAPIR output models, and specifically for use with TAPIRLink.
>>> John correctly points out that the TDWG list should have been
>>> copied from the start, and I am remedying that now.  I have
>>> condensed the thread to only the main points to make it easy to
>>> digest, and suggest we continue the discussion here.  The current
>>> point in the discussion relates to the SimpleDarwinCore being flat
>>> in nature, and therefore why it holds a subset of all the DwC terms
>>> documented in the standard.
>>>
>>> Thanks,
>>> Tim
>>>
>>>
>>> On Aug 7, 2010, at 7:59 AM, John Wieczorek wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I wish that this conversation had taken place on the tdwg-content
>>>> list so that others might take advantage of the discussion, but I
>>>> don't feel comfortable moving someone else's conversations there.
>>>
>>> Good point John.  Sorry all, this was my mistake.
>>>
>>>>
>>>> The Simple Darwin Core Schema is up to date. It does not contain
>>>> the MeasurementOrFact terms, nor the ResourceRelationship terms,
>>>> precisely because, as Renato said, the Simple Darwin Core is flat
>>>> and the other terms in those two sets make little sense in a flat
>>>> schema because they would allow you to share no more than one
>>>> MeasurementOrFact and one ResourceRelationship.
>>>>
>>>> There is a Generic Darwin Core schema that provides a model for
>>>> building other schemas from the Darwin Core, but it is not an
>>>> application schema as the SimpleDarwinCore schema is. At least two
>>>> groups are using the Generic Darwin Core schema imported into
>>>> other schemas to extend the capabilities of the Generic Darwin
>>>> Core - the germ plasm folks and the Apiary folks. The former have
>>>> a published schema at http://code.google.com/p/darwincore/source/browse/#svn/
>>>> trunk/xsd/profiles/germplasm, while the latter are working on one
>>>> for herbarium sheet data entry, for which they are interested in
>>>> many more types of annotations than just the Identification class
>>>> found in Darwin Core presently.
>>>>
>>>> If there is more that you want to share than just the Simple
>>>> Darwin Core, but don't actually need a more complex structure, one
>>>> simple way to do it would be to use the dynamicProperties term
>>>> from Simple Darwin Core. The description is at http://rs.tdwg.org/dwc/terms/index.htm#dynamicProperties
>>>> and further explanation on its use can be found at in the "Do
>>>> More with Simple Darwin Core" section (http://rs.tdwg.org/dwc/terms/simple/index.htm#domore
>>>> ) of the Simple Darwin Core page.
>>>>
>>>> Hope that helps,
>>>>
>>>> John
>>>>
>>>> Renato,
>>>> Simple Darwin Core is almost what I want but looking at the schema
>>>> here http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd
>>>> it is lacking the auxillary terms that occur that are listed here http://rs.tdwg.org/dwc/terms/index.htm
>>>> - which are the Measurement of Fact terms and the Resource
>>>> Relationship terms.
>>>>
>>>> So ideally I would like all the terms listed here http://rs.tdwg.org/dwc/terms/index.htm
>>>> included however if that is too difficult or would take too long
>>>> then I would be happy with the Simple Darwin Core schema as
>>>> described here http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd
>>>> Paul
>>>>
>>>> I'm not sure if the simple DwC schema is up-to-date, but it seems to
>>>> contain more than 150 elements. I think the idea behind "simple"
>>>> resides
>>>> in the fact that the schema defines a flat structure, not that it
>>>> only
>>>> contains a subset of DarwinCore.
>>>>
>>>> I can check if that schema is up-to-date and then build the output
>>>> model,
>>>> but first I would just need to know from you guys if this kind of
>>>> schema
>>>> will suit your needs.
>>>> Renato
>>>>
>>>>
>>>>> I was hoping the whole schema :-)
>>>>>
>>>>> Here in Australia the OZCAM community (all Australian museums)
>>>>> has just
>>>>> migrated it's schema to the new standard - using about 60 or 70
>>>>> of the
>>>>> fields so we would like to be able to expose all of them.
>>>>>
>>>>> With my TDWG exec hat on I'd like to think that any data
>>>>> providers could
>>>>> expose their data using the new DwC schema and GBIF being the
>>>>> largest
>>>>> consumer of data should be doing their  utmost to facilitate that
>>>>> being
>>>>> possible.
>>>>> TapirLink is the ideal way to do that as many are using it
>>>>> already and so
>>>>> upgrading providers to the new schema using Tapirlink should be
>>>>> relatively
>>>>> easy.
>>>>> Paul
>>>
>>>>>> I would think an output based on the simple DwC should be made
>>>>>> available
>>>>>> and GBIF will promote that with TAPIRLink, but I leave it to
>>>>>> Paul to
>>>>>> shout if he anticipates something else?
>>>>>> Tim
>>>
>>>>>>> You're right. There's no output model for the latest DwC, but
>>>>>>> it should
>>>>>>> be
>>>>>>> easy to make one. Is there a specific XML Schema for the data
>>>>>>> structure
>>>>>>> or
>>>>>>> are you planning to use the simple DwC?
>>>>>>>
>>>>>>> http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd
>>>>>>>
>>>>>>> If you want, just let me know the schema and I can quickly
>>>>>>> build a
>>>>>>> model
>>>>>>> for it.
>>>>>>> Renato
>>>
>>>>>>>> Do you know if anyone put together output models for the TAPIR
>>>>>>>> using
>>>>>>>> the latest DwC?
>>>>>>>> http://rs.tdwg.org/tapir/cs/dwc/ does not appear to have any
>>>>>>>> and the
>>>>>>>> folks in Australia are looking into this.  Just want to make
>>>>>>>> sure
>>>>>>>> there aren't any floating around already.
>>>>>>>> Tim
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>> #####################################################################################
>> This e-mail message has been scanned for Viruses and Content and
>> cleared
>> by MailMarshal
>> #####################################################################################
>>
>> ALIVE
>> UNTIL 19 SEPTEMBER 2010
>> A dynamic program of events celebrating
>> the International Year of Biodiversity
>> TALKS • TOURS • WORKSHOPS
>> KIDS ACTIVITIES • MUSIC • DISPLAYS
>> www.australianmuseum.net.au/biodiversity
>>
>>
>>
>>
>> The Australian Museum.
>>
>>
>>
>> The views in this email are those of the user and do not necessarily
>> reflect the views of the Australian Museum. The information
>> contained in this email message and any accompanying files is or may
>> be confidential and is for the intended recipient only. If you are
>> not the intended recipient, any use, dissemination, reliance,
>> forwarding, printing or copying of this email or any attached files
>> is unauthorised. If you are not the intended recipient, please
>> delete it and notify the sender. The Australian Museum does not
>> guarantee the accuracy of any information contained in this e-mail
>> or attached files. As Internet communications are not secure, the
>> Australian Museum does not accept legal responsibility for the
>> contents of this message or attached files.
>>
>> Please consider the environment before printing this email.
>>
>
>
> #####################################################################################
> This e-mail message has been scanned for Viruses and Content and cleared
> by MailMarshal
> #####################################################################################
>
> ALIVE
> UNTIL 19 SEPTEMBER 2010
> A dynamic program of events celebrating
> the International Year of Biodiversity
> TALKS • TOURS • WORKSHOPS
> KIDS ACTIVITIES • MUSIC • DISPLAYS
> www.australianmuseum.net.au/biodiversity
>
>
>
>
> The Australian Museum.
>
>
>
> The views in this email are those of the user and do not necessarily reflect the views of the Australian Museum. The information contained in this email message and any accompanying files is or may be confidential and is for the intended recipient only. If you are not the intended recipient, any use, dissemination, reliance, forwarding, printing or copying of this email or any attached files is unauthorised. If you are not the intended recipient, please delete it and notify the sender. The Australian Museum does not guarantee the accuracy of any information contained in this e-mail or attached files. As Internet communications are not secure, the Australian Museum does not accept legal responsibility for the contents of this message or attached files.
>
> Please consider the environment before printing this email.

#####################################################################################
This e-mail message has been scanned for Viruses and Content and cleared 
by MailMarshal
#####################################################################################

ALIVE
UNTIL 19 SEPTEMBER 2010
A dynamic program of events celebrating
the International Year of Biodiversity
TALKS • TOURS • WORKSHOPS
KIDS ACTIVITIES • MUSIC • DISPLAYS
www.australianmuseum.net.au/biodiversity




The Australian Museum.



The views in this email are those of the user and do not necessarily reflect the views of the Australian Museum. The information contained in this email message and any accompanying files is or may be confidential and is for the intended recipient only. If you are not the intended recipient, any use, dissemination, reliance, forwarding, printing or copying of this email or any attached files is unauthorised. If you are not the intended recipient, please delete it and notify the sender. The Australian Museum does not guarantee the accuracy of any information contained in this e-mail or attached files. As Internet communications are not secure, the Australian Museum does not accept legal responsibility for the contents of this message or attached files.

Please consider the environment before printing this email.



More information about the tdwg-content mailing list