[tdwg-tapir] Fwd: Tapir protocol - Harvest methods?

Markus Döring mdoering at gbif.org
Tue May 20 17:36:17 CEST 2008


Renato,
complex data can also be represented by tab files, with a file for  
each extension that has a pointer in the first column.
That is what we originally had in mind with the star scheme.

Markus



On 20 May, 2008, at 17:16, Renato De Giovanni wrote:

> Hi Markus,
>
> Since DarwinCore is a generic list of elements that can be used by
> any application schema, I think it's OK to use them in the new schema
> that you're suggesting.
>
> I agree that ideally we should try to define and use a common format
> for index files, although it seems that we will have at least two:
> csv for simple data and probably another one in XML for complex data,
> right?
>
> Regarding the XML for complex data, if you manage to find a generic
> schema that can be used in different contexts (not only biodiversity
> data) then I agree we could avoid extra attributes in the respective
> capabilities element. Otherwise, I would prefer to see some extra
> attribute (such as "outputModel") giving more information about the
> XML. Since TAPIR was designed to be generic, this should not be a
> problem because clients and networks are already free to decide and
> to mandate specific TAPIR capabilities. This doesn't mean that there
> will be lots of formats for index files. It's a matter of agreeing on
> a common format but still keeping the protocol generic to allow
> different uses by other communities.
>
> I also agree we could advertise the index file through some new TAPIR
> element instead of using the custom slot.
>
> Best Regards,
> --
> Renato
>
> On 16 May 2008 at 10:29, Markus Döring wrote:
>
>> Renato,
>> I was thinking along those lines too. It would be nice for TAPIRs to
>> announce the availablility of the index files. I wouldnt mind adding
>> it even to the regular tapir schema once it has proven to work with
>> the custom slot approach you have given.
>>
>> Regarding star shaped data I would prefer to agree on one format
>> instead of allowing different ones to save consumers from this pain.
>> There is a straight forward xml serialisation for this scheme that we
>> could use instead of tab files:
>>
>> <record uri="">
>>   <dwc:property1 />
>>   <dwc:property2 />
>>   <extA:record>
>>     <extA:property1 />
>>     <extA:property2 />
>>   </extA:record>
>>   <extB:record>
>>     <extB:property1 />
>>     <extB:property2 />
>>   <extB:record>
>> <record>
>>
>>
>> Advantage is, it can be produced by TAPIR software and xml
>> serialisation is required for many services, eg RSS anyway.
>> But then again the whole point of the index files is that they are
>> easy to generate and consume. On the other hand this xml structure is
>> pretty simple to process and can be genereated from databases like
>> sqlserver that have xml output straight away without the need of
>> scripting.
>>
>> That touches a different issue I am facing with the star scheme by  
>> the
>> way. I have created an identification extension for darwin core that
>> holds the historical list of identification events and their outcome.
>> This is a YAML section of the metafile describing the columns for  
>> this
>> extension through fully qualified concepts ala TAPIR:
>>
>> identification:
>>   - http://rs.tdwg.org/dwc/dwcore/ScientificName
>>   - http://rs.tdwg.org/dwc/dwcore/AuthorYearOfScientificName
>>   - http://rs.tdwg.org/dwc/dwcore/Family
>>   - http://rs.tdwg.org/dwc/dwcore/IdentificationQualifier
>>   - http://rs.tdwg.org/dwc/curatorial/DateIdentified
>>   - http://rs.tdwg.org/dwc/curatorial/IdentifiedBy
>>
>> When creating this I realised that pretty much all concepts I was
>> interested in already existed in darwin core or the curatorial
>> extension. Wouldnt it be wise to reuse those concepts? Or are they
>> strictly tight to the idea of a current identification and therefore
>> cant be used for historical ones? This is probably more of a darwin
>> core question than TAPIR, but we are all on this list anyway ...
>>
>> The xml in that case would look sth like this:
>>
>> <record uri="http://mygarden.com/specimen/plants/54321-423-43-54-6-3-24-44
>> ">
>>   <dwc:ScientificName>Aster alpinus subsp.
>> parviceps<dwc:ScientificName>
>>   ...
>>   <ident:record>
>>     <dwc:ScientificName>Aster alpinus<dwc:ScientificName>
>>     <dwc:AuthorYearOfScientificName>L.</ 
>> dwc:AuthorYearOfScientificName>
>>     <dwc:Family>Asteraceae<dwc:Family>
>>     <cur:DateIdentified>1913-03-12</cur:DateIdentified>
>>     <cur:IdentifiedBy>Karl Marx</cur:IdentifiedBy>
>>   </ident:record>
>>   <ident:record>
>>     <dwc:ScientificName>Aster alpinus subsp.
>> parviceps<dwc:ScientificName>
>>     <dwc:AuthorYearOfScientificName>Novopokr.</
>> dwc:AuthorYearOfScientificName>
>>     <dwc:Family>Asteraceae<dwc:Family>
>>     <cur:DateIdentified>2003-09-07</cur:DateIdentified>
>>     <cur:IdentifiedBy>Keith Richards</cur:IdentifiedBy>
>>   </ident:record>
>> <record>
>>
>>
>> Markus
>
> _______________________________________________
> tdwg-tapir mailing list
> tdwg-tapir at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
>




More information about the tdwg-tag mailing list