[tdwg-tapir] Fwd: Tapir protocol - Harvest methods?

trobertson at gbif.org trobertson at gbif.org
Thu May 15 21:25:09 CEST 2008

I am interested in it for anyone who could do date last modified - so the
dump becomes the seed file really for OAI-PMH (or other).

Off the top of my head I *think* (but don't quote me!) that UK NBN allow
people to basically delete their collection and upload a new version.  So
the collection code increments each time and OAI PMH is out the question
as it is a reharvest each time anyway.  If it is not NBN then it is some
other large aggregator that does this.

> Hi Tim,
> Just wondering: In the case of NBN, after the initial import from a dump
> file and if they just change a few records, are you planning to download
> everything again and then overwrite all 20 million records in GBIF's
> database? Or are you also interested in using some sort of incremental
> harvesting?
> Best Regards,
> --
> Renato
>> Locally generated / localised DwC index files?
>> (if you have rich data behind LSID, then this file is an index that
>> allows
>> searching of those rich data using DwC fields)
>> I would like to see the data file accompanied with a compulsory metafile
>> that details rights, citation, contacts etc are all given.  Whether this
>> file needs the data generation timestamp I am not so sure either and the
>> HTTP header approach does sound good.  It means you can do a one time
>> metafile crafting and then just CRON the dump generation... This would
>> be
>> for institutions with IT resources - e.g. UK NBN with 20M records.
>> For Joe Bloggs with a data set, if we included it in the wrapper tools,
>> then
>> it is easy to rewrite the metafile seemlessly anyway so they don't care.
>> Cheers,
>> Tim
> _______________________________________________
> tdwg-tapir mailing list
> tdwg-tapir at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir

More information about the tdwg-tag mailing list