[Tdwg-tag] Why should data providers supply search and query services?

Roger Hyam roger at tdwg.org
Mon Mar 6 14:23:56 CET 2006


Hi Javier,

Thanks for your points. My responses below. Including a bit about GML 
that should really be in a different thread!

Javier de la Torre wrote:
> Hi,
>
> There are many reason why I think data providers should be more 
> capable than just simple interfaces to indexers. Some of them have 
> been already pointed by Markus and Bob, but I would like to use to a 
> non-technical reason and very related to what TDWG is good for.
>
> While making the biological collection databases available for GBIF 
> indexing I think we are also helping them in their daily work. Some 
> databases are setting up their own web interface based on BioCASe, 
> there are projects to help them geoloreference their specimens based 
> on the provider software, they have the possibility to export their 
> collection database and import it into another collection management 
> software, and many other useful possibilities that will hopefully 
> appear in the next years. These solutions are possible because the 
> software installed on their servers is capable of doing searches and 
> queries.
> So here my argument is that: by setting up a query level and 
> installing a capable software on the providers directly we are 
> improving these databases.
>
But surely these people can search their own database already. If you 
are providing a cheap and easy web interface for them then that is a 
tangible benefit but could equally be done centrally with a branding for 
the institution. It doesn't physically have to reside with them and be 
maintained by them - though that may be the best for many organisations.
> I have used this argument for a while already when convincing data 
> providers: by joining GBIF they are not just only making their data 
> available to the community but that they will also benefit of the 
> tools that are appearing for them based on TDWG standards. I think it 
> is a good deal, make your data available and we will help you to 
> improve it with standard tools from the community for no cost.
This is just as available if the indexing is outsourced from the data 
owner I believe but is difficult to discuss abstractly here.
> There is also many people who do not want to share their data, 
> specially researches, and that can also benefit from our software and 
> standards  without having to participate in any network. If we create 
> good and useful software they might consider using it to handle their 
> data and at some point maybe open it to the public.
That might be a really nice side effect of our activities but as it is 
based on serendipity (or good karma perhaps!) is not something that we 
can plan for.
> This is also somehow related to what I call the OAI "model" versus the 
> OGC "model". The OAI is helping and promoting the accessibility to 
> data in distributed  databases while the OGC is an organism just 
> promoting the interoperability of applications.
> While the OAI is focus on making the accessibility to the data as good 
> as possible (to set up value-added services on top of cache databases) 
> the OGC community is working on making software interoperable, 
> extensible and open to new uses that they might not know now.
>
I am not sure that either approach works in isolation. If I want to 
define a GML application schema that contains a definition of a bridge 
in it, for example, I am not sure how I relate this to all the other 
people who have defined (or may define in the future) bridge-like 
things. How do I write an application that will 'understand' not only my 
bridge feature but also any other features out there that are 
bridge-like but that I am not aware of just now. I can see how we get 
interoperability if we all agree to use the same Application Schema and 
I can see that we can use the same software for multiple Application 
Schemas but we want *data *interoperability not just *software 
*interoperability

Here is an example: The British Ordnance Survey have their own GML 
Application Schema and it defines a "FerryLink" that extends from their 
own abstract feature type and back to GML feature. So I can treat it 
like a GML feature in an application - very useful - this is software 
interoperability. Trouble is I have no way of knowing that it is to do 
with ferries and water or anything useful about it unless I can read 
English - which machines don't. (Incidentally there is no documentation 
in the schema so I can't retrieve it and display it to the user 
automatically either). Presumably another mapping agency is also 
encoding ferry links. In fact the instances of ferry routes that are 
encoded using this schema may join the UK to countries that use 
different GML Application Schemas  to define the *same *physical ferry 
links!

My understanding is that the GML model does not give me a way to 
discover this or express it once I know that there are two ways of 
talking about the same physical objects.

You can get the OS application schemas here:

http://www.ordnancesurvey.co.uk/oswebsite/xml/schema/index.html

My knowledge of OGC standards is limited but I may be wrong on this so 
stand to be corrected.

We are building a global system so we have to be able to reconcile 
different encodings of the same object types. GML does not solve this 
problem but might be useful in other ways. The OAI standards may be 
useful for finding stuff.

> I want to think that GBIF is like the OAI and that instead of creating 
> their own technology to achieve their goals is using TDWG. I tend to 
> think of TDWG more like OGC in the other hand. So, GBIF is just one 
> user of TDWG work.
>
> So my vote goes for more sophisticated  data providers that allow us 
> to construct more things on top of them without having to consider 
> GBIF at all.
> TAPIR looks fine to me for this task, even more if complemented with 
> the TAPIR "Lite" idea for  providers that just want to contribute to 
> GBIF.
>
I am sure we need both fat and thin providers but I also think we need 
to define the roles played by different actors within the network more 
formally - which I think we will in the near future.

Some good points,

Roger

> Best regards,
>
> Javier.
>


-- 

-------------------------------------
 Roger Hyam
 Technical Architect
 Taxonomic Databases Working Group
-------------------------------------
 http://www.tdwg.org
 roger at tdwg.org
 +44 1578 722782
-------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20060306/448b2e4b/attachment.html 


More information about the tdwg-tag mailing list