Hi All,
I'd like to start nailing things down a little by getting some terms defined. Recent side discussions at the GUID-1 meeting were around the differences between resolving, searching and querying. These terms are confused so I would like to propose some definitions that we could use going forward. The intent here is to make up something useful rather than work out what people have meant in the past by these terms.
1. *Resolving.* This means to convert a pointer into a data object. Examples would be to resolve an LSID and get back either data or metadata or resolve a url and get back a web page in html. 2. *Searching.* This means to select a set of objects (or their proxies) on the basis of the values of their properties. The objects are predefined (implicitly part of the call) and we are simply looking for them. An example would be finding pages on Google. 3. *Querying.* This means to ask a question of a data source that returns objects that are defined as part of the call. An example would be an SQL query where the response object (tuples) are defined in the query.
Using only 'resolving' it would be possible for an agent to crawl around and data harvest (i.e. a web crawler). Using resolving and searching would make the crawler more efficient at gathering data for specific tasks. Querying allows arbitrary data retrieval.
At a single data source resolving should be easy to implement technically, searching slightly harder and querying 'non-trivial'.
Resolving is 'easy' to do in a distributed environment (e.g. just use DNS like LSID) and searching is harder but can be federated (many examples including Rod's work) or cached (GBIF, Google). Distributed querying is 'non-trivial' in the extreme though some GRID guys will probably solve it for us in the future.
Is everyone OK with these definitions? If no one objects then I'll add them to the Wiki and we can work with them.
Thanks for your thoughts,
Roger