XML gateways

Sun Mar 25 21:30:52 CEST 2001

Dave,

I tried looking at http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp

but I was unsuccessful. Is the address wrong? Must I be registered?

Rob Stevenson

but I was unsuccessful
Dave Vieglais wrote:

> >-----Original Message-----
> >From: TDWG - Structure of Descriptive Data [mailto:TDWG-SDD at USOBI.ORG]On
> >Behalf Of Robert A. (Bob) Morris
> >Sent: Tuesday, March 20, 2001 10:07 AM
> >To: TDWG-SDD at USOBI.ORG
> >Subject: Re: XML gateways
> >
> >
> >Jim Croft writes:
> > > Date:         Wed, 21 Mar 2001 00:25:09 +1100
> > > From: Jim Croft <jrc at anbg.gov.au>
> > > To: TDWG-SDD at usobi.org
> > > Subject:      Re: XML gateways
> > >
> ...
>
> >
> >Sad, but true. We should kick around other ideas to solve this
> >problem, though a conventional port may be the easiest for data
> >sources to implement. How about:
> >
> >- A convention whereby if
> >    http://<host>/<filePath>/<cgiquery>
> >yields HTML then
> >    http://<host>/<filePath>/xml/<cgiquery>
> >yields XML
> >
> >-A convention whereby people set up virtual hosts---pretty easy in
> >most web servers---so that if
> >    http://<host>/<filePath>/<cgiquery>
> >yields HTML then something like
> >     http://xml-<host>/<filePath>/<cgiquery>
> >yields XML
> >
> >Hopefully all of this is temporary, since a well crafted GBIF should
> >provide for discovery of URL, query syntax, and return schema. My real
> >point is that current versions of Oracle, SQL-Server, FileMaker, and
> >Access(?) can already emit XML without much(?) effort on the part of
> >the data source operators, and doing so would let people proceed to
> >build interesting distributed applications.
>
> There are initiatives to resolve each of the issues brought up here that are
> being worked on in several industry sectors:
> 1. Service discovery is addressed by UDDI (Universal Description,
> Description and Integration).
> 2. Service description by WSDL (Web Service Description Language) which is
> an XML document format for describing methods and data exposed by a web
> services.
> 3. IGIR (Interface for Generic Information Retrieval) provides a generic
> format for information retrieval via HTTP GET or POST requests that is easy
> to implement.  The queries are sent as a structure easily translated into a
> number of common query syntaxes (SQL for example).  Results are contained
> within a consistently formatted document.  IGIR is optionally stateful (it's
> up to the server to decide)- so a query that generates a very large result
> set need not be sent back to the client all in one hit.  IGIR also supports
> a SOAP interface (that's what it is originally designed for), so
> programmatic access to information resources becomes pretty simple.
>
> A search and retrieval scenario would work something like this:
> A client hits on the UDDI directory, looking for services that support IGIR
> and are "biologically relevant".  A list of services supporting IGIR are
> returned, and the client broadcasts a query to them and waits for results to
> start coming back.  Since the records are all formatted in a similar manner,
> it is a simple process for the client to merge the results and present to
> the user a nicely formatted set of results.
>
> Anyway, that's how things should work.  To support a broadcast query,
> obviously the targets must understand the contex of the query terms- so
> groups within a particular domain (such as biological collections for
> example) will need a set of "well known" access points (searchable fields).
> Similarly, to simplify the merging of results that come back from such a
> query, it would be convenient if there was an agreed record structure that
> was shared between data providers.  The xml output by most database vendors
> is a hack designed to allow easy transport of database records using http.
> Things are getting better with xsd and so forth, but the structure of the
> resulting xml documents will still reflect the structure of the databases
> rather than any agreed upon common format.
>
> Which brings us back to the "biological collections profile" concept that
> was raised at the last TDWG meeting.  Would anybody like to volunteer a set
> of access points (searchable fields) that are relevant for biological
> collections?  Similarly for the record structure?  The DarwinCore elements
> start to address these issues, but are somewhat deficient in several
> respects.  They might make a reasonable starting place though- The
> description with a comment section (add your gripes etc) is available at:
> http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp
>
> Most of you know that the DarwinCore was originally developed as a Z39.50
> profile, hence some of the language may seem a little odd.  But the main
> thing to take a look at is the list of access points- which are useful from
> a biological collections point of view, and what's missing?
>
> There are obviously other domains of biological databases beside just
> collection databases.  Any suggestions for commonality between them all?
>
> Cheers,
>   Dave V.
>
> =========================
> David A. Vieglais
> University of Kansas
> Natural History Museum &
> Biodiversity Research Center
> http://www.nhm.ukans.edu

--------------AA3EEC6D20D988B2EC71FF53
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Dave,
I tried looking at <a href="http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp">http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp</a>
but I was unsuccessful. Is the address wrong? Must I be registered?
Rob Stevenson
 &nbsp;
<a href="http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp">but
I was unsuccessful</a>
 Dave Vieglais wrote:
<blockquote TYPE=CITE>>-----Original Message-----
 >From: TDWG - Structure of Descriptive Data [<a href="mailto:TDWG-SDD at USOBI.ORG">mailto:TDWG-SDD at USOBI.ORG</a>]On
 >Behalf Of Robert A. (Bob) Morris
 >Sent: Tuesday, March 20, 2001 10:07 AM
 >To: TDWG-SDD at USOBI.ORG
 >Subject: Re: XML gateways
 >
 >
 >Jim Croft writes:
 > > Date:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Wed, 21 Mar
2001 00:25:09 +1100
 > > From: Jim Croft &lt;jrc at anbg.gov.au>
 > > To: TDWG-SDD at usobi.org
 > > Subject:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Re: XML gateways
 > >
 ...
>
 >Sad, but true. We should kick around other ideas to solve this
 >problem, though a conventional port may be the easiest for data
 >sources to implement. How about:
 >
 >- A convention whereby if
 >&nbsp;&nbsp;&nbsp; <a href="http://">http://</a>&lt;host>/&lt;filePath>/&lt;cgiquery>
 >yields HTML then
 >&nbsp;&nbsp;&nbsp; <a href="http://">http://</a>&lt;host>/&lt;filePath>/xml/&lt;cgiquery>
 >yields XML
 >
 >-A convention whereby people set up virtual hosts---pretty easy in
 >most web servers---so that if
 >&nbsp;&nbsp;&nbsp; <a href="http://">http://</a>&lt;host>/&lt;filePath>/&lt;cgiquery>
 >yields HTML then something like
 >&nbsp;&nbsp;&nbsp;&nbsp; <a href="http://xml">http://xml</a>-&lt;host>/&lt;filePath>/&lt;cgiquery>
 >yields XML
 >
 >Hopefully all of this is temporary, since a well crafted GBIF should
 >provide for discovery of URL, query syntax, and return schema. My
real
 >point is that current versions of Oracle, SQL-Server, FileMaker, and
 >Access(?) can already emit XML without much(?) effort on the part
of
 >the data source operators, and doing so would let people proceed to
 >build interesting distributed applications.
There are initiatives to resolve each of the issues brought up here
that are
 being worked on in several industry sectors:
 1. Service discovery is addressed by UDDI (Universal Description,
 Description and Integration).
 2. Service description by WSDL (Web Service Description Language) which
is
 an XML document format for describing methods and data exposed by a
web
 services.
 3. IGIR (Interface for Generic Information Retrieval) provides a generic
 format for information retrieval via HTTP GET or POST requests that
is easy
 to implement.&nbsp; The queries are sent as a structure easily translated
into a
 number of common query syntaxes (SQL for example).&nbsp; Results are
contained
 within a consistently formatted document.&nbsp; IGIR is optionally
stateful (it's
 up to the server to decide)- so a query that generates a very large
result
 set need not be sent back to the client all in one hit.&nbsp; IGIR
also supports
 a SOAP interface (that's what it is originally designed for), so
 programmatic access to information resources becomes pretty simple.
A search and retrieval scenario would work something like this:
 A client hits on the UDDI directory, looking for services that support
IGIR
 and are "biologically relevant".&nbsp; A list of services supporting
IGIR are
 returned, and the client broadcasts a query to them and waits for results
to
 start coming back.&nbsp; Since the records are all formatted in a similar
manner,
 it is a simple process for the client to merge the results and present
to
 the user a nicely formatted set of results.
Anyway, that's how things should work.&nbsp; To support a broadcast
query,
 obviously the targets must understand the contex of the query terms-
so
 groups within a particular domain (such as biological collections for
 example) will need a set of "well known" access points (searchable
fields).
 Similarly, to simplify the merging of results that come back from such
a
 query, it would be convenient if there was an agreed record structure
that
 was shared between data providers.&nbsp; The xml output by most database
vendors
 is a hack designed to allow easy transport of database records using
http.
 Things are getting better with xsd and so forth, but the structure
of the
 resulting xml documents will still reflect the structure of the databases
 rather than any agreed upon common format.
Which brings us back to the "biological collections profile" concept
that
 was raised at the last TDWG meeting.&nbsp; Would anybody like to volunteer
a set
 of access points (searchable fields) that are relevant for biological
 collections?&nbsp; Similarly for the record structure?&nbsp; The DarwinCore
elements
 start to address these issues, but are somewhat deficient in several
 respects.&nbsp; They might make a reasonable starting place though-
The
 description with a comment section (add your gripes etc) is available
at:
 <a href="http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp">http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp</a>
Most of you know that the DarwinCore was originally developed as a Z39.50
 profile, hence some of the language may seem a little odd.&nbsp; But
the main
 thing to take a look at is the list of access points- which are useful
from
 a biological collections point of view, and what's missing?
There are obviously other domains of biological databases beside just
 collection databases.&nbsp; Any suggestions for commonality between
them all?
Cheers,
 &nbsp; Dave V.
=========================
 David A. Vieglais
 University of Kansas
 Natural History Museum &amp;
 Biodiversity Research Center
 <a href="http://www.nhm.ukans.edu">http://www.nhm.ukans.edu</a></blockquote>
</html>